previous up next print clean
Next: Why not orthogonal residuals? Up: CONJUGATE DIRECTIONS AND CONJUGATE Previous: CONJUGATE DIRECTIONS AND CONJUGATE

Linear iteration

We want to solve the problem (Mst) in an iterative fashion, so we assume that the updates to the solution take the general form

s_n = s_n-1 + _np_n-1,   where ${\bf s}_{n-1}$ is the preceding estimate of ${\bf s}$,${\bf s}_n$ is the new estimate of ${\bf s}$, ${\bf p}_{n-1}$is some direction to be specified in the model space, and $\alpha_n$ is an optimization parameter (or direction weight factor). Defining the residual data error as ${\bf r}_n \equiv {\bf t} - {\bf M}{\bf s}_n$, we find the general relation that

r_n = r_n-1 - _nMp_n-1.   One useful way to proceed is to choose the optimization parameter $\alpha_n$ so that the residual vector is decreased and preferably minimized at each step of the iteration scheme. Using the standard inner product notation $(\cdot,\cdot)$ and considering

||r_n||^2 = ||r_n-1||^2 - 2_n(r_n-1,Mp_n-1) + _n^2||Mp_n-1||^2,   we find easily that the optimum choice of $\alpha_n$ using this criterion is

_n = (r_n-1,Mp_n-1) ||Mp_n-1||^2.   This formula has the significance that, whenever the residual ${\bf r}_{n-1}$ has a component along the direction ${\bf Mp}_{n-1}$,$\alpha_n$ is chosen to scale ${\bf Mp}_{n-1}$ so that this component exactly cancels and therefore removes the contribution to ${\bf r}_n$ made by ${\bf Mp}_{n-1}$. This result implies therefore that, if $({\bf r}_{n-1},{\bf M}{\bf p}_{n-1}) \ne 0$,then with this choice of $\alpha_n$ we have

(r_n, Mp_n-1) = (M^Tr_n,p_n-1)= 0.   We used the adjoint property of the inner product in (porthoggrad) to show that ${\bf p}_{n-1}$ is orthogonal to the gradient vector ${\bf g}_n \equiv {\bf M}^T{\bf r}_n$, so-called because it is the gradient obtained by taking the derivative with respect to ${\bf s}_n^T$ of the squared residual error functional associated with (Mst).

Thus, at each step of this iterative sequence a vector proportional to some vector ${\bf p}_n$ is added to the solution, while a vector proportional to ${\bf Mp}_n$ is subtracted from the residual. According to formulas (sqres) and (optimumalpha), the squared norm of the residual decreases at each iteration as

||r_n||^2 = ||r_n-1||^2 - (r_n-1,Mp_n-1)^2 ||Mp_n-1||^2   The sequence of directions will be most efficient if the vectors used in decimating the residual are orthogonal, i.e., if

(Mp_n,Mp_j) = 0     for    j=1,2,...,n-1.   In this case, as follows by induction from formula (porthoggrad), the residual vector is also orthogonal to all those vectors:

(r_n,Mp_j) = 0     for    j=1,2,...,n-1.   Using again the adjoint relation for the inner product, we find that

(M^Tr_n,p_j) = (g_n,p_j) =0     for    j=1,2,...,n-1.   and

(p_n,M^TMp_j) = 0     for     j=1,2,...,n-1,   which is a statement of conjugacy for the vectors ${\bf p}_n$. Conjugacy is just a generalization of orthogonality in which the vectors are orthogonal relative the nonstandard inner product $(\cdot,{\bf A}\cdot)$ - with ${\bf A}$ being a symmetric, positive semidefinite matrix (operator) - instead of the standard inner product given by $(\cdot,\cdot)$ with ${\bf A}$ replaced by the identity.

We conclude that conjugacy is a desirable property of the set of direction vectors ${\bf p}_n$, so our next necessary step in order to obtain a definite iterative process is to construct a convenient sequence of vectors that have this property. One set of model vectors that will be available in this iteration sequence is the set of gradient vectors themselves, where ${\bf g}_n = {\bf M}^T{\bf r}_n$. We show next why this set plays an important role in constructing the desired sequence.


previous up next print clean
Next: Why not orthogonal residuals? Up: CONJUGATE DIRECTIONS AND CONJUGATE Previous: CONJUGATE DIRECTIONS AND CONJUGATE
Stanford Exploration Project
11/11/1997