Linear iteration

Next: Why not orthogonal residuals? Up: CONJUGATE DIRECTIONS AND CONJUGATE Previous: CONJUGATE DIRECTIONS AND CONJUGATE

Linear iteration

We want to solve the problem (Mst) in an iterative fashion, so we assume that the updates to the solution take the general form

s_n = s_n-1 + _np_n-1, where ${\bf s}_{n-1}$ is the preceding estimate of ${\bf s}$ , ${\bf s}_n$ is the new estimate of ${\bf s}$ , ${\bf p}_{n-1}$ is some direction to be specified in the model space, and $\alpha_n$ is an optimization parameter (or direction weight factor). Defining the residual data error as ${\bf r}_n \equiv {\bf t} - {\bf M}{\bf s}_n$ , we find the general relation that

r_n = r_n-1 - _nMp_n-1. One useful way to proceed is to choose the optimization parameter $\alpha_n$ so that the residual vector is decreased and preferably minimized at each step of the iteration scheme. Using the standard inner product notation $(\cdot,\cdot)$ and considering

||r_n||^2 = ||r_n-1||^2 - 2_n(r_n-1,Mp_n-1) + _n^2||Mp_n-1||^2, we find easily that the optimum choice of $\alpha_n$ using this criterion is

_n = (r_n-1,Mp_n-1) ||Mp_n-1||^2. This formula has the significance that, whenever the residual ${\bf r}_{n-1}$ has a component along the direction ${\bf Mp}_{n-1}$ , $\alpha_n$ is chosen to scale ${\bf Mp}_{n-1}$ so that this component exactly cancels and therefore removes the contribution to ${\bf r}_n$ made by ${\bf Mp}_{n-1}$ . This result implies therefore that, if $({\bf r}_{n-1},{\bf M}{\bf p}_{n-1}) \ne 0$ ,then with this choice of $\alpha_n$ we have

(r_n, Mp_n-1) = (M^Tr_n,p_n-1)= 0. We used the adjoint property of the inner product in (porthoggrad) to show that ${\bf p}_{n-1}$ is orthogonal to the gradient vector ${\bf g}_n \equiv {\bf M}^T{\bf r}_n$ , so-called because it is the gradient obtained by taking the derivative with respect to ${\bf s}_n^T$ of the squared residual error functional associated with (Mst).

Thus, at each step of this iterative sequence a vector proportional to some vector ${\bf p}_n$ is added to the solution, while a vector proportional to ${\bf Mp}_n$ is subtracted from the residual. According to formulas (sqres) and (optimumalpha), the squared norm of the residual decreases at each iteration as

||r_n||^2 = ||r_n-1||^2 - (r_n-1,Mp_n-1)^2 ||Mp_n-1||^2 The sequence of directions will be most efficient if the vectors used in decimating the residual are orthogonal, i.e., if

(Mp_n,Mp_j) = 0 for j=1,2,...,n-1. In this case, as follows by induction from formula (porthoggrad), the residual vector is also orthogonal to all those vectors:

(r_n,Mp_j) = 0 for j=1,2,...,n-1. Using again the adjoint relation for the inner product, we find that

(M^Tr_n,p_j) = (g_n,p_j) =0 for j=1,2,...,n-1. and

(p_n,M^TMp_j) = 0 for j=1,2,...,n-1, which is a statement of conjugacy for the vectors ${\bf p}_n$ . Conjugacy is just a generalization of orthogonality in which the vectors are orthogonal relative the nonstandard inner product $(\cdot,{\bf A}\cdot)$ - with ${\bf A}$ being a symmetric, positive semidefinite matrix (operator) - instead of the standard inner product given by $(\cdot,\cdot)$ with ${\bf A}$ replaced by the identity.

We conclude that conjugacy is a desirable property of the set of direction vectors ${\bf p}_n$ , so our next necessary step in order to obtain a definite iterative process is to construct a convenient sequence of vectors that have this property. One set of model vectors that will be available in this iteration sequence is the set of gradient vectors themselves, where ${\bf g}_n = {\bf M}^T{\bf r}_n$ . We show next why this set plays an important role in constructing the desired sequence.

Next: Why not orthogonal residuals? Up: CONJUGATE DIRECTIONS AND CONJUGATE Previous: CONJUGATE DIRECTIONS AND CONJUGATE

Stanford Exploration Project
11/11/1997