previous up next print clean
Next: LSQR ALGORITHM OF PAIGE Up: Berryman: Resolution for Lanczos Previous: RESOLUTION MATRICES

METHOD OF LANCZOS

One common method of solving linear inversion problems of the form $\matM\vecs = \vect$ is Lanczos's method (Lanczos, 1950; Golub and Van Loan, 1983; van der Sluis and van der Vorst, 1987). This method introduces a sequence of orthonormal vectors $\vecz^{(k)}$ through a process of tridiagonalization. Applying this method to the normal equations $\matM^T\matM\vecs = \matM^T\vect$,Lanczos's method is a projection procedure equivalent to the following:

^(1)(^(1))^T^T= ^T,  

[^(1)(^(1))^T+^(2)(^(2) )^T]^T^(1) = ^T^(1),   and, for $k \ge 2$,

[^(k-1)(^(k-1))^T+^(k)(^(k) )^T+^(k+1)(^(k+1))^T]^T ^(k) = ^T^(k).   It is now clear that (lanczos1) defines $\vecz^{(1)}$ as the unit vector in the direction of $\matM^T\vect$, and can have no terms from the right null space of . Equation (lanczos2) defines $\vecz^{(2)}$as the unit vector found by removing the component of $\matM^T\matM\vecz^{(1)}$in the direction of $\vecz^{(1)}$ and then normalizing. Equation (lanczos3) determines $\vecz^{(k+1)}$ as the unit vector along the component of $\matM^T\matM\vecz^{(k)}$ that is orthogonal to both $\vecz^{(k)}$ and $\vecz^{(k-1)}$. By construction,

(^(i))^T^(j) = _ij,   so the vectors are orthonormal.

Defining the constants

N_1 = |^T| = (^(1))^T^T,  

D_k = (^(k))^T^T^(k)  for   k = 1,2,...,   and

N_k+1 = (^(k+1))^T^T^(k)  for   k = 1,2,...,   it is then clear that the equations (lanczos1)-(lanczos3) determine a tridiagonal system of the form

0&...&0&^(k+1)N_k+1+ _kD_1 & N_2 & & & N_2 & D_2 & N_3 & & & N_3 & D_3 & N_4 & & & & & & & & N_k & D_k = ^T_k   for  2kr,   where the orthogonal matrix composed of the resulting orthonormal vectors is

_k = ^(1)&^(2)&^(3)&...&^(k).  

The process stops when k=r (the rank of the matrix) because then Nr+1 = 0, or is numerically negligible. It follows that this tridiagonalization process results in the identity

^T= _r_r_r^T,   where the tridiagonal matrix of coefficients is defined by

_k = D_1 & N_2 & & & N_2 & D_2 & N_3 & & & N_3 & D_3 & N_4 & & & & & & & & N_k & D_k   for  2k r.   Since $\matT_r$ is invertible (by the definition of the rank r of ),

(^T)^= _r(_r)^-1_r^T.  

The solution to the least-squares inversion problem may therefore be written as

= _r(_r)^-1_r^T^T= N_1_r(_r)^-1100,   where I used (lanczos1), (d1), and the fact that

_r^T^(1) = 100.   It follows that only the first column of the inverse of $\matT_r$ is needed to solve this inversion problem.

Since Lanczos's method directly produces a sequence of orthonormal vectors in the model space, it is straightforward to see that the model resolution matrix for this method is given by

_model = ^T(^T)^= _r_r^T = _k=1^r ^(k)(^(k))^T,   which is also clearly symmetric. It is however more difficult to compute the data resolution matrix. Using the fact that

_data = ^= (^T)^ ^T   together with (Adagger), I find that

_data = _r(_r)^-1_r^T^T.   It is clear that both of the resolution matrices are symmetric when the full Lanczos inverse has been computed. If the process is terminated early so that k < r by setting the constant Nk+1 equal to zero, then a Lanczos approximate inverse is given by

_k = _k(_k)^-1_k^T^T,   so that $\matX_k\matM\matX_k = \matX_k$ for all k, but $\matM\matX_k\matM \ne \matM$ if k < r. The effective resolution matrices are given by $\calE_{model} = \matX_k\matM$ and $\calE_{data} = \matM\matX_k$,or are found by replacing $\matZ_r$ and $\matT_r$ in (modelreslanczos) and (datareslanczos) by $\matZ_k$and $\matT_k$, respectively. Clearly, the effective resolutions are also symmetric.

An alternative method of computing the resolution matrices involves noting that the vector sequence $\vecz^{(k)}$ may also be used as the starting set of vectors for the method of conjugate directions. Then, I can produce a set of conjugate vectors from this orthogonal set and easily compute both the model resolution matrix and the data resolution matrix if desired. This alternative requires additional work, however, to produce the sequence of conjugate vectors and it also generally produces a model resolution that is not symmetric, and therefore difficult to interpret. Thus, the Lanczos procedure appears to be superior to conjugate directions/gradients from this point of view. The main disadvantage of the Lanczos method is that the amount of storage increases with the iteration number k, since all the vectors $\vecz^{(1)}, \ldots,\vecz^{(k)}$ must be stored until the final calculation of $\vecs^{(k)} = \matX_k\vect$. In contrast, conjugate gradients requires a fixed amount of storage, but an easily interpretable model resolution must be sacrificed to gain this advantage.


previous up next print clean
Next: LSQR ALGORITHM OF PAIGE Up: Berryman: Resolution for Lanczos Previous: RESOLUTION MATRICES
Stanford Exploration Project
11/17/1997