One common method of solving linear inversion problems of the form is Lanczos's method (Lanczos, 1950; Golub and Van Loan, 1983; van der Sluis and van der Vorst, 1987). This method introduces a sequence of orthonormal vectors through a process of tridiagonalization. Applying this method to the normal equations ,Lanczos's method is a projection procedure equivalent to the following:
[^(1)(^(1))^T+^(2)(^(2) )^T]^T^(1) = ^T^(1), and, for ,
[^(k-1)(^(k-1))^T+^(k)(^(k) )^T+^(k+1)(^(k+1))^T]^T ^(k) = ^T^(k). It is now clear that (lanczos1) defines as the unit vector in the direction of , and can have no terms from the right null space of . Equation (lanczos2) defines as the unit vector found by removing the component of in the direction of and then normalizing. Equation (lanczos3) determines as the unit vector along the component of that is orthogonal to both and . By construction,
(^(i))^T^(j) = _ij, so the vectors are orthonormal.
Defining the constants
D_k = (^(k))^T^T^(k) for k = 1,2,..., and
N_k+1 = (^(k+1))^T^T^(k) for k = 1,2,..., it is then clear that the equations (lanczos1)-(lanczos3) determine a tridiagonal system of the form
0&...&0&^(k+1)N_k+1+ _kD_1 & N_2 & & & N_2 & D_2 & N_3 & & & N_3 & D_3 & N_4 & & & & & & & & N_k & D_k = ^T_k for 2kr, where the orthogonal matrix composed of the resulting orthonormal vectors is
The process stops when k=r (the rank of the matrix) because then Nr+1 = 0, or is numerically negligible. It follows that this tridiagonalization process results in the identity
^T= _r_r_r^T, where the tridiagonal matrix of coefficients is defined by
_k = D_1 & N_2 & & & N_2 & D_2 & N_3 & & & N_3 & D_3 & N_4 & & & & & & & & N_k & D_k for 2k r. Since is invertible (by the definition of the rank r of ),
The solution to the least-squares inversion problem may therefore be written as
= _r(_r)^-1_r^T^T= N_1_r(_r)^-1100, where I used (lanczos1), (d1), and the fact that
_r^T^(1) = 100. It follows that only the first column of the inverse of is needed to solve this inversion problem.
Since Lanczos's method directly produces a sequence of orthonormal vectors in the model space, it is straightforward to see that the model resolution matrix for this method is given by
_model = ^T(^T)^= _r_r^T = _k=1^r ^(k)(^(k))^T, which is also clearly symmetric. It is however more difficult to compute the data resolution matrix. Using the fact that
_data = ^= (^T)^ ^T together with (Adagger), I find that
_data = _r(_r)^-1_r^T^T. It is clear that both of the resolution matrices are symmetric when the full Lanczos inverse has been computed. If the process is terminated early so that k < r by setting the constant Nk+1 equal to zero, then a Lanczos approximate inverse is given by
_k = _k(_k)^-1_k^T^T, so that for all k, but if k < r. The effective resolution matrices are given by and ,or are found by replacing and in (modelreslanczos) and (datareslanczos) by and , respectively. Clearly, the effective resolutions are also symmetric.
An alternative method of computing the resolution matrices involves noting that the vector sequence may also be used as the starting set of vectors for the method of conjugate directions. Then, I can produce a set of conjugate vectors from this orthogonal set and easily compute both the model resolution matrix and the data resolution matrix if desired. This alternative requires additional work, however, to produce the sequence of conjugate vectors and it also generally produces a model resolution that is not symmetric, and therefore difficult to interpret. Thus, the Lanczos procedure appears to be superior to conjugate directions/gradients from this point of view. The main disadvantage of the Lanczos method is that the amount of storage increases with the iteration number k, since all the vectors must be stored until the final calculation of . In contrast, conjugate gradients requires a fixed amount of storage, but an easily interpretable model resolution must be sacrificed to gain this advantage.