The method of Lanczos may be applied to any square, symmetric matrix inversion problem. To solve the least-squares inversion problem, I applied the method to the normal matrix in the preceding section. But, I could have ``completed the square'' with and considered instead the problem
& ^T & = 0 , where is the residual vector. The resulting modification of Lanczos's method is the LSQR algorithm of Paige and Saunders (1982).
To derive this algorithm, first apply Lanczos's method to (PSLSQR) and divide the (m+n)-vector into the two pieces: in the m-dimensional data space and in the n-dimensional model space. Thus,
^(k) = ^(k)^(k). Then, the tridiagonalization process takes the form
^(1)(^(1))^T0 = 0,
[^(1)(^(1))^T + ^(2)(^(2))^T]& ^T & ^(1) = & ^T & ^(1), and, for k > 2,
[^(k-1)(^(k-1))^T + ^(k)(^(k) )^T + ^(k+1)(^(k+1))^T] & ^T & ^(k) = & ^T & ^(k).
Solving equations (Hlanczos1)-(Hlanczos3) after substituting (zgh) shows that
^(1)(^(1))^T= and ^(1) = 0,
^(2) = 0 and ^(2)(^(2))^T ^T^(1) = ^T^(1), and, for ,
[^(2k-1)(^(2k-1))^T + ^(2k+1)(^(2k+1))^T]^(2k) = ^(2k) and ^(2k+1) = 0, and
^(2k+2) = 0 and [^(2k)(^(2k))^T + ^(2k+2)( ^(2k+2))^T]^T^(2k+1) = ^T.^(2k+1) Thus, the tridiagonal system of Lanczos has been reduced to two coupled bidiagonal systems in LSQR.
Defining the constants
b_1 = || = (^(1))^T, and, for ,
b_2k = (^(2k))^T^T^(2k-1) = (^(2k-1))^T^(2k), and
b_2k+1 = (^(2k))^T^T^(2k+1) = (^(2k+1))^T^(2k). Then, the orthogonal matrix of the vectors is defined by
_k = ^(1) & ^(3) & ...& ^(2k-1), while the corresponding matrix of the vectors is defined by
_k = ^(2) & ^(4) & ...& ^(2k).
Now the two bidiagonal systems may be written, for , as
_kb_2 & b_3 & & & & & b_4 & b_5 & & & & & b_6 & b_7 & & & & & & & & & & & b_2k & = ^T_k, and
0&...&0&^(k+1)b_2k+1 + _kb_2 & & & & b_3 & b_4 & & & & b_5 & b_6 & & & & & & & & & b_2k-1 & b_2k = _k. Then, introducing the lower bidiagonal matrix
_k = b_2 & & & & b_3 & b_4 & & & & b_5 & b_6 & & & & & & & & & b_2k-1 & b_2k, it is clear that, when b2r+1 = 0 (or is numerically negligible), multiplying (GBH) on the right by yields the formal relation
_r_r_r^T, and also that
_r = _r(_r)^-1_r^T. The coupled equations (HBG) and (GBH) are equivalent to
_r_r^T_r = ^T_r, which should be compared to (tridiagonal). If the tridiagonal matrix satisfies
_r = _r^T_r (so that Dk = b2k2+b2k+12 and Nk = b2k-1b2k), then Lanczos's orthogonal matrix for the least-squares problem satisfies .
The solution of the inversion problem may now be written as
= _r= _r(_r)^-1_r^T = b_1_r(_r)^-1100, where I used (gh1), (b1), and the fact that
_r^T^(1) = 100. It follows that only the first column of the inverse of is needed to solve the inversion problem.
Once I have the approximate generalized inverse via (MHBG), I can write the model resolution matrix as
_model = ^= _r= _r_r^T. Although it is tempting to think that the data resolution is given similarly by
_data = ^= _r _r_r^T, this is not true in general! It is easy to see that
_r_r^T= , by construction. Yet the absurd result seemingly implied by (counterexample) that the data resolution is always perfect for this method cannot hold generally for , since it implies that the data can be perfectly matched in all problems (all possible choices of and )which is clearly false. The difficulty is that the initial vector in LSQR is proportional to the data vector itself, which may contain contributions from the data null space if and are inconsistent (i.e., there may exist a vector such that but ). Since the LSQR process never modifies this initial vector and never multiplies it by , these terms in the data null space never get stripped off during the LSQR computation. This difficulty is equivalent to the observation that for the LSQR algorithm if and are inconsistent.