previous up next print clean
Next: Modified LSQR Up: Berryman: Resolution for Lanczos Previous: METHOD OF LANCZOS

LSQR ALGORITHM OF PAIGE AND SAUNDERS

The method of Lanczos may be applied to any square, symmetric matrix inversion problem. To solve the least-squares inversion problem, I applied the method to the normal matrix $\matM^T\matM$ in the preceding section. But, I could have ``completed the square'' with and considered instead the problem

& ^T & = 0 ,   where $\vecr = \vect - \matM\vecs$ is the residual vector. The resulting modification of Lanczos's method is the LSQR algorithm of Paige and Saunders (1982).

To derive this algorithm, first apply Lanczos's method to (PSLSQR) and divide the (m+n)-vector $\vecz^{(k)}$ into the two pieces: $\vecg^{(k)}$ in the m-dimensional data space and $\vech^{(k)}$ in the n-dimensional model space. Thus,

^(k) = ^(k)^(k).   Then, the tridiagonalization process takes the form

^(1)(^(1))^T0 = 0,  

[^(1)(^(1))^T + ^(2)(^(2))^T]& ^T & ^(1) = & ^T & ^(1),   and, for k > 2,

[^(k-1)(^(k-1))^T + ^(k)(^(k) )^T + ^(k+1)(^(k+1))^T] & ^T & ^(k) = & ^T & ^(k).  

Solving equations (Hlanczos1)-(Hlanczos3) after substituting (zgh) shows that

^(1)(^(1))^T=   and   ^(1) = 0,  

^(2) = 0   and  ^(2)(^(2))^T ^T^(1) = ^T^(1),   and, for $k \ge 1$,

[^(2k-1)(^(2k-1))^T + ^(2k+1)(^(2k+1))^T]^(2k) = ^(2k)   and  ^(2k+1) = 0,   and

^(2k+2) = 0   and   [^(2k)(^(2k))^T + ^(2k+2)( ^(2k+2))^T]^T^(2k+1) = ^T.^(2k+1)   Thus, the tridiagonal system of Lanczos has been reduced to two coupled bidiagonal systems in LSQR.

Defining the constants

b_1 = || = (^(1))^T,   and, for $k \ge 1$,

b_2k = (^(2k))^T^T^(2k-1) = (^(2k-1))^T^(2k),   and

b_2k+1 = (^(2k))^T^T^(2k+1) = (^(2k+1))^T^(2k).   Then, the orthogonal matrix of the vectors $\vecg^{(2k+1)}$ is defined by

_k = ^(1) & ^(3) & ...& ^(2k-1),   while the corresponding matrix of the vectors $\vech^{(2k)}$ is defined by

_k = ^(2) & ^(4) & ...& ^(2k).  

Now the two bidiagonal systems may be written, for $2 \le k \le r$, as

_kb_2 & b_3 & & & & & b_4 & b_5 & & & & & b_6 & b_7 & & & & & & & & & & & b_2k & = ^T_k,   and

0&...&0&^(k+1)b_2k+1 + _kb_2 & & & & b_3 & b_4 & & & & b_5 & b_6 & & & & & & & & & b_2k-1 & b_2k = _k.   Then, introducing the $k\times k$ lower bidiagonal matrix

_k = b_2 & & & & b_3 & b_4 & & & & b_5 & b_6 & & & & & & & & & b_2k-1 & b_2k,   it is clear that, when b2r+1 = 0 (or is numerically negligible), multiplying (GBH) on the right by $\matH_r^T$ yields the formal relation[*]

_r_r_r^T,   and also that

_r = _r(_r)^-1_r^T.   The coupled equations (HBG) and (GBH) are equivalent to

_r_r^T_r = ^T_r,   which should be compared to (tridiagonal). If the tridiagonal matrix $\matT_r$ satisfies

_r = _r^T_r   (so that Dk = b2k2+b2k+12 and Nk = b2k-1b2k), then Lanczos's orthogonal matrix for the least-squares problem satisfies $\matZ_r = \matH_r$.

The solution of the inversion problem may now be written as

= _r= _r(_r)^-1_r^T = b_1_r(_r)^-1100,   where I used (gh1), (b1), and the fact that

_r^T^(1) = 100.   It follows that only the first column of the inverse of $\matB_r$ is needed to solve the inversion problem.

Once I have the approximate generalized inverse via (MHBG), I can write the model resolution matrix as

_model = ^= _r= _r_r^T.   Although it is tempting to think that the data resolution is given similarly by

_data = ^= _r _r_r^T,   this is not true in general! It is easy to see that

_r_r^T= ,   by construction. Yet the absurd result seemingly implied by (counterexample) that the data resolution is always perfect for this method cannot hold generally for $\calR_{data}$, since it implies that the data can be perfectly matched in all problems (all possible choices of and )which is clearly false. The difficulty is that the initial vector $\vecg^{(1)}$ in LSQR is proportional to the data vector itself, which may contain contributions from the data null space if and are inconsistent (i.e., there may exist a vector such that $\vecn^T\matM =0$but $\vecn^T\vect \ne 0$). Since the LSQR process never modifies this initial vector and never multiplies it by , these terms in the data null space never get stripped off during the LSQR computation. This difficulty is equivalent to the observation that $\matX_r\matM\matX_r \ne \matX_r$ for the LSQR algorithm if and are inconsistent.



 
previous up next print clean
Next: Modified LSQR Up: Berryman: Resolution for Lanczos Previous: METHOD OF LANCZOS
Stanford Exploration Project
11/17/1997