[1]
[1] ()
[1]
[1]
[1]
[1]
[1]
[1]
[1]
[1]
[1]
*et al.*
[1]
*Proof:*

(1) |

S T V W min LS [1]LS[#1] [1]LS[#1] Int Bdy k_- k_+ k_s i(r - t) i(r - t) i(z - t)

berryman@sep.stanford.edu

## ABSTRACTResolution matrices are useful in seismic tomography because they allow us to evaluate the information content of reconstructed images. Techniques based on the multiplicity of equivalent exact formulas that may be used to define the resolution matrices have been used previously by the author to design algorithms that avoid the need for any singular value decomposition of the ray-path matrix. An explicit procedure is presented for computing both model and data resolution matrices using Arnoldi's algorithm for iterative inversion in seismic tomography. Arnoldi's method differs from the Lanczos scheme by including explicit reorthogonalization of basis vectors. Some convenient notation is introduced to permit ready comparison of Arnoldi's method with the Lanczos approach. Arnoldi's method requires greater storage of basis vectors but completely overcomes the lack of basis vector orthogonality, which is the major practical limitation of the Lanczos method. |

INTRODUCTION

In an earlier contribution to SEP-77 (Berryman, 1993), I discussed general methods of constructing resolution matrices for iterative inverses and applied these techniques to both Lanczos and Paige-Saunders (LSQR) algorithms (Lanczos, 1950; Paige and Saunders, 1982). The comment was made that those methods could be applied to a wide variety of iterative inversion methods. To illustrate the general utility of the methods, I show how the methods can be applied to Arnoldi's algorithm for iterative inversion. This algorithm may be viewed as a variant of the Lanczos algorithm, but with reorthogonalization of the basis vectors explicitly built in. I will first review the Lanczos method and then present the Arnoldi approach using a similar notation. Hopefully the similarities and differences of the two techniques will then become clear.

REVIEW OF RESOLUTION MATRICES

Discussions of resolution played a central role in the classic paper of Aki
*et al.* (1977) on large scale seismic tomography. Such discussions
continue to play a key role in the interpretation of inversion results in
seismology. For example, in their recent review article -- which also
includes an extensive discussion of resolution in seismic inverse problems --
Evans and Achauer (1993) state that `` evaluating the whole resolution
matrix, not just its diagonal elements, is a required part of interpreting [the
reconstructed model].''
Indeed, any inverse method based on incomplete or imperfect data should be
expected to produce a filtered or blurred image of the object or region
of interest. Apparent resolution is therefore an important figure of merit for
practical imaging and inversion schemes, since the user of such schemes will
eventually want to know what the smallest object is that can be distinguished.
Judgments concerning reliability of features observed in the
reconstructed image depend strongly on our understanding of inherent
resolution capabilities of the inverse method used to produce the image.
Therefore, a convenient quantitative measure of pointwise or cellwise model
reliability is certainly helpful and perhaps essential for subsequent
interpretation.

For completeness, I will review the main ideas about resolution matrices. For linear inversion problems, resolution matrices can be understood most easily by considering a matrix equation of the form

= , and first asking the question: Given matrix and data vector ,what model vector solves this equation? When the matrix is square and invertible, the answer to the question is relatively easy: ,with being the usual matrix inverse of .However, it often happens in geophysical inversion problems that is not square, or not invertible even if it is square. In these situations, the least-squares method is often used, resulting in the normal equations

^T= ^T, which can often be solved approximately for since the normal matrix is square and symmetric -- although it may still be singular. It proves convenient now to introduce an approximate inverse called the Moore-Penrose pseudoinverse (Moore, 1920; Penrose, 1955a). This generalized inverse is the unique matrix that satisfies the four conditions: , , ,and .Although other choices for the approximate inverse are known [for example, see Rao (1965)], I will restrict discussion here to this best known approximate inverse. Then, after multiplying (Mst) on the left by ,the result is

^= ^. If it were true that (the identity matrix), then I would have solved the inversion problem exactly, and also have perfect model resolution. But it is precisely in those problems for which no such inverse exists that the following analysis is needed. Define the matrix coefficient of in (MMsMt) to be the resolution matrix

^.
The deviations of from the identity matrix , *i.e.,*
the components of the difference matrix , determine
the degree of distrust appropriate for the components of the solution
vector that are most poorly resolved.

For definiteness, consider the seismic tomography problem (see Figure 1):
is an ray-path matrix,
is a data *m*-vector of first arrival traveltimes, and is the model *n*-vector for (possibly rectangular) cells of constant slowness
(inverse velocity). I seek the slownesses given the
measured traveltimes in and the estimates of the ray paths between
source and receiver locations contained in the matrix [see
Berryman (1990) and references therein]. Then, the resolution matrix
defined in
(RMM) is the model resolution, since the slowness vector is the desired
model of acoustic wave slowness. A data resolution matrix may also be defined.
First, multiply (MMsMt) on the left by so

^= = ^, and then compare (MMMsMMt) to (Mst), noting that the matrix product multiplying should equal the identity matrix if the approximate inverse is a true inverse. Again, deviations of this matrix from the identity provide information about the degree to which the computed solution makes use of all the data in . Thus, the data resolution matrix (Jackson, 1972; Wiggins, 1972) is defined by

_data ^, while the resolution matrix defined previously in (RMM) is the model resolution (Backus and Gilbert, 1968; Jackson, 1972)

_model ^. Furthermore, for seismic inversion, I must also consider mathematial nonlinearities involved in the process of finding a ray-path matrix that is consistent with the model .For the present purposes, assume that and are the final (and mutually consistent) products of an iterative algorithm (Berryman, 1990). Then, the question of resolution needs to be studied carefully in order to explore fully the range of possible solutions resulting from inherent nonuniqueness of the inverse problem.

The significance of these two resolution matrices may be understood by considering the singular value decomposition (SVD) of the matrix ,given by

= _i=1^r _i _i_i^T,
where the *m*-vectors and *n*-vectors are the eigenvectors
of determined by and
and the s are the eigenvalues.
The eigenvectors are also assumed to satisfy orthonormality
conditions and .The rank of (the number of nonzero eigenvalues) has a value
. The Moore-Penrose pseudoinverse is then known to be given
by

^= _i=1^r _i^-1_i_i^T, so the resolution matrices are written explicitly in terms of sums of the outer products of the eigenvectors as

_data = _i=1^r _i_i^T. When displayed in this form, it is clear that the resolution matrices simply express the completeness of the resolved model or data spaces respectively. They are projection operators on the span of the resolved parts of the model and data vector spaces.

COMPUTING RESOLUTION

Now it is important to recognize that, although the resolution matrices
have generally been *defined* by equations (Rd) and (Rm)
-- or by (vv) and (uu) (which implicitly assume that a singular
value decomposition has been performed), it may nevertheless be possible
*to compute* these matrices in other ways.
Of particular importance is the computation of an effective inverse matrix
generated by an iterative inversion procedure.

To establish the plausibility of computing resolution without singular value decomposition, first consider a simple pedagogical example that would not be ideal for computations. For convenience define to be a parameter having the significance of a continuous iteration number and let be the current approximation to the pseudoinverse .Then the current value of the approximate solution vector is given by and the unresolved part of the data vector is clearly given by the difference vector . The length of this vector is a scalar measure of the unresolved portion of the data. By designing the iterative inversion scheme to decrease the length of this vector progressively as , the derivative of its square with respect to the continuous iteration number is given by

^T(-^T^T)(-) = - ^T^T^T(-)- ^T(-^T^T). A sufficient condition for the traveltime data resolution to improve continuously as is then [see Lu and Berryman (1990)] the equation of motion for given by

= ^T(-), where is some arbitrary scalar that determines the rate of convergence. It follows by construction that the right hand side of (firstderivative) is always negative or zero. Thus, the nonnegative length is a continuously decreasing function of the iteration parameter as long as has the equation of motion given by (eom). Clearly, the right hand side of (eom) vanishes if and only if the approximate inverse matrix satisfies

^T() = ^T, which is equivalent to the normal equations of least-squares, so as expected (Penrose, 1955b).

Defining effective resolution matrices and ,then it follows from (eom) that

_data = ^T(-_data). Assuming initial conditions for the integration are given by and , it is easy to see that the solutions of these equations will be symmetric matrices as desired. This approach becomes an iterative procedure when the equations are solved numerically by discretizing the independent parameter and stepping from one discrete value of to the next.

The procedure just outlined establishes that iterative procedures for computing resolution matrices are certainly possible. However, this particular method is less than ideal because it would be computationally intensive. So next consider a general procedure that could be applied to many iterative methods that are used in practice.

To find a more efficient approach that is also inherently symmetric,
first analyze the SVD of the normal matrix, *i.e.,*

^T= _i=1^r _i_i_i^T_j=1^r _j _j_j^T = _i=1^r _i^2_i_i^T. Then, it is easily shown that

(^T)= _i=1^r _i_i^T = _model, and similarly that an alternative is given by

^T(^T)^= _i=1^r _i_i^T = _model, a formula that is automatically symmetric. The data resolution is given by the various alternative forms

^= ^T(^T)^= (^T) = _j=1^r _j_j^T = _data.

Iterative methods such as the method of Lanczos (1950) for solving (Mst) in the least-squares sense often compute the pseudoinverse -- not of itself but rather -- of the normal matrix . The appearance of the pseudoinverse of the normal matrix in these alternative expressions for the resolution matrices (Rmagain)-(Rdagain) then provides both a motivation and a clue to constructing methods for computing resolution with such iterative methods. Thus, in the following discussion, I am able to show how these various equivalent formulas for the resolution matrices may be used efficiently, in the course of a routine computation whose primary goal is to find an approximate solution to by iterative methods.

TRIDIAGONALIZATION METHOD OF LANCZOS

Consider the linear inversion problem of the form ,where is the unknown.
For a crosshole seismic tomography problem (Berryman, 1990),
is an ray-path matrix, is an *m*-vector of
first arrival traveltimes, and is the model slowness (inverse velocity)
vector.
The method of Lanczos (1950) solves
this problem by introducing a sequence of orthonormal vectors through a process of tridiagonalization. To obtain the minimum norm
least-squares solution of , the method may be applied
to the normal equations (normaleqns),
since they have the form with square symmetric matrix
, unknown vector , and data vector
.

Lanczos algorithm

With and ,the Lanczos algorithm is a projection procedure equivalent to the following:

[^(1)(^(1))^T+^(2)(^(2) )^T]^(1) = ^(1), and, for ,

[^(k-1)(^(k-1))^T+^(k)(^(k) )^T+^(k+1)(^(k+1))^T] ^(k) = ^(k). By simple induction on the recursion formulas, all the basis vectors are ideally orthonormal and therefore satisfy . Of course, these conditions are not satisfied exactly for finite precision calculations -- leading to some practical difficulties that will not be discussed here for lack of space (Parlett, 1980).

For the application to traveltime tomography, the pertinent constants are defined by ,,and . The equations (lanczos1)-(lanczos3) determine a tridiagonal system of the form

^(k+1)_k^TN_k+1+
_k_k = ^T_k for 2kr,
where the unit vector is all zeroes except for a one in the *k*th
position. The tridiagonal matrix of coefficients is defined by

_k = D_1 & N_2 & & &
N_2 & D_2 & N_3 & &
& N_3 & D_3 & N_4 &
& & & &
& & & N_k & D_k
for 2k r,
and where the matrix
is composed of the resulting orthonormal vectors.
In practical implementations of the algorithm, the constants *N*_{k+1}
are generally found through the normalization condition implicit in
(lanczos3).

Assuming infinite precision, the process stops when *k*=*r* (the rank of the
matrix) because then (or is numerically negligible).
It follows from (tridiagonal) that this tridiagonalization process results
in the identity

^T= _r_r_r^T. Since is invertible, the Moore-Penrose inverse (Penrose, 1955a) of the normal matrix is given by

The solution to the least-squares inversion problem may therefore be written as .The first column of the inverse of (the only one needed) may be found using an elementary recursion relation.

Lanczos resolution

Since the Lanczos algorithm directly produces a sequence of orthonormal vectors in the model space, it is straightforward to see that the model resolution matrix for this method is

_model = ^T(^T)^= _r_r^T, which is also clearly symmetric. I compute the data resolution matrix using the fact that together with (Adagger), so that

_data = _r(_r)^-1_r^T^T. Both resolution matrices are symmetric if the full Lanczos inverse is computed.

Assuming only that orthogonality has been maintained among the vectors [a nontrivial assumption for realistic applications (Parlett, 1980)], the matrix is just the span of the model space associated with the ray-path matrix . Thus, (modelreslanczos) is understood intuitively as the result one obtains by finding a new orthonormal set of basis vectors for the span of the resolution matrix. In other methods such as the Paige-Saunders LSQR algorithm (Paige and Saunders, 1982), it is also possible to obtain a corresponding formula for the data resolution. However, for the Lanczos algorithm, the data resolution must be computed indirectly from the available information using (datareslanczos).

ARNOLDI INVERSION WITH REORTHOGONALIZATION

Once again, consider the linear inversion problem of the form
, where is the unknown.
For a crosshole seismic tomography problem (Berryman, 1990),
is an ray-path matrix, is an *m*-vector of
first arrival traveltimes, and is the model slowness (inverse velocity)
vector.
The method of Arnoldi (1951) solves
this problem by introducing a sequence of orthonormal vectors through a process similar to Lanczos tridiagonalization, but with all vectors reorthogonalized against all previous for *i* < *k*. To obtain the minimum norm
least-squares solution of , the method may be applied
to the normal equations (normaleqns),
since they have the form with square symmetric matrix
, unknown vector , and data vector
.

Arnoldi algorithm

With and ,the Arnoldi algorithm is a projection procedure equivalent to the following:

[^(1)(^(1))^T+^(2)(^(2) )^T]^(1) = ^(1), and, for ,

[_i=1^k+1^(i)(^(i))^T]
^(k) = ^(k).
By construction, all the basis vectors satisfy
,so to numerical accuracy the vectors are orthonormal and furthermore
they would each be identical to the corresponding vectors appearing in the Lanczos process if the computations could be performed
with infinite precision. Results would be identical to the results
of Lanczos method in this idealized situation.
In fact, (arnoldi1) and (arnoldi2) are identical to (lanczos1) and (lanczos2),
while (arnoldi3) is identical to (lanczos3) for *k*=2. Differences only begin
to arise for .

For the application to traveltime tomography, the pertinent constants are
defined by
,,and
and (except that ).
Note that *N _{1}* is exactly the same as the constant of the same name
in the Lanczos algorithm. The equations (arnoldi1)-(arnoldi3)
determine a system of the form

_k+1_k+1 = ^T_k for 2kr, where the upper Hessenberg matrix () of coefficients is defined by

_k+1 = d_1 & n_1,2 & n_1,3 & n_1,4 & & n_1,k n_2,1 & d_2 & n_2,3 & n_2,4 & & n_2,k & n_3,2 & d_3 & n_3,4 & & & & n_4,3 & & & & & & & d_k-1 & n_k-1,k & & & & n_k,k-1 & d_k & & & & & n_k+1,k for 2k r, and where the matrix is composed of the resulting orthonormal vectors.

Assuming infinite precision, the process stops when *k*=*r* (the rank of the
matrix) because then (or is numerically negligible).
Thus, the bottom row of the Hessenberg matrix is full of zeroes,
and therefore may be truncated to create the square () matrix
which contains only the first *r* rows of .Then, it follows from (hessenberg) that this process results
in the identity

^T= _r_r_r^T. Since is invertible, the Moore-Penrose inverse (Penrose, 1955a) of the normal matrix is given by

The solution to the least-squares inversion problem may therefore be written as .The first column of the inverse of is the only one needed and may be found using an elementary recursion relation.

Arnoldi resolution

Like the Lanczos algorithm, the Arnoldi algorithm produces a sequence of orthonormal vectors in the model space. It is therefore straightforward to see that the model resolution matrix for this method is

_model = ^T(^T)^= _r_r^T, which is also clearly symmetric. In analogy with the results for the Lanczos algorithm, I can again compute the data resolution matrix using the fact that together with (AdaggerH), so

_data = _r(_r)^-1_r^T^T. Both resolution matrices are symmetric if the full Arnoldi inverse is computed.

By design, orthogonality is maintained among the vectors ,so the matrix is just the span of the model space associated with the ray-path matrix . Thus, (modelresarnoldi) is understood intuitively as the result one obtains by finding a new orthonormal set of basis vectors for the span of the resolution matrix. However, for the Arnoldi algorithm as for the Lanczos algorithm, the data resolution must be computed indirectly from the available information using (dataresarnoldi).

DISCUSSION

If computations could be performed with infinite precision, the Lanczos and
Arnoldi algorithms would be the same. To see that this is so,
consider the definition of Arnoldi matrix element *n*_{i,j}:

n_i,j (^(i))^T^T^(j).
Since the normal matrix is symmetric, it follows immediately
that *n*_{i,j} = *n*_{j,i}, at least in principle. For example, the Lanczos
algorithm computes *N _{2}* =

ACKNOWLEDGMENTS

I thank H. Tal-Ezer for introducing me to the Arnoldi algorithm.

REFERENCES

- []
Aki, K., Christoffersson, A., and Husebye, E. S., 1977, Determination
of the three-dimensional structure of the lithosphere:
*J. Geophys.Res.***82**, 277-296. - []
Arnoldi, W. E., 1951, The principle of minimized iterations in the
solution of the matrix eigenvalue problem:
*Quart.Appl.Math.***9**, 17-29. - []
Backus, G., and Gilbert, F., 1968, The resolving power of gross earth data:
*Geophys.J. R. Astron.Soc.***16**, 169-205. - []
Berryman, J. G., 1990, Stable iterative reconstruction algorithm for nonlinear
traveltime tomography:
*Inverse Problems***6**, 21-42. - []
Berryman, J. G., 1993, Resolution for Lanczos and Paige-Saunders
inverses in tomography: SEP-
**77**, 161-173. - []
Evans, J. R., and Achauer, U., 1993, Teleseismic velocity tomography
using the ACH method: Theory and application to continental-scale studies:
in
*Seismic Tomography: Theory and Practice*, H. M. Iyer and K. Hirahara (eds.), Chapman and Hall, London, pp.319-360. - []
Jackson, D. D., 1972, Interpretation of inaccurate, insufficient and
inconsistent data:
*Geophys.J. R. Astron.Soc.***28**, 97-109. - []
Lanczos, C., 1950, An iteration method for the solution of the eigenvalue
problem of linear differential and integral operators:
*J. Res.Nat.Bur.Stand.***45**, 255-282. - []
Lu, S.-Y., and Berryman, J. G., 1990, Inverse scattering, seismic traveltime
tomography, and neural networks:
*Int.J. Imaging Sys.Techn.***2**, 112-118. - []
Moore, E. H., 1920,
*Bull.Amer.Math.Soc.***26**, 394-395. - []
Paige, C. C., and Saunders, M. A., 1982, LSQR: An algorithm for sparse
linear equations and sparse least squares:
*ACM Trans.Math.Software***8**, 43-71. - []
Parlett, B. N., 1980,
*The Symmetric Eigenvalue Problem*, Chapter 13, Prentice-Hall, Englewood Cliffs, NJ, pp.257-287. - []
Penrose, R., 1955a, A generalized inverse for matrices:
*Proc.Cambridge Philos.Soc.***51**, 406-413. - []
Penrose, R., 1955b, On best approximation solutions of linear matrix equations:
*Proc.Cambridge Philos.Soc.***52**, 17-19. - []
Rao, C. R., 1965,
*Linear Statistical Inference and Its Applications*, Wiley, New York, pp. 24-26. - []
Wiggins, R. A., 1972, The general linear inverse problem: Implications of
surface waves and free oscillations for Earth structure:
*Rev.Geophys.Space Phys.***10**, 251-285.

5/11/2001