Next: CROSSWELL TOMOGRAPHY APPLICATION Up: Berryman & Fomel: Iterative Previous: Data resolution estimate

IMPORTANCE OF MAINTAINING ORTHOGONALITY

As we have been discussing, suppose we want to invert a square, $n\times n$ symmetric matrix ${\bf A}$ and we want to do so using some iterative method like Lanczos 1950, LSQR Paige and Saunders (1982), conjugate gradients Hestenes and Stiefel (1952), etc. Then the iterative algorithm (especially for Lanczos and LSQR) generally may be expressed in the form

AZ_k = Z_kT_k + N_k+1z_k+1e_k^T for $k \le r$ where r is the rank of the matrix ${\bf A}$ and the matrix ${\bf Z}_k$ , given by

Z_k = z_1 & z_2 & ...& z_k , is composed of the column vectors ${\bf z}_i$ for $i = 1,\ldots,k$ that have been generated so far in the iterative process. The other terms in (tridiag) are the unit vector ${\bf e}_k$ , with its single component in the kth position, and a norming constant N_k+1. The formula (tridiag) with ${\bf T}$ being a tridiagonal matrix is just exactly the iterative scheme of Lanczos.

Now the model resolution matrix $\calR_{model}$ for the iterative scheme at the kth iteration is given by

_k = Z_kZ_k^T. However, care must be taken to make sure that the ${\bf z}_i$ 's are orthogonal as they are expected to be in this scheme. The tridiagonalization process produces a sequence of orthogonal vectors in principle, but in practice the orthogonality may break down after several iterations when the calculations are performed at finite precision. To demonstrate how the lack of orthogonality affects the process consider the following facts. Let the eigenvalues of the matrix ${\bf A}$ be ordered so that $\lambda_i^2 \ne 0$ for $i=1,\ldots,r$ and $\lambda_i^2 = 0$ for $i=r+1,\ldots,n$ .(We write the eigenvalues as squares because ${\bf A}$ often takes the form of a normal matrix ${\bf A} = {\bf M}^T{\bf M}$ ,in which case the eigenvalues of ${\bf M}$ are the $\lambda_i$ s.) Then the trace of the matrix is just

TrA = _i=1^r _i^2. The model resolution $\calR_k$ is a projection operator onto a k-dimensional Krylov subspace (the one explored so far by the iterative method) of the n-dimensional space that is both the range and domain of ${\bf A}$ .Taking the trace of $\calR_k{\bf A}$ shows that

Tr[Z_kZ_k^TA] = Tr[Z_k^TAZ_k] = _i=1^k z_i^TAz_i.

Let ${\bf v}_i$ for $i=1,\ldots,r$ be the normalized eigenvectors associated with the eigenvalues $\lambda_i^2$ . Then, assuming only that ${\bf z}_1$ has no components in the null-space of ${\bf A}$ , each of the iteration vectors ${\bf z}_i$ can be expanded in terms of these eigenvectors with the result that

z_i = _j=1^r _ijv_j. Similarly, the eigenvectors can be expanded in terms of the full set of normalized iteration vectors ${\bf z}_i$ according to

v_j = _i=1^r _ijz_i, where the same set of coefficients $\zeta_{ij}$ is used in both expansions and these coefficients must satisfy

_i=1^r _ij^2 = 1 = _j=1^r _ij^2 in order for both sets of vectors to be normalized to unity.

Substituting (zexpansion) into (Krylovtrace), we find easily that

Tr[Z_kZ_k^TA] = _j=1^r _j^2[_i=1^k _ij^2] _j=1^r _j^2. The inequality follows, since the coefficients $\zeta_{ij}$ are all real, and therefore

_j=1^k _ij^2 _j=1^r _ij^2 1. Thus, we can expect that as long as the vectors returned by the iterative scheme are orthogonal, the effective trace of the operator ${\bf T}$ must remain smaller than that of the original operator ${\bf A}$ . If this constraint is violated, then we know that the set of ${\bf z}_i$ 's generated so far are not orthogonal and furthermore that this lack of orthogonality is having serious negative consequences on our ability to compute the resolution operator correctly.

This fact leads to a test for orthogonality that is very easy to implement when our operator is a matrix with known elements. We will make use of this test in the next section.

Next: CROSSWELL TOMOGRAPHY APPLICATION Up: Berryman & Fomel: Iterative Previous: Data resolution estimate

Stanford Exploration Project
11/11/1997