As we have been discussing,
suppose we want to invert a square, symmetric matrix
and we want to do so using some iterative method
like Lanczos 1950,
LSQR Paige and Saunders (1982),
conjugate gradients Hestenes and Stiefel (1952), etc.
Then the iterative algorithm (especially for Lanczos and LSQR) generally may be expressed in the form
AZ_k = Z_kT_k + N_k+1z_k+1e_k^T
for where r is the rank of the matrix
and the matrix
, given by
Z_k = (
),
is composed of the column vectors
for
that have been generated so far
in the iterative process. The other terms in (tridiag) are
the unit vector
, with its single component in the kth
position, and a norming constant Nk+1. The formula (tridiag)
with
being a tridiagonal matrix is just exactly the
iterative scheme of Lanczos.
Now the model resolution matrix for the iterative
scheme at the kth iteration is given by
_k = Z_kZ_k^T.
However, care must be taken to make sure that the 's are
orthogonal as they are expected to be in this scheme. The
tridiagonalization
process produces a sequence of orthogonal vectors in principle, but in
practice the orthogonality may break down after several iterations
when the calculations are performed at finite precision.
To demonstrate how the lack of orthogonality affects the process
consider the following facts. Let the eigenvalues of
the matrix
be ordered so that
for
and
for
.(We write the eigenvalues as squares because
often
takes the form of a normal matrix
,in which case the eigenvalues of
are the
s.)
Then the trace of the matrix is just
TrA = _i=1^r _i^2.
The model resolution is a projection operator onto a
k-dimensional Krylov subspace (the one explored so far by the
iterative method) of the n-dimensional space that is
both the range and domain of
.Taking the trace of
shows that
Tr[Z_kZ_k^TA] = Tr[Z_k^TAZ_k] = _i=1^k z_i^TAz_i.
Let for
be the normalized eigenvectors
associated with the eigenvalues
. Then, assuming only that
has no components in the null-space of
, each of the
iteration vectors
can be expanded in terms of these eigenvectors
with the result that
z_i = _j=1^r _ijv_j.
Similarly, the eigenvectors can be expanded in terms of the full set of
normalized iteration vectors according to
v_j = _i=1^r _ijz_i,
where the same set of coefficients is used in both
expansions and these coefficients must satisfy
_i=1^r _ij^2 = 1 = _j=1^r _ij^2 in order for both sets of vectors to be normalized to unity.
Substituting (zexpansion) into (Krylovtrace), we find easily that
Tr[Z_kZ_k^TA] =
_j=1^r _j^2[_i=1^k _ij^2]
_j=1^r _j^2.
The inequality follows, since the coefficients are all
real, and therefore
_j=1^k _ij^2 _j=1^r _ij^2 1.
Thus, we can expect that as long as the vectors returned by the
iterative scheme are orthogonal, the effective trace of the operator
must remain smaller than that of the original
operator
. If this constraint is violated, then we know that
the set of
's generated so far are
not orthogonal and furthermore that this lack of orthogonality
is having serious negative consequences on our ability to compute
the resolution operator correctly.
This fact leads to a test for orthogonality that is very easy to implement when our operator is a matrix with known elements. We will make use of this test in the next section.