First, we notice that the scaling coefficient simplifies with
the choice
to the form
_n = (r_n-1,Mg_n-1) ||Mp_n-1||^2 = (M^Tr_n-1,g_n-1) ||Mp_n-1||^2 = ||g_n-1||^2 ||Mp_n-1||^2, and the residual decrease (minres) becomes
||r_n||^2 = ||r_n-1||^2 - ||g_n-1||^2 ||Mp_n-1||^2 According to formula (cgminres), the residual norm is guaranteed to decrease monotonically at each iteration as long as the gradient is different from zero.
Second, applying formula (resmpj), we notice the equality
(g_n,g_j) =
(g_n,p_j) +
_i=1^j-1_j^(i)(g_n,p_i) = 0
for
j = 1,2,...,n-1,
which is precisely equivalent to the conjugacy of the residual
vectors, suggested earlier by (conjugateres). Equation
(ortgrad) states that the residuals from successive
conjugate-gradient iterations form an orthogonal basis in the space of
the model . This fact assures that the global minimum in an
n-dimensional space can be found, in precise arithmetic, in exactly
n iterations. We can see that the validity of this remarkable fact
is based entirely upon the orthogonality condition (ortgrad).
With , we can rewrite equation (betanj) in
the form
_n^(j) = (M g_n,Mp_j) ||Mp_j||^2 = (M g_n,r_j+1-r_j) _j+1||Mp_j||^2 = (g_n,g_j+1-g_j) _j+1||Mp_j||^2. It follows immediately from formula (cgbetanj) and the orthogonality condition that
_n^(j) = 0 for j = 1,2,...,n-2, and
_n^(n-1) = ||g_n||^2 _n||Mp_n-1||^2 = ||g_n||^2 ||g_n-1||^2. The latter equality follows from formula (cgalpha). Thus, the next direction of the conjugate-gradient iteration is completely defined by a linear combination of the current gradient and the previous direction:
p_n = g_n - _n^(n-1)p_n-1. Equations (cgalpha), (cgbetan), and (pn) provide a complete definition of the classic conjugate-gradient algorithm Fletcher and Reeves (1964); Hestenes and Stiefel (1952).
Summarizing our derivation, we conclude that the success of the conjugate-direction method is supported by the orthogonality condition (preconjugacy). The success of the conjugate-gradient method requires, in addition, the conjugacy condition (conjugateres), which can be expressed in the model-space as the orthogonality of the gradients (ortgrad).
The next section shows how the orthogonal sets of vectors in the data and model spaces translate into the effective resolution operators.