First, we notice that the scaling coefficient simplifies with the choice to the form

_n = (**r**_n-1,**Mg**_n-1)
||**Mp**_n-1||^2 = (**M**^Tr_n-1,**g**_n-1)
||**Mp**_n-1||^2 = ||**g**_n-1||^2
||**Mp**_n-1||^2,
and the residual decrease (minres) becomes

||**r**_n||^2 = ||**r**_n-1||^2 -
||**g**_n-1||^2
||**Mp**_n-1||^2
According to formula (cgminres), the residual norm is guaranteed
to decrease monotonically at each iteration as long as the gradient is
different from zero.

Second, applying formula (resmpj), we notice the equality

(**g**_n,**g**_j) =
(**g**_n,**p**_j) +
_i=1^j-1_j^(i)(**g**_n,**p**_i) = 0
for
j = 1,2,...,n-1,
which is precisely equivalent to the conjugacy of the residual
vectors, suggested earlier by (conjugateres). Equation
(ortgrad) states that the residuals from successive
conjugate-gradient iterations form an orthogonal basis in the space of
the model . This fact assures that the global minimum in an
*n*-dimensional space can be found, in precise arithmetic, in exactly
*n* iterations. We can see that the validity of this remarkable fact
is based entirely upon the orthogonality condition (ortgrad).

With , we can rewrite equation (betanj) in the form

_n^(j) = (**M g**_n,**Mp**_j)
||**Mp**_j||^2 = (**M g**_n,**r**_j+1-**r**_j)
_j+1||**Mp**_j||^2 =
(**g**_n,**g**_j+1-**g**_j)
_j+1||**Mp**_j||^2.
It follows immediately from formula (cgbetanj) and the
orthogonality condition that

_n^(j) = 0 for j = 1,2,...,n-2, and

_n^(n-1) =
||**g**_n||^2
_n||**Mp**_n-1||^2 = ||**g**_n||^2
||**g**_n-1||^2.
The latter equality follows from formula (cgalpha).
Thus, the next direction of the conjugate-gradient iteration is
completely defined by a linear combination of the current gradient and
the previous direction:

**p**_n = **g**_n - _n^(n-1)**p**_n-1.
Equations (cgalpha), (cgbetan), and (pn) provide a
complete definition of the classic conjugate-gradient algorithm
Fletcher and Reeves (1964); Hestenes and Stiefel (1952).

Summarizing our derivation, we conclude that the success of the conjugate-direction method is supported by the orthogonality condition (preconjugacy). The success of the conjugate-gradient method requires, in addition, the conjugacy condition (conjugateres), which can be expressed in the model-space as the orthogonality of the gradients (ortgrad).

The next section shows how the orthogonal sets of vectors in the data and model spaces translate into the effective resolution operators.

11/11/1997