For use in linear problems, the conjugate-direction method described in this book follows an identical path with the more well-known conjugate-gradient method. We use the conjugate-direction method for convenience in exposition and programming.

The simple form of the conjugate-direction algorithm covered here
is a sequence of steps.
In each step the minimum is found in the plane given by two vectors:
the gradient vector and the vector of the previous step.
Given the linear operator and a generator of solution steps
(random in the case of random directions or gradient in the case of
steepest descent),
we can construct an optimally convergent iteration process,
which finds the solution in no more than *n steps,
where n is the size of the problem.
This result should not be surprising.
If is represented by a full matrix,
then the cost of direct inversion is proportional to n^{3},
and the cost of matrix multiplication is n^{2}.
Each step of an iterative method boils down to a matrix multiplication.
Therefore, we need at least n steps to arrive at the exact solution.
Two circumstances make large-scale optimization practical.
First, for sparse convolution matrices
the cost of matrix multiplication is n instead of n^{2}.
Second, we can often find a reasonably good solution
after a limited number of iterations.
If both these conditions are met, the cost of optimization
grows linearly with n,
which is a practical rate even for very large problems.
Fourier-transformed variables are often capitalized.
This convention will be helpful here,
so in this subsection only,
we capitalize vectors transformed by the matrix.
As everywhere, a matrix such as is printed in boldface type
but in this subsection,
vectors are not printed in boldface print.
Thus we define the solution, the solution step
(from one iteration to the next),
and the gradient by
*

(55) | ||

(56) | ||

(57) |

(58) |

(59) |

(60) | ||

(61) |

*
The gradient vector g is a vector with the same number
of components as the solution vector x.
A vector with this number of components is
*

(62) | ||

(63) |

*
The minimization (45) is now generalized
to scan not only the line with ,but simultaneously another line with .The combination of the two lines is a plane:
*

*
The many applications in this book all need to
find and with (67) and then
update the solution with (60) and
update the residual with (61).
Thus we package these activities in a subroutine
named cgstep().
To use that subroutine we will have a computation template
like we had for steepest descents, except that we
will have the repetitive work done by subroutine cgstep().
This template (or pseudocode) for minimizing the residual
by the conjugate-direction method is
*

iterate { }

4/27/2004