This paper describes the method of conjugate directions for solving linear operator equations in Hilbert space. This method is usually described in the numerous textbooks on unconstrained optimization as an introduction to the much more popular method of conjugate gradients. See, for example, Practical optimization by Gill et al. 1995 and its bibliography. The famous conjugate-gradient solver possesses specific properties, well-known from the original works of Hestenes and Stiefel 1952 and Fletcher and Reeves 1964. For linear operators and exact computations, it guarantees finding the solution after, at most, n iterative steps, where n is the number of dimensions in the solution space. The method of conjugate gradients doesn't require explicit computation of the objective function and explicit inversion of the Hessian matrix. This makes it particularly attractive for large-scale inverse problems, such as those of seismic data processing and interpretation. However, it does require explicit computation of the adjoint operator. Jon Claerbout shows dozens of successful examples of the conjugate gradient application with numerically precise adjoint operators.
The motivation for this tutorial is to explore the possibility of using different types of preconditioning operators in the place of adjoints in iterative least-square inversion. For some linear or linearized operators, implementing the exact adjoint may pose a difficult problem. For others, one may prefer different preconditioners because of their smoothness Claerbout (1995a); Crawley (1995a), simplicity Kleinman and van den Berg (1991), or asymptotic properties Sevink and Herman (1994). In those cases, we could apply the natural generalization of the conjugate gradient method, which is the method of conjugate directions. The cost difference between those two methods is in the volume of memory storage. In the days when the conjugate gradient method was invented, this difference looked too large to even consider a practical application of conjugate directions. With the evident increase of computer power over the last 30 years, we can afford to do it now.
I derive the main equations used in the conjugate-direction method from very general optimization criteria, with minimum restrictions implied. The textbook algebra is illustrated with a ratfor program and three simple examples.