Next: IRLS ALGORITHM APPLIED TO L Up: Lp MINIMIZATION WITH THE Previous: IRLS algorithm

## Resolution of the normal weighted equations

We still have to solve the normal equations (6), at each step of the general algorithm. I omit the indexes k and k+1 in what follows.

The matrix ATWA is symmetric and positive. However, it does not have a particular structure (such as the Toeplitz structure of ATA in deconvolution). The computations of ATWA and of its inverse are very expensive, in time and in memory. That is why several authors (Yarlagadda et al., 1985; Gersztenkorn et al., 1986; Scales et al., 1988) prefer iterative methods, especially conjugate gradient (CG'') methods, to direct inversion methods, like singular-value decomposition or orthogonalization by Givens rotations.

The advantages of conjugate-gradient methods are numerous. The matrix ATWA does not need to be computed and stored. The algorithm only'' requires the computation of the gradient g = ATW.e (e: residuals) and of WA.h (h: conjugate direction). The vectors e and h are updated at each iteration of the CG algorithm. W being diagonal, the product between W and a vector is trivial. Moreover, if the matrix A is sparse, its product by a vector can also be easily computed. If the problem is ill-conditioned, limiting the number of iterations is supposed to have an effect similar to that produced by adding a damping factor.

Finally, the convergence of the CG algorithm is controlled by three factors:

• the condition number of the problem;
• the choice of an initial estimate x0 of x; the problem is modified into the resolution of: = 0.
• the structure of the eigenspectrum of ATWA itself: the more gathered the eigenvalues are, the quicker the algorithm will converge.
In theory, if the matrix ATWA has N different non-zero eigenvalues, the CG algorithm should converge without round-off errors in N iterations.

I will use the conjugate-gradient method in the IRLS algorithm applied to deconvolution. However, as I will show, its efficiency will not be optimal, unless some precautions are taken. For what follows, I mainly consider the case of the L1 deconvolution, with p=1.

Next: IRLS ALGORITHM APPLIED TO L Up: Lp MINIMIZATION WITH THE Previous: IRLS algorithm
Stanford Exploration Project
1/13/1998