The matrix ATWA is symmetric and positive. However, it does not have a particular structure (such as the Toeplitz structure of ATA in deconvolution). The computations of ATWA and of its inverse are very expensive, in time and in memory. That is why several authors (Yarlagadda et al., 1985; Gersztenkorn et al., 1986; Scales et al., 1988) prefer iterative methods, especially conjugate gradient (``CG'') methods, to direct inversion methods, like singular-value decomposition or orthogonalization by Givens rotations.
The advantages of conjugate-gradient methods are numerous. The matrix ATWA does not need to be computed and stored. The algorithm ``only'' requires the computation of the gradient g = ATW.e (e: residuals) and of WA.h (h: conjugate direction). The vectors e and h are updated at each iteration of the CG algorithm. W being diagonal, the product between W and a vector is trivial. Moreover, if the matrix A is sparse, its product by a vector can also be easily computed. If the problem is ill-conditioned, limiting the number of iterations is supposed to have an effect similar to that produced by adding a damping factor.
Finally, the convergence of the CG algorithm is controlled by three factors:
I will use the conjugate-gradient method in the IRLS algorithm applied to deconvolution. However, as I will show, its efficiency will not be optimal, unless some precautions are taken. For what follows, I mainly consider the case of the L1 deconvolution, with p=1.