next up previous print clean
Next: IRLS algorithm Up: Lp MINIMIZATION WITH THE Previous: Lp MINIMIZATION WITH THE

Normal equations

The iterative reweighted least-squares (IRLS) algorithm provides a mean by which linear systems can be solved by minimizing the Lp norm of the residuals ($1\leq\!p\!\leq 2$). Let us suppose that we try to solve the set of equations:

\begin{displaymath}
A.x = y \mbox{\hspace{2.0cm}} x\in {\cal R}^m,\; y\in {\cal R}^n,\; A\in {\cal R}^{n\times m}\:.\end{displaymath}

To find the Lp solution of this system we minimize the Lp norm of the residuals:

\begin{displaymath}
{\cal N}_p(x) = (\sum_{i=1}^n \vert y_i - \sum_{j=1}^m A_{ij}x_j\vert^p)^{{1\over p}} = {\cal L}_p(y-A.x)\:.\end{displaymath}

${\cal L}_p$ is not a norm for p<1, so, for the moment, I will only consider $p\geq 1$ . I also omit the exponent 1/p . To establish the normal equations, we set:

\begin{displaymath}
{\partial{\cal N}_p \over\partial x_k} = 0 \mbox{\hspace{2.0cm} for k=1 to m}\:.\end{displaymath}

The partial derivatives can be expressed as follows:

\begin{displaymath}
{\partial{\cal N}_p \over\partial x_k} = -p\sum_{i=1}^n A_{ik}{\rm sign}(r(i))\:\vert r(i)\vert^{p-1}\:.\end{displaymath}

where $r(i)=y_i-\sum_{j=1}^mA_{ij}x_j$ are the residuals. Then:
\begin{displaymath}
{\partial{\cal N}_p \over\partial x_k} = -p\sum_{i=1}^n A_{ik}r(i) \vert r(i)\vert^{p-2}\:.\end{displaymath} (1)
To form the normal equations we can now replace r(i) by its expression:

\begin{displaymath}
\sum_{i=1}^n A_{ik}\vert r(i)\vert^{p-2}(\sum_{j=1}^mA_{ij}x_j) = \sum_{i=1}^n A_{ik}\vert r(i)\vert^{p-2}y_i\end{displaymath}

or, after some reorganization:
\begin{displaymath}
\sum_{j=1}^m(\sum_{i=1}^nA_{ik}\vert r(i)\vert^{p-2}A_{ij})x_j = \sum_{i=1}^n A_{ik}\vert r(i)\vert^{p-2}y_i\:.\end{displaymath} (2)
There are m such equations (k=1 to m). They can be expressed shortly in a matrix formulation:  
 \begin{displaymath}
W(i) = \vert r(i)\vert^{p-2} \mbox{\hspace{1.0cm} for i=1 to n ,}\end{displaymath} (3)

\begin{displaymath}
W = {\rm diag}(W(1),\cdots,W(n))\;,\end{displaymath}

and finally:  
 \begin{displaymath}
A^TWA.x = A^TW.y\;.\end{displaymath} (4)
This formulation is implicit and non-linear since the residuals r(i), and so the matrix W, depend on the unknown vector x. However, for the classical least-squares inversion (p=2), W is the identity matrix, and equation (4) is the usual least-squares equation:  
 \begin{displaymath}
A^TA.x = A^T.y\;.\end{displaymath} (5)

next up previous print clean
Next: IRLS algorithm Up: Lp MINIMIZATION WITH THE Previous: Lp MINIMIZATION WITH THE
Stanford Exploration Project
1/13/1998