next up previous print clean
Next: SCALING THE ADJOINT Up: Preconditioning Previous: Three codes for inverse

THEORY OF UNDERDETERMINED LEAST-SQUARES

Construct theoretical data with  
 \begin{displaymath}
\bold d \quad =\quad\bold F \bold m\end{displaymath} (30)
Assume there are fewer data points than model points and that the matrix $\bold F \bold F'$ is invertible. From the theoretical data we estimate a model $\bold m_0$ with  
 \begin{displaymath}
\bold m_0 \quad =\quad\bold F' (\bold F \bold F')^{-1} \bold d\end{displaymath} (31)
To verify the validity of the estimate, insert the estimate (31) into the data modeling equation (30) and notice that the estimate $\bold m_0$ predicts the correct data. Notice that equation (31) is not the same as equation ([*]) which we derived much earlier. What's the difference? The central issue is which matrix of $\bold F \bold F'$ and $\bold F' \bold F$ actually has an inverse. If $\bold F$ is a rectangular matrix, then it is certain that one of the two is not invertible. (There are plenty of real cases where neither matrix is invertible. That's one reason we use iterative solvers.) Here we are dealing with the case with more model points than data points.

Now we will show that of all possible models $\bold m$ that predict the correct data, $\bold m_0$ has the least energy. (I'd like to thank Sergey Fomel for this clear and simple proof that does not use Lagrange multipliers.) First split (31) into an intermediate result $\bold d_0$ and final result:
\begin{eqnarray}
\bold d_0 &=& (\bold F \bold F')^{-1} \bold d
 \\  \bold m_0 &=& \bold F' \bold d_0\end{eqnarray} (32)
(33)
Consider another model ($\bold x$ not equal to zero)
\begin{displaymath}
\bold m \quad =\quad\bold m_0 + \bold x\end{displaymath} (34)
which fits the theoretical data $\bold d = \bold F(\bold m_0+\bold x)$.Since $\bold d=\bold F\bold m_0$,we see that $\bold x$ is a null space vector.
\begin{displaymath}
\bold F \bold x \quad =\quad\bold 0\end{displaymath} (35)
First we see that $\bold m_0$ is orthogonal to $\bold x$ because  
 \begin{displaymath}
\bold m_0' \bold x \quad =\quad
 (\bold F' \bold d_0)' \bold...
 ... \bold F \bold x \quad =\quad
 \bold d_0' \bold 0 \quad =\quad0\end{displaymath} (36)
Therefore,
\begin{displaymath}
\bold m' \bold m \quad =\quad
\bold m_0' \bold m_0 + \bold x...
 ...\bold m_0 + \bold x'\bold x \quad\ge\quad \bold m_0' \bold m_0 \end{displaymath} (37)
so adding null space to $\bold m_0$ can only increase its energy. In summary, the solution $\bold m_0 = \bold F' (\bold F \bold F')^{-1} \bold d$has less energy than any other model that satisfies the data.

Not only does the theoretical solution $\bold m_0 = \bold F' (\bold F \bold F')^{-1} \bold d$have minimum energy, but the result of iterative descent will too, provided that we begin iterations from $\bold m_0=0$ or any $\bold m_0$with no null-space component. In (36) we see that the orthogonality $\bold m_0'\bold x=0$does not arise because $\bold d_0$ has any particular value. It arises because $\bold m_0$ is of the form $\bold F'\bold d_0$.Gradient methods contribute $\Delta\bold m =\bold F' \bold r$which is of the required form.


next up previous print clean
Next: SCALING THE ADJOINT Up: Preconditioning Previous: Three codes for inverse
Stanford Exploration Project
4/27/2004