Next: Conditioning the gradient
Up: ITERATIVE METHODS
Previous: ITERATIVE METHODS
Let us minimize the sum of the squares of the components
of the residual vector given by
| ![\begin{displaymath}
\ \it\hbox{residual} \eq \ \it\hbox{data space} \ \ -\ \ \ \it\hbox{transform} \ \ \ \ \it\hbox{model space}\end{displaymath}](img95.gif) |
(30) |
| ![\begin{displaymath}
\left[
\matrix { \matrix { \cr \cr R \cr \cr \cr } }
\right]...
...\ \ \
\left[
\matrix {
\matrix { \cr x \cr \cr }
}
\right]\end{displaymath}](img96.gif) |
(31) |
Fourier-transformed variables are often capitalized.
Here we capitalize vectors transformed by the
matrix.
A matrix such as
is denoted by boldface print.
A contour plot is based on an altitude function of space.
The altitude is the dot product
.By finding the lowest altitude
we are driving the residual vector R as close as we can to zero.
If the residual vector R reaches zero, then we have solved
the simultaneous equations
.In a two-dimensional world the vector x has two components,
(x1 , x2 ).
A contour is a curve of constant
in (x1 , x2 )-space.
These contours have a statistical interpretation as contours
of uncertainty in (x1 , x2 ), given measurement errors in Y.
Starting from
, let us see how a random search
direction can be used to try to reduce the residual.
Let g be an abstract vector with the same number of components as
the solution x,
and let g contain arbitrary or random numbers.
Let us add an unknown quantity
of vector g to vector x,
thereby changing x to
.The new residual R+dR becomes
| ![\begin{eqnarray}
R+dR &=& Y-\bold A ( x+\alpha g) \ &=& Y-\bold A x - \alpha \bold A g \ &=& R-\alpha G\end{eqnarray}](img102.gif) |
(32) |
| (33) |
| (34) |
We seek to minimize the dot product
| ![\begin{displaymath}
(R+dR)\cdot (R+dR) \eq
(R- \alpha G) \cdot (R- \alpha G )\end{displaymath}](img103.gif) |
(35) |
Setting to zero the derivative with respect to
gives
| ![\begin{displaymath}
\alpha \eq { (R \cdot G ) \over ( G \cdot G ) }\end{displaymath}](img104.gif) |
(36) |
Geometrically and algebraically
the new residual
is
perpendicular to the ``fitting function'' G.
(We confirm this by substitution leading to ![$R_+ \cdot G=0.)$](img106.gif)
In practice, random directions are rarely used.
It is more common to use the gradient vector.
Notice also that a vector of the size of x is
| ![\begin{displaymath}
g \eq \bold A' R\end{displaymath}](img107.gif) |
(37) |
Notice also that this vector can be found by taking the gradient
of the size of the residuals:
| ![\begin{displaymath}
{\partial \over \partial x' } \ R \cdot R
\eq
{\partial \ove...
...\ x' \, \bold A' ) \,
( Y \ -\ \bold A \, x )
\eq
-\bold A' \ R\end{displaymath}](img108.gif) |
(38) |
Descending by use of the gradient vector is called
``the method of steepest descent."
Next: Conditioning the gradient
Up: ITERATIVE METHODS
Previous: ITERATIVE METHODS
Stanford Exploration Project
10/21/1998