next up previous print clean
Next: Null space and iterative Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Sign convention

Method of random directions and steepest descent

random directions steepest descent

Let us minimize the sum of the squares of the components of the residual vector given by
   \begin{eqnarray}
{\rm residual}
&=&
{\rm 
\quad \hbox{transform} \quad \quad \hb...
 ...array}
{c}
 \\  \\  \\  \bold d \\  \\  \\  \ 
 \end{array}\right]\end{eqnarray} (39)
(40)

A contour plot is based on an altitude function of space. The altitude is the dot product $\bold r \cdot \bold r$.By finding the lowest altitude, we are driving the residual vector $\bold r$ as close as we can to zero. If the residual vector $\bold r$ reaches zero, then we have solved the simultaneous equations $\bold d= \bold F \bold x$.In a two-dimensional world the vector $\bold x$ has two components, (x1 , x2 ). A contour is a curve of constant $\bold r \cdot \bold r$ in (x1 , x2 )-space. These contours have a statistical interpretation as contours of uncertainty in (x1 , x2 ), with measurement errors in $\bold d$.

Let us see how a random search-direction can be used to reduce the residual $0\approx \bold r= \bold F \bold x - \bold d$.Let $\Delta \bold x$ be an abstract vector with the same number of components as the solution $\bold x$,and let $\Delta \bold x$ contain arbitrary or random numbers. We add an unknown quantity $\alpha$of vector $\Delta \bold x$ to the vector $\bold x$,and thereby create $\bold x_{\rm new}$: 
 \begin{displaymath}
\bold x_{\rm new} \eq \bold x+\alpha \Delta \bold x\end{displaymath} (41)
This gives a new residual:
   \begin{eqnarray}
\bold r_{\rm new} &=& \bold F\ \bold x_{\rm new} -\bold d \\ \b...
 ...bold r
 &=& (\bold F \bold x-\bold d)
 +\alpha\bold F\Delta\bold x\end{eqnarray} (42)
(43)
(44)
which defines $\Delta \bold r = \bold F \Delta \bold x$.

Next we adjust $\alpha$ to minimize the dot product: $ \bold r_{\rm new} \cdot \bold r_{\rm new} $ 
 \begin{displaymath}
(\bold r+\alpha\Delta \bold r)\cdot (\bold r+\alpha\Delta \b...
 ...lta \bold r) \ +\ 
\alpha^2 \Delta \bold r \cdot \Delta \bold r\end{displaymath} (45)
Set to zero its derivative with respect to $\alpha$ using the chain rule  
 \begin{displaymath}
0\eq 
(\bold r+\alpha\Delta \bold r)
\cdot
\Delta \bold r
\ ...
 ...ld r)
\eq
2
(\bold r+\alpha\Delta \bold r)
\cdot
\Delta \bold r\end{displaymath} (46)
which says that the new residual $\bold r_{\rm new} = \bold r + \alpha \Delta \bold r$ is perpendicular to the ``fitting function'' $\Delta \bold r$.Solving gives the required value of $\alpha$. 
 \begin{displaymath}
\alpha \eq - \ { (\bold r \cdot \Delta \bold r ) \over
( \Delta \bold r \cdot \Delta \bold r ) }\end{displaymath} (47)

A ``computation template'' for the method of random directions is

 


		 $\bold r \quad\longleftarrow\quad\bold F \bold x - \bold d$ 
		 iterate { 
		 		  $\Delta \bold x \quad\longleftarrow\quad{\rm random\ numbers}$ 
		 		  $\Delta \bold r\ \quad\longleftarrow\quad\bold F \ \Delta \bold x$ 
		 		 $\alpha \quad\longleftarrow\quad
 -( \bold r \cdot \Delta\bold r )/
 (\Delta \bold r \cdot \Delta\bold r )
 $		 		 $\bold x \quad\longleftarrow\quad\bold x + \alpha\Delta \bold x$ 
		 		 $\bold r\ \quad\longleftarrow\quad\bold r\ + \alpha\Delta \bold r$ 
		 		 } 
A nice thing about the method of random directions is that you do not need to know the adjoint operator $\bold F'$.

In practice, random directions are rarely used. It is more common to use the gradient direction than a random direction. Notice that a vector of the size of $\Delta \bold x$ is
\begin{displaymath}
\bold g \eq \bold F' \bold r\end{displaymath} (48)
Notice also that this vector can be found by taking the gradient of the size of the residuals:
\begin{displaymath}
{\partial \over \partial \bold x' } \ \bold r \cdot \bold r
...
 ...') \,
( \bold F \, \bold x \ -\ \bold d)
\eq
\bold F' \ \bold r\end{displaymath} (49)
Choosing $\Delta \bold x$ to be the gradient vector $\Delta\bold x = \bold g = \bold F' \bold r$is called ``the method of steepest descent.''

Starting from a model $\bold x = \bold m$ (which may be zero), below is a template of pseudocode for minimizing the residual $\bold 0\approx \bold r = \bold F \bold x - \bold d$by the steepest-descent method:  


		 $\bold r \quad\longleftarrow\quad\bold F \bold x - \bold d$ 
		 iterate { 
		 		  $\Delta \bold x \quad\longleftarrow\quad\bold F'\ \bold r$ 
		 		  $\Delta \bold r\ \quad\longleftarrow\quad\bold F \ \Delta \bold x$ 
		 		 $\alpha \quad\longleftarrow\quad
 -( \bold r \cdot \Delta\bold r )/
 (\Delta \bold r \cdot \Delta\bold r )
 $		 		 $\bold x \quad\longleftarrow\quad\bold x + \alpha\Delta \bold x$ 
		 		 $\bold r\ \quad\longleftarrow\quad\bold r\ + \alpha\Delta \bold r$ 
		 		 } 

next up previous print clean
Next: Null space and iterative Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Sign convention
Stanford Exploration Project
4/27/2004