Method of random directions and steepest descent

Next: Null space and iterative Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Sign convention

Method of random directions and steepest descent

random directions steepest descent

Let us minimize the sum of the squares of the components of the residual vector given by

$\begin{eqnarray} {\rm residual} &=& {\rm \quad \hbox{transform} \quad \quad \hb... ...array} {c} \\ \\ \\ \bold d \\ \\ \\ \ \end{array}\right]\end{eqnarray}$ (39)

(40)

A contour plot is based on an altitude function of space. The altitude is the dot product $\bold r \cdot \bold r$ .By finding the lowest altitude, we are driving the residual vector $\bold r$ as close as we can to zero. If the residual vector $\bold r$ reaches zero, then we have solved the simultaneous equations $\bold d= \bold F \bold x$ .In a two-dimensional world the vector $\bold x$ has two components, (x₁ , x₂ ). A contour is a curve of constant $\bold r \cdot \bold r$ in (x₁ , x₂ )-space. These contours have a statistical interpretation as contours of uncertainty in (x₁ , x₂ ), with measurement errors in $\bold d$ .

Let us see how a random search-direction can be used to reduce the residual $0\approx \bold r= \bold F \bold x - \bold d$ .Let $\Delta \bold x$ be an abstract vector with the same number of components as the solution $\bold x$ ,and let $\Delta \bold x$ contain arbitrary or random numbers. We add an unknown quantity $\alpha$ of vector $\Delta \bold x$ to the vector $\bold x$ ,and thereby create $\bold x_{\rm new}$ :

$\begin{displaymath} \bold x_{\rm new} \eq \bold x+\alpha \Delta \bold x\end{displaymath}$ (41)

This gives a new residual:

$\begin{eqnarray} \bold r_{\rm new} &=& \bold F\ \bold x_{\rm new} -\bold d \\ \b... ...bold r &=& (\bold F \bold x-\bold d) +\alpha\bold F\Delta\bold x\end{eqnarray}$ (42)

(43)

(44)

which defines $\Delta \bold r = \bold F \Delta \bold x$ .

Next we adjust $\alpha$ to minimize the dot product: $\bold r_{\rm new} \cdot \bold r_{\rm new}$

$\begin{displaymath} (\bold r+\alpha\Delta \bold r)\cdot (\bold r+\alpha\Delta \b... ...lta \bold r) \ +\ \alpha^2 \Delta \bold r \cdot \Delta \bold r\end{displaymath}$ (45)

Set to zero its derivative with respect to $\alpha$ using the chain rule

$\begin{displaymath} 0\eq (\bold r+\alpha\Delta \bold r) \cdot \Delta \bold r \ ... ...ld r) \eq 2 (\bold r+\alpha\Delta \bold r) \cdot \Delta \bold r\end{displaymath}$ (46)

which says that the new residual $\bold r_{\rm new} = \bold r + \alpha \Delta \bold r$ is perpendicular to the ``fitting function'' $\Delta \bold r$ .Solving gives the required value of $\alpha$ .

$\begin{displaymath} \alpha \eq - \ { (\bold r \cdot \Delta \bold r ) \over ( \Delta \bold r \cdot \Delta \bold r ) }\end{displaymath}$ (47)

A ``computation template'' for the method of random directions is

A nice thing about the method of random directions is that you do not need to know the adjoint operator $\bold F'$ .

In practice, random directions are rarely used. It is more common to use the gradient direction than a random direction. Notice that a vector of the size of $\Delta \bold x$ is

$\begin{displaymath} \bold g \eq \bold F' \bold r\end{displaymath}$ (48)

Notice also that this vector can be found by taking the gradient of the size of the residuals:

$\begin{displaymath} {\partial \over \partial \bold x' } \ \bold r \cdot \bold r ... ...') \, ( \bold F \, \bold x \ -\ \bold d) \eq \bold F' \ \bold r\end{displaymath}$ (49)

Choosing $\Delta \bold x$ to be the gradient vector $\Delta\bold x = \bold g = \bold F' \bold r$ is called ``the method of steepest descent.''

Starting from a model $\bold x = \bold m$ (which may be zero), below is a template of pseudocode for minimizing the residual $\bold 0\approx \bold r = \bold F \bold x - \bold d$ by the steepest-descent method:

Next: Null space and iterative Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Sign convention

Stanford Exploration Project
4/27/2004

	$\begin{eqnarray} {\rm residual} &=& {\rm \quad \hbox{transform} \quad \quad \hb... ...array} {c} \\ \\ \\ \bold d \\ \\ \\ \ \end{array}\right]\end{eqnarray}$	(39)
		(40)

	$\begin{eqnarray} \bold r_{\rm new} &=& \bold F\ \bold x_{\rm new} -\bold d \\ \b... ...bold r &=& (\bold F \bold x-\bold d) +\alpha\bold F\Delta\bold x\end{eqnarray}$	(42)
		(43)
		(44)