Next: ACKNOWLEDGEMENTS Up: Preconditioning Previous: The virtual-residual preconditioning algorithm

SCALING THE ADJOINT

Given the usual linearized fitting goal between data space and model space, $\bold d \approx \bold F \bold m$ ,the simplest image of the model space results from application of the adjoint operator $\hat \bold m = \bold F' \bold d$ .Unless $\bold F$ has no physical units, however, the physical units of $\hat \bold m$ do not match those of $\bold m$ ,so we need a scaling factor. The theoretical solution $\bold m_{\rm theor} = (\bold F'\bold F)^{-1}\bold F'\bold d$ suggests that the scaling units should be those of $(\bold F'\bold F)^{-1}$ .We could probe the operator $\bold F$ or its adjoint with white noise or a zero-frequency input. Bill Symes suggests we probe with the data $\bold d$ because it has the spectrum of interest. He proposes we make our image with $\hat \bold m = \bold W^2 \bold F'\bold d$ where we choose the weighting function to be

$\begin{displaymath} \bold W^2 \quad =\quad{\bf diag} \left( {\bold F' \bold d \over \bold F'\bold F\bold F'\bold d} \right)\end{displaymath}$

(47)

which obviously has the correct physical units. The weight $\bold W^2$ can be thought of as a diagonal matrix containing the ratio of two images. A problem with the choice (47) is that the denominator might vanish or might even be negative. The way to stabilize any ratio is suggested at the beginning of Chapter

; that is, we revise the ratio (47) by changing $\bold W^2= {\bf diag}(a/b)$ to

$\begin{displaymath} \bold W^2 \quad =\quad{\bf diag} \left( {< ab \gt\over < b^2 + \epsilon^2 \gt} \right)\end{displaymath}$

(48)

where $\epsilon$ is a parameter to be chosen, and the angle braces indicate the possible need for local smoothing.

Because a scaled adjoint is a guess at the solution to the fitting problem,

it is logical to choose values for $\epsilon$ and the smoothing parameters that give fastest convergence of the conjugate-direction solver.

To go beyond the scaled adjoint we can use $\bold W$ as a preconditioner. To use $\bold W$ as a preconditioner we define implicitly a new set of variables $\bold p$ by the substitution $\bold m=\bold W\bold p$ .Then $\bold d \approx \bold F\bold m=\bold F\bold W\bold p$ .To find $\bold p$ instead of $\bold m$ ,we do CD iteration with the operator $\bold F\bold W$ instead of with $\bold F$ .As usual, the first step of the iteration is to use the adjoint of $\bold d\approx \bold F\bold W\bold p$ to form the image $\hat\bold p=(\bold F\bold W)'\bold d$ .At the end of the iterations, we convert from $\bold p$ back to $\bold m$ with $\bold m=\bold W\bold p$ .The result after the first iteration $\hat\bold m=\bold W\hat\bold p=\bold W(\bold F\bold W)'\bold d=\bold W^2\bold F'\bold d$ turns out to be the same as Symes scaling.

By (47), $\bold W$ has physical units inverse to $\bold F$ .Thus the transformation $\bold F\bold W$ has no units so the $\bold p$ variables have physical units of data space. It might be more practical to view the solution $\bold p$ with data units than to view the solution $\bold m$ with the more theoretical model units.

Some experience tells me that the ideas of this section are defective. Appropriate scaling is required in both data space and model space. We need both $\bold W_1$ and $\bold W_2$ where $\hat \bold m = \bold W_1 \bold F'\bold W_2 \bold d$ .

I have a useful practical example (stacking in v(z) media) in another of my electronic books (BEI), where I found both $\bold W_1$ and $\bold W_2$ by iterative guessing. But I don't know how to give you a general strategy. I feel this is a major unsolved(?) opportunity for someone.

Next: ACKNOWLEDGEMENTS Up: Preconditioning Previous: The virtual-residual preconditioning algorithm

Stanford Exploration Project
2/27/1998