Statistical interpretation

Next: PRECONDITIONING AND INTERVAL VELOCITY Up: PRECONDITIONING THE REGULARIZATION Previous: Importance of scaling

Statistical interpretation

This book is not a statistics book. Never-the-less, many of you have some statistical knowledge that allows you a statistical interpretation of these views of preconditioning.

A statistical concept is that we can combine many streams of random numbers into a composite model. Each stream of random numbers is generally taken to be uncorrelated with the others, to have zero mean, and to have the same variance as all the others. This is often abbreviated as IID, denoting Independent, Identically Distributed. Linear combinations like filtering and weighting operations of these IID random streams can build correlated random functions much like those observed in geophysics. A geophysical practitioner seeks to do the inverse, to operate on the correlated unequal random variables and create the statistical ideal random streams. The identity matrix required for the ``second miracle'', and our search for a good preconditioning transformation are related ideas. The relationship will become more clear in chapter when we learn how to estimate the best roughening operator $\bold A$ as a prediction-error filter.

Two philosophies to find a preconditioner:

1.
Dream up a smoothing operator $\bold S$ .
2.
Estimate a prediction-error filter $\bold A$ , and then use its inverse $\bold S = \bold A^{-1}$ .

From examining these toy problems, we suspect that on large problems we will speed the convergence if we change variables to convert the regularization matrix into an identity. These small examples are confirmed by many peoples' experience with larger examples. Luckily we have multidimensional filters on the helix so we can readily transform one regularization type to the other.

The outstanding accelleration of convergence by preconditioning suggests that the philosophy of image creation by optimization has a dual orthonormality: First, Gauss (and common sense) tells us that the data residuals should be roughly equal in size. Likewise in Fourier space they should be roughly equal in size, which means they should be roughly white, i.e. orthonormal. (I use the word ``orthonormal'' because white means the autocorrelation is an impulse, which means the signal is statistically orthogonal to shifted versions of itself.) Second, to speed convergence of iterative methods, we need a whiteness, another orthonormality, in the solution. The map image, the physical function that we seek, might not be itself white, so we should solve first for another variable, the whitened map image, and as a final step, transform it to the ``natural colored'' map.

Next: PRECONDITIONING AND INTERVAL VELOCITY Up: PRECONDITIONING THE REGULARIZATION Previous: Importance of scaling

Stanford Exploration Project
2/27/1998