Let be the vector of observed data, and be the ideal underlying model. The regularized data represent the model estimate . Taking into account the lack of information about , we can treat both and as random vectors and approach the problem of finding statistically.

For any two random vectors and , let us denote by
the mathematical expectation of the random matrix
, where denotes the adjoint of *y*.
Analogously, will denote the mathematical expectation
of . For zero-mean vectors, the matrices
and correspond to covariances. In a
more general case, they are second-moment statistics of the
corresponding random processes.

Applying the Gauss-Markoff theorem, one can obtain an explicit form of the estimate under three very general assumptions Liebelt (1967):

- 1.
- The estimate has a linear relationship with the input data:
(1) - 2.
- The estimate corresponds to the minimum of , where
*E*is the mathematical expectation and denotes the model error . For unbiased estimates (zero mathematical expectation of ), the matrix corresponds to the model error covariance. Although we do not make any explicit assumptions about the statistical distribution of the error, minimizing is particularly meaningful in case of normal (Gaussian) distributions Tarantola (1987). - 3.
- The square matrix is invertible.

(2) |

(3) |

Equation () has fundamental importance in different data regularization schemes. With some slight modifications, it appears as the basis for such methods as optimal interpolation in atmospheric data analysis Daley (1991); Gandin (1965), least-squares collocation in geodesy Moritz (1980), and linear kriging in petroleum and mining engineering Hohn (1999); Journel and Huijbregts (1978). In order to apply formula () in practice, one needs first to get an estimate of the matrices and . In geostatistics, the covariance matrices are usually chosen from simple variogram models Deutsch and Journel (1997).

Unfortunately, a straightforward application of the Gauss-Markoff
formula () is computationally unaffordable for typical
seismic data applications. If the data vector contains *N* parameters,
a straightforward application will lead to an *N* by *N* matrix
inversion, which requires storage proportional to *N ^{2}* and a number
of operations proportional to

We can take two major theoretical steps to reduce the computational
complexity of the method. The first step is to approximate the
covariance matrices with sparse operators so that the matrix
multiplication is reduced from *N ^{2}* operations to something linear in

12/28/2000