Next: An example of calculating Up: Background and definitions Previous: Solving methods

## Prediction-error or annihilation filters

One of the most common geophysical applications of inversion is the calculation of prediction-error filters. A prediction-error is defined as
 (26)
where is the prediction-error filter of length , and is an input data series. The filtering operation may also be expressed as ,where indicates convolution. For the sake of the discussion here, is an infinitely long time series. As in the previous discussion, the error and the data will be assumed to be stationary and have a Gaussian distribution with a zero mean. The error will also be assumed to be uncorrelated so that .

Application of a prediction-error filter removes the predictable information from a dataset, leaving the unpredictable information, that is, the prediction error. A typical use of prediction-error filters is seen in the deconvolution problem, where the predictable parts of a seismic trace, such as the source wavelet and multiples, are removed, leaving the unpredictable reflections.

The condition that contains no predictable information may be expressed in several ways. One method is by minimizing , where is the conjugate transpose of ,by calculating a filter that minimizes .This minimization reduces to have the least energy possible, where the smallest is assumed to contain only unpredictable information.

Another equivalent expression of unpredictability is that the non-zero lags of the normalized autocorrelation are zero, or that
 (27)
where is one when k is zero and is zero otherwise. This approximation may also be expressed as ,where is a scale factor that may be ignored. These two methods of expressing unpredictability are the basis for the sample calculations of the prediction-error presented in the next section.

The prediction-error filter may also be defined in the frequency domain using the condition that the expectation E[ri rj] = 0 for .Transforming the autocorrelation into the frequency domain gives ,where is the complex conjugate of .Since is the convolution of and ,,
 (28)
Since the filter is a linear operator, it can be taken outside of the expectationPapoulis (1984) to make the previous expression become
 (29)
Thus, the power spectrum of is the inverse of the power spectrum of .Although the phase of the data is lost when creating the cross-correlation of to get ,the phase of the filter is generally unimportant when the filter is being used as an annihilation filter in an inversion. For applications where a minimum phase filter is required, Kolmogoroff spectral factorizationClaerbout (1992a) may be used.

Another way of expressing the unpredictability of is ,that is, the expectation of is the identity matrix. This states that the expectation of the cross-terms of the errors are zero, that is, the errors are uncorrelated. This also states that the variances of the errors have equal weights. To make the matrices factor with an decomposition Strang (1988), the expression needs to be posed as a matrix operation ,with as an upper triangular matrix. To do this, the indices of and are reversed from the usual order in their vector representations. A small example of is
 (30)

Building one small realization of gives
 (31)
To get an estimate of the expectation of this expression, it must be remembered that the data series is stationary, and the error will also be stationary. Only the differences in the indices are important, the locations are not. It can be seen that the elements of () are the elements of the autocorrelation at various lags. Using equation () makes the expectation of () become the identity matrix ,where the may be dropped, since it is only a scale factor that may be incorporated into the filter or as a normalization of the autocorrelation.

Starting from the expression and substituting for gives
 (32)
or
 (33)
Once again, since and are linear operators, they can be taken outside of the expectationPapoulis (1984)
 (34)
Moving the s to the right-hand side gives
 (35)
Expanding one realization of a small example of gives
 (36)
Once again, to get an estimate of the expectation of this expression, it should be remembered that is stationary, and only the differences in the indices are important. It can then be seen that E[di di-j] are elements of the autocorrelation, and is the autocorrelation matrix of .Setting gives
 (37)

Generally, will be invertible and positive definite for real data. There are some special cases where is not invertible and positive definite, for example, when contains a single sine wave. To avoid these problems, the stabilizer is often added to the autocorrelation matrix, where is a small number and is the identity matrix. In the geophysical industry, this is referred to as adding white noise, or whitening, since adding to the autocorrelation matrix is equivalent to adding noise to the data .

The matrix may be obtained from the matrix by Cholesky factorizationStrang (1988), since is symmetric positive definite. Cholesky factorization factors a matrix into ,where looks like
 (38)
The matrix obtained from this factorization will be upper triangular, as seen in (), so the filter is seen to predict a given sample of from the past samples, and the maximum filter length is the length of the window. This matrix could be considered as four filters of increasing length. The longest filter, assuming it is the most effective filter, could be taken as the prediction-error filter.

Another way of looking at this definition of a prediction-error filter is to consider the filter as a set of weights producing a weighted least-squares solution. Following Strang 1986, the weighted error is , where is to be determined, and is the error of the original system. The best will make
 (39)
Since ,
 (40)
where is the covariance matrix of .Strang, quoting Gauss, says the best is the inverse of the covariance matrix of .Setting and makes the weight become the prediction-error filter seen in equation ().

While I've neglected a number of issues, such as the invertability of ,the finite length of , and the quality of the estimations of the expectations, in practice the explicit solution to () will not be used to calculate a prediction-error filter. Most practical prediction-error filters will be calculated using other, more efficient, methods. A traditional method for calculating a short prediction-error filter using Levinson recursion will be shown in the next section. In this thesis, most of the filters are calculated using a conjugate-gradient technique such as that shown in Claerbout 1992a.

Most prediction-error filters are used as simple filters, and the desired output is just the error from the application of the filter . Another use for these prediction-error filters is to describe some class of information in an inversion, such as signal or noise. In this case, these filters are better described as annihilation filters, since the inversion depends on the condition that the filter applied to some data annihilates, or zeros, a particular class of information to a good approximation. For example, a signal may be characterized by a signal annihilation filter expressed as a matrix , so that which may be expressed as , where is small compared to .A noise may be characterized by a noise annihilation filter , so that .Examples of annihilation filters used to characterize signal and noise will be shown in chapters , , , and .

In this thesis, prediction-error filters will be referred to as annihilation filters when used in an inversion context. While prediction-error filters and annihilation filters are used in different manners, they are calculated in the same way. In spite of the similarities in calculating the filters involved, the use of these filters in simple filtering and in inversion is quite different. Simple filtering, whether one-, two-, or three-dimensional, involves samples that are relatively close to the output point and makes some simplifying assumptions. An important assumption is that the prediction error is not affected by the application of the filter. Inversion requires that the filters describe the data, and the characterization of the data is less local than it is with simple filtering. The assumption that the prediction error is not affected by the filter can be relaxed in inversion, a topic to be further considered in chapters  and .

Next: An example of calculating Up: Background and definitions Previous: Solving methods
Stanford Exploration Project
2/9/2001