Next: JOINT ESTIMATION OF OPERATOR Up: Nichols: Estimation of missing Previous: Introduction

LINEAR ESTIMATION OF MISSING DATA

Consider a list of samples with some elements missing

$\begin{displaymath} \phi = \{ d_1, d_2, m_1, d_3, d_4, d_5, m_2, \ldots , m_m, d_l \};\end{displaymath}$

there are l known data samples, $\{ d_1, \ldots, d_l \}$ , and m missing data samples, $\{ m_1, \ldots, m_m \}$ . I wish to estimate values for the missing samples. If I have some model of the expected output I can choose interpolated values that minimize the power of the residual data that does not obey the model. e.g. If we have an expected band limited spectrum we can minimize the power of the output spectrum outside the expected band. In general I write the problem as one of minimizing the residuals resulting from applying a linear operator to the data. The model parameters that can be varied in the minimization process are the missing data values.

I define two new vectors k and u. The known data, k, is a vector of length n with zeros in place of the unknown values. The unknown data, u, is a vector of length n with zeros is place of the known values. Then,

$\begin{displaymath} \phi \equiv \left( \begin{array} {c} \phi_1 \\ \phi_2 \\ ... ...} 0 \\ 0 \\ m_1 \\ \vdots \\ m_m \\ 0 \end{array} \right)\end{displaymath}$

The objective function I minimize is

$\begin{displaymath} \vert R \cdot \phi \vert ^2 = \vert R\cdot u + R\cdot k\vert^2 \end{displaymath}$

This is easily related to the standard form of linear problems, Am = y where $y= -R \cdot k$ and $A = R \cdot S .$ Here R is the user supplied operator that is derived from the assumptions made about the output. S is a scatter/gather operator that maps the vector of unknowns, m, into the vector u.

One simple example of a suitable linear operator is a roughening operator. The model that this implies is that the output should be locally smooth. If I have a smoothing operator, SM, I can define a roughening operator, R = 1 - SM. The residual vector that I want to minimize is $R \cdot \phi$ .

Many numerical methods can be used to solve this problem. I choose to use a simple conjugate-gradient algorithm ( Claerbout, 1985). The smoothing operator I use is the triangular smoothing operator "Triconv" described by Claerbout(1989). The adjustable parameters in the interpolation algorithm are the length of the smoothing operator and the number of iterations of the conjugate gradient algorithm.

An important property of this type of interpolation method is that it can be used to estimate extensions off the ends of a dataset. If I had simply designed a filter from the input data and then used the filter to predict off the end of the data it might have produced a diverging solution.

When processing a seismic dataset I treat each time slice independently. This produces one-dimensional vectors of data with unknown samples can be processed using the method described above. Figure 1 shows the processing of a synthetic dataset. The input (a) has equal amplitude spikes across half the gather. Panels (b), (c) and (d) show the results as more iterations of the conjugate gradient algorithm are performed.

plotlin1
Figure 1 Linear estimation with filter of half width 3. (a) Input data, (b) 1 iteration of conjugate gradient algorithm, (c) 5 iterations, (d) 11 iterations.

Figures 2 and 3 show the processing of a real dataset. A dataset with half the traces missing at far offsets, wz27 (Yilmaz and Cumro,1983), has been windowed to provide a small test dataset. Normal moveout was applied to the data with water velocity. This has made most events reasonably, but not totally, flat. Figure 2 shows the results of processing with a filter of half width 3 samples. The missing traces at far offsets are well interpolated but as the iterations progress, more energy is put in flat extensions at the ends of the data. This is because the algorithm tries to produce a smooth answer on each time slice separately, and the flat extensions are certainly smooth. Figure 3 shows the results of processing with a filter of half width 7 samples. Because the data has some residual dip it is not smooth over a length of 15 samples. The algorithm only interpolates large amplitudes when it is surrounded by data that is smooth over the length of the filter. Because of the residual dip the results for a longer filter are less successful.

plotlin2
Figure 2 Linear estimation with filter of half width 3. (a) Input data, (b) 1 iteration of conjugate gradient algorithm, (c) 10 iterations, (d) 30 iterations.

plotlin3
Figure 3 Linear estimation with filter of half width 7. (a) 1 iteration of conjugate gradient algorithm, (b) 10 iterations, (c) 30 iterations, (d) 50 iterations.

A better approach to the problem might be to avoid any assumptions about the desired output spectrum and instead use an error filter that matches the spectral characteristics of the input data.

Next: JOINT ESTIMATION OF OPERATOR Up: Nichols: Estimation of missing Previous: Introduction

Stanford Exploration Project
1/13/1998