Next: TIME-FREQUENCY-STATISTICAL RESOLUTION Up: Resolution Previous: TIME-STATISTICAL RESOLUTION

FREQUENCY-STATISTICAL RESOLUTION

Observations of sea level for a long period of time can be summarized in terms of a few statistical averages such as the mean height m and the variance $\sigma^2$ .Another important kind of statistical average for use on such geophysical time series is the power spectrum. Some mathematical models explain only statistical averages of data and not the data themselves. In order to recognize certain pitfalls and understand certain fundamental limitations on work with power spectra, we first consider an idealized example.

Let x_t be a time series made up of independently chosen random numbers. Suppose we have n of these numbers. We can then define the data sample polynomial X(Z)

$\begin{displaymath} X(Z) \eq x_0 + x_1 Z + x_2 Z^2 + \cdots + x_{n - 1} Z^{n-1}\end{displaymath}$ (38)

We can now make up a power spectral estimate $\hat{R} (Z)$ from this sample of random numbers by

$\begin{displaymath} \hat{R} (Z) \eq {1 \over n} \overline{X} \left( {1 \over Z} \right) \, X(Z)\end{displaymath}$ (39)

The difference between this and our earlier definition of spectrum is that a power spectrum has the divisor n to keep the expected result from increasing linearly with the somewhat arbitrary sample size n.

The definition of power spectrum is the expected value of $\hat{R}$ , namely

$\begin{displaymath} R(Z) \eq E[\hat{R} (Z)]\end{displaymath}$ (40)

It might seem that a practical definition would be to let n tend to infinity in (39). Such a definition would lead us into a pitfall which is the main topic of the present section. Specifically, from Figure 3 we conclude that $\hat{R} (Z)$ is a much fuzzier function than R(Z), so that

$\begin{displaymath} R(Z) \quad\neq\quad \lim_{n \rightarrow \infty} \hat{R} (Z)\end{displaymath}$ (41)

4-2
Figure 3 Amplitude spectra ${\hat{R}(Z)}^{1/2}$ of samples of n random numbers. These functions seem to oscillate over about the same range for n = 512 as they do for n = 32. As n tends to infinity we expect infinitely rapid oscillation.

To understand why the spectrum is rough, we identify coefficients of like powers of Z in (39). We have

$\begin{displaymath} \hat{r}_k \eq {1 \over n} \sum^{n-k -1}_{t = 0} \bar x_t x_{t +k} \qquad k \eq -n + 1 {\rm \ \ to \ \ } n - 1\end{displaymath}$ (42)

enabling us to write (39) for real time series $x_t = \bar x_t$ as

$\begin{displaymath} \hat{R} \eq \hat{r}_0 + 2 \sum^{n - 1}_{k = 1} \hat{r}_k \cos k\omega \end{displaymath}$ (43)

Let us examine (43) for large n. To do this, we will need to know some of the statistical properties of the random numbers. Let them have zero mean m = E(x_t) = 0 and let them have known constant variance $\sigma^2 = E(x^2_t)$ and recall our assumption of independence which means that E(x_t x_{t+ s}) = 0 if $0 \neq s$ .Because of random fluctuations, we have learned to expect that $\hat{r}_0$ will come out to be $\sigma^2$ plus a random fluctuation component which decreases with sample size as $1/\sqrt{n}$ ,namely

$\begin{displaymath} \hat{r}_0 \eq \sigma^2 \pm {\sigma^2 \over \sqrt{n}}\end{displaymath}$ (44)

Likewise, $\hat{r}_1$ should come out to be zero but the definition (42) leads us to expect a fluctuation component

$\begin{displaymath} \hat{r}_1 \eq \pm {n - 1 \over n} {\sigma^2 \over \sqrt{n}}\end{displaymath}$ (45)

For the kth correlation value k > 1 we expect a fluctuation of order

$\begin{displaymath} \hat{r}_k \eq \pm {n - k \over n} {\sigma^2 \over \sqrt{n}}\end{displaymath}$ (46)

An autocorrelation of a particular set of random numbers is displayed in Figure 4.

4-3
Figure 4 Positive lags of autocorrelation of 36 random numbers.

Now one might imagine that as n goes to infinity the fluctuation terms vanish and (39) takes the limiting form $\hat{R} = \sigma^2$ .Such a conclusion is false. The reason is that although the individual fluctuation terms go as $1/\sqrt{n}$ the summation in (43) contains n such terms. Luckily, these terms are randomly canceling one another so the sum does not diverge as $\sqrt{n}$ .We recall that the sum of n random signed numbers of unit magnitude is expected to add up to a random number in the range $\pm \sqrt{n}$ .Thus the sum (43) adds up to

$\begin{displaymath} \hat{a} \quad\approx\quad \left( 1 \pm {\sqrt{n} \over \sqrt{n}} \right) \, \sigma^2 \eq (1 \pm 1)\, \sigma^2\end{displaymath}$ (47)

This is the basic result that a power spectrum estimated from the energy density of a sample of random numbers has a fluctuation from frequency to frequency and from sample to sample which is as large as the expected spectrum.

It should be clear that letting n go to infinity does not take us to the theoretical result $\hat{R} = \sigma^2$ .The problem is that, as we increase n, we increase the frequency resolution but not the statistical resolution. To increase the statistical resolution we need to simulate ensemble averaging. There are two ways to do this: (1) Take the sample of n points and break it into k equal-length segments of n/k points each. Compute an $R(\omega)$ for each segment and then add all k of the $R(\omega)$ together, or (2) form $R(\omega)$ from the n-point sample. Of the n/2 independent amplitudes, replace each one by an average over its k nearest neighbors. Whichever method, (1) or (2), is used it will be found that $\Delta f = 0.5\ k/n\tau$ and $(\Delta p/p)^2 =$ inverse of number of degrees of freedom averaged over = 1/k. Thus, we have

$\begin{displaymath} \Delta f \, \left( {\Delta p \over p}\right)^2 \eq {.5 \over n\tau}\end{displaymath}$

If some of the data are not used, or are not used effectively, we get the usual inequality

$\begin{displaymath} \Delta f \, \left( {\Delta p \over p}\right)^2 \quad\geq\quad {1\over 2n\tau}\end{displaymath}$

Thus we see that, if there are enough data available (n large enough), we can get as good resolution as we like. Otherwise, improved statistical resolution is at the cost of frequency resolution and vice versa.

We are right on the verge of recognizing a resolution tradeoff, not only between $\Delta f$ and $\Delta p$ but also with $\Delta t = n\tau$ ,the time duration of the data sample. Recognizing now that the time duration of our data sample is given by $\Delta t = n\tau$ ,we obtain the inequality

$\begin{displaymath} \Delta f \, \Delta t\, \left( {\Delta p \over p}\right)^2 \gt {1 \over 2}\end{displaymath}$ (48)

The inequality will be further interpreted and rederived from a somewhat different point of view in the next section.

In time-series analysis we have the concept of coherency which is analogous to the concept of correlation defined in Sec. 4-2. There we had for two random variables x and y that

$\begin{displaymath} c \eq {E(xy) \over [E(x^2)\, E(y^2)]^{1/2} }\end{displaymath}$

Now if x_t and y_t are time series, they may have a relationship between them which depends on time-delay, scaling, or even filtering. For example, perhaps Y(Z) = F(Z) X(Z) + N(Z) where F(Z) is a filter and n_t is unrelated noise. The generalization of the correlation concept is to define coherency by

$\begin{displaymath} C \eq {E\left[ X \left( {1 \over Z}\right) \, Y(Z) \right] \over [E(\overline {X} X)\, E(\overline {Y} Y)]^{1/2} }\end{displaymath}$

Correlation is a real scalar. Coherency is complex and expresses the frequency dependence of correlation. In forming an estimate of coherency it is always essential to simulate some ensemble averaging. Note that if the ensemble averaging were to be omitted, the coherency (squared) calculation would give

$\begin{displaymath} \vert C\vert^2 \eq \overline{C} C \eq {(\overline{\overline{... ...erline{X} Y) \over (\overline{X} X) (\overline{Y} Y)} \eq {+}1 \end{displaymath}$

which states that the coherency squared is +1 independent of the data. Because correlation scatters away from zero we find that coherency squared is biased away from zero.

4-4
Figure 5 Model of random time series generation.

Next: TIME-FREQUENCY-STATISTICAL RESOLUTION Up: Resolution Previous: TIME-STATISTICAL RESOLUTION

Stanford Exploration Project
10/30/1997