next up previous [pdf]

Next: Computing the Prediction Error Up: Berryman: MESA Previous: Introduction

The Variational Principle

Given a discrete (possibly complex) time series $ \left\{X_1,\ldots, X_N\right\}$ of $ N$ values with sampling interval $ \Delta t$ (and Nyquist frequency $ W = 1/2\Delta t$), we wish to compute an estimate of the power spectrum $ P(f)$, where $ f$ is the frequency. It is well known that

$\displaystyle P(f) = \lim_{N\to\infty} \frac{1}{N} \left\vert \sum_{n=1}^N X_n\...
...\Delta t)}\right\vert^2 = \sum_{n=-\infty}^\infty R_n \exp{(i2\pi fn\Delta t)},$ (A-1)

where the autocorrelation function $ R$ is defined by (for $ n \ge 0$)

$\displaystyle R_n = \lim_{N\to\infty} \frac{1}{N} \sum_{i=1}^{N-n} X_i^*X_{i+n} = R_{-n}^*.$ (A-2)

Now suppose that we use the finite sequence $ \left\{X_i\right\}$ to estimate the first $ M$ autocorrelation values $ R_0, \ldots, R_{M-1}$. (Methods of obtaining these estimates are discussed in the section on Computing the Prediction Error Filter.) Then, () has shown that maximizing the average entropy (see Appendix A for a derivation)

$\displaystyle h = \frac{1}{4W}\int_{-W}^W \ln\left[2WP(f)\right] df,$ (A-3)

subject to the constraint that (1) is equivalent to extrapolating the autocorrelation $ R_n$ for $ \vert n\vert \ge M$ in the most random possible manner.

Doing the math, we find that

$\displaystyle \frac{\delta h}{\delta R_n} = \frac{1}{4W} \int_{-W}^W P^{-1}(f) ...
...\vert < M\, \cr 0 \qquad\hbox{for}\quad \vert n\vert \ge M. \end{array} \right.$ (A-4)

The $ \lambda$'s are Lagrange multipliers to be determined. That the variation of $ h$ with respect to $ R_n$ for $ \vert n\vert \ge M$ should be zero is the essence of the variational principle. The value of $ h$ is then stationary with respect to changes in the $ R_n$'s, which are unknown. We can infer from Equation (4) that

$\displaystyle P^{-1}(f) = \sum_{n= - (M-1)}^{M-1} \lambda_n\exp{(-i2\pi fn\Delta t)}.$ (A-5)

Making the $ Z$-transform to $ Z = \exp{(-i2\pi f\Delta t)}$, Equation (5) becomes a polynomial of the complex parameter $ Z$:

$\displaystyle P^{-1}(f) = \sum \lambda_n Z^n.$ (A-6)

Since $ P$ is necessarily real and nonnegative, Equation (6) can be uniquely factored as

$\displaystyle P^{-1}(f) = 2WE_M^{-1}\left[\sum_m a_mZ^m\right]\left[\sum_n a_n^* Z^{-n}\right] = 2WE_M^{-1}\left\vert\sum_n a_nZ^n\right\vert^2,$ (A-7)

with $ a_0 = 1$. The first sum in (7) has all of its zeroes outside the unit circle (minimum phase) and the second sum has its zeroes inside the unit circle (maximum phase).

Fourier transforming Equation (1), we find that

$\displaystyle R_n = \int_{-W}^W P(f) \exp{(-i2\pi fn\Delta t)} df.$ (A-8)

Substituting (7) into (8), we find (after a few more transformations) that $ R_n$ is given by the contour (complex) integral

$\displaystyle R_n = \frac{E_M}{2\pi i}\oint_{\vert Z\vert=1} \frac{Z^{n-1}}{\vert\sum a_m Z^m\vert^2} dZ.$ (A-9)

The integrand of (9) can have simple poles inside the contour of integration at $ Z = 0$ and at any zero of the maximum phase factor. The poles for $ Z \ne 0$ can be eliminated by taking a linear combination of Equation (9) ``for various values of $ n$.'' Using the Cauchy integral theorem, we find that

$\displaystyle \sum_j a_j^* R_{n-j} = \frac{E_M}{2\pi i}\oint_{\vert Z\vert=1} \...
...hbox{for} \quad n = 0\,\cr 0 \qquad \hbox{for} \quad n > 0, \end{array} \right.$ (A-10)

since $ a_0 = 1$. Equation (10) and its complex conjugate for the $ a_m$ are exactly the standard equations for the maximum and minimum phase spike deconvolution operators $ \left\{a_m^*\right\}$ and $ \left\{a_m\right\}$, respectively.

Notice that, if we define the $ N\times N$ matrix $ T_{N-1}$ as the equidiagonal matrix of autocorrelation values whose elements are given by

$\displaystyle \left[T_{N-1}\right]_{ij} \equiv R_{i-j},$ (A-11)

then Equation (10) may be seen as a problem of inverting the matrix $ T$ to find the vector $ \left\{a_{N-1}^*,\ldots,1\right\}$. Equation (10) can be solved using the well-known Levinson algorithm for inverting a Toeplitz matrix (, ). Therefore, a power spectral estimate can be computed by using (10) to find the $ a_n$'s, and (7) to compute the spectrum.

One gap in the analysis should be filled before we proceed. That the variational principle is a stationary principle (i.e., $ \delta h = 0$) is obvious. That it is truly a maximum principle however requires some proof. First note that the average entropy $ h$ computed from substituting (7) into (3) is exactly

$\displaystyle h = \frac{1}{2}\ln E_M.$ (A-12)

This fact can be proven by writing (3) as

\begin{displaymath}\begin{array}{rl} 2h = \ln E_M & + \frac{M-1}{2\pi i}\oint \l...
...ln\left(\sum_n a_n^* Z^{M-n-1}\right) \frac{dZ}{Z}. \end{array}\end{displaymath} (A-13)

The first integral in (13) vanishes identically as is shown in Appendix B. The second integral vanishes because its argument is analytic for all $ \vert Z\vert < 1$ except for $ Z = 0$, and the residue there is $ \ln a_0 = 0$. The third integral can be rewritten as

$\displaystyle \frac{1}{2\pi i}\oint \ln\left(\sum a_n^* Z^{M-n-1}\right) \frac{...
..._0 + \sum_{n=1}^{M-1} \frac{1}{2\pi i}\oint\ln\left(Z - Z_n\right)\frac{dZ}{Z},$ (A-14)

where the $ Z$'s are the $ M-1$ zeroes of the maximum phase factor $ (\vert Z\vert < 1)$. Each of the integrals on the right side of (14) vanishes because of the identities proven in Appendix B.

For small deviations from the constraining values of $ R_n$, and from the values of $ R_n$ computed from (8) once $ P_M(f)$ is known, we can expand $ h$ in a Taylor series:

$\displaystyle h = \frac{1}{2}\ln E_M + \sum_{n = -(M-1)}^{M-1}\lambda_n r_n - \sum_{m,n = -\infty}^\infty H_{mn} r_m r_n^*.$ (A-15)

The $ r_n$'s are small deviations in the $ R_n$'s. The $ \lambda_n$'s are defined by (4). The matrix elements of $ H$ are given by

$\displaystyle H_{mn} = - \frac{\delta^2 h}{\delta R_m\delta R_n^*} = \frac{1}{4W} \int_{-W}^W \frac{Z^{n-m}}{P^2(f)} df,$ (A-16)

with $ Z = \exp{(-i2\pi f\Delta t)}$. $ H$ is obviously Hermitian and is seen to be positive definite because

$\displaystyle \sum_{mn} H_{mn}v_mv_n^* = \frac{1}{4W}\int_{-W}^W \frac{\vert\sum v_mZ^{-m}\vert^2}{P^2(f)} df \ge 0,$ (A-17)

where $ \left\{v_n\right\}$ is an arbitrary complex vector and the equality in (17) holds only when $ \left\{v_n\right\}$ is identically zero.

The result (17) is sufficient to prove that $ h$ is not only stationary, but actually a maximum.

The analysis given in this section has at least two weak points: (a) For real data, we never measure the autocorrelation function directly. Rather, a finite time series is obtained and an autocorrelation estimate is computed. Given the autocorrelation estimate, an estimate of the minimum phase operator must then be inferred. A discussion of various estimates of the autocorrelation is given in the next section on Computing the Prediction Error Filter, along with a method of estimating the prediction error filter without computing an autocorrelation estimate. (b) Even assuming we could compute the ``best'' estimate of the autocorrelation, that estimate is still subject to random error. The probability of error increases as we compute values of $ R_n$ with greater lag $ n$. Since there is a one-to-one correspondence between the $ R_n$'s and the $ a_n$'s, the length of the operator can strongly affect the accuracy of the estimated MESA power spectrum. A method of estimating the optimum operator length for a given sample length $ N$ will be discussed in the subsequent section on Choosing the Operator Length.


next up previous [pdf]

Next: Computing the Prediction Error Up: Berryman: MESA Previous: Introduction

2009-04-13