%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
%+                                                                           +
%+  NOTICE:  This article is also available formatted with illustrations.    +
%+                                                                           +
%+  SEE:     http://sepwww.stanford.edu/oldreports/sep61/                    +
%+                                                                           +
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


\part{THEORY}

\chapter{Spectrum and phase}
\setcounter{chapter}{\SPEC}

\par
In this chapter we will examine
\begin{itemize}
\item $90^{\circ}$ phase shift, analytic signal, and Hilbert transform
\item spectral factorization, i.e.~finding
	a minimum phase wavelet to fit any spectrum
\item a ``cookbook'' for  Butterworth causal bandpass filters
\item phase delay, group delay, and beating
\item where the name ``minimum phase'' came from
\item what minimum phase  means about energy delay
\end{itemize}

\section{HILBERT TRANSFORM}
Chapter \FT\ explains that many plots in this book have various interpretations.
Superficially,
the plot pairs represent cosine transforms of real even functions.
But since the functions are even, their negative halves are not shown.
An alternate interpretation of the plot pairs
is that one signal is real and causal.
This is illustrated in full detail in Figure~\ref{spec-intro}.
\sideplot{spec-intro}{2.96in}{\FIGS/spec/intro}{ (spec-intro)
	Both positive and negative times and frequencies
	of a real causal response (top) and real (mid) and
	imaginary (bottom)  parts of its FT.
	}
Half of the values in Figure~\ref{spec-intro} convey no information.
These are the zero values at negative time,
and the negative frequencies of the FT.
In other words, the right half of Figure~\ref{spec-intro} is redundant.
So it is generally not shown.
Likewise the bottom plot which is the imaginary part is generally not shown
because it is derivable in a simple way from given information.
Computation of the unseen imaginary part is called
{\it Hilbert transform.}
Here we will investigate details and applications of the Hilbert transform.
There are surprisingly many, including
$90^{\circ}$ phase-shift filtering, envelope functions,
the instantaneous frequency function,
and
relating amplitude spectra to phase spectra.
\par
Ordinarily a function is specified entirely in the time domain
or entirely in the frequency domain.
The Fourier transform then finds it in the other domain.
In a mathematical view,
the Hilbert transform arises when half the information
is in the time domain and the other half in the frequency domain.
(Algebraically speaking,
any fractional part could be given in either domain).

\subsection{A $Z$-transform view of Hilbert transformation}
Let $x_t$ be an even function of $t$.
Recalling that $Z^{-n}+Z^n=2 \cos\omega n$, we have
\begin{eqnarray}
X(Z) &= & \cdots + x_1 Z^{-1} + x_0 + x_1 Z + x_2 Z^2 + \cdots \\
X(Z) &= & x_0 + 2x_1 \cos \omega + 2x_2 \cos 2\omega + \cdots
  \label{spec.cos}
\end{eqnarray}
Now make up a new function $Y(Z)$
by replacing cosine by sine in (\ref{spec.cos}).
\begin{equation}
Y(Z)\eq 2x_1 \sin\omega + 2x_2 \sin 2\omega + \cdots
\label{spec.sine}
\end{equation}
Recalling that $Z=\cos\omega +i\sin\omega$ we see
that all the negative powers of $Z$ cancel from $X(Z)+iY(Z)$ giving
a causal $C(Z)$.
\begin{equation}
C(Z) \eq
{1\over 2} [X(Z)+iY(Z)] \eq
{1\over 2} x_0 + x_1 Z + x_2 Z^2 + \cdots \\
\end{equation}
So for plot pairs the causal response is $c_t$,
the real part of the FT is equation~(\ref{spec.cos}),
and the imaginary part not usually shown
is given by equation (\ref{spec.sine}).


\subsection{The quadrature filter}
\par
We had a causal response
and we switched
cosines and sines
in the frequency domain.
Here we do it again except that the time and frequency domains are interchanged.
The result has a more physical interpretation.
\par
A filter that converts sines into cosines is called  a $90^\circ$
phase shift filter or a {\it quadrature} filter.
More specifically,
if the input is $\cos (\omega t +\phi_1)$,
then the output should be $\cos(\omega t + \phi_1 - \pi /2)$.
An example is shown in Figure~\ref{spec-hilb0}.
\sideplot{spec-hilb0}{2.0in}{\FIGS/spec/hilb0}{ (spec-hilb0)
	Input (top) filtered with quadrature filter
	yields phase shifted signal (bottom).
	}
Let $U(Z)$ denote the $Z$-transform of a real signal input and
$Q(Z)$ denote a quadrature filter,
then the output signal is
\begin{equation}
V(Z) \eq Q(Z) \ U(Z)
\label{spec.VQU}
\end{equation}


\par
Let us find the numerical values of $q_t$. 
The time-derivative operation
has the desired $90^\circ$ phase-shifting property we seek.
The trouble with a differentiator
is that higher frequencies are amplified
with respect to lower frequencies.
Recall the FT and take its time derivative.
\begin{eqnarray}
b(t)          & \eq & \int B(\omega) e^{-i\omega t} d\omega  \\
\frac{db}{dt} & \eq & \int -i\omega B(\omega) e^{-i\omega t} d\omega 
\end{eqnarray}
Thus we see that time differentiation corresponds to the weight
factor $-i\omega$ in the frequency domain.
The weight  $-i\omega$ has the proper phase but the wrong amplitude.
The desired weight factor is
\begin{equation}
Q(\omega) = -i\omega/ |\omega| = -i \, {\rm sgn}\,\omega
\label{spec.Qdef}
\end{equation}
where ``sgn'' is called the ``signum'' or ``sign'' function.
Let us transform $Q(\omega)$ into the domain of sampled time $t=n$.
\begin{eqnarray}
q_n & = & \frac{1}{2\pi} \int_{-\pi}^{\pi} Q(\omega) e^{-i\omega n} 
						d\omega \\
    & = & \frac{i}{2\pi} \int_{-\pi}^{0} 
		e^{-i\omega n} d\omega - 
	  \frac{i}{2\pi} \int_{0}^{\pi} e^{-i\omega n} d\omega
								\nonumber \\
    & = & \frac{i}{2\pi}(\frac{e^{-i\omega n}}{-in} \mid_{-\pi}^{0}
		- \frac{e^{-i\omega n}}{-in}        \mid_{0}^{\pi} )
								\nonumber \\
    & = & \frac{1}{2\pi n}(-1+e^{+in\pi} + e^{-in\pi} -1)
								\nonumber \\
    & = & \left \{
          \begin{array}{ll}
          0   &  \mbox{for $n$ even}  \\
	  {-2 \over \pi n}  & \mbox{for $n$ odd}
	  \end{array}
	  \right. 
\end{eqnarray}
Examples of filtering with $q_n$ are shown in
Figure~\ref{spec-hilb0} and \ref{spec-hilb}.
\par
Since $q_n$ does not vanish for negative $n$,
the quadrature filter is nonrealizable
(requires future inputs to create its present output).
If the discussion were in continuous time rather than sampled time,
the filter would be of the form $1/t$,
a function that has a singularity at $t=0$
and whose integral over positive $t$ is divergent.
Convolution with the filter coefficients $q_n$
is therefore painful because the infinite sequence drops off slowly.
Convolution with the filter $q_t$ is called {\em Hilbert transformation}.

\plot{spec-hilb}{1.7in}{\FIGS/spec/hilb} { (spec-hilb)
	A Hilbert transform pair.
	}


\subsection{The analytic signal}
\par
The so-called {\it analytic signal}
can be constructed from a real valued time series $u_t$ and
itself $90^\circ$ phase shifted, i.e., $v_t$
from equation~(\ref{spec.VQU}).
The analytic signal is $g_t$
where
\begin{equation}
G(Z) \eq
U(Z) + i V(Z)  \eq
[1+ iQ(Z)] \ U(Z)
\end{equation}
In the time domain the filter
$ [1+ iQ(Z)]$ is $\delta_t+iq_t$
where $\delta_t$ is an impulse function at time $t=0$.
The filter $1+iQ(Z)= 1 +\omega/ |\omega|$
vanishes for negative $\omega$.
Thus it is a real {\it step} function in the
{\it frequency} domain.

\par
We can guess where the name ``analytic signal'' comes from
if we think back to $Z$-transforms and causal functions.
They are free of poles inside the unit circle,
so they are ``analytic'' there.
Their causality is the Fourier dual to the one-sidedness
we see in the frequency domain here.
So, an ``analytic'' function is one
that in the dual domain is one-sided.

\subsection{Instantaneous envelope}
The quadrature filter is often used to make
the envelope of a signal.
The envelope signal may be defined by
$e_t= \sqrt{u_t^2 + v_t^2}$.
Alternatively,
with the analytic signal
$g_t=u_t+iv_t$,
the squared-envelope is $e_t=g_t \bar{g}_t$.

\par
A quick way to accomplish the $90^\circ$ phase shift operation
is to use Fourier transformation.
Begin with $u_t+i\cdot 0$ and transform it to the frequency domain.
Then multiply by the step function.
Finally, inverse transform to get $g_t=u_t+iv_t$
which is equivalent to $(\delta_t+iq_t)*u_t$.

\plot{spec-envelope}{2.82in}{\FIGS/spec/envelope} { (spec-envelope)
	Left is a field profile.
	Middle is the unsmoothed envelope function.
	Right is the smoothed envelope.
	The vertical axis is time and the horizontal axis is space.
	Independent time-domain calculations are done at each point in space.
	}

\par
Sinusoids have smooth envelope functions,
but that doesn't mean real seismograms do.
Figure~\ref{spec-envelope} shows an example of a field profile
and unsmoothed and smoothed envelopes.
Before smoothing,
the stepout (alignment) of the reflections is quite clear.
In the practical world, alignment is considered to be a manifestation of phase.
In the theoretical world, an envelope is a smooth function
such as might be used to scale data without altering its phase.
Hence the reason for smoothing the envelope.

\par
If you are interested in wave propagation
you might recognize the possibility
of using analytic signals.
Energy stored as potential energy is $90^\circ$ out of phase
with kinetic energy, so $u_t$ might represent scaled pressure
while $v_t$ represents scaled velocity.
Then $\bar w_t w_t$ is the instantaneous energy.
(The scales are the square root of compressibility
and the square root of density).

\subsection{Instantaneous frequency}
\par
The phase $\phi_t$ of a complex function of time $g_t = u_t+iv_t$
is defined by $\phi_t=\arctan (v_t/u_t)$.
The
{\it instantaneous frequency}
is $d\phi/dt$.
Before forming the derivative,
recall the definition of complex logarithm of $g$
\begin{equation}
	\begin{array}{rcl}
		    g & = & r e^{i\phi}  \\
		\ln g & = & \ln |r| + \ln e^{i\phi} \\
		      & = & \ln |r| + i\phi
	\end{array}
\label{spec.cxlog}
\end{equation}
So $\phi =\Im \ln g $.
The instantaneous frequency is
\begin{equation}
\omega_{\rm instantaneous}
	\eq 	\frac{d\phi}{dt}
	\eq	\Im \frac{d\;}{dt} \ln  g(t)
	\eq	\Im \frac{1}{g} \frac{dg}{dt}
	\label{spec.instant}
\end{equation}
For a signal that is a pure sinusoid,
such as $g(t) = g_0 e^{i\omega t}$,
equation~(\ref{spec.instant}) clearly gives the right answer.
When a mixture of frequencies are simultaneously present,
we can hope that (\ref{spec.instant}) gives a sensible average.

\par
Trouble can arise in (\ref{spec.instant}) when the denominator $g$ gets small
which happens whenever the envelope of the signal gets small.
This difficulty can be overcome by careful averaging.
Rationalize the denominator by multiplying by the conjugate signal,
and then smooth locally a little (as indicated by the summation sign below).

\begin{equation}
\hat \omega_{\rm smoothed} \eq \Im \ 
{
	{ \sum \  \bar g(t)\  { d \over dt }\\ g(t) }
	\over
	{ \sum \ \bar g(t)\ g(t)}
}
\label{spec.smoofreq}
\end{equation}
(Those of you who have had a class in quantum mechanics may recognize
the notion of the ``expectation of an operator.''
You also see why the probability wavefunction of quantum physics
must be complex valued---it is a consequence
of the analytic signal eliminating negative frequencies from the average.
If the negative frequencies were not eliminated,
then the average frequency would be zero.)

\par
What range of times
should be smoothed in equation~(\ref{spec.smoofreq})?
Besides the nature of the data,
the appropriate smoothing depends on the method of representing $d \over dt$.
To prepare a figure, I implemented $d \over dt$ by multiplication
by $-i\omega$.
(This has the advantage of more accuracy
than finite differences at high frequencies,
but the disadvantage that the discontinuity in slope
at the Nyquist frequency gives an extended transient in the time domain).
The result is shown in Figure~\ref{spec-node}.
\plot{spec-node}{3.3in}{\FIGS/spec/node}{ (spec-node)
	A sum of three sinusoids (top),
	unsmoothed instantaneous frequency (middle),
	and smoothed instantaneous frequency (bottom).
	}
Inspection of Figure~\ref{spec-node} shows that smoothing is
even more necessary for instantaneous frequency than for envelopes,
and this is not surprising because the presence of $d \over dt$
makes it rougher.
Particularly notice times in the range 400-512
where the sinusoids are truncated.
There the unsmoothed instantaneous frequency
becomes a large rapid oscillation near the Nyquist frequency.
This roughness is nicely controlled by $(1,2,1)$ smoothing.
\par
It is also gratifying to see that a spike added to the sinusoids
(at point 243)
makes a burst of high frequency, and curious to notice the behavior
where an oscillation approaches the axis and then turns away
just before or just after crossing the axis.

\par
An example of instantaneous frequency applied to field data is shown
below in Figure~\ref{spec-frequency}.
\plot{spec-frequency}{2.82in}{\FIGS/spec/frequency} { (spec-frequency)
	A field profile (left), instantaneous frequency smoothed
	only with (1,2,1) (middle) and smoothed more heavily (right).
	}
Although the instantaneous frequency formalism is appealing
it has not found routine use in practice.
Part of the reason might be that much of the literature
does not seem to include the stablized form (\ref{spec.smoofreq}).
A more direct reason is that instantaneous frequency
is not a part of the standard algorithms of earth imaging.
\par
The instantaneous frequency idea can also be applied to the space axis.
This will be more familiar to readers
familiar with the methodology of imaging and migration.
Instead of temporal frequency $\omega = d\phi / dt$
we compute the spatial frequency $k_x =d\phi / dx$.
Figure~\ref{spec-kx} shows an example.
\plot{spec-kx}{2.82in}{\FIGS/spec/kx} { (spec-kx)
	A field profile (left), unsmoothed instantaneous frequency (center)
	and instantaneous spatial frequency $k_x$ (right).
	}
After you study the example,
you might be led to suggest that the results are not especially smooth
and perhaps the smoothing implied by the summations
in equation~(\ref{spec.smoofreq})
should be over a window in both $x$ and $t$.

\exer
\item
Let $c_t$ be a
{\it complex}
causal function of time.
How does $X(Z)$ change
in equation (\ref{spec.cos}),
and how must $Y(Z)$
in equation (\ref{spec.sine}),
be deduced from $X(Z)$?
\item
Figure~\ref{spec-hilb} shows a Hilbert transform pair,
the real and imaginary parts of the Fourier transform
of a causal response.
Describe the causal response.
\item
Given $Y(Z)=Q(Z)X(Z)$ prove that the envelope of $y_t$
is  the same as the envelope of $x_t$.
\item
Using of partial fractions convolve the waveform
\[
(2/\pi)(\ldots,-\frac{1}{5},0,-\frac{1}{3},0,-1,0,1,0,\frac{1}{3},
0,\frac{1}{5}, \ldots)
\]
with itself.
What is the interpretation of the fact that the result
is $(\ldots,0,0,-1,0,0,\ldots)$?
({\sc hint:} $\pi^2/8=1+ \frac{1}{9}+\frac{1}{25}+\frac{1}{49}+\ldots$).
\item
Using the fast Fourier transform matrix,
the quadrature filter $Q(\omega)$ may be represented by the column vector
\[
-i(0,1,1,1,\ldots,0,-1,-1,-1,\ldots,-1)^{\rm T}
\]
Multiply this into the inverse transform matrix
to show that the transform is proportional to 
$(\cos \pi k /N)/(\sin \pi k/N)$.
What is the scale factor?
Sketch it for $k \ll N$ indicating the limit
$N \rightarrow \infty$.
({\sc hint:} $1+x+x^2+\ldots+x^N=(1-x^{N+1})/(1-x)$).
\exerend


\section{SPECTRAL FACTORIZATION}
A problem that arises in a variety of physical contexts is this:
Given a spectrum, find a minimum phase wavelet that has that spectrum.
We will see how to make this wavelet
and see that it is unique 
(except for a trivial aspect,
the negative of any wavelet has the same spectrum as the wavelet,
and more generally any wavelet can be multiplied by any
complex number of unit magnitude, such as $\pm i$, etc).
\par
First consider the simpler problem where the wavelet need not be causal.
You can easily find a symmetric wavelet with any spectrum.
Just take the spectrum (which by definition is an energy or power)
and take its square root
getting what is called the {\it amplitude} spectrum.
Inverse transform that to the time domain
and you have a symmetric wavelet with the desired spectrum.
\par
This section solves the {\it spectral factorization} problem,
namely,
finding a minimum phase wavelet
to go along with any given spectrum.
Thus, given any wavelet it shows how
to find another wavelet that is minimum phase and
has the same spectrum as the given wavelet.
\par
The prediction-error wavelet in chapter~\TSA\ is theoretically
obtainable by spectral factorization of an inverse spectrum.
The Kolmogoroff method of spectral factorization
that we'll be looking at here
is much faster than the time-domain least-squares methods
considered in chapters~\TSA\ and \TOE\ which motivates
its widespread practical use.
\plot{spec-mpsamples}{1.77in}{\FIGS/spec/mpsamples} { (spec-mpsamples)
	Left are given wavelets and right are minimum phase equivalents.
	}
\par
Some simple examples are shown in Figure~\ref{spec-mpsamples}.
For all but the 4th signal
the spectrum of the minimum phase wavelet
clearly matches that of the input.
Wavelets are shifted to $t=0$ and turned backwards.
In the 4th case the wave shape changes into a big pulse at zero lag.
As the Robinson theorem on page~\pageref{'Robinson'} suggests,
minimum-phase wavelets tend to decay rapidly after a strong onset.
I imagined that hand drawn wavelets with a strong onset
would rarely turn out to be perfectly minimum phase,
but when I tried it, I was surprised
at how easy it seemed to be to draw a minimum-phase wavelet.
This is shown on the bottom of Figure~\ref{spec-mpsamples}.
\par
To begin understanding spectral factorization
notice that the polar form of any complex number puts
the phase into the exponential i.e.
$x+iy = |r| e^{i \phi} = e^{\ln |r| + i \phi}$.
So we look first into the behavior of exponentials
and logarithms of Fourier transforms.

\subsection{Exponential of a causal is causal}
Begin with a causal response $c_t$ and its associated $C(Z)$.
The $Z$-transform $C(Z)$ could be evaluated
giving a complex value for each real $\omega$.
This complex value could be exponentiated to get another value, say
\begin{equation}
B(Z(\omega )) \eq e^{C(Z(\omega ))}
\label{spec.exp}
\end{equation}
Next we could inverse transform $B(Z(\omega ))$ back to $b_t$.
We will prove the amazing fact that $b_t$ must be causal too.
\par
First notice that if $C(Z)$ has no negative powers of $Z$,
then $C(Z)^2$ doesn't either.
Likewise for the third power or any positive integer power,
or sum of positive integer powers.
Now recall the basic power series definition of exponential
\begin{equation}
e^x \eq 1 + x
+ {x^2 \over 2}
+ {x^3 \over 2 \cdot 3}
+ {x^4 \over 2 \cdot 3 \cdot 4}
+ {x^5 \over 2 \cdot 3 \cdot 4 \cdot 5}
+ \cdots
\label{spec.exp.series}
\end{equation}
Including equation (\ref{spec.exp}) gives
\label{'exp_causal'}
\begin{equation}
B(Z) \eq
e^{C(Z)} \eq 1 + C(Z)
+ {C(Z)^2 \over 2}
+ {C(Z)^3 \over 2 \cdot 3}
+ {C(Z)^4 \over 2 \cdot 3 \cdot 4}
+ \cdots
\label{spec.expc}
\end{equation}
Each term in the infinite series corresponds to a causal response,
so the sum, $b_t$ is causal.
(If you have forgotten the series for the exponential function,
then recall that the solution to $dy/dx=y$
is the definition of the exponential function $y(x)=e^x$
and the power series satisfies the differential equation
term by term, so it must be the exponential too.
The factorials in the denominators
assure us that the power series always converges,
i.e.~it is finite for any finite $x$.)

\par
Putting one polynomial into another or one infinite series
into another is an onerous task though it does lead to a wavelet
that is exactly causal.
In practice we do operations that are conceptually the same,
but for speed we do them with discrete Fourier transforms.
The disadvantage is periodicity,
i.e.~negative times are represented computationally like negative frequencies,
they are the last half of the elements of a vector
so there can be some blurring of late times into negative times.

\plot{spec-eZ}{2.2in}{\FIGS/spec/eZ} { (spec-eZ)
	Exponentials.
	}

\par
Figure~\ref{spec-eZ} shows examples
of equation (\ref{spec.expc}) for $C=Z$ and $C=4Z$.
Unfortunately I don't have an analytic calculation to confirm the validity
of that example.
\todo{This suggests I should adapt the program for complex time series
	to look at examples like $e^{i\hat \omega t_0}$ and
	$\exp( (-i\omega)^\gamma t_0)$.
	Anyway, I should try
	$\exp( (-i\omega)^{1/2} t_0)$.
	}

\subsection{Getting a causal wavelet from a prescribed spectrum}

\par
To find a causal wavelet from a prescribed spectrum,
we will be needing to form the logarithm of the spectrum.
Since a spectrum can easily vanish,
and since the logarithm of zero is infinite,
there is a pitfall.
To prepare ourselves we first examine
the log spectra example shown in Figure~\ref{spec-logspec}.
On the infinite domain the FT of a box function is a sinc
whose zeros become minus infinities in the logarithm.
On the discrete domain exact zeros may occur or not.
The transform of a triangle is a sinc squared,
but since this triangle was imperfectly drawn (by me),
its transform does not go identically to zero.
The sinc function drops off as $\omega$
and sinc squared drops off as $\omega^2$.
We confirm this on the logarithm plot by sinc squared
dropping off {\it twice} as much.
\plot{spec-logspec}{2.04in}{\FIGS/spec/logspec} { (spec-logspec)
	Log spectra of a box function and a triangle function.
	}
\par
Now for the task of going from a spectrum to a causal wavelet.
Take as given the spectrum of the causal wavelet $B(Z)$.
That means that we are not given $B(Z)$ itself,
but we are given $S(Z) = \overline{B} (1/Z) B(Z)$.
The spectrum may be given as autocorrelation coefficients
or values on the unit circle.
Thus
\begin{equation}
\overline{B}(1/Z) \, B(Z) =
S(Z) =
e^{\ln S(Z)} =
e^{U(Z)} =
e^{\overline{C}(1/Z) + C(Z)} =
e^{\overline{C}(1/Z)} \,  e^{C(Z)} 
\label{spec.prelog}
\end{equation}
Given the spectrum $S(Z)$ for each value on the unit circle
we could deduce the log spectrum $U(Z) = \ln S(Z)$ at each point on
the unit circle.
\begin{equation}
U(Z) \eq
\ln [ S(Z) ]
\eq
{\overline{C}(1/Z) + C(Z)}
\label{spec.log}
\end{equation}
This is the answer we have been looking for.
Given $U(Z)$ for all real values of $\omega$ we could
inverse transform it to the time domain
getting the two sided function $u_t = \bar c_{-t} + c_t$.
Setting to zero the coefficients at negative times
eliminates $\bar c_{-t}$ leaving
just $c_t$ hence $C(Z)$,
and we already know that the exponential of $C(Z)$
gives $B(Z)$ with a causal $b_t$.
This method is known as Kolmogoroff spectral factorization,
named after the mathematician who first discovered it.

\par
The program {\tt mpwave()} below begins with a wavelet,
forms its spectrum and then calls {\tt kolmogoroff()}
to factor the spectrum.
The program {\tt kolmogoroff()} first takes the logarithm of the spectrum,
then it returns to the time domain and sets to zero the noncausal part.
It returns to frequency, exponentiates, and returns to the time domain
with a wavelet that will be proven to be minimum phase.
(Between where negative lags are set to zero
and positive lags are left untouched,
are two points that are scaled by half.)
\progblock{mpwave}{\PROG/mpwave.r}
\progblock{kolmogoroff}{\PROG/kolmogoroff.r}
\par
The first test I tried on this program was the input wavelet $(1,2,0,0)$.
The desired result is that the wavelet should time reverse
itself to $(2,1,0,0)$.
The actual result was (1.9536, 1.0837, 0.0464, -0.0837).
This result is imperfect because the four-point Fourier transform
is a summation around the unit circle
while theoretically an integration is called for.
So, better results can be obtained by padding additional zeros after
the input wavelet.
Also,
you might notice that the program is designed for complex valued wavelets.
As typical of Fourier transform with single-word precision,
the imaginary parts were about $10^{-6}$ of the real parts
instead of being precisely zero.

\par
Some examples of spectral factorization are shown in Figure~\ref{spec-example}.
\plot{spec-example}{2.18in}{\FIGS/spec/example} { (spec-example)
	Examples of log spectra and their associated minimum-phase wavelets.
	}


\subsection{Why the causal wavelet is minimum phase}
\label{'Kolmog_MP'}
Next we see why the causal wavelet $B(Z)$ we have made
from the prescribed spectrum
turns out to be minimum phase.
First return to the original definition of minimum phase,
a causal wavelet is minimum phase if and only if
its inverse is causal.
We have our wavelet in the form $B(Z)= e^{C(Z)}$.
Consider another wavelet $A(Z) = e^{-C(Z)}$ constructed analogously.
By the same reasoning, $a_t$ is also causal.
Since $A(Z)B(Z)=1$ we have found a causal, inverse wavelet.
Thus $b_t$ is a minimum phase wavelet.

\par
Since the phase is a Fourier series it must be periodic, it cannot increase
indefinitely with $\omega$ as it does for the nonminimum phase wavelet
(see Figure~\ref{spec-phase}).


\subsection{Pathological examples}

The spectral factorization algorithm fails when an infinity is encountered.
This happens when the spectrum becomes zero so that its logarithm
becomes minus infinity.
This can happen in a benign way,
for example with the spectrum of the wavelet $(1,1)$,
where the infinity occurs at the Nyquist frequency.
You could smooth the spectrum near the Nyquist
before you take the logarithm.
On the other hand, the pathology can be more extreme.
With the same wavelet, $(1,1)$ convolved with itself
$N$ times the wavelet and its spectrum tend to Gaussians.
So, at the Nyquist frequency,
smoothing would only replace zero by a very tiny number.
\par
Figure~\ref{spec-patho} shows functions whose spectrum contains zeros,
along with their minimum phase equivalents.
When the logarithm of zero arises during the computation
it is replaced by the log of $10^{-30}$.
It is surprising that the triangle suffered so much less
than the other two functions.
It seems that minor imperfection
in specifying the triangle resulted in a spectrum that
did not have the theoretical zeros of sinc squared.
\plot{spec-patho}{2.72in}{\FIGS/spec/patho} { (spec-patho)
	Functions whose spectrum contains zeros,
	along with their minimum phase equivalents,
	as computed by discrete Fourier transform.
	}

\subsection{Relation of amplitude to phase}
\par
As we learned from equation (\ref{spec.log}),
a minimum phase function is determined completely from its spectrum.
Thus its phase is determinable from its spectrum.
Likewise, we will see that except for a scale,
the spectrum is determinable from the phase.
\par
It wasn't mentioned at the time,
but spectral factorization implicitly uses Hilbert transformation.
Somehow we generated a phase.
To see how the phase arose,
recall equation~(\ref{spec.prelog}) and (\ref{spec.log}).
\begin{equation}
S_k \ = \  e^{\ln S_k}
    \ = \  e^{U_k}
    \ = \  e^{ (U_k - i\Phi_k)/2  }  \,
        e^{ (U_k + i\Phi_k)/2  }    
    \ = \  e^{\overline{C}_k} e^{C_k}
    \ = \   \overline{B}_k B_k	\label{spec.99}
\end{equation}
Where did $\Phi_k$ come from?
We took $U_k + i 0$ to the time domain getting $u_t$.
Then we multiplied $u_t$ by a real valued step function of time.
This multiplication in the time domain
is what created the phase.
It created the phase because multiplication in the time domain
implies a convolution
in the frequency domain.
Recall the Fourier transform of a real valued step function
that arises with Hilbert transform.
Multiplying in time with a step means
that in frequency $U_k$ has been convolved
with $\delta_{k =0} +i \times (90^{\circ}$ phase shift filter).
So $U_k$ is unchanged and a phase, $\Phi_k$ has been generated.
The explanation will be somewhat clearer
if you review the $Z$-transform approach
at the beginning of the chapter
because there you can see both the frequency domain
and the time domain in one expression.
\par
To illustrate different classes of discontinuity,
pulse, step, and slope,
Figure~\ref{spec-hilb2} shows another Hilbert transform pair.
\plot{spec-hilb2}{0.90in}{\FIGS/spec/hilb2} { (spec-hilb2)
	A Hilbert transform pair.
	}


\exer
\item
What is the meaning of {\it minimum-phase waveform} if the roles of time
domain and frequency domain are interchanged?
\item
Show how to do the inverse Hilbert transform, given $\phi$ find $u$.
What is the interpretation of the fact that you cannot get $u_0$?
\item
Consider a model of a portion of the earth where $x$ is the north 
coordinate, $+z$ represents altitude above the earth,
and magnetic bodies are distributed in the earth creating
no magnetic field component in the east-west direction.
One may show that the magnetic field $h$ above the earth is represented by
$$\left[
\begin{array}{cc}
h_x (x, z) \\   h_z (x, z) \end{array} \right]
= \int^{+\infty}_{-\infty} F(k) \, 
\left[ \begin{array}{cc}
-ik \\ |k| \end{array} \right] \,
e^{ikx - |k|z} \, dk $$
Here $F(k)$ is some spatial frequency spectrum.
\begin{description}
\item[(a)] By using Fourier transforms,
how does one compute $h_x(x, 0)$ from
$h_z (x, 0)$ and vice versa?
\item[(b)] Given $h_z(x, 0)$, how does one compute $h_z(x, z)$? 
\item[(c)]  Notice that at $z = 0$
$$f(x) = h_z(x) + ih_x(x) = \int^{+\infty}_{-\infty} e^{ikx} F(k)\,
(|k| + k)\, dk $$
\item[ ] and that $F(k) (|k| + k)$ is a one-sided function of $k$.
With a total  field magnetometer one observes
$$h^2_x (x) + h^2_z(x) = w(x) \bar w (x)$$
What can you say about getting $F(k)$ from this?
\item[(d)] How unique are $h_x(x)$ and $h_z(x)$ if $f(x) \bar f(x)$ is given?
\end{description}
\exerend

\section{A BUTTERWORTH FILTER COOKBOOK}

An ideal band-pass filter passes some range of frequencies without distortion
and suppresses all other frequencies.
Further thought shows the ideal bandpass filter,
a rectangle function of frequency,
is far from ideal because its time-domain representation
$(\sin\,\omega_0 t)/(\omega_0 t)$
is non-causal and decays much too slowly with time for many practical uses.
The appropriate band-pass filter is one
whose time decay can be chosen to be reasonable
(along with the necessary compromise on the shape of the rectangle).
Butterworth filters fulfill these needs.
They are causal and of various orders,
the lowest order being best (shortest) in the time domain,
and the higher orders being better in the frequency domain.
Butterworth filters are often chosen for well-engineered projects.
Unfortunately they are less often used in experimental work
because of a complicated setting-up issue
that I am going to solve for you here,
along with giving some examples and discussing pitfalls.
The main problem is that there is no simple mathematical expression
for the filter coefficients as a function of order and cutoff frequency.
\par
Analysis starts from an equation
that for large order $n$ is an equation of a box
\begin{equation}
\overline{B(\omega)} B(\omega) \eq
{1 \over 1 + \left( {\omega\, \over \omega_0} \right)^{2n}}
\label{spec.butw}
\end{equation}
When $|\omega|<\omega_0$ this Butterworth lowpass spectrum is about unity.
When $|\omega|>|\omega_0|$ the spectrum drops rapidly to zero.
The magnitude $|B(\omega)|$
(with some truncation effects to be described later)
is plotted in Figure~\ref{spec-butf}
for various values of $n$.
\plot{spec-butf}{2.06in}{\FIGS/spec/butf}{ (spec-butf) 
	Spectra of Butterworth filters of various order $n$.
	}

\par
Conceptually the easiest form of Butterworth filtering is to take
your data to the frequency domain and multiply by equation
(\ref{spec.butw})
where you have selected some value of $n$ to compromise
the demands of the frequency domain (sharp cutoff) and the time domain
(rapid decay).
Of course the time domain representation of equation
(\ref{spec.butw})
is non-causal.
So if you prefer a causal filter, you could take the Butterworth spectrum
into a spectral factorization program such as {\tt kolmogoroff()}.

\par
The time-domain response of the Butterworth filter is infinitely long,
though to a good approximation, a Butterworth filter of degree $n$
can be approximated by a ratio of $n^{\rm th}$ order polynomials.
Since, as we'll see,
$n$ is typically in the range 2-5,
time domain filtering is quicker than FT.
To proceed we need to express $\omega$ in terms of $Z$
where $Z=e^{i\omega \Delta t}$.
This is done in an approximate way that is valid for frequencies
far from the Nyquist frequency.
Intuitively, time differentiation is implied by $-i\omega$.
We saw that in sampled time,
differentiation is generally represented by the bilinear transform,
equation~(\ref{zz.bilin}),
$-i\hat\omega\Delta t = 2(1-Z)/(1+Z)$.
Thus a sampled time representation
of $\omega^2= (i\omega) (-i\omega) $ is
\begin{equation}
\omega^2 \eq 4 \ 
{{1-Z^{-1}} \over {1+Z^{-1}}} \ 
{{1-Z} \over {1+Z}} 
\label{spec.butw2}
\end{equation}
Substituting equation (\ref{spec.butw2}) into
(\ref{spec.butw}) we find
\begin{eqnarray}
B({1 \over Z} ) B(Z)
&=&
{[ (1+Z^{-1})(1+Z)]^n \over
 [ (1+Z^{-1})(1+Z)]^n  \ + \ 
 [{4\over\omega_0^2}\, (1-Z^{-1})(1-Z)]^n }	
\label{spec.butz}				\\
B({1 \over Z} ) B(Z)
&=&
{N(Z^{-1})N(Z) \over D(Z^{-1}) D(Z)}
\label{spec.butz2}
\end{eqnarray}
where the desired causal Butterworth discrete-domain filter
is $B(Z)=N(Z)/D(Z)$.
You can now appreciate the enormity of the task
when you realize the denominator in (\ref{spec.butz})
must be factored into a product
of a function of $Z$ times the same function of $Z^{-1}$
to get equation~(\ref{spec.butz2}).
Since the function is positive,
it can be considered to be a spectrum,
and factorization must be possible.

\subsection{Butterworth filter finding program}
To express equation (\ref{spec.butz}) in the Fourier domain,
multiply every parenthesized factor by $\sqrt{Z}$ and recall that
$\sqrt{Z}+1/\sqrt{Z} = 2\cos(\omega/2)$.
Thus
\begin{equation}
\overline{B(\omega)} B(\omega) \eq        {
               [2\, \cos\,\omega/2]^{2n}   \over
               [2\, \cos\,\omega/2]^{2n}   \ +\ 
[{2\over \omega_0}\ \sin\,\omega/2]^{2n}  }
\label{spec.butprog}
\end{equation}
An analogous equation holds for highpass filters.
The program below does both.
First the denominator of equation~(\ref{spec.butprog})
is set up as a spectrum and factored.
The numerator could be found in the same way,
but the result is already apparent from the numerator of
(\ref{spec.butz}),
i.e.~we need the coefficients of $(1+Z)^n$.
In subroutine {\tt butter()}
they are simply obtained by Fourier transformation.
The occurrence of a tangent in the program
arises from equation~(\ref{zz.tan}).
\progblock{butter}{\PROG/butter.r}


\subsection{Examples of Butterworth filters}

Spectra and log spectra of various orders of Butterworth filters
are in Figure~\ref{spec-butf}.
They match a rectangle function
passing frequencies below the half-Nyquist.
Convergence is rapid with order.
The logarithm plot shows a range of 0-3 meaning an amplitude ratio
of $10^3=1000$.
Tiny glitches near the bottom for high order curves
result from truncating the time axis in the time domain
shown in Figure~\ref{spec-butm}.
\sideplot{spec-butm}{2.15in}{\FIGS/spec/butm}{ (spec-butm) 
	Butterworth filter time responses
	for half-Nyquist low pass.
	}
The time-domain truncation also explains a slight roughness
on the top of the rectangle function.
\par
In practice, the filter is sometimes run both forward and backwards
to achieve a phaseless symmetrical response.
This squares the spectral amplitudes,
giving convergence twice as rapidly the figure shows.
Notice that the higher order curves in the time domain
(Figure~\ref{spec-butm})
have undesirable sidelobes ringing longer with higher orders.
Also, higher order curves have increasing delay for the main signal burst.
This delay is a consequence of the binomial coefficients in the numerator.

\par
Another example of a low-pass Butterworth filter
shows some lurking instabilities.
This is not surprising because a causal bandpass operator is almost
a contradiction in terms.
It is a contradiction because the word ``bandpass'' implies
multiplying the spectrum by zero outside the chosen band,
and ``causal'' implies a well-behaved spectral logarithm
and these can't coexist because the logarithm of zero is minus infinity.
All this is another way of saying when you use Butterworth filters,
you probably should not use high orders.
Figure~\ref{spec-butl} illustrates that an instability arises
in the seventh-order Butterworth filter,
and the sixth-order filter looks suspicious.
\sideplot{spec-butl}{2.05in}{\FIGS/spec/butl}{ (spec-butl)
	Butterworth time responses for narrow band low pass filter.
	}
If you insist on using high order filters,
you can probably get to about twice as high an order
by using double precision,
increasing the spectral width {\tt nw,}
and if you are really persistent,
using the method of the exercises.
My favorite Butterworth filters for making synthetic seismograms
have with five coefficients (fourth order).
I do one pass through a low cut at {\tt cutoff=.1}
and another through a high cut with {\tt cutoff=.4}.

\exer
\item
Notice that equation (\ref{spec.butw}) can be factored analytically.
Individual factors could be implemented as the $Z$-transform filters,
and the filters cascaded.
This avoids the instability that arises when many poles are combined.
Identify the poles of equation (\ref{spec.butw}).
Which belong in the causal filter and which in its time reverse?
\exerend


\section{PHASE DELAY AND GROUP DELAY}
The Fourier-domain ratio of a wave seen at $B$
divided by the wave seen at $A$
may be regarded as a filter.
Propagation velocity is the distance from $A$ to $B$
divided by the delay.
But there are at least two ways to {\it define} the delay.

\subsection{Phase delay}
When you put a sinusoid into a filter then a sinusoid
must come out.
The only thing that can change is the amplitude and the phase.
Comparing a zero crossing of the input
to a zero crossing of the output
measures the so-called {\it phase delay}.
To quantify this, define an input, $\sin \omega t$,
and an output, $\sin (\omega t - \phi )$.
Then the phase delay $t_p$ is found by solving
\begin{equation}
\begin{array}{rcl}
\sin( \omega t - \phi ) &=& \sin \omega (t-t_p)	\nonumber \\
\omega t - \phi &=& \omega t - \omega t_p     	\nonumber \\
t_p &=& \frac{\phi}{\omega}			\label{1-4-1}
\end{array}
\end{equation}
A problem with phase delay is that the phase can be ambiguous
within an additive constant of $2\pi N$ where $N$ is any integer.
In wave propagation theory,
{\it phase velocity} is defined by
the distance divided by the phase delay.
There it is hoped that the $2\pi N$ ambiguity can be
resolved by observations tending to zero frequency or physical separation.

\subsection{Group delay}
A more interesting kind of delay is {\it group delay}
corresponding to group velocity in wave propagation theory.
Often the group delay is nothing more than the phase delay.
This happens when the phase delay is independent of frequency.
But when the phase delay depends on frequency,
then a completely new velocity called the {\it group velocity} appears.
Curiously, the group velocity is {\it not} an average of phase velocities.
\par
The simplest analysis of group delay begins
by defining a filter input $x_t$ as
the sum of two frequencies.
\begin{equation}
x_t \eq \cos \omega_1 t + \cos \omega_2 t
\label{1-4-2}
\end{equation}
By using a trigonometric identity
\begin{equation}
x_t \eq 2
\underbrace{
	\ \cos(\frac{\omega_1 - \omega_2}{2} t)
}_{\rm beat}
	\ \cos (\frac{\omega_1 + \omega_2}{2} t)
\label{1-4-3}
\end{equation}
we see that the sum of two cosines
looks like a cosine of the average frequency
multiplied by a cosine of half the difference frequency.
Since the frequencies in Figure~\ref{spec-beat} are taken close together, 
\plot{spec-beat}{0.85in}{\FIGS/spec/beat}{ (spec-beat)
	Two nearby frequencies beating.
	}
the difference frequency factor in (\ref{1-4-3})
represents a slowly variable amplitude
multiplying the average frequency. 
The slow (difference frequency) modulation of the higher (average)
frequency is called {\it beating}.
\par
The beating phenomena is also called ``interference'' though
that word is deceptive.
If the two sinusoids were two wave beams crossing one another,
they would simply cross each other {\it without} interfering.
Where they are present simultaneously, they simply add.
\par
Each of the two frequencies could be delayed a different amount by a filter,
so take the output of the filter $y_t$ to be 
\begin{equation}
y_t \eq \cos (\omega_1 t - \phi_1) + \cos (\omega_2 t - \phi_2)
\label{1-4-4}
\end{equation}
In taking the output of the filter to be of the form of (\ref{1-4-4}),
we have assumed that neither frequency was attenuated.
(The group velocity concept loses its simplicity and much
of its utility in dissipative media.)
Using the same trigonometric identity on (\ref{1-4-4}) gives
\begin{equation}
y_t \eq 2 \ 
	\underbrace{
	\cos (\frac{\omega_1 - \omega_2}{2} t - \frac{\phi_1 - \phi_2}{2})
	}_{\rm beat} \ 
	\cos (\frac{\omega_1 + \omega_2}{2} t - \frac{\phi_1 + \phi_2}{2})
\label{1-4-5}
\end{equation}
Rewriting the beat factor in terms of a time delay $t_g$,
we have
\begin{eqnarray}
	\cos [ \frac{\omega_1 -\omega_2}{2} (t-t_g) ] &=&
	\cos ( \frac{\omega_1 - \omega_2}{2} t -
		\frac{\phi_1 - \phi_2}{2})
\\
	(\omega_1 - \omega_2 ) t_g &=& \phi_1 - \phi_2	\nonumber
\\
t_g &=& \frac{\phi_1 - \phi_2 }{\omega_1 - \omega_2} \eq 
	\frac{\Delta \phi}{\Delta \omega}
\label{1-4-6}
\end{eqnarray}
For a continuum of frequencies the group delay is
\begin{equation}
t_g \eq {d\phi \over d\omega }
\label{spec.group}
\end{equation}

\subsection{Group delay as a function of the FT}
We'll see that the group delay
is a simple function of the Fourier transform of the filter.
We'll assign the name $P$ to this filter to denote its
being an all-pass filter.
It need not be a causal all-pass filter,
and in practice a bit of energy absorption might be OK too.
The phase angle $\phi$
could be computed as the arctangent of the ratio
of imaginary to real parts of the Fourier transform, 
namely $\phi(\omega) = \arctan [ \Im P(\omega)/ \Re P(\omega)]$.
As with~(\ref{spec.cxlog}) we use
$\phi =\Im \ln P $ and from (\ref{spec.group}) we get
\begin{equation}
t_g 	\eq 	\frac{d\phi}{d\omega}
	\eq	\Im \frac{d\;}{d\omega} \ln  P(\omega )
	\eq	\Im \frac{1}{P} \frac{dP}{d\omega}
	\label{spec.dphidom}
\end{equation}
which could be expressed like the Fourier dual to
equation~(\ref{spec.smoofreq}).


\subsection{Observation of dispersive waves}
\par
There are various formulas relating energy delay to group delay,
and this chapter illuminates those that are one dimensional.
In observational work,
it is commonly said that
``what you see is the group velocity.''
This means when you see an apparently sinusoidal wave train,
its distance from the source divided by
its travel time (group delay) is the group velocity.
An interesting example of a dispersive wave is in ``\FGDP'' Figure~1-11.

\subsection{Group delay of all-pass filters}
On page~\pageref{'all-pass'} we introduced all-pass filters, i.e.,
filters with constant unit spectra, that is,
$P(Z) \overline{P}(1/Z) = 1$.
In the frequency domain $P(Z)$
can be expressed as $e^{i \phi (\omega)}$ where
$\phi$ is real and is called the {\it phase shift}.  
Clearly $P \overline{P} = 1$ for all real $\phi$.
It is an easy matter to make a filter
with any desired phase shift---you
merely Fourier transform 
$e^{i \phi (\omega)}$ into the time domain.
If $\phi(\omega)$ is arbitrary,
the resulting time function is likely to be two-sided.
Since we are interested in physical processes that are causal,
we may wonder what class of functions
$\phi(\omega)$ corresponds to one-sided time functions.
The answer is that the group delay
$\tau_g = d\phi / d\omega$ of a causal all-pass filter must be positive.
\par
Proof that $d\phi /d\omega > 0$ for a causal all-pass filter
is found in ``\FGDP'' and there is no need to reproduce the algebra here.
The proof begins from equation~(\ref{zz.all})
and takes the imaginary part of the logarithm to get phase.
Differentiation with respect to $\omega$ yields a form
that is recognizable as a spectrum, hence always positive.
\par
A single-pole, single-zero all-pass filter
passes all frequency components with constant gain
and a phase shift that can be adjusted
by the placement of the pole.
Taking $Z_0$ near the unit circle
causes most of the phase shift to be concentrated
near the frequency where the pole is located.
Taking the pole further away causes the delay to be
spread over more frequencies.
Complicated phase shifts or group delays may be built up
by cascading single-pole filters.
\par
The above reasoning for a single-pole, single-zero all-pass filter
also applies to many roots
because the phase of each will add and
the sum of $\tau_g = d\phi / d\omega > 0$ will be greater than zero.
\par
The Fourier dual to the group delay of a causal all-pass filter
is that the instantaneous frequency of a certain class of analytic
signals must be positive.
This class of analytic signals
is all those with a constant envelope function,
as might be approximated by field data
after the process of automatic gain control.

\exer
\item
Let $x_t$ be some real signal.
Let $y_t =x_{t+3}$
be another real signal.
Sketch the phase as a function of frequency
of the cross spectrum $X(1/Z)Y(Z)$ as
would a computer that put all arctangents in the principal 
quadrants $-\pi /2 < \arctan < \pi /2$.
Label axis scales.
\item
Sketch the amplitude, phase, and group delay of the all-pass filter
$(1 - \overline{Z_0} Z)/(Z_0 - Z)$
where $Z_0 = (1 + \epsilon)e^{i\omega_0}$ and
$\epsilon$ is small.
Label important parameters on the curve.
\item
Show that the coefficients of an all-pass,
phase-shifting filter made by cascading
$(1 - \overline{Z_0} Z)/$ $(Z_0 - Z)$ with
$(1 - Z_0 Z)/(\overline{Z_0} - Z)$ are real.
\item
A continuous signal is the impulse
response of a continuous-time, all-pass filter.
Describe the function in both time domain and
frequency domain.
Interchange the words {\it time}
and {\it frequency} in your description of the function.
What is a physical example of such a function?
What happens to the statement:
``The group delay of an all-pass filter is positive."?
\item
A graph of the group delay $\tau_g (\omega)$
shows $\tau_g$ to be positive for all $\omega$.
What is the area under
$\tau_g$ in the range $0 < \omega < 2\pi$.
({\sc hint}:  This is a trick question you can solve
in your head.)
\exerend


\section{MINIMUM PHASE}
In Chapter \ZZ\ we learned that
the inverse of a causal filter $B(Z)$ is causal
if $B(Z)$ has no roots inside the unit circle.  
The terminology ``minimum phase'' was introduced there
without motivation.
Here we examine the phase,
and learn why it is called {\it minimum}.

\subsection{Phase of a single root}
\par
For real $\omega$, a plot of real and imaginary parts of $Z$ is the circle
$(x,y) = (\cos\,\omega, \sin\,\omega)$.
A smaller circle is $.9Z$.
A right shifted circle is $1+.9Z$.
Let $Z_0$ be a complex number.
It could be $x_0+iy_0$ or
$Z{_0} = e^{i\omega_0}/\rho$
where $\rho$ and $\omega_0$ are fixed constants.
Consider the complex $Z$ plane for the two-term filter
\begin{eqnarray}
B(Z)          &=& 1 - {Z \over Z{_0}} \\
B(Z(\omega )) &=& 1 -\rho e^{i(\omega - \omega_0)}\\
B(Z(\omega )) &=& 1 -\rho \cos(\omega - \omega_0) -i\rho \sin(\omega -\omega_0)
\label{spec.Bplane}
\end{eqnarray}
\par
\plot{spec-origin}{1.9in}{\FIGS/spec/origin}{ (spec-origin)
	Left, complex $B$ plane for $\rho<1$.
	Right, for $\rho>1$.
	}
Real and imaginary parts of $B$ are plotted in Figure~\ref{spec-origin}.
Arrows are at frequency $\omega$ intervals of $20^\circ$.
Observe that for $\rho>1$ the sequence of arrows
has a sequence of angles that ranges over $360^\circ$,
whereas for $\rho>1$ the sequence of arrows
has a sequence of angles between $\pm 90^\circ$.
Now let us replot equation~(\ref{spec.Bplane})
in a more conventional way, with $\omega$ as the horizontal axis.
Whereas the phase is the angle of an arrow in Figure~\ref{spec-origin},
in Figure~\ref{spec-phase} it is the arctangent of $\Im B/ \Re B$.  
Notice how different is the phase curve in 
Figure~\ref{spec-phase} for $\rho < 1$ than for $\rho > 1$.  

\par
Real and imaginary parts of $B$
are {\it periodic} functions of frequency $\omega$
since $B(\omega ) = B(\omega +2\pi )$.
You might be tempted to conclude that the phase
would be periodic too.
But Figure~\ref{spec-phase} shows that for a nonminimum phase filter,
as $\omega$ ranges from $-\pi$ to $\pi$
the phase $\phi$ increases by $2\pi$
(because the circular path in Figure~\ref{spec-phase} surrounds the origin).
To make Figure~\ref{spec-phase} I used the Fortran arctangent function
that takes two arguments, $x$, and $y$.
It returns an angle between $-\pi$ and $+\pi$.
As I was plotting the nonminimum phase,
the phase suddenly jumped discontinuously
from a value near $\pi$ to $-\pi$, and I needed to add $2\pi$
to keep the curve continuous.
This is called ``phase unwinding.''
\sideplot{spec-phase}{2.55in}{\FIGS/spec/phase}{ (spec-phase)
	Left, shows real and imaginary parts and phase angle
	of equation (\protect\ref{spec.Bplane}),
	for $\rho<1$.
	Right, for $\rho>1$.
	Left is minimum phase and right is nonminimum phase.
	}
\par
You would write a similar program if you ever had the job
of writing a program to answer the question,
``Given an earthquake at location $(x,y)$, did it occur in country X?''
You would circumnavigate the country,
like the circle in Figure~\ref{spec-phase} and see if the phase angle
from the earthquake to the country's boundary accumulated to $0$ (yes)
or to $2\pi$ (no).
\par
The word {\it minimum} in minimum phase arises
because delaying a filter can always add more phase.
For example,
multiplying any polynomial by $Z$ delays it and adds $\omega$
to its phase.

\par
For the minimum-phase filter
the group delay $d\phi / d\omega$ applied to Figure~\ref{spec-phase}
is a periodic function of $\omega$.
For the non minimum-phase filter
it happens to be a monotonically increasing function of $\omega$.
Since it is not an all-pass filter the monotonicity is accidental.

\par
Because group delay $d\phi / d\omega$ is the Fourier dual
to instantaneous frequency $d\phi /dt$,
we can now go back to Figure~\ref{spec-node} and explain
the discontinuous behavior of instantaneous frequency
where the signal amplitude is near to zero.

\subsection{Phase of a rational filter}
\par
Now we are ready to consider a complicated filter like
\begin{equation}
B(Z) = {(Z - c_{1}) (Z - c_{2}) \cdots \over {(Z - a_{1}) (Z - a_{2}) \cdots}}
\label{2-2-1}
\end{equation}
By the rules of complex-number multiplication
the phase of $B(Z)$ is the sum of the phases in the numerator
minus the sum of the phases in the denominator.
Since we are discussing realizable filters
the denominator factors must all be minimum phase,
and so the denominator phase curve
is a sum of periodic phase curves
like lower left in Figure~\ref{spec-phase}.
\par
The numerator factors may or may not be minimum phase.  
Thus the numerator phase curve is a sum of phase curves
like either type in Figure~\ref{spec-phase}.
If any factors augment phase by $2\pi$,
then the phase is not periodic
and the filter is nonminimum phase.

\section{ROBINSON'S ENERGY DELAY THEOREM}
Here we will see that a minimum-phase wavelet
has less energy delay
than any other one-sided wavelet with the same spectrum.
\label{'Robinson'}
More precisely the energy summed from zero to any time $t$
for the minimum-phase wavelet
is greater than or equal
to that of any other wavelet
with the same spectrum.
\par
Here is how I prove Robinson's energy delay theorem:
Compare two wavelets $F_{\rm in}$ and $F_{\rm out}$ which are identical
except for one zero,
which is outside the unit circle for $F_{\rm out}$
and inside for $F_{\rm in}$.
We may write this as
\begin{eqnarray}
F_{\rm out}(Z) &= & (b + sZ)\, F(Z)  \\
F_{\rm in}(Z) &= & (s + bZ)\, F(Z) 
\end{eqnarray}
where $b$ is bigger than $s$ and $F$ is arbitrary but of degree $n$.
Complex valued $b$ and $s$ are left for an exercise.
Notice that the spectrum of $b+sZ$ is the same as that of $s+bZ$.
Next tabulate the terms in question.


\begin{tabular}{l|l|l|l|l} \hline
$t$ &   $F_{\rm out}$        & $F_{\rm in}$  & $F^2_{\rm out} - F^2_{\rm in}$
    &  $\sum^t_{k = 0} (F^2_{\rm out} - F^2_{\rm in})$ \\ \hline
$0$  & $bf_0$ & $sf_0$  & $(b^2 - s^2)\, {f_0}^2 $ & $(b^2 - s^2)\, {f_0}^2$ \\
$1$  & $bf_1 + sf_0$ & $sf_1 + bf_0$  & $(b^2 - s^2)\, ({f_1}^2 - {f_0}^2)$ &
     $(b^2 - s^2)\, {f_1}^2$ \\
$\vdots$ & $\vdots$ & & & \\
$k$  & $bf_k + sf_{k -1}$ & $sf_k + bf_{k -1}$  &
$(b^2 - s^2)\, ({f_k}^2 - f_{k - 1}^2)$ &      $(b^2 - s^2)\, {f_k}^2$ \\
$\vdots$ & $\vdots$ & & & \\
$n + 1$  & $sf_n$ & $bf_n$  & $(b^2 - s^2) (-{f_n}^2)$  & $0$ \\ \hline
\end{tabular}
\par  				%%%%%% a break break is required!
The difference, which is given in the right-hand column,
is always positive.
An example of the result is shown
in Figure~\ref{spec-robinson}.
\sideplot{spec-robinson}{2.24in}{\FIGS/spec/robinson}{ (spec-robinson)
	Total energy percentage versus time.
	`m' for minimum phase wavelet.
	`n' for nonminimum phase.
	}
\par
Notice, $(s+bZ)/(b+sZ)$ is an all-pass filter.
Multiplying by an all-pass filter does not change the amplitude spectrum.
Multiplying by the all-pass filter introduces a zero and a pole,
but the pole could cancel a pre-existing zero.
So, if we have a collection of zeros inside the unit circle,
we can eliminate them one at a time by multiplying by an all-pass filter.
Each time we eliminate a zero inside the unit circle,
we make the energy of the filter come out earlier.
Eventually we run out of zeros inside the unit circle and the energy
comes out as early as possible.

\exer
\item
Repeat the proof of Robinson's minimum-energy-delay theorem
for complex-valued $b$,
$s$, and $f_k$. [{\sc hint:} Does $F_{\rm in} = (s + bZ)\, F$ or
$F_{\rm in} = (\overline{s} + \overline{b} Z)F$?]
\exerend


\section{FILTERS IN PARALLEL}
We have seen that in a cascade of filters 
the filter polynomials are multiplied together.  
When filters operate in parallel then their
polynomials add.
See Figure~\ref{spec-parallel}.
\sideplot{spec-parallel}{0.88in}{\FIGS/spec/parallel}{ (spec-parallel)
	Filters operating in parallel.
	}
\par
We have seen that a cascade of filters is minimum phase if, 
and only if,
each element of the product is minimum phase.  
Now we will see a sufficient (but not necessary) 
condition that a sum $A(Z) + G(Z)$ be minimum phase.  
First, assume that $A(Z)$ is minimum phase.  
Then write
\begin{equation}
A(Z) + G(Z) = A(Z) \left( 1 + {G(Z) \over A(Z)} \right)
\end{equation}
The question whether $A(Z) + G(Z)$ is minimum phase
is now reduced to determining
whether $A(Z)$ and $1 + G(Z)/A(Z)$ are both minimum phase.
We have assumed that $A(Z)$ is minimum phase.  
Before we ask whether $1 + G(Z)/A(Z)$ is minimum phase 
we need to be sure that it's causal.  
Since $1/A(Z)$ is expandable in positive powers of $Z$ only,
then $G(Z)/A(Z)$ is also causal.  
We will next see that a sufficient condition for 
$1 + G(Z)/A(Z)$ to be minimum phase is that
the spectrum of $A$ exceeds that of $G$ at all frequencies.  
In other words, for any real $\omega$, $|A |> |G |$.  
Thus, if we plot the curve of $G(Z)/A(Z)$ in the complex plane, 
for real $0 \leq \omega \leq 2\pi$ it lies everywhere inside the unit circle.
Now if we add unity---getting $1 + G(Z)/A(Z)$,
the curve will always have a positive real part as in
Figure~\ref{spec-garbage}.  
\sideplot{spec-garbage}{1.4in}{\FIGS/spec/garbage}{ (spec-garbage)
	A phase trajectory like in Figure~\protect\ref{spec-origin} left,
	but more complicated.
	}
Since the curve cannot enclose the origin, 
the phase must be that of a minimum-phase function.  
In words, ``You can add garbage to a minimum-phase wavelet 
if you do not add too much.
\par
This abstract theorem has an immediate physical consequence.
Suppose a wave characterized
by a minimum phase $A(Z)$ is emitted from a source and 
detected at a receiver some time later.  
At a still later time an echo bounces off 
a nearby object and is also detected at the receiver.  
The receiver sees the signal
$Y(Z) = A(Z) + Z^n \alpha A(Z)$
where $n$ measures the delay from the first arrival
to the echo and $\alpha$ represents the amplitude attenuation of the echo.
To see that $Y(Z)$ is minimum phase,
we note that the magnitude of $Z^n$
is unity and that the reflection coefficient $\alpha$
must be less than unity
(to avoid perpetual motion)
so that $Z^n \alpha A(Z)$ takes the role of $G(Z)$.
Thus a minimum-phase wave along with its echo is minimum phase.
We will later consider wave propagation with echoes of echoes
ad infinitum.
\exer
\item
Find two nonminimum-phase wavelets whose sum is minimum phase.
\item
Let $A(Z)$ be a minimum-phase polynomial of degree $N$.
Let $A'(Z) = Z^N \overline{A}(1/Z)$.
Locate in the complex $Z$ plane the roots of $A'(Z)$.
$A'(Z)$ is called {\it maximum phase}.
[{\sc hint:}  Work the simple case $A(Z) = a_0 + a_1Z$ first.]
\item
Suppose $A(Z)$ is maximum phase and that the degree of $G(Z)$ is less
than or equal to the degree of $A(Z)$.  Assume $|A|>|G|$.
Show that $A(Z) + G(Z)$ is maximum phase.
\item
Let $A(Z)$ be minimum phase.
Where are the roots of
$A(Z) + cZ^N \bar{A}(1/Z)$ in the three cases
$| c | < 1, | c | > 1, | c | = 1$? 
({\sc hint:}  The roots of a polynomial are continuous functions of the 
polynomial coefficients.)
\exerend