previous up next print clean
Next: SYNTHETIC EXAMPLE Up: Harlan: Flexible tomography Previous: VELOCITY PERTURBATIONS

OPTIMIZATION

For this application, I found it advantageous to write a generic ``Gauss-Newton'' optimization routine that minimizes a least-squares objective function with a non-linear forward model. Both ray tracing and tomographic inversion of velocities are optimized with this algorithm. The ray parameters in equation (7) are perturbed until the traveltime is minimized. (The traveltime is a nonlinear function of local velocities.) The error between picked and modeled traveltimes (9) is minimized by perturbations of velocity parameters (4). In each case, a model damping term is included for numerical stability.

Let the vector $\vec{ \bf m}$ describe a model, and a vector $\vec{ \bf d}$contain the data whose errors will be minimized. Define also a scalar product for each of these vectors: $< \vec{ \bf m}_1 , \vec{ \bf m}_2 \gt _{m}$ and $< \vec{ \bf d}_1 , \vec{ \bf d}_2 \gt _{d}$.The squared magnitude of each is defined by
\begin{displaymath}
\Vert \vec{ \bf m} \Vert _m^2 \equiv < \vec{ \bf m} , \vec{ ...
 ...f d} \Vert _d^2 \equiv < \vec{ \bf d} , \vec{ \bf d} \gt _{d} .\end{displaymath} (11)
These scalar products incorporate any non-stationary variances or covariances that can be assumed for the problem. For example, I will be assuming smaller variances for higher-order polynomials used to describe raypaths. The velocity parameters $\eta$ and $\epsilon$ will have small variances on the order of 0.05, and the velocity Vx will depend on the physical units of the survey. Rather than introduce correlations between samples into the scalar product, I prefer to encourage such correlations with the choice of basis functions. By scaling basis functions correctly, we can make the model norms become the trivial Cartesian norm, simply summing the squares of model parameters. I do not assume any correlation in the errors of traveltime data.

Assume we wish to fit the data $\vec{ \bf d}$ with a non-linear forward model $\vec{ \bf f} (\vec{ \bf m})$.We also must apply a linearized forward transform ${\bf F}(\vec{ \bf m}_0 )$ for a given reference model $\vec{ \bf m}_0$, so that
\begin{displaymath}
\vec{ \bf f} (\vec{ \bf m}_0 + \Delta \vec{ \bf m}) 
\approx...
 ...f m}_0 ) + {\bf F}(\vec{ \bf m}_0 ) \cdot
\Delta \vec{ \bf m} .\end{displaymath} (12)
We must be able to apply the transforms $\vec{ \bf f}( \vec{ \bf m}_0 )$ and ${\bf F}(\vec{ \bf m}_0 )$ when necessary and apply the adjoint ${\bf F}^* ( \vec{ \bf m}_0 )$ of the linear transform, defined by  
 \begin{displaymath}
< \vec{ \bf d} , {\bf F}(\vec{ \bf m}_0 ) \cdot \Delta \vec{...
 ...c{ \bf m}_0 ) \cdot \vec{ \bf d} , \Delta \vec{ \bf m} \gt _m
.\end{displaymath} (13)
Let us assume that all optimum models $\vec{ \bf m}$can then be specified to minimize an objective function of the form  
 \begin{displaymath}
\min_{\vec{ \bf \scriptstyle m}} 
J_1 ( \vec{ \bf m} ) =
\Ve...
 ...Vert _d^2
+ \Vert \vec{ \bf m} - \vec{ \bf \bar m} \Vert _m^2 .\end{displaymath} (14)
where $\vec{ \bf \bar m}$ contains the expected mean of the model. The relative weighting of the two terms ideally should be equal when covariances are included properly in the dot products. To optimize a raypath I minimize the traveltime. To optimize velocities I minimize the differences between measured and modeled traveltimes.

The objective function is iteratively approximated by a quadratic objective function, using the linearized forward model  
 \begin{displaymath}
\min_{\Delta \vec{ \bf \scriptstyle m}}
J_2 ( \Delta \vec{ \...
 ...\bf m}_0 + \Delta \vec{ \bf m} - \vec{ \bf \bar m} \Vert _m^2 .\end{displaymath} (15)
This quadratic objective function is easily optimized by a gradient method such as conjugate gradients. The gradient
\begin{displaymath}
\nabla_{\Delta \vec{ \bf \scriptstyle m}} J_2 ( \Delta \vec{...
 ... +
( \vec{ \bf m}_0 + \Delta \vec{ \bf m} - \vec{ \bf \bar m} )\end{displaymath} (16)
requires application of the adjoint linearized transform. The resulting linearized perturbation is added to the reference model, after optimizing a scale factor $\lambda$ by a line search:  
 \begin{displaymath}
\min_{\lambda}
J_3 ( \lambda ) =
\Vert \vec{ \bf d} - \vec{ ...
 ... + \lambda \Delta \vec{ \bf m} - \vec{ \bf \bar m} \Vert _m^2 .\end{displaymath} (17)
The reference model is updated by the scaled perturbation, the transform is relinearized, and the new quadratic (15) is optimized again, until convergence.


previous up next print clean
Next: SYNTHETIC EXAMPLE Up: Harlan: Flexible tomography Previous: VELOCITY PERTURBATIONS
Stanford Exploration Project
11/12/1997