next up previous print clean
Next: Synthetic Example Up: Witten: Optimization Previous: Introduction

The Method

The general form of the problem examined by Kim et al. (submitted) is
\mbox{minimize } \vert{\bf Cu}-{\bf d}\vert _2^2+\Sigma_{i=1}^n\epsilon\vert{\bf u}_i\vert,
\end{displaymath} (1)
where $\bf{u}$ is the model, $\bf{C}$ is an operator relating the model to the data $\bf{d}$ and $\bf{u}_i$ is the model value at point $\rm{i}$.

In this paper a very similar problem, of the form,
\mbox{minimize } \vert{\bf W}({\bf Cu}-{\bf
 d})\vert _2^2+...
 ...t+\Sigma_{i=1}^n\epsilon_\tau\vert{\bf D}_\tau {\bf u}_i\vert,
\end{displaymath} (2)
will be explored. Here $\bf{u}$ is a vector whose components range over vertical travel time depth $\tau$ and whose values are the interval velocity squared v2int and $\bf{d}$ is the data vector which has the same range as $\bf{u}$, but whose values are the scaled root-mean squared (RMS) velocities squared, $\tau v^2_{RMS}/\Delta\tau$, where $\tau/\Delta\tau$ is the index on the time axis. $\bf{C}$ is the casual integration operator, and $\bf W$ is a weight matrix which is proportional to our confidence in the RMS velocities. As well, $\bf{D_x}$ and $\bf{D_\tau}$ are the first order finite-difference derivatives along the midpoint and travel-time axis, respectively, and $\epsilon_x$ and $\epsilon_\tau$ are the regularization parameters that control the importance of the two model residuals, effectively controlling the smoothing.

This problem can be transformed to a convex quadratic problem with linear inequality constraints,
\mbox{minimize}& \vert{\bf W}({\bf Cu}-{\bf d})\vert _2^2 +
 ..._i \leq {\bf D}_\tau {\bf u}_i \leq {\bf v}^\tau_i &i=1,\ldots,n,
where $v^{x,\tau}$ serve to remove the absolute value from the problem. The new problem (3) can be solved by interior point methods (e.g. Wright (1997); Ye (1997)). With this goal in mind, we can now construct logarithmic barrier functions, which approximate an inequality constraint by increasing to infinity as the point approaches the constraint. For a simple problem,
\mbox{minimize}& f_0(x) &\nonumber \\ 
\mbox{subject to}& f_i(x) \leq 0, & i=1,\ldots,m,
the logarithmic barrier function is
\end{displaymath} (5)
where t > 0 is a parameter the determines how closely you approximate the constraint Boyd and Vandenberghe (2004).

For the bound constraints in equation 3 the barrier functions are:
{\bf \Phi}^x({\bf u},{\bf v}^x)=-\Sigma_{i=1}^n \mbox{log}({...
\mbox{log}({{\bf v}^x_i}^2-({\bf D}_x {\bf u}_i)^2), 
\end{displaymath} (6)
{\bf \Phi}^\tau({\bf u},{\bf v}^\tau)=-\Sigma_{i=1}^n \mbox{...
\mbox{log}({{\bf v}^\tau_i}^2-({\bf D}_\tau {\bf u}_i)^2),
\end{displaymath} (7)
Now we can define the centering problem as,
\mbox{minimize } \phi_t({\bf u},{\bf v}^x,{\bf v}^\tau)=t\ve...
\end{displaymath} (8)
The centering problem is an equivalent representation to problem (3) and has a unique solution parametrized by t, called the central path which leads to an optimal solution Boyd and Vandenberghe (2004). Newtons method can now be applied to the centering problem, which involves solving a system on linear equations,  

{ c } 
 \Delta \bf{u} \\ 
 \Delta \bf{v}^x \\ 
 \Delta \bf{v}^\tau
 \end{array} \right]= -g,
\end{displaymath} (9)
where $H=\nabla^2\phi_t(\bf{u},\bf{v}^x,\bf{v}^\tau)$ is the Hessian and $g=\nabla\phi_t(\bf{u},\bf{v}^x,\bf{v}^\tau)$ is the gradient. Conjugate gradients is used to find an approximate solution to this system. We differ from Kim et al. (submitted) by choosing not to solve the whole system with conjugate gradients. Instead, $\bf{v}^x$ and $\bf{v}^\tau$ will be solved analytically, decreasing the size of the system of equations needed to be solved solve from 3n to n, substantially reducing computational time.

`To solve for $\bf{v}^x$ analytically, take the derivative of $\phi_t(\bf{u},\bf{v}^x,\bf{v}^\tau)$ with respect to $\bf{v}^x$, then solve for $\bf{v}^x$ . This gives
 v}^x_i=\frac{1}{t\epsilon_x}+\sqrt{\left(\frac{1}{t\epsilon_x}\right)^2+({\bf D}_x
 {\bf u}_i)^2}.
\end{displaymath} (10)
The same can be done for $\bf{v}^\tau$. We can also write the Hessian and gradient succinctly as,
H=t\nabla^2 \left\vert \bf{W(Cu-d)} \right\vert _2^2 
+ \na...
 ...u},{\bf v}^\tau \right)
=2t {\bf WC}^{\bf T}{\bf C} + \bf{D},
\end{displaymath} (11)
{\bf D}=diag\left(2\left(\frac{{{\bf v}^x_1}^2+({\bf D}_x {\...
 ...}{{{\bf v}^\tau_n}^2-({\bf D}_\tau {\bf u}_n)^2}\right)\right)
\end{displaymath} (12)
where diag denotes a diagonal matrix with elements $2\left(\frac{{{\bf v}^x_1}^2+({\bf D}_x {\bf u}_1)^2}{{{\bf
 v}^x_1}^2-({\bf D...
 ...D}_\tau {\bf
 u}_1)^2}{{{\bf v}^\tau_1}^2-({\bf D}_\tau
 {\bf u}_1)^2}\right)$. The gradient can be written as
g = &\nabla_u\phi_t({\bf u},{\bf v}^x,{\bf v}^\tau) \nonumber \...
 ...{2\bf{u}_i}{{\bf{v}^\tau_i}^2-{\bf D}_\tau {\bf u}_i^2}\right]^T.

Since the Hessian is constructed from more than the linear operator (it incorporates the barrier functions), matrix multiplication is used to solve the system of equations in the Newton system. Thus each step of the conjugate gradient is slow, but time is saved by reducing the overall number of conjugate gradient steps.

The final algorithm is:
given the update parameters for t, set the initial values $t=1/\epsilon$, $\bf{u}=0$, and $\bf{v}^{x,\tau}=1$.

1. Calculate $\bf{v}^x$ and $\bf{v}^\tau$
2. Compute the search direction $\Delta u$ using conjugate gradients
3. Compute the step size s by backtracking line search
4. Update $\bf{u}=\bf{u}+s(\Delta \bf{u}$)
5. Calculate $\bf{v}^x$ and $\bf{v}^\tau$ update
6. Evaluate the duality gap and quit if appropriate (see Boyd and Vandenberghe (2004) for more on Duality)
7. Update t

next up previous print clean
Next: Synthetic Example Up: Witten: Optimization Previous: Introduction
Stanford Exploration Project