next up previous [pdf]

Next: Test geometries Up: De Ridder et al.: Previous: Time-domain kernel

Optimization scheme

The FWI objective function $ J_{\rm FWI}$ can be written as:

$\displaystyle J_{\rm FWI}(\mathbf v) = \lVert \mathbf d (\mathbf v) - \mathbf d_{\rm obs} \rVert^2_2,$ (4)

where $ \mathbf v$ is the velocity model, $ \mathbf d (\mathbf v)$ is the computed data, and $ \mathbf d_{\rm obs}$ is the observed data. $ \mathbf d (\mathbf v)$ is computed as:

$\displaystyle d(\mathbf x_s, \mathbf x_r, \omega; \mathbf v) = f(\mathbf x_s, \...
...) G(\mathbf x_s, \mathbf x, \omega; \mathbf v) \delta(\mathbf x_r - \mathbf x),$ (5)

where $ f(\mathbf x_s, \omega)$ is the source function, $ \omega$ is frequency, $ \mathbf x_s$ and $ \mathbf x_r$ are the source and receiver coordinates, and $ \mathbf x$ is the model coordinate. In the acoustic, constant-density case the Green's function $ G(\mathbf x_s, \mathbf x, \omega; \mathbf v)$ satisfies:

$\displaystyle \left( \nabla^2 + v^{-2}(\mathbf x)\omega^2 \right) G(\mathbf x_s, \mathbf x, \omega) = \delta(\mathbf x_s - \mathbf x).$ (6)

We then separate the model into a background and a perturbation:

$\displaystyle v^{-2}(\mathbf x) = b(\mathbf x) + m(\mathbf x),$ (7)

where $ b(\mathbf x)$ is the background component, which is the current model in slowness squared units, and $ m(\mathbf x)$ is the perturbation component. After this separation, we can use Taylor expansion on the data around the background component as follows:

$\displaystyle \mathbf d(\mathbf v) = \mathbf d(\mathbf b) + \frac{\partial \mathbf d}{\partial \mathbf v}\vert_{b} \mathbf m + ... .$ (8)

By neglecting the higher-order terms in the data series, we can define the linearized modeling operator $ \mathbf L$ as:

$\displaystyle \Delta \mathbf d(\mathbf v) = \frac{\partial \mathbf d}{\partial \mathbf v}\vert_{b} \mathbf m = \mathbf L(\mathbf b) \mathbf m.$ (9)

The first order Born approximation can be used to define the operator:

$\displaystyle \Delta d(\mathbf x_s,\mathbf x_r,\omega; \mathbf b, \mathbf m) = ...
...hbf x,\omega;\mathbf b) m(\mathbf x) G(\mathbf x,\mathbf x_r,\omega;\mathbf b),$ (10)

where the Green's functions now satisfy the acoustic wave equation as follows:

$\displaystyle \left(\nabla^2 + b(\mathbf x) \omega^2 \right) G(\mathbf x_s,\mathbf x,\omega) = \delta(\mathbf x_s-\mathbf x),$ (11)
$\displaystyle \left(\nabla^2 + b(\mathbf x) \omega^2 \right) G(\mathbf x,\mathbf x_r,\omega) = \delta(\mathbf x-\mathbf x_r).$ (12)

We can now compute the model gradient $ g(\mathbf x)$ as follows:

$\displaystyle g(\mathbf x) = \frac{\partial J_{\rm FWI}}{\partial \mathbf m} = \mathbf L^* \Delta \mathbf d.$ (13)

Finally, we can update the model with the gradient:

$\displaystyle b_{\rm new}(\mathbf x) = b(\mathbf x) - \alpha g(\mathbf x),$ (14)

where $ \alpha$ is the step size. To estimate the step size, we first evaluate the objective function with the gradient scaled to have a maximum of 2% and 4% of the minimum value of the current model. Using these two points as well as the objective function value at the current model, which is already computed in the gradient calculation, we fit a parabola. If the parabola has positive-side minimum, i.e. both the curvature and the x-axis shift are positive, a new objective function evaluation is performed at the parabola minimum. Then, the two or three evaluations are compared and the scale that resulted in the smallest objective function is used as the step size given that the objective function decreases. Otherwise, the line search is repeated after shrinking the gradient by a factor of 4. The optimization scheme is implemented on the CPU in Fourier domain notations using frequency domain Green's function solutions computed on the GPU.

next up previous [pdf]

Next: Test geometries Up: De Ridder et al.: Previous: Time-domain kernel