Next: Dix inversion by an Up: Li and Maysami: L1 Previous: Dix inversion as an

Dix inversion by IRLS method

Bube and Langan (1997) and Tang (2006) show that the nonlinear objective functions, such as the regression equation (3), can be solved by the IRLS algorithm. Many authors have demonstrated successful applications of IRLS as a robust estimator to yield sparse models. To take advantage of the well-established norm regression, we can transform the problem by introducing a diagonal weighting function $\boldsymbol{W_p}$ . Then the fitting goal 2 and the regularization 3 become:

$\displaystyle \vert\vert\boldsymbol{W_d}(\boldsymbol{C}\bold u-d)\vert\vert _2 \approx 0,$

(4)

$\displaystyle \vert\vert\epsilon \boldsymbol{W_p} \boldsymbol{D_{z}}\bold u \vert\vert _2\approx 0,$

(5)

where $\boldsymbol{W_p}$ is a diagonal weighting function on the model residual. To use the gradient-based method, we have to recompute the weight $\boldsymbol{W_p}$ at each iteration, and the algorithm can be summarized as follows:

Set the initial weighting matrix $\boldsymbol{W^{(0)}_p}$ to equal the identity matrix:

$\displaystyle \boldsymbol{W^{(0)}_p} = I$ (6)
At the iteration, the element of the weighting matrix is recomputed as

$\displaystyle W^{(k-1)}_{pi} = (\vert p_i^{k-1}\vert)^{-1/2}$ (7)

where $\bold p = \boldsymbol{D_{z}}\bold u$ is the model residual. However, applying equation (7) to compute the weighting matrix is dangerous when

is zero or too small. One way to avoid this issue is to bound the weights at a certain cutoff $\sigma$ :

$\displaystyle \bold W^{(k-1)}_{pi} = \left\{ \begin{array}{rl} (\vert p_i^{k-1}... ...-1}\vert \geq \sigma$} \sigma^{-1/2} &\mbox{ otherwise}. \end{array} \right.$

(8)

One of the most important disadvantages of IRLS algorithm arises in equation 8: how should we choose the cutoff number $\sigma$ ? We would like to derive this number automatically according to its physical meaning, instead of cumbersome numerical experiments.

Further examining the weighting function, we notice that when applying the truncated weights, we end up treating small residuals in the norm, and at the turning point ( $p=\sigma$ ) we have a sharp transition to the norm. Thus, $\sigma$ is the cutoff between the region and the region, determining the tolerance to the large residuals. Therefore, we can choose $\sigma$ according to the desired blockiness of the model space. For the synthetic example, which is a 40-point-long interval velocity model with three layers, we expect only two spikes out of those 40 points in the derivative. Therefore we would like $\sigma$ to be around the $95^{th}$ percentile of the derivative, allowing 5% of the spikes to be of unlimited size, while the others are small.

Next: Dix inversion by an Up: Li and Maysami: L1 Previous: Dix inversion as an

2009-10-19