next up previous [pdf]

Next: Model derivatives Up: Claerbout: Blocky models: L1/L2 Previous: Claerbout: Blocky models: L1/L2

introduction

I've seen many applications improved when least-squares ($ L2$) model fitting was changed to least absolute values ($ L1$). I've never seen the reverse. Never-the-less we always return to $ L2$ because the solving method is easier and faster. It does not require us to specify parameters of numerical analysis that are unclear how to specify.

Another reason to re-investigate $ L1$ is its natural ability to estimate blocky models. Sedimentary sections tend to fluctuate randomly, but sometimes there is a homogeneous material continuing for some distance. A function with such homogeneous regions is called ``blocky''. The derivative of such a function is called ``sparse''. $ L2$ gives huge penalties to large values and minuscule penalties to small ones, hence it never really produces sparse functions and their integrals are never really blocky. If we had an easy, reliable $ L1$ solver, we could expect to see many more realistic solutions.

There are reasons to abandon strict $ L1$ and revert to an $ L1/L2$ hybrid solver. A hybrid solver has a parameter, a threshold, at which $ L2$ behavior transits to $ L1$. We have good reasons to use two different hybrid solvers, one for the data fitting, the other for the model styling (prior knowledge or regularization). Each requires a threshold of residual, let us call it $ R_d$ for the data fitting, and $ R_m$ for the model styling. Processes that require parameters are detestible when we have a poor idea of the meaning of the parameters (especially if they relate to numerical analysis), however the meaning of the thresholds $ R_d$ and $ R_m$ is quite clear. When we look at a shot gather and see about 30% of the area is covered with ground roll, it is clear we would like to choose $ R_d$ to be at about the 70th percentile of the fitting residual. As for the model styling, if we'd like to see blocks about 20 points long, we'd like our spikes to average about 20 points apart, so we would like $ R_m$ about the 95th percentile allowing 5% of the spikes to be of unlimited size, while the others small.

I was first attracted to strict $ L1$ by its potential for blocky models. But then I realized for each nonspike (zero) on the time axis, theory says I would need a ``basis equation''. That implies an immense number of iterations, so it is unacceptable in imaging applications. With the hybrid solvers, instead of exact zeros we have a large region driven down by the L2 norm and a small L1 region where large spikes are welcomed.


next up previous [pdf]

Next: Model derivatives Up: Claerbout: Blocky models: L1/L2 Previous: Claerbout: Blocky models: L1/L2

2009-10-19