next up previous [pdf]

Next: Linear Programming Up: Moussa: Alternative optimization schemes Previous: Introduction

Pure L1 Solutions

Early work focused on a pure conjugate-gradient or steepest-descent method using the true $ L_1$ norm. We implemented a modified version of cg_step, implementing a line search first as a weighted median search, and also with a full, explicit calculation of the Frechet derivative. It was conclusively shown that the gradient of a pure $ L_1$ solution resulted in introduction of local minima, resulting in a failure to converge. This has been noted in prior work (Bube and Langan, 1997) - the pure $ L_1$ norm is not strictly convex. This can be shown with the trivial example of finding the $ L_1$ minimum for a median problem with even number of elements:

$\displaystyle \begin{bmatrix}1 \ 1\end{bmatrix} \begin{bmatrix}m_1\end{bmatrix} = \begin{bmatrix}1 \ 2\end{bmatrix}$ (1)

The residual in the $ L_1$ case is the sum $ \vert m - 1\vert + \vert m - 2\vert$, which has nonunique global minima for all $ m \in \begin{bmatrix}1,2\end{bmatrix}$. The gradient, in this entire region, is exactly zero. In a larger, non-trivial data-fitting problem, these zero-value gradients can occur at locations other than the global optimum - and thus the gradient-based methods do not function properly.


next up previous [pdf]

Next: Linear Programming Up: Moussa: Alternative optimization schemes Previous: Introduction

2009-10-19