Next: Linear Programming Up: Moussa: Alternative optimization schemes Previous: Introduction

Pure L1 Solutions

Early work focused on a pure conjugate-gradient or steepest-descent method using the true

norm. We implemented a modified version of cg_step, implementing a line search first as a weighted median search, and also with a full, explicit calculation of the Frechet derivative. It was conclusively shown that the gradient of a pure

solution resulted in introduction of local minima, resulting in a failure to converge. This has been noted in prior work (Bube and Langan, 1997) - the pure

norm is not strictly convex. This can be shown with the trivial example of finding the

minimum for a median problem with even number of elements:

$\displaystyle \begin{bmatrix}1 \ 1\end{bmatrix} \begin{bmatrix}m_1\end{bmatrix} = \begin{bmatrix}1 \ 2\end{bmatrix}$

(1)

The residual in the case is the sum $\vert m - 1\vert + \vert m - 2\vert$ , which has nonunique global minima for all $m \in \begin{bmatrix}1,2\end{bmatrix}$ . The gradient, in this entire region, is exactly zero. In a larger, non-trivial data-fitting problem, these zero-value gradients can occur at locations other than the global optimum - and thus the gradient-based methods do not function properly.

Next: Linear Programming Up: Moussa: Alternative optimization schemes Previous: Introduction

2009-10-19