How To Make An L-1 Norm Conjugate Direction Solver
			  by Jon 6-21-2009


cg_step solves about all the problems in GEE and a great number in
the SEP reports and theses.   Let us convert cg_step to an L1 program.
That way we will have a great number of ready made test cases!  Hooray!
Wouldn't it be nice if we could abandon IRLS
(since it is a kludge that nobody understands very well)?

Consider the L1 weighted median
0 \approx SUM_i  W_i | m - d_i| 

Let us review the weighted median solver:
	Draw a collection of shifted absolute value functions on the real line
	each based at a data value d_i.  The d_i are NOT sorted by size.
	We are going to find the m that minimizes their sum, i.e. SUM_i w_i|m-d_i|.

	We take a pile of data values and split it into two piles.
	Choose starting m= the d_i of a random i.
	Scan all data d_i {
		if(m>d_i)   splus  = splus  + w_i  # add positive slopes
		else        sminus = sminus + w_i  # add negative slopes
		Compare the slopes.  All the data values in the weaker
		slope sum do not contain the weighted median.
		We can ignore them from now on (but remember their slope sum).
		Using data values from the stronger pile only,
		go back and split the stronger pile.
		Iterate until your pile size drops to one.
		}
	The data value in your pile is the answer.
	For randomly sorted data the cost is 3*N comparisons.
	My 35 year old code is in /homes/sep/jon/math/l1/1973/{split.r,elskew.r}


Simplify the notation for weighted median
0 \approx SUM_i  W_i | m - d_i| = |w_i m - w_i d_i|
0 \approx |b_i m -c_i|
Which may be solved for m by the method above.

Recall the L2 gradient is simply the residual vector put into the adjoint.
Recall from GEE chapter noiz that the L1 gradient
is the residual vector stuck into the signum function stuck into the adjoint.

Recall that steepest descent does a line search analytically
finding an alpha = - (dr . r) / ( dr . dr)

Recall that conjugate directions searches the (alpha, beta) plane
for the minimum of
( alpha g(i) + beta s(i) -r(i)) dotted on itself.
This is done in cg_step with a 2x2 set of simultaneous equations.

What is the L1 norm solution (alpha,beta) of the regression

0 \approx SUM_i | alpha g(i) + beta s(i) -r(i)|    ?

That's a new problem in two unknowns, but it shouldn't be difficult
for those of us who know how to find the weighted median.  
Here's a first guess:
	set alpha=0, beta=0
	iterate{
		Take beta=0, solve weighted median for alpha, alpha1=alpha
		Take alpha=alpha1, solve weighted median for beta1
		}
Here is a second guess:
	We have an Nx2 problem.  Get the two component gradient
	d alpha = g dot sgn(r)
	d beta  = s dot sgn(r)
	Do the weighted median scan on the dm=d(alfa,beta)
Here is a third guess:
	Set beta=0, solve for alfa finding the j-th equation exactly satisfied.
	Set alfa=0, solve for beta finding the k-th equation exactly satisfied.
	Solve the 2x2 system of those two equations.
	Now you have an (alfa,beta).  Multiply it by unknown gamma
	and scan (weighted median) for gamma.
Fourth guess:
	There are only two unknowns.
	We should be able to get it exactly
	in a countable number of steps proportional to N.