Next: Application to the dip
Up: Optimization
Previous: Trust region methods
The L-BFGS-B algorithm is an extension of the L-BFGS algorithm to
handle simple bounds on the model Zhu et al. (1997). The L-BFGS algorithm is a very
efficient algorithm for solving large scale problems.
L-BFGS-B borrows ideas from the trust region methods while keeping
the L-BFGS update of the Hessian and line search algorithms.
Methods based completely on the trust region techniques exist and
are freely available. Among them, the program SBMIN from the LANCELOT
package
(written in Fortran 77) is very popular Conn et al. (1992). The original L-BFGS-B is
freely available in Fortran 77 from the Northwestern University
webpage.
The L-BFGS-B works as follows for one iteration:
- 1.
- Find an approximation of the Cauchy point for
| |
(8) |
with being the quadratic form in equation
(5). Identify and . - 2.
- Minimize the quadratic form in
equation (5) for the unconstrained variables.
This step gives a search direction.
- 3.
- Perform a line search along the new search direction
to find the minimum of .
- 4.
- Update the Hessian with the L-BFGS method Nocedal (1980)
and check if convergence is obtained.
In a ``pure'' trust region method, step (3) is replaced by tests
assessing the success of the update by measuring the difference
between the quadratic form and the true objective function at the
update. In addition, the radius is also examined.
The L-BFGS-B algorithm is affordable for very large problems.
The memory requirement is roughly (12+2m)N where m is the number
of BFGS updates kept in memory and N the size of the model space.
In practice, m=5 is a typical choice. Per iteration, the number
of multiplications range from 4mN+N when no constraints are applied to
m2N when all variables are bounded. The program offers the freedom
to have different bounds for different points of the model space.
In addition, some points can be constrained while others are not.
There are three different stopping criteria for the L-BFGS-B algorithm.
First the program stops when the maximum number of iterations is reached.
Or, the program stops when the decrease of the objective
function becomes small enough.
Or, the program stops when the norm of the projected gradient (in a sense) is small enough.
Now for the bells and whistles, tests indicate that the L-BGFS-B algorithm
ran in single precision with no constraints is not quite twice
as slow as a conjugate gradient solver per iteration.
This result is quite remarkable when considering that L-BFGS-B works for any type
of non-linear (or linear) problem with line searches. In addition, the
number of iterations needed to convergence is almost
identical for both L-BFGS-B and the conjugate gradient solver.
In the next section, the L-BFGS-B algorithm
is utilized to estimate local dips from seismic data.
Next: Application to the dip
Up: Optimization
Previous: Trust region methods
Stanford Exploration Project
10/23/2004