Proof

Next: L-BFGS update Up: A quasi-Newton method for Previous: Lemma

Proof

Starting from equation (2), we can write for all $\bold{z} \neq 0$ and $\bold{y}^T \bold{s} \gt 0$ ,

$\begin{displaymath} \bold{z}^T\bold{H}_{n+1}\bold{z}=\bold{z}^T\bold{H}_n\bold{z... ...old{z}^T\bold{H}_n\bold{s})^2}{(\bold{s}^T\bold{H}_n\bold{s})}.\end{displaymath}$

Since $\bold{H}_n$ is spd, we have

$\begin{displaymath} (\bold{z}^T\bold{H}_n\bold{s})^2 \leq (\bold{s}^T\bold{H}_n\bold{s})(\bold{z}^T\bold{H}_n\bold{z})\end{displaymath}$

with equality only if $\bold{z}=0$ or $\bold{s}=0$ . But we have $\bold{z} \neq 0$ and $\bold{y}^T \bold{s} \gt 0$ so that

$\begin{displaymath} \bold{z}^T\bold{H}_{n+1}\bold{z} \gt \frac{(\bold{z}^T\bold{y})^2}{\bold{y}^T\bold{s}} \geq 0.\end{displaymath}$

Then $\bold{H}_{n+1}$ is spd. If during the iterations we have $\bold{y}^T\bold{s} \leq 0$ , then the update is a failure.

The previous lemma is very important since it shows that starting from an initial spd Hessian $\bold{H}_0$ , the next approximation of the Hessian is spd (given that $\bold{y}^T \bold{s} \gt 0$ ). This guarantees the existence of a minimizer for the function f (the inverse $\bold{H}^{-1}$ is also spd). It can be shown Kelley (1999) that given some assumptions, the BFGS iterates are defined and converge q-superlinearly to the local minimizer $\bold{x}^*$ .

In practice, the storage needed to compute the update and the possibility that $\bold{y}^T\bold{s} \leq 0$ are important issues. The updated Hessian is computed at each iteration recursively. For this, we need to store a solution step vector $\bold{s}$ and a gradient step vector $\bold{y}$ after each iteration. If for small problems this storage is not an issue, it may become critical for large-scale problems. Unfortunately, these large-scale problems occur in geophysics, and we need to find a better storage solution. Nocedal (1980) gives an interesting answer to this problem. Instead of keeping all the $\bold{s}$ and $\bold{y}$ from the past iterations, we update the Hessian using the information from the m previous iterations, where m is given by the end user. This means that when the number of iterations is smaller than m, we have a ``real'' BFGS update, and when it is larger than m, we have a Limited-memory BFGS (L-BFGS) update.

Next: L-BFGS update Up: A quasi-Newton method for Previous: Lemma

Stanford Exploration Project
4/27/2000