next up previous print clean
Next: L-BFGS update Up: A quasi-Newton method for Previous: Lemma

Proof

Starting from equation (2), we can write for all $\bold{z} \neq 0$ and $\bold{y}^T \bold{s} \gt 0$,

\begin{displaymath}
\bold{z}^T\bold{H}_{n+1}\bold{z}=\bold{z}^T\bold{H}_n\bold{z...
 ...old{z}^T\bold{H}_n\bold{s})^2}{(\bold{s}^T\bold{H}_n\bold{s})}.\end{displaymath}

Since $\bold{H}_n$ is spd, we have

\begin{displaymath}
(\bold{z}^T\bold{H}_n\bold{s})^2 \leq (\bold{s}^T\bold{H}_n\bold{s})(\bold{z}^T\bold{H}_n\bold{z})\end{displaymath}

with equality only if $\bold{z}=0$ or $\bold{s}=0$. But we have $\bold{z} \neq 0$ and $\bold{y}^T \bold{s} \gt 0$ so that

\begin{displaymath}
\bold{z}^T\bold{H}_{n+1}\bold{z} \gt \frac{(\bold{z}^T\bold{y})^2}{\bold{y}^T\bold{s}} \geq 0.\end{displaymath}

Then $\bold{H}_{n+1}$ is spd. If during the iterations we have $\bold{y}^T\bold{s} \leq 0$, then the update is a failure.

The previous lemma is very important since it shows that starting from an initial spd Hessian $\bold{H}_0$, the next approximation of the Hessian is spd (given that $\bold{y}^T \bold{s} \gt 0$). This guarantees the existence of a minimizer for the function f (the inverse $\bold{H}^{-1}$ is also spd). It can be shown Kelley (1999) that given some assumptions, the BFGS iterates are defined and converge q-superlinearly [*] to the local minimizer $\bold{x}^*$.

In practice, the storage needed to compute the update and the possibility that $\bold{y}^T\bold{s} \leq 0$ are important issues. The updated Hessian is computed at each iteration recursively. For this, we need to store a solution step vector $\bold{s}$ and a gradient step vector $\bold{y}$after each iteration. If for small problems this storage is not an issue, it may become critical for large-scale problems. Unfortunately, these large-scale problems occur in geophysics, and we need to find a better storage solution. Nocedal (1980) gives an interesting answer to this problem. Instead of keeping all the $\bold{s}$ and $\bold{y}$ from the past iterations, we update the Hessian using the information from the m previous iterations, where m is given by the end user. This means that when the number of iterations is smaller than m, we have a ``real'' BFGS update, and when it is larger than m, we have a Limited-memory BFGS (L-BFGS) update.


next up previous print clean
Next: L-BFGS update Up: A quasi-Newton method for Previous: Lemma
Stanford Exploration Project
4/27/2000