Since is spd, we have
with equality only if or . But we have and so that
Then is spd. If during the iterations we have , then the update is a failure.
The previous lemma is very important since it shows that starting from an initial spd Hessian , the next approximation of the Hessian is spd (given that ). This guarantees the existence of a minimizer for the function f (the inverse is also spd). It can be shown Kelley (1999) that given some assumptions, the BFGS iterates are defined and converge q-superlinearly to the local minimizer .
In practice, the storage needed to compute the update and the possibility that are important issues. The updated Hessian is computed at each iteration recursively. For this, we need to store a solution step vector and a gradient step vector after each iteration. If for small problems this storage is not an issue, it may become critical for large-scale problems. Unfortunately, these large-scale problems occur in geophysics, and we need to find a better storage solution. Nocedal (1980) gives an interesting answer to this problem. Instead of keeping all the and from the past iterations, we update the Hessian using the information from the m previous iterations, where m is given by the end user. This means that when the number of iterations is smaller than m, we have a ``real'' BFGS update, and when it is larger than m, we have a Limited-memory BFGS (L-BFGS) update.