Abstract
We study the numerical performance of a limited memory quasi-Newton method for large scale optimization, which we call the L-BFGS method. We compare its performance with that of the method developed by Buckley and LeNir (1985), which combines cycles of BFGS steps and conjugate direction steps. Our numerical tests indicate that the L-BFGS method is faster than the method of Buckley and LeNir, and is better able to use additional storage to accelerate convergence. We show that the L-BFGS method can be greatly accelerated by means of a simple scaling. We then compare the L-BFGS method with the partitioned quasi-Newton method of Griewank and Toint (1982a). The results show that, for some problems, the partitioned quasi-Newton method is clearly superior to the L-BFGS method. However we find that for other problems the L-BFGS method is very competitive due to its low iteration cost. We also study the convergence properties of the L-BFGS method, and prove global convergence on uniformly convex problems.
Similar content being viewed by others
References
E.M.L. Beale, “Algorithms for very large nonlinear optimization problems,” in: M.J.D. Powell, ed.,Nonlinear Optimization 1981 (Academic Press, London, 1981) pp. 281–292.
A. Buckley, “A combined conjugate gradient quasi-Newton minimization algorithm,”Mathematical Programming 15 (1978) 200–210.
A. Buckley, “Update to TOMS Algorithm 630,” Rapports Techniques No. 91, Institut National de Recherche en Informatique et en Automatique, Domaine Voluceau, Rocquencourt, B.P. 105 (Le Chesnay, 1987).
A. Buckley and A. LeNir, “QN-like variable storage conjugate gradients,”Mathematical Programming 27 (1983) 155–175.
A. Buckley and A. LeNir, “BBVSCG—A variable storage algorithm for function minimization,”ACM Transactions on Mathematical Software 11/2 (1985) 103–119.
R.H. Byrd and J. Nocedal, “A tool for the analysis of quasi-Newton methods with application to unconstrained minimization,”SIAM Journal on Numerical Analysis 26 (1989) 727–739.
J.E. Dennis Jr. and R.B. Schnabel,Numerical methods for unconstrained optimization and nonlinear equations (Prentice-Hall, 1983).
J.E. Dennis Jr. and R.B. Schnabel, “A view of unconstrained optimization,” in: G.L. Nemhauser, A.H.G. Rinnooy Kan and M.J. Todd, eds.,Handbooks in Operations Research and Management Science, Vol. 1, Optimization (North-Holland, Amsterdam, 1989) pp. 1–72.
R. Fletcher,Practical Methods of Optimization, Vol. 1, Unconstrained Optimization (Wiley, New York, 1980).
J.C. Gilbert and C. Lemaréchal, “Some numerical experiments with variable storage quasi-Newton algorithms,” IIASA Working Paper WP-88, A-2361 (Laxenburg, 1988).
P.E. Gill and W. Murray, “Conjugate-gradient methods for large-scale nonlinear optimization,” Technical Report SOL 79-15, Department of Operations Research, Stanford University (Stanford, CA, 1979).
P.E. Gill, W. Murray and M.H. Wright,Practical Optimization (Academic Press, London, 1981).
A. Griewank, “The global convergence of partitioned BFGS on semi-smooth problems with convex decompositions,” ANL/MCS-TM-105, Mathematics and Computer Science Division, Argonne National Laboratory (Argonne, IL, 1987).
A. Griewank and Ph.L. Toint, “Partitioned variable metric updates for large structured optimization problems,”Numerische Mathematik 39 (1982a) 119–137.
A. Griewank and Ph.L. Toint, “Local convergence analysis of partitioned quasi-Newton updates,”Numerische Mathematik 39 (1982b) 429–448.
A. Griewank and Ph.L. Toint, “Numerical experiments with partially separable optimization problems,” in: D.F. Griffiths, ed.,Numerical Analysis: Proceedings Dundee 1983, Lecture Notes in Mathematics, Vol. 1066 (Springer, Berlin, 1984) pp. 203–220.
D.C. Liu and J. Nocedal, “Test results of two limited memory methods for large scale optimization,” Technical Report NAM 04, Department of Electrical Engineering and Computer Science, Northwestern University (Evanston, IL, 1988).
J.J. Moré, B.S. Garbow and K.E. Hillstrom, “Testing unconstrained optimization software,”ACM Transactions on Mathematical Software 7 (1981) 17–41.
S.G. Nash, “Preconditioning of truncated-Newton methods,”SIAM Journal on Scientific and Statistical Computing 6 (1985) 599–616.
L. Nazareth, “A relationship between the BFGS and conjugate gradient algorithms and its implications for new algorithms,”SIAM Journal on Numerical Analysis 16 (1979) 794–800.
J. Nocedal, “Updating quasi-Newton matrices with limited storage,”Mathematics of Computation 35 (1980) 773–782.
D.P. O'Leary, “A discrete Newton algorithm for minimizing a function of many variables,”Mathematical Programming 23 (1982) 20–33.
J.D. Pearson, “Variable metric methods of minimization,”Computer Journal 12 (1969) 171–178.
J.M. Perry, “A class of conjugate gradient algorithms with a two-step variable-metric memory,” Discussion Paper 269, Center for Mathematical Studies in Economics and Management Science, Northwestern University (Evanston, IL, 1977).
M.J.D. Powell, “Some global convergence properties of a variable metric algorithm for minimization without exact line search,” in: R.W. Cottle and C.E. Lemke, eds.,Nonlinear Programing, SIAM-AMS Proceedings IX (SIAM, Philadelphia, PA, 1976).
M.J.D. Powell, “Restart procedures for the conjugate gradient method,”Mathematical Programming 12 (1977) 241–254.
D.F. Shanno, “On the convergence of a new conjugate gradient algorithm,”SIAM Journal on Numerical Analysis 15 (1978a) 1247–1257.
D.F. Shanno, “Conjugate gradient methods with inexact searches,”Mathematics of Operations Research 3 (1978b) 244–256.
D.F. Shanno and K.H. Phua, “Matrix conditioning and nonlinear optimization,”Mathematical Programming 14 (1978) 149–160.
D.F. Shanno and K.H. Phua, “Remark on algorithm 500: minimization of unconstrained multivariate functions,”ACM Transactions on Mathematical Software 6 (1980) 618–622.
T. Steihaug, “The conjugate gradient method and trust regions in large scale optimization,”SIAM Journal on Numerical Analysis 20 (1983) 626–637.
Ph.L. Toint, “Some numerical results using a sparse matrix updating formula in unconstrained optimization,”Mathematics of Computation 32 (1978) 839–851.
Ph.L. Toint, “Towards an efficient sparsity exploiting Newton method for minimization,” in: I.S. Duff, ed.,Sparse Matrices and their Uses (Academic Press, New York, 1981) pp. 57–87.
Ph.L. Toint, “Test problems for partially separable optimization and results for the routine PSPMIN,” Report Nr 83/4, Department of Mathematics, Facultés Universitaires de Namur (Namur, 1983a).
Ph.L. Toint, “VE08AD, a routine for partially separable optimization with bounded variables,” Harwell Subroutine Library, A.E.R.E. (UK, 1983b).
Ph.L. Toint, “A view of nonlinear optimization in a large number of variables,” Report Nr 86/16, Department of Mathematics, Facultés Universitaires de Namur (Namur, 1986).
Author information
Authors and Affiliations
Additional information
This work was supported by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under contract DE-FG02-87ER25047, and by National Science Foundation Grant No. DCR-86-02071.
Rights and permissions
About this article
Cite this article
Liu, D.C., Nocedal, J. On the limited memory BFGS method for large scale optimization. Mathematical Programming 45, 503–528 (1989). https://doi.org/10.1007/BF01589116
Issue Date:
DOI: https://doi.org/10.1007/BF01589116