Mathematical Programming

, Volume 45, Issue 1–3, pp 503–528 | Cite as

On the limited memory BFGS method for large scale optimization

  • Dong C. Liu
  • Jorge Nocedal

Abstract

We study the numerical performance of a limited memory quasi-Newton method for large scale optimization, which we call the L-BFGS method. We compare its performance with that of the method developed by Buckley and LeNir (1985), which combines cycles of BFGS steps and conjugate direction steps. Our numerical tests indicate that the L-BFGS method is faster than the method of Buckley and LeNir, and is better able to use additional storage to accelerate convergence. We show that the L-BFGS method can be greatly accelerated by means of a simple scaling. We then compare the L-BFGS method with the partitioned quasi-Newton method of Griewank and Toint (1982a). The results show that, for some problems, the partitioned quasi-Newton method is clearly superior to the L-BFGS method. However we find that for other problems the L-BFGS method is very competitive due to its low iteration cost. We also study the convergence properties of the L-BFGS method, and prove global convergence on uniformly convex problems.

Key words

Large scale nonlinear optimization limited memory methods partitioned quasi-Newton method conjugate gradient method 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E.M.L. Beale, “Algorithms for very large nonlinear optimization problems,” in: M.J.D. Powell, ed.,Nonlinear Optimization 1981 (Academic Press, London, 1981) pp. 281–292.Google Scholar
  2. A. Buckley, “A combined conjugate gradient quasi-Newton minimization algorithm,”Mathematical Programming 15 (1978) 200–210.Google Scholar
  3. A. Buckley, “Update to TOMS Algorithm 630,” Rapports Techniques No. 91, Institut National de Recherche en Informatique et en Automatique, Domaine Voluceau, Rocquencourt, B.P. 105 (Le Chesnay, 1987).Google Scholar
  4. A. Buckley and A. LeNir, “QN-like variable storage conjugate gradients,”Mathematical Programming 27 (1983) 155–175.Google Scholar
  5. A. Buckley and A. LeNir, “BBVSCG—A variable storage algorithm for function minimization,”ACM Transactions on Mathematical Software 11/2 (1985) 103–119.Google Scholar
  6. R.H. Byrd and J. Nocedal, “A tool for the analysis of quasi-Newton methods with application to unconstrained minimization,”SIAM Journal on Numerical Analysis 26 (1989) 727–739.Google Scholar
  7. J.E. Dennis Jr. and R.B. Schnabel,Numerical methods for unconstrained optimization and nonlinear equations (Prentice-Hall, 1983).Google Scholar
  8. J.E. Dennis Jr. and R.B. Schnabel, “A view of unconstrained optimization,” in: G.L. Nemhauser, A.H.G. Rinnooy Kan and M.J. Todd, eds.,Handbooks in Operations Research and Management Science, Vol. 1, Optimization (North-Holland, Amsterdam, 1989) pp. 1–72.Google Scholar
  9. R. Fletcher,Practical Methods of Optimization, Vol. 1, Unconstrained Optimization (Wiley, New York, 1980).Google Scholar
  10. J.C. Gilbert and C. Lemaréchal, “Some numerical experiments with variable storage quasi-Newton algorithms,” IIASA Working Paper WP-88, A-2361 (Laxenburg, 1988).Google Scholar
  11. P.E. Gill and W. Murray, “Conjugate-gradient methods for large-scale nonlinear optimization,” Technical Report SOL 79-15, Department of Operations Research, Stanford University (Stanford, CA, 1979).Google Scholar
  12. P.E. Gill, W. Murray and M.H. Wright,Practical Optimization (Academic Press, London, 1981).Google Scholar
  13. A. Griewank, “The global convergence of partitioned BFGS on semi-smooth problems with convex decompositions,” ANL/MCS-TM-105, Mathematics and Computer Science Division, Argonne National Laboratory (Argonne, IL, 1987).Google Scholar
  14. A. Griewank and Ph.L. Toint, “Partitioned variable metric updates for large structured optimization problems,”Numerische Mathematik 39 (1982a) 119–137.Google Scholar
  15. A. Griewank and Ph.L. Toint, “Local convergence analysis of partitioned quasi-Newton updates,”Numerische Mathematik 39 (1982b) 429–448.Google Scholar
  16. A. Griewank and Ph.L. Toint, “Numerical experiments with partially separable optimization problems,” in: D.F. Griffiths, ed.,Numerical Analysis: Proceedings Dundee 1983, Lecture Notes in Mathematics, Vol. 1066 (Springer, Berlin, 1984) pp. 203–220.Google Scholar
  17. D.C. Liu and J. Nocedal, “Test results of two limited memory methods for large scale optimization,” Technical Report NAM 04, Department of Electrical Engineering and Computer Science, Northwestern University (Evanston, IL, 1988).Google Scholar
  18. J.J. Moré, B.S. Garbow and K.E. Hillstrom, “Testing unconstrained optimization software,”ACM Transactions on Mathematical Software 7 (1981) 17–41.Google Scholar
  19. S.G. Nash, “Preconditioning of truncated-Newton methods,”SIAM Journal on Scientific and Statistical Computing 6 (1985) 599–616.Google Scholar
  20. L. Nazareth, “A relationship between the BFGS and conjugate gradient algorithms and its implications for new algorithms,”SIAM Journal on Numerical Analysis 16 (1979) 794–800.Google Scholar
  21. J. Nocedal, “Updating quasi-Newton matrices with limited storage,”Mathematics of Computation 35 (1980) 773–782.Google Scholar
  22. D.P. O'Leary, “A discrete Newton algorithm for minimizing a function of many variables,”Mathematical Programming 23 (1982) 20–33.Google Scholar
  23. J.D. Pearson, “Variable metric methods of minimization,”Computer Journal 12 (1969) 171–178.Google Scholar
  24. J.M. Perry, “A class of conjugate gradient algorithms with a two-step variable-metric memory,” Discussion Paper 269, Center for Mathematical Studies in Economics and Management Science, Northwestern University (Evanston, IL, 1977).Google Scholar
  25. M.J.D. Powell, “Some global convergence properties of a variable metric algorithm for minimization without exact line search,” in: R.W. Cottle and C.E. Lemke, eds.,Nonlinear Programing, SIAM-AMS Proceedings IX (SIAM, Philadelphia, PA, 1976).Google Scholar
  26. M.J.D. Powell, “Restart procedures for the conjugate gradient method,”Mathematical Programming 12 (1977) 241–254.Google Scholar
  27. D.F. Shanno, “On the convergence of a new conjugate gradient algorithm,”SIAM Journal on Numerical Analysis 15 (1978a) 1247–1257.Google Scholar
  28. D.F. Shanno, “Conjugate gradient methods with inexact searches,”Mathematics of Operations Research 3 (1978b) 244–256.Google Scholar
  29. D.F. Shanno and K.H. Phua, “Matrix conditioning and nonlinear optimization,”Mathematical Programming 14 (1978) 149–160.Google Scholar
  30. D.F. Shanno and K.H. Phua, “Remark on algorithm 500: minimization of unconstrained multivariate functions,”ACM Transactions on Mathematical Software 6 (1980) 618–622.Google Scholar
  31. T. Steihaug, “The conjugate gradient method and trust regions in large scale optimization,”SIAM Journal on Numerical Analysis 20 (1983) 626–637.Google Scholar
  32. Ph.L. Toint, “Some numerical results using a sparse matrix updating formula in unconstrained optimization,”Mathematics of Computation 32 (1978) 839–851.Google Scholar
  33. Ph.L. Toint, “Towards an efficient sparsity exploiting Newton method for minimization,” in: I.S. Duff, ed.,Sparse Matrices and their Uses (Academic Press, New York, 1981) pp. 57–87.Google Scholar
  34. Ph.L. Toint, “Test problems for partially separable optimization and results for the routine PSPMIN,” Report Nr 83/4, Department of Mathematics, Facultés Universitaires de Namur (Namur, 1983a).Google Scholar
  35. Ph.L. Toint, “VE08AD, a routine for partially separable optimization with bounded variables,” Harwell Subroutine Library, A.E.R.E. (UK, 1983b).Google Scholar
  36. Ph.L. Toint, “A view of nonlinear optimization in a large number of variables,” Report Nr 86/16, Department of Mathematics, Facultés Universitaires de Namur (Namur, 1986).Google Scholar

Copyright information

© North-Holland 1989

Authors and Affiliations

  • Dong C. Liu
    • 1
  • Jorge Nocedal
    • 1
  1. 1.Department of Electrical Engineering and Computer ScienceNorthwestern UniversityEvanstonUSA

Personalised recommendations