Local and Global Convergence

  • Kenneth Lange
Part of the Statistics and Computing book series (SCO)


Proving convergence of the various optimization algorithms is a delicate exercise. In general, it is helpful to consider local and global convergence patterns separately. The local convergence rate of an algorithm provides a useful benchmark for comparing it to other algorithms. On this basis, Newton’s method wins hands down. However, the tradeoffs are subtle. Besides the sheer number of iterations until convergence, the computational complexity and numerical stability of an algorithm are critically important. The MM algorithm is often the epitome of numerical stability and computational simplicity. Scoring lies somewhere between these two extremes. It tends to converge more quickly than the MM algorithm and to behave more stably than Newton’s method. Quasi-Newton methods also occupy this intermediate zone. Because the issues are complex, all of these algorithms survive and prosper in certain computational niches.


Stationary Point Line Search Spectral Radius Global Convergence Gradient Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    de Leeuw J (1994) Block relaxation algorithms in statistics. Information Systems and Data Analysis, Bock HH, Lenski W, Richter MM, Springer, Berlin, pp 308-325Google Scholar
  2. 2.
    Dennis JE Jr, Schnabel RB (1983) Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Engle-wood Cliffs, NJMATHGoogle Scholar
  3. 3.
    Fessler JA, Clinthorne NH, Rogers WL (1993) On complete-data spaces for PET reconstruction algorithms. IEEE Trans Nuclear Sci 40:1055-1061CrossRefGoogle Scholar
  4. 4.
    Gill PE, Murray W, Wright MH (1981) Practical Optimization. Academic Press, New YorkMATHGoogle Scholar
  5. 5.
    Guillemin V, Pollack A (1974) Differential Topology. Prentice-Hall, Englewood Cliffs, NJMATHGoogle Scholar
  6. 6.
    Lange K (1995) A gradient algorithm locally equivalent to the EM algorithm. J Roy Stat Soc B 57:425-437MATHGoogle Scholar
  7. 7.
    Lange, K (2004) Optimization. Springer, New YorkMATHGoogle Scholar
  8. 8.
    Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J Roy Stat Soc B 44:226-233MATHMathSciNetGoogle Scholar
  9. 9.
    Luenberger DG (1984) Linear and Nonlinear Programming, 2nd ed. Addison-Wesley, Reading, MAMATHGoogle Scholar
  10. 10.
    McLachlan GJ, Krishnan T (2008) The EM Algorithm and Extensions, 2nd ed. Wiley, New YorkMATHGoogle Scholar
  11. 11.
    Meyer RR (1976) Sufficient conditions for the convergence of monotonic mathematical programming algorithms. J Computer System Sci 12:108-121MATHGoogle Scholar
  12. 12.
    Nocedal J (1991) Theory of algorithms for unconstrained optimization. Acta Numerica 1991:199-242.MathSciNetGoogle Scholar
  13. 13.
    Ortega JM (1990) Numerical Analysis: A Second Course. SIAM, PhiladelphiaMATHGoogle Scholar
  14. 14.
    Ortega JM, Rheinboldt WC (1970) Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, San DiegoMATHGoogle Scholar
  15. 15.
    Ostrowski AM (1960) Solution of Equations and Systems of Equations. Academic Press, New YorkMATHGoogle Scholar
  16. 16.
    Wu CF (1983) On the convergence properties of the EM algorithm. Ann Stat 11:95-103MATHCrossRefGoogle Scholar

Copyright information

© Springer New York 2010

Authors and Affiliations

  1. 1.Departments of Biomathematics, Human Genetics, and Statistics David Geffen School of MedicineUniversity of California, Los AngelesLos AngelesUSA

Personalised recommendations