Skip to main content
Log in

On the connection between the conjugate gradient method and quasi-Newton methods on quadratic problems

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

It is well known that the conjugate gradient method and a quasi-Newton method, using any well-defined update matrix from the one-parameter Broyden family of updates, produce identical iterates on a quadratic problem with positive-definite Hessian. This equivalence does not hold for any quasi-Newton method. We define precisely the conditions on the update matrix in the quasi-Newton method that give rise to this behavior. We show that the crucial facts are, that the range of each update matrix lies in the last two dimensions of the Krylov subspaces defined by the conjugate gradient method and that the quasi-Newton condition is satisfied. In the framework based on a sufficient condition to obtain mutually conjugate search directions, we show that the one-parameter Broyden family is complete. A one-to-one correspondence between the Broyden parameter and the non-zero scaling of the search direction obtained from the corresponding quasi-Newton method compared to the one obtained in the conjugate gradient method is derived. In addition, we show that the update matrices from the one-parameter Broyden family are almost always well-defined on a quadratic problem with positive-definite Hessian. The only exception is when the symmetric rank-one update is used and the unit steplength is taken in the same iteration. In this case it is the Broyden parameter that becomes undefined.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. It may happen, depending on the number of distinct eigenvalues of \(H\) and the orientation of \(p_0\), that the optimal solution is found after less than \(n\) iterations, see, e.g., [19, Chapter 6].

  2. The choice \(B_k=H\) would give Newton’s method, whereas the choice \(B_k=I\) would give the steepest-descent method.

  3. “We have made both a simplification by which certain orthogonality conditions which are important to the rate of attaining the solution are preserved, and also an improvement in the criterion of convergence.” [5]

  4. A generalization of exact linesearch for general smooth functions, see [3].

  5. Any symmetric matrix can be expressed like this.

  6. In the literature, much emphasis is placed on that updates should satisfy (15). This condition alone is not a sufficient condition on \(B_k\) to give conjugate directions.

  7. See the next paragraph and Sect. 3.3.

  8. For general functions we may have \(p_{k-1}^TB_{k-1}p_{k-1}=0\) and \(B_{k-1}p_{k-1} \ne 0\), and in [7] the values of \(\phi _k\) that give rise to this situation are characterized and also termed degenerate values.

  9. An illustration that the conditions (i)–(iii) of Proposition 2 are indeed only sufficient.

References

  1. Broyden, C.G.: Quasi–Newton methods and their application to function minimisation. Math. Comp. 21, 368–381 (1967)

    Article  MATH  MathSciNet  Google Scholar 

  2. Davidon, W.C.: Variable metric method for minimization. SIAM J. Optim. 1(1), 1–17 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  3. Dixon, L.C.W.: Quasi–Newton algorithms generate identical points. Math. Program. 2, 383–387 (1972)

    Article  MATH  Google Scholar 

  4. Fletcher, R.: Practical Methods of Optimization, 2nd edn. A Wiley-Interscience. John Wiley & Sons Ltd., Chichester (1987)

    MATH  Google Scholar 

  5. Fletcher, R., Powell, M.J.D.: A rapidly convergent descent method for minimization. Comput. J. 6(2), 163–168 (1963/1964)

  6. Fletcher, R., Reeves, C.M.: Function minimization by conjugate gradients. Comput. J. 7, 149–154 (1964)

    Article  MATH  MathSciNet  Google Scholar 

  7. Fletcher, R., Sinclair, J.W.: Degenerate values for Broyden methods. J. Optim. Theory Appl. 33(3), 311–324 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  8. Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press Inc. Harcourt Brace Jovanovich, London (1981)

    MATH  Google Scholar 

  9. Gutknecht, M.H.: A brief introduction to Krylov space methods for solving linear systems. In Frontiers of Computational Science, Springer-Verlag, Berlin, pp. 53–62 (2007). In: Proceedings of the International Symposium on Frontiers of Computational Science, Nagoya Univ, Nagoya, Dec 12–13, 2005.

  10. Hager, W.W., Zhang, H.: The limited memory conjugate gradient method. SIAM J. Optim. 23(4), 2150–2168 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  11. Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stand. 49(1952), 409–436 (1953)

    MathSciNet  Google Scholar 

  12. Huang, H.Y.: Unified approach to quadratically convergent algorithms for function minimization. J. Optim. Theory Appl. 5, 405–423 (1970)

    Article  MATH  Google Scholar 

  13. Luenberger, D.G.: Linear and Nonlinear Programming, 2nd edn. Addison-Wesley, Boston, MA (1984)

    MATH  Google Scholar 

  14. Mifflin, R.B., Nazareth, J.L.: The least prior deviation Quasi–Newton update. Math. Programming Ser. A 65(3), 247–261 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  15. Nazareth, L.: A relationship between the BFGS and conjugate gradient algorithms and its implications for new algorithms. SIAM J. Numer. Anal. 16(5), 794–800 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  16. Nazareth, L.: An alternative variational principle for variable metric updating. Math. Program. 30(1), 99–104 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  17. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research. Springer-Verlag, New York (1999)

    Google Scholar 

  18. Powell, M.J.D.: Recent advances in unconstrained optimization. Math. Program. 1(1), 26–57 (1971)

    Article  MATH  Google Scholar 

  19. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia, PA (2003)

    Book  MATH  Google Scholar 

  20. Shewchuk, J.R.: An introduction to the conjugate gradient method without the agonizing pain. Technical Report, Pittsburgh, PA (1994)

Download references

Acknowledgments

We thank the referees and the associate editor for their insightful comments and suggestions which significantly improved the presentation of this paper. Research supported by the Swedish Research Council (VR).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anders Forsgren.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Forsgren, A., Odland, T. On the connection between the conjugate gradient method and quasi-Newton methods on quadratic problems. Comput Optim Appl 60, 377–392 (2015). https://doi.org/10.1007/s10589-014-9677-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-014-9677-5

Keywords

Navigation