Abstract
It is well known that the conjugate gradient method and a quasi-Newton method, using any well-defined update matrix from the one-parameter Broyden family of updates, produce identical iterates on a quadratic problem with positive-definite Hessian. This equivalence does not hold for any quasi-Newton method. We define precisely the conditions on the update matrix in the quasi-Newton method that give rise to this behavior. We show that the crucial facts are, that the range of each update matrix lies in the last two dimensions of the Krylov subspaces defined by the conjugate gradient method and that the quasi-Newton condition is satisfied. In the framework based on a sufficient condition to obtain mutually conjugate search directions, we show that the one-parameter Broyden family is complete. A one-to-one correspondence between the Broyden parameter and the non-zero scaling of the search direction obtained from the corresponding quasi-Newton method compared to the one obtained in the conjugate gradient method is derived. In addition, we show that the update matrices from the one-parameter Broyden family are almost always well-defined on a quadratic problem with positive-definite Hessian. The only exception is when the symmetric rank-one update is used and the unit steplength is taken in the same iteration. In this case it is the Broyden parameter that becomes undefined.
Similar content being viewed by others
Notes
It may happen, depending on the number of distinct eigenvalues of \(H\) and the orientation of \(p_0\), that the optimal solution is found after less than \(n\) iterations, see, e.g., [19, Chapter 6].
The choice \(B_k=H\) would give Newton’s method, whereas the choice \(B_k=I\) would give the steepest-descent method.
“We have made both a simplification by which certain orthogonality conditions which are important to the rate of attaining the solution are preserved, and also an improvement in the criterion of convergence.” [5]
A generalization of exact linesearch for general smooth functions, see [3].
Any symmetric matrix can be expressed like this.
In the literature, much emphasis is placed on that updates should satisfy (15). This condition alone is not a sufficient condition on \(B_k\) to give conjugate directions.
See the next paragraph and Sect. 3.3.
For general functions we may have \(p_{k-1}^TB_{k-1}p_{k-1}=0\) and \(B_{k-1}p_{k-1} \ne 0\), and in [7] the values of \(\phi _k\) that give rise to this situation are characterized and also termed degenerate values.
An illustration that the conditions (i)–(iii) of Proposition 2 are indeed only sufficient.
References
Broyden, C.G.: Quasi–Newton methods and their application to function minimisation. Math. Comp. 21, 368–381 (1967)
Davidon, W.C.: Variable metric method for minimization. SIAM J. Optim. 1(1), 1–17 (1991)
Dixon, L.C.W.: Quasi–Newton algorithms generate identical points. Math. Program. 2, 383–387 (1972)
Fletcher, R.: Practical Methods of Optimization, 2nd edn. A Wiley-Interscience. John Wiley & Sons Ltd., Chichester (1987)
Fletcher, R., Powell, M.J.D.: A rapidly convergent descent method for minimization. Comput. J. 6(2), 163–168 (1963/1964)
Fletcher, R., Reeves, C.M.: Function minimization by conjugate gradients. Comput. J. 7, 149–154 (1964)
Fletcher, R., Sinclair, J.W.: Degenerate values for Broyden methods. J. Optim. Theory Appl. 33(3), 311–324 (1981)
Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press Inc. Harcourt Brace Jovanovich, London (1981)
Gutknecht, M.H.: A brief introduction to Krylov space methods for solving linear systems. In Frontiers of Computational Science, Springer-Verlag, Berlin, pp. 53–62 (2007). In: Proceedings of the International Symposium on Frontiers of Computational Science, Nagoya Univ, Nagoya, Dec 12–13, 2005.
Hager, W.W., Zhang, H.: The limited memory conjugate gradient method. SIAM J. Optim. 23(4), 2150–2168 (2013)
Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stand. 49(1952), 409–436 (1953)
Huang, H.Y.: Unified approach to quadratically convergent algorithms for function minimization. J. Optim. Theory Appl. 5, 405–423 (1970)
Luenberger, D.G.: Linear and Nonlinear Programming, 2nd edn. Addison-Wesley, Boston, MA (1984)
Mifflin, R.B., Nazareth, J.L.: The least prior deviation Quasi–Newton update. Math. Programming Ser. A 65(3), 247–261 (1994)
Nazareth, L.: A relationship between the BFGS and conjugate gradient algorithms and its implications for new algorithms. SIAM J. Numer. Anal. 16(5), 794–800 (1979)
Nazareth, L.: An alternative variational principle for variable metric updating. Math. Program. 30(1), 99–104 (1984)
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research. Springer-Verlag, New York (1999)
Powell, M.J.D.: Recent advances in unconstrained optimization. Math. Program. 1(1), 26–57 (1971)
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia, PA (2003)
Shewchuk, J.R.: An introduction to the conjugate gradient method without the agonizing pain. Technical Report, Pittsburgh, PA (1994)
Acknowledgments
We thank the referees and the associate editor for their insightful comments and suggestions which significantly improved the presentation of this paper. Research supported by the Swedish Research Council (VR).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Forsgren, A., Odland, T. On the connection between the conjugate gradient method and quasi-Newton methods on quadratic problems. Comput Optim Appl 60, 377–392 (2015). https://doi.org/10.1007/s10589-014-9677-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-014-9677-5