Skip to main content

Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints

Abstract

We propose two limited-memory BFGS (L-BFGS) trust-region methods for large-scale optimization with linear equality constraints. The methods are intended for problems where the number of equality constraints is small. By exploiting the structure of the quasi-Newton compact representation, both proposed methods solve the trust-region subproblems nearly exactly, even for large problems. We derive theoretical global convergence results of the proposed algorithms, and compare their numerical effectiveness and performance on a variety of large-scale problems.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. 1.

    Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2007)

    MATH  Google Scholar 

  2. 2.

    Brust, J.J., Burdakov, O., Erway, J.B., Marcia, R.F., Yuan, Y.X.: Shape-changing L-SR1 trust-region methods. Technical Report 2016-2, Department of Mathematics, Wake Forest University (2016)

  3. 3.

    Brust, J.J., Burdakov, O.P., Erway, J.B., Marcia, R.F.: Dense initializations for limited-memory quasi-Newton methods. Comput. Optim. Appl. 74(1), 121–142 (2019). https://doi.org/10.1007/s10589-019-00112-x

    MathSciNet  Article  MATH  Google Scholar 

  4. 4.

    Brust, J.J., Erway, J.B., Marcia, R.F.: On solving L-SR1 trust-region subproblems. Comput. Optim. Appl. 66(2), 245–266 (2017)

    MathSciNet  Article  Google Scholar 

  5. 5.

    Burdakov, O., Gong, L., Yuan, Y.X., Zikrin, S.: On efficiently combining limited memory and trust-region techniques. Math. Program. Comput. 9, 101–134 (2016)

    MathSciNet  Article  Google Scholar 

  6. 6.

    Burdakov, O., Martinez, J., Pilotta, E.: A limited-memory multipoint symmetric secant method for bound constrained optimization. Ann. Oper. Res. 117, 51–70 (2002)

    MathSciNet  Article  Google Scholar 

  7. 7.

    Burke, J.V., Wiegmann, A., Xu, L.: Limited memory BFGS updating in a trust-region framework. Technical Report, University of Washington (1996)

  8. 8.

    Byrd, R.H., Gilbert, J.C., Nocedal, J.: A trust region method based on interior point techniques for nonlinear programming. Math. Program. Ser. A 89, 149–185 (2000)

    MathSciNet  Article  Google Scholar 

  9. 9.

    Byrd, R.H., Hribar, M., Nocedal, J.: An interior point algorithm for large-scale nonlinear programming. SIAM J. Optim. 9, 877–900 (1999)

    MathSciNet  Article  Google Scholar 

  10. 10.

    Byrd, R.H., Nocedal, J., Schnabel, R.B.: Representations of quasi-Newton matrices and their use in limited-memory methods. Math. Program. 63, 129–156 (1994)

    MathSciNet  Article  Google Scholar 

  11. 11.

    Celis, M., Dennis Jr., J., Tapia, R.: A trust region strategy for equality constrained optimization. Technical Report 84-1, Mathematical Sciences Department, Rice University (1984)

  12. 12.

    Coleman, T., Branch, M.A., Grace, A.: Optimization Toolbox for Use with MATLAB. MathWorks, Natick (1999)

    Google Scholar 

  13. 13.

    Coleman, T., Verma, A.: A preconditioned conjugate gradient approach to linear equality constrained minimization. Comput. Optim. Appl. 20, 61–72 (2001)

    MathSciNet  Article  Google Scholar 

  14. 14.

    Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. SIAM, Philadelphia (2000)

    Book  Google Scholar 

  15. 15.

    DeGuchy, O., Erway, J.B., Marcia, R.F.: Compact representation of the full Broyden class of quasi-Newton updates. Numer Linear Algebra Appl 25(5), e2186 (2018)

    MathSciNet  Article  Google Scholar 

  16. 16.

    Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)

    MathSciNet  Article  Google Scholar 

  17. 17.

    Erway, J.B., Marcia, R.F.: Algorithm 943: MSS: MATLAB software for L-BFGS trust-region subproblems for large-scale optimization. ACM Trans. Math. Softw. 40(4), 28:1–28:12 (2014). https://doi.org/10.1145/2616588

    MathSciNet  Article  MATH  Google Scholar 

  18. 18.

    Hager, W.W.: Updating the inverse of a matrix. SIAM Rev. 31(2), 221–239 (1989)

    MathSciNet  Article  Google Scholar 

  19. 19.

    Lalee, M., Nocedal, J., Plantenga, T.: On the implementation of an algorithm for large-scale equality constrained optimization. SIAM J. Optim. 8(3), 682–706 (1998)

    MathSciNet  Article  Google Scholar 

  20. 20.

    Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat. Comput. 4, 553–572 (1983)

    MathSciNet  Article  Google Scholar 

  21. 21.

    Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)

    MATH  Google Scholar 

  22. 22.

    Powell, M., Yuan, Y.: A trust region algorithm for equality constrained optimization. Math. Program. 49, 189–211 (1991)

    MathSciNet  Article  Google Scholar 

  23. 23.

    Saunders, M.A.: PDCO: Primal-dual interior method for convex objectives (2002–2015). http://www.stanford.edu/group/SOL/software/pdco.html. Accessed 21 June 2018

  24. 24.

    Steihaug, T.: The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 20, 626–637 (1983)

    MathSciNet  Article  Google Scholar 

  25. 25.

    Vardi, A.: A trust region algorithm for equality constrained minimization: convergence properties and implementation. SIAM J. Numer. Anal. 22(3), 575–591 (1985)

    MathSciNet  Article  Google Scholar 

  26. 26.

    Wächter, A., Biegler, L.T.: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106, 25–57 (2006)

    MathSciNet  Article  Google Scholar 

  27. 27.

    Waltz, R., Morales, J., Nocedal, J., Orban, D.: An interior algorithm for nonlinear optimization that combines line search and trust region steps. SIAM. J. Optim. 9, 877–900 (1999)

    MathSciNet  Article  Google Scholar 

  28. 28.

    Yuan, Y.X.: Trust region algorithms for constrained optimization. Technical report, State Key Laboratory of Scientific and Engineering Computing, Beijing

  29. 29.

    Zhijiang, S.: RSQP toolbox for MATLAB (2006). https://www.mathworks.com/matlabcentral/fileexchange/13046-rsqp-toolbox-for-matlab. Accessed 21 June 2018

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Johannes J. Brust.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. R. Marcia’s research is partially supported by NSF Grant IIS 1741490. C. Petra also acknowledges support from the LDRD Program of Lawrence Livermore National Laboratory under Projects 16-ERD-025 and 17-SI-005.

J. J. Brust was formerly at University of California Merced, Merced, CA.

Appendix A

Appendix A

Notation

Section 2: Background

\({\mathbf {s}}_{k-1}={\mathbf {x}}_{k} - {\mathbf {x}}_{k-1} \qquad \qquad \qquad \qquad \quad {\mathbf {S}}_k =\displaystyle [ {\mathbf {s}}_{k-l} \,\, \cdots \,\, {\mathbf {s}}_{k-1}]\)
\({\mathbf {y}}_{k-1}=\nabla f({\mathbf {x}}_{k}) - \nabla f({\mathbf {x}}_{k-1}) \qquad \quad {\mathbf {Y}}_k = \displaystyle \left[ {\mathbf {y}}_{k-l} \,\, \cdots \,\, {\mathbf {y}}_{k-1}\right] \)
\({\mathbf {S}}_k^T {\mathbf {Y}}_k={\mathbf {L}}_k + {\mathbf {T}}_k \qquad \qquad \qquad \qquad \quad {\mathbf {D}}_k=\text {diag}({\mathbf {S}}_k^T{\mathbf {Y}}_k)\)
\({\mathbf {B}}^{{(k)}}_0={\gamma _{k}} {\mathbf {I}}_n \qquad \qquad \qquad \qquad \qquad \qquad {\mathbf {H}}_k=\mathbf {B}^{-1}_k\)
\(\gamma _{k}={\mathbf {y}}_{k-1}^T {\mathbf {y}}_{k-1} / {\mathbf {y}}_{k-1}^T {\mathbf {s}}_{k-1} \qquad \qquad \,\, \delta _{k} = {1/\gamma _k}\)
\({\mathbf {B}}_k=\gamma _k {\mathbf {I}}_n + \widehat{\varvec{\Psi }}_k \widehat{\varvec{\Xi }}_k \widehat{\varvec{\Psi }}_k^T \qquad \qquad \qquad \widehat{\varvec{\Psi }}_k = [ {\mathbf {S}}_k \ \ {\mathbf {Y}}_k]\)
\({\mathbf {H}}_k=\delta _k {\mathbf {I}}_n + \widehat{\varvec{\Psi }}_k \widehat{{\mathbf {M}}}_k \widehat{\varvec{\Psi }}_k^T\)
\(\widehat{\varvec{\Xi }}_k = \displaystyle \gamma _k\left[ \begin{array}{cc} - {\mathbf {S}}_k^T {\mathbf {S}}_k &{} - {\mathbf {L}}_k \\ - {\mathbf {L}}_k^T &{} \ \ \gamma _k {\mathbf {D}}_k \end{array}\right] ^{-1}\)
\(\widehat{{\mathbf {M}}}_k = -(\gamma _k^{2} \widehat{\varvec{\Xi }}_k^{-1} + \gamma _k\widehat{\varvec{\Psi }}_k^T \widehat{\varvec{\Psi }}_k)^{-1}\)

Section 3: Trust-Region Subproblem Solution without an Inequality Constraint

\({\mathbf {K}}= \displaystyle \left[ \begin{array}{c c} {\mathbf {B}}_k &{} {\mathbf {A}}^T \\ {\mathbf {A}} &{} {\mathbf {0}} \end{array} \right] \)                   \( \begin{array}{l} \varvec{\Omega }_k = ( {\mathbf {A}} {\mathbf {B}}_k^{-1} {\mathbf {A}}^T )^{-1} \\ \varvec{\Psi }_k =[ {\mathbf {A}}^T \ \ \ \widehat{\varvec{\Psi }}_k ]\end{array}\) 
\(\mathbf {K}^{-1} = \displaystyle \left[ \begin{array}{c c}{\mathbf {B}}_k^{-1} \!- {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k {\mathbf {A}} {\mathbf {B}}_k^{-1} \ \ &{} {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k \\ ({\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k)^T \ \ &{} -\varvec{\Omega }_k \\ \end{array} \right] \) 
\({\mathbf {V}}_k = {\mathbf {B}}_k^{-1} \!-\! {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k {\mathbf {A}} {\mathbf {B}}_k^{-1}\) 
\({\mathbf {V}}_k = \delta _k {\mathbf {I}}_n + \varvec{\Psi }_k {\mathbf {M}}_k \varvec{\Psi }_k^T\) 
\({\mathbf {W}}_k = {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k\) 
\({\mathbf {M}}_k = \displaystyle \left[ \begin{array}{c c} - \delta _k^2 \varvec{\Omega }_k &{} - \delta _k\varvec{\Omega }_k {\mathbf {C}}_k\\ - \delta _k {\mathbf {C}}_k^T \varvec{\Omega }_k &{} \ \widehat{{\mathbf {M}}}_k \!-\! {\mathbf {C}}_k^T\varvec{\Omega }_k{\mathbf {C}}_k \end{array} \right] \) 
\({\mathbf {C}}_k = {\mathbf {A}}\widehat{\varvec{\Psi }}_k\widehat{{\mathbf {M}}}_k \) 

Section 4: Trust-Region Subproblem Solution with an \(\ell _2\)-Norm Inequality Constraint

\({\mathbf {H}}_k(\sigma ) = ({\mathbf {B}}_k + \sigma {\mathbf {I}})^{-1} \qquad \qquad \qquad \qquad \quad \,\,\, {\mathbf {H}}_k = {\mathbf {H}}_k(0) \)
\(\varvec{\Phi }_k(\sigma ) = {\mathbf {I}}_n - {\mathbf {A}}^T\varvec{\Omega }_k(\sigma ) {\mathbf {A}}{\mathbf {H}}_k(\sigma ) \qquad \qquad \varvec{\Phi }_k = \varvec{\Phi }_k(0) \)
\({\mathbf {H}}_k(\sigma ) = \frac{1}{\gamma _k + \sigma }{\mathbf {I}}_n + \widehat{\varvec{\Psi }}_k\widehat{{\mathbf {M}}}_k(\sigma )\widehat{\varvec{\Psi }}_k^T\)
\(\varvec{\Omega }_k(\sigma ) = ({\mathbf {A}}{\mathbf {H}}_k(\sigma ){\mathbf {A}}^T)^{-1}\)
\(\widehat{{\mathbf {M}}}_k(\sigma ) = -\big ((\gamma _k + \sigma )^2 \widehat{\varvec{\Xi }}_k^{-1} + (\gamma _k + \sigma )\widehat{\varvec{\Psi }}_k^T\widehat{\varvec{\Psi }}_k \big )^{-1}\)
\({\mathbf {V}}_k(\sigma ) = {\mathbf {H}}_k(\sigma ) - {\mathbf {H}}_k(\sigma ){\mathbf {A}}^T \varvec{\Omega }_k(\sigma ){\mathbf {A}} {\mathbf {H}}_k(\sigma )\)
\({\mathbf {V}}_k(\sigma ) = {\mathbf {H}}_k(\sigma ) \varvec{\Phi }_k(\sigma )\)
\({\mathbf {s}}(\sigma ) = - {\mathbf {H}}_k(\sigma ) \varvec{\Phi }_k(\sigma ) {\mathbf {g}}_k\)
\({\mathbf {s}}'(\sigma ) = - {\mathbf {H}}_k(\sigma ) \varvec{\Phi }_k(\sigma ) {\mathbf {s}}(\sigma )\)

Section 5: Trust-Region Subproblem Solution with a Shape-Changing Norm Inequality Constraint

\({\mathbf {U}}_k = -\varvec{\Psi }_k{\mathbf {M}}_k \varvec{\Psi }_k^T\)
\({\mathbf {A}}^T = \mathbf {Q}_{1} \mathbf {R}_{1}\qquad \qquad \qquad \qquad \qquad \qquad \,\, \mathbf {Q}_{1} \mathbf {Q}_{1}^T = {\mathbf {A}}^T ({\mathbf {A}} {\mathbf {A}}^T)^{-1} {\mathbf {A}} \)
\({\mathbf {P}} = {\mathbf {I}}_n - {\mathbf {A}}^T ({\mathbf {A}} {\mathbf {A}}^T)^{-1} {\mathbf {A}} \qquad \qquad \qquad {\mathbf {P}}\widehat{\varvec{\Psi }}_k = \widehat{{\mathbf {Q}}}_2\widehat{{\mathbf {R}}}_2 \)
\(\widehat{{\mathbf {V}}}_2\widehat{\varvec{\Lambda }}_k \widehat{{\mathbf {V}}}^T_2 = \widehat{{\mathbf {R}}}_2 (\widehat{{\mathbf {M}}}_k-{\mathbf {C}}_k^T\varvec{\Omega }_k{\mathbf {C}}_k) \widehat{{\mathbf {R}}}^T_2 \)
\(\mathbf {Q}_{2} = \widehat{{\mathbf {Q}}}_2 \widehat{{\mathbf {V}}}_2\)
\({\mathbf {Q}} = \left[ \mathbf {Q}_{1} \, \mathbf {Q}_{2} \, \mathbf {Q}_{3} \right] \)
\( \mathbf {Q}_{\parallel } = \left[ \mathbf {Q}_{1} \, \mathbf {Q}_{2} \right] \qquad \qquad \qquad \qquad \qquad \quad \mathbf {Q}_{\perp } = \mathbf {Q}_{3} \)
\({\mathbf {z}} = \left[ \begin{array}{c} \mathbf {z}_{1} \\ \mathbf {z}_{2} \\ \mathbf {z}_{3} \end{array} \right] \qquad \qquad \qquad \qquad \qquad \qquad \quad {\mathbf {s}} = {\mathbf {Q}} {\mathbf {z}}\)
\( \mathbf {z}_{\parallel } = \mathbf {z}_{2} = \mathbf {Q}_{2}^T {\mathbf {s}} \qquad \qquad \qquad \qquad \qquad \quad \mathbf {z}_{\perp } = \mathbf {z}_{3} = \mathbf {Q}_{3}^T {\mathbf {s}} \)
\( \mathbf {g}_{\parallel } = \mathbf {Q}_{2}^T {\mathbf {g}}_k \quad \qquad \qquad \qquad \qquad \qquad \qquad \mathbf {g}_{\perp } = \mathbf {Q}_{\perp }^T {\mathbf {g}}_k \)
\({\mathbf {V}}_k = {\mathbf {Q}} \varvec{\Lambda } {\mathbf {Q}}^T = \left[ \mathbf {Q}_{1} \, \mathbf {Q}_{2} \, \mathbf {Q}_{3} \right] \left[ \begin{array}{c c c} {\mathbf {0}} &{} \\ &{} \delta _k {\mathbf {I}} - \widehat{\varvec{\Lambda }}_k&{} \\ &{} &{} \delta _k {\mathbf {I}} \end{array} \right] \left[ \begin{array}{c} \mathbf {Q}_{1}^T \\ \mathbf {Q}_{2}^T \\ \mathbf {Q}_{3}^T \\ \end{array} \right] \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Brust, J.J., Marcia, R.F. & Petra, C.G. Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints. Comput Optim Appl 74, 669–701 (2019). https://doi.org/10.1007/s10589-019-00127-4

Download citation

Keywords

  • Linear equality constraints
  • Quasi-Newton
  • L-BFGS
  • Trust-region algorithm
  • Compact representation
  • Eigendecomposition
  • Shape-changing norm