Skip to main content

Complexity of Proximal Augmented Lagrangian for Nonconvex Optimization with Nonlinear Equality Constraints

Abstract

We analyze worst-case complexity of a Proximal augmented Lagrangian (Proximal AL) framework for nonconvex optimization with nonlinear equality constraints. When an approximate first-order (second-order) optimal point is obtained in the subproblem, an \(\epsilon \) first-order (second-order) optimal point for the original problem can be guaranteed within \({\mathcal {O}}(1/ \epsilon ^{2 - \eta })\) outer iterations (where \(\eta \) is a user-defined parameter with \(\eta \in [0,2]\) for the first-order result and \(\eta \in [1,2]\) for the second-order result) when the proximal term coefficient \(\beta \) and penalty parameter \(\rho \) satisfy \(\beta = {\mathcal {O}}(\epsilon ^\eta )\) and \(\rho = \varOmega (1/\epsilon ^\eta )\), respectively. We also investigate the total iteration complexity and operation complexity when a Newton-conjugate-gradient algorithm is used to solve the subproblems. Finally, we discuss an adaptive scheme for determining a value of the parameter \(\rho \) that satisfies the requirements of the analysis.

This is a preview of subscription content, access via your institution.

Notes

  1. Circumstances under which the penalty parameter sequence of ALGENCAN is bounded are discussed in [1, Section 5].

References

  1. Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: On augmented Lagrangian methods with general lower-level constraints. SIAM J. Optim. 18(4), 1286–1309 (2008). https://doi.org/10.1137/060654797

    Article  MathSciNet  MATH  Google Scholar 

  2. Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: Second-order negative-curvature methods for box-constrained and general constrained optimization. Comput. Optim. Appl. 45(2), 209–236 (2010). https://doi.org/10.1007/s10589-009-9240-y

    Article  MathSciNet  MATH  Google Scholar 

  3. Andreani, R., Fazzio, N., Schuverdt, M., Secchin, L.: A sequential optimality condition related to the quasi-normality constraint qualification and its algorithmic consequences. SIAM J. Optim. 29(1), 743–766 (2019). https://doi.org/10.1137/17M1147330

    Article  MathSciNet  MATH  Google Scholar 

  4. Andreani, R., Haeser, G., Ramos, A., Silva, P.J.S.: A second-order sequential optimality condition associated to the convergence of optimization algorithms. IMA J. Numer. Anal. 37(4), 1902–1929 (2017)

    Article  MathSciNet  Google Scholar 

  5. Andreani, R., Martínez, J.M., Ramos, A., Silva, P.J.S.: A cone-continuity constraint qualification and algorithmic consequences. SIAM J. Optim. 26(1), 96–110 (2016). https://doi.org/10.1137/15M1008488

    Article  MathSciNet  MATH  Google Scholar 

  6. Andreani, R., Secchin, L., Silva, P.: Convergence properties of a second order augmented Lagrangian method for mathematical programs with complementarity constraints. SIAM J. Optim. 28(3), 2574–2600 (2018). https://doi.org/10.1137/17M1125698

    Article  MathSciNet  MATH  Google Scholar 

  7. Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, Cambridge (2014)

    MATH  Google Scholar 

  8. Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149(1), 301–327 (2015). https://doi.org/10.1007/s10107-014-0753-5

    Article  MathSciNet  MATH  Google Scholar 

  9. Birgin, E.G., Floudas, C.A., Martínez, J.M.: Global minimization using an augmented Lagrangian method with variable lower-level constraints. Math. Program. 125(1), 139–162 (2010). https://doi.org/10.1007/s10107-009-0264-y

    Article  MathSciNet  MATH  Google Scholar 

  10. Birgin, E.G., Gardenghi, J., Martínez, J.M., Santos, S.A., Toint, P.L.: Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models. Math. Program. 163(1–2), 359–368 (2017)

    Article  MathSciNet  Google Scholar 

  11. Birgin, E.G., Haeser, G., Ramos, A.: Augmented Lagrangians with constrained subproblems and convergence to second-order stationary points. Comput. Optim. Appl. 69(1), 51–75 (2018). https://doi.org/10.1007/s10589-017-9937-2

    Article  MathSciNet  MATH  Google Scholar 

  12. Birgin, E.G., Martínez, J.M.: Complexity and performance of an augmented Lagrangian algorithm. Optim. Methods Softw. (2020). https://doi.org/10.1080/10556788.2020.1746962

    Article  MathSciNet  Google Scholar 

  13. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011). https://doi.org/10.1561/2200000016

    Article  MATH  Google Scholar 

  14. Cartis, C., Gould, N., Toint, P.: On the evaluation complexity of composite function minimization with applications to nonconvex nonlinear programming. SIAM J. Optim. 21(4), 1721–1739 (2011)

    Article  MathSciNet  Google Scholar 

  15. Cartis, C., Gould, N., Toint, P.: Complexity bounds for second-order optimality in unconstrained optimization. J. Complex. 28(1), 93–108 (2012)

    Article  MathSciNet  Google Scholar 

  16. Cartis, C., Gould, N.I.M., Toint, P.L.: On the evaluation complexity of cubic regularization methods for potentially rank-deficient nonlinear least-squares problems and its relevance to constrained nonlinear optimization. SIAM J. Optim. 23(3), 1553–1574 (2013). https://doi.org/10.1137/120869687

    Article  MathSciNet  MATH  Google Scholar 

  17. Cartis, C., Gould, N.I.M., Toint, P.L.: On the complexity of finding first-order critical points in constrained nonlinear optimization. Math. Program. Ser. A 144, 93–106 (2014)

    Article  MathSciNet  Google Scholar 

  18. Cartis, C., Gould, N.I.M., Toint, P.L.: Optimization of orders one to three and beyond: characterization and evaluation complexity in constrained nonconvex optimization. J. Complex. 53, 68–94 (2019)

    Article  Google Scholar 

  19. Curtis, F.E., Jiang, H., Robinson, D.P.: An adaptive augmented Lagrangian method for large-scale constrained optimization. Math. Program. 152(1), 201–245 (2015). https://doi.org/10.1007/s10107-014-0784-y

    Article  MathSciNet  MATH  Google Scholar 

  20. Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156(1), 59–99 (2016). https://doi.org/10.1007/s10107-015-0871-8

    Article  MathSciNet  MATH  Google Scholar 

  21. Grapiglia, G.N., Nesterov, Y.: Regularized Newton methods for minimizing functions with Hölder continuous Hessians. SIAM J. Optim. 27(1), 478–506 (2017). https://doi.org/10.1137/16M1087801

    Article  MathSciNet  MATH  Google Scholar 

  22. Grapiglia, G.N., Yuan, Y.X.: On the complexity of an augmented Lagrangian method for nonconvex optimization. arXiv e-prints arXiv:1906.05622 (2019)

  23. Haeser, G., Liu, H., Ye, Y.: Optimality condition and complexity analysis for linearly-constrained optimization without differentiability on the boundary. Math. Program. (2018). https://doi.org/10.1007/s10107-018-1290-4

    Article  MATH  Google Scholar 

  24. Hajinezhad, D., Hong, M.: Perturbed proximal primal-dual algorithm for nonconvex nonsmooth optimization. Math. Program. (2019). https://doi.org/10.1007/s10107-019-01365-4

    Article  MathSciNet  MATH  Google Scholar 

  25. Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4(5), 303–320 (1969). https://doi.org/10.1007/BF00927673

    Article  MathSciNet  MATH  Google Scholar 

  26. Hong, M., Hajinezhad, D., Zhao, M.M.: Prox-PDA: The proximal primal-dual algorithm for fast distributed nonconvex optimization and learning over networks. In: D. Precup, Y.W. Teh (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70, pp. 1529–1538. PMLR (2017). http://proceedings.mlr.press/v70/hong17a.html

  27. Jiang, B., Lin, T., Ma, S., Zhang, S.: Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis. Comput. Optim. Appl. 72(1), 115–157 (2019). https://doi.org/10.1007/s10589-018-0034-y

    Article  MathSciNet  MATH  Google Scholar 

  28. Liu, K., Li, Q., Wang, H., Tang, G.: Spherical principal component analysis. In: Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 387–395 (2019). https://doi.org/10.1137/1.9781611975673.44

  29. Nouiehed, M., Lee, J.D., Razaviyayn, M.: Convergence to second-order stationarity for constrained non-convex optimization. arXiv e-prints arXiv:1810.02024 (2018)

  30. O’Neill, M., Wright, S.J.: A log-barrier Newton-CG method for bound constrained optimization with complexity guarantees. IMA J. Numer. Anal. (2020). https://doi.org/10.1093/imanum/drz074

    Article  Google Scholar 

  31. Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Optimization (Sympos., Univ. Keele, Keele, 1968), pp. 283–298. Academic Press, London (1969)

  32. Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976). https://doi.org/10.1287/moor.1.2.97

    Article  MathSciNet  MATH  Google Scholar 

  33. Royer, C.W., O’Neill, M., Wright, S.J.: A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization. Math. Program. (2019)

  34. Sun, J., Qu, Q., Wright, J.: Complete dictionary recovery over the sphere. In: 2015 International Conference on Sampling Theory and Applications (SampTA), pp. 407–410 (2015)

  35. Zhang, J., Luo, Z.Q.: A proximal alternating direction method of multiplier for linearly constrained nonconvex minimization. SIAM J. Optim. 30(3), 2272–2302 (2020). https://doi.org/10.1137/19M1242276

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

Research supported by Award N660011824020 from the DARPA Lagrange Program, NSF Awards 1628384, 1634597, and 1740707; and Subcontract 8F-30039 from Argonne National Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Xie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proofs of Elementary Results

Appendix: Proofs of Elementary Results

Proof of Theorem 1

Since \(x^*\) is a local minimizer of (1), it is the unique global solution of

$$\begin{aligned} \min \, f(x) + \frac{1}{4} \Vert x - x^* \Vert ^4 \quad \hbox {subject to}\;\; c(x) = 0, \;\; \Vert x - x^* \Vert \le \delta , \end{aligned}$$
(68)

for \(\delta > 0\) sufficiently small. For the same \(\delta \), we define \(x_k\) to be the global solution of

$$\begin{aligned} \min \, \quad f(x) + \frac{\rho _k}{2} \Vert c(x) \Vert ^2 + \frac{1}{4} \Vert x - x^* \Vert ^4 \quad \hbox {subject to}\;\; \Vert x - x^* \Vert \le \delta , \end{aligned}$$
(69)

for a given \(\rho _k\), where \(\{ \rho _k \}_{k \ge 1}\) is a positive sequence such that \(\rho _k \rightarrow +\infty \). Note that \(x_k\) is well defined because the feasible region is compact and the objective is continuous. Suppose that z is any accumulation point of \(\{ x_k \}_{k \ge 1}\), that is, \(x_k \rightarrow z\) for \(k \in {\mathcal {K}}\), for some subsequence \({\mathcal {K}}\). Such a z exists because \(\{ x_k \}_{k \ge 1}\) lies in a compact set, and moreover, \(\Vert z - x^* \Vert \le \delta \). We want to show that \(z = x^*\). By the definition of \(x_k\), we have for any \(k \ge 1\) that

$$\begin{aligned} f(x^*)&= f(x^*) + \frac{\rho _k}{2}\Vert c(x^*) \Vert ^2 + \frac{1}{4} \Vert x^* - x^* \Vert ^4 \nonumber \\&\ge f(x_k) + \frac{\rho _k}{2}\Vert c(x_k) \Vert ^2 + \frac{1}{4} \Vert x_k - x^* \Vert ^4 \ge f(x_k) + \frac{1}{4} \Vert x_k - x^* \Vert ^4. \end{aligned}$$
(70)

By taking the limit over \({\mathcal {K}}\), we have \(f(x^*) \ge f(z) + \frac{1}{4} \Vert z - x^* \Vert ^4\). From (70), we have

$$\begin{aligned} \frac{\rho _k}{2}\Vert c(x_k) \Vert ^2 \le f(x^*) - f(x_k) \le f(x^*) - \inf _{k \ge 1} f(x_k) < +\infty . \end{aligned}$$
(71)

By taking limits over \({\mathcal {K}}\), we have that \(c(z) = 0\). Therefore, z is the global solution of (68), so that \(z = x^*\).

Without loss of generality, suppose that \(x_k \rightarrow x^*\) and \(\Vert x_k - x^* \Vert < \delta \). By first and second-order optimality conditions for (69), we have

$$\begin{aligned}&\nabla f(x_k) + \rho _k \nabla c(x_k) c(x_k) + \Vert x_k - x^* \Vert ^2 (x_k - x^*) = 0, \nonumber \\&\quad \nabla ^2 f(x_k) + \rho _k \sum _{i=1}^m c_i(x_k) \nabla ^2 c_i(x_k) + \rho _k \nabla c(x_k) [ \nabla c(x_k) ]^T \end{aligned}$$
(72)
$$\begin{aligned}&+ 2(x_k-x^*)(x_k-x^*)^T + \Vert x_k-x^* \Vert ^2 I \succeq 0. \end{aligned}$$
(73)

Define \(\lambda _k \triangleq \rho _k c(x_k)\) and \(\epsilon _k \triangleq \max \{ \Vert x_k - x^* \Vert ^3, 3 \Vert x_k - x^* \Vert ^2, \sqrt{ 2( f(x^*) - \inf _{k \ge 1} f(x_k) )/\rho _k} \}\). Then by (71), (72), (73) and Definition 2, \(x_k\) is \(\epsilon _k\)-2o. Note that \(x_k \rightarrow x^*\) and \(\rho _k \rightarrow +\infty \), so \(\epsilon _k \rightarrow 0^+\). \(\square \)

Proof of Lemma 1

We prove by contradiction. Otherwise for any \(\alpha \) we could select sequence \(\{ x_k \}_{k \ge 1} \subseteq S_{\alpha }^0\) such that \(f(x_k) + \frac{\rho _0}{2} \Vert c(x_k) \Vert ^2 < - k\). Let \(x^*\) be an accumulation point of \(\{ x_k \}_{k \ge 1}\) (which exists by compactness of \(S_\alpha ^0\)). Then there exists index K such that \(f(x^*) + \frac{\rho _0}{2} \Vert c(x^*) \Vert ^2 \ge -K + 1 > f(x_k) + \frac{\rho _0}{2} \Vert c(x_k) \Vert ^2 + 1\) for all \(k \ge K\), which contradicts the continuity of \(f(x) + \frac{\rho _0}{2} \Vert c(x) \Vert ^2\). \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xie, Y., Wright, S.J. Complexity of Proximal Augmented Lagrangian for Nonconvex Optimization with Nonlinear Equality Constraints. J Sci Comput 86, 38 (2021). https://doi.org/10.1007/s10915-021-01409-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-021-01409-y

Keywords

  • Optimization with nonlinear equality constraints
  • Nonconvex optimization
  • Proximal augmented Lagrangian
  • Complexity analysis
  • Newton-conjugate-gradient

Mathematics Subject Classification

  • 68Q25
  • 90C06
  • 90C26
  • 90C30
  • 90C60