Abstract
We present a local convergence analysis of the method of multipliers for equality-constrained variational problems (in the special case of optimization, also called the augmented Lagrangian method) under the sole assumption that the dual starting point is close to a noncritical Lagrange multiplier (which is weaker than second-order sufficiency). Local \(Q\)-superlinear convergence is established under the appropriate control of the penalty parameter values. For optimization problems, we demonstrate in addition local \(Q\)-linear convergence for sufficiently large fixed penalty parameters. Both exact and inexact versions of the method are considered. Contributions with respect to previous state-of-the-art analyses for equality-constrained problems consist in the extension to the variational setting, in using the weaker noncriticality assumption instead of the usual second-order sufficient optimality condition (SOSC), and in relaxing the smoothness requirements on the problem data. In the context of optimization problems, this gives the first local convergence results for the augmented Lagrangian method under the assumptions that do not include any constraint qualifications and are weaker than the SOSC. We also show that the analysis under the noncriticality assumption cannot be extended to the case with inequality constraints, unless the strict complementarity condition is added (this, however, still gives a new result).
Similar content being viewed by others
References
ALGENCAN: http://www.ime.usp.br/egbirgin/tango/
Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: On augmented Lagrangian methods with general lower-level constraints. SIAM J. Optim. 18, 1286–1309 (2007)
Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: Augmented Lagrangian methods under the constant positive linear dependence constraint qualification. Math. Program. 111, 5–32 (2008)
Andreani, R., Haeser, G., Schuverdt, M.L.L., Silva, P.J.S.: A relaxed constant positive linear dependence constraint qualification and applications. Math. Program. 135(1–2), 255–273 (2012)
Bertsekas, D.P.: Constrained optimization and Lagrange multiplier methods. Academic Press, New York (1982)
Clarke, F.H.: Optimization and nonsmooth analysis. John Wiley & Sons, New York, USA (1983)
Conn, A., Gould, N., Sartenaer, A., Toint, P.: A globally convergent augmented Lagrangian algorithm for optimization with general constarints and simple bounds. SIAM J. Numer. Anal. 28, 545–572 (1991)
Conn, A., Gould, N., Sartenaer, A., Toint, P.: Convergence properties of an augmented Lagrangian algorithm for optimization with a combination of general equality and linear constraints. SIAM J. Optim. 6(3), 674–703 (1996)
Facchinei, F., Pang, J.S.: Finite-dimensional variational inequalities and complementarity problems. Springer-Verlag, New York (2003)
Fernández, D., Solodov, M.: Stabilized sequential quadratic programming for optimization and a stabilized Newton-type method for variational problems. Math. Program. 125(1), 47–73 (2010)
Fernández, D., Solodov, M.V.: Local convergence of exact and inexact augmented Lagrangian methods under the second-order sufficient optimality condition. SIAM J. Optim. 22, 384–407 (2012)
Fischer, A.: Local behavior of an iterative framework for generalized equations with nonisolated solutions. Math. Program. 94, 91–124 (2002)
Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4, 303–320 (1969)
Izmailov, A.F.: On the analytical and numerical stability of critical Lagrange multipliers. Comput. Math. Math. Phys. 45(6), 930–946 (2005)
Izmailov, A.F., Kurennoy, A.S.: On regularity conditions for complementarity problems. Comput. Optim. Appl. 1–18 (2013). doi:10.1007/s10589-013-9604-1
Izmailov, A.F., Kurennoy, A.S., Solodov, M.V.: The Josephy-Newton method for semismooth generalized equations and semismooth SQP for optimization. Set-Valued Var. Anal. 21, 17–45 (2013)
Izmailov, A.F., Kurennoy, A.S., Solodov, M.V.: A note on upper Lipschitz stability, error bounds, and critical multipliers for Lipschitz-continuous KKT systems. Math. Program. 142, 591–604 (2013)
Izmailov, A.F., Pogosyan, A.L., Solodov, M.V.: Semismooth Newton method for the lifted reformulation of mathematical programs with complementarity constraints. Comput. Optim. Appl. 51(1), 199–221 (2012)
Izmailov, A.F., Solodov, M.V.: The theory of 2-regularity for mappings with Lipschitzian derivatives and its applications to optimality conditions. Math. Oper. Res. 27(3), 614–635 (2002)
Izmailov, A.F., Solodov, M.V.: Examples of dual behaviour of Newton-type methods on optimization problems with degenerate constraints. Comput. Optim. Appl. 42(2), 231–264 (2009)
Izmailov, A.F., Solodov, M.V.: On attraction of Newton-type iterates to multipliers violating second-order sufficiency conditions. Math. Program. 117(1–2), 271–304 (2009)
Izmailov, A.F., Solodov, M.V.: On attraction of linearly constrained Lagrangian methods and of stabilized and quasi-Newton SQP methods to critical multipliers. Math. Program. 126(2), 231–257 (2011)
Izmailov, A.F., Solodov, M.V.: Stabilized SQP revisited. Math. Program. 133, 93–120 (2012)
Izmailov, A.F., Solodov, M.V.: Newton-type methods for optimization and variational problems. Springer International Publishing Switzerland, Springer Series in Operations Research and Financial Engineering, Switzerland (2014)
Izmailov, A.F., Solodov, M.V., Uskov, E.I.: Global convergence of augmented Lagrangian methods applied to optimization problems with degenerate constraints, including problems with complementarity constraints. SIAM J. Optim. 22(4), 1579–1606 (2012)
Klatte, D., Kummer, B.: Nonsmooth equations in optimization: regularity, calculus, methods and applications. Kluwer Academic Publishers, Dordrecht (2002)
Klatte, D., Tammer, K.: On the second order sufficient conditions to perturbed \(\text{ C }^{1,\, 1}\) optimization problems. Optimization 19, 169–180 (1988)
LANCELOT: http://www.cse.scitech.ac.uk/nag/lancelot/lancelot.shtml
Nocedal, J., Wright, S.J.: Numerical optimization, 2nd edn. Springer, New York (2006)
Powell, M.J.D.: A method for nonlinear constraints in minimization problems, pp. 283–298. Academic Press, London and New York (1969)
Qi, L.: LC\(^1\) functions and LC\(^1\) optimization problems. Technical report AMR 91/21, School of Mathematics, The University of New South Wales, Sydney (1991)
Qi, L.: Superlinearly convergent approximate Newton methods for LC\(^1\) optimization problems. Math. Program. 64, 277–294 (1994)
Rockafellar, R.T.: Computational schemes for large-scale problems in extended linear-quadratic programming. Math. Program. 48(1–3), 447–474 (1990)
Rockafellar, R.T., Wets, R.J.B.: Generalized linear-quadratic problems of deterministic and stochastic optimal control in discrete time. SIAM J. Control Optim. 28(4), 810–822 (1990)
Ruszczyński, A.P.: Nonlinear optimization. Princeton university press, Princeton (2006)
Serre, D.: Matrices: Theory and applications, vol. 216, 2nd edn. Springer (2010)
Solodov, M.V.: Constraint qualifications. In: Cochran, J.J., Cox, L.A., Keskinocak, P., Kharoufen, J.P., Smith, J.C. (eds.) Wiley encyclopedia of operations research and management science. John Wiley & Sons, Inc., Hoboken (2010)
Stein, O.: Lifting mathematical programs with complementarity constraints. Math. Program. 131(1–2), 71–94 (2012)
Acknowledgments
The authors thank E. I. Uskov for a useful discussion on the relations between results obtained in this work and other existing local convergence theories for multiplier methods.
Author information
Authors and Affiliations
Corresponding author
Additional information
Research of the first two authors is supported by the Russian Foundation for Basic Research Grant 14-01-00113. The third author is supported in part by CNPq Grant 302637/2011-7, by PRONEX–Optimization, and by FAPERJ.
Appendix
Appendix
This appendix contains lemmas concerning nonsingularity of matrices of certain structure, used in the analysis above. The first one is a refined version of [23, Lemma 1].
Lemma 3
Let \(H\) be an \(n\times n\)-matrix, \(B\) be an \(l\times n\)-matrix, and assume that
Then for any \(M > 0\) there exists \(\gamma >0\) such that
for every \(n\times n\)-matrix \(\tilde{H}\) close enough to \(H\), every \(l\times n\)-matrix \(\tilde{B}\) close enough to \(B\), every \(t\in \mathrm{I\! \mathrm R}\) such that \(|t|\) is sufficiently large, and for every \(l\times n\)-matrix \(\varOmega \) satisfying \(\Vert \varOmega \Vert \le M/|t|\).
Proof
Suppose the contrary, i.e., that for some \(M>0\) there exist sequences \(\{H_k\}\) of \(n\times n\)-matrices, \(\{B_k\}\) and \(\{\varOmega _k\}\) of \(l\times n\)-matrices, \(\{t_k\}\subset \mathrm{I\! \mathrm R}\), and \(\{\xi ^k\}\subset \mathrm{I\! \mathrm R}^n\setminus \{0\}\), such that \(\{H_k\}\rightarrow H, \{B_k\}\rightarrow B, |t_k|\rightarrow \infty , \Vert \varOmega _k\Vert \le M/|t_k|\) for all \(k\), and
as \(k\rightarrow \infty \). Without loss of generality we may assume that \(\Vert \xi ^k\Vert =1\) for all \(k\) and that \(\{\xi ^k\}\rightarrow \xi \ne 0\). Then (65) means the existence of a sequence \(\{ w^k\} \subset \mathrm{I\! \mathrm R}^n\) such that \(\{ w_k\} \rightarrow 0\) and
for all \(k\). Therefore, it must hold that \(B^{\mathrm{T}}B\xi =0\), since
tends to \(0\) as \(k\rightarrow \infty \). Consequently, \(\xi \in \ker B\).
On the other hand, (66) implies that
for all \(k\), where the second term in the left-hand side tends to zero as \(k\rightarrow \infty \) because \(\{t_k\varOmega _k\}\) is bounded and \(\{B_k\xi ^k\}\rightarrow B\xi =0\). Hence, \(H\xi \in \mathop {\hbox {im}}B^{\mathrm{T}}\) by the closedness of \(\mathop {\hbox {im}}B^{\mathrm{T}}\). This completes a contradiction with (64). \(\square \)
Lemma 4
Under the assumptions of Lemma 3, for any \(M>0\) and any \(\varepsilon >0\) it holds that for every \(n\times n\)-matrix \(\tilde{H}\) close enough to \(H\), every \(l\times n\)-matrix \(\tilde{B}\) close enough to \(B\), every real \(t\) such that \(|t|\) is sufficiently large, and for all \(l\times n\)-matrices \(\varOmega \) satisfying \(\Vert \varOmega \Vert \le M/|t|\), the matrix \(\tilde{H} + t(B+\varOmega )^{\mathrm{T}}\tilde{B}\) is nonsingular and
Proof
Fix arbitrary \(M>0\) and \(\varepsilon >0\). The assertion regarding nonsingularity of the matrix \(\tilde{H} + t(B+\varOmega )^{\mathrm{T}}\tilde{B}\) follows directly from Lemma 3. Therefore, we only have to prove that (possibly by making \(\tilde{H}\) closer to \(H, \tilde{B}\) closer to \(B\), and \(|t|\) larger) one can additionally ensure (67).
By contradiction, suppose first that there exist sequences \(\{H_k\}\) of \(n\times n\)-matrices, \(\{B_k\}\) and \(\{\varOmega _k\}\) of \(l\times n\)-matrices, \(\{t_k\}\) of reals, and \(\{\eta ^k\}\subset \mathrm{I\! \mathrm R}^n\), such that \(\{H_k\}\rightarrow H, \{B_k\}\rightarrow B, |t_k|\rightarrow \infty , \Vert \varOmega _k\Vert \le M/|t_k|, \Vert \eta ^k\Vert =1\) and \(\det (H_k + t_k(B+\varOmega _k)^{\mathrm{T}}B_k)\ne 0\) for all \(k\), and for
it holds that
for all \(k\). By (68) we have that
Due to (69), the sequence \(\{\eta ^k/\Vert \xi ^k\Vert \}\) is bounded. Without loss of generality we may assume that the sequence \(\{\xi ^k/\Vert \xi ^k\Vert \}\) converges to some \(\xi \in \mathrm{I\! \mathrm R}^n\) such that \(\Vert \xi \Vert =1\). Then dividing both sides of (70) by \(t_k\Vert \xi ^k\Vert \) and passing onto the limit as \(k\rightarrow \infty \), we obtain that \(B^{\mathrm{T}}B\xi =0\), and hence, \(\xi \in \ker B\).
Furthermore, by (70), it holds that
for all \(k\). The second term in the left-hand side tends to zero because \(\{\Vert \varOmega _k\Vert \}\rightarrow 0\) while the sequence \(\{\eta ^k/\Vert \xi ^k\Vert \}\) is bounded. Moreover, the third term in the left-hand side tends to zero as well, because \(\{t_k\varOmega _k\}\) is bounded while \(\{B_k\xi ^k/\Vert \xi ^k\Vert \}\rightarrow B\xi =0\). Therefore, by closedness of \(\mathop {\hbox {im}}B^{\mathrm{T}}\), it follows that \(H\xi \in \mathop {\hbox {im}}B^{\mathrm{T}}\), which contradicts (64). \(\square \)
Lemma 5
In addition to the assumptions of Lemma 3, let \(H\) be symmetric.
Then for any \(M>0\) and any \(\varepsilon >0\) it holds that for every symmetric \(n\times n\)-matrix \(\tilde{H}\) close enough to \(H\), every real \(t\) such that \(|t|\) is sufficiently large, and for all \(l\times n\)-matrices \(\varOmega \) satisfying \(\Vert \varOmega \Vert \le M/|t|\), the matrix \(\tilde{H} + t(B+\varOmega )^{\mathrm{T}}(B+\varOmega )\) is nonsingular and the following estimate is valid
Proof
Again, nonsingularity of \(\tilde{H} + t(B+\varOmega )^{\mathrm{T}}(B+\varOmega )\) is given by Lemma 3. If at the same time the estimate (71) does not hold, there must exist sequences \(\{H_k\}\) of symmetric \(n\times n\)-matrices, \(\{\varOmega _k\}\) of \(l\times n\)-matrices, \(\{t_k\}\) of reals, and \(\{\eta ^k\}\subset \mathrm{I\! \mathrm R}^n\), such that \(\{H_k\}\rightarrow H, |t_k|\rightarrow \infty \), and for all \(k\) it holds that \(\Vert \varOmega _k\Vert \le M/|t_k|, \Vert \eta ^k\Vert =1, \det (H_k + t_k(B+\varOmega _k)^{\mathrm{T}}(B+\varOmega _k))\ne 0\), and
For each \(k\) set
where the symmetry of \(H_k\) was taken into account. Due to Lemma 4 we have that \(\{ W_k\} \rightarrow 0\).
Furthermore, for each \(k\) the vector \(\eta ^k\) can be decomposed into the sum
where \(\eta ^k_1\in \ker B^{\mathrm{T}}=(\mathop {\hbox {im}}B)^\bot \) and \(\eta ^k_2\in \mathop {\hbox {im}}B\). Observe that \(t_kW_k(B+\varOmega _k)^{\mathrm{T}}\eta ^k_1=W_k(t_k\varOmega _k^{\mathrm{T}})\times \eta ^k_1\), and since the sequences \(\{\eta ^k_1\}\) and \(\{t_k\varOmega _k\}\) are bounded, and \(\{ W_k\} \rightarrow 0\), we conclude that \(\{ t_kW_k(B+\varOmega _k)^{\mathrm{T}}\eta ^k_1\} \rightarrow 0\). On the other hand, as \(\eta ^k_2\in \mathop {\hbox {im}}B\), there exists \(\xi ^k_2\in \mathrm{I\! \mathrm R}^n\) such that \(B\xi ^k_2=\eta ^k_2\) and the sequence \(\{\xi ^k_2\}\) is bounded. Therefore, employing (73),
The last two terms in the right-hand side tend to zero because the sequences \(\{\xi ^k_2\}\) and \(\{H_k +t_k(B+\varOmega _k)^{\mathrm{T}}\varOmega _k\}\) are bounded, while \(\{\varOmega _k\} \rightarrow 0\) and \(\{ W_k\} \rightarrow 0\). Therefore,
which contradicts (72). \(\square \)
Rights and permissions
About this article
Cite this article
Izmailov, A.F., Kurennoy, A.S. & Solodov, M.V. Local convergence of the method of multipliers for variational and optimization problems under the noncriticality assumption. Comput Optim Appl 60, 111–140 (2015). https://doi.org/10.1007/s10589-014-9658-8
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-014-9658-8