Skip to main content
Log in

Local convergence analysis of augmented Lagrangian method for nonlinear semidefinite programming

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

The augmented Lagrangian method (ALM) has gained tremendous popularity for its elegant theory and impressive numerical performance since it was proposed by Hestenes and Powell in 1969. It has been widely used in numerous efficient solvers to improve numerical performance to solve many problems. In this paper, without requiring the uniqueness of multipliers, the local (asymptotic Q-superlinear) Q-linear convergence rate of the primal-dual sequences generated by ALM for the nonlinear semidefinite programming is established by assuming the second-order sufficient condition and the semi-isolated calmness of the Karush–Kuhn–Tucker solution under some mild conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1

Similar content being viewed by others

Data availability

We do not analyze or generate any datasets, because our work proceeds within a theoretical and mathematical approach.

References

  1. Arrow, K.J., Solow, R.M.: Gradient methods for constrained maxima with weakened assumptions. In: Arrow, K.J., Hurwicz, L., Uzawa, H. (eds.) Studies in Linear and Nonlinear Programming, pp. 165–176. Stanford University Press, Stanford (1958)

    Google Scholar 

  2. Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38, 367–426 (1996)

    MathSciNet  Google Scholar 

  3. Bauschke, H.H., Borwein, J.M., Li, W.: Strong conical hull intersection property, bounded linear regularity, Jameson’s property (G), and error bounds in convex optimization. Math. Program. 86, 135–160 (1999)

    MathSciNet  Google Scholar 

  4. Bertsekas, D.: Constrained Optimization and Lagrange Multipliers Methods. Academic Press, New York (1982)

    Google Scholar 

  5. Bonnans, J.F., Shapiro, A.: Perturbation analysis of optimization problems. Springer, New York (2000)

    Google Scholar 

  6. Buys, J.D.: Dual Algorithms for Constrained Optimization Problems, Doctoral dissertation, University of Leiden, Leiden, the Netherlands (1972)

  7. Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley-Interscience, New York (1983)

    Google Scholar 

  8. Conn, A.R., Gould, N.I.M., Toint, Ph.: A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds. SIAM J. Numer. Anal. 28, 545–572 (1991)

    MathSciNet  Google Scholar 

  9. Contesse-Becker, L.: Extended convergence results for the method of multipliers for non-strictly binding inequality constraints. J. Optim. Theory Appl. 79, 273–310 (1993)

    MathSciNet  Google Scholar 

  10. Cui, Y., Sun, D. F., Toh, K. C.: On the R-superlinear convergence of the KKT residuals generated by the augmented Lagrangian method for convex composite conic programming. Math. Program. 178, 381–415 (2019). arXiv:1610.00875

  11. Cui, Y., Ding, C.: Nonsmooth composite matrix optimization: strong regularity, constraint nondegeneracy and beyond (2019). arXiv: 1907.13253

  12. Cui, Y., Ding, C., Zhao, X.Y.: Quadratic growth conditions for convex matrix optimization problems associated with spectral functions. SIAM J. Optim. 27, 2332–2355 (2017)

    MathSciNet  Google Scholar 

  13. Ding, C.: An Introduction to a Class of Matrix Optimization Problems. Ph.D. thesis, National University of Singapore (2012)

  14. Ding, C., Sun, D.F., Ye, J.J.: First order optimality conditions for mathematical programs with semidefinite cone complementarity constraints. Math. Program. 147, 539–579 (2014)

    MathSciNet  Google Scholar 

  15. Ding, C., Sun, D.F., Zhang, L.W.: Characterization of the robust isolated calmness for a class of conic programming problems. SIAM J. Optim. 27, 67–90 (2017)

    MathSciNet  Google Scholar 

  16. Dontchev, A.L., Rockafellar, R.T.: Characterizations of Lipschitz stability in nonlinear programming, pp. 65–82. Marcel Dekker, New York, Mathematical Programming With Data Perturbations (1997)

  17. Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings. Springer, Heidelberg (2009)

    Google Scholar 

  18. Eckstein, J., Silva, P.J.S.: A practical relative error criterion for augmented Lagrangians. Math. Program. 141, 319–348 (2013)

    MathSciNet  Google Scholar 

  19. Fernandez, D., Solodov, M.V.: Local convergence of exact and inexact augmented Lagrangian methods under the second-order sufficient optimality condition. SIAM J. Optim. 22, 384–407 (2012)

    MathSciNet  Google Scholar 

  20. Golshtein, E.G., Tretyakov, N.V.: Modified Lagrangians and Monotone Maps in Optimization. Wiley, New York (1989)

    Google Scholar 

  21. Hang, N.T.V., Mordukhovich, B., Sarabi, E.: Augmented lagrangian method for second-order cone programs under second-order sufficiency. J. Glob. Optim. 1–31 (2021)

  22. Hang, N.T.V., Sarabi, M.E.: Local convergence analysis of augmented lagrangian methods for piecewise linear-quadratic composite optimization problems, arXiv preprint arXiv:2010.11379 (2020)

  23. Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4, 303–320 (1969)

    MathSciNet  Google Scholar 

  24. Hoffman, A.J.: On approximate solutions of systems of linear inequalities. J. Res. Natl. Bur. Stand. 49, 263–265 (1952)

    MathSciNet  Google Scholar 

  25. Ito, K., Kunisch, K.: The augmented Lagrangian method for equality and inequality constraints in Hilbert spaces. Math. Program. 46, 341–360 (1990)

    MathSciNet  Google Scholar 

  26. Izmailov, A.F., Kurennoy, A.S., Solodov, M.V.: A note on upper Lipschitz stability, error bounds, and critical multipliers for Lipschitz-continuous KKT systems. Math. Program. 142, 591–604 (2013)

    MathSciNet  Google Scholar 

  27. Izmailov, A.F., Solodov, M.V.: Newton-Type Methods for Optimization and Variational Problems. Springer, New York (2014)

    Google Scholar 

  28. Kanzow, C., Steck, D.: On error bounds and multiplier methods for variational problems in Banach spaces. SIAM J. Control. Optim. 56, 1716–1738 (2018)

    MathSciNet  Google Scholar 

  29. Kanzow, C., Steck, D.: Improved local convergence results for augmented Lagrangian methods in \(C^2\)-cone reducible constrained optimization. Math. Program. 177, 425–438 (2019)

    MathSciNet  Google Scholar 

  30. Klatte, D.: Upper Lipschitz behavior of solutions to perturbed \(C^{1,1}\) programs. Math. Program. 88, 285–311 (2000)

    MathSciNet  Google Scholar 

  31. Leibfritz, F.: COMPleib 1.1: COnstraint Matrix-optimization Problem Library—a collection of test examples for nonlinear semidefinite programs, control system design and related problems. Technical Report, Department of Mathematics, University of Trier, Germany (2005)

  32. Li, X.D., Sun, D.F., Toh, K.C.: QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming. Math. Program. Comput. 10, 703–743 (2018)

    MathSciNet  Google Scholar 

  33. Li, X.D., Sun, D.F., Toh, K.C.: A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 28, 433–458 (2018)

    MathSciNet  Google Scholar 

  34. Luque, F.J.: Asymptotic convergence analysis of the proximal point algorithm. SIAM J. Optim. 22, 277–293 (1984)

    MathSciNet  Google Scholar 

  35. Mohammadi, A., Mordukhovich, B.S., Sarabi, M.E.: Parabolic regularity in geometric variational analysis. Trans. Am. Math. Soc. 374, 1711–1763 (2021)

    MathSciNet  Google Scholar 

  36. Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation, Vol. 1: Basic Theory, Vol. 2: Applications, Springer, Berlin (2006)

  37. Mordukhovich, B., Sarabi, E.: Critical multipliers in variational systems via second-order generalized differentiation. Math. Program. 169, 605–648 (2018)

    MathSciNet  Google Scholar 

  38. Mordukhovich, B., Sarabi, E.: Criticality of Lagrange multipliers in variational systems. SIAM J. Optim. 29, 425–438 (2019)

    MathSciNet  Google Scholar 

  39. Pennanen, T.: Local convergence of the proximal point algorithm and multiplier methods without monotonicity. Math. Oper. Res. 27, 170–191 (2002)

    MathSciNet  Google Scholar 

  40. Poliquin, R.A., Rockafellar, R.T.: Generalized hessian properties of regularized nonsmooth functions. SIAM J. Optim. 6, 1121–1137 (1996)

    MathSciNet  Google Scholar 

  41. Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization, pp. 283–298. Academic, New York (1969)

  42. Robinson, S.M.: Some continuity properties of polyhedral multifunctions. SIAM J. Control. Optim. 28, 206–214 (1981)

    MathSciNet  Google Scholar 

  43. Robinson, S.M.: Generalized equations and their solutions, Part II: applications to nonlinear programming. Math. Program. Study 19, 200–221 (1982)

    Google Scholar 

  44. Rockafellar, R.T.: Convex Analysis, Vol. 36. Princeton University Press, Princeton (1970)

  45. Rockafellar, R.T.: New applications of duality in convex programming. In: Proceedings of the 4th Conference on Probability, Brasov, Rumania, 1971, pp. 37–81; written version of a talk also given at the 7th International Symposium on Mathematical Programming in the Hague (1970)

  46. Rockafellar, R.T.: A dual approach to solving nonlinear programming problems by unconstrained optimization. Math. Program. 5, 354–373 (1973)

    MathSciNet  Google Scholar 

  47. Rockafellar, R.T.: The multiplier method of Hestenes and Powell applied to convex programming. J. Optim. Theory Appl. 12, 555–562 (1973)

    MathSciNet  Google Scholar 

  48. Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1, 97–116 (1976)

    MathSciNet  Google Scholar 

  49. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control. Optim. 14, 877–898 (1976)

    MathSciNet  Google Scholar 

  50. Rockafellar, R.T.: Lagrange multipliers and optimality. SIAM Rev. 35, 183–238 (1993)

    MathSciNet  Google Scholar 

  51. Rockafellar, R.T., Wets, R.J.-B.: Variational analysis, Springer, Berlin (1998)

  52. Rockafellar, R.T.: Augmented Lagrangians and Hidden Convexity in Sufficient Conditions for Local Optimality, Mathematical Programming, to appear

  53. Rockafellar, R.T.: Convergence of Augmented Lagrangian Methods in Extensions Beyond Nonlinear Programming, Preprint, https://sites.math.washington.edu/~rtr/papers/rtr258-ExtendedALM.pdf, (2021)

  54. Rudin, W.: Principles of Mathematical Analysis citation third edition, McGraw-Hill (1976)

  55. Shapiro, A., Sun, J.: Some properties of the augmented Lagrangian in cone constrained optimization. Math. Oper. Res. 29, 479–491 (2004)

    MathSciNet  Google Scholar 

  56. Sun, D.F.: The strong second order sufficient condition and constraint nondegeneracy in nonlinear semidefinite programming and their implications. Math. Oper. Res. 31, 761–776 (2006)

    MathSciNet  Google Scholar 

  57. Sun, D.F., Sun, J.: Semismooth matrix valued functions. Math. Oper. Res. 27, 150–169 (2002)

    MathSciNet  Google Scholar 

  58. Sun, D.F., Sun, J.: Strong semismoothness of eigenvalues of symmetric matrices and its application to inverse eigenvalue problems. SIAM J. Numer. Anal. 40, 2352–2367 (2003)

    MathSciNet  Google Scholar 

  59. Sun, D.F., Sun, J., Zhang, L.W.: The rate of convergence of the augmented Lagrangian method for nonlinear semidefinite programming. Math. Program. 114, 349–391 (2008)

    MathSciNet  Google Scholar 

  60. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58, 267–288 (1996)

    MathSciNet  Google Scholar 

  61. Tretyakov, N.V.: A method of penalty estimates for convex programming problems. Ékonomika i Matematicheskie Metody 9, 525–540 (1973)

    MathSciNet  Google Scholar 

  62. Yang, L.Q., Sun, D.F., Toh, K.C.: SDPNAL\(+\): a majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints. Math. Program. Comput. 7, 331–366 (2015)

    MathSciNet  Google Scholar 

  63. Ye, J.J., Ye, X.Y.: Necessary optimality conditions for optimization problems with variational inequality constraints. Math. Oper. Res. 22, 977–997 (1997)

    MathSciNet  Google Scholar 

  64. Zarantonello, E.H.: Projections on Convex Sets in Hilbert Space and Spectral Theory: Part I. Projections on Convex Sets: Part II. Spectral Theory, Contributions to Nonlinear Functional Analysis, 237–424 (1971)

  65. Zhao, X.Y., Sun, D.F., Toh, K.C.: A Newton-CG augmented Lagrangian method for semidefinite programming. SIAM J. Optim. 20, 1737–1765 (2010)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiwei Wang.

Ethics declarations

Conflicts of interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research of this author was supported by the National Natural Science Foundation of China under project No. 12071464 and the Beijing Natural Science Foundation (Z190002).

Appendices

Appendix A: Proof of Lemma 1

For each \(s\in \{1,\dots ,\bar{d}\}\), let \(\varLambda _{\bar{\iota }_s\bar{\iota }_s}=\textrm{Diag}(\lambda _{\bar{\iota }_s}(A))\) and \(\varXi _{\bar{\iota }_s\bar{\iota }_s}=\textrm{Diag}(\lambda _{\bar{\iota }_s}(A+H))\). We first show that (10) holds. If \(\bar{d}=1\), i.e., \(\lambda _1(\overline{A})=\dots =\lambda _n(\overline{A})\), the first equation in (10) trivially holds. Next we assume that \(\bar{d}\ge 2\). From (9), we have for any \({\mathcal S}^n\ni H\rightarrow 0\),

$$\begin{aligned} \left[ \begin{array}{cccc} \varLambda _{\bar{\iota }_1\bar{\iota }_1} &{} 0 &{} \dots &{} 0\\ 0 &{} \varLambda _{\bar{\iota }_2\bar{\iota }_2} &{} \dots &{} 0\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ 0 &{} 0 &{}\dots &{} \varLambda _{\bar{\iota }_{\bar{d}}\bar{\iota }_{\bar{d}}} \end{array}\right] U+HU=U\left[ \begin{array}{cccc} \varXi _{\bar{\iota }_1\bar{\iota }_1} &{} 0 &{} \dots &{} 0\\ 0 &{} \varXi _{\bar{\iota }_2\bar{\iota }_2} &{} \dots &{} 0\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ 0 &{} 0 &{}\dots &{} \varXi _{\bar{\iota }_{\bar{d}}\bar{\iota }_{\bar{d}}} \end{array}\right] . \end{aligned}$$

It follows from \(A\in \mathbb {B}_{r}(\overline{A})\) that \(\lambda _i(A)\ne \lambda _j(A)\) whenever \(i\in \bar{\iota }_t\), \(j\in \bar{\iota }_s\) with \(s\ne t\). It is easy to see that for all \(i\in \bar{\iota }_s\), \(j\in \bar{\iota }_t\) with \(s\ne t\), \(U_{ij}=\frac{(HU)_{ij}}{\varLambda _{ii}-\varXi _{jj}}\). Then we have for all \(\Vert H\Vert \le \omega :=r/6\),

$$\begin{aligned} \frac{\Vert U_{\bar{\iota }_{s} \bar{\iota }_{t}}\Vert }{\Vert H\Vert }\le \sum _{i\in \bar{\iota }_s, j\in \bar{\iota }_t}\frac{1}{(\varLambda _{ii}-\varXi _{jj})^2}\le \sum _{i\in \bar{\iota }_k, j\in \bar{\iota }_l}\frac{1}{(|v_i(\overline{A})-v_j(\overline{A})|-2r-2\omega )^2}:=q, \end{aligned}$$

where \(\omega \) and q are independent of A. Hence we obtain that

$$\begin{aligned} U_{\bar{\iota }_{s} \bar{\iota }_{t}}={O}(\Vert H\Vert ) \quad \forall \, 1 \le s \ne t \le \bar{d}, \end{aligned}$$

where \({O}(\Vert H\Vert )\) is uniform for all \(A\in \mathbb {B}_{r}(\overline{A})\). By using the fact that U is orthogonal, we obtain directly that the second equation in (10) holds. In order to prove (11), we consider the SVD of each \(U_{\bar{\iota }_{s} \bar{\iota }_{s}}\), \(s=1,\dots ,\bar{d}\). Fix \(s\in \{1,\dots ,\bar{d}\}\). Let W and V be in \({\mathcal O}^{|\bar{\iota }_s|}\) such that \(U_{\bar{\iota }_{s} \bar{\iota }_{s}}=W\varSigma V^T\), where \(\varSigma \) is a nonnegative diagonal matrix. From (10), we obtain that for all \(A\in \mathbb {B}_{r}(\overline{A})\),

$$\begin{aligned} W\varSigma ^2W^T=I_{|\bar{\iota }_{s}|}+{O}(\Vert H\Vert ^2), \end{aligned}$$

which is equivalent to

$$\begin{aligned} \varSigma ^2=W^TW+{O}(\Vert H\Vert ^2)=I_{|\bar{\iota }_{s}|}+{O}(\Vert H\Vert ^2). \end{aligned}$$

Since \(\varSigma \) is a nonnegative diagonal matrix, we may conclude that

$$\begin{aligned} \varSigma =\textrm{Diag}(1+{O}(\Vert H\Vert ^2),\dots ,1+{O}(\Vert H\Vert ^2)). \end{aligned}$$

Therefore, from \(U_{\bar{\iota }_{s} \bar{\iota }_{s}}=W\varSigma V^T\), we have \(U_{\bar{\iota }_{s} \bar{\iota }_{s}}=WV^T+O(\Vert H\Vert ^2)\). Since \(WV^T\in {\mathcal O}^{|\bar{\iota }_{s}|}\), we know that for all \(A\in \mathbb {B}_{r}(\overline{A})\), (11) holds. Next, we shall show (12) holds. For each \(s\in \{1,\dots ,\bar{d}\}\) by comparing the s-th diagonal block of both sides of (9), we obtain that

$$\begin{aligned} U^T_{\bar{\iota }_{s}}(\varLambda (A)+H)U_{\bar{\iota }_{s}}=\varXi _{\bar{\iota }_s\bar{\iota }_s}. \end{aligned}$$
(78)

Fix \(s\in \{1,\dots ,\bar{d}\}\). From (10) and (78), we know that

$$\begin{aligned} U_{\bar{\iota }_{s}}^T\varLambda (A)U_{\bar{\iota }_{s}}&=\left[ \begin{array}{ccc} {O}(\Vert H\Vert )&U_{\bar{\iota }_{s} \bar{\iota }_{s}}&{O}(\Vert H\Vert )\end{array}\right] \left[ \begin{array}{ccc} \varLambda _1(A)&{}0&{}0\\ 0&{}\varLambda (A)_{\bar{\iota }_{s} \bar{\iota }_{s}}&{}0\\ 0&{}0&{}\varLambda _2(A) \end{array}\right] \left[ \begin{array}{c} {O}(\Vert H\Vert )\\ U_{\bar{\iota }_{s} \bar{\iota }_{s}}\\ {O}(\Vert H\Vert ) \end{array}\right] \\&={O}(\Vert H\Vert ^2)\varLambda _1(A)+U_{\bar{\iota }_{s} \bar{\iota }_{s}}^T\varLambda (A)_{\bar{\iota }_{k} \bar{\iota }_{k}}U_{\bar{\iota }_{k} \bar{\iota }_{k}}+{O}(\Vert H\Vert ^2)\varLambda _2(A). \end{aligned}$$

It follows that

$$\begin{aligned} \varXi _{\bar{\iota }_{k} \bar{\iota }_{k}}-\big ({O}(\Vert H\Vert ^2)\varLambda _1(A)+U_{\bar{\iota }_{k} \bar{\iota }_{s}}^T\varLambda (A)_{\bar{\iota }_{s} \bar{\iota }_{s}}U_{\bar{\iota }_{s} \bar{\iota }_{s}}+{O}(\Vert H\Vert ^2)\varLambda _2(A)\big )\\ =U_{\bar{\iota }_{s} \bar{\iota }_{s}}^TH_{\bar{\iota }_{s} \bar{\iota }_{s}}U_{\bar{\iota }_{s} \bar{\iota }_{s}}+{O}(\Vert H\Vert ^2). \end{aligned}$$

Since \(U_{\bar{\iota }_{s} \bar{\iota }_{s}}=Q_s+{O}(\Vert H\Vert ^2)\) and \(\Vert \varLambda (A)\Vert \le \Vert \varLambda (\overline{A})\Vert +r\), we obtain that

$$\begin{aligned} Q_s^TH_{\bar{\iota }_{s} \bar{\iota }_{s}}Q_k=\varXi _{\bar{\iota }_{s} \bar{\iota }_{s}}-Q_s^T\varLambda (A)_{\bar{\iota }_{s} \bar{\iota }_{s}}Q_s+{O}(\Vert H\Vert ^2). \end{aligned}$$

Hence (12) holds with the uniform \({O}(\Vert H\Vert ^2)\) for all \(A\in \mathbb {B}_{r}(\overline{A})\). The proof is completed.

Appendix B: Proof of Proposition 1

Firstly, we show (14) holds for the case that \(A=\varLambda (A)\). For any \(H\in {\mathcal S}^n\), denote \(Z=A+H\). Let \(U\in {\mathcal O}^n\) (depending on H) be such that

$$\begin{aligned} \varLambda (A)+H=U\varLambda (Z)U^T. \end{aligned}$$
(79)

Let \(\omega >0\) be any fixed number such that \(0\le \omega \le \frac{{\lambda }_{|\alpha |}(\overline{A})-r}{2}\) if \(\alpha \ne \varnothing \) and be any fixed positive number otherwise. Then, define the following continuous scalar function

$$\begin{aligned} f(t):=\left\{ \begin{array}{ll} t &{} \text{ if } \quad t>\omega \\ 2 t-\omega &{} \text{ if } \quad \frac{\omega }{2}<t<\delta \\ 0 &{} \text{ if } \quad t<\frac{\omega }{2}. \end{array}\right. \end{aligned}$$

Therefore, we have

$$\begin{aligned} \left\{ \lambda _{1}(A), \ldots , \lambda _{|\alpha |}(A)\right\} \in (\omega ,+\infty ) \quad \text{ and } \quad \left\{ \lambda _{|\alpha |+1}(A), \ldots , \lambda _{n}(A)\right\} \in (-\infty , \frac{\omega }{2}). \end{aligned}$$

For the scalar function f, let \(F:{\mathcal S}^n\rightarrow {\mathcal S}^n\) be the corresponding Löwner’s operator, i.e., for any \(W\in {\mathcal S}^n\),

$$\begin{aligned} F(W):=\sum _{i=1}^nf(\lambda _i(W))P_iP_i^T, \end{aligned}$$

where \(P\in {\mathcal O}^n(W)\). Since f is real analytic on the open set \((-\infty ,\frac{\omega }{2})\cup (\omega ,+\infty )\), It is well-known that for H sufficiently close to zero,

$$\begin{aligned} F(A+H)-F(A)-F'(A)H=O(\Vert H\Vert ^2) \end{aligned}$$
(80)

and

$$\begin{aligned} F'(A)H=\left[ \begin{array}{ccc} H_{\alpha \alpha } &{} H_{\alpha \beta } &{} \varSigma _{\alpha \gamma } \circ H_{\alpha \gamma } \\ H_{\alpha \beta }^{T} &{} 0 &{} 0 \\ \varSigma _{\alpha \gamma }^{T} \circ H_{\alpha \gamma }^{T} &{} 0 &{} 0 \end{array}\right] , \end{aligned}$$

where \(O(\Vert H\Vert ^2)\) is independent of A for any \(A\in \mathbb {B}_r(\overline{A})\) and \(\varSigma \in {\mathcal S}^n\) is given by

$$\begin{aligned} \varSigma _{ij}=\frac{\max \{\lambda _i(A),0\}-\max \{\lambda _j(A),0\}}{\lambda _i(A)-\lambda _j(A)},\quad i,j=1,\dots ,n. \end{aligned}$$

Let \(R(\cdot ):=\varPi _{{\mathcal S}_+^n}(\cdot )-F(\cdot )\). By the definition of f, we know that \(F(A)=\varPi _{{\mathcal S}_+^n}(A)\), which implies that \(R(A)=0\). Meanwhile, it is clear that the matrix valued function R is directionally differentiable at A, and the directional derivative of R for any given direction \(H\in {\mathcal S}^n\) is given by

$$\begin{aligned} R^{\prime }(A; H)=\varPi _{\mathcal {S}_{+}^{n}}^{\prime }(A; H)-F^{\prime }(A) H=\left[ \begin{array}{ccc} 0 &{} 0 &{} 0 \\ 0 &{} \varPi _{\mathcal {S}_{+}^{|\beta |}}( H_{\beta \beta }) &{} 0 \\ 0 &{} 0 &{} 0 \end{array}\right] . \end{aligned}$$

By the Lipschitz continuity of \(\lambda (\cdot )\), we know that for H sufficiently close to zero,

$$\begin{aligned} \left\{ \lambda _{1}(Z), \ldots , \lambda _{|\alpha |}(Z)\right\} \in (\omega ,+\infty ), \quad \left\{ \lambda _{|\alpha |+1}(Z), \ldots , \lambda _{|\beta |}(Z)\right\} \in (-\infty , \frac{\omega }{2}) \end{aligned}$$

and

$$\begin{aligned} \{\lambda _{|\beta |+1}(Z), \dots , \lambda _{n}(Z)\} \in (-\infty ,0). \end{aligned}$$

Therefore, by the definition of F, we know that for H sufficiently close to zero,

$$\begin{aligned} R(A+H)=\varPi _{{\mathcal S}_+^n}(A+H)-F(A+H)=U\left[ \begin{array}{ccc} 0 &{} 0 &{} 0 \\ 0 &{} \varPi _{\mathcal {S}_{+}^{|\beta |}}(\varLambda (Z)_{\beta \beta }) &{} 0 \\ 0 &{} 0 &{} 0 \end{array}\right] U^T. \end{aligned}$$

Since \(U\in {\mathcal O}^n(Z)\), we know from Lemma 1 that for any \({\mathcal S}^n\ni H\rightarrow 0\), there exists an orthogonal matrix \(Q\in {\mathcal O}^{|\beta |}\) such that

$$\begin{aligned} U_{\beta }=\left[ \begin{array}{c} {O}(\Vert H\Vert )\\ U_{\beta \beta }\\ {O}(\Vert H\Vert ) \end{array}\right] \quad \textrm{and}\quad U_{\beta \beta }=Q+{O}(\Vert H\Vert ^2), \end{aligned}$$
(81)

Therefore, by noting that \(\varPi _{\mathcal {S}_{+}^{|\beta |}}(\varLambda (Z)_{\beta \beta })={O}(\Vert H\Vert )\) and \({O}(\Vert H\Vert )\) is uniform for \(A\in \mathbb {B}_r(\overline{A})\) with \(\pi (\overline{A})=\pi (A)\), we obtain from the above discussion that

$$\begin{aligned}&R(A+H)-R(A)-R'(A;H)\\&=\left[ \begin{array}{ccc} 0 &{} 0 &{} 0 \\ 0 &{} Q\varPi _{\mathcal {S}_{+}^{|\beta |}}(\varLambda (Z)_{\beta \beta })Q^T-\varPi _{\mathcal {S}_{+}^{|\beta |}}(H_{\beta \beta }) &{} 0 \\ 0 &{} 0 &{} 0 \end{array}\right] +{O}(\Vert H\Vert ^2) \end{aligned}$$

By (79) and (81), we know that

$$\begin{aligned} \varLambda (Z)_{\beta \beta }=U_{\beta }^{T}\varLambda (A)U_{\beta }+U_{\beta }^{T}HU_{\beta }\\ = U_{\beta \beta }^{T}H_{\beta \beta }U_{\beta \beta }+{O}(\Vert H\Vert ^{2})= Q^{T}H_{\beta \beta }Q+{O}(\Vert H\Vert ^{2}). \end{aligned}$$

Since \(Q\in {\mathcal O}^{|\beta |}\), we have

$$\begin{aligned} H_{\beta \beta }=Q\varLambda (Z)_{\beta \beta }Q^{T}+{O}(\Vert H\Vert ^{2}), \end{aligned}$$

where \({O}(\Vert H\Vert )\) is uniform for \(A\in \mathbb {B}_r(\overline{A})\) with \(\pi (\overline{A})=\pi (A)\). Combining this with the globally Lipschitz continuity of \(\varPi _{{\mathcal S}_{+}^{|\beta |}}(\cdot )\) and \(\varPi _{{\mathcal S}_{+}^{|\beta |}}(Q \varLambda (Z)_{\beta \beta }Q^{T})=Q\varPi _{{\mathcal S}_{+}^{|\beta |}}( \varLambda (Z)_{\beta \beta })Q^{T}\), we obtain that

$$\begin{aligned} Q\varPi _{{\mathcal S}_{+}^{|\beta |}}( \varLambda (Z)_{\beta \beta })Q^{T}-\varPi _{{\mathcal S}_{+}^{|\beta |}}(H_{\beta \beta }) ={O}(\Vert H\Vert ^{2}). \end{aligned}$$

Therefore,

$$\begin{aligned} R(A+H)-R(A)-R'(A;H)={O}(\Vert H\Vert ^{2}). \end{aligned}$$
(82)

By combining (80) and (82), we know that for any \({\mathcal S}^{n}\ni H\rightarrow 0\),

$$\begin{aligned} \varPi _{{\mathcal S}_{+}^{n}}(\varLambda (A)+H)-\varPi _{{\mathcal S}_{+}^{n}}(\varLambda (A))-\varPi '_{{\mathcal S}_{+}^{n}}(\varLambda (A);H)={O}(\Vert H\Vert ^{2})\, \end{aligned}$$
(83)

and \({O}(\Vert H\Vert ^2)\) is uniform for \(A\in \mathbb {B}_r(\overline{A})\) with \(\pi (\overline{A})=\pi (A)\). Next, we consider the case that \(A={P}^{T}\varLambda (A){P}\). Re-write (79) as

$$\begin{aligned} \varLambda (A)+{P}^{T}H{P}={P}^{T}U\varLambda (Z)U^{T}{P}. \end{aligned}$$

Let \(\widetilde{H}:={P}^{T}H{P}\). Then, we have \( \varPi _{{\mathcal S}^{n}_{+}}(A+H)={P}\,\varPi _{{\mathcal S}^{n}_{+}}(\varLambda (A)+\widetilde{H}){P}^{T}\,. \) Therefore, since \({P}\in {\mathcal O}^{n}\), we know from (83) and (5) that for any \({\mathcal S}^{n}\ni H\rightarrow 0\), (14) holds.

Appendix C: Proof of Lemma 2

We first consider the case where A is diagonal. For notational simplicity, let \(\varLambda =\varLambda (A)\) and \(\varXi =\varLambda ({A}+H)\). From (13), we have \(AU+HU=U\varXi \), which implies

$$\begin{aligned} \varLambda _{\bar{\iota }_s\bar{\iota }_s}U_{\bar{\iota }_s\bar{\iota }_t}+(HU)_{\bar{\iota }_s\bar{\iota }_t}=U_{\bar{\iota }_s\bar{\iota }_t}\varXi _{\bar{\iota }_s\bar{\iota }_t}. \end{aligned}$$

It follows that

$$\begin{aligned} \varLambda _{\bar{\iota }_s\bar{\iota }_s}U_{\bar{\iota }_s\bar{\iota }_t}+\sum _{j=1}^{\bar{d}}H_{\bar{\iota }_s\bar{\iota }_j}U_{\bar{\iota }_j\bar{\iota }_t}=U_{\bar{\iota }_s\bar{\iota }_t}\varXi _{\bar{\iota }_t\bar{\iota }_t}. \end{aligned}$$

This, together with Lemma 1 shows that

$$\begin{aligned} U_{\bar{\iota }_s\bar{\iota }_t}&=\varSigma ^{st}\circ \sum _{j=1}^{\bar{d}}H_{\bar{\iota }_s\bar{\iota }_j}U_{\bar{\iota }_j\bar{\iota }_t} =\varSigma ^{st}\circ H_{\bar{\iota }_s\bar{\iota }_t}Q_t+{O}(\Vert H\Vert ^2) \end{aligned}$$
(84)

where \((\varSigma ^{st})_{ij}=1/((\varXi _{\bar{\iota }_t\bar{\iota }_t})_i-(\varLambda _{\bar{\iota }_s\bar{\iota }_s})_j)\). It is easy to see that \(1/\big ((\varXi _{\bar{\iota }_t\bar{\iota }_t})_i-(\varLambda _{\bar{\iota }_s\bar{\iota }_s})_j\big )=1/\big ((\varLambda _{\bar{\iota }_t\bar{\iota }_t})_i-(\varLambda _{\bar{\iota }_s\bar{\iota }_s})_j\big )+{O}(\Vert H\Vert )\). Combining this with (84), we have

$$\begin{aligned} U_{\bar{\iota }_s\bar{\iota }_t}=\varTheta ^{st}\circ H_{\bar{\iota }_s\bar{\iota }_t}Q_t+{O}(\Vert H\Vert ^2),\;\textrm{with}\;O(\Vert H\Vert ^2)\;\mathrm{uniform\;for\;all}\;A\in \mathbb {B}_r(\overline{A}). \end{aligned}$$

Next we consider \(A=P^T\varLambda (A)P^T\). Re-write (13) as

$$\begin{aligned} \varLambda (A)+{P}^{T}H{P}={P}^{T}U\varLambda (A+H)U^{T}{P}. \end{aligned}$$

Let \(\widetilde{H}:={P}^{T}H{P}\). Since P is an orthogonal matrix, the following proof is the same as the diagonal case. Thus we have completed the proof.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, S., Ding, C. Local convergence analysis of augmented Lagrangian method for nonlinear semidefinite programming. Comput Optim Appl 87, 39–81 (2024). https://doi.org/10.1007/s10589-023-00520-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-023-00520-0

Keywords

Mathematics Subject Classification

Navigation