Skip to main content
Log in

On Saddle Points in Semidefinite Optimization via Separation Scheme

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

This paper aims at investigating saddle point conditions for augmented Lagrangian functions for semidefinite optimization problems. By means of the image space analysis, the existence of a saddle point is shown to be equivalent to a regular weak nonlinear separation of two suitable subsets in the image space (IS) associated with the given problem. Especially, three classes of augmented Lagrangians based on smooth spectral penalty functions can be derived, as particular cases, from a nonlinear separation scheme in the IS. Without requiring the strict complementarity, it is proved that, under strong second-order sufficiency conditions, all these augmented Lagrangian functions admit a local saddle point, and their Hessians become positive definite in a neighborhood of a local optimal point of the original problem. The existence of global saddle points is then obtained under additional assumptions that do not require the compactness of the feasible set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Giannessi, F.: Constrained Optimization and Image Space Analysis. Springer, Berlin (2005)

    MATH  Google Scholar 

  2. Giannessi, F.: Theorems of the alternative and optimality conditions. J. Optim. Theory Appl. 42, 331–365 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  3. Dien, P.H., Mastroeni, G., Pappalardo, M., Quang, P.H.: Regularity conditions for constrained extremum problems via image space. J. Optim. Theory Appl. 80, 19–37 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  4. Pappalardo, M.: Image space approach to penalty methods. J. Optim. Theory Appl. 64, 141–152 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  5. Rubinov, A.M., Uderzo, A.: On global optimality conditions via separation functions. J. Optim. Theory Appl. 109, 345–370 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  6. Luo, H.Z., Mastroeni, G., Wu, H.X.: Separation approach for augmented Lagrangians in constrained nonconvex optimization. J. Optim. Theory Appl. 144, 275–290 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  7. Luo, H.Z., Wu, H.X., Liu, J.Z.: Some results on augmented Lagrangians in constrained global optimization via image space analysis. J. Optim. Theory Appl. 159, 360–385 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  8. Li, S.J., Xu, Y.D., Zhu, S.K.: Nonlinear separation approach to constrained extremum problems. J. Optim. Theory Appl. 154, 842–856 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  9. Zhu, S.K., Li, S.J.: Unified duality theory for constrained extremum problems. Part I: Image space analysis. J. Optim. Theory Appl. 161(3), 738–762 (2014)

  10. Chinaie, M., Zafarani, J.: Image space analysis and scalarization of multivalued optimization. J. Optim. Theory Appl. 142, 451–467 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  11. Giannessi, F., Mastroeni, G., Yao, J.C.: On maximum and variational principles via image space analysis. Positivity 16, 405–427 (2012)

    Article  MathSciNet  Google Scholar 

  12. Mastroeni, G.: Nonlinear separation in the image space with applications to penalty methods. Appl. Anal. 91, 1901–1914 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  13. Shapiro, A., Sun, J.: Some properties of the augmented Lagrangian in cone constrained optimization. Math. Oper. Res. 29, 479–491 (2004)

  14. Todd, M.J.: Semidefinite optimization. Acta Numer. 10, 515–560 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  15. Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  16. Ye, Y.: Interior Point Algorithms: Theory and Analysis. Wiley, New York (1997)

    Book  MATH  Google Scholar 

  17. Ben-Tal, A., Jarre, F., Kocvara, M., Nemirovski, A., Zowe, J.: Optimal design of trusses under a nonconvex global buckling constraints. Optim. Eng. 1, 189–213 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  18. Fares, B., Apkarian, P., Noll, D.: An augmented Lagrangian method for a class of LMI-constrained problems in robust control theory. Int. J. Control 74, 348–360 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  19. Fares, B., Noll, D., Apkarian, P.: Robust control via sequential semidefinite programming. SIAM J. Control Optim. 40, 1791–1820 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  20. Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)

    Book  MATH  Google Scholar 

  21. Chan, Z.X., Sun, D.: Constraint nondegeneracy, strong regularity and nonsingularity in semidefinite programming. SIAM J. Optim. 19, 370–396 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  22. Forsgren, A.: Optimality conditions for nonconvex semidefinite programming. Math. Program. 88, 105–128 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  23. Qi, H.D.: Local duality of nonlinear semidefinite programming. Math. Oper. Res. 34, 124–141 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  24. Shapiro, A.: First and second order analysis of nonlinear semidefinite programs. Math. Program. Ser. B 77, 301–320 (1997)

    MATH  Google Scholar 

  25. Sun, D.: The strong second-order sufficient condition and constraint nondegeneracy in nonlinear semidefinite programming and their implications. Math. Oper. Res. 31, 761–776 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  26. Correa, R., Hector Ramirez, C.: A global algorithm for solving nonlinear semidefinite programming. SIAM J. Optim. 15, 303–318 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  27. Yamashita, H., Yabe, H., Harada, K.: A primal-dual interior point method for nonlinear semidefinite programming. Math. Program. 135, 89–121 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  28. Sun, J., Zhang, L.W., Wu, Y.: Properties of the augmented Lagrangian in nonlinear semidefinite optimization. J. Optim. Theory Appl. 129, 437–456 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  29. Sun, D., Sun, J., Zhang, L.W.: The rate of convergence of the augmented Lagrangian method for nonlinear semidefinite programming. Math. Program. 114, 349–391 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  30. Luo, H.Z., Wu, H.X., Chen, G.T.: On the convergence of augmented Lagrangian methods for nonlinear semidefinite programming. J. Glob. Optim. 54, 599–618 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  31. Wu, H.X., Luo, H.Z., Ding, X.D., Chen, G.T.: Global convergence of modified augmented Lagrangian methods for nonlinear semidefinite programming. Comput. Optim. Appl. 56, 531–558 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  32. Wu, H.X., Luo, H.Z., Yang, J.F.: Nonlinear separation approach for the augmented Lagrangian in nonlinear semidefinite programming. J. Glob. Optim. 59, 695–727 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  33. Noll, D.: Local convergence of an augmented Lagrangian method for matrix inequality constrained programming. Optim. Methods Softw. 22, 777–802 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  34. Stingl, M.: On the solution of nonlinear semidefinite programs by augmented Lagrangian methods. PhD thesis, Institute of Applied Mathematics, Universitytat Erlangen-Nurnberg (2005)

  35. Li, D., Sun, X.L.: Existence of a saddle point in nonconvex constrained optimization. J. Glob. Optim. 21, 39–50 (2001)

    Article  MATH  Google Scholar 

  36. Sun, X.L., Li, D., McKinnon, K.I.M.: On saddle points of augmented Lagrangians for constrained nonconvex optimization. SIAM J. Optim. 15, 1128–1146 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  37. Wu, H.X., Luo, H.Z.: A note on the existence of saddle points of \(p\)-th power Lagrangian for constrained nonconvex optimization. Optimization 61, 1331–1345 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  38. Wu, H.X., Luo, H.Z.: Saddle points of general augmented Lagrangians for constrained nonconvex optimization. J. Glob. Optim. 53, 683–697 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  39. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    MATH  Google Scholar 

  40. Birgin, E.G., Castillo, R.A., Martínez, J.M.: Numerical comparison of augmented Lagrangian algorithms for nonconvex problems. Comput. Optim. Appl. 31, 31–55 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  41. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991)

    Book  MATH  Google Scholar 

  42. Zarantonello, E.H.: Projections on convex sets in Hilbert space and spectral theory I and II. In: Zarantonello, E.H. (ed.) Contributions to Nonlinear Functional Analysis, pp. 237–424. Academic, New York (1971)

    Chapter  Google Scholar 

  43. Theobald, C.M.: An inequality for the trace of the product of two symmetric matrices. Math. Proc. Camb. Philos. Soc. 77, 77–265 (1975)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the two anonymous referees for the detailed comments and valuable suggestions, which have improved the final presentation of the paper. This work was supported by the National Natural Science Foundation of China under Grants 11371324 and 11071219, and the Zhejiang Provincial Natural Science Foundation of China under Grants LY13A010012 and LY13A010017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hezhi Luo.

Appendix: Proof of Proposition 4.2

Appendix: Proof of Proposition 4.2

We first recall a well-known result regarding the differentiability properties of the composite matrix function. We use the following definition from [41].

Definition 6.1

Let \(G: {{\mathrm{I}\!\mathrm{R}}}^n\rightarrow {{\mathrm{I}\!\mathrm{R}}}^{m\times m}\). Let \(\lambda _1(x),\ldots ,\lambda _{\mu (x)}(x)\) be the \(\mu (x)\) increasingly ordered eigenvalues of \(G(x)\). Let \(G(x)=U(x)\varLambda (x)U(x)^T\) be the spectral decomposition of \(G(x)\), with \(\varLambda (x)=\mathrm{diag}\left( \lambda _1(x), \ldots , \lambda _1(x),\right. \) \(\left. \ldots ,\lambda _{\mu (x)}(x),\ldots ,\right. \) \(\left. \lambda _{\mu (x)}(x)\right) \). Then, the Frobenius covariance matrices is defined by

$$\begin{aligned} P_i(x):= U(x)\mathrm{diag} (0,\ldots ,0,1,\ldots ,1,0,\ldots ,0)U(x)^T,~i=1,\ldots ,\mu (x), \end{aligned}$$
(90)

where the non-zeros in the diagonal matrix occur exactly in the positions of \(\lambda _i(x)\) in \(\varLambda (x)\).

Note that \(P_i(x)P_j(x)=0\) if \(i\ne j\), \(P_i(x)^2=P_i(x) \), \(i,j=1,\ldots ,\mu (x)\), and \(\sum _{i=1}^{\mu (x)}P_i(x)=I_m\) (see [41], p. 403).

Definition 6.2

Let \(t_1,\ldots ,t_m\) be real values and let \(f_0 : {{\mathrm{I}\!\mathrm{R}}}\rightarrow {{\mathrm{I}\!\mathrm{R}}}\) be twice continuously differentiable. Then, we define

$$\begin{aligned}&\Delta f_0(t_k,t_l):=\left\{ \begin{array}{ll} \frac{f_0(t_k)-f_0(t_l)}{t_k-t_l},&{}k\ne l,\\ f_0^\prime (t_k),&{}k=l, \end{array}\right. \\&\Delta ^2 f_0(t_k,t_l,t_q):=\left\{ \begin{array}{ll} \frac{\Delta f_0(t_k,t_q)-\Delta f_0(t_l,t_q)}{t_k-t_l},&{}k\ne l,\\ \frac{\Delta f_0(t_k,t_l)-\Delta f_0(t_q,t_l)}{t_k-t_q},&{}k=l\ne q,\\ f_0^{\prime \prime }(t_k),&{}k=l=q. \end{array}\right. \end{aligned}$$

The following result is from [41] (Theorem 6.6.30), which will be used to prove Proposition 4.2.

Lemma 6.1

Let \(f_0 : {{\mathrm{I}\!\mathrm{R}}}\rightarrow {{\mathrm{I}\!\mathrm{R}}}\) and \(G: {{\mathrm{I}\!\mathrm{R}}}^n\rightarrow {\mathcal {S}}^m\) be twice continuously differentiable. Define \(F : {\mathcal {S}}^m\rightarrow {\mathcal {S}}^m\) by \(F(Z):= U f_1(D)U^T\), where \(Z= UD U^T\) is the spectral decomposition of \(Z\), \(D=\mathrm{diag}\left( \lambda _1, \ldots , \lambda _m\right) \) and \(f_1(D) = \mathrm{diag}\left( f_0(\lambda _1),\ldots , f_0(\lambda _m)\right) \), where \(\lambda _i\), \(i=1,\ldots ,m\) are eigenvalues of \(Z\) listed in the decreasing order. Let \(\lambda _1(x),\ldots ,\lambda _{\mu (x)}(x)\) be the \(\mu (x)\) increasingly ordered eigenvalues of the matrix \(G(x)\). Then, \(F(G(x))\) is twice continuously differentiable with

$$\begin{aligned} \frac{\partial F(G(x))}{\partial x_i}&= \sum _{k,l=1}^{\mu (x)}\Delta f_0(\lambda _k(x),\lambda _l(x)) P_k(x) G_i^\prime (x) P_l(x),\\ \frac{\partial ^2 F(G(x))}{\partial x_i\partial x_j}&= \sum _{k,l=1}^{\mu (x)}\Delta f_0(\lambda _k(x),\lambda _l(x)) P_k(x) G_{ij}^{\prime \prime }(x) P_l(x) \\&\quad +\sum _{k,l,q=1}^{\mu (x)}\Delta ^2 f_0(\lambda _k(x),\lambda _l(x), \lambda _q(x))(M_{klq} +M^T_{klq} ), \end{aligned}$$

where \(P_k(x)\), \(k=1,\ldots ,\mu (x)\) are Frobenius covariance matrices of \(G(x)\), and the matrix \(M_{klq}\) is given by \(M_{klq}:=P_k(x) G_i^{\prime }(x) P_l(x) G_j^{\prime }(x)P_q(x)\).

Proof of Proposition 4.2. We first show conclusion (i). Let \(\lambda _i^*\ge 0\), \(i=1,\ldots ,m\) be eigenvalues of \(G(x^*)\) of rank \(r\) listed in the decreasing order. Since \(\varLambda ^*\bullet G(x^*)=0\) is equivalent to \( \lambda _i(\varLambda ^*) \lambda _i^*=0\) for \(i=1,\ldots ,m\), we obtain \(\lambda _i(\varLambda ^*)=0\) for \(i=1,\ldots ,r\). Let us define \(\psi _c(t):=c^{-1}\psi (ct)\) for any \(t\in {{\mathrm{I}\!\mathrm{R}}}\). By the property \(\psi (0) = 0\), we have \( \psi _c(\lambda _i^*)=0\) for \(i=r+1,\ldots ,m\). Thus, \(\lambda _i(\varLambda ^*) \psi _c(\lambda _i^*)=0\) for \(i=1,\ldots ,m\), which, by the definition of \(\varPsi _c\), means that \(\varLambda ^*\bullet \varPsi _c(G(x^*))=0\). Similarly, \(\varLambda ^*\bullet \varPhi _c(G(x^*))=0\). Observe from condition (D2) that \(G(x^*)\succeq 0\) implies \(\mathrm{Tr}\left( \varXi _c(G(x^*))\right) =0\). Hence, \(L_i(x^*,\varLambda ^*,\mu ^*,c)= f(x^*)\) \((i=1,2,3)\) for any \(c > 0\).

Next, we prove conclusion (ii) for \(L_1\). The proof for \(L_2\) can be constructed by using similar arguments and the properties (C1)–(C3).

Let \(\lambda _1^*,\ldots ,\lambda ^*_{\mu (x^*)} \) be the \(\mu (x^*)\) distinct eigenvalues of \(G(x^*)\) and let \(P_i(x^*)\) be defined by (90) for \(i=1,\ldots ,\mu (x^*)\). Note that \(P_i(x^*)P_j(x^*)=0\) for all \(i\ne j\) and \(P_i(x^*)^2=P_i(x^*) \) for all \(i,j=1,\ldots ,\mu (x^*)\). Let \(P_0:=EE^T\) with \(E=[e_{r+1},\ldots ,e_m]\). We see that \(P_0=P_{\mu (x^*)}(x^*)\). Since \(\varLambda ^*\bullet G(x^*)=0\) and \(\varLambda ^*,G(x^*) \in {\mathcal {S}}^m_+ \) imply \(\varLambda ^* G(x^*)=G(x^*)\varLambda ^*\), by Theorem 1.3.12 of [41], \(\varLambda ^*\) and \(G(x^*)\) are simultaneously diagonalizable. So, \(\varLambda ^*={\bar{ E}}D_{\varLambda ^*}{\bar{E}}^T\), where \(D_{\varLambda ^*}:=\mathrm{diag}\left( 0,\ldots ,0,\lambda _{r+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) \), and \({\bar{E}}:=[e_1,\ldots ,e_m]\) with \({\bar{E}}^T\bar{E}=I_m\). Let \(D_0:=\mathrm{diag}{({\underbrace{0,\ldots ,0}_{r}},\underbrace{1\ldots ,1}_{m-r})}\). Then, \(P_0={\bar{E}}D_0\bar{E}^T\) and

$$\begin{aligned} P_0\varLambda ^*P_0={\bar{E}} D_0{\bar{E}}^T\bar{E}D_{\varLambda ^*}{\bar{E}}^T\bar{E}D_0{\bar{E}}^T=\bar{E}D_0 D_{\varLambda ^*} D_0{\bar{E}}^T ={\bar{E}} D_{\varLambda ^*} {\bar{E}}^T=\varLambda ^*.\nonumber \\ \end{aligned}$$
(91)

By Lemma 6.1 and (91), we can obtain

$$\begin{aligned}&\left[ \varLambda ^*\bullet \frac{\partial }{\partial x_i}\varPsi _c(G(x^*))\right] _{i =1}^n = \left[ (P_0\varLambda ^*P_0)\bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_k(x^*)G_i^\prime (x^*)P_l(x^*)\right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_0P_k(x^*)G_i^\prime (x^*)P_l(x^*)P_0\right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_{\mu (x^*)}(x^*)P_k(x^*)G_i^\prime (x^*)P_l(x^*)P_{\mu (x^*)}(x^*)\right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \left( \Delta \psi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})P_{\mu (x^*)}(x^*) G_i^\prime (x^*) P_{\mu (x^*)}(x^*)\right) \right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \left( \psi _c^\prime (0)P_0 G_i^\prime (x^*) P_0\right) \right] _{i =1}^n = \psi ^\prime (0)\left[ \varLambda ^*\bullet G_i^\prime (x^*)\right] _{i =1}^n\!, \end{aligned}$$

from which and by \(\psi ^\prime (0)=1\) and \(h(x^*)=0\), we draw immediately that

$$\begin{aligned} \nabla _x L_1(x^*,\varLambda ^*,\mu ^*,c)&= \nabla f(x^*) -\psi ^\prime (0)\left[ \varLambda ^*\bullet G_i^\prime (x^*)\right] _{i =1}^n+\nabla h(x^*)(\mu ^*+ch(x^*))\\&= \nabla _x L_0(x^*,\varLambda ^*,\mu ^*)=0. \end{aligned}$$

Moreover, we have

$$\begin{aligned}&\left[ \varLambda ^*\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varPsi _c(G(x^*))\right] _{i,j =1}^n=\left[ (P_0\varLambda ^*P_0)\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varPsi _c(G(x^*))\right] _{i,j =1}^n\nonumber \\&= \left[ \varLambda ^*\bullet \left( \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_0P_k(x^*)G_{ij}^{\prime \prime }(x^*)P_l(x^*)P_0\right) \right] _{i,j =1}^n\nonumber \\&+\left[ \varLambda ^*\bullet \left( \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \psi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)P_0P_k(x^*)G_i^{\prime }(x^*)P_l(x^*)G_j^{\prime }(x^*)P_q(x^*)P_0\right) \right] _{i,j =1}^n\nonumber \\&+\left[ \varLambda ^*\bullet \left( \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \psi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)P_0P_q(x^*)G_j^{\prime }(x^*)P_l(x^*)G_i^{\prime }(x^*)P_k(x^*)P_0\right) \right] _{i,j =1}^n\nonumber \\&= \left[ \varLambda ^* \bullet \left( \Delta \psi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})P_{\mu (x^*)}(x^*) G_{ij}^{\prime \prime }(x^*) P_{\mu (x^*)}(x^*)\right) \right] _{i,j =1}^n\nonumber \\&+ \left[ \varLambda ^* \bullet \left( P_{\mu (x^*)}(x^*) N_{ij}(x^*) P_{\mu (x^*)}(x^*) \right) \right] _{i,j =1}^n\nonumber \\&+\left[ \varLambda ^* \bullet \left( P_{\mu (x^*)}(x^*) N_{ji}(x^*) P_{\mu (x^*)}(x^*)\right) \right] _{i,j =1}^n\nonumber \\&= \left[ \varLambda ^* \bullet \left( \psi _c^\prime (0)P_0 G_{ij}^{\prime \prime }(x^*) P_0\right) \right] _{i,j =1}^n+2\left[ \varLambda ^* \bullet (P_0 N_{ij}(x^*)P_0)\right] _{i,j =1}^n\nonumber \\&= \psi ^\prime (0)\left[ \varLambda ^*\bullet G_{ij}^{\prime \prime }(x^*)\right] _{i,j =1}^n+2\left[ \varLambda ^* \bullet N_{ij}(x^*) \right] _{i,j =1}^n\!, \end{aligned}$$
(92)

where

$$\begin{aligned} N_{ij}(x^*) =G_i^\prime (x^*)\left( \sum _{k=1}^{\mu (x^*)} \Delta ^2 \psi _c(\lambda ^*_k,0,0)P_k(x^*)\right) G_j^{\prime }(x^*),\quad i,j=1,\ldots ,n. \end{aligned}$$

By Definition 6.2, via a straightforward computation, we obtain

$$\begin{aligned} \Delta ^2 \psi _c(\lambda ^*_k,0,0)=\left\{ \begin{array}{ll} c \psi ^{\prime \prime }(0), &{}\quad k=\mu (x^*)\\ \frac{c^{-1}\psi (c\lambda ^*_k)}{(\lambda ^*_k)^2}-\frac{\psi ^\prime (0)}{\lambda ^*_k},&{}\quad k<\mu (x^*) \end{array}\right. . \end{aligned}$$

Note that \(\psi ^\prime (0)=1\). It then follows from (92) and the definition of \(L_1\) that

$$\begin{aligned} \nabla _{xx}^2 L_1(x^*,\varLambda ^*,\mu ^*,c)&= \underbrace{{\nabla ^2f(x^*) -\psi ^{\prime }(0)\left[ \varLambda ^*\bullet G_{ij}^{\prime \prime }(x^*)\right] _{i,j=1}^n+\sum _{i=1}^p\mu ^*_i\nabla ^2 h_i(x^*)}}_{{\nabla _{xx}^2 L_0(x^*,\varLambda ^*,\mu ^*)}}\nonumber \\&+\,H_c(x^*,\varLambda ^*)+c M(x^*,\varLambda ^*)+c\nabla h(x^*)\nabla h(x^*)^T , \end{aligned}$$
(93)

which is just (38), where

$$\begin{aligned}&H_c(x^*,\varLambda ^*)=-2\nonumber \\&\quad \times \left[ \varLambda ^*\bullet \left( G_i^\prime (x^*) \sum _{k=1}^{\mu (x^*)-1}\left( \frac{c^{-1}\psi (c\lambda _k^* )}{(\lambda ^*_k)^2}-\frac{1}{\lambda _k^* }\right) P_k(x^*) G_j^\prime (x^*)\right) \right] _{i,j=1}^n \end{aligned}$$
(94)
$$\begin{aligned}&M(x^*,\varLambda ^*)=-2\psi ^{\prime \prime }(0)\left[ \varLambda ^*\bullet \left( G_i^\prime (x^*)P_{\mu (x^*)}G_j^\prime (x^*)\right) \right] _{i,j=1}^n. \end{aligned}$$
(95)

By condition (B1), \(\psi (0)=0\) and \(\psi ^\prime (0)=1\), we know that \(\psi (t)< t\) for any \(t\ne 0\), and hence \(\frac{c^{-1}\psi (c\lambda _k^* )}{(\lambda ^*_k)^2}-\frac{1}{\lambda _k^* }<0\) for \(k=1,\ldots ,\mu (x^*)-1\). Together with \(P_k(x^*)\succeq 0\) for all \(k\) and \(\varLambda ^*\succeq 0\), we then infer from (94) that \(H_c(x^*,\varLambda ^*)\) is positive semidefinite. By the condition (B3) for \(\psi \), we see that \(\lim _{c\rightarrow \infty }\frac{\psi (c\tau )}{c}=0\) for any \(\tau >0\). Note also that \(\lambda _k^*>0\) for \(k=1,\ldots ,r\) and that, by the definition of \(P_k(x)\),

$$\begin{aligned} \sum _{k=1}^{\mu (x^*)-1}\frac{1}{\lambda _k^* } P_k(x^*)=\sum _{k=1}^r \frac{1}{\lambda _k^* } e_ke_k^T. \end{aligned}$$

We then deduce from (94) that

$$\begin{aligned} \lim _{c\rightarrow \infty }H_{c}(x^{*},\varLambda ^*)&= 2\left[ \varLambda ^{*}\bullet \left( G_i^{\prime }(x^{*})\left( \sum _{k=1}^{r} \frac{1}{\lambda _{k}^{*}} e_ke_k^T\right) G_j^\prime (x^*)\right) \right] _{i,j=1}^n\nonumber \\&= 2\left[ \varLambda ^{*}\bullet \left( G_{i}^{\prime }(x^{*})[G(x^{*})]^{\dag } G_{j}^{\prime }(x^{*})\right) \right] _{i,j=1}^{n}\nonumber \\&= H(x^{*},\varLambda ^{*}), \end{aligned}$$
(96)

proving (39). Note that \(s=m-\mathrm{rank}(\varLambda ^*)\ge r\). Then, \(\lambda _i(\varLambda ^*)=0\) for \(i=1,\ldots ,s\) and

$$\begin{aligned} D_{\varLambda ^*}=\mathrm{diag}\left( 0,\ldots ,0,\lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) \!. \end{aligned}$$

Note that \(\varLambda ^*={\bar{E}}D_{\varLambda ^*}{\bar{E}}^T\) with \({\bar{E}}=[e_1,\ldots ,e_m]\) and \(P_{\mu (x^*)}=P_0=EE^T\) with \(E=[e_{r+1},\ldots ,e_m]\). It then follows from (95) that

$$\begin{aligned} M(x^*,\varLambda ^*)&= -2\psi ^{\prime \prime }(0)\left[ ({\bar{E}}D_{\varLambda ^*}{\bar{E}}^T)\bullet \left( G_i^\prime (x^*)EE^TG_j^\prime (x^*)\right) \right] _{i,j=1}^n\nonumber \\&= -2\psi ^{\prime \prime }(0)\left[ \sum _{k=s+1}^m \lambda _k(\varLambda ^*)e_k^T G_i^\prime (x^*)\left( \sum _{l=r+1}^me_le^T_l\right) G_j^\prime (x^*) e_k\right] _{i,j=1}^n\nonumber \\&= -2\psi ^{\prime \prime }(0)\left[ \sum _{k=s+1}^m \sum _{l=r+1}^m\lambda _k(\varLambda ^*)\left( e_k^T G_i^\prime (x^*)e_le^T_lG_j^\prime (x^*) e_k\right) \right] _{i,j=1}^n\nonumber \\&= -2\psi ^{\prime \prime }(0)BQ^*_sB^T, \end{aligned}$$
(97)

where \(Q^*_s: =\mathrm{diag}\left( \lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*),\ldots , \lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) \in {{\mathrm{I}\!\mathrm{R}}}^{{\bar{m}}\times {\bar{m}}}\) with \({\bar{m}}:=(m-s)(m-r)\), and the \(i\)-th row of matrix \(B\in {{\mathrm{I}\!\mathrm{R}}}^{n\times {\bar{m}}}\) is defined by

$$\begin{aligned} b_i\!:=\!\left( e_{s+1}^TG_i^\prime (x^*)e_{r+1},\ldots ,e_{m}^TG_i^\prime (x^*)e_{r+1},\ldots ,e_{s+1}^TG_i^\prime (x^*)e_m,\ldots ,e_{m}^TG_i^\prime (x^*)e_m\right) ^T\!\!\!. \end{aligned}$$

Note that, since \(\mathrm{rank}(\varLambda ^*)=m-s\) implies \(\lambda _i(\varLambda ^*)>0\) for \(i=s+1,\ldots ,m\), \(Q^*_s\) is positive definite and hence, by \(\psi ^{\prime \prime }(0)<0\), we see from (97) that \(M(x^*,\varLambda ^*)\) is positive semidefinite. Now assume that for some \({\bar{d}}\ne 0\in {{\mathrm{I}\!\mathrm{R}}}^n\), it holds

$$\begin{aligned} {\bar{d}}^T M(x^*,\varLambda ^*){\bar{d}} =0. \end{aligned}$$

Since \(Q^*_s\) is positive definite, it then follows from (97) that \(B^T{\bar{d}}=\sum _{i=1}^n{\bar{d}}_ib_i=0\). So,

$$\begin{aligned} e_{k}^T\left( \sum _{i=1}^n{\bar{d}}_iG_i^\prime (x^*)\right) e_j=0, ~k=s+1,\ldots ,m,~j=r+1,\ldots ,m. \end{aligned}$$

This proves that (40) holds.

Finally, we prove conclusion (ii) for \(L_3\). Note that \(\mathrm{Tr}\left( \varXi _c(G(x))\right) =I_m\bullet \varXi _c(G(x))\). By Lemma 6.1, similar to the computations of (92) and (92) above, we have

$$\begin{aligned} \left[ I_m\bullet \frac{\partial }{\partial x_i}\varXi _c(G(x^*))\right] _{i =1}^n&= \left[ I_m\bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)P_k(x^*)G_i^\prime (x^*)P_l(x^*)\right] _{i =1}^n \nonumber \\&= \left[ \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)\mathrm{Tr}\left( P_k(x^*)G_i^\prime (x^*)P_l(x^*)\right) \right] _{i =1}^n\nonumber \\&= \left[ \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)\mathrm{Tr}\left( P_k(x^*)P_l(x^*)G_i^\prime (x^*)\right) \right] _{i =1}^n\nonumber \\&= \left[ \Delta \xi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})\mathrm{Tr}\left( P_{\mu (x^*)}(x^*)G_i^\prime (x^*)\right) \right] _{i =1}^n\nonumber \\&= \xi ^\prime (0)\left[ P_{\mu (x^*)}(x^*)\bullet G_i^\prime (x^*)\right] _{i =1}^n\!. \end{aligned}$$
(98)

Moreover, we have

$$\begin{aligned}&\left[ I_m\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varXi _c(G(x^*))\right] _{i,j =1}^n \nonumber \\&= \left[ \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)\mathrm{Tr}\left( P_k(x^*)G_{ij}^{\prime \prime }(x^*)P_l(x^*) \right) \right] _{i,j =1}^n\nonumber \\&+\left[ \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \xi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)\mathrm{Tr}\left( P_k(x^*)G_i^{\prime }(x^*)P_l(x^*)G_j^{\prime }(x^*)P_q(x^*) \right) \right] _{i,j =1}^n \nonumber \\&+\left[ \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \xi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)\mathrm{Tr}\left( P_q(x^*)G_j^{\prime }(x^*)P_l(x^*)G_i^{\prime }(x^*)P_k(x^*) \right) \right] _{i,j =1}^n\nonumber \\&= \left[ \Delta \xi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})\mathrm{Tr}\left( P_{\mu (x^*)}(x^*) G_{ij}^{\prime \prime }(x^*) \right) \right] _{i,j =1}^n\nonumber \\&+ \left[ \mathrm{Tr}\left( P_{\mu (x^*)}(x^*) N_{ij}(x^*) \right) \right] _{i,j =1}^n+\left[ \mathrm{Tr}\left( P_{\mu (x^*)}(x^*) N_{ji}(x^*) \right) \right] _{i,j =1}^n\nonumber \\&= \xi ^\prime (0)\left[ P_{\mu (x^*)}(x^*)\bullet G_{ij}^{\prime \prime }(x^*)\right] _{i,j =1}^n+2\left[ P_{\mu (x^*)}(x^*) \bullet N_{ij}(x^*) \right] _{i,j =1}^n\!, \end{aligned}$$
(99)

where

$$\begin{aligned}&N_{ij}(x^*) =G_i^\prime (x^*)\left( \sum _{k=1}^{\mu (x^*)} \Delta ^2 \xi _c(\lambda ^*_k,0,0)P_k(x^*)\right) G_j^{\prime }(x^*),\quad i,j=1,\ldots ,n.\\&\Delta ^2 \xi _c(\lambda ^*_k,0,0)=\left\{ \begin{array}{ll} c \xi ^{\prime \prime }(0), &{}\quad k=\mu (x^*)\\ \frac{c^{-1}\xi (c\lambda ^*_k)}{(\lambda ^*_k)^2}-\frac{\xi ^\prime (0)}{\lambda ^*_k},&{}\quad k<\mu (x^*) \end{array}\right. . \end{aligned}$$

We note from conditions (D1) and (D2) that \(\xi (t) = 0\) and \(\xi ^\prime (t) = 0\) for \(t \ge 0\) and \(\xi ^{\prime \prime }(0)=0\). Note also that \(\lambda ^*_k>0\) for all \(k=1,\ldots ,r\). Therefore, from (98) and (99), we can obtain

$$\begin{aligned} \left[ I_m\bullet \frac{\partial }{\partial x_i}\varXi _c(G(x^*))\right] _{i =1}^n=0,\quad \left[ I_m\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varXi _c(G(x^*))\right] _{i,j =1}^n=0. \end{aligned}$$
(100)

By the definition of \(L_3\) and \(\psi ^\prime (0)=1\), it follows from (92), (92), and (100) that

$$\begin{aligned} \nabla _{x} L_{3}(x^{*},\varLambda ^{*},\mu ^*,c)&= \nabla f(x^*) - \psi ^\prime (0) [\varLambda ^*\bullet G_i^\prime (x^*)]_{i =1}^n+ \nabla h (x^*)\mu ^* \\&= \nabla _x L_0(x^*,\varLambda ^*,\mu ^*),\\ \nabla _{xx}^2 L_3(x^*,\varLambda ^*,\mu ^*,c)&= {\underbrace{\nabla ^2f(x^*) -\psi ^\prime (0) [\varLambda ^*\bullet G_{ij}^{\prime \prime }(x^*)]_{i,j=1}^n+\sum _{i=1}^p\mu ^*_i\nabla ^2 h_i(x^*)}_{\nabla _{xx}^2 L_0(x^*,\varLambda ^*,\mu ^*)}}\\&+\,H_c(x^*,\varLambda ^*)+cM(x^*,\varLambda ^*)+c\nabla h(x^*)\nabla h(x^*)^T , \end{aligned}$$

where \( H_c(x^*,\varLambda ^*)\) and \(M(x^*,\varLambda ^*)\) are given in (94) and (95), and satisfy (39) and (40). The proof of the proposition is completed. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, H., Wu, H. & Liu, J. On Saddle Points in Semidefinite Optimization via Separation Scheme. J Optim Theory Appl 165, 113–150 (2015). https://doi.org/10.1007/s10957-014-0634-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-014-0634-3

Keywords

Mathematics Subject Classification

Navigation