Skip to main content
Log in

Local convergence of primal–dual interior point methods for nonlinear semidefinite optimization using the Monteiro–Tsuchiya family of search directions

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

The recent advance of algorithms for nonlinear semidefinite optimization problems (NSDPs) is remarkable. Yamashita et al. first proposed a primal–dual interior point method (PDIPM) for solving NSDPs using the family of Monteiro–Zhang (MZ) search directions. Since then, various kinds of PDIPMs have been proposed for NSDPs, but, as far as we know, all of them are based on the MZ family. In this paper, we present a PDIPM equipped with the family of Monteiro–Tsuchiya (MT) directions, which were originally devised for solving linear semidefinite optimization problems as were the MZ family. We further prove local superlinear convergence to a Karush–Kuhn–Tucker point of the NSDP in the presence of certain general assumptions on scaling matrices, which are used in producing the MT search directions. Finally, we conduct numerical experiments to compare the efficiency among members of the MT family.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1

Similar content being viewed by others

Data Availability Statement

All data in the paper are available from the corresponding author on reasonable request. There is no conflict of interest in writing the paper.

Notes

  1. We have the identity \(W=Y^{-\frac{1}{2}}(Y^{\frac{1}{2}}G(x)Y^{\frac{1}{2}})^{\frac{1}{2}}Y^{-\frac{1}{2}}\).

  2. Since \(\mathcal {J}\Xi ^P_{\mu _k}(v^l)\varDelta v=-\Xi ^P_{\mu _k}(v^l)\) is assumed to be nonsingular, there exists a nonnegative integer \(\bar{\ell }\) such that \(\varPsi _{\mu _k}^P(v^l+\beta _2(\bar{\ell })\varDelta v)\le \left( 1 - 0.25 \cdot \beta _2(\bar{\ell })\right) \varPsi _{\mu _k}^P(v^l)\), which is rewritten as (68) because of \(\varPsi _{\mu _k}^P(w)=\varPsi _{\mu _k}^I(w)\).

References

  1. Alizadeh, F., Haeberly, J.P.A., Overton, M.L.: Primal–dual interior-point methods for semidefinite programming: convergence rates, stability and numerical results. SIAM J. Optim. 8(3), 746–768 (1998)

    Article  MathSciNet  Google Scholar 

  2. Andreani, R., Haeser, G., Viana, D.S.: Optimality conditions and global convergence for nonlinear semidefinite programming. Math. Program. 180, 203–235 (2020)

    Article  MathSciNet  Google Scholar 

  3. Auslender, A.: An extended sequential quadratically constrained quadratic programming algorithm for nonlinear, semidefinite, and second-order cone programming. J. Optim. Theory Appl. 156(2), 183–212 (2013)

    Article  MathSciNet  Google Scholar 

  4. Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2013)

    Google Scholar 

  5. Correa, R., Ramirez, C.H.: A global algorithm for nonlinear semidefinite programming. SIAM J. Optim. 15(1), 303–318 (2004)

    Article  MathSciNet  Google Scholar 

  6. Forsgren, A.: Optimality conditions for nonconvex semidefinite programming. Math. Program. 88(1), 105–128 (2000)

    Article  MathSciNet  Google Scholar 

  7. Forsgren, A., Gill, P.E., Wright, M.H.: Interior methods for nonlinear optimization. SIAM Rev. 44(4), 525–597 (2002)

    Article  MathSciNet  ADS  Google Scholar 

  8. Freund, R.W., Jarre, F., Vogelbusch, C.H.: Nonlinear semidefinite programming: sensitivity, convergence, and an application in passive reduced-order modeling. Math. Program. 109(2–3), 581–611 (2007)

    Article  MathSciNet  Google Scholar 

  9. Fukuda, E.H., Lourenço, B.F.: Exact augmented Lagrangian functions for nonlinear semidefinite programming. Comput. Optim. Appl. 71(2), 457–482 (2018)

    Article  MathSciNet  Google Scholar 

  10. Helmberg, C., Rendl, F., Vanderbei, R.J., Wolkowicz, H.: An interior-point method for semidefinite programming. SIAM J. Optim. 6(2), 342–361 (1996)

    Article  MathSciNet  Google Scholar 

  11. Hoi, C., Scherer, C.W., Van der Meché, E., Bosgra, O.: A nonlinear SDP approach to fixed-order controller synthesis and comparison with two other methods applied to an active suspension system. Eur. J. Control. 9(1), 13–28 (2003)

    Article  Google Scholar 

  12. Huang, X., Teo, K., Yang, X.: Approximate augmented Lagrangian functions and nonlinear semidefinite programs. Acta Math. Sin. 22(5), 1283–1296 (2006)

    Article  MathSciNet  Google Scholar 

  13. Jarre, F.: An interior method for nonconvex semidefinite programs. Optim. Eng. 1(4), 347–372 (2000)

    Article  MathSciNet  Google Scholar 

  14. Kakihara, S., Ohara, A., Tsuchiya, T.: Curvature integrals and iteration complexities in SDP and symmetric cone programs. Comput. Optim. Appl. 57(3), 623–665 (2014)

    Article  MathSciNet  Google Scholar 

  15. Kanzow, C., Nagel, C., Kato, H., Fukushima, M.: Successive linearization methods for nonlinear semidefinite programs. Comput. Optim. Appl. 31(3), 251–273 (2005)

    Article  MathSciNet  Google Scholar 

  16. Kato, A., Yabe, H., Yamashita, H.: An interior point method with a primal–dual quadratic barrier penalty function for nonlinear semidefinite programming. J. Comput. Appl. Math. 275, 148–161 (2015)

    Article  MathSciNet  Google Scholar 

  17. Kočvara, M., Leibfritz, F., Stingl, M., Henrion, D.: A nonlinear SDP algorithm for static output feedback problems in COMPleib. IFAC Proc. Vol. 38(1), 1055–1060 (2005)

    Article  Google Scholar 

  18. Kočvara, M., Stingl, M.: Solving nonconvex SDP problems of structural optimization with stability control. Optim. Methods Softw. 19(5), 595–609 (2004)

    Article  MathSciNet  Google Scholar 

  19. Kojima, M., Shindoh, S., Hara, S.: Interior-point methods for the monotone semidefinite linear complementarity problem in symmetric matrices. SIAM J. Optim. 7(1), 86–125 (1997)

    Article  MathSciNet  Google Scholar 

  20. Konno, H., Kawadai, N., Tuy, H.: Cutting plane algorithms for nonlinear semi-definite programming problems with applications. J. Global Optim. 25(2), 141–155 (2003)

    Article  MathSciNet  Google Scholar 

  21. Leibfritz, F., Maruhn, J.H.: A successive SDP-NSDP approach to a robust optimization problem in finance. Comput. Optim. Appl. 44(3), 443 (2009)

    Article  MathSciNet  Google Scholar 

  22. Leibfritz, F., Mostafa, E.: An interior point constrained trust region method for a special class of nonlinear semidefinite programming problems. SIAM J. Optim. 12(4), 1048–1074 (2002)

    Article  MathSciNet  Google Scholar 

  23. Leibfritz, F., Volkwein, S.: Reduced order output feedback control design for PDE systems using proper orthogonal decomposition and nonlinear semidefinite programming. Linear Algebra Appl. 415(2–3), 542–575 (2006)

    Article  MathSciNet  Google Scholar 

  24. Lu, Z., Monteiro, R.D.: Error bounds and limiting behavior of weighted paths associated with the SDP map \(X^{{\frac{1}{2}}}SX^{{\frac{1}{2}}}\). SIAM J. Optim. 15(2), 348–374 (2005)

    Article  Google Scholar 

  25. Monteiro, R.D.: Primal-dual path-following algorithms for semidefinite programming. SIAM J. Optim. 7(3), 663–678 (1997)

    Article  MathSciNet  Google Scholar 

  26. Monteiro, R.D.: Polynomial convergence of primal–dual algorithms for semidefinite programming based on the Monteiro and Zhang family of directions. SIAM J. Optim. 8(3), 797–812 (1998)

    Article  MathSciNet  Google Scholar 

  27. Monteiro, R.D., Tsuchiya, T.: Polynomial convergence of a new family of primal–dual algorithms for semidefinite programming. SIAM J. Optim. 9(3), 551–577 (1999)

    Article  MathSciNet  Google Scholar 

  28. Monteiro, R.D., Zanjacomo, P.: Implementation of primal–dual methods for semidefinite programming based on Monteiro and Tsuchiya Newton directions and their variants. Optim. Methods Softw. 11(1–4), 91–140 (1999)

    Article  MathSciNet  Google Scholar 

  29. Monteiro, R.D., Zhang, Y.: A unified analysis for a class of long-step primal–dual path-following interior-point algorithms for semidefinite programming. Math. Program. 81(3), 281–299 (1998)

    Article  MathSciNet  Google Scholar 

  30. Nesterov, Y.E., Todd, M.J.: Primal–dual interior-point methods for self-scaled cones. SIAM J. Optim. 8(2), 324–364 (1998)

    Article  MathSciNet  Google Scholar 

  31. Okuno, T., Fukushima, M.: An interior point sequential quadratic programming-type method for log-determinant semi-infinite programs. J. Comput. Appl. Math. 376, 112784 (2020)

    Article  MathSciNet  Google Scholar 

  32. Okuno, T., Fukushima, M.: Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints. Math. Program. 199, 251–303 (2023)

    Article  MathSciNet  Google Scholar 

  33. Qi, H.: Local duality of nonlinear semidefinite programming. Math. Oper. Res. 34(1), 124–141 (2009)

    Article  MathSciNet  Google Scholar 

  34. Qi, H., Sun, D.: A quadratically convergent Newton method for computing the nearest correlation matrix. SIAM J. Matrix Anal. Appl. 28(2), 360–385 (2006)

    Article  MathSciNet  Google Scholar 

  35. Scherer, C.W.: Multiobjective \(H_2\)/\(H_{\infty }\)control. IEEE Trans. Autom. Control 40(6), 1054–1062 (1995)

    Article  Google Scholar 

  36. Shapiro, A.: First and second order analysis of nonlinear semidefinite programs. Math. Program. 77(1), 301–320 (1997)

    Article  MathSciNet  Google Scholar 

  37. Sun, D.: The strong second-order sufficient condition and constraint nondegeneracy in nonlinear semidefinite programming and their implications. Math. Oper. Res. 31(4), 761–776 (2006)

    Article  MathSciNet  Google Scholar 

  38. Sun, D., Sun, J., Zhang, L.W.: The rate of convergence of the augmented Lagrangian method for nonlinear semidefinite programming. Math. Program. 114(2), 349–391 (2008)

    Article  MathSciNet  Google Scholar 

  39. Sun, J., Zhang, L.W., Wu, Y.: Properties of the augmented Lagrangian in nonlinear semidefinite optimization. J. Optim. Theory Appl. 129(3), 437–456 (2006)

    Article  MathSciNet  Google Scholar 

  40. Todd, M.J.: A study of search directions in primal–dual interior-point methods for semidefinite programming. Optim. Methods Softw. 11(1–4), 1–46 (1999)

    MathSciNet  Google Scholar 

  41. Todd, M.J., Toh, K.C., Tütüncü, R.H.: On the Nesterov–Todd direction in semidefinite programming. SIAM J. Optim. 8(3), 769–796 (1998)

    Article  MathSciNet  Google Scholar 

  42. Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38(1), 49–95 (1996)

    Article  MathSciNet  Google Scholar 

  43. Wolkowicz, H., Saigal, R., Vandenberghe, L.: Handbook of Semidefinite Programming: Theory, Algorithms, and Applications, vol. 27. Springer, New York (2012)

    Google Scholar 

  44. Yamakawa, Y., Okuno, T.: A stabilized sequential quadratic semidefinite programming method for degenerate nonlinear semidefinite programs. Comput. Optim. Appl. 83(3), 1027–1064 (2022)

    Article  MathSciNet  Google Scholar 

  45. Yamakawa, Y., Yamashita, N.: A two-step primal–dual interior point method for nonlinear semidefinite programming problems and its superlinear convergence. J. Oper. Res. Soc. Jpn. 57(3–4), 105–127 (2014)

    MathSciNet  Google Scholar 

  46. Yamakawa, Y., Yamashita, N.: A differentiable merit function for the shifted perturbed KKT conditions of the nonlinear semidefinite programming. Pac. J. Optim. 11(3), 557–579 (2015)

    MathSciNet  Google Scholar 

  47. Yamashita, H., Yabe, H.: Local and superlinear convergence of a primal–dual interior point method for nonlinear semidefinite programming. Math. Program. 132(1–2), 1–30 (2012)

    Article  MathSciNet  Google Scholar 

  48. Yamashita, H., Yabe, H., Harada, K.: A primal–dual interior point method for nonlinear semidefinite programming. Math. Program. 135(1–2), 89–121 (2012)

    Article  MathSciNet  Google Scholar 

  49. Yamashita, H., Yabe, H., Harada, K.: A primal-dual interior point trust-region method for nonlinear semidefinite programming. Optim. Methods Softw. 36, 569–601 (2021)

    Article  MathSciNet  Google Scholar 

  50. Zhang, Y.: On extending some primal–dual interior-point algorithms from linear programming to semidefinite programming. SIAM J. Optim. 8(2), 365–386 (1998)

    Article  MathSciNet  Google Scholar 

  51. Zhao, Q., Chen, Z.: On the superlinear local convergence of a penalty-free method for nonlinear semidefinite programming. J. Comput. Appl. Math. 308, 1–19 (2016)

    Article  MathSciNet  Google Scholar 

  52. Zhao, Q., Chen, Z.: A line search exact penalty method for nonlinear semidefinite programming. Comput. Optim. Appl. 75(2), 467–491 (2020)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The author thanks Professor Yoshiko Ikebe, Professor Mirai Tanaka, and Professor Makoto Yamashita for numerous comments and suggestions. He is also sincerely grateful for the anonymous reviewers for many crucial suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takayuki Okuno.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported in part by Grant-in-Aid for Young Scientists 20K19748 and Grant-in-Aid for Scientific Research (B)20H04145 from JSPS KAKENHI.

Omitted Proofs

Omitted Proofs

1.1 Proof of Proposition 

Before the proof of Proposition , we first give two propositions. Choose arbitrary sequences \(\{\tilde{w}^{\ell }\}\) and \(\{\tilde{\mu }^{\ell }\}\) satisfying (22) in Condition (P2). To show the proposition, we prepare the following two claims.

Since the semidefinite complementarity condition that \(G(x^{*})\bullet Y_{*}=0\), \(G(x^{*})\in \mathbb {S}^m_+\), and \(Y_{*}\in \mathbb {S}^m_+\) holds, the matrices \(G(x^{*})\) and \(Y_{*}\) can be simultaneously diagonalized, namely, there exists an orthogonal matrix \(Q_{*}\in \mathbb {R}^{m\times m}\) such that

$$\begin{aligned} G(x^{*})=Q_{*}\begin{bmatrix}\varLambda _1&{} \quad O\\ O&{} \quad O \end{bmatrix}Q_{*}^{\top },\ Y_{*}=Q_{*} \begin{bmatrix}O&{} \quad O \\ O&{}\varLambda _2 \end{bmatrix} Q^{\top }_{*}, \end{aligned}$$
(A.1)

where \(\varLambda _1\in \mathbb {R}^{r_{*}\times r_{*}}\) with \(r_{*}:=\textrm{rank}\,G(x^{*})\) is a positive diagonal matrix and \(\varLambda _2\in \mathbb {R}^{(m-r_{*})\times (m-r_{*})}\) is a nonnegative diagonal matrix. The diagonal entries of \(\varLambda _1\) and \(\varLambda _2\) are the eigenvalues of \(G(x^{*})\) and \(Y_{*}\), respectively. The following proposition is obtained from [47, Lemma 3] under the assumption that \(\tilde{w}^{\ell }\in \mathcal {N}_{\tilde{\mu }_{\ell }}^{r_{\ell }}\) with \(r_{\ell }=\textrm{o}(\tilde{\mu }_{\ell })\).

Proposition A1

It holds that

$$\begin{aligned} G_{\ell } =\begin{bmatrix} \mathrm{\varTheta }(1)&{} \quad \textrm{O}(\tilde{\mu }_{\ell })\\ \textrm{O}(\tilde{\mu }_{\ell })&{} \quad \mathrm{\varTheta }(\tilde{\mu }_{\ell }) \end{bmatrix},\ \widetilde{Y}_{\ell } =\begin{bmatrix} \mathrm{\varTheta }(\tilde{\mu }_{\ell }) &{} \quad \textrm{O}(\tilde{\mu }_{\ell })\\ \textrm{O}(\tilde{\mu }_{\ell })&{} \quad {\mathrm{\varTheta }}(1) \end{bmatrix}, \end{aligned}$$

where both the matrices are partitioned into the four blocks with the same sizes as those in (A.1). The above expressions indicate upper-bounds of the magnitude of the block matrices. For example, \(\Vert \text{ the } (1,1)\text{-block } \text{ of } G_{\ell }\Vert _{\text {F}}=\mathrm {\varTheta }(1)\). Moreover, the sequences of the inverse matrices satisfy

$$\begin{aligned} G_{\ell }^{-1} =\begin{bmatrix} \mathrm{\varTheta }(1)&{} \quad {\textrm{O}}(1)\\ \textrm{O}(1)&{} \quad \mathrm{\varTheta }(\tilde{\mu }_{\ell }^{-1}) \end{bmatrix},\ \widetilde{Y}_{\ell }^{-1}=\begin{bmatrix} \mathrm{\varTheta }(\tilde{\mu }_{\ell }^{-1}) &{} \textrm{O}(1)\\ \textrm{O}(1)&{} \quad \mathrm{\varTheta }(1) \end{bmatrix}. \end{aligned}$$

The next one will be used to prove Case (i) of Proposition .

Proposition A2

Let \(U:=\mathcal {L}_{X^{\frac{1}{2}}}^{-1}(\varDelta X)\) for \(X\in \mathbb {S}^m_{++}\) and \(\varDelta X\in \mathbb {S}^m\). Then,

$$\begin{aligned} \Vert UX^{-\frac{1}{2}}\Vert _{\textrm{F}}=\Vert X^{-\frac{1}{2}}U\Vert _{\textrm{F}}\le \frac{1}{\sqrt{2}}\Vert X^{-\frac{1}{2}}\varDelta XX^{-\frac{1}{2}}\Vert _{\textrm{F}} \le \sqrt{\frac{m}{2}}\Vert \varDelta X\Vert _{\textrm{F}}\Vert X^{-1}\Vert _{\textrm{F}} \end{aligned}$$

Proof

Since the first equality is obvious, we show the inequalities part. From \(U=\mathcal {L}_{X^{\frac{1}{2}}}^{-1}(\varDelta X)\) it follows that \(UX^{\frac{1}{2}}+X^{\frac{1}{2}}U=\varDelta X\), which implies

$$\begin{aligned} UX^{-\frac{1}{2}}+X^{-\frac{1}{2}}U=X^{-\frac{1}{2}}\varDelta XX^{-\frac{1}{2}}. \end{aligned}$$

Then, the first desired inequality follows from [27, Lemma 2.1]. The second one is obtained from

$$\begin{aligned} \frac{ \Vert X^{-\frac{1}{2}}\varDelta XX^{-\frac{1}{2}}\Vert _{\textrm{F}} }{\sqrt{2}} \le \frac{ \Vert X^{-\frac{1}{2}}\Vert _{\textrm{F}}^2\Vert \varDelta X\Vert _{\textrm{F}} }{\sqrt{2}} \le \sqrt{\frac{m}{2}} \Vert X^{-1}\Vert _{\textrm{F}}\Vert \varDelta X\Vert _{\textrm{F}}, \end{aligned}$$

where the second inequality follows from \(\Vert X^{-\frac{1}{2}}\Vert _\textrm{F}^2=\textrm{Tr}(X^{-1})\le \Vert I\Vert _{\textrm{F}}\Vert X^{-1}\Vert _\textrm{F}=\sqrt{m}\Vert X^{-1}\Vert _{\textrm{F}}\). Hence, the proof is complete. \(\square \)

Let us start proving Proposition . For each \(\ell \), let

$$\begin{aligned} \widetilde{P}_{\ell }:=\mathcal {P}(\tilde{w}^{\ell }),\ \widehat{G}_{\ell }:=\widetilde{P}_{\ell }G_{\ell }\widetilde{P}_{\ell }^{\top },\ \widehat{Y}_{\ell }:=\widetilde{P}_{\ell }^{-\top }\widetilde{Y}_{\ell }\widetilde{P}_{\ell }^{-1}. \end{aligned}$$

For other notations such as \(\widehat{\mathcal {G}}_i\), see Condition (\(\textbf{P2}\)). Note that for \(w\in \mathcal {W}\) and \(\mu >0\),

$$\begin{aligned} \Vert \widehat{G}(x)^{\frac{1}{2}}\widehat{Y}\widehat{G}(x)^{\frac{1}{2}}-\mu I\Vert _{\textrm{F}} =\Vert G(x)^{\frac{1}{2}}YG(x)^{\frac{1}{2}}-\mu I\Vert _{\textrm{F}} \le \Vert G(x)Y-\mu I\Vert _\textrm{F}, \end{aligned}$$
(A.2)

where the equality is easily verified by comparing the squares of both the sides and the inequality follows from Proposition  with \(X=G(x)\). Combining the above relation with \(\Vert G_{\ell }\widetilde{Y}_{\ell }-\tilde{\mu }_{\ell }I\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi })\), we have

$$\begin{aligned} \Vert \widehat{G}_{\ell }^{\frac{1}{2}}\widehat{Y}_{\ell }\widehat{G}_{\ell }^{\frac{1}{2}}-\tilde{\mu }_{\ell }I\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi }). \end{aligned}$$
(A.3)

We often use the following equations:

$$\begin{aligned}&\Vert G_{\ell }\Vert _{\textrm{F}}=\textrm{O}(1),\ \Vert \widetilde{Y}_{\ell }\Vert _{\textrm{F}}=\textrm{O}(1),\ \Vert \mathcal {G}_i(\tilde{x}^{\ell })\Vert _{\textrm{F}} =\textrm{O}(1),\end{aligned}$$
(A.4)
$$\begin{aligned}&\tilde{\mu }_{\ell }\Vert G_{\ell }^{-1}\Vert _{\textrm{F}}=\textrm{O}(1),\ \tilde{\mu }_{\ell }\Vert \widetilde{Y}_{\ell }^{-1}\Vert _\textrm{F}=\textrm{O}(1), \end{aligned}$$
(A.5)

where the equations in (A.4) are derived from \(\lim _{\ell \rightarrow \infty }(\tilde{x}^{\ell },\widetilde{Y}_{\ell })=(x^{*},Y_{*})\) and the continuity of \(\mathcal {G}_i\) and G, and those in (A.5) follow from Proposition .

Now, we proceed to the proof of Cases (i)–(v). Fix \(i\in \{1,2,\ldots ,n\}\) arbitrarily and write \(\mathcal {U}_{\ell }^i:=\mathcal {L}^{-1}_{\widehat{G}_{\ell }^{\frac{1}{2}}}(\widehat{\mathcal {G}}_i(\tilde{x}^{\ell })).\) We first show Case (i) with \(\widetilde{P}_{\ell }=I\) for any \(\ell \). Note \(Z_{\ell }^i={\tilde{\mu }_{\ell }}\mathcal {U}_{\ell }^iG_{\ell }^{-\frac{1}{2}}\) with \(\mathcal {U}_{\ell }^i=\mathcal {L}^{-1}_{G_{\ell }^{\frac{1}{2}}}(\mathcal {G}_i(\tilde{x}^{\ell }))\) in this case. We then have \(\tilde{\mu }_{\ell }\mathcal {U}_{\ell }^iG_{\ell }^{\frac{1}{2}}+\tilde{\mu }_{\ell }G_{\ell }^{\frac{1}{2}}\mathcal {U}_{\ell }^i=\tilde{\mu }_{\ell }\mathcal {G}_i(\tilde{x}^{\ell })\), which together with Proposition  with \((X,\varDelta X)=(G_{\ell },\mathcal {G}_i(\tilde{x}^{\ell }))\) implies

$$\begin{aligned} \Vert Z_{\ell }^i\Vert _{\textrm{F}}= \tilde{\mu }_{\ell }\Vert \mathcal {U}_{\ell }^iG_{\ell }^{-\frac{1}{2}}\Vert _{\textrm{F}}\le \tilde{\mu }_{\ell }\sqrt{\frac{m}{2}}\Vert G_{\ell }^{-1}\Vert _{\textrm{F}}\Vert \mathcal {G}_i(\tilde{x}^{\ell })\Vert _\textrm{F}=\textrm{O}(1), \end{aligned}$$

where we used (A.4) and (A.5). Hence, \(\{Z_{\ell }^i\}\) is bounded. We next show Case (ii) with \(\widetilde{P}_{\ell }=G_{\ell }^{-\frac{1}{2}}\). By \(\widehat{G}_{\ell }=I\) for each \(\ell \), we have \(\mathcal {U}_{\ell }^i=\widehat{\mathcal {G}}_i(\tilde{x}^{\ell })/2\), which together with (A.4) and (A.5) implies

$$\begin{aligned} \Vert Z_{\ell }^i\Vert _{\textrm{F}}=\tilde{\mu }_{\ell }\Vert G_{\ell }^{\frac{1}{2}}\mathcal {U}_{\ell }^iG_{\ell }^{-\frac{1}{2}}\Vert _\textrm{F}=\frac{\tilde{\mu }_{\ell }}{2} \Vert \mathcal {G}_i(\tilde{x}^{\ell })G_{\ell }^{-1}\Vert _{\textrm{F}}=\textrm{O}(1). \end{aligned}$$

Thus, \(\{Z_{\ell }^i\}\) is bounded for Case (ii).

In what follows, we show the remaining cases in a unified manner. As will be shown later, in each of Cases (iii)–(v), there exists some \(\rho _{*}>0\) such that

$$\begin{aligned} \Vert \widehat{G}_{\ell }^{\frac{1}{\rho _{*}}}-\tilde{\mu }_{\ell }I\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi }). \end{aligned}$$
(A.6)

Let \(S_{\ell }^{i}:=\frac{\tilde{\mu }_{\ell }^{-\frac{\rho _{*}}{2}}}{2}\widehat{\mathcal {G}}_i(\tilde{x}^{\ell })\) for each \(\ell \). The expression \(\tilde{\mu }_{\ell }\left\| \widetilde{P}_{\ell }^{-1}S_{\ell }^{i}\widehat{G}_{\ell }^{-\frac{1}{2}}\widetilde{P}_{\ell }\right\| _{\textrm{F}}\) is evaluated as

$$\begin{aligned} \tilde{\mu }_{\ell }\left\| \widetilde{P}_{\ell }^{-1}S_{\ell }^{i}\widehat{G}_{\ell }^{-\frac{1}{2}}\widetilde{P}_{\ell }\right\| _\textrm{F}=\frac{\tilde{\mu }_{\ell }^{1-\frac{\rho _{*}}{2}}\Vert \mathcal {G}_i(\tilde{x}^{\ell })\widetilde{P}_{\ell }^{\top }\widehat{G}_{\ell }^{-\frac{1}{2}}\widetilde{P}_{\ell }\Vert _\textrm{F}}{2} =\textrm{O}(\tilde{\mu }_{\ell }^{1-\rho _{*}}\Vert \widetilde{P}_{\ell }\Vert _{\textrm{F}}^2), \end{aligned}$$
(A.7)

where the first equality follows from \(\widehat{\mathcal {G}_i}(\tilde{x}^{\ell })=\widetilde{P}_{\ell }\mathcal {G}_i(\tilde{x}^{\ell })\widetilde{P}_{\ell }^{\top }\) and the last one from \(\Vert \widehat{G}_{\ell }^{-\frac{1}{2}}\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{\rho _{*}}{2}})\) by (A.6) and \(\Vert \mathcal {G}_i(\tilde{x}^{\ell })\Vert _{\textrm{F}}=\textrm{O}(1)\) as in (A.4). Furthermore, let

$$\begin{aligned} \widehat{G}_{\ell }=Q_{\ell }D_{\ell }Q_{\ell }^{\top } \end{aligned}$$
(A.8)

be an eigen-decomposition of \(\widehat{G}_{\ell }\) with an appropriate orthogonal matrix \(Q_{\ell }\in \mathbb {R}^{m\times m}\) and a diagonal matrix \(D_{\ell }\in \mathbb {R}^{m\times m}\) with the eigenvalues of \(\widehat{G}_{\ell }\) aligned on the diagonal. Notice that \(\widehat{G}_{\ell }^{\frac{1}{2}}=Q_{\ell }D_{\ell }^{\frac{1}{2}}Q_{\ell }^{\top }\). Denote \(d_{p,\ell }:=(D_{\ell })_{pp}\in \mathbb {R}\) for each \(p=1,2,\ldots ,m\) and \(\mathcal {U}_{Q_{\ell }}^i:=Q_{\ell }^{\top }\mathcal {U}_{\ell }^iQ_{\ell }\). By multiplying \(Q_{\ell }^{\top }\) and \(Q_{\ell }\) on both sides of \(\mathcal {U}_{\ell }^i\widehat{G}_{\ell }^{\frac{1}{2}}+\widehat{G}_{\ell }^{\frac{1}{2}}\mathcal {U}_{\ell }^i=\widehat{\mathcal {G}}_i(\tilde{x}^{\ell })\), it follows from (A.8) that \(\mathcal {U}_{Q_{\ell }}^iD_{\ell }^{\frac{1}{2}}+D_{\ell }^{\frac{1}{2}}\mathcal {U}_{Q_{\ell }}^i=Q_{\ell }^{\top }\widehat{\mathcal {G}}_i(\tilde{x}^{\ell })Q_{\ell }\), which together with \(d_{p,\ell }=(D_{\ell })_{pp}\) for each \(p,\ell \) yields

$$\begin{aligned} (\mathcal {U}_{Q_{\ell }}^i)_{p,q}=\frac{(Q_{\ell }^{\top }\widehat{\mathcal {G}}_i(\tilde{x}^{\ell }) Q_{\ell })_{p,q}}{d_{p,\ell }^{\frac{1}{2}}+d_{q,\ell }^{\frac{1}{2}}} \end{aligned}$$

for each \(p,q=1,2,\ldots ,m\). From (A.6), for each \(p=1,2,\ldots ,m\), there exists some \(\{\delta _{p,\ell }\}\subseteq \mathbb {R}\) satisfying \(\delta _{p,\ell }=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi })\) and \( d_{p,\ell }=(\tilde{\mu }_{\ell }+\delta _{p,\ell })^{\rho _{*}}. \) By taking the fact of \(\tilde{\mu }_{\ell }>0\) into account, for each p, the mean-value theorem implies that for some \(\bar{s}_{p,\ell }\in [0,1]\)

$$\begin{aligned} d_{p,\ell }^{\frac{1}{2}}-\tilde{\mu }_{\ell }^{\frac{\rho _{*}}{2}}=(\tilde{\mu }_{\ell }+\delta _{p,\ell })^{\frac{\rho _{*}}{2}}-\tilde{\mu }_{\ell }^{\frac{\rho _{*}}{2}} ={\frac{1}{2}\rho _{*}\delta _{p,\ell }(\tilde{\mu }_{\ell }+\bar{s}_{p,\ell }\delta _{p,\ell })^{\frac{\rho _{*}}{2}-1}}. \end{aligned}$$
(A.9)

Notice that \(\tilde{\mu }_{\ell }+\bar{s}_{p,\ell }\delta _{p,\ell }=\mathrm{\varTheta }(\tilde{\mu }_{\ell })\) and \(d_{p,\ell } =\mathrm{\varTheta }(\tilde{\mu }_{\ell }^{\rho _{*}})\) for all p. Then, for each \(p,q=1,2,\ldots ,m\), (A.9) yields

$$\begin{aligned} \frac{1}{2\tilde{\mu }_{\ell }^{\frac{\rho _{*}}{2}}}-\frac{1}{d_{p,\ell }^{\frac{1}{2}}+d_{q,\ell }^{\frac{1}{2}}}&=\frac{d_{p,\ell }^{\frac{1}{2}}+d_{q,\ell }^{\frac{1}{2}}-2\tilde{\mu }_{\ell }^{\frac{\rho _{*}}{2}}}{2(d_{p,\ell }^{\frac{1}{2}}+d_{q,\ell }^{\frac{1}{2}})\mu ^{\frac{\rho _{*}}{2}}} \\&=\textrm{O}\left( \frac{ \rho _{*}\delta _{p,\ell }(\tilde{\mu }_{\ell }+\bar{s}_{p,\ell }\delta _{p,\ell })^{\frac{\rho _{*}}{2}-1}+ \rho _{*}\delta _{q,\ell }(\tilde{\mu }_{\ell }+\bar{s}_{q,\ell }\delta _{q,\ell })^{\frac{\rho _{*}}{2}-1} }{ 4\tilde{\mu }_{\ell }^{\rho _{*}} } \right) \\&=\textrm{O}(\tilde{\mu }_{\ell }^{\xi - \frac{\rho _{*}}{2}}), \end{aligned}$$

where the last equality follows from \(\delta _{p,\ell }=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi })\) and \(\delta _{q,\ell }=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi })\). From this fact together with \(\Vert Q_{\ell }^{\top }\widehat{\mathcal {G}}_i(\tilde{x}^{\ell })Q_{\ell }\Vert _{\textrm{F}}= \Vert Q_{\ell }^{\top }\widetilde{P}_{\ell }\mathcal {G}_i(\tilde{x}^{\ell })\widetilde{P}_{\ell }^{\top }Q_{\ell }\Vert _{\textrm{F}}= \textrm{O}(\Vert \widetilde{P}_{\ell }\Vert _\textrm{F}^2)\), we obtain, for each pq,

$$\begin{aligned} (\mathcal {U}_{Q_{\ell }}^i-Q_{\ell }^{\top }S_{\ell }^{i}Q_{\ell })_{p,q}={(Q_{\ell }^{\top }\widehat{\mathcal {G}}_i(\tilde{x}^{\ell })Q_{\ell })_{p,q}}\left( \frac{1}{d_{p,\ell }^{\frac{1}{2}}+d_{q,\ell }^{\frac{1}{2}}}-\frac{1}{2\tilde{\mu }_{\ell }^{\frac{\rho _{*}}{2}}}\right) \\ =\textrm{O}\left( \Vert \widetilde{P}_{\ell }\Vert _{\textrm{F}}^2\tilde{\mu }_{\ell }^{\xi - \frac{\rho _{*}}{2}}\right) , \end{aligned}$$

which together with \(\Vert G_{\ell }^{-\frac{1}{2}}\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{\rho _{*}}{2}})\) from (A.6) implies

$$\begin{aligned}&\Vert \tilde{\mu }_{\ell }\widetilde{P}_{\ell }^{-1}Q_{\ell }(\mathcal {U}_{Q_{\ell }}^i-Q_{\ell }^{\top }S_{\ell }^{i}Q_{\ell })Q_{\ell }^{\top }\widehat{G}_{\ell }^{-\frac{1}{2}}\widetilde{P}_{\ell }\Vert _{\textrm{F}}\nonumber \\&\quad \le \tilde{\mu }_{\ell }\Vert \widetilde{P}_{\ell }^{-1}\Vert _{\textrm{F}}\Vert Q_{\ell }\Vert _{\textrm{F}}^2\Vert \widetilde{P}_{\ell }\Vert _{\textrm{F}}\Vert \mathcal {U}_{Q_{\ell }}^i-Q_{\ell }^{\top }S_{\ell }^{i}Q_{\ell }\Vert _{\textrm{F}}\Vert \widehat{G}_{\ell }^{-\frac{1}{2}}\Vert _{\textrm{F}}\nonumber \\&\quad =\displaystyle {\textrm{O}(\tilde{\mu }_{\ell }^{\xi +1-\rho _{*}}\Vert \widetilde{P}_{\ell }\Vert _\textrm{F}^3\Vert \widetilde{P}_{\ell }^{-1}\Vert _{\textrm{F}})}, \end{aligned}$$
(A.10)

where we used the fact that \(\Vert Q_{\ell }\Vert _{\textrm{F}}=\sqrt{m}\) because \(Q_{\ell }\) is an orthogonal matrix. Hence, by recalling \(Z_{\ell }^i=\tilde{\mu }_{\ell }\widetilde{P}_{\ell }^{-1}\mathcal {U}_{\ell }^i\widehat{G}_{\ell }^{-\frac{1}{2}}\widetilde{P}_{\ell }\) and using (A.7) we obtain

$$\begin{aligned} \Vert Z_{\ell }^i\Vert _{\textrm{F}}&\le \Vert \tilde{\mu }_{\ell }\widetilde{P}_{\ell }^{-1}S_{\ell }^{i}\widehat{G}_{\ell }^{-\frac{1}{2}}\widetilde{P}_{\ell }\Vert _{\textrm{F}} +\Vert \tilde{\mu }_{\ell }\widetilde{P}_{\ell }^{-1}Q_{\ell }(\mathcal {U}_{Q_{\ell }}^i-Q_{\ell }^{\top }S_{\ell }^{i}Q_{\ell })Q_{\ell }^{\top }\widehat{G}_{\ell }^{-\frac{1}{2}}\widetilde{P}_{\ell }\Vert _{\textrm{F}}\nonumber \\&=\textrm{O}(\tilde{\mu }_{\ell }^{1-\rho _{*}}\Vert \widetilde{P}_{\ell }\Vert _\textrm{F}^2+\tilde{\mu }_{\ell }^{\xi +1-\rho _{*}}\Vert \widetilde{P}_{\ell }\Vert _{\textrm{F}}^3\Vert \widetilde{P}_{\ell }^{-1}\Vert _\textrm{F}). \end{aligned}$$
(A.11)

Hereafter, for each of Cases (iii)–(v), we evaluate \(\Vert \widetilde{P}_{\ell }\Vert _\textrm{F}\), \(\Vert \widetilde{P}_{\ell }^{-1}\Vert _{\textrm{F}}\), and \(\rho _{*}\), and prove the boundedness of \(\{Z_{\ell }^i\}\) by showing that the rightmost hand expression in (A.11) is \(\textrm{O}(1)\).

Case (iii): Since \(\widehat{Y}_{\ell }=I\), \(\widetilde{P}_{\ell }=\widetilde{Y}_{\ell }^{\frac{1}{2}}\) for each \(\ell \), (A.3) implies \(\Vert \widehat{G}_{\ell }-\tilde{\mu }_{\ell }I\Vert _{\textrm{F}}=\Vert \widehat{G}_{\ell }^{\frac{1}{2}}\widehat{Y}_{\ell }\widehat{G}_{\ell }^{\frac{1}{2}}-\tilde{\mu }_{\ell }I\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi })\), which indicates \(\rho _{*}=1\) (see (A.6)). Furthermore, \(\Vert \widetilde{P}_{\ell }\Vert _{\textrm{F}}=\Vert \widetilde{Y}_{\ell }^{\frac{1}{2}}\Vert _{\textrm{F}}=\textrm{O}(1)\) and \(\Vert \widetilde{P}_{\ell }^{-1}\Vert _{\textrm{F}}=\Vert \widetilde{Y}_{\ell }^{-\frac{1}{2}}\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{1}{2}})\) by (A.5). Combined with (A.11) and the assumption \(\xi \ge \frac{1}{2}\), these results yield \(\Vert Z_{\ell }^i\Vert _\textrm{F}=\displaystyle {\textrm{O}(1+\tilde{\mu }_{\ell }^{\xi -\frac{1}{2}})}=\textrm{O}(1)\).

Case (iv): Since \(\widetilde{P}_{\ell }=(\widetilde{Y}_{\ell }G_{\ell }\widetilde{Y}_{\ell })^{\frac{1}{2}}\) and \(\widehat{G}_{\ell }^{-\frac{1}{2}}=\widehat{Y}_{\ell }\), we have, from (A.3), \( \Vert \widehat{G}_{\ell }^{\frac{1}{2}}-\tilde{\mu }_{\ell }I\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi }) \) yielding \(\rho _{*}=2\). Moreover, by (A.4) and \(\Vert G_{\ell }\widetilde{Y}_{\ell }-\tilde{\mu }_{\ell }I\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi })\),

$$\begin{aligned}&\Vert \widetilde{P}_{\ell }\Vert _{\textrm{F}}^2=\textrm{Tr}(\widetilde{Y}_{\ell }G_{\ell }\widetilde{Y}_{\ell })\le \Vert \widetilde{Y}_{\ell }\Vert _\textrm{F}\Vert G_{\ell }\widetilde{Y}_{\ell }-\tilde{\mu }_{\ell }I\Vert _{\textrm{F}} +\tilde{\mu }_{\ell }\Vert \widetilde{Y}_{\ell }\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }),\ \hbox {and}\\&\Vert \widetilde{P}_{\ell }^{-1}\Vert _{\textrm{F}}^2=\textrm{Tr}(\widetilde{Y}_{\ell }^{-1}G_{\ell }^{-1}\widetilde{Y}_{\ell }^{-1}) =\textrm{Tr}(G_{\ell }K_{\ell }^2) \le \Vert G_{\ell }\Vert _{\textrm{F}}\Vert K_{\ell }\Vert _{\textrm{F}}^2 =\textrm{O}(\tilde{\mu }_{\ell }^{-2}), \end{aligned}$$

where \(K_{\ell }:=G_{\ell }^{-\frac{1}{2}}\widetilde{Y}_{\ell }^{-1}G_{\ell }^{-\frac{1}{2}}\) and we used \(\Vert K_{\ell }\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{-1})\) from (A.3) to derive the last equality. These results combined with (A.11), \(\rho _{*}=2\), and \(\xi \ge \frac{1}{2}\) yield \(\Vert Z_{\ell }^i\Vert _{\textrm{F}} =\displaystyle {\textrm{O}(1+\tilde{\mu }_{\ell }^{\xi -\frac{1}{2}})}=\textrm{O}(1).\)

Case (v): By \(\widehat{G}_{\ell }=\widehat{Y}_{\ell }\) and (A.3), we have \(\Vert \widehat{G}_{\ell }^2-\tilde{\mu }_{\ell }I\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{1+\xi })\) yielding \(\rho _{*}=\frac{1}{2}\). Recall that the MTW scaling matrix \(W_{\ell }\) is defined by \( W_{\ell }:=G_{\ell }^{\frac{1}{2}}\left( G_{\ell }^{\frac{1}{2}}\widetilde{Y}_{\ell }G_{\ell }^{\frac{1}{2}}\right) ^{-\frac{1}{2}}G_{\ell }^{\frac{1}{2}} \) for each \(\ell \). Note that \(\Vert G_{\ell }\Vert _{\textrm{F}}=\textrm{O}(1)\) and \(\Vert G_{\ell }^{\frac{1}{2}}\widetilde{Y}_{\ell }G_{\ell }^{\frac{1}{2}}\Vert _{\textrm{F}}=\mathrm{\varTheta }(\tilde{\mu }_{\ell })\) follow from (A.4) and (A.3), respectively. The first equality in (A.2) then implies

$$\begin{aligned} \Vert W_{\ell }\Vert ^2_{\textrm{F}} =\Vert G_{\ell }^{\frac{1}{2}}\left( G_{\ell }^{-\frac{1}{2}}\widetilde{Y}_{\ell }^{-1}G_{\ell }^{-\frac{1}{2}}\right) ^{\frac{1}{2}}G_{\ell }^{\frac{1}{2}}\Vert ^2_\textrm{F} \le m\Vert G_{\ell }\Vert _{\textrm{F}}^2\Vert G_{\ell }^{-\frac{1}{2}}\widetilde{Y}_{\ell }^{-1}G_{\ell }^{-\frac{1}{2}}\Vert _{\textrm{F}} =\textrm{O}(\tilde{\mu }_{\ell }^{-1}), \end{aligned}$$

which entails \(\Vert W_{\ell }\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{1}{2}})\). Using \(\Vert G_{\ell }^{\frac{1}{2}}\widetilde{Y}_{\ell }G_{\ell }^{\frac{1}{2}}\Vert _{\textrm{F}}=\mathrm{\varTheta }(\tilde{\mu }_{\ell })\) again and \(\Vert G_{\ell }^{-1}\Vert _{\textrm{F}}=\textrm{O}(\tilde{\mu }_{\ell }^{-1})\) from (A.5), we have \(\Vert W_{\ell }^{-1}\Vert _{\textrm{F}} = \Vert G_{\ell }^{-\frac{1}{2}}\left( G_{\ell }^{\frac{1}{2}}\widetilde{Y}_{\ell }G_{\ell }^{\frac{1}{2}}\right) ^{\frac{1}{2}}G_{\ell }^{-\frac{1}{2}}\Vert _\textrm{F} =\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{1}{2}}).\) Hence, we obtain that \( \Vert \widetilde{P}_{\ell }\Vert _\textrm{F}^2=\textrm{Tr}(W_{\ell }^{-1})=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{1}{2}}) \) and \( \Vert \widetilde{P}_{\ell }^{-1}\Vert _\textrm{F}^2=\textrm{Tr}(W_{\ell })=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{1}{2}})\). Therefore, \( \Vert \widetilde{P}_{\ell }\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{1}{4}}),\ \Vert \widetilde{P}_{\ell }^{-1}\Vert _\textrm{F}=\textrm{O}(\tilde{\mu }_{\ell }^{-\frac{1}{4}}).\) These results combined with (A.11), \(\rho _{*}=\frac{1}{2}\), and \(\xi \ge \frac{1}{2}\) yield \(\Vert Z_{\ell }^i\Vert _\textrm{F}=\textrm{O}(1+\tilde{\mu }_{\ell }^{\xi -\frac{1}{2}})=\textrm{O}(1)\). We complete the proof. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Okuno, T. Local convergence of primal–dual interior point methods for nonlinear semidefinite optimization using the Monteiro–Tsuchiya family of search directions. Comput Optim Appl (2024). https://doi.org/10.1007/s10589-024-00562-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10589-024-00562-y

Keywords

Navigation