A simple proof of second-order sufficient optimality conditions in nonlinear semidefinite optimization

In this note, we present an elementary proof for a well-known second-order sufficient optimality condition in nonlinear semidefinite optimization which does not rely on the enhanced theory of second-order tangents. Our approach builds on an explicit elementary computation of the so-called second subderivative of the indicator function associated with the semidefinite cone which recovers the best curvature term known in the literature.


Introduction
Second-order sufficient optimality conditions play a significant role in the theory of nonlinear optimization.Among others, their validity guarantees stability of the underlying strict local minimizer with respect to perturbations of the data, and this opens a way in order to show local fast convergence of diverse types of numerical solution algorithms, including augmented Lagrangian, sequential quadratic programming, and Newton-type methods.
Geometric constraints of type where F : X → Y is a twice continuously differentiable mapping between Euclidean spaces X and Y, and C ⊂ X is a closed, convex set, provide a rather general paradigm for the modeling of diverse popular constraint systems in nonlinear optimization.It has been well-recognized in the past that second-order optimality conditions in constrained optimization depend on the second derivative of the objective function as well as the curvature of the feasible set.In the presence of constraints of type (1.1), the latter can be described in terms of the second derivative of F and the curvature of C. Thus, associated secondorder optimality conditions do not only comprise the second derivative of a suitable Lagrangian function, but a so-called curvature term associated with C pops up as well.
In case where C is a polyhedral set, this curvature term vanishes, and one obtains very simple second-order conditions as they are known from standard nonlinear programming, see Ben-Tal (1980); McCormick (1967).In more general cases, however, a suitable tool to keep track of the curvature of C has to be used to formulate a suitable curvature term.
Classically, the support function of a (local) second-order tangent approximation of C has been exploited for that purpose, see Bonnans et al. (1999); Bonnans and Shapiro (2000), and this exemplary led to second-order optimality conditions in nonlinear second-order cone and semidefinite optimization, see Bonnans and Ramírez (2005); Shapiro (1997).
However, we would like to mention here that the proofs in these papers are far from being elementary since the calculus of second-order tangents is a rather challenging task.
With the aid of a generalized notion of support functions, the approach via secondorder tangents can be further generalized to situations where C is not convex anymore, see Gfrerer et al. (2022).Another less popular approach to curvature terms has been promoted recently in Benko et al. (2022); Mohammadi et al. (2021); Thinh et al. (2021) where the so-called second subderivative, see Rockafellar (1989), of the indicator function of C has been used for that purpose.This tool yields promising results even in infinitedimensional spaces, see Christof and Wachsmuth (2018); Wachsmuth and Wachsmuth (2022).The approach via second subderivatives is particularly suitable for the derivation of second-order sufficient optimality conditions due to the underlying calculus properties of second subderivatives, see Benko and Mehlitz (2023) for a recent study.Secondorder sufficient conditions obtained from this approach have been shown to serve as suitable tools for the local convergence analysis of solution algorithms associated with challenging optimization problems based on variational analysis, see Hang et al. (2022); Hang and Sarabi (2021); Sarabi (2022).In this note, we aim to popularize the approach using second subderivatives even more by presenting an application in nonlinear semidefinite optimization.Thus, let us focus on the special situation where Y := S m equals the space of all real symmetric m × m-matrices and C := S m + is the cone of all positive semidefinite matrices.The tightest second-order sufficient condition in nonlinear semidefinite optimization we are aware of has been established by Shapiro and can be found in (Shapiro, 1997, Theorem 9).Its proof heavily relies on technical arguments which exploit second-order directional differentiability of the smallest eigenvalue of a positive semidefinite matrix and calculus rules for second-order tangent sets.Later, several authors tried to recover or enhance this result using reformulations of the original problem.In Forsgren (2000), the author obtained a related second-order sufficient condition based on a localized Lagrangian and some technical arguments via Schur's complement.The authors of Lourenço et al. (2018) applied the squared slack variable technique to semidefinite op-timization problems and obtained second-order sufficient conditions in the presence of so-called strict complementarity.In Jarre (2012), strict complementarity and a secondorder constraint qualification are needed to recover Shapiro's original second-order sufficient condition based on a simplified technique.Further results about second-order optimality conditions in nonlinear semidefinite optimization such as a strong secondorder sufficient condition and a weak second-order necessary condition can be found in Fukuda et al. (2020); Sun (2006).The validation of second-order sufficient conditions in the papers Forsgren (2000); Jarre (2012); Lourenço et al. (2018) is much simpler than the strategy used in Shapiro (1997).However, these approaches either do not recover the original result from Shapiro (1997) in full generality, i.e., additional conditions are postulated to proceed, or the analysis still makes some technical preliminary considerations necessary.Here, we simply compute the second subderivative of the indicator function associated with the positive semidefinite cone in order to recover the result from Shapiro (1997) in elementary way.Let us note that this calculation already has been done in (Mohammadi and Sarabi, 2020, Example 3.7), but the arguments presented there are not self-contained and exploit involved variational properties of eigenvalue functions, see Torki (1999).In contrast, our calculations are completely elementary.
The remainder of this note is structured as follows.In Section 2, we summarize the notation used in this paper and recall the definitions of some variational tools which we are going to exploit.We present an abstract second-order sufficient optimality condition for nonlinear semidefinite optimization problems in Section 3 which comprises the second subderivative of the indicator function of the semidefinite cone as the curvature term and can be distilled from a much more general result recently proven in Benko et al. (2022); Benko and Mehlitz (2023).Then, by explicit computation of the appearing second subderivative, we specify this result in terms of initial problem data and recover the results from Bonnans and Shapiro (2000); Shapiro (1997).Some concluding remarks close the paper in Section 4.

Preliminaries
The notation used in this note is fairly standard and follows Bonnans and Shapiro (2000); Rockafellar and Wets (1998).

Basic notation
By R n + , we denote the nonnegative orthant of R n .Let R m×n be the set of all rectangular matrices with m rows amd n columns, and O the all-zero matrix of appropriate dimensions.An Euclidean space X, i.e., a finite-dimensional Hilbert space, will be equipped with the inner product •, • : X × X → R and the associated induced norm The space of all real symmetric n × n-matrices S n is equipped with the Frobenius inner product given by and the associated induced Frobenius norm.
For an arbitrary Euclidean space X and some nonempty, convex set A ⊂ X, we use in order to denote the polar cone of A, which is always a closed, convex cone, and the annihilator of A, which is a subspace of X.The distance function For x ∈ A, we make use of in order to represent the tangent (or Bouligand) cone to A at x.The associated polar cone, i.e., is the normal cone to A at x.Note that T A (x) and N A (x) are closed, convex cones.For a twice continuously differentiable mapping F : X → Y between Euclidean spaces X and Y as well as some point x ∈ X, F ′ (x) : X → Y is the linear operator which represents the first derivative of F at x. Similarly, F ′′ (x) : X × X → Y is the bilinear mapping which represents the second derivative of F at x. Partial derivatives are denoted in analogous way.
Finally, for a lower semicontinuous function ϕ : X → R ∪ {∞}, some x ∈ X such that ϕ(x) < ∞, and some x * ∈ X, the function is referred to as the second subderivative of ϕ at x with x * .The recent study Benko and Mehlitz (2023) reports on the calculus of this variational tool and its usefulness for the derivation of second-order optimality conditions in nonlinear optimization, and these findings can be partially extended even to infinite-dimensional situations, see Christof and Wachsmuth (2018); Wachsmuth and Wachsmuth (2022).Here, we are particularly interested in the second subderivative of indicator functions δ A : X → R ∪ {∞}, associated with closed, convex sets A ⊂ X, given by ∀x ∈ X : For this particular function, the definition of the second subderivative yields and one can easily check that In case where u ∈ T A (x) and x * , u > 0, d 2 δ A (x, x * )(u) = −∞ holds.Thus, only the case u ∈ T A (x) ∩ {x * } ⊥ is interesting.In turn, for given x ∈ A and u ∈ T A (x), the consideration of the second subderivative is only reasonable if x * ∈ N A (x) ∩ {u} ⊥ .

Matrix analysis
In order to carry out our analysis related to the cone of all positive semidefinite matrices, we need to introduce some further notation first.Fix some m ∈ N such that m ≥ 2. By S m + , S m − ⊂ S m , we denote the cones of all positive semidefinite and negative semidefinite matrices, respectively.For each matrix Y ∈ S m + , there exists an orthogonal matrix P ∈ R m×m such that Y = P ⊤ M P where M ∈ R m×m is the diagonal matrix whose diagonal is made of the eigenvalues of Y , ordered non-increasingly.We refer to this representation as an ordered eigenvalue decomposition of Y .Throughout the paper, we will denote the index sets of (row) indices of M associated with the positive and zero eigenvalues of Y by π and ω, respectively.For later use, let us also mention that Y † = P ⊤ M † P holds for the Moore-Penrose pseudoinverse of Y , and that M † results from M by inverting its positive diagonal elements.For arbitrary matrices A ∈ S m and index sets I, J ⊂ {1, . . ., m}, we use A IJ to denote the matrix which results from A by deleting those rows and columns whose indices do not belong to I and J, respectively.Furthermore, we set A P := P AP ⊤ and A P IJ := (A P ) IJ .In (Bonnans and Shapiro, 2000, Section 5.3.1), the formula (2.1) has been established.Furthermore, (Hiriart-Urruty and Malick, 2012, Section 4.2.4) gives In the course of this note, we will need a criterion for semidefiniteness of block matrices.The following lemma is taken from (Boyd and Vandenberghe, 2004, Appendix A.5.5).
Lemma 2.1.Let m 1 , m 2 ∈ N be positive integers.Furthermore, let A ∈ S m 1 + be positive definite, and let B ∈ R m 1 ×m 2 as well as C ∈ S m 2 be arbitrarily chosen.For m := m 1 +m 2 , we consider the block matrix 3 Second-order sufficient optimality conditions in nonlinear semidefinite optimization Let m ∈ N such that m ≥ 2 be fixed.Throughout the section, we consider the nonlinear semidefinite optimization problem where f : X → R and F : X → S m are twice continuously differentiable mappings and X is some Euclidean space.Let F ⊂ X be the feasible set of (NSDP).For α ≥ 0, we introduce the generalized Lagrangian function Furthermore, for x ∈ F, we exploit the critical cone associated with (NSDP) given by Note that, due to (2.1), this cone can be computed explicitly as soon as an ordered eigenvalue decomposition of F (x) is at hand.For u ∈ C(x) and α ≥ 0, the associated directional Lagrange multiplier set is given by and this set can be computed via (2.2).
The following second-order sufficient optimality condition for (NSDP) can be distilled from the more general result (Benko et al., 2022, Theorem 3.3) which has been proven via a straight contradiction argument, and a direct proof of it, which is merely based on calculus rules for the second subderivative, is stated in (Benko and Mehlitz, 2023, Theorem 5.2).A slightly less general result, which clearly motivated the authors of Benko et al. (2022), can be found in (Mohammadi et al., 2021, Theorem 7.1).
Theorem 3.1.Let x ∈ F be chosen such that for each u ∈ C(x) \ {0}, there are α ≥ 0 and Y * ∈ Λ α (x, u) such that Then x is an essential local minimizer of second order for (NSDP), i.e., there are ε > 0 and β > 0 such that Particularly, x is a strict local minimizer of (NSDP).
We also note that the growth condition (3.2) is slightly more restrictive than which is referred to as the second-order growth condition associated with (NSDP) at x in the literature.
In order to turn (3.1) into a valuable second-order optimality condition, the appearing second subderivative of δ S m + has to be evaluated or, at least, estimated from below.Exemplary, this strategy has been used in Benko and Mehlitz (2023) in order to infer second-order sufficient conditions in nonlinear second-order cone programming and turned out to be much simpler than the more technical verification strategies from Bonnans and Ramírez (2005); Hang et al. (2020).Here, we present a similar analysis for nonlinear semidefinite programs.As already remarked in Benko and Mehlitz (2023), obtaining second-order necessary optimality conditions based on second subderivatives is often not reasonable since this would come along with comparatively strong regularity conditions which are necessary in order to get the calculus rules for second subderivatives working.
In the subsequent lemma, an explicit formula for the second subderivative of δ S m + is presented.
which gives (Y * ) P ωω , V P ωω = 0.For given V ′ ∈ S m and sufficiently small t > 0, M ππ + t(V ′ ) P ππ is positive definite, and since Y + tV ′ ∈ S m + and M + t(V ′ ) P ∈ S m + are equivalent by orthogonality of P , Lemma 2.1 can be used to infer that, for small enough t > 0, Y + tV ′ ∈ S m + equals Thus, from (Y * ) P ωω ∈ S |ω| − , we find Finally, we construct particular sequences {t k } k∈N ⊂ (0, ∞) and {V k } k∈N ⊂ S m which show that this lower estimate is sharp.Therefore, let {t k } k∈N ⊂ (0, ∞) be a null sequence such that M ππ + t k V P ππ is invertible for each k ∈ N. Define By construction, we also have and rearrangements lead to Thus, Lemma 2.1 gives Y + t k V k ∈ S m + for each k ∈ N. Reprising the above steps for the estimation of the lower limit and recalling (Y * ) P ωω , V P ωω = 0, we find This already completes the proof.
Let us note that the assertion of Lemma 3.2 has been proven in (Mohammadi and Sarabi, 2020, Example 3.7) with the aid of some deeper results from Torki (1999) addressing variational properties of eigenvalue functions.In contrast, our proof is rather elementary.
Combining this result with Theorem 3.1, we obtain fully explicit second-order sufficient optimality conditions for (NSDP).
Corollary 3.3.Let x ∈ F be chosen such that for each u ∈ C(x) \ {0}, there are α ≥ 0 and Y * ∈ Λ α (x, u) such that Then x is an essential local minimizer of second-order for the associated optimization problem (NSDP).
Let us point the reader's attention to the simplicity of the above arguments which have been used to obtain this second-order optimality condition.Theorem 3.1 is proven via a standard contradiction argument.Further, the computation of the appearing second subderivative of δ S m + is completely elementary and relies on the standard approach of working with an ordered eigenvalue decomposition.In (Shapiro, 1997, Theorem 9) and (Bonnans and Shapiro, 2000, Section 5.3.5),related second-order sufficient conditions, based on the same expression for the curvature term, i.e., the right-hand side in (3.3), but with a weaker growth condition were obtained using the theory of second-order tangent sets.This approach is much more technical and relies on deeper mathematics such as second-order directional differentiability of the smallest eigenvalue of a positive semidefinite matrix.

Concluding remarks
In this note, we computed the second subderivative of the indicator function associated with the cone of all positive semidefinite matrices, and this finding was used to obtain second-order sufficient optimality conditions in nonlinear semidefinite optimization.This procedure recovered the findings from Shapiro (1997) in elementary way.In the future, it needs to be studied whether this second-order sufficient condition can be employed beneficially in numerical optimization like in Hang et al. (2022) where local analysis of a multiplier-penalty method associated with second-order cone programs is investigated.Furthermore, it seems reasonable to check whether our approach to second-order sufficient conditions yields comprehensive results when applied to optimization problems with semidefinite cone complementarity constraints, see e.g.Ding et al. (2014); Liu and Pan (2022); Wu et al. (2014).Finally, we note that holds, so S m + is a special instance of the closed, convex cone where K ⊂ R m is an arbitrary closed, convex cone.In the literature, S m + (K) is referred to as the set-semidefinite or set-copositive cone associated with K, and for K := R m + , the popular copositive cone is obtained, see Bomze (2012); Burer (2015); Dür (2010); Dür and Rendl (2021) for further information about this cone and applications of copositive optimization.Following the approach of this note, it might be possible to obtain second-order sufficient conditions for nonlinear optimization problems involving S m + (K).However, it is well known that the variational geometry of S m + (K) is much more challenging for general K than for K := R m , so the necessary computations might be much more involved than the ones from Lemma 3.2.
Proof.Let Y = P ⊤ M P be an ordered eigenvalue decomposition of Y with orthogonal matrix P ∈ R m×m and diagonal matrix M ∈ S m as well as the index sets π and ω as defined in Section 2.2.From Y * ∈ N S m + (Y ), we find (Y * ) P ππ = O, (Y * ) P πω = O, and (Y * ) P ωω ∈ S |ω| − .Furthermore, V ∈ T S m + (Y ) gives V P ωω ∈ S |ω| + .From Y * , V = 0 and orthogonality of P , we have