Eigenvalue-based algorithm and analysis for nonconvex QCQP with one constraint

A nonconvex quadratically constrained quadratic programming (QCQP) with one constraint is usually solved via a dual SDP problem, or Moré’s algorithm based on iteratively solving linear systems. In this work we introduce an algorithm for QCQP that requires finding just one eigenpair of a generalized eigenvalue problem, and involves no outer iterations other than the (usually black-box) iterations for computing the eigenpair. Numerical experiments illustrate the efficiency and accuracy of our algorithm. We also analyze the QCQP solution extensively, including difficult cases, and show that the canonical form of a matrix pair gives a complete classification of the QCQP in terms of boundedness and attainability, and explain how to obtain a global solution whenever it exists.


Introduction
A quadratically constrained quadratic programming (QCQP) is an optimization problem of the form [4,Sec. 4.4] minimize x∈R n f (x) := x Ax + 2a x, where A and B i are n × n symmetric matrices and a, b i ∈ R n , β i ∈ R. When A and B i are all positive semidefinite, QCQP (1) is a convex problem, for which efficient algorithms are available such as the interior-point method [4,Ch. 11]. By contrast, when convexity is not assumed, QCQP is generally a difficult problem, in fact NPhard in general [32]. Even when the constraints are all affine, i.e., B i = 0, the decision problem formulation is known to be NP-complete [40]. All these are evidence that nonconvex QCQP is generally computationally intractable. One exception to this difficulty is when k = 1, that is, when there is just one constraint: This class of problems, namely QCQP with one quadratic constraint, includes the trust-region subproblem (TRS) as a special case [5], [26,Ch. 4], in which B is positive definite, b = 0 and β < 0. TRS is commonly employed for nonlinear optimization, and a number of efficient algorithms for solving TRS are available, e.g. [13,25,29,31]. The QCQP (2) is sometimes called the generalized TRS [24], and has applications in double well potential problems [9] and compressed sensing for geological data, in which A is positive semidefinite and B is indefinite [18]. In this paper we refer to QCQP with one quadratic constraint (2) simply as QCQP, unless otherwise mentioned. A dual formulation for QCQP (2) can be written as an semidefinite programming (SDP) maximize λ,γ γ subject to A + λB a + λb (a + λb) λβ − γ 0, where X Y means X − Y is positive semidefinite. Remarkably, assuming Slater's condition is satisfied, the SDP (3) is a dual problem with no duality gap, that is, the solution γ to (3) is equal to the optimal value of the QCQP (2), even when (2) is nonconvex; see [4,App. B] for details and a proof, which relies on the S-lemma [27].
The solution x can be obtained via the dual variable X = x x of (3) in the non-hard case (otherwise rank(X ) is higher [29]). Nonconvex QCQP with one constraint (2) is thus a notable class of nonconvex optimization problems that can be solved in polynomial time. However, this SDP approach is not very efficient as solving the SDP (3) by the standard interior-point method involves an iterative process, each iteration requiring O(n 3 ) operations [4, § 11.8.3] with a rather large constant, which limits the practical matrix size to, say, n ≤ 1000. Alternative strategies have been proposed in the literature. Moré [24] analyzes QCQP (2) and describes an algorithm that extends the algorithm of [25] for TRS. This algorithm is of an iterative nature, and is not matrix-free (a property desirable for dealing with large-sparse problems). Many other iterative algorithms have been proposed for TRS, but as indicated in the experiments in [1] for TRS, a one-step algorithm based on eigenvalues can significantly outperform such algorithms. Another approach [9,16] is to note the Lagrange dual problem can be expressed equivalently as This is a concave maximization problem, hence can be solved by e.g. a gradient descent method or Newton's method. However, computing the gradient already involves O(n 3 ) operations, let alone the Hessian, and typically a rather large number of iterations is needed for convergence.
The main contribution of this paper is the development of an efficient algorithm for QCQP (2) that is strictly feasible and (A, B) is a definite pair with A + λB 0 for some λ ≥ 0 (which we call definite feasible), which we argue is a generic condition for QCQP to be bounded. 1 The running time is O(n 3 ) when the matrices A, B are dense, and it can be significantly faster if the matrices are sparse. The algorithm requires (i) finding aλ ≥ 0 such that A +λB is positive definite, and (ii) computing an extremal eigenpair of an (2n + 1) × (2n + 1) generalized eigenvalue problem. We emphasize that the algorithm requires just one eigenvalue problem. The algorithm is easy to implement given a routine for computing an extremal (largest or smallest) eigenpair, for which high-quality software based on shift-invert Arnoldi is publically available such as ARPACK [2,22]. We present experiments that illustrate the efficiency and accuracy of our algorithm compared with the SDP-based approach. Our algorithm is based on the framework established in [1,11,17] of formulating the KKT conditions as an eigenvalue problem.
In addition, this paper also contributes to the theoretical understanding of QCQP, treating those that are not definite feasible. Specifically, it is a nontrivial problem to decide whether a given QCQP is bounded or not, and if bounded, whether the infimum is attainable. We present a classification of QCQP in terms of feasibility/boundedness/attainability, based on the canonical form of the symmetric pair (A, B) under congruence. We shall see that the canonical form provides rich information on the properties of the associated QCQP. We thus establish a process that (in exact arithmetic) solves QCQP completely in the sense that feasibility/boundedness/attainability is checked and the optimal objective value and a global solution is computed if it exists.
Broadly speaking, this paper is a contribution in the direction of "global optimization via eigenvalues". To our knowledge the earliest reference is Gander, Golub and von Matt [11] for TRS. This algorithm was revisited and further developed recently in [1] to illustrate its high efficiency, which was extended in [34] to deal with an additional linear constraint. This paper is largely an outgrowth of [1], extending the scope from TRS to QCQP, relaxing the convexity assumption in the constraint, and fully analyzing the degenerate cases. We also note [33], which solves QCQP with an additional ball constraint (generalized CDT problem; GCDT) via a two-parameter eigenvalue problem. It is in principle possible to impose a ball constraint with sufficiently large radius to convert QCQP (2) to GCDT, then use the algorithm in [33]. However, this would be very inefficient, requiring O(n 6 ) operations: our algorithm here needs at most O(n 3 ) operations, and can be faster when sparsity structure is present.
This paper is organized as follows. In Sect. 2 we review (mostly existing) results on the optimality and boundedness of QCQP (2). Section. 3 is the heart of this paper where we derive our eigenvalue-based algorithm for definite feasible QCQP. We present numerical experiments in Sect. 4, and analyze QCQP that are not definite feasible in Sect. 5.
Notation. We denote by R(X ) the range of a matrix X , and by N (X ) the null space. X ( )0 indicates X is a positive (semi)definite matrix. I n is the n × n identity, and O n , O m×n are zero matrices of size n × n and m × n. We simply write I, O if the dimensions are clear from the context. The Moore-Penrose pseudoinverse of a matrix A is denoted by A † . x * denotes a QCQP solution with associated Lagrange multiplier λ * .

Preliminaries: optimality and boundedness of QCQP
This section collects results on QCQP that are needed for our analysis and algorithm.

QCQP with no interior feasible point
QCQP (2) has no strictly feasible point when Slater's condition is violated. Note that checking strict feasibility can be done by an unconstrained quadratic minimization problem minimize x g(x). This subsection focuses on the case min x g(x) = 0.
Since min x g(x) > −∞, the quadratic function g(x) must be convex. Since further min x g(x) = 0, we can write g(x) for some x ∈ R n as Thus, dealing with QCQP that violates Slater's condition is straightforward. In what follows we treat strictly feasible QCQP for which there exist x with g(x) < 0 (i.e., Slater's condition is satisfied).

Boundedness and attainability
We start with a necessary and sufficient condition for boundedness of a strictly feasible QCQP.
Lemma 1 (Hsia et al. [16], Thm. 5) Suppose that for QCQP (2), there exists an interior feasible point x ∈ R n such that g(x) < 0. Then f (x) is bounded below in the feasible region if and only if there exists λ ≥ 0 such that Proof This is essentially a corollary of strong duality between QCQP (2) and SDP (3), which is bounded below if and only if there exist λ ≥ 0 and γ satisfying the first constraint in (3), which is equivalent to (5).
Boundedness guarantees the existence of the optimal (infimum) value for f . On the other hand, it is worth noting that there exist QCQP that are bounded but has no solution x * . For example, consider minimize x 2 subject to − x y + 1 ≤ 0.
For any (x, y), we have x 2 ≥ 0, and by taking (x, y) = (ε, 1/ε) and ε → 0 the constraints are satisfied and the objective function approaches the infimum 0. However, no feasible (x, y) has the objective value equal to 0. Such QCQP, that is, QCQP for which the infimum cannot be attained in the feasible region, are called unattainable. A necessary and sufficient condition for unattainability is given in the following result (recall that † denotes the pseudoinverse).
A reasonable output of a numerical algorithm for such QCQP is the infimum objective value 0 with the warning that it is unattainable.

Optimality conditions
When QCQP (2) satisfies Slater's condition and has a global solution, a set of necessary and sufficient conditions is given by Moré [24]: [24]) Suppose that QCQP (2) satisfies Slater's condition. Then x * is its global solution if and only if there exist λ * ≥ 0 such that The first three conditions in (8) represent the KKT conditions, and there can be many KKT points (λ, x) satisfying these three, reflecting the nonconvexity of the problem. The final condition A + λ * B 0 specifies which of the KKT points is the solution.

Definite feasible QCQP: strictly feasible and definite
By Lemma 1, for a strictly feasible QCQP to be bounded we necessarily need A+λB 0 for someλ ≥ 0. If we further have A +λB 0, then the QCQP is clearly bounded. As we argue in Sect. 5, such cases form a "generic" class of QCQP (2) that are bounded and has a global solution. We therefore give a name for such QCQP.
Definition 1 A QCQP (2) satisfying the following two conditions is said to be definite feasible.
1. It is strictly feasible: there exists x ∈ R n such that g(x) < 0, and 2. (A, B) is definite with nonnegative shift: there existsλ ≥ 0 such that A +λB 0.
We shall treat such QCQP in detail and derive an efficient algorithm in Sect. 3. To begin with, for definite feasible QCQP there always exists a global solution x * . Theorem 2 (Moré [24]) For a definite feasible QCQP (2), there exist x * ∈ R n , λ * ≥ 0 such that the conditions (8) hold.
In the special case of TRS we have B 0, so by takingλ arbitrarily large we have A +λB 0, and since Slater's condition is trivially satisifed, it follows that TRS is a definite feasible QCQP. Similarly, if A 0, takingλ = 0 shows the pencil is definite, so such QCQP is definite feasible as long as it is strictly feasible. Indeed a number of studies focus on such cases [8,9].

Checking definite feasibility
Let us now discuss how to determine whether a given QCQP (2) is definite feasible.
In general, given a matrix pair (A, B), it is an active research area to devise algorithms for checking its definiteness (dropping the requirementλ ≥ 0), that is, checking whether there exist t ∈ R such that A sin t + B cos t 0. Such algorithms include [6,15], which also provide a value t 0 for which A sin t 0 + B cos t 0 0 if (A, B) is definite. If the pair (A, B) is determined not to be definite then QCQP is not definite feasible. If the pair (A, B) is determined to be definite, with t 0 available such that A sin t 0 + B cos t 0 0, then we compute the smallest eigenvalue λ 1 and largest eigenvalue λ 2 of the pencil Then the matrix (A cos t 0 − B sin t 0 ) + λ(A sin t 0 + B cos t 0 ) is positive semidefinite for λ ≥ λ 2 , and negative semidefinite for λ ≤ λ 1 . Hence we can rewrite the condition thus we obtain the interval t ∈ (t 1 , t 2 ) on which A sin t + B cos t 0. From this we obtain the intervalD = { 1 tan t | t ∈ (t 1 , t 2 ), sin t > 0, cos t ≥ 0} such that A + λB 0 if and only if λ ∈D. If D =D∩[0, ∞) is empty then the QCQP is not definite feasible; otherwise it is.

Eigenvalue-based algorithm for definite feasible QCQP
We now develop an eigenvalue algorithm for definite feasible QCQP. In this section we assume that a value ofλ ≥ 0 such that A +λB 0 is known, through a process such as those described in Sect. 2.4.1.
By Theorems 1 and 2, a definite feasible QCQP can be solved by solving (8) for λ * and x * . We develop an algorithm that first finds the optimal Lagrange multiplier λ * by an eigenvalue problem, then computes x * .
For λ ∈ D, define In view of the third condition in (8), the main goal is to find λ * such that γ (λ * ) = 0. This argument apparently dismisses the cases where A + λ * B is singular (then x(λ) is not well-defined) or λ * = 0 (then γ (λ) = 0 is unnecessary). Nonetheless, the analysis below will cover such cases.
Since A +λB 0, A and B are simultaneously diagonalizable, that is, there exists a nonsingular W ∈ R n×n such that W AW and W BW are both diagonal [12,Chap. 8]. Hence without loss of generality we assume that A, B are diagonal in the analysis below (our algorithm does not assume this or the knowledge of W ). Let A = diag(d 1 , . . . , d n ), B = diag(e 1 , . . . , e n ), a = [a 1 , . . . , a n ] , and b = [b 1 , . . . , b n ] (those a and b are indeed W a and W b in the original coordinate system). It is now straightforward to identify the interval D. Since when 0 < λ 1 < λ 2 < ∞ there exist i 1 and i 2 so that d i 1 +λ 1 e i 1 = 0 and d i 2 +λ 2 e i 2 = 0. γ (λ) can be explicitly expressed using x(λ) = [x 1 (λ), . . . , x n (λ)] as Therefore x i (λ) and γ (λ) are rational functions of λ. Moreover, on λ ∈ D, the function γ (λ) has the following property [24,Thm. 5.2].

Classification of definite feasible QCQP
In order to investigate the properties of λ * that satisfy (8), in particular γ (λ * ) = 0, we separate definite feasible QCQP (2) into four distinct cases, depending on the sign of γ (λ) on λ ∈ D.
(a) γ (λ) takes both nonnegative and nonpositive values. We now investigate the value of λ * for each case. First for (a), by the mean-value theorem there exists λ ∈ D such that γ (λ) = 0, and for this λ we have λ * = λ.
To deal with cases (b), (c) we use the following result.

Proposition 2
The following results hold for definite feasible QCQP (2) for which A + λB 0 on λ ∈ D.
and further e i 2 < 0 from (10), which holds even if i 2 contains multiple elements. Note also that x(λ 2 ) i are bounded constants for all i / ∈ i 2 . Hence by (12), γ (λ) is a quadratic equation in x(λ) i 2 with negative leading coefficient, and so for the assumption γ (λ 2 ) > 0 to hold, |x(λ 2 ) i 2 | cannot blow up to ∞. Hence by (11) we must have a i 2 + λ 2 b i 2 = 0, and thus x(λ 2 ) converges to the vector (11) . Now, any vector x equal to x(λ 2 ) with the i 2 th element replaced with an arbitrary number satisfies (A + λ 2 B)x = −(a + λ 2 b); we shall choose this i 2 th element of x-which we denote by y -so that g(x y ) = 0, where we made the y-dependence of x explicit. Then g(x y ) = 0 is a quadratic equation in y with negative leading coefficient e i 2 . Together with the assumption g(x −b i 2 /e i 2 ) > 0, there are two real solutions in y to g(x y ) = 0. With either root, the vector x := x y satisfies g( The the case (c) with λ 1 > 0 is similar: We need a i 1 + λ 1 b i 1 = 0, and x(λ 1 ) converges to the vector (11) with the i 1 th element x(λ 1 Define the vector x y to be equal to x(λ 1 ) except the i 1 th element y, which is set so that g(x y ) = 0. Then x := x y satisfies the two equations. We note that it is possible to prove Proposition 2 as a straightforward corollary of [9, Lemma 2].
Summarizing, the values of λ * in Theorem 2 are Figures 1, 2, 3 and 4 illustrate typical plots of γ (λ) for the four cases.

Computing the Lagrange multiplier λ *
We now consider computing λ * > 0 that satisfies (8). The material in this subsection is the key ingredient of the algorithm that we propose. We need to find λ * , x * such that Typical γ (λ) for case (a). There exists λ * such that In principle, this can be done by solving γ (λ) = 0 for λ, which is a rational equation. However, when A, B are not diagonal (as is usually the case), expressing γ (λ) explicitly in rational function form is a nontrivial task. Our approach, building upon [1,11] for the TRS, is to express γ (λ) = 0 as a generalized eigenvalue problem.
Then we show that the the optimal Lagrange multiplier λ * > 0 in (8) is an eigenvalue of M 0 + λM 1 , that is, there exists a nonzero vector z ∈ R 2n+1 such that and for λ * > 0 satisfying (8), λ = λ * satisfies Proof For λ for which det(A + λB) = 0, we have Hence if A + λ * B is nonsingular, then (18) holds with γ (λ * ) = 0, and hence we have The above proof also shows that any KKT multiplier λ satisfying the first three equations in (8) is an eigenvalue of M 0 +λM 1 . Note from (18) and the fact A+λB 0 that we either have det(M 0 +λM 1 ) = 0, indicating M 0 + λM 1 is a regular matrix pencil (thus having exactly 2n + 1 eigenvalues), or that det(M 0 +λM 1 ) = 0, in which case (λ, x(λ)) satisfies all the conditions in (8), thus is a solution for QCQP (2). It thus follows that λ * can be found (or at least a finite set containing it) via the eigenvalue problem det(M 0 + λM 1 ) = 0. Note from Proposition 1 that M 0 + λM 1 is regular unless γ (λ) is identically zero, in which case every value of λ for which A + λB 0 is an optimal Lagrange multiplier. Since this case is easy, in what follows we assume M 0 + λM 1 is regular.
We shall further show that λ * is the unique eigenvalue of M 0 + λM 1 in the interval D = (λ 1 , λ 2 ). To simplify the analysis, we apply a Möbius transformation [23] to the matrix pencil (15). Specifically, recalling thatλ ≥ 0 is known such that A +λB 0, we defineM := M 0 +λM 1 and consider the eigenvalues of The eigenvalues ξ of (19) correspond one-to-one to those λ of M 0 + λM 1 via the transformation λ =λ + ξ −1 . We consider only ξ = 0, as ξ = 0 corresponds to the eigenvalue of M 0 + λM 1 at infinity, which is irrelevant. We shall show that the largest (or smallest) real eigenvalue ξ * of (19) gives the value λ * =λ + ξ −1 * in (8). We consider separate cases depending on the sign of γ (λ), and we next show that in both cases it suffices to compute one extremal eigenpair of (19).
Theorem 4 shows that when γ (λ) > 0, the optimal Lagrange multiplier λ * can be obtained by computing the largest real eigenvalue of (19). One practical difficulty here is that by an algorithm such as shift-and-invert Arnoldi, it can be much harder to compute the largest real eigenvalue than the eigenvalue with largest real part (which can be complex). We shall now show that in fact these are the same for (19), that is, its rightmost eigenvalue is real. A similar statement was made in [1] for the special case of TRS; here we extend the result to definite feasible QCQP. Proof It suffices to prove that for every ξ = s + ti with s ≥ ξ and t = 0, we have det(M 1 + ξM) = 0, or equivalently (by (18) First consider values of λ such that det(A + λB) = 0. These are the eigenvalues Hence γ (λ + ξ −1 ) = 0, completing the proof.
The upshot is that to obtain the optimal Lagrange multiplier λ * when γ (λ) > 0, it suffices to compute the rightmost eigenpair of (19).

When γ (λ) < 0
This case can be treated in essentially the same way as above, but special treatment is necessary for the case (d).
The following is an analogue of Theorem 5.
Proof We first prove that cases (a) and (c) cannot satisfy the assumptions. Indeed by Theorem 6 we have ξ < −λ −1 < 0, so writingλ + ξ −1 = p + qi we have Since λ ,λ ∈ D, we haveλ + s −1 ∈ D. Then as in the proof of Theorem 5 we see that ξ is not a solution for (19), a contradiction. Now suppose we are in case (d). Since D is bounded below by 0, if 0 < λ ≤λ then λ ∈ D. Also, since p ≤λ always holds, if p > 0 then by 0 < p ≤λ we have p ∈ D, so as in the proof of Theorem 5 we see that ξ is not a solution for (19), again a contradiction. Thus we conclude that p = Re(λ + ξ −1 ) ≤ 0.

Pseudocode for computing λ *
We summarize the whole process for finding λ * in Algorithm 3.1.
As discussed before, we can compute the rightmost (or leftmost) eigenpair of a generalized eigenvalue problem using the Arnoldi method, which is much more efficient than computing all the eigenvalues, especially when the matrices have structure such as symmetry and/or sparsity. In Matlab the eigs command with the flag 'lr' ('sr') computes such eigenpair.

Obtaining the solution x *
Having computed the optimal Lagrange multiplier λ * , we now turn to finding the solution x * . We shall show that generically the eigenvector z obtained in Algorithm 3.1 contains the desired information on x * .
First, if the output of Algorithm 3.1 is λ * = 0, then the QCQP solution is simply −A −1 a, the solution of a linear system (see Sect. 3.4.2 for the case det(A) = 0).
For nonzero λ * , we can generically obtain the solution by computing x * = −(A + λ * B) −1 (a + λ * b), but below we show that solving such linear system is usually unnecessary.

When A + λ * B is nonsingular
If λ * > 0 and det(A + λ * B) = 0, then we can obtain x * via the eigenvector associated with λ * (which is obtained by the Arnoldi method). Suppose z = [θ y 1 y 2 ] is the computed eigenvector where θ ∈ R, y 1 , y 2 ∈ R n .
Plugging z = [θ y 1 y 2 ] into M 0 z + λ * M 1 z = 0 gives First suppose θ = 0, which holds generically. Then from the last equation we see that the solution is x * = y 1 θ . Next suppose that θ = 0. By (21), if y 1 = 0 then y 1 ∈ N (A + λ * B), and A + λ * B is singular. When y 1 = 0, by (20) we have y 2 ∈ N (A + λ * B) so again A + λ * B is singular. Thus when A + λ * B is nonsingular we necessarily have θ = 0, and thus we can obtain x * directly from the eigenvector z.

When A + λ * B is singular
When A + λ * B is singular at the λ * obtained by Algorithm 3.1, matters are more subtle. In this case we need to solve the linear system which has a singular coefficient matrix A + λ * B. A singular linear system generically does not have a solution, but (8) shows that (22) must be consistent. However, the error of a computed solution to a linear system is generally proportional to the condition number, and solving a singular linear system numerically is challenging, if not impossible.
In fact, the case where A + λ * B is singular corresponds to the well known "hard case" for the special case of TRS. For TRS, dealing with such hard cases are discussed in [25,30]. In this work we discuss dealing with the hard case for the general QCQP by forming and solving a nonsingular linear system that has the same solution as (22). The development here parallels that in [1], which is in turn based on [10].
The following theorem will be the basis for the construction of x * .
Theorem 8 For λ * satisfying (8), suppose A + λ * B is singular. Let v 1 , . . . v j be a basis for N (A + λ * B), and let w * be the solution of the linear systemÃw * = −ã, whereÃ in which α > 0 is an arbitrary positive number. Then the following hold: 1.Ã 0, in particular,Ã is nonsingular (hence w * above exists uniquely), To prove the theorem we prepare a lemma, which we will use repeatedly. (2), if x ∈ N (A + λ * B) and x Bx = 0 then x = 0.

Lemma 4 For a definite feasible QCQP
Proof We have and A +λB 0, hence x = 0.
We are now ready to prove Theorem 8.
Proof (for Theorem 8) We give a proof for each claim.

By
Since Together with the assumption FromÃu i = Bv i , we have (25). From this it follows that (A + λ * B)Ã −1 Bv i = 0, and where for the last equality we used (A +λ * B)Ã −1 Bv i = 0 and (23). Now from (8) we see that there exists x * such that a where we define L : Next, suppose that Lx = 0 and x ∈ N (A + λ * B). We show that x = 0. To this end, note that so v i BÃ −1 Bx = 0 for i = 1, . . . , j. Now since x can be written as a linear combination of v 1 , . . . , v j , it follows that x BÃ −1 Bx = 0, and byÃ −1 0 we have Bx = 0. Hence x Bx = 0, and again by Lemma 4 we conclude that x = 0. Moreover, that is, L is an idempotent matrix: indeed, it turns out that L does not depend on α. Therefore, for every v ∈ N (A + λ * B), Lv is a linear combination of v 1 , . . . , v j , so (v − Lv) ∈ N (A + λ * B), and from L(v − Lv) = Lv − L 2 v = 0 it follows from the above argument, taking x ← v − Lv, that v − Lv = 0. Hence by (26) we conclude that Let us now explain how to obtain a QCQP solution x * using Theorem 8. First when λ * > 0, by Theorem 2 A +λ * B can be singular only in cases (b) and (c). First examine case (b). For the obtained λ * we have λ * >λ, so for any nonzero so g(w * +v) < g(w * ). Moreover, there exists x * satisfying (8), so g(w * ) ≥ g(x * ) = 0.
Summarizing the above findings, we can compute an optimal solution x * by Algorithm 3.2. Note that the above argument clearly shows the solution can be non-unique; the goal here is to obtain one optimal solution.

Algorithm 3.2 Find solution x * for QCQP (2)
Take an arbitrary v ∈ N (A + λ * B) and choose x * = w * + tv so that g(x * ) = 0. end if end if

Complexity
When no structure is present and A, B are dense, the dominant cost in Algorithm 3.2 lies in finding an eigenpair and the solution of a linear system; these are both O(n 3 ). Computing N (A + λ * B) can be done by an SVD, and finding γ (λ) is mostly solving a linear system, and the other steps are all O(n 2 ). Hence the overall complexity of Algorithm 3.2 is O(n 3 ).
In comparison, the SDP-based approaches require at least O(n 3 ) in each iteration of the interior-point method [3] with a rather large constant, so we see that Algorithm 3.2 can be much more efficient.
Moreover, the dominant step of finding an extremal eigenpair can easily take advantage of the sparsity structure of A, B if present, resulting in running time much faster than O(n 3 ). This fact is illustrated in our experiments.

Numerical experiments
To illustrate the performance (speed and accuracy) of Algorithm 3.2 for solving QCQP (2), here we present Matlab experiments comparing with the SDP-based algorithm. Specifically, we compare Algorithm 3.2 with SDP solvers based on the interior-point method: SeDuMi [36], and SDPT3 [39], which we invoke via CVX [14]. We used the default values for parameters such as the stopping criterion. However, since the core of that algorithm and ours are both essentially the same as those for the TRS (excluding findingλ, which they both require), we expect that our code would outperform [24] in speed and accuracy just as in TRS.
All experiments were carried out in MATLAB version R2013a on a machine with an Intel Xeon E5-2680 processor with 64GB RAM.

Setup
We generate a "random" definite feasible QCQP with indefinite A, B as follows. First form a random positive definite K 0, formed as X X + I where X is a random n ×n matrix, obtained by MATLAB's function randn(n). Since the problem becomes illconditioned if K is close to singular, we chose K to have eigenvalues at least 1. We then setλ to be a random positive number.
We then took a random symmetric matrix B obtained by Y = randn(n); B = Y+Y', and define A via K = A +λB. We took a and b to be random vectors.
Below we report the average speed and accuracy from 50 randomly generated instances for each matrix size n.

Computingλ
In practiceλ is usually unknown in advance, and in that case our algorithm starts by computingλ. To this end we used the algorithm in [6,15] to findλ, as discussed in Sect. 2.4.1, to obtain the interval D. Although any value in D is allowed to beλ, we choseλ as the middle point of D to avoid ill-conditioning of A +λB.
In the figures we show the performance of our algorithm in two cases: (i) whenλ is known a priori, shown as "Eig", and and (ii) whenλ needs to be computed, shown as "Eigcheck". In other words, the runtime of Eigcheck is the sum of Eig and findingλ.

Newton refinement process
We use a refinement process to improve the accuracy of the solution, in particular to force the computed solution to satisfy the constraint to working precision. Suppose λ * is positive in (8). Then at the solution the constraint must hold with equality, but due to numerical errors this may not be the case with the computed solution x. Writing x = x * + δ where x * satisfies the constraint exactly, i.e., g(x * ) = 0, we have We apply Newton's method to g(x) = 0 to force x to satisfy the constraint to full accuracy. Specifically, we update x by Then We have applied this refinement process to all three algorithms. By forcing the computed x to be numerically feasible, we rule out the misleading cases where an infeasible x with a small objective function is interpreted as a "good" solution. We note that the refinement may decrease the quality of x as a numerical solution for the problem arising in the algorithm: for example, the residual in the eigenvalue equation (22) may become larger. However, for solving the original QCQP, it is more important to improve feasibility. Figure 5 shows the runtime of the three algorithms. For n ≥ 1000, SDPT3 was unable to compute a solution on our computer, so we show experiments with n ≤ 700 for this algorithm. Our algorithm and SeDuMi are able to deal with larger matrices on our machine, and we report its performance up to n = 5000.

Results
We see that whenλ is known (Eig), our algorithm is faster than SeDuMi, SDPT3 by orders of magnitude. Even if we include the time for computingλ, our algorithm (EigCheck) is still faster than SeDuMi and SDPT3. Figure 6 shows the accuracy of the computed solution. For each QCQP, let f i (i = 1, 2, 3) be the objective value of the solution computed by each of the three algorithms. We compute where f opt = f (x opt ). We report the average values i ,t i of s i , t i for each fixed matrix size (recall that we repeated 50 random examples for each n).  Figure 6 illustrates that our algorithm found solutions and objective values nearest to optimal, hence more accurate.
Sparse matrices Another strength of our method is that it can directly take advantage of the matrix sparsity structure. Specifically, for the computation of an extremal eigenpair, which is the dominant part of our algorithm, efficient eigensolvers for large-sparse matrices are widely available [2,22], and implemented for example in Matlab's eigs command.
To illustrate this we generated QCQP as before, but with A, B sparse. Here we assume thatλ is known, and skip its computation, showing the runtime of only Eig; otherwise the code spends the majority of the runtime in findingλ (unless the tridiagonal structure is fully exploited in the detection process). Similarly, SeDuMi and SDPT3 are not shown here, as their speed remained about the same as in the dense case for n ≤ 700, hence impractical for n ≥ 10 3 . Here we examine the runtime and accuracy of our algorithm Eig for varying matrix size n from 10 3 to as large as 10 6 . We test with two types of sparse matrices: tridiagonal and random sparse (generated using MATLAB's sprandsym).
In Figs. 7, 8 and 9 we verify that when the matrices A, B are highly sparse, our method runs faster than O(n 3 ); here it scaled like O(n 2 ) for the tridiagonal case, and also for the random sparse case when the number of nonzeros per row is fixed. The ill-conditioned case By taking K = X X + ε I for a very small ε > 0, K is close to singular. We took ε = 10 0 , 10 −1 , . . . , 10 −12 , n = 200 and Fig. 10 shows the results. Since the runtime did not vary significantly, we do not show the runtime. For the accuracy, we also compare with our Matlab implementation of Moré's algorithm [24]. In Figure 10, we observe that our algorithm computed the optimal value reliably even in the ill-conditioned case, unlike the SDP-based algorithms. Moré's algorithm gave even better accuracy here: recall that this algorithm is an extension of the classical Moré-Sorensen algorithm for TRS [35], which is iterative in nature (solving a linear system in each iteration), and not matrix-free.

QCQP that are not definite feasible
Thus far we have focused on the definite feasible QCQP and derived an eigenvaluebased algorithm that is fast and accurate. We now develop an analysis that accounts for "non-generic" QCQP that are not necessarily definite feasible (since the discussion in Sect. 2.1 still holds, we still focus on the strictly feasible case). The key tool for our analysis is the canonical form of a symmetric pair under congruence, which we review next.

The canonical form of ( A, B) under congruence
For a pair of symmetric matrices (A, B) the canonical form under congruence [21,37,38], shown below, is the simplest form taken by W (A + λB)W where W is nonsingular. We define the ⊕ operator as Here when m > 1, and for m = 1, and ν j = 0, δ j , η j = ±1. The form in (27), (28) (29) is the canonical form of (A, B) under congruence.
This theorem shows that by congruence transformation, a symmetric matrix pair (A, B) can be block diagonalized with three types of diagonal blocks (27), (28) and (29). Each block corresponds to an eigenvalue or singular part of the pencil A + λB as summarized below.
1. The blocks in the right-hand side of (27) correspond to a singular part; any matrix pair that possess these blocks is singular, that is, det(A + λB) = 0 for every λ. 2. The blocks (28) correspond to real finite (right term) and infinite (left term) eigenvalues. The right terms are the "natural" extensions of the Jordan block in standard eigenvalue problems. k j , l j are the size of the Jordan blocks. 3. The blocks (29) correspond to nonreal eigenvalues, which must appear in conjugate pairs. Again, m j is the Jordan block size.
The main message of this section is that the canonical form under congruence contains full information about QCQP (2). While this work appears to be the first to use the canonical form in the analysis of QCQP, related results have been presented in the literature. The paper [9] also shows that if the matrices A, B are diagonalizable by congruence and this congruence transformation is known, the dual problem can be solved by linear programming. The preprint [16] also investigates the matrix pencil and illustrates why QCQP is nontrivial when the pencil is not simultaneously diagonalizable under congruence. Here we clarify the situation, treating extensively the difficult cases that are not definite feasible, and characterizing the feasibility/boundedness/attainability with respect to the canonical form of the pair (A, B) under congruence. The manuscript [19] shows that if the canonical form is known and the QCQP is bounded, a SOCP reformulation is possible to obtain the solution.

Implication of canonical form for QCQP boundedness
Now we turn to the implications of Theorem 9 for QCQP, first focusing on the condition for QCQP to be bounded.
In Sect. 2.1 we dealt with the case where the feasible region has no interior point, and the analysis made no assumption on definite feasibility. Hence, here we assume Slater's condition, which allows us to invoke Lemma 1 and Theorem 1. The first observation is that the necessary condition A + λB 0 in (5) for QCQP to be bounded restricts the admissible canonical forms of A + λB from the general form in Theorem 9 to the following.
Proof For A + λB 0 to hold, each block in Theorem 9 needs to be positive semidefinite. We examine each of the blocks of the form (27), (28) and (29). First consider the block (29), corresponding to the nonreal eigenvalues. When m j > 1, the (2, 2), (2m j , 2), and (2m j , 2m j ) elements are respectively 0, −ν j and 0. Thus the (29) blocks cannot be positive semidefinite, regardless of the value of λ. Therefore we need m j = 1, but then (29) is and we look for conditions under which this is semidefinite. For this to happen we need the (1, 1) and (2, 2) elements to be nonnegative, which means we need ν j = 0, a contradiction. Thus the blocks (29) cannot exist. Similarly, the second term in (27) cannot exist since its (1, 1), (1, 2ε j ), (2ε j , 2ε j ) elements are respectively 0,1 and 0.
For the first term in (28), if k j > 1 the (1, k j ) and (k j , k j ) elements are respectively δ j and 0, so again we need k j = 1. Then δ j (F 1 + λG 1 ) = [δ j ] ≥ 0 so δ j = 1, and Finally, consider the second term in (28). When l j > 1 the (1, l j ) and (l j , l j ) elements are respectively η j (λ + α j ) and 0, so the only value of λ for which η j ((λ + α j )F l j + G l j ) 0 is λ = −α j . Hence we need η j G l j 0, and the only value of l j > 1 for which this holds is l j = 2, in which case η j = 1. Moreover, we need λ = −α j to hold simultaneously for all j = 1, . . . , q, so α j = θ for each j (they are all the same), and by λ ≥ 0 we have θ ≤ 0. In addition, when l j = 1 we have η j ((λ + α j )F l j + G l j ) = η j (λ + α j ).
Given A, B satisfying (30), we next examine the values of λ for which A +λB 0. (30). Then the values of λ for which A + λB 0 are the intersection of

Proposition 4 Let A, B ∈ S n by symmetric matrices satisfying
A proof is a straightforward examination of each term in (30).
The above results imply that for a bounded QCQP, the pencil A + λB cannot have nonreal eigenvalues. Furthermore, Jordan blocks must be of size at most two, and when (30) contains q 2 ≥ 1 blocks of size two J (λ; θ) (the QCQP in (6) is one such example with q 2 = 1), the corresponding eigenvalue θ need to be all the same for all the q 2 blocks, and moreover the value of λ with A + λB 0 is restricted to just one value, namely λ = −θ , corresponding to the block J (−θ ; θ) = 1 0 0 0 , and this value λ = −θ needs to satisfy q 1 j=1 η j [λ + α j ] 0 for A + λB 0 to hold.

Characterizing bounded QCQP via the canonical form
Recall from Lemma 1 that the QCQP is bounded if and only if there exists λ ≥ 0 such that A + λB 0 and a + λb ∈ R(A + λB).
If (A, B) is definite so that A + λB 0 for some λ, then both conditions in (32) are satisfied trivially. However, these conditions are not straightforward to verify when the pair (A, B) is semidefinite but not definite.
Here we show that the conditions (32) can be written explicitly using the canonical form of symmetric pencils by congruence. Essentially this specifies the types of A, B for which the QCQP is solvable.
We start by examining the first condition in (32), the semidefiniteness of the pair (A, B). As we saw in Theorem 10, this requirement restricts the canonical form to (30); here without loss of generality we assume the −α j are arranged in nondecreasing order. Recall that J (λ; θ) is a Jordan block corresponding to a real eigenvalue, whose size is here restricted to 2 × 2. The so-called sign characteristics η j ∈ {1, −1} must satisfy certain conditions. We separate into two cases depending on the presence of Jordan blocks.
-if a block J (λ; θ) is present then the θ values must be all the same, and λ = −θ is the only value for which A + λB 0. The requirement on η i is η i = 1 if α j > θ, and η j = −1 if α j < θ. For the real and semisimple eigenvalues α j = θ , the corresponding sign characteristic η j is allowed to be either 1 or −1.
We repeat that in the second case the set of λ for which A + λB 0 is a point. In the first case, it is the interval [−α j , −α j+1 ]. In the special case where the pair (A, B) is definite, the canonical form consists only of the second and third terms in (30) I r ×r ⊕ q 1 j=1 η j [λ + α j ], with η j satisfying the first of the above conditions. Next consider the second condition a + λb ∈ R(A + λB). This can be written as (A + λB)x = a + λb for some vector x, which, using the canonical form, is equivalent to Our task is to identify the condition under which the linear system (33) has a solution x with A + λB 0. We consider two cases separately: This has a solution for any value of λ ∈ (λ j , λ j+1 ) if and only if W (a + λb) is of the form where * ∈ R n−u can take any value. Crucial here is the zero pattern of the vector W (a +λb); whether such vector exists with λ ∈ (λ j , λ j+1 ) can be verified easily once W a, W b are available. -A + λB 0 only at a pointλ. In this case (33) reduces to . . .
We summarize the above findings in the next theorem.

Theorem 11 A QCQP with strict interior feasible point is bounded below if and only
if its canonical form under congruence is of the form (30), and W (a + λb) has nonzero structure (35), when a block J (λ; θ) is present or λ j = λ j+1 , and (34), other wise.
Note that the conditions in the theorem are straightforward to verify provided that the congruence transformation W for the canonical form is available. We note that in [19,Thm. 6] a necessary condition is given for QCQP to be bounded; Theorem 11 gives the necessary and sufficient conditions.
To compute W , in [19] an algorithm is presented assuming B is nonsingular and all the eigenvalues are real and the Jordan blocks are of size at most two with the same eigenvalue. For the general case, one can proceed by upper triangularizing the matrix pencil using the QZ algorithm (or the GUPTRI algorithm [7,20] to deal with singular pencils), and then solving generalized Sylvester equations [12,Sec. 7.7] to block diagonalize the matrices, detect the Jordan block sizes and find the corresponding transformations for each block. Unfortunately, currently no numerically stable algorithm appears to be available for computing the canoincal form of a general symmetric pair.

Complete solution for QCQP
We now discuss how to solve a QCQP that is not necessarily definite feasible. We describe the process in a way that avoids computing the canonical form whenever possible.

Removing common null space
For QCQP that are not definite feasible, attempting to compute λ * as in Sect. 3, we face the difficulty that the O u×u block (if it exists) forces det(M 0 + λM 1 ) = 0 for every value of λ, so the pencil is singular and hence we cannot compute λ * via the generalized eigenvalue problem. Here we discuss how to remove such O u×u blocks.
Since such block corresponds to the common null space, we first compute the null space Q such that We take Q to have orthonormal columns Q Q = I and let Q ⊥ be its orthogonal complement in R n . Write x = Q ⊥ y + Qz and define When c = d = 0, this is a QCQP of smaller size with the O u×u blocks removed: the canonical form of (A , B ) has no zero block. Since this is simply an orthogonal transformation, it preserves the essential properties of the original QCQP, including strict feasibility. First suppose that c = 0 but d = 0. Then (36) is clearly unbounded, as we can take z = −αc with α → ∞. Now suppose that d = 0. We shall show how to obtain (λ * , x * ) satisfying (8) in Theorem 1. Since we assume that QCQP is bounded, A + λ * B 0 and (a + λ * b) ∈ R(A + λ * B) both hold and we see that need to hold; otherwise it would not be a bounded QCQP. The first equation c+λ * d = 0 clearly determines the value of λ * (if it exists; otherwise the QCQP is unbounded), and if A + λ * B 0 does not hold for this λ * , the QCQP is unbounded. Then, taking y * to be an arbitrary vector satisfying (A + λ * B )y * = −(a + λ * b ), and defining we have y * B y * + 2b y * + 2d z * + β = 0, and x * = Q ⊥ y * + Qz * satisfies (8), so x * is a global QCQP solution. Note that even when A + λ * B is singular, by defining A andã as in Theorem 8 we can compute y * via a nonsingular linear system. We thus focus on QCQP without a O u×u block in what follows.

Solution process for nongeneric QCQP
Suppose that we have removed the common null space of A and B as in Sect. 5.3.1, andλ ≥ 0 is known such that A +λB 0. The canonical form of (A, B) must be in the form Let the columns of V form a basis for N (A+λB), and we separate the cases depending on the eigenvalues of V BV . Note that the blocks in the canonical form that contribute to the null space are the blocksλ + α j = 0 with eigenvalue −α j =λ, and J (λ; θ) = 1 0 0 0 , for which the vector 0 1 is a null vector. We denote J 1 = {r + j |λ + α j = 0} and J 2 = {r + q 1 + 2 j | j = 1, · · · , q 2 } if J (λ; θ) = 1 0 0 0 , and J 2 = ∅ otherwise.
Under this condition and (38), holds. Thus we see that the zero eigenvalues of V BV correspond to the terms J (λ; θ), and the nonzero eigenvalues to the terms η j [λ + α j ], and their signs are η j . First we treat the caseλ > 0. 1. When N (A +λB) = 0.
This means A +λB is nonsingular and so A +λB 0, so it belongs to the definite feasible case, for which Algorithm 3.2 suffices. 2. When V BV 0 or V BV ≺ 0.
By (39), V BV 0 is equivalent to J 2 = ∅, and η j = 1 for all j ∈ J 1 . Similarly, V BV ≺ 0 is equivalent to J 2 = ∅ and η j = −1 for j ∈ J 1 . Thus slightly perturbingλ in the positive directionλ ←λ + (when V BV 0) or the negative directionλ ←λ − (when V BV ≺ 0) for a positive , we obtain W (A +λB)W = I r ×r ⊕ For all j such thatλ + α j = 0, the signs of η j take both +1 and −1. This impliesλ = λ * , which is the only value λ for which A + λB 0. Moreover, we can take v 1 , v 2 ∈ N (A +λB) such that v 1 Bv 1 > 0, v 2 Bv 2 < 0, so we solve (A+λB)x = −(a +λb) forx (by (34) the QCQP is unbounded if no suchx exists) and then find t ∈ R such that g(x + tv i ) = 0; we choose i ∈ {1, 2} depending on the sign of g(x): i = 1 if g(x * ) < 0, and i = 2 otherwise. Then x * =x + tv i is the solution. 4. When V BV = O has a zero eigenvalue, and we have V BV 0 or V BV 0.
For definiteness suppose that V BV 0; the other case is analogous. Since a zero eigenvalue is present, this is a case where the J (λ; θ) block exists. The goal is to find x such that g(x) = 0 and (A +λB)x = −(a +λb). We first find a vector w * such that (A +λB)w * = −(a +λb), while if no such w * exists then the QCQP is unbounded by Theorem 11. Otherwise the QCQP is bounded, and we proceed to solve the unconstrained quadratic optimization problem minimize u g(w * + V u).
If the optimal objective value is 0 or below (including −∞), there must exist u 0 such that g(w * + V u 0 ) ≤ 0. We then use a vector v such that v Bv > 0, v ∈ N (A +λB) and adjust a scalar t so that g(w * + V u 0 + tv) = 0. Then we obtain a global solution w * + V u 0 + tv. Next consider the case where the optimal value of (41) is larger than 0. In this case there is no x such that g(x) = 0 and (A+λB)x = −(a +λb). Since we are dealing with the bounded case, this means we are in the unattainable case; there exists a scalar μ such that for any ε > 0, there exists a feasible point x with f (x) = μ + ε.
We can rewrite this as so it follows that the desired value of μ is μ =λβ − w * (A +λB)w * . By (39), this is the case where q 1 = 0 and q 2 > 0. We proceed as above until (40). The goal is to find u such that g(w * + V u) = 0. In this case, g(w * + V u) = g(w * ) + 2(Bw * + b) V u + u (V BV )u = g(w * ) + 2(Bw * + b) V u, so g(w * + V u) is constant if and only if (Bw * + b) V = 0. If (Bw * + b) V = 0, there exists u 0 such that g(w * + V u 0 ) = 0, which means that the global solution is x * = w * + V u 0 . Otherwise, when (Bw * + b) V = 0, we are unable to find u such that g(w * + V u) = 0 unless g(w * ) = 0. This means we are in the unattainable case, and the optimal value is as in (42).
Ifλ = 0, we either have λ * =λ = 0 or λ * > 0; the latter case (in which QCQP is definite feasible) occurs if and only if V BV 0, because if V BV 0 has a zero eigenvalue, then by (38) a zero eigenvalue of V BV implies the existence of J (λ; θ), which means D is a point. If V BV is not positive definite, we must have λ * =λ = 0. We then compute w * such that (40) holds, and solve (41), or more precisely a feasibility problem of finding u such that g(w * + V u) ≤ 0. In fact, any w * + V u such that g(w * + V u) ≤ 0 satisfies (8) and is therefore a global solution; recall from the complementarity condition in (8) that when λ * = 0 it is not necessary to satisfy g(x * ) = 0. Such u trivially exists if V BV has a negative eigenvalue. If V BV 0 and det(V BV ) = 0 (i.e., J (λ; 0) exists) then it could be that min u g(w * + V u) > 0; then by Lemma 2 this corresponds to the unattainable case, with infimum value μ = −w * Aw * as in (42).
The steps described in this section, as shown in Fig. 11, completely solves QCQP with one constraint in the following sense: 1. For any bounded QCQP, it returns the optimal (or infimum) objective value, along with its corresponding solution x if it is attainable. 2. If the QCQP is unbounded, it reports unboundedness. 3. If the QCQP is infeasible, it reports infeasibility.
The worst-case complexity corresponds to the case where a canonical form of (A, B) is required. Using the GUPTRI algorithm [7,20] for the canonical form, the worstcase complexity is O(n 4 ). We repeat that most QCQP that are solvable in practice are solved by Algorithm 3.2, which is O(n 3 ) or faster.

Conclusion and discussion
We introduced an algorithm for QCQP with one constraint, which for generic (i.e., definite feasible QCQP for whichλ is known) QCQP requires computing just one eigenpair of a generalized eigenvalue problem. The algorithm is both faster and more accurate than the SDP-based approach, and can directly take advantage of the matrix sparsity structure if present.
For QCQP that are not definite feasible, for which SDP-based methods also face difficulty, we have classified the possible canonical forms under congruence of the pair (A, B), and described an algorithm (though more expensive than Algorithm 3.2) that completely solves the QCQP.
We close with remarks on future directions. First, a recent paper [34] describes an eigenvalue-based algorithm for TRS with an additional linear constraint, and a natural direction is to examine such an extension for QCQP. Second, since our algorithm essentially also solves the SDP (3), it is worth examining the class of SDP problems that can be solved similarly by an eigenvalue problem. Also of interest would be to deal with Riemannian optimization, such as minimization of trace(X AX + C X ) over X ∈ R n×k subject to the orthogonality constraint X X = I k .