On Local Convexity of Quadratic Transformations

Article
  • 534 Downloads

Abstract

In this paper, we improve Polyak’s local convexity result for quadratic transformations. Extension and open problems are also presented.

Keywords

Convexity Quadratic transformation Joint numerical range 

1 Introduction

Let \(x\in {\mathbb{R}}^n\) and \(f(x) = (f_1(x),\cdots , f_m(x))\), where
$$\begin{aligned} f_i(x) = \frac{1}{2}x^{\rm T}A_ix + a_i^{\rm T}x, \,i = 1,\cdots ,m \end{aligned}$$
are quadratic functions and \(A_i\in {\mathbb{R}}^{n\times n}\) (\(i = 1,\cdots ,m\)) are symmetric. One interesting question is when the following joint numerical range
$$\begin{aligned} F_m=\{f(x):x\in {\mathbb{R}}^n\}\subseteq {\mathbb{R}}^m \end{aligned}$$
is convex.

Though results on the convexity of complex quadratic functions were already there since 1918, see Toeplitz [19] and Hausdorff [8], the first such result for real case is due to Dines [4] in 1941. It states that if \(f_1, f_2\) are homogeneous quadratic functions then the set \(F_2\) is convex. In 1971, Yakubovich [23, 24] used this basic result to prove the famous S-lemma, see [17] for a survey. Brickman [3] proved in 1961 that if \(f_1, f_2\) are homogeneous quadratic functions and \(n\geqslant 3\) then the set \(\{(f_1(x), f_2(x)) : x\in {\mathbb{R}}^n, \Vert x\Vert = 1\} \subseteq {\mathbb{R}}^2\) is convex. Fradkov [5] proved in 1973 that if matrices \(A_1, \cdots , A_m\) commute and \(f_1,\cdots ,f_m\) are homogeneous, then \(F_m\) is convex. In 1995, it was showed by Ramana and Goldman [18] that the identification of the convexity of \(F_m\) is NP-hard. In the same paper, the quadratic maps, under which the image of every linear subspace is convex, was also investigated. Based on Brickman’s result, Polyak [14] proved in 1998 that if \(n\geqslant 3\) and \(f_1, f_2, f_3\) are homogeneous quadratic functions such that \(\mu _1A_1+\mu _2A_2+\mu _2A_3\succ 0\) (where notation \(A\succ 0\) means that \(A\) is positive definite) for some \(\mu \in {\mathbb{R}}^3\), then the set \(F_3\) is convex. Moreover, as shown in the same paper, when \(n\geqslant 2\) and there exists \(\mu \in {\mathbb{R}}^2\) such that \(\mu _1A_1+\mu _2A_2\succ 0\), the set \(F_2\) is convex. In 2007, Beck [1] showed that if \(m\leqslant n, A_1\succ 0\) and \(A_2=\cdots =A_m=0\), then \(F_m\) is convex. However, if \(A_1\succ 0, A_2=\cdots =A_{n+1}=0\) and \(a_2,\cdots ,a_{n+1}\) are linearly independent, then \(F_{n+1}\) is not convex. When \(m=2\), Beck’s result reduces to be a corollary of Polyak’s result. Very recently, Xia et al. [22] used the new developed S-lemma with equality to establish the necessary and sufficient condition for the convexity of \(F_2\) for \(A_2=0\) and arbitrary \(A_1\).

More generally, Polyak [15, 16] succeeded in proving a nonlinear image of a small ball in a Hilbert space is convex, provided that not only the derivative of the map is Lipschitz continuous on the ball, but also the derivative at the center of the ball is surjective and the norm of its adjoint mapping is bounded from zero. Later, Uderzo [21] extended the result to a certain subclass of uniformly convex Banach spaces. When focusing on quadratic transformations, Polyak’s result reads as follows:

Theorem 1.1

[17] Let \(A=[a_1\,\cdots \,a_m]\in {\mathbb{R}}^{n\times m}\) and define
$$\begin{aligned}&L:=\sqrt{\sum _{i=1}^m\Vert A_i\Vert ^2}, \\&\nu :=\sigma _{\min }(A)=\sqrt{\lambda _{\min }(A^{\rm T}A)},\nonumber \end{aligned}$$
(1.1)
where \(\Vert A_i\Vert =\sigma _{\max }(A_i)=\sqrt{\lambda _{\max }(A_i^{\rm T}A_i)}\) is the spectral norm of \(A_i, \sigma _{\min }(\cdot ), \lambda _{\min }(\cdot ), \sigma _{\max }(\cdot ), \lambda _{\max }(\cdot )\), denote the smallest and largest singular value and eigenvalue, respectively. Assume \(\nu >0\).
If \(\varepsilon < \varepsilon ^*:= \nu /(2L)\), then the image
$$\begin{aligned} F_m(\varepsilon ) = \{f(x):\,x\in {\mathbb{R}}^n,\,\Vert x\Vert \leqslant \varepsilon \} \end{aligned}$$
(1.2)
is a convex set in \({\mathbb{R}}^m\).
Polyak [15, 16] used the following example to show his estimation \(\varepsilon ^*\) is tight, where \(n=m=2\) and
$$\begin{aligned} f_1(x)=x_1x_2-x_1,\,f_2(x)=x_1x_2+x_2. \end{aligned}$$
Actually, in this case, \(\varepsilon ^*=1/(2\sqrt{2}) \approx 0.353\,6\). It is trivially verified that \(F_m(\varepsilon )\) is convex for \(\varepsilon \leqslant \varepsilon ^*\) and loses convexity for \(\varepsilon >\varepsilon ^*\).

In this paper, we improve the above Polyak’s result for quadratic transformations (i.e., Theorem 1.1) by strengthening the constant \(L\). Then, Theorem 1.1 is extended to the image of the ball of the same radius \(\varepsilon \) centered at any point \(a\) satisfying \(\Vert a\Vert <2(\varepsilon ^*-\varepsilon )\). Furthermore, we propose two new approaches for possible improvement of \(L\).

The paper is organized as follows. In Sect. 1, we improve and extend Theorem 1.1. In Sect. 2, we discuss further possible improvements. In the final conclusion section, we propose two open questions.

Throughout the paper, all vectors are column vectors. Let \(v(\cdot )\) denotes the optimal value of problem \((\cdot )\). Notation \(A\succeq 0\) implies that the matrix \(A\) is positive semidefinite. \({\rm {vec}(A)}\) denotes the vector obtained by stacking the columns of \(A\) one underneath the other. The trace of \(A\) is denoted by trace\((A)=\sum _{i=1}^nA_{ii}\). \(|A|\) is the matrix having entries \(|A_{ij}|\). The Kronecker product and the inner product of the matrices \(A\) and \(B\) are denoted by \(A\otimes B\) and \(A\bullet B={\rm trace}(AB^{\rm T})=\sum _{i,j=1}^na_{ij}b_{ij}\), respectively. The identity matrix is denoted by \(I\). \(\Vert x\Vert =\sqrt{x^{\rm T}x}\) is the \(\ell _2\)-norm of the vector \(x\).

2 Main Results

In this section, we first improve Theorem 1.1 and then extend it to the ball of the same radius centered at any point close enough to the zero point.

Theorem 2.1

Define
$$\begin{aligned} L_{\rm new}:=\sqrt{\lambda _{\max }\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) }. \end{aligned}$$
(2.1)
Then we have
$$\begin{aligned} L_{\rm new} \leqslant L. \end{aligned}$$
(2.2)
For any \(\varepsilon < \varepsilon ^*_{\rm new}:= \nu /(2L_{\rm new})\), the image \(F_m(\varepsilon )\) defined in (1.2) is convex.

Proof

Let \(L_b\) be any upper bound of the Lipschitz constant of \(f\), i.e.,
$$\begin{aligned} \Vert \nabla f(x)-\nabla f(z)\Vert \leqslant L_b\Vert x-z\Vert ,\,\quad \forall x, z\in {\mathbb{R}}^n. \end{aligned}$$
(2.3)
According to the proof in [15], Theorem 1.1 remains true if \(L\) defined in (1.1) is replaced by \(L_b\). It is sufficient to show that \(L_b:=L_{\rm new}\) satisfies (2.3). To this end, we have
$$\begin{aligned}\max _{\Vert x-z\Vert =1}\Vert \nabla f(x)-\nabla f(z)\Vert &= \max _{\Vert x-z\Vert =1}\Vert [A_1(x-z)\,\cdots \,A_m(x-z)]\Vert = \max _{\Vert y\Vert =1}\Vert [A_1y\,\cdots \,A_my]\Vert \\&= \sqrt{\max _{\Vert y\Vert =1} \lambda _{\max }\left( [A_1y\,\cdots \,A_my]^{\rm T}[A_1y\,\cdots \,A_my]\right) } \end{aligned}$$
(2.4)
$$\begin{aligned}&\leqslant \sqrt{\max _{\Vert y\Vert =1} {\rm trace }\left( [A_1y\,\cdots \,A_my]^{\rm T}[A_1y\,\cdots \,A_my]\right) }\\&= \sqrt{\max _{\Vert y\Vert =1} y^{\rm T}\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) y} \\&= \sqrt{\lambda _{\max }\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) }. \end{aligned}$$
(2.5)
The inequality (2.2) holds since
$$\begin{aligned} L_{\rm new}&= \sqrt{\lambda _{\max }\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) } = \sqrt{\max _{\Vert y\Vert =1}y^{\rm T}\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) y}\\&\leqslant \sqrt{ \sum _{i=1}^m \left( \max _{\Vert y\Vert =1}y^{\rm T}A_i^{\rm T}A_iy\right) } = \sqrt{\sum _{i=1}^m\lambda _{\max }\left( A_i^{\rm T}A_i\right) } = \sqrt{\sum _{i=1}^m\Vert A_i\Vert ^2}=L. \end{aligned}$$
\(\square \)

Theorem 2.2

For any \(0<\varepsilon < \varepsilon ^*_{\rm new}=\nu /(2L_{\rm new})\) and any \(a\in {\mathbb{R}}^n\) such that \(\Vert a\Vert <2(\varepsilon ^*_{\rm new}-\varepsilon )\), the image
$$\begin{aligned} F_m( \varepsilon ,a)=\{f(x):\,x\in {\mathbb{R}}^n,\,\Vert x-a\Vert \leqslant \varepsilon \} \end{aligned}$$
is a convex set in \({\mathbb{R}}^m\).

Proof

For any \(a\in {\mathbb{R}}^n\) such that \(\Vert a\Vert <2(\varepsilon ^*_{\rm new}-\varepsilon )\), we have
$$\begin{aligned}&\sigma _{\min }(A+[A_1a\cdots A_ma]) \\&\geqslant \sigma _{\min }(A)-\sigma _{\max }(-[A_1a\cdots A_ma]) \\&\geqslant \sigma _{\min }(A)-\sup _{\Vert a\Vert <2(\varepsilon ^*_{\rm new}-\varepsilon )}\sigma _{\max }([A_1a\cdots A_ma]) \\&= \sigma _{\min }(A)- \sqrt{\sup _{\Vert a\Vert <2(\varepsilon ^*_{\rm new}-\varepsilon )} \lambda _{\max }\left( [A_1a\,\cdots \,A_ma]^{\rm T}[A_1a\,\cdots \,A_ma]\right) } \\&\geqslant \sigma _{\min }(A)- \sqrt{\sup _{\Vert a\Vert <2(\varepsilon ^*_{\rm new}-\varepsilon )} {\rm trace }\left( [A_1a\,\cdots \,A_ma]^{\rm T}[A_1a\,\cdots \,A_ma]\right) } \\&= \sigma _{\min }(A)- \sqrt{\sup _{\Vert a\Vert <2(\varepsilon ^*_{\rm new}-\varepsilon )} a^{\rm T}\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) a}\\&= \sigma _{\min }(A)-2(\varepsilon ^*_{\rm new}-\varepsilon ) \sqrt{\lambda _{\max }\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) } \\&= \sigma _{\min }(A)-2(\varepsilon ^*_{\rm new}-\varepsilon )L_{\rm new} \\&= 2\varepsilon L_{\rm new}, \end{aligned}$$
(2.6)
where the first inequality is Weyl’s inequality [10] for the singular values, see also Problem III.6.5 in [2] or Theorem 3.3.16 in [11].
Since the optimal value of the maximizing problem (2.6) is unattainable, the above inequality implies that
$$\begin{aligned} \sigma _{\min }(A+[A_1a\cdots A_ma])>2\varepsilon L_{\rm new},\,\forall a\in {\mathbb{R}}^n:\,\Vert a\Vert <2(\varepsilon ^*_{\rm new}-\varepsilon ). \end{aligned}$$
(2.7)
Notice that
$$\begin{aligned} f_i(x) =f_i(a)+(A_ia+a_i)^{\rm T}(x-a)+ \frac{1}{2}(x-a)^{\rm T}A_i(x-a), \,i = 1,\cdots ,m. \end{aligned}$$
Then, we have
$$\begin{aligned} F_m( \varepsilon ,a)-f(a) = \{g(y):\,y\in {\mathbb{R}}^n,\,\Vert y\Vert \leqslant \varepsilon \}:=G_m(\varepsilon ,a), \end{aligned}$$
where \(g(y) = ((A_1a+a_1)^{\rm T}y+ \frac{1}{2}y^{\rm T}A_1y,\cdots , (A_ma+a_m)^{\rm T}y+ \frac{1}{2}y^{\rm T}A_my)\). According to Theorem 2.1, for any
$$\begin{aligned} \varepsilon < \sigma _{\min }(A+[A_1a\cdots A_ma])/(2L_{\rm new}), \end{aligned}$$
(2.8)
the image \(G_m(\varepsilon ,a)\) is a convex set in \(\mathbb{R}^{m}.\) The proof is complete as (2.8) is ensured by (2.7). \(\square \)

Remark 2.3

Theorem 2.1 is a special case of Theorem 2.2 by setting \(a=0\).

3 Further Possible Improvements

The estimation of Theorem 2.1 is still not tight. Actually, \(L_{\rm new}\) defined in (2.1) can be further improved to be the Lipschitz constant of \(f\), denoted by \(L_f\). According to (2.4), we have
$$\begin{aligned} L_f^2=\max _{\Vert y\Vert =1} \lambda _{\max }\left( [A_1y\,\cdots \,A_my]^{\rm T}[A_1y\,\cdots \,A_my]\right) . \end{aligned}$$
(3.1)
However, this is a nonlinear eigenvalue optimization problem and not easy to solve. As pointed by one referee, the problem (3.1) can be reduced to the largest singular value of a third order tensor [13], which is generally NP-hard [9].

Except for the upper bound \(L_{\rm new}\) (2.1), we further consider the other two relaxations of (3.1). We first need two lemmas.

Lemma 3.1

[2] Every eigenvalue of \(B\in {\mathbb{R}}^{m\times m}\) lies within at least one of the Gershgorin disks
$$\begin{aligned} \left\{ \lambda :\,\left| \lambda -B_{ii} \right| \leqslant \sum _{j\ne i}|B_{ij}| \right\} ,\,i=1,\cdots ,m. \end{aligned}$$

Lemma 3.2

[7] For any \(m\times m\) matrix \(B\), all its eigenvalues are located in the same disk
$$\begin{aligned} \left| \lambda -\frac{{\rm trace }(B)}{m}\right| \leqslant \sqrt{\frac{m-1}{m}\left( {\rm trace }(B^{\rm T}B)-\frac{\left( {\rm trace }(B)\right) ^2}{m}\right) }. \end{aligned}$$
(3.2)

Remark 3.3

Let \(\lambda _i(B)\) be the \(i\)-th largest eigenvalue of \(B\). When \(B\succeq 0\), substituting the following inequality
$$\begin{aligned} {\rm trace }(B^{\rm T}B)=\sum _{i=1}^m \lambda _i^2(B) \leqslant \left( \sum _{i=1}^m \lambda _i(B)\right) ^2=\left( {\rm trace }(B)\right) ^2 \end{aligned}$$
into (3.2), we see that Lemma 3.2 improves the inequality
$$\begin{aligned} \lambda _{\max }(B)\leqslant {\rm trace }(B), \end{aligned}$$
which is used in (2.5).

Now, we apply Lemmas 3.1 and 3.2 to establish two new relaxations of \(L_f\) (3.1).

Firstly, according to Lemma 3.1, we have
$$\begin{aligned}&\sqrt{\max _{\Vert y\Vert =1} \lambda _{\max }\left( [A_1y\,\cdots \,A_my]^{\rm T}[A_1y\,\cdots \,A_my]\right) }\\&\leqslant \sqrt{\max _{\Vert y\Vert =1} \max _{i=1,\cdots ,m} \left\{ y^{\rm T}(A_i^{\rm T}A_i)y+\sum _{j\ne i}y^{\rm T}|A_i^{\rm T}A_j|y\right\} }\\&= \sqrt{ \max _{i=1,\cdots ,m} \max _{\Vert y\Vert =1} y^{\rm T}\left( A_i^{\rm T}A_i+\sum _{j\ne i}|A_i^{\rm T}A_j|\right) y }\\&= \sqrt{ \max _{i=1,\cdots ,m} \lambda _{\max }\left( A_i^{\rm T}A_i+\frac{1}{2}\sum _{j\ne i}\left( |A_i^{\rm T}A_j|+|A_j^{\rm T}A_i|\right) \right) }\\&:= \overline{L}_{\rm new}. \end{aligned}$$
Consequently, Theorem 1.1 holds true if we replace \(L\) with \(\overline{L}_{\rm new}\).
Secondly, according to Lemma 3.2, we have
$$\begin{aligned}&\lambda _{\max }\left( [A_1y\,\cdots \,A_my]^{\rm T}[A_1y\,\cdots \,A_my]\right) \\&\leqslant \frac{1}{m}\sqrt{ \left( y^{\rm T}\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) y\right) ^2}\\&+\sqrt{\frac{m-1}{m}\left( \sum _{i,j=1}^m(y^{\rm T}A_i^{\rm T}A_jy)^2- \frac{1}{m} \left( y^{\rm T}\left( \sum _{i=1}^mA_i^{\rm T}A_i\right) y\right) ^2\right) } \\&= \frac{1}{m}\sqrt{ z^{\rm T} \left( \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \otimes \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \right) z}+ \sqrt{\frac{m-1}{m}}\cdot \\&\sqrt{z^{\rm T}\left( \sum _{i,j=1}^m (A_i^{\rm T}A_j)\otimes (A_i^{\rm T}A_j) - \frac{1}{m} \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \otimes \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \right) z}\\&= \frac{1}{m}\sqrt{ \left( \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \otimes \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \right) \bullet Z}+ \sqrt{\frac{m-1}{m}}\cdot \\&\,\,\sqrt{\left( \sum _{i,j=1}^m (A_i^{\rm T}A_j)\otimes (A_i^{\rm T}A_j) - \frac{1}{m} \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \otimes \left( \sum _{i=1}^m A_i^{\rm T}A_i\right) \right) \bullet Z}\\&:= B(Z) \end{aligned}$$
where \(z=y\otimes y\) and \(Z=zz^{\rm T}\). Since \(y^{\rm T}y=1\), we have
$$\begin{aligned}&{\rm trace}(Z)=z^{\rm T}z=(y\otimes y)^{\rm T}(y\otimes y)=(y^{\rm T}y)\otimes (y^{\rm T}y)=1\otimes 1=1,\\&{\rm vec}(I)^{\rm T}Z{\rm vec}(I)=\left( {\rm vec}(I)^{\rm T}z\right) ^2=\left( \sum _{i=1}^my_i^2\right) ^2 =1,\\&\Vert Z{\rm vec}(I)\Vert =\Vert zz^{\rm T}{\rm vec}(I)\Vert =\left| {\rm vec}(I)^{\rm T}z\right| \Vert z\Vert =\Vert z\Vert =\sqrt{z^{\rm T}z}= 1,\\&Z=zz^{\rm T}\succeq 0. \end{aligned}$$
Therefore, Theorem 1.1 remains true if \(L\) is replaced by \(\widetilde{L}_{\rm new}\), where
$$\begin{aligned} \begin{array}{lll} \widetilde{L}_{\rm new}^2=&{}\max &{}\,B(Z)\\ &{}{\rm s.t.}&{} {\rm trace}(Z)=1,\\ &{}&{} {\rm vec}(I)^{\rm T}Z{\rm vec}(I)=1,\\ &{}&{}\Vert Z{\rm vec}(I)\Vert \leqslant 1,\\ &{}&{}Z\succeq 0, \end{array} \end{aligned}$$
which is a convex semidefinite programming (CSDP) problem, and hence can be efficiently solved. In the following examples, the CSDP problems are modeled by CVX 1.2 [6] and solved by SDPT3 [20] within CVX.

Example 3.4

Let \(n =3,\, m = 2\). Consider the two examples:
$$\begin{aligned}&(E_1):&A_1=\left[ \begin{array}{ccc}2&{}0&{}6\\ 0&{}0&{}6\\ 6&{}6&{}2 \end{array}\right] ,\, A_2=\left[ \begin{array}{ccc}6&{}5&{}2\\ 5&{}4&{}0\\ 2&{}0&{}0 \end{array}\right] ,\, A=\left[ \begin{array}{cc}-1&{}0\\ 0&{}1\\ 0&{}0 \end{array}\right] ,\\&(E_2):&A_1=\left[ \begin{array}{ccc}0&{}5&{}3\\ 5&{}0&{}6\\ 3&{}6&{}4 \end{array}\right] ,\, A_2=\left[ \begin{array}{ccc}0&{}4&{}2\\ 4&{}0&{}4\\ 2&{}4&{}4 \end{array}\right] ,\, A=\left[ \begin{array}{cc}-1&{}0\\ 0&{}1\\ 0&{}0 \end{array}\right] . \end{aligned}$$
We can verify that
$$\begin{aligned}&(E_1):&L\approx 14.416\,6,\, L_{\rm new}\approx 13.909\,4,\, \overline{L}_{\rm new}\approx 12.884\,9,\,\widetilde{L}_{\rm new}\approx 12.674\,7,\\&(E_2):&L\approx 13.806\,5,\, L_{\rm new}\approx 13.804\,3,\, \overline{L}_{\rm new}\approx 14.590\,1,\,\widetilde{L}_{\rm new}\approx 13.800\,9. \end{aligned}$$
It is observed that neither \(L_{\rm new}\) nor \(\overline{L}_{\rm new}\) dominates each other. Moreover, both are dominated by \(\widetilde{L}_{\rm new}\).

Figure 1 shows the images of the \(\varepsilon \)-disks for \((E_1)\) and \((E_2)\), respectively. It follows that \(\widetilde{L}_{\rm new}\) is not tight and the convexity loses when \(\varepsilon \) is large enough.

Fig. 1

Images of \(\varepsilon \)-disks for \((E_1)\) with \(\varepsilon =1/(2\widetilde{L}_{\rm new})\approx 0.039\,4,\,0.06,\,0.14\) in the left subgraph and for \((E_2)\) with \(\varepsilon =1/(2\widetilde{L}_{\rm new})\approx 0.036\,2,\,0.04,\,0.08\) in the right subgraph

4 Concluding Remarks

In this paper, we improve and extend Polyak’s local convexity result for quadratic transformations by providing tighter bounds for
$$\begin{aligned} \max _{\Vert y\Vert =1} \lambda _{\max }\left( [A_1y\,\cdots \,A_my]^{\rm T}[A_1y\,\cdots \,A_my]\right) . \end{aligned}$$
It is open whether the above nonlinear eigenvalue optimization problem can be efficiently globally solved. Moreover, we propose a convex semidefinite programming (CSDP) relaxation, which is conjectured to be the tightest among all existing upper bounds as we are unable to find a counterexample. Since the above nonlinear eigenvalue optimization problem can be reformulated as the following polynomial optimization problem
$$\begin{aligned} \max _{\Vert z\Vert =1,\,\Vert y\Vert =1} z^{\rm T}\left( [A_1y\,\cdots \,A_my]^{\rm T}[A_1y\,\cdots \,A_my]\right) z, \end{aligned}$$
it is the future work to study upper bounds on this polynomial optimization problem, for example, applying the SOS (sum of squares) method of Lasseree’s SDP Hierarchy [12].

Notes

Acknowledgments

The author is grateful to the anonymous referee whose comments improved this paper.

References

  1. [1]
    Beck, A.: On the convexity of a class of quadratic mappings and its application to the problem of finding the smallest ball enclosing a given intersection of balls. J. Glob. Optim. 39(1), 113–126 (2007)CrossRefMATHGoogle Scholar
  2. [2]
    Bhatia, R.: Matrix Analysis. Springer-Verlag, New York (1997)CrossRefGoogle Scholar
  3. [3]
    Brickman, L.: On the field of values of a matrix. Proc. AMS 12, 61–66 (1961)MathSciNetCrossRefMATHGoogle Scholar
  4. [4]
    Dines, L.L.: On the mapping of quadratic forms. Bull. AMS 47, 494–498 (1941)MathSciNetCrossRefGoogle Scholar
  5. [5]
    Fradkov, A.L.: Duality theorems for certain nonconvex extremum problems. Sib. Math. J. 14, 247–264 (1973)MathSciNetCrossRefMATHGoogle Scholar
  6. [6]
    M. Grant, S. Boyd, CVX: Matlab software for disciplined convex programming, version 1.21 (2010). http://cvxr.com/cvx
  7. [7]
    Gu, Y.: The distribution of eigenvalues of a matrix. Acta Math. Appl. Sin. 17(4), 501–511 (1994)Google Scholar
  8. [8]
    Hausdorff, F.: Der Wervorrat einer Bilinearform. Mathematische Zeitschrift 3, 314–316 (1919)MathSciNetCrossRefMATHGoogle Scholar
  9. [9]
    He, S., Li, Z., Zhang, S.: Approximation algorithms for homogeneous polynomial optimization with quadratic constraints. Math. Program. Ser. B 125, 353–383 (2010)MathSciNetCrossRefMATHGoogle Scholar
  10. [10]
    Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1985)CrossRefMATHGoogle Scholar
  11. [11]
    Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991)CrossRefMATHGoogle Scholar
  12. [12]
    Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11, 796–817 (2001)MathSciNetCrossRefMATHGoogle Scholar
  13. [13]
    Lim, L.-H.: Singular values and eigenvalues of tensors: a variational approach. In: Proceedings of the IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, vol. 1, pp. 129–132 (2005)Google Scholar
  14. [14]
    Polyak, B.T.: Convexity of quadratic transformations and its use in control and optimization. J. Optim. Theory Appl. 99, 553–583 (1998)MathSciNetCrossRefMATHGoogle Scholar
  15. [15]
    Polyak, B.T.: Convexity of nonlinear image of a small ball with applications to optimization. Set-Valued Anal. 9, 159–168 (2001)MathSciNetCrossRefMATHGoogle Scholar
  16. [16]
    Polyak, B.T.: The convexity principle and its applications. Bull. Br az. Math.Soc. (N.S.) 34(1), 59–75 (2003)MathSciNetCrossRefMATHGoogle Scholar
  17. [17]
    Pólik, I., Terlaky, T.: A servey of S-lemma. SIAM Rev. 49(3), 371–418 (2007)MathSciNetCrossRefMATHGoogle Scholar
  18. [18]
    Ramana, M., Goldman, A.J.: Quadratic maps with convex images, Report 36-94, Rutgers Center for Operations Research, Rutgers, The State University of New Jersey (1994)Google Scholar
  19. [19]
    Toeplitz, O.: Das algebraische Analogen zu einem Satz von Fejer. Mathematische Zeitschrift 2, 187–197 (1918)MathSciNetCrossRefMATHGoogle Scholar
  20. [20]
    Toh, K.C., Todd, M.J., Tutuncu, R.H.: SDPT3—a Matlab software package for semidefinite programming. Optim. Methods Softw. 11, 545–581 (1999)MathSciNetCrossRefGoogle Scholar
  21. [21]
    Uderzo, A.: On the Polyak convexity principle and its application to variational analysis. Nonlinear Anal. 91, 60–71 (2013)MathSciNetCrossRefMATHGoogle Scholar
  22. [22]
    Y. Xia, S. Wang, R.L. Sheu, S-Lemma with equality and its applications. arXiv:1403.2816v2 (2014) http://arxiv.org/abs/1403.2816
  23. [23]
    Yakubovich, V.A.: S-procedure in nonlinear control theory. Vestnik Leningrad. Univ. 1, 62–77 (1971). (in Russian)Google Scholar
  24. [24]
    Yakubovich, V.A.: S-procedure in nonlinear control theory, Vestnik Leningrad. Univ. 4, 73–93 (1977) (English translation)Google Scholar

Copyright information

© Operations Research Society of China, Periodicals Agency of Shanghai University, and Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.State Key Laboratory of Software Development Environment, LMIB of the Ministry of Education, School of Mathematics and System SciencesBeihang UniversityBeijingChina

Personalised recommendations