Oracle-Based Primal-Dual Algorithms for Packing and Covering Semidefinite Programs

Elbassioni, Khaled; Makino, Kazuhisa

doi:10.1007/978-981-16-4095-7_4

Khaled Elbassioni⁹ &
Kazuhisa Makino¹⁰

4556 Accesses

Abstract

Packing and covering semidefinite programs (SDPs) appear in natural relaxations of many combinatorial optimization problems as well as a number of other applications. Recently, several techniques have been proposed that utilize the particular structure of this class of problems in order to obtain more efficient algorithms than those offered by general SDP solvers. For certain applications, it may be necessary to deal with SDPs with a very large number of (e.g., exponentially or even infinitely many) constraints. In this chapter, we give an overview of some of the techniques that can be used to solve this class of problems, focusing on multiplicative weight updates and logarithmic-potential methods.

You have full access to this open access chapter, Download chapter PDF

Dealing with inequality constraints in large-scale semidefinite relaxations for graph coloring and maximum clique problems

Article 25 April 2024

The Chvátal–Gomory procedure for integer SDPs with applications in combinatorial optimization

Article Open access 13 March 2024

Semidefinite Approaches for MIQCP: Convex Relaxations and Practical Methods

1 Packing and Covering Semidefinite Programs

We denote by $\mathbb S^n$ the set of all $n\times n$ real symmetric matrices and by $\mathbb S^n_+\subseteq \mathbb S^n$ the set of all $n\times n$ positive semidefinite (psd) matrices. We consider the following pairs of packing-covering semidefinite programs (SDPs):

$$\begin{aligned} \quad &\displaystyle z_I^* = \max \, C\bullet X \ \ \ \ \ \ \ \ ({\textsc {Packing-I}}) \\ \text {s.t.} \ & \displaystyle A_i\bullet X\le b_i, \forall i\in [m]\\ \qquad&X\in \mathbb S^{n},~X\succeq 0 \end{aligned}$$

$$\begin{aligned}& \displaystyle \quad \quad z_I^* = \min \, b^Ty \quad \quad \quad \quad ({\textsc {Covering-I}}) \\ \ \ \ & \displaystyle \text {s.t.}\sum _{i=1}^my_iA_i\succeq C\nonumber \\ & \quad \quad y\in \mathbb R^m,~y\ge 0\nonumber , \end{aligned}$$

$$\begin{aligned} & \displaystyle z_{II}^* = \min \, C\bullet X \quad \quad \quad \quad ({\textsc {Covering-II}}) \\ \text {s.t.} \ \ & \displaystyle A_i\bullet X\ge b_i, \forall i\in [m]\nonumber \\ \qquad & X\in \mathbb S^{n},~X\succeq 0\nonumber \end{aligned}$$

$$\begin{aligned} & \displaystyle \quad \quad z_{II}^* = \max \, b^Ty \quad \quad \quad \quad ({\textsc {Packing-II}}) \\ \ \ & \displaystyle \text {s.t.}\sum _{i=1}^my_iA_i\preceq C\nonumber \\ & \quad \quad y\in \mathbb R^m,~y\ge 0\nonumber , \end{aligned}$$

where $C,A_1,\ldots ,A_m \in \mathbb S_+^n$ are (non-zero) psd matrices, and $b=(b_1,\ldots ,b_n)^T\in \mathbb R^m_+$ is a non-negative vector. In the above, $C\bullet X:=\text {Tr}(CX)=\sum _{i=1}^n\sum _{j=1}^n c_{ij}x_{ij}$, and “$\succeq $” is the Löwner order on matrices: $A\succeq B$ if and only if $A-B$ is psd. This type of SDP arises in many applications. See, for example, [14, 15] and the references therein.

We assume the following throughout this chapter:

(A)
$b_i>0$ and hence $b_i=1$ for all $i\in [m]$.

It is known that, under assumption (A), strong duality holds for problems (Packing-I) and (Covering-I) (resp., (Packing-II) and (Covering-II)). Let $\epsilon \in (0,1]$ be a given constant. We say that (X, y) is an $\epsilon $-optimal primal-dual solution for (Packing-I)-(Covering-I) if (X, y) is a primal-dual feasible pair such that

$$\begin{aligned} C\bullet X\ge (1-\epsilon )b^Ty\ge (1-\epsilon )z^*_I. \end{aligned}$$

(4.1)

Similarly, we say that (X, y) is an $\epsilon $-optimal primal-dual solution for (Packing-II)-(Covering-II) if (X, y) is a primal-dual feasible pair such that

$$\begin{aligned} C\bullet X\le (1+\epsilon )b^Ty\le (1+\epsilon )z^*_{II}. \end{aligned}$$

(4.2)

In this chapter, we allow the number of constraints m in (Packing-I) (resp., (Covering-II)) to be exponentially (or even infinitely) large, so we assume the availability of the following oracle:

Max(Y)(resp., Min(Y)):: : Given $Y\in \mathbb S_+^n$, find $i\in {\text {argmax}}_{i\in [m]}A_i\bullet Y$ (resp., $i\in {\text {argmin}}_{i\in [m]}A_i\bullet Y$).

Note that an approximation oracle computing the above maximum (resp., minimum) within a factor of $(1-\epsilon )$ (resp., $(1+\epsilon )$) is also sufficient for our purposes. A primal-dual solution (X, y) to (Covering-I) (resp., (Packing-II)) is said to be $\eta $-sparse if the size of ${\text {supp}}(y):=\{i\in [m]:y_i>0\}$ is at most $\eta $.

When $C=I=I_n$ (which is the identity matrix in $\mathbb R^{n\times n}$) and $b=\mathbf{1}_m$ (which is the vector containing all ones in $\mathbb R^m$), we say that the packing-covering SDPs are in normalized form. It can be shown (see, e.g., [7, 16]) that, to within a multiplicative factor of $(1+\epsilon )$ in the objective, any pair of packing-covering SDPs of the form (Packing-I)-(Covering-I) can be brought to normalized form in $O(n^3)$ time while increasing the oracle time by only $O(n^\omega )$, where $\omega $ is the exponent of matrix multiplication, under the following assumption:

(B-I)
There exist r matrices, say $A_{1},\ldots ,A_{r}$, such that $\hat{A}:=\sum _{i=1}^rA_{i}\succ 0$. In particular, $\text {Tr}(X)\le \tau :=\frac{r}{\lambda _{\min }(\bar{A})}$ for any optimal solution X for (Packing-I), and we may assume that $r=1$ and $A_1=\frac{1}{\tau }I.$

Similarly, it can be shown that, to within a multiplicative factor of $(1+\epsilon )$ in the objective, any pair of packing-covering SDPs of the form (Packing-II)-(Covering-II) can be brought to normalized form in $O(n^3)$ time, while increasing the oracle time by only $O(n^\omega )$. Moreover, we may assume in this normalized form that

(B-II)
$\lambda _{\min }(A_i)=\Omega \big (\frac{\epsilon }{n}\cdot \min _{i'}\lambda _{\max }(A_{i'})\big )$ for all $i\in [m]$,

where, for a psd matrix $B\in \mathbb S_+^{n}$, we denote by $\{\lambda _j(B):~j=1,\ldots ,n\}$ the eigenvalues of B, and by $\lambda _{\min }(B)$ and $\lambda _{\max }(B)$ the minimum and maximum eigenvalues of B, respectively. Given additional $O(mn^2)$ time, we may also assume that

(B-II’)
$\frac{\lambda _{\max }(A_i)}{\lambda _{\min }(A_i)}= O\big (\frac{n^2}{\epsilon ^2}\big )$ for all $i\in [m]$.

Thus, the remainder of this chapter focuses on normalized problems.

Mixed packing and covering SDPs.

We also consider the following mixed packing-covering feasibility SDPs:

$$\begin{aligned}&\displaystyle A_i\bullet X\le b_i, \ \ \forall i\in [m_p] \qquad \qquad ({\textsc {Mix-Pack-Cover}}) \nonumber \\ \displaystyle&B_i\bullet X\ge d_i, \ \ \forall i\in [m_c]\nonumber \\ \qquad&X\in \mathbb S^{n},~X\succeq 0\nonumber , \end{aligned}$$

where $A_1,\ldots ,A_{m_p}, B_1,\ldots ,B_{m_c} \in \mathbb R^{n\times n}$ are psd matrices, and $b=(b_1,\ldots ,b_{m_p})^T$, $d=(d_1,\ldots ,d_{m_c})^T$ are non-negative real vectors.

A matrix $X\in \mathbb S_+^n$ is an $\epsilon $-approximate solution for (Mix-Pack-Cover) if $A_i\bullet X\le b_i$ for all $i\in [m_p]$ and $B_i\bullet X\ge (1-\epsilon )d_i$ for all $i\in [m_c]$.

2 Applications

2.1 SDP relaxation for Robust MaxCut

Given a simple undirected graph $G=(V,E)$ on $n=|V|$ vertices with non-negative edge weights $w\in \mathbb R_+^E$, the objective in the well-known MaxCut problem is to find a subset of the vertices $X\subset V$ that maximizes the weight of the cut: $ w(X,V\setminus X):=\sum _{u\in X,~v\in V\setminus X}w_{uv}$. The best-known approximation algorithm (with approximation ratio $0.878\ldots $) [10] for MaxCut is based on the following SDP relaxation:

$$\begin{aligned} \displaystyle \max \, L(w)\bullet X \qquad \quad \quad \qquad ({\textsc {MaxCut-SDP}}) \\ \end{aligned}$$

$$\begin{aligned} \text {s.t.}\quad&\displaystyle \mathbf{1}_{i}\mathbf{1}^T_i\bullet X=1, \ \ \forall i\in [n] \\ \qquad&X\in \mathbb R^{n\times n},~X\succeq 0\nonumber . \end{aligned}$$

(4.3)

By simply changing the equality in (4.3) into an inequality, this can be written in the form (Packing-I), with $A_i:=\mathbf{1}_i\mathbf{1}_i^T$ and $C:=L(w)\succeq 0$ being the Laplacian matrix of G, defined as follows:

$$ L_{ij}(w)=\left\{ \begin{array}{ll} \sum _{k=1}^nw_{ik}&{}\text { if }i=j,\\ -w_{ij}&{}\text { if }\{i,j\}\in E,\\ 0&{}\text { otherwise.} \end{array} \right. $$

Based on this relaxation, the following result is obtained using the scalar multiplicative weights update (MWU) method:

Theorem 4.1

([18]) There is a randomized algorithm for finding an $\epsilon $-optimal solution for (MaxCut-SDP) in time $\tilde{O}(\frac{nm}{\epsilon ^3})$, where n and m respectively denote the number of vertices and edges in a given graph.

Under the robust optimization framework, one assumes the weights are not known precisely, but instead are given by a convex uncertainty set $\mathcal W\subseteq \mathbb R^{n}_+$, where it is necessary to find a (near)-optimal solution under the worst-case choice $w\in \mathcal W$ in the uncertainty set:

$$\begin{aligned} \quad&\displaystyle \max \min _{w\in \mathcal W}\, L(w)\bullet X \quad \quad \quad \quad {\textsc {Robust-MaxCut-SDP}} \end{aligned}$$

$$\begin{aligned} \text {s.t.}\quad&\displaystyle \mathbf{1}_{i}\mathbf{1}^T_i\bullet X\le 1, \quad \forall i\in [n] \\ \qquad&X\in \mathbb R^{n\times n},~X\succeq 0\nonumber . \end{aligned}$$

(4.4)

By “guessing” the value $\tau $ of an optimal solution (via binary search), (4.4) can be reduced to

$$\begin{aligned} \quad&\displaystyle \min \, I\bullet X\\ \quad \quad \quad \quad {\textsc {Robust-MaxCut-SDP}} \text {s.t.}\quad&\displaystyle \mathbf{1}_{i}\mathbf{1}^T_i\bullet X\ge 1, \quad \forall i\in [n]\nonumber \\ \qquad&\frac{1}{\tau }L(w)\bullet X\ge 1,\quad \forall w\in \mathcal W\nonumber \\ \qquad&X\in \mathbb R^{n\times n},~X\succeq 0\nonumber . \end{aligned}$$

Thus, we obtain a covering SDP (of type (Covering-II)) with an infinite number of constraints, given by a minimization oracle over the convex set $\mathcal W$. We can use the matrix logarithmic-potential method to obtain the following result:

Theorem 4.2

There is a randomized algorithm that finds an $\epsilon $-optimal solution for (4.4) in time $\tilde{O}\big (\frac{n^{\omega +1}}{\epsilon ^{2.5}}+\frac{n\mathcal T}{\epsilon ^2}\big )$, where $\mathcal T$ is the time needed to optimize a linear function over $\mathcal W$.

Note that for this reduction to remain valid, it is sufficient to find an $\epsilon $-optimal solution to (4.4) for any $\epsilon =o\big (\frac{1}{n}\big )$.

2.2 Mahalanobis Distance Learning

Given a psd matrix $X\in \mathbb S^{n}$, the X-Mahalanobis distance between two points $a,b\in \mathbb R^n$ is defined as

$$ d_X(a,b):=\sqrt{(a-b)^TX(a-b)}. $$

The distance function $d_X(\cdot ,\cdot )$ is a semi-metric; that is, it is symmetric ($d_X(a,b)=d_X(a,b)$) and satisfies the triangle inequality ($d_X(a,c)\le d_X(a,b)+d_M(b,c)$), and it is also a metric if $X\succ 0$ (as in this case, $d_X(a,b)=0$ if and only if $a=b$).

The Mahalanobis distance learning problem is defined as follows [28]: Given sets $\mathcal C_s$ and $\mathcal C_d$ of similar and dissimilar pairs of points in $\mathbb R^n$, respectively, a similarity parameter $\sigma _s\in \mathbb R_+$ and a dissimilarity parameter $\sigma _d\in \mathbb R_+$, the objective is to find a matrix X such that all the pairs in $\mathcal C_s$ are “close” and all the pairs in $\mathcal C_d$ are “far” with respect to the distance function $d_X(\cdot ,\cdot )$:

$$\begin{aligned} \quad&\displaystyle (a-b)^TX(a-b)\le \sigma _s, \ \forall (a,b)\in \mathcal C_s \end{aligned}$$

(4.5)

$$\begin{aligned} \quad&\displaystyle (a-b)^TX(a-b)\ge \sigma _d, \ \forall (a,b)\in \mathcal C_d \end{aligned}$$

(4.6)

$$\begin{aligned} \qquad&X\in \mathbb S^{n},~X\succeq 0. \end{aligned}$$

(4.7)

Note that this can be written in the form (Mix-Pack-Cover), with $|\mathcal C_s|$ packing constraints of the form $A_{a,b}\bullet X \le \sigma _s$, where $A_{a,b}=(a-b)(a-b)^T$ for $(a,b)\in \mathcal C_s$, and $|\mathcal C_d|$ covering constraints of the form $B_{a,b}\bullet X \ge \sigma _d$, where $B_{a,b}=(a-b)(a-b)^T$ for $(a,b)\in \mathcal C_d$.

We can use the scalar MWU method to obtain the following result:

Theorem 4.3

There is a deterministic algorithm that finds an $\epsilon $-feasible solution for (4.5)-(4.2.2) in time $\tilde{O}(\frac{m(m+n^3)}{\epsilon ^2})$, where n is the dimension of the point sets and $m:=|\mathcal C_s|^2+|\mathcal C_d|^2$.

We remark that it is plausible that further improvements (possibly by another factor of O(m)) are possible via rank-one tricks and the use of approximate eigenvalue computations.

2.3 Related Work

Problems (Packing-I)-(Covering-I) and (Packing-II)-(Covering-II) can be solved using general SDP solvers, such as interior-point methods. For example, the barrier method (see, e.g., [22]) can compute a solution within an additive error of $\epsilon $ from the optimal in time $O(\sqrt{n}m(n^3+mn^2+m^2)\log \frac{1}{\epsilon })$ (see also [1, 27]). However, due to the special nature of (Packing-I)-(Covering-I) and (Packing-II)-(Covering-II), better algorithms can be obtained. Most of the improvements are obtained by using first-order methods [2, 3, 5, 6, 8, 15,16,17,18, 21, 23, 24], or second-order methods [13, 14]. In general, we can classify these algorithms according to whether they are (semi) width-independent, are parallel, output sparse solutions, or are oracle-based, as follows.

(I)
(Semi) width-independent: The running time of the algorithm depends polynomially on the bit length of the input. For example, in the of case of (Packing-I)-(Covering-I), the running time is ${\text {poly}}(n,m,\mathcal L,\log \tau ,\frac{1}{\epsilon })$, where $\mathcal L$ is the maximum bit length needed to represent any number in the input. In contrast, the running time of a width-dependent algorithm depends polynomially on a “width parameter” $\rho $, which is polynomial in $\mathcal L$ and $\tau $.
(II)
Parallel: The algorithm takes ${\text {polylog}}(n,m,\mathcal L,\log \tau )\cdot {\text {poly}}(\frac{1}{\epsilon })$ time on a ${\text {poly}}(n,m,$ $\mathcal L,\log \tau ,\frac{1}{\epsilon })$ number of processors.
(III)
Sparse: The algorithm outputs an $\eta $-sparse solution to (Covering-I) (resp., (Packing-II)) for $\eta ={\text {poly}}(n,\log m,\mathcal L,\log \tau ,\frac{1}{\epsilon })$ (resp., $\eta ={\text {poly}}(n,\log m,\mathcal L,\frac{1}{\epsilon })$), where $\tau $ is a parameter that bounds the trace of any optimal solution X;
(IV)
Oracle-based: The only access the algorithm has to the matrices $A_1,\ldots ,A_m$ is via the maximization/minimization oracle, and hence the running time is independent of m.

Table 4.1 below gives a summary^{Footnote 1} of the most relevant results together with their classifications according to the four criteria above. We note that almost all of these algorithms for packing/covering SDPs are generalizations of similar algorithms for packing/covering linear programs (LPs), and most of them are essentially based on an exponential potential function in the form of scalar exponentials, such as [3, 18], or matrix exponential [2, 5, 6, 15, 17]. For instance, several of these results use the scalar or matrix versions of the MWU method (see, e.g., [4]), which are extensions of similar methods for packing/covering LPs [9, 11, 25, 29].

In [12], a different type of algorithm was given for covering LPs (indeed, more generally, for a class of concave covering inequalities) based on a logarithmic potential function. In [7], it was shown that this approach could be extended to provide sparse solutions for both versions of packing and covering SDPs.

Table 4.1 Different algorithms for packing/covering SDPs

Full size table

As we can see from the table, among all the algorithms, only the matrix (MWU and logarithmic-potential) algorithms are oracle-based (and hence produce sparse solutions) in the sense described above. However, the overall running time of the matrix MWU algorithm is larger by a factor of (roughly) $\Omega (n^{3-\omega })$ than that of the logarithmic-potential algorithm, where $\omega $ is the exponent of matrix multiplication. Moreover, we cannot extend the matrix MWU algorithm to solve (Packing-I)-(Covering-I) (in particular, it seems tricky to bound the number of iterations).

3 General Framework for Packing-Covering SDPs

Given a pair of packing-covering SDPs (Packing-I)-(Covering-I) or (Covering-II)-(Packing-II), we consider the following general framework in which each constraint is assigned a weight reflecting how satisfied the constraint is given the current solution:

We obtain different algorithms depending on how the weights are defined. We write $a_i:=A_i\bullet X\ge 0.$ Since $a_{\max }:=\max \{a_1,\ldots ,a_m\}$ (resp., $a_{\min }:=\min \{a_1,\ldots ,a_m\}$) is not a smooth function (in X), it is more convenient to work with a smooth approximation of it, which is provided by the weighted average formed in step 3 in the framework. There are several ways to do this, for example:

Exponential averaging: The weights are $\overline{p}_i:=\frac{(1+\epsilon )^{a_i}}{\sum _{i'=1}^m(1+\epsilon )^{a_{i'}}}$ (resp., $\overline{p}_i:=\frac{(1-\epsilon )^{a_i}}{\sum _{i'=1}^m(1-\epsilon )^{a_{i'}}}$). The following claim justifies the use of these sets of weights.

Lemma 4.1

If $a_{\max }\ge \frac{1+\epsilon }{\epsilon }\log _{1+\epsilon }\frac{m}{\epsilon }$ $\Bigl ($resp., $a_{\min }\ge \frac{1}{\epsilon }\log _{\frac{1}{1-\epsilon }}\left( \frac{m \cdot a_{\max }}{\epsilon \cdot a_{\min }}\right) \Bigr )$, then

$$ \frac{a_{\max }}{1+\epsilon }\le \sum _{i=1}^m\overline{p}_i a_i\le a_{\max } ~~~\Bigl ( \text {resp., }\, a_{\min }\le \sum _{i=1}^m\overline{p}_i a_i\le (1+\epsilon )a_{\min }\Bigr ). $$

Logarithmic potential averaging: The weights are $\overline{p}_i=\frac{\epsilon }{m}\frac{\theta ^*}{\theta ^*-a_i}$ (resp., $\overline{p}_i=\frac{\epsilon }{m}\frac{\theta ^*}{a_i-\theta ^*}$), where $\theta ^*$ is the minimizer (resp., maximizer) of the potential function
$$\begin{aligned} \Phi (\theta )=\ln \left( \theta \cdot \root \epsilon /m \of {\prod _{i=1}^m\frac{1}{\theta -a_i}}\right) \ \ \Biggl ( \text {resp.,} \Phi (\theta )=\ln \left( \theta \cdot \root \epsilon /m \of {\prod _{i=1}^m(a_i-\theta )}\right) \Biggr ). \end{aligned}$$
(It can be easily verified that $\sum _i\overline{p}_i=1$.) The following claim justifies the use of these sets of weights.

Lemma 4.2

$$ \frac{(1-\epsilon )a_{\max }}{1-\epsilon /m}\le \sum _{i=1}^m\overline{p}_i a_i\le a_{\max } ~~~\Biggr (\text {resp.,} a_{\min }\le \sum _{i=1}^m\overline{p}_i a_i\le \frac{a_{\min }(1+\epsilon )}{1+\epsilon /m}\Biggr ). $$

4 Scalar Algorithms

4.1 Scalar MWU Algorithm for (Packing-I)-(Covering-I)

Given a normalized pair of packing-covering SDPs of type I (Packing-I)-(Covering-I), and a feasible primal solution X, we use the exponential weight $p_i:=(1+\epsilon )^{A_i\bullet X}$, for $i\in [m]$. Averaging the inequalities with respect to the weights $\overline{p}_i:=\frac{p_i}{\sum _{i}p_i}$, we arrive at the following problem:

$$\begin{aligned} \quad&\displaystyle \max \, I\bullet X \\ \text {s.t.}\quad&\displaystyle \sum _i\overline{p}_iA_i\bullet X\le 1, \ \ \forall i\in [m]\nonumber \\ \qquad&X\in \mathbb R^{n\times n},~X\succeq 0\nonumber . \end{aligned}$$

(4.8)

Letting $\overline{A}:=\sum _i\overline{p}_iA_i$ and writing $X=\sum _{v\in B_n}\lambda _vvv^T$, where $ B_n:=\{v\in \mathbb R^n:~\Vert v\Vert =1\} $ and $\lambda _v\ge 0$ for all $v\in B_n$, we obtain the following (infinite-dimensional) knapsack problem

$$\begin{aligned} \quad&\displaystyle \max \, \sum _{v\in B_n} \lambda _v\\ \text {s.t.}\quad&\displaystyle \sum _{v\in B_n}\lambda _v\overline{A}\bullet vv^T\le 1, \ \ \forall i\in [m]\nonumber \\ \qquad&\lambda _v\ge 0, \ \ \forall v\in B_n\nonumber . \end{aligned}$$

(4.9)

An optimal solution is attained at a vector $v\in B_n$ which minimizes $v^T\overline{A}v$. This is the basis vector corresponding to $\lambda _{\min }(\overline{A})$.

Thus, using this set of weights in our general framework (Algorithm 1) yields the following procedure (for a vector $p\in \mathbb R^m$, we write $\overline{p}_i:=\frac{p_i}{\sum _ip_i}$):

The stopping criterion is that the left-hand side (LHS) of at least one inequality in (Packing-I) reaches some threshold $T:=\epsilon ^{-2}\ln m$, with respect to the current solution X(t). The step size (step 5) is chosen such that in each iteration of the while-loop, this right-hand size increases by at least 1, thus guaranteeing termination in mT iterations.

Theorem 4.4

Given a real $\epsilon \in (0,1]$, Algorithm 2 outputs an $\epsilon $-optimal solution for (Packing-I)-(Covering-I) in $O(m\log m/\epsilon ^2)$ iterations, where each iteration requires an oracle call that computes an eigenvector corresponding to the minimum eigenvalue of a psd matrix.

For a given matrix $M\in \mathbb R^{n\times n}$, computing $\lambda _{\min }(M)$ (almost) exactly requires $O(n^3)$ time via a full eigenvalue decomposition of the matrix. If M is psd, a faster approximation of $\lambda _{\min }(M)$ can be obtained (using Lanczos’ algorithm with a random start) via the following result.

Theorem 4.5

([19]) Let $M\in \mathbb S_+^n$ be a psd matrix with N non-zeros and $\gamma \in (0,1)$ be a given constant. Then, there is a randomized algorithm that computes, with high (i.e., $1-o(1)$) probability a unit vector $v\in \mathbb R^n$ such that $v^TMv\ge (1-\gamma )\lambda _{\max }(M)$. The algorithm takes $O\big (\frac{\log n}{\sqrt{\gamma }}\big )$ iterations, each requiring O(N) arithmetic operations.

By applying the lemma to $(\overline{A})^{-1}$, we can approximate $\lambda _{\min }(\overline{A})$ in $\tilde{O}(n^{\omega })$ time.

4.2 Scalar Logarithmic Potential Algorithm For (Packing-I)–(Covering-I)

Given a normalized pair of packing-covering SDPs of type I (Packing-I)-(Covering-I) and a feasible primal solution X, we use the logarithmic-potential weights $\overline{p}_i=\frac{\epsilon }{m}\frac{\theta ^*}{\theta ^*-A_i\bullet X}$ for $i\in [m]$. Averaging the inequalities with respect to this set of weights, we arrive at the knapsack problem (4.9). This gives rise to the following procedure:

In the above, for given numbers $x\in \mathbb R_+$ and $\delta \in (0,1)$, we define the $\delta $-(upper) approximation $x^\delta $ of x to be a number satisfying: $x\le x^\delta <(1+\delta )x$.

Theorem 4.6

Given $\epsilon \in (0,1]$, Algorithm 3 outputs an $\epsilon $-optimal solution for (Covering-I)-(Packing-I) in $O(m\log \psi +m/\epsilon ^2)$ iterations, where $\psi :=\frac{\lambda _{\max }(\overline{A}(0))}{\lambda _{\min }(\overline{A}(0))}$ and each iteration requires an oracle call that computes an eigenvector corresponding to the minimum eigenvalue of a psd matrix.

5 Matrix Algorithms

5.1 Matrix MWU Algorithm For (Covering-II)-(Packing-II)

Let $F(y):=\sum _{i=1}^my_iA_i$. Then, we can rewrite the normalized version of (Packing-II) as follows:

$$\begin{aligned} \quad&\displaystyle z_I^* = \max \, \mathbf{1}^Ty \qquad \qquad \quad \quad ({\textsc {Packing-II}}) \\ \text {s.t.}\quad&\displaystyle \lambda _j(F(y))\le 1, \ \ \forall j\in [n]\nonumber \\ \qquad&y\in \mathbb R^m,~y\ge 0\nonumber . \end{aligned}$$

Averaging the inequalities with respect to the weights $\overline{p}_j:=\frac{p_j}{\sum _{j}p_j}$, where $ p_j:=(1+\epsilon )^{\lambda _j(F(y))}$, we get

$$\begin{aligned} \quad&\displaystyle \max \, \mathbf{1}^Ty\\ \text {s.t.}\quad&\displaystyle \sum _j\overline{p}_j\lambda _j(F(y))\le 1, \ \ \forall j\in [n]\nonumber \\ \qquad&y\in \mathbb R^m,~y\ge 0\nonumber . \end{aligned}$$

Using the eigenvalue decomposition: $F(y)=U\Lambda U^T$, where $\Lambda $ is the diagonal matrix containing the eigenvalues of F(y) and $UU^T=I$, and letting

$$\begin{aligned} \overline{P}:=U\left[ \begin{array}{llll} \overline{p}_1&{} 0&{} \cdots &{}0\\ 0&{} \overline{p}_2&{} \cdots &{}0\\ \cdots &{} \cdots &{} \cdots &{}\cdots \\ 0&{} 0&{} \cdots &{}\overline{p}_n \end{array}\right] U^T=\frac{(1+\epsilon )^{F(y)}}{\text {Tr}((1+\epsilon )^{F(y)})},\end{aligned}$$

we obtain the following knapsack problem:

$$\begin{aligned} \quad&\displaystyle \max \, \mathbf{1}^Ty\\ \text {s.t.}\quad&\displaystyle \sum _{i}(\overline{P}\bullet A_i) y_i\le 1, \ \ \forall j\in [n]\nonumber \\ \qquad&y\in \mathbb R^m,~y\ge 0\nonumber . \end{aligned}$$

An optimal solution is attained at the basis vector $y=\mathbf{1}_i\in \mathbb R^m_+$ that minimizes $\overline{P}\bullet A_i$. This gives rise to the following matrix MWU algorithm:

Theorem 4.7

Given an real $\epsilon \in (0,1]$, Algorithm 2 outputs an $\epsilon $-optimal solution for (Covering-II)-(Packing-II) in $O(n\log n/\epsilon ^2)$ iterations, where each iteration requires matrix exponential computation, two oracle calls that computes the maximum eigenvalue of a psd matrix, and a single oracle call to the minimization in step 4.

The most demanding step in the above algorithm is the matrix exponential computation, which can be done in $O(n^3)$ time via a complete eigenvalue decomposition. A more efficient approximation, particularly when the matrices $A_i$ are sparse, can be obtained via the following result.

Theorem 4.8

([26]) There is an algorithm for approximating the matrix exponential $e^{F}$ in time $O(n^2r\log ^3\frac{1}{\epsilon })$, where r denotes the number of non-zeros in $F\in \mathbb S^{n}$, and $\epsilon $ is the approximation accuracy.

We remark that a matrix MWU algorithm and a theorem similar to Algorithm 4 and Theorem 4.7 for (Packing-I)-(Covering-I) have not yet been discovered and are left as open problems.

5.2 Matrix Logarithmic Potential Algorithm For (Packing-I)-(Covering-I)

Let $F(y):=\sum _{i=1}^my_iA_i$. Then, we can rewrite the normalized version of (Covering-I) as

$$\begin{aligned} \quad&\displaystyle z_I^* = \min \,\mathbf{1}^Ty \qquad \qquad \quad \quad ({\textsc {Packing-II}}) \\ \text {s.t.}\quad&\displaystyle \lambda _j(F(y))\ge 1, \ \ \forall j\in [n]\nonumber \\ \qquad&y\in \mathbb R^m,~y\ge 0\nonumber . \end{aligned}$$

Averaging the inequalities with respect to the weights $\overline{p}_j:=\frac{\epsilon }{n}\frac{\theta ^*}{\lambda _j(F(y))-\theta ^*}$, we get

$$\begin{aligned} \quad&\displaystyle \min \, \mathbf{1}^Ty\\ \text {s.t.}\quad&\displaystyle \sum _j\overline{p}_j\lambda _j(F(y))\ge 1, \ \ \forall j\in [n]\nonumber \\ \qquad&y\in \mathbb R^m,~y\ge 0\nonumber . \end{aligned}$$

Using the eigenvalue decomposition: $F(y)=U\Lambda U^T$, where $\Lambda $ is the diagonal matrix containing the eigenvalues of F(y) and $UU^T=I$, and letting

$$\begin{aligned} \overline{P}:=U\left[ \begin{array}{llll} \overline{p}_1&{} 0&{} \cdots &{}0\\ 0&{} \overline{p}_2&{} \cdots &{}0\\ \cdots &{} \cdots &{} \cdots &{}\cdots \\ 0&{} 0&{} \cdots &{}\overline{p}_n \end{array}\right] U^T=\frac{\epsilon \theta ^*}{n}(F(y)-\theta ^*I)^{-1},\end{aligned}$$

we obtain the following knapsack problem:

$$\begin{aligned} \quad&\displaystyle \min \quad \mathbf{1}^Ty\\ \text {s.t.}\quad&\displaystyle \sum _{i}(\overline{P}\bullet A_i) y_i\ge 1, \forall j\in [n]\nonumber \\ \qquad&y\in \mathbb R^m,~y\ge 0\nonumber . \end{aligned}$$

An optimal solution is attained at the basis vector $y=\mathbf{1}_i\in \mathbb R^m_+$ that maximizes $\overline{P}\bullet A_i$. This gives rise to the following matrix logarithmic-potential algorithm:

The most demanding steps are the computation of $\theta (t)$ and X(t) in steps 5 and 5, respectively. Computing $\theta (t)$ can be done via binary search over a region determined by repeated matrix multiplications and approximate minimum eigenvalue computation (cf. Theorem 4.5). Once $\theta (t)$ is determined, computing X(t) requires a single matrix inversion. The overall running time per iteration is $\tilde{O}(n^{\omega })$ plus the time needed by the maximization oracle in step 5.

Theorem 4.9

Given $\epsilon \in (0,1]$, Algorithm 5 outputs an $\epsilon $-optimal solution for (Covering-I)-(Packing-I) in $O(n\log \psi +\frac{n}{\epsilon ^2})$ iterations, where $\psi := \frac{r\cdot \max _i\lambda _{\max }(A_i)}{\lambda _{\min }(\hat{A})}$ and each iteration requires $O(\log \frac{n}{\epsilon })$ matrix multiplications and a single oracle call to the maximization in step 5.

5.3 Matrix Logarithmic Potential Algorithm For (Packing-II)-(Covering-II)

A symmetric version of Algorithm 5 for (Packing-II)-(Covering-II) can be given as follows:

Theorem 4.10

Given $\epsilon \in (0,1]$, Algorithm 6 outputs an $\epsilon $-optimal solution for (Packing-II)-(Covering-II) in $O(n\log \psi +\frac{n}{\epsilon ^2})$ iterations, where $\psi := O(\log \frac{n}{\epsilon })$ and each iteration requires $O(\log \frac{n}{\epsilon })$ matrix inversions and a single oracle call to the minimization in step 4.

Notes

1.
We provide rough estimates of the bounds, as some of them are not stated explicitly in the corresponding paper in terms of the parameters we consider here.

References

F. Alizadeh, Interior point methods in semidefinite programming with applications to combinatorial optimization. SIAM J. Optim. 5(1), 13–51 (1995)
Article MathSciNet Google Scholar
Z. Allen-Zhu, Y.T. Lee, L. Orecchia, Using optimization to obtain a width-independent, parallel, simpler, and faster positive sdp solver, in Proceedings of the Twenty-seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’16 (Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2016), pp. 1824–1831
Google Scholar
S. Arora, E. Hazan, S. Kale, Fast algorithms for approximate semidefinite programming using the multiplicative weights update method (2005), pp. 339–348
Google Scholar
S. Arora, E. Hazan, S. Kale, The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)
Article MathSciNet Google Scholar
S. Arora, S. Kale, A combinatorial, primal-dual approach to semidefinite programs (2007), pp. 227–236
Google Scholar
S. Arora, S. Kale. A combinatorial, primal-dual approach to semidefinite programs. J. ACM 63(2), 12:1–12:35 (2016)
Google Scholar
K. Elbassioni, K. Makino, Oracle-based primal-dual algorithms for packing and covering semidefinite programs, in 27th Annual European Symposium on Algorithms, ESA 2019, September 9–11, 2019, Munich/Garching, Germany (2019), pp. 43:1–43:15
Google Scholar
D. Garber, E. Hazan, Sublinear time algorithms for approximate semidefinite programming. Math. Program. 158(1–2), 329–361 (2016)
Article MathSciNet Google Scholar
N. Garg, J. Könemann, Faster and simpler algorithms for multicommodity flow and other fractional packing problems. SIAM J. Comput. 37(2), 630–652 (2007)
Article MathSciNet Google Scholar
M.X. Goemans, D.P. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42(6), 1115–1145 (1995)
Article MathSciNet Google Scholar
M.D. Grigoriadis, L.G. Khachiyan, A sublinear-time randomized approximation algorithm for matrix games. Operat. Res. Lett. 18(2), 53–58 (1995)
Article MathSciNet Google Scholar
M.D. Grigoriadis, L.G. Khachiyan, L. Porkolab, J. Villavicencio, Approximate max-min resource sharing for structured concave optimization. SIAM J. Optim. 41, 1081–1091 (2001)
Article MathSciNet Google Scholar
G. Iyengar, D. J. Phillips, C. Stein, Approximation algorithms for semidefinite packing problems with applications to maxcut and graph coloring, in Integer Programming and Combinatorial Optimization (IPCO), eds. by M. Jünger, V. Kaibel (Berlin, Heidelberg, 2005), pp. 152–166
Google Scholar
G. Iyengar, D.J. Phillips, C. Stein, Feasible and accurate algorithms for covering semidefinite programs, in Algorithm Theory—SWAT 2010, ed. by H. Kaplan (Berlin, Heidelberg, 2010), pp. 150–162.
Google Scholar
G. Iyengar, D.J. Phillips, C. Stein, Approximating semidefinite packing programs. SIAM J. Optim. 21(1), 231–268 (2011)
Article MathSciNet Google Scholar
R. Jain, P. Yao, A parallel approximation algorithm for positive semidefinite programming. In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, October 22-25, 2011 (2011), pp. 463–471
Google Scholar
R. Jain, P. Yao, A parallel approximation algorithm for mixed packing and covering semidefinite programs. CoRR (2012). arXiv:abs/1201.6090
P. Klein, H.-I. Lu, Efficient approximation algorithms for semidefinite programs arising from max cut and coloring, in Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing, STOC ’96 (ACM, New York, NY, USA, 1996), pp. 338–347
Google Scholar
Z. Leyk, H. Woźniakowski, Estimating a largest eigenvector by lanczos and polynomial algorithms with a random start. Numer. Linear Algebra Appl. 5(3), 147–164 (1999)
Article MathSciNet Google Scholar
Y. Nesterov, Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)
Google Scholar
Y. Nesterov, Smoothing technique and its applications in semidefinite optimization. Math. Program. 110(2), 245–259 (2007)
Article MathSciNet Google Scholar
Y. Nesterov, A. Nemirovskii, Interior-Point Polynomial Algorithms in Convex Programming. Society for Industrial and Applied Mathematics (1994)
Google Scholar
R. Peng, K. Tangwongsan, Faster and simpler width-independent parallel algorithms for positive semidefinite programming, in Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’12 (ACM, New York, NY, USA, 2012), pp. 101–108
Google Scholar
R. Peng, K. Tangwongsan, P. Zhang, Faster and simpler width-independent parallel algorithms for positive semidefinite programming. CoRR (2016). arXiv:abs/1201.5135
S.A. Plotkin, D.B. Shmoys, É. Tardos, Fast approximation algorithms for fractional packing and covering problems (1991), pp. 495–504
Google Scholar
J. van den Eshof, M. Hochbruck, Preconditioning lanczos approximations to the matrix exponential. SIAM J. Sci. Comput. 27(4), 1438–1457 (2006)
Article MathSciNet Google Scholar
L. Vandenberghe, S. Boyd, Semidefinite programming. SIAM Rev. 38(1), 49–95 (1996)
Article MathSciNet Google Scholar
E.P. Xing, M.I. Jordan, S.J. Russell, A.Y. Ng, Distance metric learning with application to clustering with side-information, in Advances in Neural Information Processing Systems 15, ed. by S. Becker, S. Thrun, K. Obermayer (MIT Press, Cambridge, 2003), pp. 521–528
Google Scholar
N.E. Young, Sequential and parallel algorithms for mixed packing and covering (2001), pp. 38–546
Google Scholar

Download references

Acknowledgements

We thank Waleed Najy for many helpful discussions on this topic. This work was partially supported by JST CREST JPMJCR1402 and Grants-in-Aid for Scientific Research. The research of the first author was partially supported by Abu Dhabi Education & Knowledge − Abu Dhabi Award for Research Excellence (AARE18-152).

Author information

Authors and Affiliations

Khalifa University of Science and Technology, P.O. Box 127788, Abu Dhabi, United Arab Emirates
Khaled Elbassioni
Research Institute for Mathematical Sciences (RIMS), Kyoto University, Kyoto, 606-8502, Japan
Kazuhisa Makino

Authors

Khaled Elbassioni
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhisa Makino
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khaled Elbassioni .

Editor information

Editors and Affiliations

Graduate School of Information Science, University of Hyogo, Kobe, Hyogo, Japan
Naoki Katoh
Graduate School of Information Science, University of Hyogo, Kobe, Hyogo, Japan
Yuya Higashikawa
School of Informatics and Engineering, University of Electro-Communications, Chofu, Tokyo, Japan
Hiro Ito
Department of Information Science, Ochanomizu University, Bunkyo, Tokyo, Japan
Atsuki Nagao
Human Genome Center, University of Tokyo, Minato, Tokyo, Japan
Tetsuo Shibuya
Center for Advanced Intelligence Project, RIKEN, Chuo, Tokyo, Japan
Adnan Sljoka
Graduate School of Information Science, Tohoku University, Sendai, Miyagi, Japan
Kazuyuki Tanaka
Graduate School of Engineering, Osaka Prefecture University, Sakai, Osaka, Japan
Yushi Uno

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Elbassioni, K., Makino, K. (2022). Oracle-Based Primal-Dual Algorithms for Packing and Covering Semidefinite Programs. In: Katoh, N., et al. Sublinear Computation Paradigm. Springer, Singapore. https://doi.org/10.1007/978-981-16-4095-7_4

Download citation

DOI: https://doi.org/10.1007/978-981-16-4095-7_4
Published: 20 October 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-4094-0
Online ISBN: 978-981-16-4095-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics