Abstract
Let V be a set of n vertices, \({\mathcal M}\) a set of m labels, and let \({\textbf{R}}\) be an \(m \times n\) matrix ofs independent Bernoulli random variables with probability of success p; columns of \({\textbf{R}}\) are incidence vectors of label sets assigned to vertices. A random instance \(G(V, E, {\textbf{R}}^T {\textbf{R}})\) of the weighted random intersection graph model is constructed by drawing an edge with weight equal to the number of common labels (namely \([{\textbf{R}}^T {\textbf{R}}]_{v,u}\)) between any two vertices u, v for which this weight is strictly larger than 0. In this paper we study the average case analysis of Weighted Max Cut, assuming the input is a weighted random intersection graph, i.e. given \(G(V, E, {\textbf{R}}^T {\textbf{R}})\) we wish to find a partition of V into two sets so that the total weight of the edges having exactly one endpoint in each set is maximized. In particular, we initially prove that the weight of a maximum cut of \(G(V, E, {\textbf{R}}^T {\textbf{R}})\) is concentrated around its expected value, and then show that, when the number of labels is much smaller than the number of vertices (in particular, \(m=n^{\alpha }, \alpha <1\)), a random partition of the vertices achieves asymptotically optimal cut weight with high probability. Furthermore, in the case \(n=m\) and constant average degree (i.e. \(p = \frac{\Theta (1)}{n}\)), we show that with high probability, a majority type randomized algorithm outputs a cut with weight that is larger than the weight of a random cut by a multiplicative constant strictly larger than 1. Then, we formally prove a connection between the computational problem of finding a (weighted) maximum cut in \(G(V, E, {\textbf{R}}^T {\textbf{R}})\) and the problem of finding a 2-coloring that achieves minimum discrepancy for a set system \(\Sigma \) with incidence matrix \({\textbf{R}}\) (i.e. minimum imbalance over all sets in \(\Sigma \)). We exploit this connection by proposing a (weak) bipartization algorithm for the case \(m=n, p = \frac{\Theta (1)}{n}\) that, when it terminates, its output can be used to find a 2-coloring with minimum discrepancy in a set system with incidence matrix \({\textbf{R}}\). In fact, with high probability, the latter 2-coloring corresponds to a bipartition with maximum cut-weight in \(G(V, E, {\textbf{R}}^T {\textbf{R}})\). Finally, we prove that our (weak) bipartization algorithm terminates in polynomial time, with high probability, at least when \(p = \frac{c}{n}, c<1\).
Similar content being viewed by others
1 Introduction
Given an undirected graph G(V, E), the Max Cut problem asks for a partition of the vertices of G into two sets, such that the number of edges with exactly one endpoint in each set of the partition is maximized. This problem can be naturally generalized for weighted (undirected) graphs. A weighted graph is denoted by \(G (V, E, {\textbf{W}})\), where V is the set of vertices, E is the set of edges and \({\textbf{W}}\) is a weight matrix, which specifies a weight \({\textbf{W}}_{i,j}\), for each pair of vertices i, j. In particular, we assume that \({\textbf{W}}_{i,j}=0\), for each edge \(\{i,j\} \notin E\).
Definition 1
(Weighted Max Cut) Given a weighted graph \(G (V, E, {\textbf{W}})\), find a partition of V into two (disjoint) subsets A, B, so as to maximize the cumulative weight of the edges of G having one endpoint in A and the other in B.
Weighted Max Cut is fundamental in theoretical computer science and is relevant in various graph layout and embedding problems [1]. Furthermore, it also has many practical applications, including infrastructure cost and circuit layout optimization in network and VLSI design [2], minimizing the Hamiltonian of a spin glass model in statistical physics [3], and data clustering [4]. In the worst case Max Cut (and also Weighted Max Cut) is APX-hard, meaning that there is no polynomial-time approximation scheme that finds a solution that is arbitrarily close to the optimum, unless P = NP [5].
The average case analysis of Max Cut, namely the case where the input graph is chosen at random from a probabilistic space of graphs, is also of considerable interest and is further motivated by the desire to justify and understand why various graph partitioning heuristics work well in practical applications. In most research works the input graphs are drawn from the Erdős-Rényi random graphs model \({\mathcal G}_{n, m}\), i.e. random instances are drawn equiprobably from the set of simple undirected graphs on n vertices and m edges, where m is a linear function of n (see also [6, 7] for the average case analysis of Max Cut and its generalizations with respect to other random graph models). One of the earliest results in this area is that Max Cut undergoes a phase transition on \({\mathcal G}_{n, \gamma n}\) at \(\gamma =\frac{1}{2}\) [8], in that the difference between the number of edges of the graph and the Max-Cut size is O(1), for \(\gamma <\frac{1}{2}\), while it is \(\Omega (n)\), when \(\gamma > \frac{1}{2}\). For large values of \(\gamma \), it was proved in [9] that the maximum cut size of \(G_{n, \gamma n}\) normalized by the number of vertices n reaches an absolute limit in probability as \(n \rightarrow \infty \), but it was not until recently that the latter limit was established and expressed analytically in [10], using the interpolation method; in particular, it was shown to be asymptotically equal to \((\frac{\gamma }{2}+P_* \sqrt{\frac{\gamma }{2}})n\), where \(P_* \approx 0.7632\). We note however that these results are existential, and thus do not lead to an efficient approximation scheme for finding a tight approximation of the maximum cut with large enough probability when the input graph is drawn from \({\mathcal G}_{n, \gamma n}\). An efficient approximation scheme in this case was designed in [8], and it was proved that, with high probability, this scheme constructs a cut with at least \(\left( \frac{\gamma }{2} + 0.37613 \sqrt{\gamma }\right) n = (1+0.75226 \frac{1}{\sqrt{\gamma }}) \frac{\gamma }{2}n\) edges, noting that \(\frac{\gamma }{2}n\) is the size of a random cut (in which each vertex is placed independently and equiprobably in one of the two sets of the partition). Whether there exists an efficient approximation scheme that can close the gap between the approximation guarantee of [8] and the limit of [10] remains an open problem.
In this paper, we study the average case analysis of Weighted Max Cut when input graphs are drawn from the weighted random intersection graphs model (the unweighted version of the model was initially defined in [11]), which is defined below. In this model, edges are formed through the intersection of label sets assigned to each vertex and edge weights are equal to the number of common labels between edgepoints.
Definition 2
(Weighted random intersection graph) Consider a universe \({\mathcal M} = \{1, 2, \ldots , m\}\) of labels and a set of n vertices V. We define the \(m \times n\) representation matrix \({\textbf{R}}\) whose entries are independent Bernoulli random variables with probability of success p. For \(\ell \in {\mathcal M}\) and \(v \in V\), we say that vertex v has chosen label \(\ell \) iff \({\textbf{R}}_{\ell , v}=1\). Furthermore, we draw an edge with weight \([{\textbf{R}}^T {\textbf{R}}]_{v,u}\) between any two vertices u, v for which this weight is strictly larger than 0. The weighted graph \(G = (V, E, {\textbf{R}}^T {\textbf{R}})\) is then a random instance of the weighted random intersection graphs model \(\overline{\mathcal G}_{n, m, p}\).
Random intersection graphs are relevant to and capture quite nicely social networking; vertices are the individual actors and labels correspond to specific types of interdependency. Other applications include oblivious resource sharing in a (general) distributed setting, efficient and secure communication in sensor networks [12], interactions of mobile agents traversing the web etc. (see e.g. the survey papers [13, 14] for further motivation and recent research related to random intersection graphs). In all these settings, weighted random intersection graphs, in particular, also capture the strength of connections between actors (e.g. in a social network, individuals having several characteristics in common have more intimate relationships than those that share only a few common characteristics). One of the most celebrated results in this area is the equivalence (measured in terms of total variation distance) of random intersection graphs and Erdős-Rényi random graphs when the number of labels satisfies \(m = n^{\alpha }, \alpha >6\) [15]. This bound on the number of labels was improved in [16], where it was proved that the total variation distance between the two models tends to 0 when \(m = n^{\alpha }, \alpha >4\). Furthermore, [17] proved the equivalence of sharp threshold functions among the two models for \(\alpha \ge 3\). Similarity of the two models has been proved even for smaller values of \(\alpha \) (e.g. for any \(\alpha > 1\)) in the form of various translation results (see e.g. Theorem 1 in [18]), suggesting that some algorithmic ideas developed for Erdős-Rényi random graphs also work for random intersection graphs (and also weighted random intersection graphs).
In view of this, in the present paper we study the average case analysis of Weighted Max Cut under the weighted random intersection graphs model, for the range \(m=n^{\alpha }, \alpha \le 1\) for two main reasons: First, the average case analysis of Max Cut has not been considered in the literature so far when the input is drawn from the random intersection graphs model, and thus the asymptotic behaviour of the maximum cut remains unknown especially for the range of values where random intersection graphs and Erdős-Rényi random graphs differ the most. Furthermore, studying a model where we can implicitly control its intersection number (indeed m is an obvious upper bound on the number of cliques that can cover all edges of the graph) may help understand algorithmic bottlenecks for finding maximum cuts in Erdős-Rényi random graphs.
Second, we note that the representation matrix \({\textbf{R}}\) of a weighted random intersection graph can be used to define a random set system \(\Sigma \) consisting of m sets \(\Sigma =\{L_1, \ldots , L_m\}\), where \(L_{\ell }\) is the set of vertices that have chosen label \(\ell \); we say that \({\textbf{R}}\) is the incidence matrix of \(\Sigma \). Therefore, there is a natural connection between Weighted Max Cut and the discrepancy of such random set systems, which we formalize in this paper. In particular, given a set system \(\Sigma \) with incidence matrix \({\textbf{R}}\), its discrepancy is defined as \(\text {disc}(\Sigma ) = \min _{{\textbf{x}} \in \{\pm 1\}^n} \max _{L \in \Sigma } \left|\sum _{v \in L} x_v \right|= \min _{{\textbf{x}} \in \{\pm 1\}^n}\Vert {\textbf{R}} {\textbf{x}} \Vert _{\infty }\), i.e. it is the minimum imbalance of all sets in \(\Sigma \) over all 2-colorings \({\textbf{x}}\). Recent work on the discrepancy of random rectangular matrices defined as above [19] has shown that, when the number of labels (sets) m satisfies \(n \ge 0.73 m \log {m}\), the discrepancy of \(\Sigma \) is at most 1 with high probability. The proof of the main result in [19] is based on a conditional second moment method combined with Stein’s method of exchangeable pairs, and improves upon a Fourier analytic result of [20], and also upon previous results in [21, 22]. The design of an efficient algorithm that can find a 2-coloring having discrepancy O(1) in this range still remains an open problem. Approximation algorithms for a similar model for random set systems were designed and analyzed in [23]; however, the algorithmic ideas there do not apply in our case.
1.1 Our Contribution
In this paper, we introduce the model of weighted random intersection graphs and we study the average case analysis of Weighted Max Cut through the prism of Discrepancy of random set systems. We formalize the connection between these two combinatorial problems for the case of arbitrary weighted intersection graphs in Corollary 1. We prove that, given a weighted intersection graph \(G = (V,E,{\textbf{R}}^T {\textbf{R}})\) with representation matrix \({\textbf{R}}\), and a set system with incidence matrix \({\textbf{R}}\), such that \(\text {disc}(\Sigma ) \le 1\), a 2-coloring has maximum cut weight in G if and only if it achieves minimum discrepancy in \(\Sigma \). In particular, Corollary 1 applies in the range of values considered in [19] (i.e. \(n \ge 0.73\,m \log {m}\)), and thus any algorithm that finds a maximum cut in \(G(V,E,{\textbf{R}}^T {\textbf{R}})\) with large enough probability can also be used to find a 2-coloring with minimum discrepancy in a set system \(\Sigma \) with incidence matrix \({\textbf{R}}\), with the same probability of success.
We then consider weighted random intersection graphs in the case \(m = n^{\alpha }, \alpha \le 1\), and we prove that the maximum cut weight of a random instance \(G(V,E,{\textbf{R}}^T {\textbf{R}})\) of \(\overline{{\mathcal G}}_{n, m, p}\) concentrates around its expected value (see Theorem 2). In particular, with high probability over the choices of \({\textbf{R}}\), \(\texttt {Max-Cut}(G) \sim \mathbb {E}_{{\textbf{R}}}[\texttt {Max-Cut}(G)]\), where \(\mathbb {E}_{{\textbf{R}}}\) denotes expectation with respect to \({\textbf{R}}\). The proof is based on the Efron-Stein inequality for upper bounding the variance of the maximum cut. As a consequence of our concentration result, we prove in Theorem 3 that, in the case \(\alpha <1\), a random 2-coloring (i.e. biparition) \({\textbf{x}}^{(rand)}\) in which each vertex chooses its color independently and equiprobably, has cut weight asymptotically equal to \(\texttt {Max-Cut}(G)\), with high probability over the choices of \({\textbf{x}}^{(rand)}\) and \({\textbf{R}}\).
The latter result on random cuts allows us to focus the analysis of our randomized algorithms of Sect. 4 on the case \(m=n\) (i.e. \(\alpha =1\)), and \(p = \frac{c}{n}\), for some constant c (see also the discussion at the end of Sect. 3.1), where the assumptions of Theorem 3 do not hold. It is worth noting that, in this range of values, the expected weight of a fixed edge in a weighted random intersection graph is equal to \(mp^2 = \Theta (1/n)\), and thus we hope that our work here will serve as an intermediate step towards understanding when algorithmic bottlenecks for Max Cut appear in sparse random graphs (especially Erdős-Rényi random graphs) with respect to the intersection number. In particular, in Sect. 4.1, we analyze the Majority Cut Algorithm that extends the algorithmic idea of [8] to weighted intersection graphs as follows: vertices are colored sequentially (each color \(+1\) or \(-1\) corresponding to a different set in the partition of the vertices), and the t-th vertex is colored opposite to the sign of \(\sum _{i \in [t-1]} [{\textbf{R}}^T {\textbf{R}}]_{i,t} x_i\), namely the total available weight of its incident edges, taking into account colors of adjacent vertices. Our average case analysis of the Majority Cut Algorithm shows that, when \(m=n\) and \(p = \frac{c}{n}\), for large constant c, with high probability over the choices of \({\textbf{R}}\), the expected weight of the constructed cut is at least \(1+\beta \) times larger than the expected weight of a random cut, for any constant \(\beta = \beta (c) \le \sqrt{\frac{8}{27 \pi c^3}} - o(1)\). The fact that the lower bound on beta is inversely proportional to \(c^{3/2}\) was to be expected, because, as p increases, the approximation of the maximum cut that we get from the weight of a random cut improves (see also the discussion at the end of Sect. 3.1).
In Sect. 4.2 we propose a framework for finding maximum cuts in weighted random intersection graphs for \(m=n\) and \(p = \frac{c}{n}\), for constant c, by exploiting the connection between Weighted Max Cut and the problem of discrepancy minimization in random set systems. In particular, we design the Weak Bipartization Algorithm, that takes as input an intersection graph with representation matrix \({\textbf{R}}\) and outputs a subgraph that is “almost” bipartite. In fact, the input intersection graph is treated as a multigraph composed by overlapping cliques formed by the label sets \(L_{\ell } = \{v: {\textbf{R}}_{\ell , v}=1\}, \ell \in {\mathcal M}\). The algorithm attempts to destroy all odd cycles of the input (except from odd cycles that are formed by labels with only two vertices) by replacing each clique induced by some label set \(L_{\ell }\) by a random maximal matching. In Theorem 5 we prove that, with high probability over the choices of \({\textbf{R}}\), if the Weak Bipartization Algorithm terminates, then its output can be used to construct a 2-coloring that has minimum discrepancy in a set system with incidence matrix \({\textbf{R}}\), which also gives a maximum cut in \(G(V,E,{\textbf{R}}^T {\textbf{R}})\). It is worth noting that this does not follow from Corollary 1, because a random set system with incidence matrix \({\textbf{R}}\) has discrepancy larger than 1 with (at least) constant probability when \(m=n\) and \(p = \frac{c}{n}\). Our proof relies on a structural property of closed 0-strong vertex-label sequences (loosely defined as closed walks of edges formed by distinct labels) in the weighted random intersection graph \(G(V, E, {\textbf{R}}^T {\textbf{R}})\) (Lemma 1). Finally, in Theorem 6, we prove that our Weak Bipartization Algorithm terminates in polynomial time, with high probability, if the constant c is strictly less than 1. Therefore, there is a polynomial time algorithm for finding weighted maximum cuts, with high probability, when the input is drawn from \(\overline{{\mathcal G}}_{n, n, \frac{c}{n}}\), with \(c<1\). We believe that this part of our work may also be of interest regarding the design of efficient algorithms for finding minimum disrepancy colorings in random set systems.
A preliminary version of this paper appeared in the Proceedings of the 32nd International Symposium on Algorithms and Computation (ISAAC) [24].
2 Notation and Preliminary Results
We denote weighted undirected graphs by \(G(V, E, {\textbf{W}})\); in particular, \(V=V(G)\) (resp. \(E=E(G)\)) is the set of vertices (resp. set of edges) and \({\textbf{W}} = {\textbf{W}}(G)\) is the weight matrix, i.e. \({\textbf{W}}_{i, j}\) is the weight of (undirected) edge \(\{i,j\} \in E\). We allow \({\textbf{W}}\) to have non-zero diagonal entries, as these do not affect cut weights. We also denote the number of vertices by n, and we use the notation \([n] = \{1,2,\ldots ,n\}\). We also use this notation to define parts of matrices, for example \({\textbf{W}}_{[n], 1}\) denotes the first column of the weight matrix.
A bipartition of the set of vertices is a partition of V into two nonempty sets A, B, such that \(A \cap B = \emptyset \) and \(A \cup B = V\). Bipartitions correspond to 2-colorings, which we denote by vectors \({\textbf{x}}\) such that \(x_i=+1\) if \(i \in A\) and \(x_i=-1\) if \(i \in B\).
Given a weighted graph \(G(V, E, {\textbf{W}})\), we denote by \(\texttt {Cut}(G, {\textbf{x}})\) the weight of a cut defined by a bipartition \({\textbf{x}}\), namely \(\texttt {Cut}(G, {\textbf{x}}) = \sum _{\{i, j\} \in E: i \in A, j \in B} {\textbf{W}}_{i,j} = \frac{1}{4} \sum _{\{i, j\} \in E} {\textbf{W}}_{i,j} (x_i-x_j)^2\). The maximum cut of G is \(\texttt {Max-Cut}(G) = \max _{{\textbf{x}} \in \{-1, +1\}^n} \texttt {Cut}(G, {\textbf{x}})\).
For a weighted random intersection graph \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) with representation matrix \({\textbf{R}}\), we denote by \(S_v\) the set of labels chosen by vertex \(v \in V\), i.e. \(S_v = \{\ell : {\textbf{R}}_{\ell , v}=1\}\). Furthermore, we denote by \(L_{\ell }\) the set of vertices having chosen label \(\ell \), i.e. \(L_{\ell }=\{v:{\textbf{R}}_{\ell , v}=1\}\). Using this notation, the weight of an edge \(\{v, u\} \in E\) is \(\left|S_v \cap S_u \right|\); notice also that this is equal to 0 when \(\{v, u\} \notin E\). We also note here that we may also think of a weighted random intersection graph as a simple weighted graph where, for any pair of vertices v, u, there are \(\left|S_v \cap S_u \right|\) simple edges between them.
A set system \(\Sigma \) defined on a set V is a family of sets \(\Sigma = \{L_1, L_2, \ldots , L_m\}\), where \(L_\ell \subseteq V, \ell \in [m]\). The incidence matrix of \(\Sigma \) is an \(m \times n\) matrix \({\textbf{R}} = {\textbf{R}}(\Sigma )\), where for any \(\ell \in [m], v \in [n]\), \({\textbf{R}}_{\ell , v} = 1\) if \(v \in S_{\ell }\) and 0 otherwise. The discrerpancy of \(\Sigma \) with respect to a 2-coloring \({\textbf{x}}\) of the vertices in V is \(\text {disc}(\Sigma , {\textbf{x}}) = \max _{\ell \in [m]} \left|\sum _{v \in V} {\textbf{R}}_{\ell , v} x_v \right|= \Vert {\textbf{R}} {\textbf{x}} \Vert _{\infty }\). The discrepancy of \(\Sigma \) is \(\text {disc}(\Sigma ) = \min _{{\textbf{x}} \in \{-1, +1\}^n} \text {disc}(\Sigma , {\textbf{x}})\).
It is well-known that the cut size of a bipartition of the set of vertices of a graph G(V, E) into sets A and B is given by \(\frac{1}{4} \sum _{\{i,j\} \in E} (x_i-x_j)^2\), where \(x_i=+1\) if \(i \in A\) and \(x_i=-1\) if \(i \in B\). This can be naturally generalized for multigraphs and also for weighted graphs. In particular, the Max-Cut size of a weighted graph \(G(V, E, {\textbf{W}})\) is given by
In particular, we get the following Proposition:
Proposition 1
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a weighted intersection graph with representation matrix \({\textbf{R}}\). Then, for any \({\textbf{x}} \in \{-1, +1\}^n\),
and so
where \(\Vert \cdot \Vert \) denotes the 2-norm. In particular, the expectation of the size of a random cut, where each entry of \({\textbf{x}}\) is independently and equiprobably either +1 or -1 is equal to \(\mathbb {E}_{{\textbf{x}}}\left[ \texttt {Cut}(G, {\textbf{x}})\right] = \frac{1}{4} \sum _{i\ne j, i,j \in [n]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,j}\), where \(\mathbb {E}_{{\textbf{x}}}\) denotes expectation with respect to \({\textbf{x}}\).
Proof
We first note that, by straightforward calculation, for any weighted graph \(G(V, E, {\textbf{W}})\), where \({\textbf{W}}\) is symmetric and \({\textbf{W}}_{i,j} = 0\) if \(\{i,j\} \notin E\), and any \({\textbf{x}} \in \{-1,+1\}^n\), we have
Noting that \(\texttt {Cut}(G) = \frac{1}{4} \sum _{\{i,j\} \in E} {\textbf{W}}_{i,j} (x_i-x_j)^2\), the above settle equation (2), by taking \({\textbf{W}} = {\textbf{R}}^T {\textbf{R}}\). Similarly, by Eq. (1), and since the term \(\sum _{i,j \in [n]^2} {\textbf{W}}_{i,j}\) is independent of \({\textbf{x}}\), we have
which settles equation (3), by taking \({\textbf{W}} = {\textbf{R}}^T {\textbf{R}}\).
For the last part of the Proposition, notice that diagonal entries of the weight matrix in (4) cancel out, and so, for any \({\textbf{x}} \in \{-1, +1\}^n\), setting \({\textbf{W}} = {\textbf{R}}^T {\textbf{R}}\), we have
Taking expectations with respect to \({\textbf{x}}\), the contribution of the second sum in the above expression equals 0, which completes the proof. \(\square \)
Since \(\sum _{i,j \in [n]^2} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,j}\) is fixed for any given representation matrix \({\textbf{R}}\), the above Proposition implies that, to find a bipartition of the vertex set V that corresponds to a maximum cut, we need to find an n-dimensional vector in \(\arg \min _{{\textbf{x}} \in \{-1, +1\}^n} \left\| {\textbf{R}} {\textbf{x}} \right\| ^2\). We thus get the following:
Corollary 1
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a weighted intersection graph with representation matrix \({\textbf{R}}\) and \(\Sigma \) a set system with incidence matrix \({\textbf{R}}\). If \(\text {disc}(\Sigma ) \le 1\), then \({\textbf{x}}^* \in \arg \min _{{\textbf{x}} \in \{-1, +1\}^n} \left\| {\textbf{R}} {\textbf{x}} \right\| ^2\) if and only if \({\textbf{x}}^* \in \arg \min _{{\textbf{x}} \in \{-1, +1\}^n} \text {disc}(\Sigma , {\textbf{x}})\). In particular, if the minimum discrepancy of \(\Sigma \) is at most 1, a bipartition corresponds to a maximum cut iff it achieves minimum discrepancy.
Proof
Since \(\text {disc}(\Sigma , {\textbf{x}}^*) \le 1\), then each component of \({\textbf{R}}{\textbf{x}}^*\) is either 0 or 1, for any \({\textbf{x}}^* \in \{-1, +1\}^n\). In particular, since every element of \({\textbf{R}}\) is either 0 or 1, for any \(\ell \in [m]\), \(\left[ {\textbf{R}}{\textbf{x}}^*\right] _{\ell }\) will be equal to 0, if and only if the number of ones in the \(\ell \)-th row is even, and it will be equal to 1 otherwise. This is the best one can hope for, since sets with an odd number of elements can never have discrepancy less than 1. Therefore, \(\Vert {\textbf{R}} {\textbf{x}}^*\Vert \) is also the minimum possible. In particular, this implies that, in the case \(\text {disc}(\Sigma , {\textbf{x}}^*) \le 1\), any 2-coloring that achieves minimum discrepancy gives a bipartition that corresponds to a maximum cut and vice versa. \(\square \)
Notice that the above result is not necessarily true when \(\text {disc}(\Sigma ) > 1\), since the minimum of \(\Vert {\textbf{R}} {\textbf{x}} \Vert \) could be achieved by 2-colorings with larger discrepancy than the optimal.
2.1 Range of Values for Selection Probability
Concerning the success probability p, we note that, when \(n,m \rightarrow \infty \), and \(p = o\left( \sqrt{\frac{1}{nm}} \right) \), direct application of the results of [25] suggest that \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) is chordal with high probability, but in fact the same proofs reveal that a stronger property holds, namely that there is no closed vertex-label sequence (refer to the precise definition in Sect. 4.2) having distinct labels. Therefore, in this case, finding a bipartition with maximum cut weight is straightforward: indeed, one way to construct a maximum cut is to run our Weak Bipartization Algorithm from Sect. 4.2, and then to apply Theorem 5 (noting that the algorithm’s termination condition trivially holds, since the set \({\mathcal C}_{odd}(G^{(b)})\) defined in Sect. 4.2 is empty). In view of this, in Sect. 3, we will assume that \(p \ge C_1 \sqrt{\frac{1}{nm}}\), for arbitrary positive constant \(C_1\) that can be as small as possible; this implies that edge weights are \(\Omega \left( \sqrt{\frac{m}{n}} \right) \) on expectation. On the other hand, in view of our results in Sect. 3.1 regarding the near optimality of the weight of a random cut, in the analysis of our randomized algorithms in Sect. 4, we assume \(n=m\) and \(p = \Theta \left( \frac{1}{n} \right) \); this range of values gives sparse graph instances, but the corresponding distribution of weighted random intersection graphs is different from the distribution of sparse Erdős-Rényi random graphs, even without taking weights into account (please refer to the end of Sect. 3.1 for a more technical justification for the latter assumption).
3 Concentration of Max-Cut
In this section, we prove that the size of the maximum cut in a weighted random intersection graph concentrates around its expected value. We note however, that the following Theorem does not provide an explicit formula for the expected value of the maximum cut.
Theorem 2
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model with \(m=n^a, \alpha \le 1\), and \(p \ge C_1 \sqrt{\frac{1}{nm}}\), for arbitrary positive constant \(C_1\), and let \({\textbf{R}}\) be its representation matrix. Then \(\texttt {Max-Cut}(G) = (1 \pm o(1)) \mathbb {E}_{{\textbf{R}}}[\texttt {Max-Cut}(G)]\) with high probability, as \(n \rightarrow \infty \), where \(\mathbb {E}_{{\textbf{R}}}\) denotes expectation with respect to \({\textbf{R}}\), i.e. \(\texttt {Max-Cut}(G)\) concentrates around its expected value.
Proof
Let \(G=G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a weighted random intersection graph, and let \({\textbf{D}}\) denote the (random) diagonal matrix containing all diagonal elements of \({\textbf{R}}^T{\textbf{R}}\). In particular, Eq. (3) of Proposition 1 can be written as
Furthermore, for any given \({\textbf{R}}\), notice that, if we select each element of \({\textbf{x}}\) independently and equiprobably from \(\{-1, +1\}\), then \(\mathbb {E}_{{\textbf{x}}}[{\textbf{x}}^T \left( {\textbf{R}}^T {\textbf{R}} -{\textbf{D}}\right) {\textbf{x}}]=0\), where \(\mathbb {E}_{{\textbf{x}}}\) denotes expectation with respect to \({\textbf{x}}\). By the probabilistic method, we thus have \(\min _{{\textbf{x}} \in \{-1, +1\}^n} {\textbf{x}}^T \left( {\textbf{R}}^T {\textbf{R}} -{\textbf{D}}\right) {\textbf{x}} \le 0\), implying the following bound:
where the second inequality follows trivially by observing that \(\frac{1}{2} \sum _{i\ne j, i,j \in [n]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,j}\) equals the sum of the weights of all edges.
By linearity of expectation, we have \(\mathbb {E}_{{\textbf{R}}}\left[ \sum _{i\ne j, i,j \in [n]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,j} \right] = \mathbb {E}_{{\textbf{R}}}\left[ \sum _{i\ne j, i,j \in [n]} \sum _{\ell \in [m]} {\textbf{R}}_{\ell ,i} {\textbf{R}}_{\ell , j} \right] = n(n-1)mp^2 = \Theta (n^2mp^2)\), which is \(\Omega (n)\) in the range of parameters that we consider. In particular, by (5), we have
By Chebyshev’s inequality, for any \(\epsilon >0\), we have
where \(\text {Var}_{{\textbf{R}}}\) denotes variance with respect to \({\textbf{R}}\). To bound the variance on the right hand side of the above inequality, we use the Efron-Stein inequality. In particular, we write \(\texttt {Max-Cut}(G):= f({\textbf{R}})\), i.e. we view \(\texttt {Max-Cut}(G)\) as a function of the label choices. For \(\ell \in [m], i \in [n]\), we also write \({\textbf{R}}^{(\ell , i)}\) for the matrix \({\textbf{R}}\) where entry \((\ell , i)\) has been replaced by an independent, identically distributed (i.i.d.) copy of \({\textbf{R}}_{\ell , i}\), which we denote by \({\textbf{R}}_{\ell , i}'\). By the Efron-Stein inequality, we now have
Notice now that, given all entries of \({\textbf{R}}\) except \({\textbf{R}}_{\ell , i}\), the probability that \(f({\textbf{R}})\) is different from \(f\left( {\textbf{R}}^{(\ell , i)} \right) \) is at most \(\Pr ({\textbf{R}}_{\ell , i} \ne {\textbf{R}}_{\ell , i}') = 2p(1-p)\). Furthermore, if \(L_{\ell } \backslash \{i\}\) is the set of vertices different from i which have selected \(\ell \), we then have that \(\left( f({\textbf{R}}) - f\left( {\textbf{R}}^{(\ell , i)} \right) \right) ^2 \le \left|L_{\ell } \backslash \{i\} \right|^2\), because the intersection graph with representation matrix \({\textbf{R}}\) differs by at most \(\left|L_{\ell } \backslash \{i\} \right|\) edges from the intersection graph with representation matrix \({\textbf{R}}^{(\ell , i)}\). Notice now that, by definition, \(\left|L_{\ell } \backslash \{i\} \right|\) follows the Binomial distribution \({\mathcal B}(n-1, p)\). In particular, \(\mathbb {E} \left[ \left|L_{\ell } \backslash \{i\} \right|^2 \right] = (n-1)p(np-2p+1)\), implying \(\mathbb {E}\left[ \left( f({\textbf{R}}) - f\left( {\textbf{R}}^{(\ell , i)} \right) \right) ^2 \right] \le 2p(1-p) (n-1)p(np-2p+1)\), for any fixed \(\ell \in [m], i \in [n]\).
Putting this all together, (8) becomes
where the last equation comes from the fact that, in the range of values that we consider, we have \(np = \Omega (1)\). Therefore, by (7), we get
which goes to 0 in the range of values that we consider. Together with (6), the above bound proves that \(\texttt {Max-Cut}(G)\) is concentrated around its expected value, and the proof is completed. \(\square \)
3.1 Max-Cut for Small Number of Labels
Using Theorem 2, we can now show that, in the case \(m = n^{\alpha }, \alpha <1\), and \(p = \Omega \left( \sqrt{\frac{1}{nm}}\right) \), a random cut has asymptotically the same weight as \(\texttt {Max-Cut}(G)\), where \(G=G(V,E, {\textbf{R}}^T {\textbf{R}})\) is a random instance of \(\overline{\mathcal G}_{n, m, p}\). In particular, let \({\textbf{x}}^{(rand)}\) be constructed as follows: for each \(i \in [n]\), set \(x^{(rand)}_{i} = -1\) independently with probability \(\frac{1}{2}\), and \(x^{(rand)}_{i} = +1\) otherwise. In view of Eq. (3), the main idea for the proof of the following Theorem is to show that, with high probability over random \({\textbf{x}}\) and \({\textbf{R}}\), \(\Vert {\textbf{R}} {\textbf{x}}\Vert ^2\) is asymptotically smaller than the expectation of the weight of the cut defined by \({\textbf{x}}^{(rand)}\). The result then follows by concentration of \(\texttt {Max-Cut}(G)\) around its expected value, and straightforward bounds on \(\texttt {Max-Cut}(G)\).
Theorem 3
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model with \(m=n^a, \alpha < 1\), and \(p \ge C_1 \sqrt{\frac{1}{nm}}\), for arbitrary positive constant \(C_1\), and let \({\textbf{R}}\) be its representation matrix. Then the cut weight of the random 2-coloring \({\textbf{x}}^{(rand)}\) satisfies \(\texttt {Cut}(G, {\textbf{x}}^{(rand)}) = (1-o(1)) \texttt {Max-Cut}(G)\) with high probability over the choices of \({\textbf{x}}^{(rand)}\), \({\textbf{R}}\).
Proof
Let \(G=G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a weighted random intersection graph. By Eq. (2) of Proposition 1, for any \({\textbf{x}} \in \{-1, +1\}^n\), we have:
Taking expectations with respect to random \({\textbf{x}}\) and \({\textbf{R}}\), we get
To prove Theorem 3, we will show that, with high probability over random \({\textbf{x}}\) and \({\textbf{R}}\), we have \(\Vert {\textbf{R}} {\textbf{x}}\Vert ^2 = o\left( \mathbb {E}_{{\textbf{R}}}\left[ \frac{1}{4} \sum _{i\ne j, i,j \in [n]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,j} \right] \right) = o(n^2mp^2)\), in which case the theorem follows by concentration of \(\texttt {Max-Cut}(G)\) around its expected value (Theorem 2), and the fact that \(\texttt {Max-Cut}(G) \ge \frac{1}{4} \sum _{i\ne j, i,j \in [n]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,j}\) (see Eq. (5)). Indeed, by Eq. (10) and the lower bound on \(\texttt {Max-Cut}(G)\), we get that \(\texttt {Max-Cut}(G) - \Vert {\textbf{R}} {\textbf{x}}\Vert ^2 \le \texttt {Cut}(G, {\textbf{x}}) \le \texttt {Max-Cut}(G)\). Furthermore, by concentration of \(\texttt {Max-Cut}(G)\) around its expected value and the fact that \(\mathbb {E}_{{\textbf{R}}}[\texttt {Max-Cut}(G)] = \Theta (n^2mp^2)\) (Eq. (6)), we get that \(\texttt {Max-Cut}(G) = \Theta (n^2mp^2)\), with high probability. Therefore, having \(\Vert {\textbf{R}} {\textbf{x}}\Vert ^2 = o(n^2mp^2)\) implies \(\texttt {Max-Cut}(G) - o(\texttt {Max-Cut}(G)) \le \texttt {Cut}(G, {\textbf{x}}) \le \texttt {Max-Cut}(G)\), as needed.
To this end, fix \(\ell \in [m]\) and consider the random variable counting the number of ones in the \(\ell \)-th row of \({\textbf{R}}\), namely \(Y_{\ell } = \sum _{i \in [n]} {\textbf{R}}_{\ell , i}\). By the multiplicative Chernoff bound, for any \(\delta >0\),
Since \(np \ge C_1 \sqrt{\frac{n}{m}} = C_1 n^{\frac{1-\alpha }{2}}\), taking any \(\delta \ge 2\), we get
Therefore, by the union bound,
implying that, all rows of \({\textbf{R}}\) have at most 3np non-zero elements with high probability.
Fix now \(\ell \) and consider the random variable corresponding to the \(\ell \)-th entry of \({\textbf{R}} {\textbf{x}}\), namely \(Z_{\ell } = \sum _{i \in [n]} {\textbf{R}}_{\ell , i} x_i\). In particular, given \(Y_{\ell }\), notice that \(Z_{\ell }\) is equal to the sum of \(Y_{\ell }\) independent random variables \(x_i \in \{-1, +1\}\), for i such that \({\textbf{R}}_{\ell , i}=1\). Therefore, since \(\mathbb {E}_{{\textbf{x}}}[Z_{\ell }] = \mathbb {E}_{{\textbf{x}}}[Z_{\ell } |Y_{\ell }]=0\), by Hoeffding’s inequality, for any \(\lambda \ge 0\),
Therefore, by the union bound, and taking \(\lambda \ge \sqrt{6 np \ln {n}}\),
implying that all entries of \({\textbf{R}} {\textbf{x}}\) have absolute value at most \(\sqrt{6 np \ln {n}}\) with high probability over the choices of \({\textbf{x}}\) and \({\textbf{R}}\). Consequently, with high probability over the choices of \({\textbf{x}}\) and \({\textbf{R}}\), we have \(\Vert {\textbf{R}} {\textbf{x}}\Vert ^2 \le 6mnp \ln {n}\), which is \(o(n^2mp^2)\), since \(\ln {n} = o(np)\) in the range of parameters considered in this theorem. This completes the proof. \(\square \)
We note that the same analysis also holds when \(n=m\) and p is sufficiently large (e.g. \(\ln {n} = o(np)\)). In particular, similar probability bounds hold in Eqs. (12), (13) and (14), for the same choices of \(\delta \ge 2\) and \(\lambda \ge \sqrt{7 np \ln {n}}\), implying that \(\Vert {\textbf{R}} {\textbf{x}}\Vert ^2 \le 7mnp \ln {n} = o(n^2mp^2)\) with high probability. In view of this, in the following sections we will only assume \(m=n\) (i.e. \(\alpha =1\)) and also \(p = \frac{c}{n}\), for some positive constant c (note that, we no longer have \(\ln {n} = o(np)\), as p is much smaller, and so the above proof idea does not apply in this case).
4 Algorithmic Results (Randomized Algorithms)
4.1 The Majority Cut Algorithm
In the following algorithm, the 2-coloring representing the bipartition of a cut is constructed as follows: initially, a small constant fraction \(\epsilon \) of vertices are randomly placed in the two partitions, and then in each subsequent step, one of the remaining vertices is placed in the partition that maximizes the weight of incident edges with endpoints in the opposite partition.
Clearly the Majority Cut Algorithm runs in polynomial time in n, m. Furthermore, the following Theorem provides a lower bound on the expected weight of the cut constructed by the algorithm in the case \(m=n\), \(p = \frac{c}{n}\), for large constant c, and \(\epsilon \rightarrow 0\). For the proof, we first express the weight increase of the constructed cut due to the coloring of the t-th vertex, in the subgraph induced by the colored vertices, as the absolute value of a random variable \(Z_t\). Then, given the colors and label choices of all previously colored vertices (namely vertices \(v_1, \ldots , v_{t-1}\)) we lower bound the conditional expectation of \(|Z_t |\) by the mean absolute difference \(\textrm{MD}(Z^B_t)\) of a certain binomial random variable \(Z_t^B\). Finally, we lower bound \(\textrm{MD}(Z^B_t)\) by using the Berry-Esseen Theorem for Gaussian approximation, which is stated below.
Theorem
(Berry-Esseen Theorem [26]) Let \(X_1, X_2, \ldots ,\) be independent, identically distributed random variables, with \(\mathbb {E}[X_i]=0, \mathbb {E}[X_i^2] = \sigma ^2>0\), and \(\mathbb {E}[|X_i|^3] = \rho < \infty \). For \(N>0\), let \(F_N(\cdot )\) be the cumulative distribution function of \(\frac{X_1+\cdots +X_N}{\sigma \sqrt{N}}\), and let \(\Phi (\cdot )\) be the cumulative distribution function of the standard normal distribution. Then, \(\sup _{x \in \mathbb {R}} |F_N(x)-\Phi (x) |\le \frac{0.4748 \rho }{\sigma ^3 \sqrt{N}}\).
We now state and prove the main theorem in this section.
Theorem 4
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model, with \(m=n\), and \(p = \frac{c}{n}\), for large positive constant c, and let \({\textbf{R}}\) be its representation matrix. Then, with high probability over the choices of \({\textbf{R}}\), the majority algorithm constructs a cut with expected weight at least \((1+\beta ) \frac{1}{4} \mathbb {E}\left[ \sum _{i\ne j, i,j \in [n]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,j} \right] \), where \(\beta = \beta (c) \le \sqrt{\frac{8}{27 \pi c^3}} - o(1)\) is a constant, i.e. at least \(1+\beta \) times larger than the expected weight of a random cut.
Proof
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) (i.e. the input to the Majority Cut Algorithm) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model, with \(m=n\), and \(p = \frac{c}{n}\), for some large enough constant c. For \(t \in [n]\), let \(M_t\) denote the constructed cut size just after the consideration of a vertex \(v_t\), for some \(t \ge \epsilon n+1\). In particular, by Eq. (2) for \(n=t\), reasoning similarly as to get Eq. (3), and since the values \(x_1, \ldots , x_{t-1}\) are already decided in previous steps, we have
The first of the above terms is
and the second term is
By (15), (16) and (17), we have
Define now the random variable
where \({\textbf{x}}=(x_1, \ldots , x_n) \in \{-1,+1\}^n\) is the 2-coloring constructed by the Majority Cut Algorithm (in fact only the first \(t-1\) entries of \({\textbf{x}}\) are needed for \(Z_t\)), so that \(M_t = M_{t-1} + \frac{1}{2} \sum _{i \in [t-1]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,t} + \frac{1}{2} \left|Z_t\right|\). Observe that, in the latter recursive equation, the term \(\frac{1}{2} \sum _{i \in [t-1]} \left[ {\textbf{R}}^T {\textbf{R}} \right] _{i,t}\) corresponds to the expected increment of the constructed cut if the t-vertex chose its color uniformly at random. Therefore, lower bounding the expectation of \(\frac{1}{2}\left|Z_t\right|\) will tell us how much better the Majority Algorithm does when considering the t-th vertex.
Towards this end, we first note that, given \({\textbf{x}}_{[t-1]} = \{x_i, i \in [t-1] \}\), and \({\textbf{R}}_{[m], [t-1]}=\{ {\textbf{R}}_{\ell , i}, \ell \in [m], i \in [t-1]\}\), \(Z_t\) is the sum of m independent random variables, since the Bernoulli random variables \({\textbf{R}}_{\ell ,t}, \ell \in [m],\) are independent, for any given t (note that the conditioning is essential for independence, otherwise the inner sums in the definition of \(Z_t\) would also depend on the \(x_i\)’s, which, for \(i \ge \epsilon n+1\), are functions of \(x_1, \ldots , x_{i-1}\), and of the entries of \({\textbf{R}}\)). Furthermore, \(\mathbb {E}[Z_t |{\textbf{x}}_{[t-1]}, {\textbf{R}}_{[m], [t-1]}] = p \sum _{\ell \in [m]} \sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i\) and \(\text {Var}(Z_t |{\textbf{x}}_{[t-1]}, {\textbf{R}}_{[m], [t-1]}) = p(1-p) \sum _{\ell \in [m]} \left( \sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i \right) ^2\). Given \({\textbf{x}}_{[t-1]}\) and \({\textbf{R}}_{[m], [t-1]}\), define the sets \(A^+_t = \{\ell \in [m]: \sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i > 0\}\) and \(A^-_t = \{\ell \in [m]: \sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i < 0\}\). In particular, given \({\textbf{x}}_{[t-1]} = \{x_i, i \in [t-1] \}\), and \({\textbf{R}}_{[m], [t-1]}=\{ {\textbf{R}}_{\ell , i}, \ell \in [m], i \in [t-1]\}\), \(Z_t\) can be written as
where \({\textbf{R}}_{\ell , t}, \ell \in A^+_t \cup A^-_t\) are independent Bernoulli random variables with success probability p.
Note that \(\mathbb {E}[|Z_t |\big |{\textbf{x}}_{[t-1]}, {\textbf{R}}_{[m], [t-1]}]\) does not increase if we replace \(\sum _{\ell \in A^+_t} {\textbf{R}}_{\ell , t} \sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i\) and \(\sum _{\ell \in A^-_t} {\textbf{R}}_{\ell , t} \left|\sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i \right|\) in the expression (19) for \(Z_t\) by independent binomial random variables \(Z_t^+ \sim {\mathcal B}\left( \sum _{\ell \in A^+_t} \sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i, p\right) \) and \(Z_t^- \sim {\mathcal B}\left( \sum _{\ell \in A^-_t} \left|\sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i \right|, p\right) \), respectively.Footnote 1 In particular, if \(Z_t^{'+}\) and \(Z_t^{'-}\) follow the same distribution as \(Z_t^+\) and \(Z_t^-\), respectively, and \(Z_t^+, Z_t^{'+}, Z_t^-, Z_t^{'-}\) are stochastically independent, then
In view of the above, if \(Z^B_t\) is a random variable which, given \({\textbf{x}}_{[t-1]} = \{x_i, i \in [t-1] \}\), and \({\textbf{R}}_{[m], [t-1]}=\{ {\textbf{R}}_{\ell , i}, \ell \in [m], i \in [t-1]\}\), follows the Binomial distribution \({\mathcal B}\left( N_t, p\right) \), where
then
where \(\textrm{MD}(\cdot )\) is the mean absolute difference of (two independent copies of) \(Z^B_t\). In particular, \(\textrm{MD}(Z^B_t) = \mathbb {E}[\left|Z^B_t - Z'^B_t \right|]\), where \(Z^B_t, Z'^B_t\) are independent random variables following \({\mathcal B}\left( N_t, p\right) \). Unfortunately, we are aware of no simple closed formula for \(\textrm{MD}(Z^B_t)\), and so we resort to Gaussian approximation through the Berry-Esseen Theorem: we write \(Z^B_t = \sum _{i=1}^{N_t} Z^B_{t,i}\), \(Z'^B_t = \sum _{i=1}^{N_t} Z'^B_{t,i}\), and set \(X_i = Z^B_{t,i} - Z'^B_{t,i}\), where \(Z^B_{t,i}, Z'^B_{t,i}\) are independent Bernoulli random variables with success probability p, for any \(i \in [N_t]\). In particular, we have \(\mathbb {E}[X_i]=0\), \(\mathbb {E}[X_i^2] = \mathbb {E}[|X_i |^3] = 2p(1-p)\). Therefore, by the Berry-Esseen Theorem, given \({\textbf{x}}_{[t-1]} = \{x_i, i \in [t-1] \}\), and \({\textbf{R}}_{[m], [t-1]}=\{ {\textbf{R}}_{\ell , i}, \ell \in [m], i \in [t-1]\}\),the distribution of \(Z^B_t - Z'^B_t\) is approximately Normal \({\mathcal N}(0, 2p(1-p)N_t)\), with approximation error \(\frac{0.4748}{\sqrt{2p(1-p) N_t}}\).
Notice that the latter approximation error bound becomes o(1) if \(N_t = \Theta (n), p = \frac{c}{n}\) and c is large enough. Therefore, we next show that, with high probability over the choices of \({\textbf{R}}\), \(N_t = \Theta (n)\), for any \(t \ge \epsilon n+1\), where \(\epsilon \) is the constant used in the Majority Algorithm. In particular, even though we cannot control the variables \(x_i \in \{-1,+1\}, i \in [t-1]\), in the definition of \(N_t\), we will find a lower bound that holds with high probability, by using the random variable
and employing the following inequality
Indeed, (22) holds because, for any \(i \in [t-1]\), if \(\sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i}\) is odd, then \(\left|\sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i \right|\ge 1\), no matter what value the \(x_i\)’s have. Therefore, \(\sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} x_i\) will contribute at least 1 to the right side of (20), and thus (22) follows.
Notice now that, for any fixed i and \(t \ge \epsilon n+1\), we have \(\Pr (\sum _{i \in [t-1]} {\textbf{R}}_{\ell ,i} \text { is odd}) = \sum _{j \text { odd}} \left( {\begin{array}{c}t-1\\ j\end{array}}\right) p^j (1-p)^{t-1-j} = \frac{1}{2} \left( 1 - (1-2p)^{t-1}\right) \ge \frac{1}{2} \left( 1 - e^{-2p(t-1)}\right) \ge \frac{1}{2} \left( 1 - e^{-2c\epsilon }\right) \), where in the last inequality we set \(p = \frac{c}{n}\). Taking \(c \rightarrow \infty \), the latter bound becomes \(\frac{1}{2} - o(1)\). Therefore, by independence of the entries of \({\textbf{R}}\), \(Y_t\) stochastically dominates a binomial random variable \({\mathcal B}(t-1, \frac{1}{3})\). Furthermore, by the multiplicative Chernoff (upper) bound, for any \(\delta >0\),
Taking \(\delta = \frac{1}{2}\) and noting that \(t \ge \epsilon n +1\), we have
which is o(1/n), for any constant \(\epsilon >0\). By the union bound,
By inequality (22), we thus have that, with high probability over the choices of \({\textbf{R}}\), \(N_t \ge \frac{t-1}{6} \ge \frac{\epsilon n}{6}\), for all \(t \ge \epsilon n+1\), as needed.
Combining the above, by the Berry-Esseen Theorem, given \({\textbf{x}}_{[t-1]}, {\textbf{R}}_{[m], [t-1]}\), the distribution of \(Z_t^B-Z'^B_t\) is approximately Normal \({\mathcal N}(0, 2p(1-p)N_t)\) with approximation error o(1) as \(c \rightarrow \infty \), with high probability over the choices of \({\textbf{R}}\). In particular, given \({\textbf{x}}_{[t-1]}, {\textbf{R}}_{[m], [t-1]}\), \(|Z_t^B-Z'^B_t |\) follows approximately (i.e. with the same approximation error o(1)) the folded normal distribution with mean value (at least) \(\sqrt{\frac{2}{\pi } \text {Var}(Z_t^B-Z'^B_t |{\textbf{x}}_{[t-1]}, {\textbf{R}}_{[m], [t-1]})} = \sqrt{\frac{4}{\pi } p(1-p) N_t}\). Since \(N_t \ge \frac{t-1}{6} \ge \frac{\epsilon n}{6}\) with high probability, and also \(p = \frac{c}{n}\), we get that \(p(1-p)N_t \ge \frac{c (t-1)}{6n} -o(1)\), with high probability, where the o(1) includes the approximation error given by the Berry-Esseen Theorem. Consequently, by inequality (21), with high probability over the choices of \({\textbf{R}}\) (which is \(1- o(1)\)),
Summing over all \(t \ge \epsilon n+1\), we get
Using the fact that \(\sum _{t \ge 1} \sqrt{t} = \frac{2}{3} n^{3/2} +o(n)\), we thus have that
On the other hand, we have that the expected weight of a random cut is equal to \(\frac{1}{4} n(n-1)mp^2 = \frac{c^2}{4}n + o(n)\) (see e.g. Eq. (11)). The proof is completed by taking \(\epsilon \rightarrow 0\). \(\square \)
It is worth noting that the dependency of the lower bound for \(\beta \) on the constant c is to be expected; indeed our results in Sect. 3.1 suggest that, when the label selection probability p becomes large enough, the weight of random cut is asymptotically optimal.
4.2 Intersection Graph (Weak) Bipartization
Notice that we can view a weighted intersection graph \(G(V, E, {\textbf{R}}^T{\textbf{R}})\) as a multigraph, composed by m (possibly) overlapping cliques corresponding to the sets of vertices having chosen a certain label, namely \(L_{\ell } = \{v: {\textbf{R}}_{\ell , v}\}, \ell \in [m]\). In particular, let \(K^{(\ell )}\) denote the clique induced by label \(\ell \). Then \(G = \cup ^+_{\ell \in [m]} K^{(\ell )}\), where \(\cup ^+\) denotes union that keeps multiple edges and also retains label information for each edge (e.g., edges within clique \(K^{(\ell )}\) are formed by label \(\ell \)). In this section, we present an algorithm that takes as input an intersection graph G given as a union of overlapping cliques and outputs a subgraph that is “almost” bipartite.
To facilitate the presentation of our algorithm, we first give some useful definitions. A closed vertex-label sequence is a sequence of alternating vertices and labels starting and ending at the same vertex, namely \(\sigma := v_1, \ell _1, v_2, \ell _2, \cdots , v_k, \ell _{k}, v_{k+1}=v_1\), where \(v_i \in V\), \(\ell _i \in {\mathcal M}\), and \(\{v_i, v_{i+1}\} \subseteq L_{\ell _i}\), for all \(i \in [k]\) (i.e. \(v_i\) is connected to \(v_{i+1}\) in the intersection graph; see Fig. 1). The size of the closed vertex-label sequence, denoted by \(|\sigma |\), is the number of its labels, i.e., \(|\sigma |=k\). We will also say that label \(\ell \) is strong if \(|L_{\ell } |\ge 3\), otherwise it is weak. For a given closed vertex-label sequence \(\sigma \), and any integer \(\lambda \in [|\sigma |]\), we will say that \(\sigma \) is \(\lambda \)-strong if \(|L_{\ell _i} |\ge 3\), for \(\lambda \) indices \(i \in [|\sigma |]\). The structural Lemma below is useful for our analysis.Footnote 2
Lemma 1
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model, with \(m=n\), and \(p = \frac{c}{n}\), for some constant \(c>0\). With high probability over the choices of \({\textbf{R}}\), 0-strong closed vertex-label sequences in G do not have labels in common.
Proof
We will use the first moment method and so we need to prove that the expectation of the number of pairs of distinct 0-strong closed vertex-label sequences in G that have at least one label in common goes to 0. To this end, for \(j \in [\min (k, k')-1]\), let \(A_j(k, k')\) denote the number of such sequences \(\sigma , \sigma '\), with \(k=|\sigma |, k' = |\sigma ' |\), that have j labels in common. In particular, for integers \(k, k'\), let \( \sigma :=v_1, \ell _1, v_2, \ell _2, \cdots , v_k, \ell _{k}, v_{k+1}=v_1\), and let \(\sigma ':=v'_1, \ell '_1, v'_2, \ell '_2, \cdots , v'_{k'}, \ell '_{k'}, v'_{k'+1}=v_1\). Notice that, any such fixed pair \(\sigma , \sigma '\) has the same probability to appear, namely \(p^{2(k+k'-j)} (1-p)^{(n-2)(k+k'-j)}\); indeed, \(p^{2k} (1-p)^{(n-2)k}\) is the probability that \(\sigma \) appears (recall that \(\sigma \) has k labels and it is 0-strong, i.e. each label is only selected by two vertices) and \(p^{2(k'-j)} (1-p)^{(n-2)(k'-j)}\) is the probability that \(\sigma '\) appears given that \(\sigma \) has appeared. Furthermore, the number of such pairs of sequences is dominated by the number of sequences that overlap in j consecutive labels (e.g. the first j), which is at most \(n^k m^k n^{k'-j-1} m^{k'-j}\) (notice that j common labels implies that there are at least \(j'+1\) common vertices). Overall, since \(n=m\) and \(p = \frac{c}{n}\), we have
Since \(n \rightarrow \infty \) and \(p = \frac{c}{n}\), by elementary calculus we have that \(c^2 (1-p)^{n-2}\) bounded by a constant (which depends only on c) strictly less than 1. Therefore, the above expectation is at most \(e^{-\ln {n} - \Theta (1) (k+k'-j)}\). Therefore, summing over all choices of \(k, k' \in [n]\) and \(j \in [\min (k, k')-1]\), we get that the expected number of pairs of distinct 0-strong closed vertex-label sequences that have at least one label in common is at most
and the proof is completed by Markov’s inequality. \(\square \)
The following definition is essential for the presentation of our algorithm.
Definition 3
Given a weighted intersection graph \(G=G(V,E, {\textbf{R}}^T {\textbf{R}})\) and a subgraph \(G^{(b)} \subseteq G\), let \({\mathcal C}_{odd}(G^{(b)})\) be the set of odd length closed vertex-label sequences \(\sigma := v_1, \ell _1, v_2, \ell _2, \cdots , v_k, \ell _{k}, v_{k+1}=v_1\) that additionally satisfy the following:
-
(a)
\(\sigma \) has distinct vertices (except the first and the last) and distinct labels.
-
(b)
\(v_i\) is connected to \(v_{i+1}\) in \(G^{(b)}\), for all \(i \in [|\sigma |]\).
-
(c)
\(\sigma \) is \(\lambda \)-strong, for some \(\lambda > 0\).
Our Weak Bipartization Algorithm initially replaces each clique \(K^{(\ell )}\) by a random maximal matching \(M^{(\ell )}\), and thus gets a subgraph \(G^{(b)} \subseteq G\) (see Fig. 1). If \({\mathcal C}_{odd}(G^{(b)})\) is not empty, then the algorithm selects \(\sigma \in {\mathcal C}_{odd}(G^{(b)})\) and a strong label \(\ell \in \sigma \), and then replaces \(M^{(\ell )}\) in \(G^{(b)}\) by a new random matching of \(K^{(\ell )}\). The algorithm repeats until all odd cycles are destroyed (or runs forever trying to do so).
The following results are the main technical tools that justify the use of the Weak Bipartization Algorithm for Weighted Max Cut.
Lemma 2
If \({\mathcal C}_{odd}(G^{(b)})\) is empty, then \(G^{(b)}\) may only have 0-strong odd cycles.
Proof
For the sake of contradiction, assume \({\mathcal C}_{odd}(G^{(b)}) = \emptyset \), but \(G^{(b)} = \cup ^+_{\ell \in [m]} M^{(\ell )}\) has an odd cycle \(C_k\) that is not 0-strong and has minimum length. Notice that \(C_k\) corresponds to a closed vertex-label sequence, say \(\sigma := v_1, \ell _1, v_2, \ell _2, \cdots , v_k, \ell _{k}, v_{k+1}=v_1\), where \(\{v_i, v_{i+1}\} \in M^{(\ell _i)}\), for all \(i \in [k]\). Furthermore, by assumption, conditions (b) and (c) of Definition 3 are satisfied by \(\sigma \) (indeed \(\{v_i, v_{i+1}\} \in M^{(\ell _i)}\), for all \(i \in [k]\), and \(\sigma \) is \(\lambda \)-strong, for some \(\lambda >0\)). Therefore, the only reason for which \(\sigma \) does not belong to \({\mathcal C}_{odd}(G^{(b)})\) is that condition (a) of Definition 3 is not satisfied, i.e. there are distinct indices \(i > i' \in [k]\) such that \(\ell _i = \ell _{i'}\). Clearly, such indices are not consecutive (i.e. \(i' \ne i+1\)), because \(\ell _i\) is strong and step 6 of our algorithm implies that \(M^{(\ell _i)}\) is a matching of \(K^{(\ell _i)}\). But then either the vertex-label sequence \(v_1, \ldots , v_i, \ell _i, v_{i'+1}, \ell _{i'+1}, v_{i'+2}, \ldots , v_{k+1} = v_1\) or the vertex-label sequence \(v_{i+1}, \ell _{i+1}, v_{i+2}, \ldots , v_{i'}, \ell _{i}, v_{i+1}\) corresponds to a shorter odd cycle, which is a contradiction on the minimality of \(C_k\). \(\square \)
Theorem 5
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model, with \(n=m\) and \(p = \frac{c}{n}\), where \(c>0\) is a constant, and let \({\textbf{R}}\) be its representation matrix. Let also \(\Sigma \) be a set system with incidence matrix \({\textbf{R}}\). With high probability over the choices of \({\textbf{R}}\), if the Weak Bipartization Algorithm terminates on input G, its output can be used to construct a 2-coloring \({\textbf{x}}^{(\text {disc})} \in \arg \min _{{\textbf{x}} \in \{\pm 1\}^n} \text {disc}(\Sigma , {\textbf{x}})\), which also gives a maximum cut in G, i.e. \({\textbf{x}}^{(\text {disc})} \in \arg \max _{{\textbf{x}} \in \{\pm 1\}^n} \text {Cut}(G, {\textbf{x}})\).
Proof
By construction, the output of the Weak Bipartization Algorithm, namely \(G^{(b)}\), has only 0-strong odd cycles. Furthermore, by Lemma 1 these cycles correspond to vertex-label sequencies that are label-disjoint. Let H denote the subgraph of \(G^{(b)}\) in which we have destroyed all 0-strong odd cycles by deleting a single (arbitrary) edge \(e_C\) from each 0-strong odd cycle C (keeping all other edges intact), and notice that \(e_C\) corresponds to a weak label. In particular, H is a bipartite multi-graph and thus its vertices can be partitioned into two independent sets A, B constructed as follows: In each connected component of H, start with an arbitrary vertex v and include in A (resp. in B) the set of vertices reachable from v that are at an even (resp. odd) distance from v. Since H is bipartite, it does not have odd cycles, and thus this construction is well-defined, i.e. no vertex can be placed in both A and B.
We now define \({\textbf{x}}^{(disc)}\) by setting \(x^{(disc)}_i = +1\) if \(i \in A\) and \(x^{(disc)}_i = +1\) if \(i \in B\). Let \({\mathcal M}_0\) denote the set of weak labels corresponding to the edges removed from \(G^{(b)}\) in the construction of H. We first note that, for each \(\ell _C \in {\mathcal M}_0\) corresponding to the removal of an edge \(e_C\), we have \(\left|\sum _{i \in L_{\ell _C}} x^{(disc)}_i \right|=2\). Indeed, since \(e_C\) belongs to an odd cycle in \(G^{(b)}\), its endpoints are at even distance in H, which means that either they both belong to A or they both belong to B. Therefore, their corresponding entries of \({\textbf{x}}^{(disc)}\) have the same sign, and so (taking into account that the endpoints of \(e_C\) are the only vertices in \(L_{\ell _C}\)), we have \(\left|\sum _{i \in L_{\ell _C}} x^{(disc)}_i \right|=2\). Second, we show that, for all the other labels \(\ell \in [m] \backslash {\mathcal M}_0\), \(\left|\sum _{i \in L_{\ell }} x^{(disc)}_i \right|\) will be equal to 1 if \(|L_{\ell } |\) is odd and 0 otherwise. For any label \(\ell \in [m] \backslash {\mathcal M}_0\), let \(M^{(\ell )}\) denote the part of \(G^{(b)}\) corresponding to a maximal matching of \(K^{(\ell )}\), and note that all edges of \(M^{(\ell )}\) are contained in H. Since H is bipartite, no edge in \(M^{(\ell )}\) can have both its endpoints in either A or B. Therefore, by construction, the contribution of entries of \({\textbf{x}}^{(disc)}\) corresponding to endpoints of edges in \(M^{(\ell )}\) to the sum \(\sum _{i \in L_{\ell }} x^{(disc)}_i\) is 0. In particular, if \(|L_{\ell } |\) is even, then \(M^{(\ell )}\) is a perfect matching and \(\left|\sum _{i \in L_{\ell }} x^{(disc)}_i \right|= 0\), otherwise (i.e. if \(|L_{\ell } |\) is odd) there is a single vertex not matched in \(M^{(\ell )}\) and \(\left|\sum _{i \in L_{\ell }} x^{(disc)}_i \right|= 1\).
To complete the proof of the theorem, we need to show that \(\text {Cut}(G, {\textbf{x}}^{(disc)})\) is maximum. By Proposition 1, this is equivalent to proving that \(\Vert {\textbf{R}} {\textbf{x}}^{(disc)}\Vert \le \Vert {\textbf{R}} {\textbf{x}}\Vert \) for all \({\textbf{x}} \in \{-1,+1\}^n\). Suppose that there is some \({\textbf{x}}^{(min)} \in \{-1,+1\}^n\) such that \(\Vert {\textbf{R}} {\textbf{x}}^{(disc)}\Vert > \Vert {\textbf{R}} {\textbf{x}}^{(min)}\Vert \). As mentioned above, for all \(\ell \in [m] \backslash {\mathcal M}_0\), we have \([{\textbf{R}} {\textbf{x}}^{(disc)}]_{\ell } \le 1\), and so \([{\textbf{R}} {\textbf{x}}^{(disc)}]_{\ell } \le [{\textbf{R}} {\textbf{x}}^{(min)}]_{\ell }\). Therefore, the only labels where \({\textbf{x}}^{(min)}\) could do better are those corresponding to edges \(e_C\) that are removed from \(G^{(b)}\) in the construction of H, i.e. \(\ell _C \in {\mathcal M}_0\), for which we have \([{\textbf{R}} {\textbf{x}}^{(disc)}]_{\ell _C} =2\). However, any such edge \(e_C\) belongs to an odd cycle C, and thus any 2-coloring of the vertices of C will force at least one of the 0-strong labels corresponding to edges of C to be monochromatic. Taking into account the fact that, by Lemma 1, with high probability over the choices of \({\textbf{R}}\), all 0-strong odd cycles correspond to vertex-label sequences that are label-disjoint, we conclude that \(\Vert {\textbf{R}} {\textbf{x}}^{(disc)}\Vert \le \Vert {\textbf{R}} {\textbf{x}}^{(min)}\Vert \), which completes the proof. \(\square \)
The fact that Theorem 5 is not an immediate consequence of Corollary 1 follows from the observation that a random set system with incidence matrix \({\textbf{R}}\) has discrepancy larger than 1 with (at least) constant probability when \(m=n\) and \(p = \frac{c}{n}\). Indeed, by a straightforward counting argument, we can see that the expected number of 0-strong odd cycles is at least constant. Furthermore, in any 2-coloring of the vertices at least one of the weak labels forming edges in a 0-strong odd cycle will be monochromatic. Therefore, with at least constant probability, for any \({\textbf{x}} \in \{-1,+1\}^n\), there exists a weak label \(\ell \), such that \(x_i x_j=1\), for both \(i, j \in L_{\ell }\), implying that \(\text {disc}(L_{\ell })=2\).
We close this section by a result indicating that the conditional statement of Theorem 5 is not void, namely there is a range of values for c where the Weak Bipartization Algorithm terminates in polynomial time.
Theorem 6
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model, with \(n=m\) and \(p = \frac{c}{n}\), where \(0<c<1\) is a constant, and let \({\textbf{R}}\) be its representation matrix. With high probability over the choices of \({\textbf{R}}\), the Weak Bipartization Algorithm terminates on input G in \(O\left( (n+\sum _{\ell \in [m]} |L_{\ell } |) \cdot \log {n} \right) \) polynomial time.
Before presenting the proof of the Theorem, we first prove the following structural Lemma regarding the expected number of closed vertex label sequences.
Lemma 3
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model. Let also \(C_k\) denote the number of distinct closed vertex-label sequences of size k in G. Then
In particular, when \(m=n \rightarrow \infty \), \(p = \frac{c}{n}, c>0\), and \(k \ge 3\), we have \(\mathbb {E}[C_k] \le \frac{e}{2\pi } c^{2k}\).
Proof
Notice that there are \(\frac{1}{k} \frac{n!}{(n-k)!}\) ways to arrange k out of n vertices in a cycle. Furthermore, in each such arrangement, there are \(\frac{m!}{(m-k)!}\) ways to place k out of m labels so that there is exactly one label between each pair of vertices. Since labels in any given arrangement must be selected by both its adjacent vertices, (23) follows by linearity of expectation.
Setting \(m=n\) and \(p = \frac{c}{n}\), and using the inequalities \(\sqrt{2 \pi } n^{n+\frac{1}{2}}e^{-n} \le n! \le e n^{n+\frac{1}{2}}e^{-n}\),
When n goes to \(\infty \) and \(k \ge 3\), then the above is at most \(\frac{e}{2\pi } c^{2k}\) as needed. \(\square \)
We are now ready for the proof of the Theorem.
Proof of Theorem 6
We will prove that, when \(m=n \rightarrow \infty \), \(p = \frac{c}{n}, c<1\), and \(k \ge 3\), with high probability, there are no closed vertex-label sequences that have labels in common. To this end, recalling Definition 3 for \({\mathcal C}_{odd}(G^{(b)})\), we provide upper bounds on the following events: \(A {\mathop {=}\limits ^{\text {def}}} \{\exists k \ge \log {n}: C_k \ge 1\}\), \(B {\mathop {=}\limits ^{\text {def}}} \{|{\mathcal C}_{odd}(G^{(b)})|\ge \log {n}\}\) and \(C {\mathop {=}\limits ^{\text {def}}} \{\exists \sigma \ne \sigma ' \in {\mathcal C}_{odd}(G^{(b)}): \exists \ell \in \sigma , \ell \in \sigma '\}\).
By the union bound, Markov’s inequality and Lemma 3, we get that, with high probability, all closed vertex-label sequences have less than \(\log {n}\) labels:
where the last equality follows since \(c<1\) is a constant. Furthermore, by Markov’s inequality and Lemma 3, and noting that any closed vertex-label sequence in \({\mathcal C}_{odd}(G^{(b)})\) must have at least \(k \ge 3\) labels, we get that, with high probability, there are less than \(\log {n}\) closed vertex-label sequences in \({\mathcal C}_{odd}(G^{(b)})\):
To bound \(\Pr (C)\), fix a closed vertex-label sequence \(\sigma \), and let \(|\sigma |\ge 3\) be the number of its labels. Notice that, the probability that there is another closed vertex-label sequence that has labels in common with \(\sigma \) implies the existence of a vertex-label sequence \(\breve{\sigma }\) that starts with either a vertex or a label from \(\sigma \), ends with either a vertex or a label from \(\sigma \), and has at least one label or at least one vertex that does not belong to \(\sigma \). Let \(|\breve{\sigma }|\) denote the number of labels of \(\breve{\sigma }\) that do not belong to \(\sigma \). Then the number of different vertex-label sequences \(\breve{\sigma }\) that start and end in labels from \(\sigma \) is at most \(|\sigma |^2 n^{|\breve{\sigma }|+1} m^{|\breve{\sigma }|}\); indeed \(\breve{\sigma }\) in this case has \(|\breve{\sigma }|\) labels and \(|\breve{\sigma }|+1\) vertices that do not belong to \(\sigma \). Therefore, by independence, each such sequence \(\breve{\sigma }\) has probability \(p^{2|\breve{\sigma }|+2}\) to appear. Similarly, the number of different vertex-label sequences \(\breve{\sigma }\) that start and end in vertices from \(\sigma \) is at most \(|\sigma |^2 n^{|\breve{\sigma }|-1} m^{|\breve{\sigma }|}\) and each one has probability \(p^{2|\breve{\sigma }|}\) to appear. Finally, the number of different vertex-label sequences \(\breve{\sigma }\) that start in a vertex from \(\sigma \) and end in a label from \(\sigma \) (notice that this also covers the case where \(\breve{\sigma }\) starts in a label from \(\sigma \) and ends in a vertex from \(\sigma \)) is at most \(|\sigma |^2 n^{|\breve{\sigma }|} m^{|\breve{\sigma }|}\) and each one has probability \(p^{2|\breve{\sigma }|+1}\) to appear. Overall, for a given sequence \(\sigma \), the expected number of sequences \(\breve{\sigma }\) described above that additionally satisfies \(|\breve{\sigma }|< \log {n}\), is at most
where in the last inequality we used the fact that \(m=n, p = \frac{c}{n}\) and \(c<1\). Since the existence of a sequence \(\breve{\sigma }\) for \(\sigma \) that additionally satisfies \(|\breve{\sigma }|\ge \log {n}\) implies event A, and on other hand the existence of more than \(\log {n}\) different sequences \(\sigma \in |{\mathcal C}_{odd}(G^{(b)})|\) implies event B, by Markov’s inequality and (25), we get
We have thus proved that, with high probability over the choices of \({\textbf{R}}\), closed vertex-label sequences in \({\mathcal C}_{odd}(G^{(b)})\) are label disjoint, as needed.
In view of this, the proof of the Theorem follows by noting that, since closed vertex label sequences in \({\mathcal C}_{odd}(G^{(b)})\) are label disjoint, steps 5 and 6 within the while loop of the Weak Bipartization Algorithm will be executed exactly once for each sequence in \({\mathcal C}_{odd}(G^{(b)})\), where \(G^{(b)}\) is defined in step 3 of the algorithm; indeed, once a closed vertex label sequence \(\sigma \in {\mathcal C}_{odd}(G^{(b)})\) is destroyed in step 6, no new closed vertex label sequence is created. In fact, once \(\sigma \) is destroyed we can remove the corresponding labels and edges from \(G^{(b)}\), as these will no longer belong to other closed vertex label sequences. Furthermore, to find a closed vertex label sequences in \({\mathcal C}_{odd}(G^{(b)})\), it suffices to find an odd cycle in \(G^{(b)}\), which can be done by running DFS, requiring \(O(n+\sum _{\ell \in [m]} |L_{\ell }|)\) time, because \(G^{(b)}\) has at most \(\sum _{\ell \in [m]} |L_{\ell }|\) edges. Finally, by (24), we have \(|{\mathcal C}_{odd}(G^{(b)})|< \log {n}\) with high probability, and so the running time of the Weak Bipartization Algorithm is \(O((n+\sum _{\ell \in [m]} |L_{\ell }|) \log {n})\), which concludes the proof of Theorem 6.
5 Discussion and Some Open Problems
In this paper, we introduced the model of weighted random intersection graphs and we studied the average case analysis of Weighted Max Cut through the prism of discrepancy of random set systems. In particular, in the first part of the paper, we proved concentration of the weight of a maximum cut of \(G(V, E, {\textbf{R}}^T {\textbf{R}})\) around its expected value, and we used it to show that, with high probability, the weight of a random cut is asymptotically equal to the maximum cut weight of the input graph, when \(m = n^{\alpha }, \alpha <1\). On the other hand, in the case where the number of labels is equal to the number of vertices (i.e. \(m=n\)), we proved that a majority algorithm gives a cut with weight that is larger than the weight of a random cut by at least a constant factor, when \(p = \frac{c}{n}\) and c is large.
In the second part of the paper, we highlighted a connection between Weighted Max Cut of sparse weighted random intersection graphs and Discrepancy of sparse random set systems, formalized through our Weak Bipartization Algorithm and its analysis. We demonstrated how our proposed framework can be used to find optimal solutions for these problems, with high probability, in special cases of sparse inputs (\(m=n, p=\frac{c}{n}, c<1\)).
One of the main problems left open in our work concerns the termination of our Weak Bipartization Algorithm for large values of c. We conjecture the following:
Conjecture 1
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model, with \(m=n\), and \(p = \frac{c}{n}\), for some constant \(c \ge 1\). With high probability over the choices of \({\textbf{R}}\), on input G, the Weak Bipartization Algorithm terminates in polynomial time.
We also leave the problem of determining whether the Weak Bipartization Algorithm terminates in polynomial time, in the case \(m=n\) and \(p = \omega (1/n)\), as an open question for future research.
Towards strengthening the connection between Weighted Max Cut under the \(\overline{{\mathcal G}}_{n, m, p}\) model, and Discrepancy in random set systems, we conjecture the following:
Conjecture 2
Let \(G(V,E, {\textbf{R}}^T {\textbf{R}})\) be a random instance of the \(\overline{{\mathcal G}}_{n, m, p}\) model, with \(m=n, p = \frac{c}{n}\), for some positive constant c, and let \({\textbf{R}}\) be its representation matrix. Let also \(\Sigma \) be a set system with incidence matrix \({\textbf{R}}\). Then, with high probability over the choices of \({\textbf{R}}\), there exists \({\textbf{x}}^{\text {disc}} \in \arg \min _{{\textbf{x}} \in \{-1, +1\}^n} \text {disc}(\Sigma , {\textbf{x}})\), such that \( \texttt {Cut}(G, {\textbf{x}}^{\text {disc}})\) is asymptotically equal to \(\texttt {Max-Cut}(G)\).
Notes
This property follows inductively, by noting that, if \(X = \sum _{i=1}^k a_i X_i - \sum _{i=k+1}^N a_i X_i\), and \(X'=\sum _{i=1}^{k-1} a_i X_i + (a_k-1)X_k+X'_k - \sum _{i=k+1}^N a_i X_i\), where \(k, N, a_i \in \mathbb {N}^+, i \in [N]\), and \(X_i, i \in [N], X'_k\) are independent, identically distributed Bernoulli random variables, then \(\mathbb {E}[|X |] \ge \mathbb {E}[|X'|]\). Indeed, notice that, the independence of \(X_k, X'_k\) implies that these random variables work against each other (with respect to the absolute value) at least half of the time.
We conjecture that the structural property of Lemma 1 also holds if we replace 0-strong with \(\lambda \)-strong, for any positive constant \(\lambda \), but this stronger version is not necessary for our analysis.
References
Díaz, J., Petit, J., Serna, M.: A survey on graph layout problems. ACM Comput. Surv. 34, 313–356 (2002)
Poljak, S., Tuza, Z.: Maximum cuts and largest bipartite subgraphs. DIMACS series in Discrete Mathematics and Theoretical Computer Science, pp. 181–244. American Mathematical Society, Providence (1995)
Barahona, F., Grötschel, M., Jünger, M., Reinelt, G.: An application of combinatorial optimization to statistical physics and circuit layout design. Oper. Res. 36(3), 493–513 (1988)
Poland, J., Zeugmann, T.: Clustering pairwise distances with missing data: maximum cuts versus normalized cuts. Comput. Sci. 4265, 197–208 (2006)
Papadimitriou, C., Yannakakis, M.: Optimization, approximation, and complexity classes. Comput. Syst. Sci. 43(3), 425–440 (1991)
Gamarnik, D., Li, Q.: On the max-cut of sparse random graphs. Random Struct. Algorithms 52(2), 219–262 (2018)
Coja-Oghlan, A., Moore, C., Sanwalani, V.: Max k-cut and approximating the chromatic number of random graphs. Random Struct. Algorithms 28(3), 289–322 (2006)
Coppersmith, D., Gamarnik, D., Hajiaghayi, M., Sorkin, G.: Random maxsat, random maxcut, and their phase transitions. Rand. Struct. Algorithms 24(4), 502–545 (2004)
Bayati, M., Gamarnik, D., Tetal, P.: Combinatorial approach to the interpolation method and scaling limits in sparse random graphs. Ann. Probab. 41, 4080–4115 (2013)
Dembo, A., Montanari, A., Sen, S.: Extremal cuts of sparse random graphs. Ann. Probab. 45(2), 1190–1217 (2017)
Karoński, M., Scheinerman, E., Singer-Cohen, K.: On random intersection graphs: the subgraph problem. Comb. Probab. Comput. 8, 131–159 (1999)
Nikoletseas, S., Raptopoulos, C., Spirakis, P.: Communication and security in random intersection graphs models. In: Proceedings of the 12th IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WOWMOM), pp. 1–6 (2011)
Bloznelis, M., Godehardt, E., Jaworski, J., Kurauskas, V., Rybarczyk, K.: Recent progress in complex network analysis: models of random intersection graphs. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds.) Data Science, Learning by Latent Structures, and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 69–78. Springer, Berlin, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44983-7_6
Nikoletseas, S., Raptopoulos, C., Spirakis, P.: Efficient approximation algorithms in random intersection graphs. In: Gonzalez, T.F. (ed.) Handbook of Approximation Algorithms and Metaheuristics, 2nd edn. Chapman and Hall/CRC, Boca Raton (2018)
Fill, J., Sheinerman, E., Singer-Cohen, K.: Random intersection graphs when \(m = \omega (n)\): an equivalence theorem relating the evolution of the \(g(n, m, p)\) and \(g(n, p)\) models. Random Struct. Algorithms 16(2), 156–176 (2000)
Kim, J.H., Lee, S.J., Na, J.: On the total variation distance between the binomial random graph and the random intersection graph. Random Struct. Algorithms 52(4), 662–679 (2018)
Rybarczyk, K.: Equivalence of a random intersection graph and \(g(n, p)\). Random Struct. Algorithms 38(1–2), 205–234 (2011)
Raptopoulos, C., Spirakis, P.: Simple and efficient greedy algorithms for hamilton cycles in random intersection graphs. In: Proceedings of the 16th International Symposium on Algorithms and Computation (ISAAC), pp. 493–504 (2005)
Altschuler, D., Niles-Weed, J.: The discrepancy of random rectangular matrices. Random Struct. Algorithms (2021). https://doi.org/10.1002/rsa.21054
Hoberg, R., Rothvoss, T.: A fourier-analytic approach for the discrepancy of random set systems. In: Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 2547–2556 (2019)
Ezra, E., Lovett, S.: On the beck-fiala conjecture for random set systems. In: Proceedings of Approximation, Randomization, and Combinatorial Optimization - Algorithms and Techniques (APPROX-RANDOM), pp. 29–12910 (2016)
Potukuchi, A.: Discrepancy in random hypergraph models. CoRR abs arxiv: 1811.01491 (2018)
Bansal, N., Meka, R.: On the discrepancy of random low degree set systems. In: Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms SODA, pp. 2557–2564 (2019)
Nikoletseas, S., Raptopoulos, C., Spirakis, P.: Max cut in weighted random intersection graphs and discrepancy of sparse random set systems. In: Proceedings of the 32nd International Symposium on Algorithms and Computation (ISAAC), pp. 1–16 (2021)
Behrisch, M., Taraz, A., Ueckerdt, M.: Coloring random intersection graphs and complex networks. SIAM J. Discret. Math. 23(1), 288–299 (2009)
Shevtsova, I.: On the absolute constants in the Berry-Esseen-type inequalities. Doklady Math. 89, 378–381 (2014)
Funding
Open access funding provided by HEAL-Link Greece. Christoforos Raptopoulos was supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “2nd Call for H.F.R.I. Research Projects to support Post-Doctoral Researchers” (Project Number:704). Paul Spirakis was supported by NeST initiative of the School of EEE and CS at the U. of Liverpool and by the EPSRC grant EP/P02002X/1
Author information
Authors and Affiliations
Contributions
All authors contributed equally to this work.
Corresponding author
Ethics declarations
Conflict of interest
The author declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nikoletseas, S., Raptopoulos, C. & Spirakis, P. MAX CUT in Weighted Random Intersection Graphs and Discrepancy of Sparse Random Set Systems. Algorithmica 85, 2817–2842 (2023). https://doi.org/10.1007/s00453-023-01121-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-023-01121-3