MAX CUT in Weighted Random Intersection Graphs and Discrepancy of Sparse Random Set Systems

Let V be a set of n vertices, M\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal M}$$\end{document} a set of m labels, and let R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textbf{R}}$$\end{document} be an m×n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m \times n$$\end{document} matrix ofs independent Bernoulli random variables with probability of success p; columns of R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textbf{R}}$$\end{document} are incidence vectors of label sets assigned to vertices. A random instance G(V,E,RTR)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G(V, E, {\textbf{R}}^T {\textbf{R}})$$\end{document} of the weighted random intersection graph model is constructed by drawing an edge with weight equal to the number of common labels (namely [RTR]v,u\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[{\textbf{R}}^T {\textbf{R}}]_{v,u}$$\end{document}) between any two vertices u, v for which this weight is strictly larger than 0. In this paper we study the average case analysis of Weighted Max Cut, assuming the input is a weighted random intersection graph, i.e. given G(V,E,RTR)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G(V, E, {\textbf{R}}^T {\textbf{R}})$$\end{document} we wish to find a partition of V into two sets so that the total weight of the edges having exactly one endpoint in each set is maximized. In particular, we initially prove that the weight of a maximum cut of G(V,E,RTR)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G(V, E, {\textbf{R}}^T {\textbf{R}})$$\end{document} is concentrated around its expected value, and then show that, when the number of labels is much smaller than the number of vertices (in particular, m=nα,α<1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m=n^{\alpha }, \alpha <1$$\end{document}), a random partition of the vertices achieves asymptotically optimal cut weight with high probability. Furthermore, in the case n=m\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=m$$\end{document} and constant average degree (i.e. p=Θ(1)n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = \frac{\Theta (1)}{n}$$\end{document}), we show that with high probability, a majority type randomized algorithm outputs a cut with weight that is larger than the weight of a random cut by a multiplicative constant strictly larger than 1. Then, we formally prove a connection between the computational problem of finding a (weighted) maximum cut in G(V,E,RTR)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G(V, E, {\textbf{R}}^T {\textbf{R}})$$\end{document} and the problem of finding a 2-coloring that achieves minimum discrepancy for a set system Σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Sigma $$\end{document} with incidence matrix R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textbf{R}}$$\end{document} (i.e. minimum imbalance over all sets in Σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Sigma $$\end{document}). We exploit this connection by proposing a (weak) bipartization algorithm for the case m=n,p=Θ(1)n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m=n, p = \frac{\Theta (1)}{n}$$\end{document} that, when it terminates, its output can be used to find a 2-coloring with minimum discrepancy in a set system with incidence matrix R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textbf{R}}$$\end{document}. In fact, with high probability, the latter 2-coloring corresponds to a bipartition with maximum cut-weight in G(V,E,RTR)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G(V, E, {\textbf{R}}^T {\textbf{R}})$$\end{document}. Finally, we prove that our (weak) bipartization algorithm terminates in polynomial time, with high probability, at least when p=cn,c<1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = \frac{c}{n}, c<1$$\end{document}.


Introduction
Given an undirected graph G(V, E), the Max Cut problem asks for a partition of the vertices of G into two sets, such that the number of edges with exactly one endpoint in each set of the partition is maximized.This problem can be naturally generalized for weighted (undirected) graphs.A weighted graph is denoted by G(V, E, W), where V is the set of vertices, E is the set of edges and W is a weight matrix, which specifies a weight W i,j = w i,j , for each pair of vertices i, j.In particular, we assume that W i,j = 0, for each edge {i, j} / ∈ E.
▶ Definition 1 (Weighted Max Cut).Given a weighted graph G(V, E, W), find a partition of V into two (disjoint) subsets A, B, so as to maximize the cumulative weight of the edges of G having one endpoint in A and the other in B.
Weighted Max Cut is fundamental in theoretical computer science and is relevant in various graph layout and embedding problems [10].Furthermore, it also has many practical applications, including infrastructure cost and circuit layout optimization in network and VLSI design [19], minimizing the Hamiltonian of a spin glass model in statistical physics [3], and data clustering [18].In the worst case Max Cut (and also Weighted Max Cut) is APX-hard, meaning that there is no polynomial-time approximation scheme that finds a solution that is arbitrarily close to the optimum, unless P = NP [17].
The average case analysis of Max Cut, namely the case where the input graph is chosen at random from a probabilistic space of graphs, is also of considerable interest and is further motivated by the desire to justify and understand why various graph partitioning heuristics work well in practical applications.In most research works the input graphs are drawn from the Erdős-Rényi random graphs model G n,m , i.e. random instances are drawn equiprobably from the set of simple undirected graphs on n vertices and m edges, where m is a linear function of n (see also [13,7] for the average case analysis of Max Cut and its generalizations with respect to other random graph models).One of the earliest results in this area is that Max Cut undergoes a phase transition on G n,γn at γ = 1  2 [8], in that the difference between the number of edges of the graph and the Max-Cut size is O(1), for γ < 1  2 , while it is Ω(n), when γ > 1  2 .For large values of γ, it was proved in [4] that the maximum cut size of G n,γn normalized by the number of vertices n reaches an absolute limit in probability as n → ∞, but it was not until recently that the latter limit was established and expressed analytically in [9], using the interpolation method; in particular, it was shown to be asymptotically equal to ( γ 2 + P * γ 2 )n, where P * ≈ 0.7632.We note however that these results are existential, and thus do not lead to an efficient approximation scheme for finding a tight approximation of the maximum cut with large enough probability when the input graph is drawn from G n,γn .An efficient approximation scheme in this case was designed in [8], and it was proved that, with high probability, this scheme constructs a cut with at least γ 2 + 0.37613 √ γ n = (1 + 0.75226 1 √ γ ) γ 2 n edges, noting that γ 2 n is the size of a random cut (in which each vertex is placed independently and equiprobably in one of the two sets of the partition).Whether there exists an efficient approximation scheme that can close the gap between the approximation guarantee of [8] and the limit of [9] remains an open problem.
In this paper, we study the average case analysis of Weighted Max Cut when input graphs are drawn from the generalization of another well-established model of random graphs, namely the weighted random intersection graphs model (the unweighted version of the model was initially defined in [15]).In this model, edges are formed through the intersection of label sets assigned to each vertex and edge weights are equal to the number of common labels between edgepoints.
▶ Definition 2 (Weighted random intersection graph).Consider a universe M = {1, 2, . . ., m} of labels and a set of n vertices V .We define the m × n representation matrix R whose entries are independent Bernoulli random variables with probability of success p.For ℓ ∈ M and v ∈ V , we say that vertex v has chosen label ℓ iff R ℓ,v = 1.Furthermore, we draw an edge with weight [R T R] v,u between any two vertices u, v for which this weight is strictly larger than 0.The weighted graph G = (V, E, R T R) is then a random instance of the weighted random intersection graphs model G n,m,p .
Random intersection graphs are relevant to and capture quite nicely social networking; vertices are the individual actors and labels correspond to specific types of interdependency.Other applications include oblivious resource sharing in a (general) distributed setting, efficient and secure communication in sensor networks [20], interactions of mobile agents traversing the web etc. (see e.g. the survey papers [6,16] for further motivation and recent research related to random intersection graphs).In all these settings, weighted random intersection graphs, in particular, also capture the strength of connections between actors (e.g. in a social network, individuals having several characteristics in common have more intimate relationships than those that share only a few common characteristics).One of the most celebrated results in this area is equivalence (measured in terms of total variation distance) of random intersection graphs and Erdős-Rényi random graphs when the number of labels satisfies m = n α , α > 6 [12].This bound on the number of labels was improved in [22], by showing equivalence of sharp threshold functions among the two models for α ≥ 3. Similarity of the two models has been proved even for smaller values of α (e.g. for any α > 1) in the form of various translation results (see e.g.Theorem 1 in [21]), suggesting that some algorithmic ideas developed for Erdős-Rényi random graphs also work for random intersection graphs (and also weighted random intersection graphs).
In view of this, in the present paper we study the average case analysis of Weighted Max Cut under the weighted random intersection graphs model, for the range m = n α , α ≤ 1 for two main reasons: First, the average case analysis of Max Cut has not been considered in the literature so far when the input is a drawn from the random intersection graphs model, and thus the asymptotic behaviour of the maximum cut remains unknown especially for the range of values where random intersection graphs and Erdős-Rényi random graphs differ the most.Furthermore, studying a model where we can implicitly control its intersection number (indeed m is an obvious upper bound on the number of cliques that can cover all edges of the graph) may help understand algorithmic bottlenecks for finding maximum cuts in Erdős-Rényi random graphs.
Second, we note that the representation matrix R of a weighted random intersection graph can be used to define a random set system Σ consisting of m sets Σ = {L 1 , . . ., L m }, where L ℓ is the set of vertices that have chosen label ℓ; we say that R is the incidence matrix of Σ.Therefore, there is a natural connection between Weighted Max Cut and the discrepancy of such random set systems, which we formalize in this paper.In particular, given a set system Σ with incidence matrix R, its discrepancy is defined as disc(Σ) = min x∈{±1} n max L∈Σ v∈L x v = ∥Rx∥ ∞ , i.e. it is the minimum imbalance of all sets in Σ over all 2-colorings x.Recent work on the discrepancy of random rectangular matrices defined as above [1] has shown that, when the number of labels (sets) m satisfies n ≥ 0.73m log m, the discrepancy of Σ is at most 1 with high probability.The proof of the main result in [1] is based on a conditional second moment method combined with Stein's method of exchangeable pairs, and improves upon a Fourier analytic result of [14], and also upon previous results in [11], [20].The design of an efficient algorithm that can find a 2coloring having discrepancy O(1) in this range still remains an open problem.Approximation algorithms for a similar model for random set systems were designed and analyzed in [2]; however, the algorithmic ideas there do not apply in our case.

Our Contribution
In this paper, we introduce the model of weighted random intersection graphs and we study the average case analysis of Weighted Max Cut through the prism of Discrepancy of random set systems.We formalize the connection between these two combinatorial problems for the case of arbitrary weighted intersection graphs in Corollary 4. We prove that, given a weighted intersection graph G = (V, E, R T R) with representation matrix R, and a set system with incidence matrix R, such that disc(Σ) ≤ 1, a 2-coloring has maximum cut weight in G if and only if it achieves minimum discrepancy in Σ.In particular, Corollary 4 applies in the range of values considered in [1] (i.e.n ≥ 0.73m log m), and thus any algorithm that finds a maximum cut in G(V, E, R T R) with large enough probability can also be used to find a 2-coloring with minimum discrepancy in a set system Σ with incidence matrix R, with the same probability of success.
We then consider weighted random intersection graphs in the case m = n α , α ≤ 1, and we prove that the maximum cut weight of a random instance G(V, E, R T R) of G n,m,p concentrates around its expected value (see Theorem 5).In particular, with high probability (whp, i.e. with probability tending to 1 as n → ∞) over the choices of R, , where E R denotes expectation with respect to R. The proof is based on the Efron-Stein inequality for upper bounding the variance of the maximum cut.As a consequence of our concentration result, we prove in Theorem 6 that, in the case α < 1, a random 2-coloring (i.e.biparition) x (rand) in which each vertex chooses its color independently and equiprobably, has cut weight asymptotically equal to Max-Cut(G), with high probability over the choices of x (rand) and R.
The latter result on random cuts allows us to focus the analysis of our randomized algorithms of Section 4 on the case m = n (i.e.α = 1), and p = c n , for some constant c (see also the discussion at the end of subsection 3.1), where the assumptions of Theorem 6 do not hold.It is worth noting that, in this range of values, the expected weight of a fixed edge in a weighted random intersection graph is equal to mp 2 = Θ(1/n), and thus we hope that our work here will serve as an intermediate step towards understanding when algorithmic bottlenecks for Max Cut appear in sparse random graphs (especially Erdős-Rényi random graphs) with respect to the intersection number.In particular, we analyze a Majority Cut Algorithm 1 that extends the algorithmic idea of [8] to weighted intersection graphs as follows: vertices are colored sequentially (each color +1 or −1 corresponding to a different set in the partition of the vertices), and the t-th vertex is colored opposite to the sign of i∈[t−1] [R T R] i,t x i , namely the total available weight of its incident edges, taking into account colors of adjacent vertices.Our average case analysis of the Majority Cut Algorithm shows that, when m = n and p = c n , for large constant c, with high probability over the choices of R, the expected weight of the constructed cut is at least 1 + β times larger than the expected weight of a random cut, for some constant β = β(c) ≥ intersection graphs for m = n and p = c n , for constant c, by exploiting the connection between Weighted Max Cut and the problem of discrepancy minimization in random set systems.In particular, we design a Weak Bipartization Algorithm 2, that takes as input an intersection graph with representation matrix R and outputs a subgraph that is "almost" bipartite.In fact, the input intersection graph is treated as a multigraph composed by overlapping cliques formed by the label sets L ℓ = {v : R ℓ,v = 1}, ℓ ∈ M. The algorithm attempts to destroy all odd cycles of the input (except from odd cycles that are formed by labels with only two vertices) by replacing each clique induced by some label set L ℓ by a random maximal matching.In Theorem 11 we prove that, with high probability over the choices of R, if the Weak Bipartization Algorithm terminates, then its output can be used to construct a 2-coloring that has minimum discrepancy in a set system with incidence matrix R, which also gives a maximum cut in G(V, E, R T R).It is worth noting that this does not follow from Corollary 4, because a random set system with incidence matrix R has discrepancy larger than 1 with (at least) constant probability when m = n and p = c n .Our proof relies on a structural property of closed 0-strong vertex-label sequences (loosely defined as closed walks of edges formed by distinct labels) in the weighted random intersection graph G(V, E, R T R) (Lemma 8).Finally, in Theorem 12, we prove that our Weak Bipartization Algorithm terminates in polynomial time, with high probability, if the constant c is strictly less than 1.Therefore, there is a polynomial time algorithm for finding weighted maximum cuts, with high probability, when the input is drawn from G n,n, c n , with c < 1.We believe that this part of our work may also be of interest regarding the design of efficient algorithms for finding minimum disrepancy colorings in random set systems.
Due to lack of space, some of the proofs are given in a clearly marked Appendix, to be read at the discretion of the program committee.

Notation and preliminary results
We denote weighted undirected graphs by G(V, E, W); in particular, is the set of vertices (resp.set of edges) and W = W(G) is the weight matrix, i.e.W i,j = w i,j is the weight of (undirected) edge {i, j} ∈ E. We allow W to have non-zero diagonal entries, as these do not affect cut weights.We also denote the number of vertices by n, and we use the notation [n] = {1, 2, . . ., n}.We also use this notation to define parts of matrices, for example W [n],1 denotes the first column of the weight matrix.
A bipartition of the sets of vertices is a partition of V into two sets A, B such that A ∩ B = ∅ and A ∪ B = V .Bipartitions correspond to 2-colorings, which we denote by vectors x such that Given a weighted graph G(V, E, W), we denote by Cut(G, x) the weight of a cut defined by a bipartition x, namely Cut(G, For a weighted random intersection graph G(V, E, R T R) with representation matrix R, we denote by S v the set of labels chosen by vertex v ∈ V , i.e. S v = {ℓ : R ℓ,v = 1}.Furthermore, we denote by L ℓ the set of vertices having chosen label ℓ, i.e.L ℓ = {v : R ℓ,v = 1}.Using this notation, the weight of an edge {v, u} ∈ E is |S v ∩ S u |; notice also that this is equal to 0 when {v, u} / ∈ E. We also note here that we may also think of a weighted random intersection graph as a simple weighted graph where, for any pair of vertices v, u, there are |S v ∩ S u | simple edges between them.
A set system Σ defined on a set V is a family of sets Σ = {L 1 , L 2 , . . ., L m }, where It is well-known that the cut size of a bipartition of the set of vertices of a graph G(V, E) into sets A and B is given by This can be naturally generalized for multigraphs and also for weighted graphs.In particular, the Max-Cut size of a weighted graph G(V, E, W) is given by In particular, we get the following Corollary (refer to Section A of the Appendix for the proof): and so where ∥ • ∥ denotes the 2-norm.In particular, the expectation of the size of a random cut, where each entry of x is independently and equiprobably either +1 or -1 is equal to , where E x denotes expectation with respect to x.

Since i,j∈[n]
2 R T R i,j is fixed for any given representation matrix R, the above Corollary implies that, to find a bipartition of the vertex set V that corresponds to a maximum cut, we need to find an n-dimensional vector in arg min x∈{−1,+1} n ∥Rx∥ 2 .We thus get the following (refer to Section B of the Appendix for the proof): Notice that above result is not necessarily true when disc(Σ) > 1, since the minimum of ∥Rx∥ could be achieved by 2-colorings with larger discrepancy than the optimal.

Range of values for p
Concerning the success probability p, we note that, when p = o 1 nm , direct application of the results of [5] suggest that G(V, E, R T R) is chordal with high probability, but in fact the same proofs reveal that a stronger property holds, namely that there is no closed vertex-label sequence (refer to the precise definition in subsection 4.2) having distinct labels.Therefore, in this case, finding a bipartition with maximum cut weight is straightforward: indeed, one way to construct a maximum cut is to run our Weak Bipartization Algorithm 2 from subsection 4.2, and then to apply Theorem 11 (noting that Weak Bipartization termination condition trivially holds, since the set C odd (G (b) ) defined in subsection 4.2 is empty).Furthermore, even though we consider weighted graphs, we will also assume that mp 2 = O(1), noting that, otherwise, G(V, E, R T R) will be almost complete with high probability (indeed, the unconditional edge existence probability is 1−(1−p 2 ) m , which tends to 1 for mp 2 = ω(1)).In particular, we will assume that C 1 m , for arbitrary positive constants C 1 , C 2 ; C 1 can be as small as possible, and C 2 can be as large as possible, provided C 2 1 √ m ≤ 1.We note that, when p is asymptotically equal to the upper bound C 2 1 √ m , there is no constant weight upper bound that holds with high probability, whereas, when p is asymptotically equal to the lower bound C 1 1 nm , all weights in the graph are bounded by a small constant with high probability.Our results in Section 3 assume this range of values for p, and thus graph instances may contain edges with large (but constant) weights.On the other hand, in the analysis of our randomized algorithms in section 4, we assume n = m and p = Θ 1 n ; this range of values gives sparse graph instances (even though the distribution is different from sparse Erdős-Rényi random graphs).

Concentration of Max-Cut
In this section we prove that the size of the maximum cut in a weighted random intersection graph concentrates around its expected value.We note however, that the following Theorem does not provide an explicit formula for the expected value of the maximum cut.Proof.Let G = G(V, E, R T R) be a weighted random intersection graph, and let D denote the (random) diagonal matrix containing all diagonal elements of R T R. In particular, equation (3) of Corollary 3 can be written as Furthermore, for any given R, notice that, if we select each element of x independently and equiprobably from {−1, +1}, then , where E x denotes expectation with respect to x.Therefore, by the probabilistic method, min x∈{−1,+1} n x T R T R − D x ≤ 0, implying the following bound: where the second inequality follows trivially by observing that 1 2 i̸ =j,i,j∈[n] R T R i,j equals the sum of the weights of all edges.
By linearity, , which goes to infinity as n → ∞, because np = Ω n m = Ω(1) in the range of parameters that we consider.In particular, by (4), we have (5) By Chebyshev's inequality, for any ϵ > 0, we have where Var R denotes variance with respect to R. To bound the variance on the right hand side of the above inequality, we use the Efron-Stein inequality.In particular, we write Max-Cut(G) := f (R), i.e. we view Max-Cut(G) as a function of the label choices.For we also write R (ℓ,i) for the matrix R where entry (ℓ, i) has been replaced by an independent, identically distributed (i.i.d.) copy of R ℓ,i , which we denote by R ′ ℓ,i .By the Efron-Stein inequality, we have Notice now that, given all entries of R except R ℓ,i , the probability that because the intersection graph with representation matrix R differs by at most |L ℓ \{i}| edges from the intersection graph with representation matrix R (ℓ,i) .Also note that, by definition, |L ℓ \{i}| follows the Binomial distribution B(n−1, p).In particular, Putting this all together, (7) becomes Therefore, by (6), we get which goes to 0 in the range of values that we consider.Together with (5), the above bound proves that Max-Cut(G) is concentrated around its expected value.◀

Max-Cut for small number of labels
Using Theorem 5, we can now show that, in the case m = n α , α < 1, and p = O 1 √ m , a random cut has asymptotically the same weight as Max-Cut(G), where G = G(V, E, R T R) is a random instance of G n,m,p .In particular, let x (rand) be constructed as follows: for each i ∈ [n], set x (rand) i = −1 independently with probability 1  2 , and x (rand) i = +1 otherwise.The proof details of the following Theorem can be found in Section C of the Appendix.In view of equation (3), the main idea is to prove that, with high probability over random x and R, ∥Rx∥ 2 is asymptotically smaller than the expectation of the weight of the cut defined by x (rand) , in which case the theorem follows by concentration of Max-Cut(G) around its expected value (Theorem 5), and straightforward bounds on Max-Cut(G).
m , for arbitrary positive constants C 1 , C 2 , and let R be its representation matrix.Then the cut weight of the random 2-coloring x (rand) satisfies Cut(G, x (rand) ) = (1 − o( 1))Max-Cut(G) with high probability over the choices of x (rand) , R.
We note that the same analysis also holds when n = m and p is sufficiently large (e.g.p = ω( ln n n )); more details can be found at the end of Section C of the Appendix.In view of this, in the following sections we will only assume m = n (i.e.α = 1) and also p = c n , for some positive constant c.Besides avoiding complicated formulae for p, the reason behind this assumption is that, in this range of values, the expected weight of a fixed edge in G(V, E, R T R) is equal to mp 2 = Θ(1/n), and thus we hope that our work will serve as an intermediate step towards understanding algorithmic bottlenecks for finding maximum cuts in Erdős-Rényi random graphs G n,c/n with respect to their intersection number.

The Majority Cut Algorithm
In the following algorithm, the 2-coloring representing the bipartition of a cut is constructed as follows: initially, a small constant fraction ϵ of vertices are randomly placed in the two partitions, and then in each subsequent step, one of the remaining vertices is placed in the partition that maximizes the weight of incident edges with endpoints in the opposite partition.

Algorithm 1 Majority Cut
Input: G(V, E, R T R) and its representation matrix R ∈ {0, 1} m×n Output: Large cut 2-coloring x ∈ {−1, +1} Clearly the Majority Algorithm runs in polynomial time in n, m.Furthermore, the following Theorem provides a lower bound on the expected weight of the cut constructed by the algorithm in the case m = n, p = c n , for large constant c, and ϵ → 0. The full proof details can be found in Section D of the Appendix.▶ Theorem 7. Let G(V, E, R T R) be a random instance of the G n,m,p model, with m = n, and p = c n , for large positive constant c, and let R be its representation matrix.Then, with high probability over the choices of R, the majority algorithm constructs a cut with expected 1) is a constant, i.e. at least 1 + β times larger than the expected weight of a random cut.
Proof sketch.Let G(V, E, R T R) be a random instance of the G n,m,p model, with m = n, and p = c n , for some large enough constant c.For t ∈ [n], let M t denote the constructed cut size just after the consideration of a vertex v t , for some t ≥ ϵn + 1.By equation ( 3) for n = t, and since the values x 1 , . . ., x t−1 are already decided in previous steps, we have 2 , and after careful calculation we get the recurrence Observe that, in the latter recursive equation, the term 1 2 i∈[t−1] R T R i,t corresponds to the expected increment of the constructed cut if the t-vertex chose its color uniformly at random.Therefore, lower bounding the expectation of 1  2 |Z t | will tell us how much better the Majority Algorithm does when considering the t-th vertex.
Towards this end, we note that, given is the sum of m independent random variables, since the Bernoulli random variables R ℓ,t , ℓ ∈ [m], are independent, for any given t (note that the conditioning is essential for independence, otherwise the inner sums in the definition of Z t would also depend on the x i 's, which are not random when i is large).By using a domination argument, we can then prove that where Z B t is a certain Binomial random variable (formally defined in the full proof), and MD(•) is the mean absolute difference of (two independent copies of) and the result follows by noting that the expected weight of a random cut is equal to

Intersection graph (weak) bipartization
Notice that we can view a weighted intersection graph G(V, E, R T R) as a multigraph, composed by m (possibly) overlapping cliques corresponding to the sets of vertices having chosen a certain label, namely In particular, let K (ℓ) denote the clique induced by label ℓ. ℓ) , where ∪ + denotes union that keeps multiple edges.In this section, we present an algorithm that takes as input an intersection graph G given as a union of overlapping cliques and outputs a subgraph that is "almost" bipartite.
To facilitate the presentation of our algorithm, we first give some useful definitions.A closed vertex-label sequence is a sequence of alternating vertices and labels starting and ending at the same vertex, namely σ , where the size of the sequence k = |σ| is the number of its labels, v i ∈ V , ℓ i ∈ M, and {v i , v i+1 } ⊆ L ℓi , for all i ∈ [k] (i.e.v i is connected to v i+1 in the intersection graph).We will also say that label ℓ is strong if |L ℓ | ≥ 3, otherwise it is weak.For a given closed vertex-label sequence σ, and any integer λ ∈ [|σ|], we will say that σ is λ-strong if |L ℓi | ≥ 3, for λ indices i ∈ [|σ|].The structural Lemma below is useful for our analysis (see Section E of the Appendix for the proof).The following definition is essential for the presentation of our algorithm.

additionally satisfy the following:
(a) σ has distinct vertices (except the first and the last) and distinct labels.
Algorithm 2 initially replaces each clique K (ℓ) by a random maximal matching M (ℓ) , and thus gets a subgraph ) and a strong label ℓ ∈ σ, and then replaces M (ℓ) in G (b) by a new random matching of K (ℓ) .The algorithm repeats until all odd cycles are destroyed (or runs forever trying to do so).The following results are the main technical tools that justify the use of the Weak Bipartization Algorithm for Weighted Max Cut.The proof details for Lemma 10 can be found in Section F of the Appendix.
n , where c > 0 is a constant, and let R be its representation matrix.Let also Σ be a set system with incidence matrix R. With high probability over the choices of R, if Algorithm 2 for weak bipartization terminates on input G, its output can be used to construct a 2-coloring x (disc) ∈ arg min x∈{±1} n disc(Σ, x), which also gives a maximum cut in G, i.e. x (disc) ∈ arg max x∈{±1} n Cut(G, x).
Proof.By construction, the output of Algorithm 2, namely G (b) , has only 0-strong odd cycles.Furthermore, by Lemma 8 these cycles correspond to vertex-label sequencies that are label-disjoint.Let H denote the subgraph of G (b) in which we have destroyed all 0-strong odd cycles by deleting a single (arbitrary) edge e C from each 0-strong odd cycle C (keeping all other edges intact), and notice that e C corresponds to a weak label.In particular, H is a bipartite multi-graph and thus its vertices can be partitioned into two independent sets A, B constructed as follows: In each connected component of H, start with an arbitrary vertex v and include in A (resp. in B) the set of vertices reachable from v that are at an even (resp.odd) distance from v. Since H is bipartite, it does not have odd cycles, and thus this construction is well-defined, i.e. no vertex can be placed in both A and B.
We now define x (disc) by setting x Let M 0 denote the set of weak labels corresponding to the edges removed from G (b) in the construction of H.We first note that, for each ℓ C ∈ M 0 corresponding to the removal of an edge e C , we have . Indeed, since e C belongs to an odd cycle in G (b) , its endpoints are at even distance in H, which means that either they both belong to A or they both belong to B. Therefore, their corresponding entries of x (disc) have the same sign, and so (taking into account that the endpoints of e C are the only vertices in L ℓ C ), we have . Second, we show that, for all the other labels ℓ ∈ [m]\M 0 , i∈L ℓ x (disc) i will be equal to 1 if |L ℓ | is odd and 0 otherwise.For any label ℓ ∈ [m]\M 0 , let M (ℓ) denote the part of G (b) corresponding to a maximal matching of K (ℓ) , and note that all edges of M (ℓ) are contained in H. Since H is bipartite, no edge in M (ℓ) can have both its endpoints in either A or B. Therefore, by construction, the contribution of entries of x (disc)  corresponding to endpoints of edges in M (ℓ) to the sum i∈L ℓ x (disc) i is 0. In particular, if is odd) there is a single vertex not matched in M (ℓ) and i∈L ℓ x To complete the proof of the theorem, we need to show that Cut(G, x (disc) ) is maximum.By Corollary 3, this is equivalent to proving that ∥Rx (disc) ∥ ≤ ∥Rx∥ for all x ∈ {−1, +1} n .Suppose that there is some x (min) ∈ {−1, +1} n such that ∥Rx (disc) ∥ > ∥Rx (min) ∥.As mentioned above, for all ℓ ∈ [m]\M 0 , we have [Rx (disc) ] ℓ ≤ 1, and so [Rx (disc) ] ℓ ≤ [Rx (min) ] ℓ .Therefore, the only labels where x (min) could do better are those corresponding to edges e C that are removed from G (b) in the construction of H, i.e. ℓ C ∈ M 0 , for which we have [Rx (disc) ] ℓ C = 2.However, any such edge e C belongs to an odd cycle C, and thus any 2-coloring of the vertices of C will force at least one of the 0-strong labels corresponding to edges of C to be monochromatic.Taking into account the fact that, by Lemma 8, with high probability over the choices of R, all 0-strong odd cycles correspond to vertex-label sequences that are label-disjoint, we conclude that ∥Rx (disc) ∥ ≤ ∥Rx (min) ∥, which completes the proof.◀ The fact that Theorem 11 is not an immediate consequence of Corollary 4 follows from the observation that a random set system with incidence matrix R has discrepancy larger than 1 with (at least) constant probability when m = n and p = c n .Indeed, by a straightforward counting argument, we can see that the expected number of 0-strong odd cycles is at least constant.Furthermore, in any 2-coloring of the vertices at least one of the weak labels forming edges in a 0-strong odd cycle will be monochromatic.Therefore, with at least constant probability, for any x ∈ {−1, +1} n , there exists a weak label ℓ, such that x i x j = 1, for both i, j ∈ L ℓ , implying that disc(L ℓ ) = 2.
We close this section by a result indicating that the conditional statement of Theorem 11 is not void, namely there is a range of values for c where the Weak Bipartization Algorithm terminates in polynomial time.
▶ Theorem 12. Let G(V, E, R T R) be a random instance of the G n,m,p model, with n = m and p = c n , where 0 < c < 1 is a constant, and let R be its representation matrix.With high probability over the choices of R, Algorithm 2 for weak bipartization terminates on input The proof of the above theorem uses the following structural Lemma regarding the expected number of closed vertex label sequences.
In particular, when m = n → ∞, p = c n , c > 0, and k ≥ 3, we have Proof.Notice that there are (n−k)! ways to arrange k out of n vertices in a cycle.Furthermore, in each such arrangement, there are m! (m−k)!ways to place k out of m labels so that there is exactly one label between each pair of vertices.Since labels in any given arrangement must be selected by both its adjacent vertices, (9) follows by linearity of expectation.
Setting m = n and p = c n , and using the inequalities When n goes to ∞ and k ≥ 3, then the above is at most e 2π c 2k as needed.◀ We are now ready for the proof of the Theorem.
Proof of Theorem 12.We will prove that, when m = n → ∞, p = c n , c < 1, and k ≥ 3, with high probability, there are no closed vertex-label sequences that have labels in common.To this end, recalling Definition 9 for C odd (G (b) ), we provide upper bounds on the following events: By the union bound, Markov's inequality and Lemma 13, we get that, whp all closed vertex-label sequences have less than log n labels: where the last equality follows since c < 1 is a constant.Furthermore, by Markov's inequality and Lemma 13, and noting that any closed vertex-label sequence in C odd (G (b) ) must have at least k ≥ 3 labels, we get that, whp there less than log n closed vertex-label sequences in C odd (G (b) ): To bound Pr(C), fix a closed vertex-label sequence σ, and let |σ| ≥ 3 be the number of its labels.Notice that, the probability that there is another closed vertex-label sequence that has labels in common with σ implies the existence of a vertex-label sequence σ that starts with either a vertex or a label from σ, ends with either a vertex or a label from σ, and has at least one label or at least one vertex that does not belong to σ.Let |σ| denote the number of labels of σ that do not belong to σ.Then the number of different vertex-label sequences σ that start and end in labels from σ is at most |σ| 2 n |σ|+1 m |σ| ; indeed σ in this case has |σ| labels and |σ| + 1 vertices that do not belong to σ.Therefore, by independence, each such sequence σ has probability p 2|σ|+2 to appear.Similarly, the number of different vertex-label sequences σ that start and end in vertices from σ is at most |σ| 2 n |σ|−1 m |σ| and each one has probability p 2|σ| to appear.Finally, the number of different vertex-label sequences σ that start in a vertex from σ and end in a label from σ (notice that this also covers the case where σ starts in a label from σ and ends in a vertex from σ) is at most |σ| 2 n |σ| m |σ| and each one has probability p 2|σ|+1 to appear.Overall, for a given sequence σ, the expected number of sequences σ described above that additionally satisfies |σ| < log n, is at most where in the last inequality we used the fact that m = n, p = c n and c < 1.Since the existence of a sequence σ for σ that additionally satisfies |σ| ≥ log n implies event A, and on other hand the existence of more than log n different sequences σ ∈ |C odd (G (b) )| implies event B, by Markov's inequality and (12), we get Pr(C) ≤ Pr(A)+Pr(B)+c We have thus proved that, with high probability over the choices of R, closed vertex-label sequences in C odd (G (b) ) are label disjoint, as needed.
In view of this, the proof of the Theorem follows by noting that, since closed vertex label sequences in C odd (G (b) ) are label disjoint, steps 5 and 6 within the while loop of the Weak Bipartization Algorithm will be executed exactly once for each sequence in C odd (G (b) ), where G (b) is defined in step 3 of the algorithm; indeed, once a closed vertex label sequence σ ∈ C odd (G (b) ) is destroyed in step 6, no new closed vertex label sequence is created.In fact, once σ is destroyed we can remove the corresponding labels and edges from G (b) , as these will no longer belong to other closed vertex label sequences.Furthermore, to find a closed vertex label sequences in C odd (G (b) ), it suffices to find an odd cycle in G (b)

◀ Discussion and some open problems
In this paper, we introduced the model of weighted random intersection graphs and we studied the average case analysis of Weighted Max Cut through the prism of discrepancy of random set systems.In particular, in the first part of the paper, we proved concentration of the weight of a maximum cut of G(V, E, R T R) around its expected value, and we used it to show that, with high probability, the weight of a random cut is asymptotically equal to the maximum cut weight of the input graph, when m = n α , α < 1.On the other hand, in the case where the number of labels is equal to the number of vertices (i.e.m = n), we proved that a majority algorithm gives a cut with weight that is larger than the weight of a random cut by at least a constant factor, when p = c n and c is large.In the second part of the paper, we highlighted a connection between Weighted Max Cut of sparse weighted random intersection graphs and Discrepancy of sparse random set systems, formalized through our Weak Bipartization Algorithm and its analysis.We demonstrated how our proposed framework can be used to find optimal solutions for these problems, with high probability, in special cases of sparse inputs (m = n, p = c n , c < 1).One of the main problems left open in our work concerns the termination of our Weak Bipartization Algorithm for large values of c.We conjecture the following:

A Proof of Corollary 3
We first prove the following Lemma, by straightforward calculation from equation (1): ▶ Lemma 16.Let G(V, E, W) be a weighted graph such that W is symmetric and Proof.For any x ∈ {−1, +1} n , we write By (1), this completes the proof.◀ Proof of Corollary 3. Notice that diagonal entries of the weight matrix in (13) cancel out, and so, for any x ∈ {−1, +1} n , we have Taking expectations with respect to x, the contribution of the second sum in the above expression equals 0, which completes the proof.◀

B Proof of Corollary 4
Proof.Since disc(Σ, x * ) ≤ 1, then each component of Rx * is either 0 or 1, for any x * ∈ {−1, +1} n .In particular, for any ℓ ∈ [m], [Rx * ] ℓ is 0 if the number of ones in the ℓ-th row is even and it is equal to 1 otherwise.This is the best one can hope for, since sets with an odd number of elements cannot have discrepancy less than 1.Therefore, ∥Rx * ∥ is also the minimum possible.In particular, this implies that, in the case disc(Σ, x * ) ≤ 1, any 2-coloring that achieves minimum discrepancy gives a bipartition that corresponds to a maximum cut and vice versa.◀

C Proof of Theorem 6
Proof.Let G = G(V, E, R T R) be a weighted random intersection graph.By equation (2) of Corollary 3, for any x ∈ {−1, +1} n , we have: Taking expectations with respect to random x and R, we get To prove the Theorem, we will show that, with high probability over random x and R, we have ∥Rx∥ 2 = o E R  1), (16) implying that, all rows of R have at most 3np non-zero elements with high probability.
Fix now ℓ and consider the random variable corresponding to the ℓ-th entry of Rx, namely Z ℓ = i∈[n] R ℓ,i x i .In particular, given Y ℓ , notice that Z ℓ is equal to the sum of Y ℓ independent random variables x i ∈ {−1, +1}, for i such that R ℓ,i = 1.Therefore, since  1), (17) implying that all entries of Rx have absolute value at most √ 6np ln n with high probability over the choices of x and R. Consequently, with high probability over the choices of x and R, we have ∥Rx∥ 2 = 6mnp ln n, which is o(n 2 mp 2 ), since np = ω(ln n) in the range of parameters considered in this theorem.This completes the proof.◀ We note that the same analysis also holds when n = m and p is sufficiently large (e.g.p = ω( ln n n )).In particular, similar probability bounds hold in equations ( 15), ( 16) and ( 17), for the same choices of δ ≥ 2 and λ ≥ √ 6np ln n, implying that ∥Rx∥ 2 = 6mnp ln n = o(n 2 mp 2 ) with high probability.

D Proof of Theorem 7
Proof.Let G(V, E, R T R) (i.e. the input to the Majority Cut Algorithm 1) be a random instance of the G n,m,p model, with m = n, and p = c n , for some large enough constant c.For t ∈ [n], let M t denote the constructed cut size just after the consideration of a vertex v t , for some t ≥ ϵn + 1.In particular, by equation (3) for n = t, and since the values x 1 , . . ., x t−1 are already decided in previous steps, we have The first of the above terms is and the second term is By (18), (19) and (20), we have Define now the random variable Observe that, in the latter recursive equation, the term 1 2 i∈[t−1] R T R i,t corresponds to the expected increment of the constructed cut if the t-vertex chose its color uniformly at random.Therefore, lower bounding the expectation of 1  2 |Z t | will tell us how much better the Majority Algorithm does when considering the t-th vertex.
Towards this end, we first note that, given x [t−1] = {x i , i ∈ [t − 1]}, and R [m],[t−1] = {R ℓ,i , ℓ ∈ [m], i ∈ [t−1]}, Z t is the sum of m independent random variables, since the Bernoulli random variables R ℓ,t , ℓ ∈ [m], are independent, for any given t (note that the conditioning is be a random instance of the G n,m,p model with m = n a , α ≤ 1, and C 1 1 nm ≤ p ≤ 1, for arbitrary positive constant C 1 , and let R be its representation matrix.Then Max-Cut(G) ∼ E R [Max-Cut(G)] with high probability, where E R denotes expectation with respect to R, i.e.Max-Cut(G) concentrates around its expected value.

2 ▶ 8 .
Lemma Let G(V, E, R T R) be a random instance of the G n,m,p model, with m = n, and p = c n , for some constant c > 0. With high probability over the choices of R, 0-strong closed vertex-label sequences in G do not have labels in common.
be a random instance of the G n,m,p model, with m = n, and p = c n , for some constant c ≥ 1.With high probability over the choices of R, on input G, Algorithm 2 for weak bipartization terminates in polynomial time.We also leave the problem of determining whether Algorithm 2 terminates in polynomial time, in the case m = n and p = ω(1/n), as an open question for future research.Towards strengthening the connection between Weighted Max Cut under the G n,m,p model, and Discrepancy in random set systems, we conjecture the following: ▶ Conjecture 15.Let G(V, E, R T R) be a random instance of the G n,m,p model, with m = n α , α ≤ 1 and mp 2 = O(1), and let R be its representation matrix.Let also Σ be a set system with incidence matrix R.Then, with high probability over the choices of R, there exists x disc ∈ arg min x∈{−1,+1} n disc(Σ, x), such that Cut(G, x disc ) is asymptotically equal to Max-Cut(G).