How to hide a clique?

In the well known planted clique problem, a clique (or alternatively, an independent set) of size $k$ is planted at random in an Erdos-Renyi random $G(n, p)$ graph, and the goal is to design an algorithm that finds the maximum clique (or independent set) in the resulting graph. We introduce a variation on this problem, where instead of planting the clique at random, the clique is planted by an adversary who attempts to make it difficult to find the maximum clique in the resulting graph. We show that for the standard setting of the parameters of the problem, namely, a clique of size $k = \sqrt{n}$ planted in a random $G(n, \frac{1}{2})$ graph, the known polynomial time algorithms can be extended (in a non-trivial way) to work also in the adversarial setting. In contrast, we show that for other natural settings of the parameters, such as planting an independent set of size $k=\frac{n}{2}$ in a $G(n, p)$ graph with $p = n^{-\frac{1}{2}}$, there is no polynomial time algorithm that finds an independent set of size $k$, unless NP has randomized polynomial time algorithms.


Introduction
The planted clique problem, also referred to as hidden clique, is a problem of central importance in the design of algorithms.We introduce a variation of this problem where instead of planting the clique at random, an adversary plants the clique.Our main results are that in certain regimes of the parameters of the problem, the known polynomial time algorithms can be extended to work also in the adversarial settings, whereas for other regimes, the adversarial planting version becomes NP-hard.We find the results interesting for three reasons.One is that they concern an extensively studied problem (planted clique), but from a new direction, and we find that the results lead to a better understanding of what aspects of the planted clique problem are made use of by the known algorithms.Another is that extending the known algorithms (based on semidefinite programming) to the adversarial planted setting involves some new techniques regarding how semidefinite programming can be used and analysed.Finally, the NP-hardness results are interesting as they are proven in a semi-random model in which most of the input instance is random, and the adversary controls only a relatively small aspect of the input instance.One may hope that this brings us closer to proving NP-hardness results for purely random models, a task whose achievement would be a breakthrough in complexity theory.

The random planted clique model
Our starting point is the Erdos-Renyi G(n, p) random graph model, which generates graphs on n vertices, and every two vertices are connected by an edge independently with probability * Part of the work was done while the author was a visiting student in the Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, and a full-time undergraduate student in the Faculty of Computer Science, Higher School of Economics, Moscow, Russia.
p. We start our discussion with the special case in which p = 1 2 , and other values of p will be considered later.Given a graph G, let ω(G) denote the size of the maximum clique in G, and let α(G) denote the size of the maximum independent set.Given a distribution D over graphs, we use the notation G ∼ D for denoting a graph sampled at random according to D. The (edge) complement of a graph G ∼ G(n, 1 2 ) is by itself a graph sampled from G(n, 1 2 ), and the complement of a clique is an independent set, and hence the discussion concerning cliques in G(n, 1  2 ) extends without change to independent sets (and vice versa).It is well known (proved by computing the expectation and variance of the number of cliques of the appropriate size) that for G ∼ G(n, 1 2 ), w.h.p. ω(G) ≃ 2 log n (the logarithm is in base 2).However, there is no known polynomial time algorithm that can find cliques of size 2 log n in such graphs.A polynomial time greedy algorithm can find a clique of size (1 + o(1)) log n.The existence of ρ > 1 for which polynomial time algorithms can find cliques of size ρ log n is a longstanding open problem.
In the classical planted clique problem, one starts with a graph G ′ ∼ G(n, 1 2 ) and a parameter k.In G ′ one chooses at random a set K of k vertices, and makes this set into a clique by inserting all missing edges between pairs of vertices with K. We refer to K as the planted clique, and say that the resulting graph G is distributed according to G(n, 1 2 , k).Given G ∼ G(n, 1 2 , k), the algorithmic goal can be one of the following three: find K, find a clique of maximum size, or find any clique of size at least k.It is not difficult to show that when k is sufficiently large (say, k > 3 log n), then with high probability K is the unique maximum size clique in G ∼ G(n, 1 2 , k), and hence all three goals coincide.Hence in the planted clique problem, the goal is simply to design polynomial time algorithms that (with high probability over the choice of G ∼ G(n, 1 2 , k)) find the planted clique K.The question is how large should k be (as a function of n) so as to make this task feasible.
For some sufficiently large constant c > 0 (throughout, we use c to denote a sufficiently large constant), if k > c √ n log n, with high probability the the vertices of K are simply the k vertices of highest degree in G (see [Kuc95]), and hence K can easily be recovered.Alon, Krivelevich and Sudakov [AKS98] managed to shave the √ log n factor, designing a spectral algorithm that recovers K when k > c √ n.They also showed that c can be made an arbitrarily small constant, by increased the running time by a factor of n O(log( 1 c )) (this is done by "guessing" a set K ′ of O(log( 1 c )) vertices of K, and finding the maximum clique in the subgraph induced on their common neighbors).Subsequently, additional algorithms were developed that find the planted clique when k > c √ n.They include algorithms based on the Lovasz theta function, which is a form of semi-definite programming [FK00], algorithms based on a "reverse-greedy" principle [FR10,DGGP14], and message passing algorithms [DM15].There have been many attempts to find polynomial time algorithms that succeed when k = o( √ n), but so far all of them failed (see for example [Jer92,FK03,MPW15]).It is a major open problem whether there is any such polynomial time algorithm.Planted clique when p = 1 2 was not studied as extensively, but it is quite well understood how results from the G(n, 1 2 , k) model transfer to the G(n, p, k) model.For p much smaller that 1 2 , say p = n δ−1 for some 0 < δ < 1 (hence average degree n δ ), the problem changes completely.Even without planting, with high probability over the choice of G ∼ G(n, p) (with p = n δ−1 ) we have that ω(G) = O( 1 1−δ ), and the maximum clique can be found in polynomial time.This also extends to finding maximum cliques in the planted setting, regardless of the value of k. (We are not aware of such results being previously published, but they are not difficult.See Section 2.2.)For p > 1 2 , it is more convenient to instead look at the equivalent problem in which p < 1 2 , but with the goal of finding a planted independent set instead of a planted clique.We refer to this model as Ḡ(n, p, k).For G ∼ G(n, p) (with p = n δ−1 ) we have that with high probability α(G) = Θ(n 1−δ log n).For G ∼ Ḡ(n, p, k) the known algorithms extend to finding planted independent sets of size k = cn 1− δ 2 in polynomial time.We remark that the approach of [AKS98] of making c arbitrarily small does not work for such sparse graphs.

The adversarial planted clique model
In this paper we introduce a variation on the planted clique model (and planted independent set model) that we refer to as the adversarial planted clique model.As in the random planted clique model, we start with a graph G ′ ∼ G(n, p) and a parameter k.However, now a computationally unbounded adversary may inspect G ′ , select within it a subset K of k vertices of its choice, and make this set into a clique by inserting all missing edges between pairs of vertices with K. We refer to this model as AG(n, p, k) (and the corresponding model for planted independent set as A Ḡ(n, p, k)).As shorthand notation shall use G ∼ AG(n, p, k) to denote a graph generated by this process.Let us clarify that AG(n, p, k) is not a distribution over graphs, but rather a family of distributions, where each adversarial strategy (where a strategy of an adversary is a mapping from G ′ to a choice of K) gives rise to a different distribution.
In the adversarial planted model, it is no longer true that the planted clique is the one of maximum size in the resulting graph G.Moreover, finding K itself may be information theoretically impossible, as K might be statistically indistinguishable from some other clique of size k (that differs from K by a small number of vertices).The three goals, that of finding K, finding a clique of maximum size, or finding any clique of size at least k, are no longer equivalent.Consequently, for our algorithmic results we shall aim at the more demanding goal of finding a clique of maximum size, whereas for our hardness results, we shall want them to hold even for the less demanding goal of finding an arbitrary clique of size k.

Our results
Our results cover a wide range of values of 0 < p < 1, where p may be a function of n.For simplicity of the presentation and to convey the main insights of our results, we present here the results for three representative regimes: p = 1 2 , p = n δ−1 for 0 < δ < 1, and p = 1 − n δ−1 .For the latter regime, it will be more convenient to replace it by the equivalent problem of finding adversarially planted independent sets when p = n δ−1 .
Informally, our results show the following phenomenon.We consider only the case that p ≤ 1 2 , but consider both the planted clique and the planted independent set problems, and hence the results can be translated to p > 1 2 as well.For clique, we show (Theorem 1.1 and Theorem 1.2) how to extend the algorithmic results known for the random planted clique setting to the adversarial planted clique setting.However, for independent set, we show that this is no longer possible.Specifically, when p is sufficiently small, we prove (Theorem 1.3) that finding an independent set of size k (any independent set, not necessarily the planted one) in the adversarial planted independent set setting is NP-hard.Moreover, the NP-hardness result holds even for large values of k for which finding a random planted independent set is trivial.
Theorem 1.1.For every fixed ε > 0 and for every k ≥ ε √ n, there is an (explicitly described)

k). The statement holds for every adversarial planting strategy (choice of k vertices as a function of
2 )), and the probability of success is taken over the choice of Theorem 1.2.Let p = n δ−1 for 0 < δ < 1.Then for every k, there is an (explicitly described) algorithm running in time n O( 1 1−δ ) which almost surely finds the maximum clique in a graph G ∼ AG(n, p, k).The statement holds for every adversarial planting strategy, and the probability of success is taken over the choice of G ′ ∼ G(n, p).
Theorem 1.3.For p = n δ−1 with 0 < δ < 1, 0 < γ < 1, and cn 1−δ log n ≤ k ≤ 2 3 n (where c is a sufficiently large constant, and the constant 2 3 was chosen for concreteness -any other constant smaller than 1 will work as well) the following holds.There is no polynomial time algorithm that has probability at least γ of finding an independent set of size k in G ∼ A Ḡ(n, p, k), unless NP has randomized polynomial time algorithms (NP=RP).(The algorithm is required to succeed against every adversarial planting strategy, and the probability of success is taken over the choice of G ′ ∼ G(n, p).)

Related work
Some related work was already mentioned in Section 1.1.Our algorithm for Theorem 1.1 is based on an adaptation of the algorithm of [FK00] that applied to the random planted clique setting.In turn, that algorithm is based on the theta function of Lovasz [Lov79].
A work that is closely related to ours and served as an inspiration both to the model that we study, and to the techniques that are used in the proof of the NP-hardness result (Theorem 1.3) is the work of David and Feige [DF16] on adversarially planted 3-colorings.That work uncovers a phenomenon similar to the one displayed in the current work.Specifically, for the problem of 3-coloring (rather than clique or independent set) it shows that for certain values of p, algorithms that work in the random planted setting can be extended to the adversarial planted setting, and for other values of p, finding a 3-coloring in the adversarial planted setting becomes NP-hard.However, there are large gaps left open in the picture that emerges from the work of [DF16].For large ranges of the values of p, specifically, n −1/2 < p < n −1/3 and p < n −2/3 , there are neither algorithmic results nor hardness results in the work of [DF16].Unfortunately, the most interesting values of p for the 3-coloring problem, which are p ≤ c log n n , lie within these gaps, and hence the results of [DF16] do not apply to them.Our work addresses a different problem (planted clique instead of planted 3-coloring), and for our problem, our analysis leaves almost no such gaps.We are able to determine for which values of p the problem is polynomial time solvable, and for which values it is NP-hard.See Section 3 for more details.Our model is an example of a semi-random model, in which part of the input is determined at random and part is determined by an adversary.There are many other semi-random models, both for the clique problem and for other problems.Describing all these models is beyond the scope of this paper, and the interested reader is referred to [Fei20] and references therein for additional information.

Overview of the proofs
In this section we provide an overview of the proofs for our three main theorems.Further details, as well as extensions to the results, appear in the appendix.
The term almost surely denotes a probability that tends to 1 as n grows.The term extremely high probability denotes a probability of the form 1 − e −n r for some r > 0. By exp(x) for some expression x we mean e x .

Finding cliques using the theta function
In this section we provide an overview of the proof of Theorem 1.1.Our algorithm is an adaptation of the algorithm of [FK00] that finds the maximum clique in the random planted model.We shall first review that algorithm, then describe why it does not apply in our setting in which an adversary plants the clique, and finally explain how we modify that algorithm and its analysis so as to apply it in the adversarial planted setting.
The key ingredient in the algorithm of [FK00] is the theta function of Lovasz, denoted by ϑ.Given a graph G, ϑ(G) can be computed in polynomial time (up to arbitrary precision, using semidefinite programming (SDP)), and satisfies ϑ(G) ≥ α(G).As we are interested here in cliques and not in independent sets, we shall consider Ḡ, the edge complement of G, and then ϑ( Ḡ) ≥ ω(G).The theta function has several equivalent definitions, and the one that we shall use here (referred to as ϑ 4 in [Lov79]) is the following.
Given a graph G = G(V, E), a collection of unit vectors s i ∈ R n (one vector for every vertex i ∈ V ) is an orthonormal representation of G, if s i and s j are orthogonal (s i • s j = 0) whenever (i, j) ∈ E. The theta function is the maximum value of the following expression, where maximization is over all orthonormal representations {s i } of G and over all unit vectors h (h is referred to as the handle): (1) The optimal orthonormal representation and the associated handle that maximize the above formulation for ϑ can be found (up to arbitrary precision) in polynomial time by formulating the problem as an SDP (details omitted).Observe that for any independent set S the following is a feasible solution for the SDP: choose s i = h for all i ∈ S, and choose all remaining vectors s j for j ∈ S to be orthogonal to h and to each other.Consequently, ϑ(G) ≥ α(G), as claimed.
The main content of the algorithm of [FK00] is summarized in the following theorem.We phrased it in a way that addresses cliques rather than independent sets, implicitly using α( Ḡ) = ω(G).We also remind the reader that in the random planted model, the planted clique K is almost surely the unique maximum clique.
Theorem 2.1 (Results of [FK00]).Consider G ∼ G(n, 1 2 , k), a graph selected in the random planted clique model, with k ≥ c √ n for some sufficiently large constant c.Then with extremely high probability (over choice of G) it holds that ϑ( Ḡ) = ω(G).Moreover, for every vertex i that belongs to the planted clique K, the corresponding vector s i has inner product larger than 1 − 1 n with the handle h, and for every other vertex, the corresponding inner product is at most 1 n .
Given Theorem 2.1, the following algorithm finds the planted clique when G ∼ G(n, 1 2 , k), and k ≥ c √ n for some sufficiently large constant c.Solve the optimization problem (1) (on Ḡ) to sufficiently high precision, and output all vertices whose corresponding inner product with h is at least 1 2 .The algorithm above does not apply to G ∼ AG(n, 1 2 , k), a graph selected in the adversarial planted clique model, for the simple reason that Theorem 2.1 is incorrect in that model.[Juh82]), and consequently one would expect the value of ϑ( Ḡ) to be roughly k + √ log n.
Summarizing, it is not difficult to come up with strategies for planting cliques of size k that result in the maximum clique having size strictly larger than k, and the value of ϑ( Ḡ) being even larger.Consequently, the solution of the optimization problem (1) by itself is not expected to correspond to the maximum clique in G.
We now explain how we overcome the above difficulty.A relatively simple, yet important, observation is the following.Proposition 2.1.Let G ∼ AG(n, p, k) with p = 1/2 and k > √ n, and let K ′ be the maximum clique in G (which may differ from the planted clique K).Then with extremely high probability over the choice of G ′ ∼ G(n, 1 2 ), for every possible choice of k vertices by the adversary, K ′ contains at least k − O(log n) vertices from K, and at most O(log n) additional vertices.
Proof.Standard probabilistic arguments show that with extremely high probability, the largest clique in G ′ (prior to planting a clique of size k) is of size at most k 2 .When this holds, K ′ contains at least k 2 vertices from K. Each of the remaining vertices of K ′ needs to be connected to all vertices in K ′ ∩ K. Consequently, with extremely high probability, K ′ contains at most 2 log n vertices not from K.This is because a G ′ ∼ G(n, 1 2 ) graph, with extremely high probability, does not contain two sets of vertices A and B, with |A| = 2 log n, |B| = Ω( √ n), such that all pairs of vertices in A × B induce edges in G.
As |K ′ | ≥ k, we conclude that all but O(log n) vertices of K must be members of K ′ .
A key theorem that we prove is: with extremely high probability over the choice of G ′ ∼ G(n, 1 2 ), for every possible choice of k vertices by the adversary.
We now explain how Theorem 2.2 is proved.The bound ϑ( Ḡ) ≥ k was already explained above.Hence it remains to show that ϑ( Ḡ) ≤ k + O(log n).In general, to bound ϑ(G) from above for a graph G(V, E), one considers the following dual formulation of ϑ, as a minimization problem.
Here M ranges over all n by n symmetric matrices in which M ij = 1 whenever (i, j) ∈ E, and λ 1 (M ) denotes the largest eigenvalue of M .(Observe that if G has an independent set S of size k, then M contains a k by k block of 1 entries.A Rayleigh quotient argument then implies that λ 1 (M ) ≥ k, thus verifying the inequality ϑ(G) ≥ α(G).)To prove Theorem 2.2 we exhibit a matrix M as above (for the graph Ḡ) for which we prove that We first review how a matrix M was chosen by [FK00] in the proof of Theorem 2.1.First, recall that we consider Ḡ, and let E be the set of edges of Ḡ (non-edges of G).We need to associate values with the entries M ij for (i, j) ∈ E (as other entries are 1).The matrix block corresponding to the planted clique K (planted independent set in Ḡ) is all 1 (by necessity).For every (i, j) ∈ E where both vertices are not in K one sets M ij = −1.For every other pair (i, j) ∈ E (say, i ∈ K and j ∈ K) one sets M i,j = − k−d i,K d i,K , where d i,K is the number of neighbors that vertex i has in the set K. In order to show that λ 1 (M ) = k, one first observes that the vector x K (with value 1 at entries that correspond to vertices of K, and value 0 elsewhere) is an eigenvector of M with eigenvalue k.Then one proves that λ 2 (M ), the second largest eigenvalue of M , has value smaller than k.This is done by decomposing M into a sum of several matrices, bounding the second largest eigenvalue for one of these matrices, and the largest eigenvalue for the other matrices.By Weyl's inequality, the sum of these eigenvalues is an upper bound on λ 2 (M ).This upper bound is not tight, but it does show that λ 2 (M ) < k.It follows that the eigenvalue k associated with x K is indeed λ 1 (M ).Further details are omitted.
We now explain how to choose a matrix M so as to prove the bound ϑ( Ḡ) ≤ k + O(log n) in Theorem 2.2.Recall (see Example 1) that we might be in a situation in which ϑ( Ḡ) > α( Ḡ) > k (with all inequalities being strict).In this case, let K ′ denote the largest independent set in Ḡ, and note that K ′ is larger than K.In M , the matrix block corresponding to K ′ is all 1.One may attempt to complete the construction of M as described above for the random planting case, but replacing K by K ′ everywhere in that construction.If one does so, the vector x K ′ (with value 1 at entries that correspond to vertices of K ′ , and value 0 elsewhere) is an eigenvector of M with eigenvalue α( Ḡ) > k.However, M would necessarily have another eigenvector with a larger eigenvalue, because ϑ( Ḡ) > α( Ḡ).Hence we are still left with the problem of bounding λ 1 (M ), rather than bounding λ 2 (M ).Having failed to identify an eigenvector for λ 1 (M ), we may still obtain an upper bound on λ 1 (M ) by using approaches based on Weyl's inequality (or other approaches).However, these upper bounds are not tight, and it seems difficult to limit the error that they introduce to be as small as O(log n), which is needed for proving the inequality For the above reason, we choose M differently.For some constant 1 2 < ρ < 1, we extend the clique K to a possibly larger clique Q, by adding to it every vertex that has ρk neighbors in K. (In Example 1, the corresponding clique Q will include all vertices of K ∪ T .In contrast, if K is planted at random and not adversarially, then we will simply have Q = K.) Importantly, we prove (see 2 ), then with high probability |Q| < k + O(log n) (for every possible choice of planting a clique of size k by the adversary).For the resulting graph G Q , we choose the corresponding matrix M in the same way as it was chosen for the random planting case.Now we do manage to show that the eigenvector x Q (with eigenvalue |Q|) associated with this M indeed has the largest eigenvalue.This part is highly technical, and significantly more difficult than the corresponding proof for the random planting case.The reason for the added level of difficulty is that, unlike the random planting case in which we are dealing with only one random graph, here the adversary can plant the clique in any one of n k locations, and our analysis needs to hold simultaneously for all n k graphs that may result from such plantings.Further details can be found in Appendix A.
Having established that ϑ( ḠQ ) = |Q| ≤ k + O(log n), we use monotonicity of the theta function to conclude that ϑ( Ḡ) ≤ k + O(log n).This concludes our overview for the proof of Theorem 2.2.
Given Theorem 2.2, let us now explain our algorithm for finding a maximum clique in ), the first step in our algorithm is to solve the optimization problem (1) on the complement graph Ḡ.By Theorem 2.2, we will have ϑ( Ḡ) ≤ k + c log n for some constant c > 0. Let {s i } denote the orthonormal representation found by our solution, and let h be the corresponding handle.
The second step of our algorithm it to extract from G a set of vertices that we shall refer to as H, that contains all those vertices i for which (h Lemma 2.1.For H as defined above, with extremely high probability, at least k − O(log n) vertices of K are in H, and most O(log n) vertices not from K are in H.
Proof.Let T denote the set of those vertices in K for which (h Remove T from G, thus obtaining the graph G T .This graph can be thought of as a subgraph with n − |T | vertices of the random graph G ′ ∼ G(n, 1 2 ), in which an adversary planted a clique of size k − |T |.We also have that ϑ 4 between the size of the planted clique and the value of the theta function contradicts Theorem 2.2 for the graph G T .(Technical remark: this last argument uses the fact that Theorem 2.2 holds with extremely high probability, as we take a union bound over all choices of T .) Having established that T is small, let R be the set of vertices not in K for which (h•s i ) 2 ≥ 3 4 .We claim that every such vertex i ∈ R is a neighbor of every vertex j ∈ K \ T .This is because in the orthogonal representation (for Ḡ), if i and j are not neighbors we have that s i • s j = 0, and then the fact that s i ,s j and h are unit vectors implies that (h

Having this claim and using the fact that
2 ) graph, with extremely high probability, does not contain two sets of vertices A and B, with The third step of our algorithm constructs a set F that contains all those vertices that have at least 3k 4 neighbors in H.
Lemma 2.2.With extremely high probability, the set F described above contains the maximum clique in G, and at most O(log n) additional vertices.
Proof.We may assume that H satisfies the properties of Lemma 2.1.Proposition 2.1 then implies that with extremely high probability, every vertex of the maximum clique in G has at least 3k 4 neighbors in H, and hence is contained in F .A probabilistic argument (similar to the end of the proof of Lemma 2.1) establishes that F has at most O(log n) vertices not from K.
As K itself has at most O(log n) vertices not from the maximum clique (by Proposition 2.1), the total number of vertices in F that are not members of the maximum clique is at most Finally, in the last step of our algorithm we find a maximum clique in F , and this is a maximum clique in G.This last step can be performed in polynomial time by a standard algorithm (used for example to show that vertex cover is fixed parameter tractable).For every non-edge in the subgraph induced on F , at least one of its end-vertices needs to be removed.Try both possibilities in parallel, and recurse on each subgraph that remains.The recursion terminates when the graph is a clique.The shortest branch in the recursion gives the maximum clique.As only O(log n) vertices need to be removed in order to obtain a clique, the depth of the recursion is at most O(log n), and consequently the running time (which is exponential in the depth) is polynomial in n.
This completes our overview of our algorithm for finding a clique in √ n for a sufficiently large constant c > 0. To complete the proof of Theorem 1.1 we need to also address the case that k > ε √ n for arbitrarily small constant ε.This we do (as in [AKS98]) by guessing t ≃ 2 log c ǫ vertices from K (there are n t possibilities to try, and we try all of them), and considering the subgraph of G induced on their common neighbors.This subgraph corresponds to a subgraph of The many details that were omitted from the above overview of the proof of Theorem 1.1 can be found in in the appendix.Specifically, in Appendix A we present the proof of Theorem 2.2, generalized to values of p other than 1/2, and k ≥ c √ np.(A technical lemma that is needed for this proof appears in Appendix D.) In Appendix B we present the proof of Theorem 1.1, first addressing the case that c is sufficiently large, and then extending the results to the case that c can be arbitrarily small.

Finding cliques by enumeration
In this section we prove Theorem 1.2.Let p = n δ−1 for 0 < δ < 1, and consider first G ′ ∼ G(n, p) (hence G ′ has average degree roughly n δ ).For every size t ≥ 1, let N t denote the number of cliques of size t in G ′ .The expectation (over choice of G ′ ∼ G(n, p)) satisfies: The exponent is maximized when t = 3−δ 2(1−δ) .For the maximizing (not necessarily integer) t, the exponent equals (3−δ) 2 8(1−δ) .We denote this last expression by e δ , and note that e δ = O( 1 1−δ ).The expected number of cliques of all sizes is then: (The last inequality holds for sufficiently large n.)By Markov's inequality, with probability at least 1 − 1 n , the actual number of cliques in G ′ is at most n e δ +1 .(Stronger concentration results can be used here, but are not needed for the proof of Theorem 1.2.)Now, for arbitrary 1 ≤ k ≤ n, let the adversary plant a clique K of size k in G ′ , thus creating the graph G ∼ G(n, p, k).As every subgraph of K is a clique, the total number of cliques in G is at least 2 k , which might be exponential in n (if k is large).However, the number of maximal cliques in G (a clique is maximal if it is not contained in any larger clique) is much smaller.Given a maximal clique C in G, consider C ′ , the subgraph of C not containing any vertex from K. C ′ is a clique in G ′ (which is nonempty, except for one special case of C = K).C ′ uniquely determines C, as the remaining vertices in C are precisely the set of common neighbors of C ′ in K (this is because the clique C is maximal).Consequently, the number of maximal cliques in G is not larger than the number of cliques in G ′ .
As all maximal cliques in a graph can be enumerated in time linear in their number times some polynomial in n (see e.g.[MU04] and references therein), one can list all maximal cliques in G in time n e β +O(1) (this holds with probability at least 1− 1 n , over the choice of G ′ , regardless of where the adversary plants clique K), and output the largest one.
This completes the proof of Theorem 1.2.

Proving NP-hardness results
In this section we provide an overview of the proof of Theorem 1.3.Our proof is an adaptation to our setting of a proof technique developed in [DF16].
Recall that we are considering a graph G ∼ A Ḡ(n, p, k) (adversarial planted independent set) with p = n δ−1 and 0 < δ < 1.Let us first explain why the algorithm described in Section 2.1 fails when k = cn 1− δ 2 (whereas if the independent set is planted at random, algorithms based on the theta function are known to succeed).The problem is that the bound in Theorem 2.2 is not true anymore, and instead one has the much weaker bound of ϑ(G) ≤ k + n 1−δ log n.Following the steps of the algorithm of Section 2.1, in the final step, we would need to remove a minimum vertex cover from F .However, now the upper bound on the size of this vertex cover is O(n 1−δ log n) rather than O(log n).Consequently, we do not know of a polynomial time algorithm that will do so.It may seem that we also do not know that no such algorithm exists.After all, F is not an arbitrary worst case instance for vertex cover, but rather an instance derived from a random graph.However, our NP-hardness result shows that indeed this obstacle is insurmountable, unless NP has randomized polynomial time algorithms.We remark that using an approximation algorithm for vertex cover in the last step of the algorithm of Section 2.1 does allow one to find in G an independent set of size k − O(n 1−δ log n) = (1 − o(1))k, and the NP-hardness result applies only because we insist on finding an independent set of size at least k.
Let us proceed now with an overview of our NP hardness proof.We do so for the case that k = n 3 (for which we can easily find the maximum independent set if the planted independent set is random).Assume for the sake of contradiction that ALG is a polynomial time algorithm that with high probability over choice of G ′ ∼ G(n, p), for every planted independent set of size k = n 3 , it finds in the resulting graph G an independent set of size k.
We now introduce a class H of graphs that, in anticipation of the proofs that will follow, is required to have the following three properties.(Two of the properties are stated below in a qualitative manner, but they have precise quantitative requirements in the proofs that follow.) 1. Solving maximum independent set on graphs from this class is NP-hard.
2. Graphs in this class are very sparse.
3. The number of vertices in each graph is small.Given the above requirements, we choose 0 < ε < min[ δ 2 , 1 − δ], and let H be the class of balanced graphs on n ǫ vertices, and of average degree 2 + δ. (A graph H is balanced if no subgraph of H has average degree larger than the average degree of H.) Given a graph H ∈ H and a parameter k ′ , it is NP-hard to determine whether H has an independent of size at least k ′ or not (see Theorem C.1).We will reach a contradiction to the existence of ALG by showing how ALG could be used in order to find in H an independent set of size k ′ , if one exists.For this, we use the following randomized algorithm ALGRAND.
1. Generate a random graph G ′ ∼ G(n, p).If H does not have an independent set of size k ′ , ALGRAND surely fails to output such an independent set.But if H does have an independent set of size k ′ , why should ALGRAND succeed?This is because ALG (which is used in ALGRAND) is fooled to think that the graph GH generated by ALGRAND was generated from A Ḡ(n, p, k), and on such graphs ALG does find independent sets of size k.And why is ALG fooled?This is because the distribution of graphs generated by ALGRAND is statistically close to a distribution that can be created by the adversary in the A Ḡ(n, p, k) model.Specifically, consider the following distribution that we refer to as A H G(n, p, k).

The computationally unbounded adversary finds within G ′ all subsets of vertices of size
|H| such that the subgraph induced on them is H. (If there is no such subset, fail.)Choose one such copy of H uniformly at random.
3. As H is assumed to have an independent set of size k ′ , plant an independent set K of size k as follows.k ′ of the vertices of K are vertices of an independent set in the selected copy of H.The remaining k − k ′ vertices of K are chosen at random among the vertices of G ′ that have no neighbor at all in the copy of H. (Observe that we expect there to be at least roughly n − |H|n δ ≥ n 2 such vertices, and with extremely high probability the actual number will be at least Theorem 2.3.The two distributions, GH ∼ G H (n, p, k) generated by ALGRAND and G ∼ A H G(n, p, k) generated by the adversary, are statistically similar to each other.
The proof of Theorem 2.3 appears in Appendix C.4.Here we explain the main ideas in the proof.A minimum requirement for the theorem to hold is that G ′ ∼ G(n, p) typically contains at least one copy of H (otherwise A H G(n, p, k) fails to produce any output).But this by itself does not suffice.Intuitively, the condition we need is that G ′ typically contains many copies of H. Then the fact that G H (n, p) of ALGRAND adds another copy of H to G ′ does not appear to make much of a difference to G ′ , because G ′ anyway has many copies of H. Hopefully, this will imply that G ′ ∼ G(n, p) and G H ∼ G H (n, p) come from two distributions that are statistically close.This intuition is basically correct, though another ingredient (a concentration result) is also needed.Specifically, we need the following lemma (stated informally).
Lemma 2.3.For G ′ ∈ G(n, p) (with p and H as above), the expected number of copies of H in G ′ is very high (2 n η for some η > 0 that depends on δ and ǫ).Moreover, with high probability, the actual number of copies of H in G ′ is very close to its expectation.
The proof of Lemma 2.3 is based on known techniques (first and second moment methods).It uses in an essential way the fact that the graph H is sparse (average degree barely above 2) and does not have many vertices (these properties hold by definition of the class H).See more details in Appendix C.3.Armed with Lemma 2.3, we then prove the following Lemma.
Lemma 2.4.The two distributions G(n, p) and G H (n, p) are statistically similar to each other.Lemma 2.4 is proved by considering graphs G ′ ∼ G(n, p) that do contain a copy of H (Lemma 2.3 establishes that this is a typical case), and comparing for each such graph the probability of it being generated by G H (n, p) with the probability of it being generated by G(n, p).Conveniently, the ratio between these probabilities is the same as the ratio between the actual number of copies of H in the given graph G ′ , and the expected number of copies of H in a random G ′ ∼ G(n, p).By Lemma 2.3, for most graphs, this ratio is close to 1.For more details, see Appendix C.4.
Theorem 2.3 follows quite easily from Lemma 2.4.Consequently ALG's performance on the distributions G H (n, p, k) and A H G(n, p, k) is similar.By our assumption, ALG finds (with high probability) an independent set of size k in G ∼ A H G(n, p, k), which now implies that it also does so for GH ∼ G H (n, p, k).But as argued above, finding an independent set of size k in GH ∼ G H (n, p, k) implies that ALGRAND finds an independent set of size k ′ in H ∈ H, thus solving an NP-hard problem.Hence the assumption that there is a polynomial time algorithm ALG that can find independent sets of size k in G ∼ A Ḡ(n, p, k) implies that NP has randomized polynomial time algorithms.

Additional results
In the main part of the paper we only described what we view as our main results.The appendix contains all missing proofs, and some additional results and extensions, not described above.For example, one may ask for which value of p ≤ 1 2 the transition occurs from being able to find the maximum independent set in G ∼ A Ḡ(n, p, k) in polynomial time, to the problem becoming NP hard.Our results show a gradual transition.For constant p the problem remains polynomial time solvable, and then, as p continues to decrease, the running time of our algorithms becomes super polynomial, and grows gradually towards exponential complexity.Establishing this type of behavior does not require new proof ideas, but rather only the substitution of different parameters in the existing proofs.Consequently, some theorems that were stated here only in special cases (e.g., Theorem 2.2 that was stated only for p = 1 2 ) are restated in the appendix in a more general way (e.g., replacing 1 2 by p), and a more general proof is provided.
Though this is not shown in the appendix, our hardness results (for finding adversarially planted independent sets) also imply a gradual transition, providing NP-hardness results when p = n δ−1 , and as p grows (e.g., into the range p = 1 (log n) c ) the NP-hardness results are replaced by hardness results under stronger assumptions, such as (a randomized version of) the exponential time hypothesis.This is because for p = 1 (log n) c we need to limit the size of the graphs H ∈ H to be only polylogarithmic in n, as for larger sizes the proofs in Section 2.3 fail.
An interesting range of parameters that remains open is that of p = d n for some large constant d.The case of a random planted independent set of size c d n (for some sufficiently large constant c > 0 independent of d) was addressed in [FO08].In such sparse graphs, the planted independent set is unlikely to be the maximum independent set.The main result in [FO08] is a polynomial time algorithm that with high probability finds the maximum independent set in that range of parameters.It would be interesting to see whether the positive results extend to the case of adversarial planted independent set.We remark that neither Theorem 1.1 nor Theorem 1.3 apply in this range of parameters.

A Bounding the theta function
In this section we will prove Theorem 2.2.Instead of proving exactly this theorem, we will prove a generalization to other values of p.Let c ∈ (0, 1) be an arbitrary constant.

This theorem has a very important corollary, which follows from the Lipschitz property of Lovasz theta function [Lov79].
Corollary A.1.Let p and k be as in Theorem A.2, and let K ⊂ V be the vertices belonging to the planted clique of G ∼ AG(n, p, k).Then, with probability at least 1 − exp(−2k log n), where G \ T denotes the graph G with vertices from T deleted; (ii) for every subset S ⊂ V \ K, if we "add" S to the planted clique by drawing all edges between S and S ∪ K, for the resulting graph We now prove Theorem A.2.For G ∼ AG(n, p, k), its complement graph Ḡ contains a planted independent set of size k, so ϑ( Ḡ) ≥ α( Ḡ) ≥ k.It remains to prove the upper bound.We will use the formulation of the theta function as an eigenvalue minimization problem: Here M ranges over all n by n symmetric matrices in which M ij = 1 whenever (i, j) ∈ E, and λ 1 (M ) denotes the largest eigenvalue of M .
The following proposition will be used in the proof of Theorem A.2. Proof.For convenience, we will consider the size of Q to be exactly µk, and consider the set of vertices that have at least νk neighbors in Q, as addition of o(1)-function does not affect anything in the proof.We shall also use g(n, p, t) as shorthand notation for g(n, p, µ, ν, t).

Proposition A.1. Let k and p be as in Theorem
Fix some set Q ⊂ V of size µk, and a set I ⊂ V \ Q of size m.Let T (I, Q) denote the event that every vertex in I has at least νk neighbors in Q.Consider a random bipartite graph with parts I and Q and edge probability p, and let e(I, Q) be the number of edges between I and Q.It is clear that E[e(I, Q)] = mµkp, and the event T (I, Q) implies the event {e(I, Q) ≥ mνk}.Hence There are n m ≤ ne m m ≤ exp(2m log n) possible vertex sets I, and n k ≤ exp(2k log n) possible subsets Q.Let T m be the event that for at least one such choice of I and Q the event T (I, Q) holds.By union bound, We derive an upper bound on ϑ( Ḡ) by presenting a particular matrix M , for which ϑ( Ḡ) ≤ λ 1 (M ) ≤ k ′ ≤ k + a(n, p).We use d(i, Q) to denote the number of edges between the vertex i ∈ The symmetric matrix M we choose is as follows.
• The upper left k ′ × k ′ block is all-ones matrix of order k ′ .
• The lower right block of size (n Observe that that every row of B sums up to zero.
• The upper right block is the transpose of the lower right block B.
We rewrite b ij for (i, j) / ∈ E in the following way: .
The vector with 1 in its first k ′ entries and 0 in other n − k ′ coordinates is an eigenvector of M with eigenvalue k ′ .To show that k ′ is the largest eigenvalue, it suffices to prove that λ 2 (M ) < k ′ .We represent M as a sum of three symmetric matrices M = U + V + W , and apply Weyl theorem [HJ12]: Matrices U , V and W are as follows.
• The matrix U is derived from the adjacency matrix of the original graph G ′ ∼ G(n,p).
U ii = 0 for all i, U ij = 1 if (i, j) ∈ E (in G ′ ), and U ij = −p/(1 − p) for all other i = j.
• Matrix V describes the modification that G ′ undergoes by planting the clique K and extending it to Q.For i, j ≤ k ′ we have was not an edge of G ′ .All other entries are 0.
• The matrix W is the correction matrix for having the row sums of B equal to 0. In its lower left block (i Its upper right block is the transpose of the lower left block.All other entries are 0. Claim A.1.With probability at least 1 − exp(−2k log n), for every possible choice of k vertices by adversary, we have To bound the eigenvalues of U , V and W , we shall use upper bounds on the eigenvalues of random matrices, as appear in [Vu07].
Theorem A.3.There are constants C ′ and C ′′ such that the following holds.Let a ij , i, j ∈ [n] be independent random variables, each of which has mean 0 and variance at most σ 2 and is bounded in absolute value by L, where σ ≥ C ′′ L log 2 n √ n .Let A be the corresponding n × n matrix.Then with probability at least 1 − O(1/n 3 ),

The bound holds regardless of what the diagonal elements of A are, since by subtracting the diagonal we may decrease the eigenvalues at most by L.
The matrix U is a random matrix, as it is generated from the graph G ′ ∼ G(n, p).The entries of matrix U have mean zero, |U ij | = O(1) since p is bounded by constant c < 1, and the variance is for all non-negative t.Hence, to show that λ 1 (U ) does not exceed λ U by too much with extremely high probability, it suffices to show that the probability of λ 1 (U ) to deviate from its mean is exponentially small in k log n ≃ w(n) 1/2 log n.The result by Alon, Krivilevich and Vu [AKV02] ensures that eigenvalues of U are well-concentrated around their means.
Theorem A.4 (Concentration of eigenvalues).For 1 ≤ i ≤ j ≤ n, let a ij be independent, real random variables with absolute value at most 1.Define a ji = a ij for all i, j, and let A be the n × n matrix with and for all t = ω( √ s): The same estimate holds for λ n−s+1 (A).
Taking t = Θ(w(n) 1/4 log n), from Theorem A.4 we get log n with probability at least 1 − exp − Ω k log 2 n .Note that the bound holds for any choice of the adversary, as matrix U does not depend on the vertices of the planted clique and is determined by initial graph G(n, p) only.
As for the matrix V , we shift it so that all its entries have mean 0. Precisely, we consider matrix V ′ such that for all i, j > k ′ we have , and for i < j ≤ k ′ we have V ′ ij = V ij − 1, which is either −1 with probability p and p/(1 − p) with probability (1 − p).Basically, V ′ is a copy of matrix U of order k ′ , so from Theorem A.3 we can obtain the bounds for λ 2 (V ), which is for some constant C ′ > 0, we will denote this bound by Λ V ′ .Similarly to λ 1 (U ), we have We would like these bounds hold for any choice of the adversarial k-clique.There are possible choices, so by setting t = Θ(w(n) 1/4 log n) in the bound above and applying union bound over all possible choices of k-clique, we prove for any choice of the adversary with probability at least 1 − exp − Ω k log 2 n .It remains to bound λ 1 (W ).We will use the trace of W 2 . .
By definition of set Q, for every It turns out that we can always bound the sum above.
Theorem A.5.With probability at least 1 − exp(−2k log n), for every possible choice of k vertices by the adversary, The proof is rather technical and is presented in Appendix D. From Theorem A.5 we get Combining the bounds for λ 1 (U ), λ 2 (V ) and λ 1 (W ), we get By choosing C ≥ 5 1−p in k = Cw(n) 1/2 , we guarantee that the expression above is less than k ′ .Therefore, k ′ is indeed the largest eigenvalue of matrix M , and ϑ( Ḡ) ≤ k ′ ≤ k + a(n, p) for every choice of adversarial k-clique with extremely high probability.This finishes the proof of Theorem A.2.

B Main algorithm
In this section we prove Theorem 1.1.G ′ ∼ G(n, 1 2 )), and the probability of success is taken over the choice of

k). The statement holds for every adversarial planting strategy (choice of k vertices as a function of
As with Theorem 2.2 and Theorem A.2, we will prove a more general version of the theorem, considering G ∼ AG(n, p, k) for a wide range of values of p, and not just p = 1 2 .We first prove such a theorem when k ≥ C √ np for a sufficiently large constant C. Afterwards, we shall extend the proof to the case that C can be an arbitrarily small constant.Theorem B.2.Let c ∈ (0, 1) be an arbitrary constant.Consider an arbitrary function w(n), such that n 2/3 ≪ w(n) ≤ cn.Let G ∼ AG(n, p, k), where p = w(n)/n and k ≥ 5 1−p w(n) 1/2 .There is an (explicitly described) algorithm running in time n O(1) which almost surely finds the maximum clique in G, for every adversarial planting strategy.
Proof.As described in Section 2.1, we solve the optimization problem finding the optimal orthonormal representation {s i } and handle h, using the SDP formulation.Suppose that we solved ϑ( Ḡ) in (4) for G ∼ AG(n, p, k) (with p and k as in Theorem B.2).By Theorem A.2, k ≤ ϑ( Ḡ) ≤ k + a(n, p).Let G = (V, E), let K denote the set of vertices chosen by the adversary.
As h and s i are unit vectors, we have that for all i ∈ V , (h it must be connected to the whole set K 3/4 .The set K 3/4 has size at least k − 4a(n, p), so by Corollary A.2 there are less than a(n, p) vertices i ∈ V \ K with (h • s i ) 2 ≥ 3/4.As a result, for the set H of vertices i ∈ V with (h Let F ⊂ V be the set of all vertices that have at least 3k/4 neighbors in H. Similarly to Lemma 2.2, with extremely high probability F contains the maximum clique in G.Moreover, by Proposition A.1 there are at most O(a(n, p)) vertices from V \ H that have at least 3/4k neighbors in H, implying If follows that the maximum clique of G[F ], the subgraph of G induced on F , is the maximum clique of G.Moreover, K ⊆ F , so F contains a clique of size at least k, and |F | ≤ k +O(a(n, p)).The maximum clique in G[F ] can be found in polynomial time by a standard algorithm (used for example to show that vertex cover is fixed parameter tractable).For every non-edge in the subgraph induced on F , at least one of its end-vertices needs to be removed, so we try both possibilities in parallel, and recurse on each subgraph that remains.Each branch of the recursion is terminated either when the graph is a clique, or when k vertices remain (whichever happens first).At least one of the branches of the recursion finds the maximum clique.The depth of the recursion is at most . This running time is polynomial if p is upper bounded by a constant smaller than 1.This finishes the description of the algorithm, proving Theorem B.2.
By the above claim and our choice of s we now have that k − s > 5 1−p |N (S)|p, where k − s is the size of the clique planted in G ′ S,K .Consequently, we are in a position to apply Theorem B.2 on G[N (S)], and conclude that the algorithm given in the proof of the theorem finds the maximum clique in G[N (S)].This indeed holds almost surely for every particular choice of K ⊂ V and S ⊂ K, but we are not done yet, as we want this to hold for all choices of K and S in G ′ ∼ G(n, p).To reach such a conclusion we need to analyse the failure probability of Theorem B.2 more closely, so as to be able to take a union bound over all choices of K and S.This union bound involves n k • k s ≃ exp(k log n) events (the term k s is negligible compared to n k , because s is a constant).Indeed the failure probability for Theorem B.2 can withstand such a union bound.This is because the proof of Theorem B.2 is based on earlier claims whose failure probability is at most exp(−2k log n).This upper bound on the failure probability is stated explicitly in Theorem A.2 and Corollary A.2, and can be shown to also hold in claims that do not state it explicitly (such as Proposition 2.1, Lemma 2.1 and Lemma 2.2, and versions of them generalized to arbitrary p), using analysis similar to that of the proof of Proposition A.1.

C.1 Maximum Independent Set in balanced graphs
Definition C.1.Given a graph H, denote its average degree by α.A graph H is balanced if every induced subgraph of H has average degree at most α.
Theorem C.1.For any 0 < η ≤ 1, determining the size of the maximum independent set in a balanced graph with average degree 2 < α < 2 + η is NP-hard.
Proof.It is well known that given a parameter k and a 3-regular graph H, determining whether H has an independent set of size k is NP-hard.For simplicity of upcoming notation, let 2n denote the number of vertices in H.Given a positive integer parameter t, we describe a polynomial time reduction R such that given a 3-regular graph H it holds that: • R(H) is a balanced graph with average degree 2 + 1 3t+1 .
• R(H) has an independent set of size k + 3nt if and only if H has an independent set of size t.
By choosing t > 1 3η−6 , the theorem is proved.Let H be a 3-regular graph on 2n vertices.The graph R(H) is obtained from H by replacing every edge (u,v) of H by a path with 2t intermediate vertices that connects between u and v.There are 3n edges in H, so by doing so we add 2t • 3n vertices of degree 2. The average degree of the resulting graph R(H) is To see that the graph R(H) is balanced, consider a subset of vertices S * ⊆ R(H), and let α * > 2 denote the average degree of the induced subgraph R(H)[S * ].W.l.o.g., we can assume that R(H)[S * ] has minimum degree at least 2 (because if R(H)[S * ] has a vertex of degree at most 1, removing it would result in a subgraph of higher average degree).Let V 3 be the set of vertices of degree 3 in R(H)[S * ].All remaining vertices of R(H)[S * ] have degree 2. As no two degree 3 vertices in R(H) are neighbors, R(H)[S * ] is composed of degree 3 vertices, and non-empty disjoint paths connecting between them.As no path connecting two degree 3 vertices in R(H) has fewer than 2t vertices (it may have more than 2t vertices, if it goes through original vertices of H), the number of degree 2 vertices in R(H)[S * ] is at least 3|V 3 | 2 • 2t.Hence α * ≤ 2 + 1 3t+1 , as desired.Every independent set I of size k in H gives rise to an independent set of size k + 3nt in R(H), because in R(H) we can take the vertices of I and t vertices from each of the 3n length t paths (at least one of the two end vertices of each path is not adjacent to a vertex in I).Likewise, every independent set of size k + 3nt in R(H) gives rise to an independent set of size k in R(H).Note that I contains at most t vertices from any single path of R(H), and moreover, can be assumed to contain exactly t vertices from any single path of R(H) (if I contains fewer than t vertices from the path connecting u and v, then by taking all even vertices of the path one gains a vertex, and this compensates for the at most one vertex that is lost from I due to the possible need to remove v from I).As I contains 3nt path vertices, its remaining k vertices are from H.Moreover, they form an independent set in H (no two vertices u and v adjacent in H can be in this set, because then the path connecting them in R(H) cannot contribute t vertices to I).

C.2 Notation to be used in the proof of Theorem 2.3
In the coming sections we prove Theorem 2.3.For simplicity of the presentation (and without affecting the implications towards the proof of Theorem 1.3), we describe the distributions G H (n, p), G H (n, p, k) and A H G(n, p, k) in a way that differs from their description in Section 2.3.Based on these descriptions, we will present X H (G), a key random variable associated with these distributions.This random variable is easier to work with than the random variable referred to in Lemma 2.3, and hence we shall later slightly change the formulation of Lemma 2.3 (without affecting the correctness of Theorem 2.3).
It will be convenient for us to think of G as an n vertex graph with vertices numbered from 1 to n, and of H as an m vertex graph with vertices numbered from 1 to m.For simplicity, we assume that m divides n (this assumption can easily be removed with only negligible effect on the results).Given an n-vertex graph G, we partition the vertex set of G into m disjoint subsets of vertices, each of size n m .Part i for 1 ≤ i ≤ m contains the vertices [(i − 1) n m + 1, i n m ].A vertex set S of size m that contains one vertex in each part is said to obey the partition.Definition C.2.Let H be an arbitrary m-vertex graph, and let n be such that m divides n, let k ′ ≤ m be a parameter (specifying the conjectured size of the maximum independent set in H), and let k satisfy k ′ ≤ k ≤ n − m.We say G H is distributed by G H (n, p) (for p ∈ (0, 1)) and that GH is distributed by G H (n, p, k) if they are created by the following random process.
1. Generate a random graph G ′ ∼ G(n, p), with a partition of its vertex set into m parts.
2. Choose a random subset M of m vertices from G ′ that obeys the partition.Though the description is different, it is not difficult to show that the distributions G H (n, p) and G H (n, p, k) are identical to the corresponding distributions described in Section 2.3.

For every
We also change the description of distribution A H G(n, p, k) from Section 2.3 in a way analogous to the above, by fixing a partition of the vertices of G ′ ∼ G(n, p) and requiring the adversary to choose in G ′ an induced copy of H that obeys the partition (vertex i of H must be in part i of the partition, for every 1 ≤ i ≤ m).As in Section 2.3, the adversary also plants a random independent set of size k − k ′ among the non-neighbors of H.If either G ′ does not have an induced copy of H that obeys the partition, of there are too few non-neighbors of H, we say that the adversary fails, and we revert to the default procedure of planting a random independent set of size k in G ′ .
We note that there is a (negligible) difference in the probability of failure in the above description of A H G(n, p, k) compared to that of Section 2.3, because it might be that G ′ has an induced copy of H, but no induced copy of H that obeys the partition.
For a graph G and a given partition, X H (G) denotes the number of sets S of size m obeying the partition, such that the subgraph of G induced on S is H (with vertex i of H in part i of the partition, for every 1 ≤ i ≤ m).For a graph G chosen at random from some distribution, X H (G) is a random variable.

C.3 Proof of Lemma 2.3
As noted in Appendix C.2, we slightly change Lemma 2.3.Instead of referring to all induced copies of H, we refer only to induced copies of H that obey the partition.The random variable X H (G) denotes their number.The main technical content of this modified Lemma 2.3 is handled by the following lemma.
Lemma C.1.Let 0 < ε < 1/7 be a constant, and let G ∼ G(n, p) be a random graph with p ∈ (0, 1).Let H be a balanced graph on m vertices with average degree 2 < α < 3.If m ≤ min[ ε p , 2 −1/4 p α/4 √ εn] (or equivalently, ε ≥ m 2 p and ε 2 ≥ 2 m 4 n 2 p α ), then for every β ∈ [0, 1) Proof.Let w(n) := np, so p = w(n)/n.Let Y H (G) be a random variable counting the number of sets S obeying the partition that have H as an edge induced subgraph of G, but may have additional internal edges.By definition, X H (G) ≤ Y H (G) and if it has no internal edges beyond those of H.This happens with probability 2 .We will now compute E Y H (G) 2 .Given the occurrence of H, consider another potential occurrence H ′ that differs from it by t vertices.Since H is balanced graph, Hence, the probability that H ′ realized conditioned on H being realized is at most p αt 2 .The number of ways to choose t other vertices is m t n m t (first choose t groups out of m in the partition, then choose one vertex in each group).Hence, the expected number of such occurrences is When w(n) α ≥ 2m 4 n α−2 the term t = m − 1 dominates, and hence the sum is at most roughly 2 , and The last inequality holds since ε < 1/7.We get that . By Chebyshev's inequality we conclude that ] the following holds for large enough n.Let G ∼ G(n, p) be a random graph with p = n δ−1 , and let H be a balanced graph on m = n ρ vertices and with average degree α.Then E[XH (G)] n→∞ −−−→ +∞, and for every β ∈ [0, 1) Proof.We first note that α < 2 1−δ implies that 2 − α(1 − δ) > 0, and hence we can take ρ > 0 in the above Corollary.The inequality ρ < (1 − δ)/2 implies (for large enough n) that The above bounds on m satisfy the requirements of Lemma C.1, and hence
We now restate and prove Theorem 2.3.Recall that now G H (n, p, k) and AG H (n, p, k) refer to the distributions as defined in Appendix C.2, rather that those defined in Section 2.3.

Theorem C.2 (Theorem 2.3 restated). Let f be an arbitrary function that gets as input an
n vertex graph and outputs either 0 or 1.Let p A denote the probability that f (G) = 1 when G ∼ A H G(n, p, k), and let p H denote the probability that f The following lemmas establish that with high probability the graph G ∼ G H (n, p, k) has no independent set that has more than k − k ′ vertices outside the induced copy of H.The notation used in these lemmas is as in Definition C.2.

Lemma C.4. Let
Proof.By first moment method the probability that there exists an independent set of size t is at most Proof.To prove this, view the process of generating GH in a following way.Initially, we have the graph H and n − m isolated vertices.Then, for every pair of vertices u, v where u ∈ M and v ∈ V \ M , draw an edge (u, v) with probability p.By doing so, we determine the set W ⊆ V \ M of vertices that have no neighbors in H. Select a random subset I ′ ⊂ W of size k − k ′ .For every pair of vertices from V \ M , if at least one of them does not belong to I ′ , draw an edge with probability p.
There are at most n t ≤ n t possible choices for the set Q.There are at most k−k ′ t ≤ k t ≤ n t possible choices for the set Y of at most t neighbors of Q within I ′ .The probability that Q has no neighbors in . By a union bound the probability that some subset Q ⊂ V \ (M ∪ I ′ ) of size t has at most t neighbors in I ′ is at most n −t .The probability of this happening for some value t ≤ k−k ′ 2 is at most n , as desired.
Combining the above lemmas we have the following Corollary.

Corollary C.2. Let
Then with probability at least 1 − 4/n over the choice of graph G ∼ G H (n, p, k), every independent set of size k in G contains at least k ′ vertices in the planted copy of H.
Proof.There are three events that might cause the Corollary to fail.
• G H (n, p, k) fails to produce an output.By Lemma C.3 and the upper bound on k, the probability of this event is smaller than 1 n .
• Even before planting I ′ , there is an independent set larger than . By Lemma C.4 the probability of this event is smaller than 1 n .
• After planting I ′ , one can obtain an independent set larger than ) with some of the vertices of I ′ .As we already assume that Lemma C.4 holds, Q can be of size at most k−k ′ 2 .Lemma C.5 then implies that the probability of this event is at most 2 n .
The sum of the above three failure probabilities is at most 4 n .
Now we restate and prove Theorem 1.3.
Theorem C.3.For p = n δ−1 with 0 < δ < 1, 0 < γ < 1, and 6n 1−δ log n ≤ k ≤ 2 3 n the following holds.There is no polynomial time algorithm that has probability at least γ of finding an independent set of size k in G ∼ A Ḡ(n, p, k), unless NP has randomized polynomial time algorithms (NP=RP).
Proof.Suppose for the sake of contradiction that algorithm ALG has probability at least γ of finding an independent set of size k in the setting of the Theorem. Choose ]. Let H be the class of balanced graphs of average degree α on m = n ρ vertices.By Theorem C.1, given a graph H ∈ H and a parameter k ′ , it is NP-hard to determine whether H has an independent set of size k ′ .We now show how ALG can be leveraged to design a randomized polynomial time algorithm that solves this NP-hard problem with high probability.
Repeat the following procedure 10 log n γ times.
• Sample a graph G ∼ G H (n, p, k) (as in Definition C.2).
• Run ALG on G.If ALG returns an independent set of size k that has at least k ′ vertices in the planted copy of H, then answer yes (H has an independent set of size k ′ ) and terminate.
If 10 log n γ iterations are completed without answering yes, then answer no (H probably does not have an independent set of size k ′ ).
Clearly, the above algorithm runs in random polynomial time.Moreover, if it answers yes then its answer is correct, because it actually finds an independent set of size k ′ in H.It remains to show that if H has an independent set of size k ′ , the probability of failing to give a yes answer is small.
We now lower bound the probability that a single run of ALG on G ∼ G H (n, p, k) fails to output yes.Recall that ALG succeeds (finds an independent set of size k) with probability at least γ over graphs with adversarially planted independent sets, and in particular, over the distribution A H G(n, p, k).
In Corollary C.1, choose ε = γ 25 and β = 1 5 .Our choice of m = n ρ satisfies the conditions of Lemma C.1, and hence we can apply Theorem C.2.In Theorem C.2 use the function f that has value 1 if ALG succeeds on G.It follows from Theorem C.2 that ALG succeeds with probability at least β(γ − 4ε (1−β) 2 ) = 3γ 20 over graphs G ∼ G H (n, p, k).Corollary C.2 implies that there is probability at most 4 n that there is an independent set of size k in G that does not contain k ′ vertices in the induced copy of H. Hence a single iteration returns yes with probability at least 3γ 20 − 4 n ≥ γ 10 (for sufficiently large n).Finally, as we have 10 log n γ iterations, the probability that none of the iterations finds an independent set of size k is at most (1 − γ 10 )

D Probabilistic bound
In this section we prove Theorem A.5.Let c ∈ (0, 1) and C > 0 be arbitrary constants.Let G ∼ G(n, p), G = (V, E), where p = w(n)/n for log 4 n ≪ w(n) < cn, and let k = Cw(n) 1/2 .Let K ⊂ V be arbitrary, |K| = k.We number the vertices of G so that V = [n], K = [k] and V \ K = [n] \ [k].For k + 1 ≤ i ≤ n let X i be a random variable equal to the number of edges from i to vertices in K.It is clear that X i ∼ Bin(k, p), so E[Xi] = kp and the variance V[Xi] = E (X i − kp) 2 = kp(1 − p).Since In other words, the probability that there exists such choice of k-subset and such 1 ≤ r ≤ R − 1 that for the corresponding set of vertices M r we have |M r | = |M ′ r | > 2 r+1 w(n) 1/4 tends to zero.Earlier we assumed that M r = M ′ r , but in general M r = M ′ r ⊔ M ′′ r , and m r = m ′ r + m ′′ r .The opposite case is M r = M ′′ r , and the analysis transfers without any changes, and |M ′′ r | ≤ 2 r+1 w(n) 1/4 with probability at least 1 − exp − Ω(nw(n) −1/4 ) .Hence, with probability of at least 1 − exp − Ω(nw(n) −1/4 ) for every choice of K and every 1 ≤ r ≤ R − 1 we have Since w(n) ≫ log 4 n, log n ≪ w(n) 1/4 , and we set the number of groups R = R = log Cn − log(w(n) 1/2 log n).By Lemma D.1, with probability at least 1 − exp − Ω(nw(n) −1/4 ) , Now we move to the first sum, for i ∈ M R with R = R we have We need to prove that with extremely high probability for any choice of k-subset i∈M R U i ≤ (n − k)kp(1 − p) + o(nkp(1 − p)).
We will do this by applying the Bernstein inequality [Ber46].
and a planted clique of size ε √ n − t ≃ c √ n ′ .Now on this new graph G" we can invoke the algorithm based on the theta function.(Technical remark.The proof that ϑ( Ḡ") ≤ k + O(log n) uses the fact that Theorem 2.2 holds with extremely high probability.See more details in Appendix B.)
n→∞ −−−→ +∞, recall the notation w(n) = np and the following bound from the proof of Lemma C.1

The following example illustrates what might go wrong, Example 1. Consider a graph G ′ ∼ G(n, 1 2 ). In G ′ first select a random vertex set T of size slightly smaller than 1 2 log n. Observe that the number of vertices in G ′ that are in the common neighborhood of all vertices of T is roughly 2 −|T | n > √ n. Plant a clique K of size k in the common neighborhood of T . In this construction, K is no longer the largest clique in G. This is
because T (being a random graph) is expected to have a clique K ′ of size 2 log |T | ≃ 2 log log n, and K ′ ∪ K forms a clique of size roughly k + 2 log log n in G.Moreover, as T itself is a random graph with edge probability 1 2 , the value of the theta function on T is roughly |T | (see G ′ a random copy of H (that is, pick |H| random vertices in G ′ and replace the subgraph induced on them by H).We refer to the resulting distribution as G H (n,p), and to the graph sampled from this distribution as G H . Observe that the number of vertices in G H that have a neighbor in H is with high probability not larger than |H|n δ ≤ n 2 .3. Within the non-neighbors of H, plant at random an independent set of size k−k ′ .We refer to the resulting distribution as G H (n,p,k), and to the graph sampled from this distribution as GH .Observe that with extremely high probability, α( GH \ H) = k − k ′ .Hence we may assume that this indeed holds.If furthermore α(H) ≥ k ′ , then α( GH ) ≥ k.
4. Run ALG on GH .We say that ALGRAND succeeds if ALG outputs an independent set IS of size k.Observe that then at least k ′ vertices of H are in IS, and hence ALGRAND finds an independent set of size k ′ in H.
1] be arbitrary constants.For any t ≥ 0, for every set Q ⊂ V of size (µ + o(1))k, there are at most g(n, p, µ, ν, t) := 6µ (ν−pµ) 2 p log n + t vertices from V \ Q that have at least (ν − o(1))k neighbors in Q, with probability at least 1 − exp − (ν−pµ) 2 tk associate vertex i of H with the vertex of M in the ith part, and replace the induced subgraph of G ′ on M by the graph H.This gives G H ∼ G H (n, p).4.Within the non-neighbors of M , plant at random an independent set I′ of size k − k ′ , giving the graph GH ∼ G H (n, p, k).(If M has fewer than k − k ′ non-neighbors in G H ,an event that will happen with negligible probability for our choice of parameters, then we say that this step fails, and instead we plant a random independent set of size k in G H .)

Lemma 2.4 and Theorem 2.3
Lemma C.2 (Lemma 2.4 restated).Let p(G) denote the probability to output G according to G(n, p), and let p H (G) denote the probability to output G according to G H (n, p).For every constant β ∈ [0, 1), with probability at least 1 − 4ε (1−β) 2 over the choice of graph G ∼ G(n, p), it holds that p H (G) ≥ βp(G). .Let e be the number of edges in G and consider p H (G). Out of the n m m options to choose a subset M in G H (n, p), only X H (G) options are such that the subgraph induced on M is H, so that the resulting graph could be G.Since H has average degree α, it has exactly αm/2 edges.Note that