Correction: Sandwiching dense random regular graphs between binomial random graphs

Kim and Vu made the following conjecture (Advances in Mathematics, 2004): if d≫logn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d\gg \log n$$\end{document}, then the random d-regular graph G(n,d)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,d)$$\end{document} can asymptotically almost surely be “sandwiched” between G(n,p1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,p_1)$$\end{document} and G(n,p2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,p_2)$$\end{document} where p1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1$$\end{document} and p2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2$$\end{document} are both (1+o(1))d/n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1+o(1))d/n$$\end{document}. They proved this conjecture for logn≪d⩽n1/3-o(1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log n\ll d\leqslant n^{1/3-o(1)}$$\end{document}, with a defect in the sandwiching: G(n,d)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,d)$$\end{document} contains G(n,p1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,p_1)$$\end{document} perfectly, but is not completely contained in G(n,p2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,p_2)$$\end{document}. The embedding G(n,p1)⊆G(n,d)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,p_1) \subseteq {\mathscr {G}}(n,d)$$\end{document} was improved by Dudek, Frieze, Ruciński and Šileikis to d=o(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=o(n)$$\end{document}. In this paper, we prove Kim–Vu’s sandwich conjecture, with perfect containment on both sides, for all d where min{d,n-d}≫n/logn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\min \{d, n-d\}\gg n/\sqrt{\log n}$$\end{document}. The sandwich theorem allows translation of many results from G(n,p)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,p)$$\end{document} to G(n,d)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,d)$$\end{document} such as Hamiltonicity, the chromatic number, the diameter, etc. It also allows translation of threshold functions of phase transitions from G(n,p)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,p)$$\end{document} to bond percolation of G(n,d)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathscr {G}}(n,d)$$\end{document}. In addition to sandwiching regular graphs, our results cover graphs whose degrees are asymptotically equal. The proofs rely on estimates for the probability of small subgraph appearances in a random factor of a pseudorandom graph, which is of independent interest.

sandwiched between two binomial random graphs G (n, p 1 ) and G (n, p 2 ), the former with average degree slightly less than d, and the latter with average degree slightly greater. The formal statement is as follows. Recall that a coupling of random variables Z 1 , . . . , Z k is a random variable (Ẑ 1 , . . . ,Ẑ k ) whose marginal distributions coincide with the distributions of Z 1 , . . . , Z k , respectively. With slight abuse of notation, we use (Z 1 , . . . , Z k ) as a coupling of Z 1 , . . . , Z k . Conjecture 1 (Sandwich Conjecture [14]) For d log n, there are probabilities p 1 = (1 − o(1))d/n and p 2 = (1 + o(1))d/n and a coupling (G L , G, G U ) such that G L ∼ G (n, p 1 ), G U ∼ G (n, p 2 ), G ∼ G (n, d) and P(G L ⊆ G ⊆ G U ) = 1 − o (1).
The condition d log n in the conjecture is necessary. When p = O(log n/n), there probably exist vertices in G (n, p) whose degrees differ from pn by a constant factor. Therefore, Conjecture 1 cannot hold for this range of d. For log n d n 1/3 / log 2 n, Kim and Vu proved a weakened version of the sandwich conjecture where G ⊆ G U is replaced by a bound on Δ(G \ G U ) (see the precise statement in [14,Theorem 2]. 1 ) Note that this weakened sandwich theorem already allows direct translation of many results from G (n, p) to G (n, d), including all increasing graph properties such as Hamiltonicity.
Dudek, Frieze, Ruciński and Šileikis [6] improved one side of Kim and Vu's result, G L ⊆ G, to cover all degrees d such that log n d n and also extended it to the hypergraph setting. In particular, this new embedding theorem allows them to translate Hamiltonicity from binomial random hypergraphs to random regular hypergraphs.
In the graph case, the embedding result of [6] relies on an estimate for the probability of edge appearances in a random t-factor (spanning subgraph with degrees t) of a nearly complete graph S, where t is near-regular and sparse. This can be done using a switching argument, which has already appeared in several enumeration works; see for example [18]. Extending the results of [6] to d = Θ(n) requires new proof methods beyond switchings: one needs to consider the case when S is no longer a nearly complete graph, and components of t are all linear in n.
An immediate corollary of the sandwich conjecture, if it were true, is that one can couple two random regular graphs G 1 ∼ G (n, d 1 ) and G 2 ∼ G (n, d 2 ) such that asymptotically almost surely (a.a.s.) G 1 ⊆ G 2 , if d 2 is sufficiently greater than d 1 .
In fact we conjecture that such a coupling exists as long as d 2 d 1 . However, the weakened versions of the sandwich conjecture, as proved in [14] and [6], are not strong enough to imply the existence of such a coupling, even when d 2 is much greater than d 1 .  G (n, d 1 ), G 2 ∼ G (n, d 2 ), and P(G 1 ⊆ G 2 ) = 1 − o (1).

Remark 3
This conjecture or some variant of it has already been the subject of speculation and discussion in the community, but we have not found any written work about it. The case when d 1 = 1 and 3 d 2 n − 1 is simple, since almost all d 2 -regular graphs have perfect matchings, which follows from them being at least (d 2 − 1)-connected [4,15]. Generate a random d 2 -regular graph G 2 . If G 2 has any perfect matchings, select one at random; otherwise select a random 1-regular graph. By symmetry, this gives a random 1-regular graph which is a subgraph of G 2 with probability 1 − o (1).
The two binomial random graphs in Conjecture 1 differ by o(d/n) in edge density. This gap gives enough room to sandwich a random graph with more relaxed degree sequences. We propose a stronger sandwich conjecture stated as Conjecture 4 below.
Given a vector d = (d 1 , . . . , d n ) T ∈ R n , let rng(d) stand for the difference between the maximum and minimum components of d. Denoting Δ(d) := max j d j , we can also write rng(d) := Δ(d) + Δ(−d). If d(G) is the degree sequence of a graph G, we will also use the notation Δ(G) := Δ(d(G)).

Conjecture 4 Assume d = d(n) is a near-regular degree sequence such that Δ(d)
log n. Then, there are p 1 = (1 − o(1))Δ(d)/n and p 2 = (1 + o(1))Δ(d)/n and a coupling (G L , G, G U ) such that G L ∼ G (n, p 1 ), G U ∼ G (n, p 2 ), G ∼ G (n, d) and We categorise the family of near-regular degree sequences into the following three classes. In this paper, we confirm Conjecture 4 for all dense near-regular d, and we confirm Conjecture 1 for all d where min{d, n − d} n/ √ log n. This proves the sandwich conjecture by Kim and Vu for asymptotically almost all d. We will treat sparse and co-sparse near-regular degree sequences in a subsequent paper, as the proof techniques used for those ranges are very different from this work.

Main results
Throughout the paper we assume that d is a realisable degree sequence, i.e. G (n, d) is nonempty. This necessarily requires that d has nonnegative integer coordinates and even sum. All asymptotics in the paper refer to n → ∞, and there is an implicit assumption that statements about functions of n hold when n is large enough. For two sequences of real numbers a n and b n , we say a n = o(b n ) if b n = 0 eventually and lim n→∞ a n /b n = 0. We say a n = O(b n ) if there exists a constant C > 0 such that |a n | C |b n | for all (large enough) n. We write a n = ω(b n ) or a n = Ω(b n ) if a n > 0 always and b n = o(a n ) or b n = O(a n ), respectively. If both a n and b n are positive sequences, we will also write a n b n if a n = o(b n ), and a n b n if a n = ω(b n ). Our contribution towards Conjectures 1 and 4 are given by the following theorems.
This section is organised as follows. First, we discuss some important properties that immediately translate from G (n, p) to random graphs with given degrees by our sandwich theorems. Then, we show that Theorem 5 and Theorem 6 follow from a more general and more accurate theorem for embedding G (n, p) inside G (n, d) (see Theorem 8 in Sect. 2.2). The proof of the embedding theorem is long and technical so we postpone it till Sect. 4. Nevertheless, in Sect. 2.3, we state its key ingredient, which may be of independent interest: the probability estimate for a random factor in a pseudorandom graph to contain or avoid a given small set of edges.

Translation from G (n, p) to G (n, d)
Our sandwich theorem allows translation of many results from binomial random graphs to random graphs with dense near-regular degree sequences. Some of the translations can already be obtained from a one-sided sandwich, e.g. the monotone properties. For instance, we can immediately transfer properties of Hamiltonicity or containment of other subgraphs from G (n, p) to G (n, d). Other translations require sandwiching on both sides. For example, we can translate graph parameters such as chromatic number, diameter, and independence number. We refer the reader to the conference version of this work [10, Section 3] for these applications. 2 In this section, we only give one example showing how to translate threshold functions of phase transitions from G (n, p) to those of the random graph obtained by edge percolation on G (n, d).
We say Γ has a sharp threshold f (n) in G (n, p) if for every fixed ε > 0, The concept of (sharp) threshold extends naturally to other random graph models such as G (n, m), G (n, d) and G (n, d) where d is near-regular.

Theorem 7 (Percolation on G (n, d)) Assume d is near-regular and dense. Let G ∼ G (n, d) and G p be the subgraph of G obtained by independently keeping each edge with probability p. Let Q be a monotone property and let th(Q) denote a (sharp) threshold function of Q in G(n, p). Then (n/Δ(d)) · th(Q) is a (sharp) threshold function of Q in G p .
We give one example of Theorem 7. A giant component in G (n, p) is a component of size linear in n. Determining the sharp threshold of the emergence of a giant component in G (n, p) is a remarkable landmark result in random graph theory. The emergence threshold of a giant component in other random graph models has also been extensively studied. For instance, the emergence threshold of a giant component in G p is known to be 1/(d − 1) in the special case where G ∼ G (n, d) with d 3, following from a sequence of results [15,16,22]. Theorem 7 extends this result to near-regular degree sequences where Δ(d) = Ω(n).

Corollary 2 (Giant component) Assume d is near-regular and Δ(d)
= Ω(n). The emergence of a giant component in G p has a sharp threshold 1/Δ(d).

Embedding theorem
Our sandwich theorems are corollaries of the following more general theorem for embedding G (n, p) inside G (n, d). In fact, the embedding theorem provides better p 1 and p 2 and sharper bounds on the probability of G L ⊆ G ⊆ G U than Theorems 5 and 6. Theorem 8 (The embedding theorem) Let d = d(n) ∈ N n be a degree sequence and ξ = ξ(n) > 0 be such that ξ(n) = o (1). Denote Δ = Δ(d). Assume n · rng(d) ξΔ(n − Δ) and n − Δ ξΔ n/ log n. Then there exist p = (1 − O(ξ ))Δ/n, and a coupling (G L , G) with G L ∼ G (n, p) and G ∼ G (n, d) We prove Theorem 8 in Sect. 4 using the coupling procedure described in Sect. 3. The proposition below shows that the probability bound of Theorem 8 is tight up to an additional three powers of log n in the exponent.

Lemma 1
Assume is d-regular, i.e. all components equal to d. Let (G L , G) be any coupling such that G L ∼ G (n, p) and G ∼ G (n, d), where p(n − 1) d. Then In particular, if p is defined as in Theorem 8, then The given bound follows from comparing values {d +1, . . . , d +d 1/2 } of Bin(n −1, p) to the same values of Bin(n − 1, d/(n − 1)). For the second part, observe that the assumptions imply ξ 1/ log n.
Now we prove that Theorem 5 follows from Theorem 8. We can now stitch π and π together to construct a coupling ( n, d) and G U ∼ G (n, p 2 ). First uniformly generate a graph G ∈ G (n, d). Then, conditional on G, generate G L under π and generate G U under π . This yields (G L , G, G U ) with the desired marginal distributions. Moreover, a.a.s.

Proof (of Theorem 5)
Based on Theorem 8 we also establish the following result, which covers Theorem 6.
Complementing gives an embedding of

Forced and forbidden edges in a random factor
As explained later, see Question 1 in Sect. 3, a key step towards proving Theorem 8 is to estimate, to the desired accuracy, the edge probability in a random t-factor S t of a graph S, where t = (t 1 , . . . , t n ) T is a degree sequence.
We will estimate the edge probabilities by enumerating t-factors of S, using a complex-analytic approach which is presented in detail in Sect. 5. Here, we just give a quick overview. Given S, the generating function for subgraphs of S with given degrees is jk∈S (1+ z j z k ). Using Cauchy's integral formula, we find that the number N (S, t) of t-factors of S is given by We will derive an asymptotic expression of N (S, t) using a multidimensional variant of the saddle-point method. The integral is split into two parts. The first part corresponds to the neighbourhood of saddle points. Using the Laplace approximation, we need to estimate the moment-generating function of a polynomial with complex coefficients of an n-dimensional Gaussian random vector. To do this, we apply the general theory based on complex martingales developed in [11]. The second part consists of the integral over the other regions and has a negligible contribution. See Sect. 5 and Theorem 11 for these calculations.
Estimating both parts of the integral is highly non-trivial and this analysis was previously done in the literature only for the case when S is the complete graph K n or not far from it, see [1,11,19,20]. Extending these results to a general graph S requires significant improvements of known techniques. Our enumeration result (see Theorem 11) gives an asymptotic value of N (S, t) for S such that every pair of vertices have Θ(Δ 2 (S)/n) common neighbours and under some technical conditions on t.
We also investigate the connection between the random graph S t and the so-called β-model which belongs to the exponential family of random graphs. We show that the probability of containing/avoiding a prescribed small set of edges is asymptotically the same for both models; see Sect. 6 and Theorem 12.
We would like to note that, even for S = K n , Theorem 11 and Theorem 12 extend previously known results. Here, we state a special case of Theorem 12 for the case when the degree sequence t is approximately proportional to the degree sequence of S. Here, and throughout the paper, we use · p for p ∈ {1, 2, ∞} to denote the customary vector norms and their induced matrix norms. (A1) for any two distinct vertices j and k and some constant γ > 0, we have γ Δ 2 (S) n |{ : j ∈ S and k ∈ S}| Δ 2 (S) γ n ; Let S t be a random t-factor of S. Then, for any ε > 0, where m(H + ) and m(H − ) denote the number of edges in H + and H − , respectively. Theorem 10 will be sufficient for us to prove the embedding theorem. In fact, for Theorem 8, we only need to consider the case when H + is an edge and H − is empty, for which (A4) is trivial; see Sect. 4. We prove Theorem 10 in Sect. 6.3.

Embedding G (n, p) inside G (n, d)
To prove our embedding theorem, we will use a procedure called Coupling( ) which constructs a joint distribution of (G L , G) where G L ⊆ G a.a.s. and their marginal distributions follow G (n, p) and G (n, d) respectively. The procedure is given in Fig. 1.

The coupling procedure
Procedure Coupling( ) takes a graphical degree sequence d, a positive integer I and a positive real ζ < 1 as an input, and outputs three random graphs G ζ , G, G 0 , all on [n], such that G ∼ G (n, d) and G ζ ⊆ G 0 . Roughly speaking, the procedure constructs (G ζ (t) , G (t) , G 0 (t) ) by sequentially adding edges to the three graphs, and G ζ (t) ⊆ G (t) ⊆ G 0 (t) is maintained up to step I . The outputs G ζ and G 0 of Coupling( ) will be G ζ (I ) and G 0 (I ) , ignoring some technicality. The output G will be a "proper" completion of G (I ) into a graph with degree sequence d. For a careful choice of I and ζ , procedure Coupling( ) typically produces an outcome that G ζ ⊆ G and G − G 0 is "small". Moreover, if I is chosen randomly according to a suitable distribution, which we specify later in this section, then G ζ ∼ G (n, p ζ ) and G 0 ∼ G (n, p 0 ), where p ζ ≈ p 0 for small ζ > 0. (See the definition of p ζ in (2).) Even though we only need the coupling (G L , G) with G L = G ζ for our purposes, it will be convenient to include G 0 in our coupling construction in order to deduce certain properties of G required for our proofs. In rare cases, if certain parameters become too large, Coupling( ) calls another procedure IndSample( ). Procedure IndSample( ) also generates three random graphs G ζ ∼ G (n, p ζ ), G ∼ G (n, d) and G 0 ∼ G (n, p 0 ) but the relation G ζ ⊆ G is not a.a.s. guaranteed. In fact, G will be independent of (G ζ , G 0 ). The main challenge will be to show that the probability for Coupling( ) to call IndSample( ) is rather small.
If M is a multigraph, we write G M if G is the simple graph obtained by suppressing multiple edges in M into single edges. With a slight abuse of notation, we write jk ∈ G (n, d) for the event that jk is an edge in a graph randomly chosen from The details of procedures Coupling( ) and IndSample( ) are shown in Fig. 1. Note that Coupling( ) consists of two loops indexed by a contiguous sequence of values of ι. When we refer to "step ι" or "ι iterations", we refer to the point in Coupling( ) where ι has that value, regardless of which of the two loops we are in.
Our next lemma verifies that G ζ and G 0 output by Coupling(d, I , ζ ) have the desired distributions if I is an integer drawn from a Poisson random variable with a properly chosen mean. (With a slight abuse of notation, we write I ∼ Po(μ), but note that the argument passed to Coupling( ) is not a random variable but a single integer drawn from the distribution Po(μ).) Denote by N = n 2 the number of edges in K n .
Proof By the definition of Coupling( ) and IndSample( ), whether IndSample( ) is called or not, the construction for G ζ and G 0 lasts exactly I steps. In each step 1 ι I , a uniformly random edge jk from K n is chosen. Then jk is added to M (ι) 0 always, and jk is added to M (ι) ζ with probability 1 − ζ . Let e 1 , . . . , e N be an enumeration of the edges of K n . For 1 z N , let X z denote the number of times that edge e z is chosen during these I iterations. Clearly, Moreover, the probability generating function for the random vector This implies that the components of X are independent. Hence, each edge of K n is included in G independently with probability P(X z 1) = 1 − e −μ/N . This verifies that G 0 ∼ G (n, p 0 ). Next we consider the distribution of G ζ . By the definition of Coupling(d, I , ζ ), for every 1 ι I , the chosen edge e z is added to M (ι) ζ with probability 1−ζ . Let Y z denote the multiplicity of e z in M denote the probability space of all subgraphs of G containing exactly m edges with the uniform distribution. In the next lemma, we verify the marginal distribution of G (ι) during the coupling procedure. Define m (ι) to be the number of edges in G (ι) .

Lemma 3 Suppose IndSample( ) was not called during the first ι iterations of procedure Coupling( ). Then G
Proof With a slight abuse of notation, let G (ι) be the graph where edges are labelled with [m (ι) ] in the order that they are added by Coupling( ). We will prove by induction that G (ι) has the same distribution as the graph obtained by uniformly labelling edges in G (n, d, m (ι) ) with [m (ι) ]. This is obviously true for ι = 1.
Without loss of generality, assume G (ι−1) has m (ι) − 1 edges and has the claimed distribution, and assume that G (ι) contains m (ι) edges. Let L (G (ι−1) ) be the set of edge-labelled graphs with degree sequence d which contain G (ι−1) as an edge-labelled ). Hence, the random graph G (ι) also has the claimed distribution.
The above immediately implies the statement of the lemma for the non-edgelabelled G (ι) , since there are exactly m (ι) ! ways to label edges of G (ι) for any realisation of G (ι) with m (ι) edges.
Lemma 3 immediately yields the following corollary.
Thus, procedure Coupling(d, I , ζ ) with I ∼ Po(μ) always produces a random triple of graphs with suitable marginal distributions. Next, we need to choose parameters μ and ζ in such a way that p ζ approximates the density of G (n, d) reasonably well and the probability of G ζ G is small. Note that G ζ ⊆ G could only be violated when IndSample( ) is called, in which case G ζ and G are generated independently.
Define I * to be the value of ι when IndSample( ) is called, otherwise I * = I +1. Then we have For each 0 ι I * − 1, define  (ι) ) is the set of graphs disjoint from G (ι) whose union with G (ι) is a graph in G (n, d). Note that in each step of the algorithm, a new edge jk is added to Inductively that implies that the set of subgraphs of S (ι) with degree sequence d − g (ι) is never empty, for every ι, as stated in the following observation.
We find that where the second equation above holds since the denominators are nonzero by Observation 1. Thus, (1) and (2) motivate the following question.
Question 1 Let S t be a uniform random t-factor (spanning subgraph with degree sequence t) of a graph S. Under which assumptions on S and t can one guarantee that for any two edges z, z of S?
Having an accurate estimate of the above probability ratio is crucial in our approach towards solving the sandwich conjecture, and tightening the density gap between the two binomial random graphs that sandwich G (n, d). Theorem 10 answers Question 1 for dense pseudorandom graph S and dense near-regular t. Resolving the full sandwich conjecture requires solving Question 1 for pseudorandom S with near-regular t in all density regimes: S can be sparse or dense and t can be sparse or dense relative to S. We will address that in the subsequent paper. Some partial solutions have been presented in the conference version [10].

Proof of theorem 8
We continue using all notations introduced in Sect. 3. In this paper, all graphs are defined on the vertex set [n]. When we do algebraic operations on graphs, we always operate on the edge sets of the graphs. In particular, for graphs G and H , In this section we show how to choose μ and ζ such that procedure Coupling( ) produces a desirable outcome. As explained before (in particular, see (1) and (2)) it is important that all edges of S (ι) = K n − G (ι) are approximately equally likely to appear in the uniform random subgraph of S (ι) with degree sequence d − g (ι) , where g (ι) denotes the degree sequence of G (ι) . We will employ Theorem 10 for this purpose.

Preliminaries
We will need the following bounds.
Bound (c) comes from approximating Po(μ) with Bin(K , μ/K ) as K → ∞ and using the upper Chernoff bound.
The next lemma will assist us in verifying assumption (A1) in Theorem 10.
Observe that the degrees ofS are distributed according to Bin(n − 1, p). Also, the number of common neighbours of any two vertices inS is distributed according Bin(n − 2, p 2 ). Observing that np 2 log n and combining Lemma 4(a) and the union bound, we get that, with probability e −Ω( p 2 n) , for all pairs of distinct vertices j and k. This implies that Note that S has the same distribution asS conditioned on the event thatS has exactly m edges. From Lemma 4(b), we know that

Estimates for S (Ã) and g (Ã)
Recall that Lemma 4 is sufficient to extract some information about the density of S (ι) and the sequence d − g (ι) , as described in the following lemma. Recall the definition of p 0 and p ζ from Lemma 2 and that m (ι) denotes the number of edges in G (ι) . Define Proof By the assumption that IndSample( ) was not called during the first ι steps, and using Lemma 2, we have that Since e −Ω(ξ 2 M) = e −Ω(ξ 4 Δ) we can proceed conditioned on the event that p (ι) ≥ ξ/2. Take G ∼ G(n, d) and let h = (h 1 , . . . , h n ) denote the degree sequence of the random graph G p (ι) obtained by independently keeping every edge from G with probability p (ι) . By Lemma 3, the sequence d − g (ι) has exactly the same distribution as h conditioned on the event |E(G p (ι) )| = M − m (ι) , therefore .
Observing h j ∼ Bin(d j , p (ι) ) and using Lemma 4(a), we find that If n − Δ > p (ι) Δ then Otherwise, if n−Δ p (ι) Δ then, using Lemma 4(a) and the assumption n−Δ ξΔ, we find that Finally, Lemma 4(b) gives a polynomial lower bound on P(|E(G p (ι) )| = M − m (ι) ), which is absorbed by the main error term by virtue of the assumption Δ(d) ξ −4 log n. This completes the proof.
Alas, there is not much structural information available about the graphs S (ι) . In fact, by virtue of Lemma 3, such questions are similar in some sense to investigating the model G (n, d) that is the problem we started with. Nevertheless, it turns out that the following trivial observation will be sufficient for our purposes: If ζ = o(1) and ιζ = o(N ) then we have ζ follow directly from the definition. For the second part, it is sufficient to bound P m By the construction of G (ι) Note that m  [17], we get that Using Thus, by (4), we get completing the proof. (1))|E(S (ι) )| with high probability provided ιζ N . This enables us to derive all the necessary structural properties about S (ι) from the well-studied model G (n, m).

Specifying and
Take I ∼ Po(μ), where μ is the unique solution of Let ζ = Cξ for some sufficiently large constant C > 0 (which depends only on the implicit constant in O( ) of Theorem 10 with γ = 1 9 and ε = 1 4 ).

Completing the proof of theorem 8
First, by the assumptions, observe that Thus, it is sufficient to prove the assertion with probability 1 − e −Ω(ξ 4 Δ) . We prove that if IndSample( ) was not called during the first ι steps of Coupling(d, I , ζ ) then the probability that it is called in step ι + 1 is e −Ω(ξ 4 Δ) . Then our assertion holds by taking the union bound over the I steps which, with probability at least 1 − e −Ω(ξ 4 Δ) , is bounded by n 2 . Suppose IndSample( ) was not called during the first ι steps of Coupling(d, I , ζ ). To bound the probability that it is called at the next iteration, we use Theorem 10 with S := S (ι) , t := d − g (ι) , By Observation 1, the set of t-factors of S is not empty. By the assumptions of Theorem 8, for all j, Using Lemma 6, we get that, with probability 1 − e −Ω(ξ 4 Δ) , Let s denote the degree sequence of S (ι) and λ = t 1 +···+t n s 1 +···+s n . Then, by (6) and the two bounds of t j in (7), Consequently, lettingt = t 1 /n ands = s 1 /n we have Using the two bounds for t j in (7) and the corresponding bounds for s j , we have Note that Combining the bounds above, we get that Next, we prove that For the first inequality in (10), note that where the equality in (11) holds by (8) and (9). The second equality in (10) follows by (11), the bound p (ι) ξ/2 from (7) and the theorem assumption that n − Δ ξΔ n/ log n. It follows from (10) that λ(1 − λ)Δ(S (ι) ) t − λs ∞ + n/ log n, and thus assumptions (A2) and (A3) of Theorem 10 are satisfied. Assumption (A4) is also immediate since H = H + ∪ H − consists of one edge.
Using (1) and (12), we conclude that procedure Coupling(d, I , ζ ) produces a "bad" output (G ζ , G, G 0 ) with probability To complete the proof we take (G L , G) = (G ζ , G) and p = p ζ , recall that G ∼ G (n, d) by Corollary 3, and G ζ ∼ G (n, p ζ ) by Lemma 2, where The last equation follows by (5) and the assumption that d is near-regular.

Enumeration of factors
In this section we establish an asymptotic formula for the number of factors (subgraphs with given degree sequence) of a graph in the dense case. Let S be a simple graph. We start from the observation that jk∈S (1 + z j z k ) is the generating function for subgraphs of S with powers of z 1 , . . . , z n corresponding to degrees. In particular, the number N (S, t) of t-factors of S is given by where [ · ] denotes coefficient extraction. Using Cauchy's integral formula, it follows that Substituting z j = e β j +iθ j , we get that 1+e β j +β k e n j=1 it j θ j dθ 1 · · · dθ n = jk∈S (1+e β j +β k ) (2π) n e n j=1 t j β j U n (π ) The choice of parameters β = (β 1 , . . . , β n ) will be specified later. The values (λ jk ) defined in (14) have an interesting property: if we consider a random subgraph S (λ jk ) of S with independent adjacencies where, for each jk ∈ S, the probability that vertices j and k are adjacent in S (λ jk ) equals λ jk , then the probability of each outcome depends only on its degree sequence t = (t 1 , . . . , t n ). In other words, the conditional distribution of S (λ jk ) with respect to given t is uniform. The random model of S (λ jk ) is referred as the β-model and it is a special case of the exponential family of random graphs, see [3,11] for more details. A further connection between S (λ jk ) and S t is established in Sect. 6.
The exact value of the integral (13) can be found very rarely. Instead, we will approximate it. The complex-analytical approach consists of the following steps: (i) estimate the contribution of critical regions around concentration points, where the integrand achieves its maximum value, (ii) show that other regions give a negligible contribution.
By Taylor's theorem, for a ∈ [0, 1] and x ∈ [−π/4, π/4], we have Using this to expand the multipliers of F S,t (θ ), we find that where the n × n symmetric matrix Q is defined by (17) and the multivariable polynomials u and v are defined by Observe that θ T Qθ ≥ 0, so Q is a positive semidefinite matrix. Moreover, it is positive definite if S does not contain a bipartite component. The optimal choice for β is such that the linear part in (16) disappears, which corresponds to the case when our contours in the complex plane pass through the saddle point. Thus, we get the following system of equations: For the case S = K n , the existence and the uniqueness of the solution was studied in [1,3,23]: the necessary and sufficient condition is that t lies in the interior of the polytope defined by the Erdős-Gallai inequalities. When S is the complete graph, it is also known that system (19) is equivalent to (i) maximisation of the likelihood with respect to the parameters of the β-model given observations of the degrees (ii) finding the random model with independent adjacencies and given expected degrees that maximises the entropy. Unfortunately, analogs of these results are not available for general S even though the methods used in the literature will certainly carry over. Since such results are not needed for our purposes here, we leave these questions for a subsequent paper.
If system (19) holds then we have λ = t 1 +···+t n 2|E(S)| , which is the relative density of a t-factor in S. We are ready to state our main result of this section.

Theorem 11 Let ε, γ and c be fixed positive constants. Suppose a graph S on n vertices and a degree sequence t satisfy the following assumptions:
(B1) for any two distinct vertices j and k, we have γ Δ 2 (S) n |{ : j ∈ S and k ∈ S}| Δ 2 (S) γ n ; (B2) there exists a solution β of system (19) such that rng(β) c; Let X be a random variable with the normal density π −n/2 |Q| 1/2 e −x T Q x . Then, where the constant implicit in O( ) depends on γ, ε and c only.
There is a vast literature on asymptotic enumeration of dense subgraphs with given degrees in the case when S is the complete graph or not far from it, see, for example, [1,11,19,20] and references therein. An important advantage of Theorem 11 with respect to the previous results is that it allows S to be essentially different from K n and it holds for a very wide range of degrees. Theorem 11 follows immediately from Eqs. (13), (15), Lemma 9 and Corollary 4.

The integral in the critical regions
For given S and t, denote and Δ := Δ(S).
In the following, we always assume that ΛΔ n/ log n which is the assumption (B3) of Theorem 11. We also assume that (19) is satisfied. Let ε be a fixed positive constant required to be sufficiently small in several places of the argument. Define η := n ε (ΛΔ) 1/2 = o (1).
As explained above (see (15)), the contributions of these two regions to the integral in (13) are identical so we can focus on B 0 . From (16), we have where h(θ ) = O(n −1/2+6ε ) uniformly for θ ∈ B 0 . A general theory on the estimation of such integrals was developed in [11], based on the second-order approximation of complex martingales. We will apply the tools from [11] here and, for the reader's convenience, also quote them in the appendix, see Section A.2. We will need the following bounds. (20). If rng(β) c for some fixed c > 0, then (a) uniformly over all jk ∈ S, λ jk = Θ(λ) and 1 − λ jk = Θ (1 − λ).

Lemma 8 Let λ be defined in
Furthermore, suppose Δ = Ω(n 1/2 ) and assumption (B1) of Theorem 11 holds. Then Q is positive definite and the following hold.
There exists a real matrix T such that T T QT = I and Proof Observe that 1 1+e y 1+e x e y−x for any real x y. Since all β j + β k and β j + β k are at most 2c apart, this implies that Recalling the definition (20), we have proved (a). Note that assumption (B1) of Theorem 11 implies that S is a connected non-bipartite graph. Thus, Q is positive definite. Parts (b) and (c) follow from Lemma 20 (see appendix) applied to the scaled matrix Q/Λ.
We are ready to establish asymptotic estimates for the critical region B 0 . Note that in the next lemma we allow the components of t to be non-integers.
Next, we need to estimate some moments of u(X) and v(X). Let Σ = (σ jk, m ) denote the covariance matrix of the variables X j + X k for jk ∈ S: σ jk, m := Cov(X j + X k , X + X m ). (23) Since X is gaussian with density π −n/2 |Q| 1/2 e −x T Q x , the covariances Cov(X j , X k ) equal the corresponding entries of (2Q) −1 . Using the bounds of Lemma 8(b), we find that The expectation of a polynomial of odd degree is zero (due to the symmetry of the distribution) so Cov(u(X), v(X)) = Ev(X) = 0. The following are special cases of Isserlis' theorem (see [12]), which is also known as Wick's formula in quantum field theory: jk, jk σ 2 m, m + 72 σ jk, jk σ m, m σ 2 jk, m + 24σ 4 jk, m .
Recalling (18) and using (24), we obtain that Similarly as above, we derive that and Var u(X) = Eu 2 (X) − (Eu(X)) 2 noting that the leading term containing σ 2 jk, jk σ 2 m, m appears in both Eu 2 (X) and (Eu(X)) 2 and gets cancelled from the subtraction. Substituting these bounds into (22) and bounding e 1 2 Var v(X) = e o(log n) = n o (1) , the proof is complete.

Estimates outside of the critical regions
In this section, we show that the contribution to the integral (13) of the remaining region B = U n (π ) − B 0 − B π is negligible, where the critical regions B 0 and B π are defined in Sect. 5.1. Observe that depends on S and (λ jk ) only but does not depend on t. To bound the factors of |F S,t (θ )|, we use the following inequality, whose uninteresting proof we omit. Throughout this section, including the lemma statements, we always assume that the assumptions of Theorem 11 hold. Recall that Lemma 8(a,b) implies that all the eigenvalues of Q are Θ (ΛΔ) (by bounding the 1-norms of Q and Q −1 ). From Lemma 9, we find that As a first step, we demonstrate as negligible the domain where many components of θ ∈ U n (π ) lie sufficiently far from 0 and ±π . Define : more than 1 2 n 1−ε components θ j satisfy η/2 |θ j | 2π π − η/2 .
The following lemma depends on a technical lemma (Lemma 18) which we present in the appendix.

Lemma 11 We have
Proof Without loss of generality, at least 1 4 n 1−ε components θ j lie in [η/2, π − η/2]. Denote U = { j : θ j ∈ [η/2, π − η/2]}. We estimate the number N T (U ) of triangles { j, k, } (i.e. jk, j , k ∈ S) such that { j, k, }∩U = ∅. Using Lemma 18(a), we find that the degree of any vertex of U is at least γ Δ. For any jk ∈ S and { j, k} ∩ U = ∅ there are at least γ Δ 2 n common neighbours each of which gives rise to a triangle contributing to N T (U ). Since every triangle is counted at most 3 times, we get that .
For each such triangle { j, k, } that j ∈ U , observe that Therefore, we can mark one edge j k from this triangle such that |θ j + θ k | 2π η/3. Repeating this argument for all such triangles and observing that any edge is present in at most Δ 2 γ n triangles, we show that at least γ 3 Δ|U |/6 edges were marked. Using Lemma 8(a) and Lemma 10, we get that |F S,t (θ)| e −Ω ΛΔ|U |η 2 = e −Ω(n 1+ε ) .
Multiplying by the volume of B , which is less than (2π) n , and comparing with (27), completes the proof.
If Lemma 11 doesn't apply, we have at least n − 1 2 n 1−ε components of θ lying in neighbourhoods of 0 and ±π . Next we will use a similar argument to show that most of these components lie in one of those two intervals (on a circle). Define B := θ ∈ U n (π ) \ B : |θ j | η/2 holds for more than n 2ε components θ j and |θ j − π | 2π η/2 holds for more than n 2ε components θ j .

Lemma 12 We have
η/2}. Since θ / ∈ B , we have |U 1 | + |U 2 | ≥ n − 1 2 n 1−ε . For j ∈ U 1 , k ∈ U 2 and any such that j , k ∈ S, we have Thus, we can mark some j k ∈ { j , k } that |θ j + θ k | 2π = Ω(1). By the assumptions, the number of choices for ( j, k, ) is at least |U 1 | |U 2 | γ Δ 2 n . Dividing by 2Δ to compensate for over-counting, we get that at least |U 1 | |U 2 | γ Δ 2n edges were marked. Using Lemma 8(a) and Lemma 10, we find that The proof now follows the same line as in the previous lemma.
Since adding π to each component is a symmetry, see (15) Take any j m. Using Lemma 18(a), we find that at least γ Δ − n 1−ε = Θ(Δ) vertices k such that jk ∈ S and |θ k | 2π η/2. For such k, we have |θ j + θ k | 2π ≥ η/2. Similarly as before, by Lemma 10, for θ ∈ B * (m), Thus, we can bound where θ 1 ∈ R n−m , θ 2 ∈ R m and S is obtained from S by deletion of the first m vertices. Recall that |F S ,t (θ 1 )| does not depend on t , but we define t anyway by By (28), S and t satisfy all the assumptions of Lemma 9 and Lemma 20. Thus, where Q is the matrix of (17)  Summing over m and multiplying by 2 for the symmetry of (0, . . . , 0) and (π, . . . , π), we find that Using Lemma 11 and Lemma 12, we conclude the following.

Corollary 4
Under the assumptions of Theorem 11 and for sufficiently small ε,

Random t-factors and the beta model
In this section we establish a deep relation between S t (a uniform random element of the set of t-factors of S) and the corresponding β-model: the probabilities of any forced or forbidden small structure are asymptotically the same.
For the case S = K n , the estimate of Theorem 12 was previously established by Isaev and McKay in [11,Theorem 5.2] under the additional constraint that h ∞ = O(n 1/6 ). When S = K n and t is near-regular, a more precise formula for P(H + ⊆ S t and H − ∩ S t = ∅) can be derived from [19,Theorem 1.3], provided h 1 n 1+ε and h ∞ n 1/2+ε . The latter result shows that the error term h 2 2 /(ΛΔ) in Theorem 12 cannot be improved in general; see [19,Corollary 2.6].
In this section, we first give some preliminary estimates for the solution of system (19). Then, we prove Theorem 12. Then, in Sect. 6.3, we show that Theorem 10 follows from Theorem 12.

The solution of the beta system
The following lemma will be useful for investigating system (19).

Lemma 13
Let r : R n → R n , δ > 0, and U = {x ∈ R n : x − x (0) δ r(x (0) ) } and x (0) ∈ R n , where · is any vector norm in R n . Assume that r is analytic in U and sup where J denotes the Jacobian matrix of r and · stands for the induced matrix norm. Then there exists x * ∈ U such that r(x * ) = 0.

Proof of theorem 12
Let S = S − (H + ∪ H − ) and t ∈ N n be such that t − t is the degree sequence of H + . Then, by definition, Since h is an integer vector, we have that Using β as β (0) in Corollary 5, we find a solution β of system (19) for the graph S and the vector t such that Observe that rng(β ) = rng(β) + o (1) and h ∞ (ΛΔ) 1/2 Δ 2 n . Therefore, S and t also satisfy the assumptions of Theorem 11. Applying Theorem 11 twice, we find that where Q, u, v and Q , u , v are matrices of (17) and polynomials of (18) for S, t and S , t , respectively, X and X are the corresponding normally distributed vectors and

Lemma 14
Under the assumptions of Theorem 12, we have Proof Let (λ jk ) and (λ jk ) be defined as in (14) for for S, t and S , t . From (30), we have Applying Taylor's theorem to log(1 + e x ) and symmetry λ jk = λ k j , we obtain that Then, using (30) again, we get that To complete the proof of Theorem 11, it remains to be shown that Let Q = (q jk ) and Q = (q jk ). Using (31), we find that if jk ∈ S − S ; 0, otherwise. (33)

Lemma 15
Under the assumptions of Theorem 12, we have

ΛΔ .
Proof If U , V are symmetric positive definite matrices of equal size, then the matrix U V has positive real eigenvalues. To see this, note that U V is similar to the matrix The same argument applies to tr((Q ) −1 (Q − Q )) and thus log |Q| To show the remaining part of (32), we will use the following estimate and similarly for|Ev 2 (X) − Ev 2 (X )|.

Lemma 16
Under the assumptions of Theorem 12, we have Proof Repeating the arguments of Lemma 9 (see (25) and (26)) and using (29), (30), (31), we derive that Observe also that (using the arguments of (26)) Applying the Cauchy-Schwartz inequality, we find that We complete the proof of (32) and of Theorem 12 with the following lemma.

Proof of theorem 10
Since Theorem 10 is trivially true for h = 0, we assume otherwise. By assumption (A3), we get that where r( ) is defined in Corollary 5. Applying that corollary, we find a solution β of system (19) such that In particular, the assumptions of Theorem 10 and (36) imply that all the assumptions of Theorem 12 hold. By Taylor's theorem, we have that Then, we get that O((β j − β (0) )h j ) .
Using (36), we find that Thus, applying Theorem 12 gives the required probability bound.
where G[U ] denotes the induced subgraph, ε b (G[U ]) is the minimal number of edges required to delete from the graph G[U ] to make it bipartite, and ∂ G (U ) is the set of edges of G with exactly one end in U . Thus, to prove (b), it is sufficient to show ψ(G) ≥ γ 2 12 Δ. First, consider the case |U | n(1−γ /6). Observe that, for any common neighbour of two vertices j ∈ U and k / ∈ U , either j or jk contributes to ∂ G (U ). By the assumptions, the number of choices of j, k and is at least |U |(n − |U |) γ Δ 2 n . We need to divide by 2Δ to allow for over-counting. Thus, we get Now, assume |U | > n(1 − γ /6). Consider any partition (W 1 , W 2 ) of U into two disjoints sets. We may assume |W 1 | ≥ |W 2 |. If |W 2 | γ n/3 then, bounding the degrees of the vertices in W 1 from below by γ Δ (by part (a)) and degrees of vertices of W 2 above by Δ, we get that If |W 2 | > γ n/3, observe that, for any common neighbour of two vertices j ∈ W 1 and k ∈ W 2 , at least one of { j , k } contributes to E(G[ By the assumptions, the number of choices of j, k and is at least |W 1 ||W 2 | γ Δ 2 n Dividing by 2Δ to adjust for over-counting, we get Combining the above cases, we conclude that ε b (G[U ])+|∂ G (U )| |U | ≥ γ 2 12 Δ always. Part (b) follows.

A.2 Integration theorem
Here, we quote the results from [11] that were used in Sect. 5. For a domain Ω ⊆ R n and a twice continuously differentiable function q : Ω → C, define Theorem 13 (Theorem 4.4 of [11]) Let c 1 , c 2 , c 3 , ε, ρ 1 , ρ 2 , φ 1 , φ 2 be nonnegative real constants with c 1 , ε > 0. Let Q be an n × n positive-definite symmetric real matrix and let T be a real matrix such that T T QT = I . Let Ω be a measurable set such that U n (ρ 1 ) ⊆ T −1 (Ω) ⊆ U n (ρ 2 ), and let f : R n → C and g : R n → R be twice continuously differentiable and let h : Ω → C be integrable. We make the following assumptions.
(d) | f (x)|, |g(x)| n c 3 e c 2 x T Q x/n for x ∈ R n .
In order to apply Theorem 13, we need to verify that T exists and satisfies all required conditions. The following lemma is a special case of [11,Lemma 4.9] (for trivial ker Q and γ = μ min /d max ). Recall that · max stands for the maximum of the absolute values of the elements of a given matrix.

Lemma 19
Let Q be an n ×n real symmetric matrix with positive minimum eigenvalue μ min . Let D be a diagonal matrix with diagonal entries in [d min , d max ] for d min > 0. Suppose that Q − D max rd min /n for some r . Then