Delocalization of two-dimensional random surfaces with hard-core constraints

We study the fluctuations of random surfaces on a two-dimensional discrete torus. The random surfaces we consider are defined via a nearest-neighbor pair potential which we require to be twice continuously differentiable on a (possibly infinite) interval and infinity outside of this interval. No convexity assumption is made and we include the case of the so-called hammock potential, when the random surface is uniformly chosen from the set of all surfaces satisfying a Lipschitz constraint. Our main result is that these surfaces delocalize, having fluctuations whose variance is at least of order $\log n$, where $n$ is the side length of the torus. We also show that the expected maximum of such surfaces is of order at least $\log n$. The main tool in our analysis is an adaptation to the lattice setting of an algorithm of Richthammer, who developed a variant of a Mermin-Wagner-type argument applicable to hard-core constraints. We rely also on the reflection positivity of the random surface model. The result answers a question mentioned by Brascamp, Lieb and Lebowitz 1975 on the hammock potential and a question of Velenik 2006.


Introduction
In this paper we study the fluctuations of random surface models in two dimensions. We consider the following family of models. Denote by T 2 n the two-dimensional discrete torus in which the vertex set is {−n + 1, −n + 2, . . . , n − 1, n} 2 and (a, b) is adjacent to (c, d) if (a, b) and (c, d) are equal in one coordinate and differ by exactly one modulo 2n in the other coordinate. Let U be a potential, i.e, a measurable function U : R → (−∞, ∞] satisfying U (x) = U (−x). The random surface model with potential U , normalized at the vertex 0 := (0, 0), is the probability measure µ T 2 n ,0,U on functions ϕ : V (T 2 n ) → R defined by dµ T 2 n ,0,U (ϕ) := where the vertices and edges of T 2 n are denoted by V (T 2 n ) and E(T 2 n ) respectively, dϕ v denotes Lebesgue measure on ϕ v , δ 0 is a Dirac delta measure at 0 and Z T 2 n ,0,U is a normalization constant. For this definition to make sense the potential U needs to satisfy additional requirements. It suffices, for instance (see Lemma 3.1 for additional details), that inf x U (x) > −∞ and 0 <ˆexp(−U (x))dx < ∞. (

1.2)
Suppose ϕ is sampled from the measure µ T 2 n ,0,U . The expectation of ϕ is zero at all vertices by symmetry. How large are the fluctuations of ϕ around zero? Let us focus on the variance of ϕ at the vertex (n, n). It is expected that this variance is of order log n under mild conditions on U .
This has been shown when the potential U is twice continuously differentiable with U bounded away from zero and infinity, and certain extensions of this class, as discussed in the survey paper [31,Remarks 6 and 7]. Specifically, a lower bound of order log n has been established by Brascamp, Lieb and Lebowitz [5] when U is twice continuously differentiable, exp(−αU (x))dx < +∞, ∀α > 0, lim |x|→∞ (|x| + |U (x)|) exp(−U (x)) = 0, and either of the following holds: The class of potentials covered by their result can be further extended by taking suitable limits, as indicated in [5]. In addition, using arguments of Ioffe, Shlosman and Velenik [15] it is possible to derive qualitatively correct lower bounds for the variance for a class of, possibly discontinuous, potentials satisfying U −Ũ ∞ < ε for a small enough ε > 0 and some twice continuously differentiableŨ satisfying sup xŨ (x) < ∞.
The case of the hammock potential, when U (x) = 0 for |x| ≤ 1 and U (x) = ∞ for |x| > 1, is explicitly mentioned as open in [5] and [31,Open problem 2]. In this paper we prove a lower bound of order log n on the variance for a wide class of potentials which includes the hammock potential. A sample from the random surface measure with the hammock potential is depicted in Figure 1.1, both in 2 and 3 dimensions.
We say that U ∈ C 2 (I) for an interval I ⊆ R if U is twice continuously differentiable on I. We consider the class of potentials U satisfying the following condition: Either U ∈ C 2 (R) or U ∈ C 2 ((−K, K)) for some 0 < K < ∞ and U (x) = ∞ when |x| > K.
(1.3) This class includes the hammock potential as well as "double well" potentials, oscillating potentials with finite support (that is, infinity outside of a bounded interval) and all smooth examples. In the case that U ∈ C 2 ((−K, K)) we allow the possibility of a discontinuity at the endpoints −K and K. The following theorem is the main result of this paper. Besides proving a lower bound on the variance at the vertex (n, n) we obtain estimates also for other vertices, for small ball and large deviation probabilities and for the maximum of the random surface. Let n ≥ 2 and let ϕ be randomly sampled from µ T 2 n ,0,U . There exist constants C(U ), c(U ) > 0, depending only on U , such that for any v ∈ V (T 2 n ) with v 1 ≥ (log n) 2 we have Var(ϕ v ) ≥ c(U ) log(1 + v 1 ), In addition, |ϕ v | ≥ c(U ) log n ≥ 1 2 . interval. Sampled using coupling from the past [28].
We remark that condition (1.2) is mainly required in this theorem for the probability measure (1.1) to make sense. One may replace it by other conditions of a similar nature. Additional remarks may be found following Theorem 4.1 below.
Our results can be viewed in a broader context of Mermin-Wagner-type arguments. Such arguments show, roughly, that continuous translational symmetry cannot be broken in one-or twodimensional systems. For lattice models with compact spin spaces this implies that spins are uniformly distributed in the infinite volume limit. For lattice models with non-compact spin spaces, such as the random surface models we consider, such arguments prove delocalization and consequently non-existence of infinite volume Gibbs measures. We present now a non-exhaustive list of papers studying these phenomena. Such arguments were pioneered by Mermin and Wagner [18] who worked in a quantum context and relied on the, so called, Bogoliubov inequalities. These techniques were later extended and transferred to a classical context -see e.g. Hohenberg [14] and Brascamp, Lieb and Lebowitz [5]. New techniques were developed by Dobrushin and Shlosman [6,7], McBryan and Spencer [17] and Fröhlich and Pfister [26,9]. The methods in all of the above papers require the potential to satisfy certain smoothness assumptions. Ioffe, Shlosman and Velenik [15] and Gagnebin and Velenik [12] presented extensions to some classes of non-smooth potentials.
These works left open the case of potentials taking infinite values and a solution to this problem came from Richthammer [29] who studied Gibbsian point processes in R 2 . Our approach follows closely his elaborate technique introduced for proving that all Gibbs states of such point process models are translation invariant, even in the presence of hard-core constraints as in the hard sphere model. The main ingredient in Richthammer's approach is an algorithm designed for perturbing a given configuration in a prescribed manner while preserving the hard-core constraints. Our proof adapts this algorithm from the continuum to the graph setting and from the point process to the random surface context. The resulting adaptation is presented in some detail in Section 2 and we hope that it will be useful in other contexts as well.
1.1. Overview of the proof. In order to illustrate our proof we first explain how to establish a lower bound on fluctuations in the simpler case that the potential U satisfies that U is twice continuously differentiable on R and sup x U (x) < ∞, (1.4) in addition to the condition (1.2). The methods of this section are similar to the one of [26]. We then provide details on the modification of this method, following the approach of Richthammer, which we use for potentials satisfying condition (1.3).
We wish to convert the inequality (1.9) into an inequality of probabilities rather than densities. To this end define Let a > 0 and define On the one hand, by (1.9), (1.10) On the other hand, the Cauchy-Schwartz inequality and a change of variables using (1.8) and the fact that τ (0) = 0 yields (1.11) Putting together (1.10) and (1.11) and recalling (1.6) we obtain P(|ϕ (n,n) − log(2n + 1)| ≤ a)P(|ϕ (n,n) + log(2n Using the symmetry of the distribution of ϕ, the arithmetic-geometric mean inequality and taking a := 1 3 log(2n + 1) in the last inequality we conclude from which we conclude Eϕ 2 (n,n) ≥ c (U ) log n for some c (U ) > 0. The inequality (1.5) follows as Eϕ (n,n) = 0 by symmetry.

1.1.2.
Modification of the argument for potentials satisfying (1.3). For simplicity, assume the potential U satisfies U ∈ C 2 ([−1, 1]) and U (x) = ∞ when |x| > 1, as more general potentials satisfying (1.3) may be treated by similar arguments. Let us say that a configuration ψ : The measure µ T 2 n ,0,U is supported on Lipschitz configurations (satisfying ψ(0) = 0) under our assumption on U . The fundamental difficulty in applying the previous argument to this case is that it may happen that although ψ is a Lipschitz configuration, one of the configurations ψ + or ψ − defined by (1.8) may fail to be, in which case the inequality (1.9) will not be satisfied. The solution we use for this problem is to replace the configurations ψ + and ψ − in the previous argument by T + (ψ) and are certain mappings, termed addition algorithms in our paper, which share many of the properties of the operations of adding and subtracting τ while preserving the class of Lipschitz configurations.
The definitions and properties of T + and T − are adapted from the work of Richthammer [29] who showed that all Gibbs states of point process models in R 2 with hard-core constraints, such as the hard sphere model, are translation invariant. Our adaptation translates Richthammer's notions from the continuum to the graph setting and from the point process to the random surface context. The main properties of T + and T − are detailed in Section 2.1. We highlight the possibility of defining these mappings for general graphs and general addition functions τ as we believe these extensions to be useful in other contexts and as they are captured with the same definitions and proofs.
The mappings T + and T − are defined to satisfy T − (ψ) := 2ψ − T + (ψ), just as in the definitions of ψ + and ψ − in (1.8). It thus suffices to define T + (ψ). Let us remark briefly on this definition for a Lipschitz configuration ψ. Roughly speaking, a certain ψ-dependent ordering on the vertices of the graph is chosen. Then, for each vertex v in this order, an amount between 0 and τ (v) is added to ψ v in such a way that the Lipschitz property is maintained with respect to the previously treated vertices in the chosen order. The amount added at vertex v is chosen to vary continuously with the value ψ v , in such a way that the resulting operation is invertible.
Two difficulties arise when replacing ψ + and ψ − by T + (ψ) and T − (ψ) in the argument of Section 1.1.1. First, the change of variables used in inequality (1.11) relied on the fact that the mappings ψ → ψ + τ and ψ → ψ − τ preserve Lebesgue measure. When making a change of variables from T + (ψ) and T − (ψ) to ψ a Jacobian factor enters, which needs to be estimated. Second, the argument uses the fact that ψ + (n,n) and ψ − (n,n) differ significantly from ψ (n,n) , by the amount log(2n + 1). Thus we also need to show that the difference of T + (ψ) (n,n) and T − (ψ) (n,n) from ψ (n,n) is close to log(2n + 1), at least for most configurations ψ. It turns out that both these difficulties may be overcome if we can control the following percolation-like process. We say an edge e = (v, w) ∈ E(T 2 n ) has extremal slope for the configuration ψ if |ψ v − ψ w | ≥ 1 − ε, for some small ε > 0 fixed in advance. Sampling ϕ randomly from the measure µ T 2 n ,0,U , we denote by E(ϕ) the random subgraph of T 2 n consisting of all edges with extremal slope for ϕ. Both difficulties described above may be overcome by showing that with high probability, the subgraph E(ϕ) is "subcritical" in the sense that its connected components are small. Proving this turns out to be a non-trivial task, which requires us to make use of reflection positivity techniques, specifically, the chessboard estimate. We remark that here (and only here) we rely essentially on the fact that T 2 n is a torus (i.e., has periodic boundary) and that the measure µ T 2 n ,0,U is normalized at the single vertex 0. Analogous estimates were also required in Richthammer's work [29] but were provided by the underlying Poisson process structure of the problem considered there, via so-called Ruelle bounds.
1.1.3. Reader's guide. In Section 2 we describe the mappings T + and T − mentioned in the previous section. The section begins by listing the main properties of T + and T − , continues with a precise definition of T + and proceeds to prove that the required properties of T + indeed hold with this definition. In Section 3 we discuss reflection positivity for random surface models and prove, via the chessboard estimate, that the subgraph of edges with extremal slopes mentioned in the previous section is "subcritical" with high probability. Sections 2 and 3 address disjoint aspects of the problem and may be read independently. In Section 4 we prove our main theorem, Theorem 1.1, under alternative assumptions, by modifying the argument presented in Section 1.1.1 to make use of the mappings T + and T − and extending it to provide information also on small ball and large deviation probabilities and on the maximum of the random surface. In the short Section 5 we use the results of Section 3 to reduce Theorem 1.1 to the case discussed in Section 4. Section 6 contains a discussion of future research directions and open questions.

The addition algorithm and its properties
In this section we define the addition algorithm T + which forms a core part of our proof. The algorithm is an adaptation to the graph setting of an algorithm of Richthammer [29] used in a continuum setting. Our presentation adapts the proofs in [29] but emphasizes the applicability of the algorithm to general graphs and general addition functions τ .

2.1.
Properties of the addition algorithm. Here we describe the properties of the addition algorithm which will be used by our application. The algorithm itself is defined in the next section and the fact that it satisfies the stated properties is verified in the subsequent sections.
Let G = (V, E) be a finite, connected graph. We sometimes write v ∼ w to denote that (v, w) ∈ E. Let τ : V → [0, ∞) and 0 < ε ≤ 1 2 be given. We define a pair of measurable mappings T + , T − : R V → R V related by the equality and satisfying the following properties: (1) T + and T − are one-to-one and onto.
(2) For every ϕ ∈ R V and every v ∈ V , The properties stated so far do not exclude the possibility that T + is the identity mapping (implying the same for T − by (2.1)). The next property shows that T + (ϕ) − ϕ is close to τ under certain restrictions on the set of edges on which ϕ changes by more than 1 − ε. We require a few definitions. Let d G stand for the graph distance in G. The next two definitions concern the Lipschitz properties of τ .
In the following definitions we consider the connectivity properties of the subset of edges on which ϕ changes by more than and write, for a pair of vertices where we mean in particular v Together with property (2) above this shows that T + (ϕ) − ϕ and ϕ − T − (ϕ) are approximately equal to τ when M (ϕ) ≤ L(τ, ε). A slightly stronger property is given in Proposition 2.7 below. Our final property regards the change of measure induced by the mappings T + and T − . We bound the Jacobians of these mappings when the subgraph E(ϕ) does not contain many large connected components.
Partition the vertex set V into V 0 and V 1 by letting Given a function θ : V 0 → R we write for the measure on R V given by product Lebesgue measure on the subspace where (5) There are measurable functions J + : R V → [0, ∞) and J − : R V → [0, ∞) satisfying that for every θ : V 0 → R and every g :

2.2.
Description of the addition algorithm. In this section we define the mapping T + whose properties were discussed in the previous section.
Let the graph G = (V, E), function τ and constant ε be as above. Fix an arbitrary total order on the vertex set V . Define a Lipschitz "bump" function f : R → R by . (2.11) We also define a family of shifted and rescaled versions of f . For a vertex v ∈ V and h, t ∈ R let if τ (v) < t . (2.12) One should have in mind the case τ (v) ≥ t and think of m v,h,t as being the same as f , scaled and shifted to have maximum τ (v), minimum t and to have its "center" at h. However, if the function just described has Lipschitz constant more than 1/2, we lower its maximum so that its Lipschitz constant becomes 1/2. For easy reference we record this as the function m v,h,t has Lipschitz constant at most 1 2 . (2.13) The case τ (v) < t is not used in the definition of T + below. It is included here as it is technically convenient in the analysis to have m v,h,t defined for all values of the parameters. The definition of T + is based on the following algorithm. The algorithm takes as input a function ϕ ∈ R V . It outputs three sequences indexed by 1 ≤ k ≤ |V |: (1) A sequence (P k ) which is a ordering of the vertices V , that is, {P k } = V .
(2) A sequence (s k ) ⊆ [0, ∞) with s k representing the amount to add to ϕ at vertex P k .
(3) A sequence (τ k ) of functions, τ k : V × R → R, which will play a role in analyzing the Jacobian of the mapping T + .
The mapping T + is then defined by T + (ϕ) :=φ withφ P k := ϕ P k + s k , 1 ≤ k ≤ |V |. (2.14) Addition algorithm: Loop. For k between 1 and |V | do: If there are multiple vertices achieving the same minimum let P k be the smallest one with respect to the total order . (2) Set s k := τ k (P k , ϕ P k ).
(2.15) Table 1: An illustration of the action of the addition algorithm on a function defined on a 2x3 grid graph.

Loop
(1) The green (gray in b&w) vertex is set to be P 1 and to be processed.
(3) The requested shifts are updated. In this example for all w:

Loop
(1) The green (gray in b&w) vertex is set to be P 2 and to be processed.
(2) It is shifted by s 2 := 0.2. For all other w: τ The next vertex to be processed need not be adjacent to the previously processed vertices.

Loop
(1) The green (gray in b&w) vertex is set to be P 3 and to be processed.
(2) It is shifted by s 3 := 0.3. For all other w: τ  Table 1: An illustration of the action of the addition algorithm (cont.)

Loop
(1) The green (gray in b&w) vertex is set to be P 4 and to be processed.
(3) The requested shifts are updated. For the top-center vertex v: For all other w: τ The requested shift of a vertex may be decreased several times.

Loop
(1) The green (gray in b&w) vertex is set to be P 5 and to be processed.
(3) The requested shifts are updated. For the top-left vertex v: For the processed vertices (all except v) τ 6 := τ 5 .

Loop
(1) The green (gray in b&w) vertex is set to be P 6 and to be processed.
The consecutive shifts increase, The algorithm terminates! In the next sections we verify that the mapping T + defined by (2.14) satisfies the properties declared in Section 2.1.

Increments and Lipschitz property.
In this section we verify properties (2) and (3) from Section 2.1 for T + . Property (2) is an immediate consequence of the definition (2.14) of T + combined with (2.17) below.
Proof. Observe that, by (2.12), we have We shall prove by induction that Recall that the function τ is non-negative. It follows from (2.19), (2.21) and the initialization and step (3) of the addition algorithm that In particular, s k = τ k (P k , ϕ P k ) ≥ 0. Thus, (2.21) remains true when k is replaced by k + 1. We conclude that (2.20) holds. It now follows, in the same way that (2.22) was deduced from (2.21), that (2.16) is valid. Now (2.17) is verified upon recalling that s k = τ k (P k , ϕ P k ). It remains to verify (2.18). Let 1 ≤ k < |V |. Our choice of the point P k in step (1) of the addition algorithm ensures that (2.23) In addition, it follows from (2.19) that Thus (2.15) implies that As k is arbitrary, this establishes (2.18).
In the next lemma we investigate the gradient of T + (ϕ), establishing property (3) from Section 2.1 for T + .
Proof. Fix an edge (v, w) ∈ E. Assume without loss of generality that v = P k and w = P for some 1 ≤ k < ≤ |V |. Observe that, by step (3) of the addition algorithm, Then, by the definition (2.12) of m, we have that Combining the last two inequalities with (2.18) shows that s = s k . The equality (2.24) now follows from (2.14). Assume now that |ϕ v − ϕ w | < 1. On the one hand, by (2.18), On the other hand, by (2.26) and the definition (2.12) of m, Therefore, by (2.13) and our assumption that |ϕ v − ϕ w | < 1, 2.4. Bijectivity. In this section we define an inverse (T + ) −1 to the mapping T + , thereby establishing that T + is one-to-one and onto as claimed in property (1) from Section 2.1. The definition of (T + ) −1 uses the same graph G = (V, E), function τ , constant ε, total order on V and family of functions m v,h,t as the definition of T + . It is based on the following algorithm which takes as input a functionφ ∈ R V and outputs four sequences indexed by 1 ≤ k ≤ |V |: (2) A sequence (s k ) ⊆ [0, ∞) withs k representing the amount to subtract fromφ at vertexP k .
(3) Two auxiliary sequences of functions,τ k : Inverse addition algorithm: Loop. For k between 1 and |V | do: are multiple vertices achieving the same minimum letP k be the smallest one with respect to the total order .
is also continuous and strictly increasing and we havẽ Proof. Fixφ ∈ R V and v ∈ V . We prove the lemma by induction. Let 1 ≤ ≤ |V |, suppose the algorithm is well-defined and the lemma holds for all 1 ≤ k < and let us prove the assertions of the lemma for k = . Observe thatτ (v, ·) is obtained by taking the minimum of τ (v) and the function m v,h,t (·) with various values of h and t. Thus, since m v,h,t (·) has Lipschitz constant at most 1 2 by (2.13), it follows thatτ (v, ·) has Lipschitz constant at most 1 2 . Thus h → h +τ (v, h) is continuous and strictly increasing from R onto R. The remaining assertions of the lemma are immediate consequences.
These assertions are proved in the next two sections.
2.4.1. Injectivity. In this section we prove (2.29), showing that T + is one-to-one.
be the sequences generated when calculating T + (ϕ) and when calculating (T + ) −1 (φ) withφ := T + (ϕ). By (2.14) and (2.27) it suffices to show thatP We prove this claim by induction. We haveτ 1 = τ 1 by the initialization steps of the algorithms. Fix 1 ≤ k ≤ |V | and assume that (2.31) We need to show thatP These sequences need not be equal. However, they satisfy certain relations as the following lemma clarifies.
Comparing the definitions of P k , s k and τ k+1 with those ofP k ,s k andτ k+1 and using (2.31) and (2.14) we deduce from the lemma that (2.32) holds, completing the inductive proof.
Proof of Lemma 2.4. Let us first show that∆ P k = ∆ P k . By (2.14) and (2.31), ThusD k (P k ,φ P k ) = ϕ P k and hence, using (2.31) again, Hence we may writẽ (2.33) Hence, sinceD k (v, ·) is increasing by Lemma 2.3, we conclude that (2.34) Consequently, by (2.33) and Lemma 2.3, It follows that∆ v ≥ s m , whence, by (2.18), as we wanted to prove. Lastly, suppose that equality holds in (2.36). It follows that equality holds also in (2.35) and hence in (2.34). Thus, using (2.31), Surjectivity. In this section we prove (2.30), showing that T + is onto. The proof is similar to the proof that T + is one-to-one as given in the previous section.
The proof requires the following lemma, an analog of Lemma 2.1 for T + .
Proof. The proof of (2.37) and (2.38) follows in exactly the same way as the proof of Lemma 2.1 with (P k ), (s k ) and (τ k ) replacing (P k ), (s k ) and (τ k ). It remains to prove (2.39). We start by showing that To verify this, observe that by Our choice of the pointP k in step (2) of the inverse addition algorithm ensures thats The definition (2.12) of m implies that Putting together (2.41) and (2.42) and recalling (2.28) yields be the sequences generated when calculating T + (ϕ) with ϕ := (T + ) −1 (φ) and when calculating (T + ) −1 (φ). To show that T + is onto it suffices, by (2.14) and (2.27), to show that We prove this claim by induction. We have τ 1 =τ 1 by the initialization steps of the algorithms. Fix 1 ≤ k ≤ |V | and assume that P j =P j , s j =s j for 1 ≤ j < k and τ j =τ j for 1 ≤ j ≤ k. (2.43) We need only show that As in the previous section, these sequences satisfy certain relations as the following lemma clarifies.
Comparing the definitions of P k , s k and τ k+1 with those ofP k ,s k andτ k+1 and using (2.43) and (2.27) we deduce from the lemma that (2.44) holds, completing the inductive proof.
Proof of Lemma 2.6. Let us first show that ∆P k =∆P k . By (2.27) and Lemma 2.3, Hence we may write, using Lemma 2.3, Consequently, by (2.43), (2.37) and (2.39), 2.5. The shifts produced by the algorithm. Our goal in this section is to analyze the shifts produced by the addition algorithm of Section 2.2 and to give conditions under which T + (ϕ) v − ϕ v is approximately equal to τ (v). Corollary 2.8 verifies property (4) from Section 2.1 for T + .
Recall from Section 2.1 that E(ϕ) is the subgraph of edges on which ϕ changes by at least 1 − ε, that r(ϕ, v) is the radius of the connected component of v in E(ϕ) and M (ϕ) is the diameter of the largest connected component of E(ϕ). Recall also the definitions of τ (v, k) and L(τ, ε). Depending on the choice of τ and ε the value of L(τ, ε) may be negative, though our theorems will be meaningful only when this is not the case. The following is the main proposition of this section.
The definitions of M (ϕ) and L(τ, ε) imply the following corollary.
Proof of Proposition 2.7. Fix ϕ ∈ R V and let (P k ), (s k ) and (τ k ) be the outputs of the addition algorithm of Section 2.2 when running on the input ϕ. For v ∈ V , let k v stand for that integer for which v = P kv and let . The lower bound in Proposition 2.7 is a consequence of the following fact: For any v ∈ V , then the definition of the addition algorithm implies that σ v = τ (v) and (2.47) follows. Otherwise, let u be a neighbor of v with k u < k v and note that necessarily u ∈ E(ϕ, v) by our assumption on v. Now, the induction hypothesis (2.47), definitions (2.3), (2.4) and (2.7) and our assumption that M (ϕ) ≤ L(τ, ε) yield that This, together with |ϕ v − ϕ u | < 1 − and (2.12), imply that in step 3 of the addition algorithm, As u is an arbitrary neighbor of v with k u < k v we conclude that σ v = τ (v) as required in (2.47). Now suppose that v is not the first vertex visited in E(ϕ, v). Let u be the vertex of E(ϕ, v) with minimal k u . Clearly, k u < k v and by the induction hypothesis (2.47), σ u = τ (u). Thus, Lemma 2.1 and (2.7) yield that r(ϕ, v)), finishing the proof of (2.47).
2.6. Jacobian definition. In this section we find a formula for the Jacobian of the mapping T + . We start with some smoothness properties of the functions used in defining T + . We write (P k ), (s k ) and (τ k ) for the outputs of the addition algorithm of Section 2.2 when running on the input ϕ. Lemma 2.9. For any ϕ ∈ R V , 1 ≤ k ≤ |V | and v ∈ V , the function τ k (v, ·) is everywhere differentiable from the right and is Lipschitz continuous with Lipschitz constant at most 1 2 . Proof. The function τ k (v, ·) is defined by taking a pointwise minimum of the constant function τ (v) and functions of the form m w,h,t (·) for various values of the parameters w, h and t. The lemma follows by noting that both τ (v) and m w,h,t (·) are everywhere differentiable from the right and Lipschitz continuous with Lipschitz constant at most 1 2 (see (2.12) and (2.13)) and these properties are preserved under taking pointwise minimum (it follows, in fact, that τ k (v, ·) is piecewise linear with all slopes of size at most 1 2 ).
Let J + : R V → (0, ∞) V be defined by (1 + ∂ 2 τ k (P k , ϕ P k )) (2.48) where the notation ∂ 2 τ k (P k , ϕ P k ) stands for the right derivative of τ k with respect to its second variable (which exists by Lemma 2.9), evaluated at (P k , ϕ P k ). Lemma 2.9 ensures also that the factors in the product are positive.
Recall the definition of the partition V 0 , V 1 of V and the measure dµ θ from (2.8) and (2.9).
Lemma 2.10. For any θ : V 0 → R and any function g : R V → R integrable with respect to dµ θ the function g(T + (ϕ))J + (ϕ) is integrable with respect to dµ θ and We remark that T + is clearly Borel measurable by its definition in Section 2.2 and hence the integrand on the left-hand side of (2.49) is measurable. The rest of the section is devoted to proving this lemma.
We need the following basic facts about Lipschitz continuous maps.
where we have written dϕ for the Lebesgue measure on R d . Here, as remarked in [8], T −1 (ψ) is at most countable for almost every ψ. Now, let Σ stand for the set of bijections σ : {1, . . . , |V |} → V . For each σ ∈ Σ define the set Referring back to the definition of the addition algorithm in Section 2.2 we see that each A σ is measurable, possibly empty, and R V = ∪ σ∈Σ A σ . For each σ ∈ Σ we define a version of the addition algorithm in which the points are taken in the order σ. More precisely, we define an algorithm taking as input a function ϕ ∈ R V and outputting two sequences indexed by 1 ≤ k ≤ |V |: Loop. For k between 1 and |V | do: (1), . . . , σ(k)} and v ∼ σ(k) . (2.52) We then define a mapping T σ : Comparing the definitions of T + and T σ we conclude that Fix a θ : V 0 → R and let Observe that T + maps X bijectively onto X by properties (1) and (2) (see Section 2.1) and the definition of V 0 . The measure dµ θ is supported on X; identifying X with R V1 in the natural way it coincides with the Lebesgue measure on X. By (2.12), the function m v,h,t (h ) is Lipschitz continuous as a function of h, t and h , for every fixed v. In addition, the composition and pointwise minimum of Lipschitz continuous functions is also Lipschitz continuous. It follows that for every v and k, the function τ σ k (v, h) is Lipschitz continuous as a function of h and ϕ (i.e., as an implicit function of ϕ w for every w ∈ V ). We thus deduce from the definition of s σ k and (2.53) that T σ is a Lipschitz continuous map. We also note that T σ maps X into X since as follows by induction on k using the fact that m v,h,t ≥ t by (2.12). Thus we may apply the formula (2.50) (by identifying X with R V1 and dµ θ with the Lebesgue measure on R V1 ) to obtain thatˆX for every σ ∈ Σ and h : X → R integrable with respect to dµ θ . Here and below, we denote by We continue to find a formula for |det(∇ V1 T σ (ϕ))|. We note first that ∇ V1 T σ (ϕ) exists for dµ θalmost every ϕ ∈ X as, by the above discussion, T σ is Lipschitz continuous from X to X. By construction of T σ , ∇ V T σ has a triangular form when its rows and columns are sorted in the order of σ. Hence the definition of s σ k , (2.53) and (2.55) yield that for dµ θ -almost every ϕ ∈ X we have Now let h : R V → R be a function integrable with respect to dµ θ and define Putting together (2.48), the fact that J + ≥ 0, (2.51), (2.54), (2.57) and (2.56) we havê Finally, T + is invertible by Section 2.4 and T + = T σ on A σ by (2.54). Hence T σ restricted to A σ is one-to-one. Thus, since h σ (ϕ) = 0 when ϕ / ∈ A σ , we may continue the last equality to obtain This equality is obtained for any h : R V → R integrable with respect to dµ θ . Letting g : R V → R be integrable with respect to dµ θ , Lemma 2.10 now follows by substituting h with g(T + (ϕ)). Formally, this is done by using the above equality to approximate g(T + (ϕ)) with h which are integrable with respect to dµ θ .

2.7.
Properties of T − . The relation (2.1) defines a mapping T − : In this section we establish that T − satisfies similar properties to those proved for T + , as claimed in Section 2.1.
In this section, to emphasize the dependence on ϕ, we write (P ϕ k ), (s ϕ k ) and (τ ϕ k ) for the outputs of the addition algorithm of Section 2.2 when running on the input ϕ. Putting together (2.14) and (2.58) we see that We claim that, due to the symmetry of the function f of (2.11), To see this observe first that the symmetry of f and (2.12) imply Thus, examining the addition algorithm of Section 2.2 we conclude that (2.61) Together with (2.14), this equality implies (2.60).
Now, the fact that T − satisfies properties (1), (2), (3) and (4) in Section 2.1 follows immediately from (2.58), (2.60) and the fact that T + satisfies these properties. We now show that T − also satisfies (2.10). Define J − : analogously to (2.48). Observe that J − (ϕ) = J + (−ϕ) by (2.58) and (2.61). Recall the definition of the measure dµ θ from (2.9). Using (2.60) and the equality (2.10) for T + we have for every θ : V 0 → R and every g : R V → [0, ∞), integrable with respect to dµ θ , We remark that the symmetry of the function f of (2.11), while essential for establishing (2.60), is not necessary for establishing the properties of T − described in Section 2.2. These properties may also be obtained without using (2.60) by repeating the proofs used for T + .
2.8. The geometric average of the Jacobians. In this section we provide an estimate for the geometric average of the Jacobians J + and J − in terms of the connectivity properties of the subgraph E(ϕ) and the Lipschitz properties of the function τ . This estimate establishes property (5) from Section 2.1.
Proof. Fix ϕ ∈ R V satisfying M (ϕ) ≤ L(τ, ε). Write (P k ), (s k ) and (τ k ) for the outputs of the addition algorithm of Section 2.2 when running on the input ϕ. Denote σ v := T + (ϕ) v − ϕ v for v ∈ V . By (2.48) and (2.62) we get where we have used that |∂ 2 τ k (P k , ϕ P k )| ≤ 1/2 for all k according to Lemma 2.9. Examination of the addition algorithm of Section 2.2 reveals that τ k (v, h) is the minimum of τ (v) and m v,ϕw,σw (h) where w ranges over a (possibly empty) subset of the neighbors of v. Observing that the Lipschitz constant of m v,h,t is at most max 1 ε (τ (v) − t), 0 by (2.12), we see that Now, using our assumption that M (ϕ) ≤ L(τ, ε), Proposition 2.7 yields that The lemma follows by substituting this estimate in (2.63).

Reflection positivity for random surfaces
Recall the random surface measure µ T 2 n ,0,U , defined in (1.1), corresponding to a potential U . In this section we estimate the probability that the random surface has many edges with large slopes.
We start by explaining why the measure µ T 2 n ,0,U is well-defined under our assumptions. Lemma 3.1. The measure µ T 2 n ,0,U is well-defined for any potential U satisfying condition (1.2). In addition, there exists a constant c(U ) > 0 for which Proof. Let U be a potential satisfying condition (1.2). In order that µ T 2 n ,0,U be well-defined it suffices that satisfies 0 < Z T 2 n ,0,U < ∞. We first show that Z T 2 n ,0,U < ∞. Let S be a spanning tree of T 2 n , regarded here as a subset of edges. Then 2). By integrating the vertices in V (T 2 n ) \ {0} leaf by leaf according to the spanning tree S the integral above equals ´e xp(−U (x))dx |S| , which is finite by (1.2).
We now prove (3.1), implying in particular that Z T 2 n ,0,U > 0. Condition (1.2) implies the existence of some α < ∞ for which the set A := {x : U (x) ≤ α} has positive measure. The Lebesgue density theorem now yields the existence of a point a ∈ A and an ε > 0 such that where we write |B| for the Lebesgue measure of a set B ⊆ R. This implies that and, using that U (x) = U (−x), the analogous statement Denote by (V even , V odd ) a bipartition of the vertices of the bipartite graph T 2 n , with 0 ∈ V even , and define the following set of configurations, We conclude from the definition of A, (3.3) and (3.4) that the integral in (3.2), restricted to the set Ω, is at least (0.4ε exp(−α)) |V (T 2 n )\{0}| > 0. This can be seen by again fixing a spanning tree of T 2 n and integrating the vertices in V (T 2 n ) \ {0} leaf by leaf according to it. As a side note we remark that the fact that T 2 n is bipartite was essential for showing that Z T 2 n ,0,U > 0. If T 2 n is replaced by a triangle graph on 3 vertices then the analogous quantity to Z T 2 n ,0,U is zero when, say, {x : 3]. However, the above argument can be easily modified to work for all graphs if {x : U (x) < ∞} contains an interval around 0.
For 0 < L < ∞ and 0 < δ < 1 we say a potential U has (δ, L)-controlled gradients on T 2 n if the following holds: (1) There exists some K > L such that U (x) < ∞ for |x| < K.
This theorem is proved in the following sections, making use of reflection positivity and the chessboard estimate.
3.1. Reflection positivity. We start by reviewing the basic definitions pertaining to our use of reflection positivity and the chessboard estimate. Our treatment is based on [3, Section 5]. Let n ≥ 1. For −n + 1 ≤ j ≤ n the vertical plane of reflection P ver j (passing through vertices) is the set of vertices P ver j := {(j, k) ∈ V (T 2 n ) : − n + 1 ≤ k ≤ n}. The plane P ver j divides T 2 n into two overlapping parts, P ver,+ j and P ver,− j , according to which exchanges P ver,+ j and P ver,− j . We also define horizontal planes of reflection P hor j and their associated P hor,+ j , P hor,− j ,P hor j and θ P hor j in the same manner by switching the role of the two coordinates of vertices in T 2 n . We write simply P, P + , P − ,P and θ P when the plane of reflection P is one of the planes P ver j or P hor j which is left unspecified. Denote by F the set of all measurable functions f : Equivalently, F is the set of all measurable functions depending only on the gradient of ϕ. For a plane of reflection P we write F + P (respectively F − P ) for the set of f ∈ F for which f (ϕ) depends only on ϕ v , v ∈ P + (respectively v ∈ P − ). We extend the definition of θ P to act on R V (T 2 n ) and F by (θ P ϕ) v := ϕ θ P (v) and (θ P f )(ϕ) := f (θ P ϕ).
When ϕ is randomly sampled from a probability measure on R V (T 2 n ) we will regard a function f ∈ F as a random variable (taking the value f (ϕ)) and write Ef for its expectation. Definition 3.3. Let ϕ be randomly sampled from a probability measure P on R V (T 2 n ) . We say that P is reflection positive with respect to F if for any plane of reflection P and any two bounded and We call a function f ∈ F a block function at (j, we define a reflection operator ϑ t acting on block functions as follows. If f is a block function at (j, k) ∈ V (T 2 n ) then ϑ t f is the function obtained from f by performing the reflections which map the block at (j, k) to the block at (j + t 1 , k + t 2 ).
Explicitly, if f is defined by (3.10) then ϑ t f is the block function at (j + t 1 , k + t 2 ) defined by Theorem 3.4. (Chessboard estimate) Let ϕ be randomly sampled from a probability measure P on . Suppose that P is reflection positive with respect to F. Then for any 1 ≤ m ≤ |V (T 2 n )|, any f 1 , . . . , f m , bounded block functions at (0, 0), and any distinct t 1 , . . . , t m ∈ V (T 2 n ) we have In particular, the right-hand side is non-negative.
For completeness, we provide a short proof of the chessboard estimate in Section 3.3 below. We remark that the same proof shows that if P is reflection positive with respect to all measurable functions on R V (T 2 n ) then it also satisfies the chessboard estimate with respect to this class. We restrict here to the class F in view of our application to random surface measures, see Proposition 3.5 below.

Controlled gradients property.
In this section we prove Theorem 3.2. We start by proving that our random surface measures are reflection positive.  2). Then for any n ≥ 1 the measure µ T 2 n ,0,U is reflection positive with respect to F. Proof. Suppose ϕ is randomly sampled from µ T 2 n ,0,U . Fix a plane of reflection P , a vertex v 0 ∈ P and supposeφ is randomly sampled from µ T 2 n ,v0,U (the measure µ T 2 n ,v0,U is obtained by replacing 0 with v 0 in (1.1)). We write E µ T 2 n ,0,U and E µ T 2 n ,v 0 ,U for the expectation operators corresponding to ϕ andφ, respectively. Observe that since the induced measure on the gradient of ϕ is translation invariant. In addition, by symmetry, For two bounded f, g ∈ F + P the relation (3.8) now follows from (3.13) and (3.14) by To see the relation (3.9) observe that, by the domain Markov property and symmetry, conditioned on (φ v ) v∈P the configurations (φ v ) v∈P + and ((θ Pφ ) v ) v∈P + are independent and identically distributed.
Thus, for any f ∈ F + P we have We now prove Theorem 3.2. Fix 0 < δ < 1, n ≥ 1 and suppose ϕ is randomly sampled from µ T 2 n ,0,U . Let K be the constant from (3.6), where we write Recall the definition of the random graph E(ϕ, L) from (3.5). For an edge e = (v, w) ∈ E(T 2 n ) and 0 < L < ∞ define the function f e,L ∈ F by f e,L (ψ) := 1 (|ψv−ψw|≥L) .
We need to show that there exists some 0 < L < K, independent of n, such that f ei,L ≤ δ k for all k ≥ 1 and distinct e 1 , . . . , e k ∈ E(T 2 n ).
Fix some k ≥ 1 and distinct e 1 , . . . , e k ∈ E(T 2 n ). Define four block functions at (0, 0) by The definition (3.11) of the reflection operators (ϑ t ) implies that there exist k 1 , k 2 , k 3 , k 4 ≥ 0 with Assume, without loss of generality, that k 1 ≥ k/4 (as the cases that k j ≥ k/4 for some 2 ≤ j ≤ 4 follow analogously). Then, by the chessboard estimate, Theorem 3.4, and thus it suffices to show that there exists some 0 < L < K, independent of n, such that We note that : |ψ (j+1,k) − ψ j,k | ≥ L for all −n + 1 ≤ j ≤ n and all even −n + 1 ≤ k ≤ n}.
Thus, recalling (1.1), we have (3.17) We estimate the numerator and denominator in the last fraction separately. First, we have already shown a lower bound on Z T 2 n ,0,U in (3.1). Second, denote by H the subset of edges ((j, k), (j+1, k)) ∈ E(T 2 n ) for which k is even. Let S be a spanning tree of T 2 n , regarded here as a subset of edges, satisfying The integral above can be estimated by integrating the vertices in V (T 2 n ) \ {0} leaf by leaf according to the spanning tree S. Recalling the definition of E L , two cases arise depending on whether or not the edge connecting a leaf to the remaining tree belongs to H. Thus we obtain Condition (1.2) ensures that C 2 (U ) < ∞ and the definition of K gives that lim L↑K C 3 (U, L) = 0. Thus, using (3.18), for every ε > 0 there exists an 0 < L < K, independent of n, for which This inequality, together with (3.16), (3.17) and (3.1), implies that we may choose an 0 < L < K, independent of n, so that (3.15) holds, as we wanted to show.

3.3.
Proof of the chessboard estimate. In this section we prove Theorem 3.4. Let ϕ be randomly sampled from the given measure P. Reflection positivity of P with respect to F implies that for each plane of reflection P , the bilinear form E(gθ P h) is a degenerate inner product on bounded g, h ∈ F P + . In particular, we have the Cauchy-Schwartz inequality, |Egθ P h| ≤ E(gθ P g)E(hθ P h), for all bounded g, h ∈ F P + . (3.19) For a function f ∈ F of the form and a plane of reflection P , define two functions, the "parts of f in P − and P + ", by Define also the function ρ P f ∈ F by ρ P f := f P + θ P f P + and note that E(ρ P f ) ≥ 0 by (3.9). Observe that Thus, using the Cauchy-Schwartz inequality (3.19) with g = f P + and h = θ P f P − we have Our first goal is to show that starting with a function of the form (3.20), one may iteratively apply the operator ρ P with different planes of reflection P to reach a function of the form (3.20) with all the block functions identical.
Proof. Let s = (j, k) ∈ V (T 2 n ). Define the vertical planes of reflection (Q i ), 0 ≤ i ≤ log 2 (n) , by Q i := P ver ji for j i := j + 1 − 2 i modulo 2n. One may verify directly that for some π : V (T 2 n ) → V (T 2 n ) satisfying that π((a, b)) = (j, b) for all −n + 1 ≤ a ≤ n. In the same manner, one may now take the horizontal planes of reflection (R i ), 0 ≤ i ≤ log 2 (n) , defined by R i := P hor ki for k i := k + 1 − 2 i modulo 2n, and conclude that For a bounded block function f 0 at (0, 0) define which is well-defined and non-negative by (3.9). Let f have the form (3.20). With the above notation, the chessboard estimate (3.12) becomes the inequality where we note that in Theorem 3.4 we may assume that m = |V (T 2 n )| by taking some of the block functions to be constant.
Consider first the case that Let P 1 , . . . , P m be the planes of reflection corresponding to s as given by Proposition 3.6. By iteratively applying the Cauchy-Schwartz inequality (3.21) with the planes (P i ) we may obtain that |E(f )| is bounded by a product in which f s , raised to some positive power, is one of the factors. Thus we conclude from (3.23) that E(f ) = 0, establishing (3.22) in this case.
Second, assume that (3.23) does not hold. Define Let h ∈ F be an (arbitrary) function maximizing |E(h)| among all functions of the form ϑ t h t with each h t being one of the (g s ). (3.24) Observe that, by the Cauchy-Schwartz inequality (3.21) and the definition of h, we have Thus, for any plane of reflection P . (3.25) In particular, E(ρ P h) also maximizes |E(h)| among functions of the form (3.24) (so that equality holds in the last inequality). Let P 1 , . . . , P m be the planes of reflection corresponding to s = 0 as given by Proposition 3.6. By iteratively applying (3.25) with these planes we obtain that since g s = 1 for all s and h has the form (3.24). Finally, the definition of h now shows that implying (3.22) and finishing the proof of Theorem 3.4.

Lower bound for random surface fluctuations in two dimensions
Recall the definition of the controlled gradients property from Section 3. Throughout the section we fix n ≥ 2 and a potential U with the following properties: • There exists an 0 < ε ≤ 1/2 for which U has (1/8, 1 − ε)-controlled gradients on T 2 n . • U restricted to [−1, 1] is twice continuously differentiable.
We fix ε to the value given by the first property. Write 0 := (0, 0). For the rest of the section we suppose that ϕ is a random function sampled from the probability distribution µ T 2 n ,0,U defined in (1.1). For a vertex v = (v 1 , v 2 ) of T 2 n we write v 1 := |v 1 | + |v 2 |.

DELOCALIZATION OF TWO-DIMENSIONAL RANDOM SURFACES WITH HARD-CORE CONSTRAINTS 30
The theorem establishes lower bounds for the variance and large deviation probabilities of ϕ v as well as upper bounds on the probability that ϕ v is atypically small. The lower bound on the variance is expected to be sharp up to the value of c(U ).
The theorem is not optimal in several ways. One expects the results to hold for all v ∈ V (T 2 n ) without the restriction on v 1 , one expects that the exponent 2/3 may be replaced by 1 and that the restrictions on r and t may be relaxed. We believe that further elaboration of our methods may address some of these issues. However, since our main focus is on vertices v for which v 1 is of order n and on estimating the variance of ϕ v we prefer to present simpler proofs.
Again, this estimate is expected to be sharp up to the value of c(U ).

4.1.
Tools. In this section we let τ : V (T 2 n ) → [0, ∞) be an arbitrary function satisfying τ (0) = 0. We let T + , T − be the functions defined in Section 2 acting on the graph T 2 n with the given τ function and constant ε. We also recall the notation J + , J − , M (ϕ) and L(τ, ε) from Section 2.1. Our main tool for lower bounding the fluctuations of ϕ is the following lemma.
and let F 0 be the sigma-algebra generated by (ϕ v ), v ∈ V 0 . There exists a constant c(U ) > 0 such that for any a, s > 0, any u ∈ V (T 2 n ) and any event A ∈ F 0 we have for the density of the measure µ T 2 n ,0,U . Fix a function θ : V 0 → R satisfying θ(0) = 0 and denote by dλ the measure Define the event We wish to bound I from below and from above. We start with the bound from below.
Since U restricted to [−1, 1] is twice continuously differentiable there exists some 0 < c(U ) ≤ 1 such that for all x, r ∈ R for which x + r, x − r ∈ [−1, 1]. Abbreviate 1)) and observe that where we have used property (3) from Section 2.1 to justify our use of (4.5). Together with the definition of the event E this implies that To bound I from above we use the Cauchy-Schwartz inequality and the Jacobian identity in (2.10) to obtain (4.7) Comparing (4.6) and (4.7) and recalling that ϕ is sampled from the probability distribution µ T 2 n ,0,U we conclude that We continue by noting that by the definition of E, In addition, we recall from properties (2) and (4) of T + in Section 2.1 that if ψ satisfies |ψ u | ≤ a and M (ψ) ≤ L(τ, ε) then −a − ε 2 ≤ T + (ψ) u − τ (u) ≤ a and a similar relation for T − by (2.1). In addition, since A ∈ F 0 , properties (1) and (2) imply that A = T + (A) = T − (A). Therefore, using that T + and T − are one-to-one, Combining the last two inequalities with (4.8) establishes the lemma.
Our next lemma bounds the error terms appearing on the right-hand side of (4.4).
Proof. Given a vertex v ∈ V (T 2 n ) and k ≥ 1 denote by P v,k the set of all simple paths in T 2 n starting at v and having length k. Here, by such a path we mean a vector (e 1 , . . . , e k ) ⊆ E(T 2 n ) of distinct edges with e i = (v i , v i+1 ) and v = v 1 . Observe that, trivially, |P v,k | ≤ 4 k for all v and k. Now note that since U has (1/8, 1 − ε)-controlled gradients on T 2 n we have for each v ∈ V (T 2 n ) and k ≥ 1, Observe that We estimate each of the terms on the right-hand side separately. First, using (4.9) we have observing that the inequality holds trivially if L(τ, ε) is zero or negative. Second, using property (5) from Section 2.1 we see that and using again (4.9) we conclude that Thus, Markov's inequality and (4.12) show that The lemma follows by combining this estimate with (4.10) and (4.11).

Fluctuation bounds.
In this section we prove Theorem 4.1. Fix and the function η : We aim to use the lemmas of the previous section with the τ function a constant multiple of η.
The above definition is chosen so that we may control the quantities appearing in Lemma 4.4. The first case allows us to lower bound the function L while the second and third cases ensure that η is slowly varying. The next lemma formalizes these ideas. Write, as in (2.3), (4.14) Lemma 4.5. There exists an absolute constant C > 0 such that For any α > 0 we have Proof. The fact that η(w) depends only on w 1 and η(w 1 ) ≥ η(w 2 ) when w 1 1 ≥ w 2 1 shows that for each w ∈ V (T 2 n ) and k ≥ 0 we have By considering separately the latter two cases in the above inequality we have where we have also used that there are at most 4m vertices w ∈ V (T 2 n ) with w 1 = m (strict inequality is possible when m ≥ n). Continuing the last inequality we obtain for some absolute constants C, C > 0. We note that for any x, s, k ≥ 0 we have that Thus, (4.15) follows from the definitions (2.4) and (4.13) of L(τ, ε) and η.
Proof of Theorem 4.1. Assume that v 1 ≥ (log n) 2 . It suffices to prove (4.2) and (4.3) as (4.1) is an immediate consequence of the case t = 1 of (4.3) and the fact that Eϕ v = 0 by symmetry. Let N (U ) > 0 be large enough for the following derivations. We first claim that choosing c(U ) sufficiently small and C(U ) sufficiently large the theorem holds when n ≤ N (U ). Indeed, this is clear for (4.2) as we may make the right-hand side greater than 1 by choosing C(U ) appropriately. To see this for (4.3) first note that our assumption that the potential U restricted to [−1, 1] is bounded away from infinity implies that P(|ϕ v | ≥ 0.99 v 1 ) > 0. Thus it suffices to check that 1+log n ≤ 0.99 v 1 and this follows, using our assumption that n ≥ 2, as Assume for the rest of the proof that n > N (U ). Consequently, since v 1 ≥ (log n) 2 , we have We start with the proof of (4.3). Let there is nothing to prove. Thus we suppose that P(|ϕ v | ≤ t log(1 + v 1 )) ≥ 1 2 . Pick the function τ := 8t · η so that, since ε ≤ 1 2 , we have τ (v) ≥ 2t log(1 + v 1 ) + ε 2 by (4.16). Combining the arithmetic-geometric mean inequality with Lemma 4.3, taking A to be the full event, we have where s > 0 is arbitrary. By Lemma 4.4 and Lemma 4.5 we have Furthermore, our assumption that t ≤

1+
√ v 1 log n and v 1 ≥ (log n) 2 combined with (4.15) yields that and combining the last inequalities we conclude that for some c (U ), C(U ) > 0 depending only on U .

4.3.
Maximum. In this section we prove Theorem 4.2.
Let ρ(U ) > 0 be a constant to be chosen later, depending only on U and small enough for the following derivations. We may choose c(U ) sufficiently small so that the theorem holds when n ≤ exp(1/ρ(U ) 2 ) and thus we assume that Fix a collection of arbitrary vertices u 1 , . . . , u n ∈ V (T 2 n ) satisfying u i 1 ≥ n 2 and d T 2 n (u i , u j ) > 2n 1/3 when i = j. Define the events, for 1 ≤ i ≤ n, and we aim to use Lemma 4.3 to estimate the summands on the right-end side. Let v 0 := ( n 1/3 , 0) and let η : V (T 2 n ) → [0, ∞) be the function defined by (4.13) with v = v 0 . Noting that η takes its maximal value at v 0 we may define η i : where w − u i is the vertex in T 2 n obtained by doing the coordinate-wise difference modulo 2n. We define also the functions τ i :  In addition, if ρ(U ) is sufficiently small then L(τ i , ε) ≥ n 1/6 (4.24) and for some absolute constant C > 0.
Proof. Property (4.22) is an immediate consequence of the fact that η(v) = η(v 0 ) for all vertices v with v 1 ≥ v 0 1 and the definition of τ i . To see (4.23), recall (4.19) and observe that when ρ(U ) is sufficiently small, as in (4.16). Now use the definition of τ i and the fact that ε ≤ 1 2 . Since (4.21) defines η i via η we may use Lemma 4.5, taking ρ(U ) sufficiently small, to obtain (4.24). Finally, (4.25) follows from a similar derivation as in the proof of Lemma 4.5.
We may now apply Lemma 4.3 with τ i playing the role of τ and A i playing the role of A, noting that by (4.22) and our choice of the u i , A i is indeed measurable with respect to the sigma algebra generated by {ϕ v : τ i (v) = 0}. Using also the arithmetic-geometric mean inequality and (4.23) we have where s > 0 is arbitrary. Combining Lemma 4.4 with (4.24) and (4.25) we have Choosing s := exp(−20Cρ(U ) 2 log n/ε 2 ), taking ρ(U ) small enough and using (4.19) yields Plugging back into (4.27) and summing over i using (4.20) gives Finally, choosing ρ(U ) sufficiently small this implies that It follows that there exists some 1 ≤ i ≤ n for which P(B i ∪ A c i ) ≥ 1 2 , whence, by the definition of A i , P(∪ n i=1 B i ) ≥ 1 2 and the theorem follows.

Discussion and open questions
In this work we prove lower bounds for the fluctuations of two-dimensional random surfaces. Specifically, we investigate random surface measures of the form (1.1) based on a potential U satisfying the conditions (1.2) and (1.3). These conditions allow for a wide range of potentials including the hammock potential, when U (x) = 0 for |x| ≤ 1 and U (x) = ∞ for |x| > 1, double well and oscillating potentials. We prove that such random surfaces delocalize, with the variance of their fluctuations being at least logarithmic in the side-length of the torus. We also establish related bounds on the maximum of the surface and on large deviation and small ball probabilities. In this section we discuss related research directions and open questions.
Upper bound on the fluctuations. It is expected that under mild conditions on the potential there holds an upper bound of matching order on the fluctuations of the random surface. For instance, that if ϕ is randomly sampled from the measure (1.1) then Var(ϕ (n,n) ) ≤ C(U ) log n for some C(U ) < ∞ and all n ≥ 2. One may well speculate the result to hold for all potentials satisfying (1.2) and (1.3) and indeed even in greater generality. Certain potentials are known to satisfy such a bound but it appears that even the case of the potential U (x) = x 4 has not yet been settled [31,Remark 6 and open problem 1].
Reflection positivity. Our work relies crucially on reflection positivity and the chessboard estimate to establish what we called the controlled gradients property, see the beginning of Section 3. This restricts our results in ways which are probably not essential. Specifically, we may handle only random surface measures on a torus with even side length and we must normalize such measures at a single point. It is desirable to lift these restrictions, by possibly arriving at a more illuminating proof of the controlled gradients property. This will allow to treat random surface measures on other graphs as well as on the graph T 2 n with other boundary conditions. For instance, one would expect our results to hold for zero boundary conditions, when ϕ v is normalized to zero at all v = (v 1 , v 2 ) with max(|v 1 |, |v 2 |) = n.
With regards to this we put forward that the controlled gradients property possibly holds for any finite, connected graph G and any potential U , satisfying the conditions (1.2) and (1.3), say. Precisely, let G and U be such a graph and potential. Write K := sup{x : U (x) < ∞} ∈ (0, ∞] and let ϕ be randomly sampled from the probability measure dµ G,v0,U (ϕ) : dϕ v , (6.1) for some vertex v 0 ∈ V (G). Then it may be that for any 0 < δ < 1 there exists some 0 < L < K such that L depends only on δ and U (and not on G) and if we define the random subgraph E(ϕ, L) of G by E(ϕ, L) := {(v, w) ∈ E(G) : |ϕ v − ϕ w | ≥ L} (6.2) then P(e 1 , . . . , e k ∈ E(ϕ, L)) ≤ δ k for all k ≥ 1 and distinct e 1 , . . . , e k ∈ E(G). More general random surfaces. One may try to extend the applicability of our results in several directions. First, one may try and relax the condition (1.3) to allow for singular potentials. Ioffe, Shlosman and Velenik [15] introduced a technique for proving lower bounds on fluctuations for potentials which are small perturbations, in some sense, of smooth potentials. These ideas were also incorporated in the work of Richthammer [29] upon which our addition algorithm is based. It is a promising avenue for future research to try and combine the techniques of [15] with our technique. This may allow to treat all continuous (not necessarily differentiable) potentials as well as certain classes of discontinuous potentials.
Second, one may try and extend the results to integer-valued random surface models. For instance, to probability measures on configurations ϕ : T 2 n → Z (rather than ϕ : T 2 n → R) with ϕ(0) = 0 for which the probability of ϕ is proportional to exp − (v,w)∈E(T 2 n ) U (ϕ v − ϕ w ) . This direction seems much more challenging as our technique is based on an argument which relies crucially on the continuous nature of the model. We mention that while it is expected that many integer-valued random surface models have fluctuations with variance of logarithmic order this has been established only in two cases: when U (x) = β|x| and U (x) = βx 2 , both with β sufficiently small. This result is by Fröhlich and Spencer [10]. It is also known that if β is large then these models become localized, having fluctuations with bounded variance, a transition which is called the roughening transition. As specific examples of surfaces for which delocalization is expected but remains unproved we mention integer-valued analogs of the hammock potential, when U (x) = 0 for x ∈ {−1, 1} and otherwise U (x) = ∞ (the graph-homomorphism or homomorphism height function model) or when U (x) = 0 for x ∈ {−M, −M +1, . . . , M } and otherwise U (x) = ∞ (the M -Lipschitz model). The former of these models can be used as a height function representation for the squareice or 6-vertex models and is also related to the zero temperature 3-state antiferromagnetic Potts model (i.e., uniformly chosen proper colorings of T 2 n with 3 colors). For more on these models we refer to [23] where it is proved that the homomorphism height function and 1-Lipschitz models are localized in sufficiently high dimensions. covariance matrix of ϕ. How do the off-diagonal elements behave? How fast do the values of ϕ decorrelate? A related question is to study the decay of correlations for the gradient of ϕ. Sufficiently fast decay of gradient correlations will lead to an upper bound on Var(ϕ v ), by writing ϕ v as the sum of the gradients of ϕ on a path leading from 0 to v and averaging over many such paths. With regards to this we mention the results of Aizenman [1] and Pinson [27], following ideas of Patrascioiu and Seiler [22], who give a lower bound, in a certain sense, for the decay of correlations for the Hammock potential and for the integer-valued homomorphism height function model mentioned above.
High-dimensional convex geometry. The case that the potential U is the hammock potential is natural also from a geometric point of view. In this case the measure (1.1) is the uniform measure on the high-dimensional convex polytope of Lipschitz functions defined by Lip := ϕ : T 2 n → R : ϕ 0 = 0 and |ϕ v − ϕ w | ≤ 1 when v ∼ w . The field of convex geometry is highly developed and we mention here the central limit theorem of Klartag [16] which states that uniform measures on high-dimensional convex bodies have many projections which are approximately Gaussian. It would be interesting to use this point of view to obtain new results for the random surface with the hammock potential.