An elementary proof of phase transition in the planar XY model

Using elementary methods we obtain a power-law lower bound on the two-point function of the planar XY spin model at low temperatures. This was famously first rigorously obtained by Fr\"{o}hlich and Spencer and establishes a Berezinskii-Kosterlitz-Thouless phase transition in the model. Our argument relies on a new loop representation of spin correlations, a recent result of Lammers on delocalisation of integer-valued height functions, and classical correlation inequalities.


Introduction and main result
Let G = (V, E) be a finite graph. Given a collection of nonnegative coupling constants J = (J e ) e∈E , and an inverse temperature β > 0, the XY model (with free boundary conditions) is a random spin configuration σ ∈ S V , where S = {z ∈ C : |z| = 1} is the complex unit circle, sampled according to the Gibbs distribution where vv denotes the edge {v, v }, and dσ v is the uniform probability measure on S. For simplicity of notation, unless stated otherwise, we will assume that J e = 1 for all e. However, our results extend naturally to nonhomogeneous coupling constants. We will write · G,β for the expectation with respect to µ G,β . The observable of main interest for us will be the two-point function σ aσb G,β , a, b ∈ V , and its infinite volume limit where Γ is an infinite planar lattice. Note that if σ v = e iθv , θ v ∈ (−π, π], then σ vσv +σ v σ v = 2 cos(θ v −θ v ). This means that the model is ferromagnetic, i.e., pairs of neighbouring spins that are (almost) aligned have smaller energy and hence are statistically favoured. A natural question is wether varying β leads to a ferromagnetic order-disorder phase transition in the model. The classical theorem of Mermin and Wagner [31] excludes this possibility when the underlying lattice Γ is two-dimensional. Moreover, McBryan and Spencer showed that at any finite temperature σ aσb Z 2 ,β decays to zero at least as fast as a power of the distance between a and b. On the other hand, it is known by the work of Fröhlich, Simon and Spencer [15] that in higher dimensions the model exhibits long-range order at low temperatures and the two-point function does not decay to zero.
Even though there is no spontaneous symmetry breaking, Berezinskii [7,8], and Kosterlitz and Thouless [21] predicted that a different type of phase transition takes place in two dimensions. It should be understood in terms of interacting topological excitations of the model, the so called vortices and antivortices. They are those faces of the graph where the XY configuration makes a full clockwise or anticlockwise turn respectively when one traverses the edges of the face in a clockwise manner. Vortices and antivortices interact through a Coulomb interaction, and are energetically favoured to form short-distance pairs of vortex-antivortex. However, such configurations have clearly much smaller entropy. The Berezinskii-Kosterlitz-Thouless (BKT) phase transition happens when, while increasing the temperature, entropy wins agains energy, and the vortex-antivortex pairs unbind and form a plasma of freely spaced vortices and antivortices. This regime corresponds to exponential decay, whereas the phase with bound vortex-antivortex pairs should exhibit power-law decay of the two-point function. A rigorous lower bound of this type for low temperatures, and therefore a proof of the BKT phase transition was first obtained in the celebrated work of Fröhlich and Spencer [16] who also derived analogous results for the Villain spin model. Their proof uses a multi-scale analysis of the Coulomb gas, and the main purpose of the present article is to present an alternative and less technically involved argument for the existence of phase transition in two dimensions.
To be more precise, we introduce a new loop representation for the two-point function in the XY model that can be used to transfer probabilistic information from the dual integervalued height function model to the XY model. Along the way we also show that the height function possesses the crucial absolute-value-FKG property. This, together with a recent elementary delocalisation result for general height functions obtained by Lammers [22], is used to prove existence of the BKT phase transition.
Theorem 1 (Berezinskii-Kosterlitz-Thouless phase transition). There exists β c ∈ (0, ∞) such that (i) for all β < β c , there exists c = c(β) > 0 such that for all v, v ∈ Z 2 , (ii) for all β ≥ β c and all distinct v, v ∈ Z 2 , We note that unlike in the original proof of Fröhlich and Spencer, we do not show that the rate of decay approaches zero when so does the temperature. However, we establish a type of sharpness which says that there is no other behaviour than exponential and powerlaw decay. The short proof of sharpness is independent of the rest of the argument. In the first step we classically use the Lieb-Rivasseau inequality [27,34] to establish a sharp transition between exponential decay and nonsummability of correlations (similarly to the proof for the Ising model [13]). To conclude a uniform power-law lower bound as in (ii) whenever the correlations are not summable we use the Messager-Miracle-Sole inequality [32] on monotonicity of correlations with respect to the position of the vertex on the lattice.
We also note that our proof works (with minor modifications and a different, implicit multiplicative constant in (ii)) for other infinite graphs that in addition to being translation invariant possess reflection and rotation symmetries.
For a more detailed overview of the XY model, we refer the reader to [14,33], and for expositions of the argument of Fröhlich and Spencer, we refer to [17,20].
This article is organised as follows.
• In Section 2 we introduce the dual of the planar XY model in form of an integervalued height function defined on the faces of the graph. We also establish positive association of its absolute value (the absolute-value-FKG property), and recall the delocalisation result of Lammers [22]. • In Section 3 we define a random collection of loops on the graph that carries probabilistic information about both the XY spins and the dual height function. Although this is a well known object that goes back to the works of Symanzik [37], and Brydges, Fröhlich and Spencer [9], the formula that relates the two-point function to the probability of two points being connected by a loop (Lemma 8) is new and crucial to our argument. • In Section 4 we give an elementary argument which states that if the height function delocalises at some temperature, then the spin two-point function does not decay exponentially. • In Section 5 we use the above ingredients to show that on any translation invariant graph, there exists a finite temperature at which the two-point function does not decay exponentially. This is not immediate as the result of Lammers [22] applies only to trivalent graphs. However, a simple graph-modification argument together with the Ginibre inequality allows to change the setup from a general graph to a triangulation (a graph whose dual is trivalent). • In Section 6 we finish the proof of the main theorem. We use the Lieb-Rivasseau inequality [27,34] and the inequality of Lemma 19 to show that the absence of exponential decay implies a power-law lower bound on the two-point function. The proof of sharpness relies only on this section and Lemma 19. • Finally, independently of the rest of the article, in Appendix A we develop a new loop representation for squares and products of spin correlation functions in the XY model. As an application we present new correlation inequalities and give new combinatorial proofs of the Lieb-Rivasseau [27,34], and the Messager-Miracle-Sole inequality [32]. We hope this representation will be useful in the further study of the XY model.
Acknowledgements. ML is grateful to Roland Bauerschmidt and Hugo Duminil-Copin for inspiring discussions on the XY model. We also thank Christophe Garban for useful remarks on a draft.

The dual height function
To define the dual model we assume that G is planar and we need to introduce currents. To this end, let E be the set of directed edges of G, and let N = {0, 1, . . .}. A function n : E → N is called a current on G. For a current n, we define δn : V → Z by Hence if δn v is positive, then the amount of outgoing current is larger than the incoming current, an we think of v as a source. Likewise if δn v is negative, there is more incoming current and v is a sink. A current is sourceless if δn v = 0 for all v ∈ V .
We define Ω 0 to be the set of all (sourceless) currents. Sourceless currents naturally define a height function h on the set of faces of G, denoted by U , where the height of the outer face is set to zero, and the increment of the height between two faces u and u is equal to where the primal directed edge (v, v ) crosses the dual directed edge (u, u ) from right to left. That this yields a well defined function on the faces of G follows from the fact that δn = 0. We define the XY weight of a current by These weights appear naturally in the expansion of the partition function of the XY model into a sum over sourceless currents after one expands the exponentials in (1) into a power series in the variables 1 2 βJ vv σ vσv for each directed edge (v, v ) ∈ E, and then integrates out the σ variables. They will also appear in the analogous classical expansion for spin correlations (11).
We note that using currents to define a model on the dual graph is an instance of planar duality of abelian spin systems [11], and the fact that the function is integer valued is a consequence of Z being the dual group of the unit circle.
Clearly, the weight (2) defines a probability measure P G,β on currents and hence also on height functions. In terms of the height function it is a Gibbs measure given by where E † is the set of dual edges of G, and where the symmetric potentials V β e : Z → R are given by with I k being the modified Bessel function. We again note that we will usually set all J e = 1 to simplify the notation. A well known Turán-type inequality for modified Bessel functions [38] states that for any k ≥ 0 and β > 0, which means that V β e is convex on the integers. This puts the model in the well studied framework of height functions with a convex potential (see e.g. [36]).

Gibbs measures and delocalisation.
To state the delocalisation result of Lammers we will need the notion of a Gibbs measure for height functions on infinite graphs (though we will not directly work with it in the remainder of the article). Let Γ = (V, E) be an infinite planar graph and Γ † = (U, E † ) its planar dual. If ν is a measure on height functions ϕ : Z U → Z and Λ ⊂ U a finite subset, write ν Λ for the measure restricted to Λ. Let V = (V e ) e∈E † be a family of convex symmetric potentials. We call ν a Gibbs measure for the potential V if for every such Λ, it satisfies the Dobrushin-Lanford-Ruelle relation where ν ϕ Λ is the Gibbs measure on height functions h ∈ Z U given as in (3) (but with V β replaced by V) and conditioned on h being equal to ϕ on the boundary of Λ.
In what follows we will always assume that Γ is locally finite and invariant under the action of a Z 2 -isomorphic lattice. We say that ν is translation invariant if it is invariant under the same acton.
In a recent beautiful work [22] Lammers gave a condition on the potential that guarantees that there are no translation invariant Gibbs measures on graphs of degree three (trivalent graphs).
This together with the dichotomy stated in Theorem 3 will be one of the key ingredients of the proof of the main theorem.

2.2.
Absolute-value-FKG and dichotomy. In this section, we prove that the height function satisfies the absolute-value-FKG property, which is known to imply the following dichotomy.
Let Γ = (V, E) be a translation invariant graph, and let 0 be a chosen face of Γ. Define B n to be the subgraph of Γ induced by the vertices in V that lie on at least one face of Γ that is contained in the graph ball of radius n on Γ † . We introduce this slightly convoluted definition to guarantee the following three properties: 0 belongs to all B n , also B n Γ as n → ∞, and finally, the weak dual graph of B n (the dual graph with the vertex corresponding to the external face of B n removed) is a subgraph of Γ † . Theorem 3. Consider the setup as above. Then for every β > 0, exactly one of the following two occurs: (i) (localisation) There exists a C < ∞ such that uniformly over all n, Proof. This is a consequence of the absolute-value-FKG property proved below (Proposition 4) and standard arguments using monotonicity in boundary conditions. See for example [10,Lemma 2.2] or [23,Theorem 2.7].
The remainder of this section is devoted to proving the following version of the absolutevalue-FKG property.
Proposition 4. Let G = (V, E) be a finite graph and U the set of its faces. Then for all β > 0, and all Ψ, Φ : N U → R + increasing functions, The proposition is easiest to prove for small β. We extend this to general β afterwards.
is a nonincreasing function of k on {0, 1, . . .}, then P G,β is absolute-value-FKG. We define r k = 1 β I k (β) , and need to show that r 2 k ≤ r k−1 r k+1 for all k ≥ 0. The well known recurrence relation Hence it is enough to prove that where k = β 2 r k . Using the Turán inequality (5), it follows that 0 ≤ r k+1 ≤ r k , and therefore it is sufficient to establish that R k := (2k + k+1 )(2k + 4 + k+1 ) − (2k + 2) 2 = 4(k + 1) k+1 + 2 k+1 − 4 ≤ 0. At the same time, simply using the definition of r k+1 and comparing the Taylor expansions (4) of I k+1 and I k term by term gives k+1 ≤ β 2 /(2k + 2). Therefore, when β ≤ 1, we have R k ≤ 2 k+1 − 2 ≤ 0 for all k ≥ 0, which concludes the proof. To treat general values of β, we will use a trick which consists in replacing each edge of G by s = β consecutive edges, and reducing the parameter β by the factor s, together with the following convolution property of the modified Bessel functions.
Lemma 6. For all k, l ∈ Z and all β, β ≥ 0, Proof. This is a classical identity which follows from the fact that where Z, Z are independent Poisson random variables with mean β/2, and the fact that a sum of independent Poisson random variables is Poisson.
With this we can prove Proposition 4.
Proof of Proposition 4. Let G s = (V s , E s ) be G with each edge replaced by s consecutive edges, and let h s be the height function on G s with law µ Gs,β/s . By Lemma 6 (and an induction argument) the restriction of h s to V has the same law as h 1 . Moreover, β/s ≤ 1 by definition of s, which by Lemma 5 implies that µ s is absolute-value-FKG. To finish the proof it is enough to notice that any increasing function on N V is also increasing on N Vs . Remark 1. An interesting consequence of the idea above (that we will not use in this article) is the following. Consider the case when s from above is independent of β and diverges to infinity. In this limit, the height function becomes well defined at every point of every dual edge. Here we think of the dual graph as the so called cable graph, i.e., every dual edge e is identified with a continuum interval of length J e β. Then the distribution of the height on an edge, when conditioned on the values at the endpoints, is one of the difference of two Poisson processes with intensity J e β/2 each, and conditioned on the value at the endpoints. One can check that the model exhibits a spatial Markov property on the full cable graph and not only on the vertices. This is in direct analogy with the cable graph representation of the discrete Gaussian free field, where the vertex-field can be extended to the edges via Brownian bridges (see e.g. [30] and the references therein).

Loop representation of currents and path reversal
The purpose of this section is mainly to develop a loop representation for the two-point function of the XY model. The important aspect of our approach is that the correlations are represented as probabilities for loop connectivities in random ensembles of closed loops. This is in contrast with most of the classical representations that write correlation functions as ratios of partition functions of loops, where in the numerator, in addition to loops, one also sums over open paths between the points of insertion in the correlator [9,37]. We note that a similar idea to ours appears in the work of Benassi and Ueltschi [6], but due to technical differences in the framework (see Remark 4), the formula for the two-point function obtained in [6] is not as transparent as ours.
Let G = (V, E) be a finite, not necessarily planar graph. We say that a multigraph M on V is a submultigraph of G if after identifying the multiple copies of the same edge in M it is a subgraph of G. We write L S for the set of all loop configurations outside S, and define a weight for ω ∈ L S by where M is the underlying multigraph, and M e is the number of copies of e in M. When S = ∅, a configuration is composed only of loops that can visit every vertex in V , and we simply call it a loop configuration.
An important feature of the weight (7) is that it depends on ω only through M. Also note, that if S ⊆ S, then there is a natural map ρ : L S → L S that consists in forgetting (or cutting) the loop connections at the vertices in S \ S . Under this map, each configuration in L S has v∈S\S (deg M (v)/2)! preimages, each of them having the same weight, and hence This consistency property will be useful later on. For now, let |n| : E → N be the amplitude of a current n, i.e.
Definition (Multigraph of a current and consistent configurations). For a current n, let M n be the submultigraph of G where each edge e ∈ E is replaced by |n| e (possibly zero) parallel copies of e. A loop configuration on M n is called consistent with n if for every edge (v, v ) ∈ E, the number of times the loops traverse a copy of vv in the direction of (v, v ) is equal to n (v,v ) . We define L S n to be the set of all loop configurations on M n outside S that are consistent with n.
and S(ϕ) = {v ∈ V : ϕ v = 0}. For a current n, with a slight abuse of notation, we also write S(n) = S(δn). Note that L S n can be nonempty only if S(n) ⊆ S. Indeed, each path and loop that enters a vertex in V \ S must also leave it, and hence the total number of incoming and outgoing arrows at each such vertex must be the same. For ϕ : V → Z, we also define Again, this is nonempty only if S(ϕ) ⊆ S. We will write L S 0 , where 0 denotes the zero function on V .
We now relate the weights of loops to those of currents. To this end, note that for each edge vv ∈ E, there are exactly ways of assigning orientations to it so that the result is consistent with n. Moreover, independently of the choices of orientations, there are exactly (deg Mn (v)/2)! possible pairings of the incoming and outgoing edges at each vertex v ∈ V \ S. Combining all this we arrive at a crucial loop representation for current weights: if S(n) ⊆ S, then An important observation here is that the left-hand side is independent of S, and hence so is the right-hand side.

3.1.
Coupling with the height function. We now apply this framework to the case of two sourceless currents and a coupling with the corresponding height function. From (9) we have where 0 denotes the zero function on V .
Remark 2. This loop representation of the partition function, though obtained via a different procedure, goes back to the work of Symanzik [37], and Brydges, Fröhlich and Spencer [9].
Moreover, in the case when G is planar we immediately get the following distributional identity. Define P G,β to be the probability measure on L 0 := L ∅ 0 induced by the weights λ β := λ ∅ β . For each face u ∈ U of G, and ω ∈ L 0 , define W ω (u) to be the total net winding of all the loops in ω around u.
Proposition 7. The law of (W (u)) u∈U under P G,β is the same as the law of the height function (h(u)) u∈U under P G,β .

3.2.
The two point-function and path reversal. We now turn to the loop representation of the two-point function. For reasons that will become apparent soon, we need to consider the two-point function of the squares, i.e., σ 2 aσ 2 b . We note that the more standard correlation function σ aσb (or rather its square) can be treated using our approach from Appendix A.
Since the resulting currents will have sources, we will need to consider nonempty S in the construction above. To this end, fix two vertices a, b ∈ V , and and define ϕ = 2(δ a − δ b ), where δ a (v) = 1{a = v}. To lighten the notation, will write a, b instead of {a, b} for the set S. As for the partition function, expanding the exponential in the Gibbs-Boltzmann weights (1) into a power series in 1 2 βJ vv σ vσv for each directed (v, v ) ∈ E, and integrating out the σ variables, we classically get where the last equality is new and follows from (9).
We will write P a,b (ω) for the set of paths in ω that start at a and end at b, and define We now want to "erase the sources" at a and b from the currents underlying L a,b ϕ , and hence rewrite the numerator as a sum over L a,b 0 . We will then ultimately connect the open paths at a and b in all possible ways, and hence get a sum over L ∅ 0 (see Figure 1 for an example). To this end note that in each ω ∈ L a,b ϕ there are exactly two more paths going from a to b, than those going from b to a, i.e., m a,b (ω) = m b,a (ω) + 2. The elementary operation that we will perform on the former paths is reversal. To this end, denote by r(γ) the path γ with the orientation of all the visited edges reversed. Obviously this does not change the underlying multigraph, and hence also the weight of the loop configuration. The crucial observation now is that it maps ω ∈ L a,b ϕ to a configuration ω ∈ L a,b 0 , and hence erases the sources of the underlying currents. Indeed one can easily check that after reversing a path, the number of incoming minus the number of outgoing edges at every vertex v / ∈ {a, b} in ω is the same as in ω, whereas at a (resp. b) this number is decreased (resp. increased) by two. More precisely, our transformation maps bijectively a pair (ω, γ) where ω ∈ L a,b ϕ and γ ∈ P a,b (ω) to the pair (ω , r(γ)) where ω ∈ L a,b 0 and r(γ) ∈ P b,a (ω ). Moreover, m b,a (ω ) = m b,a (ω) + 1, which in particular means that m(ω ) > 0. Since path reversal does not change the weight of a loop configuration, we obtain where in the second equality we used path reversal, the last equality follows from (8) with S = ∅, and where, with a slight abuse of notation, for ω ∈ L ∅ 0 , m b,a (ω ) is the number of pieces of loops going from b to a and not visiting b nor a except for the start and end vertex. Recall that P G,β is the probability measure on L ∅ 0 induced by the weights λ ∅ β , and note that m b,a has the same distribution as m a,b under P G,β (the law on loops is invariant under a global orientation reversal). We therefore obtain from (10) and (11) the following loop representation of the two-point function.   Remark 3. We stress again that the crucial property of this loop representation is that the measure P G,β is supported on collections of closed loops, and is independent of the choice of a and b. A similar idea was used by Lees and Taggi [26] to study spin O(n) models with an external magnetic field. Moreover, by Proposition 7 and Lemma 8, the random loops under P G,β carry probabilistic information about both the spin XY model (in terms of correlation functions) and its dual height function (as an exact coupling). An analogous role for the Ising and Ashkin-Teller model is played by the (double) random current measure that encodes both an integer valued height function and the spin correlations [12,28,29]. The difference is that for the XY model, the correlations are determined by loop connectivities instead of percolation connectivities. This comparison offers an alternative explanation for the different types of phase transition in discrete and continuous spin systems.
Remark 4. The approach above is different from [6,9,26,37] in that in the loop configurations, we never make connections at vertices with sources. This leads to different combinatorics than in [6], and in particular a more transparent formula for the two-point function. See also Appendix A for a different construction where we allow such connections.
Remark 5. We call a multigraph M Eulerian if its degree is even at every vertex. Another way to sample the loop configuration that easily follows from the above definitions is the following procedure: Remark 6. Using the same argument as above one obtains the following formula for higher power two-point functions. For k ≥ 1, we have where (m) k = m(m−1) · · · (m−k+1) is the falling factorial. One can also consider multi-point functions and get more complicated loop representation formulas.
Remark 7. This representation is valid on any, not necessarily planar, graph, and it is known that the XY model exhibits long-range order in dimension greater than two [15]. The disorderorder transition should coincide with the onset of infinite loops (biinfinite paths) on the current. The alternative heuristic for the lack of symmetry breaking in two dimensions arising from this picture is that planar simple random walk is recurrent (an hence does not produce infinite loops).
Remark 8. The isomorphism theorem of Le Jan [24] says that the discrete complex Gaussian free field can be coupled with a Poissonian collection of random walk loops, the so called random walk loop soup, in such a way that one half of the square of the absolute value of the field is equal to the total occupation time of the random walk loops. On the other hand, it is immediate that conditioned on the absolute value of the field, its complex phase is distributed like the XY model with coupling constants depending on this absolute value. With some work, e.g. using [25], one can show that under this conditioning the random walk loops have the same distribution as the loops described above.

Delocalisation implies no exponential decay
In this section we prove that if the height function delocalises, then the spin correlations are not summable along certain sets of vertices. In the next section, we will show how to apply this together with the delocalisation results of Lammers [22] to deduce a BKT-type phase transition in a wide range of periodic planar graphs.
Suppose Γ = (V, E) is a translation invariant planar graph, and write for the infinite volume two-point function, where the limit is taken along any increasing sequence of subgraphs G exhausting Γ. That this is well defined is guaranteed by the fact that the sequence is nondecreasing, i.e., σ aσb G,β ≤ σ aσb G ,β if G is a subgraph of G , which in turn is a classical consequence of the Ginibre inequality [18].
Definition. Let 0 be a distinguished face of Γ. A bi-infinite self-avoiding path in Γ that goes through at least one edge incident to 0 is called a cut (at 0). Note that a cut L naturally splits into two infinite sets of vertices L + and L − with the property that any cycle in Γ that surrounds 0 must intersect both L + and L − .
The main quantity of interest for us will be the sum of correlations along cuts. To be more precise for ε > 0, let Proposition 9. For every > 0, there exists C = C( , β, Γ) < ∞ such that for all finite subgraphs G of Γ containing 0, we have where the infimum is over all cuts at 0.
Before presenting the proof, let us mention that a direct corollary of this proposition is the following. A natural example of a cut is any path that stays at a constant distance from a straight line going through 0. In this case it is easy to see that χ Γ,β (L) is finite whenever there is exponential decay of spin correlations. We can now state the main conclusion of this section. Proof. We know that situation (i) from Theorem 3 does not happen. This means that sup n E Bn,β [|h(0)|] = ∞, and the claim follows directly from Proposition 9.
Remark 9. One naturally expects that the localisation-delocalisation phase transition for the height function happens at the same temperature as the BKT transition for the XY model. The remaining part of this prediction is therefore to show that if the spin correlations do not decay exponentially, then the height function delocalises. We do not do this in this article.
Recall that m a,b is the number of paths (pieces of loops) in a loop configuration that go from a to b. We will need the following lemma.
Lemma 11. For all β > 0 and p > 1, there exists a C p < ∞ such that for all finite graphs G = (V, E) and all a, b ∈ V , Proof. Fix β > 0, G = (V, E) and a, b ∈ V , and let ω ∈ L 0 be a loop configuration on G. Denote by ω e , the number of visits of all loops in ω to an undirected edge e ∈ E. If there are m ≥ 1 paths going from a to b in ω, then in particular c∼a ω {a,c} ≥ m. This implies that Applying Hölder's inequality gives where 1/p + 1/q = 1. We now notice that by definition, ω e under P G,β has the same distribution as the amplitude |n| e under P G,β . Therefore, to finish the proof it is enough to show that for all p > 1, there exists C p < ∞ depending on β but independent of G such that We postpone the proof of this bound to Lemma 13 and Lemma 14.
The last ingredient that we will need is the following inequality Lemma 12. For any a, b ∈ V , we have A version of the Ginibre inequality (see e.g. [5]) says that which after rearrangement gives the desired inequality.
We note that the constant in the inequality above can be improved to 1 using our switching techniques from Appendix A (see Remark 10).
We are now ready to prove the main theorem.
Proof of Proposition 9. Fix a finite subgraph G and a cut L. By Proposition 7 the height function h(0) under P G,β has the sam law as W (0) -the total net winding around 0 of all loops in a loop configuration -drawn according to P G,β . Moreover, any piece of a loop that adds to the winding (in any orientation) must intersect both L + and L − by definition of a cut. Therefore, taking p = 2/(2 − ε), we have where the third line follows from Lemma 11, the forth one from Lemma 8, the fifth one from Lemma 12, and the last one from (12). This completes the proof.
It therefore remains to show (14), which will directly follow from Lemma 13 and Lemma 14 below. To that end, define for k ∈ N and β > 0, a random variable Y k by so that the normalizing constant is I k (β). For e = vv , let be the absolute value of the gradient of the height function across the dual edge e † . Note that the random variables (X e = X e (n)) e∈E defined through X e = |n| e − |∇h| e 2 have the same distribution as Y |∇h|e . Moreover, conditionally on |∇h|, they are an independent family. To show (14) it is enough to bound the moments of |∇h| e and X e separately, which we will now do. By the definition of the height function and currents, we therefore have where we used the obvious bounds σ l vσ l v G\e,β ≤ 1, and Z 0 G\e /Z 0 G ≤ 1, and the last inequality follows easily from the definition of I r (β). Finally, The last bound is independent of G and e which completes the proof.

Lemma 14.
For all β > 0 and all r ∈ N, there exists aC r < ∞ such that for all finite planar graphs G = (V, E) and e ∈ E, For two nonnegative integers i, r, let (i) r = i(i − 1) · · · (i − r + 1) be the falling factorial with the convention that (i) 0 = 1. Note that (i) r = 0 whenever i < r. It will be convenient to look at the falling factorial moments. First note that by definition of Y k , By the Turán inequality (5), the map k → I k+1 (β)/I k (β) is decreasing and hence Now note that (i) r ≥ |i − r| r when i ≥ r, and hence i r ≤ 2 r−1 (|i − r| r + r r ) ≤ 2 r ((i) r + r r ). Finally E β [|X e | r | |∇h| e = k] = E β [|Y k | r ] ≤ 2 r (C + r r ) :=C r , where the last bound does not depend on k. Integrating over the possible values of |∇h| e concludes the proof.

Existence of phase transition in the XY model
In this section, we prove that for all translation invariant planar graphs Γ = (V, E), the XY model undergoes a non-trivial phase transition in terms of the quantity χ ε β (L). As before, let 0 denote an arbitrary distinguished face of Γ. We define β 0 = inf{β > 0 : for all ε > 0 and all cuts L at 0, χ ε β (L) = ∞}. Theorem 15. Let Γ be as above. Then β 0 < ∞.
By Corollary 10 it is enough to show that for any such Γ, there exists a finite β 0 > 0 such that the associated height function delocalises in the sense that there are no translation invariant Gibbs measures on the dual Γ † . We first implement this strategy for triangulations, where delocalisation can be shown directly using the general result of Lammers [22] (Theorem 2). Proof of Theorem 15 for triangulations. Let Γ be a translation invariant triangulation. Note that condition (6) in our case is equivalent to I 1 (β)/I 0 (β) ≥ 1 2 . It is known that this fraction converges to 1 as β → ∞ (see for example [35]), and therefore in light of Theorem 2, there are no translation invariant Gibbs measures for β large enough.
To extend beyond triangulations, we will use a different approach. We stress that in particular, we will not show delocalisation of the height function on graphs that are not triangulations. Instead, we exploit monotonicity in coupling constants to bound from below the spin correlations on an arbitrary translation invariant graph by correlations on a modified graph that is a triangulation. We explain this procedure in detail for the square lattice, and briefly mention the extension to other lattices at the end.
In what follows, we will need the following well known monotonicity of spin correlations that is a classical consequence of the Ginibre inequality [18]. In order to use (6), we need to transform Γ into a triangulation. See Figure 2 for guidance. Fix a square and double the bottom and left edge and put coupling constants β/2 on the doubled edges instead of β. Next, double the common vertex of the left and bottom edge and add an additional edge e, on which we set the coupling constant to infinity. This does not change the distribution of the spins. Finally, set the coupling constant on the edge e to 0, which is equivalent to removing the edge from the square, and repeat the procedure for all other squares. In this way, we obtain a new lattice Γ , which consists of squares with a diagonal on which there is an additional vertex. Note that all coupling constants are now equal to β/2. By Lemma 16, for all pairs of vertices a, b in Γ, using the natural embedding of Γ on Γ .
Since Γ is a translation invariant graph, the dichotomy statement of Theorem 3 holds. To show that there are no translation invariant Gibbs measures for the associated height function, notice that the dual (Γ ) † of Γ (after collapsing the doubled edges to a single edge) is trivalent. Moreover, the height function on any finite subgraph of (Γ ) † has a potential given by V e = V β/2 e for the nondiagonal edges and V e = 2V β/2 e otherwise, and the potential V satisfies Lammers' condition (6) precisely when (I 1 (β/2)/I 0 (β/2)) 2 ≥ 1 2 . Since the fraction Figure 3. The transformation of a general graph to a triangulation (after identifying the resulting multiple edges). The dashed edges are such that the coupling constant is set to infinity first, and then to zero (which is equivalent to removing the edges) and hence the spin correlations in the final graph are smaller than in the original graph.
on the left-hand side tends to 1 as β → ∞, we can choose β large enough so that there are no translation invariant Gibbs measures for the height function on (Γ ) † . Note that every cut on Γ embeds naturally as a cut on Γ . Therefore, by Proposition 9 together with (15), we have that for each cut L on Γ and each > 0, This finishes the proof.
To extend this proof to general graphs, we make each face into a triangulation by "zigzagging" (see Figure 3).

No exponential decay implies a power-law lower bound
In this section we finish the proof of the main theorem by showing that the absence of exponential decay implies a power-law lower bound on the two-point function when Γ = Z 2 . Similar arguments can be applied to other graphs that in addition to being translation invariant possess reflection and rotation symmetries.
Proof of Theorem 1. Let 0 denote the vertex at the origin. For a finite subgraph G of Z 2 containing 0, let where ∂G is the set of vertices of G adjacent to at least one vertex outside G. Define We will show that β c satisfies the properties listed in Theorem 1. To this end first fix β < β c . By Lemma 16, there exists a finite graph G with ϕ G,β < 1. Using a standard argument that consists in iteratively applying the Lieb-Rivasseau inequality [27,34] (see Lemma 20) to translates of G, we obtain that the two-point functions decay exponentially fast, and hence (i) holds true.
To conclude (ii), note that for each finite G, ϕ G,β is a continuous function of β, and hence the set in (16) is open. This means that for every β ≥ β c , we have ϕ G,β ≥ 1 for all finite subgraphs G. Now let Λ n be the box [−n, n] 2 , and let Λ n be the ball in L 1 of radius 2n (see Figure 4). We write x n := (n, n) ∈ ∂Λ n ∩ ∂Λ n and a n = σ 0σxn Z 2 ,β . By rotation symmetry and the Messager-Miracle-Sole [32] inequality (see Lemma 21), we have For β ≥ β c , we moreover have w∈∂Λ n σ 0 σ w Z 2 ,β ≥ ϕ Λ n ,β ≥ 1.
Finally by Theorem 15 we know that there exists a finite β at which there is no exponential decay, and by classical expansions there exists a nonzero β at which there is exponential decay (see e.g. [2]). We conclude that 0 < β c < ∞.

Appendix A. Double currents, path switching and correlation inequalities
The main purpose of this section is to present a new technique that may be applied to a further study of the XY model (possibly in higher dimensions). We develop a loop representation for squares and products of correlation functions. This is a generalization of the construction from Section 3, and to the best of our knowledge has not yet been described in the literature. It is also analogous to the double random current representation of the Ising model [1,3,19] but is more subtle as one has to deal with path switching rather than connection switching in a percolation model. We stress the fact that we do not use any of the results from this section in the proof of the main theorem, except for the well known inequalities of Lieb and Rivasseau, and Messager and Miracle-Sole.
There will be two major differences in the definition of a loop configuration compared to Section 3: the edges will come in two colours, red and blue, corresponding to two currents r and b respectively, and we will allow the paths to enter vertices v at which the number of incoming and outgoing edges is not the same, i.e., δ(r + b) v = 0. To be more precise, consider the following definition. We writeL S ϕ for the set of all coloured loop configurations outside S with sources ϕ, and define a weight onL S ϕ bỹ where M is the underlying multigraph, and M e is the number of copies of e in M.
Note that this weight no longer only depends on the multigraph M and on S, but also on |ϕ(ω)|, where ϕ(ω) are the sources of ω. Also note that in the above definition S and ϕ can be chosen independently. S denotes the set of vertices where we do not resolve any connections between paths and loops, and ϕ prescribes where the sources and sinks are (vertices with nonzero value of ϕ). At any such vertex v, we resolve as many connections as possible leaving only |ϕ v | incoming or outgoing arrows unmatched, depending on the sign of ϕ v . This is the reason why ϕ appears in the above weight, which was not the case in Section 3.
As before if S ⊆ S, then there is a natural map ρ : L S → L S that consists in forgetting (or cutting) the loop and path connections at the vertices in S \ S , and Definition (Coloured currents and consistent configurations). We will consider a pair of currents r, b that we think of as red and blue respectively. A coloured loop configuration ω on M r+b is called consistent with r and b if for every edge vv ∈ E, the number of times the loops and paths traverse a red (resp. blue) copy of vv in the direction of (v, v ) is equal to r (v,v ) (resp. b (v,v ) ). In particular ω has sources δ(r + b). We defineL S r,b to be the set of all coloured loop configurations on M r+b outside S that are consistent with r and b.
For ϕ, ψ : V → Z, we also definẽ where the union is clearly disjoint. For brevity, we will writeL S 0 instead ofL S 0,0 , where 0 denotes the zero function on V .
We now relate the weights of loops to those of pairs of currents. To this end, note that for each edge vv ∈ E, there are exactly ways of assigning colour to the copies of vv in M r+b , and to orient them in the two possible ways so that the result is consistent with r and b. Moreover, independently of the choices of colours and orientations, there are exactly possible pairings of the incoming and outgoing edges at each vertex v ∈ V \ S such that there are exactly ϕ v 1{ϕ v > 0} outgoing and −ϕ v 1{ϕ v < 0} incoming edges unpaired. This is equivalent to choosing the possible steps that all the loops and paths in the configuration make at v. Combining all this, we get the following identity: An important observation again is that the right-hand side is independent of S, and hence so is the left-hand side. In particular, for two sourceless currents, we have Again, in the case when G is planar we get the following distributional identity. LetP G,β to be the probability measure onL 0 :=L ∅ 0 induced by the weightsλ β :=λ ∅ β . For each face u ∈ U of G, and ω ∈L 0 , define W ω (u) to be the total net winding of all the loops in ω around u.
Proposition 17. The law of (W (u)) u∈U underP G,β is the same as the law of the sum of two independent height functions (h(u) + h (u)) u∈U under P G,β . A.1. The two point-function and path switching. We now turn to the loop representation of the square of the two-point function. To this end, write ϕ = δ a − δ b . Similar to (11), we get where the last equality follows from (19).
As before, we now want to reverse some of the paths. However, this time we also need to take care of the colours of the edges visited by a path. This motivates the following definition.
Definition (Path switching). For a path γ in a coloured loop configuration ω, we define s(γ) to be the path obtained from γ by • reversing the orientation of γ, and • swapping the colours of the edges visited by γ. We also define ω to be the configuration where γ is replaced by s(γ). This operation does not change the underlying multigraph. Moreover if γ starts at a and ends at b, then for any ϕ, ψ : V → Z, path switching maps (see Figure 5).
We note that there are two important cases in which path switching does not change the weightλ S β . The first one is when {a, b} ⊂ S, and the second one is when ϕ a + ψ a = 1, and ϕ b + ψ b = −1, since then the absolute value of the sources of the configuration does not change.
Again the crucial observation now is that switching a path going from a to b maps ω ∈L a,b ϕ,ϕ to ω ∈L a,b 0 , and hence erases the sources and sinks of the underlying currents. Indeed one can easily check (see Figure 5) that after reversing a path and swapping the colours, the number of incoming minus the number of outgoing red and blue edges at every vertex v / ∈ {a, b} in ω is the same as in ω, whereas at a and b this number is decreased by one. Since we did not change the sources outside {a, b}, we do not change the weight of a loop configuration, and hence obtain in the same way as in Section 3.2 that Together with (21) this implies the following loop representation of the square of the two-point function.
Proposition 18. Let a, b ∈ V be distinct. Then Remark 10. The constant in the inequality of Lemma 12 can be improved to 1 using the same method as above but starting from coloured loop configurations inL a,b 0,2ϕ instead ofL a,b ϕ,ϕ .
A.2. Application to some inequalities. As a further application we now prove an inequality that is related to but independent of the Ginibre inequality.
Lemma 19. Let a, b, c ∈ V . Then σ aσb G,β ≥ σ aσc G,β σ cσb G,β ≥ σ a σ bσ 2 c G,β . Proof. The two inequalities have, maybe quite surprisingly, almost the same proof. We only prove the first and leave the second to the reader. We set S = {c} and will write c instead of {c} in our notation. We also define ϕ = δ a − δ c , ψ = δ b − δ c , and note that ψ − ϕ = δ b − δ a . Also note that for each ω ∈L c ϕ,ψ , the unique path starting at a must end at c. Consider the map ω → ω that switches this path. Clearly this is a bijection betweenL c ϕ,ψ and {ω ∈L c 0,ψ−ϕ : the unique path ending at a starts at c}.
Moreover, we have |ϕ v (ω)| = |ϕ v (ω )| for all v = c, and hence the weightsλ c β are preserved. This means that where we used (19) twice. This finishes the proof.
The purpose of the remainder of this section is to give more applications of the representation introduced above. We start with two new bijective proofs of the classical inequalities that we used in the proof of our main theorem.
Lemma 20 (Lieb-Rivasseau inequality [27,34]). Let G = (V, E) be any graph. Let a, b ∈ V be distinct, and let H be a finite subgraph of G containing a and not containing b, and let ∂H be the set of vertices of H adjacent to at least one vertex outside H. Then σ aσb G,β ≤ c∈∂H σ aσc H,β σ cσb G,β .
Proof. It is enough to assume that G is finite, and then approximate an infinite graph by finite subgraphs. The proof is similar to the previous one. Assume a / ∈ ∂H. Otherwise, there is nothing to prove. Fix c ∈ ∂H and S = {c}. We will write c instead of {c} in our notation. Let ϕ = δ c − δ a , ψ = δ c − δ b , and note that ψ − ϕ = δ a − δ b .
WriteL c for the collection of coloured loop configurations ω ∈L c 0,ψ−ϕ with the property that the unique path starting at a exits H \ ∂H at c, and ω has no red edges outside of H.
For ω ∈L c consider a coloured loop configuration where this path is switched. Clearly this is a bijection betweenL c and the set of configurations ω ∈L c ϕ,ψ that have no red edges outside H, and for which the unique path ending at a stays within H \ ∂H until it hits c. Denote this collection of configurations byL c . Moreover, we have |ϕ v (ω)| = |ϕ v (ω )| for all v = c, and hence the weightsλ c β are preserved. LetẼ c be the collection of ω ∈L 0,ψ−ϕ with the property that the unique path from a to b exits H \ ∂H in c, and ω does not have red edges outside of H. Clearly, the subset ofL 0,ψ−ϕ consisting of configurations with no red edges outside of H equals the disjoint union ∪ c∈∂HẼc and cutting ω ∈Ẽ c at c gives an element ofL c . In light of (18), we therefore have which completes the proof.
We are also able to use the coloured loop representation to prove the Messager-Miracle-Sole inequality.
Geometrically, this in particular implies that the largest correlation with the spin at 0 on any vertical, horizontal or diagonal straight line is attained by the vertex closest to 0. This will follow from the following lemma after taking G Z 2 . The proof is inspired by the one from [4] for the Ising model. The idea is to fold a graph across a line and think of the parts of the current coming from both sides of the line as the red and blue current in the coloured loop representation. Proof. We only consider the easier case when L passes through vertices. This means that it is either a diagonal, or a horizontal (vertical) line at integer height. The more involved case when L passes only through the edges (this case implies Lemma 21 for horizontal and vertical lines) we leave to the interested reader.
If L is horizontal or vertical, then split the edges that lie on L into two parallel edges with coupling constants β/2, and think of the resulting graph as a new graph G. Write Z for the set of vertices on L, and G − = (V − , E − ) and G + = (V + , E + ) for the two isomorphic parts of G separated by L where G − contains a and b (each of them also containing Z).
We can decompose a current n on G into two parts: r and b on G − and G + . In what follows, we identify G − with G + under the obvious isomorphism, and all currents are considered on G − unless stated otherwise. Let C k , for k = 0, 1, be the set of functions ϕ : V − → Z such that ϕ v = 0 for v ∈ V − \ Z, and v∈Z ϕ v = k. Since every current in Ω δa−δ L(b) (G) must have a total flux of +1 across L, we can write where the second inequality holds true as a the weight w β is invariant under reversal of the current, and the last equality is a consequence of (19). Now, for each ω ∈L Z δ 0 −ϕ,δ b −ϕ switch the unique path γ starting at b. This transformation preserves weights and results in a configuration ω ∈L Z δ 0 −δ b −ϕ ,−ϕ , where ϕ = ϕ − δ z ∈ C 0 and z ∈ Z is the vertex at which γ ends. Reversing the order of the steps above we therefore get = σ aσb G,β Z 0 G,β , where the last equality follows since the total flux of a current in Ω δa−δ b (G) across L is zero.
A.3. Limitations of the coloured loop representation. A natural idea is to try to prove the Ginibre inequality in form of Lemma 16 using our representation. One would like to show that the derivative of the two-point function with respect to one coupling constant J e is nonnegative. Using coloured loop configurations we can write where R e (ω) and B e (ω) is respectively the number of red and blue copies of e in the multigraph visited by the unique path from a to b in ω. Without going into too many details, to justify the second equality we make the following observations. First, taking the derivative with respect to J e is equivalent to dividing by J e and marking one of the copies of e of the right colour (here the currents in Ω δa−δ b are red and those in Ω 0 are blue). Then, if the marked edge is not on the path from a to b, we switch the corresponding loop (reverse it and swap the colours). This does not change the weight of the configuration. Such terms hence cancel out from the expression above as the loops going trough a marked blue copy of e are counted with a minus sign. The remaining terms are those whose marked edge lies on the distinguished path. This gives the final formula.
Clearly, the final result is not evidently nonnegative and we would need additional arguments to conclude the Ginibre inequality. On the other hand, the Ginibre inequality implies the distinguished path visits red edges more often than blue edges on average.