1 Introduction and Main Result

Let \(G=(V,E)\) be a finite graph. Given a collection of nonnegative coupling constants \(J=(J_e)_{e\in E}\), and an inverse temperature \(\beta >0\), the XY model (with free boundary conditions) is a random spin configuration \(\sigma \in {\mathbb {S}}^V\), where \({\mathbb {S}}=\{ z\in {\mathbb {C}}: |z|=1\}\) is the complex unit circle, sampled according to the Gibbs distribution

$$\begin{aligned} d \mu _{G,\beta }(\sigma ) \propto \exp \Big (\tfrac{1}{2}\beta \sum _{vv'\in E}J_{vv'}( \sigma _v{\bar{\sigma }}_{v'}+ {\bar{\sigma }}_v\sigma _{v'}) \Big ) \prod _{v\in V} d \sigma _v, \end{aligned}$$
(1)

where \(vv'\) denotes the edge \(\{v,v'\}\), and \(d\sigma _v\) is the uniform probability measure on \({\mathbb {S}}\). For simplicity of notation, unless stated otherwise, we will assume that \(J_e = 1\) for all e. However, our results extend naturally to nonhomogeneous coupling constants. We will write \(\langle \cdot \rangle _{G,\beta }\) for the expectation with respect to \(\mu _{G,\beta }\). The observable of main interest for us will be the two-point function \(\langle \sigma _a{\bar{\sigma }}_b\rangle _{G,\beta }\), \(a,b\in V\), and its infinite volume limit (which is well defined by the Ginibre inequalities [16])

$$\begin{aligned} \langle \sigma _a{\bar{\sigma }}_b\rangle _{\Gamma ,\beta }= \lim _{G\nearrow \Gamma }\langle \sigma _a{\bar{\sigma }}_b\rangle _{G,\beta }, \end{aligned}$$

where \(\Gamma \) is an infinite planar lattice.

Note that if \(\sigma _v=e^{i \theta _v}\), \(\theta _v\in (-\pi ,\pi ]\), then \(\sigma _v{\bar{\sigma }}_{v'}+ {\bar{\sigma }}_v\sigma _{v'}=2\cos (\theta _v-\theta _{v'})\). This means that the model is ferromagnetic, i.e., pairs of neighbouring spins that are (almost) aligned have smaller energy and hence are statistically favoured. A natural question is whether varying \(\beta \) leads to a ferromagnetic order–disorder phase transition in the model. The classical theorem of Mermin and Wagner [29] excludes this possibility when the underlying lattice \(\Gamma \) is two-dimensional. Moreover, McBryan and Spencer [28] showed that at any finite temperature \(\langle \sigma _a{\bar{\sigma }}_b\rangle _{{\mathbb {Z}}^2,\beta }\) decays to zero at least as fast as a power of the distance between a and b. On the other hand, it is known by the work of Fröhlich, Simon and Spencer [13] that in higher dimensions the model exhibits long-range order at low temperatures and the two-point function does not decay to zero.

Even though there is no spontaneous symmetry breaking, Berezinskii [5, 6], and Kosterlitz and Thouless [18] predicted that a different type of phase transition takes place in two dimensions. It should be understood in terms of interacting topological excitations of the model, the so called vortices and antivortices. They are those faces of the graph where the XY configuration makes a full clockwise or anticlockwise turn respectively when one traverses the edges of the face in a clockwise manner. Vortices and antivortices interact through a Coulomb-like interaction, and are energetically favoured to form short-distance pairs of vortex-antivortex. The Berezinskii–Kosterlitz–Thouless (BKT) phase transition happens when, while decreasing the temperature, the freely spaced vortices and antivortices (high-temperature plasma) bind together into such vortex-antivortex pairs. This regime should exhibit power-law decay of the two-point functions (in contrast to exponential decay at high temperatures). A rigorous lower bound of this type for low temperatures, and therefore a proof of the BKT phase transition was first obtained in the celebrated work of Fröhlich and Spencer [14] who also derived analogous results for the Villain spin model. Their proof uses a multi-scale analysis of the Coulomb gas, and the main purpose of the present article is to present an alternative and less technically involved argument for the existence of phase transition in two dimensions.

To be more precise, we introduce a new loop representation for the two-point function in the XY model that can be used to transfer probabilistic information from the dual integer-valued height function model to the XY model. Along the way we also show that the height function possesses the crucial absolute-value-FKG property. This, together with a recent elementary delocalisation result for general height functions obtained by Lammers [19], is used to prove existence of the BKT phase transition.

Theorem 1

(Berezinskii–Kosterlitz–Thouless phase transition). There exists \(\beta _c\in (0,\infty )\) such that

  1. (i)

    for all \(\beta < \beta _c\), there exists \(c =c(\beta )> 0\) such that for all \(v,v'\in {\mathbb {Z}}^2\),

    $$\begin{aligned} \langle \sigma _v \overline{\sigma }_{v'} \rangle _{{\mathbb {Z}}^2,\beta } \le e^{-c|v - v'|}, \end{aligned}$$
  2. (ii)

    for all \(\beta \ge \beta _c\) and all distinct \(v,v'\in {\mathbb {Z}}^2\),

    $$\begin{aligned} \langle \sigma _v \overline{\sigma }_{v'} \rangle _{{\mathbb {Z}}^2,\beta } \ge \frac{1}{8|v - v'|}. \end{aligned}$$

We note that unlike in the original proof of Fröhlich and Spencer, we do not show that the rate of decay approaches zero when so does the temperature. However, we establish a type of sharpness which says that there is no other behaviour than exponential and power-law decay. The short proof of sharpness is independent of the rest of the argument. In the first step we classically use the Lieb–Rivasseau inequality [24, 32] to establish a sharp transition between exponential decay and nonsummability of correlations (similarly to the proof for the Ising model [11]). To conclude a uniform power-law lower bound as in (ii) whenever the correlations are not summable we use the Messager–Miracle-Sole inequality [30] on monotonicity of correlations with respect to the position of the vertex on the lattice.

We also note that our proof works (with minor modifications and a different, implicit multiplicative constant in (ii)) for other infinite planar graphs that in addition to being translation invariant possess reflection and rotation symmetries, and whose dual graph has bounded degree.

At the same time when this article appeared, an analogous result for the Villain model (without sharpness and explicit polynomial decay in the BKT phase) was given by Aizenman et al. [2]. It was later extended to also cover the XY model (including sharpness). For a more detailed overview of the XY model, we refer the reader to [12, 31], and for expositions of the argument of Fröhlich and Spencer, we refer to [15, 17].

This article is organised as follows.

  • In Sect. 2 we introduce the dual of the planar XY model in form of an integer-valued height function defined on the faces of the graph. We also establish positive association of its absolute value (the absolute-value-FKG property), and recall the delocalisation result of Lammers [19].

  • In Sect. 3 we define a random collection of loops on the graph that carries probabilistic information about both the XY spins and the dual height function. Although this is a well known object that goes back to the works of Symanzik [35], and Brydges, Fröhlich and Spencer [7], the formula that relates the two-point function to the probability of two points being connected by a loop (Lemma 8) is new and crucial to our argument.

  • In Sect. 4 we give an elementary argument which states that if the height function delocalises at some temperature, then the spin two-point function does not decay exponentially.

  • In Sect. 5 we use the above ingredients to show that on any translation invariant graph, there exists a finite temperature at which the two-point function does not decay exponentially. This is not immediate as the result of Lammers [19] applies only to trivalent graphs. However, a simple graph-modification argument together with the Ginibre inequality allows to change the setup from a general graph to a triangulation (a graph whose dual is trivalent).

  • In Sect. 6 we finish the proof of the main theorem. We use the Lieb–Rivasseau inequality [24, 32] and the Messager–Miracle-Sole inequality [30] to show that the absence of exponential decay implies a power-law lower bound on the two-point function.

2 The Dual Height Function

To define the dual model we assume that G is planar and finite, and we introduce the notion of currents. To this end, let \({E} =\{(v,v'): \{v,v'\}\in E \}\) be the set of directed edges of G, and let \({\mathbb {N}}=\{0,1,\ldots \}\). A function \(\textbf{n}: {E}\rightarrow {\mathbb {N}} \) is called a current on G. For a current \(\textbf{n}\), we define \(\delta \textbf{n}: V\rightarrow {\mathbb {Z}}\) by

$$\begin{aligned} \delta \textbf{n}_v= \sum _{v'\sim v} \textbf{n}_{(v,v')} - \textbf{n}_{(v',v)}. \end{aligned}$$

Hence if \(\delta \textbf{n}_v\) is positive, then the amount of outgoing current is larger than the incoming current, an we think of v as a source. Likewise if \(\delta \textbf{n}_v\) is negative, there is more incoming current and v is a sink. A current is sourceless if \(\delta \textbf{n}_v=0\) for all \(v\in V\).

We define \(\Omega _0\) to be the set of all (sourceless) currents. Sourceless currents naturally define a height function h on the set of faces of G, denoted by U, where the height of the outer face is set to zero, and the increment of the height between two faces u and \(u'\) is equal to

$$\begin{aligned} h(u)-h(u')=\textbf{n}_{(v,v')} - \textbf{n}_{(v',v)}, \end{aligned}$$

where the primal directed edge \((v,v')\) crosses the dual directed edge \((u,u')\) from right to left. That this yields a well defined function on the faces of G follows from the fact that \(\delta \textbf{n}=0\). We define the XY weight of a current by

$$\begin{aligned} w_{\beta }(\textbf{n})=\prod _{(v,v') \in E}\frac{1}{\textbf{n}_{(v,v')}!}\Big (\frac{\beta J_{vv'}}{2} \Big )^{\textbf{n}_{(v,v')}}, \end{aligned}$$
(2)

These weights appear naturally in the expansion of the partition function of the XY model into a sum over sourceless currents after one expands the exponentials in (1) into a power series in the variables \(\tfrac{1}{2}\beta J_{vv'}\sigma _v{\bar{\sigma }}_{v'}\) for each directed edge \((v,v')\in {E}\), and then integrates out the \(\sigma \) variables. They will also appear in the analogous classical expansion for spin correlations (11).

We note that using currents to define a model on the dual graph is an instance of planar duality of abelian spin systems [9], and the fact that the function is is a consequence of \({\mathbb {Z}}\) being the dual group of the unit circle.

Clearly, the weight (2) defines a probability measure \(\mathbb {P}_{G,\beta }\) on currents and hence also on height functions. In terms of the height function it is a Gibbs measure given by

$$\begin{aligned} \mathbb {P}_{G,\beta }(h) \propto \exp \Big (- \sum _{uu'\in E^\dagger } {\mathcal {V}}_e^{\beta }(h(u)-h(u'))\Big ), \end{aligned}$$
(3)

where \(E^\dagger \) is the set of dual edges of G, and where the symmetric potentials \( \mathcal {V}^{\beta }_e: {\mathbb {Z}}\rightarrow {\mathbb {R}}\) are given by

$$\begin{aligned} {\mathcal {V}}^{\beta }_e(k) =- \log \Big (\sum _{i= 0}^{\infty } \frac{1}{i!(i+|k|)!}\Big (\frac{\beta J_e}{2}\Big )^{2i+|k|} \Big ) =- \log I_{k}({\beta J_e }) \end{aligned}$$
(4)

with \(I_{k}\) being the modified Bessel function. We again note that we will usually set all \(J_e=1\) to simplify the notation.

A well known Turán-type inequality for modified Bessel functions [36] states that for any \(k\ge 0\) and \(\beta >0\),

$$\begin{aligned} I_k^2(\beta )\ge I_{k-1}(\beta ) I_{k+1}(\beta ) \end{aligned}$$
(5)

which means that \({\mathcal {V}}^\beta _e\) is convex on the integers. This puts the model in the well-studied framework of height functions with a convex potential (see e.g. [34]).

2.1 Gibbs measures and delocalisation

To state the delocalisation result of Lammers we will need the notion of a Gibbs measure for height functions on infinite graphs (though we will not directly work with it in the remainder of the article). Let \(\Gamma = (V, E)\) be an infinite planar graph and \(\Gamma ^{\dagger } = (U, E^{\dagger })\) its planar dual. If \(\nu \) is a measure on height functions \(\varphi : \mathbb {Z}^{U} \rightarrow \mathbb {Z}\) and \(\Lambda \subset U\) a finite subset, write \(\nu _{\Lambda }\) for the measure restricted to \(\Lambda \). Let \({\mathcal {V}}=({\mathcal {V}}_e)_{e\in E\dagger }\) be a family of convex symmetric potentials. We call \(\nu \) a Gibbs measure for the potential \({\mathcal {V}}\) if for every such \(\Lambda \), it satisfies the Dobrushin–Lanford–Ruelle relation

$$\begin{aligned} \nu _{\Lambda }(\cdot )= \int _{\mathbb {Z}^{U}} \nu ^{\varphi }_{\Lambda }(\cdot ) d\nu (\varphi ), \end{aligned}$$

where \(\nu ^{\varphi }_{\Lambda }\) is the Gibbs measure on height functions \(h\in \mathbb {Z}^U\) given as in (3) (but with \({\mathcal {V}}^\beta \) replaced by \({\mathcal {V}}\)) and conditioned on h being equal to \(\varphi \) on the boundary of \(\Lambda \).

In what follows we will always assume that \(\Gamma \) is locally finite and invariant under the action of a \({\mathbb {Z}}^2\)-isomorphic lattice. We say that \(\nu \) is translation invariant if it is invariant under the same acton.

In a recent beautiful work [19] Lammers gave a condition on the potential that guarantees that there are no translation invariant Gibbs measures on graphs of degree three (trivalent graphs).

Theorem 2

(Lammers [19]). Let \(\Gamma ^{\dagger } = (U, E^{\dagger })\) be as above and moreover trivalent. If for every \(e\in E^\dagger \),

$$\begin{aligned} {\mathcal {V}}_e(\pm 1) \le {\mathcal {V}}_e(0) + \log (2), \end{aligned}$$
(6)

then there are no translation invariant Gibbs measures for \({\mathcal {V}}\).

This together with the dichotomy stated in Theorem 4 will be one of the key ingredients of the proof of the main theorem.

2.2 Absolute-value-FKG and dichotomy

In this section, we prove that the height function satisfies the absolute-value-FKG property, which is known to imply the dichotomy in Theorem 4 below [8, 20]. Here we will only work with the potential \(\mathcal {V}_{\beta }\) as defined in (4).

Proposition 3

(Absolute-value-FKG). Let \(G = (V, E)\) be a finite graph and U the set of its faces. Then for all \(\beta >0\), and all \(\Psi , \Phi : {\mathbb {N}}^{U} \rightarrow \mathbb {R}_+\) increasing functions,

$$\begin{aligned} \mathbb {E}_{G,\beta }[\Psi (|h|)\Phi (|h|)] \ge \mathbb {E}_{G,\beta }[\Psi (|h|)] \mathbb {E}_{G,\beta }[\Phi (|h|)]. \end{aligned}$$

We first explain briefly the dichotomy. Let \(\Gamma = (V, E)\) be a translation invariant graph, and let 0 be a chosen face of \(\Gamma \). Define \(B_n\) to be the subgraph of \(\Gamma \) induced by the vertices in V that lie on at least one face of \(\Gamma \) that is contained in the graph ball of radius n on \(\Gamma ^\dagger \). We introduce this slightly convoluted definition to guarantee the following three properties: 0 belongs to all \(B_n\), also \(B_n\nearrow \Gamma \) as \(n\rightarrow \infty \), and finally, the weak dual graph of \(B_n\) (the dual graph with the vertex corresponding to the external face of \(B_n\) removed) is a subgraph of \(\Gamma ^\dagger \).

Theorem 4

Consider the setup as above. Then for every \(\beta > 0\), exactly one of the following two occurs:

  1. (i)

    (localisation) There exists a \(C < \infty \) such that uniformly over all n,

    $$\begin{aligned} \mathbb {E}_{B_n,\beta }[ |h(0)|] \le C. \end{aligned}$$
  2. (ii)

    (delocalisation) There are no translation invariant Gibbs measures for the potential (4).

Proof

This is a consequence of the absolute-value-FKG property (Proposition 3) and standard arguments using monotonicity in boundary conditions. See [20, Theorem 2.7]. \(\square \)

We turn to the proof of Proposition 3. The first step (Lemma 5) consists in showing that, for \(\beta \) small enough, the potential satisfies an inequality known to imply the absolute-value-FKG property [20]. In the second step we use this to conclude the absolute-value-FKG property for general \(\beta \).

Lemma 5

The absolute-value-FKG property holds true for all \(\beta \le 1\).

Proof

We rely on a result of Lammers and Ott [20, Theorem 2.8], stating that if

$$\begin{aligned} {\mathcal {V}}_e^{\beta }(k - 1) - 2{\mathcal {V}}_e^{\beta }(k) + {\mathcal {V}}_e^{\beta }(k + 1) = -\log \Big (\frac{I_{k-1}(\beta )I_{k + 1}(\beta )}{I_k(\beta )^2}\Big ) \end{aligned}$$

is a nonincreasing function of k on \(\{0,1,\ldots \}\), then \({\mathbb {P}}_{G,\beta }\) is absolute-value-FKG. We define \(r_k = \tfrac{1}{\beta }\frac{I_{k }(\beta )}{I_{k-1}(\beta )}\), and need to show that \(r_k^2\le r_{k-1}r_{k+1}\) for all \(k\ge 0\). The well known recurrence relation

$$\begin{aligned} I_{k - 1}(\beta ) = \tfrac{2k}{\beta } I_{k}(\beta ) + I_{k + 1}(\beta ) \qquad \text {yields} \qquad r_k = (2k + \beta ^2r_{k + 1})^{-1}. \end{aligned}$$

Hence it is enough to prove that

$$\begin{aligned} (2k + \epsilon _{k + 1})(2(k + 2) + \epsilon _{k + 3})\le (2(k + 1) + \epsilon _{k + 2})^2, \end{aligned}$$

where \(\epsilon _k = \beta ^2r_{k}\). Using the Turán inequality (5), it follows that \(0 \le r_{k + 1} \le r_{k}\), and therefore it is sufficient to establish that

$$\begin{aligned} R_k:=(2k + \epsilon _{k + 1})(2k + 4 + \epsilon _{k + 1}) - (2k + 2)^2=4(k+1)\epsilon _{k + 1} + \epsilon _{k + 1}^2 - 4 \le 0. \end{aligned}$$

At the same time, simply using the definition of \(r_{k+1}\) and comparing the Taylor expansions (4) of \(I_{k+1}\) and \(I_k\) term by term gives \(\epsilon _{k+1} \le \beta ^2/(2k+2)\). Therefore, when \(\beta \le 1\), we have \(R_k\le \epsilon _{k + 1}^2 - 2 \le 0\) for all \(k\ge 0\), which concludes the proof. \(\square \)

To treat general values of \(\beta \), we will use a trick which consists in replacing each edge of G by \(s=\lceil \beta \rceil \) consecutive edges, and reducing the parameter \(\beta \) by the factor s, together with the following convolution property of the modified Bessel functions.

Lemma 6

For all \(k, l \in \mathbb {Z}\) and all \(\beta ,\beta '\ge 0\),

$$\begin{aligned} \sum _{m \in \mathbb {Z}} I_{k - m}(\beta )I_{m - l}(\beta ') = I_{k - l}(\beta +\beta '). \end{aligned}$$

Proof

This is a classical identity which follows from the fact that \(I_{k }(\beta )/e^{\beta }={\mathbb {P}}(Z-Z'=k)\), where \(Z,Z'\) are independent Poisson random variables with mean \(\beta /2\), and the fact that a sum of independent Poisson random variables is Poisson. \(\square \)

With this we can prove Proposition 3.

Proof of Proposition 3

Let \(G_s = (V_s, E_s)\) be G with each edge replaced by s consecutive edges, and let \(h_s\) be the height function on \(G_s\) with law \(\mu _{G_s, \beta /s}\). By Lemma 6 (and an induction argument) the restriction of \(h_s\) to V has the same law as \(h_1\). Moreover, \(\beta /s \le 1\) by definition of s, which by Lemma 5 implies that \(\mu _s\) satisfies the absolute-value-FKG property. To finish the proof it is enough to notice that any increasing function on \({\mathbb {N}}^V\) is also increasing on \({\mathbb {N}}^{V_s}\). \(\square \)

Remark 1

An interesting consequence of the idea above (that we will not use in this article) is the following. Consider the case when s from above is independent of \(\beta \) and diverges to infinity. In this limit, the height function becomes well defined at every point of every dual edge. Here we think of the dual graph as the so called cable graph, i.e., every dual edge e is identified with a continuum interval of length \(J_e\beta \). Then the distribution of the height on an edge, when conditioned on the values at the endpoints, is one of the difference of two Poisson processes with intensity \(J_e\beta /2\) each, and conditioned on the value at the endpoints. One can check that the model exhibits a spatial Markov property on the full cable graph and not only on the vertices. This is in direct analogy with the cable graph representation of the discrete Gaussian free field, where the vertex-field can be extended to the edges via Brownian bridges (see e.g. [27] and the references therein).

3 Loop Representation of Currents and Path Reversal

The purpose of this section is mainly to develop a loop representation for the two-point function of the XY model. The important aspect of our approach is that the correlations are represented as probabilities for loop connectivities in random ensembles of closed loops. This is in contrast with most of the classical representations that write correlation functions as ratios of partition functions of loops, where in the numerator, in addition to loops, one also sums over open paths between the points of insertion in the correlator [7, 35]. We note that a similar idea to ours appears in the work of Benassi and Ueltschi [4], but due to technical differences in the framework (see Remark 4), the formula for the two-point function obtained in [4] is not as transparent as ours.

Let \(G=(V,E)\) be a finite, not necessarily planar graph. We say that a multigraph \({\mathcal {M}}\) on V is a submultigraph of G if after identifying the multiple copies of the same edge in \({\mathcal {M}}\) it is a subgraph of G.

Definition

(Loop configurations outside S). Let \({\mathcal {M}}\) be a submultigraph of G, and let \(S\subseteq V\). A loop configuration (on \({\mathcal {M}}\)) outside S is a collection of

  • unrooted directed loops on \({\mathcal {M}}\) avoiding S, and

  • directed open paths on \( {\mathcal {M}}\) starting and ending in S (and not visiting S except at their start and end vertex),

such that every edge of \({\mathcal {M}}\) is traversed exactly once by a loop or a path.

We write \({\mathcal {L}}^S\) for the set of all loop configurations outside S, and define a weight for \(\omega \in {\mathcal {L}}^S\) by

$$\begin{aligned} \lambda ^S_\beta ( \omega ) = \prod _{v \in V \setminus S} \frac{1}{(\deg _{{\mathcal {M}}}(v) / 2)!} \prod _{e\in E} \frac{1}{{\mathcal {M}}_e!} \Big (\frac{\beta }{2}\Big )^{{\mathcal {M}}_e}, \end{aligned}$$
(7)

where \({\mathcal {M}}\) is the underlying multigraph, and \({\mathcal {M}}_e\) is the number of copies of e in \({\mathcal {M}}\). When \(S=\emptyset \), a configuration is composed only of loops that can visit every vertex in V, and we simply call it a loop configuration.

An important feature of the weight (7) is that it depends on \( \omega \) only through \({\mathcal {M}}\). Also note, that if \(S'\subseteq S\), then there is a natural map \(\rho : {\mathcal {L}}^{S'} \rightarrow {\mathcal {L}}^S\) that consists in forgetting (or cutting) the loop connections at the vertices in \(S\setminus S'\). Under this map, each configuration in \( {\mathcal {L}}^S\) has \(\prod _{v \in S \setminus S'} {(\deg _{{\mathcal {M}}}(v) / 2)!}\) preimages, each of them having the same weight, and hence

$$\begin{aligned} \sum _{\tilde{\omega } \in \rho ^{-1}[\omega ]} \lambda ^{S'}_\beta (\tilde{ \omega }) = \lambda ^S_\beta ( \omega ). \end{aligned}$$
(8)

This consistency property will be useful later on.

For now, let \(|\textbf{n}|: E\rightarrow {\mathbb {N}}\) be the amplitude of a current \(\textbf{n}\), i.e.

$$\begin{aligned} |\textbf{n}|_{vv'}:=\textbf{n}_{(v,v')}+\textbf{n}_{(v',v)}. \end{aligned}$$

Definition

(Multigraph of a current and consistent configurations). For a current \(\textbf{n}\), let \({\mathcal {M}}_{\textbf{n}}\) be the submultigraph of G where each edge \(e\in E\) is replaced by \(|\textbf{n}|_e\) (possibly zero) parallel copies of e. A loop configuration on \({\mathcal {M}}_{\textbf{n}}\) is called consistent with \(\textbf{n}\) if for every edge \((v,v')\in {E}\), the number of times the loops traverse a copy of \(vv'\) in the direction of \({(v,v')}\) is equal to \(\textbf{n}_{(v,v')}\). We define \( {{\mathcal {L}}}^S_{\textbf{n}}\) to be the set of all loop configurations on \({\mathcal {M}}_{\textbf{n}}\) outside S that are consistent with \(\textbf{n}\).

For \(\varphi : V\rightarrow {\mathbb {Z}}\), let \(\Omega _\varphi =\{\textbf{n}: \delta \textbf{n}=\varphi \}\),

$$\begin{aligned} Z^{\varphi }_{G,\beta } =\sum _{\textbf{n}\in \Omega _\varphi } w_{\beta }(\textbf{n}), \end{aligned}$$

and \({\mathcal {S}} (\varphi )=\{ v\in V: \varphi _v\ne 0\}\). For a current \(\textbf{n}\), with a slight abuse of notation, we also write \({\mathcal {S}}(\textbf{n})={\mathcal {S}}(\delta \textbf{n})\). Note that \( {{\mathcal {L}}}^S_{\textbf{n}}\) can be nonempty only if \({\mathcal {S}}(\textbf{n})\subseteq S\). Indeed, each path and loop that enters a vertex in \(V\setminus S\) must also leave it, and hence the total number of incoming and outgoing arrows at each such vertex must be the same. For \(\varphi : V\rightarrow {\mathbb {Z}}\), we also define

$$\begin{aligned} {\mathcal {L}}_{\varphi }^S=\bigcup _{\textbf{n}\in \Omega _{\varphi }} {\mathcal {L}}^S_{\textbf{n}}. \end{aligned}$$

Again, this is nonempty only if \({\mathcal {S}}(\varphi )\subseteq S\). We will write \( {\mathcal {L}}_{0}^S\), where 0 denotes the zero function on V.

We now relate the weights of loops to those of currents. To this end, note that for each edge \(vv'\in E\), there are exactly

$$\begin{aligned} \frac{|\textbf{n}|_{v v'}!}{\textbf{n}_{(v, v')}!\textbf{n}_{{(v', v)}}!} \end{aligned}$$

ways of assigning orientations to it so that the result is consistent with \(\textbf{n}\). Moreover, independently of the choices of orientations, there are exactly \((\deg _{{\mathcal {M}}_{\textbf{n}}}(v) / 2)!\) possible pairings of the incoming and outgoing edges at each vertex \(v \in V\setminus S\). Combining all this we arrive at a crucial loop representation for current weights: if \({\mathcal {S}}(\textbf{n})\subseteq S\), then

$$\begin{aligned} w_\beta (\textbf{n})= \sum _{ \omega \in {\mathcal {L}}^S_{\textbf{n}}} \lambda ^S_\beta ( \omega ). \end{aligned}$$
(9)

An important observation here is that the left-hand side is independent of S, and hence so is the right-hand side.

3.1 Coupling with the height function

We now apply this framework to the case of two sourceless currents and a coupling with the corresponding height function. From (9) we have

$$\begin{aligned} Z^0_{G,\beta }= \sum _{ \omega \in {\mathcal {L}}^\emptyset _0} \lambda ^{\emptyset }_\beta ( \omega ) \end{aligned}$$
(10)

where 0 denotes the zero function on V.

Remark 2

This loop representation of the partition function, though obtained via a different procedure, goes back to the work of Symanzik [35], and Brydges, Fröhlich and Spencer [7].

Moreover, in the case when G is planar we immediately get the following distributional identity. Define \( {\textbf{P}}_{G,\beta }\) to be the probability measure on \( {{\mathcal {L}}}_0:={\mathcal {L}}^\emptyset _0\) induced by the weights \( \lambda _{\beta }:=\lambda ^\emptyset _{\beta }\). For each face \(u\in U\) of G, and \(\omega \in {\mathcal {L}}_0\), define \(W_{\omega }(u)\) to be the total net winding of all the loops in \(\omega \) around u.

Proposition 7

The law of \((W(u))_{u\in U}\) under \({\textbf{P}}_{G,\beta }\) is the same as the law of the height function \((h(u))_{u\in U}\) under \({{\mathbb {P}}}_{G,\beta }\).

3.2 The two point-function and path reversal

We now turn to the loop representation of the two-point function. For reasons that will become apparent soon, we need to consider the two-point function of the squares, i.e., \(\langle \sigma ^2_a {\bar{\sigma }}^2_b \rangle \).

Since the resulting currents will have sources, we will need to consider nonempty S in the construction above. To this end, fix two vertices \(a,b\in V\), and and define \(\varphi =2(\delta _a-\delta _b)\), where \(\delta _a(v)=\mathbb {1}\{ a=v\}\). To lighten the notation, will write ab instead of \(\{a,b\}\) for the set S. As for the partition function, expanding the exponential in the Gibbs–Boltzmann weights (1) into a power series in \(\tfrac{1}{2} \beta J_{vv'}\sigma _v{\bar{\sigma }}_{v'}\) for each directed \((v,v')\in E\), and integrating out the \(\sigma \) variables, we get

$$\begin{aligned} \langle \sigma ^2_a {\bar{\sigma }}^2_b \rangle _{G,\beta }= \frac{Z^{\varphi }_{G,\beta } }{Z^0_{G,\beta }} = \frac{\sum _{ \omega \in {\mathcal {L}}^{a,b}_{\varphi }} \lambda ^{a,b}_\beta ( \omega )}{Z^0_{G,\beta }}, \end{aligned}$$
(11)

where the first equality classically follows from the high-temperature expansion of correlation functions and the second one is a consequence of (9).

We will write \({\mathcal {P}}_{a,b}( \omega )\) for the set of paths in \(\omega \) that start at a and end at b, and define

$$\begin{aligned} m_{a,b}(\omega )=|{\mathcal {P}}_{a,b}( \omega )|. \end{aligned}$$

We now want to “erase the sources” at a and b from the currents underlying \( {\mathcal {L}}^{a,b}_{\varphi }\), and hence rewrite the numerator as a sum over \( {\mathcal {L}}^{a,b}_{0}\). We will then ultimately connect the open paths at a and b in all possible ways, and hence get a sum over \( {\mathcal {L}}^\emptyset _{0}\) (see Fig. 1 for an example). To this end note that in each \( \omega \in {\mathcal {L}}^{a,b}_{\varphi }\) there are exactly two more paths going from a to b, than those going from b to a, i.e., \(m_{a,b}(\omega )=m_{b,a}(\omega )+2\). The elementary operation that we will perform on the former paths is reversal. To this end, denote by \(r(\gamma )\) the path \(\gamma \) with the orientation of all the visited edges reversed. Obviously this does not change the underlying multigraph, and hence also the weight of the loop configuration. The crucial observation now is that it maps \( \omega \in {\mathcal {L}}^{a,b}_{\varphi }\) to a configuration \( \omega ' \in {\mathcal {L}}^{a,b}_{0}\), and hence erases the sources of the underlying currents. Indeed one can easily check that after reversing a path, the number of incoming minus the number of outgoing edges at every vertex \(v\notin \{a,b\}\) in \( \omega '\) is the same as in \( \omega \), whereas at a (resp. b) this number is decreased (resp. increased) by two. More precisely, our transformation maps bijectively a pair \(( \omega ,\gamma )\) where \( \omega \in {\mathcal {L}}^{a,b}_{\varphi }\) and \(\gamma \in {\mathcal {P}}_{a,b}(\omega ) \) to the pair \(( \omega ', r(\gamma ))\) where \( \omega ' \in {\mathcal {L}}^{a,b}_0\) and \(r( \gamma )\in {\mathcal {P}}_{b,a}(\omega ')\). Moreover, \(m_{b,a}( \omega ')= m_{b,a}( \omega )+1\), which in particular means that \(m( \omega ')> 0\). Since path reversal does not change the weight of a loop configuration, we obtain

$$\begin{aligned} \sum _{ \omega \in {\mathcal {L}}^{a,b}_{\varphi }} \lambda ^{a,b}_\beta ( \omega )&=\sum _{ \omega \in {\mathcal {L}}^{a,b}_{\varphi }, \gamma \in {\mathcal {P}}_{a,b}( \omega )}\frac{1}{m_{b,a}( \omega )+2} \lambda ^{a,b}_\beta ( \omega ) \\&=\sum _{ \omega ' \in {\mathcal {L}}^{a,b}_0, \gamma ' \in {\mathcal {P}}_{b,a}( \omega ')}\frac{1}{ m_{b,a}( \omega ')+1 } \lambda ^{a,b}_\beta ( \omega ') \mathbb {1}\{ m_{a,b}( \omega ')> 0 \} \\&={\sum _{ \omega '\in {\mathcal {L}}^{a,b}_0}}\frac{m_{b,a}( \omega ') }{m_{b,a}( \omega ')+1 } \lambda ^{a,b}_\beta ( \omega ') \mathbb {1}\{ m_{b,a}( \omega ')> 0 \} \\&={\sum _{ \omega '' \in {\mathcal {L}}^\emptyset _0}}\frac{m_{b,a}( \omega '') }{m_{b,a}( \omega '')+1 } \lambda ^\emptyset _\beta ( \omega '') \mathbb {1}\{ m_{b,a}( \omega '') > 0 \} , \end{aligned}$$

where in the second equality we used path reversal, the last equality follows from (8) with \(S'=\emptyset \), and where, with a slight abuse of notation, for \( \omega ''\in {\mathcal {L}}^\emptyset _0\), \(m_{b,a}( \omega '') \) is the number of pieces of loops going from b to a and not visiting b nor a except for the start and end vertex. Recall that \( {\textbf{P}}_{G,\beta }\) is the probability measure on \( {\mathcal {L}}^\emptyset _0\) induced by the weights \( \lambda ^\emptyset _{\beta }\), and note that \(m_{b,a}\) has the same distribution as \(m_{a,b}\) under \( {\textbf{P}}_{G,\beta }\) (the law on loops is invariant under a global orientation reversal). We therefore obtain from (10) and (11) the following loop representation of the two-point function.

Lemma 8

Let \(a,b\in V\) be distinct. Then

$$\begin{aligned} \langle \sigma ^2_{a}\bar{\sigma }^2_{b} \rangle _{G,\beta } = {\textbf{E}}_{G,\beta }\Big [\frac{m_{a,b}}{m_{a,b} + 1}\Big ], \end{aligned}$$

and in particular

$$\begin{aligned} \frac{1}{2} {\textbf{P}}_{G,\beta }(m_{a,b}> 0) \le \langle \sigma ^2_{a}\bar{\sigma }^2_{b} \rangle _{G,\beta } \le {\textbf{P}}_{G,\beta }(m_{a,b} >0). \end{aligned}$$
Fig. 1
figure 1

Left to right: an Eulerian multigraph \({\mathcal {M}}\); a loop configuration \(\omega \in {\mathcal {L}}^{a,b}_{2(\delta _a-\delta _b)}\) on \({\mathcal {M}}\) (a is the top left and b the bottom right vertex) together with a path from a to b marked red; a loop configuration \(\omega '\in {\mathcal {L}}^{a,b}_{0}\) with the path reversed; and one of the final loop configurations \(\omega ''\in {\mathcal {L}}^{\emptyset }_{0}\) corresponding to \(\omega '\), i.e., such that \(\rho (\omega '')=\omega '\). Here \(m_{a,b}(\omega )=3\), \(m_{b,a}(\omega )=1\), and \(m_{a,b}(\omega ')=m_{b,a}(\omega ')=2\)

Let us finish with a number of remarks.

Remark 3

We stress again that the crucial property of this loop representation is that the measure \( {\textbf{P}}_{G,\beta }\) is supported on collections of closed loops, and is independent of the choice of a and b. A similar idea was used by Lees and Taggi [23] to study spin O(n) models with an external magnetic field. Moreover, by Proposition 7 and Lemma 8, the random loops under \({\textbf{P}}_{G,\beta }\) carry probabilistic information about both the spin XY model (in terms of correlation functions) and its dual height function (as an exact coupling). An analogous role for the Ising and Ashkin–Teller model is played by the (double) random current measure that encodes both an integer-valued height function and the spin correlations [10, 25, 26]. The difference is that for the XY model, the correlations are determined by loop connectivities instead of percolation connectivities. This comparison offers an alternative explanation for the different types of phase transition in discrete and continuous spin systems.

Remark 4

The approach above is different from [4, 7, 23, 35] in that in the loop configurations, we never make connections at vertices with sources. This leads to different combinatorics than in [4], and in particular a more transparent formula for the two-point function.

Remark 5

We call a multigraph \({\mathcal {M}}\) Eulerian if its degree is even at every vertex. Another way to sample the loop configuration that easily follows from the above definitions is the following procedure:

  • First sample an Eulerian submultigraph \({\mathcal {M}}\) of G with probability proportional to

    $$\begin{aligned} {\mathcal {E}}({\mathcal {M}}) \prod _{e\in E} \frac{1}{{\mathcal {M}}_e!}\Big (\frac{\beta }{2}\Big )^{{\mathcal {M}}_e}, \end{aligned}$$

    where \({\mathcal {E}}({\mathcal {M}})\) is the number of Eulerian orientations of \({\mathcal {M}}\), i.e., assignments of orientations to every edge of \({\mathcal {M}}\) with an equal number of incoming and outgoing edges at every vertex.

  • Then choose uniformly at random an Eulerian orientation of \({\mathcal {M}}\).

  • Finally, at each vertex, independently of other vertices, connect the incoming edges with the outgoing edges uniformly at random.

Remark 6

Using the same argument as above one obtains the following formula for higher power two-point functions. For \(k\ge 1\), we have

$$\begin{aligned} \langle \sigma ^{2k}_{a}\bar{\sigma }^{2k}_{b} \rangle _{G,\beta } = {\textbf{E}}_{G,\beta }\Big [\frac{(m_{a,b})_k}{(m_{a,b} + k)_k}\Big ], \end{aligned}$$

where \((m)_k = m(m - 1)\cdots (m - k+ 1)\) is the falling factorial. One can also consider multi-point functions and get more complicated loop representation formulas.

Remark 7

The isomorphism theorem of Le Jan [21] says that the discrete complex Gaussian free field can be coupled with a Poissonian collection of random walk loops, the so called random walk loop soup, in such a way that one half of the square of the absolute value of the field is equal to the total occupation time of the random walk loops. On the other hand, it is immediate that conditioned on the absolute value of the field, its complex phase is distributed like the XY model with coupling constants depending on this absolute value. With some work, e.g. using [22], one can show that under this conditioning the random walk loops have the same distribution as the loops described above.

4 Delocalisation Implies No Exponential Decay

In this section we prove that if the height function delocalises, then the spin correlations are not summable along certain sets of vertices. In the next section, we will show how to apply this together with the delocalisation results of Lammers [19] to deduce a BKT-type phase transition in a wide range of periodic planar graphs.

Suppose \(\Gamma =(V,E)\) is a translation invariant planar graph, and write

$$\begin{aligned} \langle \sigma _a{\bar{\sigma }}_{b} \rangle _{\Gamma ,\beta } = \lim _{G\nearrow \Gamma }\langle \sigma _a{\bar{\sigma }}_{b} \rangle _{G,\beta } \end{aligned}$$
(12)

for the infinite volume two-point function, where the limit is taken along any increasing sequence of subgraphs G exhausting \(\Gamma \). That this is well defined is guaranteed by the fact that the sequence is nondecreasing, i.e., \(\langle \sigma _a{\bar{\sigma }}_{b} \rangle _{G,\beta }\le \langle \sigma _a{\bar{\sigma }}_{b} \rangle _{G',\beta }\) if G is a subgraph of \(G'\), which in turn is a classical consequence of the Ginibre inequality [16].

Definition

Let 0 be a distinguished face of \(\Gamma \). A bi-infinite self-avoiding path in \(\Gamma \) that goes through at least one edge incident to 0 is called a cut (at 0). Note that a cut L naturally splits into two connected infinite sets of vertices \(L_{ +}\) and \(L_{ -}\) with the property that any cycle in \(\Gamma \) that surrounds 0 must intersect both \(L_+\) and \(L_-\).

The main quantity of interest for us will be the sum of correlations along cuts. To be more precise for \(\varepsilon >0\), let

$$\begin{aligned} \chi ^{\epsilon }_{\Gamma ,\beta }(L)=\sum _{a \in L_{ +}, b \in L_{ -}} (\langle \sigma _{a} \overline{\sigma }_{b} \rangle _{\Gamma ,\beta })^{2-\varepsilon }. \end{aligned}$$
(13)

Proposition 9

For every \(\epsilon > 0\), there exists \(C=C(\epsilon ,\beta ,\Gamma ) < \infty \) such that for all finite subgraphs G of \(\Gamma \) containing 0, we have

$$\begin{aligned} \mathbb {E}_{ G, \beta }[ |h(0)|] \le C\inf _{L}\chi ^{\epsilon }_{\Gamma ,\beta }(L), \end{aligned}$$

where the infimum is over all cuts at 0.

Before presenting the proof, let us mention that a direct corollary of this proposition is the following. A natural example of a cut is any path that stays at a constant distance from a straight line going through 0. In this case it is easy to see that \(\chi ^{\epsilon }_{\Gamma ,\beta }(L)\) is finite whenever there is exponential decay of spin correlations. We can now state the main conclusion of this section.

Corollary 10

If the height function delocalises in the sense of Theorem 4, then

$$\begin{aligned} \chi ^{\epsilon }_{\Gamma ,\beta }(L)=\infty \end{aligned}$$

for all \(\varepsilon >0\) and all cuts L at 0. In particular the two-point function does not decay exponentially fast with the distance between the vertices.

Proof

We know that situation (i) from Theorem 4 does not happen. This means that \(\sup _n\mathbb {E}_{B_n,\beta }[ |h(0)|] =\infty \), and the claim follows directly from Proposition 9. \(\square \)

Remark 8

One naturally expects that the localisation-delocalisation phase transition for the height function happens at the same temperature as the BKT transition for the XY model. The remaining part of this prediction is therefore to show that if the spin correlations do not decay exponentially, then the height function delocalises. We do not do this in this article.

Recall that \(m_{a,b}\) is the number of paths (pieces of loops) in a loop configuration that go from a to b. We will need the following lemma.

Lemma 11

For all \(\beta > 0\) and \(p> 1\), there exists a \(C_p < \infty \) such that for all finite graphs \(G=(V,E)\) and all \(a,b\in V\),

$$\begin{aligned} {\textbf{E}}_{G,\beta }[m_{a,b}] \le C_p \deg _{G}(a)\big (\textbf{P}_{G,\beta }(m_{a,b} > 0) \big )^{\frac{1}{p}}. \end{aligned}$$

Proof

Fix \(\beta > 0\), \(G=(V,E)\) and \(a,b\in V\), and let \(\omega \in \mathcal {L}_0\) be a loop configuration on G. Denote by \(\omega _e\), the number of visits of all loops in \(\omega \) to an undirected edge \(e\in E\). If there are \(m \ge 1\) paths going from a to b in \(\omega \), then in particular \(\sum _{c\sim a} \omega _{\{ a, c\}} \ge m\). This implies that

Applying Hölder’s inequality gives

where \(1/p+1/q=1\). We now notice that by definition, \(\omega _e\) under \(\textbf{P}_{G,\beta }\) has the same distribution as the amplitude \(|\textbf{n}|_e\) under \(\mathbb {P}_{G,\beta }\). Therefore, to finish the proof it is enough to show that for all \(q>1\), there exists \(C_q<\infty \) depending on \(\beta \) but independent of G such that

$$\begin{aligned} {\mathbb {E}}_{G,\beta }[|\textbf{n}|^q_e]\le C_q. \end{aligned}$$
(14)

We postpone the proof of this bound to Lemma 13 and Lemma 14. \(\square \)

The last ingredient that we will need is the following inequality

Lemma 12

For any \(a,b\in V\), we have

$$\begin{aligned} \langle \sigma ^2_{a}\bar{\sigma }^2_{b} \rangle _{G,\beta } \le 2 \langle \sigma _{a}\bar{\sigma }_{b} \rangle ^2_{G,\beta }. \end{aligned}$$

Proof

A version of the Ginibre inequality (see e.g. [3]) says that

$$\begin{aligned} \big \langle \Im (\sigma _a) \Im (\sigma _b)\Re (\sigma _a)\Re (\sigma _b)\big \rangle _{G,\beta }&\le \big \langle \Im (\sigma _a) \Im (\sigma _b)\big \rangle _{G,\beta } \big \langle \Re (\sigma _a)\Re (\sigma _b)\big \rangle _{G,\beta }, \end{aligned}$$

which after rearrangement gives the desired inequality. \(\square \)

We are now ready to prove the main theorem.

Proof of Proposition 9

Fix a finite subgraph G and a cut L. By Proposition 7 the height function h(0) under \(\mathbb {P}_{G,\beta }\) has the sam law as W(0) – the total net winding around 0 of all loops in a loop configuration – drawn according to \(\textbf{P}_{G,\beta }\). Moreover, any piece of a loop that adds to the winding (in any orientation) must intersect both \( L_{+}\) and \( L_{ -}\) by definition of a cut. Therefore, taking \(p=2/(2-\varepsilon )\), we have

$$\begin{aligned} \mathbb {E}_{G,\beta }[|h(0)|]&= {\textbf{E}}_{G,\beta }[|W(0)|] \\&\le \sum _{a\in L_{+}, b \in L_{ -}} {\textbf{E}}_{G,\beta }[m_{a,b}] \\&\le {\tilde{C}}\sum _{a \in L_{+}, b \in L_{ -}} ( {\textbf{P}}_{G,\beta }(m_{a,b} > 0))^{1/p} \\&\le 2{\tilde{C}}\sum _{a \in L_{+}, b \in L_{ -}} (\langle \sigma ^2_{a}\bar{\sigma }^2_{b} \rangle _{G,\beta })^{1-\varepsilon /2} \\&\le 4{\tilde{C}}\sum _{a \in L_{+}, b \in L_{ -}} (\langle \sigma _{a}\bar{\sigma }_{b} \rangle _{G,\beta })^{2-\varepsilon } \\&\le C\chi ^{\epsilon }_{\Gamma ,\beta }(L). \end{aligned}$$

where the third line follows from Lemma 11, the forth one from Lemma 8, the fifth one from Lemma 12, and the last one from (12). This completes the proof. \(\square \)

It therefore remains to show (14), which will directly follow from Lemma 13 and Lemma 14 below. To that end, define for \(k \in \mathbb {N}\) and \(\beta > 0\), a random variable \(Y_{k}\) by

$$\begin{aligned} \mathbb {P}_{\beta }(Y_{k} = i) \propto \frac{1}{i!(i + k)!} \big (\tfrac{\beta }{2}\big )^{2i + k}, \end{aligned}$$

so that the normalizing constant is \(I_k(\beta )\). For \(e=vv'\), let

$$\begin{aligned} |\nabla h|_e= |\textbf{n}_{(v, v')} - \textbf{n}_{(v', v)}| \end{aligned}$$

be the absolute value of the gradient of the height function across the dual edge \(e^\dagger \). Note that the random variables \((X_{e} = X_e(\textbf{n}))_{e \in E}\) defined through

$$\begin{aligned} X_{e} = \frac{|\textbf{n}|_e - |\nabla h|_e}{2} \end{aligned}$$

have the same distribution as \(Y_{|\nabla h|_e}\). Moreover, conditionally on \(|\nabla h|\), they are an independent family. To show (14) it is enough to bound the moments of \(|\nabla h|_e\) and \(X_e\) separately, which we will now do.

Lemma 13

For all \(\beta > 0\) and all \(r \in {\mathbb {N}}\), there exists a \(C_r < \infty \) such that for all finite planar graphs \(G = (V, E)\) and all \(e \in E\),

$$\begin{aligned} \mathbb {E}_{G,\beta }[|\nabla h|_e^r] \le C_r. \end{aligned}$$

Proof

Fix a finite planar graph G, and let \( e=vv' \in E\). Write \(\mathbb {P}_{e, \beta }\) for the law of the height function on the graph consisting of just one edge e, say with \(h(v) = 0\). We claim first that there exists some absolute constant C not depending on G, e or r such that

$$\begin{aligned} \mathbb {E}_{G, \beta }|\nabla h|_e^r \le C \mathbb {E}_{e, \beta } |\nabla h|_e^r. \end{aligned}$$
(15)

This implies the result because (as \(\mathcal {V}_{\beta }\) is convex and symmetric) the law of \(\nabla h_e\) is log-concave and symmetric under \(\mathbb {P}_{e, \beta }\) so that it has all moments.

Let \(G \setminus e\) be the graph without the edge e. For \(l\in {\mathbb {Z}}\), we define \(\Omega _{ l}(G) =\{ \textbf{n}\text { on } G: \delta \textbf{n}=l(\delta _v-\delta _{v'})\}\), and

$$\begin{aligned} Z^l_G = \sum _{\textbf{n}\in \Omega _{l} (G)} w_{\beta }(\textbf{n}), \end{aligned}$$

and analogously \(Z^l_{G \setminus e}\). Similarly to (11), we get from the high-temperature expansion of correlation functions that

$$\begin{aligned} \langle \sigma _v^l {\bar{\sigma }}_{v'}^{l} \rangle _{G\setminus e, \beta } = \frac{Z_{G \setminus e}^{l}}{Z^0_{G \setminus e}}. \end{aligned}$$

By the definition of the height function and currents, we therefore have

$$\begin{aligned} \mathbb {P}_{G,\beta }(|\nabla h|_e= l) = I_l(\beta )\frac{(Z^l_{G\setminus e} +Z^{-l}_{G\setminus e}) }{Z^0_G}&=2I_l(\beta )\frac{Z^l_{G\setminus e} }{Z^0_{G\setminus e} } \frac{Z^0_{G\setminus e} }{Z^0_{G} } \le 2 I_l(\beta )\\ {}&= \mathbb {P}_{e, \beta }(|\nabla h|_e = l) Z^0_{e}, \end{aligned}$$

where we used the obvious bounds \(\langle \sigma _v^l {\bar{\sigma }}_{v'}^{l} \rangle _{G\setminus e, \beta } \le 1\), and \({Z^0_{G\setminus e} }/{Z^0_{G}} \le 1\). Setting \(C = Z_{e}^0\) we establish (15). \(\square \)

Lemma 14

For all \(\beta > 0\) and all \(r \in {\mathbb {N}}\), there exists a \({\tilde{C}}_r < \infty \) such that for all finite planar graphs \(G = (V, E)\) and \(e \in E\),

$$\begin{aligned} \mathbb {E}_{G,\beta }[|X_e|^r] \le {\tilde{C}}_r. \end{aligned}$$

Proof

For two nonnegative integers ir, let \((i)_r = i(i - 1)\cdots (i - r + 1)\) be the falling factorial with the convention that \((i)_0=1\). Note that \((i)_r=0\) whenever \( i < r\). It will be convenient to look at the falling factorial moments. First note that by definition of \(Y_k\),

$$\begin{aligned} \mathbb {E}_\beta [(Y_k)_r]&= \frac{1}{I_k(\beta )}\sum _{i \ge 0} \frac{(i)_r}{i!(i + k)!} \big ( \tfrac{\beta }{2}\big )^{2i + k} = \frac{\big (\tfrac{\beta }{2}\big )^{r}}{I_k(\beta )}\sum _{i \ge 0} \frac{1}{i!(i + k + r)!}\big ( \tfrac{\beta }{2}\big )^{2i + k + r}\\ {}&= (\tfrac{\beta }{2}\big )^{r} \frac{I_{k+r}(\beta )}{I_k(\beta )}. \end{aligned}$$

By the Turán inequality (5), the map \(k \mapsto I_{k+1}(\beta ) / I_k(\beta )\) is decreasing and hence

$$\begin{aligned} \mathbb {E}_\beta [(Y_k)_{r}] = \big (\tfrac{\beta }{2}\big )^{r } \frac{I_{k + r }(\beta )}{I_k(\beta )} \le \big (\tfrac{\beta }{2}\big )^{r }\frac{I_{r}(\beta )}{I_0(\beta )} =: C. \end{aligned}$$

Now note that \((i)_r \ge |i-r|^r\) when \(i\ge r\), and hence \(i^r\le 2^{r-1}( |i-r|^r +r^r)\le 2^{r}( (i)_r +r^r)\). Finally

$$\begin{aligned} \mathbb {E}_{\beta }[|X_e|^r \mid |\nabla h|_e = k] = \mathbb {E}_{ \beta }[|Y_k|^r] \le 2^{r}( C +r^r) := {\tilde{C}}_r, \end{aligned}$$

where the last bound does not depend on k. Integrating over the possible values of \(|\nabla h|_e\) concludes the proof. \(\square \)

5 Existence of Phase Transition in the XY Model

In this section, we prove that for all translation invariant planar graphs \(\Gamma = (V, E)\), the XY model undergoes a non-trivial phase transition in terms of the quantity \(\chi ^{\varepsilon }_{\beta }(L)\). As before, let 0 denote an arbitrary distinguished face of \(\Gamma \). We define

$$\begin{aligned} \beta _0=\inf \{\beta>0: \text {for all}\, {\varepsilon >0}\,\text {and all cuts}\, L\,\text { at}\, 0, \chi ^{\varepsilon }_{\beta }(L) =\infty \}. \end{aligned}$$

Theorem 15

Let \(\Gamma \) be as above. Then \(\beta _0<\infty \).

By Corollary 10 it is enough to show that for any such \(\Gamma \), there exists a finite \(\beta _0 > 0\) such that the associated height function delocalises in the sense that there are no translation invariant Gibbs measures on the dual \(\Gamma ^\dagger \). We first implement this strategy for triangulations, where delocalisation can be shown directly using the general result of Lammers [19] (Theorem 2).

Proof of Theorem 15 for triangulations

Let \(\Gamma \) be a translation invariant triangulation. Note that condition (6) in our case is equivalent to \({I_{1}(\beta )}/{I_0(\beta )} \ge \frac{1}{2}\). It is known that this fraction converges to 1 as \(\beta \rightarrow \infty \) (see for example [33]), and therefore in light of Theorem 2, there are no translation invariant Gibbs measures for \(\beta \) large enough. \(\square \)

To extend beyond triangulations, we will use a different approach. We stress that in particular, we will not show delocalisation of the height function on graphs that are not triangulations. Instead, we exploit monotonicity in coupling constants to bound from below the spin correlations on an arbitrary translation invariant graph by correlations on a modified graph that is a triangulation. We explain this procedure in detail for the square lattice, and briefly mention the extension to other lattices at the end.

In what follows, we will need the following well known monotonicity of spin correlations that is a classical consequence of the Ginibre inequality [16].

Lemma 16

For each (infinite or finite) graph \(G = (V, E)\), \(\beta > 0\), \(e\in E\), and \(a,b\in V\), the function

$$\begin{aligned} J_e \mapsto \langle \sigma _a{\bar{\sigma }}_{b} \rangle _{G,\beta } \end{aligned}$$

is nondecreasing.

Fig. 2
figure 2

The transformation to a triangulation. The red edge on the right is the edge with different potential

Proof of Theorem 15 for the square lattice

Let \(\Gamma = (V, E)\) denote the square lattice.

In order to use (6), we need to transform \(\Gamma \) into a triangulation. See Fig. 2 for guidance. Fix a square and double the bottom and left edge and put coupling constants \(\beta / 2\) on the doubled edges instead of \(\beta \). Next, double the common vertex of the left and bottom edge and add an additional edge e, on which we set the coupling constant to infinity. This does not change the distribution of the spins. Finally, set the coupling constant on the edge e to 0, which is equivalent to removing the edge from the square, and repeat the procedure for all other squares. In this way, we obtain a new lattice \(\Gamma '\), which consists of squares with a diagonal on which there is an additional vertex. Note that all coupling constants are now equal to \(\beta / 2\). By Lemma 16,

$$\begin{aligned} \langle \sigma _a \bar{\sigma }_{b} \rangle _{\Gamma , \beta } \ge \langle \sigma _a \bar{\sigma }_{b} \rangle _{\Gamma ', \beta /2} \end{aligned}$$
(16)

for all pairs of vertices ab in \(\Gamma \), using the natural embedding of \(\Gamma \) on \(\Gamma '\).

Since \(\Gamma '\) is a translation invariant graph, the dichotomy statement of Theorem 4 holds. To show that there are no translation invariant Gibbs measures for the associated height function, notice that the dual \((\Gamma ')^\dagger \) of \(\Gamma '\) (after collapsing the doubled edges to a single edge) is trivalent. Moreover, the height function on any finite subgraph of \((\Gamma ')^\dagger \) has a potential given by \(\mathcal {V}_e' = \mathcal {V}^{\beta /2}_e\) for the nondiagonal edges and \(\mathcal {V}'_e = 2\mathcal {V}^{\beta /2}_e\) otherwise, and the potential \(\mathcal {V}'\) satisfies Lammers’ condition (6) precisely when \(\left( {I_1(\beta /2)}/{I_0(\beta / 2)}\right) ^2 \ge \frac{1}{2}\). Since the fraction on the left-hand side tends to 1 as \(\beta \rightarrow \infty \), we can choose \(\beta \) large enough so that there are no translation invariant Gibbs measures for the height function on \((\Gamma ')^\dagger \).

Note that every cut on \(\Gamma \) embeds naturally as a cut on \(\Gamma '\). Therefore, by Proposition 9 together with (16), we have that for each cut L on \(\Gamma \) and each \(\epsilon > 0\),

$$\begin{aligned} \chi _{\Gamma ,\beta }^\epsilon (L)\ge \chi _{\Gamma ',\beta /2}^\epsilon (L) = \infty . \end{aligned}$$

This finishes the proof. \(\square \)

To extend this proof to general graphs, we make each face into a triangulation by “zig-zagging” (see Fig. 3).

Fig. 3
figure 3

The transformation of a general graph to a triangulation (after identifying the resulting multiple edges). The dashed edges are such that the coupling constant is set to infinity first, and then to zero (which is equivalent to removing the edges) and hence the spin correlations in the final graph are smaller than in the original graph

6 No Exponential Decay Implies a Power-Law Lower Bound

In this section we finish the proof of the main theorem by showing that the absence of exponential decay implies a power-law lower bound on the two-point function when \(\Gamma ={\mathbb {Z}}^2\). Similar arguments can be applied to other graphs that in addition to being translation invariant possess reflection and rotation symmetries.

We will use the following two classical inequalities.

Lemma 17

(Lieb–Rivasseau inequality [24, 32]). Let \(G=(V,E)\) be any graph. Let \(a,b \in V\) be distinct, and let H be a finite subgraph of G containing a and not containing b, and let \(\partial H\) be the set of vertices of H adjacent to at least one vertex outside H. Then

$$\begin{aligned} \langle \sigma _a \bar{\sigma }_{b} \rangle _{G,\beta } \le \sum _{c \in \partial H} \langle \sigma _a \bar{\sigma }_c \rangle _{H,\beta } \langle \sigma _c \bar{\sigma }_{b}\rangle _{G,\beta }. \end{aligned}$$

Lemma 18

(Messager–Miracle-Sole inequality [30]). For any \(n\in {\mathbb {Z}}\), the two sequences \(\langle \sigma _0 \bar{\sigma }_{(n,k)} \rangle _{\mathbb {Z}^2, \beta }\) and \(\langle \sigma _0 \bar{\sigma }_{(n+k,n-k)} \rangle _{\mathbb {Z}^2, \beta }\) are nonincreasing in k for \(k\ge 0\).

Proof of Theorem 1

Let 0 denote the vertex at the origin. For a finite subgraph G of \({\mathbb {Z}}^2\) containing 0, let

$$\begin{aligned} \varphi _{G,\beta } =\sum _{w\in \partial G} \langle \sigma _0{\bar{\sigma }}_w\rangle _{G,\beta }, \end{aligned}$$

where \(\partial G\) is the set of vertices of G adjacent to at least one vertex outside G. Define

$$\begin{aligned} \beta _c= \sup \{\beta : \text {there exists finite}\, G\, \text {with}\, \varphi _{G,\beta } <1\}. \end{aligned}$$
(17)

We will show that \(\beta _c\) satisfies the properties listed in Theorem 1. To this end first fix \(\beta <\beta _c\). By Lemma 16, there exists a finite graph G with \(\varphi _{G,\beta } <1\). Take m such that \(G \subset \Lambda _m\) and let \(x \in V\). Fix n so that \((n + 1)m \ge |x|_1 \ge nm\). Iteratively applying the Lieb–Rivasseau inequality [24, 32] to translates of G gives

$$\begin{aligned} \langle \sigma _0 \bar{\sigma }_x \rangle _{\mathbb {Z}^2, \beta } \le \sum _{w \in \partial G} \ \langle \sigma _0{\bar{\sigma }}_w\rangle _{G,\beta } \sum _{w' \in \partial (G + w)} \langle \sigma _w \bar{\sigma }_{w'} \rangle _{G + w, \beta }\langle \sigma _{w'} \bar{\sigma }_x \rangle _{\mathbb {Z}^2, \beta } \le \cdots \le (\varphi _{G, \beta }(0))^{n}, \end{aligned}$$

hence (i) holds true if \(\beta < \beta _c\).

To conclude (ii), note that for each finite G, \(\varphi _{G,\beta }\) is a continuous function of \(\beta \), and hence the set in (17) is open. This means that for every \(\beta \ge \beta _c\), we have \(\varphi _{G,\beta } \ge 1\) for all finite subgraphs G.

Fig. 4
figure 4

The \([-n, n]^2\) box \(\Lambda _n\) shaded in grey and the \(L^1\) ball \(\Lambda '_n\) of radius 2n

Now let \(\Lambda _n\) be the box \([-n, n]^2\), and let \(\Lambda '_n\) be the ball in \(L^1\) of radius 2n (see Fig. 4). We write \(x_n:=(n,n)\in \partial \Lambda _n \cap \partial \Lambda '_n\) and \(a_n =\langle \sigma _0{\bar{\sigma }}_{x_n}\rangle _{{\mathbb {Z}}^2,\beta }\). By rotation symmetry and the Messager–Miracle-Sole [30] inequality, we have

$$\begin{aligned} a_n =\min _{v\in \partial \Lambda _n}\langle \sigma _0{\bar{\sigma }}_{v}\rangle _{{\mathbb {Z}}^2,\beta } = \max _{v\in \partial \Lambda '_n}\langle \sigma _0{\bar{\sigma }}_{v}\rangle _{{\mathbb {Z}}^2,\beta }. \end{aligned}$$

For \(\beta \ge \beta _c\), we moreover have

$$\begin{aligned} \sum _{w \in \partial \Lambda '_n} \langle \sigma _0 \overline{\sigma }_w \rangle _{{\mathbb {Z}}^2, \beta } \ge \varphi _{\Lambda _n',\beta } \ge 1. \end{aligned}$$

These two observations together imply that for any \(v\in \partial \Lambda _n\),

$$\begin{aligned} \langle \sigma _0{\bar{\sigma }}_{v}\rangle _{{\mathbb {Z}}^2,\beta } \ge a_n \ge \frac{1}{|\partial \Lambda '_n|} = \frac{1}{8n} \ge \frac{1}{8 |v|} \end{aligned}$$

which implies (ii).

Finally by Theorem 15 we know that there exists a finite \(\beta \) at which there is no exponential decay, and by classical expansions there exists a nonzero \(\beta \) at which there is exponential decay (see e.g. [1]). We conclude that \(0<\beta _c<\infty \). \(\square \)