1 Introduction

A quantum XY model with spin \(S\in \tfrac{1}{2}{\mathbb {N}}\) on the square lattice \({\mathbb {Z}}^2 \) with a particular type of annealed site dilution is considered. We prefer to formulate the model in terms of a more symmetric equivalent version, with dilution represented by Ising spins instead of the site occupation numbers, with the Hamiltonian

$$\begin{aligned} H =-\frac{1}{S^2}\sum _{\{ x,y\}}\sigma _x\sigma _y\big (S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)}-S(S+1)\big )-\kappa \sum _{\{ x,y\}}\sigma _x\sigma _y-\mu \sum _{x}\sigma _xS_x^{(3)}.\nonumber \\ \end{aligned}$$
(1.1)

Here \(S_{x}^{(\alpha )}\), \(\alpha =1,2,3,\) are the components of the standard spin-S operator acting on the site x (so in particular \(S^{(1)}\) and \(S^{(3)}\) are real matrices and \(S^{(3)}\) is a diagonal matrix) and \(\sigma _{x}\) is an Ising variable representing the presence of a particle at the site x—more concretely, the occupancy number \(n_x\in \{0,1\}\) indicating the presence/absence of a particle at x corresponds to the Ising spin via the relation \(\sigma _x=2 n_x-1 \in \{-1,1\}\). The parameters \(\mu \) and \(\kappa \) allude to the chemical potential and the interaction parameter for the particles.

Our main claim concerns the existence of a staggered long range occupancy order characterised by the presence of two distinct states (in the thermodynamic limit) which preferentially take Ising spin with value \(+1\) on either the even or the odd sublattice. Indeed, it will be proven that such states occur in a region of parameters \(\mu \) and \(\kappa \), at intermediate inverse temperatures, \(\beta \).

The existence of such states can be viewed as a demonstration of an “effective entropic repulsion” caused by the interaction of quantum spins leading to an impactful restriction of the “available phase space volume”. As a result, occupation of adjacent sites might turn out to be unfavourable—it results in an effective repulsion between particles occupying nearest neighbour sites and as a result leads, eventually, to a staggered order. It is easy to understand that this is the case for the annealed site diluted Potts model with large number of spin states q [3]. Indeed, here the effect is caused by a pure entropic repulsion: two nearest neighbour occupied sites contribute the Boltzmann factor \(q+q(q-1)e^{-\beta }\) with q aligned pairs of Potts spins without energy penalty and \(q(q-1)\) nonaligned excitations. The contribution of this Boltzmann factor is, at low temperatures, much smaller than the factor \(q^2\) obtained from two next nearest neighbour Potts spins that are free, without energy penalty, to take entirely independently all \(q^2\) possible spin values. Actually, the same is true—even though less obvious—in the case of diluted models with classical continuous spins [4]. Our present result constitutes an extension of similar claims to a quantum situation.

To get a control on effective repulsion, we rely on a standard tool—the chessboard estimates which follow from reflection positivity. The classical references on this topic are [5,6,7,8,9] with a recent review [1]. For our case the treatment in [2] is especially useful. There is a technical issue in the very formulation concerning a consistent treatment of infinite volume Gibbs states. In the classical case, the use of the notion of infinite-volume DLR states is standard. For an efficient formulation of the long range order in terms of coexistence of the corresponding infinite-volume equilibrium states in the quantum case, we use the setting from [2, Sect. 3.3] introducing infinite volume KMS states.

Note that we could have also added a term \(uS_x^{(2)}S_y^{(2)}\) to the Hamiltonian with our result concerning reflection positivity still holding for \(u\le 0\) (as \(S^{(2)}\) is a purely imaginary matrix). Thus, we could consider our case as a restriction from the general case with \(-1\le u\le 0\) to the case \(u=0\) and ask whether the full result could also be extended to the models with \(-1\le u<0\). Here, however, we ran into an obstacle; it not clear which estimates can be really obtained in these cases (see Lemmas 3.4-3.5).

We remark that our Hamiltonian bares a resemblance to that of the Falicov–Kimball model. Roughly, in the special case of spin 1 / 2, if we set \(b_x=S^{(1)}_x+iS^{(3)}_x\) and \(b^*_x= S^{(1)}_x-iS^{(3)}_x\), we have

$$\begin{aligned} H=-2\sum _{\{x,y\}}\sigma _x\sigma _y(b_xb^*_y+b^*_xb_y-\tfrac{3}{2})-\kappa \sum _{\{ x,y\}}\sigma _x\sigma _y-\frac{\mu }{2i}\sum _{x}\sigma _x(b_x-b^*_x). \end{aligned}$$
(1.2)

Compare this with the Hamiltonian for the Falicov–Kimball model, as presented in [10],

$$\begin{aligned} H^{\text {FK}}=-\sum _{x,y}t_{x,y}a^*_xa_y+U\sum _xn_x\left( a^*_xa_x-\tfrac{1}{2}\right) . \end{aligned}$$
(1.3)

Here \(T=(t_{x,y})\) is a complex hermitian matrix, \(U\in {\mathbb {R}}\) is a coupling constant, \(a^*_x\) and \(a_x\) are the fermionic creation and annihilation operators acting on site x, respectively, and \(n_x\in \{0,1\}\) are occupation variables of heavy particles treated classically. If the last term \(\mu \sum _{x}\sigma _xS^{(3)}_x\) in our Hamiltonian were replaced by \(\mu \sum _{x}\sigma _x S_x^{(2)}=\mu \sum _{x}\sigma _x(b^*_xb_x-\tfrac{1}{2})\), the resulting model would appear to be even closer to the Falicov–Kimball Hamiltonian. Moreover, we can see from (3.4) and (3.42) that Theorem 2.1 would still hold in this case.

Nevertheless we remark three crucial differences between our model and the Falicov–Kimball model. Firstly, \(b^*\) and b are bosonic operators while, in the case of the Falicov–Kimball model, \(a^*\) and a are fermionic operators (even though, if our model were considered in dimension 1, it could be transformed to fermionic operators by Jordan-Wigner transform). Secondly our “hopping term” is not constant and it involves the Ising (or occupation) variables and so the Falicov–Kimball picture with itinerant electrons and fixed (classical) particles is not valid in our case. But, finally and most importantly, in our model we consider any spin (not necessarily equal to 1 / 2), which makes it differ even more from the Falicov–Kimball model. For this model the staggered order at close to half filling is proven with either fermions or hard-core bosons (see [11, 12] and the review [10]). On the other hand, in our model the staggered order occurs for all spins, and is due to an “effective entropic repulsion”, rather than fermionic effects.

We introduce the model and state the main result in Sect. 2. The proof is deferred to Sect. 3.

2 Setting and Main Results

For a fixed even\(L\in {\mathbb {N}}\), we consider the torus\({\mathbb {T}}_L={\mathbb {Z}}^d/L{\mathbb {Z}}^d\) consisting of \(L^d\) sites that can be identified with the set \((-L/2,L/2]^d\cap {\mathbb {Z}}^d\). On the torus \({\mathbb {T}}_L\) we take the algebra\({\mathfrak {A}}_L\)of observables consisting of all functions \(A: \{-1,1\}^{{\mathbb {T}}_L}\rightarrow {{\mathcal {M}}}_L\) where \({{\mathcal {M}}}_L\) is the \(C^*\)-algebra of linear operators acting on the space \(\otimes _{x\in {\mathbb {T}}_L}{\mathbb {C}}^{2S+1}\) with \(S\in \frac{1}{2}{\mathbb {N}}\) (complex \((2S+1)^{{|{\mathbb {T}}_L|}}\)-dimensional matrices).

A particular example of an observable is the Hamiltonian \(H_L\in {\mathfrak {A}}_L\) of the form (1.1) with the periodic boundary conditions (on the torus \({\mathbb {T}}_L\)),

$$\begin{aligned} H_L(\sigma )=-\frac{1}{S^2}\sum _{\{ x,y\}}\sigma _x\sigma _y\big (S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)}-S(S+1)\big )-\kappa \sum _{\{ x,y\}}\sigma _x\sigma _y-\mu \sum _{x\in {\mathbb {T}}_L}\sigma _xS_x^{(3)}.\nonumber \\ \end{aligned}$$
(2.1)

Here the sum is over pairs \(\{x,y\}\in {\mathbb {E}}_L\), the set of all edges connecting nearest neighbour sites in the torus \({\mathbb {T}}_L\), and \(S_{x}^{(\alpha )}\), \(\alpha =1,2,3,\) are the components of the standard spin-S operator acting on the site x. The Gibbs state on the torus is given by

$$\begin{aligned} \langle \cdot \rangle _{L,\,\beta }=\frac{1}{Z_L(\beta )}\sum _{\sigma }{{\text {Tr}}}\,\cdot \mathrm{e}^{-\beta H_L} \end{aligned}$$
(2.2)

with \(Z_L(\beta )=\sum _{\sigma }{{\text {Tr}}}\, \mathrm{e}^{-\beta H_L}\). Infinite volume states of a quantum spin system are formulated in terms of KMS states, an analog of DLR states for classical systems. Let us briefly recall this notion in the form to be used in our situation. Here we follow closely the treatment from [2] which can be consulted for a more detailed discussion of KMS states in a setting similar to ours. Let \({\mathfrak {A}}\) denote the \(C^*\)algebra of quasilocal observables,

$$\begin{aligned} {\mathfrak {A}}=\overline{{\mathfrak {A}}_0},\qquad \text { where }\quad {\mathfrak {A}}_0=\bigcup _{\Lambda \subset {\mathbb {Z}}^d \text { finite}}{\mathfrak {A}}_\Lambda , \end{aligned}$$
(2.3)

where the overline denotes the norm-closure. We define the time evolution operators\(\alpha _t^{(L)}\) acting on \(A\in {\mathfrak {A}}_L\) and for any \(t\in {\mathbb {R}}\) as

$$\begin{aligned} \alpha _t^{(L)}(A)=\mathrm{e}^{it H_L}A\mathrm{e}^{-itH_L}. \end{aligned}$$
(2.4)

It is well known that for a local operator \(A\in {{\mathcal {A}}}_0\) we can expand \(\alpha ^{(L)}_t(A)\) as a series of commutators,

$$\begin{aligned} \alpha ^{(L)}_t(A)=\sum _{m\ge 0}\frac{(it)^m}{m!}[H_L,[H_L,\ldots ,[H_L,A]\ldots ]]. \end{aligned}$$
(2.5)

The map \(t\rightarrow \alpha ^{(L)}_t\) extends to all \(t\in {\mathbb {C}}\) and, as \(L\rightarrow \infty \), \(\alpha _t^{(L)}\) converges in norm to an operator \(\alpha _t\) on \({\mathfrak {A}}\) uniformly on compact subsets of \({\mathbb {C}}\) (one can consult the proof, for example, in [13] and see that the same proof structure works in this case). A state \(\langle \cdot \rangle _\beta \) on \({\mathfrak {A}}\) (a positive linear functional (\(\langle A\rangle _\beta \ge 0\) if \(A\ge 0\)) such that \(\langle \mathbb {1}\rangle _\beta = 1\)) is called a KMS state (or is said to satisfy the KMS condition) with a Hamiltonian H at an inverse temperature \(\beta \), if we have

$$\begin{aligned} \langle AB\rangle _\beta =\langle \alpha _{-i\beta }(B)A\rangle _\beta \end{aligned}$$
(2.6)

for the above defined family of operators \(\alpha _t\) at imaginary values \(t=-i\beta \). One can see that the Gibbs state (2.2) satisfies the KMS condition for the finite volume time evolution operator.

A special class of observables are classical events \({\mathbb {1}}_{{{\mathcal {F}}}} I\) obtained as a product of the identity \(I\in {{\mathcal {M}}}_L\) with the indicator \({\mathbb {1}}_{{{\mathcal {F}}}}\) of an Ising configuration event \({{\mathcal {F}}}\subset \{-1,1\}^{{\mathbb {T}}_L}\). Often we will consider (classical) block events depending only on the Ising configuration on the block-cube of \(2^d\) sites, \(C=\{0,1\}^d\subset {\mathbb {T}}_L\). Namely, the events of the form \({{\mathcal {E}}}\times \{-1,1\}^{{\mathbb {T}}_L\setminus C}\) where \({{\mathcal {E}}}\subset \{-1,1\}^{C}\). We will refer to these events directly as block events \({{\mathcal {E}}}\) and use a streamlined notation \(\langle {{\mathcal {E}}}\rangle _{L,\,\beta }\) (resp. \(\langle {{\mathcal {E}}}\rangle _{\beta }\)) instead of \(\langle {\mathbb {1}}_{{{\mathcal {E}}}\times \{-1,1\}^{{\mathbb {T}}_L\setminus C}} I\rangle _{L,\,\beta }\) (resp. \(\langle {\mathbb {1}}_{{{\mathcal {E}}}\times \{-1,1\}^{{\mathbb {T}}_L\setminus C}} I\rangle _{\beta }\)).

In particular, to characterise the long-range order states mentioned above, we introduce the block events \({{\mathcal {G}}}^{\text {e}}= \{\sigma ^{\text {e}}\}\) and \( {{\mathcal {G}}}^{\text {o}}= \{\sigma ^{\text {o}}\}\) where \(\sigma ^{\text {e}}\) and \(\sigma ^{\text {o}}\) are the even and the odd staggered configurations on C: \(\sigma ^{\text {e}}_{x}=1\) iff x is an even site in C and \(\sigma ^{\text {o}}_{x}=1\) iff x is an odd site in C. Notice that the sets \({{\mathcal {G}}}^{\text {e}}\) and \({{\mathcal {G}}}^{\text {o}}\) are disjoint.

The main result for the quantum system with Hamiltonian (2.1) can now be stated as follows.

Theorem 2.1

Let \(d=2\) and \(S\ge \frac{1}{2}\). Let \(\mu _0=\frac{1}{2} \frac{S+1}{S^2}\) and \(\kappa _0=\kappa _0(\mu )=\frac{S+1}{S}-2{|\mu |} S\). Then, for any \(|\mu |<\mu _0\), \(\kappa <\kappa _0(\mu )\), and any \(0<\varepsilon <\frac{1}{2}\), there exists \(\beta _0=\beta _0(\mu ,\kappa ,\varepsilon )\) such that for any \(\beta >\beta _0\) there exist two distinct KMS states, \(\langle \cdot \rangle _\beta ^{\text {e}}\) and \(\langle \cdot \rangle _\beta ^{\text {o}}\), that are staggered,

$$\begin{aligned} \langle {{\mathcal {G}}}^{\text {e}}\rangle _\beta ^{{\text {e}}}\ge 1-\varepsilon \text { and } \langle {{\mathcal {G}}}^{\text {o}}\rangle _\beta ^{{\text {o}}}\ge 1-\varepsilon . \end{aligned}$$
(2.7)

The proof of this theorem is the content of Sect. 3. For the technical estimates, we are restricting ourselves to the two-dimensional case \(d=2\). The proof of a similar claim for \(d> 2\) (with other \(\mu _0\) and \(\kappa _0\) depending on d) employing the same methods is straightforward but rather cumbersome.

Notice that for \(|\mu |<\mu _0\) we have \(\kappa _0(\mu )>0\). It is not so surprising that that the claim is true for any negative \(\kappa \)—negative \(\kappa \) should trigger antiferromagnetic staggered order at low temperatures. More interesting is the case, established by the theorem, when this happens for positive \(\kappa \) where it is a demonstration of an effective entropic repulsion stemming from the quantum spin.

3 Proof of Theorem 2.1

3.1 Reflection Positivity for the Annealed Quantum Model

Consider now a splitting of the torus \({\mathbb {T}}_L\) into two disjoint halves, \({\mathbb {T}}_L={\mathbb {T}}_L^+\cup {\mathbb {T}}_L^-\), separated by a pair of planes; for example say, \(P_1=\{(-1/2,x_2,\ldots ,x_d)\) and \(P_2= \{(L/2-1/2,x_2,\ldots ,x_d)\), \(x_2,\ldots , x_d\in {\mathbb {R}}\). We introduce a reflection \(\theta :{\mathbb {T}}_L\rightarrow {\mathbb {T}}_L\) defined by \(\theta x=(-(x_1+1),x_2,\ldots ,x_d) \).Footnote 1 Any such reflection (parallel \(P_1\) and \(P_2\) of distance L / 2 in arbitrary half-integer position and orthogonal to any coordinate axis) will be called reflections through planes between the sites or simply reflections (we will not use the other reflections through planes on the sites that are useful for classical models). Notice that \(\theta \) maps \({\mathbb {T}}_L^+\) into \( {\mathbb {T}}_L^- \) and \(\theta ^2=1\).

Further, consider an algebra \({\mathfrak {A}}_L\) with two subalgebras \({\mathfrak {A}}_L^+, {\mathfrak {A}}_L^-\subset {\mathfrak {A}}_L\), \({\mathfrak {A}}_L={\mathfrak {A}}_L^+\otimes {\mathfrak {A}}_L^-\), living on the sets \({\mathbb {T}}_L^+, {\mathbb {T}}_L^- \), respectively. Namely, we define \({\mathfrak {A}}_L^+\) as a set of all operator-valued functions \(A: \{-1,1\}^{{\mathbb {T}}_L^+}\rightarrow {{\mathcal {M}}}_L^+\), where \({{\mathcal {M}}}_L^+\) is the set of all operators of the form \(I\otimes A^+\) with \(A^+\) acting on the subspace \(\otimes _{x\in {\mathbb {T}}_L^+}{\mathbb {C}}^{2S+1}\) and I is the identity on the complementary space \(\otimes _{x\in {\mathbb {T}}_L^-}{\mathbb {C}}^{2S+1}\). Similarly for \( {\mathfrak {A}}_L^-\).

The reflection \(\theta :{\mathbb {T}}_L^-\rightarrow {\mathbb {T}}_L^+\) can be naturally elevated to a morphism \(\theta : {\mathfrak {A}}_L^+\rightarrow {\mathfrak {A}}_L^-\) (cf. twisted reflections in [6, Sect. 3.4]) with \(\theta \) flipping the spin in the Ising configuration and rotating by \(\pi \) in the second coordinate direction of spins \(S_x\). More precisely, define the unitary operator

$$\begin{aligned} U=\prod _{x\in {\mathbb {T}}_L^{-}} e^{i\pi S_x^{(2)}} \end{aligned}$$
(3.1)

on the subspace \(\otimes _{x\in {\mathbb {T}}_L^-}{\mathbb {C}}^{2S+1}\) and, for \(\sigma \in \{-1,1\}^{{\mathbb {T}}_L}\), define \(\theta \sigma \) by

$$\begin{aligned} (\theta \sigma )_x=-\sigma _{\theta x}. \end{aligned}$$
(3.2)

Then for \(A\in {\mathfrak {A}}_L^+\) with \(A(\sigma ) = I \otimes A^+(\sigma )\) for any \(\sigma \in {\mathbb {T}}_L^+\), we define the operator \(\theta A\in {\mathfrak {A}}_L^-\) by

$$\begin{aligned} \theta A(\sigma )=\overline{U^{-1}A^+(\theta \sigma )U}\otimes I, \sigma \in {\mathbb {T}}_L^- . \end{aligned}$$
(3.3)

Here \({{\overline{A}}}\) denotes the complex conjugation of the operator A.

Note the effect of the reflection on spin operators: for any \(\alpha \in \{1,2,3\}\) and \(x\in {\mathbb {T}}_L^+\), we have \(\overline{U^{-1}S_x^{(\alpha )}U}=- S_{\theta x}^{(\alpha )}\) and thusFootnote 2

$$\begin{aligned} \theta S_x^{(\alpha )}= - S_{\theta x}^{(\alpha )}. \end{aligned}$$
(3.4)

Similarly, for the operator \(A(\sigma )=S^{(3)}_x\sigma _x\), we have

$$\begin{aligned} \theta A(\sigma )=(-S^{(3)}_{\theta x})(-\sigma _{\theta x})=S^{(3)}_{\theta x}\sigma _{\theta x} \end{aligned}$$
(3.5)

and for the operator \(A(\sigma )= \sigma _x i I\) with iI the multiple of a unit matrix by the imaginary unit i, we have

$$\begin{aligned} \theta A(\sigma )=(-\sigma _{\theta x})(-i I )= i \sigma _{\theta x}\, I. \end{aligned}$$
(3.6)

Finally, we say that a state \(\langle \varvec{\cdot }\rangle \) on \({\mathfrak {A}}_L \) is reflection positive with respect to \(\theta \) if for any \(A, B\in {\mathfrak {A}}_L^+\) we have

$$\begin{aligned} \left\langle A \theta B\right\rangle = \left\langle B\theta A\right\rangle \end{aligned}$$
(3.7)

and

$$\begin{aligned} \left\langle A\theta A\right\rangle \ge 0. \end{aligned}$$
(3.8)

The standard consequence of reflection positivity is the Cauchy-Schwarz inequality

$$\begin{aligned} \langle A \theta B\rangle ^2\le \langle A \theta A\rangle \langle B \theta B \rangle \end{aligned}$$
(3.9)

for any \(A, B\in {\mathfrak {A}}_L^+\).

In our situation of an annealed diluted quantum model, we are dealing with the state

$$\begin{aligned} \left\langle A\right\rangle _{L,\,\beta }= \frac{ \sum _{\sigma \in \{-1,1\}^{{\mathbb {T}}_L}} {{\text {Tr}}}\, A(\sigma )\mathrm{e}^{-\beta H_L(\sigma )}}{\sum _{\sigma \in \{-1,1\}^{{\mathbb {T}}_L}}{{\text {Tr}}}\, \mathrm{e}^{-\beta H_L(\sigma )}} \end{aligned}$$
(3.10)

for any \(A\in {\mathfrak {A}}_L\) and with the Hamiltonian \(H_L\in {\mathfrak {A}}_L\) of the form (2.1).

The standard proof of reflection positivity may be extended to this case.

Lemma 3.1

The state \( \langle \varvec{\cdot }\rangle _{L,\,\beta }\) is reflection positive for any \(\theta \) through planes between the sites and any \(\mu \in {\mathbb {R}}\), \(\kappa \le \tfrac{S+1}{S}\) and \(\beta \ge 0\).

Proof

The equality (3.7) is immediate. For (3.8) we first write the Hamiltonian \(H_L\) in the form \(H_L(\sigma ,\theta \sigma ')= J(\sigma )+ \theta J(\sigma ') -\sum _\alpha D_\alpha (\sigma )\, \theta D_\alpha (\sigma ')\) for any \(\sigma ,\sigma '\in \{-1,1\}^{{\mathbb {T}}_L^+}\) where \(J\in {\mathfrak {A}}_L^+\) consists of all terms of the Hamiltonian with (both) sites in \({\mathbb {T}}_L^+\) and \(D_\alpha \theta D_\alpha \), with \(D_\alpha \in {\mathfrak {A}}_L^+\) indexed by \(\alpha \), represent the terms corresponding to edges crosses the reflection plane.

Indeed, we define

$$\begin{aligned} J(\sigma )= & {} -\frac{1}{S^2}\sum _{\begin{array}{c} \{ x,y\}\\ x,y\in {\mathbb {T}}_L^+ \end{array}}\sigma _x\sigma _y(S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)}\nonumber \\&-S(S+1))-\kappa \sum _{\begin{array}{c} \{ x,y\}\\ x,y\in {\mathbb {T}}_L^+ \end{array}}\sigma _x\sigma _y-\mu \sum _{x\in {\mathbb {T}}_L^+}\sigma _xS_x^{(3)} \end{aligned}$$
(3.11)

and note that, due to the definition of \(\theta \), \(\theta J(\sigma )\) is the same as \(J(\sigma )\) but with \({\mathbb {T}}_L^+\) replaced by \({\mathbb {T}}_L^-\). This is clear for the first two sums as we pick up four resp. two factors of \(-1\), for the last term note that we also pick up two factors of \(-1\), one from \(\theta S^{(1)}_x=-S^{(1)}_{\theta x}\) and one from \(\theta \sigma _x=-\sigma _{\theta x}\). If \(\{x,y\}\) is an edge crossing the reflection plane (i.e. \(x\in {\mathbb {T}}_L^+\), \(y=\theta x\in {\mathbb {T}}_L^-\)), the corresponding \(D_\alpha \)’s are

$$\begin{aligned} D^0_x=&\sqrt{\tfrac{S+1}{S}-\kappa }\, i\, \sigma _x \end{aligned}$$
(3.12)
$$\begin{aligned} D^1_x=&\frac{1}{S} \sigma _x\,S_x^{(1)} \end{aligned}$$
(3.13)
$$\begin{aligned} D^3_x=&\frac{1}{S} \sigma _x\,S_x^{(3)} \end{aligned}$$
(3.14)

If \(\kappa \le \tfrac{S+1}{S}\), we have

$$\begin{aligned} \left( \tfrac{S+1}{S}-\kappa \right) \, \sigma _x\sigma _{y}= -D_x^0\, \theta (D_x^0) \end{aligned}$$
(3.15)

since, in view of (3.2) and (3.6),

$$\begin{aligned} \sigma _x\sigma _{y}=- i \sigma _x\, i \sigma _y= - i \sigma _x\, \theta (i \sigma _x). \end{aligned}$$
(3.16)

Also \(\sigma _xS^{(\alpha )}_x\sigma _y S^{(\alpha )}_y=\sigma _xS^{(\alpha )}_x \theta (\sigma _xS^{(\alpha )}_x)\) for \(\alpha =1,3\).

For the claim (3.8) we need to show that

$$\begin{aligned} \sum _{\sigma , \sigma '\in \{-1,1\}^{{\mathbb {T}}_L^+}}{{\text {Tr}}}\, A(\sigma ) \theta A(\sigma ')\mathrm{e}^{-\beta H_L(\sigma ,\theta \sigma ')}\ge 0 \end{aligned}$$
(3.17)

for any \(A\in {\mathfrak {A}}_L^+\). Adapting the standard proof, see e.g. [8, Theorem 2.1], by Trotter’s formula we get

$$\begin{aligned} \mathrm{e}^{-\beta H_L(\sigma ,\theta \sigma ')}=\lim _{k\rightarrow \infty }\Bigl (\mathrm{e}^{-\frac{\beta }{k}J(\sigma )}\mathrm{e}^{-\frac{\beta }{k} \theta J(\sigma ')}\bigl [1+\tfrac{\beta }{k}\sum \nolimits _\alpha D_\alpha (\sigma ) \theta D_\alpha (\sigma ')\bigr ]\Bigr )^k=:\lim _{k\rightarrow \infty }F_k(\sigma , \sigma ').\nonumber \\ \end{aligned}$$
(3.18)

The needed claim will be verified once show that

$$\begin{aligned} \sum _{\sigma ,\sigma '\in \{-1,1\}^{{\mathbb {T}}_L^+}}{{\text {Tr}}}\, \left( A(\sigma \theta A(\sigma ')\,F_k(\sigma ,\sigma ') \right) \ge 0 \end{aligned}$$
(3.19)

for all k.

Indeed, proceeding exactly in the same way as in the proof of Theorem 2.1 in [8], we can conclude that for each \(\sigma ,\sigma ' \in \{-1,1\}^{{\mathbb {T}}_L^+}\) the operator \(F_k(\sigma ,\sigma ')\) can be written as a sum of terms of the form \(F_k^{(\ell )}(\sigma ) \theta F_k^{(\ell )}(\sigma ')\), where \(F_k^{(\ell )}\in {\mathfrak {A}}_L^+\). Each such term yields

$$\begin{aligned}&\sum _{\sigma ,\sigma '\in \{-1,1\}^{{\mathbb {T}}_L^+}}{{\text {Tr}}}(A(\sigma ) \theta A(\sigma ') F_k^{(\ell )}(\sigma ) \theta F_k^{(\ell )}(\sigma ')\nonumber \\&\qquad = \sum _{\sigma ,\sigma '\in \{-1,1\}^{{\mathbb {T}}_L^+}} {{\text {Tr}}}(A(\sigma ) F_k^{(\ell )}(\sigma )\theta (AF_k^{(\ell )})(\sigma ')= \left| \sum _{\sigma \in \{-1,1\}^{{\mathbb {T}}_L^+}}{{\text {Tr}}}\bigl (A(\sigma ) F_k^{(\ell )}(\sigma )\bigr )\right| ^2\ge 0\nonumber \\ \end{aligned}$$
(3.20)

thus completing the proof.\(\square \)

3.2 Chessboard Estimates

Consider \({\mathbb {T}}_L\) partitioned into \((L/2)^d\) disjoint \(2\times 2\times \cdots \times 2\) blocks \(C_{\tau }\subset {\mathbb {T}}_L\) labeled by vectors \(\tau \in {\mathbb {T}}_{L/2}\) with \(2\tau \) denoting the position of their lower left corner. Clearly, \(C_{\tau }=C+2 \tau \) with \(C_{0}=C\).

If \(\tau \in {\mathbb {T}}_{L/2}\) with \(|\tau |=1\), we let \(\theta _{\tau }\) be the reflection with respect to the plane between C and \(C_{\tau }\) corresponding to \(\tau \). Further, if \({{\mathcal {E}}}\) is a block event, \({{\mathcal {E}}}\subset \{-1,1\}^C\), we let \(\vartheta _{\tau }({{\mathcal {E}}})\subset \{-1,1\}^{C_{\tau }}\) be the correspondingly reflected event, \(\sigma \in {{\mathcal {E}}}\) iff \(\theta \sigma \in \vartheta _{\tau }({{\mathcal {E}}})\). For other \(\tau \)’s in \({\mathbb {T}}_{L/2}\) we define \(\vartheta _{\tau }({{\mathcal {E}}})\) by a sequence of reflections (note that the result does not depend on the choice of sequence leading from C to \(C_{\tau }\).). If all coordinates of \(\tau \) are even this simply results in the translation by \(2\tau \).

Chessboard estimates are formulated in terms of a mean value of a homogenised pattern based on a block event \({{\mathcal {E}}}\) disseminated throughout the lattice,

$$\begin{aligned} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}}):=\left( \left\langle \prod _{\tau \in {\mathbb {T}}_{L/2}}\vartheta _{\tau }({{\mathcal {E}}})\right\rangle _{L,\,\beta }\right) ^{(2/L)^d}. \end{aligned}$$
(3.21)

If \(\kappa \le \tfrac{S+1}{S}\), \({{\mathcal {E}}}_1,\ldots ,{{\mathcal {E}}}_m\) are block events, and \(\tau _1,\ldots ,\tau _m\in {\mathbb {T}}_{L/2}\) are distinct, we get, by a standard repeated use of reflection positivity, the chessboard estimates

$$\begin{aligned} \Bigl \langle \prod _{j=1}^m\vartheta _{\tau }({{\mathcal {E}}}_j)\Bigr \rangle _{L,\,\beta }\le \prod _{j=1}^m\left( \left\langle \prod _{\tau \in {\mathbb {T}}_{L/2}}\vartheta _{\tau }({{\mathcal {E}}}_j)\right\rangle _{L,\,\beta }\right) ^{(2/L)^d}=\prod _{j=1}^m{\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}}_j). \end{aligned}$$
(3.22)

Note that we have chosen to split \({\mathbb {T}}_L\) into \(2\times 2\times \cdots \times 2\) blocks with the bottom left corner of the basic block C at the origin \((0,0,\ldots ,0)\). If we had instead replaced the basic block C by its shift \(C+e_1\) by the unit vector \(e_1=(1,0,\ldots ,0)\), the same estimate would hold with the new partition with all blocks shifted by \(e_1\). We will use this fact in the sequel.

The proof of the useful property of subadditivity of the function \({\mathfrak {q}}_{L,\,\beta }\) for classical systems [1, Lemma 5.9] can be also directly extended to our case.

Lemma 3.2

Suppose \(\kappa \le \tfrac{S+1}{S}\). If \({{\mathcal {E}}}, {{\mathcal {E}}}_1,{{\mathcal {E}}}_2,\ldots \) are events on C such that \({{\mathcal {E}}}\subset \cup _k {{\mathcal {E}}}_k\), then

$$\begin{aligned} {\mathfrak {q}}_{L,\,\beta }( {{\mathcal {E}}})\le \sum _k {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}}_k). \end{aligned}$$
(3.23)

Proof

Using subadditivity of \(\langle \cdot \rangle _{L,\,\beta }\), we get

$$\begin{aligned} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}})^{(L/2)^d} =\bigg \langle \prod _{\tau \in {\mathbb {T}}_{L/2}}\vartheta _{\tau }({{\mathcal {E}}})\bigg \rangle _{L,\,\beta }\le \sum _{(k_{\tau })}\bigg \langle \prod _{\tau \in {\mathbb {T}}_{L/2}}\vartheta _{\tau }({{\mathcal {E}}}_{k_{\tau }})\bigg \rangle _{L,\,\beta } \end{aligned}$$
(3.24)

Using now the chessboard estimate

$$\begin{aligned} \bigg \langle \prod _{\tau \in {\mathbb {T}}_{L/2}}\vartheta _{\tau }({{\mathcal {E}}}_{k_{\tau }})\bigg \rangle _{L,\,\beta }\le \prod _{\tau \in {\mathbb {T}}_{L/2}} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}}_{k_{\tau }}), \end{aligned}$$
(3.25)

we get

$$\begin{aligned} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}})^{(L/2)^d}\le & {} \sum _{(k_{\tau })}\prod _{\tau \in {\mathbb {T}}_{L/2}} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}}_{k_{\tau }})\nonumber \\= & {} \prod _{\tau \in {\mathbb {T}}_{L/2}}\biggl (\sum _{k} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}}_{k})\biggr ) =\biggl (\sum _{k} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {E}}}_{k})\biggr )^{(L/2)^d}. \end{aligned}$$
(3.26)

\(\square \)

Let us introduce the set \({{\mathcal {B}}}\) of bad configurations, \({{\mathcal {B}}}=\{-1,1\}^C\setminus ( {{\mathcal {G}}}^{\text {e}}\cup {{\mathcal {G}}}^{\text {o}})\), and use \(\tau _{r}\) to denote the shift by \(r\in {\mathbb {T}}_L\). The proof of the existence of two distinct KMS states is based on the following lemma.

Lemma 3.3

There exists functions \(\mu _0,\kappa _0\) as stated in Theorem 2.1 such that for any \(\varepsilon >0\), \(\mu \) such that \(|\mu |<\mu _0\) and \(\kappa <\kappa _0(\mu )\) there exists \(\beta _0\) such that for any \(\beta >\beta _0\), any L sufficiently large, and any distinct \(\tau _1,\tau _2\in {\mathbb {T}}_{L}\),

$$\begin{aligned} \langle&{{\mathcal {B}}}\rangle _{L,\,\beta }<\varepsilon , \end{aligned}$$
(3.27)
$$\begin{aligned} \langle&\tau _{2\tau _1}({{\mathcal {G}}}^{\text {e}})\cap \tau _{2\tau _2}({{\mathcal {G}}}^{\text {o}})\rangle _{L,\,\beta }<\varepsilon . \end{aligned}$$
(3.28)

Deferring its proof to the next section, we show here how it implies Theorem 2.1.

Proof of Theorem 2.1 given Lemma 3.3

We closely follow the proof of Lemma 4.5 and Proposition 3.9 in [2]. Define

$$\begin{aligned} {\mathbb {T}}_L^{\text {front}}=\{x\in {\mathbb {T}}_L : -\lfloor L/4-1/2\rfloor \le x_1\le \lceil L/4-1/2\rceil \}. \end{aligned}$$
(3.29)

We denote by \({\mathfrak {A}}_L^{\text {front}}\) the algebra of observables localised in \({\mathbb {T}}_L^{\text {front}}\).

Let \(\Delta _M\subset {\mathbb {T}}_{L/2}\) be a \(M\times M\) block of sites on the “back” of \({\mathbb {T}}_{L/2}\) (dist\((0,\Delta _M)\ge L/4 -M)\). Then for a block event \({{\mathcal {E}}}\) depending only on the Ising configuration in C define

$$\begin{aligned} \rho _{L,M}({{\mathcal {E}}})=\frac{1}{|\Delta _M|}\sum _{\tau \in \Delta _M}\tau _{2\tau }({{\mathcal {E}}}). \end{aligned}$$
(3.30)

If \(\langle {{\mathcal {E}}}\rangle _{L,\,\beta }\ge c\) for all \(L\gg 1\) for a constant \(c>0\) then we can define a new state on \({\mathfrak {A}}^{\text {front}}_L\), by

$$\begin{aligned} \langle \cdot \rangle _{L,M;\beta }=\frac{\langle \rho _{L,M}({{\mathcal {E}}})\;\cdot \;\rangle _{L,\,\beta }}{ \langle \rho _{L,M}({{\mathcal {E}}})\rangle _{L,\,\beta }}. \end{aligned}$$
(3.31)

We claim that if \(\langle \,\,\rangle _\beta \) is a weak limit of \(\langle \,\, \rangle _{L,M;\beta }\) as \(L\rightarrow \infty \) and then \(M\rightarrow \infty \) then \(\langle \,\,\rangle _\beta \) is a KMS state at inverse temperature \(\beta \) invariant under translations by \(2\tau \) for \(\tau \in {\mathbb {T}}_{L}\).

Indeed translation invariance comes from the spatial averaging in \(\rho _{L,M}({{\mathcal {E}}})\). As in [2] we need to show that \(\langle \,\, \rangle _{\beta }\) satisfies the KMS condition (2.6). For an observable A on the ‘front’ of the torus, \({\mathbb {T}}_L^{\text {front}}\), we have

$$\begin{aligned}{}[\alpha _t^{(L)}(A),\rho _{L,M}({{\mathcal {E}}})]\rightarrow 0 \text { as } L\rightarrow \infty \end{aligned}$$
(3.32)

in norm topology uniformly for t in compact subsets of \({\mathbb {C}}\). Using this and (2.6) for the finite volume Gibbs states we have that for AB bounded operators on the “front” of the torus

$$\begin{aligned} \langle \rho _{L,M}({{\mathcal {E}}}) AB\rangle _{L,\,\beta }=\langle \rho _{L,M}({{\mathcal {E}}}) \alpha ^{(L)}_{-i\beta }(A)B\rangle _{L,\,\beta } +o(1) \text { as } L\rightarrow \infty . \end{aligned}$$
(3.33)

Because \(\alpha ^{(L)}_{-i\beta }(B)\rightarrow \alpha _{-i\beta }(B)\) as \(L\rightarrow \infty \) in norm we have that \(\langle \,\,\rangle _{L,M;\beta }\) converges as \(L\rightarrow \infty \) and then \(M\rightarrow \infty \) to a KMS state at inverse temperature \(\beta \).

The proof of Theorem 2.1 follows by taking \({{\mathcal {E}}}={{\mathcal {G}}}^{\text {e}}\) or \({{\mathcal {E}}}={{\mathcal {G}}}^{\text {o}}\) as we know both staggered configurations have the same expectation we can define a state \(\langle \,\, \rangle ^{\text {e}}_{L,M;\beta }\), using Lemma 3.3 we conclude that \(\langle \rho _{L,M}({{\mathcal {G}}}^{\text {e}})\rangle _{L,\,\beta }\) is uniformly positive and hence

$$\begin{aligned} \langle \tau _{2\tau }({{\mathcal {G}}}^{\text {e}}) \rangle ^{\text {e}}_{L,M;\beta }\ge 1-\varepsilon , \end{aligned}$$
(3.34)

for any \(\tau \in {\mathbb {T}}_L^{\text {front}}\) (if \(M\ll L/2\)) and similarly for \(\langle \,\, \rangle ^{\text {o}}_{L,M;\beta }\). If \(\varepsilon \) is small enough then the right-hand side of this inequality will be greater than 1 / 2, hence in the thermodynamic limit \({{\mathcal {G}}}^{\text {e}}\) will dominate. \(\square \)

To prove Lemma 3.3 we use Peierls’ argument hinging on chessboard estimates in a version inspired by the proof of Lemma 4.2 in [2].

3.3 Peierls’ Argument

For a given Ising configuration, consider the event \(\tau _{2\tau _1}({{\mathcal {G}}}^{\text {e}})\cap \tau _{2\tau _2}({{\mathcal {G}}}^{\text {o}})\) that the blocks \(C_{\tau _1}\) and \(C_{\tau _2}\) have different staggered configurations described by \({{\mathcal {G}}}^{\text {e}}\) and \({{\mathcal {G}}}^{\text {o}}\), respectively. The idea is to show the existence of a contour separating the points \(\tau _1\) and \(\tau _2\) and to use chessboard estimates to show that occurrence of such a contour is improbable (Fig. 1).

Fig. 1
figure 1

A contour bordering the region \({{\overline{\Delta }}}\) separating blocks \(C_{\tau _1}\) and \(C_{\tau _2}\) with configurations \({{\mathcal {G}}}^{\text {e}}\) and \({{\mathcal {G}}}^{\text {o}}\), respectively. Spins with values \(+1\), \(-1\), are denoted by \(+\), \(\circ \). The solid line represents the boundary of the region \({{\overline{\Delta }}}\), with the minimal cutset \(\gamma \) consisting of 20 edges corresponding to pairs of blocks touching the boundary from both sides—12 edges aligned in direction \(e_2\) (represented by horizontal pieces of the boundary) and 8 edges aligned in direction \(e_1\). The darker shaded blocks are those at least 1 / 2 of the n / (2d) bad blocks from \(S(\gamma )\), all belonging to the same partition: in our case the new partition of \({\mathbb {T}}_L\) with the basic block C shifted by a unit vector from \({\mathbb {T}}_L\) in the direction \(e_2\)

Consider the set of all blocks (labeled by) \(\tau \in {\mathbb {T}}_{L/2}\) such that a translation of the even staggered configuration \(\tau _{2\tau }({{\mathcal {G}}}^{\text {e}})\) occurs on it. Let \(\Delta \subset {\mathbb {T}}_{L/2}\) be its connected component containing \(\tau _1\). Consider the component \({{\overline{\Delta }}}\subset {\mathbb {T}}_{L/2}\) of \(\Delta ^\text {c}\) containing \(\tau _2\). The set of edges \(\gamma \) of the graph \({\mathbb {T}}_{L/2}\) between vertices of \({{\overline{\Delta }}}\) and its complement \({{\overline{\Delta }}}^\text {c}\) is a minimal cutset of \(\Delta \). Informally, \(\gamma \) is a contour between \(\Delta \) with all its holes except the one containing \(\tau _2\) filled up and the remaining component containing \(\tau _2\)a contour separating\(\tau _1\)and\(\tau _2\). The standard fact is that the number of contours with a fixed number of edges \({|\gamma |}=n\) separating two vertices \(\tau _1\) and \(\tau _2\) is bounded by \(c^n\) with a suitable constant c.

Given a contour \(\gamma \) of length \({|\gamma |}=n\), there exists a coordinate direction such that there are at least n / d edges in \(\gamma \) aligned along this direction. Assuming, without loss of generality, that this is the “vertical” axis \(e_d\), precisely half of those edges have their outer endpoint (the vertex in \({{\overline{\Delta }}}\)) “below” their inner endpoint. Thus, we can assume that there are at least n / (2d) edges \(\{\tau , \tau +e_d\}\) such that \(\tau \in {{\overline{\Delta }}}\) and \(\tau +e_d\in \Delta \).

Now, the crucial claim is that with each contour we can associate at least 1 / 2 of the n / (2d) bad blocks (with a configuration from \(\vartheta _{2\tau }({{\mathcal {B}}})\)), all belonging to the same fixed partition: either to our original partition of \({\mathbb {T}}_L\) labelled by \({\mathbb {T}}_{L/2}\) or to a new partition of \({\mathbb {T}}_L\) with the basic block C shifted by a unit vector from \({\mathbb {T}}_L\) in direction \(e_d\). Indeed, any block corresponding to an outer vertex \(\tau \) is either bad or, if not, it has to be a translation \(\tau _{2\tau }({{\mathcal {G}}}^{\text {o}})\) of the odd staggered configuration (being the even staggered configuration would be in contradiction with the assumption that \(\Delta \) is a connected component of the set of blocks with even staggered configuration). However, then the block shifted by a unit vector in \({\mathbb {T}}_L\) in direction \(e_d\) features an odd staggered configuration on its lower half and an even staggered configurations on its upper half, i.e., a configuration that belongs to the properly shifted set \({{\mathcal {B}}}\) (here it is helpful that the set \({{\mathcal {B}}}\) is invariant with respect to the reflection through the middle plane of the block).

We use \(S(\gamma )\) to denote this collection of at least \({|\gamma |}/(4d)\) bad blocks associated with contour \(\gamma \). Given that, according to the construction above, all blocks from \(S(\gamma )\) belong to the same partition (either the original one or a shifted one), we can use the chessboard estimate based on the the corresponding partition to bound the probability that all blocks of a given set \(S(\gamma )\) are bad by

$$\begin{aligned} \bigg \langle \prod _{\tau \in S(\gamma )}\vartheta _{\tau }({{\mathcal {B}}})\bigg \rangle _{L,\,\beta }\le {\mathfrak {q}}_{L,\,\beta }({{\mathcal {B}}})^{{|S(\gamma )|}}. \end{aligned}$$
(3.35)

As a result, assuming that \({\mathfrak {q}}_{L,\,\beta }({{\mathcal {B}}})\le 1\) (we will later show it can be made arbitrarily small), the expectation of the event \( \tau _{2\tau _1}({{\mathcal {G}}}^{\text {e}})\cap \tau _{2\tau _2}({{\mathcal {G}}}^{\text {o}})\) is bounded by

$$\begin{aligned} \bigg \langle \tau _{2\tau _1}({{\mathcal {G}}}^{\text {e}})\cap \tau _{2\tau _2}({{\mathcal {G}}}^{\text {o}})\bigg \rangle _{L,\,\beta }\le \sum _{\gamma \text { separating } \tau _1 \text { and } \tau _2} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {B}}})^{{|\gamma |}/(4d)} 2^{{|\gamma |}/(2d)+1}. \end{aligned}$$
(3.36)

Here, \(2^{{|\gamma |}/(2d)+1} \) is the bound on the number of sets \(S(\gamma )\) associated with the contour \(\gamma \) once the direction \(e_d\) is chosen.

This leads to the final bound

$$\begin{aligned} \bigg \langle \tau _{2\tau _1}({{\mathcal {G}}}^{\text {e}})\cap \tau _{2\tau _2}({{\mathcal {G}}}^{\text {o}})\bigg \rangle _{L,\,\beta }\le \sum _{n=4}^\infty 2\bigl (4{\mathfrak {q}}_{L,\,\beta }({{\mathcal {B}}})^{n/(4d)} \bigr )c^n. \end{aligned}$$
(3.37)

We now see that Lemma 3.3 will hold if \({\mathfrak {q}}_{L,\,\beta }({{\mathcal {B}}})\) can be made arbitrarily small by tuning the parameters of the model correctly. Hence we turn our attention to this.

For the remaining technical part of this section we restrict ourselves to the two-dimensional case.

For \(d=2\), the set \({{\mathcal {B}}}\) consists of 14 configurations that can be classified into five events according to the number of sites in C that have Ising spin \(+1\), \({{\mathcal {B}}}={{\mathcal {B}}}^{(0)}\cup {{\mathcal {B}}}^{(1)}\cup {{\mathcal {B}}}^{(2)}\cup {{\mathcal {B}}}^{(3)}\cup {{\mathcal {B}}}^{(4)}\). Here, \({{\mathcal {B}}}^{(0)}\) and \({{\mathcal {B}}}^{(4)}\) consist of a single configuration (fully \(-1\) and fully \(+1\), respectively) and \({{\mathcal {B}}}^{(1)},{{\mathcal {B}}}^{(2)},{{\mathcal {B}}}^{(3)}\) consist each of 4 configurations related by symmetries. Notice that the event \({{\mathcal {B}}}^{(2)}\) has precisely two \(+1\) spins at neighbouring positions (excluding the configurations \(\sigma ^{\text {e}}\) and \(\sigma ^{\text {o}}\)). The case of \({{\mathcal {B}}}^{(3)}\) is depicted in Fig. 2.

Fig. 2
figure 2

Example of a disseminated pattern obtained by reflections of a configuration from \({{\mathcal {B}}}^{(3)}\) on C (inside the shaded box). Notice that the configurations on blocks shifted by \(\tau \) with both coordinates even are just translations of the original configuration on C

By subadditivity we can bound \({\mathfrak {q}}_{L,\,\beta }({{\mathcal {B}}})\) by the sum of expectations of homogenised patterns based on the fourteen configurations from \({{\mathcal {B}}}\) disseminated throughout the lattice by reflections. In view of the symmetries, we need only consider only 5 configurations \(\sigma ^{(k)}, k=0,1,\ldots ,4\), one from each event \({{\mathcal {B}}}^{(k)}, k=0,1,\ldots ,4\). In fact we can see that, as reflections flips the sign of Ising variables, that we need only consider \(k=0,1,2\) Indeed, the dissemination of pattern \({{\mathcal {B}}}^{(0)}\) differs from the dissemination of pattern \({{\mathcal {B}}}^{(4)}\) by a shift by \(2e_1\), and the dissemination of pattern \({{\mathcal {B}}}^{(1)}\) differs from the dissemination of pattern \({{\mathcal {B}}}^{(3)}\) by a shift by \(2e_1\) and a rotation.

We use \(Z^{(k)}_L(\beta )\) to denote the corresponding quantities

$$\begin{aligned} Z^{(k)}_L(\beta )={\mathfrak {q}}_{L,\,\beta }(\{\sigma ^{(k)}\})^{(L/2)^2}Z_L(\beta ), \end{aligned}$$
(3.38)

for \(k\in \{0,1,\ldots ,4\}\). For notational consistency we also denote the contribution of staggered configurations on \({\mathbb {T}}_L\) as \(Z^{(\text {e})}_L(\beta )\) and \( Z^{(\text {o})}_L(\beta )\)

Lemma 3.4

For any \(\mu \in {\mathbb {R}}\) and \(\kappa <\kappa _0(\mu )\) we have

$$\begin{aligned} Z_L^{(0)}(\beta ),Z_L^{(4)}(\beta )\le&e^{\beta L^2 |\mu |\ S } {{\text {Tr}}}\exp \left\{ \frac{\beta }{S^2}\sum _{\{ x,y\}}(S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)})\right\} , \end{aligned}$$
(3.39)
$$\begin{aligned} Z_L^{(1)}(\beta ),Z_L^{(2)}(\beta ),Z_L^{(3)}(\beta )\le&e^{\beta L^2 \left( |\mu |\ S-\kappa +\tfrac{S+1}{S}\right) } {{\text {Tr}}}\exp \left\{ \frac{\beta }{S^2}\sum _{\{ x,y\}}(S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)})\right\} , \end{aligned}$$
(3.40)
$$\begin{aligned} Z^{(\text {e})}_L(\beta ),Z^{(\text {o})}_L(\beta )\ge&e^{\beta L^2\left( -|\mu |S-2\kappa +2\tfrac{S+1}{S}\right) }{{\text {Tr}}}\exp \left\{ \frac{\beta }{S^2}\sum _{\{ x,y\}}(S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)})\right\} . \end{aligned}$$
(3.41)

Proof

We begin by removing the terms associated to \(S(S+1),\kappa \) and \(\mu \) from the Hamiltonian, i.e., we need bounds on the terms \((-\tfrac{S+1}{S}+\kappa )\sum _{\{ x,y\}}\sigma _x^{(k)}\sigma _y^{(k)}\) and \(\mu \sum _{x\in {\mathbb {T}}_L}\sigma ^{(k)}_xS_x^{(3)}\) (occuring in \(-H\)), for \(\sigma ^{(k)}\), the Ising configuration corresponding to the disseminated pattern \({{\mathcal {B}}}^{(k)}\).

For the first term we use that \(\sigma _x^{(k)}\sigma _y^{(k)}=\pm 1\) for each \(\{x,y\}\). In particular, we get \(\sum _{\{ x,y\}}\sigma _x^{(k)}\sigma _y^{(k)}=0\) for \(k=0,4\), it equals \(-L^2\) for \(k=1,2,3\), and it equals \(-2L^2\) for \(k=\text {e},\text {o}\). Indeed, for \(\sigma ^{(0)}\) and \(\sigma ^{(4)}\) half of the links yield \(-1\) (they are are between a plus and a minus) and the second half yield \(+1\). For \(\sigma ^{(1)}\), \(\sigma ^{(2)}\), and \(\sigma ^{(3)}\) three quarters of the links yield \(-1\) and one quarters \(+1\). Finally, for \(k=\text {e}\) and \(k=\text {o}\) all links yield \(-1\).

For the \(\mu \)-term we use the simple bound

$$\begin{aligned} -|\mu |\ S L^2\le \mu \left\| \sum _{x\in {\mathbb {T}}_L}\sigma _xS_x^{(3)}\right\| \le |\mu | S L^2. \end{aligned}$$
(3.42)

Together this gives the factors in front of the traces in equations (3.39), (3.40), and (3.41). What remains in each case is a term of the form

$$\begin{aligned} -\frac{1}{S^2}\sum _{\{ x,y\}}\sigma ^{(k)}_x\sigma ^{(k)}_y(S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)}) \end{aligned}$$
(3.43)

where \(k\in \{0,1,\ldots ,4,e,o\}\). By conjugating with a unitary operator acting as \(e^{i\pi S^2}\) on the sites where \(\sigma ^{(k)}_x=-1\) we can turn this operator into,

$$\begin{aligned} -\frac{1}{S^2}\sum _{\{ x,y\}}(S_x^{(1)}S_y^{(1)}+S_x^{(3)}S_y^{(3)}). \end{aligned}$$
(3.44)

As we have conjugated by a unitary operator this conjugation does not affect the trace. This completes the proof. \(\square \)

As a result, we get the following bounds on the expectations of the disseminated bad configurations \({\mathfrak {q}}_{L,\,\beta }(\{\sigma ^{(k)}\})\) for \(k=0,1,\ldots ,4\).

Lemma 3.5

Let \(\mu \in {\mathbb {R}}\) and \(\kappa <\kappa _0(\mu )\). We have

$$\begin{aligned} {\mathfrak {q}}_{L,\beta }(\{\sigma ^{(0)}\}),{\mathfrak {q}}_{L,\beta }(\{\sigma ^{(4)}\})\le&2^{-4/L^2}\exp \left\{ 4\beta \left( 2|\mu | S+2\kappa -2\tfrac{S+1}{S}\right) \right\} \end{aligned}$$
(3.45)
$$\begin{aligned} {\mathfrak {q}}_{L,\beta }(\{\sigma ^{(1)}\}),{\mathfrak {q}}_{L,\beta }(\{\sigma ^{(2)}\}),{\mathfrak {q}}_{L,\beta }(\{\sigma ^{(3)}\})\le&2^{-4/L^2}\exp \left\{ 4\beta \left( 2|\mu | S+\kappa -\tfrac{S+1}{S}\right) \right\} \end{aligned}$$
(3.46)

Proof

All the estimates follow from the previous lemmas using

$$\begin{aligned} {\mathfrak {q}}_{L,\,\beta }(\{\sigma ^{(k)}\})=\left( \frac{Z_L^{(k)}(\beta )}{Z_L(\beta )}\right) ^{(2/L)^2}\le \left( \frac{Z_L^{(k)}(\beta )}{2 Z^{\text {e}}_L(\beta )}\right) ^{(2/L)^2}. \end{aligned}$$
(3.47)

\(\square \)

Further, using subadditivity (Lemma 3.2) we have

$$\begin{aligned} {\mathfrak {q}}_{L,\,\beta }({{\mathcal {B}}})\le {\mathfrak {q}}_{L,\,\beta }(\{\sigma ^{(0)}\}) +4 \sum _{k=1}^3 {\mathfrak {q}}_{L,\,\beta }(\{\sigma ^{(k)}\}) +{\mathfrak {q}}_{L,\,\beta }(\{\sigma ^{(4)}\}). \end{aligned}$$
(3.48)

From Lemma 3.5 we can see that for \(\beta \) large this quantity will be small if

$$\begin{aligned} \kappa < \min \{1+\tfrac{1}{S}-|\mu | S,1+\tfrac{1}{S}-2|\mu |S\}=1+\tfrac{1}{S}-2|\mu | S=:\kappa _0(\mu ). \end{aligned}$$
(3.49)

This condition is compatible with the requirement \(\kappa \le 1+\tfrac{1}{S}\) in Lemma 3.2 and allows us to take \(\kappa >0\) once \(|\mu |<\tfrac{1}{2S}+\tfrac{1}{2S^2}\).

More precisely, we see that there exists \(\mu _0>0\) and a function \(\kappa _0\) that is positive on \((-\mu _0,\mu _0)\) such that if \(|\mu |<\mu _0\), \(\kappa <\max (\kappa _0(\mu ),0)\), and \(\varepsilon >0\), there exists \(\beta _0(\mu ,\kappa ,\varepsilon )\) such that the claims of Lemma 3.3 and thus also Theorem 2.1 are valid for any \(\beta \ge \beta _0\).