1 Introduction and Main Results

1.1 Introduction

Spin systems in equilibrium have been studied by a variety of methods which led to a very complete mathematical description of the physical phenomena occurring in the different regimes of the phase diagrams. This includes in particular a good understanding of the critical phenomena in a wide range of models. Much less is known about the Glauber dynamics of spin systems. For sufficiently high temperatures, it is well understood that the dynamics relaxes exponentially fast towards the equilibrium measure. For the Ising model, the much more difficult question of fast relaxation in the entire uniqueness regime was addressed in [22, 46, 50, 51]. In the phase transition regime, at least for scalar spins, the dynamical behaviour is governed by the interface motion and the relaxation becomes much slower. In particular, the relaxation time diverges as the system size increases, but the dynamical scaling depends strongly on the choice of the boundary conditions. We refer to [49] for a review, as well as to [21, 44] for more recent results. In the vicinity of the critical point, strong correlations develop and as a consequence the dynamic evolution slows down but is no longer driven by phase separation. Even though the critical dynamical behaviour has been well investigated in physics [36], mathematical results are scarce. The only cases for which polynomial lower bounds on the relaxation or mixing times are known are the two-dimensional Ising model [45], exactly at the critical point, the Ising model on a tree [27], both without sharp exponent, and the mean-field Ising model which is fully understood [26, 42].

The goal of this paper is to investigate the dynamical relaxation of hierarchical models near and at the critical point by deriving the scaling of the spectral gap in terms of the temperature (or the equivalent parameter of the model) and the system size.

Since their introduction by Dyson [28] and the pioneering work of Bleher–Sinai [11], hierarchical models have been a stepping stone to develop renormalisation group arguments. At equilibrium, sharp results on the critical behaviour of a large class of models have typically been obtained first in a hierarchical framework and then later been extended to the Euclidean lattice. For the equilibrium problem, the hierarchical framework results in a significant technical simplification, but the results and methods have turned out to be surprisingly parallel to the case of the Euclidean lattice \(\mathbb {Z}^d\). This point of view is discussed in detail in [9], to which we also refer for an overview of results and references. Building on the results for the hierarchical set-up for the equilibrium problem, we derive recursive relations on the spectral gap after one renormalisation step. This enables us to obtain sharp asymptotic behaviour of the spectral gap for large size Sine-Gordon model in the rough phase (Kosterlitz–Thouless phase) and for the \(|\varphi |^4\) model in the vicinity of the critical point. The scaling coincides in both cases with the one of the hierarchical free field dynamics (with a logarithmic corrections for the \(|\varphi |^4\) model) which describes the equilibrium scaling limit of these models. Renormalisation procedures have already been used to analyze spectral gaps for Glauber dynamics, see e.g., [49], but the renormalisation scheme used in this paper is different and allows to keep sharp control from one scale to the next.

After recalling the definitions of the hierarchical models and presenting the results of this paper in Sect. 1.4, we implement, in Sect. 2, the induction procedure to control the spectral gap after one renormalisation step. We believe that our method could be extended beyond the hierarchical models, thus the induction is described in a general framework under some assumptions which can then be checked for each microscopic models. This is completed in Sect. 3 for the hierarchical \(|\varphi |^4\) model, and in Sect. 4 for the hierarchical Sine-Gordon and the Discrete Gaussian models. Proving these assumptions requires establishing stronger control on the renormalised Hamiltonians in the large field region than needed when studying the renormalisation at equilibrium (convexity instead of probabilistic bounds). Such convexity for large fields is the main challenge to extend the method of this paper beyond hierarchical models.

1.2 Spectral gap

Let \(\Lambda \) be a finite set and M be a symmetric matrix of spin couplings acting on \(\mathbb {R}^\Lambda \). We consider possibly vector-valued spin configurations \(\varphi = (\varphi _x^i)_{x\in \Lambda , i=1,\dots , n} \in \mathbb {R}^{n\Lambda } = \{ \varphi : \Lambda \rightarrow \mathbb {R}^n\}\), with action of the form

$$\begin{aligned} H(\varphi ) = \frac{1}{2}(\varphi ,M\varphi ) + \sum _{x\in \Lambda } V(\varphi _x), \quad (\varphi \in \mathbb {R}^{n\Lambda }), \end{aligned}$$
(1.1)

for some potential \(V: \mathbb {R}^n\rightarrow \mathbb {R}\), where \((\cdot ,\cdot )\) is the standard inner product on \(\mathbb {R}^{n\Lambda }\). In the vector-valued case \(n>1\), we assume that V is O(n)-invariant and that M acts by \((M\varphi )_x^i = (M\varphi ^i)_x\) for \(i=1,\dots , n\) and \(x\in \Lambda \). The associated probability measure \(\mu \) has expectation

$$\begin{aligned} {{\mathbb {E}}} _\mu (F) = \frac{1}{Z} \int _{\mathbb {R}^{n\Lambda }} e^{-H(\varphi )} F(\varphi ) \, d\varphi , \qquad Z = \int _{\mathbb {R}^{n\Lambda }} e^{-H(\varphi )} \, d\varphi . \end{aligned}$$
(1.2)

The (continuous) Glauber dynamics associated with H is given by the system of stochastic differential equations

$$\begin{aligned} d\varphi _x = -\partial _{\varphi _x} H(\varphi ) \, dt + \sqrt{2} dB_x, \quad (x\in \Lambda ), \end{aligned}$$
(1.3)

where the \(B_x\) are independent n-dimensional standard Brownian motions. (The continuous Glauber dynamics is also referred to as overdamped Langevin dynamics; to keep the terminology concise we use the term Glauber dynamics in the continuous as well as in the discrete case.) By construction, the measure \(\mu \) defined in (1.2) is invariant with respect to this dynamics. Its relaxation time scale is controlled by the inverse of the spectral gap of the generator of the Glauber dynamics (see, for example, [2, Proposition 2.1]). By definition, the spectral gap is the largest constant \(\gamma \) such that, for all functions \(F: \mathbb {R}^{n\Lambda } \rightarrow \mathbb {R}\) with bounded derivative,

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F)= {{\mathbb {E}}} _\mu (F^2) - {{\mathbb {E}}} _\mu (F)^2 \leqslant \frac{1}{\gamma } {{\mathbb {E}}} _\mu (\nabla F, \nabla F) . \end{aligned}$$
(1.4)

Our goal in this paper is to determine the order of the spectral gap \(\gamma \) for specific choices of M and V, when the size of the domain \(\Lambda \) diverges. For statistical mechanics, the setting of primary interest is a finite domain of a lattice or a torus \(\Lambda = \Lambda _N \subset \mathbb {Z}^d\) whose size tends to infinity, and a short-range spin coupling matrix M, such as the discrete Laplace operator \(-\Delta \) on \(\Lambda \). The discrete Laplace operator has a nontrivial kernel. This degeneracy must be removed through boundary conditions or an external field (mass term). For example, for a cube of side length D with Dirichlet boundary conditions, the smallest eigenvalue is of order \(D^{-2}\). In the hierarchical set-up that we consider, we impose an external field instead of boundary conditions whose size is such that the smallest eigenvalue is at least of order \(D^{-2}\).

For \(V=0\), or more generally for quadratic potentials which can be absorbed in the definition of M, the spectral gap \(\gamma \) of the generator of the Langevin dynamics is equal to the minimal eigenvalue of M (assuming that it is positive) by explicit diagonalisation of (1.3). More generally, for V any strictly convex potential satisfying \(V''(\varphi ) \geqslant c > 0\) uniformly in \(\varphi \), the Bakry–Emery criterion [3] implies that

$$\begin{aligned} \gamma \geqslant \lambda + c, \end{aligned}$$
(1.5)

where \(\lambda \) is the smallest eigenvalue of M. Under these conditions, \(\mu \) actually satisfies a logarithmic Sobolev inequality with the same constant. In particular, under these assumptions, the dynamics relaxes quickly, in time of order 1.

The situation is much more subtle when the potential V is non-convex. Indeed, as the potential becomes sufficiently non-convex, the static measure \(\mu \) typically undergoes phase transitions. In fact for unbounded spin systems on a lattice, the relaxation of the Glauber dynamics has been controlled only in the uniqueness regime under some assumptions on the decay of correlations [12, 13, 39, 41, 53] (see also [52] for conservative dynamics). By considering hierarchical models, we are able to show that the spectral gap decays polynomially in the vicinity of a phase transition. The idea is to decompose the measure into renormalised fields such that at each scale, conditioned on a block spin field, the renormalised potential remains strictly convex. By induction, we then obtain a recursion on the spectral gaps of the renormalised measures.

Before stating the results, we first turn to the definition of the hierarchical models.

1.3 Hierarchical Laplacian

The Gaussian free field (GFF) on a finite approximation to \(\mathbb {Z}^d\) is a Gaussian field whose covariance is the Green function of the Laplace operator. The Green function has decay \(|x|^{-(d-2)}\) in dimensions \(d\geqslant 3\) and has asymptotic behaviour \(-\log |x|\) in dimension \(d=2\). The hierarchical Laplace operator is an approximation to the Euclidean one in the sense that its Green function has comparable long-distance behaviour, but simpler short-distance structure. The study of hierarchical models has a long history in statistical mechanics going back to [11, 28]; recent studies and uses of hierarchical models include [1, 10, 15, 35, 54] and references.

Fig. 1
figure 1

Blocks in \(\mathcal{B}_j\) for \(j=0,1,2,3\) where \(d=2\), \(N=3\), \(L=2\)

There is some flexibility in the choice of the hierarchical field; the precise choice is not significant. Let \(\Lambda = \Lambda _N\) be a cube of side length \(L^N\) in \(\mathbb {Z}^d\), \(d \geqslant 1\), for some fixed integer \(L>1\) and N eventually chosen large. For scale \(0\leqslant j \leqslant N\), we decompose \(\Lambda \) as the union of disjoint blocks of side lengths \(L^j\) denoted \(B \in {\mathcal {B}}_j\); see Fig. 1. In particular, \({\mathcal {B}}_0 = \Lambda \) and the unique block in \({\mathcal {B}}_N\) is \(\Lambda _N\) itself. The blocks have the structure of a K-ary tree with \(K=L^d\), height N and the leaves are indexed by the sites \(x \in \Lambda _N\).

For scale j and \(x\in \Lambda \), let \(B_{j}(x)\) be the block in \({\mathcal {B}}_j\) containing x. As in [9, Chapter 4], define the block averaging operators, which are the projections

$$\begin{aligned} (Q_jf)_x = \frac{1}{|B_{j}(x)|}\sum _{y \in B_{j}(x)} f_y, \quad \text {for }f\in \mathbb {R}^\Lambda . \end{aligned}$$
(1.6)

Let \(P_j = Q_{j-1}-Q_{j}\). Then \(P_1, \dots , P_N, Q_N\) are orthogonal projections on \(\mathbb {R}^\Lambda \) with disjoint ranges whose direct sum is the full space. An operator on \(\mathbb {R}^\Lambda \) is hierarchical if it is diagonal with respect to this decomposition. To obtain a hierarchical Green function with the scaling of the Green function of the usual Laplace operator, we choose the hierarchical Laplace operator on \(\Lambda \) to be

$$\begin{aligned} -\Delta _{H} = \sum _{j=1}^N L^{-2(j-1)} P_j . \end{aligned}$$
(1.7)

Like the usual Laplacian on the discrete torus, this choice of hierarchical Laplacian annihilates the constant functions. The definition implies that the Green function of the hierarchical Laplacian has comparable long distance behaviour to that of the nearest-neighbour Laplacian: for \(|x-y|^{-1} \ll m\),

$$\begin{aligned} (-\Delta _H+m^2)^{-1}_{xy}&\asymp |x-y|^{-(d-2)}&\qquad (d>2), \end{aligned}$$
(1.8)
$$\begin{aligned} (-\Delta _H+m^2)^{-1}_{xy}&= c_N - \sigma \log _L |x-y| + O(1)&\qquad (d=2), \end{aligned}$$
(1.9)

where \(|x-y|\) is the Euclidean distance and \(\sigma = 1-L^{-2}\) is a constant independent of N, and \(A \asymp B\) denotes that A / B and B / A are bounded by N-independent constants. On the other hand, the hierarchical Laplacian has coarser small distance behaviour than the lattice Laplacian. For a more detailed introduction to the hierarchical Laplacian, as well as discussion of its relation to the lattice Laplacian, see [9, Chapters 3–4].

1.4 Models and results

In Sect. 2, we are going to develop a quite general multiscale strategy to estimate the spectral gap of (critical) spin systems by using a renormalisation group approach. We will then apply this method to the n-component \(|\varphi |^4\) model and the Sine-Gordon model as well as the degenerate case of the Discrete Gaussian model. These models correspond to choices of the potential V defined now. In the setting of the hierarchical spin coupling, we study the critical region of the \(|\varphi |^4\) model and the rough phase of the Sine-Gordon and Discrete Gaussian models. These are both settings for which the renormalisation group method is well developed for the equilibrium case, and we use this as input.

1.4.1 Ginzburg–Landau–Wilson \(|\varphi |^4\) model

The n-component \(|\varphi |^4\) model is defined by the double-well potential (if \(n=1\)), respectively Mexican hat shaped potential (if \(n\geqslant 2\)),

$$\begin{aligned} M=-\Delta _H, \quad V(\varphi ) = \frac{1}{4} g|\varphi |^4 + \frac{1}{2} \nu |\varphi |^2, \quad (g>0, \; \nu \in \mathbb {R}). \end{aligned}$$
(1.10)

Our interest is in the case \(\nu <0\), when this potential is non-convex. The \(|\varphi |^4\) model is a prototype for a spin model with O(n) symmetry. The spatial dimension \(d=4\) is critical for this model (see, e.g., [9]). The following theorem quantifies the decay of the spectral gap in the four-dimensional hierarchical \(|\varphi |^4\) model when approaching the critical point from the high temperature side.

Theorem 1.1

Let \(\gamma _N(g,\nu ,n)\) be the spectral gap of the hierarchical n-component \(|\varphi |^4\) model on \(\Lambda _N\) with dimension \(d=4\) (as defined above). Let \(L \geqslant L_0\), and let \(g>0\) be sufficiently small. There exists \(\nu _c = \nu _c(g,n) = -C(n+2)g + O(g^2)\) and a constant \(\delta \geqslant 1\) (independent of n) such that for \(t_0 \geqslant t \geqslant cL^{-2N}\), where \(t_0\) is a small constant,

$$\begin{aligned} c_1 t (-\log t)^{-\delta (n+2)/(n+8)} \leqslant \gamma _N(g,\nu _c+t,n) \leqslant c_2 t (-\log t)^{-(n+2)/(n+8)}, \end{aligned}$$
(1.11)

provided that N is sufficiently large. In particular, \(t\geqslant cL^{-2N}\) is allowed to depend on N.

The proof is postponed to Sect. 3. The same proof also implies easily that for \(t \geqslant t_0\) the gap is of order 1, but since we are interested in the more delicate approach of the critical point, we omit the details. Together with this, Theorem 1.1 implies that for the \(|\varphi |^4\) model, the spectral gap is of order 1 in the high temperature phase, \(\nu > \nu _c\) independently of N, and as the critical point is approached the spectral gap scales like that of the free field, with a logarithmic correction. We expect that \(\gamma \sim Ct(-\log t)^{-z}\) for a universal critical exponent \(z = z(n) \geqslant \frac{n+2}{n+8}\), which our method does not determine (see also [36]). The upper bound follows easily from the estimates derived at equilibrium in [9, Theorem 4.2.1] and we also use the renormalisation group flow constructed in [9] as input to prove the lower bound (see also [33]). References for the renormalisation group analysis of the \(|\varphi |^4\) model on \(\mathbb {Z}^4\), with different approaches, include [31, 34, 37, 38] and [5,6,7,8, 17,18,19,20].

1.4.2 Sine-Gordon model

The Sine-Gordon model is defined by a \(2\pi \)-periodic potential and coupling matrix proportional to the inverse temperature \(\beta \), i.e.,

$$\begin{aligned} M=-\beta \Delta _H \quad (\beta >0), \qquad V(\varphi ) \text { is even and }2\pi \text {-periodic}. \end{aligned}$$
(1.12)

The corresponding energy \(H(\varphi )\) in (1.1) is invariant under for any \(n \in \mathbb {Z}\), where denotes the constant function on \(\Lambda \) with for all \(x\in \Lambda \). To break this non-compact symmetry, we add the external field and consider

$$\begin{aligned} H_\varepsilon (\varphi ) = H(\varphi ) + \frac{\varepsilon }{2} \left( {\frac{1}{\sqrt{|\Lambda |}}\sum _{x} \varphi _x}\right) ^2 = \frac{\beta }{2} (\varphi ,-\Delta _H \varphi ) + \sum _x V(\varphi _x) + \frac{\varepsilon }{2} \left( {\frac{1}{\sqrt{|\Lambda |}}\sum _{x} \varphi _x}\right) ^2 . \end{aligned}$$
(1.13)

As previously, we are interested in the large volume limit \(|\Lambda |\uparrow \infty \); to avoid some uninteresting technicalities, we will make the convenient choice \(\varepsilon =\beta L^{-2N}\). If V was, e.g., the double well potential \(V(\varphi ) = \varphi ^4-\varphi ^2\) instead of a periodic potential as above, then the corresponding measure has a uniform spectral gap for any \(\beta >0\) sufficiently small (see, e.g., [4]). The following theorem shows that this is not the case for periodic potentials: the spectral gap decreases to 0. Thus that the resulting models are critical, in the sense of slow decay of correlations, is also reflected in their dynamics.

For the statement of the theorem, denote by \({\hat{V}}(q) = (2\pi )^{-1} \int _{-\pi }^\pi e^{iq\varphi } V(\varphi ) \, d\varphi \) the Fourier coefficient of the \(2\pi \)-periodic function V, and let \(\sigma = 1-L^{-2}\) be the constant in (1.9) with dimension \(d=2\).

Theorem 1.2

Let \(\gamma _N(\beta ,V)\) be the spectral gap of the hierarchical Sine-Gordon model on \(\Lambda _N\) with dimension \(d=2\) (as defined above). Assume \(\sum _{q\in \mathbb {Z}\setminus \{0\}} (1+q^2) |{\hat{V}}(q)|\) is small enough. Let \(0< \beta < \sigma /(4\log L)\) and let \(\varepsilon = \beta L^{-2N}\). There are \(\kappa \in (0,1)\) and \(c>0\) such that the spectral gap scales as

$$\begin{aligned} cL^{-2N} \leqslant \gamma _N(\beta , V) \leqslant L^{-2N}(1-O(\kappa ^N)) \end{aligned}$$
(1.14)

provided that N is sufficiently large.

The Sine-Gordon model is dual to a Coulomb gas model (see, e.g., [16, 32]). Under this duality, the inverse temperature of the Coulomb gas model is proportional to the temperature \(1/\beta \) of the Sine-Gordon model. We here primarily view the Sine-Gordon model as a spin model, rather than as a description of the Coulomb gas, and therefore choose \(\beta \) instead of \(1/\beta \) in (1.12). Note that the usual normalisation of the logarithm in (1.9) is \(c_N - \frac{1}{2\pi } \log |x| + O(1)\) for the Laplace operator on \(\mathbb {Z}^2\). For this normalisation of the hierarchical Laplace operator, the hierarchical critical inverse temperature becomes \(1/\beta = 8\pi \). This is only approximately true in the Euclidean model because of a field-strength (stiffness) renormalisation which is not present in the hierarchical model. For the critical inverse temperature \(\beta = \sigma /(4\log L)\), we expect that \(\gamma \sim C L^{-2N} N^{-z}\) for a universal critical exponent \(z>0\). For the presence of logarithmic corrections to the free field scaling in the static case, see [30]. Our theorem uses the set-up for the renormalisation group for this model of [16] (see also [48]). References for the Sine-Gordon model on \(\mathbb {Z}^2\) include [32] and [23,24,25, 29, 30, 47].

1.4.3 Discrete Gaussian model

We conclude this section with a discrete model which is closely linked to the Sine-Gordon model. The Discrete Gaussian model is an integer-valued field with expectation given by

$$\begin{aligned} {{\mathbb {E}}} _ \mu (F) = \frac{1}{Z} \sum _{\sigma \in (2\pi \mathbb {Z})^\Lambda } F(\sigma ) e^{-\frac{\beta }{2}(\sigma ,-\Delta _H\sigma ) - \frac{\varepsilon }{2} ({\frac{1}{\sqrt{|\Lambda |}}\sum _{x}\sigma _x})^2} \quad \text {for }F: (2\pi \mathbb {Z})^\Lambda \rightarrow \mathbb {R}, \quad (\beta >0). \end{aligned}$$
(1.15)

Note that by rescaling \(\beta \) and \(\varepsilon \) by \((2\pi )^2\), this definition is equivalent to the one in which the model takes values in \(\mathbb {Z}\) rather than \(2\pi \mathbb {Z}\). The normalisation by \(2\pi \) is convenient for our proof. The model formally takes the form of a degenerate Sine-Gordon model in which \(e^{-V(\varphi )}\) is replaced by a sum of \(\delta \)-functions. As the spins take integer values, we now consider a discrete Glauber dynamics for the Discrete Gaussian model with Dirichlet form

$$\begin{aligned} \frac{1}{2(2\pi )^2} \sum _{x\in \Lambda } {{\mathbb {E}}} _\mu \Big ( (F(\sigma ^{x+})-F(\sigma ))^2 + (F(\sigma ^{x-})-F(\sigma ))^2 \Big ), \end{aligned}$$
(1.16)

where \(\sigma ^{x\pm }\) is obtained from \(\sigma \in (2\pi \mathbb {Z})^\Lambda \) by increasing/decreasing the entry at \(x\in \Lambda \) by \(2\pi \). Thus the corresponding spectral gap of this dynamics is the smallest constant \(\gamma \) such that, for all functions \(F: (2\pi \mathbb {Z})^\Lambda \rightarrow \mathbb {R}\) with finite variance,

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F) \leqslant \frac{1}{\gamma } \frac{1}{2(2\pi )^2} \sum _{x\in \Lambda } {{\mathbb {E}}} _\mu \Big ( (F(\sigma ^{x+})-F(\sigma ))^2 + (F(\sigma ^{x-})-F(\sigma ))^2 \Big ). \end{aligned}$$
(1.17)

The following theorem is related to Theorem 1.2. It shows that the spectral gap of the Discrete Gaussian model scales like the one of the GFF.

Theorem 1.3

Let \(\gamma _N(\beta )\) be the spectral gap of the hierarchical Discrete Gaussian model on \(\Lambda _N\) in dimension \(d=2\) (as defined above). For \(\beta >0\) sufficiently small and \(\varepsilon = \beta L^{-2N}\), there are \(\kappa \in (0,1)\) and \(c>0\) such that

$$\begin{aligned} cL^{-2N} \leqslant \gamma _N(\beta ) \leqslant L^{-2N} ({ 1- O(\kappa ^N)}) \end{aligned}$$
(1.18)

provided that N is sufficiently large.

2 Induction on Renormalised Brascamp–Lieb Inequalities

The Brascamp–Lieb inequality is a generalisation of the spectral gap inequality. We here say that a measure \(\mu \) on a finite-dimensional vector space X with inner product \((\cdot ,\cdot )\) satisfies a Brascamp–Lieb inequality with quadratic form \(D :X \rightarrow X\) if for all smooth functions F,

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F) \leqslant {{\mathbb {E}}} _\mu (\nabla F, D\nabla F). \end{aligned}$$
(2.1)

In particular, if the quadratic form satisfies \(D \leqslant \mathrm {id}/ \lambda \) for some \(\lambda >0\), then \(\mu \) satisfies a spectral gap inequality with constant \(\lambda \). In this section, we construct inductive bounds on Brascamp–Lieb inequalities between renormalised versions of a spin system. From these we deduce in particular an induction on the spectral gap. In the remainder of this paper, we will verify the generic assumptions made in this section in the specific cases of the hierarchical \(|\varphi |^4\) and the Sine-Gordon models.

2.1 Hierarchical decomposition

While the results of this section are somewhat more general, in the remainder of this paper we will apply them to hierarchical models. We therefore recall their structure which can be helpful to keep in mind throughout this section. From Sect. 1.3, first recall the orthogonal projections \(P_1, \dots , P_N, Q_N\) whose ranges span \(\mathbb {R}^\Lambda \), and the hierarchical Laplacian \(\Delta _H\) [see (1.7)]. By spectral calculus, for any \(m^2 > 0\), its Green function can be written as

$$\begin{aligned} (-\Delta _H + m^2 )^{-1} = \sum _{j=1}^N (1+m^2L^{2(j-1)})^{-1} L^{2(j-1)} P_j + m^{-2} Q_N. \end{aligned}$$
(2.2)

Using the definition \(P_j = Q_{j-1}-Q_j\) to express the right-hand side of the last equation in terms of the block averaging operators \(Q_j\), we can alternatively write

$$\begin{aligned} (-\Delta _H + m^2 )^{-1} =\sum _{j=0}^N C_j \quad \text {with} \quad C_j = \lambda _j Q_j, \end{aligned}$$
(2.3)

where

$$\begin{aligned} \lambda _0(m^2)= & {} \frac{1}{1+m^2}, \qquad \lambda _N(m^2) = \frac{1}{m^2(1+m^2L^{2(N-1)})}, \end{aligned}$$
(2.4)
$$\begin{aligned} \lambda _j(m^2)= & {} L^{2j} \frac{(1-L^{-2})}{(1+m^2L^{2j})(1+m^2L^{2(j-1)})} \quad (0<j<N). \end{aligned}$$
(2.5)

The above spin coupling matrices generalise directly to the O(n)-invariant vector-valued case, in which all operators act separately on each component, and we use the same notation in this case. Thus the Laplacian and the covariances act on the space \(X_0 = \mathbb {R}^{n\Lambda }\).

The covariances \(C_j\) are degenerate and it is convenient to introduce the subspaces of \(X_0=\mathbb {R}^{n\Lambda }\) on which they are supported. Thus define \(X_j\) to be the image of \(C_j\), i.e.,

$$\begin{aligned} X_j = \{ \varphi \in \mathbb {R}^{n\Lambda }: \varphi |_B \text { is constant for every }B \in {\mathcal {B}}_j \}, \end{aligned}$$
(2.6)

and, for \(S \subset \Lambda \),

$$\begin{aligned} X_j(S) = \{ \varphi \in X_j: \varphi _x = 0 \text { for } x \not \in S \}. \end{aligned}$$
(2.7)

Then the Gaussian field \(\zeta = \{\zeta _x\}_{x\in \Lambda }\) with values in \(X_j\) and covariance \(C_j\) can be realised as

$$\begin{aligned} \forall x \in B, \qquad \zeta _x = \zeta _B, \end{aligned}$$
(2.8)

where \(\{ \zeta _B \}_{B \in {\mathcal {B}}_j}\) are independent Gaussian variables in \(\mathbb {R}^n\) with variance \( \frac{\lambda _j}{|B_{j}(x)|}= L^{-dj} \lambda _j\).

In general, one can identify \(\varphi \in X_j\) with \(\{\varphi _B\}_{B\in {\mathcal {B}}_j}\). In the following, we are going to consider functions defined only on the subspaces \(X_j\). Let F be such a function of class \(C^2\) written as

$$\begin{aligned} \{ \varphi _B \}_{B \in {\mathcal {B}}_j} \in \mathbb {R}^{n |{\mathcal {B}}_j|} \mapsto F \big ( \{ \varphi _B \} \big ). \end{aligned}$$
(2.9)

Then F can be extended as a smooth function on the whole of \(\mathbb {R}^{n\Lambda }\) by setting, for example,

$$\begin{aligned} F(\varphi ) = F\Bigl ({\Big \{\frac{1}{|B|}\sum _{x\in B}\varphi _x\Big \} }\Bigr ). \end{aligned}$$
(2.10)

For such F, we will consider the gradient and the Hessian of F only in the directions spanned by so that we set

$$\begin{aligned} \forall \varphi \in X_j, \qquad \nabla _{X_j} F (\varphi ) = Q_j\nabla F(\varphi ), \quad {{\,\mathrm{Hess}\,}}_{X_j} F (\varphi ) = Q_j{{\,\mathrm{Hess}\,}}F(\varphi )Q_j . \end{aligned}$$
(2.11)

As the gradient and the Hessian are projected only in the directions spanned by , their restrictions on \(X_j\) are independent of the way F has been extended in \(\mathbb {R}^{n\Lambda }\).

2.2 Renormalised measure

Let \(X_0=\mathbb {R}^{n\Lambda }\) with the standard inner product \((\cdot ,\cdot )\). From now on, we consider a Gaussian measure on \(X_0\) whose covariance \(C_{\geqslant 0}\) has a decomposition \(C_{\geqslant 0}=C_0 + \cdots + C_N\), with the \(C_i\) symmetric and positive semi-definite. We then consider the class of probability measures \(\mu \) with expectation

$$\begin{aligned} {{\mathbb {E}}} _\mu (F) = \frac{\mathbb {E}_{C_{\geqslant 0}} (e^{-V_0}F)}{\mathbb {E}_{C_{\geqslant 0}} (e^{-V_0})}, \end{aligned}$$
(2.12)

for some potential \(V_0\). In particular, the models introduced in Sect. 1 are in this class, with

$$\begin{aligned} V_0(\varphi ) = \sum _{x \in \Lambda } V(\varphi _x) \quad \text {for }\varphi \in X_0 = \mathbb {R}^{n\Lambda }, \end{aligned}$$
(2.13)

and the decomposition (2.3). Given such a decomposition \(C_0+ \cdots +C_N\) and the potential \(V_0\), we define the renormalised potentials \(V_j\) inductively by

$$\begin{aligned} e^{-V_{j+1}(\varphi )} = \mathbb {E}_{C_{j}}(e^{-V_j(\varphi +\zeta )}), \end{aligned}$$
(2.14)

where the expectation applies to \(\zeta \). (This definition includes \(j=N\), but throughout this section we will only use \(j<N\).) The associated renormalised measure \(\mu _j\) is then defined by the expectation

$$\begin{aligned} \mathbb {E}_{\mu _j}(F) = \frac{\mathbb {E}_{C_{\geqslant j}}(e^{-V_{j}} F)}{\mathbb {E}_{C_{\geqslant j}}(e^{-V_j})}, \qquad C_{\geqslant j} = C_{j}+\cdots +C_N . \end{aligned}$$
(2.15)

As is the case for the hierarchical decomposition, the covariances \(C_j\) are permitted to be degenerate and we denote by \(X_j\) the subspaces of \(X_0\) on which they are supported, i.e., \(X_j\) is the image of \(C_j\) [see (2.6) for the hierarchical decomposition].

2.3 One step of renormalisation

For the remainder of the section, we fix a scale \(j \in \{0,1,\dots , N\}\), and consider a single renormalisation group step from scale j to scale \(j+1\) when \(j<N\), and a final estimate when \(j=N\). To simplify the notation, we usually omit the scale index j and write \(+\) in place of \(j+1\). In particular, we write \(C=C_j\), \(V=V_j\), \(\mu = \mu _{j}\), \(\mu _{+} = \mu _{j+1}\), and so on. Let \(X = X_j \subseteq X_0\) be the image of C and denote by Q the orthogonal projection from \(X_0\) onto X. We need the following assumptions.

For \(j<N\), in the assumptions below, \(D_+=D_{j+1}\) is the matrix associated with a quadratic form for a Brascamp–Lieb inequality for the measure \(\mu _+\) [see (2.19)], and we set \(D_{N+1}=0\). Throughout the paper, inequalities between operators and matrices are interpreted in the sense of quadratic forms.

A1. Non-convexity of potential There is a constant \(\varepsilon = \varepsilon _j < 1\) such that uniformly in \(\varphi \in X\),

$$\begin{aligned} E(\varphi ) := C^{1/2} ({{\,\mathrm{Hess}\,}}_X V(\varphi )) C^{1/2} \geqslant -\varepsilon Q. \end{aligned}$$
(2.16)

A2. Coupling of scales The images of C and \(C_+\) contain all directions on which \(D_+\) is nontrivial, more precisely

$$\begin{aligned} D_+ = D_+Q = D_+Q_+. \end{aligned}$$
(2.17)

A3. Symmetry For all \(\varphi \in X\),

$$\begin{aligned}{}[E(\varphi ),C]=[E(\varphi ),D_+]=[C,D_+]=[C,Q_+]=0, \end{aligned}$$
(2.18)

where \([A,B] = AB-BA\) denotes the commutator.

The most significant assumption is (2.16), which will be seen to ensure that the fluctuation field measure given the block spin field is uniformly strictly convex. The more technical assumptions (2.17) and (2.18) are very convenient (and obvious in the hierarchical setting (2.3)) but seem less fundamental. We use (2.16) in Lemma 2.7 and (2.60), (2.17) in (2.56), and (2.18) in (2.59).

Under the above assumptions, we relate the Brascamp–Lieb inequality for \(\mu _+\) to that for \(\mu \).

Theorem 2.1

Fix \(j< N\), and assume (A1)–(A3) and that \(\mu _+\) satisfies the Brascamp–Lieb inequality

$$\begin{aligned} {{\,\mathrm{Var}\,}}_{\mu _+}(F) \leqslant \mathbb {E}_{\mu _+}(\nabla F(\varphi ), D_+ \nabla F(\varphi )). \end{aligned}$$
(2.19)

Then \(\mu \) satisfies a Brascamp–Lieb inequality (2.1) with

$$\begin{aligned} D \leqslant \frac{C}{1-\varepsilon } + \frac{D_+}{(1-\varepsilon )^2} . \end{aligned}$$
(2.20)

For \(j=N\), assume only that (A1) holds. Then \(\mu \) satisfies a Brascamp–Lieb inequality (2.1) with

$$\begin{aligned} D \leqslant \frac{C}{1-\varepsilon } . \end{aligned}$$
(2.21)

Iterating this theorem starting from \(j=N\) gives the Brascamp–Lieb inequality for the original measure \(\mu _{0}\) as follows. In particular, the spectral gap of \(\mu _0\) is bounded by the inverse of the largest eigenvalue of the matrix \(D_0\).

Corollary 2.2

Assume that, for \(j=0,\dots , N\), the sequence of renormalised measures \((\mu _{j})\) satisfies Assumptions (A1)-(A3) where \(\varepsilon = \varepsilon _j\). Then \(\mu _0\) satisfies a Brascamp–Lieb inequality with

$$\begin{aligned} D_0 \leqslant \sum _{k=0}^{N} \delta _k C_k, \qquad \delta _k = \frac{1}{1-\varepsilon _k} \prod _{l=0}^{k-1} \frac{1}{(1-\varepsilon _{l})^2} \leqslant \exp \left( {2\sum _{l=0}^{k} \varepsilon _{l} + O(\varepsilon _l^2)}\right) . \end{aligned}$$
(2.22)

Proof

By backward induction starting from \(j=N\), we will prove that the renormalised measures \(\mu _j\) satisfy the Brascamp–Lieb inequality

$$\begin{aligned} {{\,\mathrm{Var}\,}}_{\mu _j}(F) \leqslant \mathbb {E}_{\mu _j}(\nabla F(\varphi ), D_j\nabla F(\varphi )), \qquad \text {with } D_j \leqslant \sum _{k=j}^N \delta _{j,k} C_k \end{aligned}$$
(2.23)

and

$$\begin{aligned} \delta _{j,k}= \frac{1}{1-\varepsilon _k} \prod _{l=j}^{k-1} \frac{1}{(1-\varepsilon _l)^2} . \end{aligned}$$
(2.24)

The claim (2.22) is then the case \(j=0\). To start the induction, we apply (2.21) which gives (2.23) for \(j=N\). To advance the induction, suppose \(0 \leqslant j < N\) is such that the inductive assumption (2.23) holds with j replaced by \(j+1\). This means that (2.19) holds for j and Assumptions (A1)–(A3) also hold by assumption of the corollary. Theorem 2.1 and the inductive assumption imply that \(\mu _j\) satisfies the Brascamp–Lieb inequality with

$$\begin{aligned} D_j \leqslant \frac{C_j}{1-\varepsilon _j} + \frac{D_{j+1}}{(1-\varepsilon _j)^2} \leqslant \frac{C_j}{1-\varepsilon _j} + \sum _{k=j+1}^{N} \frac{\delta _{j+1,k}}{(1-\varepsilon _j)^2} C_k = \sum _{k=j}^{N} \delta _{j,k} C_k. \end{aligned}$$
(2.25)

This advances the inductive assumption, i.e., (2.23) holds for j.    \(\square \)

Corollary 2.3

Under the assumptions of the previous corollary, the measure \(\mu _0\) satisfies a spectral gap inequality with inverse spectral gap less than the largest eigenvalue of the matrix \(D_0\).

Proof

The claim is immediate from the definitions of the Brascamp–Lieb and the spectral gap inequalities. Indeed, if \(1/\lambda \) is the largest eigenvalue of \(D_0\) then

$$\begin{aligned} {{\,\mathrm{Var}\,}}_{\mu _0}(F) \leqslant \mathbb {E}_{\mu _0}(\nabla F, D_0 \nabla F) \leqslant \frac{1}{\lambda } \mathbb {E}_{\mu _0}(\nabla F, \nabla F), \end{aligned}$$
(2.26)

as claimed.    \(\square \)

In Sects. 3 and 4, Assumptions (A1)–(A3) will be checked for the different hierarchical models in order to derive the scaling of the spectral gap from the previous corollary.

Remark 2.4

More generally, in the assumption \(D_+ = D_+(\varphi )\) and \(\varepsilon =\varepsilon (\varphi )\) could depend on \(\varphi \in X\), with \(\varepsilon \) uniformly bounded by 1. The conclusion (2.20) is then replaced by

$$\begin{aligned} D(\varphi +\zeta ) \leqslant \frac{C}{1-\varepsilon (\varphi +\zeta )} + \frac{D_+(\varphi )}{(1-\varepsilon (\varphi +\zeta ))^2} . \end{aligned}$$
(2.27)

However, this strengthened inequality may be difficult to use. To improve the readability, we therefore do not carry the additional arguments for \(D_+\) and \(\varepsilon \) through the proof.

2.4 Proof of Theorem 2.1

We write the renormalised field at scale j as \(\zeta +\varphi \) where \(\varphi \in X_+\) is the block spin field at the next scale \(j+1\) and \(\zeta \in X\) is the fluctuation field at scale j. More precisely, recall that

$$\begin{aligned} \mathbb {E}_{\mu }(F) = \frac{\mathbb {E}_{C_{\geqslant }}(e^{-V} F)}{\mathbb {E}_{C_{\geqslant }}(e^{-V})} = \frac{{\mathbb {E}_{C_>} \, \mathbb {E}_{C}} (e^{-V(\varphi +\zeta )} F(\varphi +\zeta ))}{\mathbb {E}_{C_>} \, \mathbb {E}_C(e^{-V(\varphi +\zeta )})}, \end{aligned}$$
(2.28)

where \(C = C_j\) and \(\zeta \) denotes the corresponding random field, where \(C_>\) stands for the covariance \(C_{j+1} + C_{j+2} + \cdots C_N\) and \(\varphi \) denotes the corresponding random field, where \(C_{\geqslant } = C + C_>\), and where \(\mathbb {E}_C\) denotes the expectation of a Gaussian measure with covariance C.

Define the expectation conditioned on the block spin field \(\varphi \) in \(X_+\) by

$$\begin{aligned} \mathbb {E}_{\mu _\varphi }(F) = \mathbb {E}_{\mu }(F|\varphi ) = \frac{\mathbb {E}_{C}(e^{-V(\varphi +\cdot )} F)}{\mathbb {E}_{C}(e^{-V(\varphi +\cdot )})} = \frac{\mathbb {E}_{C}(e^{-V(\varphi +\cdot )} F)}{e^{-V_+(\varphi )}} \end{aligned}$$
(2.29)

where we will often use the notation \(\mathbb {E}_{\mu _\varphi }\) for the conditional measure \(\mathbb {E}_{\mu }(\cdot |\varphi )\) to make the notation more concise. Then, using (2.15),

$$\begin{aligned} {{\mathbb {E}}} _{\mu }(F) = \frac{1}{Z_{j+1}} \mathbb {E}_{C_>} \Bigl ({ e^{- V_{+} ( \varphi )} \; {{\mathbb {E}}} _{\mu } ( F | \varphi ) }\Bigr ) = {{\mathbb {E}}} _{\mu _+} \Bigl ({ {{\mathbb {E}}} _{\mu } ( F | \varphi ) }\Bigr ), \end{aligned}$$
(2.30)

where \(Z_{j+1}\) is a normalising constant.

To prove Theorem 2.1, we write using the conditional expectation,

$$\begin{aligned} \mathbb {E}_{\mu } (F^2) - \mathbb {E}_{\mu }(F)^2 = \mathbb {E}_{\mu _+} \Bigl ({ \mathbb {E}_{\mu } ({ F(\varphi +\zeta )^2 | \varphi }) }\Bigr ) - \mathbb {E}_{\mu _{+}} \Bigl ({ \mathbb {E}_{\mu } ({ F (\varphi +\zeta ) | \varphi }) }\Bigr )^2 = {{\mathbb {A}}} _1 + {{\mathbb {A}}} _2, \end{aligned}$$
(2.31)

with

$$\begin{aligned} {{\mathbb {A}}} _1&= \mathbb {E}_{\mu _{+}} \Bigl ({ \mathbb {E}_\mu ({ F ({\varphi +\zeta })^2 | \varphi }) - \mathbb {E}_\mu ({ F ({\varphi + \zeta }) | \varphi })^2 }\Bigr ) , \end{aligned}$$
(2.32)
$$\begin{aligned} {{\mathbb {A}}} _2&= \mathbb {E}_{\mu _{+}} \Bigl ({ \mathbb {E}_\mu ({ F ({ \varphi +\zeta }) | \varphi })^2 }\Bigr ) - \mathbb {E}_{\mu _{+}} \Bigl ({ \mathbb {E}_\mu ({ F ({ \varphi +\zeta }) | \varphi }) }\Bigr )^2 . \end{aligned}$$
(2.33)

In the remainder of this section, we will bound each term separately thanks to the following lemmas.

Lemma 2.5

Assume (A1). Then for any function F with gradient in \(L^2(\mu )\), one has

$$\begin{aligned} {{\mathbb {A}}} _1 \leqslant \mathbb {E}_\mu \left( \nabla F(\varphi ) \frac{C}{1 -\varepsilon } \nabla F(\varphi ) \right) . \end{aligned}$$
(2.34)

Lemma 2.6

Assume (A1)–(A3) and that \(\mu _+\) satisfies the Brascamp–Lieb inequality (2.19). Then for any function F with gradient in \(L^2(\mu )\), one has

$$\begin{aligned} {{\mathbb {A}}} _2 \leqslant \mathbb {E}_\mu \left( \nabla F(\varphi ) \frac{D_+}{(1 - \varepsilon )^2} \nabla F(\varphi ) \right) . \end{aligned}$$
(2.35)

Proof of Theorem 2.1

For \(j<N\), the proof is immediate by combining the decomposition (2.31) and the previous two lemmas. For \(j=N\), the claim follows directly from Lemma 2.5 only.    \(\square \)

2.4.1 Proof of Lemma 2.5

From now on, we freeze the block spin field \(\varphi \in X_+\). Then the conditional measure \(\mu _\varphi = \mu ( \,\cdot \, |\varphi )\) is a probability measure on the space X, the image of C [see (2.6) in the hierarchical case]. As a subspace of the Euclidean vector space \(X_0\), the space X has an induced inner product which we also denote by \((\cdot ,\cdot )\), and an induced surface measure, which is equivalent to the Lebesgue measure of the dimension of X. The measure \(\mu _\varphi \) has density proportional to \(e^{-H_\varphi (\zeta )}\) with respect to this measure given by

$$\begin{aligned} H_\varphi (\zeta ) = \frac{1}{2} (\zeta , C^{-1} \zeta ) + V(\varphi +\zeta ). \end{aligned}$$
(2.36)

(By definition of the subspace X we can regard C as an invertible symmetric operator \(X \rightarrow X\).) For a function \(F: X_0 \rightarrow \mathbb {R}\) and \(\varphi \in X_0\), the function \(F_\varphi : X \rightarrow \mathbb {R}\) is defined by \(F_\varphi (\zeta ) = F(\varphi +\zeta )\).

Lemma 2.7

Assume (A1). Then for all \(\varphi \in X_+\), the conditional measure \(\mu _\varphi \) satisfies the Brascamp–Lieb inequality

$$\begin{aligned} \mathbb {E}_{\mu _\varphi } (F_\varphi (\zeta )^2) - \mathbb {E}_{\mu _\varphi } ( F_\varphi (\zeta ))^2 \leqslant \mathbb {E}_{\mu _\varphi } \left( \big (\nabla _X F_\varphi (\zeta ), \frac{C}{1 -\varepsilon }\nabla _X F_\varphi (\zeta ) \big ) \right) . \end{aligned}$$
(2.37)

Proof

As a consequence of Assumption (2.16) and of the definition of the space X, the Hamiltonian \(H_\varphi \) associated with \(\mu _\varphi \) is strictly convex on X, with

$$\begin{aligned} {{\,\mathrm{Hess}\,}}_{X} H_\varphi&= C^{-1} + {{\,\mathrm{Hess}\,}}_X V_\varphi \\&= C^{-1/2}(\mathrm {id}+ C^{1/2} {{\,\mathrm{Hess}\,}}V_\varphi C^{1/2})C^{-1/2} \geqslant (1-\varepsilon ) C^{-1}, \end{aligned}$$

where we used that C is invertible on X and that \(QC = CQ= C\). The Brascamp–Lieb inequality (A.4) implies the inequality.    \(\square \)

Proof of Lemma 2.5

The term \({{\mathbb {A}}} _1\) is a variance under the conditional measure \(\mu _\varphi \). By Lemma 2.7, the measure satisfies the Brascamp–Lieb inequality (2.37). Therefore

$$\begin{aligned} {{\mathbb {A}}} _1&=\mathbb {E}_{\mu _{+}} \Big ( \mathbb {E}_{\mu _\varphi }( F_\varphi (\zeta )^2) - \mu _\varphi ( F_\varphi ( \zeta ))^2 \Big ) \nonumber \\&\leqslant \mathbb {E}_{\mu _{+}} \Big ( \mu _\varphi \Big (\nabla _X F_\varphi (\zeta ) \frac{C}{1 -\varepsilon } \nabla _X F_\varphi (\zeta ) \Big ) \Big ) = \mathbb {E}_\mu \left( \nabla F(\varphi ) \frac{C}{1 -\varepsilon } \nabla F(\varphi ) \right) . \end{aligned}$$
(2.38)

In the last equality we used that \(CQ=C\) by definition of Q as the orthogonal projection onto the image of C so that \(\nabla _X\) can be replaced by \(\nabla \).    \(\square \)

2.4.2 Proof of Lemma 2.6

The second term \({{\mathbb {A}}} _2\) in (2.32) is a variance under \(\mu _{+}\):

$$\begin{aligned} {{\mathbb {A}}} _2 = \mathbb {E}_{\mu _{+}} \Big ( {\tilde{F}}(\varphi )^2 \Big ) - \mathbb {E}_{\mu _{+}} \Big ({\tilde{F}}(\varphi )\Big )^2, \quad {\tilde{F}}(\varphi ) = \mathbb {E}_{\mu _\varphi }( F_\varphi (\zeta )). \end{aligned}$$
(2.39)

Using Assumption (2.19) that the measure \(\mu _{+}\) satisfies a Brascamp–Lieb inequality, we have

$$\begin{aligned} {{\mathbb {A}}} _2 \leqslant \mathbb {E}_{\mu _{+}} \Big ( \Vert D_+^{1/2} \nabla \tilde{F}(\varphi )\Vert _2^2 \Big ) = \mathbb {E}_{\mu _{+}} \Big ( \Vert D_+^{1/2} \nabla _{X_+} \mathbb {E}_{\mu _\varphi }( F(\varphi + \zeta ))\Vert _2^2 \Big ) \, , \end{aligned}$$
(2.40)

where \(\nabla _{X_+}\) applies to the variable \(\varphi \) and \(\Vert f\Vert _2^2 = \sum _{x\in \Lambda } |f_x|^2\).

We first state a technical lemma.

Lemma 2.8

Assume (A3). For \({\dot{\varphi }} \in X_+\),

$$\begin{aligned} ({\dot{\varphi }}, \nabla _{X_+} {\tilde{F}}(\varphi )) = ({\dot{\varphi }}, \nabla _{X_+} \mathbb {E}_{\mu _\varphi } ( F ( \varphi + \zeta ) )) = {{\,\mathrm{Cov}\,}}_{\mu _\varphi }( F(\varphi +\zeta ) , \, ({\dot{\varphi }}, C^{-1}\zeta )). \end{aligned}$$
(2.41)

Proof

The derivative applies only on the block spin field \(\varphi \). We write \(\nabla _\varphi \) for \(\nabla _{X_+}\) with respect to the variable \(\varphi \) and \(\nabla _\zeta \) for \(\nabla _{X}\) with respect to the variable \(\zeta \). Using the notation (2.36),

$$\begin{aligned} ({\dot{\varphi }}, \nabla _{\varphi } \mathbb {E}_{\mu _\varphi } \left( F \big ( \varphi + \zeta \big ) \right) )&= \mathbb {E}_{\mu _\varphi } \left( ({\dot{\varphi }}, \nabla _{\varphi } F \big ( \varphi + \zeta \big ) ) \right) \nonumber \\&\quad - {{\,\mathrm{Cov}\,}}_{\mu _\varphi } \left( F \big ( \varphi + \zeta \big ) \, , \, ({\dot{\varphi }},\nabla _{\varphi } H_\varphi (\zeta )) \right) \nonumber \\&= \mathbb {E}_{\mu _\varphi } \left( ({\dot{\varphi }},\nabla _{\zeta } F \big ( \varphi + \zeta \big ) ) \right) \nonumber \\&\quad - {{\,\mathrm{Cov}\,}}_{\mu _\varphi } \left( F \big ( \varphi + \zeta \big ) \, , \, ({\dot{\varphi }},\nabla _\zeta V \big ( \varphi + \zeta \big )) \right) , \end{aligned}$$
(2.42)

where in the last term we used that, since \({\dot{\varphi }} \in X_+\),

$$\begin{aligned} ({\dot{\varphi }},\nabla _{\varphi } F) = ({\dot{\varphi }}, \nabla _{\zeta } F), \qquad ({\dot{\varphi }},\nabla _{\varphi } H_\varphi ) = ({\dot{\varphi }}, \nabla _{\zeta } V). \end{aligned}$$
(2.43)

By integration by parts, we get also that

$$\begin{aligned} \mathbb {E}_{\mu _\varphi } \left( \nabla _\zeta F \big ( \varphi + \zeta \big ) \right) = \mathbb {E}_{\mu _\varphi } \left( F(\varphi + \zeta ) \nabla _\zeta H_{\varphi }(\zeta ) \right) . \end{aligned}$$
(2.44)

Using this relation and (2.18), we get that for any \(\zeta \in X\),

$$\begin{aligned} ({\dot{\varphi }},\nabla _{\zeta } H_{ \varphi } (\zeta )) = ({\dot{\varphi }}, \nabla _{\zeta } \frac{1}{2} (\zeta , C^{-1} \zeta )) + ({\dot{\varphi }}, \nabla _{\zeta } V(\varphi + \zeta )) = ({\dot{\varphi }}, C^{-1} \zeta ) + ({\dot{\varphi }}, \nabla _\zeta V(\varphi +\zeta )), \end{aligned}$$
(2.45)

and therefore

$$\begin{aligned} \mathbb {E}_{\mu _\varphi } \left( ({\dot{\varphi }}, \nabla _\zeta F \big ( \varphi + \zeta \big ))\right) = \mathbb {E}_{\mu _\varphi } \left( F(\varphi + \zeta ) ({\dot{\varphi }}, C^{-1} \zeta ) \right) + \mathbb {E}_{\mu _\varphi } \left( F(\varphi + \zeta ) ({\dot{\varphi }}, \nabla _\zeta V(\varphi +\zeta )) \right) . \end{aligned}$$
(2.46)

The last equality applied to \(F=1\) implies that (as an identity between elements of \(X_+\))

$$\begin{aligned} \mathbb {E}_{\mu _\varphi } \left( ({\dot{\varphi }}, \nabla _{\zeta } V(\varphi +\zeta )) \right) = - \mathbb {E}_{\mu _\varphi }(({\dot{\varphi }},C^{-1}\zeta )) . \end{aligned}$$
(2.47)

Thus (2.42) becomes

$$\begin{aligned} ({\dot{\varphi }}, \nabla _{\varphi } \mathbb {E}_{\mu _\varphi } \left( F \big ( \varphi + \zeta \big ) \right) ) = {{\,\mathrm{Cov}\,}}_{\mu _\varphi } \left( F \big ( \varphi + \zeta \big ) , ({\dot{\varphi }}, C^{-1} \zeta ) \right) , \end{aligned}$$
(2.48)

as claimed.    \(\square \)

Lemma 2.9

Assume (A1)–(A3). Then for \(\varphi \) in \(X_+\),

$$\begin{aligned} \Vert D_+^{1/2} \nabla _{X_+} \mathbb {E}_{\mu _\varphi } \left( F(\varphi + \zeta ) \right) \Vert _2^2\leqslant & {} \mathbb {E}_{\mu _\varphi } \left( \Vert \frac{D_+^{1/2}}{1-\varepsilon }\nabla _{X_+} F( \varphi + \zeta )\Vert _2^2 \right) \nonumber \\= & {} \mathbb {E}_{\mu _\varphi } \left( \Vert \frac{D_+^{1/2}}{1-\varepsilon }\nabla F( \varphi + \zeta )\Vert _2^2 \right) . \end{aligned}$$
(2.49)

Applying the expectation \(\mathbb {E}_{\mu _+}( \cdot )\) on both sides and substituting the result into (2.40), this completes Lemma 2.6.

Proof of Lemma 2.9

The block spin field \(\varphi \in X_+\) is fixed and in the proof we study the measure \(\mu _\varphi \) on the subspace X. We define \(L_\varphi \) to be the self-adjoint generator of the Glauber dynamics for the conditional measure \(\mu _\varphi \) on X, i.e.,

$$\begin{aligned} L_\varphi F (\zeta ) = \Delta _{X} F (\zeta ) + (\nabla _{X} H_\varphi (\zeta ),\nabla _X F (\zeta )); \end{aligned}$$
(2.50)

see also Appendix A. Moreover, we define the Witten Laplacian \({\mathcal {L}}_\varphi \) on \(L^2(\mu _\varphi ) \otimes X\) by

$$\begin{aligned} {\mathcal {L}}_\varphi = L_\varphi \otimes \mathrm {id}_X + {{\,\mathrm{Hess}\,}}_{X} H_\varphi . \end{aligned}$$
(2.51)

Using the Helffer–Sjöstrand representation (Theorem A.1), one can rewrite the correlations (2.41) under the conditional measure in terms of the operator \({\mathcal {L}}_\varphi \) as

$$\begin{aligned} ({\dot{\varphi }}, \nabla _{X_+} \mathbb {E}_{\mu _\varphi }( F ( \varphi + \zeta )))&= {{\,\mathrm{Cov}\,}}_{\mu _\varphi }( F(\varphi +\zeta ) , \, (C^{-1}\zeta , {\dot{\varphi }})) \nonumber \\&= \mathbb {E}_{\mu _\varphi }(\nabla _{X} (C^{-1} \zeta ,{\dot{\varphi }}) , {\mathcal {L}}_\varphi ^{-1} \, \nabla _{X} F ( \varphi + \zeta )) \nonumber \\&= (C^{-1} {\dot{\varphi }}, \mathbb {E}_{\mu _\varphi }({\mathcal {L}}_\varphi ^{-1} \, \nabla _{X} F ( \varphi + \zeta )))\nonumber \\&= ({\dot{\varphi }}, \mathbb {E}_{\mu _\varphi }(C^{-1}{\mathcal {L}}_\varphi ^{-1} \, \nabla _{X} F ( \varphi + \zeta ))). \end{aligned}$$
(2.52)

This is an identity in \(X_+\) which can be rewritten by using the projection \(Q_+\) as

$$\begin{aligned} \nabla _{X_+} \mathbb {E}_{\mu _\varphi }( F ( \varphi + \zeta )) = \mathbb {E}_{\mu _\varphi }(Q_+ C^{-1}{\mathcal {L}}_\varphi ^{-1} \, \nabla _{X} F ( \varphi + \zeta )). \end{aligned}$$
(2.53)

Composing by \(D_+^{1/2}\) and using that \(D_+ = D_+ Q_+\) by (2.17), we deduce that

$$\begin{aligned} D_+^{1/2} \nabla _{X_+} \mathbb {E}_{\mu _\varphi }( F ( \varphi + \zeta )) = \mathbb {E}_{\mu _\varphi }( M_\varphi \, \nabla _{X} F ( \varphi + \zeta )) \end{aligned}$$
(2.54)

where the operator \(M_\varphi \) is defined as

$$\begin{aligned} M_\varphi = D_+^{1/2}C^{-1} {\mathcal {L}}_\varphi ^{-1}. \end{aligned}$$
(2.55)

Since \(D_+\) commutes with C and with \({\mathcal {L}}_\varphi C\) by (2.18), the operator \(M_\varphi \) acts on \(L^2(\mu _\varphi ) \otimes X\) and is self-adjoint. From (2.54) and the Cauchy-Schwarz inequality, we finally obtain

$$\begin{aligned} \Vert D_+^{1/2} \nabla _{X_+} \mathbb {E}_{\mu _\varphi }(F(\varphi +\zeta ))\Vert _2^2 \leqslant \mathbb {E}_{\mu _\varphi } \Bigl ({ \Vert M_\varphi \nabla _{X} F(\varphi +\zeta )\Vert _2^2 }\Bigr ), \end{aligned}$$
(2.56)

where \(\Vert f\Vert _2^2 = (f,f)\) and \(\nabla _{X_+}\) applies to \(\varphi \) and \(\nabla _X\) applies to \(\zeta \). In the following, we will show that the operator \(M_\varphi \) obeys the following form inequality on \(L^2(\mu _\varphi )\otimes X\):

$$\begin{aligned} M_\varphi \leqslant (1-\varepsilon )^{-1} D_+^{1/2}, \end{aligned}$$
(2.57)

which then concludes the proof of the lemma. Recall that the operator \({\mathcal {L}}_\varphi \) is defined by

$$\begin{aligned} {\mathcal {L}}_\varphi = L_\varphi \otimes \mathrm {id}_{X} + {{\,\mathrm{Hess}\,}}_X H_\varphi = L_\varphi \otimes \mathrm {id}_{X} + {{\,\mathrm{Hess}\,}}_X V(\varphi +\zeta ) + C^{-1}. \end{aligned}$$
(2.58)

Under Assumption (2.18), we can write

$$\begin{aligned} ({{\,\mathrm{Hess}\,}}_X V)C = C^{1/2} ({{\,\mathrm{Hess}\,}}_X V) C^{1/2}. \end{aligned}$$
(2.59)

Using that \(L_\varphi \) and C are positive operators, using Assumption (2.16), it follows that as operators on \(L^2(\mu _\varphi ) \otimes X\),

$$\begin{aligned} {\mathcal {L}}_\varphi C = C^{1/2} {\mathcal {L}}_\varphi C^{1/2} = L_\varphi \otimes C + \mathrm {id}_X + C^{1/2}({{\,\mathrm{Hess}\,}}_X V (\varphi +\zeta ) )C^{1/2} \geqslant (1 -\varepsilon ) Q . \end{aligned}$$
(2.60)

Finally, using that \(D_+=D_+Q\) by Assumption (2.17), and using (2.18), it follows that \(M_\varphi \) satisfies the desired form bound

$$\begin{aligned} M_\varphi \leqslant (1-\varepsilon )^{-1} D_+^{1/2}. \end{aligned}$$
(2.61)

This completes the proof.    \(\square \)

3 Hierarchical \(|\varphi |^4\) Model

In this section, we apply Corollaries 2.2 and 2.3 to the hierarchical \(|\varphi |^4\) model. Throughout this section, the dimension is fixed to be \(d=4\). Nevertheless, we sometimes write d to emphasise that a factor 4 arises from the dimension \(d=4\) rather than from the exponent of \(|\varphi |^4\).

3.1 Renormalisation group flow

For \(m^2>0\) (to be determined in Theorem 3.1 as a function of g and \(\nu \)), we decompose

$$\begin{aligned} (-\Delta _H+m^2)^{-1} = C_0 + \cdots + C_N, \end{aligned}$$
(3.1)

as in (2.3), and define the renormalised potential with respect to this decomposition as in (2.14),

$$\begin{aligned} e^{-V_{j+1}(\varphi )} = \mathbb {E}_{C_{j}} \left( { e^{-V_j(\varphi +\zeta )} }\right) . \end{aligned}$$
(3.2)

Note in particular that the sequence of renormalised potentials depends on the choice of \(m^2\), and that \(C_j \leqslant \vartheta _j^2 L^{2j} Q_j\) where we define \(\vartheta _j = 2^{-(j-j_m)_+}\). As a consequence of the hierarchical structure, the renormalised potential can be written as

$$\begin{aligned} V_j(\varphi ) = \sum _{B\in {\mathcal {B}}_j} V_j(B,\varphi ), \end{aligned}$$
(3.3)

where \(V_j(B,\varphi )\) is a function of \(\varphi \) that depends only on the restriction \(\varphi |_B\) for any block \(B \in {\mathcal {B}}_j\).

We always restrict the domain of the functions \(V_j(B)\) to the space \(X_j(B) \cong \mathbb {R}^n\) of fields that are constant on B. Explicitly, for a block \(B \in {\mathcal {B}}\), denote by \(i_B: \mathbb {R}^n \rightarrow \mathbb {R}^{nB}\) the linear map that sends \(\varphi \in \mathbb {R}^n\) to the constant field \(\varphi : B \rightarrow \mathbb {R}^n\) with \(\varphi _x = \varphi \) at every \(x \in B\). Then \(V_j(B) \circ i_B\) is a function of a single variable in \(\mathbb {R}^n\) induced by \(V_j(B)\). In particular using (2.10) one can view \(V_j(B)\) as a function in \(\mathbb {R}^{nB}\), so that for any \({{\dot{\varphi }}} \in X_j(B)\) taking the constant value \({{\dot{\varphi }}}_B \in \mathbb {R}^n\),

$$\begin{aligned} {{\dot{\varphi }}} ({{\,\mathrm{Hess}\,}}V_j(B)) {{\dot{\varphi }}} = {{\dot{\varphi }}_B} {{\,\mathrm{Hess}\,}}(V_j(B) \circ i_B){\dot{\varphi }}_B. \end{aligned}$$
(3.4)

If there is a constant \(s >0\) such that

$$\begin{aligned} \frac{1}{ |B| } {{\dot{\varphi }}_B} {{\,\mathrm{Hess}\,}}(V_j(B) \circ i_B) {\dot{\varphi }}_B \geqslant - s ({\dot{\varphi }}_B,{\dot{\varphi }}_B), \end{aligned}$$
(3.5)

then using that \(({\dot{\varphi }},{\dot{\varphi }}) =|{\dot{\varphi }}_B|^2|B|\), we deduce

$$\begin{aligned} {{\dot{\varphi }}} ({{\,\mathrm{Hess}\,}}V_j(B)) {{\dot{\varphi }}} \geqslant - s ({\dot{\varphi }} ,{\dot{\varphi }} ). \end{aligned}$$
(3.6)

With the notation (2.11), the inequalities (3.5) and \(C_j \leqslant \vartheta _j^2 L^{2j} Q_j\), it follows that

$$\begin{aligned} C_j^{1/2}({{\,\mathrm{Hess}\,}}_{X_j} V_j)C_j^{1/2} \geqslant - s \vartheta _j^2 L^{2j} Q_j. \end{aligned}$$
(3.7)

Thus, in the hierarchical model, Assumption (A1) in (2.16) with \(\varepsilon _j = s \vartheta _j^2 L^{2j}\) follows from (3.5). In the rest of this section, we therefore reduce to the study of the function \(V_j(B) \circ i_B\) in \(\mathbb {R}^n\).

The renormalisation group for the \(|\varphi |^4\) model provides precise estimates on the renormalised potential \(V_j\) when the field \(\varphi \) is not too large. The following theorem about the renormalisation group flow is proved in [9]. Note that \(V_j\) in (3.2) is the full renormalised potential (the logarithm of the density with respect to the Gaussian reference measure), not its leading contribution as in [9]. We will denote the latter instead by \({\hat{V}}_j\) as it plays a less central role in the arguments of this paper. It is determined by the coupling constants \((g_j,\nu _j) \in \mathbb {R}^2\) through

$$\begin{aligned} {\hat{V}}_j(B,\varphi ) = \sum _{x\in B} \left( { \frac{1}{4} g_j|\varphi _x|^4 + \frac{1}{2}\nu _j|\varphi _x|^2}\right) , \quad {\hat{W}}_j(B,\varphi ) = \sum _{x \in B} \left( { \frac{1}{6} \alpha _j g_j^2 |\varphi _x|^6}\right) , \end{aligned}$$
(3.8)

where \(\alpha _j = \alpha _j(m^2) = O(L^{2j}L^{-(j-j_m)_+})\) is an explicit (j-dependent) constant and \(j_m = \lfloor {\log _L m^{-1}} \rfloor \) is the mass scale. We stress the fact that if the field is constant on B then

$$\begin{aligned} {\hat{V}}_j(B) \circ i_B (\varphi ) = |B| \left( { \frac{1}{4} g_j|\varphi |^4 + \frac{1}{2}\nu _j|\varphi |^2}\right) , \quad {\hat{W}}_j(B) \circ i_B (\varphi ) = |B| \left( { \frac{1}{6} \alpha _j g_j^2 |\varphi |^6}\right) , \end{aligned}$$
(3.9)

so that in the following we will often consider the effective potential normalised by the factor 1 / |B| [see also (3.5)].

For the statement of the theorem, define the fluctuation field scale\(\ell _j\) and the large field scale\(h_j\) by

$$\begin{aligned} \ell _j = L^{-(d-2) j/2} = L^{-j}, \qquad h_j = L^{-dj/4}g_j^{-1/4} = L^{-j} g_j^{-1/4}. \end{aligned}$$
(3.10)

Finally, we define \({\mathcal {F}}_j\) by \(F \in {\mathcal {F}}_j\) if for any \(B\in {\mathcal {B}}_j\) there is a function \(\varphi \in \mathbb {R}^{n\Lambda } \mapsto F(B,\varphi )\) that (i) depends only on the average of \(\varphi \) over the block B; (ii) the function \(F(B) \circ i_B\) is the same for any block B; and (iii) the function F(B) is invariant under rotations, i.e., \(F(\varphi ,B) = F(T\varphi ,B)\) for any \(T \in O(n)\) acting on \(\varphi \in \mathbb {R}^{n\Lambda }\) by \((T\varphi )_x = T\varphi _x\); see [9, Definition 5.1.5].

Theorem 3.1

Let \(L \geqslant L_0\). For any \(g>0\) small enough, there exists \(\nu _c(g) = -C(n+2)g + O(g^2)\) such that for \(\nu > \nu _c(g)+cL^{-2N}\), there exists \(m^2 > 0\), a sequence of coupling constants \((g_j,\nu _j, u_j) \subset \mathbb {R}^3\), and \({\hat{K}}_j \in {\mathcal {F}}_j\) such that the following are true.

  1. 1.

    The full renormalised potential \(V_j\) defined by (3.2) satisfies: for all \(\varphi \) that are constant on B,

    $$\begin{aligned} e^{-V_j(B,\varphi )} = e^{-u_j|B|}(e^{-{\hat{V}}_j(B,\varphi )}(1+{\hat{W}}_j(B,\varphi )) + {\hat{K}}_j(B,\varphi )). \end{aligned}$$
    (3.11)
  2. 2.

    The sequence \((g_j,\nu _j)\) of coupling constants satisfies \((g_0,\nu _0)=(g,\nu -m^2)\), and

    $$\begin{aligned} g_{j+1} = g_j - \beta _j g_j^2 + O(2^{-(j-j_m)_+}g_j^3), \qquad 0 \geqslant L^{2j}\nu _j = O(2^{-(j-j_m)_+}g_j), \end{aligned}$$
    (3.12)

    where \(\beta _j = \beta _0^0(1+m^2L^{2j})^{-2}\) for an absolute constant \(\beta _0^0>0\) and \(j_m = \lfloor {\log _Lm^{-1}} \rfloor \).

  3. 3.

    The functions \({\hat{K}}_j\) satisfy \({\hat{K}}_0=0\) and

    $$\begin{aligned} \sup _{\varphi \in \mathbb {R}^n} \max _{0\leqslant \alpha \leqslant 3} h_j^{\alpha } |\nabla ^\alpha ({\hat{K}}_j(B) \circ i_B)(\varphi )|&= O(2^{-(j-j_m)_+}g_j^{3/4}), \end{aligned}$$
    (3.13)
    $$\begin{aligned} \max _{0\leqslant \alpha \leqslant 3} \ell _j^{\alpha } |\nabla ^\alpha ({\hat{K}}_j(B) \circ i_B)(0)|&= O(2^{-(j-j_m)_+}g_j^{3}), \end{aligned}$$
    (3.14)

    where \(\ell _j = L^{-j}\) and \(h_j = L^{-j} g_j^{-1/4}\).

  4. 4.

    The relation between \(t = \nu - \nu _c(g) >0\) and \(m^2>0\) satisfies, as \(t \downarrow 0\),

    $$\begin{aligned} m^2 \sim C_g t(\log t^{-1})^{-(n+2)/(n+8)}. \end{aligned}$$
    (3.15)

In the above theorem and everywhere else, the error terms \(O(\cdot )\) are uniform in the scale j. The theorem is mainly proved and explained in [9]. For our application to the analysis of the spectral gap of the Glauber dynamics, it is however more convenient to use a slightly different organisation than that used in [9]. It is here better to use the decomposition (2.3) instead of (2.2) (used in [9]). We translate between the conventions in [9] and those used in the statement of Theorem 3.1 in Appendix B and also give precise references there.

We remark that the normalising constants \(u_j\) are unimportant for our purposes, and that the recursion (3.12) implies that, as \(m^2 \downarrow 0\),

$$\begin{aligned} g_j^{-1} = O(g_{j_m}^{-1}), \qquad g_{j_m}^{-1} \sim \beta _0^0 \log m^{-1}; \end{aligned}$$
(3.16)

see [9, Proposition 6.1.3].

A variant of the theorem implies the following asymptotic behaviour of the susceptibility as the critical point is approached.

Corollary 3.2

Let \(F= \sum _x\varphi _x^{1}\). Then for \(t = \nu -\nu _c \geqslant c L^{-2 N}\),

$$\begin{aligned} \frac{ {{\,\mathrm{Var}\,}}_\mu (F)}{|\Lambda _N|} = \frac{1}{m^2} \left( {1+o\left( {\frac{1}{L^{2N}m^2}}\right) }\right) \sim C_g \frac{1}{t}(-\log t)^{(n+2)/(n+8)}, \end{aligned}$$
(3.17)

with o(1) tending to 0 as \(L^{2N}m^2 \rightarrow \infty \), and \({{\,\mathrm{Var}\,}}_\mu \) denotes the variance under the full \(|\varphi |^4\) measure as in (1.2).

Indeed, the corollary is [9, Theorem 5.2.1 and (6.2.17)], noting that \({{\,\mathrm{Var}\,}}_\mu (F)/|\Lambda _N|\) is the finite volume susceptibility studied there. The corollary provides the upper bound in Theorem 1.1 since, with F as defined in the corollary,

$$\begin{aligned} \frac{(\nabla F,\nabla F)}{|\Lambda _N|} = 1, \end{aligned}$$
(3.18)

and \(\gamma _N(g,\nu _c(g)) \leqslant {{\,\mathrm{Var}\,}}_\mu (F)/\mathbb {E}_\mu (\nabla F,\nabla F)\) for any F by definition of the spectral gap.

3.2 Small field region

The bounds of Theorem 3.1 are effective for small fields \(|\varphi | \leqslant h_j\). For such fields \(\varphi \), the approximate effective potential \({\hat{V}}_j(\varphi )\) is a good approximation to \(V_j(\varphi )\). Indeed, then \(e^{{\hat{V}}_j(B,\varphi )} = e^{O(1)}\) and

$$\begin{aligned} V_j(B,\varphi )-{\hat{V}}_j(B,\varphi )&= -\log (1+{\hat{W}}_j(B,\varphi ) + e^{{\hat{V}}_j(B,\varphi )} {\hat{K}}_j(B,\varphi )) + u_j |B| \nonumber \\&= -{\hat{W}}_j(B,\varphi )- e^{{\hat{V}}_j(B,\varphi )} {\hat{K}}_j(B,\varphi ) + u_j |B| + O({\hat{W}}_j+e^{{\hat{V}}_j} {\hat{K}}_j)^2. \end{aligned}$$
(3.19)

Recall the abbreviation \(\vartheta _j= 2^{-(j-j_m)_+}\) where \(j_m = \lfloor \log _L m^{-1} \rfloor \) is the mass scale. By (3.12) and (3.13) and the definition of \({\hat{W}}\), uniformly in \(\varphi \in \mathbb {R}^n\) with \(|\varphi | \leqslant h_j\),

$$\begin{aligned} \max _{0\leqslant \alpha \leqslant 3} h_j^{\alpha } |\nabla ^\alpha ({\hat{W}}_j(B) \circ i_B)(\varphi )|&= O(\vartheta _jg_j^{2/4}), \end{aligned}$$
(3.20)
$$\begin{aligned} \max _{0\leqslant \alpha \leqslant 3} h_j^{\alpha } |\nabla ^\alpha (e^{{\hat{V}}_j(B)}{\hat{K}}_j(B) \circ i_B)(\varphi )|&= O(\vartheta _jg_j^{3/4}), \end{aligned}$$
(3.21)

and the remainder satisfies an analogous estimate. In particular, by (3.19),

$$\begin{aligned} {{\,\mathrm{Hess}\,}}(V_j(B) \circ i_B)(\varphi )&= {{\,\mathrm{Hess}\,}}(({\hat{V}}_j-{\hat{W}}_j)(B) \circ i_B)(\varphi ) + O(\vartheta _jh_j^{-2}g_j^{3/4})\mathrm {id}_{n} \nonumber \\&= {{\,\mathrm{Hess}\,}}(({\hat{V}}_j-{\hat{W}}_j)(B) \circ i_B)(\varphi ) + O(\vartheta _jL^{2j}g_j^{5/4}) \mathrm {id}_{n} , \end{aligned}$$
(3.22)

where \(\mathrm {id}_{n}\) is the identity matrix acting on the single-spin space \(\mathbb {R}^n\). The first term on the right-hand side can be computed explicitly from (3.8), which implies that as quadratic forms,

$$\begin{aligned} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}({\hat{V}}_j(B) \circ i_B)(\varphi )&= ((g_j|\varphi |^2 +\nu _j)\mathrm {id}_n + 2g_j (\varphi ^k\varphi ^l)_{k,l}) \geqslant \big ( g_j|\varphi |^2 + \nu _j \big ) \mathrm {id}_{n}, \end{aligned}$$
(3.23)
$$\begin{aligned} \frac{1}{|B|}|{{\,\mathrm{Hess}\,}}({\hat{W}}_j(B) \circ i_B)(\varphi )|&\leqslant 5\alpha _j g_j^2(|\varphi |^4 \mathrm {id}_n + 2|\varphi |^2(\varphi ^k\varphi ^l)_{k,l}) \leqslant (15\alpha _j g_j^2|\varphi |^4) \mathrm {id}_n, \end{aligned}$$
(3.24)

where \(|B| =L^{dj}\), and where we used that the \(n\times n\) matrix \((\varphi ^k\varphi ^l)_{k,l}\) has eigenvalues 0 and \(|\varphi |^2 \geqslant 0\). Combining (3.22) with (3.23) and (3.24), we find that

$$\begin{aligned} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}(V_j(B) \circ i_B)(\varphi ) \geqslant \Bigl ({g_j|\varphi |^2 + \nu - 15\alpha _j g_j^2 |\varphi |^4 - O(\vartheta _jL^{-2j}g_j^{5/4})}\Bigr ) \mathrm {id}_n. \end{aligned}$$
(3.25)

Using that \(\alpha _j g_j|\varphi |^2 = O(g_j^{1/2})\) for \(|\varphi | \leqslant h_j\) (since \(\alpha _j = O(L^{2j})\)), in summary, we have obtained the following corollary of Theorem 3.1.

Corollary 3.3

Suppose that \(V_0\) satisfies the conditions of Theorem 3.1. Then for all scales \(j\in \mathbb {N}\) and all \(\varphi \in \mathbb {R}^n\) with \(|\varphi |\leqslant h_j\), the effective potential satisfies the quadratic form bounds

$$\begin{aligned} \frac{1}{|B|}{{\,\mathrm{Hess}\,}}(V_j(B) \circ i_B)(\varphi ) \geqslant \Bigl ({g_j|\varphi |^2(1-O(g_j^{1/2})) + \nu _j - O(\vartheta _jL^{-2j}g_j^{5/4})}\Bigr )\mathrm {id}_n, \end{aligned}$$
(3.26)

with \(0 \leqslant -\nu _j = O(\vartheta _jL^{-2j}g_j)\), and furthermore

$$\begin{aligned} \frac{1}{|B|} \nabla (V_j(B) \circ i_B)(\varphi ) = g_j\varphi |\varphi |^2(1-O(g_j^{1/2})) + \nu _j\varphi + O(\vartheta _jL^{-3j} g_j). \end{aligned}$$
(3.27)

3.3 Large field region

Using the small field estimates as input, we are going to prove the following estimate for the large field region.

Theorem 3.4

Assume the conditions of Theorem 3.1, in particular that \(g>0\) is sufficiently small and that \(\nu > \nu _c(g) + cL^{-2N}\). Then for all \(j \in \mathbb {N}\) and all \(B \in {\mathcal {B}}_j\), the effective potential satisfies

$$\begin{aligned} L^{2j} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}(V_j(B) \circ i_B) \geqslant \varepsilon _j \mathrm {id}_n \quad \text {for all }\varphi \in \mathbb {R}^n \text { with }|\varphi | \geqslant h_j, \end{aligned}$$
(3.28)

where the constants \(\varepsilon _j\) satisfy \(\varepsilon _{j+1} = {\bar{\varepsilon }}_j - O(\vartheta _j^2{\bar{\varepsilon }}_{j}^2)\) and \(\varepsilon _0= \frac{1}{5} g_0^{1/2}\) where \({\bar{\varepsilon }}_j = \varepsilon _j \wedge \frac{1}{5} g_j^{1/2}\).

To prove Theorem 1.1, we will only use the conclusion \(\varepsilon _j \geqslant 0\) from Theorem 3.4. However, in order to prove Theorem 3.4, it is convenient that the \(\varepsilon _j\) do not become too small. The elementary proof of the following estimate is given in Appendix B.

Lemma 3.5

The sequence \((\varepsilon _j)\) defined in Theorem 3.4 satisfies \(\varepsilon _j \geqslant c g_j\) for all \(j\in \mathbb {N}\).

We will prove Theorem 3.4 by induction in j. For \(j=0\), the estimate (3.28) can be checked directly from (3.23) and \(\nu \geqslant \nu _c(g) = -O(g)\), which imply that

$$\begin{aligned} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}(V_0(B) \circ i_B) \geqslant (g|\varphi |^2 + \nu )\mathrm {id}_n \geqslant g(|\varphi |^2 - O(1))\mathrm {id}_n \geqslant (g^{1/2} - O(g)) \mathrm {id}_n. \end{aligned}$$
(3.29)

From the inductive assumption and Corollary 3.3, we can get the following bounds.

Lemma 3.6

Assume that (3.28) holds for some \(j\in \mathbb {N}\) and that \(\varepsilon _j \leqslant \frac{1}{4} g_j^{1/2} - O(g_j)\). Then

$$\begin{aligned} L^{2(j+1)} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}(V_j(B)\circ i_B)&\geqslant \varepsilon _j \mathrm {id}_n \quad \hbox { for all}\ |\varphi | \geqslant \frac{1}{2} h_{j+1}, \end{aligned}$$
(3.30)
$$\begin{aligned} L^{2j} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}(V_j(B)\circ i_B)&\geqslant -O(g_j)\mathrm {id}_n \quad \hbox { for all}\ \varphi . \end{aligned}$$
(3.31)

Proof

For \(|\varphi | \geqslant h_j\), the estimate (3.30) follows directly from the assumption (3.28) and the trivial bound \(L^2\varepsilon _j \geqslant \varepsilon _j\). Next we consider the case \(\frac{1}{2} h_{j+1} \leqslant |\varphi | \leqslant h_j\). By definition,

$$\begin{aligned} h_{j+1} = L^{-(j+1)} g_{j+1}^{-1/4} = L^{-(j+1)} g_j^{-1/4}(1+O(g_j)) = L^{-1}h_j (1+O(g_j)). \end{aligned}$$
(3.32)

Therefore (3.26) implies

$$\begin{aligned} L^{2(j+1)} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}(V(B) \circ i_B)\geqslant & {} (g_j(\frac{1}{2} L^{j+1} h_{j+1})^2 + \nu _j L^{2(j+1)} - O(g_j)) \nonumber \\\geqslant & {} (\frac{1}{4} g_j^{1/2} - O(L^ 2 g_j)) \geqslant \varepsilon _j . \end{aligned}$$
(3.33)

Similarly, using Corollary 3.3 for the small fields and the inductive assumption for the large fields, we have for all \(\varphi \) that

$$\begin{aligned} L^{2j} \frac{1}{|B|} {{\,\mathrm{Hess}\,}}(V_j(B)\circ i_B) \geqslant - O(g_j) \mathrm {id}_n, \end{aligned}$$
(3.34)

which implies (3.31). This completes the proof of Lemma 3.6.    \(\square \)

The following proposition now advances the induction and thus proves Theorem 3.4.

Proposition 3.7

Assume (3.30) and (3.31) with \(j<N\). For \(\varphi \in \mathbb {R}^n\) with \(|\varphi | \geqslant h_{j+1}\) and \(B_+ \in {\mathcal {B}}_{j+1}\),

$$\begin{aligned} L^{2(j+1)} \frac{1}{|B_+|} {{\,\mathrm{Hess}\,}}(V_{j+1}(B_+) \circ i_{B_+})(\varphi ) \geqslant (\varepsilon _j-O(\vartheta _j^2\varepsilon _j^2))\mathrm {id}_{n} . \end{aligned}$$
(3.35)

The proposition will be proved in the remainder of this section. Since the scale j will be fixed we usually drop the j and write \(+\) instead of \(j+1\). To set-up notation, we fix a block \(B_+ \in {\mathcal {B}}_{+}\) and write \(V(B_+) = \sum _{B \in {\mathcal {B}}_j(B_+)} V(B)\). By the hierarchical structure, \({{\,\mathrm{Hess}\,}}V(B_+)\) is a block diagonal matrix indexed by the blocks \(B \in {\mathcal {B}}(B_+)\), and we will always restrict the domain to \(X_j(B_+)\), the space of fields constant inside the small blocks B. On this domain, \(V(B_+)\) can be identified with a function of \(L^d\) vector-valued variables while \(V_+(B_+)\) has domain \(X_{+}(B_+)\) and can be identified with a function of a single vector-valued variable. The covariance operator C and the projection Q operate naturally on \(X(B_+)=X_j(B_+)\) and can be identified with diagonal matrices indexed by blocks \(B \in {\mathcal {B}}(B_+)\); in particular, they are invertible on \(X(B_+)\). By the definition of \(V_+\) in (3.2), together with the hierarchical structure of C, it follows that

$$\begin{aligned} V_+(B_+, \varphi ) = -\log \mathbb {E}_{C}(e^{-V(B_+,\varphi +\zeta )}) = -\log \int _{X(B_+)} e^{-H_\varphi (\zeta )} \, d\zeta + \text {constant}, \end{aligned}$$
(3.36)

where (recall that here C denotes the restriction of C to \(X(B_+)\))

$$\begin{aligned} H_\varphi (\zeta ) = \frac{1}{2} (\zeta ,C^{-1}\zeta ) + V(B_+,\varphi +\zeta ) . \end{aligned}$$
(3.37)

By differentiating (3.36) we obtain, for \({\dot{\varphi }} \in X_+(B_+)\),

$$\begin{aligned} {\dot{\varphi }}{{\,\mathrm{Hess}\,}}V_+(B_+,\varphi ){\dot{\varphi }} = \langle {\dot{\varphi }}{{\,\mathrm{Hess}\,}}V(B_+,\varphi +\zeta ){\dot{\varphi }} \rangle _{H_\varphi } - {{\,\mathrm{Var}\,}}_{H_\varphi }(\nabla V(B_+,\varphi +\zeta ) \cdot {\dot{\varphi }}) \end{aligned}$$
(3.38)

where \(\langle \cdot \rangle _{H_\varphi }\) denotes the expectation of the probability measure with density \(e^{-H_\varphi }\) on \(X(B_+)\), and \(\nabla \) is the gradient in \(X(B_+)\), i.e., with respect to fields that are constants on scale-j blocks in \(B_+\).

To estimate the right-hand side of the last equation, we need some information on the typical value of the fluctuation field \(\zeta \) under the expectation \(\langle \cdot \rangle _{H_\varphi }\). By assumption of the proposition, the bound (3.31) holds, and together with the definition of \(C = C_j\) in particular,

$$\begin{aligned} C^{1/2} {{\,\mathrm{Hess}\,}}V(B_+,\zeta )C^{1/2} \geqslant -\frac{1}{2} Q \quad \hbox { for all}\ \zeta \in X(B_+), \end{aligned}$$
(3.39)

as an operator on \(X(B_+)\), i.e., \(\zeta \) is a constant on every \(B \in {\mathcal {B}}(B_+)\). Therefore, uniformly in \(\zeta \),

$$\begin{aligned} C^{1/2}{{\,\mathrm{Hess}\,}}H_\varphi (B_+,\zeta )C^{1/2} = Q + C^{1/2} {{\,\mathrm{Hess}\,}}V(B_+,\varphi +\zeta ) C^{1/2} \geqslant \frac{1}{2} Q. \end{aligned}$$
(3.40)

For any \(\varphi \), the action \(H_\varphi \) is therefore strictly convex on \(X(B_+)\) and, in particular, it has a unique minimiser in this space. We denote this minimiser by \(\zeta ^0\). It satisfies the Euler–Lagrange equation

$$\begin{aligned} \zeta ^0 = - C \nabla V(B_+,\varphi +\zeta ^0). \end{aligned}$$
(3.41)

Here recall the definition \(V(B_+) = \sum _{B\in {\mathcal {B}}(B_+)} V(B)\), and hence that \(\nabla V(B_+)\) is a vector of blocks indexed by \(B \in {\mathcal {B}}(B_+)\), on which the covariance operator C acts diagonally.

Further recall that \(\varphi \) is constant on \(B_+\). By symmetry and uniqueness of the minimiser, we see that \(\zeta ^0\) has to be constant not only in each small block B, but in each \(B_+\), i.e., \(\zeta ^0 \in X_{+}(B_+)\). In the following lemma, the block \(B_+\) is fixed and \(\varphi \) and \(\zeta ^0\) are both in \(X_{+}(B_+)\) so that we may identify them with variables in \(\mathbb {R}^n\).

Lemma 3.8

Let \(|\varphi |\geqslant h_+\). Then \(|\varphi +\zeta ^0| \geqslant h_+(1 - O(g^{1/2}))\).

Proof

As discussed above, we regard \(\nabla V\) and \(C\nabla V\) both as block vectors indexed by \(B \in {\mathcal {B}}(B_+)\). For \(\varphi '\) constant on \(B_+\), the blocks of \(\nabla V(B_+,\varphi ')\) are equal and C acts by multiplying each of these blocks by the same constant \(O(\vartheta ^2 L^{2j})\). Hence \(C\nabla V(B_+,\varphi ')\) is a block vector with all blocks equal to \(O(\vartheta ^2 L^{2j})\nabla V(B,\varphi ')\) where B is any of the block in \({\mathcal {B}}(B_+)\). We denote by \(|C\nabla V(B_+,\varphi ')|_\infty \) the value in any of these blocks. Now (3.27) implies that, for \(\varphi '\) constant on \(B_+\) with \(|\varphi '|\leqslant h_+\),

$$\begin{aligned} M&:= \sup _{|\varphi '|\leqslant h_+} |C \nabla V(B_+,\varphi ')|_\infty \nonumber \\&\leqslant \vartheta ^2 L^{2j} \Bigl ({ g h_+^3(1+O(g^{1/2})) + \nu h_+ + O(L^{-dj} h_+^{-1} g^{3/4})}\Bigr ) \nonumber \\&\leqslant \vartheta ^2 h_+ \left( {g L^{2j} h_+^2(1+O(g^{1/2})) + L^{2j} \nu + O(L^{-2j} h_+^{-2} g^{3/4})}\right) \leqslant O(\vartheta ^2 g^{1/2}h_+) . \end{aligned}$$
(3.42)

To prove the claim, we may assume that \(|\varphi +\zeta ^0| \leqslant h_+\) since otherwise the claim holds trivially. Then \(|\zeta ^0| \leqslant M = O(\vartheta ^2 g^{1/2} h_+)\) by (3.41) and (3.42). We conclude from this that \(|\varphi +\zeta ^0| > h_+\) or \(|\zeta ^0| = O(\vartheta ^2 g^{1/2}h_+)\). Thus \(|\varphi +\zeta ^0| \geqslant h_+ \wedge (|\varphi |-O(\vartheta ^2 g^{1/2}h_+)) \geqslant h_+(1- O(\vartheta ^2 g^{1/2}))\).    \(\square \)

In the following lemma, \(\zeta \in X(B_+)\) is the fluctuation field under the measure with expectation \(\langle \cdot \rangle _{H_\varphi }\). Thus \(\zeta \) is constant in any small block B, but unlike the minimiser \(\zeta ^0\) the field \(\zeta \) is not constant in \(B_+\).

Lemma 3.9

For any \(t \geqslant 1\), with \(\ell = L^{-j}\) as in (3.10),

$$\begin{aligned} \forall x \in B_+, \qquad \mathbb {P}_{H_\varphi }(|\zeta _x-\zeta ^0| \geqslant 3\vartheta \ell t) \leqslant 2e^{-t^2/4}. \end{aligned}$$
(3.43)

Proof

By changing variables, it suffices to study the measure with action \(H(\zeta ) = H_\varphi (\zeta +\zeta ^0)\), whose unique minimiser is \(\zeta =0\), and clearly H has the same Hessian as \(H_\varphi \). From the information that the minimiser of H is 0, we obtain a bound on the random variable \(\zeta \) as follows. Using that \({{\,\mathrm{Hess}\,}}H \geqslant \frac{1}{2} C^{-1}\) as quadratic forms and that \(C_{xx} \leqslant \vartheta ^2\ell ^2\) for all \(x\in \Lambda \) by definition, the Brascamp–Lieb inequality (A.5) for the measure \(\langle \cdot \rangle _H\) with density proportional to \(e^{-H}\) implies

$$\begin{aligned} \langle e^{s(\zeta _x-\mathbb {E}_H(\zeta _x))} \rangle _H \leqslant e^{s^2 C_{xx}} \leqslant e^{s^2 \vartheta ^2 \ell ^2} . \end{aligned}$$
(3.44)

By Markov’s inequality therefore

$$\begin{aligned} \mathbb {P}_H(|\zeta _x-\langle \zeta _x \rangle _H| > \vartheta \ell t) \leqslant 2e^{-t^2/4}. \end{aligned}$$
(3.45)

To estimate the mean \(\langle \zeta \rangle _H\), we integrate by parts to get

$$\begin{aligned} |B_+| \vartheta ^2 \ell ^2 \int e^{-H}\geqslant & {} \sum _{x \in B_+}C_{xx} \int e^{-H} = \int \left( {\nabla , C\zeta }\right) \, e^{-H} \nonumber \\= & {} \int ( \zeta , C \nabla H(\zeta )) \, e^{-H} \geqslant \frac{1}{2} \int (\zeta ,\zeta ) \, e^{-H} \end{aligned}$$
(3.46)

where the integral is over \(X(B_+)\) and \(\nabla \) is the gradient on \(X(B_+)\), and where we used that, by (3.40),

$$\begin{aligned} (\zeta , C \nabla H(\zeta )) = \int _0^1 (\zeta , C^{1/2} {{\,\mathrm{Hess}\,}}H(t\zeta ) C^{1/2} \zeta ) \, dt \geqslant \frac{1}{2}(\zeta ,\zeta ). \end{aligned}$$
(3.47)

Since \(\mathbb {E}(\zeta ,\zeta ) = | B_+| \langle \zeta _x^2 \rangle _H\) by symmetry, therefore

$$\begin{aligned} \langle \zeta _x^2 \rangle _H \leqslant 2 \vartheta ^2 \ell ^2, \quad |\langle \zeta _x \rangle _H| \leqslant \sqrt{2} \vartheta \ell . \end{aligned}$$
(3.48)

Finally, combining (3.45) and (3.48)

$$\begin{aligned} \mathbb {P}_H(|\zeta _x| > 3 \vartheta \ell t) \leqslant \mathbb {P}_H(|\zeta _x-\langle \zeta _x \rangle _H| \geqslant \vartheta \ell t) \leqslant 2e^{-t^2/4}, \end{aligned}$$
(3.49)

which is the claim.    \(\square \)

Next we use the following estimate on \({{\,\mathrm{Hess}\,}}V_+(B_+)\).

Lemma 3.10

Let \(\varphi , {\dot{\varphi }} \in X_{+}(B_+)\). Then

$$\begin{aligned} {\dot{\varphi }} {{\,\mathrm{Hess}\,}}V_+(B_+, \varphi ) {\dot{\varphi }} \geqslant \left\langle {\dot{\varphi }} \frac{{{\,\mathrm{Hess}\,}}V(B_+, \varphi +\zeta )}{\mathrm {id}+ C^{1/2} {{\,\mathrm{Hess}\,}}V(B_+,\varphi +\zeta ) C^{1/2}} {\dot{\varphi }} \right\rangle _{H_\varphi } \end{aligned}$$
(3.50)

where \({{\,\mathrm{Hess}\,}}V_+(B_+)\) is taken in \(X_+(B_+)\) and \({{\,\mathrm{Hess}\,}}V(B_+)\) is taken in \(X(B_+)\).

Note that \({{\,\mathrm{Hess}\,}}V(B_+,\varphi +\zeta )\) are both diagonal matrices indexed by \(B \in {\mathcal {B}}_+\), with constant entries on each block B. In fact, C is proportional to the identity matrix on \(X(B_+)\).

Proof

We freeze the block spin field \(\varphi \in X_{+}(B_+)\) and recall that the fluctuation field \(\zeta \in X(B_+)\) is distributed with expectation \(\langle \cdot \rangle _{H_\varphi }\). We abbreviate \({{\,\mathrm{Hess}\,}}V = {{\,\mathrm{Hess}\,}}V(\varphi +\zeta ) = {{\,\mathrm{Hess}\,}}V(B_+,\varphi +\zeta )\) throughout the proof. Applying the Brascamp–Lieb inequality (A.4) to the measure \(\langle \cdot \rangle _{H_\varphi }\) gives

$$\begin{aligned} {{\,\mathrm{Var}\,}}_{H_\varphi }(\nabla V(\varphi +\zeta )\cdot {\dot{\varphi }}) \leqslant \langle {\dot{\varphi }} {{\,\mathrm{Hess}\,}}V(\varphi +\zeta ) (C^{-1} + {{\,\mathrm{Hess}\,}}V(\varphi +\zeta ) )^{-1} {{\,\mathrm{Hess}\,}}V(\varphi +\zeta ) {\dot{\varphi }} \rangle _{H_\varphi } . \end{aligned}$$
(3.51)

Inserting this into (3.38), the above can be written as

$$\begin{aligned} {\dot{\varphi }} {{\,\mathrm{Hess}\,}}V_+(\varphi ) {\dot{\varphi }} \geqslant \Bigl \langle {\dot{\varphi }} \Bigl ({{{\,\mathrm{Hess}\,}}V - {{\,\mathrm{Hess}\,}}V (C^{-1} + {{\,\mathrm{Hess}\,}}V )^{-1} {{\,\mathrm{Hess}\,}}V}\Bigr ) {\dot{\varphi }} \Bigr \rangle _{H_\varphi } . \end{aligned}$$
(3.52)

Since \({{\,\mathrm{Hess}\,}}V\) and C are both (block) diagonal matrices, the term inside the expectation can be written as

$$\begin{aligned} {{\,\mathrm{Hess}\,}}V (\mathrm {id}+ C^{1/2} {{\,\mathrm{Hess}\,}}V C^{1/2} )^{-1}. \end{aligned}$$
(3.53)

This completes the proof.    \(\square \)

For \(\varphi \in X(B_+)\), let \(\Lambda (\varphi )\) be the largest constant such that \(L^{2(j+1)} {{\,\mathrm{Hess}\,}}V(B_+,\varphi ) \geqslant \Lambda (\varphi )\) as quadratic forms on \(X(B_+)\). From (3.39) it follows that \(\Lambda (\varphi ) \geqslant -\frac{1}{2}\) uniformly in \(\varphi \in X(B_+)\). Then (3.50) implies that for \({\dot{\varphi }} \in X_{+}(B_+)\),

$$\begin{aligned} {\dot{\varphi }} {{\,\mathrm{Hess}\,}}V_+(B_+,\varphi ) {\dot{\varphi }}&\geqslant L^{-2(j+1)} \left\langle {\dot{\varphi }} \frac{ L^{2(j+1)} {{\,\mathrm{Hess}\,}}V(B_+, \varphi +\zeta )}{\mathrm {id}+ C^{1/2} {{\,\mathrm{Hess}\,}}V(B_+,\varphi +\zeta ) C^{1/2}} {\dot{\varphi }} \right\rangle _{H_\varphi }\nonumber \\&\geqslant L^{-2(j+1)} \left\langle \frac{\Lambda (\varphi +\zeta )}{1+ L^{-2}\vartheta ^2\Lambda (\varphi +\zeta )} \right\rangle _{H_\varphi } ({\dot{\varphi }},{\dot{\varphi }}), \end{aligned}$$
(3.54)

where the second inequality uses that \(t/(1+a t)\) is increasing in \(t>-1/a\) and that \(C \leqslant \vartheta ^2 L^{2j}Q\).

The next lemma completes the proof of Proposition 3.7.

Lemma 3.11

For \(\varphi \in X_{+}(B_+)\) with \(|\varphi | \geqslant h_+\), we have

$$\begin{aligned} \left\langle \frac{\Lambda (\varphi +\zeta )}{1+L^{-2}\vartheta ^2\Lambda (\varphi +\zeta )} \right\rangle _{H_\varphi } \geqslant \varepsilon -O(\vartheta ^2 \varepsilon ^2). \end{aligned}$$
(3.55)

Proof

On the event \(\min _x |\varphi +\zeta _x| \geqslant \frac{1}{2} h_+\) we have \(\Lambda (\varphi +\zeta ) \geqslant \varepsilon >0\) by (3.30), and since \(t/(1+at)\) is increasing for \(t > 0\) therefore

$$\begin{aligned} \frac{\Lambda (\varphi +\zeta )}{1+L^{-2}\vartheta ^2 \Lambda (\varphi +\zeta )} \geqslant \frac{\varepsilon }{1+L^{-2}\vartheta ^2 \varepsilon } \geqslant \varepsilon - O(\vartheta ^2 \varepsilon ^2). \end{aligned}$$
(3.56)

By Lemma 3.9, the probability that \(|\zeta _x-\zeta ^0| \geqslant \frac{1}{4} h_+\) is bounded by \(2e^{-(h_+/(12\vartheta \ell ))^2/4} \leqslant 2e^{-c \, (\vartheta g)^{-1/2}}\) for any point \(x\in B_+\) (since \(\vartheta \leqslant 1\)). Using that \(\zeta \) is constant on the small blocks B and taking a union bound over the \(L^d\) blocks \(B \in {\mathcal {B}}(B_+)\) we get that \(\max _x |\zeta _x-\zeta ^0| \geqslant \frac{1}{4} h_+\) with probability at most \(2L^d e^{-c (\vartheta g)^{-1/2}}\). Since \(|\varphi +\zeta ^0| \geqslant h_+(1-O(g^{1/2})) \geqslant \frac{3}{4} h_+\) by Lemma 3.8, together with the assumption \(|\varphi | \geqslant h_+\), we conclude that \(\min _x|\varphi +\zeta _x| \geqslant \frac{1}{2} h_+\) with probability at least \(1-2L^d e^{-c(\vartheta g)^{-1/2}}\). Thus (3.56) holds with at least this probability.

On the event that (3.56) does not hold, we still have the bound \(\Lambda (\varphi +\zeta ) \geqslant -\frac{1}{2}\) by (3.39). Thus the contribution of this event to the expectation (3.55) is bounded by \(-O(L^d e^{-c \, (\vartheta g)^{-1/2}}) = -O(\vartheta ^2\varepsilon ^4)\), where we used that \(\varepsilon _j \geqslant c \vartheta _j g_j\) by Lemma 3.5. In summary,

$$\begin{aligned} \left\langle \frac{\Lambda (\varphi +\zeta )}{1+\Lambda (\varphi +\zeta )} \right\rangle _{H_\varphi } \geqslant ({ \varepsilon - O(\vartheta ^2\varepsilon ^2) }) (1-O(\vartheta ^2\varepsilon ^4)) - O(\vartheta ^2\varepsilon ^4) \geqslant \varepsilon -O(\vartheta ^2\varepsilon ^2). \end{aligned}$$
(3.57)

This implies the claim.    \(\square \)

3.4 Proof of Theorem 1.1

We now use Corollary 3.3 and Theorem 3.4 to verify the assumptions of Corollaries 2.22.3 and in doing so deduce Theorem 1.1. By (2.3), the covariances in the decomposition of \((-\Delta _H+m^2)^{-1}\) are given by

$$\begin{aligned} C_j = \lambda _j Q_j, \quad \text {with } \lambda _j = L^{2j} {\left\{ \begin{array}{ll} O(1+m^2L^{2(j-1)})^{-2} &{}(j<N)\\ O(1+m^2L^{2(N-1)})^{-1} &{}(j=N). \end{array}\right. } \end{aligned}$$
(3.58)

We recall that \(\vartheta _j = 2^{(j-j_m)_+}\). Corollary 3.3 implies

$$\begin{aligned} \frac{1}{|B|}{{\,\mathrm{Hess}\,}}(V_j(B) \circ i_B) \geqslant (\nu _j + O(\vartheta _jL^{-2j}g_j^{5/4}))\mathrm {id}_n \quad \text {uniformly in } |\varphi | \leqslant h_j. \end{aligned}$$
(3.59)

The right-hand side is less than 0 by Theorem 3.1. Thus, by Theorem 3.4, the same estimate holds for \(|\varphi |\geqslant h_j\) and therefore for all \(\varphi \). In summary, and since the above estimates hold for all blocks, and using (3.7),

$$\begin{aligned} C_j^{1/2} {{\,\mathrm{Hess}\,}}V_j(\varphi ) C_j^{1/2} \geqslant L^{2j} \; (\nu _j + O(\vartheta _jL^{-2j}g_j^{5/4})) Q_j \quad \text {uniformly in } \varphi \in X_j. \end{aligned}$$
(3.60)

Thus Assumption (A1) holds with

$$\begin{aligned} \varepsilon _j = (-L^{2j}\nu _j +O(\vartheta _jg_j^{5/4})). \end{aligned}$$
(3.61)

Lemma 3.12

There exists a constant \(\delta >0\) such that for all \(j \in \mathbb {N}\),

$$\begin{aligned} -2\sum _{k=0}^{j} L^{2k}\nu _k \leqslant \delta \frac{n+2}{n+8} \log g_{j} + O(1), \qquad \sum _{k=0}^\infty ((L^{2k}\nu _k)^2+\vartheta _k g_k^{5/4}) = O(g_0^{1/4}). \end{aligned}$$
(3.62)

The elementary proof requires some notation from [9]; we therefore postpone it to Appendix B.

Proof of Theorem 1.1

We apply Corollary 2.2. By (3.60), Assumption (A1) holds for all \(j\leqslant N\), and Assumptions (A2) and (A3) follow automatically from the hierarchical structure. Therefore, by (2.22), the \(|\varphi |^4\) measure satisfies a Brascamp–Lieb inequality with quadratic form

$$\begin{aligned} D_0 \leqslant \sum _{j=0}^{N} \delta _j C_j , \quad \text {where } \delta _j = \exp \left( {2\sum _{k=1}^{j} \varepsilon _k + O(\varepsilon _k^2)}\right) . \end{aligned}$$
(3.63)

We abbreviate \(\gamma =(n+2)/(n+8)\). Using \(g_j^{-1} = O(g_{j_m}^{-1})\) which holds by (3.16), and using (3.62),

$$\begin{aligned} \exp \left( {2\sum _{k=1}^{j} \varepsilon _k + O(\varepsilon _k^2)}\right) = O(g_{j_m}^{-\delta \gamma }). \end{aligned}$$
(3.64)

We then use that \(g_{j_m}^{-1}= O(\log m^{-1})\) by (3.16), to show that (3.64) is a logarithmic correction of order \((- \log m)^{\delta \gamma }\). Thus the dominant contribution in (3.63) is given by

$$\begin{aligned} \sum _{j=1}^{N-1} (1+m^2L^{2(j-1)})^{-2}L^{2j} + (1+m^2L^{2(N-1)})^{-1}L^{2N} = O(m^{-2}), \end{aligned}$$
(3.65)

where we recall that \(m^2 \sim Ct (-\log t)^{-\gamma }\) as \(t \downarrow 0\) by (3.15). In summary, we conclude that \(D_0\) is bounded as a quadratic form from above by

$$\begin{aligned} O(m^{-2}) (\log m^{-1})^{\delta \gamma } = O(t^{-1})(-\log t)^{(1+\delta )\gamma }. \end{aligned}$$
(3.66)

Replacing by \(1+\delta \) by \(\delta \), this implies the lower bound for the spectral gap claimed in (1.11). The upper bound for the spectral gap follows immediately from (3.17).    \(\square \)

4 Hierarchical Sine-Gordon and Discrete Gaussian Models

In this section, we apply Corollaries 2.2 and 2.3 to the hierarchical versions of the Sine-Gordon and the Discrete Gaussian models. This boils down to checking that Assumption (A1) is satisfied along the renormalisation group flow of both models. Throughout this section \(d=2\).

4.1 Proof of Theorem 1.2

We start by defining the renormalisation group for the hierarchical Sine-Gordon model, essentially in the set-up of [16, Chapter 3]. By definition, with \(\varepsilon =\beta L^{-2N}\), the Sine-Gordon model has energy

$$\begin{aligned} H(\varphi ) = \frac{\beta }{2} (\varphi ,(-\Delta _H+L^{-2N} Q_N) \varphi ) + \sum _{x\in \Lambda } V_0(\varphi _x), \end{aligned}$$
(4.1)

where the potential \(V_0\) is even and \(2\pi \)-periodic. We decompose the covariance of the Gaussian part as

$$\begin{aligned} (-\beta \Delta _H + \beta L^{-2N} Q_N )^{-1} = \sum _{j=1}^N \beta ^{-1} L^{2(j-1)} P_j + \beta ^{-1}L^{2N}Q_N = \sum _{j=0}^N C_j \end{aligned}$$
(4.2)

with

$$\begin{aligned} C_j = \lambda _j(\beta ) Q_j, \qquad \lambda _0(\beta ) = \frac{1}{\beta }, \quad \lambda _j(\beta ) = \frac{\sigma }{\beta } L^{2j} \quad (0<j \leqslant N), \quad \sigma = 1-L^{-2}.\nonumber \\ \end{aligned}$$
(4.3)

Relative to this decomposition, the renormalised potential is defined as in Sect. 2.2. Due to the hierarchical structure of this decomposition, the renormalised potential takes the form

$$\begin{aligned} V_j(\varphi ) = \sum _{B\in {\mathcal {B}}_j} V_j(B,\varphi ), \end{aligned}$$
(4.4)

where \(V_j(B,\varphi )\) only depends on \(\varphi |_B\). As in Sect. 3.1, we restrict the domain of \(V_j(B)\) to \(X_j(B)\), i.e., the constant fields on B. The final potential obtained as \(V_{N+1}\) in (2.14) will instead be denoted by \(V_{N,N}\) since it is indexed by the final block \(\Lambda \in {\mathcal {B}}_N\), i.e., \(V_{N,N}(\varphi ) = V_{N,N}(\Lambda _N,\varphi )\), and \(\varphi \) can be seen as an external field. Then each \(V_j(B)\) can be identified with a \(2\pi \)-periodic function on \(\mathbb {R}\) (and analogously for \(V_{N,N}\)). For any such function \(F: S^1 \rightarrow \mathbb {R}\), we use the norm

$$\begin{aligned} \Vert F\Vert = \sum _{q \in \mathbb {Z}} w(q) |{\hat{F}}(q)|, \quad w(q) = (1+|q|)^2, \end{aligned}$$
(4.5)

where our convention for the Fourier coefficients of F is \({\hat{F}}(q) = (2\pi )^{-1} \int _0^{2\pi } F(\varphi ) e^{iq\varphi } \, d\varphi \). We write

$$\begin{aligned} \Vert V_j\Vert = \Vert V_j(B)\Vert = \Vert V_j(B) \circ i_B\Vert , \qquad {\hat{V}}_j(0) = {\hat{V}}_j(B,0) \end{aligned}$$
(4.6)

for an arbitrary \(B \in {\mathcal {B}}_j\) (the definition is independent of B). Except for the weight w(q), the norm (4.5) is the one used in [16, 48].

Proposition 4.1

Let \(j<N\). Assume that \(\Vert V_j- {\hat{V}}_j(0)\Vert \) is sufficiently small. Then the renormalised potential satisfies

$$\begin{aligned} \Vert V_{j+1} - {\hat{V}}_{j+1}(0)\Vert \leqslant L^2 e^{-\sigma /2\beta } (\Vert V_j-{\hat{V}}_j(0)\Vert + O(\Vert V_j- {\hat{V}}_j(0)\Vert )^2). \end{aligned}$$
(4.7)

Moreover, for the last step \(j=N\),

$$\begin{aligned} \Vert V_{N,N} - {\hat{V}}_{N,N}(0)\Vert \leqslant \Vert V_N-{\hat{V}}_N(0)\Vert + O(\Vert V_N- {\hat{V}}_N(0)\Vert )^2. \end{aligned}$$
(4.8)

The derivation of this proposition is postponed to Sect. 4.2. We now state consequences of this proposition and prove Theorem 1.2 using these.

Corollary 4.2

For every \(\beta < \sigma /(4\log L)\) and \(\kappa< L^{2} e^{-\sigma /2\beta } < 1\), for all \(V_0-{\hat{V}}_0\) sufficiently small,

$$\begin{aligned} \Vert V_j-{\hat{V}}_j(0)\Vert \leqslant \kappa ^j \Vert V_0-{\hat{V}}_0(0)\Vert \quad \text {for }j \leqslant N, \end{aligned}$$
(4.9)

and

$$\begin{aligned} \Vert V_{N,N}-{\hat{V}}_{N,N}(0)\Vert \leqslant 2\kappa ^N \Vert V_{0}-{\hat{V}}_{0}(0)\Vert . \end{aligned}$$
(4.10)

Proof

Fix \(\eta >0\) small and set \(\kappa = L^2 e^{-(1-\eta )\sigma /2\beta } < 1\). The bound (4.7) implies that for \(\Vert V_0-{\hat{V}}_0(0)\Vert \) sufficiently small depending on \(\eta ,\beta ,\eta \),

$$\begin{aligned} \Vert V_{j+1} - {\hat{V}}_{j+1}(0)\Vert \leqslant L^2 e^{-(1-\eta )\sigma /2} \Vert V_j-{\hat{V}}_j(0)\Vert = \kappa \Vert V_j-{\hat{V}}_j(0)\Vert . \end{aligned}$$
(4.11)

Then (4.9) follows by iterating this bound, and (4.10) follows from this and (4.8).    \(\square \)

Corollary 4.3

Let \(\beta < \sigma /(4 \log L)\) and let \(\varepsilon = \beta L^{-2N}\). Then the variance of \(F = \sum _{x \in \Lambda _N} \varphi _x\) under the Gibbs measure \(\mu \) defined in (1.2) is given by

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F) = \frac{|\Lambda _N|}{\varepsilon } ({1-O(\kappa ^N)}) . \end{aligned}$$
(4.12)

Proof

Throughout the following proof, we denote by \(C = (-\beta \Delta _H+\varepsilon Q_N)^{-1}\) the full covariance of the hierarchical Gaussian free field. By completion of the square, and using that ,

(4.13)

With \(F(\varphi ) = \sum _x \varphi _x\), we get by translating the measure by that

(4.14)

By Corollary 4.2 and the fact that the norm controls the second derivatives,

$$\begin{aligned} | V_{N,N}''(0)| = |(V_{N,N}-{\hat{V}}_{N,N}(0))''| \leqslant \Vert V_{N,N}-{\hat{V}}_{N,N}(0)\Vert \leqslant 2\kappa ^N \Vert V_0-{\hat{V}}_0(0)\Vert , \end{aligned}$$
(4.15)

where \(V_{N,N}''\) is the second derivative of the function \(V_{N,N}(\Lambda _N) \circ i_{\Lambda _N}: \mathbb {R}\rightarrow \mathbb {R}\). Finally, and using that as well as that \(\varepsilon = \beta L^{-2N}\),

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F)&= \frac{\partial ^2 \Gamma (0)}{\partial t^2} = \frac{|\Lambda _N|}{\varepsilon } - \frac{V_{N,N}''(0)}{\varepsilon ^{2}} = \frac{|\Lambda _N|}{\varepsilon } \left( {1- O\left( {\frac{\kappa ^N}{\varepsilon |\Lambda _N|}}\right) }\right) \nonumber \\&= \frac{|\Lambda _N|}{\varepsilon } ({1- O(\kappa ^N)}) . \end{aligned}$$
(4.16)

This completes the proof.    \(\square \)

Proof of Theorem 1.2

We start by proving the lower bound on the spectral gap by applying Corollary 2.2. Thanks to the hierarchical structure, the spins are constant in the blocks at any given scale j, and Assumptions (A2) and (A3) always hold. Assumption (A1) follows from Corollary 4.2 which implies that for \(j \leqslant N\)

$$\begin{aligned} (V_j(B)\circ i_B)''(\varphi ) \geqslant - \sum _q q^2 |{\hat{V}}_j(q)| = - \Vert V_j-{\hat{V}}_j(0)\Vert \geqslant -\kappa ^j \Vert V_0-{\hat{V}}_0(0)\Vert . \end{aligned}$$
(4.17)

This implies the bound (3.5) with

$$\begin{aligned} s = \frac{1}{ |B_j|} \; \kappa ^j \Vert V_0-{\hat{V}}_0(0)\Vert = \kappa ^j \Vert V_0-{\hat{V}}_0(0)\Vert \; L^{-2 j}. \end{aligned}$$
(4.18)

The equivalent of (3.7) is

$$\begin{aligned} C_j^{1/2}({{\,\mathrm{Hess}\,}}_{X_j} V_j)C_j^{1/2} \geqslant - s L^{2j} Q_j . \end{aligned}$$
(4.19)

Therefore Assumption (A1) in (2.16) holds with \(\varepsilon _j = s L^{2j} = \kappa ^j \Vert V_0-{\hat{V}}_0(0)\Vert \). With \(\delta _j\) defined as in (2.22), it follows that

$$\begin{aligned} \sum _{j=0}^{N} \delta _j C_j&\leqslant \exp \left( {\sum _{j=0}^{N} O(\kappa ^j)\Vert V_0-{\hat{V}}_0(0)\Vert }\right) \sum _{j=0}^{N} C_j \nonumber \\&\leqslant (1+O(\Vert V_0-{\hat{V}}_0(0)\Vert )) (-\beta \Delta _H+\varepsilon Q_N)^{-1} \leqslant \frac{O(1)}{\varepsilon } \mathrm {id}_{\Lambda _N}. \end{aligned}$$
(4.20)

Applying Corollary 2.2, we get that the measure \(\mu \) satisfies a Brascamp–Lieb inequality with matrix

$$\begin{aligned} D_0 \leqslant \frac{O(1)}{\varepsilon } \mathrm {id}_{\Lambda _N}. \end{aligned}$$
(4.21)

This implies immediately the asserted lower bound on the spectral gap, i.e., \(\gamma _N \geqslant c\varepsilon \).

Finally, the upper bound on the spectral gap follows readily from Corollary 4.3. Choosing as test function \(F = \sum _{x \in \Lambda _N} \varphi _x\), we have \({{\mathbb {E}}} _\mu (\nabla F, \nabla F) = |\Lambda _N|\) and (4.12) implies

$$\begin{aligned} \frac{{{\mathbb {E}}} _\mu (\nabla F, \nabla F) }{{{\,\mathrm{Var}\,}}_\mu (F)} = \varepsilon (1+O(\frac{1}{\varepsilon L^{2N}})) = O(\varepsilon ). \end{aligned}$$
(4.22)

This completes the proof.    \(\square \)

4.2 Proof of Proposition 4.1

The proof of Proposition 4.1 follows as in [16, Chapter 3], with small modifications. Throughout Sect. 4.2, the full covariance matrix \((-\beta \Delta _H+\varepsilon Q_N)^{-1}\) does not play a role and we write \(C = C_j\) for a fixed scale j. More generally, we drop the scale index j and write \(+\) in place of \(j+1\). We write \(B_+\) for a fixed block in \({\mathcal {B}}_+\) and B for the blocks in \({\mathcal {B}}(B_+)\).

We need the following properties of the norm (4.5). Since \(w(p+q) \leqslant w(p)w(q)\), i.e.,

$$\begin{aligned} (1+|p+q|)^2&= 1+p^2+q^2 +2|p+q|+2pq \nonumber \\&\leqslant 1+p^2+q^2+2|p+q|+4|pq|+2|pq|(|p|+|q|) = (1+|p|)^2(1+|q|)^2, \end{aligned}$$
(4.23)

the norm (4.5) satisfies the product property

$$\begin{aligned} \Vert FG\Vert = \sum _{q,p} w(q) |{\hat{F}}(q-p)||{\hat{G}}(p)| \leqslant \sum _{q,p} w(q-p)w(p) |{\hat{F}}(q-p)||{\hat{G}}(p)| = \Vert F\Vert \Vert G\Vert . \end{aligned}$$
(4.24)

As a consequence, for any \(F: S^1 \rightarrow \mathbb {R}\) with \(\Vert F\Vert \) small enough,

$$\begin{aligned} \Vert e^{-F}-1\Vert&\leqslant \Vert F\Vert + O(\Vert F\Vert ^2), \end{aligned}$$
(4.25)
$$\begin{aligned} \Vert \log (1+F)\Vert&\leqslant \Vert F\Vert + O(\Vert F\Vert ^2). \end{aligned}$$
(4.26)

Lemma 4.4

For \(F:S^1 \rightarrow \mathbb {R}\) with \({\hat{F}}(0) = 0\) and \(\Vert F\Vert <\infty \), and for \(x\in \Lambda \),

$$\begin{aligned} \Vert \mathbb {E}_C \left( F(\cdot +\zeta _x) \right) \Vert \leqslant e^{-\sigma /(2\beta )}\Vert F\Vert . \end{aligned}$$
(4.27)

Proof

By (2.3), under the expectation \(\mathbb {E}_C\), each \(\zeta _x\) is a Gaussian random variable with variance \(\sigma /\beta \). Therefore

$$\begin{aligned} \mathbb {E}_C(e^{iq\zeta _x}) = e^{-\sigma q^2/(2\beta )}. \end{aligned}$$
(4.28)

This gives

$$\begin{aligned} \mathbb {E}_C (F(\varphi +\zeta _x)) = \mathbb {E}_C \biggl [{ \sum _q {\hat{F}}(q) e^{iq(\varphi +\zeta _x)} }\biggr ] = \sum _q e^{-\sigma q^2/(2\beta )} {\hat{F}}(q) e^{iq\varphi }. \end{aligned}$$
(4.29)

Since by assumption \({\hat{F}}(0) = 0\), we obtain

$$\begin{aligned} \Vert \mathbb {E}_C (F(\cdot +\zeta _x))\Vert\leqslant & {} \sum _q e^{-\sigma q^2/(2\beta )} w(q) |{\hat{F}}(q)| \nonumber \\\leqslant & {} e^{-\sigma /(2\beta )} \sum _q w(q) |{\hat{F}}(q)| = e^{-\sigma /(2\beta )} \Vert F\Vert \end{aligned}$$
(4.30)

as claimed.    \(\square \)

Proof of Proposition 4.1

We may assume that \({\hat{V}}(0)=0\). We fix \(B_+ \in {\mathcal {B}}_+\) and use B for the blocks in \({\mathcal {B}}(B_+)\). By definition of the hierarchical model, the Gaussian field \(\zeta \) with covariance \(C=C_j\) is constant in any block \(B \in {\mathcal {B}}_j\) and we thus write \(\zeta _B\) for \(\zeta _x\) with \(x\in B\). We then start from

$$\begin{aligned} e^{-V_+(B_+,\varphi )} = \mathbb {E}_{C} \left( \prod _{B\in {\mathcal {B}}(B_+)} e^{-V(\varphi +\zeta _B)} \right)&= \mathbb {E}_{C} \left( \prod _{B \in {\mathcal {B}}(B_+)} (1+e^{-V(\varphi +\zeta _B)}-1) \right) \nonumber \\&= \sum _{X \subset B_+} \mathbb {E}_{C} \left( \prod _{B \in {\mathcal {B}}(X)}(e^{-V(\varphi +\zeta _B)}-1) \right) , \end{aligned}$$
(4.31)

where \(X\subset B_+\) denotes that X is a union of blocks \(B\in {\mathcal {B}}(B_+)\). The term with \(|X|=0\) is simply 1. By (4.27) and (4.25), the terms with \(|X|=1\) are bounded by

$$\begin{aligned} \left\| \sum _{B\in {\mathcal {B}}(B_+)} \mathbb {E}_{C} \Big ( e^{-V(\varphi +\zeta _B)}-1 \Big ) \right\| \leqslant |{\mathcal {B}}(B_+)| e^{-\sigma /(2\beta )} (\Vert V\Vert +O(\Vert V\Vert ^2)) . \end{aligned}$$
(4.32)

By (4.27), using that the \(\zeta _B\) are independent for different blocks B and the product property of the norm, the terms with \(|X|>1\) give

$$\begin{aligned} \left\| \sum _{|X|> 1} \mathbb {E}_{C} \left( \prod _{B\in {\mathcal {B}}(X)}(e^{-V(\varphi +\zeta _B)}-1) \right) \right\|&\leqslant \sum _{|X|>1} \prod _{B \in {\mathcal {B}}(X)}e^{-\sigma /(2\beta )} \Vert (e^{-V(\varphi +\zeta _B)}-1) \Vert \nonumber \\&\leqslant \sum _{|X|>1} (e^{-\sigma /(2\beta )}(\Vert V\Vert +O(\Vert V\Vert ^2)))^{|X|} \nonumber \\&= O(e^{-\sigma /(2\beta )}\Vert V\Vert ^2). \end{aligned}$$
(4.33)

In summary, for \(\Vert V\Vert \) small enough, we get

$$\begin{aligned} \left\| \mathbb {E}_{C} \left( \prod _{B\in {\mathcal {B}}(B_+)} e^{-V(\varphi +\zeta _B)} \right) -1 \right\|&\leqslant |{\mathcal {B}}(B_+)| e^{-\sigma /(2\beta )} (\Vert V\Vert + O(\Vert V\Vert ^2)) \nonumber \\&= L^2 e^{-\sigma /(2\beta )} (\Vert V\Vert +O(\Vert V\Vert ^2)). \end{aligned}$$
(4.34)

Finally, by (4.26),

$$\begin{aligned} \Vert V_+\Vert = \left\| \log \left( 1 + \mathbb {E}_C \left( \prod _{B\in {\mathcal {B}}(B_+)} e^{-V(\varphi +\zeta _B)} \right) - 1\right) \right\| \leqslant L^2 e^{-\sigma /(2\beta )} (\Vert V\Vert +O(\Vert V\Vert ^2)), \end{aligned}$$
(4.35)

as needed.    \(\square \)

4.3 Proof of Theorem 1.3

We will now reduce the result for the Discrete Gaussian model to that for the Sine-Gordon model. For this, we carry out an initial renormalisation group step by hand, resulting in an effective Sine-Gordon potential for the Discrete Gaussian model. This strategy for the Discrete Gaussian model (and more general models) goes back to [32].

First, recall that the covariance of the hierarchical GFF can be written as

$$\begin{aligned} (-\beta \Delta +\varepsilon Q_N)^{-1} = C_0 + \cdots + C_N = C_0 + C_{\geqslant 1}, \end{aligned}$$
(4.36)

where \(C_0 = \frac{1}{\beta } Q_0\) and where \(Q_0\) is simply the identity matrix on \(\mathbb {R}^\Lambda \). Therefore, by the convolution property of Gaussian measures,

$$\begin{aligned} e^{-\frac{1}{2} (\sigma , (-\beta \Delta _H+\varepsilon Q_N) \sigma )} \propto \int _{\mathbb {R}^{\Lambda }} e^{-\frac{1}{2} (\varphi ,C_{\geqslant 1}^{-1}\varphi )} e^{- \frac{\beta }{2}(\varphi -\sigma ,\varphi - \sigma )} \, d\varphi \propto \mathbb {E}_{C_{\geqslant 1}}(e^{- \frac{\beta }{2} (\varphi -\sigma ,\varphi -\sigma )}), \end{aligned}$$
(4.37)

where \(A \propto B\) denotes that A / B is independent of \(\sigma \), and where the Gaussian expectation applies to the field \(\varphi \). We define the effective single-site potential \(V(\psi )\) for \(\psi \in \mathbb {R}\) by

$$\begin{aligned} e^{-V(\psi )} = \sum _{n \in 2\pi \mathbb {Z}} e^{-\beta (n-\psi )^2/2}. \end{aligned}$$
(4.38)

The potential V is \(2\pi \)-periodic as in the Sine-Gordon model. This is where the \(2\pi \)-periodicity of the Discrete Gaussian Model is convenient. For \(\psi \in \mathbb {R}\), we also define a probability measure \(\mu _\psi \) on \(2\pi \mathbb {Z}\) by

$$\begin{aligned} \mu _\psi (n) = e^{V(\psi )} e^{-\beta (n-\psi )^2/2} \quad \text {for }n \in 2\pi \mathbb {Z}. \end{aligned}$$
(4.39)

For \(\varphi \in \mathbb {R}^\Lambda \), we further set \(\mu _\varphi = \prod _{x\in \Lambda } \mu _{\varphi _x}\) with \(\mu _{\varphi _x}\) as in (4.39) with \(\psi =\varphi _x\). With this notation, in summary, we have the representation

$$\begin{aligned} \sum _{\sigma \in (2\pi \mathbb {Z})^\Lambda } F(\sigma ) \, e^{-\frac{1}{2} (\sigma ,(-\beta \Delta _H+\varepsilon Q_N) \sigma )} \propto \mathbb {E}_{C_{\geqslant 1}}(e^{- V(\varphi )} \mathbb {E}_{\mu _\varphi }(F(\sigma ))) . \end{aligned}$$
(4.40)

Denote by \(\mu _r(d\varphi )\) the probability measure on \(\mathbb {R}^\Lambda \) of the Sine-Gordon model with potential \(V(\varphi )\) defined by (4.38) with \(C_{\geqslant 0}\) replaced by \(C_{\geqslant 1}\).

$$\begin{aligned} \mathbb {E}_\mu (F) = \mathbb {E}_{\mu _r}(\mathbb {E}_{\mu _\varphi }(F)). \end{aligned}$$
(4.41)

In the next two lemmas, we verify that V satisfies the conditions of Theorem 1.2 provided \(\beta \) is sufficiently small, and that the probability measure \(\mu _\psi \) satisfies a spectral gap inequality on \(2\pi \mathbb {Z}\), with constant uniform in \(\psi \). It is clear from the definition (4.38) that V is \(2\pi \)-periodic.

Lemma 4.5

For \(\beta >0\) small enough, V is smooth with \(\Vert V-{\hat{V}}(0)\Vert =O(e^{-1/(2\beta )})\).

Proof

The function \(F = e^{-V}\) is \(2\pi \)-periodic, and subtracting a constant from V, we can normalise F such that \({\hat{F}}(0)=1\). Note that subtraction of a constant does not change \(V-{\hat{V}}(0)\). The Fourier coefficients of F are then given by

$$\begin{aligned} {\hat{F}}(q) = \frac{1}{2\pi } \int _0^{2\pi } F(\psi ) e^{-iq\psi } \, d\psi = \frac{C}{2\pi } \int _\mathbb {R}e^{-\beta \psi ^2/2} e^{-iq\psi } \, d\psi = e^{-q^2/(2\beta )} , \end{aligned}$$
(4.42)

where the constant C and the last equality are due to the normalisation \({\hat{F}}(0)=1\). It follows that

$$\begin{aligned} \Vert F-1\Vert = \sum _{q\ne 0} (1+q^2) e^{-q^2/(2\beta )} = O(e^{-1/(2\beta )}) . \end{aligned}$$
(4.43)

By (4.26), it then also follows that

$$\begin{aligned} \Vert V\Vert = \Vert \log F\Vert = \Vert \log (1+(F-1))\Vert = \Vert F-1\Vert + O(\Vert F-1\Vert ^2) = O(e^{-1/(2\beta )}). \end{aligned}$$
(4.44)

Since \(\Vert V-{\hat{V}}(0)\Vert \leqslant \Vert V\Vert \), this clearly implies the claim.    \(\square \)

Corollary 4.6

For \(\beta >0\) sufficiently small, the measure \(\mu _r\) has inverse spectral gap \(O(1/\varepsilon )\).

Proof

The proof is essentially the same as that of Theorem 1.2. The only difference compared to Theorem 1.2 is that we replaced \(C_{\geqslant 0}\) by \(C_{\geqslant 1}\) which does not change the conclusion. For small \(\beta \), the assumption on V is satisfied thanks to Lemma 4.5.    \(\square \)

The following lemma can be proved, e.g., using the path method for spectral gap inequalities; we postpone the elementary proof to Appendix C.

Lemma 4.7

For any \(\beta >0\), there exists a constant \(C_\beta \) such that the measure \(\mu _\psi \) on \(2\pi \mathbb {Z}\) has a spectral gap uniformly in \(\psi \in \mathbb {R}\),

$$\begin{aligned} {{\,\mathrm{Var}\,}}_{\mu _\psi }(F(n)) \leqslant C_\beta \mathbb {E}_{\mu _\psi } \Big ( (F(n+2\pi )-F(n))^2 + (F(n-2\pi )-F(n))^2 \Big ). \end{aligned}$$
(4.45)

With the above ingredients, the proof can now be completed as follows.

Proof of Theorem 1.3

We start with the proof of the lower bound on the spectral gap. By (4.41), the variance of a function \(F: (2\pi \mathbb {Z})^\Lambda \rightarrow \mathbb {R}\) under the Discrete Gaussian measure can be written as

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F) = \mathbb {E}_{\mu _r}({{\,\mathrm{Var}\,}}_{\mu _\varphi }(F)) + {{\,\mathrm{Var}\,}}_{\mu _r}(G), \quad \text {where } G(\varphi ) = \mathbb {E}_{\mu _\varphi }(F). \end{aligned}$$
(4.46)

By Corollary 4.6, the measure \(\mu _r\) has an inverse spectral gap bounded by \(O(1/\varepsilon )\). By Lemma 4.7 and the tensorisation principle for spectral gaps, the product measure \(\mu _\varphi = \prod _{x\in \Lambda } \mu _{\varphi _x}\) has a spectral gap uniformly bounded by \(C_\beta \). It follows that

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F) \leqslant C_\beta {{\mathbb {D}}} (F) + O(\frac{1}{\varepsilon }) \sum _{x\in \Lambda } \mathbb {E}_{\mu _r}(|\nabla _{\varphi _x} G|^2), \end{aligned}$$
(4.47)

where the Dirichlet form introduced in (1.16) has been denoted by

$$\begin{aligned} {{\mathbb {D}}} (F) = \frac{1}{2(2\pi )^2} \sum _{x\in \Lambda } {{\mathbb {E}}} _\mu \Big ( (F(\sigma ^{x+})-F(\sigma ))^2 + (F(\sigma ^{x-})-F(\sigma ))^2 \Big ). \end{aligned}$$
(4.48)

We also set

$$\begin{aligned} {{\mathbb {D}}} _{x,\mu _{\varphi }} (F) = \frac{1}{2(2\pi )^2} {{\mathbb {E}}} _{\mu _{\varphi }} \Big ( (F(\sigma ^{x+})-F(\sigma ))^2 + (F(\sigma ^{x-})-F(\sigma ))^2 \Big ). \end{aligned}$$
(4.49)

Then the second term on the right-hand side is bounded as follows. Since with respect to the measure \(\mu _\varphi \) for fixed \(\varphi \), the \(\sigma _x\) are independent, we have

$$\begin{aligned} |\nabla _{\varphi _x} G(\varphi )|^2 = \beta ^2 ({{\,\mathrm{Cov}\,}}_{\mu _{\varphi }}(F(\sigma ), \sigma _x))^2 \leqslant \beta ^2 \mathbb {E}_{\mu _{\varphi }}({{\,\mathrm{Cov}\,}}_{\mu _{\varphi _x}}(F(\sigma ), \sigma _x))^2) \leqslant C_\beta ^2 {{\mathbb {D}}} _{x,\mu _{\varphi }} (F) \end{aligned}$$
(4.50)

where we used the following inequality, which follows from \({{\,\mathrm{Var}\,}}_{\mu _{\varphi _x}}(\sigma _x) \leqslant C_\beta \) and (4.45):

$$\begin{aligned} ({{\,\mathrm{Cov}\,}}_{\mu _{\varphi _x}}(F(\sigma ),\sigma _x))^2 \leqslant ({{{\,\mathrm{Var}\,}}_{\mu _{\varphi _x}}(F)}) ({{{\,\mathrm{Var}\,}}_{\mu _{\varphi _x}}(\sigma _x)}) \leqslant C_\beta ^2 {{\mathbb {D}}} _{x, \mu _{\varphi }} (F) . \end{aligned}$$
(4.51)

Using that \({{\mathbb {D}}} (F) = \sum _{x\in \Lambda } \mathbb {E}_{\mu _r}({{\mathbb {D}}} _{x,\mu _\varphi }(F))\), in summary, we conclude that

$$\begin{aligned} {{\,\mathrm{Var}\,}}_\mu (F) \leqslant C_\beta \Big (1+C_\beta O(\frac{1}{\varepsilon }) \Big ) {{\mathbb {D}}} (F) \end{aligned}$$
(4.52)

and therefore that the inverse spectral gap obeys \(1/\gamma = O(1/\varepsilon )\).

For the matching upper bound on the spectral gap, we use the test function \(F = \sum _{x\in \Lambda } \sigma _x\), analogously to the Sine-Gordon case. For any \(\psi \in \mathbb {R}\) and \(t \in \mathbb {R}\),

$$\begin{aligned} \mathbb {E}_{\mu _\psi }(e^{t\sigma }) = e^{V(\psi )} \sum _{n\in 2\pi \mathbb {Z}} e^{-\beta (n-\psi )^2/2 + n t} = e^{V(\psi )-V(\psi + t/\beta ) +t^2/(2\beta ) + t\psi }. \end{aligned}$$
(4.53)

Let \(u = \sum _{y} [C_{\geqslant 1}]_{xy}\) (which is independent of x). It follows that

$$\begin{aligned} e^{\Gamma (t)} = \mathbb {E}_{\mu }(e^{tF}) = \mathbb {E}_{\mu _r} \mathbb {E}_{\mu _\varphi }(e^{tF})&= e^{t^2 |\Lambda _N|/(2\beta )} \frac{\mathbb {E}_{C_{\geqslant 1}}(e^{-\sum _x V(\varphi _x+ t/\beta ) + t \sum _x \varphi _x})}{\mathbb {E}_{C_{\geqslant 1}}(e^{-V(\varphi )})} \nonumber \\&= e^{t^2 |\Lambda _N| (1/\beta +u)/2} \frac{\mathbb {E}_{C_{\geqslant 1}} (e^{-\sum _x V(\varphi _x+ t/\beta + t u)})}{\mathbb {E}_{C_{\geqslant 1}}(e^{-V(\varphi )})} . \end{aligned}$$
(4.54)

Since \(\sum _y [C_0]_{xy} = [C_0]_{xx}=1/\beta \), note that

$$\begin{aligned} 1 /\beta + u = \sum \nolimits _y \sum \nolimits _{j=0}^N [C_j]_{xy}= \sum \nolimits _{y} (-\beta \Delta _H+\varepsilon Q_N)^{-1}_{xy} = \varepsilon ^{-1}. \end{aligned}$$
(4.55)

As in the proof of Corollary 4.3, it follows that

$$\begin{aligned} {{\,\mathrm{Var}\,}}_{\mu }(F) = \frac{|\Lambda _N|}{\varepsilon } - \frac{V_N''(0)}{\varepsilon ^2} = \frac{|\Lambda _N|}{\varepsilon }(1+O(\frac{\kappa ^N}{\varepsilon L^{2N}})) = \frac{|\Lambda _N|}{\varepsilon }(1+O(\kappa ^N)). \end{aligned}$$
(4.56)

Since \({{\mathbb {D}}} (F) = |\Lambda _N|\), this completes the proof of \(\gamma \leqslant \varepsilon (1+O(\kappa ^N))\) and therefore the proof of the theorem.    \(\square \)