1 Introduction

On the state space \(\{0,1\}^{{\mathbb T} ^k}\), where \({\mathbb T} ^k\) is the regular rooted tree with \(k\ge 2\) children for each node, we consider a constrained spin model in which each spin, with rate one and iff all its children are zero, chooses a new value in \(\{0,1\}\) with probability \(1-p\) and \(p\) respectively. This model belongs to the class of kinetically constrained spin models which have been introduced in the physics literature to model liquid/glass transition or, more generally, glassy dynamics (see [12, 20] for physical background and [4] for related mathematical work). As for most of the kinetically constrained models, the Bernoulli(p) product measure \(\mu \) is a reversible measure for the process.

Remark 1.1

It may be useful to compare the above model to the popular heat-bath Glauber dynamics for the Ising model (see e.g. [18]). In the latter case, with rate one the spin at a vertex \(x\) is replaced by \(s\in \{0,1\}\) with probability proportional to \(\lambda ^{N_{x,s}}\), \(\lambda \ge 1\), where \(N_{x,s}\) is the number of neighbors of \(x\) having spin value \(s\). The dynamics becomes, in some sense, simpler than the one defined above (no hard constraints and with some very useful monotonicity properties) but it suffers from a much more complicated reversible stationary measure (the Ising Gibbs measure).

When \(k=1\) the model coincides with the well known East model [14] (see also [1, 4, 5, 10, 11] for rigorous analysis). As soon as \(k\ge 2\), the model shares some of the key features of another well known kinetically constrained system, namely the North East model [4, 15]. More specifically, since above the critical density \(p_c=1/k\) the occupied vertices begin to percolate (under the reversible measure \(\mu \)), blocked clusters appear and time ergodicity is lost. It is therefore particularly interesting to study the relaxation to equilibrium in e.g. finite sub-trees of \({\mathbb T} ^k\), when the density \(p\) is below, equal or above the critical density \(p_c=1/k\).

In [19] it was recently proved that, as long as \(p<p_c\), the process on the infinite tree is exponentially ergodic with a finite relaxation time \(T_\mathrm{rel}\). Under the same assumption, on a finite tree with suitable boundary conditions on the leaves the mixing time was also shown to be linear in the depth of the tree. When instead \(p>p_c\) the ergodicity on the infinite tree is lost and both the relaxation and the mixing times for finite trees diverge exponentially fast in the depth of the tree.

In this paper we tackle for the first time the critical case \(p=p_c\). Our main results, answering a question of Aldous-Diaconis [1], can be formulated as follows. Let \(T_\mathrm{rel}({\mathbb T} ^k_L),T_\mathrm{mix}({\mathbb T} ^k_L)\) be the relaxation and mixing time respectively of the process on a finite \(k\)-ary rooted tree of depthFootnote 1 \(L\), with no constraints for the spins at the leaves (cf Definitions 1.4 and 1.5).

  • Critical case Assume \(p=p_c\). Then (cf Theorem 1) \(T_\mathrm{rel}({\mathbb T} ^k_L)=\Omega (L^2)\) and \(T_\mathrm{rel}({\mathbb T} ^k_L)=O(L^{2+\beta })\) for some \(0\le \beta <\infty \).

  • Quasi-critical case Assume \(p=p_c-\epsilon \), \(0<\epsilon \ll 1\), and let \(T_\mathrm{rel}\) be the relaxation time for the process on the infinite tree \({\mathbb T} ^k\). Then (cf. Theorem 2) \(T_\mathrm{rel}=\Omega (\epsilon ^{-2})\) and \(T_\mathrm{rel}=O\bigl (\epsilon ^{-2-\alpha }\bigr )\) for some \(\alpha \ge 0\).

  • Mixing time We basically show (cf. Theorem 3) that \(T_\mathrm{mix}({\mathbb T} ^k_L) \) behaves like \(L\times T_\mathrm{rel}({\mathbb T} ^k_L)\). This behavior, which was established for \(p<p_c\) in [19], pertains to the critical and quasi-critical regime. When \(p>p_c\) it should no longer be correct.

Our results, which are identical to those proved for the Ising model on trees at the spin glass critical point [8], represent the first rigorous analysis of a kinetically constrained model at criticality. They also confirm what it is believed to be a quite general phenomenon: when crossing a second order phase transition point (\(p=p_c\) in our case), the relaxation time should go from \(O(1)\) to critical power-law to exponential. Unfortunately such a scenario has been proved so far only for the Ising model on trees [8] and on \({\mathbb Z} ^2\) [16], but it should hold not only for the Ising model on \({\mathbb Z} ^d\), \(d\ge 3\), but for a much larger class of spin models. Finally, as shown in [19], our approach has a good chance to apply also to other models with an ergodicity phase transition, notably the North-East model on \({\mathbb Z} ^2\) for which the critical density \(p_c\) coincides with the oriented percolation threshold [4].

1.1 Model, notation and background

1.1.1 The graph

The model we consider is defined on the infinite rooted \(k\)-ary tree \({\mathbb T} ^k\) with root \(r\) and vertex set \(V\). For each \(x\in V\), \(\mathcal K_x\) will denote the set of its \(k\) children and \(d_x\) its depth, i.e. the graph distance between \(x\) and the root \(r\). A generic finite sub-tree of \({\mathbb T} ^k\) will be denoted by \(\mathcal T\). The finite \(k\)-ary subtree of \({\mathbb T} ^k\) with \(n\) levels is the set \({\mathbb T} ^k_n=\{x\in {\mathbb T} ^k:\ d_x\le n\}\). For \(x\in {\mathbb T} ^k_n\), \({\mathbb T} ^k_{x,n}\) will denote the \(k\)-ary sub-tree of \({\mathbb T} _n^k\) rooted at \(x\) with depth \(n-d_x\), where \(d_x\) is the depth of \(x\). In other words the leaves of \({\mathbb T} ^k_{x,n}\) are a subset of the leaves of \({\mathbb T} _n^k\). We also set \(\hat{\mathbb T} ^k_{x,n}={\mathbb T} ^k_{x,n}{\setminus }\{x\}\) (see Fig. 1 below). In the sequel, whenever no confusion arises, we will drop the superscripts \(k,n\) from \({\mathbb T} ^k_n\) and \({\mathbb T} _{x,n}^k\).

Fig. 1
figure 1

For \(k=3\), the tree \({\mathbb T} \) rooted at \(r\), of depth \(L\) i.e. with \(L\) levels below \(r\)), the set \(\Delta _x\) and the sub-set \(\hat{\mathbb T} _y\)

1.1.2 The configuration spaces

We choose as configuration space the set \(\Omega =\{0,1\}^V\) whose elements will usually be assigned Greek letters. We will often write \(\eta _x\) for the value at \(x\) of the element \(\eta \in \Omega \). We will also write \(\Omega _A\) for the set \(\{0,1\}^A\), \(A\subseteq V\). With a slight abuse of notation, for any \(A\subseteq V\) and any \(\eta ,\omega \in \Omega \), we let \(\eta _A\) be the restriction of \(\eta \) to the set \(A\) and \(\eta _A\cdot \omega _{A^c}\) be the configuration which equals \(\eta \) on \(A\) and \(\omega \) on \(V{\setminus }A\).

1.1.3 Probability measures

For any \(A\subseteq V\) we denote by \(\mu _A\) the product measure \(\otimes _{x\in A}\,\mu _x\) where each factor \(\mu _x\) is the Bernoulli measure on \(\{0,1\}\) with \(\mu _x(1)=p\) and \(\mu _x(0)=q\) with \(q=1-p\). If \(A=V\) we abbreviate \(\mu _V\) to \(\mu \). Also, with a slight abuse of notation, for any finite \(A\subset V\), we will write \(\mu (\eta _A)=\mu _A(\eta _A)\).

1.1.4 Conditional expectations and conditional variances

Given \(A\subset V\) and a function \(f :\Omega \rightarrow {\mathbb R} \) depending on finitely many variables, in the sequel referred to as local function, we define the function \(\eta _{A^c} \mapsto \mu _A(f)(\eta _{A^c})\) by the formula:

$$\begin{aligned} \mu _A(f)(\eta _{A^c}):= \sum _{\sigma \in \Omega _A} \mu _A(\sigma )f(\sigma _A\cdot \eta _{A^c}). \end{aligned}$$

Clearly \(\mu _A(f)\) coincides with the conditional expectation of \(f\) given the configuration outside \(A\). Similarly we write

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits _A(f) (\eta _{A^c})=\mu _A(f^2) (\eta _{A^c})-\left[ \mu _A(f) (\eta _{A^c})\right] ^2 \end{aligned}$$

for the conditional variance of \(f\) given \(\eta _{A^c}\). Usually we will omit writing explicitly the dependence on \(\eta _{A^c}\) whenever it will be clear from the context. Note that \( \mathop {\mathrm{Var}}\nolimits _A(f)=0\) iff \(f\) does not depend on the configuration inside \(A\). When \(A=V\), respectively \(A=\{x\}\) for some \(x \in V\), we abbreviate \( \mathop {\mathrm{Var}}\nolimits _V(f)\) to \( \mathop {\mathrm{Var}}\nolimits (f)\), respectively \( \mathop {\mathrm{Var}}\nolimits _{\{x\}}(f)\) to \( \mathop {\mathrm{Var}}\nolimits _x(f)\).

Definition 1.2

(OFA-kf model) The OFA-kf (Oriented Fredrickson-Andersen k-facilitated) model at density \(p\) is a continuous time Glauber type Markov processe on \(\Omega \), reversible w.r.t. \(\mu \), with Markov semigroup \(P_t= e^{t\mathcal L}\) whose infinitesimal generator \(\mathcal L\) acts on local functions \(f:\Omega \mapsto \mathbb R\) as follows:

$$\begin{aligned} \mathcal Lf(\omega )=\sum _{x\in {\mathbb T} ^k}c_{x}(\omega )\left[ \mu _x(f)(\omega )-f(\omega )\right] . \end{aligned}$$
(1.1)

The function \(c_x\), in the sequel referred to as the constraint at x, is defined by

$$\begin{aligned} c_{x}(\omega ) = \left\{ \begin{array}{l@{\quad }l} 1 &{} \text {if}\,\omega _y=0 \,\,\,\ \forall y\in \mathcal K_x\\ 0 &{} \text {otherwise}. \end{array}\right. \end{aligned}$$
(1.2)

It is easy to check by standard methods (see e.g. [17]) that the process is well defined and that its generator can be extended to non-positive self-adjoint operators on \(L^2({\mathbb T} ^k,\mu )\).

The OFA-kf process can of course be defined also on finite rooted trees. In this case and in order to ensure irreducibility of the Markov chain the constraints \(c_x\) must be suitably modified.

Definition 1.3

(Finite volume dynamics) Let \(\mathcal T\) be a finite subtree of \({\mathbb T} ^k\) and let, for any \(\eta \in \Omega _\mathcal T\), \(\eta ^{0}\in \Omega \) denote the extension of \(\eta \) in \(\Omega \) given by

$$\begin{aligned} \eta ^0_x = \left\{ \begin{array}{l@{\quad }l} \eta _x &{} \text {if}\, x\in \mathcal T\\ 0 &{} \text {if}\, x\in {\mathbb T} ^k{\setminus }\mathcal T. \end{array}\right. \end{aligned}$$

For any \(x\in \mathcal T\) define the finite constraints \(c_{\mathcal T,x}\) by

$$\begin{aligned} c_{\mathcal T,x}(\eta )= c_x(\eta ^0). \end{aligned}$$
(1.3)

We will then consider the irreducible, continuous time Markov chains on \(\Omega _\mathcal T\) with generator

$$\begin{aligned} \mathcal L_\mathcal Tf=\sum _{x\in \mathcal T}c_{\mathcal T,x}[\mu _x(f)-f] \qquad \eta \in \Omega _\mathcal T. \end{aligned}$$
(1.4)

Note that irreducibility of the above defined finite volume dynamics is guaranteed by the fact that starting from the vacant leaves one can empty any configuration via allowed spin flips. It is natural to define (see [4]) the critical density for the model by:

$$\begin{aligned} p_c =\sup \{p\in [0,1]:\text {0 is simple eigenvalue of } \mathcal L\} \end{aligned}$$
(1.5)

The regime \(p<p_c\) is called the ergodic region and we say that an ergodicity breaking transition occurs at the critical density. In [19] it has been established that \(p_c\) coincides with the percolation threshold \(1/k\) and that for all \(p<p_c\) the value \(0\) is a simple eigenvalue of the generator \(\mathcal L\). Actually much more is known but first we need to introduce some relevant time scales.

Definition 1.4

(The relaxation time) Let \(\mathcal D(f) :=\mu (f,-\mathcal Lf)\) be the Dirichlet form corresponding to the generator \(\mathcal L\). We define the spectral gap of the process as

$$\begin{aligned} \mathop {\mathrm{gap}}\nolimits (\mathcal L):=\inf _{\begin{array}{c} f\in \mathrm{Dom}(\mathcal L)\\ f\ne \mathrm{const} \end{array}} \frac{\mathcal D(f)}{ \mathop {\mathrm{Var}}\nolimits (f)} \end{aligned}$$
(1.6)

We also define the relaxation time by \(T_\mathrm{rel}:= \mathop {\mathrm{gap}}\nolimits (\mathcal L)^{-1}\). Similarly, if \(\mathcal T\) is a finite rooted tree, we define \(T_\mathrm{rel}(\mathcal T):= \mathop {\mathrm{gap}}\nolimits (\mathcal L_\mathcal T)^{-1}\).

Definition 1.5

(Mixing times) Let \(\mathcal T\) be a finite rooted sub-tree of \({\mathbb T} ^k\). For any \(\eta \in \Omega _\mathcal T\) we denote by \(\nu _t^\eta \) the law at time \(t\) of the Markov chain with generator \(\mathcal L_{\mathcal T}\) and by \(h_t^\eta \) its relative density w.r.t. \(\mu _\mathcal T\). Following [21], we define the family of mixing times \(\{T_a(\mathcal T)\}_{a\ge 1}\) by

$$\begin{aligned} T_a(\mathcal T):= \inf \left\{ t\ge 0:\ \max _\eta \mu _{\mathbb T} \left( |h_t^\eta -1|^a\right) ^{1/a}\le 1/4\right\} . \end{aligned}$$

Notice that \(T_1(\mathcal T)\) coincides with the usual mixing time \(T_\mathrm{mix}(\mathcal T)\) of the chain (see e.g. [18]) and that, for any \(a\ge 1\), \(T_1\le T_a\).

With the above notation it was proved in [19] that

  1. (i)

    for all \(p<p_c\), \(T_\mathrm{rel}<+\infty \) and that the mixing time on a finite regular \(k\)-ary sub-tree of depth \(L\) grows linearly in \(L\);

  2. (ii)

    if \(p>p_c\), then both the relaxation time and the mixing time on a finite regular \(k\)-ary sub-tree of depth \(L\) grow exponentially fast in \(L\).

1.2 Main results

Our first contribution concerns the critical case \(p=p_c\).

Theorem 1

Fix \(k\ge 2\) and assume \(p=p_c\). Then there exist constants \(c>0\) and \(\beta \ge 0\), with \(\beta \) independent of \(k\), such that for each \(L\)

$$\begin{aligned} c^{-1} L^2\le&T_\mathrm{rel}\bigl ({\mathbb T} _L^k\bigr )\le cL^{2+\beta }. \\ \end{aligned}$$

Remark 1.6

The above result implies, in particular, that the relaxation time for the critical process on the infinite tree \({\mathbb T} ^k\) is infinite. However the process is still ergodic in the sense that \(0\) is a simple eigenvalue of the generator \(\mathcal L\). This can be proven following the same lines of [4, Proposition 2.5] by using the key ingredient that, at \(p=p_c\), there is no infinite percolation of occupied vertices a.s..

Our second main result deals with the quasi-critical regime, \(p=p_c-\epsilon \) with \(0<\epsilon \ll 1\), on the infinite tree \({\mathbb T} ^k\).

Theorem 2

Fix \(k\ge 2\) and assume \(0<p<p_c\). Then there exist constants \(a>0\) and \(\alpha \ge 0\), with \(\alpha \) independent of \(k\), such that

$$\begin{aligned} a^{-1} (p_c-p)^{-2}\le&T_\mathrm{rel}\le a(p_c-p)^{-(2+\alpha )} \\ \end{aligned}$$

The last result implies some consequences of the above theorems for the mixing time on a finite sub-tree.

Theorem 3

Fix \(k\ge 2\) and \(p\in (0,1)\). There exists \(c>0\) such that, for all \(L\),

$$\begin{aligned} \frac{1}{c} L\, T_\mathrm{rel}\bigl ({\mathbb T} ^k_{\lfloor L/2\rfloor }\bigr )\le T_1({\mathbb T} _L^k)\le T_2({\mathbb T} _L^k) \le cL\,T_\mathrm{rel}({\mathbb T} _L^k). \end{aligned}$$
(1.7)

In particular:

  1. (i)

    if \(p=p_c\,\), then

    $$\begin{aligned} c^{-1}L^3 \le T_1({\mathbb T} _L^k)\le cL^{3+\beta }. \end{aligned}$$
  2. (ii)

    If \(p<p_c\,\),

    $$\begin{aligned} \frac{1}{c} (p_c-p)^{-2} L \le T_1({\mathbb T} _L^k) \le c L (p_c-p)^{-(2+\alpha )} \end{aligned}$$

for some constants \(\alpha , \beta \ge 0\) independent of \(k,L\).

1.3 Additional notation and technical preliminaries

We first introduce the natural bootstrap map for the model.

Definition 1.7

The bootstrap map \(B:\{0,1\}^{{\mathbb T} ^k}\rightarrow \{0,1\}^{{\mathbb T} ^k}\) associated to the OFA-kf model is defined by

$$\begin{aligned} B(\eta )_x=\left\{ \begin{array}{l@{\quad }l} 0 &{} \text {if either}\, \eta _x=0 \ \hbox {or} \ c_x(\eta )=1\\ 1 &{} \text {otherwise}\end{array}\right. \end{aligned}$$
(1.8)

with \(c_x\) defined in (1.2).

Remark 1.8

Notice that: (i) if after \(n\)-iterations of the bootstrap map \(c_x(B^n(\eta ))=1\) then, even if \(\eta _x=1\), the percolation cluster of \(1\)’s attached to \(x\) is contained in the first \(n\)-levels below \(x\) and (ii) the bootstrap critical point, i.e. the largest value of \(p\) such that, \(\pi \)-a.s., infinitely many iteration of the bootstrap map are able to make the root vacant, (see e.g. [2]), coincides with the percolation threshold \(p_c=1/k\).

Secondly we formulate two technical results which will be useful in the sequel. Let \(E^{(n)}_x=\{\eta :\ B^n(\eta )_x=1\}\) and define \(p_n:=\mu (E^{(n)}_r)\). Notice that \(p_n\) is increasing in \(p\) and that \(p_n\le p\) for all \(n\).

Lemma 1.9

  1. (i)

    If \(p\le p_c\) then \(p_n\le \frac{2}{(k-1)n}\) for all \(n\ge 1\).

  2. (ii)

    Assume \(p=p_c-\epsilon \) with \(\epsilon \in [0,1/k]\). Then \(p_n\le p(1-\epsilon k)^n\) for all \(n\ge 1\).

Proof

  1. (i)

    We start from

$$\begin{aligned} \mu \left( E^{(n+1)}_r\right) =p\mu \left( \cup _{x\in \mathcal K_r}E^{(n)}_x\right) , \end{aligned}$$
(1.9)

or, equivalently,

$$\begin{aligned} p_{n+1}=p(1-(1-p_n)^k). \end{aligned}$$

Using the monotonicity in \(p\) of the \(p_n\)’s it is enough to prove the statement for \(p=p_c\). The inclusion-exclusion inequalities (1.9) imply that (recall that now \(p=1/k\))

$$\begin{aligned} p_{n+1}&\le \frac{1}{k} \left[ k p_n - {k\atopwithdelims ()2}p_n^2 + {k\atopwithdelims ()3}p_n^3\right] \nonumber \\&= p_n - \frac{(k-1)}{2}p_n^2 +\frac{(k-1)(k-2)}{6} p_n^3. \end{aligned}$$
(1.10)

One readily checks that the r.h.s. of (1.10) is increasing in \(p_n\in [0,1/k]\). Thus, if we assume inductively that \(p_n \le \frac{2}{(k-1)n},\ n\ge 2\), we obtain

$$\begin{aligned} p_{n+1}\le \frac{2}{(k-1)}\left[ \frac{1}{n}-\frac{1}{n^2} + \frac{2(k-2)}{3(k-1)n^3}\right] \le \frac{2}{(k-1)(n+1)} \quad n\ge 2. \end{aligned}$$

The base case \(p_2\) follows from the trivial observation that \(p_2\le p_1\le \frac{1}{k} < \frac{1}{k-1}\).

  1. (ii)

    Taking union bound in (1.9) gives

$$\begin{aligned} p_{n+1}\le pkp_n= (1-\epsilon k)p_n\le \cdots \le (1-\epsilon k)^np. \end{aligned}$$

\(\square \)

The second technical ingredient is the following monotonicity result for the spectral gap (see [4, Lemma 2.11] for a proof).

Lemma 1.10

Let \({\mathbb T} _1 \subset {\mathbb T} _2\) be two sub-trees of  \({\mathbb T} ^k\). Then,

$$\begin{aligned} \mathop {\mathrm{gap}}\nolimits ({\mathcal {L}}_{{\mathbb T} _1}) \ge \mathop {\mathrm{gap}}\nolimits ({\mathcal {L}}_{{\mathbb T} _2}). \end{aligned}$$

2 The critical case: proof of Theorem 1

2.1 Upper bound of the relaxation time

Let \({\mathbb T} \equiv {\mathbb T} ^k_L, {\mathbb T} _x\equiv T^k_{x,L}\) and \(\hat{\mathbb T} _x\equiv \hat{T}^k_{x,L}\). We divide the proof of on the upper bound of \(T_\mathrm{rel}({\mathbb T} )\) into three steps.

2.1.1 First step

(Comparison with a long-range auxiliary dynamics). Motivated by [19] we introduce auxiliary long range constraints as follows.

Definition 2.1

For any integer \(\ell \ge 1\) we set

$$\begin{aligned} c_x^{(\ell )}(\eta )=\left\{ \begin{array}{l@{\quad }l} 1 &{}\text { if} \ c_x(B^{\ell -1} (\eta ))=1\\ 0 &{}\text {otherwise.}\end{array}\right. \end{aligned}$$

Remark 2.2

One can use the functions \(c_x^{(\ell )} \) to define an auxiliary long range dynamics with generator given by (1.1) with \(c_x\) replaced by \(c_x^{(\ell )}\). For this new constrained dynamics a vertex \(x\) is free to flip iff, by a sequence of at most \(\ell \) flips satisfying the original constraints (1.2) all the children of \(x\) can be made vacant.

Fix now \(\delta \in (0,1/9)\) and choose \(\ell = (1-\delta )L\) (neglecting integer parts). Let also \(c_{{\mathbb T} ,x}^{(\ell )}(\eta ):=c_{x}^{(\ell )}(\eta ^0)\) where \(\eta ^0\) is given in Definition 1.3 respectively. Notice that \(c_{{\mathbb T} ,x}^{(\ell )}(\eta )\equiv 1\) iff \(d_x> L-\ell \).

We will establish the inequality

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits _{\mathbb T} (f)\le \lambda \sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f))\right) \qquad \forall f \end{aligned}$$
(2.1)

with \(\lambda = 2(\frac{1-\delta }{1-9\delta })\).

Remark 2.3

Inequality (2.1) will be proven following the strategy of [19]. Notice however that here we don’t perform another Cauchy-Schwarz inequality to pull out the constraint \(c_{{\mathbb T} ,x}^{(\ell )}\) and get the Dirichlet form with long range constraints.

We start from

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits _{\mathbb T} (f)\le \sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(f))\right) . \end{aligned}$$
(2.2)

The above inequality follows easily from a repeated use of the formula for conditional variance and we refer to section 4.1 in [19] for a short proof. We now examine a generic term \(\mu \left( \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}(f)\right) \right) \) in the r.h.s. of (2.2). We write

$$\begin{aligned} \mu _{\hat{\mathbb T} _x}(f)=\mu _{\hat{\mathbb T} _x}\left( c_{{\mathbb T} ,x}^{(\ell )}f\right) + \mu _{\hat{\mathbb T} _x}\left( [1-c_{{\mathbb T} ,x}^{(\ell )}]f\right) \end{aligned}$$

so that

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}(f) \right) \le 2 \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}\left( c_{{\mathbb T} ,x}^{(\ell )}f\right) \right) + 2 \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}\left( (1-c_{{\mathbb T} ,x}^{(\ell )})f\right) \right) . \end{aligned}$$
(2.3)

We now consider the second term \( \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}\left( (1-c_{{\mathbb T} ,x}^{(\ell )})f\right) \right) \). Without loss of generality we can assume \(\mu _{\hat{\mathbb T} _x}(f)=0\). Recall that the constraint \(c_{{\mathbb T} ,x}^{(\ell )}\) depends only on the spin configuration in the first \(\ell \) levels below \(x\), in the sequel denoted by \(\Delta _x\) (see Fig. 1).

Thus

$$\begin{aligned} \mu _{\hat{\mathbb T} _x}\left( (1-c_{{\mathbb T} ,x}^{(\ell )})f\right) = \mu _{\hat{\mathbb T} _x}\left( (1-c_{{\mathbb T} ,x}^{(\ell )})\mu _{\hat{\mathbb T} _x{\setminus }\Delta _x}(f)\right) \end{aligned}$$

and

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}\left( (1\!-\!c_{{\mathbb T} ,x}^{(\ell )})f\right) \right)&\le \mu _{{\mathbb T} _x}\left( \mu _{\hat{\mathbb T} _x}\left( (1\!-\!c_{{\mathbb T} ,x}^{(\ell )})\mu _{\hat{\mathbb T} _x{\setminus }\Delta _x}(f)\right) ^2\right) \nonumber \\&\le \mu _{{\mathbb T} _x}(1\!-\!c_{{\mathbb T} ,x}^{(\ell )}) \mu _{{\mathbb T} _x}\left( \mu _{\hat{\mathbb T} _x{\setminus }\Delta _x}(f)^2\right) \nonumber \\&= \mu _{{\mathbb T} _x}(1\!-\!c_{{\mathbb T} ,x}^{(\ell )}) \mathop {\mathrm{Var}}\nolimits _{{\mathbb T} _x}\left( \mu _{\hat{\mathbb T} _x{\setminus }\Delta _x}(f)\right) \nonumber \\&\le \mu _{{\mathbb T} _x}(1\!-\!c_{{\mathbb T} ,x}^{(\ell )})\sum _{y\in \Delta _x\cup x}\mu _{{\mathbb T} _x}\left( \mathop {\mathrm{Var}}\nolimits _y(\mu _{\hat{\mathbb T} _y}(f))\right) \qquad \qquad \end{aligned}$$
(2.4)

where we used Cauchy-Schwarz inequality, the fact that \(c_{{\mathbb T} ,x}^{(\ell )}\) does not depend on \(\eta _x\) and (2.2) in the last inequality. From the definition of \(c_{{\mathbb T} ,x}^{(\ell )}\) on the finite tree \({\mathbb T} \) it holds

$$\begin{aligned} \mu _{{\mathbb T} _x}(1- c_{{\mathbb T} ,x}^{(\ell )})=\left\{ \begin{array}{l@{\quad }l} 0 &{} \text {if} \ d_x>\delta L\\ p_{\ell }/p&{} \text {otherwise}\end{array}\right. \end{aligned}$$
(2.5)

In conclusion, using (2.3), (2.4) and (2.5),

$$\begin{aligned}&\sum _{x\in {\mathbb T} } \!\mu _{\mathbb T} \left[ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(f))\right] \end{aligned}$$
(2.6)
$$\begin{aligned}&\quad \le 2\sum _{x\in {\mathbb T} }\!\mu _{\mathbb T} \left[ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f)\right] + 2\frac{p_{\ell }}{p}\sum _{\genfrac{}{}{0.0pt}{}{x:}{d_x \le \delta L}}\sum _{y\in \Delta _x\cup x}\mu _{{\mathbb T} }[ \mathop {\mathrm{Var}}\nolimits _y(\mu _{\hat{\mathbb T} _y}(f))]\nonumber \\&\quad \le 2\sum _{x\in {\mathbb T} }\!\mu _{\mathbb T} \left[ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f)\right] + 2\frac{p_{\ell }}{p}\left[ \max _z N_z \right] \sum _{y}\mu _{{\mathbb T} }[ \mathop {\mathrm{Var}}\nolimits _y(\mu _{\hat{\mathbb T} _y}(f))]\qquad \qquad \end{aligned}$$
(2.7)

where

$$\begin{aligned} N_z:=\#\{x:\ \Delta _x \ni z,\ d_x\le \delta L\}\le \min (\delta L,\ell +1). \end{aligned}$$

Part (i) of Lemma 1.9 implies that \(p_{\ell }\le \frac{2}{(k-1)\ell }=\frac{2}{(k-1)(1-\delta )L}\) so that

$$\begin{aligned}&\sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(f))\right) \nonumber \\&\quad \le 2\sum _{x\in {\mathbb T} }\!\mu _{\mathbb T} \left[ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f))\right] \!+\! \frac{4\delta }{p(1\!-\!\delta )(k\!-\!1)}\sum _{x\in {\mathbb T} }\mu _{{\mathbb T} }[ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(f))]\qquad \qquad \end{aligned}$$
(2.8)

Since \(p=1/k\) and \(k/(k-1)\le 2\), inequality (2.1) holds with \(\lambda = 2(1-\delta )/(1-9\delta )\) provided \(8\delta /(1-\delta )<1\).

2.1.2 Second step

[Analysis of the auxiliary dynamics]. Let \(h_i= \alpha ^i\), \(\alpha >1\) to be fixed later on, and let

$$\begin{aligned} T_{i}:= T_\mathrm{rel}({\mathbb T} ^k_{h_{i}\wedge \ell }). \end{aligned}$$
(2.9)

We shall now prove that

$$\begin{aligned} \sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x( \mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f))\right) \le \left[ 2+ \frac{4\alpha }{p(k-1)} \left( \sum ^{n-1}_{i=1} \sqrt{T_{i}}\right) ^2 \right] \mathcal D_{\mathbb T} (f), \end{aligned}$$
(2.10)

with \(n\) such that \(h_{n-1}< \ell \le h_n\).

The starting point is (2.1). For any \(x\in {\mathbb T} \) we introduce a scale decomposition of the constraint \(c_{{\mathbb T} ,x}^{(\ell )}\) as follows \(c_{{\mathbb T} ,x}^{(\ell )}=\sum _{i=0}^{n-1}\chi _i +c_{{\mathbb T} ,x}\), where \(\chi _i:= c_{{\mathbb T} ,x}^{(h_{i+1}\wedge \ell )}-c_{{\mathbb T} ,x}^{(h_{i}\wedge \ell )}\). Thus

$$\begin{aligned}&\sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x( \mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f))\right) \\&\quad \le 2 \sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x( \mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x} f))\right) + 2 \sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}\left( \sum _{i=0}^{n-1}\chi _i f\right) \right) \right) \\&\quad \le 2\mathcal D_{{\mathbb T} }(f) + 2 \sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}\left( \sum _{i=0}^{n-1}\chi _i f\right) \right) \right) , \end{aligned}$$

where in the last inequality we used convexity to conclude that

$$\begin{aligned} \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x( \mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x} f))\right) \le \mu _{\mathbb T} \left( c_{{\mathbb T} ,x} \mathop {\mathrm{Var}}\nolimits _x(f)\right) . \end{aligned}$$

We now examine the key term \(\sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x( \mu _{\hat{\mathbb T} _x}(\sum _{i=0}^{n-1}\chi _i f))\right) \).

Observe first that \(\chi _i=0\) if \(h_i\ge \ell \) and that \(\chi _i=1\) implies the number of iterations of the bootstrap map necessary to make the node \(x\) flippable is at least \(h_i\) but no more than \(h_{i+1}\wedge \ell \). In particular, if \(\chi _i(\eta )=1\), there exists a “line” of zeros of \(\eta \) within \(h_{i+1}\wedge \ell \) levels below \(x\). For such an \(\eta \) we denote by \(\Gamma (\eta )\) the “lowest” such line constructed as follows.

Consider the nodes in \({\mathbb T} _x\) at distance \(h_{i+1}\wedge \ell \) from \(x\). Let us order them from left to right as \(z_1,z_2,\dots \); start from \(z_1\) and find the first vacant site on the branch leading to \(x\). Call this vertex \(y_1\) and forget about all the \(z_i\)’s having \(y_1\) as an ancestor. Say that the remaining nodes are \(z_{k_1},z_{k_1+1},\dots \); repeat the construction for \(z_{k_1}\) to get a new vacant node \(y_2\) and so forth. At the end of this procedure some of the \(y_i\) may have some other \(y_k\) as an ancestor. In this case we remove the former from our collection and we relabel accordingly. The line \(\Gamma (\eta )\) is then the final collection \((y_1,y_2,\dots )\).

Fig. 2
figure 2

For \(k=3\), the sub-tree \({\mathbb T} _x\) rooted at \(x\) and a configuration \(\eta \) such that \(\chi _i(\eta )=1\). The line of vacant sites corresponds to a set \(\gamma \in \mathcal G_i\)

We denote by \(\mathcal G_i\) the space of all possible realization of \(\Gamma \). Moreover, given \(\gamma \in \mathcal G_i\), we denote by \(\hat{\mathbb T} _x^{\gamma ,+}\) all the nodes in \(\hat{\mathbb T} _x\) which have no ancestor in \(\gamma \), i.e. the part of the tree “above” \(\gamma \). Note that the above construction of \(\Gamma \) is made without looking at the configuration above \(\Gamma \) (Fig. 2). This observation together with the definition of the variance and Cauchy-Schwarz inequality gives

$$\begin{aligned}&\mathop {\mathrm{Var}}\nolimits _x\left( \mu _{\hat{\mathbb T} _x}\left( \sum _{i=0}^{n-1}\chi _i f\right) \right) = p(1-p)\left[ \,\sum _{i=0}^{n-1} \mu _{\hat{\mathbb T} _x}(\chi _i \nabla _x f)\,\right] ^2 \nonumber \\&\quad = p(1-p)\left[ \sum _{i=0}^{n-1} \sum _{\gamma \in \mathcal G_i}\mu _{\hat{\mathbb T} _x{\setminus }\hat{T}_x^{\gamma ,+}}\left( {1\!\!1} _{\Gamma =\gamma }\;\mu _{ \hat{T}_x^{\gamma ,+}}\left( \chi _i \nabla _x f \right) \right) \right] ^2 \nonumber \\&\quad \le p(1\!-\!p)\left[ \sum _{i=0}^{n-1} \sum _{\gamma \in \mathcal G_i}\mu _{\hat{\mathbb T} _x{\setminus }\hat{T}_x^{\gamma ,+}} \left( {1\!\!1} _{\Gamma =\gamma }\sqrt{\mu _{\hat{T}_x^{\gamma ,+}}(\chi _i) \mu _{\hat{T}_x^{\gamma ,+}}(|\nabla _x f|^2) }\right) \right] ^2.\qquad \qquad \quad \end{aligned}$$
(2.11)

where \(\nabla _xf(\eta )=f(\eta ^x)-f(\eta )\) with \(\eta ^x_y= \eta _y\) if \(y\ne x\) and \(\eta ^x_x=1-\eta _x\). Consider now the last factor inside the square root and multiply it by \(p(1-p)\). It satisfies

$$\begin{aligned} p(1-p) \mu _{T_x^{\gamma ,+}}(|\nabla _x f|^2)=\mu _{T_x^{\gamma ,+}}( \mathop {\mathrm{Var}}\nolimits _x(f))\le \mathop {\mathrm{Var}}\nolimits _{{\mathbb T} _x^{\gamma ,+}}(f) \le T_\mathrm{rel}({\mathbb T} _x^{\gamma ,+})\mathcal D_{{\mathbb T} _x^{\gamma ,+}}(f) \end{aligned}$$

where we used the convexity of the variance and the Poincaré inequality. Lemma 1.10 now gives \( T_\mathrm{rel}({\mathbb T} _x^{\gamma ,+})\le T_{i+1}. \) In conclusion

$$\begin{aligned} p(1-p) \mu _{T_x^{\gamma ,+}}(|\nabla _x f|^2) \le T_{i+1}\mathcal D_{{\mathbb T} _x^{\gamma ,+}}(f). \end{aligned}$$

To bound the first factor inside the square root of (2.11) we note that \({1\!\!1} _{\Gamma =\gamma }c_{{\mathbb T} ,x}^{(h_i)}={1\!\!1} _{\Gamma =\gamma }c_{{\mathbb T} _x^{\gamma ,+},x}^{(h_i)}\). Indeed the finite volume constraints \(c_{T_x^{\gamma ,+},y}\) are defined with zeros on the set \(\gamma \) of the leaves of \(T_x^{\gamma ,+}\) [see (1.3)] and in turn \({1\!\!1} _{\Gamma (\eta )=\gamma }\) guarantees the presence of such zeros for the configuration \(\eta \). Thus, using the monotonicity on the volume of the probability that the root \(x\) is connected to the level \(h_i\),

$$\begin{aligned} {1\!\!1} _{\{\Gamma =\gamma \}}\mu _{T_x^{\gamma ,+}}(\chi _i)&\le {1\!\!1} _{\{\Gamma =\gamma \}}\mu _{T_x^{\gamma ,+}}(1-c_x^{(h_i)}) ={1\!\!1} _{\{\Gamma =\gamma \}}\mu _{T_x^{\gamma ,+}}\left( 1-c_{T_x^{\gamma ,+},x}^{(h_i)}\right) \\&\le \mu (1-c_x^{(h_i)})=p_{h_i}/p. \end{aligned}$$

In conclusion, the r.h.s. of (2.11) is bounded from above by

$$\begin{aligned}&\frac{1}{p}\left( \sum _{i=0}^{n-1} \sqrt{ T_{i+1}p_{h_i}} \; \mu _{\hat{\mathbb T} _x}\left( \sum _{\gamma \in \mathcal G_i}{1\!\!1} _{\Gamma =\gamma }\sqrt{\mathcal D_{{\mathbb T} _x^{\gamma ,+}}(f)}\right) \right) ^2 \nonumber \\&\quad \le \frac{1}{p}\left( \sum _{i=0}^{n-1} \sqrt{ T_{i+1}p_{h_i}} \; \sqrt{\mu _{\hat{\mathbb T} _x}\left( \sum _{\gamma \in \mathcal G_i}{1\!\!1} _{\Gamma =\gamma }\mathcal D_{{\mathbb T} _x^{\gamma ,+}}(f)\right) }\right) ^2 \nonumber \\&\quad \le \frac{1}{p} \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\right) \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\;p_{h_i}\;\mu _{\hat{\mathbb T} _x}\left( \sum _{\gamma \in \mathcal G_i}{1\!\!1} _{\Gamma =\gamma }\mathcal D_{{\mathbb T} _x^{\gamma ,+}}(f)\right) \right) \nonumber \\&\quad \le \frac{1}{p} \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\right) \left( \sum _{i=0}^{n-1}\sqrt{T_{i+1}}\, p_{h_i} \mathop {\mathop {\sum }\limits _{y\in \hat{\mathbb T} _x}}\limits _{d_y\le d_x+h_{i+1}}\mu _{\hat{\mathbb T} _x}(c_y \mathop {\mathrm{Var}}\nolimits _y(f))\right) \qquad \qquad \end{aligned}$$
(2.12)

where we used the Cauchy-Schwarz inequality in the first and second inequality together with

$$\begin{aligned} \mu _{\hat{\mathbb T} _x}\left( \sum _{\gamma \in \mathcal G_i}{1\!\!1} _{\Gamma =\gamma }\mathcal D_{{\mathbb T} _x^{\gamma ,+}}(f)\right) \le \mathop {\mathop {\sum }_{y\in \hat{\mathbb T} _x}}_{d_y\le d_x+h_{i+1}}\mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,y} \mathop {\mathrm{Var}}\nolimits _y(f)) \end{aligned}$$

because \({1\!\!1} _{\{\Gamma (\eta )=\gamma \}}c_{T_x^{\gamma ,+},y}(\eta )= {1\!\!1} _{\{\Gamma (\eta )=\gamma \}} c_{{\mathbb T} ,y}(\eta )\). If we now average over \(\mu _{{\mathbb T} }\) and sum over \(x\in {\mathbb T} \) in the above result, we get that

$$\begin{aligned}&\sum _{x\in {\mathbb T} } \mu _{\mathbb T} \left( \mathop {\mathrm{Var}}\nolimits _x( \mu _{\hat{\mathbb T} _x}(\sum _{i=0}^{n-1}\chi _if))\right) \\&\quad \le \frac{1}{p} \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\right) \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\, p_{h_i}\mathop {\mathop {\sum }_{x\in {\mathbb T} } \sum _{y\in \hat{\mathbb T} _x}}_{d_y\le d_x+h_{i+1}}\mu _{\mathbb T} (c_{{\mathbb T} ,y} \mathop {\mathrm{Var}}\nolimits _y(f))\right) \\&\quad \le \frac{1}{p} \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\right) \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\, p_{h_i}h_{i+1}\right) \mathcal D_{\mathbb T} (f)\\&\quad \le \frac{2\alpha }{p(k-1)} \left( \sum _{i=0}^{n-1} \sqrt{T_{i+1}}\right) ^2 \mathcal D_{\mathbb T} (f) \end{aligned}$$

and (2.10) follows. Above we used the exponential growth of the scales \(\{h_i\}_i\) together with (i) of Lemma 1.9 to obtain \(p_{h_i}h_{i+1}\le 2\alpha /(k-1)\).

2.1.3 Third step

(Recurrence). With the above notation (2.1) and (2.10) yield the following key recursive inequality:

$$\begin{aligned} T_\mathrm{rel}({\mathbb T} )\le \lambda \left[ 2+ \frac{4\alpha }{p(k-1)} \left( \sum ^{n-1}_{i=0} \sqrt{T_{i}}\right) ^2 \right] \end{aligned}$$

with \(T_i\) given by (2.9) and \(\lambda =2\frac{1-\delta }{1-9\delta }\). Suppose now that \(L=\alpha ^{N+1}\) and \(\ell = \alpha ^{N}\) with \(\alpha = (1-\delta )^{-1}\). Then \(T_\mathrm{rel}({\mathbb T} )=T_{N+1}\) and \(n=N\). If we set \(a_i:=\sqrt{T_i}\) then we get

$$\begin{aligned} a_{N+1}\le c \sum _{i=0}^N a_i,\quad c=\lambda ^{1/2}\left( 2+\frac{4\alpha }{p(k-1)} \right) ^{1/2}, \end{aligned}$$

which implies that \(b_n:= \sum _{i=0}^n a_i\) satisfies \(b_{N+1}\le (1+c)b_N\). In conclusion

$$\begin{aligned} T_\mathrm{rel}({\mathbb T} )= a_{N+1}^2\le b^2_{N+1} \le (1+c)^{2N}b^2_1. \end{aligned}$$

The proof of the upper bound of \(T_\mathrm{rel}({\mathbb T} )\) in Theorem 1 is complete if the depth \(L\) is of the form \(\alpha ^{n}, n\in {\mathbb N} \). The extension to general values of \(L\) follows at once from Lemma 1.10.

2.2 Lower bound on the relaxation time \(T_\mathrm{rel}\)

Let us consider as a test function to be inserted into the variational characterization of \(T_\mathrm{rel}({\mathbb T} )\) the cardinality \(N_r\) of the percolation cluster \(\mathcal C_r\) of occupied sites associated to the root \(r\). More formally

$$\begin{aligned} N_r(\eta ):= \#\{x\in {\mathbb T} : \ \eta _y=1 \ \forall y\in \gamma _x\} \end{aligned}$$

where \(\gamma _x\) is the unique path in \({\mathbb T} \) joining \(x\) to the root \(r\). Notice that \(N_r\) can be written as \( N_r(\eta )= \eta _r\bigl (\sum _{i=1}^k N_{x_i}+1\bigr )\), where \(\{x_i\}_{i=1}^k\) are the children of the root and \(N_{x_i}\) denotes the analogous of the quantity \(N_r\) with \({\mathbb T} \) replaced by the sub-tree \({\mathbb T} _{x_i}\) rooted at \(x_i\).

We now compute the variance and Dirichlet form of \(N_r\). Clearly

$$\begin{aligned} c^{-1}\sum _{x\in {\mathbb T} }\mu (x \text { is a leaf of } \mathcal C_r)\le \mathcal D_{{\mathbb T} }(N_r)\le c \sum _{x\in {\mathbb T} }\mu (x \text { is a leaf of } \mathcal C_r) \le c\mu (N_r) \end{aligned}$$

for some constant \(c=c(k)\). Moreover \(\mu (N_r)= p\left( k\mu (N_{x_1})+1\right) \) which, for \(p=p_c=1/k\), implies that \(\mu (N_r)=L/k\). To compute \(\mathrm{Var }_{{\mathbb T} }(N_r)\) we use the above expression for \(N_r\) together with the formula for conditional variance to write

$$\begin{aligned} \mathrm{Var }_{{\mathbb T} }(N_r)&= \mu \left( \mathrm{Var }_{{\mathbb T} }(N_r | \eta _r)\right) + \mathrm{Var }_{{\mathbb T} }\bigl (\mu (N_r | \eta _r)\bigr )\nonumber \\&= pk \mathrm{Var }_{{\mathbb T} _{x_1}}(N_{x_1}) + \mathrm{Var }_{{\mathbb T} }\bigl (\eta _r (k\mu (N_{x_1}) +1)\bigr )\nonumber \\&= \mathrm{Var }_{{\mathbb T} _{x_1}}\bigl (N_{x_1}\bigr ) +p(1-p)L^2. \end{aligned}$$
(2.13)

Hence \( \mathrm{Var }_{{\mathbb T} }(N_r)\ge c'L^3\) and

$$\begin{aligned} T_\mathrm{rel}({\mathbb T} )\ge \frac{\mathrm{Var }_{{\mathbb T} }(N_r)}{\mathcal D_{{\mathbb T} }(N_r)}\ge c'' L^2. \end{aligned}$$

\(\square \)

3 The quasi-critical case: proof of Theorem 2

Here we assume \(p=p_c-\epsilon \), \(\epsilon >0\) and, without loss of generality, we assume that \(\epsilon k\ll 1\). Recall that we work directly on the infinite tree \({\mathbb T} ^k\).

3.1 Upper bound on the relaxation time \(T_\mathrm{rel}\)

We first claim that, for any \(\ell \) such that \( 2\ell (1-\epsilon k)^{\ell }<1\), one has

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits (f)\le \lambda \sum _{x\in {\mathbb T} ^k}\mu \left( \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f))\right) \end{aligned}$$
(3.1)

with \(\lambda =\frac{2}{1-2(\ell +1)(1-\epsilon k)^{\ell }}\). The proof of (3.1) starts from inequality (2.4), whose derivation does not depend on the value of \(p\). After that we proceed as follows. Since \(p=p_c-\epsilon \), Lemma 1.9 (ii) implies that

$$\begin{aligned} \mu _{T_x}(1-c_{{\mathbb T} ,x}^{(\ell )})=\frac{p_{\ell }}{p}\le (1-\epsilon k)^{\ell }\quad \forall x\in {\mathbb T} ^k. \end{aligned}$$

Thus

$$\begin{aligned} \mathrm{Var }(f)&\le \sum _{x\in {\mathbb T} ^k} \!\mu \left[ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(f))\right] \\&\le 2\sum _{x\in {\mathbb T} ^k}\!\mu _{\mathbb T} \left[ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(c_{{\mathbb T} ,x}^{(\ell )}f))\right] + 2(\ell +1)(1-\epsilon k)^{\ell }\sum _{x\in {\mathbb T} ^k}\mu [ \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(f))] \end{aligned}$$

and (3.1) follows.

Now choose \(\ell =-2\frac{\log (\epsilon k)}{\epsilon k}\), so that \(\lambda <4\) in (3.1) for any \(\epsilon \) small enough, and define, for \(x\in {\mathbb T} ^k\), \({\mathbb T} _x\) as the finite \(k\)-ary tree rooted at \(x\) of depth \(\ell \).

Exactly the same arguments leading to (2.12), but without the subtleties of the intermediate scales \(\{h_i\}_i\), show that

$$\begin{aligned} \mu \left( \mathop {\mathrm{Var}}\nolimits _x(\mu _{\hat{\mathbb T} _x}(c_x^{(\ell )}f))\right) \le T_\mathrm{rel}({\mathbb T} )\sum _{y\in {\mathbb T} _x} \mu \left( c_y \mathop {\mathrm{Var}}\nolimits _y(f)\right) . \end{aligned}$$
(3.2)

If we now combine (3.2) together with (3.1) we get

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits (f)\le 4\ell \ T_\mathrm{rel}({\mathbb T} ) \mathcal D(f) \end{aligned}$$
(3.3)

for all \(\epsilon \) small enough. Finally we claim that \(T_\mathrm{rel}({\mathbb T} )\le c \ell ^\beta \) for some appropriate constants \(c,\beta \).

To prove the claim it is enough to observe that, in its proof for the case \(p=p_c\) given in Sect. 2, only upper bounds on percolation probabilities played a role. By monotonicity these bounds hold for any \(p\le p_c\). Hence the claim. In conclusion

$$\begin{aligned} \mathop {\mathrm{Var}}\nolimits (f)\le c\ell ^{1+\beta }\mathcal D(f) \end{aligned}$$

and \(T_\mathrm{rel}\le c \ell ^{1+\beta }= c' \epsilon ^{-(1+\beta )}\).

3.2 Lower bound of the relaxation time \(T_\mathrm{rel}\)

Thanks to Lemma 1.10, \(T_\mathrm{rel}\ge T_\mathrm{rel}(\mathcal T)\) for any finite sub-tree \(\mathcal T\). We now choose \(\mathcal T\) as the \(k\)-ary tree rooted at \(r\) with depth \(\ell =\lfloor 1/\epsilon \rfloor \) and proceed exactly as in the proof of Theorem 1. Using the notation of Sect. 2.2 we have

$$\begin{aligned} \mathcal D_{{\mathbb T} }(N_r)\le c\mu (N_r)\le c' \ell \end{aligned}$$

where we used the fact that the average of \(N_r\) at \(p<p_c\) is bounded from above by the same average computed at \(p=p_c\) since \(N_r\) is increasing (w.r.t. the natural partial order in \(\Omega _{{\mathbb T} }\)). To compute \(\mathrm{Var }_{{\mathbb T} }(N_r)\) we proceed recursively starting from [cf (2.13)]

$$\begin{aligned} \mathrm{Var }_{{\mathbb T} }(N_r)&= (1-k\epsilon )\mathrm{Var }_{{\mathbb T} _{x_1}}(N_{x_1}) + \frac{1-p}{p}\mu (N_r)^2\\ \mu (N_r)&= (1-\epsilon k)\mu (N_{x_1}) +p \end{aligned}$$

Since the number of steps of the iteration is \(\lfloor 1/\epsilon \rfloor \) one immediately concludes that \(\mu (N_r)\ge c_k \ell \) and \( \mathrm{Var }_{{\mathbb T} }(N_r)\ge c'_k \ell ^3\) for some constant \(c_k\) depending only on \(k\). Thus

$$\begin{aligned} T_\mathrm{rel}\ge T_\mathrm{rel}(\mathcal T)\ge \frac{\mathrm{Var }_{{\mathbb T} }(N_r)}{\mathcal D_{{\mathbb T} }(N_r)}\ge c \ell ^2 = c\,\epsilon ^{-2}, \end{aligned}$$

for some constant \(c>0\).

4 Mixing times: proof of Theorem 3

The specific statement (i) and (ii) are a direct consequence of (1.7), Theorem 1 and Theorem 2. The upper bound \( T_1(\mathcal T)\le T_2(\mathcal T)\le cLT_\mathrm{rel}(\mathcal T)\) was proved in [19, Corollary 1]. It remains to prove the lower bound and this is what we do now following an idea of [8].

Consider two probability measures \(\pi ,\nu \) on \(\Omega _{{\mathbb T} }\) and recall their Hellinger distance

$$\begin{aligned} d_\mathcal H(\pi ,\nu ):=\sqrt{2-2I_\mathcal H(\pi ,\nu )}, \end{aligned}$$

where

$$\begin{aligned} I_\mathcal H(\pi ,\nu ):= \sum _\omega \sqrt{\pi (\omega )\nu (\omega )}. \end{aligned}$$

Clearly

$$\begin{aligned} I_\mathcal H(\pi ,\nu )\ge \sum _{\eta \in \Omega _{\mathbb T} }\pi (\eta )\wedge \nu (\eta )\ge 1- \Vert \pi -\nu \Vert _{TV}. \end{aligned}$$

If we combine the above inequality with [9, Lemma 4.2 (i)] we get

$$\begin{aligned} \frac{1}{2} d_\mathcal H(\pi ,\nu )^2\le \Vert \pi -\nu \Vert _{TV}\le d_\mathcal H(\pi ,\nu ). \end{aligned}$$

Assume now that \(\pi ,\nu \) are product measures, \(\pi =\prod _{i=1}^n \pi _i,\ \nu =\prod _{i=1}^n\nu _i\), so that

$$\begin{aligned} I_\mathcal H(\pi ,\nu ):= \prod _{i=1}^n I_\mathcal H(\pi _i,\nu _i). \end{aligned}$$

Therefore

$$\begin{aligned} \Vert \pi -\nu \Vert _{TV}&\ge 1-I_\mathcal H(\pi ,\nu )=1-\prod _{i=1}^n I_\mathcal H(\pi _i,\nu _i)\nonumber \\&= 1- \prod _{i=1}^n \left( 1-\frac{1}{2} d_\mathcal H(\pi _i,\nu _i)^2\right) \nonumber \\&\ge 1-\prod _{i=1}^n \left( 1-\frac{1}{2} \Vert \pi _i-\nu _i\Vert ^2_{TV}\right) \nonumber \\&\ge 1-e^{-\sum _i \frac{1}{2} \Vert \pi _i-\nu _i\Vert ^2_{TV}}. \end{aligned}$$
(4.1)

Suppose now that, for each \(i\le n\), \(\nu _i\) is the distribution at time \(t\) of some finite, ergodic, continuous time Markov chain \(X^{(i)}\), reversible w.r.t. \(\pi _i\) and with initial state \(x_i\). In this case the measure \(\nu \) is the distribution at time \(t\) of the product chain \(X=\otimes _i X_i\) started from \(x=(x_1,\dots ,x_n)\) and \(\pi \) is the reversible measure.

Let \(\lambda _i\) be the spectral gap of the chain \(X^{(i)}\), let \(f_i\) be the corresponding eigenvector and choose the starting state \(x_i\) in such a way that \(|f_i(x_i)|=\Vert f_i\Vert _\infty \). Then

$$\begin{aligned} \Vert \pi _i-\nu _i\Vert _{TV}&\ge \frac{1}{2} \frac{1}{\Vert f_i\Vert _\infty } |\pi _i(f_i)-\nu _i(f_i)|=\frac{1}{2} \frac{|f(x_i)|}{\Vert f_i\Vert _\infty }e^{-\lambda _i t}\nonumber \\&= \frac{1}{2} e^{-\lambda _i t}, \end{aligned}$$
(4.2)

where we used \(\pi _i(f_i)=0\) because \(f_i\) is orthogonal to the constant functions.

In conclusion, by combining together (4.1) and (4.2), we get

$$\begin{aligned} \Vert \pi -\nu \Vert _{TV}\ge 1-e^{-\frac{1}{8}\sum _i e^{-2\lambda _i t}}. \end{aligned}$$

Therefore, if \(t=t^*\) with

$$\begin{aligned} t^*=\frac{1}{2}\left[ \frac{1}{\max _i \lambda _i}\log n - \frac{1}{\min _i \lambda _i}\log 8\right] , \end{aligned}$$

then \(\Vert \pi -\nu \Vert _{TV}\ge 1-e^{-1 }\). Thus the mixing time of the product chain \(X\) is larger than \(t^*\).

We now apply the above strategy to prove a lower bound on \(T_1(\mathcal T)\).

Let \({\mathbb T} ^{(i)}\) be the \(i\)th (according to some arbitrary order) \(k\)-ary sub-tree of depth \(\lceil L/2\rceil \) rooted at the \(\lfloor L/2\rfloor \)-level of \({\mathbb T} \) and consider the OFA-kf model on \(\cup _i {\mathbb T} ^{(i)}\). Clearly such a chain \(X\) is a product chain, \(X=\otimes _i X_i\), where each of the individual chain is the OFA-kf model on \({\mathbb T} ^{(i)}\). The key observation now is that, due to the oriented character of the constraints, the projection on \(\cup _i {\mathbb T} ^{(i)}\) of the OFA-kf model on \({\mathbb T} \) coincides with the chain \(X\). Hence \(T_1(\mathcal T)\ge t_\mathrm{mix}\) if \(t_\mathrm{mix}\) denotes the mixing time of the product chain \(X\). According to the previous discussion and with \(n=k^{\lfloor L/2\rfloor }\) the number of sub-trees \(T^{(i)}\) we get

$$\begin{aligned} T_1(\mathcal T)\ge t_\mathrm{mix}&\ge \frac{1}{2} \left( \log n - \log 8\right) \mathop {\mathrm{gap}}\nolimits (\mathcal L_{{\mathbb T} '})^{-1}= \frac{1}{2}\left( \log n - \log 8\right) T_\mathrm{rel}({\mathbb T} ')\\&\ge \frac{1}{c} L\, T_\mathrm{rel}({\mathbb T} ') \end{aligned}$$

for some constant \(c>0\) where we used translation invariance to conclude that the spectral gap \(\lambda _i\) of the chain \(X_i\) coincides with \(\mathop {\mathrm{gap}}\nolimits (\mathcal L_{{\mathbb T} '})\) for any \(i\), \({\mathbb T} '\) denoting a \(k\)-ary rooted tree of depth \(\lceil L/2\rceil \).

5 Concluding remarks and open problems

(i) It is a very interesting problem to determine exactly the critical exponents for the critical and quasi-critical case and in particular to verify whether the lower bounds in Theorems 1 and 3 give the correct growth of the corresponding time scales as a function of the depth of the tree.

(ii) A key ingredient of our analysis is the fact that the percolation transition on \({\mathbb T} ^k\) is continuous, i.e. with probability one there is no infinite cluster of occupied sites at \(p=p_c\) and the probability that the cluster of the root touches more than \(n\) levels decays polynomially in \(1/n\). A very challenging open problem is the extension of the approach described in this work to models with a discontinuous (or first-order) phase transition for the corresponding bootstrap percolation problem.

The first instance of the above general question occurs for a kinetically constrained model on the ternary tree \({\mathbb T} ^3\), with the local constraint \(c_x\) requiring at least \(j=2\) of the \(k=3\) children of \(x\) to be vacant. The natural obstacle for the dynamics—previously an infinite ray of \(1\)’s in the \(j=3\) case corresponding to standard percolation—now becomes a binary regular subtree of \({\mathbb T} ^3\) where the configuration is identically equal to one. By the results of Pakes and Dekking [7] (see also [2]), generalizing the work of Chayes et al. [3] for the binary tree, it is known that the associated bootstrap percolation model undergoes a first order phase transition at \(p_c=8/9\), unlike the second order phase transition of percolation. This may therefore suggest that, when \(p\) crosses the critical point \(p_c=8/9\), the relaxation time jumps from \(O(1)\) (cf [19]) to exponential (as e.g. in the Ising model in the setting of a first order phase transition). However, rather natural test functions, like e.g. the indicator of the event that the root is occupied after \(L\) iterations of the bootstrap map, do not support the above scenario and, for \(p=p_c\), still give a bound \(\Omega (L^2)\) on the relaxation time, exactly as in the percolation \(j=3\) case. Numerical simulations for a similar unoriented model [22] also suggest a poly\((L)\) relaxation time. Thus it may be possible that the nature of the phase transition at \(p_c=8/9\) is more subtle than what appears and finding the correct behavior of the relaxation time at criticality remains a very interesting and challenging problem.

(iii) Using the comparison methods of [19, Sect. 4], our results on the relaxation time can be transfered with minimal changes to the un-oriented case defined as follows. On the un-rooted regular tree with degree \(k+1\), consider the kinetically constrained model in which a vertex can be updated iff at least \(k\) of its \(k+1\) neighbors are vacant. The critical value \(p_c\) at which ergodicity breaks down coincides with the critical value of the oriented model on \({\mathbb T} ^k\) considered so far (cf. [19, Theorem 1]). Moreover the relaxation time (on the infinite tree or on finite subtrees with suitable boundary conditions) can be bounded from above and below by the corresponding relaxation time in the oriented case. The comparison between the mixing times is more indirect since one has first to compare the logarithmic Sobolev constants of the two models and then use the general bounds relating the latter to the mixing times (see e.g. [21]).