1 Introduction

We consider stationary stochastic processes on \({\mathbb {Z}}\),

$$\begin{aligned} \dots ,Z_{-2},Z_{-1},Z_0,Z_1,Z_2,\dots \end{aligned}$$

where each \(Z_t\), \(t\in {\mathbb {Z}}\), is a symbol taking values in a finite alphabet \({\mathsf {A}}\). The processes we consider are called chains with complete connections (Doeblin and Fortet [3]), due to a dependence on the past of the following form. Assume some measurable map \(g:{\mathsf {A}}\times {\mathsf {A}}^{{\mathbb {N}}}\rightarrow [0,1]\) is given a priori, called g-function, and that for all \(t\), all \(z_{t}\in {\mathsf {A}}\),

$$\begin{aligned} P(Z_{t}=z_{t}|Z_{t-1}=z_{t-1},Z_{t-2}=z_{t-2},\dots ) =g(z_{t}|z_{t-1},z_{t-2},\dots )\quad \text {a.s.} \end{aligned}$$
(1)

A processes \(Z=(Z_t)_{t\in {\mathbb {Z}}}\) satisfying (1) is said to be specified by \(g\). The role played by \(g\) for \(Z\) is therefore analogous to a transition kernel for a discrete time Markov process, except that it allows dependencies on the whole past of the process.

We will always assume that \(g\) is regular, which means that it satisfies the following two conditions.

  1. (1)

    It is uniformly bounded away from 0 and 1: there exists \(\eta >0\) such that \(\eta \le g(z_0|z)\le 1-\eta \) for all \(z_0\in {\mathsf {A}}\), \(z\in {\mathsf {A}}^{{\mathbb {N}}}\).

  2. (2)

    Define the variation of \(g\) of order \(j\) by

    $$\begin{aligned} {\mathrm {var}}_j(g):=\sup |g(z_0|z)-g(z_0|z')|, \end{aligned}$$

    where the \(\sup \) is over all \(z_0\in {\mathsf {A}}\), and over all \(z,z'\in {\mathsf {A}}^{\mathbb {N}}\) for which \(z_{i}=z'_{i}\) for all \( 1\le i \le j\). Then \(g\) is continuous in the sense that \({\mathrm {var}}_j(g)\rightarrow 0\) when \(j\rightarrow \infty \).

When \(g\) is regular, the existence of at least one stationary process specified by \(g\) follows by a standard compactness argument (see also the explicit construction given below). Once existence is guaranteed, uniqueness can be shown under additional assumptions on the speed at which \({\mathrm {var}}_j(g)\rightarrow 0\). For instance, Doeblin and Fortet [3] showed that if

$$\begin{aligned} \sum _j{\mathrm {var}}_j(g)<\infty , \end{aligned}$$

then there exists a unique process specified by \(g\). More recently, Johansson and Öberg [9] strengthned this result, showing that uniqueness holds as soon as

$$\begin{aligned} \sum _j{\mathrm {var}}_j(g)^2<\infty . \end{aligned}$$
(2)

An interesting and natural question is to determine if a given regular \(g\)-function can lead to a phase transition, that is if it specifies at least two distinct processes.

In a pioneering paper, Bramson and Kalikow [2] gave the first example of a regular \(g\)-function exhibiting a phase transition. More recently, Berger et al. [1], in a remarkable paper, introduced a model whose \(g\)-function also exhibits a phase transition, but whose variation has a summability that can be made arbitrarily close to the \(\ell ^2\)-summability of the Johansson–Öberg criterion (see Remark 2 below).

The \(g\)-functions constructed in [2] and [1] have common features. The main one is that they both rely on some majority rule used in order to fix the influence of the past on the probability distribution of the present. That is, \(Z_{t+1}\), given \((Z_s)_{s\le t}\), is determined by the sign (and not the true value) of the average of a subset of the variables \((Z_s)_{s\le t}\) over a large finite region. This feature is essential in the mechanisms that lead to non-uniqueness, since it allows, roughly speaking, small local fluctuations to have dramatic effects in the remote future, thus favorizing the transmission of information from \(-\infty \) to \(+\infty \).

For the Bramson–Kalikow model, it had already been observed in [5] that arbitrarily small changes in the behavior of the majority rule, turning it smooth at the origin, can have important consequences on uniqueness/non-uniqueness of the process.

In this paper, we give a closer look at a class of models based on the one of Berger, Hoffman, and Sidoravicius (which will be called simply the BHS-model hereafter). Beyond giving a more detailed description of the original model of [1], our results show that any smoothing of the majority rule (see Fig. 1) leads, under general assumptions, to uniqueness, even for very slow-decaying variations.

Fig. 1
figure 1

On the left, the pure majority rule used in [1], for which non-uniqueness holds when \(0<\alpha <\frac{1-\epsilon _*}{2}\). On the right, a smoothed version, for which the process is always unique for all \(\alpha >0\), or more generally, for all sequence \(h_k\searrow 0\)

We will present these models from scratch, and not assume any prior knowledge concerning [1]; since their construction is not trivial and deserves some explanations, we will state our results precisely only at the end of Sect. 2.

Before proceeding, we single out other non-uniqueness-related works. Hulse [8] gave examples of non-uniqueness, based on the Bramson–Kalikow approach. Fernández and Maillard [4] constructed an example, using a long-range spin system of statistical mechanics, although in a non-shift-invariant framework. Gallesco et al. [6] studied a criterium for non-uniqueness which is optimal for the class of binary attractive models.

1.1 Models considered

Although the basic structure of our model is entirely imported from the one of BHS, our notations and terminology differ largely from those of [1].

The process \(Z=(Z_t)_{t\in {\mathbb {Z}}}\) defined in [1] takes values in an alphabet with four symbols, where each symbol is actually a pair, which we denote

$$\begin{aligned} Z_t=(X_t,\omega _t), \end{aligned}$$

withFootnote 1 \(X_t\in \{+,-\}\), \(\omega _t\in \{0,1\}\). The process can be considered as constructed in two steps. First, a doubly-infinite sequence of i.i.d. random variables \(\omega =(\omega _t)_{t\in {\mathbb {Z}}}\) is sampled, representing the environment, with distribution \(Q\):

$$\begin{aligned} Q(\omega _t=1)=1-Q(\omega _t=0)=\tfrac{1}{2} . \end{aligned}$$

Then, once the environment \(\omega \) is fixed, a process \(X=(X_t)_{t\in {\mathbb {Z}}}\) is considered, whose conditional distribution given \(\omega \) is denoted \(P_\omega \) and called the quenched distribution. We will assume that \(P_\omega \)-almost surely,

$$\begin{aligned} P_\omega (X_t=\pm \,|\,X_{t-1}=x_{t-1},X_{t-2}=x_{t-2},\dots ) =\tfrac{1}{2} \bigl \{ 1\pm \psi _t^\omega (x_{-\infty }^{t-1}) \bigr \}. \end{aligned}$$
(3)

where \(x_{-\infty }^{t-1}=(x_{t-1},x_{t-2},\dots )\in \{\pm \}^{\mathbb {N}}\). The perturbation \(\psi _t^\omega :\{\pm \}^{\mathbb {N}}\rightarrow [-1,1]\) describes how the variables of the process \(X\) differ from those of an i.i.d. symmetric sequence (which corresponds to \(\psi _t^\omega \equiv 0\)). The quenched model will always be attractive, in the sense that \(\psi _t^\omega (x_{-\infty }^{t-1})\) is non-decreasing in each of the variables \(x_s\), \(s<t\).

We assume that the functions \(\psi _t^\omega \) satisfy the following conditions:

  1. C1.

    For all \(x\in \{\pm \}^{\mathbb {N}}\), \(\psi _t^\omega (x)\) depends only on the environment variables \(\omega _s\), \(s\le t\).

  2. C2.

    The functions \(\psi _t^\omega \) are odd, \(\psi _t^\omega (-x)=-\psi _t^\omega (x)\) for all \(x\in \{\pm \}^{\mathbb {N}}\), and bounded uniformly in all their arguments:

    $$\begin{aligned} |\psi _t^\omega (x)|\le \epsilon \quad \text { for some }\epsilon \in (0,1). \end{aligned}$$
  3. C3.

    The maps \((x,\omega )\mapsto \psi _t^\omega (x)\) are continuous, uniformly in \(t\).

  4. C4.

    If \(\theta :\{0,1\}^{\mathbb {Z}}\rightarrow \{0,1\}^{\mathbb {Z}}\) denotes the shift, \((\theta \omega )_s:=\omega _{s+1}\), then

    $$\begin{aligned} \psi _t^\omega =\psi _0^{\theta _{t}\omega }. \end{aligned}$$

The probability distribution \({\mathbb {P}}\) of the joint process \(Z_t=(X_t,\omega _t)\) is defined as follows. If \(A\in {\mathcal {F}}:=\sigma (X_t,t\in {\mathbb {Z}})\), \(B\in {\mathcal {G}}:=\sigma (\omega _t,t\in {\mathbb {Z}})\), then

$$\begin{aligned} {\mathbb {P}}(A\times B):=\int _B P_\omega (A)Q(d\omega ). \end{aligned}$$
(4)

We will sometimes denote \({\mathbb {P}}\) by \(Q\otimes P_\omega \). It can then be verified that under \({\mathbb {P}}\), \(Z=(Z_t)_{t\in {\mathbb {Z}}}\) is a chain with complete connections specified by the regular \(g\)-function

$$\begin{aligned} g((\pm ,\omega _t)|(x_{t-1},\omega _{t-1}), (x_{t-2},\omega _{t-2}),\dots )&:=\tfrac{1}{4}\bigl \{1\pm \psi _t^{\omega }(x_{-\infty }^{t-1})\bigr \}. \end{aligned}$$
(5)

Conversely, the distribution of any process \(Z\) specified by this \(g\)-function can be expressed as in (4). Although the processes specified by \(g\) are of a dynamical nature [the process \((x_t,\omega _t)\) at time \(t\) having a distribution fixed by the entire past], we will rather be working with the quenched picture in mind, and think only of the variables \(x_t\) as being dynamical, evolving in a fixed environment \((\omega _t)_{t\in {\mathbb {Z}}}\).

The precise definition of the functions \(\psi _t^\omega \) will be given in Sect. 2.1. Before that we describe, in an informal way, the main ingredients that will appear in their construction.

1.2 Sampling a random set in the past

A natural feature of the model is that the distribution of the quenched process \(X\) at time \(t\) is determined by its values over a finite (albeit large) region in the past of \(t\). Therefore, for a given environment \(\omega \), the starting point will be to associate to each time \(t\in {\mathbb {Z}}\) a random set \(S_t=S_t^\omega \) living in the past of \(t\): \(S_t\subset (-\infty ,t)\). We will say that \(S_t\) targets the time \(t\) (see Fig. 2). Although each \(S_t\) is finite, we will always have, \(Q\)-almost surely,

$$\begin{aligned} \sup _t|S_t|=\infty \,\text { and }\, \sup _t\mathrm{{ dist }}(t,S_t)=\infty . \end{aligned}$$

In the environment \(\omega \), the distribution of \(X_t\) conditionned on its past \((X_s)_{s<t}\) [see (3)] is determined by the values of \(X\) on \(S_t\). As a matter of fact, the distribution of \(X_t\) will depend on the average of \(X\) on the set \(S_t\):

$$\begin{aligned} \psi _t^\omega (x_{-\infty }^{t-1})= \text { odd function of } \Bigg ( \frac{1}{|S_t^\omega |}\sum _{s\in S_t^\omega }x_s \Bigg ). \end{aligned}$$

The precise dependence will be fixed by some majority rule.

Fig. 2
figure 2

In a given environment \(\omega \), the distribution of \(X_t\), conditioned on its past, is determined by the variables \(X_s\), with \(s\in S_t^\omega \subset (-\infty ,t)\)

Remark 1

In general, \(S_t^\omega \) will not be an interval; as will be seen below, \(S_t^\omega \) is defined by a multiscale description of \(\omega \), and is made of unions of far apart intervals. Nevertheless, we will simplify the figures by picturing \(S_t\) as if it were an interval.

The sets \(S_t\) will be constructed in such a way that the event pictured on Fig. 3 occurs with positive \(Q\)-probability. That event represents a global connectivity satisfied by the sets \((S_t)_{t\in {\mathbb {Z}}}\) in relation to the origin: \(0\) is targeted by the set \(S_0\), which we temporarily denote by \(S_0(1)\). In turn, all points \(s\in S_0(1)\) happen to be targeted by the same set, denoted \(S_0(2)\). Then, all points \(s'\in S_0(2)\) are targeted by the same set \(S_0(3)\), etc. In this way, for each \(j\ge 0\) the variables \(\{X_s,s\in S_0(j)\}\), when conditionned on the values of the process on the past of \(S_0(j)\), are independent, with a distribution fixed solely by the magnetization of \(X\) on \(S_0(j+1)\). In this way, the properties of \(X\) in a finite region of \({\mathbb {Z}}\) will be obtained via values of \(X\) on a sequence of sets \(S_0(j)\), \(j=1,2,\dots \). This sequence will happen to be multiscale in the sense that \(S_0(j+1)\) will be orders of magnitude larger than \(S_0(j)\). Part of the mechanism will be to obtain estimates on the sizes of these sets. (Obs: The notations of this paragraph will not be used later. For a precise description of the picture just described, see the definition of the event \(\{\infty \rightarrow k\}\) in Sect. 4.1).

Fig. 3
figure 3

An environment in which information is likely to travel from the remote past up to 0

Throughout the paper, \(|A|\) will denote the number of elements of \(A\). For simplicity, intervals of \({\mathbb {Z}}\) will be denoted as \(\{a,a+1,\dots ,b-1,b\}\equiv [a,b]\). The diameter of \([a,b]\) is \({\text {{d}}}([a,b]):=b-a+1\).

2 The BHS model

The construction of the random sets \(S_t\) starts by using the environment \(\omega \) to partition \({\mathbb {Z}}\) into blocks of increasing scales. Below, most objects are random and depend on \(\omega \), although this will not always be indicated in the notations.

We start by fixing two numbers:

$$\begin{aligned} \epsilon _*\in (0,1),\,\text { and }\, k_*\in {\mathbb {N}}. \end{aligned}$$

Later, \(k_*\) (the smallest scale) will be chosen large. For all \(k\ge k_*\), define

$$\begin{aligned} \ell _k:=\lceil (1+\epsilon _*)^k\rceil , \end{aligned}$$

and let \(I_k\) be the word defined as the concatenation of \(\ell _k -1\) symbols “\(1\)” followed by a symbol “\(0\)”:

$$\begin{aligned} I_k=(1,1,\ldots ,1,1,0). \end{aligned}$$
(6)

Let \(\omega \in \{0,1\}^{\mathbb {Z}}\) be an environment and \([a,b]\subset {\mathbb {Z}}\) an interval of diameter \(\ell _k\). We say that \(I_k\) is seen in \(\omega \) on \([a,b]\) if

$$\begin{aligned} (\omega _a, \omega _{a+1}, \ldots , \omega _b)=I_k. \end{aligned}$$

In a given environment, \(I_k\) is seen on infinitely many disjoint intervals (\(Q\)-a.s.). Consider two successive occurrences of \(I_k\) in \(\omega \). That is, suppose \(I_k\) is seen on two disjoint intervals \([a,b]\) and \([a',b']\), but not on any other interval contained in \([a,b']\). Then the interval \([b,b'-1]\) is called a \(k\)-block. By definition, a \(k\)-block has diameter at least \(\ell _k\), the first symbol seen on a \(k\)-block is \(0\), and the last \(\ell _k-1\) symbols are \(1\)s.

A given \(k\) allows to partition \({\mathbb {Z}}\) into \(k\)-blocks (see Fig. 4): for each \(t\in {\mathbb {Z}}\), there exists a unique \(k\)-block containing \(t\), denoted by \(B^k(t)=[a^k(t),b^k(t)]\), where \(a^k(t)\) [resp. \(b^k(t)\)] is the leftmost (resp. rightmost) point of \(B^k(t)\).

Fig. 4
figure 4

A partition of \(Z\) into \(k\)-blocks, using successive occurences of \(I_k\) in \(\omega \)

The diameter of a typical \(k\)-block is of order (see Lemma 1)

$$\begin{aligned} \beta _k:=2^{\ell _k}. \end{aligned}$$

In a fixed environment, the partition of \({\mathbb {Z}}\) in \(k\)-blocks is coarser than the partition in \((k-1)\)-blocks: when \(k>k_*\), each \(k\)-block \(B\) is a disjoint union of one or more \((k-1)\)-blocks. If we denote the number of \((k-1)\)-blocks in \(B\) by \(N(B)\), then

$$\begin{aligned} B=b_1\cup b_2\cup \cdots \cup b_{N(B)}\equiv \bigcup ^{N(B)}_{i=1}b_i, \end{aligned}$$
(7)

where \(b_1\) [resp. \(b_{N(B)}\)] is the leftmost (resp. rightmost) \((k-1)\)-block contained in \(B\). We will verify in Lemma 2 that \(N(B)\) is of order

$$\begin{aligned} \nu _k:=\frac{\beta _k}{\beta _{k-1}}. \end{aligned}$$

When \(k>k_*\), the beginning of a \(k\)-block \(B\), decomposed as in (7), is defined as

$$\begin{aligned} {{{\mathcal {C}}}}(B):=\bigcup ^{N(B)\wedge \lfloor \nu _k^{1-\epsilon _*}\rfloor }_{i=1}b_i. \end{aligned}$$
(8)

Due to the exponent “\(1-\epsilon _*\)” in (8), the beginning of a \(k\)-block, when \(k>k_*\) is large, is typically smaller than the block itself (see Lemma 3):

figure a

For a \(k_*\)-block \(B=[a,b]\), the beginning is defined in a different manner:

$$\begin{aligned} {{{\mathcal {C}}}}(B):=\bigl \{s\in B:|s-a|\le \beta _{k_*}^{1+\epsilon _*}\bigr \}. \end{aligned}$$

Since the typical size of a \(k_*\)-block \(B\) is \(\beta _{k_*}\), we will verify later that \(B={{{\mathcal {C}}}}(B)\) with high \(Q\)-probability.

2.1 The definition of \(S_t^\omega \) and \(\psi _t^\omega \)

In order to help understand the precise definition of \(S_t\) given below, we first give a definition which is natural but not yet sufficient for our needs.

Fix \(t\in {\mathbb {Z}}\), and consider the first scale for which \(t\) is not in the beginning of its block: \(k_t:=\inf \{k\ge k_*:t\not \in {{{\mathcal {C}}}}(B^k(t))\}.\) Then, a natural way of defining \(S_t\) could be \(S_t:={{{\mathcal {C}}}}(B^{k_t}(t)).\) Unfortunately, this definition does not guarantee that some event like the one described on Fig. 3 occurs with positive probability. Namely, two distinct points \(t',t''\in S_t\) can very well be targeted by different sets \(S_{t'}\ne S_{t''}\). The definitions of \(S_t\) and \(k_t\) thus need to be modified in some subtle way.

Definition 1

Let \(k\ge k_*\). We say that \(t\in {\mathbb {Z}}\) is \(k\)-active in the environment \(\omega \) if for all \(j\in \{k_*,\ldots , k\}\),

  1. (1)

    \(t \in {{{\mathcal {C}}}}(B^j(t))\), where \(B^j(t)=[a^j(t),b^j(t)]\) is the \(j\)-block containing \(t\), and if

  2. (2)

    \(|t-a^j(t)|<\beta _{j+1}.\)

Let also \({{{\mathcal {A}}}}_{k}:=\{t\in {\mathbb {Z}}:t\text { is }k\text {-active}\}\).

Observe that

$$\begin{aligned} {{{\mathcal {A}}}}_{k_*}\supset {{{\mathcal {A}}}}_{k_*+1}\supset \dots \supset {{{\mathcal {A}}}}_{k}\supset {{{\mathcal {A}}}}_{k+1}\supset \dots \end{aligned}$$

We will see after Lemma 3 that \({{{\mathcal {A}}}}_{k}\searrow \varnothing \) as \(k\rightarrow \infty \), \(Q\)-almost surely. Therefore, it is natural to define, for \(t\in {\mathbb {Z}}\),

$$\begin{aligned} k_t=k_t^\omega :=\inf \{k\ge k_*; t\notin {{{\mathcal {A}}}}_{k}\}, \end{aligned}$$
(9)

with the convention: \(\inf \varnothing =\infty \). The set of \(k\)-active points inside a \(k\)-block \(B\) is

$$\begin{aligned} {{{\mathcal {A}}}}(B):={{{\mathcal {A}}}}_k\cap B. \end{aligned}$$

By definition, \({{{\mathcal {A}}}}(B)\subset {{{\mathcal {C}}}}(B)\). Then, let

$$\begin{aligned} S_t =S_t^\omega :={\left\{ \begin{array}{ll} {{{\mathcal {A}}}}(B^{k_t}(t)) &{} \text {if } k_*<k_t<\infty ,\\ \varnothing &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

Unlike in [1], we don’t impose \(S_t\) to contain an odd number of points. By definition, \(S_t\subset (-\infty ,t)\), and the two following crucial properties hold:

  1. P1.

    If \(t',t''\in S_t\), then \(S_{t'}=S_{t''}\).

  2. P2.

    If \(\omega , \omega '\) are such that \(\omega _s=\omega '_s\) for all \(s\in (\infty , t]\), then \(k_t^\omega =k_t^{\omega '}\) and \(S_t^\omega =S_t^{\omega '}\).

We can now define \(\psi _t^\omega \).

Definition 2

Let \(\varphi :[-1,1]\rightarrow [-1,1]\) be non-decreasing and odd, \(\varphi (-z)=-\varphi (z)\), and \(h_{k}> 0\) be a decreasing sequence such that \(h_{k}\searrow 0\) when \(k\rightarrow \infty \). If \(S_t\ne \varnothing \) and \(|t-a^{k_t}(t)|< \beta _{k_t+ 1}\), define

$$\begin{aligned} \psi _t^\omega (x):=h_{k_t}\varphi \Bigg (\displaystyle \frac{1}{|S_t|}\sum _{s\in S_t}x_s\Bigg ). \end{aligned}$$
(10)

Otherwise,

$$\begin{aligned} \psi _t^\omega (x):=0. \end{aligned}$$
(11)

We check that \(\psi _t^\omega \) satisfies the properties C1–C4 described earlier. If \(h_{k_*}\) is small enough, say \(h_{k_*}\le \tfrac{1}{2}\), then \(\psi _t^\omega \) satisfies C\(2\). C\(3\) is guaranteed by the fact that \(h_{k}\searrow 0\) and that a cutoff was introduced so that \(\psi _t^\omega =0\) if \(|t-a^{k_t}|\ge \beta _{k_t}+1\). Then, C\(4\) is clearly satisfied, and C\(1\) is consequence of P\(2\).

We will now present some results concerning the processes \(Z\) specified by the \(g\)-function defined in (5), with \(\psi _t^\omega \) defined above. Our interest will be in observing the role played by the behavior of \(\varphi \) at the origin.

2.2 A sharper result for the pure majority rule

In [1], the function \(\varphi \) used is a pure majority rule (see Fig. 1). That is,

$$\begin{aligned} \varphi _{PMR}(z):={\left\{ \begin{array}{ll} +1&{}\text {if } z \in (0,+1],\\ 0&{} \text {if } z= 0,\\ -1&{} \text {if } z\in [-1,0).\\ \end{array}\right. } \end{aligned}$$
(12)

The behavior of the model then depends crucially on the choice of the sequence \(h_k\). As will be seen, the criterion is roughly the following:

$$\begin{aligned} \sum _{k}e^{-h_{k+1}^2\beta _k^{1-\epsilon _*}} {\left\{ \begin{array}{ll} <\!\infty &{} \Rightarrow \text {non-uniqueness},\\ =\!\infty &{} \Rightarrow \text {uniqueness}.\\ \end{array}\right. } \end{aligned}$$
(13)

The sequence \(h_{k}\) considered in [1] was therefore of the form

$$\begin{aligned} h_{k}:=\frac{1}{\beta _{k-1}^{\alpha }},\quad \alpha >0. \end{aligned}$$
(14)

With this particular choice, our first result completes the description given in [1]:

Theorem 1

Consider the \(g\)-function (5), with \(\psi _t^\omega \) of the form (10), with \(\varphi \) discontinuous at the origin like \(\varphi _{PMR}\), and \(h_{k}\) defined as in (14).

  1. (1)

    If \(\alpha < \frac{1-\epsilon _*}{2}\), then there exist two distinct stationary processes \(Z^+\ne Z^-\) specified by \(g\).

  2. (2)

    If \(\alpha >\frac{1-\epsilon _*}{2}\), then there exists a unique stationary process specified by \(g\).

Item (1) was the main result of [1] but there, the uniqueness regime was not studied. We will actually see (in the proof of Lemma 6) that when \(\alpha >\frac{1-\epsilon _*}{2}\), uniqueness holds independently of \(\varphi \) being continuous or not at the origin. The methods presented below don’t allow to treat the critical case \(\alpha =\frac{1-\epsilon _*}{2}\).

Remark 2

It can be shown (see [1]) that with \(h_{k}\) as in (14), the variation of \(g\) satisfies \({\mathrm {var}}_j(g)\le c\cdot j^{-\alpha /(1+\epsilon _*)^3}\). Therefore, the Johansson–Öberg criterion (2) guarantees uniqueness when \(\alpha >(1+\epsilon _*)^3/2\). Our result extends uniqueness also to values \(\alpha \in \big (\frac{1-\epsilon _*}{2},\frac{(1+\epsilon _*)^3}{2}\big ]\).

2.3 Uniqueness for continuous majority rules

The following two results show, roughly, that any attempt to turn \(\varphi \) continuous at the origin leads to uniqueness.

Theorem 2

Consider the \(g\)-function (5), with \(\psi _t^\omega \) of the form (10), and assume there exists \(\gamma >0\) such that

$$\begin{aligned} \limsup _{z\rightarrow 0^+}\frac{\varphi (z)}{z^\gamma }<\infty . \end{aligned}$$
(15)

If \(h_{k}\) is as in (14), then for all \(\alpha >0\) the stationary process specified by \(g\) is unique.

Condition (15) is of course satisfied when \(\varphi \) is differentiable at \(0\): \(\varphi '(0)<\infty \). Some examples of non-uniqueness with \(\varphi '(0)=\infty \) for the Bramson-Kalikow model were given in [5]. We will see in Remark 6 that there exist majority rules \(\varphi \) continuous at the origin but which don’t satisfy (15), and for which non-uniqueness holds.

Under a stronger condition on \(\varphi \), we can show that uniqueness holds for all sequence \(h_{k}\searrow 0\).

Theorem 3

Consider the \(g\)-function (5), with \(\psi _t^\omega \) of the form (10). Assume that \(\varphi \) is locally Lipschitz in a neighborhood of the origin: there exists \(\delta >0\) and \(\lambda >0\) such that

$$\begin{aligned} |\varphi (z_2)-\varphi (z_1)|\le \lambda |z_2-z_1|,\quad \forall z_1,z_2\in [-\delta ,\delta ]. \end{aligned}$$

Then for any sequence \(h_{k}\searrow 0\), the stationary process specified by \(g\) is unique.

An example that leads to uniqueness for all sequence \(h_{k}\searrow 0\) is when \(\varphi =\varphi _{\mathrm {lin}}\) is linear at the origin: there exists \(0<\lambda <\infty \) and \(\delta >0\) such that \(\varphi _{\mathrm {lin}}(z)=\lambda z\) for all \(z\in [-\delta ,\delta ]\). (This particular example will actually play an important role in the proof.) Otherwise, natural candidates such as \(\varphi (z):=\tanh (\beta z)\) also lead to uniqueness even for large \(\beta >0\).

figure b

We emphasize that our uniqueness results can’t be derived from the classical criteria found in the litterature [such as (2)]. The reason for this is that most criteria are insensitive to the behavior of \(\varphi \) at the origin. In particular, our results allow to build \(g\)-functions with arbitrarily slow-decaying variation, that specify a unique stationary process.

The paper is organized as follows. We will first give a detailed description of the environment in Sect. 3, whose spirit follows closely [1]. Nevertheless, our presentation includes a few differences; in particular, our Definition 3 differs from the one in [1]. For that reason, and for the sake of completeness, we will give full proofs. After that, we will describe when an environment should be considered as good, and at the beginning of Sect. 4 introduce an event \(\{\infty \rightarrow k\}\), that will be used constantly in the sequel. We then prove Theorems 1 and 2, using two propositions that are proved later in Sect. 4.6. Theorem 3 is proved in Sect. 5.

3 The environment and properties of blocks

In this section, we study typical properties of a \(k\)-block: its diameter, its beginning, and finally we estimate the number of \(k\)-active points it contains.

Given a \(k\)-block \(B=[a,b]\) in some environment \(\omega \), we define \(\Pi (B)\), the word of \(B\), as the sequence of symbols of \(\omega \) seen in \(B\):

$$\begin{aligned} \Pi (B):=(\omega _{a},\omega _{a+1},\ldots ,\omega _{b}). \end{aligned}$$

Remember (check Fig. 4) that \(\omega _{a}=0\), and that \(\omega _{b-\ell _k-2}=\dots =\omega _{b}=1\).

To study properties of blocks that only depend on their word, and since the environment is stationary with respect to \(Q\), it will be enough to consider the blocks containing the origin, \(B^k(0)\), \(k\ge k_*\). The study of \(\Pi (B^k(0))\) will be simplified by first studying the block \(B^k(0)\) when its first point is fixed at the origin.

Let therefore \(\eta =(\eta _i)_{i\ge 1}\) be an i.i.d. sequence, such that \(P(\eta _i=0)=1-P(\eta _i=1)=\tfrac{1}{2}\). For each \(k\ge 1\), we consider the time of first occurrence of \(I_k\) [remember (6)] in \(\eta \), defined by

$$\begin{aligned} T^k:=\inf \bigl \{j\ge \ell _k:(\eta _{j-\ell _k+1},\ldots ,\eta _j)=I_k\bigr \}. \end{aligned}$$

Defining \(\eta _0:=0\), the random word

$$\begin{aligned} \Pi ^k:=(\eta _0,\eta _1,\ldots , \eta _{T^k-1}) \end{aligned}$$

has the same distribution as of that of a word of a \(k\)-block. We call \(\Pi ^k\) a \(k\)-word, and denote the set of all \(k\)-words by \({\mathcal {W}}^k\). Many notions introduced for \(k\)-blocks extend naturally to the \(k\)-word \(\Pi ^k\). For instance, the diameter of \(\Pi ^k\), that is the number of symbols it contains, is

$$\begin{aligned} {\text {{d}}}(\Pi ^k)=T^k. \end{aligned}$$
(16)

We will study a few elementary properties of the \(k\)-word \(\Pi ^k\), and then extend them to \(B^k(0)\).

3.1 The diameter

A classical martingale argument (see for instance “the monkey typing Shakespeare” in [11]) allows to compute the expectation of the size of a \(k\)-word:

$$\begin{aligned} E[T^k]=2^1+2^2+\cdots +2^{\ell _k}. \end{aligned}$$

Therefore,

$$\begin{aligned} \beta _k \le E[{\text {{d}}}(\Pi ^k)]\le 2\beta _{k}. \end{aligned}$$
(17)

We leave it as an exercise to verify that the distribution of \(\frac{T^k}{E[T^k]}\) has an exponential tail:

Lemma 1

For all \(j\ge 1\)

$$\begin{aligned} P\bigl (T^k\ge j \beta _k\bigr )\le e^{-j}. \end{aligned}$$

Corollary 1

For all \(j\ge 1\),

$$\begin{aligned} Q\bigl ( {\text {{d}}}(B^k(0))\ge j\beta _k\bigr )\le 6je^{-j}. \end{aligned}$$
(18)

Proof

Write \(B^k(0)=[a^k(0),b^k(0)]\). For all finite interval \(J\subset {\mathbb {N}}\),

$$\begin{aligned} Q\bigl ({\text {{d}}}(B^k(0))\in J\bigr )&=\sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k\\ {\text {{d}}}(\pi )\in J \end{array}} Q\bigl ({{\Pi }}(B^k(0))=\pi \bigr )\nonumber \\&=\sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k\\ {\text {{d}}}(\pi )\in J \end{array}} \sum _{-{\text {{d}}}(\pi )<a\le 0} Q\bigl ({{\Pi }}(B^k(0))=\pi , a^k(0)=a\bigr ). \end{aligned}$$
(19)

But \(\{{{\Pi }}(B^k(0))=\pi , a^k(0)=a\}\), when not empty, is uniquely determined by the following three conditions:

  1. (1)

    \(\omega _j=1\) for all \(j\in \{a-\ell _k+1,\dots ,a-1\}\),

  2. (2)

    \((\omega _a,\omega _{a+1},\dots ,\omega _{a+{\text {{d}}}(\pi )-1})=\pi \),

  3. (3)

    \(\omega _{a+{\text {{d}}}(\pi )}=0\).

Therefore, by the independence of the variables \(\omega _i\),

$$\begin{aligned} Q\bigl ({{\Pi }}(B^k(0))=\pi , a^k(0)=a\bigr )=\bigl (\tfrac{1}{2}\bigr )^{\ell _k-1}\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )}\bigl (\tfrac{1}{2}\bigr ), \end{aligned}$$
(20)

which implies

$$\begin{aligned} Q\bigl ({\text {{d}}}(B^k(0))\in J\bigr )= & {} \bigl (\tfrac{1}{2}\bigr )^{\ell _k}\sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k {\text {{d}}}(\pi )\in J \end{array}}{\text {{d}}}(\pi )\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )}\\\le & {} \bigl (\tfrac{1}{2}\bigr )^{\ell _k}(\max _{j\in J}j)\sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k {\text {{d}}}(\pi )\in J \end{array}}\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )}. \end{aligned}$$

But for each \(\pi \in {\mathcal {W}}^k\),

$$\begin{aligned} P(\Pi ^k=\pi )=\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )-1} \cdot \bigl (\tfrac{1}{2}\bigr )=\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )}, \end{aligned}$$
(21)

where “\(\cdot \left( \tfrac{1}{2}\right) \)” appears in order to have a “\(0\)” after \(\pi \), to guarantee the occurrence of the event \(\{\Pi ^k=\pi \}\). Therefore,

$$\begin{aligned} \sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k\\ {\text {{d}}}(\pi )\in J \end{array}}\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )} \equiv P\bigl ({\text {{d}}}(\Pi ^k)\in J\bigr )= P\bigl (T^k\in J\bigr ), \end{aligned}$$
(22)

and we have shown that

$$\begin{aligned} Q\bigl ({\text {{d}}}(B^k(0))\in J\bigr )\le \bigl (\tfrac{1}{2}\bigr )^{\ell _k}(\max J) P\bigl (T^k\in J\bigr ). \end{aligned}$$
(23)

Let \(J^k_i:=[i \beta _k,(i+1) \beta _k)\). Since \(\bigl (\tfrac{1}{2}\bigr )^{\ell _k}\beta _k=1\),

$$\begin{aligned} Q\bigl ({\text {{d}}}(B^k(0))\ge j \beta _k\bigr )&= \sum _{i\ge j}Q\bigl ({\text {{d}}}(B^k(0))\in J^k_i\bigr )\\&\le \sum _{i\ge j}(i+1) P\bigl (T^k\in J_i^k\bigr ) \le \sum _{i\ge j}(i+1) P\bigl (T^k\ge i \beta _k\bigr ). \end{aligned}$$

Then, (18) follows from Lemma 1. \(\square \)

3.2 The beginning

When \(k>k_*\), \(\Pi ^k\) can always be viewed as a concatenation of \((k-1)\)-words:

$$\begin{aligned} \Pi ^k= \Pi ^{k-1}_1\star \cdots \star \Pi ^{k-1}_{N(\Pi ^k)}, \end{aligned}$$

where \(\Pi _1^{k-1}:=\Pi ^{k-1}\) and

$$\begin{aligned} \Pi _j^{k-1}:=\Pi ^{k-1}\circ \theta _{{\text {{d}}}(\Pi ^{k-1}_1\star \cdots \star \Pi ^{k-1}_{j-1})}. \end{aligned}$$

The number of \((k-1)\)-words contained in \(\Pi ^k\), \(N(\Pi ^k)\), is geometric:

Lemma 2

For all \(k>k_*\), there exists \(p_k\in (0,1)\), \(\nu _k^{-1}\le p_k\le 2\nu _k^{-1}\), such that

$$\begin{aligned} \forall j\ge 1,\quad P(N(\Pi ^k)=j)=(1-p_k)^{j-1}p_k. \end{aligned}$$
(24)

In particular, \(E[N(\Pi ^k)]=p_k^{-1}\).

Proof

Consider an independent identically distributed sequence of \((k-1)\)-words, with the same distribution as \(\Pi ^{k-1}\): \(\Pi ^{k-1}_1,\Pi ^{k-1}_2,\dots \). When sampling a \((k-1)\)-word, we say that this \((k-1)\)-word is closing if the first occurrence of \(I_{k-1}\) coincides with the first occurrence of \(I_k\), which means that the first occurrence of \(I_{k-1}\) is preceded by a sequence of \(\ell _k-\ell _{k-1}\) symbols “\(1\)”. Therefore, the concatenation of the first \(j\) \((k-1)\)-words of the sequence \(\Pi ^{k-1}_1,\Pi ^{k-1}_2,\dots \) is a \(k\)-word if and only if the \((j-1)\)-th first are not closing and the \(j\)-th is closing. Defining

$$\begin{aligned} p_k:=P(\Pi ^{k-1}\text { is closing})= P(T^{k-1}=T^k), \end{aligned}$$

we obtain (24). If \(T^{k-1}=t\ge \ell _k\), we denote by \(r_t\) the word seen in the interval \([t-\ell _k+1,t-\ell _{k-1}]\), of diameter \(\ell _k-\ell _{k-1}\), and by \(q_t\) the word seen in the interval \([t-\ell _{k-1}+1,t]\). We have

$$\begin{aligned} P(T^{k-1}=T^k=t)&=P(r_t=(1,\dots ,1), T^{k-1}=t)\\&=P(T^{k-1}>t-\ell _k, r_t=(1,\dots ,1), q_t=I_{k-1})\\&=P(T^{k-1}>t-\ell _k) P(r_t=(1,\dots ,1)) P(q_t=I_{k-1})\\&= \beta _k^{-1}P(T^{k-1}>t-\ell _k). \end{aligned}$$

Therefore,

$$\begin{aligned} p_k=\sum _{t\ge \ell _k}P(T^{k-1}=T^k=t)= \beta _k^{-1} E[T^{k-1}]. \end{aligned}$$

Using (17), we get \(\nu _k^{-1}\le p_k\le 2\nu _k^{-1}\). \(\square \)

All notions previously defined for blocks, such as active points, “being good”, etc, have immediate analogs for words. Namely, any \(k\)-word \(\Pi \in {\mathcal {P}}^k\), can be identified as \(\Pi (B^k(0))\), where \(B^k(0)\) is assumed to have its first point pinned at the origin: \(a^k(0)=0.\) The beginning of \(\Pi ^k\) is the interval

$$\begin{aligned} {{{\mathcal {C}}}}(\Pi ^k) :=\left[ 0, \sum ^{N(\Pi ^k)\wedge \lfloor \nu _k^{1-\epsilon _*}\rfloor }_{i=1} {\text {{d}}}(\Pi ^{k-1}_i)\right) . \end{aligned}$$

Using (17),

$$\begin{aligned} E[{\text {{d}}}({{{\mathcal {C}}}}(\Pi ^k))]\le \nu _k^{1-\epsilon _*}E[{\text {{d}}}(\Pi ^{k-1})]\le 2\nu _{k}^{1-\epsilon _*}\beta _{k-1}. \end{aligned}$$
(25)

We can now study the position of a point relative to the beginning of each of the \(k\)-blocks in which it is contained:

Lemma 3

Let \(t\in {\mathbb {Z}}\). For \(k=k_*\),

$$\begin{aligned} Q\bigl (t \in {{{\mathcal {C}}}}(B^{k_*}(t))\bigr )\ge 1- 6\lfloor \beta _{k_*}^{\epsilon _*}\rfloor e^{-\lfloor \beta _{k_*}^{\epsilon _*}\rfloor }. \end{aligned}$$
(26)

For all \(k>k_*\),

$$\begin{aligned} Q\bigl (t\in {{{\mathcal {C}}}}(B^k(t))\bigr )\le 2\nu _k^{-\epsilon _*}, \end{aligned}$$
(27)

Remark 3

Observe that \(\nu _k\) diverges superexponentially in \(k\), and so (27) implies

$$\begin{aligned} \sum _{k>k_*} Q(t \in {{{\mathcal {C}}}}(B^{k}(t))<\infty . \end{aligned}$$

Therefore, by Borel–Cantelli’s Lemma, \(t\not \in {{{\mathcal {C}}}}(B^k(t))\) for all large enough \(k\). As a consequence, \(Q(k_t<\infty )=1\).

Proof of Lemma 3

It suffices to consider \(t=0\). On the one hand, by Corollary 1,

$$\begin{aligned} Q\bigl (0\not \in {{{\mathcal {C}}}}(B^{k_*}(0))\bigr )&\le Q\bigl (B^{k_*}(0){\setminus } {{{\mathcal {C}}}}(B^{k_*}(0))\ne \varnothing \bigr )\\&\le Q\bigl ({\text {{d}}}(B^k(0))\ge \beta _{k_*}^{1+\epsilon _*}\bigr )\le 6\lfloor \beta _{k_*}^{\epsilon _*}\rfloor e^{-\lfloor \beta _{k_*}^{\epsilon _*}\rfloor }. \end{aligned}$$

On the other hand, using (20), (21), (25),

$$\begin{aligned} Q(0\in {{{\mathcal {C}}}}(B^k(0)))&=\sum _{a\le 0}Q(0\in {{{\mathcal {C}}}}(B^k(0)), a^k(0)=a)\\&=\sum _{a\le 0}\sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k:\\ {\text {{d}}}({{{\mathcal {C}}}}(\pi ))>|a| \end{array}} Q(\Pi (B^k(0))=\pi , a^k(0)=a)\\&=\bigl (\tfrac{1}{2}\bigr )^{\ell _k}\sum _{\pi \in {\mathcal {W}}^k}{\text {{d}}}({{{\mathcal {C}}}}(\pi ))\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )}=\beta _k^{-1} E[{\text {{d}}}({{{\mathcal {C}}}}(\Pi ^k))]\le 2\nu _k^{-\epsilon _*}. \end{aligned}$$

\(\square \)

3.3 The number of active points

Definition 3

A \(k\)-block \(B\) is good if

  1. (1)

    \({\text {{d}}}(B)<\tfrac{1}{2} \beta _k^{1+\epsilon _*}\), and if

  2. (2)

    \( n_1(k):=\beta _k^{1-\epsilon _*}2^{-k}<|{{{\mathcal {A}}}}(B) |<n_2(k):=\beta _k^{1-\epsilon _*} \beta _{k-1}^{2\epsilon _*} \).

If not good, \(B\) is bad.

Proposition 1

If \(k_*\) is large enough, then for all \(k> k_*\),

$$\begin{aligned} Q(B^k(0) \text { is bad})\le 2^{-k}. \end{aligned}$$
(28)

Since the event \(\{B^k(0)\text { is good}\}\) is determined by the word of \(B^k(0)\), we first obtain a similar result for words.

The notion of “good” extends naturally to \(k\)-words. The set of good \(k\)-words is denoted \({\mathcal {W}}^k_{\text {good}}\), and \({\mathcal {W}}^k_{\text {bad}}:={\mathcal {W}}^k{\setminus }{\mathcal {W}}^k_{\text {good}}\).

Lemma 4

If \(k_*\) if large enough, then for all \(k> k_*\),

$$\begin{aligned} P\bigl (\Pi ^k\in {\mathcal {W}}^k_{\text {bad}}\bigr )\le 3\cdot 2^{-3k}+2\beta _{k-1}^{-\epsilon _*}. \end{aligned}$$
(29)

Proof

We write \({\mathcal {W}}^{k}_{\text {good}} ={\mathcal {W}}^{k,1}_{\text {good}}\cap {\mathcal {W}}^{k,2}_{\text {good}}\), where

$$\begin{aligned} {\mathcal {W}}^{k,1}_{\text {good}}&:=\bigl \{\pi \in {\mathcal {W}}^k:\, {\text {{d}}}(\pi )<\tfrac{1}{2} \beta _k^{1+\epsilon _*} \text{ and } |{{{\mathcal {A}}}}(\pi ) |> n_1(k)\bigr \},\\ {\mathcal {W}}^{k,2}_{\text {good}}&:=\bigl \{\pi \in {\mathcal {W}}^k:\,|{{{\mathcal {A}}}}(\pi ) |< n_2(k)\bigr \}. \end{aligned}$$

Let then \({\mathcal {W}}^{k,i}_{\text {bad}}:={\mathcal {W}}^k{\setminus } {\mathcal {W}}^{k,i}_{\text {good}}\). We first prove that for all \(k\ge k_*\),

$$\begin{aligned} P\bigl (\Pi ^k\in {\mathcal {W}}^{k,1}_{\text {bad}}\bigr )\le 3\cdot 2^{-3k}. \end{aligned}$$
(30)

We will proceed by induction on \(k\). Let \(k_*\) be large enough, such that for all \(k\ge k_*\),

$$\begin{aligned} e^{-\tfrac{1}{2} \beta _{k}^{\epsilon _*}} \le 2^{-3 k},\quad 2\nu _k^{-\epsilon _*}\le 2^{-3k}. \end{aligned}$$

(Observe that \(k_*\nearrow \infty \) as \(\epsilon _*\searrow 0\).) We start with the case \(k=k_*\):

$$\begin{aligned} P\bigl (\Pi ^{k_*}\in {\mathcal {W}}^{k_*,1}_{\text {bad}}\bigr )\le P\bigl ({\text {{d}}}(\Pi ^{k_*})&\ge \tfrac{1}{2} \beta _{k_*}^{1+\epsilon _*}\bigr )\\&\quad +P\bigl ({\text {{d}}}(\Pi ^{k_*})<\tfrac{1}{2} \beta _{k_*}^{1+\epsilon _*}, |{{{\mathcal {A}}}}(\Pi ^{k_*}) |\le n_1(k_*) \bigr ). \end{aligned}$$

By Lemma 1,

$$\begin{aligned} P\bigl ({\text {{d}}}(\Pi ^{k_*})\ge \tfrac{1}{2} \beta _{k_*}^{1+\epsilon _*}\bigr ) =P(T^{k_*}\ge \tfrac{1}{2} \beta _{k_*}^{1+\epsilon _*})\le e^{-\tfrac{1}{2} \beta _{k_*}^{\epsilon _*}}\le 2^{-3 k_*}. \end{aligned}$$
(31)

On the other hand, \({{{\mathcal {A}}}}(\Pi ^{k_*})\) is an interval, and \({\text {{d}}}(\Pi ^{k_*})<\tfrac{1}{2} \beta _{k_*}^{1+\epsilon _*}\) implies that \(|{{{\mathcal {A}}}}(\Pi ^{k_*}) |={\text {{d}}}(\Pi ^{k_*})=T^{k_*}\). Therefore,

$$\begin{aligned} P\bigl ({\text {{d}}}(\Pi ^{k_*})<\tfrac{1}{2} \beta _{k_*}^{1+\epsilon _*}, |{{{\mathcal {A}}}}(\Pi ^{k_*}) |\le n_1(k_*) \bigr )&\le P\bigl (T^{k_*}\le n_1(k_*) \bigr )\\&\le n_1(k_*)\bigl (\tfrac{1}{2}\bigr )^{\ell _{k_*}} =\beta _{k_*}^{-\epsilon _*}2^{-k_*} \le 2\cdot 2^{-3k_*}. \end{aligned}$$

Therefore, (30) is proved for \(k=k_*\). Suppose that (30) holds for \(k-1\). Remember that \(\Pi ^k\) is a concatenation of \((k-1)\)-words, denoted by \(\Pi _j\), \(j=1,\dots ,N(\Pi ^k)\). We define the events:

$$\begin{aligned} A_1&:=\{d(\Pi ^k)<\tfrac{1}{2} \beta _k^{1+\epsilon _*}\},\\ A_2&:=\{N(\Pi ^k)>\nu _k^{1-\epsilon _*}\},\\ A_3&:=\Big \{\begin{array}{c} \text {at least half of the }(k-1)\text {-words}\\ \text { in }{{{\mathcal {C}}}}(\Pi ^k) \text { are in }{\mathcal {W}}^{k-1,1}_{\text {good}} \end{array}\Big \}. \end{aligned}$$

We claim that \(A_1\cap A_2\cap A_3\subset \{\Pi ^k\in {\mathcal {W}}^{k,1}_{\text {good}}\}\). Indeed, \(A_1\) ensures that the first condition in \({\mathcal {W}}^{k,1}_{\text {good}}\) is satisfied. Furthermore, in \(A_1\cap A_2\cap A_3\), every \((k-1)\)-word \(\Pi _j\subset {{{\mathcal {C}}}}(\Pi ^k)\) is at distance \(\le \tfrac{1}{2} \beta _{k}^{1+\epsilon _*}\le \beta _{k+1}\) of the origin, and therefore, each active point of \(\Pi _j\) is active in \(\Pi ^k\). As a consequence,

$$\begin{aligned} |{{{\mathcal {A}}}}(\Pi ^k) |&=\sum _{\begin{array}{c} \Pi _j\subset {{{\mathcal {C}}}}(\Pi ^k) \end{array}} |{{{\mathcal {A}}}}(\Pi _j) |\\&\ge \sum _{\begin{array}{c} \Pi _j\subset {{{\mathcal {C}}}}(\Pi ^k):\\ \Pi _j\in {\mathcal {W}}^{k-1,1}_{\text {good}} \end{array}} |{{{\mathcal {A}}}}(\Pi _j) | \!>\! n_1(k\!-\!1)\sum _{\begin{array}{c} \Pi _j\subset {{{\mathcal {C}}}}(\Pi ^k):\\ \Pi _j\in {\mathcal {W}}^{k-1,1}_{\text {good}} \end{array}}1 \!\ge \! n_1(k\!-\!1)\cdot \tfrac{1}{2} \nu _{k}^{1-\epsilon _*}\equiv n_1(k). \end{aligned}$$

Therefore, \(\Pi ^k\in {\mathcal {W}}^{k,1}_{\text {good}}\). It follows that

$$\begin{aligned} P\big (\Pi ^k\in {\mathcal {W}}^{k,1}_{\text {bad}}\big ) \le P\big (A_1^c\big )+ P\big (A_2^c\big )+ P\big (A_2\cap A_3^c\big ). \end{aligned}$$

As in (31), \(P(A_1^c)\le 2^{-3k}\). By Lemma 2,

$$\begin{aligned} P(A_2^c)&=\sum _{j=1}^{\lfloor \nu _k^{1-\epsilon _*}\rfloor } P(N(\Pi ^k)=j)\\&=\sum _{j=1}^{\lfloor \nu _k^{1-\epsilon _*}\rfloor }(1-p_k)^{j-1}p_k \le \nu _k^{1-\epsilon _*}p_k\le 2\nu _k^{-\epsilon _*} \le 2^{-3k}. \end{aligned}$$

On \(A_2\), \({{{\mathcal {C}}}}(\Pi ^k)\) contains exactly \(\lfloor \nu _k^{1-\epsilon _*}\rfloor (k-1)\)-words. Therefore, using the induction hypothesis (30) for \(k-1\),

$$\begin{aligned} P(A_2\cap A_3^c)&=P\Bigl (A_2\cap \Bigl \{\begin{array}{c} \text {at least half of the }(k-1)\text {-words }\\ \text {of } {{{\mathcal {C}}}}(\Pi ^k)\text { are in }{\mathcal {W}}^{k-1,1}_{\text {bad}} \end{array} \Bigr \}\Bigr )\\&\le P\Bigl (\Bigl \{\begin{array}{c} \text {at least half of the }\lfloor \nu _{k}^{1-\epsilon _*}\rfloor (k-1) \text {-words}\\ \text {of }{{{\mathcal {C}}}}(\Pi ^k) \text { are in }{\mathcal {W}}^{k-1,1}_{\text {bad}} \end{array}\Bigr \}\Bigr )\\&\le \sum _{j=\tfrac{1}{2}\lfloor \nu _{k}^{1-\epsilon _*}\rfloor }^{\lfloor \nu _{k}^{ 1-\epsilon _*}\rfloor } \left( {\begin{array}{c}\lfloor \nu _{k}^{1-\epsilon _*}\rfloor \\ j\end{array}}\right) \bigl (3\cdot 2^{-3(k-1)}\bigr )^j \!\le \! \bigl (2^{-3(k\!-\!1)\!+\!6}\bigr )^{\tfrac{1}{2}\lfloor \nu _{k}^{1-\epsilon _*}\rfloor } <2^{-3k}. \end{aligned}$$

This proves (30). It remains to prove that for all \(k> k_*\),

$$\begin{aligned} P\big (\Pi ^k\in {\mathcal {W}}^{k,2}_{\text {bad}}\big )\le 2\beta _{k-1}^{-\epsilon _*}. \end{aligned}$$
(32)

Since \({{{\mathcal {A}}}}(\Pi ^k)\subset {{{\mathcal {C}}}}(\Pi ^k)\), we estimate \({\text {{d}}}({{{\mathcal {C}}}}(\Pi ^k))\). We define \(T_1^{k-1}:=T^{k-1}\) and for \(i>1\),

$$\begin{aligned} T_i^{k-1}:=\inf \bigl \{j>T_{i-1}^{k-1}: (\eta _{j-\ell _k+1},\ldots ,\eta _{j})=I_k \bigr \}. \end{aligned}$$

The increments \(\tau _i^{k-1}:=T^{k-1}_i-T_{i-1}^{k-1}\) are i.i.d., and \(E[\tau _1^{k-1}]=E[T^{k-1}]\le 2\beta _{k-1}\). Moreover, \({\text {{d}}}({{{\mathcal {C}}}}(\Pi ^k))\le \sum _{i=1}^{\lfloor \nu _k^{1-\epsilon _*}\rfloor }\tau _i^{k-1}\). Therefore,

$$\begin{aligned} P\big (|{{{\mathcal {A}}}}(\Pi ^k) |\ge n_2(k)\big )&\le P\left( \sum _{i=1}^{\lfloor \nu _k^{1-\epsilon _*}\rfloor } \tau _{i}^{k-1}\ge n_2(k)\right) \le \frac{E[\tau _1^{k-1}]}{\beta _{k-1}^{1+\epsilon _*}}\le 2\beta _{k-1}^{-\epsilon _*}. \end{aligned}$$

This proves (32). Together, (30) and (32) give (29). \(\square \)

Proof of Proposition 1

Take \(k> k_*\), where \(k_*\) was defined in the proof of Lemma 4. We have

$$\begin{aligned} Q(B^k(0) \text { is bad})&\le Q\bigl ({\text {{d}}}(B^k(0))\ge k \beta _k\bigr )\nonumber \\&\quad + Q\bigl (B^k(0) \text { is bad}, {\text {{d}}}(B^k(0))\le k \beta _k\bigr ). \end{aligned}$$
(33)

By Corollary 1, \(Q\bigl ({\text {{d}}}(B^k(0))\ge k \beta _k\bigr )\le 6ke^{-k}\). Then,

$$\begin{aligned} Q\bigl (B^k(0) \text { is bad}, {\text {{d}}}(B^k(0))\le k \beta _k\bigr )&=\bigl (\tfrac{1}{2}\bigr )^{\ell _k}\sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k_{\text {bad}}\\ {\text {{d}}}(\pi )\le k \beta _k \end{array}}{\text {{d}}}(\pi )\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )}\\&\le k\sum _{\begin{array}{c} \pi \in {\mathcal {W}}^k_{\text {bad}} \end{array}}\bigl (\tfrac{1}{2}\bigr )^{{\text {{d}}}(\pi )} \equiv k P\bigl (\Pi ^k\in {\mathcal {W}}^k_{\text {bad}}\bigr ). \end{aligned}$$

Using Lemma 4,

$$\begin{aligned} Q\bigl (B^k(0) \text { is bad}\bigr )\le 6ke^{-k}+ k(3\cdot 2^{-3k}+ 2\beta _{k-1}^{-\epsilon _*}). \end{aligned}$$
(34)

Taking \(k_*\) large enough, this proves (28). \(\square \)

4 Proofs of Theorems 1 and 2

The proofs of all the results will study the process \(X\) under the quenched measure \(P_\omega \), using environments \(\omega \) for which the influence of the remote past on the present (for example on a local event like \(\{X_0=+\}\)) can be computed and related to \(\varphi \) and to the sequence \(h_{k}\).

4.1 The event \(\{\infty \rightarrow k\}\)

To start, consider a set of variables \(\{X_s, s\in R\}\), where \(R\) is a finite subset of \({\mathbb {Z}}\). There clearly exists some \(k(1)\ge k_*\) such that \(R\subset B^{k(1)}(0)\). Furthermore, using Remark 3, we can take \(k(1)\) sufficiently large, and guarantee that \(R\subset B^{k(1)}(0){\setminus }{{{\mathcal {C}}}}(B^{k(1)}(0))\). But then, by the definition of the \(g\)-function constructed with \(\psi _t^\omega \), the only way by which the remote past influences the variables in \(R\) is through the value of the average of the variables \(\{X_t,\,t\in {{{\mathcal {A}}}}(B^{k(1)}(0))\}\).

Repeating the same procedure with \(B^{k(1)}(0)\) in place of \(R\), we deduce that the distribution of \(\{X_t,\,t\in {{{\mathcal {A}}}}(B^{k(1)}(0))\}\) is entirely determined by the values of \(\{X_t,\,t\in {{{\mathcal {A}}}}(B^{k(2)}(0))\}\) for some sufficiently large \(k(2)\), etc.

Our aim will be to ensure that the random sequence \(k(i)\) satisfies \(k(i+1)=k(i)+1\) for all large \(i\), and that the sizes of the sets \({{{\mathcal {A}}}}(B^{k(i)}(0))\) are under control. We thus define, for all \(k> k_*\),

$$\begin{aligned} \{\infty \rightarrow k\}:=\bigcap _{j\ge k} \bigl \{ 0\not \in {{{\mathcal {C}}}}(B^{j}(0)),\,B^{j}(0)\text { is good}\bigr \}. \end{aligned}$$
(35)

The notation used suggests that the event is of the type described earlier in Fig. 3. Indeed, let \(\omega \in \{\infty \rightarrow k\}\). Take \(j\ge k\), and \(t\in {{{\mathcal {A}}}}(B^j(0))\). Since \(0\notin {{{\mathcal {C}}}}(B^{j+1}(0))\) and \(B^{j+1}(0)=B^{j+1}(t)\), we have that \(t\notin {{{\mathcal {C}}}}(B^{j+1}(t))\) which implies \(t \notin {{{\mathcal {A}}}}(B^{j+1}(t))\). Therefore, \(k_t=j+1\). Moreover, since \(B^{j+1}(0)\) is good, we have that \(d(B^{j+1}(0))\le \tfrac{1}{2}\beta _{j+1}^{1+\epsilon _*}\le \beta _{j+2}\). This implies that \(S_t={{{\mathcal {A}}}}(B^{j+1}(0))\), and

$$\begin{aligned} \psi _t^\omega =h_{j+1}\varphi \left( \frac{1}{|{{{{\mathcal {A}}}}(B^{j+1}(0))}|} \sum _{s\in {{{\mathcal {A}}}}(B^{j+1}(0))}X_s\right) . \end{aligned}$$
(36)

Therefore, on \(\{\infty \rightarrow k\}\), for all \(j\ge k\), the variables \(\{X_t,t\in {{{\mathcal {A}}}}(B^j(0))\}\) are i.i.d., and their distribution is fixed by the value of the magnetization of \(\{X_t,t\in {{{\mathcal {A}}}}(B^{j+1}(0))\}\). That is, the distribution of the process \(X\) on any finite region is related to the behavior of the non-homogeneous Markov sequence

$$\begin{aligned} \xi _{j}:=\frac{1}{|{{{\mathcal {A}}}}(B^{j}(0))|}\sum _{s\in {{{\mathcal {A}}}}(B^{j}(0))} X_s,\quad j\ge k. \end{aligned}$$

The transition probability of the chain will be studied using the following relation, which holds on the event \(\{\infty \rightarrow k\}\), for all \(j\ge k\) (Fig. 5):

$$\begin{aligned} E_\omega [\xi _{j}\,\vert \,\xi _{j+1}]=h_{j+1}\varphi (\xi _{j+1}), \end{aligned}$$
(37)
Fig. 5
figure 5

The quenched distribution of the process \(X\), on the event \(\{\infty \rightarrow k\}\)

Proposition 2

Let \(k\ge k_*\). There exists \(\lambda (\epsilon _*,k)>0\) with \(\lambda (\epsilon _*,k)\searrow 0\) when \(k\rightarrow \infty \), such that

$$\begin{aligned} Q(\infty \rightarrow k)\ge 1-\lambda (\epsilon _*,k). \end{aligned}$$
(38)

Moreover, there exists a random scale \(K=K(\omega )\), \(Q(K<\infty )=1\), such that

$$\begin{aligned} Q(\infty \rightarrow K)=1. \end{aligned}$$

Proof

If \(k_*\) is as large as in Proposition 1, then for \(k\ge k_*\)

$$\begin{aligned} Q(\{\infty \rightarrow k\}^c)\nonumber&\le \sum _{{j}\ge k} \bigl \{ Q(B^{j}(0) \text { is bad})+Q(0\in {{{\mathcal {C}}}}(B^{j}(0))) \bigr \} \nonumber \\&\le \sum _{{j}\ge k} \bigl \{2^{-{j}} +2\nu _{{j}}^{-\epsilon _*}\bigr \}, \end{aligned}$$
(39)

which is summable in \(k\). The existence of \(K\) follows by Borel-Cantelli’s Lemma. \(\square \)

4.2 The measures \(P^+_\omega \), \(P^-_\omega \) and their maximal coupling

Our proofs will rely on the use of two particular processes specified by \(g\), \(Z^+=(X^+,\omega )\) and \(Z^-=(X^-,\omega )\), symmetric with respect to each other in the sense that

$$\begin{aligned} {\mathbb {P}}(X_s^-=+)={\mathbb {P}}(X_s^+=-). \end{aligned}$$
(40)

Each \(X^\#\) will actually be the coordinate process associated to a probability measure \(P_\omega ^\#\) on \(\{\pm \}^{\mathbb {Z}}\) constructed with a pure boundary condition \(\#\in \{+,-\}\). The construction given below is standard.

For \(x,y\in \{\pm \}^{\mathbb {Z}}\), let \(x_s^t:=(x_t,\ldots ,x_s)\) and \(x_{-\infty }^t:=(x_t,x_{t-1},\ldots )\). If \(\omega \in \{0,1\}^{{\mathbb {Z}}}\), define

$$\begin{aligned} g^\omega _t\bigl (x_t|x_{-\infty }^{t-1}\bigr ) :=\tfrac{1}{2}\{1 +x_t\psi ^\omega _t\bigl (x_{-\infty }^{t-1}\bigr )\}. \end{aligned}$$

Define also the cylinder \([x]_s^t:=\{y\in \{\pm \}^{\mathbb {Z}}:y_i =x_i,~i\in [s,t]\}\) and \(x_s^ty_{-\infty }^{s-1} :=(x_t,\ldots ,x_s,y_{s-1},y_{s-2},\ldots )\). For each \(N\in {\mathbb {N}}\), \(\eta \in \{\pm \}^{\mathbb {Z}}\), we define a probability measure on \(\{\pm \}^{(-N,\infty )}\) by setting

$$\begin{aligned} P_\omega ^{\eta ,N}\bigl ([x]_{-N+1}^{s}\bigr ):=g^\omega _{-N+1}\bigl (x_{-N+1} |\eta _{-\infty }^{-N}\bigr ) \prod _{t=-N+2}^{s} g^\omega _t\bigl (x_{t} |x_{-N+1}^{t-1}\eta _{-\infty }^{-N}\bigr ). \end{aligned}$$

When \(\eta ^1\le \eta ^2\) (pointwise), \(P_\omega ^{\eta ^1,N}\) and \(P_\omega ^{\eta ^2,N}\) can be coupled as follows. Consider an i.i.d. sequence of random variables \((U_t)_{t> -N}\), each with a uniform distribution on \([0,1]\). We construct two processes, \(X^1\) and \(X^2\), through a sequence of pairs, \(\Delta _t=\left( {\begin{array}{c}X_t^2\\ X_t^1\end{array}}\right) \), \(t>-N\), in such a way that \((X^\#_t)_{t>-N}\) has distribution \(P_\omega ^{\eta ^\#,N}\) and such that \(X_t^1\le X_t^2\) for all \(t>-N\). For \(s\le -N\), set \(x^\#_s:=\eta _s^\#\). Assume that the pairs \(\Delta _s=\left( {\begin{array}{c}x_s^2\\ x_s^1\end{array}}\right) \) have been sampled for all \(s<t\), and that these satisfy \(x_s^1\le x_s^2\). Let

$$\begin{aligned} \Delta _t =\left( {\begin{array}{c}X_t^2\\ X_t^1\end{array}}\right) :=\left( {\begin{array}{c}+\\ -\end{array}}\right) 1_{A_t}+\left( {\begin{array}{c}+\\ +\end{array}}\right) 1_{B_t}+\left( {\begin{array}{c}-\\ -\end{array}}\right) 1_{C_t}, \end{aligned}$$
(41)

where

$$\begin{aligned} A_t&:=\bigl \{0\le U_t<g_t^\omega (+\,\vert \,(x^2)_{-\infty }^{t-1} )-g_t^\omega (+\,\vert \,(x^1)_{-\infty }^{t-1})\bigr \}, \nonumber \\ B_t&:=\bigl \{g_t^\omega (+\,\vert \,(x^2)_{-\infty }^{t-1} )-g_t^\omega (+\,\vert \,(x^1)_{-\infty }^{t-1})\le U_t<g_t^\omega (+\,\vert \,(x^2)_{-\infty }^{t-1} )\bigr \},\nonumber \\ C_t&:=\bigl \{g_t^\omega (+\,\vert \,(x^2)_{-\infty }^{t-1}) \le U_t \le 1\bigr \}. \end{aligned}$$
(42)

We of course have \(\mathsf {P}(X_t^1\le X_t^2)=1\), and

$$\begin{aligned} \mathsf {P}\bigl ( X_t^2=+ \,\vert \,X_{t-1}^2=x^2_{t-1},\ldots ,X_{-N+1}^2=x_{-N+1}^2\bigr )&=\mathsf {P}\bigl (A_t\cup B_t\big )\\&= g^\omega _t\bigl (+ \,\vert \,(x^2)_{-\infty }^{t-1}\bigr ), \end{aligned}$$

and so the distribution of \((X_t^2)_{t>-N}\) is given by \(P_\omega ^{\eta ^2,N}\). Similarly,

$$\begin{aligned} \mathsf {P}\bigl ( X_t^1=+ \,\vert \,X_{t-1}^1=x^1_{t-1},\ldots ,X_{-N+1}^1=x_{-N+1}^1\bigr )&=\mathsf {P}\bigl (B_t\big )\\&= g^\omega _t\bigl (+ \,\vert \,(x^1)_{-\infty }^{t-1}\bigr ), \end{aligned}$$

and so the distribution of \((X_t^1)_{t>-N}\) is given by \(P_\omega ^{\eta ^1,N}\).

The above coupling allows to extract information about the measures \(P_\omega ^{\eta ^\#,N}\). First, \(x^1\le x^2\) implies \(g_t^\omega (+\,\vert \,(x^1)_{-\infty }^{t-1} )\le g_t^\omega (+\,\vert \,(x^2)_{-\infty }^{t-1})\), and so we always have

$$\begin{aligned} P_\omega ^{\eta ^1,N}(X_t=+)\le P_\omega ^{\eta ^2,N}(X_t=+),\quad t>-N. \end{aligned}$$

More generally, if \(f:\{\pm \}^{(-N+1,\infty )}\rightarrow {\mathbb {R}}\) is an increasing local function (that is: non-decreasing in each variable \(x_s\) and depending only on a finite number of coordinates), then

$$\begin{aligned} E_\omega ^{\eta ^1,N}[f]\le E_\omega ^{\eta ^2,N}[f]. \end{aligned}$$
(43)

Using the previous item, one can also construct two processes \(P_\omega ^{+}\) and \(P_\omega ^{-}\) by taking monotone limits. Namely, let \(P_\omega ^{+,N}\) be constructed as above using the boundary condition \(\eta _s\equiv +\) for all \(s\). Using (43), it is easy to see that for all local increasing \(f\),

$$\begin{aligned} E_\omega ^{+,N+1}[f]\le E_\omega ^{+,N}[f]. \end{aligned}$$

which allows to define \(E^{+}_\omega [f]:=\lim _{N\rightarrow \infty } E^{+,N}_\omega [f]\). Since this extends to all continuous function, it defines a measure \(P_\omega ^+\). It can then be verified that the coordinate process \(X=(X_t)_{t\in {\mathbb {Z}}}\) defined by \( X_t(x):=x_t\) satisfies (3) (with \(P_\omega ^+\) in place of \(P_\omega \)).

4.3 Non-uniqueness when \(\alpha <\frac{1-\epsilon _*}{2}\)

Let \(\xi ^\#_k\) denote the average of \(X^\#\) over \({{{\mathcal {A}}}}(B^k(0))\) (if \({{{\mathcal {A}}}}(B^k(0))=\varnothing \), let \(\xi ^\#_k:=0\)). To obtain non-uniqueness, we will show that when \(\alpha <\frac{1-\epsilon _*}{2}\),

$$\begin{aligned} {\mathbb {P}}(\xi ^+_k>0)>\tfrac{1}{2}>{\mathbb {P}}(\xi ^-_k>0), \end{aligned}$$
(44)

for all large enough \(k\). Actually, due to the attractiveness of \(g\), the following lower bound holds for all \(\omega \):

$$\begin{aligned} P_\omega ^+(\xi _{k}\ge 0)\ge \tfrac{1}{2}. \end{aligned}$$
(45)

Proposition 3

Under the hypotheses of Theorem 1, with \(\alpha <\frac{1-\epsilon _*}{2}\), there exists for all \(k'> k_*\) some \(\epsilon (k')\), \(\epsilon (k')\searrow 0\) as \(k'\rightarrow \infty \), such that

$$\begin{aligned} \forall ~\omega \in \{\infty \rightarrow k'\},\quad P^+_\omega (\xi _{k'}>0)\ge 1-\epsilon (k'). \end{aligned}$$
(46)

Proof of Theorem 1, item (1)

Using (46),

$$\begin{aligned} {\mathbb {P}}\bigl (\xi ^+_k>0)&\ge \int _{\{\infty \rightarrow k\}}P_\omega ^+\bigl (\xi _{k}>0\bigr ) Q(d\omega )\\&\ge \bigl (1-\epsilon (k)\bigr )Q(\infty \rightarrow k). \end{aligned}$$

By taking \(k\) large, this lower bound is \(>\tfrac{1}{2}\). This proves (44), and thereby item (1) of Theorem 1. \(\square \)

Remark 4

As the proof of Proposition 3 will show, it is possible to distinguish \(X^+\) and \(X^-\) even at the origin. Namely, it can be shown that

$$\begin{aligned} \forall ~ \omega \in \{\infty \rightarrow k_*+1\},\quad P^{+}_\omega (X_0=+)\ge \tfrac{1}{2}+\tau , \end{aligned}$$
(47)

where \(\tau >0\) once \(k_*\) is taken large depending on \(\alpha \). Then, by (45) and (47),

$$\begin{aligned} {\mathbb {P}}(X_0^+=+)&\ge \bigl (\tfrac{1}{2}+\tau \bigr )Q(\infty \rightarrow k_*+1)+\tfrac{1}{2} Q(\{\infty \rightarrow k_*+1\}^c)\\&=\tfrac{1}{2}+\tau Q(\infty \rightarrow k_*+1)\\&>\tfrac{1}{2}. \end{aligned}$$

Nevertheless, we prefer to avoid having \(k_*\) depend on \(\alpha \).

Remark 5

In [1], non-uniqueness was obtained by showing that when \(\alpha <\frac{1-\epsilon _*}{2}\), any process \({\mathbb {P}}\) specified by \(g\) must satisfy

$$\begin{aligned} {\mathbb {P}}\Bigl ( \Bigl \{ \lim _{k\rightarrow \infty }1_{\{\xi _{k}>0\}}=1 \Bigr \} \cup \Bigl \{ \lim _{k\rightarrow \infty }1_{\{\xi _{k}<0\}}=1 \Bigr \} \Bigr )=1. \end{aligned}$$

From this, the existence of two distinct processes can be deduced, using an argument based on symmetry and ergodic decomposition.

4.4 The signature of uniqueness

Assume the environment \(\omega \) is typical, in the sense that \(\{\infty \rightarrow k'\}\) occurs for some large enough \(k'\). As we have seen, the distribution of \(X\) on any finite region can be studied via the information contained in the sequence \(\xi _{k}\), \(k\ge k'\). On the one hand, we have seen in (44) that non-uniqueness is observed through some asymmetry in the distribution of \(\xi _{k}\) when \(k\) is large. Uniqueness, on the other hand, will essentially be characterized by showing that the variables \(\xi _{k}\) are symmetric:

$$\begin{aligned} E_\omega [\xi _{k}]=0 \quad \text { for all large }k. \end{aligned}$$
(48)

Observe that regardless of the details of \(\varphi \),

$$\begin{aligned} \lim _{k\rightarrow \infty }E_\omega [\xi _{k}]=0\, \end{aligned}$$
(49)

always holds. Namely, for all \(k\ge k'\),

$$\begin{aligned} E_\omega [\xi _{k}]=E_\omega \bigl [E_\omega [\xi _{k}\,\vert \,\xi _{k+1}]\bigr ] =h_{k+1}E_\omega [\varphi (\xi _{k+1})]=O\bigl (h_{k+1}\bigr ). \end{aligned}$$

More can be said: when conditioned on \(\xi _{k+1}\), \(\xi _{k}\) is a Bernoulli sum of i.i.d. variables \(X_s\) with expectation \(h_{k+1}\varphi (\xi _{k+1})\). Therefore, for any fixed \(\epsilon >0\), if \(k\) is large enough so that \(h_{k+1}\le \epsilon /2\), a standard large deviation estimate yields

$$\begin{aligned} P_\omega \bigl (|\xi _{k}|> \epsilon \,\vert \,\xi _{k+1}\bigr )\le e^{-c|{{{\mathcal {A}}}}(B^k(0))|}\le e^{-cn_1(k)}, \end{aligned}$$
(50)

where \(c=c(\epsilon )>0\) (we have used the fact that \(B^k(0)\) is good). Therefore,

$$\begin{aligned} \forall \epsilon >0,\quad P_\omega \bigl ( |\xi _{k}|\le \epsilon \text { for all large enough }k\bigr )=1. \end{aligned}$$
(51)

Since the variables \(\xi _{k}\) almost surely tend to zero when \(k\rightarrow \infty \), observing some (a)symmetry in their distribution is a delicate problem.

Uniqueness will be obtained with the help of the following criterion, whose proof can be found in Appendix A.

Theorem 4

Assume that

$$\begin{aligned} E_\omega ^+[X_t]=0=E_\omega ^-[X_t]\,\quad \forall ~t\in {\mathbb {Z}}. \end{aligned}$$
(52)

Then, \(P_\omega ^+=P_\omega ^-\), and any measure \(P_\omega \) satisfying (3) coincides with \(P_\omega ^+\) and \(P_\omega ^-\).

4.5 Uniqueness when \(\alpha >\frac{1-\epsilon _*}{2}\)

Proposition 4

Let \({\mathbb {P}}=Q\otimes P_\omega \) be the distribution of any process specified by \(g\). Under the hypotheses of Theorem 1, with \(\alpha >\frac{1-\epsilon _*}{2}\), for \(Q\)-almost all environment \(\omega \), and for all large enough \(k\),

$$\begin{aligned} \lim _{M\rightarrow \infty }E_\omega \bigl [\xi _{k}\bigm \vert \xi _{M}\bigr ]= 0\,\quad P_\omega \text {-almost surely.} \end{aligned}$$
(53)

Proof of Theorem 1, item 2

By Proposition 2, we can consider a fixed environment \(\omega \) for which \(K=K(\omega )<\infty \). Take \(k\ge K\) large. We will consider \(P_\omega ^+\), and show that (53) implies (52).

We know that \(\xi _{k}\) is a sum of identically distributed variables \(X_s\), \(s\in {{{\mathcal {A}}}}(B^k(0))\). By (53), \(P_\omega ^+\)-almost surely, for each such \(s\),

$$\begin{aligned} \lim _{M\rightarrow \infty }P_\omega ^+\bigl (X_s=+\bigm \vert \xi _{M}\bigr )&= \lim _{M\rightarrow \infty }\tfrac{1}{2}\bigl (E_\omega ^+\bigl [X_s\bigm \vert \xi _{M}\bigr ]+1\bigr )\\&= \lim _{M\rightarrow \infty }\tfrac{1}{2}\bigl (E_\omega ^+\bigl [\xi _{k}\bigm \vert \xi _{M}\bigr ]+1\bigr )= \tfrac{1}{2}. \end{aligned}$$

This implies that for all large enough \(k\), the distribution of \(\xi _{k}\) under \(P_\omega ^+(\cdot \,\vert \,\xi _{M})\) converges when \(M\rightarrow \infty \) to a symmetric distribution. We can show that this extends to any variable \(X_t\) as follows. Take \(k\) large enough so that \(t\not \in {{{\mathcal {A}}}}(B^k(0))\), and write

$$\begin{aligned} E_\omega ^+\bigl [X_t\bigm \vert \xi _{M}\bigr ]= E^+_\omega \bigl [ E^+_\omega [X_t\,\vert \,\xi _{k}]\bigm \vert \xi _{M} \bigr ]. \end{aligned}$$
(54)

Since \(f(x):=E^+_\omega [X_t\,\vert \,\xi _{k}=x]\) is odd, \(f(-x)=-f(x)\), and since \(\xi _{k}\) converges in distribution to a symmetric variable, the right-hand side of (54) converges to zero when \(M\rightarrow \infty \). By dominated convergence, we thus get

$$\begin{aligned} E_\omega ^+[X_t]=\lim _{M\rightarrow \infty }E^+_\omega \bigl [ E_\omega ^+\bigl [X_t\bigm \vert \xi _{M}\bigr ] \bigr ]=0. \end{aligned}$$

Similarly, \(E_\omega ^-[X_t]=0\), and this finishes the proof. \(\square \)

4.6 Proofs of Propositions 3 and 4

The sequence \(\xi _{k}\in [-1,1]\) is Markovian and temporally non-homogeneous; we can nevertheless estimate its transition probabilities with relative precision. Since the BHS-model considers the pure majority rule \(\varphi _{PMR}\), its study can be reduced to the sign variables

$$\begin{aligned} \sigma _{k}:={\left\{ \begin{array}{ll} +1&{} \text {if } \xi _{k}> 0,\\ 0 &{} \text {if } \xi _{k}=0,\\ -1&{}\text {if }\xi _{k}<0. \end{array}\right. } \end{aligned}$$

Remember that if \(B^{k}(0)\) is good, then

$$\begin{aligned} n_1(k)< |{{{\mathcal {A}}}}(B^{k}(0)) |<n_2(k), \end{aligned}$$
(55)

where the leading term in each \(n_\#(k)\) is \(\beta _k^{1-\epsilon _*}\). But since \(h_{k+1}=\beta _k^{-\alpha }\),

$$\begin{aligned} h_{k+1}^2n_\#(k) {\left\{ \begin{array}{ll} \nearrow \infty &{}\text { if }\alpha <\frac{1-\epsilon _*}{2},\\ \searrow 0 &{}\text { if }\alpha >\frac{1-\epsilon _*}{2}. \end{array}\right. } \end{aligned}$$
(56)

We study the sign changes of the sequence \(\xi _{k}\):

Lemma 5

Assume \(\varphi =\varphi _{PMR}\). Let \(\omega \in \{\infty \rightarrow k'\}\). For all large \(k\ge k'\), the following holds.

  1. (1)

    For all \(h_k\searrow 0\),

    $$\begin{aligned} P_\omega \bigl (\sigma _{k}\le 0\,\vert \,\sigma _{k+1}=+\bigr ) \le e^{-h_{k+1}^2 n_1(k)/16}. \end{aligned}$$
  2. (2)

    There exists \(c_0>0\) such that if \(h_{k}=\beta _{k-1}^{-\alpha }\),

    $$\begin{aligned} P_\omega \bigl (\sigma _{k}\le 0\,\vert \,\sigma _{k+1}=+\bigr ) {\left\{ \begin{array}{ll} \le e^{-h_{k+1}^2 n_1(k)/16}&{}\text { if }\alpha <\tfrac{1-\epsilon _*}{2}, \\ \ge c_0 &{}\text { if }\alpha >\tfrac{1-\epsilon _*}{2}. \end{array}\right. } \end{aligned}$$

Proof

  • (1)  On \(\omega \in \{\infty \rightarrow k'\}\), each block \(B^{k}(0)\), \(k\ge k'\), is good. In particular, (55) holds and by (35), under \(P_\omega \bigl (\,\cdot \,\,\vert \,\sigma _{k+1}=+\bigr )\), the variables \(\{X_s, s\in {{{\mathcal {A}}}}(B^{k}(0))\}\) are i.i.d. with

    $$\begin{aligned} E_\omega [X_s\,\vert \,\sigma _{k+1}=+]=h_{k+1}\varphi _{PMR}(\xi _{k+1})\equiv h_{k+1}. \end{aligned}$$

    Let \(X'_s:=X_s-E_\omega [X_s\,\vert \,\sigma _{k+1}=+]\). By the Bernstein Inequality,

    $$\begin{aligned}&P_\omega (\sigma _{k}\le 0\,\vert \,\sigma _{k+1}=+) \\&\quad = P_\omega \Bigl ( \tfrac{1}{|{{{\mathcal {A}}}}(B^{k}(0)) |}\sum _{s\in {{{\mathcal {A}}}}(B^{k}(0))} X'_s \le -h_{k+1} \Bigm \vert \sigma _{k+1}=+\Bigr )\\&\quad \le e^{-h_{k+1}^2 |{{{\mathcal {A}}}}(B^k(0))|/16} \le e^{-h_{k+1}^2 n_1(k)/16} . \end{aligned}$$
  • (2)  To prove the lower bound we let \(A(k):=|{{{\mathcal {A}}}}(B^k(0)) |\) and let \({{\mathcal {L}}}_k\) denote the set of integers between \(0\) and \(\sqrt{A(k)}\) that have the same parity as \(A(k)\). Using Stirling’s formula:

    $$\begin{aligned}&P_\omega \bigl (\sigma _{k}\le 0\,\vert \,\sigma _{k+1}=+\bigr )\\&\quad \ge \sum _{L\in {\mathcal {L}}_k} P_\omega \Bigl (\sum _{s\in {{{\mathcal {A}}}}(B^k(0))} X_s=-L\Bigm \vert \sigma _{k+1}=+\Bigr )\\&\quad =\sum _{L\in {\mathcal {L}}_k}\left( {\begin{array}{c}A(k)\\ \frac{A(k)+L}{2}\end{array}}\right) \bigl (\tfrac{1}{2}(1+h_{k+1})\bigr )^{\frac{A(k)-L}{2}} \bigl (\tfrac{1}{2}(1-h_{k+1})\bigr )^{ \frac{A(k)+L}{2}}\\&\quad \ge \tilde{c_0} e^{-h_{k+1}^2 A(k)} e^{-\big (h_{k+1}\sqrt{A(k)}\big )/4}. \end{aligned}$$

\(\square \)

Proof of Proposition 3

Let \(\omega \in \{\infty \rightarrow k'\}\). The probability we are interested in is defined using the \(+\) boundary condition:

$$\begin{aligned} P_\omega ^+(\xi _{k'}>0)=\lim _{N\rightarrow \infty }P_\omega ^{+,N}(\xi _{k'}>0). \end{aligned}$$

We choose \(N\) large, always to be between two successive sets \({{{\mathcal {A}}}}(B^{M-1}(0))\), \({{{\mathcal {A}}}}(B^{M}(0))\). A lower bound is obtained by assuming that all the sign of the boundary condition travels down to \(\xi _{k'}\):

figure c

Using Lemma 5,

$$\begin{aligned} P_\omega ^{+,N}(\xi _{k'}>0)&=P_\omega ^{+,N}(\sigma _{k'}=+)\\&\ge \prod _{k=k'}^{M-1} P_\omega ^{+,N}(\sigma _{k}=+\,\vert \,\sigma _{k+1}=+)\\&\ge \prod _{k=k'}^{\infty } \left\{ 1-e^{-h_{k+1}^2n_1(k)/16}\right\} . \end{aligned}$$

When \(\alpha <\frac{1-\epsilon _*}{2}\), we have

$$\begin{aligned} \sum _{k}e^{-h_{k+1}^2n_1(k)/16}<\infty , \end{aligned}$$
(57)

and so that last product converges and goes to \(1\) when \(k'\rightarrow \infty \). \(\square \)

Proof of Proposition 4

By Proposition 2, we can consider a fixed environment \(\omega \) for which \(K=K(\omega )<\infty \). Then for each \(k'>K\), we have that \(\omega \in \{\infty \rightarrow k'\}\).

Take \(M>k'\). If \(\xi _M=0\) then \(E_\omega [\xi _{k'}\,\vert \,\xi _{M}]=0\). If \(\xi _{M}> 0\), then by the attractiveness of the model, \(E_\omega [\xi _{k'}\,\vert \,\xi _{M}]\ge 0\). For an upper bound, we look for the scale \(k\) at which \(\xi \) changes sign:

$$\begin{aligned} \kappa _M:=\max \{k'\le k\le M:\xi _{k}\le 0\}, \end{aligned}$$
(58)

with the convention that \(\kappa _M:=k'-1\) if \(\xi _{k}>0\) for all \(k'\le k< M\). Again by attractiveness, on \(\{\kappa _M\ge k'\}\), the Markov property gives \(E_\omega [\xi _{k'}\,\vert \,\xi _{\kappa _M}]\le 0\). Therefore, \(E_\omega [\xi _{k'}\,\vert \,\kappa _M\ge k',\xi _{M}]\le 0\), and so

$$\begin{aligned} E_\omega [&\xi _{k'}\,\vert \,\xi _{M}] \le P_\omega \big (\kappa _M=k'-1\,\vert \,\xi _{M}\big ). \end{aligned}$$
(59)

It follows by Lemma 5 and \(1-x\le e^{-x}\) that when \(\alpha >\frac{1-\epsilon _*}{2}\),

$$\begin{aligned} P_\omega \big (\kappa _M=k'-1\,\vert \,\xi _{M}\big )&= P_\omega \big (\xi _{k'}>0,\dots ,\xi _{M-1}>0\,\vert \,\xi _{M}\big )\\&=\prod _{k=k'}^{M-1} P_\omega \big (\sigma _{k}=+\,\vert \,\sigma _{k+1}=+\big )\\&\le (1-c_0)^{M-k'}. \end{aligned}$$

\(\square \)

4.7 Proof of Theorem 2

The proof of uniqueness when \(\varphi \) is Lipshitz at the origin will be based on the same principle used when proving item 2 of Theorem 1.

Proposition 5

Let \({\mathbb {P}}=Q\otimes P_\omega \) be the distribution of any process specified by \(g\). Under the hypotheses of Theorem 2, for \(Q\)-almost all environment \(\omega \), and for all large enough \(k'\),

$$\begin{aligned} \lim _{M\rightarrow \infty }E_\omega [\xi _{k'}\,\vert \,\xi _{M}]=0\quad P_\omega \text {-almost surely}. \end{aligned}$$

We consider an environment \(\omega \) with \(K=K(\omega )<\infty \). We take \(k'>K\), and \(M>k'\). As before, the proof is based on showing that whatever the sign of \(\xi _{M}\), the sequence \(\xi _{k}\) has a positive probability of having changed sign before reaching \(k'\).

We will look at the variables \(\xi _{k}\) at even times: \(\xi _{M},\xi _{M-2},\dots \), and show that the probability of \(\xi \) changing sign between two scales \(k\) and \(k-2\) is bounded away from zero.

Lemma 6

Let \(\omega \in \{\infty \rightarrow k'\}\) with \(k'\) large enough. If \(\varphi \) satisfies (15), then for all \(\alpha >0\), and all \(k\ge k'+2\),

$$\begin{aligned} P_\omega \bigl (\xi _{k-2}\le 0\,\vert \,\xi _{k}\bigr )\ge \tfrac{c_1}{2}>0, \end{aligned}$$

where \(c_1\) is a universal constant.

Proof

Let \(\gamma >0\) be such that (15) holds. If \(\xi _{k}\le 0\), then attractiveness gives \(P_\omega \bigl (\xi _{k-2}\le 0\,\vert \,\xi _{k}\bigr )\ge \tfrac{1}{2}\). If \(\xi _{k}=x>0\), let us denote \(P_\omega ^x(\cdot ):=P_\omega (\cdot \,\vert \,\xi _{k}=x)\). We have

$$\begin{aligned} P_\omega ^x\bigl ( \xi _{k-2}\le 0\bigr )&=E_\omega ^x\bigl [P_\omega ^x\bigl (\xi _{k-2}\le 0\,\vert \,\xi _{k-1}\bigr )\bigr ]\\&\ge E_\omega ^x\bigl [P_\omega ^x\bigl (\xi _{k-2}\le 0\,\vert \,\xi _{k-1}\bigr )1_{\{|\xi _{k-1}|\le h_{k-1}^m\}}\bigr ], \end{aligned}$$

where \(m>0\) is chosen such that

$$\begin{aligned} \alpha >\frac{1-\epsilon _*}{2(1+m\gamma )}. \end{aligned}$$
(60)

Again, by attractiveness, \(-h_{k-1}^m\le \xi _{k-1}\le 0\) implies

$$\begin{aligned} P_\omega ^x\bigl (\xi _{k-2}\le 0\,\vert \,\xi _{k-1}\bigr )\ge \tfrac{1}{2}. \end{aligned}$$

When \(0\le \xi _{k-1}\le h_{k-1}^m\), we let

$$\begin{aligned} \overline{X}_s:=\frac{X_s-E_\omega ^x[X_s|\xi _{k-1}]}{\sqrt{\mathrm {Var}_\omega ^x(X_s\,\vert \,\xi _{k-1})}}, \end{aligned}$$

which are centered with variance \(1\). Then

$$\begin{aligned}&P_\omega ^x\bigl ( \xi _{k-2}\le 0\,\vert \,\xi _{k-1}\bigr ) \\&\quad =P_\omega ^x\left( \tfrac{1}{\sqrt{|{{{\mathcal {A}}}}(B^{k-2}(0)) |}}\sum _{s\in {{{\mathcal {A}}}}(B^{k-2}(0))}\overline{X}_s\le -\tfrac{E_\omega ^x[X_s\,\vert \,\xi _{k-1}]\sqrt{|{{{\mathcal {A}}}}(B^{k-2}(0)) |}}{\sqrt{\mathrm {Var}_\omega ^x(X_s\,\vert \,\xi _{k-1})}} \Bigm \vert \xi _{k-1}\right) . \end{aligned}$$

By (15), there exists \(0<\lambda <\infty \) and \(\delta >0\) such that \(\varphi (y)\le \lambda y^\gamma \) for all \(0\le y\le \delta \). Therefore, if \(k\) is large enough so that \(h_{k-1}^m\le \delta \),

$$\begin{aligned} 0\le E_\omega ^x[X_s \,\vert \,\xi _{k-1}] =h_{k-1}\varphi (\xi _{k-1}) \le \lambda h_{k-1}\xi _{k-1}^\gamma \le \lambda h_{k-1}^{1+m\gamma }. \end{aligned}$$

Then, since \(B^{k-2}(0)\) is good,

$$\begin{aligned}\frac{E_\omega ^x[X_s\,\vert \,\xi _{k-1}]\sqrt{|{{{\mathcal {A}}}}(B^{k-2}(0)) |}}{\mathrm {Var}_\omega ^x(X_s\,\vert \,\xi _{k-1})} \le \frac{\lambda h_{k-1}^{1+m\gamma }\sqrt{n_2(k-2)}}{\sqrt{1-h_{k-1}^2}}. \end{aligned}$$

But, the dominating term in this last expression is \(\beta _{k-2}^{-(\alpha (1+m\gamma )-(1-\epsilon _*)/2)}\), which tends to zero since (60) holds. Therefore, taking \(k\) large enough,

$$\begin{aligned} P_\omega ^x\bigl ( \xi _{k-2}\le 0\,\vert \,\xi _{k-1}\bigr )\ge & {} P_\omega ^x\left( \tfrac{1}{\sqrt{|{{{\mathcal {A}}}}(B^{k-2}(0)) |}}\sum _{s\in {{{\mathcal {A}}}}(B^{k-2}(0))}\overline{X}_s\le -1\Bigm \vert \xi _{k-1}\right) \\\ge & {} 0,5\cdot \tfrac{1}{\sqrt{2\pi }} \int _{-\infty }^{-1}e^{-x^2/2}\,dx\equiv c_1. \end{aligned}$$

It remains to study \(P_\omega ^x(|\xi _{k-1}|\le h_{k-1}^m)\). We have \(E^x_\omega [\xi _{k-1}]=h_{k}\varphi (x)\), so if \(k\) is such that \(h_{k}\le h_{k-1}^m/2\) then, using Chebyshev’s inequality,

$$\begin{aligned} P_\omega ^x(|\xi _{k-1}|\le h_{k-1}^m)&\ge P_\omega ^x(|\xi _{k-1}-h_{k}\varphi (x)|\le h_{k-1}^m/2)\\&\ge 1-\tfrac{2}{|{{{\mathcal {A}}}}(B^{k-1}(0)) |h_{k-1}^m} \ge 1-\tfrac{2}{n_1(k-1)h_{k-1}^m}, \end{aligned}$$

which is \(\ge \tfrac{1}{2}\) when \(k\) is large enough. \(\square \)

Proof of Proposition 5

The proof is the same as the one of Proposition 4. If \(\xi _{M}>0\), define \(S_M\) as in (58). Assuming for simplicity that \(M-k'\) is even, Lemma 6 gives

$$\begin{aligned} P_\omega \bigl (S_M=k'-1\bigm \vert \xi _{M}\bigr )&\le P_\omega \bigl (\xi _{k'}> 0,\xi _{k'+2}>0,\ldots ,\xi _{M-2}>0\,\vert \,\xi _{M}\bigr )\\&\le \bigl (1-\tfrac{c_1}{2}\bigr )^{(M-k')/2}. \end{aligned}$$

\(\square \)

Remark 6

Proceeding as in the proof of Proposition 3, we show that it is possible to obtain non-uniqueness with a \(\varphi \) continuous at the origin, and the sequence \(h_k\) as in (14) with \(\alpha <\frac{1-\epsilon _*}{2}\) (as mentioned after Theorem 1, \(\alpha >\frac{1-\epsilon _*}{2}\) always leads to uniqueness, for any function \(\varphi \)). Namely, fix \(\omega \in \{\infty \rightarrow k'\}\). We want to define a sequence \(\delta _k\searrow 0\) so that the probability

$$\begin{aligned} P_\omega ^{+,N}\big (\xi _{k'}> 0\big )&\ge P_\omega ^{+,N}\big (\xi _{k'}\ge \delta _{k'},\ldots ,\xi _M\ge \delta _M\big ) \\&\ge \prod _{k={k'}}^{M-1}P_\omega ^{+,N}\big (\xi _k\ge \delta _k\,\vert \,\xi _{k+1}=\delta _{k+1}\big )\, \end{aligned}$$

is close to one. The majority rule \(\varphi \) will be defined appropriately along the sequence \(\delta _k\), then extended to \([0,1]\) by linear interpolation, so as to be continuous at the origin and to satisfy \(\varphi (0)=0\). Finally, \(\varphi \) is defined in \([-1,0]\) in order to be odd. Conditioned on \(\{\xi _{k+1}=\delta _{k+1}\}\), the variables \(X_s\), \(s\in {{{\mathcal {A}}}}(B^k(0))\) are Bernoulli with expectation

$$\begin{aligned} E_\omega [X_s\,\vert \,\xi _{k+1}=\delta _{k+1}] = h_{k+1}\varphi (\delta _{k+1})\equiv \tau _k. \end{aligned}$$

If \(\delta _k < \tau _k\), for all \(k\), then \(\in {{{\mathcal {A}}}}(B^k(0))\), then

$$\begin{aligned} P_\omega ^{+,N}\big (\xi _k\ge \delta _k\,\vert \,\xi _{k+1}=\delta _{k+1}\big )&\ge P_\omega ^{+,N}\big (\delta _k \le \xi _k\le (2\tau _k-\delta _k) \,\vert \,\xi _{k+1}=\delta _{k+1}\big )\\&= P_\omega ^{+,N}\big (|\xi _k-\tau _k|\le (\tau _k-\delta _k)\bigm \vert \xi _{k+1}=\delta _{k+1}\big ). \end{aligned}$$

Using Chebyshev’s inequality,

$$\begin{aligned} P_\omega ^{+,N}\big (|\xi _k-\tau _k|\le (\tau _k-\delta _k)\bigm \vert \xi _{k+1}=\delta _{k+1}\big )&\ge 1-\frac{1}{n_1(k)(\tau _k-\delta _k)^2}. \end{aligned}$$

By taking for example

$$\begin{aligned} \delta _k:=\beta _{k+1}^{-\alpha },\quad \varphi (\beta _{k+1}^{-\alpha }):=\beta _{k-2}^{-\alpha }, \end{aligned}$$

we get, since \(\alpha <\frac{1-\epsilon _*}{2}\),

$$\begin{aligned} P_\omega ^{+,N}\big (\xi _{k'}> 0\big )&\ge \prod _{k={k'}}^{M-1} \Bigl \{ 1-2\beta _{k}^{-((1-\epsilon _*)-2\alpha )}\beta _{k-1}^{2\alpha } \Bigr \} \\&\ge \exp \Bigl \{ -\sum _{k={k'}}^{\infty } \beta _{k}^{-((1-\epsilon _*)-2\alpha )}\beta _{k-1}^{2\alpha } \Bigr \}. \end{aligned}$$

As a consequence,

$$\begin{aligned} \lim _{k'\rightarrow \infty }P_\omega ^{+}\big (\xi _{k'}> 0\big )=1. \end{aligned}$$

Of course, in contrast with (15), the majority rule constructed above satisfies

$$\begin{aligned} \limsup _{z\rightarrow 0^+}\frac{\varphi (z)}{z^\gamma }=\infty ,\quad \quad \text { for all }\gamma >0. \end{aligned}$$

5 Proof of Theorem 3

The proof of uniqueness when \(\varphi \) is Lipschitz, for arbitrary \(h_{k}\)s, will be based on the same principle used when proving item 2 of Theorem 1, showing that for all large enough \(k\), the distribution of \(\xi _{k}\) under \(P_\omega (\cdot \,\vert \,\xi _{M})\) converges to a symmetric distribution when \(M\rightarrow \infty \):

Proposition 6

Assume \(\varphi \) is Lipshitz in a neighborhood of the origin. Let \(h_{k}\searrow 0\) be an arbitrary sequence. Let \({\mathbb {P}}=Q\otimes P_\omega \) be the distribution of any process specified by \(g\). Then for \(Q\)-almost all environment \(\omega \), for all large enough \(k'\),

$$\begin{aligned} \lim _{M\rightarrow \infty }E_\omega \bigl [\xi _{k'}\bigm \vert \xi _{M}\bigr ]= 0\,\quad P_\omega \text {-almost surely.} \end{aligned}$$

To understand why Lipschitzness near the origin implies uniqueness regardless of the details of the sequence \(h_{k}\), we first consider a particular case.

Assume \(\varphi \) is globally linear with slope \(1\):

$$\begin{aligned} \varphi _{ID}(z):=z\,\quad \forall z\in [-1,1]. \end{aligned}$$

Let \(\omega \in \{\infty \rightarrow k'\}\), and take \(k> k'\). Then

$$\begin{aligned} E_\omega [\xi _{k}] =E_\omega \bigl [E_\omega [\xi _{k}\,\vert \,\xi _{k+1}]\bigr ] =E_\omega \bigl [h_{k+1}\varphi _{ID}(\xi _{k+1})\bigr ] =h_{k+1}E_\omega [\xi _{k+1}]. \end{aligned}$$

Repeating this procedure we get, for all \(L\ge 1\),

$$\begin{aligned} E_\omega [\xi _{k}]= \Bigl \{\prod _{j=k+1}^{k+L}h_{j}\Bigr \}E_\omega [\xi _{k+L}]. \end{aligned}$$
(61)

Taking \(L\rightarrow \infty \) gives \(E_\omega [\xi _{k}]=0\).

The proof of Proposition 6 consists in using this phenomenon, which obviously doesn’t depend on the precise values of the sequence \(h_{k}\). So first, we will consider a case where the Lipschitzness of \(\varphi \) is global:

Lemma 7

Assume that \({\widetilde{\varphi }}\) is \(1\)-Lipschitz on \([-1,1]:\)

$$\begin{aligned} |{\widetilde{\varphi }}(z_2)-{\widetilde{\varphi }}(z_1)|\le |z_2-z_1|\quad \forall z_1,z_2\in [-1,1]. \end{aligned}$$

Let \({\mathbb {P}}=Q\otimes P_\omega \) be the distribution of any process specified by the \(g\)-function associated to \({\widetilde{\varphi }}\) and to some sequence \({\tilde{h}}_{k}\searrow 0\). Then for \(Q\)-almost all environment \(\omega \), for all large enough \(k'\),

$$\begin{aligned} \lim _{M\rightarrow \infty }E_\omega \bigl [{\tilde{\xi }}_{k'}\bigm \vert {\tilde{\xi }}_{M}\bigr ]= 0\,\quad P_\omega \text {-almost surely.} \end{aligned}$$

Proof of Proposition 6:

Let \(\delta >0\) and \(\lambda >0\) be such that \(0\le \varphi (z_2)-\varphi (z_1)\le \lambda (z_2-z_1)\) for all \(-\delta \le z_1\le z_2\le \delta \). We define a function \({\widetilde{\varphi }}\) that satisfies the conditions of Lemma 7:

$$\begin{aligned} {\widetilde{\varphi }}(z):=\tfrac{1}{\lambda }\phi (z), \end{aligned}$$

where

$$\begin{aligned} \phi (z):={\left\{ \begin{array}{ll} \lambda (z+\delta )+\varphi (-\delta ) &{}\text { if }z\in [-1,-\delta ],\\ \varphi (z) &{}\text { if }z\in [-\delta ,\delta ],\\ \lambda (z-\delta )+\varphi (\delta ) &{}\text { if }z\in [\delta ,1]. \end{array}\right. } \end{aligned}$$
(62)

Assume \(\omega \in \{\infty \rightarrow k'\}\) for some large \(k'\). We fix \(M>k'\) large, and to study \(E_\omega [\xi _{k'}\,\vert \,\xi _{M}]\) we construct the sequence \(\xi _{k}\) for \(k\) decreasing from \(M\) to \(k'\), coupled to another sequence \({\tilde{\xi }}_{k}\); \(\xi _{k}\) will have its transition probability fixed by \(\varphi \) and \(h_{k}\), and \({\tilde{\xi }}_k\) will have its transition probability fixed by \({\widetilde{\varphi }}\) and

$$\begin{aligned} {\tilde{h}}_{k}:=\lambda h_{k}. \end{aligned}$$

If \(\lambda \) is large, we may need to take \(k'\) large enough so that \({\tilde{h}}_{k}\le 1/2\) for all \(k\ge k'\). The processes \(\xi _{k}\) and \({\tilde{\xi }}_{k}\) will be constructed along with a sequence \(\gamma _{k}\in \{0,1\}\): if \(\gamma _{k}=1\), then \(\xi _{k}\) and \({\tilde{\xi }}_{k}\) are still coupled; \(\gamma _{k}=0\) means they have already decoupled.

For simplicity, we will continue denoting the coupling measure by \(P_\omega \). The construction is illustrated on the figure below. We start with some fixed \(\xi _{M}\). By (51), we can assume that \(M\) is large enough in order to guarantee that \(|\xi _{M}|\le \delta \). Let then \({\tilde{\xi }}_{M}:=\xi _{M}\), and \(\gamma _{M}:=1\).

Given \((\gamma _{k+1},{\tilde{\xi }}_{k+1},\xi _{k+1})\), \((\gamma _{k},{\tilde{\xi }}_{k},\xi _{k})\) is constructed as follows:

  1. (1)

    Sample \({\tilde{\xi }}_{k}\) as an average of variables \(\{\widetilde{X}_s=\pm ,s\in {{{\mathcal {A}}}}(B^{k}(0))\}\), i.i.d. with

    $$\begin{aligned} E_\omega [\widetilde{X}_s\,\vert \,{\tilde{\xi }}_{k+1}]={\tilde{h}}_{k+1}{\widetilde{\varphi }}({\tilde{\xi }}_{k+1}).\end{aligned}$$
  2. (2)

    If \(\gamma _{k+1}=1\), set \(\xi _{k}:={\tilde{\xi }}_{k}\) and \(\gamma _{k}:=1_{\{|\xi _{k}|\le \delta \}}\).

  3. (3)

    If \(\gamma _{k+1}=0\), set \(\gamma _{k}:=0\), and sample \(\xi _{k}\) as an average of variables \(\{X_s=\pm ,s\in {{{\mathcal {A}}}}(B^{k}(0))\}\), i.i.d. with

    $$\begin{aligned} E_\omega [X_s\,\vert \,\xi _{k+1}]=h_{k+1}\varphi (\xi _{k+1}). \end{aligned}$$

By construction, \((\gamma _{k},{\tilde{\xi }}_{k},\xi _{k})\) is a Markov chain, and \(\xi _{k}\) has the proper marginals. Namely, if \(\gamma _{k+1}=0\) then

$$\begin{aligned} E_\omega [\xi _{k}\,\vert \,\xi _{k+1}]=h_{k+1}\varphi (\xi _{k+1}), \end{aligned}$$

and if \(\gamma _{k+1}=1\), then

$$\begin{aligned} E_\omega [\xi _{k}\,\vert \,\xi _{k+1}]= E_\omega [{\tilde{\xi }}_{k}\,\vert \,{\tilde{\xi }}_{k+1}]&={\tilde{h}}_{k+1}{\widetilde{\varphi }}({\tilde{\xi }}_{k+1})\\&={\tilde{h}}_{k+1}{\widetilde{\varphi }}(\xi _{k+1})\\&=h_{k+1}\varphi (\xi _{k+1}), \end{aligned}$$

where in the last line we used that \(|\xi _{k+1}|\le \delta \).

figure d

Let \(D\) be the scale at which decoupling occurs:

$$\begin{aligned} D:=\max \bigl \{k:M> k\ge k', \gamma _{k}=0\bigr \}. \end{aligned}$$

Clearly, \(D\) is a stopping time for the chain \((\gamma _{k},{\tilde{\xi }}_{k},\xi _{k})\), and if \(k\in \{k:M\ge k>D\}\), then \(\xi _{k}={\tilde{\xi }}_{k}\) and \(|\xi _{k}|=|{\tilde{\xi }}_{k}|\le \delta \).

Fix \(k'<L<M\). On the one hand, proceeding as in (50),

$$\begin{aligned} \bigl |E_\omega \bigl [\xi _{k'}, D\ge L\,\vert \,\xi _{M}\bigr ] \bigr |&\le P_\omega (D\ge L\,\vert \,\xi _{M})\\&\le \sum _{k\ge L}P_\omega (|\xi _{k}|\ge \delta \,\vert \,\xi _{M})\le \sum _{k \ge L}e^{-c n_1(k)}. \end{aligned}$$

On the other hand, \(D<L\) implies \(\xi _{L}={\tilde{\xi }}_{L}\) and so

$$\begin{aligned} E_\omega \bigl [\xi _{k'}, D< L\,\vert \,\xi _{M}\bigr ]&= E_\omega \Bigl [E_\omega \bigl [\xi _{k'}\,\vert \,\xi _{L}\bigr ] 1_{\{D<L\}}\Bigm \vert \xi _{M}\Bigr ]\\&= E_\omega \Bigl [f_\omega ({\tilde{\xi }}_{L}) 1_{\{D<L\}}\Bigm \vert \xi _{M}\Bigr ]\\&= E_\omega \bigl [f_\omega ({\tilde{\xi }}_{L})\bigm \vert \xi _{M}\bigr ] +O\bigl (P_\omega (D\ge L\,\vert \,\xi _{M})\bigr ), \end{aligned}$$

where \(f_\omega (x):=E_\omega [\xi _{k'}\,\vert \,\xi _{L}=x]\). Now, the construction of \({\tilde{\xi }}\) was based on \({\widetilde{\varphi }}\), so by Lemma 7,

$$\begin{aligned} E_\omega [{\tilde{\xi }}_{L}\,\vert \,{\tilde{\xi }}_{M}]\rightarrow 0\quad \text { when }M\rightarrow \infty . \end{aligned}$$

Since \({\tilde{\xi }}_{M}:=\xi _{M}\), \({\tilde{\xi }}_{L}\) has, under \(P_\omega (\cdot \,\vert \,\xi _{M})\) in the limit \(M\rightarrow \infty \), a symmetric law. But since \(f_\omega (-x)=-f_\omega (x)\), this implies

$$\begin{aligned} E_\omega \bigl [f_\omega ({\tilde{\xi }}_{L})\bigm \vert \xi _{M}\bigr ] \rightarrow 0\quad \text { as }M\rightarrow \infty . \end{aligned}$$

We have thus shown that for all \(L>k'\),

$$\begin{aligned} \limsup _{M\rightarrow \infty }\bigl |{E_\omega [\xi _{k'}\,\vert \,\xi _{M}]}\bigr |\le 2\sum _{k\ge L}e^{-cn_1(k)}. \end{aligned}$$

\(\square \)

Proof of Lemma 7

We work with two different \(g\)-functions that have the same sequence \({\tilde{h}}_k\) but different majority rules. The first, \({\widetilde{g}}\), is associated to \({\widetilde{\varphi }}\), which is \(1\)-Lipschitz \([-1,1]\). The second, \({\overline{g}}\), is associated to \({\overline{\varphi }}:=\varphi _{ID}\):

figure e

Uniformly in \(z_1<z_2\),

$$\begin{aligned} 0\le {\widetilde{\varphi }}(z_2)-{\widetilde{\varphi }}(z_1)\le z_2-z_1\equiv {\overline{\varphi }}(z_2)-{\overline{\varphi }}(z_1). \end{aligned}$$
(63)

Let \({\widetilde{X}}\) (resp. \({\overline{X}}\)) denote the process associated to \({\widetilde{g}}\) (resp. \({\overline{g}}\)). Using attractiveness and the notations of Sect. 4.2,

$$\begin{aligned} E_\omega ^{-,N}[{\tilde{\xi }}_{k'}]&\le E_\omega [{\tilde{\xi }}_{k'}\,\vert \,{\tilde{\xi }}_M]\le E_\omega ^{+,N}[{\tilde{\xi }}_{k'}], \end{aligned}$$

where \(N\) is chosen appropriately in function of \(M\). Since \({\tilde{\xi }}_{k'}\) is an average of identically distributed variables \({\widetilde{X}}_s\), our aim will be to show that when \(M\rightarrow \infty \),

$$\begin{aligned} 0\le E_\omega ^{+,N}[{\tilde{\xi }}_{k'}]-E_\omega ^{-,N}[{\tilde{\xi }}_{k'}] =E_\omega ^{+,N}[{\widetilde{X}}_s]-E_\omega ^{-,N}[{\widetilde{X}}_s]\rightarrow 0. \end{aligned}$$

To bound this last difference, consider the coupling of \(E_\omega ^{+,N}\) and \(E_\omega ^{-,N}\) described in Sect. 4.2, which we here denote by \(\mathsf {E}_\omega ^{\pm ,N}\). Since that coupling is maximal,

$$\begin{aligned} E_\omega ^{+,N}[{\widetilde{X}}_s]-E_\omega ^{-,N}[{\widetilde{X}}_s] =2\mathsf {E}_\omega ^{\pm ,N}[1_{\{{\widetilde{\Delta }}_s=\left( {\begin{array}{c}+\\ -\end{array}}\right) \}}]. \end{aligned}$$
(64)

We will now use (63) to further couple the pair processes, \({\widetilde{\Delta }}_s=\left( {\begin{array}{c}{\widetilde{X}}^2_s\\ {\widetilde{X}}^1_s\end{array}}\right) \) and \({\overline{\Delta }}_s=\left( {\begin{array}{c}{\overline{X}}^2_s\\ {\overline{X}}^1_s\end{array}}\right) \). This coupling will contain the four processes associated to \({\widetilde{g}}\) and \({\overline{g}}\), with boundary conditions \(+\) and\(-\). The coupling will be such that there are more discrepancies between \({\overline{X}}^2\) and \({\overline{X}}^1\) than there are between \({\widetilde{X}}^2\) and \({\widetilde{X}}^1\), in the following sense:

$$\begin{aligned} 1_{\{{\widetilde{\Delta }}_s=\left( {\begin{array}{c}+\\ -\end{array}}\right) \}} \le 1_{\{{\overline{\Delta }}_s=\left( {\begin{array}{c}+\\ -\end{array}}\right) \}} \quad a.s. \end{aligned}$$
(65)

By definition, when \(s\le -N\), \({\widetilde{\Delta }}_s={\overline{\Delta }}_s=\left( {\begin{array}{c}+\\ -\end{array}}\right) \). Let \(U_t\), \(t>-N\) be an i.i.d. sequence, each with \(U_t\) uniform on \([0,1]\). Assume all pairs, \({\widetilde{\Delta }}_s=\left( {\begin{array}{c}{\widetilde{x}}^2\\ {\widetilde{x}}^1\end{array}}\right) \) and \({\overline{\Delta }}_s=\left( {\begin{array}{c}{\overline{x}}^2\\ {\overline{x}}^1\end{array}}\right) \), have been sampled for all \(s<t\) and that (65) holds for all \(s<t\). Let \({\widetilde{\Delta }}_t\) (resp. \({\overline{\Delta }}_t\)) be defined as in (41), in which \(A_t, B_t,C_t\) are replaced by the corresponding \({\widetilde{A}}_t, {\widetilde{B}}_t,{\widetilde{C}}_t\) (resp. \({\overline{A}}_t, {\overline{B}}_t,{\overline{C}}_t\)). (Obs: we are using the same variable \(U_t\) to define \({\widetilde{\Delta }}_t\) and \({\overline{\Delta }}_s\).) Then \({\widetilde{\Delta }}_t\) and \({\overline{\Delta }}_t\) obviously have the correct distribution. To verify that (65) holds at time \(t\), we first remind that

$$\begin{aligned} {\overline{A}}_t&= \bigl \{0\le U_t<{\overline{g}}_t^\omega (+\,\vert \,({\overline{x}}^2)_{-\infty }^{t-1} )-{\overline{g}}_t^\omega (+\,\vert \,({\overline{x}}^1)_{-\infty }^{t-1})\bigr \},\\ {\widetilde{A}}_t&=\bigl \{0\le U_t<{\widetilde{g}}_t^\omega (+\,\vert \,({\widetilde{x}}^2)_{-\infty }^{t-1} )-{\widetilde{g}}_t^\omega (+\,\vert \,({\widetilde{x}}^1)_{-\infty }^{t-1})\bigr \}. \end{aligned}$$

Using the fact that \({\overline{X}}\) has more discrepancies than \({\widetilde{X}}\), and (63),

$$\begin{aligned} {\overline{g}}_t^\omega (+\,\vert \,({\overline{x}}^2)_{-\infty }^{t-1})-{\overline{g}}_t^\omega (+\,\vert \,({\overline{x}}^1)_{-\infty }^{t-1})&=\tfrac{{\tilde{h}}_{k_t}}{2}\Bigl \{ {\overline{\varphi }}\Bigl ( \tfrac{1}{|S_t|}\sum _{s\in S_t}{\overline{x}}^2_s \Bigr )- {\overline{\varphi }}\Bigl ( \tfrac{1}{|S_t|}\sum _{s\in S_t}{\overline{x}}^1_s \Bigr ) \Bigr \}\\&\ge \tfrac{{\tilde{h}}_{k_t}}{2}\Bigl \{ {\overline{\varphi }}\Bigl ( \tfrac{1}{|S_t|}\sum _{s\in S_t}{\widetilde{x}}^2_s \Bigr )- {\overline{\varphi }}\Bigl ( \tfrac{1}{|S_t|}\sum _{s\in S_t}{\widetilde{x}}^1_s \Bigr ) \Bigr \}\\&\ge \tfrac{{\tilde{h}}_{k_t}}{2}\Bigl \{ {\widetilde{\varphi }}\Bigl ( \tfrac{1}{|S_t|}\sum _{s\in S_t}{\widetilde{x}}^2_s \Bigr )- {\widetilde{\varphi }}\Bigl ( \tfrac{1}{|S_t|}\sum _{s\in S_t}{\widetilde{x}}^1_s \Bigr ) \Bigr \}\\&= {\widetilde{g}}_t^\omega (+\,\vert \,({\widetilde{x}}^2)_{-\infty }^{t-1})-{\widetilde{g}}_t^\omega (+\,\vert \,({\widetilde{x}}^1)_{-\infty }^{t-1}), \end{aligned}$$

which implies \({\widetilde{A}}_t\subset {\overline{A}}_t\) almost surely.

With (65) at hand, we go back to (64):

$$\begin{aligned} E_\omega ^{+,N}[{\widetilde{X}}_s]-E_\omega ^{-,N}[{\widetilde{X}}_s]&=2\mathsf {E}_\omega ^{\pm ,N}[1_{\{{\widetilde{\Delta }}_s=\left( {\begin{array}{c}+\\ -\end{array}}\right) \}}]\\&\le 2\mathsf {E}_\omega ^{\pm ,N}[1_{\{{\overline{\Delta }}_s=\left( {\begin{array}{c}+\\ -\end{array}}\right) \}}] =E_\omega ^{+,N}[{\overline{X}}_s]-E_\omega ^{-,N}[{\overline{X}}_s]. \end{aligned}$$

But since \({\overline{\varphi }}\) is purely linear, the explicit computation made at the beginning of the section can be repeated, giving

$$\begin{aligned} E_\omega ^{\pm ,N}[{\overline{X}}_s] =\Bigl \{ \prod _{k=k'+1}^M {\tilde{h}}_k \Bigr \}(\pm 1) \rightarrow 0\quad \text { when }M\rightarrow \infty . \end{aligned}$$

\(\square \)

6 Concluding remarks

The analysis of the model was possible due to the Markovian structure of the sequence \(\xi _{k}\), in particular to the relation (valid on \(\{\infty \rightarrow k\}\))

$$\begin{aligned} E_\omega [\xi _{k}]=h_{k+1}E_\omega [\varphi (\xi _{k+1})]. \end{aligned}$$
(66)

We will give a simple heuristic argument that might shed some light on the proofs given above, and on the role played by the continuity of \(\varphi \) at the origin.

A mean field approximation consists in assuming that \(\xi _{k+1}\) can be approximated by its mean:

$$\begin{aligned} \xi _{k+1}\simeq E_\omega [\xi _{k+1}] . \end{aligned}$$

This allows to transform \( E_\omega [\varphi (\xi _{k+1})]\simeq \varphi \bigl (E_\omega [\xi _{k+1}]\bigr ) \). This approximation is correct in exactly one case: when \(\varphi \) is purely linear.

With the mean field approximation, one can transform (66) into a deterministic toy model, in which \(\mu _{k}:=E_\omega [\xi _{k}]\) is a sequence satisfying the relation

$$\begin{aligned} \mu _{k}=h_{k+1}\varphi (\mu _{k+1}). \end{aligned}$$
(67)

We thus take some large integer \(M\), fix some initial condition, \(\mu _{M}\), and study the sequence \(\mu _{M},\mu _{M-1},\dots ,\mu _{k_*}\). Since \(\varphi (0)=0\), \(0\) is always a fixed point for the dynamics. In the case of a purely linear majority rule, \(\varphi (z)=\lambda z\), the trajectory of \(\mu _{k+1}\rightarrow \mu _k\) is always attracted towards the origin, independently of the initial condition. For example, if \(\mu _M>0\):

figure f

The same qualitative behavior holds if \(\varphi '(0)<\infty \). Namely, when \(\varphi \) is Lipschitz near the origin, a coupling with a straight line has shown that the same phenomenon occurs: for large enough \(k\), \(z\ge 0\), the curve \(z\mapsto h_{k+1}\varphi (z)\) lies strictly below the identity \(z\mapsto z\), and any initial condition is also attracted towards the origin.

In the case of a pure majority rule [or when \(\varphi '(0)=\infty \) as in Remark 6], the mechanism changes: the trajectories of \(\mu _{k+1}\rightarrow \mu _k=h_{k+1}\varphi _{PMR}(\mu _{k+1})\), are repelled away from the origin, with a sign that depends on the sign of the initial condition. For example, if \(\mu _M>0\):

figure g

This scenario was shown to hold for the BHS model, at least when \(\alpha \) was taken small enough. When \(\alpha \) is large, fluctuations allow \(\mu _k\) to change sign at any time, yielding uniqueness. The role of \(\varphi '(0)\) in the model of Bramson and Kalikow is under current investigation (work in progress).