1 Introduction

In physics, critical behaviour involves systems in which correlations decay as a power law with distance. It is an important topic in many areas of physics and can also be found in stochastic processes on graphs. Often, such systems have a parameter (e.g. temperature) and when it is set to a critical value, the system exhibits critical behaviour. Power series expansion techniques have been used in the physics literature to numerically approximate critical values and associated exponents. It was often observed that the coefficients of such power series stabilize when the system size grows, and we provide a rigorous proof of this for a large class of stochastic processes.

Self-organized criticality is a name common to models where the critical behaviour is present but without the need of tuning a parameter. This concept has been widely studied, see for example [24]. A simple model for evolution and self-organized criticality was proposed by Bak and Sneppen [2] in 1993. In this random process there are n vertices on a cycle each representing a species. Every vertex has a fitness value in [0, 1] and the dynamics is defined as follows. Every time step, the vertex with the lowest fitness value is chosen and that vertex together with its two neighbors get replaced by three independent uniform random samples from [0, 1]. The model exhibits self-organized criticality, as most of the fitness values automatically become distributed uniformly in \([f_c,1]\) for some critical value \(0<f_c<1\). This process has received a lot of attention [1, 7, 20, 21], and a discrete version of the process has been introduced in [5]. The model actually appeared earlier in [18] (“model 3”) although formulated in a different way and it was also studied in [10] (“CP 3”). In the discrete Bak–Sneppen (DBS) process, the fitness values can only be 0 or 1. At every time step, choose a uniform random vertex with value 0 and replace it and its two neighbors by three independent values, which are 0 with probability p and 1 with probability \(1-p\). The DBS process has a phase transition with associated critical value \(p_c\) [4, 22].

The Bak–Sneppen process was originally described in the context of evolutionary biology but its study has much broader consequences, e.g., the process was rediscovered in the regime of theoretical computer science [6] as well. To study the limits of a randomized algorithm for solving satisfiability, the discrete Bak–Sneppen process turned out to be a natural process to analyze.

The DBS process is closely related to the so-called contact process (CP), originally introduced in [11]. Sometimes referred to as the basic contact process, this process models the spreading of an epidemic on a graph where each vertex (an individual) can be healthy or infected. Infected individuals can become healthy (probability \(1-p\)), or infect a random neighbor (probability p). The contact process has also been studied in the context of interacting particle systems and many variants of it exist, such as a parity-preserving version [14] and a contact process that only infects in one direction [25]. Depending on the particular flavor of the processes, the CP and DBS processes are closely related [4] and in certain cases have the same critical values. The processes are similar in the sense that vertices can be active (fitness 0 or infected) or inactive (fitness 1 or healthy). The dynamics only update the state in the neighborhood of active vertices with a simple local update rule. In this article we consider a wide class of processes that fit this description, and our proofs are valid in this general setting. We will, however, focus on the DBS process when we present explicit examples.

In this paper we take a power-series approach and represent several probabilities and expectation values as a power series in the parameter p. There is a wealth of physics literature on series analysis in the theory of critical phenomena, see for example [3, 12, 13] for an overview. Processes typically only have a critical point when the system size is infinite, but numerical simulations often only allow for probing of finite systems. Our main theorem proves, for our general class of processes, that one can extract coefficients of the power series for an arbitrary large system by computing quantities in only a finite system. One can then apply series analysis techniques to these coefficients of the large system. Series expansion techniques have been extensively used for variants of the contact process as well as for closely related directed percolation models [9, 14,15,16,17, 19, 25] in order to extract information about critical values and exponents. For example, in [25] the contact process on a line is studied where infection only happens in one direction. In [14] a process is studied where the parity of the number of active vertices is preserved. In both articles, the power series of the survival probability is computed up to 12 terms and used to find estimates for the critical values and exponents. However, in all this work the stabilization of coefficients has been observedFootnote 1 but not proven.

Our Main Contribution is a definition of a general class of processes that encapsulates most of the above processes (Definition 1) and an in-depth understanding of the stabilization phenomenon, complete with a rigorous proof (Lemma 5, Theorem 1). The results are illustrated with examples.

Road Map In Sect. 1.1 we will provide two example power series that exhibit the stabilization phenomenon. In Sect. 1.2 we will sketch our results without going into technicalities and explain the intuition behind them, something that we call the Interaction Light Cone. In Sect. 2 we define our general class of processes in more detail and provide our theorems with their proofs. In Sect. 3 we apply our result to the DBS process, and we compute power-series coefficients for several quantities. As an application, we use the method of Padé approximants to extract an estimate for \(p_c\) and we estimate a critical exponent that suggests that the DBS process is in the directed percolation universality class.

1.1 Stabilization of Coefficients

There are different ways of defining the DBS process. These definitions are essentially equivalent and only differ slightly in their notion of time, but can be mapped to each other in a straightforward way. For example, one can pick a random vertex in each time step, and only perform an update when the vertex is active, but always count it as a time step. To study infinite-sized systems, one can consider a continuous-time version with exponential clocks at every vertex. Resampling of a vertex and its neighbors happens when the clock of the vertex rings and the vertex is active. When calculating time averages, the subtle differences in these definitions can lead to incorrect estimates and should not be overlooked in simulations.

The common in all definitions is that an update is applied if and only if the picked vertex was active. In order to treat the three models equivalently we will count the number of updates instead of time steps. That is, we count the number of times when an active vertex is selected to perform a local update (we count all such occasions even if the update ends up not changing the actual state).

Numerical simulations clearly exhibit the phase transition in the DBS process when p goes from 0 to 1. There is some critical probability \(p_c\) such that for \(p < p_c\) the active vertices quickly die out and the system is pushed toward a state with no active vertices. However for \(p > p_c\), the active vertices have the upper hand and dominate the system. This phase transition can clearly be seen in Fig. 1 regarding two different quantities: (a) The expected number of updates per vertex before reaching the all-inactive state on a cycle of length n, after initially activating the vertices i.i.d. randomly with probability p. (b) The probability that the end of a (non-periodic) chain eventually gets activated when the process is started with only one active vertex on the other end.

Fig. 1
figure 1

a Plot of \(R_{(n)}(p)\), see (1), the expected number of updates per vertex before the all-inactive state is reached, for the DBS process on a cycle with n vertices. The process was started in a random initial state with each vertex being activated independently with probability p. b Plot of \(S_{[n]}(p)\), see (2), the probability to ‘reach’ the other side of the system: the DBS process on a non-periodic chain of size n is started with a single active vertex at position 1 (denoted by \(\text {start }\{1\}\)) and we plot the probability that vertex n ever becomes active (denoted \(\mathrm {BA}^{(n)}\)) before the all-inactive state is reached. For \(n=5000\) the result was obtained with a Monte Carlo simulation. For the lower n, the results were computed symbolically. The inset shows a zoomed in region of the Monte Carlo data, showing that \(p_c \approx 0.635\)

Let us write these quantities as a power-series in p and in \(q=1-p\) respectively.

$$\begin{aligned} R_{(n)}(p)&:= \frac{1}{n}{\mathbb {E}}(\text {total updates} \mid \text {start i.i.d.}) = \sum _{k=0}^{\infty } a^{(n)}_k p^k, \end{aligned}$$
(1)
$$\begin{aligned} S_{[n]}(q)&:= {\mathbb {P}}( \text {vertex }n\text { becomes active} \mid \text {start } \{1\}) = \sum _{k=0}^{\infty } b^{[n]}_k q^k . \end{aligned}$$
(2)

We will study these functions in more detail in Sect. 3, where we show, amongst other things, that they are rational functions for each n. For example

$$\begin{aligned} R_{(4)}(p) = \frac{p(6-12p+10p^2-3p^3)}{6(1-p)^4} = \frac{(1-q)(1+q+q^2+3q^3)}{6q^4} . \end{aligned}$$
Fig. 2
figure 2

Plot of the function \(\vert S_{(6)}(p) \vert \), defined in Eq. (2), over the complex plane with \(p=0\) at the origin. The poles of the function are shown as red dots. The unit circle is shown in black, and the dashed green circles have radius \(p_c\) around the origin, and radius \(1-p_c\) around \(p=1\) (Color figure online)

Although these quantities only have an operational meaning for \(p\in [0,1]\), we give a plot of such a function over the complex plane, see Fig. 2. The plot shows the poles of \(S_{(6)}(p)\), which seem to approach the value \(p_c\) on the real line (for larger n see Fig. 6). Similar phenomena can be observed for partition functions in statistical physics. The partition function is usually in the denominator of observable physical quantities, so that its zeros are the poles of such quantities. A classic result on the partition function for certain gasses [26] shows that when an open region around the real axis is free of (complex) zeros, then many physical quantities are analytic in that region and therefore there is no phase transition. Now known as Lee-Yang zeros, they have been widely studied and linked for example to large-deviation statistics [8]. In [23] the hardcore model on graphs with bounded degree is studied, and it is proven that the partition function has zeros in the complex plane arbitrary close to the critical point.

Now we would like to highlight the behaviour of the coefficients \(a^{(n)}_k\) and \(b^{[n]}_k\). Table 1 and Table 2 show numerical values of the coefficients \(a^{(n)}_k\) and \(b^{[n]}_k\) respectively.Footnote 2

Table 1 Table of the coefficients \(a^{(n)}_k\) of the power series defined in Eq. (1)
Table 2 Table of the coefficients \(b^{[n]}_k\) of the power series defined in Eq. (2). Although displayed with finite precision, they were computed symbolically

A quick look at the table immediately reveals the stabilization of coefficients:

$$\begin{aligned} a^{(n)}_k = a^{(k+1)}_k \qquad \forall n \ge k+1 \qquad \text { and } b^{[n]}_k = b^{(k+1)}_k \qquad \forall n \ge k+1 . \end{aligned}$$

Therefore, we now know the first few terms of the power series for arbitrary large systems and we can proceed to use methods of series analysis. By applying the method of Padé approximants, we can numerically estimate \(p_c \approx 0.6352\). More details on this can be found in Sect. 3.

1.2 Locality of Update Rule Implies Stabilization

We rigorously prove that the coefficients stabilize, based on an observation that we call the Interaction Light Cone. Let X be a set of vertices, and let \(L_X\) be an event that is local on X, meaning that the event depends only on what happens to the vertices in X. For example, when \(X=\{v_0\}\) and \(L_X\) is the event that vertex \(v_0\) is picked at least r times, then \(L_X\) is local on X. In Sect. 2 we will give a more precise definition of local events. We now wish to compare the probability \({\mathbb {P}}(L_X)\) when the process is initialized in two different starting states, A and \(A'\). When A and \(A'\) differ only on vertices that are at least a distance d away from X, then we have

$$\begin{aligned} {\mathbb {P}}(L_X \mid \text {start in } A) - {\mathbb {P}}(L_X \mid \text {start in }A') = O( p^d ) . \end{aligned}$$

By the notation \(O(p^d)\) we mean that when this quantity is written as power series in p, then the first \(d-1\) terms of the series are zero. It only has non-zero terms of order \(p^d\) and higher, i.e., the two probabilities agree on at least the first \(d-1\) terms of their power series. This is the essence of the Interaction Light Cone. A vertex that is a distance d away from the set X will only influence probabilities and expectation values of X-local events with terms of order \(p^d\) or higher. The intuition behind this is that the probability of a single activation is \(O(p)\) and in order for such a vertex to influence the state of a vertex in X, a chain of activations of size d needs to be formed in order to reach X. This observation will also allow us to compare the process on systems of different sizes.

Lemma 1

(Informal version of Lemma 5) Let G and \(G'\) be two graphs and let X be a set of vertices present in both graphs such that the d-neighborhood of X and the local update process (where a single update may only affect a vertex and its neighbors) on it is the same in both graphs. Then for any event \(L_X\) that is local on X we have

$$\begin{aligned} {\mathbb {P}}_{G}(L_X) = {\mathbb {P}}_{G'}(L_X) + O(p^d) . \end{aligned}$$

This idea applies to expectation values as well. Consider the expected number of updates per vertex on a cycle. By translation invariance, we have

$$\begin{aligned} \frac{1}{n} {\mathbb {E}}(\text {total updates}) = {\mathbb {E}}(\#\text {times vertex 1 was updated}) , \end{aligned}$$

making it a \(\{1\}\)-local quantity. If we add an extra vertex to the cycle, the expectation value only changes by a term of order \(O(p^{n/2})\) since the new vertex has distance n / 2 to vertex 1.

2 Parametrized Local-Update Processes

The class of parametrized (discrete) local-update processes, introduced in this section, includes the DBS, the CP and many other natural processes. We prove a general ‘stabilization of the coefficients theorem’ for them, suggesting the usefulness of the power-series approach for members of the class.

Let \(G=(V,E)\) be an undirected graph with vertex set V and edge set E. We consider processes where every vertex of G is either active or inactive. A state is a configuration of active/inactive vertices, denoted by the subset of active vertices \(A\subseteq V\). For \(v\in V\) let us denote by \(\varGamma (v)\) the neighbors of v in G including v itself. A local update process in each discrete time step picks a random active vertex \(v\in A\) and resamples the state of its neighbors \(\varGamma (v)\). If the state is \(\emptyset \) (there are no active vertices) then the process stops and all vertices remain inactive afterwards.

Definition 1

(PLUP - Parametrized local-update process) We say that \(M_G\) is a parametrized local-update process on the graph \(G=(V,E)\) with parameter \(p\in [0,1]\) if it is a time-independent Markov chain on the state space \(\{\text {inactive},\text {active}\}^{V}\) that satisfies the following:

  1. (i)

    Initial State The initial value of a vertex is picked independently from the other vertices. The probability of initializing \(v\in V\) as active is a polynomial in p with constant term equal to zero.Footnote 3

  2. (ii)

    Selection Dynamics Each vertex \(v\in V\) has a fixed positive weight \(w_v\). A vertex \(v\in V\) is selected using one of the three rulesFootnote 4 below, and if the selected vertex was active, then its neighborhood \(\varGamma (v)\) is resampled using the parametrized local-update rule of vertex v (else the state remains unchanged).

    1. (a)

      Discrete-Time Active Sampling In each discrete time step, an active vertex \(v\in A\) is selected with probability \(\frac{w_v}{\sum _{u\in A}w_u}\), where A is the current state.

    2. (b)

      Discrete-Time Random Sampling In each discrete time step, a vertex \(v\in V\) is selected with probability \(\frac{w_v}{\sum _{u \in V} w_u}\).

    3. (c)

      Continuous-Time Clocks Every vertex \(v\in V\) has an exponential clock with rate \(w_v\). When a clock rings, that vertex is selected, and a new clock is set up for the vertex.

  3. (iii)

    Update Dynamics The parametrized local-update rule of a vertex \(v\in V\) describes a (time-independent) probabilistic transition from state A to \(A'\) such that the states only differ on the neighborhood \(\varGamma (v)\), i.e., \(A\bigtriangleup A'\subseteq \varGamma (v)\). The probability \(P_R\) of obtaining active vertices \(R=A'\cap \varGamma (v)\) is independent of \(A\setminus \varGamma (v)\). The probability \(P_R\) is a polynomial in p such that for \(p = 0\) we get \(A' \subsetneq A\) with probability 1, i.e., when any previously inactive vertex becomes active (\(\;|A' \setminus A| > 0\)) or when \(A'=A\) then the constant term in \(P_R\) must be zero.Footnote 5

  4. (iv)

    Termination The process terminates when the all-inactive state \(\emptyset \) is reached.

With a slight abuse of notation we write \({\mathbb {P}}_G\) and \({\mathbb {E}}_G\) for probabilities and expectation values associated to the PLUP \(M_G\), when \(M_G\) is clear from context.

Definition 2

(Local events) Let \(G=(V,E)\) be a (finite) graph and let \(M_G\) be a PLUP. Let \(S\subseteq V\) be any subset of vertices, and let \(v\in V\) be any vertex.

  • Let \(\mathrm {II}^{(S)}\) be the event that all vertices in S get initialized as inactive.

  • Let \(\mathrm {RI}^{(S)}\) be the event that all vertices in Sremain inactive during the entire process (including initialization).

  • Define \(\mathrm {BA}^{(S)}\) as the complement of \(\mathrm {RI}^{(S)}\): the event that there exists a vertex in S that becomes active at some point during the process, including initialization.

  • Let \(\#\textsc {Asel}\left( v\right) \) be the number of times that v was selected while it was active.

  • Let \(\#\textsc {toggles}\left( v\right) \) be the number of times that the value of v was changed.

If \(S=\{v\}\) we simply use the notation \(\mathrm {II}^{(v)}\), \(\mathrm {RI}^{(v)}\), and \(\mathrm {BA}^{(v)}\) for the above events. We say an event L is local on the vertex set S if it is in the sigma algebra generated by the events

$$\begin{aligned} \mathrm {II}^{(v)}\; , \mathrm {RI}^{(v)} \; , \; \mathrm {BA}^{(v)} , \; (\#\textsc {Asel}\left( v\right) = k) \; , \; (\#\textsc {toggles}\left( v\right) = k) \quad :v\in S, 0\le k < \infty . \end{aligned}$$

Lemma 2

(Time equivalence) The three versions of the selection dynamics of a PLUP, described in property (ii) of Definition 1, are equivalent for local events. That is, for any local event L the probability \({\mathbb {P}}(L)\) is independent of the chosen selection dynamics in property (ii).

Proof

The three selection dynamics only differ in the counting of time, and the presence of self loops in the Markov Chain. The definition of local events only includes events that are independent of the way time is counted. They only depend on which active vertices are selected and the changes to the state of the graph.

It is easy to see that (b) implements the dynamics of (a) via rejection sampling, therefore they give rise to the same probabilities. One can also see that on a finite graph the selection rule (c) induces the same selection rule as (b). This is because the exponential clocks induce a Poisson process at each vertex. The n independent Poisson processes with rates \(w_v\) are equivalent to one single Poisson process with rate \(W=\sum _{v\in V}w_v\) but where each point of the single process is of type v with probability \(w_v / W\). One can simulate (c) by sampling a time value from an exponential distribution with parameter W and then sampling a random vertex with probability \(w_v/W\) (as in (b)). Since the time is not relevant for local events we can ignore the sampled time value and this gives rise to the same probabilities. \(\square \)

Our lemmas and theorems only concern local events and therefore we can use any one of the three selection dynamics when proving them.

Definition 3

(Induced process) Suppose that \(V'\subseteq V\), then we define the induced process \(M_{G'}\) on the induced subgraph \(G'=(V',E')\) such that we run the process \(M_{G}\) on G and after each step we deactivate all vertices in \(V\setminus V'\). We can then view this as a process on \(G'\). Let L be a local event on \(V'\). We denote the probability of L under the induced process \(M_{G'}\) with \({\mathbb {P}}_{G'}(L)\). Similarly we use the notation \({\mathbb {E}}_{G'}\) for expectation values induced by the process \(M_{G'}\).

It is easy to see that the induced process of a PLUP is also a PLUP.

Definition 4

(Graph definitions) Let \(G=(V,E)\) be a graph, \(S\subseteq V\) be any subset of vertices and \(v\in V\) be any vertex.

  • Define \(G\setminus S\) as the induced subgraph on \(V\setminus S\) and \(G\cap S\) as the induced subgraph on S.

  • Define the d-neighbourhood \(\varGamma (S,d)\) of S as the set of vertices that are connected to S with a path of length at most d. In particular \(\varGamma (\{v\};1)=\varGamma (v)\).

  • Define the distant-k boundary \({\overline{\partial }}(S,k):=\varGamma (S,k)\setminus \varGamma (S,k-1)\) as the set of vertices lying at exactly distance k from S, and let \({\overline{\partial }}S:={\overline{\partial }}(S,1)\).

The following lemma says that if a set S splits the graph into two disconnected parts, then those two parts become independent under the condition that the vertices in S never become active.

Lemma 3

(Splitting lemma) Let \(M_G\) be a parametrized local-update process on the graph \(G=(V,E)\). Let \(S,X,Y\subseteq V\) be a partition of the vertices, such that X and Y are disconnected in the graph \(G\setminus S\). Furthermore, let \(L_X\) and \(L_Y\) be local events on X and Y respectively. Then we have (see Fig. 3)

$$\begin{aligned} {\mathbb {P}}_{G}(\mathrm {RI}^{(S)} \cap L_X \cap L_Y \mid \mathrm {II}^{(S)}) = {\mathbb {P}}_{G\setminus Y}(\mathrm {RI}^{(S)} \cap L_X \mid \mathrm {II}^{(S)}) \; \cdot \; {\mathbb {P}}_{G\setminus X}(\mathrm {RI}^{(S)} \cap L_Y \mid \mathrm {II}^{(S)}) . \end{aligned}$$
Fig. 3
figure 3

The set S of permanently inactive vertices splits the graph in parts X and Y, rendering them effectively independent. See Lemma 3

The condition of initializing S to inactive is present only to prevent counting the initialization probabilities twice. Equivalently we could write the condition only once:

$$\begin{aligned} {\mathbb {P}}_{G}(\mathrm {RI}^{(S)} \cap L_X \cap L_Y) = {\mathbb {P}}_{G\setminus Y}(\mathrm {RI}^{(S)} \cap L_X) \; \cdot \; {\mathbb {P}}_{G\setminus X}(\mathrm {RI}^{(S)} \cap L_Y \mid \mathrm {II}^{(S)}) , \end{aligned}$$

and by Bayes rule \(\left( {\mathbb {P}}(L \mid \mathrm {RI}^{(S)})={\mathbb {P}}(L \mid \mathrm {RI}^{(S)}\cap \mathrm {II}^{(S)})=\frac{{\mathbb {P}}(L \cap \mathrm {RI}^{(S)}\mid \mathrm {II}^{(S)})}{{\mathbb {P}}(\mathrm {RI}^{(S)}\mid \mathrm {II}^{(S)})}\right) \) we also have

$$\begin{aligned} {\mathbb {P}}_{G}(L_X \cap L_Y \mid \mathrm {RI}^{(S)}) = {\mathbb {P}}_{G\setminus Y}(L_X \mid \mathrm {RI}^{(S)}) \; \cdot \; {\mathbb {P}}_{G\setminus X}(L_Y \mid \mathrm {RI}^{(S)}) . \end{aligned}$$

Proof

We will use the ‘continuous-time clocks’ version of selection dynamics (PLUP property (ii)-c). By Lemma 2 the statement will then hold for all versions. We proceed with a coupling argument. There are three processes, one on G and the induced ones on \(G\setminus Y\) and \(G\setminus X\). We couple them by letting all three processes use the same source of randomness. Every vertex in G has an exponential clock that is shared by all three processes, and the randomness used for the local updates for each vertex will also come from the same source. This means that when the clock of a vertex v rings, and the neighborhood \(\varGamma (v)\) is equal in different processes, then the update result will also be equal. Now we simply observe that \(L_X \cap L_Y \cap \mathrm {RI}^{(S)}\) holds in the G-process if and only if \(L_X \cap \mathrm {RI}^{(S)}\) holds in the \((G\setminus Y)\)-process and\(L_Y \cap \mathrm {RI}^{(S)}\) holds in the \((G\setminus X)\)-process. This is because all vertices in S are initialized as inactive (all three probabilities are conditioned on this), so a vertex in S can only be activated by an update from a vertex in X or Y. To check if the event \(\mathrm {RI}^{(S)}\) holds, it is sufficient to trace the process up to the first activation of a vertex in S. Before this first activation, anything that happens to the vertices in X only depends on the clocks and updates of vertices in X, and similar for Y. Since S splits X and Y in disconnected parts, these parts can not influence each other unless a vertex in S is activated. Because of the coupling, the evolution of the X vertices in \(G\setminus Y\) will be exactly the same as the evolution in G, and similar for Y. Once a vertex in Sdoes get activated, the evolution of the three processes is no longer the same but in that case the event \(\mathrm {RI}^{(S)}\) does not hold, regardless of any further updates in any system. The clocks and updates of each vertex are independent sources of randomness, and when \(\mathrm {RI}^{(S)}\) holds then all the randomness of the S vertices is ignored. Therefore the probability of \(\mathrm {RI}^{(S)}\) in the \((G\setminus Y)\)-process and \((G\setminus X)\)-process depends only on independent random variables corresponding to the vertices in X and Y respectively, and we get the required equality. \(\square \)

2.1 Interaction Light Cone Results

Now we present the results that exhibit the interaction light cone. The intuition is that if two vertices have distance d in the graph, then the only way they can affect each other is that an interaction chain is forming between them, meaning that every vertex gets activated at least once in between them.

When we write \(f(p) = O(p^k)\) for some function f then we mean the following: f(p) is analytic in a neighborhood of 0 and when f(p) is written as a power-series in p, i.e., \(f(p) = \sum _{i=0}^{\infty } \alpha _i p^i\), then \(\alpha _i=0\) for \(0\le i \le k-1\).

Lemma 4

Let \(M_G\) be a parametrized local-update process on the graph G with vertex set V. Let \(\{X_1,\ldots ,X_k\}\) be a collection of disjoint vertex subsets \(X_i \subseteq V\) and let E be an event. If \(E \subseteq \bigcap _{i} \mathrm {BA}^{(X_i)}\), then \({\mathbb {P}}(E) = O(p^{k})\). Furthermore if \(S\subseteq V\) then also \({\mathbb {P}}(E \mid \mathrm {II}^{(S)}) = O(p^{k})\).

When the event E holds, each set \(X_i\) contains a vertex that becomes active, and by PLUP properties (i) and (iii) any activation (either during initialization or later) is \(O(p)\). Therefore the probability of E is of order \(p^{k}\) or higher. We give the full proof in Appendix B.

Lemma 5

(Graph surgery) Let \(M_G\) be a parametrized local-update process on the graph \(G=(V,E)\). If \(X,Y\subseteq V\), \(X\cap Y=\emptyset \) and \(L_X\) is a local event on X, then

$$\begin{aligned} {\mathbb {P}}_{G}(L_X)-{\mathbb {P}}_{G\setminus Y}(L_X)=O(p^{d(X,Y)}). \end{aligned}$$

Proof

We can assume without loss of generality, that \(X\ne \emptyset \ne Y\), otherwise the statement is trivial. Also we can assume without loss of generality that \(d(X,Y)\le \infty \), i.e., XY are in the same connected component of G, otherwise we can use Lemma 3 with \(S=\emptyset \).

The proof goes by induction on d(XY). For the base case, \(d(X,Y)=1\), first note that when \(p=0\), the process initializes everything to inactive by property (i). Depending on whether this atomic event is included in \(L_X\), the probability \({\mathbb {P}}(L_X)\) for \(p=0\) (i.e. the constant term) is either 0 or 1 and independent of the graph.

Now we show the inductive step, assuming we know the statement for d, and that \(d(X,Y)=d+1\). First we assume, that \(\mathrm {RI}^{(X)}\subseteq \overline{L_X}\), i.e., \(L_X\subseteq \mathrm {BA}^{(X)}\). Define

$$\begin{aligned} L_X^i&:= L_X \cap \mathrm {RI}^{({\overline{\partial }}(X,i))} \cap \bigcap _{j\in [i-1]} \mathrm {BA}^{({\overline{\partial }}(X,j))} \qquad \text {for}\quad i \in [d],\\ L_X^{d+1}&:= L_X \cap \bigcap _{j\in [d]} \mathrm {BA}^{({\overline{\partial }}(X,j))} . \end{aligned}$$

When \(L_X^i\) holds, all vertices at distance i remain inactive, but for all \(j \le i-1\) there exists a vertex at distance j that become active. These events form a partition \(L_X={\dot{\bigcup }}_{i\in [d+1]}L_X^{i}\). Below we depict \(L_X^{i}\) graphically:

figure a

It is easy to see that for all \(i\in [d+1]\) we have \(L_X^{i}\subseteq \mathrm {BA}^{(X)}\cap \bigcap _{j\in [i-1]}\mathrm {BA}^{({\overline{\partial }}(X,j))}\), and therefore by Lemma 4 we get

$$\begin{aligned} {\mathbb {P}}_G(L_X^{i} )=O(p^{i}) , \quad \text { and } \quad {\mathbb {P}}_G(L_X^{i} \mid \mathrm {II}^{({\overline{\partial }}(X,i))} )=O(p^{i}). \end{aligned}$$
(3)

Now we use, for all \(i \in [d]\), the Splitting lemma 3 with \(S={\overline{\partial }}(X,i)\) to split \(\varGamma (X,i-1)\) from \(G\setminus \varGamma (X,i)\). We get

(4)

Therefore

$$\begin{aligned} {\mathbb {P}}_G(L_X)&\overset{(3)}{=}\sum _{i\in [d]}{\mathbb {P}}_G(L_X^{i})+O(p^{d(X,Y)}) \overset{(4)}{=}\sum _{i\in [d]}{\mathbb {P}}_{G\setminus Y}(L_X^{i})+O(p^{d(X,Y)}) \\&\overset{(3)}{=}{\mathbb {P}}_{G\setminus Y}(L_X)+O(p^{d(X,Y)}). \end{aligned}$$

We finish the proof by observing that \(\mathrm {RI}^{(X)}\) is an atomic event of the sigma algebra of the local events of X, so if \(\mathrm {RI}^{(X)}\nsubseteq \overline{L_X}\), then we necessarily have \(\mathrm {RI}^{(X)}\subseteq L_X\). Therefore we can use the above proof with \(C_X:=\overline{L_X}\) and use that \({\mathbb {P}}(L_X)=1-{\mathbb {P}}(C_X)\). \(\square \)

Corollary 1

(Decay of correlations) Let \(M_G\) be a parametrized local-update process on the graph \(G=(V,E)\). If \(X,Y\subseteq V\) and \(L_X, L_Y\) are local events on X and Y respectively, then

$$\begin{aligned} {\mathrm {Cov}}(L_X,L_Y)={\mathbb {P}}_{G}(L_X\cap L_Y)-{\mathbb {P}}_{G}(L_X){\mathbb {P}}_{G}(L_Y)=O(p^{d(X,Y)-1}), \end{aligned}$$
(5)

and

$$\begin{aligned} {\mathbb {P}}_{G}(\mathrm {BA}^{(X)}\cap \mathrm {BA}^{(Y)})-{\mathbb {P}}_{G}(\mathrm {BA}^{(X)}){\mathbb {P}}_{G}(\mathrm {BA}^{(Y)})=O(p^{d(X,Y)+1}). \end{aligned}$$
(6)

The proof of this lemma is analogous to the proof of Lemma 5 and can be found in Appendix A.

In order to state our general result about the stabilization of the coefficients in the power series we define a notion of isomorphism between different PLUPs.

Definition 5

(PLUP isomorphism) We say that the PLUPs \(M_G\) and \(M_{G'}\) are isomorphic with the fixed sets \(X,X'\) if there is a graph isomorphism \(i: G\rightarrow G'\) such that \(i(X)=X'\). Moreover, the probability of transitioning in one step from a state A to \(A'\) is preserved under the isomorphism:

$$\begin{aligned} {\mathbb {P}}_G(A \text { is transformed to }A')={\mathbb {P}}_{G'}(i(A) \text { is transformed to }i(A')), \end{aligned}$$

and similarly the probability of initializing to a particular state A is preserved:

$$\begin{aligned} {\mathbb {P}}_G(\text {graph state is initially }A)={\mathbb {P}}_{G'}(\text {graph state is initially }i(A')). \end{aligned}$$

We denote such an isomorphism relation by

$$\begin{aligned} M_{G} \underset{X'}{\overset{X}{\simeq }} M_{G'}. \end{aligned}$$

Now we define convergent families of PLUPs. Our requirements for such a family of processes imply that the underlying graphs converge locally, in the neighborhood of a fixed point, to a common graph limit, also called graphing, therefore justifying the term “convergent”. Examples of convergent families of PLUPs include DBS and CP on tori of any dimension, when the limit graphing is just the infinite grid. Less regular examples are also included, such as toroid ladder graphs or discrete Möbius stripes of fixed width.

Definition 6

(Convergent family of PLUPs) We say a family \(\{ (M_{G_j},v_j) :j\in {\mathbb {N}}\}\) of rooted PLUPs is convergent, if for all \(d\in {\mathbb {N}}\) and for all \(j,k \ge d\) we have \(M_{\varGamma _{G_j}\left( \{v_j\},d\right) } \underset{v_k}{\overset{v_j}{\simeq }} M_{\varGamma _{G_k}\left( \{v_k\},d\right) }\).

We are ready to state our generic result about the stabilization of coefficients.

Theorem 1

(Power series stabilization) Suppose that \(\{(M_{G_j},v_j) :j\in {\mathbb {N}}\}\) is a convergent family of rooted PLUPs, then the coefficients of the power series of \(R_{G_i}={\mathbb {E}}_{G_i}(\#\textsc {Asel}\left( v_i\right) )\) stabilize. In particular, \(R_{G_i}(p) = R_{G_j}(p) + O(p^{\min (i,j)+1})\)

Note that for vertex-transitive graphs, this implies \(R_{G_i}=\frac{1}{|G_i|} {\mathbb {E}}_{G_i}(\text {total updates})\) stabilizes.

Proof

Let \(d = \min (i,j)\), then

In Lemma 8 in Appendix B, we prove that these types of sums are absolutely convergent for small enough p. Therefore the equality holds when the left- and right-hand side are considered as a power series in p. \(\square \)

3 The Discrete Bak–Sneppen Process

In Sect. 1.1 we introduced two quantities that exhibit a phase transition in the DBS process. We saw that the coefficients of their power series stabilize. In this section we will look at them in more detail.

3.1 Notation

We denote by \(M_{G}\) the DBS process on the graph \(G=(V,E)\). With a slight abuse of notation we also denote by \(M_{G}\) the leaking transition matrix of this time-independent Markov Chain, where the row and column that correspond to the all-inactive configuration are set to zero. We will index vectors (and matrices) by sets \(A\subseteq V\), where A is the set of active vertices, as in Sect. 2. We will denote probability row vectors by \(\rho \in {\mathbb {R}}^{2^n}\) so that \(\rho \cdot M_G\) is the state of the system after one time step (one update). Setting the all-inactive row and column to zero corresponds to the property that for every \(A\subseteq V\) we have \((M_G)_{\emptyset ,A} = (M_G)_{A,\emptyset } = 0\). We will use the notation \(M_{(n)}\) for the matrix of the process on the cycle of length n and \(M_{[n]}\) for the process on the chain (not periodic) of length n. In both case we identify vertices with \(V:=[n]=\{1,2,\ldots ,n\}\).

3.2 Expected Number of Resamples Per Site

The first quantity of interest is the expected number of updates per vertex to reach the all-inactive state. Consider the DBS process on the cycle of length n. We start the process by letting each vertex be active with probability p and inactive with probability \(1-p\), independently for each vertex. Denote this initial state by \(\rho ^{(0)}\), so its components have values \(\rho ^{(0)}_A = p^{|A|}(1-p)^{n-|A|}\). Let J be the vector with all entries equal to 1, except for the entry of the all-inactive state which is zero. Then \(\rho ^{(0)} \cdot M_{(n)}^k \cdot J^T\) is the probability that after exactly k updates there is at least one active vertex, i.e. the all-inactive state is reached after at least \(k+1\) updates, starting from \(\rho ^{(0)}\). Now define \(R_{(n)}(p)\) as the expected number of updates per vertex, before reaching the all-inactive state:

$$\begin{aligned} R_{(n)}(p)&= \frac{1}{n} \sum _{k=1}^{\infty } k \cdot {\mathbb {P}}(\text {reach all-inactive in exactly }k\text { updates}) \nonumber \\&= \frac{1}{n} \sum _{k=1}^{\infty } {\mathbb {P}}(\text {reach all-inactive in } k\text { updates or more}) \nonumber \\&= \frac{1}{n} \sum _{k=1}^{\infty } \rho ^{(0)} \cdot M_{(n)}^{k-1} \cdot J^T \end{aligned}$$
(7)
$$\begin{aligned}&= \frac{1}{n} \rho ^{(0)} \cdot (\mathrm {Id}- M_{(n)})^{-1} \cdot J^T \qquad (\hbox {by the geometric series}) \nonumber \\&= \frac{P_{(n)}(p)}{P'_{(n)}(p)}, \end{aligned}$$
(8)

where \(P_{(n)},P'_{(n)}\) are polynomials as can be seen by using Cramer’s rule for matrix inversion. Therefore we can conclude that \(R_{(n)}(p)\) is a rational function. For small n we can compute \(R_{(n)}(p)\) by symbolically inverting the matrix \(\mathrm {Id}- M_{(n)}\), which is how we obtained the coefficients in Table 1. For \(n\ge 9\) we computed the matrix inverse for rational values of p exactly, and then computed the rational function using Thiele’s interpolation formula.

Fig. 4
figure 4

Location of the poles of \(R_{(n)}(p)\) on the complex plane for different n. The black circle is the complex unit circle and the dashed circles have radius \(p_c\) around \(p=0\) and \(1-p_c\) around \(p=1\). There is always a pole at \(p=1\) because \(R_{(n)}(1)\) is always infinite

3.2.1 The Power-Series of \(R_{(n)}(p)\)

As we have seen in the previous subsection, \(R_{(n)}(p)\) is a rational function. Since a rational function is analytic, and \(R_{(n)}(p)\) has no pole at \(p=0\) (it actually takes value 0), we can write it as

$$\begin{aligned} R_{(n)}(p) = \sum _{k=0}^{\infty } a^{(n)}_k p^k, \end{aligned}$$
(9)

where the (non-zero) radius of convergence of the above power series equals the absolute value of the closest pole of \(R_{(n)}(p)\) to 0. In order to get some intuition about the radius of convergence we plotted the location of the poles of \(R_{(n)}(p)\) on the complex plane in Fig. 4. For \(n=10\) there is a pole at a point with absolute value \(\approx 0.9598\), hence \(R_{(10)}(p)\) has a radius of convergence strictly smaller than 1 even though the rational function \(R_{(n)}(p)\) is well-defined for all \(p\in [0,1)\).

Fig. 5
figure 5

Estimates for \(p_c\) based on the two methods. On the horizontal axis, n is the number of power-series coefficients used for the estimate. The function \(R_{{\mathbb {Z}}}\), \(T_{{\mathbb {N}}}\) and \(S_{{\mathbb {Z}}}\) are defined in the text below Conjecture 1. The numbers \([m,m']\) (with \(m+m'=n\)) refer to the degree of the numerator and denominator respectively of the rational functions used in the Padé approximant method. The gray shaded region shows our estimate \(p_c = 0.63523 \pm 0.00005\)

As was shown in Sect. 1.1, Table 1, the coefficients \(a^{(n)}_k\) stabilize as n grows. This is proven by Theorem 1, since the family of DBS processes on the cycles, indexed by n, is a convergent family of PLUPs. The theorem only guarantees the stabilization for \(n > 2k\) since going from a cycle of size n to \(n+1\) adds a vertex at a distance n / 2 to any fixed vertex. In the table, however, we saw that the stabilization already holds for \(n \ge k+1\). In Appendix D we prove this more precise version of the stabilization that holds for cycles. We define the ‘stabilized’ coefficients \(a^{(\infty )}_k := a^{(k+1)}_k\). We then define \(R_{{\mathbb {Z}}}(p) = R_{(\infty )}(p) = \sum _{k=0}^\infty a^{(\infty )}_k p^k\) and make the following conjecture.

Conjecture 1

(Radius of convergence) The radius of convergence of \(R_{(\infty )}(p)\) is equal to the critical probability \(p_c\) of the DBS process.

In Appendix B we explain an alternative method to compute coefficients of the \(R_{(\infty )}(p)\) power series (see the text below Lemma 9). As an application, we can apply known methods of series analysis. For example, Fig. 5 shows estimates for \(p_c\) using the ratio method and the Padé approximant method. For details on these methods, see for example [12]. The ratio method can be used to estimate the critical value when the singularity that determines the radius of convergence is at \(p_c\), i.e. there are no other singularities closer to the origin, which is what we suggest in Conjecture 1. The figure also shows estimates based on the power-series coefficients of the functions \(T_{{\mathbb {N}}}\) and \(S_{{\mathbb {Z}}}\). The function \(T_{{\mathbb {N}}}\) is the expected number of total updates on a semi-infinite chain with one end, with a single active vertex at that end as a starting state. This series is included because we can compute more terms for it. The function \(S_{{\mathbb {Z}}}\) is the probability of survival on the infinite line with a single active vertex as a starting state. This is a series in \(q=1-p\) and it is included because other work studies the equivalent function for the contact process and this allows for comparison of critical exponents [9]. The Padé approximant method suggests that the critical value is \(p_c \approx 0.63523 \pm 0.00005\), in complete agreement with [10], and that the critical exponent for \(S_{{\mathbb {Z}}}(q) \overset{q \uparrow q_c}{\sim } (q_c-q)^\beta \) is \(\beta \approx 0.277\), which suggests that it is in the directed-percolation (DP) universality class alongside several variants of the contact process [9, 14, 25].

3.3 Reaching One End of the Chain from the Other

Another quantity we considered in Sect. 1.1 is the probability of ever activating one end point of a finite chain, when we start the process with only a single active vertex at the other end. Let us consider the length-n chain, and suppose we start the DBS process with a single active vertex at site 1. As in Eq. (2), we consider

$$\begin{aligned} S_{[n]}(p) = {\mathbb {P}}(\mathrm {BA}^{(\{n\})} \mid \text {start }\{1\}). \end{aligned}$$

Note that in order to satisfy property (i) of the PLUP definition, the initial state needs to be \(\{1\}\) with probability p and \(\emptyset \) with probability \(1-p\). To get the above definition of \(S_{[n]}(p)\) with a deterministic starting state one can then simply divide by p. The power-series coefficients of \(S_{[n]}(p)\) stabilize, which follows from Lemma 5 by letting \(X=\{n\}\) and \(Y=\{1\}\). However, as suggested by Fig. 1, the limiting power series around \(p=0\) will become the zero function and it is therefore not so interesting. Instead, we can take the power series centered around \(p=1\) and it turns out that also there the coefficients stabilize. We prove this below. Define \(q=1-p\).

Fig. 6
figure 6

Location of the poles of \(S_{[n]}\)as a function ofp in the complex plane for different n. The black circle is the complex unit circle and the dashed circles have radius \(p_c\) around \(p=0\) and \(1-p_c\) around \(p=1\)

Similarly to what we did for \(R_{(n)}(p)\) we can write \(S_{[n]}(q)\) using a matrix inverse. We will start the process in the (deterministic) state with a single active vertex at location 1, denoted by the probability vector \(\delta _{\{1\}}\). Define \({\mathcal {A}}_n = \{ A \subseteq [n] \mid n \in A \}\), the set of all states where vertex n is active. Let \(M_{[n]}\) be the transition matrix for the DBS process on the chain of length n. Define the matrix \({\tilde{M}}_{[n]}\) as \(M_{[n]}\) but with some entries set to zero. Set the row and column of the all-inactive state \(\emptyset \) to zero, \(({\tilde{M}}_{[n]})_{A,\emptyset } = ({\tilde{M}}_{[n]})_{\emptyset ,A} = 0\) for all \(A\subseteq [n]\). Furthermore set all rows \(A\in {\mathcal {A}}_n\) to zero: \(({\tilde{M}}_{[n]})_{A,A'} = 0\) for all \(A'\subseteq [n]\). That is, whenever vertex n is active there is no outgoing transition. Denote by \(\upchi _{{\mathcal {A}}_n}\) the vector that is 1 for all \(A\in {\mathcal {A}}_n\) and zero everywhere else. We have

(10)

With the same argument as before we see that \(S_{[n]}\) must be a fraction of two polynomials in p (and also in q). The poles of \(S_{[n]}\) are shown in Fig. 6 where \(S_{[n]}\) is considered a function of p to be comparable with \(R_{(n)}(p)\). The coefficients \(b^{[n]}_k\) of the q power series are shown in Table 2.

Lemma 6

The coefficients \(b^{[n]}_k\) of the power series of \(S_{[n]}(q)\) in Eq. (10) stabilize.

Proof

Let \(\mathrm {RI}^{(\{n\})}\) and its complement \(\mathrm {BA}^{(\{n\})}\) be as defined in Definition 2. In the following we assume that the starting state is \(\{1\}\) with probability p and \(\emptyset \) with probability \(1-p\), so the process is a PLUP. We have \(S_{[n]}(p)= \frac{1}{p} \cdot {\mathbb {P}}(\mathrm {BA}^{(n)})\), since \(S_{[n]}(p)\) has a deterministic starting state. By Lemma 3 we have \({\mathbb {P}}_{[n]}(\mathrm {RI}^{(\{n-1\})}) = {\mathbb {P}}_{[n-1]}(\mathrm {RI}^{(\{n-1\})})\). Consider \(1-p S_{[n]}\), i.e. the probability that the n-th vertex is not activated. We have

Note that for the event \((\mathrm {BA}^{(\{n-1\})} \cap \mathrm {RI}^{(\{n\})})\) to hold, all vertices \(1,\ldots ,n-1\) must have been active. Since the process terminates with probability 1, this means all those vertices must also have been deactivated at least once. In the DBS process a deactivation is \(O(q)\), so every terminating path of the Markov Chain that is in this set has a factor of at least \(q^{n-1}\) associated to it, hence \({\mathbb {P}}_{[n]}(\mathrm {BA}^{(\{n-1\})} \cap \mathrm {RI}^{(\{n\})}) = O(q^{n-1})\). Here we use the absolute convergence of certain power series in q, which we prove in Lemma 10 in Appendix C. We see that \(S_{[n]}(q) - S_{[n-1]}(q) = O(q^{n-1})\) so the coefficients stabilize. \(\square \)