1 Introduction

Genetic defects in DNA replication fidelity and repair result in mutator phenotypes that accelerate the adaptation of microbes and cancer cells. However, certain combinations of mutator alleles increase mutation rates to intolerable levels, resulting in error-induced extinction (EEX), where every cell eventually acquires a lethal mutation (Morrison et al. 1993; Fijalkowska and Schaaper 1996). Error-induced extinction occurs within a few generations in bacteria and haploid yeast with mutations on both DNA proofreading and repair (Morrison et al. 1993; Fijalkowska and Schaaper 1996; Herr et al. 2011). Tumours with mutator phenotypes reach an upper limit of mutations, which has been interpreted as evidence for a maximal mutation rate in cancer (Fox and Loeb 2010; Schumacher et al. 2019).

Defining the maximum mutation rate is important for understanding the long-term fitness of mutator alleles, with applications for studying the evolution of hyper-mutated tumours and the synthetic lethality of mutator alleles (Topatana et al. 2020). Experimentally, this amounts to producing cell lines with mutator mutations, and measuring the mutation rates when colonies are viable. This strategy has identified intolerable mutation rates for bacteria (Morrison et al. 1993), yeast (Fijalkowska and Schaaper 1996; Soriano et al. 2021) and cancer in mice (Albertson et al. 2009). However, beyond a threshold, the cell lines are not viable or acquire antimutator alleles. Thus, it is not possible to study the behaviour of populations at the limiting mutation rate and the evolutionary process underlying their extinction. Open questions include: for how many generations, and how large can populations grow at the limit mutation rate? What is the genetic structure of populations undergoing EEX? Can mutation drive the extinction of an exponentially growing population?

Here we propose a continuous-time multi-type critical birth–death process that allows theoretical exploration of the questions above, amongst other applications of biological interest. This models a population of cells that divide, die or mutate independent of each other, where the time between events is exponentially distributed. To mimic the behaviour of cells at the limiting mutation rate, we set the rates of division \((\alpha )\) and death \((\beta )\) or mutation \((\nu )\) to be balanced \((\alpha =\beta + \nu )\). In the simplest model for EEX, there is a single type of cells that divides \((1)\rightarrow (1)+(1)\) or acquires a lethal mutation \((1)\rightarrow \emptyset \) at the same rate. This is the case of haploid yeast and bacteria with mutation rates of one lethal mutation per cell division, which undergo EEX within a few generations (Morrison et al. 1993; Fijalkowska and Schaaper 1996; Herr et al. 2011). A more biologically interesting case is to allow multiple mutations to accumulate before a lethal mutation arrives. We model this by considering different cell types, where we call a cell type-i if it has accumulated i mutations. We assume that there is a maximal number n of mutations a cell can bear. Note that in this model mutations accumulate consecutively, therefore the lethal mutation is the nth mutation, not a particular mutation.

In the simplest multi-type critical process, each type-i cell divides independently \((i)\rightarrow (i)+(i)\) at rate one, or accumulates a new mutation \((i)\rightarrow (i+1)\) also at rate one, except type-n cells, which divide or die at rate one. We represent this birth-mutation process by the following scheme

(1)

A more general version comes from introducing death and allowing different types of cells to divide and mutate at different rates. We refer to this as the n-type birth–death process, which can be illustrated as

(2)

Note that for any given type, cells appear (via division) and disappear (via mutation or death) at the same rate, and thus all types remain critical. The overall process is critical, hence eventually all cells go extinct with probability one. This can be seen intuitively, since critical type-1 cells go extinct without a source, at which point the same can be said about type-2 cells, and so on. Figure 1 shows the evolution of the number of cells of each type in a four-type process of (1). Note that each type grows faster than the previous and takes longer to become extinct. Thus, the more mutations that can accumulate before a lethal mutation, the longer it takes for the population to undergo EEX. Interestingly, the population grows approximately exponentially during a transient, but eventually goes extinct with probability one. This behaviour cannot be captured by a supercritical process with an accumulation of deleterious mutations. In that case, if the initial types of cells have higher fitness, these will overtake any subsequent less fit types, resulting in a positive probability of survival of the population. Moreover, the n-type critical process does not require a reduction in fitness until the last lethal mutation, naturally capturing the phenomenon of EEX.

Fig. 1
figure 1

An example simulation of the process of (1) run until extinction with n = 4 types, where each coloured area represents the number of cells of a type, with different types piled on top of each other. Hence the envelope is the total population size. A single initial type-1 cell resulted in a short-lived type-1 population which seeded a type-2 population before its extinction. Note how type-3 and type-4 cells were initiated multiple times. One can observe the initial fast growth of the total population before extinction

Due to their applicability to model exponentially growing populations, multi-type, super-critical processes with consecutive mutations (decomposable) have been extensively studied (Athreya and Ney 2004; Kesten and Stigum 1967; Durrett 2015; Nicholson et al. 2022). In the critical case, the asymptotic behaviour has been studied by many authors (Sevast’yanov 1959; Chistyakov 1959; Mullikin 1963). Foster and Ney (1976) derived the asymptotic survival probability for discrete-time models. Similar results were obtained simultaneously by Ogura (1975), also including continuous time processes. Later on, Foster and Ney (1978) proposed limit theorems for the generating functions of population sizes, conditioned on the survival of the first type of cells, a condition less relevant for biological applications.

In this work, we derive large-time asymptotic solutions of multi-type critical birth–death processes exploiting the fact that, after an initial stage of exponential growth, the population is dominated by the last type. More precisely, we show that, for a large time, the system is non-empty with probability proportional to \((\text {time})^{-\chi _n}\), where \(\chi _n=2^{1-n}\), establishing the relationship between the time-dependent survival probability and the maximal number of mutations that can accumulate. The exponents of the survival probability were derived by Foster and Ney (1976), Ogura (1975) through a different approach. Knowing the asymptotic survival probability allows us to find an appropriate scaling of the system, and derive the limiting distribution for the number of cells of a given type present at time t. We find asymptotic solutions for the distribution of cells of type \(k=1,\dots ,n\), as well as the total number distribution. Our methods extend the exact solution and limit results (Antal and Krapivsky 2011) for the two-type critical birth–death case, and complement the results of recent work on super-critical processes (Nicholson et al. 2022). Remarkably, we find that the distributions for the number of cells of a given type and the total number distributions have algebraic and stationary tails, described by the same exponents \(\chi _n\) as the survival probabilities. This provides interesting biological insight into the behaviour of the modeled populations, showing that they can reach a stationary growth phase before extinction. We derive further estimates of interest for studying cancer and bacteria growth, including the distributions of time of arrival and extinction of cells with an arbitrary number of mutations.

The next sections progressively build up to the main results of this work, which can be found in Sect. 5. We introduce the simplest critical processes in Sects. 2 and  3. We present exact solutions to the two-type critical process in Sect. 4 and derive asymptotic solutions to the n-type case in Sect. 5. In Appendix E, we examine the more general processes including death and arbitrary birth and mutation rates, and show that the behavior is essentially the same as in the simplest critical birth–death process. Hence in the bulk of the paper, we limit ourselves to the simplest version. Example applications of the results for studying EEX in microbes and tumours, as well as connections to experimental work, can be found in Sect. 6.

2 Single type

In the simplest case, EEX occurs because cells acquire a lethal mutation at the same rate as cell division (Morrison et al. 1993; Fijalkowska and Schaaper 1996; Herr et al. 2011). This is represented by a single type critical process. Let us recall its basic properties (Athreya and Ney 2004). We denote by \(Z_i(t)\) the number of type-i cells at time t. We start with a single type-1 cell and study \(Z_1(t)\) via the generating function

$$\begin{aligned} \mathcal {Z}_{1,1}(x_1,t)=\mathbb {E}(x_1^{Z_1(t)}|Z_1(0)=1) = \sum _{a\ge 0} P_a(t) x_1^a, \end{aligned}$$

where the first index of \(\mathcal {Z}_{1,1}\) refers to the type of the initial cell while the second to the number of types considered, and \(P_a(t)=\mathbb {P}(Z_1(t)=a| Z_1(0)=1)\) is the probability of having a cells at time t. This generating function satisfies the backward Kolmogorov equation \(\partial _t \mathcal {Z}_{1,1} = (1-\mathcal {Z}_{1,1})^2\) with \(\mathcal {Z}_{1,1}(x_1,0)=x_1\), from which

$$\begin{aligned} \mathcal {Z}_{1,1}(x_1,t) = \frac{t(1-x_1)+x_1}{t(1-x_1)+1}. \end{aligned}$$
(3)

By expanding the generating function around \(x_1=0\) we obtain the probability to have a cells at time t,

$$\begin{aligned} P_a(t) = \frac{1}{a!} \partial _{x_1}^a \mathcal {Z}_{1,1}(0,t) = \frac{1}{(1+t)^2} \left( \frac{t}{1+t} \right) ^{a-1} \end{aligned}$$
(4)

for \(a\ge 1\), and the survival probability

$$\begin{aligned} S_{1,1}(t) = 1- \mathcal {Z}_{1,1}(0,t) = 1 - P_0(t) = \frac{1}{1+t}. \end{aligned}$$
(5)

Hence the number of cells conditioned on survival has a geometric distribution

$$\begin{aligned} Z_1(t)|\{Z_1(t)>0\} \sim \textrm{Geo}\left( \frac{1}{1+t}\right) . \end{aligned}$$

Notice that the average number of cells remains constant,

$$\begin{aligned} \mathbb {E}Z_1(t) = \partial _{x_1} \mathcal {Z}_{1,1}(1,t) = \sum _{a\ge 1}aP_a(t) = 1, \end{aligned}$$

throughout the evolution. Although the probability of extinction for \(t\rightarrow \infty \) is one, it takes a long time, since for \(T=\inf \{t:Z_1(t)=0\}\),

$$\begin{aligned} \mathbb {E}T=\int _0^\infty \mathbb {P}(T>t) {d}t = \int _0^\infty S_{1,1}(t) {d}t = \infty . \end{aligned}$$

It is well known (Athreya and Ney 2004) that conditioned on survival \(Z_1(t)/t\) converges in distribution to an exponential random variable

$$\begin{aligned} \frac{Z_1(t)}{t}|\{Z_1(t)>0\} \rightarrow Y_1 \sim \textrm{Expo}(1) \end{aligned}$$
(6)

for \(t\rightarrow \infty \). This is immediate from the properties of a geometric distribution when taking the limit \(t,a\rightarrow \infty \) limit with \(a/t=y\) constant

$$\begin{aligned} \mathbb {P}\left( \frac{Z_1(t)}{t}>y|Z_1(t)>0\right) = \left( \frac{t}{t+1}\right) ^{ty} \rightarrow e^{-y} = \mathbb {P}(Y_1>y). \end{aligned}$$

One can also derive this from the generating function. First note that \(\mathbb {P}(Z_1(t)=a|Z_1(t)>0) = P_a(t)/(1-P_0(t))\) for \(a\ge 1\), hence the conditional generating function

$$\begin{aligned} \mathbb {E}(x_1^{Z_1(t)}|Z_1(t)>0) = \sum _{a\ge 1} \frac{P_a(t)}{1-P_0(t)} x_1^a = \frac{\mathcal {Z}_{1,1}(x_1,t)-P_0(t)}{1-P_0(t)}. \end{aligned}$$
(7)

The large a limit is related to \(x_1\approx 1\) so, in order to get a non-trivial limit, we write \(x_1=1-p/t\) with p constant and notice that for \(a,t\rightarrow \infty \), the right-hand side of (7) converges

$$\begin{aligned} \frac{\mathcal {Z}_{1,1}(1-p/t,t)-P_0(t)}{1-P_0(t)} = \frac{1-p/t}{1+p} \rightarrow \frac{1}{1+p}. \end{aligned}$$

The convergence of (7) requires that \(tP_a(t)/(1-P_0(t))\sim t^2 P_a(t)\rightarrow f_{Y_1}(y)\) in order to get

$$\begin{aligned} \sum _{a\ge 1} \frac{P_a(t)}{1-P_0(t)} x_1^a = \sum _{a\ge 1} t \frac{P_a(t)}{1-P_0(t)} \left( 1-\frac{p}{t}\right) ^{yt} \frac{1}{t} \rightarrow \int _0^\infty f_{Y_1}(y) e^{-py} dy = \mathbb {E}e^{-pY_1} \end{aligned}$$

to converge to the Riemann integral, which is the Laplace transform of the density \(f_Y(y)\). The two limits are of course the same, hence

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\mathcal {Z}_{1,1}(1-p/t,t)-P_0(t)}{1-P_0(t)} = \mathbb {E} e^{-pY_1} = \frac{1}{1+p}. \end{aligned}$$
(8)

By inverting (8), the Laplace transform of the density \(f_{Y_1}(y)\), we indeed obtain that \(Y_1\sim \textrm{Expo}(1)\).

3 Infinite types

An interesting special case is obtained when we consider infinitely many types (\(n=\infty \)) in the process described by scheme (1). Biologically, this represents a population at a limiting mutation rate (hence every sub-population goes extinct) but in which infinitely many mutations can accumulate (hence the total population grows exponentially).

Since \(n=\infty \), there is no extinction, but up to type m the process is identical to the m-type model. What becomes simple though in the infinite type model is the total number of cells: if we disregard the types of cells then \(Z(t) = \sum _{i\ge 1} Z_i(t)\) is a Yule process with rate one (Athreya and Ney 2004). The generating function of a Yule process \(\mathcal {Z}(x,t)=\mathbb {E}(x^{Z(t)}|Z(0)=1)\) can be obtained from the backward Kolmogorov equation \(\partial _t \mathcal {Z} = \mathcal {Z}(\mathcal {Z}-1)\) with \(\mathcal {Z}(x,0)=x\), which leads to

$$\begin{aligned} \mathcal {Z}(x,t) = \frac{x}{x+(1-x)e^t}. \end{aligned}$$

By expanding this around \(x=0\), we obtain the mass function of the total number of cells in the infinite type process with a single initial type-1 cell

$$\begin{aligned} \Pi _s(t) = \mathbb {P}(Z(t)=s|Z_i(t)=\delta _{i1}) = \frac{1}{s!} \partial _x^s \mathcal {Z}(0,t) = e^{-t}\big (1-e^{-t}\big )^{s-1}, \end{aligned}$$

The mean number of cells grows exponentially as \(\mathbb {E}Z(t)=\partial _x \mathcal {Z}(1,t)=e^t.\) For finite multi-type critical branching processes, \(\Pi _s(t)\) is not known in general. This is addressed in the next sections.

4 Two types

Consider now the simplest two-type critical branching process, which models a population in which a maximal of two mutations can accumulate. This could be interpreted as a population of diploid cells with a single locus for a lethal mutation, in which double allelic mutation is required to acquire the lethal phenotype. The two-type critical birth–death process is represented by

(9)

where all steps occur at equal rates; we set these rates to unity for simplicity. The general case with death is considered Appendix A. While the single type birth–death process is easily soluble, the two-type birth–death process reduces to generally unsolvable Riccati equations, which were recently shown to be solvable for certain birth–death models with general rates (Antal and Krapivsky 2010, 2011). For the critical case, the situation slightly simplifies and the results are more explicit. Although the analytic solution for the two-type critical case was published in Antal and Krapivsky (2011), here we re-derive the solution in more detail. These exact results will be useful for checking the validity of the asymptotic results we derive in Sect. 5 for general n, which is the only available approach for \(n\ge 3\).

Since the process is decomposable, the generating function for type-1 cells is the same as in the single type process given in (3). The generating function for both cell types, starting with a single initial type-i cell is

$$\begin{aligned} \mathcal {Z}_{i,2}(x_1,x_2,t) = \mathbb {E} (x_1^{Z_1(t)} x_2^{Z_2(t)} | Z_j(t) = \delta _{ij}) \end{aligned}$$
(10)

for \(i=1, 2\), and hence the initial conditions are

$$\begin{aligned} \mathcal {Z}_{1,2}(x_1,x_2, 0)&= x_1\end{aligned}$$
(11a)
$$\begin{aligned} \mathcal {Z}_{2,2}(x_1,x_2, 0)&= x_2. \end{aligned}$$
(11b)

A convenient setting for studying the two-type branching process is provided by the backward Kolmogorov equations

$$\begin{aligned} \partial _t \mathcal {Z}_{1,2}&= \mathcal {Z}_{1,2}^2+\mathcal {Z}_{2,2}-2\mathcal {Z}_{1,2} \end{aligned}$$
(12a)
$$\begin{aligned} \partial _t \mathcal {Z}_{2,2}&= \mathcal {Z}_{2,2}^2 + 1 - 2\mathcal {Z}_{2,2}. \end{aligned}$$
(12b)

Let \(\sigma \) be an operator that increases the indices of a function by one,

$$\begin{aligned} \sigma (\mathcal {F}_{i,j}(x_i,x_j))=\mathcal {F}_{i+1,j+1}(x_{i+1},x_{j+1}) . \end{aligned}$$
(13)

Note that in the general case (see Appendix E) \(\sigma \) increases the indices of the rates \(\alpha _i, \chi _i, \nu _i\) too. Since we are considering decomposable processes, one can easily see that \(\mathcal {Z}_{i+1,j+1}=\sigma (\mathcal {Z}_{i,j})\), namely the j-type process starting with a single i cell is the same as the \(j+1\) process starting with a single \(i+1\) cell, up to a change in indices. In particular, \(\mathcal {Z}_{2,2}= \sigma (\mathcal {Z}_{1,1})\), where \(\mathcal {Z}_{1,1}\) is the generating function of the single type process, given in (3).

4.1 Survival probabilities

Before solving Eqs. (12a)–(12b), we consider the simpler task of finding the survival probability of the system, where ‘survival’ refers to the situation when there are alive cells of any type at time t,

$$\begin{aligned} S_{i,2}(t) = \mathbb {P}(Z_1(t)+Z_2(t)>0| Z_k(0)=\delta _{i,k}). \end{aligned}$$

The survival probabilities are related to the corresponding generating functions for type-2 cells, \(S_{1,2}(t)=1-\mathcal {Z}_{1,2}(0,0,t)\) and \(S_{2,2}(t)=1-\mathcal {Z}_{2,2}(0,0,t)\). Hence from Eqs. (12a)–(12b) we deduce that the survival probabilities \(S_{1,2}(t)\) and \(S_{2,2}(t)\) evolve according to rate equations

$$\begin{aligned} \frac{d S_{1,2}}{dt}&=S_{2,2}-S_{1,2}^2 \end{aligned}$$
(14a)
$$\begin{aligned} \frac{d S_{2,2}}{dt}&= -S_{2,2}^2, \end{aligned}$$
(14b)

with initial conditions \(S_{1,2}(0)=S_{2,2}(0)=1\). Note that

$$\begin{aligned} S_{2,2} = \sigma (S_{1,1}) =S_{1,1} = \frac{1}{1+t} \end{aligned}$$
(15)

is the survival of the single type process, given by (5). The governing Eq. (14a) therefore becomes an initial value problem

$$\begin{aligned} \frac{d S_{1,2}}{dt} = \frac{1}{1+t} - S_{1,2}^2, \quad S_{1,2}(0)=1. \end{aligned}$$
(16)

This is a soluble Riccati equation. We first simplify the non-homogeneous term by using the new time variable \(\tau =\sqrt{1+t}\), which leads to

$$\begin{aligned} \frac{d S_{1,2}}{d\tau } = \frac{2}{\tau } - 2\tau S_{1,2}^2. \end{aligned}$$

We transform this into a second-order linear differential equation by setting \(S_{1,2}(\tau ) = \frac{1}{2\tau }\frac{B'(\tau )}{B(\tau )}\) to get

$$\begin{aligned} B''(\tau ) -\frac{B'(\tau )}{\tau } - 4B(\tau ) = 0. \end{aligned}$$

Finally, we let \(s=2\tau \) and write \(B(\tau )=\frac{s}{2}A(s)\) to obtain the standard differential equation for modified Bessel functions (NIST 2023)

$$\begin{aligned} s^2A''(s)+sA'(s) - (1+s^2)A(s) = 0. \end{aligned}$$

Hence the solution is a linear combination of these functions

$$\begin{aligned} A(s) = c_1 I_1(s) + c_2 K_1(s), \end{aligned}$$

where \(I_p\) and \(K_p\) are the modified Bessel functions of order p of the first and second kind, respectively. Making the previous substitutions backward, we arrive at the solution of (16)

$$\begin{aligned} S_{1,2} = \frac{1}{\tau }\,\frac{I_0(2\tau )-cK_0(2\tau )}{I_1(2\tau ) + cK_1(2\tau )},\quad \tau =\sqrt{1+t}. \end{aligned}$$
(17)

The initial condition \(S_{1,2}(t=0)=1\) fixes the amplitude

$$\begin{aligned} c = \frac{I_0(2)-I_1(2)}{K_0(2) + K_1(2)}. \end{aligned}$$
(18)

4.2 Generating functions

We now return to the generating functions \(\mathcal {Z}_{1,2}\) and \(\mathcal {Z}_{2,2}\). A full solution of Eqs. (12a)–(12b) subject to the initial conditions (11a)–(11b) is obtained in similar way as the above solution of Eqs. (12a)–(12b). First, notice that the solution of (12b) is just the generating function (3) of a single type but with the indexes increased by one

$$\begin{aligned} \mathcal {Z}_{2,2} =\sigma (\mathcal {Z}_{1,1}) = \frac{t(1-x_2)+x_2}{t(1-x_2)+1}. \end{aligned}$$
(19)

Plugging this into (12a) we recast it into a Riccati equation

$$\begin{aligned} \frac{d (1-\mathcal {Z}_{1,2})}{dt} = \frac{1}{(1-x_2)^{-1}+t} -(1-\mathcal {Z}_{1,2})^2. \end{aligned}$$
(20)

Comparing (20) and (16) we see that \(1-\mathcal {Z}_{1,2}\) satisfies the same equation as \(S_{1,2}\), the only distinction is in the shift of the time variable in the first term on the right-hand sides. Thus we use the variable

$$\begin{aligned} \tau = \sqrt{t+(1-x_2)^{-1}}. \end{aligned}$$
(21)

and then the solution becomes

$$\begin{aligned} \mathcal {Z}_{1,2} = 1-\frac{1}{\tau }\, \frac{I_0(2\tau )-cK_0(2\tau )}{I_1(2\tau )+cK_1(2\tau )}. \end{aligned}$$
(22)

The initial condition \( \mathcal {Z}_{1,2}(x_1,x_2, 0) = x_1\) fixes the amplitude

$$\begin{aligned} c = \frac{I_0(2\tau _0)-(1-x_1)\tau _0 I_1(2\tau _0)}{K_0(2\tau _0)+(1-x_1)\tau _0 K_1(2\tau _0)}, \quad \tau _0=(1-x_2)^{-1/2}. \end{aligned}$$
(23)

For \(x_1=x_2=0\) we recover the survival probabilities (17).

Having a full solution (22)–(23) for the generating function, we can extract the probability of finding a type-1 cells and b type-2 cells, \(P_{a,b}(t)\), by using Cauchy’s integral formula:

$$\begin{aligned} P_{a,b}(t) = \frac{1}{(2\pi i)^2} \oint \frac{dx_1}{x_1^{a+1}}\oint \frac{dx_2}{x_2^{b+1}}\, \mathcal {Z}_{1,2}(x_1,x_2,t). \end{aligned}$$
(24)

Unfortunately, explicit solutions for \(P_{a,b}(t)\) are not available for arbitrary a and b, although one can reduce the number of integrations in Eq. (24) to one, and use numerical methods to obtain values for arbitrary a and b, see Appendix B and C. Fortunately, in many applications, a less detailed description suffices. For instance, one may be interested in the marginal distribution of type-2 cells,

$$\begin{aligned} P^{(2)}_{s}(t) = \mathbb {P}(Z_2(t)=s) \end{aligned}$$

or the total population size of the two-type process

$$\begin{aligned} \Pi ^{(2)}_s(t)=\mathbb {P}(Z_1(t)+Z_2(t)=s). \end{aligned}$$

These are encoded in the generating functions

$$\begin{aligned} \mathcal {Z}_{1,2}(1,x,t)&=\mathbb {E}(x^{Z_2(t)}|Z_j(0)=\delta _{1j}) = \sum _{s\ge 0} P^{(2)}_s(t)x^s \end{aligned}$$
(25)
$$\begin{aligned} \mathcal {Z}_{1,2}(x,x,t)&=\mathbb {E}(x^{Z_1(t)+Z_2(t)}|Z_j(0)=\delta _{1j}) = \sum _{s\ge 0} \Pi ^{(2)}_s(t)x^s . \end{aligned}$$
(26)

In the next section, we derive the asymptotic behaviour of both distributions.

5 n types

In this section, we study the \(n-\)type critical birth–death process. This models a population of cells that mutate at the limit mutation rate until they reach a maximal number of mutations n. This is represented by the scheme

This considers a decomposable critical process in which each type can only be produced by the preceding type, and all types divide or mutate at rate one, except the maximal type-n cells, which cannot mutate but die at rate one. The results are easily generalized to consider more general cases including death and multiple mutation rates; the derivations can be found in the Appendix E.

The naïve approach to the problem is to study \(\mathbb {E}Z_i(t)\). The forward equations are \(\frac{d}{dt} \mathbb {E}Z_1(t)=0\) and

$$\begin{aligned} \frac{d}{dt} \mathbb {E}Z_i(t)= \mathbb {E}Z_{i-1}(t) \end{aligned}$$

for \(i=2,\dots n\), with initial condition \(\mathbb {E}Z_i(0)=\delta _{i1}\). The solution is

$$\begin{aligned} \mathbb {E}Z_i(t) = \frac{t^{i-1}}{(i-1)!}. \end{aligned}$$
(27)

For the infinite type case (\(n\rightarrow \infty \)) we recover the Yule process of Sect. 3 for the total number of cells \(\mathbb {E}Z(t) = \sum _{i\ge 1} \mathbb {E}Z_i(t) = e^t\). For a finite number of types, \(Z(t)=\sum _{i=1}^n Z_i(t)\), and the mean total number of cells grows exponentially \(\mathbb {E}Z(t)\approx e^t\) for small t but algebraically as \(t^{n-1}\) when \(t\rightarrow \infty \). Thus, even though the process is critical and dies out at a finite time with probability one, \(\mathbb {E}Z(t)\) keeps growing forever, and therefore the naïve approach fails to capture one important aspect of the system. Hence instead of working with the mean number of cells, we derive asymptotic solutions for large time limits following a similar strategy introduced for the single type process.

The generating function for the n-type cell process starting with a single initial type-i cell is

$$\begin{aligned} \mathcal {Z}_{i,n}(x_1,x_2,\dots x_n,t) = \mathbb {E} \left( \prod _{j=1}^n x_j^{Z_j(t)} | Z_k(t) = \delta _{ik}\right) \end{aligned}$$
(28)

which can be obtained by solving the backward Kolmogorov equations

$$\begin{aligned} \partial _t \mathcal {Z}_{i,n }={\left\{ \begin{array}{ll} \mathcal {Z}_{i,n}^2+\mathcal {Z}_{i+1,n}-2\mathcal {Z}_{i,n} & \text {for}\quad {i=1,\dots ,n-1}\\ \mathcal {Z}_{i,n}^2 + 1 - 2\mathcal {Z}_{i,n} & \text {for}\quad {i=n}. \end{array}\right. } \end{aligned}$$
(29)

Thus, in order to obtain the interesting generating function \(\mathcal {Z}_{1,n}\) for the n-type process starting from a single type-1 cell, we need to first solve the \(n-1\) equations for \(\mathcal {Z}_{n,n}, \mathcal {Z}_{n-1,n}, \dots , \mathcal {Z}_{2,n}\). However, since the system is decomposable, we have that \(\mathcal {Z}_{i+1,j+1}=\sigma (\mathcal {Z}_{i,j})\), where \(\sigma ()\) is the index increase operator (13). Thus, if we know the generating function for type-k cells, solving the system for \(k+1\) involves solving a single additional equation. For example, for the 3-type process, \(\mathcal {Z}_{3,3}=\sigma (\mathcal {Z}_{2,2})\) and \(\mathcal {Z}_{2,3}=\sigma (\mathcal {Z}_{1,2})\), which we solved in the 2-type process, and thus we only need to solve one more equation in order to obtain \(\mathcal {Z}_{1,3}\). The same applies to the survival probabilities, which we consider in the following section.

5.1 Survival probabilities

As seen in Sect. 4, it is simpler to solve the system for the survival probabilities

$$\begin{aligned} S_{i,n}(t) = \mathbb {P}(Z_1(t)+\dots +Z_n(t)>0| Z_k(0)=\delta _{i,k}) = 1-\mathcal {Z}_{i,n}(0,0,\dots ,0,t).\nonumber \\ \end{aligned}$$
(30)

Substituting into (29), we deduce that

$$\begin{aligned} \frac{d S_{i,n}}{dt} ={\left\{ \begin{array}{ll}S_{i+1,n}-S_{i,n}^2 \quad & \text {for}\quad {i=1,\dots ,n-1},\\ -S_{n,n}^2 \quad & \text {for}\quad {i=n},\end{array}\right. } \end{aligned}$$
(31)

with initial conditions \(S_{i,n}(0)=1\) for all \(i=1,\dots ,n\) (see Fig. 2 for exact and numerical solutions). For any n, the solutions starting with a single type-n cell are given by the solutions of the single-type system up to shift in indices,

$$\begin{aligned}S_{n,n}=\sigma (S_{n-1,n-1})=\dots = \sigma (S_{2,2})=\sigma (S_{1,1})=1/(1+t).\end{aligned}$$

To get \(S_{n-1,n}\), we need to solve

$$\begin{aligned} \frac{d S_{n-1,n}}{dt} =\frac{1}{1+t}-{S^2_{n-1,n}}. \end{aligned}$$
(32)

We have solved this case in Sect. 4.1 exactly, where we specified \(n=2\), but the solution’s form is the same for all n. Even though analytic solutions are invaluable, the most interesting long-time asymptotic behavior can be extracted directly from the above equation using standard methods (Bender and Orszag 2013), circumventing the exact solutions. Assume that asymptotically \(S_{n-1,n}\sim (1+t)^{-\alpha }\), with \(\alpha >0\), where \(f(x)\sim g(x)\) means \(f(x)/g(x)=1\) for \(x\rightarrow \infty \). Substituting into (32) we get

$$\begin{aligned} -\alpha (1+t)^{-\alpha -1} + \dots = (1+t)^{-1} - (1+t)^{-2\alpha } + \dots \end{aligned}$$

where the dots refer to smaller order terms. One needs to match the coefficients of the leading order terms on the two sides, which leads to a contradiction for both \(2\alpha >1\) and \(2\alpha <1\), thus \(2\alpha =1\). That is, in the leading order, the right-hand side of (32) should vanish faster than \(1/(1+t)\). This gives

$$\begin{aligned} S_{n-1,n} \sim (1+t)^{-1/2}. \end{aligned}$$
(33)

To get to higher orders we just add new terms to (33) of the form \(b{(1+t)}^{-\beta }\) one by one with smaller values of \(\beta \) each time, substitute into (32) and match the coefficients. For the first 3 terms in the \(t\rightarrow \infty \) asymptotic limit we get

$$\begin{aligned} S_{n-1,n} \sim (1+t)^{-1/2}+\tfrac{1}{4}(1+t)^{-1}+\tfrac{3}{32}(1+t)^{-3/2}. \end{aligned}$$
(34)

Here \(f(x)\sim g(x)+h(x)\) means \([f(x)-g(x)]/h(x)=1\) for \(x\rightarrow \infty \). Note that we performed the expansion in powers of \(1+t\) for convenience, but one could simply expand the result in powers of t instead. This expansion can be also derived from the exact formula (17) using large argument asymptotic expansion of the Bessel functions (Appendix D).

Continuing to the next type, we have the differential equation for \(S_{n-2,n}\) in (31) where \(S_{n-1,n}\) is replaced by its above expansion. This procedure gives the leading order for all types by setting the time derivative to zero. In the leading order, we get that \(S_{n-j,n}\sim (1+t)^{-\chi _{j+1}}\) with \(\chi _j=2^{1-j}\). With a little more work, one obtains the next correction

$$\begin{aligned} S_{n-j,n}\sim {(1+t)}^{-\chi _{j+1}}+ 2^{-j-1}{(1+t)}^{-\tfrac{1}{2}-\chi _{j+1}} \end{aligned}$$

for \(j=1,\dots ,n-1\). The most interesting survival probability is \(S_{1,n}\), which gives the probability that the n-type system starting with a single type-1 cell is not empty at time t. Asymptotically, this is given by

$$\begin{aligned} S_{1,n}\sim (1+t)^{-\chi _n}+\frac{\chi _n}{2}{(1+t)}^{-\tfrac{1}{2}-\chi _n} \end{aligned}$$
(35)

for \(n\ge 2\), and for \(n=1\) the first term is exact. One can consider higher-order terms by successively adding terms and matching coefficients. Figure 3 shows that the second-order asymptotic accurately describes the behaviour.

Fig. 2
figure 2

Survival probabilities of the n-type critical process with initial condition \(Z_1(0)=1\). We plot the survival probability of the entire system, \(S_{1,n}(t)\) (lines) and that of just type-n cells, \(Q_{1,n}(t)\) (dashed-dots lines). Curves are for different types \(n=1,\dots ,5\). For type-1 and type-2 cells, the solutions are exact from (3) and (22), respectively. For other \(n>2\), solutions are obtained numerically from (31) with different initial conditions

Fig. 3
figure 3

Survival probability of the n-type process, for \(n=1,2,3,4,8\). Second-order asymptotic solutions obtained from (35) (lines) are plotted together with results from simulations (dots)

5.2 Generating functions

We now attempt to find solutions for the generating functions for the number of cells. We rewrite the Kolmogorov equations (28) into

$$\begin{aligned} \frac{d (1 - \mathcal {Z}_{i,n})}{dt} ={\left\{ \begin{array}{ll} (1 - \mathcal {Z}_{i+1,n})-(1 - \mathcal {Z}_{i,n})^2 \quad & \text {for}\quad {i=1,\dots ,n-1},\\ -(1 - \mathcal {Z}_{i,n})^2 \quad & \text {for}\quad {i=n},\end{array}\right. } \end{aligned}$$
(36)

which for \(1-\mathcal {Z}_{{j,n}}\) are identical to equations for the survival probability (31), but with initial conditions \(1-\mathcal {Z}_{j,n}(x_1,\dots ,x_n,0) = 1-x_j\). This is not surprising since the survival probability is \(S_{i,n}(t) = 1 - \mathcal {Z}_{i,n}(0,\dots 0,t)\).

We again treat this system by first considering the generating function for the system starting with a single type-n cell, which corresponds to the single type case. Hence

$$\begin{aligned} 1 - \mathcal {Z}_{n,n} = \frac{1}{t+ (1-x_n)^{-1}}, \end{aligned}$$

which is the same as if we replace \(1+t\) in \(S_{n,n}\) by \(t+ (1-x_n)^{-1}\). Starting with a single type-\((n-1)\) cell we have

$$\begin{aligned} \frac{d (1 - \mathcal {Z}_{n-1,n})}{dt} = \frac{1}{t+ (1-x_n)^{-1}}-(1 - \mathcal {Z}_{n-1,n})^2. \end{aligned}$$
(37)

To get a nontrivial large-time behaviour we need \((1-x_n)^{-1}\propto t\). Noting the similarity between (37) and (32), we assume a power behavior for \(1-\mathcal {Z}_{n-1,n}\) which fixes the exponent and leads to

$$\begin{aligned} 1-\mathcal {Z}_{n-1,n} \sim {\left( {t+ (1-x_n)^{-1}}\right) }^{-1/2}. \end{aligned}$$

The same result can be also obtained directly from the explicit solution (22) for the generating function of type-2 cells \(\mathcal {Z}_{1,2}(1,x_2,t)\) by taking the large argument asymptotic of the modified Bessel functions (Antal and Krapivsky 2011).

Continuing this procedure we get that, to the leading order,

$$\begin{aligned} 1-\mathcal {Z}_{n-2,n} \sim {\left( {t+ (1-x_n)^{-1}}\right) }^{-1/4}, \end{aligned}$$

and so on. In particular, the generating function starting with a single type-1 cell is given by

$$\begin{aligned} 1-\mathcal {Z}_{1,n} \sim {\left( {t+ (1-x_n)^{-1}}\right) }^{-\chi _n}. \end{aligned}$$
(38)

Thus we see that the same exponents, \(\chi _n=2^{1-n}\), that describe the survival probabilities give the asymptotic behaviour of the generating functions. Note that in the leading order, the generating functions only depend on the last type of cells \(x_n\). This is not surprising if we note that the survival probability of type-k cells is the square root of the survival probability of type-\(k-1\) cells. Thus the system becomes dominated by the last type. That is, the generating function for the last type is asymptotically the same as the generating function for the total number of cells,

$$\begin{aligned} \mathcal {Z}_{1,n}(1,\dots ,1,x_n,t)\sim \mathcal {Z}_{1,n}(x_n,\dots ,x_n,t). \end{aligned}$$
(39)

If we take (39) at \(x_n=0\) we see that the survival probability of just the type-n cells

$$\begin{aligned} Q_{i,n}(t) = \mathbb {P}(Z_n(t)>0 | Z_k(0)=\delta _{i,k}) = 1-\mathcal {Z}_{i,n}(1,\dots ,1,0,t) \end{aligned}$$
(40)

is asymptotically the same as the survival probability of \(S_{1,n}(t)\) of the whole system, that is

$$\begin{aligned} Q_{1,n}(t) \sim S_{1,n}(t). \end{aligned}$$
(41)

This can also be seen in Fig. 2, where we plotted exact (\(n=1,2\)) and numerical (\(n>2\)) solutions for \(S_{1,n}(t)\) and \(Q_{1,n}(t)\). It is clear from (29) that \(Q_{i,n}(t)\) is governed by the same equations as the survival probability \(S_{i,n}(t)\) of the entire system (31) but with initial conditions \(Q_{i,n}(0)=0\) for \(i=2,\dots ,n-1\) and \(Q_{n,n}(0)=1\). In Fig. 2 we see that subsequent types arise fast but disappear slowly, as the system gets dominated by the last type.

5.3 Cell number distributions

We now derive the asymptotic distribution of the number of type-n cells

$$\begin{aligned} P^{(n)}_s(t)= \mathbb {P}(Z_n(t)=s|Z_k(0)=\delta _{k,1}) \end{aligned}$$

which is encoded in the generating function

$$\begin{aligned} \mathcal {Z}_{1,n}(1,\dots ,1,x,t) = \mathbb {E} (x^{Z_n}|Z_k(0)=\delta _{k,1}) = \sum _{s\ge 0}P^{(n)}_s(t) x^s. \end{aligned}$$
(42)

To get non-trivial large-time behaviour, we need to condition the generating function on the survival of type-n cells. Following the procedure of Sect. 2, we express the conditional generating function as

$$\begin{aligned} \mathbb {E}(x_n^{Z_n(t)}|Z_n(t)>0) = \frac{\mathcal {Z}_{1,n}(1,\dots ,1,x_n,t)-P^{(n)}_0(t)}{1-P^{(n)}_0(t)}. \end{aligned}$$
(43)

As in the single type case, we obtain a nontrivial scaling by using the scaling variables \(x_n=1-p/t\) and \(y=s/t\) when taking \(t\rightarrow \infty \) with p and y constants. Substituting the leading order asymptotic expressions for the survival probability of type-n cells from (41) and (35)

$$\begin{aligned} 1 - P^{(n)}_0(t) = Q_{1,n}(t) \sim S_{1,n}(t) \sim t^{-\chi _n} \end{aligned}$$

and that of the generating function from (38)

$$\begin{aligned} 1-\mathcal {Z}_{1,n}(1,1-p/t,t) \sim (t+t/p)^{-\chi _n}. \end{aligned}$$

we get that

$$\begin{aligned} \mathbb {E}(x_n^{Z_n(t)}|Z_n(t)>0) \rightarrow 1 - {\left( \frac{p}{p+1}\right) }^{\chi _n}.\end{aligned}$$

To obtain this convergence in a different way requires the following convergence to the density of a random variable \(Y_n\)

$$\begin{aligned} t\frac{P^{(n)}_s(t)}{Q_{1,n}(t)} \sim t^{1+\chi _n} P^{(n)}_{yt}(t)\rightarrow f_{Y_n}(y) \end{aligned}$$

so the generating function becomes a Riemann integral in the \(t\rightarrow \infty \) limit:

$$\begin{aligned} \begin{aligned} \mathbb {E}(x_n^{Z_n(t)}|Z_n(t)>0)&= \sum _{s\ge 1} \frac{P^{(n)}_s(t)}{Q_{1,n}(t)} x^s = \sum _{s\ge 1} t\frac{P^{(n)}_s(t)}{Q_{1,n}(t)} {(1-p/t)}^{yt} \frac{1}{t}\\&\rightarrow \int _0^\infty f_{Y_n}(y) e^{-py} dy = \mathbb {E} e^{-pY_n} = 1-{\left( \frac{p}{p+1}\right) }^{\chi _n}. \end{aligned} \end{aligned}$$

Hence we obtained a convergence in distribution

$$\begin{aligned} \frac{Z_n(t)}{t}|\{Z_n(t)>0\} \rightarrow Y_n. \end{aligned}$$

Inverting the above Laplace transform \(\mathbb {E} e^{-pY_n}\) we express the density of the limit variable via the confluent hypergeometric function (NIST 2023)

$$\begin{aligned} f_{Y_n}(y) =\chi _n F(1 + \chi _n; 2; -y). \end{aligned}$$
(44)

Therefore, the scaling form of the type-n distribution is

$$\begin{aligned} P^{(n)}_{s}(t)\approx \chi _n t^{-1-\chi _n}F(1 + \chi _n; 2; -s/t). \end{aligned}$$
(45)

For \(n=1\), since \(\chi _1=1\) and \(F(2;2;-x)=e^{-x}\), we recover the exponential limit of the single type case given in (6).

Finally, for \(n\ge 2\), by taking the large argument asymptotic of the confluent hypergeometric function (13.7.2 in NIST (2023)), we obtain the large \(y=s/t\) tail of the distribution.

$$\begin{aligned} P^{(n)}_{s}(t) \approx \frac{\chi _n}{\Gamma (1-\chi _n)}\, s^{-1-\chi _n} \quad \text {when}\quad t\ll s\ll t^{\tfrac{n-1}{1-\chi _n}}. \end{aligned}$$
(46)

The algebraic decay for \(n\ge 2\) is in stark contrast with the exponential decay of the first type cells, \(n=1\). Surprisingly, the tail is not only algebraic but also stationary. The algebraic tail describes the behaviour for a limited range of s values. The lower bound \(s\ll t\) comes from the large \(y=s/t\) expansion. The upper bound is more subtle, and comes from noticing that the algebraic tail would imply an infinite mean, in contradiction with the finite mean we obtained in (27). To reconcile this, we set an upper bound, \(s_*\), and estimate its order of magnitude,

$$\begin{aligned} \mathbb {E} Z_n \propto t^{n-1} \propto \int _1^{s_*} s\, s^{-1-\chi _n} ds \propto s_*^{1-\chi _n} \end{aligned}$$

which gives \(s_* \propto t^{\tfrac{n-1}{1-\chi _n}}\) as we announced.

The validity of the scaling limit (45) is illustrated in Fig. 4 for \(n=2\) via comparison to numerical solutions. One can see how the range of the algebraic tail expands with time. In Fig. 5, the scaled number distribution \(\chi _n^{-1} t^{1+\chi _n} P^{(n)}_s(t) = F\big (1+\chi _n; 2; - s/t\big )\) as given by (45) is compared to simulations as a function of s/t, where we chose \(t=20\). For \(n>1\) the asymptotic solution matches simulations, but eventually overestimates the probability for large s. For larger t, the asymptotic solution is better at the large s limit but under-estimates for small s. Within the derived range, the cell number distribution is well described by stationary tail \((\text {size})^{-1-\chi _n}\).

We have derived the asymptotic behaviour of the last type of cells in the n-type process. Since the system is decomposable, in order to get the distribution of the previous types \(k=1,2,\dots ,n-1\), we simply need to stop the process at the kth type. In biological applications, we may also be interested in the total number of cells \(Z=Z_1+\dots +Z_n\), with distribution

$$\begin{aligned} \Pi ^{(n)}_s(t)= \mathbb {P}(Z(t)=s| Z_k(0)=\delta _{1,k}). \end{aligned}$$

This is encoded in the generating function

$$\begin{aligned} \mathcal {Z}_{1,n}(x,\dots ,x,x,t) = \sum _{s\ge 0}\Pi ^{(n)}_s(t) x^s, \end{aligned}$$
(47)

hence we see by (39) that it is asymptotically the same as the distribution of the last type of cells,

$$\begin{aligned}\Pi ^{(n)}_s(t)\sim P^{(n)}_s(t).\end{aligned}$$

Thus, the above analysis gives us access to both the asymptotic behaviour of the total number of cells, as well as the behaviour of each individual cell type.

Fig. 4
figure 4

Scaling of the probability \(P_s^{(2)}(t)\) of finding s type-2 cells at time t; different lines indicate different times t. Probabilities are calculated numerically from the exact generating function (22) via the Inverse Fast Fourier Transform algorithm, see Appendix C. In the double limit \(t,\ s \rightarrow \infty \), with s/t constant, the scaled distributions converge to the scaling limit given by (45), depicted by dashed line

Fig. 5
figure 5

The scaled number distributions \(\chi _n^{-1} t^{1+\chi _n} P^{(n)}_s(t)\) from simulations (dots) are compared to the theoretical prediction \(F(1+\chi _n; 2; -s/t)\), cf. Eq. (45), as a function of s/t, where we fix \(t=20\). Curves are for \(n=1,2,3,4\). Note the power law decay for \(n\ge 2\) as opposed to the exponential decay for the single type case (\(n=1\))

5.4 Arrival and exit times

Let us study when new cell types appear and disappear from the system. It is easier to work with the infinite type version of the model for this question, otherwise, we need to assume that the type in question is not greater than n. For convenience, we study when type-n cells appear or disappear, but the results are of course valid for any type, not only the last.

For the pure birth-mutation process, the exit time of types is straightforward. Let

$$\begin{aligned} E_n = \sup \{ t\ge 0: Z_n(t)>0 \} \end{aligned}$$

denote the time of extinction of type-n cells. For the birth-mutation process, notice that

$$\begin{aligned} \{E_n>t\} = \{Z_1(t)+\dots +Z_n(t)>0\}. \end{aligned}$$

Hence the distribution of \(E_n\) when starting from a single type-1 cell is given by the total survival of the n-type process, which we derived asymptotically in (35).

Let us now turn to the arrival time

$$\begin{aligned} T_n = \inf \{ t\ge 0: Z_n(t)>0\} \end{aligned}$$

when the first type-n cell appears. We are interested in its distribution starting with a single type-i cell

$$\begin{aligned} h_{i,n}(t) = \mathbb {P}(T_n>t|Z_k(0)=\delta _{ik}). \end{aligned}$$

To derive an equation for this quantity consider a modified system where type-n cells neither divide nor die, just stay alive forever. Hence their generating function, when starting with a single type-n cell, stays constant \({\mathcal {Z}}_{n,n}=x_n\), but all other equations for \(i<n\) remain the same in (29). The existence of a type-n cell in the modified system then indicates that they were produced also in the original system. Hence \(h_{i,n}(t)={\mathcal {Z}}_{i,n}(1,\dots ,1,0,t)\), and thus

$$\begin{aligned} \frac{dh_{i,n}}{dt} = h_{i,n}^2 - 2h_{i,n} + h_{i+1, n} \end{aligned}$$
(48)

with initial condition \(h_{i,n}(0)=1\) for \(i=1\dots ,n-1\) and \(h_{n,n}(t)\equiv 0\). In terms of \(g_{i,n}(t)=1-h_{i,n}(t)=\mathbb {P}(T_n\le t|Z_k(t)=\delta _{ik})\), Eq. (48) takes a simpler form

$$\begin{aligned} \frac{d g_{i,n}}{dt} =-g_{i,n}^2 + g_{i+1, n} \end{aligned}$$
(49)

with initial condition \(g_{i,n}(0)=0\) for \(i<n\) and \(g_{n,n}(t)\equiv 1\) for all t.

For the arrival of the first type-2 cell, the above equation becomes

$$\begin{aligned} \frac{d g_{1,2}}{dt} =-g_{1,2}^2 + 1 \end{aligned}$$
(50)

with solution

$$\begin{aligned} g_{1,2}(t) = \tanh {t}. \end{aligned}$$
(51)

Hence the first type-2 cell arrives on average at

$$\begin{aligned} \mathbb {E}T_2 = \int _0^\infty 1-g_{1,2}(t)\, dt = \log 2.\end{aligned}$$

Note also that

$$\begin{aligned} h_{1,2}(t) = 1-g_{1,2}(t) = \mathbb {P}(T_2>t|Z_k(0)=\delta _{1k}) = \frac{2}{1+e^{2t}} \end{aligned}$$
(52)

have the same form as for the analogous supercritical process (Nicholson et al. 2022).

Now we can use this solution for the arrival of type-3 cells, by noting that \(g_{i +1,n}=\sigma ( g_{i,n-1})\) and we get

$$\begin{aligned} \frac{d g_{1,3}}{dt} = -g_{1,3}^2 + \tanh {t}. \end{aligned}$$
(53)

The solution of (53) can be expressed as a lengthy combination of hypergeometric functions. For subsequent types, no exact solutions are available.

Let us study instead the more general birth–death process described by scheme (2). The above method presented for the birth-mutation process stays valid, and for all \( \alpha _i=1\) and a constant mutation rate \(\nu \) for all types, Eq. (49) becomes

$$\begin{aligned} \frac{d g_{i,n}}{dt} = - g_{i,n}^2 + \nu g_{i+1, n} \end{aligned}$$
(54)

with initial conditions \(g_{i,n}(0)=0\) and \(g_{n,n}\equiv 1\). The simplest non-trivial property of this process is the probability that the first type-n mutant eventually arrives starting from a single type-1 cell,

$$\begin{aligned} g_{1,n}(\infty ):=\lim _{t\rightarrow \infty } g_{1,n}(t). \end{aligned}$$

For the simple birth-mutation process (\(\nu =1\)) this quantity is trivial: \(g_{1,n}(\infty )=1\) for all n, namely all types arrive eventually with probability 1. This is not the case in the more general birth–death case where, by setting the left-hand side of (54) to zero, we obtain that

$$\begin{aligned} g_{1,n}(\infty )=\nu ^{1-\chi _n}. \end{aligned}$$

If we denote by M the maximal type that ever appears, then what we found is that \(\mathbb {P}(M\ge n)=g_{1,n}(\infty )=\nu ^{1-\chi _n}\). It gets easier for larger types to appear, in the sense that \(\mathbb {P}(M\ge n+1|M\ge n) = \nu ^{2^{-n}}\rightarrow 1\) as \(n\rightarrow \infty \). This also implies that the mean number of types that ever appear is infinite, \(\mathbb {E}M=\infty \).

Since \(g_{2,2}\equiv 1\), we can get an explicit solution for the arrival of the first type-2 cell

$$\begin{aligned} g_{1,2}(t)=\sqrt{\nu }\tanh \sqrt{\nu } t. \end{aligned}$$

which generalises (51).

For the arrival of the first type-3 cell, using that \(g_{2,3}=\sigma (g_{1,2})=g_{1,2}\), we have that

$$\begin{aligned} \frac{d g_{1,3}}{dt} = - g_{1,3}^2 + \nu \sqrt{\nu }\tanh \sqrt{\nu } t. \end{aligned}$$
(55)

The initial condition is \(g_{1,3}(0)=0\). Let us normalize \(g_{1,3}\) by its limiting probability \(g_{1,3}(\infty )=\nu ^{3/4}\). That is, we introduce \({\tilde{g}}_{1,3}:=\tfrac{g_{1,3}}{g_{1,3}(\infty )}\) to get

$$\begin{aligned} \frac{1}{g_{1,3}(\infty )}\,\frac{d {\tilde{g}}_{1,3} }{dt}= - {\tilde{g}}_{1,3}^2 + \tanh {\sqrt{\nu }t}. \end{aligned}$$

Next, we re-scale time, \(t\rightarrow {\tilde{t}}=t g_{1,3}(\infty )=t \nu ^{3/4}\), and define

$$\begin{aligned} k_{1,3}({\tilde{t}})={\tilde{g}}_{1,3}\left( {\tilde{t}} \nu ^{-3/4}\right) = \nu ^{-3/4} g_{1,3}(t) \end{aligned}$$

to arrive at

$$\begin{aligned}\frac{d k_{1,3}}{d {\tilde{t}}}= - k_{1,3}^2 + \tanh {\nu ^{-1/4} {{\tilde{t}}}}.\end{aligned}$$

In the limit where \(\nu \rightarrow 0\) we get

$$\begin{aligned} \frac{d k_{1,3}}{d {\tilde{t}}}\approx - k_{1,3}^2 + 1, \end{aligned}$$

which is the same differential equation we have for \(g_{1,2}\) for \(\nu =1\) in (50) with the same initial condition \(k_{1,3}(0)=0\). Hence \(k({{\tilde{t}}}) = \tanh {{\tilde{t}}}\) and then

$$\begin{aligned} g_{1,3}(t) \approx \nu ^{3/4} \tanh { \nu ^{3/4} t}. \end{aligned}$$

The same procedure can be applied to obtain the arrival distribution of all types,

$$\begin{aligned} g_{1,n}\left( t \right) \approx g_{1,n}(\infty ) \tanh (g_{1,n}(\infty ) t) =\nu ^{1-\chi _n}\tanh {\nu ^{1-\chi _n} t} \end{aligned}$$
(56)

which can be then verified by induction. In Fig. 6 we observe that, for a fixed type, the asymptotic agrees with the behaviour in the \(\nu \rightarrow 0\) limit. However, for a fixed \(\nu \), the approximation becomes worse as we consider more types.

Fig. 6
figure 6

On the left, the scaled probability of arrival of the first type-3 cell, \(\nu ^{-3/4}\mathbb {P}(T_3\le t)\) in terms of \(\nu ^{3/4}t\), obtained numerically from (54) for different \(\nu \) is compared to the scaling limit (56). On the right, the normalized arrival probability \(\nu ^{\chi _n-1}\mathbb {P}(T_n\le t)\) of different types obtained numerically from (54) are compared to the corresponding scaling limit (56) (dashed line) for \(\nu =10^{-5}\) and \(n=2,\dots ,5\)

6 Examples and applications

We now discuss example applications of the results for studying the evolution of microbes and tumour cells at the limiting mutation rate and relate them to experimental work.

6.1 Colony size and number of generations until EEX in microbes

In bacteria, haploid and diploid yeast, numerous experimental projects have investigated EEX by crossing cell lines with different mutator alleles Morrison et al. (1993), Fijalkowska and Schaaper (1996), Herr et al. (2011), Herr et al. (2014), Soriano et al. (2021). The observable quantities of such experiments are the CanR mutation rate of the cell lines (number of mutations in CanR gene per division), and the number of viable colonies after a given number of generations. The main goal is to identify the maximum mutation rate. We have seen that defining the limiting mutation rate is conceptually straightforward: if the sum of death and mutation rate is at least equal to the division rate, and there is a finite number of mutations that can accumulate, then EEX will occur with probability one. However, for how long and how large the colonies can grow depends on a number of factors, including the number of mutations that can accumulate before the lethal mutation. These are questions of interest for experiment design, which we can answer using the model proposed.

If we set the birth rate to 1 in the no-death model, we can interpret time in units of cell divisions or generations. We can immediately compute the probability that a colony initiated by a single cell will survive after t generations immediately from Eq. (35), which gives the survival probability after t generations depending on the maximal number of mutations that can accumulate n,

$$\begin{aligned} S_{1,n}\sim (1+t)^{-\chi _n}+\frac{\chi _n}{2}{(1+t)}^{-\tfrac{1}{2}-\chi _n} \end{aligned}$$

where \(\chi _n=2^{1-n}\). In Fig. 7a, we see that for \(n=1\), the survival probability is less than \(10\%\) after 10 generations, whilst up to 100 generations are needed for the same probability if \(n=2\). For \(n=5\), a reduction in survival probability of \(50\%\) will take \(10^5\) generations. In yeast, most experiments determine the growth of colonies after 20 generations. We can compute the expected size of the colony after t generations using Eq. (47). In Fig. 7b, we see that, if only one mutation can accumulate (\(n=1\)), we won’t observe any growth after 20 generations, whilst if two or three mutations can accumulate (\(n=2,3\)), colonies will form, although smaller than expected. This is qualitatively relatable to the comparison between haploid and diploid yeast investigated by Herr et al. (2014), who found that at the limiting mutation rate of one mutation in an essential gene per cell division, haploid yeast colonies are not viable, whilst diploid yeast form viable, but smaller colonies. For \(n>4\), the expected growth after 20 generations plateaus to the expected size in a normally exponentially growing population, suggesting that one would need to run longer experiments to observe EEX at the macroscopic level in cell lines that are more robust to mutational burden, in accordance with 7A. The survival probabilities and expected population size for the general case including cell death are available in Appendix E). Introducing death in the model effectively re-scales time, such that sub-populations with higher death rates go extinct faster.

Fig. 7
figure 7

(a) Number of generations until the survival probability is less than \(10\%\) and \(50\%\), as a function of the maximal number of mutations that can accumulate, calculated from (35). (b) The expected size of colonies started with one single cell after 20 generations, as a function of the maximal number of mutations that can accumulate, calculated from (47). The dashed grey line shows the expected size of a population growing exponentially with no maximal number of mutations

A remarkable conclusion from the model is that, even if cells are at the limiting mutation rate and hence EEX will occur with probability one, this might take many generations. Thus, in the usual experimental setting, EEX is not detectable by following colony size. However, it would be detectable if sequencing or genetic information of the colonies at different time points was available. In the next Section, we consider the evolution of genetic diversity in more depth.

6.2 Genetic diversity during EEX

Multi-type branching processes are widely applied to studying the genetic structure of growing populations. In the multi-type critical process, we observe an interesting behaviour, where extreme mutation rates result in an initial increase of genetic diversity, but after a transient phase, the population becomes dominated by cells carrying the maximal number of mutations, and thus loses genetic diversity. In Fig. 8 we plot the evolution of the number of cells of different simulation runs of the 3, 4, and 5-type process, next to the Shannon diversity index, which is defined as

$$\begin{aligned} H'(t)=-\sum _{i=1}^{n}p_{i}(t)\log p_i (t) \end{aligned}$$
(57)

where \(p_i(t)\) is the proportion of cells of type-i at time t. Indeed, we can see that \(H'\) increases as the first mutations accumulate, and rapidly declines when the last type of cells arrive.

Fig. 8
figure 8

Example simulation runs of the 3-type (a), 4-type (b, c) and 5-type (d) process, where each colored area represents the number of cells of a type, with different types piled on top of each other, and we ran the process until extinction or \(10^4\) steps. Next to each run, we plot the corresponding Shannon diversity index \(H'(t)\) in time 57. In all examples, this is maximal when the last type arrives, and then quickly decreases as the population becomes dominated by the last type

Even though an analytical expression for \(H'(t)\) from the model is not available, we can use (35) to obtain the expected total number of types or subclones present at time t, with its large time asymptotic expression

$$\begin{aligned} K_n(t):= \mathbb {E}[\#\text { of types at time } t]= \sum _i^n Q_{1,i}(t) \sim \sum _i^n S_{1,i}(t) \sim t^{-\chi _n}, \end{aligned}$$
(58)

where \(Q_{1,i}(t)\) denotes the survival probability of the ith cell-type, and \(S_{1,i}(t)\) denotes the survival probability of the entire population (defined in (40) and (30), respectively). In Fig. 9 we see that, similarly to the Shannon diversity index, \(K_n(t)\) initially increases and then quickly decreases. As observed in the simulations, after 10–20 generations, the expected number of types present already declines, in sharp contrast with the case in which the colony grows exponentially at a high mutation rate but cells do not acquire lethal mutations. Therefore, even though the macroscopic behaviour is indistinguishable in that timescale, if genetic information on the colonies in time is available experimentally (e.g. the number of subclones after 20 generations), one can identify populations that will eventually reach extinction due to high mutation. Moreover, one can use the expressions derived from the model to infer the maximal number of mutations n, which can then be used to quantitatively predict the evolution of the colonies and the time until EEX.

Fig. 9
figure 9

Mean number of types present at time t for populations with different maximal number of mutations (n), calculated from (58) (solid lines) with their large time asymptotic expressions (dashed lines). The dotted grey line shows the expected number of types in a population growing exponentially with no maximal number of mutations (\(n=\infty \))

7 Discussion

Multi-type branching processes provide a natural tool to model biological processes driven by cell division, death, and mutation. Due to their potential to describe evolutionary dynamics, extensive work has been dedicated to deriving solutions of multi-type processes, especially the super-critical and sub-critical cases (Durrett 2015; Nicholson et al. 2022). In this work, we have focused on finite-type critical processes, which mimic populations in which mutations accumulate until a maximal number of alterations is reached, resulting in extinction.

Driven by biological motivation, we have focused on deriving solutions for the survival of the population, the number of cells, and the arrival and extinction time of cells with different mutations. We found that the survival probability of the overall system, which is asymptotically equivalent to the survival probability of the last type of cells, decays as \(t^{-\chi _n}\) for the nth type. The \(\chi _n = 2^{1-n}\) exponents for the survival probability had been derived by previous work by Foster and Ney (1976) and Ogura (1975), although following different approaches. With our approach we have also derived higher-order terms, facilitating the estimation of the accuracy of the first-order term.

For two cell types, the generating function of the population sizes was expressed explicitly in terms of modified Bessel functions. This provides a numerically efficient way to extract the number distributions, as detailed in Appendix C. By conditioning on survival of the final type, we extract the distribution of the number of type-k cells in the large time limit. The survival probabilities show us that the system becomes quickly dominated by the last type of cells, and indeed the distribution of the last type coincides with that of the total number of cells. This distribution only depends on the ratio s/t of the size s of the population and time t. Interestingly, in the large s/t limit, this has algebraic and stationary tail \((\text {size})^{-1-\chi _n}\). That is, for a fixed (large) number of cells, there is a time regime during which the probability of finding s cells remains constant in time. Since this fat tail – for each but the first cell type – would imply infinite population size, we have derived an upper cutoff for the power law tail which ensures the finiteness of the mean population sizes. These power-law tails appearing for large times for \(n\ge 2\) cell types are in sharp contrast to the purely exponential behaviour of the mass function of the first cell type. Although the solution is only valid for large time and population size, this corresponds to the range of interest in biological applications.

We provide exact or asymptotic formulas for our model of EEX in time, including the evolution of population size, and the time of arrival and extinction of sub-populations. In Sect. 6, we discussed example applications of these formulas for studying EEX in microbes, in relation to experimental work. In particular, we showed that populations at the limit mutation rate that can accumulate a high number of mutations take longer to exhibit macroscopic changes in colony size than the number of generations that experiments usually track. Thus, even though such cell lines would eventually undergo error-induced extinction, this extreme fate could not be predicted by analyzing colony growth. In order to identify colonies at the limit mutation rate, experiments should complement macroscopic measures with measures of genetic diversity such as the number of subclones.

Apart from providing tools to quantitatively analyze and guide experimental design, the model provides theoretical insight into error-induced extinction. We propose a general mechanism for EEX: populations of cells that divide and mutate or die at the same rates and have a maximal number of mutations tolerable. In the context of cancer, this maximal number might represent the amount of DNA damage that can accumulate before being detected by the immune system (Schumacher and Schreiber 2015). We find that the modeled populations become dominated by cells carrying the maximal number of mutations and thus lose genetic diversity. This is related to the idea that genetic instability results in extinction because populations cannot overcome selective barriers (Tejero et al. 2016; Andor et al. 2017; Tilk et al. 2022). Another interesting behaviour is that populations at the error threshold reach a stationary phase before going extinct. Stationary growth of cancer or bacteria populations is normally associated with having reached a carrying capacity (due to limited nutrients, etc.) (Gerlee 2013; Tjørve and Tjørve 2017). Our model shows that this macroscopic behaviour might be caused by mutational burden, in which case the fate of the population is drastically different, highlighting the importance of considering genetic structure when modeling population growth.

A related phenomenon to EEX is the so-called error catastrophe, which describes the inability of a genetic element to be maintained in a population as the fidelity of its replication machinery decreases beyond a certain threshold value, such that it cannot produce enough copies of itself (Summers and Litwin 2006). This has been invoked as a theoretical basis for the treatment of viral infection with drugs that would push the error rate for copying the viral genome beyond this threshold (Vignuzzi et al. 2005). In our model, every type of cells eventually disappears due to high mutation, and thus undergoes an error catastrophe. Error catastrophes might result in the most fit cells to be replaced by lower fitness population, but, unlike EEX, not necessarily in extinction. This more general case can be studied using the proposed model, but either allowing for infinite number of mutations, or adjusting the birth and death rate of the last type of cells. As shown in previous work, the applications of multi-type critical processes extend beyond error catastrophe and EEX, e.g., to modeling infectious disease spread (Antal and Krapivsky 2012).

Throughout this paper, we have focused on the simplest n-type critical process, with zero death rate for all but the last cell type. The more general case including cell death, and arbitrary mutation and division rates can be found in Appendix E. This could be relevant for modelling systems in which mutations result in cells with different division, mutation, and death rates, as long as all cell types have critical growth. Including intermediate types with sub-critical and super-critical growth remains a challenge for future work. Future work should consider non-consecutive mutations, allowing for specific mutational paths to be modeled. This is particularly relevant to the study of the fitness of mutator alleles in cancer evolution, which is typically driven by the accumulation of specific mutations.