1 Introduction

Branching Brownian motion (BBM) [3, 23] can be seen as an elementary model for the evolution of a population of individuals that are subject to birth, death, and motion in space. One of the primary interests in this model was the analysis of the speed of spread of such a population in space, as well as finer properties of the front. Indeed, BBM was investigated form the point of view of extreme value theory over the last 40 year (see e.g. [4,5,6,7,8,9, 12,13,14, 19]).

As a model for population dynamics, BBM is somewhat unrealistic as it leads to uncontrolled exponential growth of the population size. In fact, in the standard normalisation, the population size grows like \(\exp (t)\), while the population spreads over a volume of order t, leading to an unsustainable density of the population. Several variants of the model that resolve this problem have been proposed where, according to some selection rule, offspring is selected to survive in such a way that the total population size stays controlled [11, 15, 20, 21]. Versions where competitive interactions between particles are present were considered, e.g. in [1, 2, 16, 17].

In this paper, we propose a model where the population size is controlled by penalising the fact that particles stay close to each other. Before defining the model precisely, recall that BBM is constructed as follows: start with a single particle which performs a standard Brownian motion x(t) in \({\mathbb {R}}\) with \(x(0)=0\) and continues for a standard exponentially distributed holding time T, independent of x. At time T, the particle splits independently of x and T into k offspring with probability \(p_k\), where \(\sum _{i=1}^\infty p_k=1\), \(\sum _{k=1}^\infty k p_k=2\) and \(K=\sum _{k=1}^\infty k(k-1)p_k<\infty \). In the present paper, we choose the simplest option, \(p_2=1\), all others zero, except in Sect. 8, where we allow for \(p_0>0\). These particles continue along independent Brownian paths starting from x(T) and are subject to the same splitting rule. We let n(t) denote the number of particles at time t and label the particles at time t arbitrarily by \(1,2,3,\dots , n(t)\), and denote by \(x(t)=\{x_1(t),\dots , x_{n(t)}(t)\}\) the positions of these particles at that time. For \(s\le t\), we let \(x_i(s)\) be the position of the ancestor of particle i at time s. We denote by \({\mathbb {P}}\) the law of BBM.

Alternatively, BBM can be constructed as a Gaussian process indexed by a continuous time Galton–Watson tree with mean zero and covariances, conditioned on the Galton–Watson tree, given by

$$\begin{aligned} {\mathbb {E}}\left[ x_k(s)x_\ell (r)|{\sigma }(GW)\right] = \mathrm{{d}}(x_k(t),x_\ell (t))\wedge s\wedge r, \end{aligned}$$
(1.1)

where \(d(x_k(t),x_\ell (t))\) is the time of the most recent common ancestor of the particles labeled k and \(\ell \) in the Galton–Watson tree.

For \(t<\infty \) and for some \(\epsilon >0\), we define the penalty function

$$\begin{aligned} I_t(x) \equiv \int _0^t\sum _{i\ne j=1}^{n(s)} \mathbbm {1}_{|x_i(s)-x_j(s)|\le \epsilon }\mathrm{{d}}s. \end{aligned}$$
(1.2)

(The notation here is not quite consistent, as the labelling of the n(s) particles at time s is changing with s. This can be remedied by using the Ulam–Kesten–Harris labelling of the tree, but maybe this is not necessary here.) We are interested in the law of x(t) under the tilted measure \(P_{t,{\lambda }}\), for \({\lambda }>0\), given by

$$\begin{aligned} P_{t,{\lambda }} (A) \equiv \frac{{\mathbb {E}}\left[ \mathbbm {1}_{x(t)\in A} {\mathrm e}^{-{\lambda }I_t(x)}\right] }{{\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_t(x)}\right] }, \end{aligned}$$
(1.3)

for any Borel set A. The function \(I_t\) measures the total time when any two particles stay within a distance \(\epsilon \) up to time t. This seems to a be reasonable measure for competitive pressure. In a typical realisation of BBM, the density of particles at time s will be of order \({\mathrm e}^s/s\), and hence the \(\epsilon \)-neighbourhood of a typical particle contains \(\epsilon {\mathrm e}^s/s\) other particles. Thus, for a typical configuration x of BBM, \(I_t(x)\sim \epsilon {\mathrm e}^{2t}/(2t)\). This penalty is most easily avoided by reducing the particle number by not branching. For a particle to not branch up to time t has probability \({\mathrm e}^{-t}\), which is far less costly. Reducing the particle density by making the particles move much farther apart would be far more costly. This observation suggests that a simple exactly solvable model, which we describe and analyse below, correctly reflects the main features of this model.

1.1 A Simplified Model

Analysing the measure \(P_{t,{\lambda }}\) directly seems rather difficult. We suggest an approximation that should share the qualitative features of the full measure. For this, we consider a lower bound on \(I_t\). Note that, whenever branching occurs, the offspring start at the same point and thus are all closer than \(\epsilon \) Let us for simplicity take a branching law such that \(p_2=1\), i.e. only binary branching occurs. Then we can bound

$$\begin{aligned} I_t(x) \ge I'_t(x)\equiv \sum _{i=1}^{n(t)-1} {\tau }_{\epsilon }(i), \end{aligned}$$
(1.4)

where \({\tau }_\epsilon (i)\) is the first time the two Brownian motions that start at the i-th branching event are a distance \(\epsilon \) apart. For small \(\epsilon \), the probability that one of the two branches branches again before the time \({\tau }_\epsilon \) is of order \(\epsilon ^2\), so that it will be a good approximation to treat the \( {\tau }_{\epsilon }(i)\) as independent and having the same distribution as

$$\begin{aligned} {\tau }_\epsilon \equiv \inf \{t>0: |B_t|>\epsilon \}. \end{aligned}$$
(1.5)

Then,

$$\begin{aligned} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }\sum _{i=1}^{n(t)-1} {\tau }_{\epsilon }(i)}\right] \approx {\mathbb {E}}\left[ {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }{\tau }_{\epsilon }}\right] ^{n(t)-1}\right] \equiv {\mathbb {E}}\left[ {\sigma }({\lambda },\epsilon )^{n(t)-1}\right] . \end{aligned}$$
(1.6)

where (as follows from Theorem 5.35 and Proposition 7.48 in [22]),

$$\begin{aligned} {\sigma }({\lambda },\epsilon )\equiv {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }{\tau }_{\epsilon }}\right] ={{\,\mathrm{sech}\,}}(\epsilon \sqrt{2{\lambda }}), \end{aligned}$$
(1.7)

which for small \({\lambda }\epsilon ^2\) behaves like \(\exp (-{\lambda }\epsilon ^2)\). Note that we also have, by Jensen’s inequality, that

$$\begin{aligned} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }\sum _{i=1}^{n(t)-1} {\tau }_{\epsilon }(i)}\right] \le {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }\sum _{i=1}^{n(t)-1}{\mathbb {E}}[{\tau }_{\epsilon }(i)]}\right] = {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }\epsilon ^2(n(t)-1)}\right] . \end{aligned}$$
(1.8)

We define the following simplified model.

$$\begin{aligned} {\widehat{P}}_{t,{\lambda }} (A) \equiv \frac{{\mathbb {E}}\left[ \mathbbm {1}_{x(t)\in A} {\sigma }({\lambda },\epsilon )^{n(t)-1}\right] }{{\mathbb {E}}\left[ {\sigma }({\lambda },\epsilon )^{n(t)-1}\right] }, \end{aligned}$$
(1.9)

One might think that the approximate model is a poor substitute for the full model, since it ignores the repulsion of particles after the time that they first separate. However, as we will see shortly, already \(I'_t(x)\) suppresses branching so much that the total number of particles will stay finite for any time. Hence we can expect that these finitely many particles can remain separate rather easily and that the remaining effect of \(I_t\) will be relatively mild.

1.2 Outline

The remainder of this paper is organised as follows. In Sect. 2, we derive exact formulas for the partition function, the particle number, and the first branching time in the simplified model. In Sect. 3 we introduce the notion of quasi-Markovian Galton–Watson trees. In Sect. 4 we show that the branching times in the simplified model are given by such a tree. In Sect. 5, we consider the limit when \({\lambda }\downarrow 0\) and derive a universal asymptotic model, which is a specific quasi-Markovian Galton–Watson tree. In Sect. 6, we consider the position of the maximal particle and show that its distribution is governed by a F-KPP equation with time-dependent reaction term and analyse the behaviour of its solutions. We discuss the relation of the approximate model to the full model in Sect. 7. In Sect. , we briefly look at the case when \(p_0>0\). In this case, the process dies out and we derive the rate at which the number of particles tends to zero.

2 Partition Function, Particle Numbers, and First Branching Time

2.1 The Partition Function

The first object we consider is the normalising factor or partition function

$$\begin{aligned} v_{\lambda }(t)\equiv {\mathbb {E}}\left[ {\sigma }({\lambda },\epsilon )^{n(t)-1}\right] . \end{aligned}$$
(2.1)

It is connected as we show below to the following differential equation.

Lemma 2.1

Let \({\alpha }\in (0,1]\) and let \(f_{\alpha }(t)\) be the solution of the ordinary differential equation

$$\begin{aligned} \frac{d}{dt} f_{\alpha }(t) = {\alpha }f_{\alpha }(t)^2 -f_{\alpha }(t), \end{aligned}$$
(2.2)

with initial condition \(f_{\alpha }(0)=1\). Then

$$\begin{aligned} f_{\alpha }(t) = \frac{{\mathrm e}^{-t}}{ \left( 1 - {\alpha }\right) +{\alpha }{\mathrm e}^{-t}}. \end{aligned}$$
(2.3)

Remark

A first inspection of Eq. (2.2) shows why the cases \({\alpha }=1\) and \(0<{\alpha }<1\) are vastly different. Equation (2.2) has the two fix points 0 and \(1/{\alpha }\). Here 0 is stable and \( 1/{\alpha }\) is unstable. Hence all solutions with initial condition \(0\le f_{\alpha }(0)<1/{\alpha }\) will converge to 0, while solutions with \( f_{\alpha }(0)>1/{\alpha }\) will tend to infinity. Only the special initial condition \(f_{\alpha }(0)=1/{\alpha }\) will lead to the constant solution. Since we start with the initial condition \(f_{\alpha }(0)=1\), if \({\alpha }=1\), we get this special constant solution, while for \({\alpha }<1\), the solution will tend to zero.

Proof

We define

$$\begin{aligned} {{\hat{f}}}_{\alpha }(t)\equiv {\mathrm e}^t f_{\alpha }(t). \end{aligned}$$
(2.4)

Then \({{\hat{f}}}_{\alpha }\) solves

$$\begin{aligned} \frac{\text{ d }}{{\text{ d }}t}{{\hat{f}}}_{\alpha }(t) = {\alpha }{{\hat{f}}}_{\alpha }(t)^2{\mathrm e}^{-t}, \end{aligned}$$
(2.5)

also with initial condition \({{\hat{f}}}_{\alpha }(t)=1\). Dividing both sides by \({{\hat{f}}}_{\alpha }(t)^2\), this can be written as

$$\begin{aligned} - \frac{\text{ d }}{{\text{ d }}t} \frac{1}{{{\hat{f}}}_{\alpha }(t)} ={\alpha }{\mathrm e}^{-t}, \end{aligned}$$
(2.6)

which can be integrated to give

$$\begin{aligned} -\frac{1}{{{\hat{f}}}_{\alpha }(t)}+\frac{1}{{{\hat{f}}}_{\alpha }(0)} = {\alpha }\left( 1- {\mathrm e}^{-t}\right) , \end{aligned}$$
(2.7)

or

$$\begin{aligned} {{{\hat{f}}}_{\alpha }(t)} =\frac{1}{ \frac{1}{{{\hat{f}}}_{\alpha }(0)} -{\alpha }\left( 1- {\mathrm e}^{-t}\right) }, \end{aligned}$$
(2.8)

and

$$\begin{aligned} { f_{\alpha }(t)} =\frac{1}{ {\mathrm e}^{t}\left( \frac{1}{f_{\alpha }(0)} -{\alpha }\right) +{\alpha }}. \end{aligned}$$
(2.9)

Using the initial condition \(f_{\alpha }(0)=1\), the claim of the lemma follows.

\(\square \)

Remark

We note that, provided \({\alpha }<1\),

$$\begin{aligned} \lim _{t\uparrow \infty } {{\hat{f}}}_{\alpha }(t)= \frac{1}{1 -{\alpha }}. \end{aligned}$$
(2.10)

The next lemma shows that \(v_{\lambda }\) solves Eq. (2.5) with \({\alpha }={\sigma }({\lambda },\epsilon )\).

Lemma 2.2

\( v_{\lambda }(t)\) solves Eq. (2.5) with \({\alpha }={\sigma }({\lambda },\epsilon )\) and

$$\begin{aligned} v_{\lambda }(t)=\frac{{\mathrm e}^{-t}}{ \left( 1 - {\sigma }({\lambda },\epsilon )\right) +{\sigma }({\lambda },\epsilon ){\mathrm e}^{-t}}. \end{aligned}$$
(2.11)

Proof

The derivation of the ode is similar to that of the F-KPP equation for BBM (see [8]). Clearly, \(v_{\lambda }(0)=1\). Conditioning on the time if the first branching event, we get

$$\begin{aligned} v_{\lambda }(t) ={\mathrm e}^{-t} + \int _0^t ds {\mathrm e}^{-(t-s)} {\sigma }({\lambda },\epsilon ) v_{\lambda }(s)^2. \end{aligned}$$
(2.12)

Differentiating with respect to t gives

$$\begin{aligned} \frac{\hbox {d}}{{\hbox {d}}t} v_{\lambda }(t)= & {} -{\mathrm e}^{-t} +{\sigma }({\lambda },\epsilon ) v_{\lambda }(t)^2 -\int _0^t ds {\mathrm e}^{-(t-s)} {\sigma }({\lambda },\epsilon ) v_{\lambda }(s)^2 \nonumber \\= & {} {\sigma }({\lambda },\epsilon ) v_{\lambda }(t)^2 -v_{\lambda }(t). \end{aligned}$$
(2.13)

Thus \(v_{\lambda }\) solves the Eq. (2.2) with \({\alpha }={\sigma }({\lambda },\epsilon )\). (2.11) follows from Lemma , which proves the lemma. \(\square \)

Remark

To keep the notation light, we keep dependence of \({\sigma }\) on \({\lambda }\) and \(\epsilon \) implicit from now on unless we want to emphasis this dependence.

2.2 Particle Numbers

From the formula for the partition function, we can readily infer the mean number of particles at time t, namely,

$$\begin{aligned} \widehat{E}_{t,{\lambda }} \left[ n(t)\right]= & {} 1+ {\sigma }\frac{\text{ d }}{{\text{ d }}{\sigma }} \ln {{\hat{v}}}_{\lambda }(t) = 1+ {\sigma }\frac{1-{\mathrm e}^{-t}}{1-{\sigma }(1-{\mathrm e}^{-t})} \nonumber \\= & {} \frac{1}{1-{\sigma }(1-{\mathrm e}^{-t})}, \end{aligned}$$
(2.14)

and for \(t\uparrow \infty \) this converges to \(1/(1-{\sigma })\). In fact, we can even compute the distribution of the number of particles at times \(s\le t\). To do so, we want to compute the Laplace (Fourier) transforms

$$\begin{aligned} \widehat{E}_{t,{\lambda }}\left[ {\mathrm e}^{{\gamma }n(s)}\right] =\frac{{\mathbb {E}}\left[ {\mathrm e}^{{\gamma }n(s)}{\sigma }^{n(t)-1}\right] }{{\mathbb {E}}\left[ {\sigma }^{n(t)-1}\right] } =\frac{{\mathbb {E}}\left[ {\mathrm e}^{{\gamma }n(s)}{\sigma }^{n(t)-1}\right] }{v_{\lambda }(t)}, \end{aligned}$$
(2.15)

where \({\gamma }>0\). The denominator has already been calculated in . For the numerator we write

$$\begin{aligned} {\mathbb {E}}\left[ {\mathrm e}^{{\gamma }n(s)}{\sigma }^{n(t)-1}\right]= & {} {\mathbb {E}}\left[ {\mathrm e}^{{\gamma }n(s)} {\mathbb {E}}\left[ {\sigma }^{n(t)-1}\big |{{{\mathcal {F}}}}_s\right] \right] \nonumber \\= & {} {\mathbb {E}}\left[ {\mathrm e}^{{\gamma }n(s)} {\mathbb {E}}\left[ {\sigma }^{ \sum _{i=1}^{n(s)} n^{(i)} (t-s)-1}\big |{{{\mathcal {F}}}}_s\right] \right] , \end{aligned}$$
(2.16)

where \(n^{(i)} (t-s)\) are the number of particles at time t that have particle i as common ancerstor at time s. Using the independence properties, this equals

$$\begin{aligned}&{\mathbb {E}}\left[ {\mathrm e}^{{\gamma }n(s)} {\sigma }^{n(s)-1} \left( {\mathbb {E}}\left[ {\sigma }^{ n (t-s)-1}\right] \right) ^{n(s)}\right] \nonumber \\&\qquad = {\mathrm e}^{{\gamma }}v_{\lambda }(t-s){\mathbb {E}}\left[ \left( {\mathrm e}^{{\gamma }}{\sigma }v_{\lambda }(t-s)\right) ^{n(s)-1}\right] \end{aligned}$$
(2.17)

Observing that \({\mathbb {E}}\left[ \left( {\mathrm e}^{{\gamma }}{\sigma }v_{\lambda }(t-s)\right) ^{n(s)-1}\right] \) solves Eq. (2.2) with \({\alpha }={\mathrm e}^{{\gamma }}{\sigma }v_{\lambda }(t-s)\), it follows from Lemma that (2.17) is equal to

$$\begin{aligned} {\mathrm e}^{{\gamma }}v_{\lambda }(t-s) \frac{{\mathrm e}^{-s}}{1-{\mathrm e}^{{\gamma }}{\sigma }v_{\lambda }(t-s)(1-{\mathrm e}^{-s})}= \frac{{\mathrm e}^{-s}}{{\mathrm e}^{-{\gamma }}v_{\lambda }(t-s)^{-1}-{\sigma }(1-{\mathrm e}^{-s})}. \nonumber \\ \end{aligned}$$
(2.18)

(2.18) is equal to

$$\begin{aligned} \frac{{\mathrm e}^{-s}}{{\mathrm e}^{-{\gamma }}\left( 1-{\sigma }(1-{\mathrm e}^{-t+s})\right) {\mathrm e}^{t-s}-{\sigma }(1-{\mathrm e}^{-s})} = \frac{{\mathrm e}^{-t}}{{\mathrm e}^{-{\gamma }}\left( 1-{\sigma }(1-{\mathrm e}^{-t+s})\right) -{\sigma }({\mathrm e}^{s-t}-{\mathrm e}^{-t})}.\nonumber \\ \end{aligned}$$
(2.19)

Dividing by \(v_{\lambda }(t)\), we arrive at

$$\begin{aligned} \widehat{E}_{t,{\lambda }}\left[ {\mathrm e}^{{\gamma }n(s)}\right] = \frac{1-{\sigma }(1-{\mathrm e}^{-t})}{{\mathrm e}^{-{\gamma }}\left( 1-{\sigma }(1-{\mathrm e}^{-t+s})\right) -{\sigma }({\mathrm e}^{s-t}-{\mathrm e}^{-t})}. \end{aligned}$$
(2.20)

From this exact formula, we can derive various special cases.

Theorem 2.3

  1. (i)

    Under the measure \(\widehat{P}_{t,{\lambda }}\), the number of particles at time t is geometrically distributed with parameter \(1-{\sigma }(1-{\mathrm e}^{-t})\). In particular, the number of particles converges, as \(t\uparrow \infty \), to a geometric random variable with parameter \(1-{\sigma }\).

  2. (ii)

    As \(t\uparrow \infty \), the number of particles at time \(s(t) = t+\ln (1-{\sigma })+\rho \), for all \(\rho \le -\ln (1-{\sigma })\), converges in distribution to a geometric random variable with parameter

    \((1+{\sigma }{\mathrm e}^{\rho })^{-1}\).

Proof

Inserting \(s=t\) into (2.20) we get that

$$\begin{aligned} \widehat{E}_{t,{\lambda }}\left[ {\mathrm e}^{{\gamma }n(t)}\right] = \frac{1-{\sigma }(1-{\mathrm e}^{-t})}{{\mathrm e}^{-{\gamma }}-{\sigma }(1-{\mathrm e}^{-t})}, \end{aligned}$$
(2.21)

which is the Laplace transform of the geometric distribution with parameter \(1-{\sigma }(1-{\mathrm e}^{-t})\). This implies (i). Similarly, with \(s=t+\ln (1-{\sigma }) +\rho \), and \(\rho \le -\ln (1-{\sigma })\)

$$\begin{aligned} \widehat{E}_{t,{\lambda }}\left[ {\mathrm e}^{{\gamma }n(s)}\right] = \frac{1-{\sigma }(1-{\mathrm e}^{-t})}{{\mathrm e}^{-{\gamma }}\left( 1-{\sigma }+{\sigma }(1-{\sigma }){\mathrm e}^{\rho }\right) -{\sigma }(1-{\sigma })({\mathrm e}^{\rho }-{\mathrm e}^{-t})}. \end{aligned}$$
(2.22)

If we now take \(t\uparrow \infty \), we get

$$\begin{aligned} \lim _{t\uparrow \infty } \widehat{E}_{t,{\lambda }}\left[ {\mathrm e}^{{\gamma }n(t+\ln (1-{\sigma })+\rho )}\right] = \frac{1}{{\mathrm e}^{-{\gamma }}\left( 1+{\sigma }{\mathrm e}^{\rho }\right) -{\sigma }{\mathrm e}^{\rho }} = \frac{(1+{\sigma }{\mathrm e}^{\rho })^{-1}}{{\mathrm e}^{-{\gamma }} -\frac{{\sigma }{\mathrm e}^{\rho }}{1+{\sigma }{\mathrm e}^{\rho }}}, \nonumber \\ \end{aligned}$$
(2.23)

which is the Laplace transform of the geometric distribution with parameter \((1+{\sigma }{\mathrm e}^{\rho })^{-1}\). \(\square \)

Remark

Note that, for fixed s, taking the limit \(t\uparrow \infty \), we get unsurprisingly \({\mathrm e}^{{\gamma }}\) in (2.20), indicating that there is just one particle.

We see that the mean number of particles ranges from 1 (as \(\rho \downarrow -\infty \)), \(1+{\sigma }\) (for \(\rho =0\)), to \((1-{\sigma })^{-1}\) (for \(\rho =-\ln (1-{\sigma })\)). Note that if \({\sigma }=1\), n(t) is geometric with parameter \({\mathrm e}^{-t}\), which corresponds to BBM with binary branching.

2.3 Distribution of the First Branching Time

We have seen so far that the repulsion strongly suppresses the number of branchings. The first branching time is then

$$\begin{aligned} {\tau }_1\equiv \inf \{ s>0:n(s)=2\}. \end{aligned}$$
(2.24)

Theorem 2.4

The distribution of the first branching time under \(\widehat{P}_{t,{\lambda }}\) is given by

$$\begin{aligned} \widehat{P}_{t,{\lambda }} \left( {\tau }_1\le t-r\right) ={\sigma }\frac{{\mathrm e}^{-r}-{\mathrm e}^{-t}}{1-{\sigma }(1-{\mathrm e}^{-r})}. \end{aligned}$$
(2.25)

Proof

Note that after the first branching, there will be two independent BBMs that run for the remaining time \(t-{\tau }_1\) and that are subject to the same penalty as before. In particular, given \({\tau }_1\), the total particle number n(t) is equal to the sum of the number of particles in these two branches,

$$\begin{aligned} n(t)= \tilde{n}^{(0)}(t-{\tau }_1)+ {{\tilde{n}}}^{(1)}(t-{\tau }_1), \end{aligned}$$
(2.26)

where \( {{\tilde{n}}}^{(i)}\) are the particles in the two branches that split at time \({\tau }_1\). Denote by \(v_{\lambda }(t,t-r)\) the unnormalised mass of paths that branch before time \(t-r\le t\), i.e. set

$$\begin{aligned} v_{\lambda }(t,t-r)\equiv {\mathbb {E}}\left[ {\sigma }^{n(t)-1}\mathbbm {1}_{{\tau }_1\le t- r}\right] . \end{aligned}$$
(2.27)

We get

$$\begin{aligned} v_{\lambda }(t,t-r)= & {} \int _0^{t-r} {\mathrm e}^{-s} {\mathbb {E}}\left[ {\sigma }^{{{\tilde{n}}}^{(0)}(t-s)+ {{\tilde{n}}}^{(1)}(t-s) -1}\right] {\hbox {d}}s\nonumber \\= & {} \int _0^{t-r} {\mathrm e}^{-s} {\sigma }{\mathbb {E}}\left[ {\sigma }^{ n(t-s)-1}\right] {\mathbb {E}}\left[ {\sigma }^{ n(t-s)-1}\right] {\hbox {d}}s\nonumber \\= & {} \int _0^{t-r} ds {\mathrm e}^{-s} {\sigma }v_{\lambda }(t-s)^2{\hbox {d}}s. \end{aligned}$$
(2.28)

Since \(v_{\lambda }(t-s)\) is known, this is an explicit formula, namely

$$\begin{aligned} v_{\lambda }(t,t-r)= & {} {\sigma }{\mathrm e}^{-2t} \int _0^{t-r} ds {\mathrm e}^{s} \frac{1}{(1-{\sigma }(1-{\mathrm e}^{-(t-s)}))^2}\nonumber \\= & {} {\mathrm e}^{-t}{\sigma }\frac{{\mathrm e}^{-r}-{\mathrm e}^{-t}}{(1-{\sigma }(1-{\mathrm e}^{-t}))(1-{\sigma }(1-{\mathrm e}^{-r}))}. \end{aligned}$$
(2.29)

Since \( \widehat{P}_{t,{\lambda }} \left( {\tau }_1\le t-r\right) =\frac{v_{\lambda }(t,t-r)}{v_{\lambda }(t)}\), (2.25) follows. \(\square \)

Remark

Note that, for r fixed, \(\widehat{P}_{t,{\lambda }} \left( {\tau }_1\le t-r\right) \) converges, as \(t\uparrow \infty \), to

$$\begin{aligned} \frac{{\sigma }{\mathrm e}^{-r}}{1-{\sigma }(1-{\mathrm e}^{-r})}. \end{aligned}$$
(2.30)

Note further that \(v_{\lambda }(t)= v_{\lambda }(t,t) +{\mathbb {E}}\mathbbm {1}_{{\tau }_1>t}= v_{\lambda }(t,t) +{\mathrm e}^{-t}\) and therefore

$$\begin{aligned} \widehat{P}_{t,{\lambda }} \left( {\tau }_1\le t\right) =\frac{v_{\lambda }(t,t)}{v_{\lambda }(t)} =1-{\mathrm e}^{-t}/v_{\lambda }(t)<1. \end{aligned}$$
(2.31)

3 Quasi-Markovian Time-Inhomogeneous Galton–Watson Trees

In this section, we introduce a class of models that are continuous-time version of Galton–Watson processes that are time-inhomogeneous and that in general are not Markov, but have an underlying discrete-time Markov property. These processes emerge in the models introduced above.

We start with discrete time trees and we introduce the usual Ulam–Harris labelling.

Let us define the set of (infinite) multi-indices

$$\begin{aligned} {\mathbf {I}}\equiv {\mathbb {Z}}_+^{\mathbb {N}}, \end{aligned}$$
(3.1)

and let \({\mathbf {F}}\subset {\mathbf {I}}\) denote the subset of multi-indices that contain only finitely many entries that are different from zero. Ignoring leading zeros, we see that

$$\begin{aligned} {\mathbf {F}} = \cup _{k=0}^\infty {\mathbb {Z}}_+^k, \end{aligned}$$
(3.2)

where \({\mathbb {Z}}_+^0\) is either the empty multi-index or the multi-index containing only zeros. A discrete-time tree is then identified by a consistent sequence of sets of multi-indices, \(q(n)\) at time n as follows.

  • \(\{(0,0,\dots )\} =\{u(0)\}=q(0)\).

  • If \(u\in q(n)\) then \(u+(\underbrace{0,\dots ,0}_{n\times 0}, k,0,\dots )\in q(n+1)\) if \(0\le k\le l^u(n)-1\), where

    $$\begin{aligned} l^u(n)=\# \{ \text{ offsprings } \text{ of } \text{ the } \text{ particle } \text{ corresponding } \text{ to } u\, \text{ at } \text{ time }\, n \}. \end{aligned}$$
    (3.3)

We assume here that \(l^u(n)\ge 1\) for all n and all u. We can relate the assignment of labels in a backwards consistent fashion as follows. For \(u\equiv (u_1,u_2,u_3,\dots )\in {\mathbb {Z}}_+^{\mathbb {N}}\), we define the function \(u(r), r\in {\mathbb {R}}_+\), through

$$\begin{aligned} u_\ell (r)\equiv {\left\{ \begin{array}{ll} u_\ell ,&{}\,\, \hbox {if}\,\, \ell \le r,\\ 0,&{}\,\, \hbox {if}\,\, \ell > r. \end{array}\right. } \end{aligned}$$
(3.4)

Clearly, if \(u(n)\in q(n)\) and \(r\le t\), then \(u(r)\in q(r)\). This allows to define the boundary of the tree at infinity as follows:

$$\begin{aligned} {\partial }{\mathbf {T}} \equiv \left\{ u\in {\mathbf {I}}: \forall n<\infty , u(n)\in q(n)\right\} . \end{aligned}$$
(3.5)

We also want to be able to consider a branch of a tree as an entire new tree. For this, we use the notation \(\overleftarrow{u} =(u_1,u_2,u_3,\dots )\) if \(u=(u_0,u_1,u_2,\dots )\).

Given a discrete-time tree, we can turn it into a continuous-time tree by assigning waiting times to each vertex, resp. to each multi-index in the tree. For example, in the case of the standard continuous-time Galton–Watson tree, we simply assign standard, iid, exponential random variables, \(e_u(n)\), to each vertex, resp. multi-index. Note that we choose the notation in such a way that we think of u as an element of the boundary of the tree, and \(e_u(n)\) is the waiting time attached to the vertex labelled u(n) (in the n-th generation). This time represents the waiting time from the birth of this branch to its next branching. This allows to assign a total time, \(T_u(n)\) for the branching of a multi-index at discrete time n, as

$$\begin{aligned} T_u(n)=t_0+\sum _{k=0}^n e_u(k), \end{aligned}$$
(3.6)

where \(t_0\in {\mathbb {R}}\) is an initial time associated to the root of the tree and \(e_{u}(0) \) is the time of the first branching of the root of the tree. We denote by \({{{\mathcal {F}}}}_n\) the \({\sigma }\)-algebra generated by the branching times of the first n generations of the tree, i.e.

$$\begin{aligned} {{{\mathcal {F}}}}_n\equiv {\sigma }\left( t_0, e_u(k), k\le n, u\in {\partial }{\mathbf {T}}\right) . \end{aligned}$$
(3.7)

We need to define further \({\sigma }\)-algebras that correspond to events that take place in sub-trees. For a given multi-index u, define the set of multi-indices that coincide with u in the first n entries,

$$\begin{aligned} {{{\mathcal {U}}}}_n(u)\equiv \left\{ v\in {\mathbf {T}}: \forall k\le n, v_k=u_k\right\} . \end{aligned}$$
(3.8)

Naturally, this is the subtree that branches off the branch u in the n-th generation. Next, we define the \({\sigma }\)-algebra generated by the times in these subtrees,

$$\begin{aligned} {{{\mathcal {G}}}}_n(u)\equiv {\sigma }\left( e_v(k), v\in {{{\mathcal {U}}}}_n(u), k\ge n\right) . \end{aligned}$$
(3.9)

For simplicity, we restrict ourselves in the remainder of this section to the case of binary branching, i.e. \(l^u(n)\equiv 2\). We define the notion of a normal set recursively.

Definition 3.1

A set \({{{\mathcal {E}}}}_n(u)\in {{{\mathcal {G}}}}_n(u)\) is called normal, if it is of the form

$$\begin{aligned} {{{\mathcal {E}}}}_n(u)= {\mathbb {R}}_+^{{\mathbf {T}}}, \end{aligned}$$
(3.10)

or if

$$\begin{aligned} {{{\mathcal {E}}}}_n(u)=\{e_n(u)\le r\}\cap {{{\mathcal {E}}}}_{n+1}((u(n),0))\cap {{{\mathcal {E}}}}_{n+1}((u(n),1)), \end{aligned}$$
(3.11)

where the two events \({{{\mathcal {E}}}}_{n+1}\) are normal. We say that a normal event \({{{\mathcal {E}}}}_n(u)\) has finite horizon, if there exists a finite \(N\ge n\) such that \({{{\mathcal {E}}}}_n(u)\in {{{\mathcal {F}}}}_N\).

Definition 3.2

We say that the assignment of branching times is quasi-Markov (with time horizon t), if there exists a family of probability measures, \(Q_{t,T}, t\in {\mathbb {R}}_+, T\le t\) on \({{{\mathcal {G}}}}_0(0)\) and a family of probability measures \(q_{t,T}, t\in {\mathbb {R}}_+, T\le t\) on \(({\mathbb {R}}_+,{{{\mathcal {B}}}}({\mathbb {R}}_+))\) that have the following property. For any event \({{{\mathcal {E}}}}_0(0)\in {{{\mathcal {G}}}}_0(0)\) which is of the form

$$\begin{aligned} {{{\mathcal {E}}}}_0(0)=\{e_0(0)\le r\}\cap {{{\mathcal {E}}}}_{1}((0,0))\cap {{{\mathcal {E}}}}_{1}((0,1)), \end{aligned}$$
(3.12)

where \({{{\mathcal {E}}}}_1(u)\in {{{\mathcal {G}}}}_1(u)\), for all \(t_0<r<t-t_0\),

$$\begin{aligned} Q_{t,t_0}({{{\mathcal {E}}}}_0(0))=\int _{t_0}^rq_{t,t_0}(ds) Q_{t,s}({{{\mathcal {E}}}}_1(\overleftarrow{00}))Q_{t,s}({{{\mathcal {E}}}}_1(\overleftarrow{01})). \end{aligned}$$
(3.13)

Remark

Heuristically, the probability measure \(q_{t,t_0} (ds)\) in (3.13) represents the distribution of the next branching time and the \( Q_{t,t_0+s}\) the law of the branches starting at time \(t_0+s\).

Lemma 3.3

The measures \(Q_{t,t_0}\) on the \({\sigma }\)-algebra generated by the normal events with finite horizon in \({{{\mathcal {G}}}}_0(u)\) are uniquely determined by the family of measures \(q_{t,s}, s\le t\). \(q_{t,s}\) is the law of \(e_u(n)\) conditioned on \(T_{u}(n-1)=s\).

Proof

From (3.13), it follows by simple iteration that the measure of any normal event of finite horizon is expressed uniquely in terms of q. Noting further that the set of finite horizon events is intersection stable, the assertion follows from Dynkin’s theorem. \(\square \)

The total tree at time t is then described as follows:

  1. (i)

    The branches of the tree alive are

    $$\begin{aligned} {{{\mathcal {A}}}}(t)\equiv \left\{ u(n): u \in {\mathbf {T}}, n\in {\mathbb {N}}_0 \;\text {s.t.}\; T_{n-1}(u)\le t<T_n(u)\right\} . \end{aligned}$$
    (3.14)
  2. (ii)

    The entire tree up to time t is the set

    $$\begin{aligned} {{{\mathcal {T}}}}(t)\equiv \left\{ u(k): k\le n, u(n)\in {{{\mathcal {A}}}}(t)\right\} . \end{aligned}$$
    (3.15)

Note that both sets are empty if \(t<t_0\). It is a bit cumbersome to write, but the distribution of the set \({{{\mathcal {T}}}}(t)\) together with the lengths of all branches can be written down explicitly in terms of the laws q and the branching laws of the underlying discrete-time tree.

4 The Simplified Model as Quasi-Markov Galton–Watson tree

We return to the approximate model defined in Sect. 1. For simplicity, we keep the assumption that the underlying tree is binary. We show that the branching times under the law \(\widehat{P}_{t,{\lambda }}\) define a quasi-Markov Galton–Watson tree.

Lemma 4.1

The branching times of the simplified model under the law \(\widehat{P}_{t,{\lambda }}\) are quasi-Markov, where \(Q_{t,T}\) is the marginal distribution of \(\widehat{P}_{t-T,{\lambda }}\) with \(q_{t,T} \) that is absolutely continuous w.r.t. Lebesgue mesure with density

$$\begin{aligned} \frac{{\mathrm e}^{-s} {\sigma }v_{\lambda }(t-s-T)^2}{v_{\lambda }(t-T)}\mathbbm {1}_{s\ge T}, \end{aligned}$$
(4.1)

namely,

$$\begin{aligned} \widehat{P}_{t-t_0,{\lambda }}({{{\mathcal {E}}}}_0(0))=\int _{t_0}^rq_{t,t_0}(ds) \widehat{P}_{t-s,{\lambda }}({{{\mathcal {E}}}}_1(\overleftarrow{00})) \widehat{P}_{t-s,{\lambda }}({{{\mathcal {E}}}}_1(\overleftarrow{01})). \end{aligned}$$
(4.2)

Proof

Let \({{{\mathcal {E}}}}_0(0)=\{e_0(0)\le r_0\}\cap {{{\mathcal {E}}}}_1(00)\cap {{{\mathcal {E}}}}_1(01)\). We now have

$$\begin{aligned} n(t)= {{\tilde{n}}}^{(00)} (t-T_1(00))+ {{\tilde{n}}}^{(01)} (t-T_1(00)). \end{aligned}$$
(4.3)

where the \({{\tilde{n}}} \) are the particle numbers in the two respective branches of the tree. In analogy to (2.28), we obtain

$$\begin{aligned}&v_{\lambda }(t-T)\widehat{P}_{t-T,{\lambda }}({{{\mathcal {E}}}}_0(0))\nonumber \\&\quad =\int _T^{r_0}{\mathrm e}^{-s+T} {\mathbb {E}}\left[ {\sigma }^{{{\tilde{n}}}^{(00)}(t-s)+\tilde{n}^{(01)}(t-s)-1} {{{\mathcal {E}}}}_1(00){{{\mathcal {E}}}}_1(01)\right] {\hbox {d}}s\nonumber \\&\quad =\int _T^{r_0}{\mathrm e}^{-s+T} {\sigma }{\mathbb {E}}\left[ {\sigma }^{{{\tilde{n}}}^{(00)}(t-s)-1} {{{\mathcal {E}}}}_1(00)\right] {\mathbb {E}}\left[ {\sigma }^{{{\tilde{n}}}^{(01)}(t-s)-1} {{{\mathcal {E}}}}_1(01)\right] {\hbox {d}}s\nonumber \\&\quad =\int _T^{r_0}{\mathrm e}^{-s+T} {\sigma }v_{\lambda }(t-s)^2\frac{{\mathbb {E}}\left[ {\sigma }^{\tilde{n}^{(00)}(t-s)-1} {{{\mathcal {E}}}}_1(00)\right] }{v_{\lambda }(t-s)}\frac{ {\mathbb {E}}\left[ {\sigma }^{{{\tilde{n}}}^{(01)}(t-s)-1} {{{\mathcal {E}}}}_1(01)\right] }{v_{\lambda }(t-s)}{\hbox {d}}s\nonumber \\&\quad =\int _T^{r_0}{\mathrm e}^{-s+T} {\sigma }v_{\lambda }(t-s)^2 \widehat{P}_{t-s,{\lambda }}\left( {{{\mathcal {E}}}}_1(\overleftarrow{00})\right) \widehat{P}_{t-s,{\lambda }}\left( {{{\mathcal {E}}}}_1(\overleftarrow{01})\right) {\hbox {d}}s, \end{aligned}$$
(4.4)

where we used the independence the events in the two branches under the original BBM measure \({\mathbb {P}}\) and the definition of \(\widehat{P}_{t,{\lambda }}\). This concludes the proof.

\(\square \)

5 The Limit \({\lambda }(t)\downarrow 0\)

We have seen that a penalty with fixed \({\lambda }<\infty \) and \(\epsilon >0\) enforces that only a finite number of branchings take place, even if we let t tend to infinity. To get more interesting results, we consider now the case when \({\lambda }={\lambda }(t) \) depends on t such that \({\lambda }(t)\downarrow 0\) as \(t\uparrow \infty \). In fact, we will see that a rather interesting limiting model arises in this setting. Clearly, in this case \({\sigma }({\lambda }(t),\epsilon ) \approx {\mathrm e}^{-{\lambda }(t)\epsilon ^2}\approx 1-{\lambda }(t)\epsilon ^2\) is a good approximation.

We first look at the partition function.

Lemma 5.1

Assume that \({\lambda }(t)\downarrow 0\), but \(t+\ln ({\lambda }(t)\epsilon ^2)\uparrow \infty \), as \(t\uparrow \infty \). Then

$$\begin{aligned} \lim _{t\uparrow \infty } {\mathrm e}^t {\lambda }(t) \epsilon ^2 v_{{\lambda }(t)}(t) =1. \end{aligned}$$
(5.1)

Proof

We just use the explicit form of \(v_{\lambda }(t)\) given in Lemma for \({\lambda }={\lambda }(t)\). This gives

$$\begin{aligned} {\mathrm e}^t {\lambda }(t) \epsilon ^2 v_{{\lambda }(t)}(t)= & {} \frac{{\lambda }(t) \epsilon ^2}{1-{\sigma }({\lambda }(t),\epsilon ) +{\sigma }({\lambda }(t),\epsilon ){\mathrm e}^{-t}}\nonumber \\= & {} \frac{{\lambda }(t) \epsilon ^2}{{\lambda }(t)\epsilon ^2+O({\lambda }(t)^2) +O({\mathrm e}^{-t})} =1+O({\lambda }(t)), \end{aligned}$$
(5.2)

which implies the statement of the lemma. \(\square \)

From Theorem , we derive the asymptotics of the particle numbers.

Theorem 5.2

Assume that \({\lambda }(t)\downarrow 0\), but \(t+\ln ({\lambda }(t)\epsilon ^2)\uparrow \infty \), as \(t\uparrow \infty \). Then:

  1. (i)

    The number of particles at time t times \({\lambda }(t)\epsilon ^2\), \({\lambda }(t)\epsilon ^2n(t)\), converges in distribution to an exponential random variable with parameter 1.

  2. (ii)

    For any \(\rho \in {\mathbb {R}}\), the number of particles at time \(s(t)=t+\ln ({\lambda }(t)\epsilon ^2) +\rho \) converges in distribution to a geometric random variable with parameter \(1/(1+{\mathrm e}^\rho )\).

  3. (iii)

    If \(\rho (t)\uparrow \infty \) but \(\ln ({\lambda }(t)\epsilon ^2) +\rho (t)\le 0\) , the number of particles at time \(s(t)=t+\ln ({\lambda }(t)\epsilon ^2) +\rho (t)\) divided by \(1+{\mathrm e}^{\rho (t)}\) converges in distribution to an exponential random variable with parameter 1.

Proof

The proof follows easily from the explicit computations of the Laplace tranforms of the particle numbers (see Eq. (2.15) and (2.22)). \(\square \)

The next theorem gives the asymtotics of the first branching time.

Theorem 5.3

Let \({\lambda }(t)\) be as in Theorem . Then, for any \(\rho \in {\mathbb {R}}\),

$$\begin{aligned} \lim _{t\uparrow \infty } \widehat{P}_{{\lambda }(t),t} \left( {\tau }_1\le t+\ln \left( {\lambda }(t)\epsilon ^2\right) +\rho \right) = \frac{1}{{\mathrm e}^{-\rho }+1}. \end{aligned}$$
(5.3)

Proof

From the explicit formula (2.25), we get that

$$\begin{aligned} \widehat{P}_{{\lambda }(t),t} \left( {\tau }_1\le t-r\right)= & {} \frac{\left( 1-{\lambda }(t)\epsilon ^2\right) ({\mathrm e}^{-r}-{\mathrm e}^{-t})}{{\lambda }(t)\epsilon ^2+{\mathrm e}^{-r}\left( 1-{\lambda }(t)\epsilon ^2\right) }\left( 1+O\left( {\lambda }(t)^2\epsilon ^4\right) \right) \nonumber \\= & {} \frac{1-{\mathrm e}^{-(t-r)}}{{\mathrm e}^r{\lambda }(t)\epsilon ^2+1}\left( 1+O\left( {\lambda }(t)^2\epsilon ^4\right) \right) . \end{aligned}$$
(5.4)

To get something non-trivial, the first term in the denominator should be of order one. That suggest to choose \(r=r(t)=-\ln \left( {\lambda }(t)\epsilon ^2\right) -\rho \). Eq. (5.3) then follows directly. \(\square \)

5.1 The limiting Quasi-Markov Galton–Watson tree

Theorem also suggests to define

$$\begin{aligned} \tilde{\tau }_1\equiv {\tau }_1-t-\ln \left( {\lambda }(t)\epsilon ^2\right) . \end{aligned}$$
(5.5)

\({{\tilde{{\tau }}}}_1\) should be thought of as the position of the first branching seen from the standard position \(t +\ln ({\lambda }(t)\epsilon ^2)\).

To derive the asymptotics of the consecutive branching times, we just have to look at

$$\begin{aligned} \widehat{P}_{{\lambda }(t),t} (e_u(n+1)\le {\Delta }|{{{\mathcal {F}}}}_n)= \widehat{P}_{{\lambda }(t), t-T_u(n)}({\tau }_1\le {\Delta }). \end{aligned}$$
(5.6)

For this, we have from the previous computation in the proof of Theorem (see (5.4))

$$\begin{aligned} \widehat{P}_{{\lambda }(t), t-T_u(n)} \left( {\tau }_1\le {\Delta }\right) =\frac{1-{\mathrm e}^{-{\Delta }}}{{\mathrm e}^{t-T_u(n)+\ln \left( {\lambda }(t)\epsilon ^2\right) } {\mathrm e}^{-{\Delta }} +1}\left( 1+O\left( {\lambda }(t)^2\epsilon ^4\right) \right) . \nonumber \\ \end{aligned}$$
(5.7)

The asymptotic results above suggest to consider the branching times of the process in the limit \(t\uparrow \infty \), \({\lambda }(t)\downarrow 0\), around the time \(t-\ln ({\lambda }(t)\epsilon ^2)\). We have seen that the time of the first branching shifted by this value converges in distribution to a random variable with distribution function \(1/\left( {\mathrm e}^{-\rho }+1\right) \) (which is supported on \((-\infty , \infty )\)).

In fact we define a limiting model as a quasi-Markov Galton–Watson tree with the measures

$$\begin{aligned} q_{\infty ,T}(e\le {\Delta })=\frac{1-{\mathrm e}^{-{\Delta }}}{{\mathrm e}^{-{\Delta }-T}+1}. \end{aligned}$$
(5.8)

This gives, in particular, for the first branching time,

$$\begin{aligned} Q_{\infty , t_0}(e_0(0)\le {\Delta })=\frac{1-{\mathrm e}^{-{\Delta }}}{{\mathrm e}^{-{\Delta }-t_0}+1}. \end{aligned}$$
(5.9)

We have to choose \(t_0\) to match this with the known asymptotics of the first branching time, see (5.3). We set

$$\begin{aligned} Q_{\infty ,-\infty }(T_0(0)\le {\Delta })\equiv \lim _{t_0\downarrow -\infty } Q_{\infty , t_0}(e_0(0)\le -t_0+{\Delta }) =\lim _{t_0\downarrow -\infty } \frac{1-{\mathrm e}^{t_0 -{\Delta }}}{ {\mathrm e}^{-{\Delta }}+1}=\frac{1}{1+{\mathrm e}^{-{\Delta }}},\nonumber \\ \end{aligned}$$
(5.10)

for all \({\Delta }\in {\mathbb {R}}\). So the picture is that we start the process at \(t_0=-\infty \) and the first branching time is infinitely far in the future and occurs at a finite random time distributed according to (5.10). The density of this distribution is \(\frac{1}{4} \cosh ({\Delta })^{-2}\). In particular, it has mean zero and variance \(\pi ^2/3\).

We have the following result.

Theorem 5.4

Assume that \({\lambda }(t)\downarrow 0\) and \(t+\ln ({\lambda }(t)\epsilon ^2)\uparrow \infty \), as \(t\uparrow \infty \). Then, for any \({\Delta }\in {\mathbb {R}}\) and events \({{{\mathcal {E}}}}_1(u)\in {{{\mathcal {G}}}}_1\),

$$\begin{aligned}&\lim _{t\uparrow \infty } \widehat{P}_{{\lambda }(t),t} \left( \left\{ T_0(0)\le {\Delta }+t+\ln ({\lambda }(t)\epsilon ^2)\right\} \cap {{{\mathcal {E}}}}_1((0,0))\cap {{{\mathcal {E}}}}_1((0,1)) \right) \nonumber \\&\qquad = Q_{\infty ,-\infty } \left( \{T_0(0)\le {\Delta }\}\cap {{{\mathcal {E}}}}_1((0,0))\cap {{{\mathcal {E}}}}_1((0,1))\right) , \end{aligned}$$
(5.11)

where \(Q_{\infty ,-\infty }\) is the law of the limiting model.

Proof

We need to show is that the measures \(q_{t,T} \) converge to \(q_{\infty ,T}\) as \(t\rightarrow \infty \). But this follows from (5.7). \(\square \)

Fig. 1
figure 1

Scaling towards the limit process

Note further, as long as \(T_u(n)\) is negative, the distribution of \(e_u(n)\) is concentrated around \(-T_u(n)\), while as \(T_u(n)\) become positive and large, the distribution tends to a standard exponential distribution. In fact,

$$\begin{aligned} {\mathbb {E}}\left[ e_u(n+1)|{{{\mathcal {F}}}}_n\right] = \left( 1+{\mathrm e}^{T_u(n)}\right) \ln \left( 1+{\mathrm e}^{-T_u(n)}\right) , \end{aligned}$$
(5.12)

from (5.6) and by integrating (5.8). Clearly this converges to 1, as \(T_u(n)\uparrow \infty \) and behaves like \(-T_u(n)\), as \(T_u(n)\) tends to \(-\infty \).

6 The Distribution of the Front

An obvious first question is the distribution of the maximum of BBM under the law \(\widehat{P}_{t,{\lambda }}\). We define, for any \(z\in {\mathbb {R}}\),

$$\begin{aligned} u_{\lambda }(t,z)={\mathbb {E}}\left[ {\sigma }^{n(t)-1}\mathbbm {1}_{\forall _{1=1}^{n(t)} x_i(t)\le z}\right] . \end{aligned}$$
(6.1)

Then

$$\begin{aligned} \widehat{P}_{t,{\lambda }}\left( \forall _{1=1}^{n(t)} x_i(t)\le z\right) =\frac{u_{\lambda }(t,x)}{v_{\lambda }(t)}\equiv 1-w_{\lambda }(t,x). \end{aligned}$$
(6.2)

Note that we use the choice \(1-w_{\lambda }\) to be closer to the usual formulation of the F-KPP equation.

Interestingly, \(w_{\lambda }\) solves a time-dependent version of the F-KPP equation.

Lemma 6.1

\(w_{\lambda }\) defined in (6.2) is the unique solution of the equation

$$\begin{aligned} {\partial }_t w_{\lambda }=\frac{1}{2}{\partial }_{xx} w_{\lambda }+ w_{\lambda }(1-w_{\lambda }) \frac{{\sigma }}{(1-{\sigma }){\mathrm e}^{t}+{\sigma }}. \end{aligned}$$
(6.3)

with initial condition \(w_{\lambda }(0,x)=\mathbbm {1}_{x\le 0}\).

Remark

Note that \(w_{\lambda }\) does not refer to a stochastic process but to a family of processes at their final time t.

Proof

In complete analogy to the derivation of the F-KPP equation (see e.g. [8]), \(u_{\lambda }\) satisfies the recursive equation

$$\begin{aligned} u_{\lambda }(t,z)={\mathrm e}^{-t} \Phi _t(z)+\int _0^t ds {\mathrm e}^{-(t-s)} \int _{-\infty }^{\infty } {\hbox {d}}y\frac{{\mathrm e}^{-\frac{y^2}{2(t-s|)}}}{\sqrt{2\pi (t-s)}}{\sigma }u_{\lambda }(s,z-y)^2,\nonumber \\ \end{aligned}$$
(6.4)

where \(\Phi _t(z)=\int _{-\infty }^z \frac{{\mathrm e}^{-\frac{x^2}{2t}}}{\sqrt{2\pi t}}{\hbox {d}}x\) is the probability that a single Brownian motion at time t is smaller than z. Letting \(H(t,x)\equiv \frac{ {\mathrm e}^{-t -\frac{x^2}{2t}}}{\sqrt{2\pi t}}\), we can write this as

$$\begin{aligned} u_{\lambda }(t,z)= \int _{-\infty }^\infty {\hbox {d}}y H(t, z-y)\mathbbm {1}_{y\le 0} +\int _0^t {\hbox {d}}s \int _{-\infty }^\infty {\hbox {d}}y H(s,y) {\sigma }u_{\lambda }(t-s,z-y)^2.\nonumber \\ \end{aligned}$$
(6.5)

Note that H is the Green function for the differential operator \({\partial }_t-\frac{1}{2} {\partial }_{xx}+1\) and so \(u_{\lambda }\) is the mild formulation of the partial differential equation

$$\begin{aligned} {\partial }_t u_{\lambda }(t,z)=\frac{1}{2} {\partial }_{zz} u_{\lambda }(t,z)-u_{\lambda }(t,z)+{\sigma }u_{{\lambda }}(t,z)^2, \end{aligned}$$
(6.6)

with initial condition \(u_{\lambda }(0,z)=\mathbbm {1}_{z\ge 0}\). This equation is the F-KPP equation if \({\sigma }({\lambda },\epsilon )=1\), i.e. if \({\lambda }=0\), and looks similar to it in general. Hence,

$$\begin{aligned} {\partial }_t w_{\lambda }= & {} -\frac{{\partial }_t u_{\lambda }}{v_{\lambda }}+\frac{u_{\lambda }{\partial }_t v_{\lambda }}{v_{\lambda }^2} =\frac{1}{2}{\partial }_{xx} w_{\lambda }+{\sigma }w_{\lambda }(1-w_{\lambda })v_{\lambda }\nonumber \\= & {} \frac{1}{2}{\partial }_{xx} w_{\lambda }+ w_{\lambda }(1-w_{\lambda }) \frac{{\sigma }}{(1-{\sigma }){\mathrm e}^{t}+{\sigma }}, \end{aligned}$$
(6.7)

where we used the explicit form of \(v_{\lambda }\) from Lemma . \(\square \)

Note that (6.3) is a time-dependent version of the F-KPP equation, where the nonlinear term is modulated down over time. Time-dependent F-KPP equations have been studied in the past (see e.g. [18, 24, 25]), but we did not find this specific example in the literature. For small \({\lambda }\), (6.3) becomes

$$\begin{aligned} {\partial }_t w_{\lambda }=\frac{1}{2}{\partial }_{xx} w_{\lambda }+ w_{\lambda }(1-w_{\lambda })\frac{1+O({\lambda }\epsilon ^2)}{{\lambda }\epsilon ^2{\mathrm e}^t+1}. \end{aligned}$$
(6.8)

For future use, note that (6.8) is a special case of a class of F-KPP equations of the form

$$\begin{aligned} {\partial }_t \psi =\frac{1}{2}{\partial }_{xx} \psi + g(t)\psi (1-\psi ), \end{aligned}$$
(6.9)

where \(g:{\mathbb {R}}\rightarrow {\mathbb {R}}_+\). We will be interested in the case when g is bounded, monotone decreasing, and integrable.

The key tool for analysing solutions of (6.3) is the Feynman–Kac representation for \(\psi \) (see Bramson [10]).

Lemma 6.2

If \(\psi \) is a solution of the equation (6.9) with initial condition \(\psi (0,x)=\rho (x) \in [0,1]\), then \(\psi \) satisfies

$$\begin{aligned} \psi (t,x)={\mathbb {E}}_x\left[ \exp \left( \int _0^t g(t-s)(1-\psi (t-s,B_s))ds\right) \rho (B_t)\right] , \end{aligned}$$
(6.10)

where B is Brownian motion starting in x.

The strategy to exploit this representation used by Bramson is to use a priori bounds on \(\psi \) in the right-hand side of the equation in order to get sharp upper and lower bounds. Here we want to do the same, but we need to take into account the specifics of the function g. Going back to the specific case (6.8), g remains close to 1 for a fairly long time (of order \(-\ln ({\lambda }\epsilon ^2)\)), and then decays exponentially with rate 1 to zero. Therefore, we expect that initially, the solution will behave like that of the F-KPP equation and approach a travelling wave solution. As time goes on, the wave slows down and comes essentially to a halt. Finally, we see a pure diffusion. We will deal differently with these three regimes. We begin with the initial phase when \(g(t)\sim 1\).

Lemma 6.3

Assume that g is non-increasing and bounded by one and zero from above and below. Define \(G(t)=\int _0^tg(s) ds\). Then

$$\begin{aligned} {\mathrm e}^{G(t)-t} \psi _0(t,x)\le \psi (t,x) \le \psi _0(t,x), \quad \forall x\in {\mathbb {R}}, t\in {\mathbb {R}}_+, \end{aligned}$$
(6.11)

where \(\psi _0\) is the solution of (6.9) with \(g(t)\equiv 1\) and initial condition \(\psi _0(0,x)=\rho (x)\in [0,1]\).

Proof

The upper bound follows from the maximum principle since \(g(t-s)\le 1\). For the lower bound, starting from (6.10), we see that

$$\begin{aligned} \psi (t,x)= & {} {\mathbb {E}}_x\left[ \exp \left( \int _0^t \left( 1-\psi (t-s,B_s))ds\right) + (g(t-s)-1)(1-\psi (t-s,B_s))ds\right) \rho (B_t)\right] \nonumber \\\ge & {} \exp \left( \int _0^t (g(t-s)-1)ds\right) {\mathbb {E}}_x\left[ \exp \left( \int _0^t \left( 1-\psi (t-s,B_s))ds\right) ds\right) \rho (B_t)\right] \nonumber \\\ge & {} \exp \left( \int _0^t (g(t-s)-1)ds\right) {\mathbb {E}}_x\left[ \exp \left( \int _0^t \left( 1-\psi _0(t-s,B_s))ds\right) ds\right) \rho (B_t)\right] \nonumber \\= & {} \exp (G(t)-t) \psi _0(t,x), \end{aligned}$$
(6.12)

where we the last inequality follows from the already proven upper bound on \(\psi (t,x)\). \(\square \)

G can be computed explicitly for \(g(t)={\sigma }/((1-{\sigma }) {\mathrm e}^t+{\sigma })\), namely

$$\begin{aligned} G(t)= t-\ln \left( 1+(1/{\sigma }-1){\mathrm e}^t\right) +\ln \left( 1/{\sigma })\right) = t-\ln \left( {\sigma }+(1-{\sigma }) {\mathrm e}^{t}\right) . \nonumber \\ \end{aligned}$$
(6.13)

Notice that

$$\begin{aligned} \lim _{t\uparrow \infty } G(t)=-\ln (1-{\sigma }). \end{aligned}$$
(6.14)

Define, for \({\delta }>0\),

$$\begin{aligned} {\tau }_{\delta }\equiv \sup \left\{ t>0: 1-g(t)\le {\delta }\right\} . \end{aligned}$$
(6.15)

Obviously,

$$\begin{aligned} \frac{{\sigma }}{(1-{\sigma }) {\mathrm e}^{{\tau }_{\delta }}+{\sigma }} =1-{\delta }, \end{aligned}$$
(6.16)

so

$$\begin{aligned} {\tau }_{\delta }=-\ln (1/{\sigma }-1)-\ln (1/{\delta }-1). \end{aligned}$$
(6.17)

Finally,

$$\begin{aligned} G({\tau }_{\delta }) ={\tau }_{\delta }+\ln (1-{\delta })+\ln (1/{\sigma }). \end{aligned}$$
(6.18)

In the limit \({\lambda }\downarrow 0\), we get

$$\begin{aligned} {\tau }_{\delta }\sim -\ln ({\lambda }\epsilon ^2)-\ln (1/{\delta }-1), \end{aligned}$$
(6.19)
$$\begin{aligned} G({\tau }_{\delta })-{\tau }_{\delta }\sim \ln (1-{\delta }), \end{aligned}$$
(6.20)

and

$$\begin{aligned} \lim _{t\uparrow \infty }G(t)\sim -\ln ({\lambda }\epsilon ^2). \end{aligned}$$
(6.21)

We see that, as \({\lambda }\downarrow 0\), \({\tau }_{\delta }\uparrow \infty \). This allows us to deduce the precise behaviour of the solution at this time via Bramson’s results.

Lemma 6.4

Let \(w_{\lambda }\) satisfy (6.3) with Heaviside initial condition. Then, for each \({\delta }>0\), there exists \({\lambda }_0\) such that, for all \({\lambda }<{\lambda }_0\),

$$\begin{aligned} (1-{\delta }) v_0(x)\le w_{\lambda }\left( {\tau }_{\delta }, x+ m({\tau }_{\delta })\right) \le v_0(x), \end{aligned}$$
(6.22)

where \(v_0\) is a travelling wave of the F-KPP equation with speed \(\sqrt{2}\)

$$\begin{aligned} \frac{1}{2}{\partial }_{xx} v_0 +\sqrt{2} {\partial }_x v_0 +v_0(1-v_0)=0, \end{aligned}$$
(6.23)

and \(m(t)\equiv \sqrt{2} t-\frac{3}{2\sqrt{2}} \ln t\).

Proof

This follows immediately from Bramson’s theorems A and B in [10], Lemma , and (6.13). \(\square \)

Next, we look at the behaviour of the solution for times when \(g(t)\ll 1\).

Lemma 6.5

Let \(\psi \) solve (6.9) with initial condition \(\psi (0,x)=\rho (x)\in [0,1]\). Define, for \({\Delta }>0\), \(T_{\Delta }\) by

$$\begin{aligned} T_{\Delta }=\inf \left\{ t>0: \int _t^\infty g(s)s\le {\Delta }\right\} . \end{aligned}$$
(6.24)

Then, for \(t>T_{\Delta }\),

$$\begin{aligned} {\mathbb {E}}_x \left[ \psi (T_{\Delta },B_{t-T_{\Delta }})\right] \le \psi (t,x)\le {\mathrm e}^{\Delta }{\mathbb {E}}_x \left[ \psi (T_{\Delta },B_{t-T_{\Delta }})\right] , \end{aligned}$$
(6.25)

where B is Brownian motion started in x.

Proof

We have that, for \(t\ge T_{\Delta }\),

$$\begin{aligned} \psi (t,x) ={\mathbb {E}}_x\left[ \exp \left( \int _0^{t-T_{\Delta }} g(t-s) (1-\psi (t-s,B_s)){\hbox {d}}s\right) \psi (T_{\Delta },B_{t-T_{\Delta }})\right] .\nonumber \\ \end{aligned}$$
(6.26)

The exponent is trivially bounded by

$$\begin{aligned} 0\le \int _0^{t-T_{\Delta }} g(t-s)(1-\psi (t-s,B_s)){\hbox {d}}s\le \int _0^{t-T_{\Delta }} g(t-s){\hbox {d}}s = \int _{T_{\Delta }}^t g(s){\hbox {d}}s \le {\Delta }\nonumber \\ \end{aligned}$$
(6.27)

Inserting these bounds into (6.26) gives (6.25). \(\square \)

In the case when G is given by (6.13), \(T_{\Delta }\) is determined by

$$\begin{aligned} G(\infty )-G(T_{\Delta })= {\Delta }. \end{aligned}$$
(6.28)

But

$$\begin{aligned} G(\infty )-G(T_{\Delta })= & {} -\ln (1-{\sigma }) -T_{\Delta }+\ln \left( {\sigma }+(1-{\sigma }){\mathrm e}^{T_{\Delta }}\right) \nonumber \\= & {} -\ln (1-{\sigma }) +\ln \left( {\sigma }{\mathrm e}^{-T_{\Delta }}+(1-{\sigma })\right) = \ln \left( {\sigma }{\mathrm e}^{-T_{\Delta }}/(1-{\sigma })+1\right) \nonumber \\ \end{aligned}$$
(6.29)

We make the ansatz \(T_{\Delta }=-\ln (1-{\sigma })+z\) and determine z. Then

$$\begin{aligned} G(\infty )-G(T_{\Delta })=\ln \left( {\sigma }{\mathrm e}^{-z}+1\right) . \end{aligned}$$
(6.30)

Setting this equal to \({\Delta }\) and solving it for z, gives

$$\begin{aligned} T_{\Delta }=-\ln (1/{\sigma }-1) -\ln \left( {\mathrm e}^{\Delta }-1\right) \sim \ln (1/{\sigma }-1) +\ln (1/{\Delta }), \end{aligned}$$
(6.31)

for small \({\Delta }\). In particular, we have that

$$\begin{aligned} T_{\Delta }-{\tau }_{\delta }\sim \ln \left( {\mathrm e}^{\Delta }-1\right) +\ln (1/{\delta }-1) \sim \ln (1/{\Delta })+\ln (1/{\delta }), \end{aligned}$$
(6.32)

for small \({\Delta }\) and \({\delta }\).

What is left to do is to control the evolution of the solution between time \({\tau }_{\delta }\) and \(T_{\Delta }\).

Lemma 6.6

With the notation above, for \({\tau }_{\delta }\le t\le T_{\Delta }\),

$$\begin{aligned} {\mathbb {E}}_x\left[ w_{\lambda }({\tau }_{\delta },B_{t-{\tau }_{\delta }})\right] \le w_{\lambda }(t,x) \le {\mathrm e}^{G(t)-G({\tau }_{\delta })} {\mathbb {E}}_x\left[ w_{\lambda }({\tau }_{\delta },B_{t-{\tau }_{\delta }})\right] \wedge 1, \end{aligned}$$
(6.33)

where

$$\begin{aligned} G(t)-G({\tau }_{\delta })=-\ln \left( (1-{\delta }){\mathrm e}^{-(t-{\tau }_{\delta })}+{\delta }\right) . \end{aligned}$$
(6.34)

and

$$\begin{aligned} {\mathrm e}^{G(T_{\Delta })-G({\tau }_{\delta })} \sim \frac{1}{{\delta }}{\mathrm e}^{-{\Delta }}. \end{aligned}$$
(6.35)

Proof

The Feynman–Kac representation gives, for \({\tau }_{\delta }\le t\le T_{\Delta }\),

$$\begin{aligned} w_{\lambda }(t,x) ={\mathbb {E}}_x\left[ \exp \left( - \int _0^{t-{\tau }_{\delta }} g(t-{\tau }_{\delta }-s)(1-w_{\lambda }(t-{\tau }_{\delta }-s, B_s))\right) w_{\lambda }({\tau }_{\delta },B_{t-{\tau }_{\delta }})\right] .\nonumber \\ \end{aligned}$$
(6.36)

The bounds in Lemma (6.6) follow from (6.36) together with the fact that \(w_{\lambda }\in [0,1]\). (6.34) follows from (6.13) and (6.18). \(\square \)

Remark

We expect the upper bound to be closer to the correct answer.

We now combine all estimates. This gives, for \(t\ge T_{\Delta }\),

$$\begin{aligned} (1-{\delta }) {\mathbb {E}}_0\left[ v_0(x+B_{t-{\tau }_{\delta }})\right] \le w_{\lambda }(x+m({\tau }_{\delta }))\le (1/{\delta }) {\mathbb {E}}_0\left[ v_0(x+B_{t-{\tau }_{\delta }})\right] \wedge 1.\nonumber \\ \end{aligned}$$
(6.37)

If we choose, e.g. \({\delta }=1/2\), we see that the upper and lower bounds only differ by a factor 4.

The expectation over \(v_0\) can be bounded using the known tail estimates (see [10] and [14]),

$$\begin{aligned} v_0(x)\le & {} Cx{\mathrm e}^{-\sqrt{2} x}, \;\hbox {if}\, x>1, \end{aligned}$$
(6.38)
$$\begin{aligned} v_0(x)\ge & {} 1-C{\mathrm e}^{(2-\sqrt{2})x}, \;\hbox {if}\, x<-1, \end{aligned}$$
(6.39)

However, the resulting expressions are not very nice and not very precise, so we leave their computation to the interested reader.

We conclude this chapter by summarising the behaviour of the solution as a function of t when \({\lambda }\downarrow 0\).

Theorem 6.7

Assume that \({\lambda }\downarrow 0\). Let \(0<{\delta }<1\) and \({\tau }_{\delta }\) defined as in (6.15). Then

$$\begin{aligned} {\tau }_{\delta }= \ln (1/{\sigma }-1)-\ln (1/{\delta }-1)\sim -\ln ({\lambda }\epsilon ^2)-\ln (1/{\delta }-1). \end{aligned}$$
(6.40)

Moreover, for \({\Delta }>0\), let \(T_{\Delta }\) be defined in (6.24). Then,

$$\begin{aligned} T_{\Delta }=-\ln (1/{\sigma }-1) -\ln \left( {\mathrm e}^{\Delta }-1\right) \sim -\ln ({\lambda }\epsilon ^2) +\ln (1/{\Delta }), \end{aligned}$$
(6.41)

for \({\lambda }\downarrow 0\) and \({\Delta }\) small. Then the solution \(w_{\lambda }\) of (6.3) can be described as follows.

  1. (i)

    For \(0\ll t\le {\tau }_{\delta }\)

    $$\begin{aligned} w_{\lambda }(t,x+m(t)) \sim v_0(x), \end{aligned}$$
    (6.42)

    where \(v_0\) is the solution of (6.23).

  2. (ii)

    For \({\tau }_{\delta }<t<T_{\Delta }\), we have

    $$\begin{aligned} (1-{\delta }) {\mathbb {E}}_x\left[ v_0(B_{t-{\tau }_{\delta }})\right] \le w_{\lambda }(t,x+m({\tau }_{\delta })) \le {\mathrm e}^{G(t)-G({\tau }_{\delta })} {\mathbb {E}}_x\left[ v_0(B_{t-{\tau }_{\delta }})\right] \wedge 1,\nonumber \\ \end{aligned}$$
    (6.43)
  3. (iii)

    For \(t \ge T_{\Delta }\), we have

    $$\begin{aligned} (1-{\delta }) {\mathbb {E}}_x\left[ v_0(B_{t-{\tau }_{\delta }})\right] \le w_{\lambda }(t,x+m({\tau }_{\delta })) \le \frac{1}{{\delta }} {\mathbb {E}}_x\left[ v_0(B_{t-{\tau }_{\delta }})\right] \wedge 1. \end{aligned}$$
    (6.44)

This picture corresponds to the geometric picture we have established in the preceding sections, in a sort of time reversed way (recall the remark after Lemma ): the diffusive behaviour at large times corresponds to the Brownian motion up to the time of the first branching \((\sim - \ln ({\lambda }\epsilon ^2))\), the travelling wave behaviour at times up to \({\tau }_{\delta }\) corresponds to the almost freely branching at the late times after \(t +\ln ({\lambda }\epsilon ^2)\), and the finite time interval between \({\tau }_{\delta }\) and \(T_{\Delta }\), when the travelling waves comes to a halt corresponds to the first branching steps that are asymptotically described by the limiting quasi-Markov Galton–Watson tree described in Sect. 5.

7 Comparison to the Full Model

We show that in the original model, with the interaction given by \(I_t\) (see Eq. (1.2)), behaves similarly to the simplified model. In particular, the first branching happens at least as late as in that model.

Lemma 7.1

Let \({\tau }_1\) be the first branching time. Then

$$\begin{aligned} P_{t,{\lambda }} \left( {\tau }_1\le t- r\right) \le \frac{{\mathrm e}^{-{\lambda }\epsilon ^2}({\mathrm e}^{-r}-{\mathrm e}^{-t})}{\left( 1-{\mathrm e}^{-{\lambda }\epsilon ^2} (1-{\mathrm e}^{-t})\right) \left( 1-{\mathrm e}^{-{\lambda }\epsilon ^2}(1-{\mathrm e}^{-r})\right) }. \end{aligned}$$
(7.1)

For \({\lambda }(t)\downarrow 0\) and \(t\uparrow \infty \), this behaves as

$$\begin{aligned} P_{t,{\lambda }(t)} \left( {\tau }_1\le t- r\right) \le \frac{{\mathrm e}^{-r}}{{\lambda }(t)\epsilon ^2 \left( {\lambda }(t)\epsilon ^2+{\mathrm e}^{-r}\right) }=\frac{1}{{\lambda }(t)^2\epsilon ^4{\mathrm e}^{r} +{\lambda }(t)\epsilon ^2}. \end{aligned}$$
(7.2)

And so

$$\begin{aligned} P_{t,{\lambda }(t)} \left( {\tau }_1\le t +2\ln \left( {\lambda }(t)\epsilon ^2\right) -\rho \right) \le {\mathrm e}^{-\rho }, \end{aligned}$$
(7.3)

Proof

Set

$$\begin{aligned} V(t,r)\equiv {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_t(x)}\mathbbm {1}_{{\tau }_1\le r}\right] . \end{aligned}$$
(7.4)

Then

$$\begin{aligned} P_{t,{\lambda }} \left( {\tau }_1\le t- r\right) = \frac{V(t,t-r)}{{\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_t(x)}\right] } \le {\mathrm e}^{t}{{\tilde{v}}}_{\lambda }(t,t-r), \end{aligned}$$
(7.5)

where \(\tilde{v}_{\lambda }\) is defined as (2.27) but with \({\sigma }({\lambda },\epsilon )\) replaced by \({\mathrm e}^{-{\lambda }\epsilon ^2}\). In the last inequality, we bounded the numerator from above using (1.8) and the denominator from below by the probability that there is no branching up to time t. Inserting the explicit form of \({{\tilde{v}}}_{\lambda }(t,t-r)\) gives (7.1). The asymptotic formulae for small \({\lambda }\) are straightforward. \(\square \)

One can improve the bound above as follows. Instead of bounding the denominator just by the probability that there is no branching up to time t, we can bound it by no branching up to time \(t-q\) and then bound the interaction of the remaining piece uniformly by

$$\begin{aligned} I_q(x)\le \int _0^q n(s)(n(s)-1){\hbox {d}}s. \end{aligned}$$
(7.6)

Hence the denominator becomes

$$\begin{aligned} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_t(x)}\right] \ge {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_t(x)}\mathbbm {1}_{{\tau }_1>t-q}\right] = {\mathrm e}^{-t+q} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_q(x)}\right] . \end{aligned}$$
(7.7)

Now

$$\begin{aligned} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_q(x)}\right]\ge & {} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }\int _0^q n(s)(n(s)-1)ds} \mathbbm {1}_{n(s)\le c{\mathbb {E}}n(s) \forall _{s\le q}}\right] \nonumber \\ {}\ge & {} {\mathrm e}^{-{\lambda }\int _0^q c{\mathbb {E}}n(s)( c{\mathbb {E}}n(s)-1)ds} {\mathbb {E}}\left[ \mathbbm {1}_{n(s)\le c{\mathbb {E}}n(s) \forall _{s\le q}}\right] . \end{aligned}$$
(7.8)

Since \(n(s)/{\mathbb {E}}n(s)\) is a positive martingale, by Doob’s maximum inequality we have that, for \(c>1\),

$$\begin{aligned} {\mathbb {E}}\left[ \mathbbm {1}_{ \exists _{s\le q} n(s)> c{\mathbb {E}}n(s)}\right] \le 1/c. \end{aligned}$$
(7.9)

Moreover, \({\mathbb {E}}[n(s)]={\mathrm e}^s\), and hence

$$\begin{aligned} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_q(x)}\right] \ge {\mathrm e}^{-t+q} (1-1/c) {\mathrm e}^{-{\lambda }c^2 {\mathrm e}^{2q}/2}. \end{aligned}$$
(7.10)

Finally, we make the close to optimal choice \(q= \frac{1}{2} \ln (1/c^2{\lambda })\), which yields

$$\begin{aligned} {\mathbb {E}}\left[ {\mathrm e}^{-{\lambda }I_q(x)}\right] \ge {\mathrm e}^{-t+ \frac{1}{2} \ln (1/c^2{\lambda })} (1-1/c) {\mathrm e}^{-1} = {\mathrm e}^{-t-1} \left( {\lambda }c^2\right) ^{-1/2} (1-1/c).\nonumber \\ \end{aligned}$$
(7.11)

Thus, choosing \(c=2\), (7.3) improves to

$$\begin{aligned} P_{t,{\lambda }} \left( {\tau }_1\le t- r\right) \le 4{\mathrm e}\frac{\sqrt{{\lambda }}}{{\lambda }^2\epsilon ^4{\mathrm e}^{r}+{\lambda }\epsilon ^2 \;} . \end{aligned}$$
(7.12)

Hence,

$$\begin{aligned} P_{t,{\lambda }} \left( {\tau }_1\le t+\ln ({\lambda }^{3/2} \epsilon ^4) -\rho \right) \le \frac{4{\mathrm e}}{{\mathrm e}^{\rho }+\sqrt{{\lambda }}\epsilon ^2}\sim 4{\mathrm e}^{-\rho +1}. \end{aligned}$$
(7.13)

This is still not perfect for small \({\lambda }\), as the shift of the first branching time in the full model and the approximate model do not coincide but it seems very hard to improve the bound on the denominator much more. Improvement would need to come from a matching bound in the numerator.

8 The Case \(p_0>0\)

The behaviour of the model is very different if particles are allowed to die. In that case, the process will die out almost surely. However, it is still interesting to see how exactly this happens. To simplify things, we assume in the sequel \(p_0>0\) and \(p_2=1-p_0\). Note first that the approximate penalty function changes slightly, since now the number of branching events is no longer equal to the number of particles. Let us introduce the two numbers m(t) and \({\hbox {d}}(t)\) as the number of (binary) branchings and deaths, resp. that occurred up to time t. Clearly, \(n(t)=1+m(t)-{\hbox {d}}(t)\). We then have

$$\begin{aligned} I_t(x) \ge \sum _{i=1}^{m(t)} {\tau }_{\epsilon }(i), \end{aligned}$$
(8.1)

and we define the approximate model via

$$\begin{aligned} {\widehat{P}}_{t,{\lambda }} (A) \equiv \frac{{\mathbb {E}}\left[ \mathbbm {1}_{x(t)\in A} {\sigma }({\lambda },\epsilon )^{m(t)}\right] }{{\mathbb {E}}\left[ {\sigma }({\lambda },\epsilon )^{m(t)}\right] }. \end{aligned}$$
(8.2)

We consider first the partition function function \(v_{\lambda }(t)={\mathbb {E}}\left[ {\sigma }({\lambda },\epsilon )^{m(t)}\right] \). The analog of Lemma is as follows. As before, to lighten our notation we drop the dependence of \({\sigma }({\lambda },\epsilon )\) and simply write \({\sigma }\) in the remainder of this section.

Lemma 8.1

Let \({{\tilde{v}}}_{\lambda }(t)\) be the solution of the ordinary differential equation

$$\begin{aligned} \frac{d}{dt} {{\tilde{v}}}_{\lambda }(t) = {\sigma }p_2 \tilde{v}_{\lambda }(t)^2 -{{\tilde{v}}}_{\lambda }(t) +p_0, \end{aligned}$$
(8.3)

with initial condition \(\tilde{v}_{\lambda }(0)=1\). Then \( v_{\lambda }(t)={{\tilde{v}}}_{{\lambda }}(t)\).

Proof

We proceed as in the proof of Lemma . Since now the first event could be either a branching (with probability \(p_2\)) or a death (with probability \(p_0\)), we get the recursion

$$\begin{aligned} v_{\lambda }(t) ={\mathrm e}^{-t} + \int _0^t ds {\mathrm e}^{-(t-s)} \left( p_2{\sigma }v_{\lambda }(s)^2 +p_0\right) . \end{aligned}$$
(8.4)

Differentiating yields the asserted claim. \(\square \)

The presence of the term \(p_0>0\) eliminates the fixpoint 0 in equation (8.3). In fact, (8.3) has the two fixpoints

$$\begin{aligned} v^\pm _{\lambda }\equiv \frac{1}{2{\sigma }p_2}\left( 1\pm \sqrt{1-4{\sigma }p_2+4{\sigma }p_2^2}\right) \end{aligned}$$
(8.5)

Note that for \({\sigma }=1\), this simplifies to

$$\begin{aligned} v^\pm _0=\frac{1}{2p_2}\left( 1\pm \sqrt{(1-2p_2)^2}\right) , \end{aligned}$$
(8.6)

which is 1 and \(p_0/p_2\). In that case we clearly have \(v_0(t)=1\) in all cases.

If \({\lambda }>0\), but \({\lambda }\ll 1\) (i.e. \({\sigma }<1\), but \(1-{\sigma }\) small), we can expand

$$\begin{aligned} v_{\lambda }^\pm = {\left\{ \begin{array}{ll} \frac{1}{2{\sigma }p_2}\left( 1\pm (2{\sigma }p_2-1) \sqrt{1+\frac{4p_2^2{\sigma }(1-{\sigma })}{(2p_2{\sigma }-1)^2}}\right) , &{}\text {if} \;\; p_2> 1/2,\\ \frac{1}{2{\sigma }p_2}\left( 1\pm (1-2{\sigma }p_2) \sqrt{1+\frac{4p_2^2{\sigma }(1-{\sigma })}{(1-2p_2{\sigma })^2}}\right) , &{}\text {if} \;\; p_2\le 1/2. \end{array}\right. } \end{aligned}$$
(8.7)

In particular, the smaller fixpoint is

$$\begin{aligned} v_{\lambda }^- \approx {\left\{ \begin{array}{ll} \frac{p_0}{p_2} +O((1-{\sigma })^2), &{}\text {if} \;\; p_2> 1/2,\\ 1 -\frac{p_2(1-{\sigma })}{1-2p_2{\sigma }}, &{}\text {if} \;\; p_2\le 1/2. \end{array}\right. } \end{aligned}$$
(8.8)

Now set \(f_{\lambda }(t)\equiv v_{\lambda }(t)-v^-_{\lambda }\). Then \(f_{\lambda }\) satisfies the differential equation

$$\begin{aligned} {\partial }_t f_{\lambda }(t)={\sigma }({\lambda },\epsilon )p_2f_{\lambda }(t)^2+(2{\sigma }p_2v^-_{\lambda }-1)f_{\lambda }(t), \end{aligned}$$
(8.9)

with initial condition \(f_{\lambda }(0)=1-v^-_{\lambda }\). We can solve this equation as in the case \(p_0=0\). Define

$$\begin{aligned} {{\hat{f}}}_{\lambda }(t)\equiv {\mathrm e}^{-(2{\sigma }p_2v^-_{\lambda }-1)t}f_{\lambda }(t). \end{aligned}$$
(8.10)

Then

$$\begin{aligned} {\partial }_t {{\hat{f}}}_{\lambda }(t)={\sigma }p_2{{\hat{f}}}_{\lambda }(t)^2{\mathrm e}^{(2{\sigma }p_2v^-_{\lambda }-1)t}, \end{aligned}$$
(8.11)

which has the solution

$$\begin{aligned} {{\hat{f}}}_{\lambda }(t)=\frac{1}{\frac{1}{1-v^-_{\lambda }}-\frac{{\sigma }p_2}{1-2{\sigma }p_2v^-_{\lambda }}\left( 1-{\mathrm e}^{(2{\sigma }p_2v^-_{\lambda }-1)t}\right) }. \end{aligned}$$
(8.12)

Hence

$$\begin{aligned} f_{\lambda }(t)=\frac{{\mathrm e}^{(2{\sigma }p_2v^-_{\lambda }-1)t}}{\frac{1}{1-v^-_{\lambda }}-\frac{{\sigma }p_2}{1-2{\sigma }p_2v^-_{\lambda }}\left( 1-{\mathrm e}^{(2{\sigma }p_2v^-_{\lambda }-1)t}\right) }. \end{aligned}$$
(8.13)

Note that in the case \({\lambda }=0\), this is just one, while otherwise, it decays exponentially to zero, so that \(v_{\lambda }(t)\rightarrow v_{\lambda }^->0\), indicating that the number of branchings in the process remains finite.

Next let us consider the generating function of the particle number,

$$\begin{aligned} w_{{\lambda },{\gamma }}(t)\equiv {\mathbb {E}}\left[ {\sigma }^{m(t)} {\mathrm e}^{-{\gamma }n(t)}\right] .\ \end{aligned}$$
(8.14)

We readily see that this function satisfies the equation

$$\begin{aligned} w_{{\lambda },{\gamma }}(t) = {\mathrm e}^{-{\gamma }} {\mathrm e}^{-t}+ p_2{\sigma }\int _0^t {\mathrm e}^{-(t-s)} w_{{\lambda },{\gamma }}(t)^2 ds +p_0 \left( 1-{\mathrm e}^{-t}\right) . \end{aligned}$$
(8.15)

This implies the differential equation

$$\begin{aligned} {\partial }_t w_{{\lambda },{\gamma }}(t) =p_2{\sigma }w_{{\lambda },{\gamma }}(t)^2-w_{{\lambda },{\gamma }}(t)+p_0, \end{aligned}$$
(8.16)

with initial condition \(w_{{\lambda },{\gamma }}(0) ={\mathrm e}^{-{\gamma }}\). Thus, w and v differ only in the initial conditions. It is therefore easy to see that

$$\begin{aligned} w_{{\lambda },{\gamma }}(t) = v_{\lambda }^{-} +\frac{{\mathrm e}^{\left( 2{\sigma }p_2v^-_{\lambda }-1\right) t}}{\frac{1}{{\mathrm e}^{-{\gamma }}-v^-_{\lambda }}-\frac{{\sigma }p_2}{1-2{\sigma }p_2v^-_{\lambda }}\left( 1-{\mathrm e}^{\left( 2{\sigma }p_2v^-_{\lambda }-1\right) t}\right) }. \end{aligned}$$
(8.17)

From this expression, we can compute, for example, the expected number of particles at time t under the measure \({{\hat{P}}}_{t,{\lambda }}\),

$$\begin{aligned} {{\hat{E}}}_{t,{\lambda }} [n(t)] =- \frac{{\partial }}{{\partial }{\gamma }} \ln \left( w_{{\lambda },{\gamma }}(t)\right) \big |_{{\gamma }=0}, \end{aligned}$$
(8.18)

which reads

$$\begin{aligned} {{\hat{E}}}_{t,{\lambda }} [n(t)] =\frac{1}{v_{\lambda }(t)} \frac{{\mathrm e}^{\left( 2{\sigma }p_2v^-_{\lambda }-1\right) t} }{\left( 1-\frac{{\sigma }p_2(1-v^-_{\lambda })}{1-2{\sigma }p_2v_{\lambda }^-}\left( 1-{\mathrm e}^{\left( 2{\sigma }p_2v^-_{\lambda }-1\right) t}\right) \right) ^2}. \end{aligned}$$
(8.19)

For \(t\uparrow \infty \), this behaves to leading order, provided \(v_{\lambda }^->0\), like

$$\begin{aligned} \frac{1}{v_{\lambda }^-} \frac{{\mathrm e}^{(2{\sigma }p_2v^-_{\lambda }-1)t} }{\left( 1-\frac{{\sigma }p_2(1-v^-_{\lambda })}{1-2{\sigma }p_2v_{\lambda }^-}\right) ^2}. \end{aligned}$$
(8.20)

This implies that the process dies out exponentially fast unless the death rate is zero. This does not come, of course, as a surprise.

Alternative computation of \({{\hat{E}}}_{t,{\lambda }}[n(t)]\).

Instead of passing through the generating function for n(t), we can also proceed by deriving a direct recursion for \({{\hat{E}}}_{t,{\lambda }}[ n(t)]\). To do so, define the un-normalised expectation

$$\begin{aligned} u_{\lambda }(t) ={\mathbb {E}}\left[ n(t) {\sigma }^{m(t)}\right] . \end{aligned}$$
(8.21)

Clearly we have

$$\begin{aligned} u_{\lambda }(t)= {\mathrm e}^{-t} + p_2 \int _{0}^t {\mathrm e}^{-(t-s)} {\sigma }2u_{\lambda }(s) v_{\lambda }(s)){\hbox {d}}s, \end{aligned}$$
(8.22)

where we used that in the case when the first event is a death at time \(t-s\), n(t) will be zero, while in the case of a birth, \(n(t) = (n_1(s)+n_2(s))\), where \(n_i\) are independent copies. This implies the differential equation

$$\begin{aligned} {\partial }_t u_{\lambda }(t)= 2p_2{\sigma }u_{\lambda }(t)v_{\lambda }(t)-u_{\lambda }(t). \end{aligned}$$
(8.23)

The solution of this can be written directly as

$$\begin{aligned} u_{\lambda }(t) =\exp \left( \int _0^t \left( 2p_2{\sigma }v_{\lambda }(s)-1\right) {\hbox {d}}s\right) . \end{aligned}$$
(8.24)

Since \(v_{\lambda }\) is explicit, one can verify that this gives the same answer as (8.19).