1 Introduction

It has been known since the work of McKean [22] that the time evolution of a one-dimensional branching Brownian motion is intimately connected with the behavior of solutions of the Fisher-KPP equation

$$\begin{aligned} \frac{\partial u}{\partial t}=\frac{1}{2} \frac{\partial ^{2}u}{\partial x^{2}} +f (u). \end{aligned}$$
(1)

McKean observed that the cumulative distribution function of the position \(R_{t}\) of the rightmost particle at time \(t\) in a one-dimensional branching Brownian motion obeys Eq. (1). In the simplest case, where particles move independently along Brownian trajectories and undergo simple binary fission following independent, exponentially distributed gestation times, the function \(f\) is given by \(f (u)=u^{2}-u\), the case originally studied by Fisher [15]; more general branching mechanisms lead to more general functional forms. In general, when the underlying branching process is supercritical, the solution of (1) with Heaviside initial data approaches a traveling wave with a positive asymptotic velocity, and thus, in particular, the distribution of \(R_{t}\), when centered at its median \(m_{t}\), converges weakly to the distribution described by the wave. There is now a considerable literature surrounding problems connected with this phenomenon: see, e.g., [10] for the precise asymptotic behavior of the function \(t\mapsto m_{t}\); [19] for a proof that the traveling wave is a mixture of extreme value distributions; and [1, 4] and [3] for proofs that the distribution of the entire point process of particle locations, when centered at \(m_{t}\), converges in law as \(t \rightarrow \infty \). Generalizations of some of these results to supercritical branching random walks are given in [6] and [9].

When the branching mechanism is critical (that is, when the mean number of offspring particles at a particle death is \(1\)), the nature of the process \(R_{t}\) is entirely different, because in this case the process must ultimately die out, with probability one. Hence, it is more natural in the critical case to ask about the distribution of

$$\begin{aligned} M:=\max _{t<\infty }R_{t}, \end{aligned}$$
(2)

the rightmost point ever reached by a particle of the branching process. Interest in the distribution of \(M\) stems in part from its relevance to evolutionary biology, where critical branching Brownian motion has been used as a model for the spatial diffusion of alleles with no selective advantage or disadvantage: see, for instance, [11, 16], and the references therein. For critical (or subcritical) branching Brownian motion the distribution function \(\omega (x)=P\{M\le x \}\) of \(M\) satisfies the ordinary differential equation

$$\begin{aligned} \frac{1}{2} \omega '' (x)=-\psi (\omega (x)), \end{aligned}$$
(3)

where \(\psi (\omega )\) is the probability generating function of the offspring distribution of the branching process. This differential equation is similar to that satisfied by the traveling wave(s) for the Fisher-KPP equation, but differs in the nonlinear term \(\psi (\omega )\); this leads to solutions of a very different character, reflecting the qualitative differences between critical and supercritical branching. By explicitly integrating the Eq. (3), Fleischman and Sawyer [16], Sect. 3, obtained precise asymptotic estimates for the tail of the distribution of \(M\) for critical branching Brownian motion: in particular, they proved that if the offspring distribution has mean \(1\), positive variance \(\sigma ^{2}\), and finite third moment then

$$\begin{aligned} P\{M\ge x \}\sim \frac{6}{\sigma ^{2}x^{2}} \quad \text {as}\;\; x \rightarrow \infty . \end{aligned}$$
(4)

(In the case of “double-or-nothing” branching, where particles produce either \(0\) or \(2\) offspring with probability \(1/2\), the solution to the differential equation with the appropriate boundary conditions has the closed form \(1-w (x)=6/ (1+x)^{2}\), as is easily checked.)

Our primary objective in this paper is to show that under suitable hypotheses the asymptotic formula (4) holds generally for critical, driftless branching random walks. For branching random walks, unlike branching Brownian motion, there are no ordinary differential equations governing the law of \(M\), so one cannot write explicit integral formulas for the distribution of \(M\), as one can for branching Brownian motion.Footnote 1

For ease of exposition we will limit our study to discrete-time branching random walks on the integer lattice \(\mathbb Z\) in which the reproduction and dispersal mechanisms are independent, with dispersal preceding reproduction. Thus, in each generation, particles

  1. (A)

    jump (independently of one another) to new sites, with jumps following a mean-zero, finite variance, jump distribution \(F_{RW}:=\{a_{x} \}_{x\in \mathbb Z}\);

  2. (B)

    then reproduce as in a simple Galton-Watson process, according to a fixed offspring distribution \(F_{GW}:=\{p_{k} \}_{k\ge 0}\) with mean \(1\) and finite variance.

We will generally assume that the branching random walk is initiated by a single particle located at the origin \(0\in \mathbb Z\). For a formal construction of a branching random walk following rules (A)–(B), see, for instance, [18]. The locations of the particles in generation \(n\) will be denoted by \(X_{n,i}\), where \(i\le N_{n}\) and \(N_{n}\) is the total number of particles in the \(n\)th generation. Our interest is in the distribution of the maximal displacement

$$\begin{aligned} M=\max _{n \ge 0} M_{n} \quad \text {where} \;\; M_{n}= \max _{i=1,2,\ldots ,N_n} X_{n,i}. \end{aligned}$$
(5)

Remark 1

Observe that only the particle locations at the end of each reproduction step are taken into account in the definition of \(M\): thus, for instance, if the initial particle at \(X_{0,1}=0\) were to jump to site \(x=1\) and then produce no offspring, the maximal displacement would be \(M=0\), not \(M=1\).

Remark 2

Many authors (e.g., [7, 18]) consider branching random walks in which the order of the reproduction and jump steps is opposite to that specified above. For our purposes it is more convenient to have the jump step precede the reproduction step, as this leads to more compact formulations of the main result and the nonlinear convolution Eq. (11) below. It should be clear that analogous results can be deduced for branching random walks of the type considered in [7, 18], by conditioning on the first jump step. See Remark 3 in Sect. 2.1 below. Although it is less obvious, our results (and their proofs) also generalize to branching random walks in which the reproduction and particle motion are not independent (as, for instance, in the branching random walks discussed in [8]), and to branching random walks in which the step distribution is supported by \(\mathbb R\) rather than \(\mathbb Z\). Thus, in particular, our results extend to critical branching Brownian motion, since branching Brownian motion observed at integer multiples of a fixed time \(\delta >0\) is a branching random walk. In the interest of clarity, we have chosen not to formulate and prove our results in the greatest possible generality.

Theorem 1

Assume that the step distribution \(F_{RW}=\{a_k\}_{k \in \mathbb {Z}}\) has mean \(0\), positive variance \(\eta ^{2}\), and finite \(r\)-th moment for some \(r>4\). Assume also that the offspring distribution \(F_{GW}=\{p_{k} \}_{k\ge 0}\) has mean \(1\), positive variance \(\sigma ^{2}\), and finite third moment. Then

$$\begin{aligned} P\{M\ge x \}\sim \frac{6\eta ^{2}}{\sigma ^{2}x^{2}} \quad \text {as}\;\; x \rightarrow \infty . \end{aligned}$$
(6)

This will be proved in Sect. 2. The result can also be reformulated as a statement about the distribution of the maximal displacement of a branching random walk initiated by a large number \(n\) of particles at the origin. Such a branching random walk is just the superposition (sum) of \(n\) independent copies of the branching random walk in Theorem 1, so the event that its maximal displacement is \(\le \!\sqrt{n}x\) is the intersection of the events that each of the \(n\) constituent branching random walks has maximal displacement \(\le \!\sqrt{n}x\). For fixed \(x>0\) the target points \(\sqrt{n}x \rightarrow \infty \) as \(n \rightarrow \infty \), so the asymptotic formula (6) yields the following corollary.

Corollary 2

Let \(M^{n}\) be the maximal displacement of a branching random walk initiated by \(n\) particles at the origin at time \(0\). Then for any \(x>0\),

$$\begin{aligned} \lim _{n \rightarrow \infty }P\{M^{n} \ge \sqrt{n}x\}= 1-\exp \{-C/x^{2} \} \quad \text {where} \quad C=\frac{6\eta ^{2}}{\sigma ^{2}}. \end{aligned}$$
(7)

Theorem 1 is closely related to the main results of Kesten [18], who considers critical branching random walk conditioned to survive for a large number of generations (Zheng [25] has also considered the case where the random walk is assumed to have a “small” drift, but this leads to entirely different asymptotics. Aidekon [2] considers branching random walks that are rendered critical by virtue of killing at the origin: here again the asymptotic behavior of the maximum behaves quite differently). Kesten shows that under the same hypotheses as in Theorem 1, for any fixed \(\beta >0\), given that the branching process survives for \(\beta n\) generations, the conditional distribution of \(\max _{k\le n}M_{k}/\sqrt{n}\) converges as \(n \rightarrow \infty \). He does not, however, identify the limit distribution. He remarks on the result of Sawyer and Fleischman:

We have not proved (Eq. (6)) in our setting. It is not clear at the moment how the methods of Sawyer and Fleischman, which rely on differential equations, can be carried over to the discrete setting; differential equations will have to be replaced by recurrence relations.

Our main technical innovation will be to show how to relate these “recurrence relations” to the differential equation (3). This will be accomplished by exploiting Feynman-Kac formulas. Our approach applies also to time-dependent Feynman-Kac problems, and leads in particular to information about the distribution of the random variable \(M_{n}\). As an illustration, we will prove in Sect. 3 the following conditional limit theorem.

Theorem 3

Under the hypotheses of Theorem 1, the conditional distribution of \(M_{n}/\sqrt{n}\), given that the branching process survives for \(n\) generations, converges weakly as \(n \rightarrow \infty \) to a nontrivial limit distribution \(G\) that depends only on the variances \(\sigma ^{2}\) and \(\eta ^{2}\) of the offspring and step distributions.

The scaling in this theorem is the same as that in the Dawson-Watanabe theorem (see, for instance, [12], ch. 1), which can be stated as follows. Suppose that \(n\) independent copies of the branching random walk are initiated at the origin \(0\in \mathbb Z\). If particles are assigned mass \(1/n\), and if time and space are re-scaled by factors \(1/n\) and \(1/\sqrt{n}\), respectively, then the corresponding measure-valued processes \(BRW_{n}(t)\) converge weakly to super-Brownian motion \(X_{t}\). A similar theorem holds for the measure-valued processes \(BRW^{*}_{n} (t)\) attached to branching random walk initiated by a single particle at the origin, but conditioned to survive for \(n\) generations: under the same re-scaling of mass, time, and space as in the Dawson-Watanabe theorem, the measure-valued processes \(BRW^{*}_{n} (t)\) converge weakly as \(n \rightarrow \infty \) to a measure-valued process \(Y_{t}\). The law of this process \(Y_{t}\) is related to that of the super-Brownian motion by the Poisson cluster representation: if \(N\) is a Poisson random variable with mean \(1\) and \(Y^{1}_{t},Y^{2}_{t},\ldots \) are independent copies of \(Y _{t}\) then \(\sum _{i=1}^{N}Y^{i} _{t}\) is a super-Brownian motion.

The weak convergence of the measure-valued processes \(BRW^{*}_{n} (t)\) by itself does not imply Theorem 3, because the location \(M_{n}/\sqrt{n}\) is not a continuous function (relative to the weak topology on measures) of \(BRW^{*}_{n} (1)\) (Lalley[20] shows that in dimension \(1\) rescaled branching random walks converge to super-Brownian motion in a stronger topology than the weak topology implicit in the Dawson-Watanabe theorem. However, even this topology is too weak to make the normalized rightmost particle location a continuous functional). Nevertheless, it is natural to wonder how the limit distribution \(G\) of Theorem 3 is related to the limiting measure-valued process \(Y_{t}\). The proof of Theorem 3 will establish that \(G\) is the distribution of the rightmost support point of the random measure \(Y_{1}\).

Corollary 4

Under the hypotheses of Theorem 1 (in particular, under the assumption that the step distribution \(F_{RW}\) has finite \(r\)th moment for some \(r>4\)),

$$\begin{aligned} G (x) = P\{ Y_{1} [x,\infty )=0 \}. \end{aligned}$$
(8)

Are \(4+\varepsilon \) moments on the step distribution really necessary for the validity of our theorems (and Kesten’s)? The following simple heuristic argument (which with a bit of work can be made rigorous) shows that \(4-\varepsilon \) moments are not enough. Consider, for instance, the case where \(F_{RW}\) is the symmetric distribution on the nonzero integers with discrete density

$$\begin{aligned} f_{RW} (x)=\frac{1}{2\zeta (5-\varepsilon )|x|^{5-\varepsilon }}, \end{aligned}$$

which has infinite \(4\)th moment. Conditional on the event that the branching random walk survives for at least \(n\) generations it will produce on the order of \(n^{2}\) particles. Each of these has conditional probability \(\sim C/n^{2-\alpha }\) of placing an offspring at distance \(n^{(1+\delta )/2}\) to the right, where \(\alpha =\varepsilon /2+\varepsilon \delta -4\delta \). Consequently, if \(\alpha >0\) then for large \(n\) the probability that all \(n^{2}\) particles are located in an interval \([-A\sqrt{n},A\sqrt{n}]\) is vanishingly small.

2 Maximal displacement: proof of theorem 1

2.1 A nonlinear convolution equation

For the remainder of the paper we shall assume that the offspring distribution \(F_{GW}=\{p_{k}\}_{k\ge 0}\) and the jump distribution \(F_{RW}=\{a_{x} \}_{x\in \mathbb Z}\) satisfy the hypotheses of Theorem 1: in particular, \(F_{GW}\) has mean \(1\), positive variance \(\sigma ^{2}\), and finite third moment, and \(F_{RW}\) has mean \(0\), positive variance \(\eta ^{2}\), and finite \(4+\varepsilon \) moment. The maximal displacement \(M\) of the branching random walk is defined by Eq. (5), and its (tail) distribution function will be denoted by

$$\begin{aligned} u(x) = P\{M> x \}. \end{aligned}$$
(9)

Clearly, \(u\) is non-increasing in \(x\), with jump discontinuities at integer arguments. Also, \(u (x)=1\) for all \(x\le 0\), and \(\lim _{x \rightarrow \infty }u (x)=0\).

We begin by showing that \(u\) satisfies a nonlinear convolution equation analogous to the Fleischman-Sawyer Eq. (3). This is obtained in the conventional manner, by conditioning on the first generation of the branching process. Each particle of the first generation will give rise to its own descendant branching random walk, independent of its siblings (conditional on their locations), and in order that \(M\le x\) it must be the case that the maximal displacements of all of the descendant BRWs, adjusted by their starting points, must be \(\le \! x\). This implies that for all \(x\ge 1\),

$$\begin{aligned} 1-u (x) = \sum _{y\in \mathbb Z}a_{y} \sum _{k=1}^{\infty }p_{k} (1-u (x-y))^{k} . \end{aligned}$$
(10)

(Note that this is consistent with our convention regarding the sequencing of the dispersal and reproduction steps—see Remark 1 above). Rewriting this equation in terms of \(u\) leads immediately to the following proposition.

Proposition 5

\(u(x)\) satisfies the nonlinear convolution equation

$$\begin{aligned} u(x) = \sum _{y \in \mathbb {Z}} a_y Q(u(x-y)), \end{aligned}$$
(11)

where \(1-Q (1-s)\) is the probability generating function of the offspring distribution \(F_{GW}\), that is,

$$\begin{aligned} Q(s) = 1-\sum _{i=0}^{\infty } p_i (1-s)^i, \qquad for \;\; 0 \le s \le 1. \end{aligned}$$
(12)

Remark 3

If the branching random walk used the alternative rule discussed in Remark 1 (that is, particles first reproduce and then disperse) to construct the branching random walk, then Eq. (11) would change as follows. Writing \(\tilde{M}\) for the maximal displacement of this branching random walk, and \(\tilde{u}(x) = P \{ \tilde{M} \ge x \}\), we would have

$$\begin{aligned} \tilde{u}(x) = Q \left( \sum _{k \in \mathbb {Z}} a_k \tilde{u}(x-k) \right) . \end{aligned}$$

Comparing this with Eq. (11), we see that

$$\begin{aligned} \tilde{u}(x) = Q \bigl ( u(x) \bigr ). \end{aligned}$$

Since the Taylor expansion of \(Q\) is \(Q(s) = s -{\sigma ^2}s^2/{2} + O(s^3)\), it follows that \(\tilde{u} (x)\) and \(u(x)\) go to \(0\) as \(x \rightarrow \infty \) at the same rate.

2.2 A discrete Feynman-Kac formula

Our goal now is to analyze the asymptotic behavior of solutions to the nonlinear convolution Eq. (11) as \(x \rightarrow \infty \). To accomplish this, we will show that solutions of (11) can be represented by formulas of “Feynman-Kac” type. Henceforth, we shall denote by \(W_{n}\) the random walk on \(\mathbb Z\) whose step distribution is the reflection of the step distribution \(F_{RW}\) in the underlying BRW, that is,

$$\begin{aligned} P (W_{n+1}-W_{n}=y\,|\, W_{n},W_{n-1},\ldots )=a_{-y}. \end{aligned}$$
(13)

We shall use superscripts \(P^{x}\) and \(E^{x}\) to denote the initial point \(W_{0}=x\) of the random walk \(W_{n}\).

Define

$$\begin{aligned} h(s)&= s - Q(s) = {\sigma ^2 s^{2}/2}+ O(s^3) \quad \text {and}\\ \nonumber H(s)&= {h(s)/s} = {\sigma ^2 s/2} + O(s^2). \end{aligned}$$
(14)

It is easily checked that \(H (s)\) is increasing for \(s\in [0,1]\), and satisfies \(H (0)=0\) and \(H (1)=p_{0}\).

Proposition 6

Under \(P^{x}\), the process

$$\begin{aligned} Y_n = \left( \prod _{j=1}^n \bigl (1-H(u(W_j)) \bigr ) \right) \cdot u(W_n) \end{aligned}$$
(15)

is a bounded martingale with respect to the natural filtration generated by the random walk \(\{ W_n \}\). Here we adopt the convention that the empty product \(\prod _{j=1}^0\) is equal to \(1\).

Proof

The random variables \(Y_n \) are uniformly bounded, in particular, \(0 \le Y_n \le 1\). To prove that the sequence is a martingale we appeal to the nonlinear convolution Eq. (11), which can be rewritten in terms of the function \(h\) as

$$\begin{aligned} \left( \sum _{k \in \mathbb {Z}} a_k u(x-k) \right) - u(x) = \sum _{k \in \mathbb {Z}} a_k h(u(x-k)). \end{aligned}$$
(16)

Using this, we compute

$$\begin{aligned} \begin{aligned}&E^x \left( Y_{n+1} \mid \{ W_j \} _{j=1}^n \right) \\&\quad = \; \left( \prod _{j=1}^n \bigl (1-H(u(W_j)) \bigr ) \right) \cdot E^x \left( \left( 1-H(u(W_{n+1})) \right) \cdot u(W_{n+1}) \mid \{ W_j \} _{j=1}^n \right) \\&\quad = \; \left( \prod _{j=1}^n \bigl (1-H(u(W_j)) \bigr ) \right) \cdot E^x \left( \left( u(W_{n+1}) - h(u(W_{n+1})) \right) \mid \{ W_j \} _{j=1}^n \right) \\&\quad = \; \left( \prod _{j=1}^n \left( 1-H(u(W_j)) \right) \right) \cdot \left( \sum _{k \in \mathbb {Z}} a_k u(W_n - k) - \sum _{k \in \mathbb {Z}} a_k h(u(W_n - k)) \right) \\&\quad = \; \left( \prod _{j=1}^n \left( 1-H(u(W_j)) \right) \right) \cdot u(W_n) = \; Y_n, \end{aligned} \end{aligned}$$

where the second to last equality uses Eq. (16). \(\square \)

Corollary 7

For each \(y\in \mathbb Z\), define

$$\begin{aligned} \tau _{y}=\min \{ n \ge 0 \mid W_n \le y \}. \end{aligned}$$
(17)

Then for all \(x,y\in \mathbb Z\),

$$\begin{aligned} u(x) = E^x \left( \prod _{j=1}^{\tau _{y}} \bigl (1-H(u(W_j)) \bigr ) \right) u (W_{\tau _{y}}). \end{aligned}$$
(18)

Proof

Since the random walk \(W_{n}\) is driftless it must be recurrent, and hence \(\tau _{y}\) is finite. Since the martingale \(Y_{n}\) of Proposition 6 is bounded, Doob’s optional sampling identity applies, yielding (18). Note that if \(y\le 0\) then \(u (W_{\tau _{y}})=1\), since \(W_{\tau _{y}}\le 0\).      \(\square \)

2.3 Scaling limits

Using the Feynman-Kac representation (18) we will show that the function \(u\), properly re-normalized, converges to a function that satisfies the Fleischman-Sawyer equation (3). Because we do not know a priori that the function \(u\) has a proper scaling limit we must work with subsequential limits. Since \(u\) is monotone and satisfies \(0<u\le 1\) it follows that for any \(y\ge 0\) there exist sequences \(x_{k} \rightarrow \infty \) such that

$$\begin{aligned} \phi (y):= \lim _{ k \rightarrow \infty } \frac{u (x_{k}+{y}/{\sqrt{u(x_{k})}} )}{u(x_{k})} \end{aligned}$$
(19)

exists. Clearly, any such limit must satisfy \(0\le \phi (y)\le 1\), and if \(y=0\) the limit is \(\phi (0)=1\). By Cantor’s diagonalization argument, any such sequence \(x_{k}\) must have a subsequence, which we also denote by \(x_{k}\), such that the convergence (19) holds for all rational \(y\ge 0\).

Proposition 8

For any sequence \(x_{k}\rightarrow \infty \) such that (19) holds for all rational \(y\ge 0\), the limit function \(\phi (y)\) extends to a continuous, non-increasing, positive function of \(y\in [0,\infty )\). Hence, the convergence (19) holds uniformly for \(y\) in any compact interval \([0,A]\).

Proof

Fix \(0\le y_1<y_2\), both rational, and for ease of notation write \(z_i={y_i}/{\sqrt{u(x)}}\) for \(i=1,2\) and \(x=x_{k}\) (the dependence on \(k\) will be suppressed). Fix a sequence \(x_{k}\rightarrow \infty \) along which (19) holds for all rational \(y\). To avoid a proliferation of subscripts, we shall write \(\lim _{x \rightarrow \infty }\) to mean convergence along the subsequence \(x_{k}\). By Proposition 6 and Doob’s optional sampling theorem,

$$\begin{aligned} \begin{aligned} \phi (y_{2})&= \lim _{x \rightarrow \infty } \frac{u (x+z_{2})}{u(x)}\\&= \lim _{x \rightarrow \infty } E^{x+z_2}\left( \frac{u(W_{\tau (x+z_{1})})}{u(x)} \prod _{j=1}^{\tau ( x+z_1)} (1-H(u(W_j)) )\right) , \end{aligned} \end{aligned}$$

where \(\tau _{z} = \tau (z)=\min \{ j \ge 0 \mid W_j \le z \}\). Using the expansion \(H (u)\sim \sigma ^{2}u/2\) as \(u \rightarrow 0\) we obtain

$$\begin{aligned} \phi (y_{2})&= \lim _{x \rightarrow \infty } E^{x+z_2-z_1} \left( \prod _{j=1}^{\tau (x)} \left( 1-H(u(W_j+z_1)) \right) \frac{u(W_{\tau (x)}+z_1)}{u(x)} \right) \\&= \lim _{x \rightarrow \infty } E^{x+z_2-z_1} \left( \exp \left\{ -\frac{\sigma ^2}{2} \sum _{j=1}^{\tau (x)} u(W_j+z_1) \right\} \frac{u(W_{\tau (x)}+z_1)}{u(x)} \right) \\&\ge \lim _{x \rightarrow \infty } E^{x+z_2-z_1} \left( \exp \left\{ -\frac{\sigma ^2}{2} \tau (x) u(x) \right\} \frac{u(x+z_1)}{u(x)} \right) . \end{aligned}$$

The last inequality holds because \(W_j > x\) for all \(j < \tau _x\) and \(W_{\tau _x} \le x\).

By the invariance principle (here we use the assumption that the step distribution of the random walk \(W_{n}\) has mean \(0\) and finite variance), as \(x \rightarrow \infty \),

$$\begin{aligned} \begin{aligned} \mathcal {D} (\tau _x u(x) \mid W_0 = x+z_{2}-z_{1})&= \mathcal {D} (\tau _0 u(x) \mid W_0 = z_{2}-z_{1}) \\&\Longrightarrow \mathcal {D} (\tau _0^{BM} \mid B_0 = y_2-y_1) \end{aligned} \end{aligned}$$

where \(B_t\) is a standard Brownian motion started at \(y_{2}-y_{1}\) and \(\tau _0^{BM}\) is the first hitting time of \(0\) by \(B_t\). (Here \(\mathcal {D}\) denotes conditional distribution.) Hence, for any \(\epsilon >0\), when \(y_2-y_1\) is sufficiently small and \(x\) is sufficiently large, \(\exp \{-\sigma ^{2}\tau _{x}u (x)/2 \} \ge 1-\epsilon \) with probability \(1-\epsilon \). Consequently,

$$\begin{aligned} \begin{aligned} \lim _{x \rightarrow \infty }&E^{x+z_2-z_1}\left( \exp \bigl ( -\frac{\sigma ^2}{2} \tau _x u(x) \bigr ) \frac{u(x+z_1)}{u(x)} \right) \\&\ge \; (1-\epsilon )^2 \lim _{x \rightarrow \infty } \frac{u(x+z_1)}{u(x)}\\ = \;&(1-\epsilon )^2 \phi (y_1). \end{aligned} \end{aligned}$$

This proves that \(\forall \epsilon >0\), if \(z_2-z_1\) is sufficiently close to zero then \(\phi (y_1) \ge \phi (y_2) \ge (1-\epsilon )^2 \phi (y_1)\). Therefore, \(\phi \) is continuous. \(\square \)

Our aim now is to show that there is only one possible subsequential limit function \(\phi \), and that it satisfies the Fleischman-Sawyer differential equation. To accomplish this we will use the discrete Feynman-Kac formula (18) and the invariance principle to show that any subsequential limit \(\phi \) satisfies the following Feynman-Kac formula.

Proposition 9

Assume that the step distribution \(\{a_k\}_{k \in \mathbb {Z}}\) of the random walk \(\{W_n\}\) has finite \(r\)-th moment for some \(r>4\). Then any subsequential limit \(\phi (y)\) specified by (19) satisfies

$$\begin{aligned} \phi (y) = E^{y/\eta } \exp \left( -\frac{\sigma ^2}{2} \int \limits _0^{\tau _0^{BM}} \phi (\eta B_t) \,\mathrm {d}t \right) \quad for \; y\ge 0, \end{aligned}$$
(20)

where under \(P^{y\eta }\) the process \(B_{t}\) is a standard Brownian motion started at \(B_{0}=y/\eta \) and \(\tau _0^{BM}\) is the first hitting time of \(0\) by \(B_t\).

The need for a finite \(4+\varepsilon \) moment stems from the fact that in general the random walk \(W_{n}\) will overshoot \(0\) at the first passage time \(\tau (0)\). The renewal theorem (cf. [14] or [24]) implies that as the initial point \(W_{0}=x \rightarrow \infty \), the distribution of the overshoot \(W_{\tau (0)}\) converges weakly provided the step distribution of the random walk has mean zero and finite variance. The number of finite moments of the limiting overshoot distribution is determined by the number of moments of the step distribution, as follows.

Lemma 10

If the step distribution of \(\{W_n\}\) has finite \(r\)-th moment, then the limiting overshoot distribution has finite \((r-2)\)th moment.

Proof

Consider the ladder variables

$$\begin{aligned} T_1&= \min \{ n > 0 \mid W_n <W_{0} \},\quad Z_1=W_{T_1}-W_{0},\\ T_2&= \min \{ n > T_1 \mid W_n < W_{T_{1}} \},\quad Z_2=W_{T_2} -W_{T_{1}}, \end{aligned}$$

and so on. Clearly, the first passage time \(\tau (0)\) must be one of the ladder times \(T_{i}\). Moreover, the ladder steps \(Z_{m+1}-Z_{m}\) are i.i.d., and by exercise 6, p. 232 of [24], the random variable \(Z_{1}\) has finite absolute \(r\)th moment if the step distribution \(F_{RW}\) has finite absolute \((r+1)\)th moment. The key observation is that for any \(a \le 0\) and any \(x\ge 1\),

$$\begin{aligned} P^{x}(W_{\tau (0)} \le a)&= \sum _{k=1}^{x} G (x;k) P^{k}\{Z_{1}\le a-k\}\\&\quad \le \sum _{k=1}^{x} P^{k}\{Z_{1}\le a-k\}\\&=\sum _{k=1}^{x} P^{0}\{Z_{1}\le a-k\} \end{aligned}$$

where \(G (x;k)\) is the probability under \(P^{x}\) that the random walk \(W_{n}\) will visit the site \(k\) at one of the ladder times \(T_{i}\). Thus, if \(S\) has the limiting overshoot distribution, then for any \(a\le 0\),

$$\begin{aligned} P(S \le a) \le \sum _{k=1}^{\infty } P(Z_1 \le a-k). \end{aligned}$$

This inequality together with the earlier observation about moments of the ladder variable \(Z_{1}\) implies that if \(E^{0}|W_{1}|^{r}<\infty \) then \(S\) has at least \(r-2\) moments. \(\square \)

Lemma 11

If the step distribution \(F_{RW}\) satisfies the hypotheses of Theorem 1 then along any sequence \(x=x_{k}\rightarrow \infty \) such that (19) holds uniformly on compact sets,

$$\begin{aligned} \lim _{k \rightarrow \infty } E^{y/\sqrt{u(x_{k})}}\left( \frac{u(W_{\tau _0}+x_{k})}{u(x_{k})}\right) = 1. \end{aligned}$$
(21)

Proof

As in the proof of Proposition 8 we will omit the subscript \(k\) on \(x_{k}\) and write \(\lim _{x \rightarrow \infty }\) to mean convergence along the subsequence \(x_{k}\). We also write \(z=y/\sqrt{u (x)}\). By Proposition 8,

$$\begin{aligned} \lim _{y \rightarrow 0} \lim _{x \rightarrow \infty } \frac{u (x+z )}{u(x)} = 1, \end{aligned}$$

which implies that for any \(\alpha >0\),

$$\begin{aligned} \lim _{x \rightarrow \infty } \frac{u (x+u (x)^{-\frac{1}{2}+\alpha }) }{u(x)} = 1. \end{aligned}$$

By the monotonicity of \(u\),

$$\begin{aligned} 1&\le E^{z} \left( \frac{u(W_{\tau _0}+x)}{u(x)} \right) \\&= E^{z} \left( \frac{u(W_{\tau _0}+x)}{u(x)} \right) \mathbf {1}_{A} +E^{z} \left( \frac{u(W_{\tau _0}+x)}{u(x)} \right) \mathbf {1}_{A^{c}} \\&=I+{ II}, \end{aligned}$$

where

$$\begin{aligned} A= A (x)=\{W_{\tau (0)}\ge -u (x)^{-1/2+\alpha } \}. \end{aligned}$$

By Chebyshev’s inequality, for any \(r>2\),

$$\begin{aligned} P^{x} (A^{c})\le E|W_{\tau (0)}|^{r-2}u (x)^{(r-2) (\frac{1}{2} -\alpha )}, \end{aligned}$$

and by Lemma 10, if the step distribution \(F_{RW}\) has finite \(r\)th moment then \(E|W_{\tau (0)}|^{r-2}<\infty \). Since \(u (u(W_{\tau _0}+x))/u (x)\le u (x)^{-1}\), it follows that

$$\begin{aligned} { II}\le C (u(x))^{(\frac{1}{2}-\alpha )(r-2)-1}. \end{aligned}$$

By hypothesis the step distribution \(F_{RW}\) has finite \(r\)th moment for some \(r>4\), so the constant \(\alpha >0\) can be chosen so that the exponent in the last displayed inequality is positive. Consequently, quantity \(II\) converges to \(0\) as \(x \rightarrow \infty \). On the other hand, \(\lim _{x \rightarrow \infty }P^{x} (A)=1\), and on the event \(A\) the integrand in quantity \(I\) is bounded by

$$\begin{aligned} \frac{u (x-u (x)^{-1/2+\alpha })}{u (x)} \longrightarrow 1, \end{aligned}$$

and so quantity \(I\) converges to \(1\) as \(x \rightarrow \infty \). \(\square \)

Proof of Proposition 9

Once again write \(z=y/\sqrt{u (x)}\). According to Corollary 7 and the Optional Stopping Theorem,

$$\begin{aligned}&\frac{u \bigl (x+{y}/{\sqrt{u(x)}} \bigr )}{u(x)}\nonumber \\ = \;&E^{x+z} \prod _{j=1}^{\tau _{x}} \bigl (1-H(u(W_j)) \bigr ) \frac{u(W_{\tau _x})}{u(x)} \nonumber \\ = \;&E^{z} \prod _{j=1}^{\tau _{0}} \bigl (1-H(u(W_j+x)) \bigr ) \frac{u(W_{\tau _0}+x)}{u(x)} \nonumber \\ = \;&E^{z} \exp \left\{ \sum _{j=1}^{\tau _{0}} \log (1-H(u(W_j+x)) \right\} \frac{u(W_{\tau _0}+x)}{u(x)} \nonumber \\ = \;&E^{z} \exp \left\{ \sum _{j=1}^{\tau _{0}} \bigl ( -\frac{\sigma ^2}{2} u(W_j+x) + O(u(x)^2) \bigr ) \right\} \frac{u(W_{\tau _0}+x)}{u(x)} \nonumber \\ = \;&E^{z} \exp \left\{ -\frac{\sigma ^2}{2} \sum _{j=1}^{\tau _{0}} u(W_j+x) + \tau _0 O(u(x)^2) \right\} \frac{u(W_{\tau _0}+x)}{u(x)} \nonumber \\ \end{aligned}$$
(22)

The error term \(O(u(x)^2)\) is bounded in magnitude by \(Cu (x)^{2}\) for some finite constant \(C\) not depending on \(x\), by virtue of our standing hypothesis that the offspring distribution \(F_{GW}\) has finite third moment, which ensures that \(H (u)\) has finite third derivative at \(u=0\).

The invariance principle implies that as \(x \rightarrow \infty \) the distribution of the process \(\sqrt{u(x)} W_{t/u(x)}/\eta \) under \(P^{z}\) converges weakly to that of a standard Brownian motion \(B_{t}\) started at \(B_{0}=y/\eta \). Consequently, the distribution of the renormalized first passage time \(u (x)\tau _{0}\) converges weakly to that of \(\tau ^{BM}_{0}\), and hence the error term \(\tau _{0}O (u (x)^{2})\) converges in distribution to \(0\). Moreover, along any sequence \(x=x_{k}\rightarrow \infty \) such that the convergence (19) holds uniformly for \(y\) in compact intervals,

$$\begin{aligned} \sum _{j=1}^{\tau _{0}}u (W_{j}+x)=u (x)\sum _{j=1}^{\tau _{0}}u (W_{j}+x)/u (x) \mathop {\longrightarrow }\limits ^{\mathcal {D}} \int \limits _{0}^{\tau ^{BM}_{0}} \phi (\eta B_{t})\,dt. \end{aligned}$$

Therefore, by Lemma 11,

$$\begin{aligned}&\lim \limits _{k \rightarrow \infty } E^{z} \exp \left\{ -\frac{\sigma ^2}{2} \sum \limits _{j=1}^{\tau _{0}} u(W_j+x_{k}) + \tau _0 O(u(x_{k})^2) \right\} \frac{u(W_{\tau _0}+x_{k})}{u(x_{k})}\\&\quad = E^{y/\eta } \exp \left\{ -\frac{\sigma ^2}{2}\int \limits _0^{\tau _0^{BM}} \phi (\eta B_t) \,\mathrm {d}t \right\} . \end{aligned}$$

This together with the convergence (19) and the chain of equalities (22) proves that \(\phi \) must satisfy the Feynman-Kac formula (20). \(\square \)

Corollary 12

Under the hypotheses of Proposition 9,

$$\begin{aligned} \lim _{x \rightarrow \infty } \frac{u (x+y/\sqrt{u (x)})}{u (x)} = \left( \frac{\sigma y}{\sqrt{6}\eta }+1 \right) ^{-2}:=\phi (y) \end{aligned}$$
(23)

uniformly for \(y\ge 0\).

Proof

Since \(u\) and \(\varphi \) are monotone and bounded, it suffices to prove that there is only one possible subsequential limit function (19), and that this limit is the solution of the differential equation

$$\begin{aligned} \phi '' (y)= {\sigma ^{2}\phi (y)^{2}}/{\eta ^{2}} \end{aligned}$$
(24)

that satisfies \(\phi (0)=1\) and \(\lim _{y \rightarrow \infty }\phi (y)=0\). But this follows from the Feynman-Kac representation (20) of subsequential limits and Kac’s theorem (cf., for instance, [17], Sect.  2.6), which implies that for any positive, bounded, continuous function \(V:[0,\infty ) \rightarrow \mathbb R\) the function

$$\begin{aligned} \psi (y)=E^{y/\eta }\exp \left\{ -\frac{\sigma ^{2}}{2}\int \limits _{0}^{\tau ^{BM}_{0}}V (\eta B_{t})\,dt \right\} \end{aligned}$$

is the unique bounded solution of the differential equation \(\psi ''=\sigma ^{2}V\psi /\eta ^{2}\) satisfying \(\psi (0)=1\). \(\square \)

2.4 Proof of Theorem 1

To complete the proof of Theorem 1 we must show that

$$\begin{aligned} \lim _{x \rightarrow \infty }w (x) =1/\beta ^{2}:=6\eta ^{2}/\sigma ^{2} \quad \text {where} \quad w (x):=x^{2}u (x). \end{aligned}$$
(25)

We will deduce (25) from the asymptotic scaling law (23), which in terms of the function \(w\) may be rewritten as

$$\begin{aligned} \lim _{x \rightarrow \infty }\frac{(1+\beta y)^{2}}{(1+y/\sqrt{w (x)})^{2}}\cdot \frac{w (x (1+y/\sqrt{w (x)}))}{w (x)}=1. \end{aligned}$$
(26)

This relation holds uniformly for \(y\in [0,A]\), for any \(A<\infty \), by Corollary 12, since (26) is obtained from (23) by multiplying both sides of (23) by \((1+\beta y)^{2}\).

The function \(w\) is not continuous, since \(u\) has jump discontinuities at the positive integers. However, it is “asymptotically continuous” in the sense that

$$\begin{aligned} \lim _{x \rightarrow \infty } \sup _{0\le y\le 1} \left| \frac{w (x+y)}{w (x)} -1\right| =0. \end{aligned}$$

This follows directly from the corresponding assertion for the function \(u\), which in turn follows from the monotonicity of \(u\) and Corollary 12. Consequently, \(w\) obeys the following weak form of the intermediate value theorem.

Lemma 13

Suppose there exist constants \(0\le A < B\le \infty \) and positive integers \(x_{1}<z_{1}<x_{2}<z_{2}<\cdots \) such that \(\lim w(x_{n})=A\) and \(\lim w (z_{n})=B\). Then for every \(C\in [A,B]\) there exist integers \(y_{n}\in [x_{n},z_{n}]\) and \(y_{n}'\in [z_{n},x_{n+1}]\) such that

$$\begin{aligned} \lim w (y_{n})= \lim w (y_{n}')=C. \end{aligned}$$

Proof of Theorem 1

Denote by \(0\le A\le B\le \infty \) the liminf and limsup of \(w (x)\) as \(x \rightarrow \infty \). We must show that \(A=B=1/\beta ^{2}\).

First we show that \(A<\infty \) and \(B>0\). Assume to the contrary that \(A=\infty \); then \(w (x) \rightarrow \infty \) as \(x \rightarrow \infty \), and so there must be a sequence \(x_{k} \rightarrow \infty \) such that \(w\) exits the interval \((0, w (x_{k}))\) at \(x_{k}\), that is, such that \(w (x)\ge w (x_{k})\) for all \(x>x_{k}\). The asymptotic scaling relation (26), with \(x=x_{k}\), implies that for any \(y>0\),

$$\begin{aligned} \lim _{k \rightarrow \infty }\frac{(1+\beta y)^{2}}{(1+y/w (x_{k}))^{2}}\cdot \frac{w (x_{k} (1+y/\sqrt{w (x_{k})}))}{w (x_{k})}=1. \end{aligned}$$

Since \(w (x_{k})\rightarrow \infty \), the first ratio converges to \((1+\beta y)^{2}>1\); but the second ratio is also at least \(1\), by the choice of \(x_{k}\), so we have a contradiction. Therefore, \(A<\infty \). A similar argument shows that \(B>0\).

Suppose next that \(\infty >A>1/\beta ^{2}\). Since \(A=\liminf w (x)\) there exists a sequence \(x_{k}\rightarrow \infty \) such that \(w(x_{k})\rightarrow A\). For any \(y>0\) the scaling relation (26), with \(x=x_{k}\), implies

$$\begin{aligned} \lim _{k \rightarrow \infty }\frac{(1+\beta y)^{2}}{(1+y/\sqrt{A})^{2}}\cdot \frac{w (x_{k} (1+y/\sqrt{A}))}{A}=1. \end{aligned}$$

But since \(y>0\), it follows that

$$\begin{aligned} \lim _{k \rightarrow \infty } w (x_{k} (1+y/\sqrt{A}))) = A \frac{(1+y/\sqrt{A})^{2}}{(1+\beta y)^{2}}<A, \end{aligned}$$

contradicting the supposition that \(A=\liminf w (x)\). Thus, \(A\le 1/\beta ^{2}\). A similar argument proves that \(B\ge 1/\beta ^{2}\). Thus, \(\beta ^{-2}\in [A,B]\).

It remains to prove that \(A=B\). If not, then it must be the case that \(\beta ^{-2}<B\) or \(\beta ^{-2}>A\). Suppose that \(A<\beta ^{-2}\), and let \(A^{*}\in (A,\beta ^{-2})\); then the weak intermediate value property (Lemma 13) implies that there exist sequences \(z_{n}\rightarrow \infty \) and \(x_{n}\rightarrow \infty \) such that

$$\begin{aligned} \lim _{n \rightarrow \infty } w (z_{n})&=A^{*};\\ \lim _{n \rightarrow \infty } w (x_{n})&=A; \quad \text {and}\\ w (x) \le w (&z_{n}) \quad \text {for all}\; x\in [z_{n},x_{n}]. \end{aligned}$$

Relation (26), this time with \(x=z_{n}\), implies that uniformly for \(y\in [0,C]\), for any \(C<\infty \),

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{(1+\beta y)^{2}}{(1+y/\sqrt{A^{*}})^{2}}\cdot \frac{w (z_{n} (1+y/\sqrt{A^{*}}))}{w (z_{n})}=1. \end{aligned}$$

But since \(A^{*}<\beta ^{-2}\),

$$\begin{aligned} \frac{(1+\beta y)^{2}}{(1+y/\sqrt{A^{*}})^{2}}\le 1 \end{aligned}$$

for all \(y\ge 0\), with strict inequality except at \(y=0\), and so

$$\begin{aligned} \liminf \frac{w (z_{n} (1+y/\sqrt{A^{*}}))}{w (z_{n})}\ge 1 \end{aligned}$$

uniformly on any interval \(y\in [0,C]\), with strict inequality on any sub-interval \(y\in [\varepsilon ,C]\). This contradicts the hypothesis that \(w (x)\le w (z_{n})\) for all \(x\in [z_{n},x_{n}]\). Therefore, \(A=\beta ^{-2}\). A similar argument shows that \(B=\beta ^{-2}\). \(\square \)

3 Conditional limit theorem

3.1 Space-time Feynman-Kac formula

Assume throughout this section that \(M_{n}\) is the rightmost particle location in the \(n\)th generation of a branching random walk satisfying the hypotheses of Theorem 1. (On the event that there are no particles in the \(n\)th generation, set \(M_{n}=-\infty \).) The unconditional distribution function of the random variable \(M_{n}\) satisfies a time-dependent nonlinear convolution equation similar to the time-independent Eq. (11) satisfied by the distribution of the maximal displacement random variable \(M\). Specifically, if

$$\begin{aligned} v_{n} (x)=v (n,x):=P\{M_{n} > x\}, \end{aligned}$$
(27)

then for every \(n\ge 1\),

$$\begin{aligned} v_{n} (x) =\sum _{y\in \mathbb Z} a_{k}Q (v_{n-1} (x-y)), \end{aligned}$$
(28)

where \(1-Q (1-s)\) is the probability generating function of the offspring distribution (see Eq. (12)). The objective of this section is to analyze the asymptotic behavior of \(v\), and in particular to show that \(nv (n,[x\sqrt{n}])\) converges as \(n \rightarrow \infty \), for any \(x\in \mathbb R\), to a distribution function that depends only on the variances of the offspring and step distributions of the branching random walk. The strategy will once again be to represent the solution of (28) by a discrete Feynman-Kac formula, and then to show that after an appropriate rescaling the Feynman-Kac expectations converge to the corresponding Feynman-Kac expectations for Brownian motion. As in Sect. 2, denote by \(W_{n}\) a random walk with step distribution (13), and by \(P^{x}\) the law of the random walk with initial point \(W_{0}=x\). Then essentially the same arguments as in the time-independent case prove the following assertion.

Proposition 14

For each \(n\) the process

$$\begin{aligned} Z^{(n)}_{k}= v_{n-k} (W_{k})\prod _{j=1}^{k-1} (1-H (v_{n-j}(W_{j}))) , \quad \text {for}\; k=0,1,2,\ldots ,n , \end{aligned}$$
(29)

where \(H\) is defined by (14), is a martingale under \(P^{x}\). Consequently, for any \(n\le m\),

$$\begin{aligned} v_{m} (x)=E^{x}v_{n} (W_{m-n})\prod _{j=1}^{m-n-1} (1-H (v_{m-j} (W_{j}))). \end{aligned}$$
(30)

3.2 Monotonicity, tightness, and scaling limits

For each \(n\ge 0\) the function \(v_{n} (x)\) is non-increasing in \(x\), with limit \(0\) as \(x \rightarrow \infty \) and limit \(v_{n} (-\infty )=P\{\zeta \ge n \}\) as \(x \rightarrow -\infty \), where \(\zeta \) is the extinction time of the branching random walk. By Kolmogorov’s theorem on the lifetime of a critical Galton-Watson process (see [5], ch. 1), as \(n \rightarrow \infty \),

$$\begin{aligned} v_{n} (-\infty )=P\{\zeta \ge n \}\sim \frac{2}{\sigma ^{2}n}. \end{aligned}$$
(31)

Hence, the functions \(nv_{n} (x)\) are uniformly bounded.

Lemma 15

Under the hypotheses of Theorem 1, the family of rescaled distribution functions \( nv_{n} (x\sqrt{n})\) is tight, that is,

$$\begin{aligned} \lim _{ x \rightarrow \infty } \sup _{n\ge 1} nP\{M_{n} \ge x\sqrt{n}\}=0 \quad \text {and} \quad \lim _{ x \rightarrow -\infty } \sup _{n\ge 1} nP\{-\infty <M_{n} \le x\sqrt{n}\}=0. \end{aligned}$$
(32)

Proof

The first of these follows directly from Theorem 1, because \(M_{n}\le M\). The second follows from Theorem 1 by reflection of the branching random walk in the origin. \(\square \)

Remark 4

The hypothesis that the step distribution \(F_{RW}\) of the branching random walk has finite \(4+\varepsilon \) moment is used here in an essential way. If \(F_{RW}\) has infinite \(4-\varepsilon \) moment for some \(\varepsilon >0\) then Lemma 15 need not be true: see the discussion at the end of Sect. 2.

Lemma 16

For any \(\varepsilon >0\) there exists \(\delta >0\) such that if \(|x-y|\le \delta \sqrt{n}\) and \(n\le m\le n (1+\delta )\) then

$$\begin{aligned} \left| \frac{v_{m} (y)}{v_{n} (x)}-1\right| \le \varepsilon . \end{aligned}$$
(33)

Proof

This follows by an argument similar to the proof of Proposition 8, using the Feynman-Kac formula (30) and Donsker’s invariance principle, since \(\sup _{x}v_{n} (x) \rightarrow 0\) as \(n \rightarrow \infty \) and \(H (u)\sim \sigma ^{2}u/2\) as \(u \rightarrow 0\). \(\square \)

Corollary 17

Any sequence of positive integers has a subsequence \(n_{k}\rightarrow \infty \) along which the functions \((t,x)\mapsto nv_{[nt]} (x\sqrt{n})\) converge uniformly for \(1\le t\le A\) and \(x\in [-\infty ,\infty ]\), for any \(A<\infty \). The set of possible limit functions \(\varphi (t,x)\) is compact in \(C ([1,A]\times \bar{\mathbb R}\), and for each \(t\) the function \(\varphi (t,x)\) is non-increasing in \(x\), with

$$\begin{aligned} \lim _{x \rightarrow \infty }\varphi (t,x)=0 \quad \text {and} \quad \lim _{x \rightarrow -\infty } t\varphi (t,x)=2/\sigma ^{2}. \end{aligned}$$
(34)

Proof

All of the assertions except the limits (34) follow from Lemma 16. The limit relations (34) follow from the tightness of the family \(nv_{n} (x\sqrt{n})\) and Kolmogorov’s theorem (31). \(\square \)

Corollary 18

Let \(\varphi (t,x)\) be any subsequential limit of the functions \((t,x)\mapsto nv_{[nt]} (x\sqrt{n})\), for \(1\le t\) and \(x\in [-\infty ,\infty ]\). Then \(\varphi \) satisfies the identity

$$\begin{aligned} \varphi (t+1,x)=E^{x/\eta } \varphi (1,\eta B_{t})\exp \left\{ -\frac{\sigma ^{2}}{2}\int \limits _{0}^{t} \varphi (t+1-s,\eta B_{s}) \,ds \right\} , \end{aligned}$$
(35)

where under \(P^{y}\) the process \(B_{t}\) is a standard Brownian motion started at \(y\). Consequently, \(\varphi \) satisfies the partial differential equation

$$\begin{aligned} \frac{\partial \varphi }{\partial t}=\frac{\eta ^{2}}{2}\frac{\partial ^{2}\varphi }{\partial x^{2}} -\sigma ^{2}\varphi ^{2} \quad for \;\; t>1 \quad and\;\; x\in \mathbb R. \end{aligned}$$
(36)

Proof

The integral representation (35) follows from the discrete Feynman-Kac formula (30) and the invariance principle by virtually the same argument as in the proof of Proposition 9. The differential Eq. (36) follows from the integral representation by Kac’s theorem. \(\square \)

It should be noted that the only a priori bounds on the functions \(v_{n}\) needed to deduce the existence of subsequential limits \(\varphi (t,x)\) are those in Lemmas 15–16 and Corollary 17. These use only the crude estimate \(M_{n}\le M:=\max _{n\ge 1}M_{n}\) and the results concerning the tail behavior of the distribution of \(M\) proved in Sect. 2.2. To prove that there is only one possible subsequential limit function \(\varphi (t,x)\) we will need the following stronger a priori bounds on the functions \(v_{n}\).

Lemma 19

$$\begin{aligned} \lim _{x \rightarrow \infty } \sup _{n\ge 1}\,&nx^{2}v_{n} (x\sqrt{n})=0 \quad and \end{aligned}$$
(37)
$$\begin{aligned} \lim _{x \rightarrow -\infty } \inf _{n\ge 1}\,&nx^{2} (1- v_{n} (x\sqrt{n}))=0. \end{aligned}$$
(38)

The proof is deferred to Sect. 3.6 below.

3.3 Super-Brownian motion and solutions of (36)

At first sight it might appear that Corollary 18 sheds no light at all on the question of uniqueness of scaling limits, because in general the solution to the partial differential Eq. (36) will depend on the initial condition \(\varphi (1,x)\). However, we will show, using the a priori bounds in Lemma 19, that in fact Corollary 18 implies that there can be only one scaling limit, and thereby complete the proof of Theorem 3. The key is the fact that solutions of (36) determine—and are determined by—the law of super-Brownian motion, by the following duality formula. For ease of exposition, assume henceforth that \(\eta ^{2}=1\) (There is no loss of generality in this, because solutions of (36) can be rescaled).

Proposition 20

Let \(X^{\delta _{x}}_{t}\) be a super-Brownian motion with branching parameter \(\sigma ^{2}\) and initial mass distribution \(X^{\delta _{x}}_{0}=\delta _{x}\), a unit point mass at location \(x\in \mathbb R\). Let \(\varphi (t+1,x)\) be the solution of the evolution Eq. (36) with initial condition \(\varphi (1,x)=\varphi (x)\). Then

$$\begin{aligned} \varphi (t+1,x)=-\log E \exp \{- \langle X^{\delta _{x}}_{t},\varphi \rangle \}. \end{aligned}$$
(39)

Here \(\langle \mu ,\varphi \rangle \) denotes the integral of \(\varphi \) against the measure \(\mu \).

Proof

See, for instance, [12], Corollary 1.25. \(\square \)

Exploitation of the duality formula (39) will require some elementary properties of super-Brownian motion. First, super-Brownian motion is equivariant under spatial translation: \(X^{\delta _{x}}_{t}\) can be obtained from \(X^{\delta _{0}}_{t}\) by translating each of the random measures \(X^{\delta _{0}}_{t}\) by \(x\) to the right. Second, super-Brownian motion satisfies a scaling law: if \(\tilde{X}_{t}\) is a super-Brownian motion with initial mass distribution \(\tilde{X}_{0}=A\delta _{x\sqrt{A}}\) then the rescaled measure-valued process \(X_{t}\) defined by

$$\begin{aligned} \langle X_{t},f \rangle =\langle A^{-1}\tilde{X}_{At}, f (\sqrt{A}\,\cdot ) \rangle \end{aligned}$$

is a super-Brownian motion with initial mass distribution \(X_{0}=\delta _{x}\). Third, super-Brownian motion is infinitely divisible, in the following sense: the superposition of \(m\) independent super-Brownian motions with initial mass distributions \(\nu _{i}\) is a super-Brownian motion with initial mass distribution \(\sum _{i=1}^{m}\nu _{i}\). This implies that the total mass \(|X_{t}|\) at time \(t\) evolves as Feller diffusion, and so \(P\{X_{t}\not =0 \}\sim C/t\) as \(t \rightarrow \infty \) with \(C=2/\sigma ^{2}\), where \(\sigma ^{2}\) is the branching parameter (cf. [21], Sect. 2.1, Theorem 1 and Example (iii)). It follows, by (39), that any solution \(\varphi (t+1,x)\) of (36) whose initial condition \(\varphi (x)=\varphi (1,x)\) is non-increasing in \(x\) and satisfies the boundary conditions (34). Finally, the scaling and infinite divisibility properties imply that super-Brownian motion conditioned to live for time at least \(t\) has a limit law.

Proposition 21

If \(X_{t}\) is a super-Brownian motion with initial state \(X_{0}=\delta _{0}\) then for any bounded, continuous test function \(f:\mathbb R \rightarrow \mathbb R\),

$$\begin{aligned} \lim _{t \rightarrow \infty }\mathcal {D} (t^{-1}\langle {X_{t},f (\sqrt{t} \,\cdot )}\rangle \,|\, X_{t}\not =0) =\mathcal {D} (\langle {Y^x_{1},f}\rangle ) \end{aligned}$$
(40)

where \(Y_{t}=Y^0_{t}\) is the super-process specified in Corollary 4.

Proof

See [23] and [13]. \(\square \)

Construction of the limit process: The existence of the weak limit (40) is implicit in the “Poisson cluster” representation of super-Brownian motion (cf. [12], ch. 1) but in fact weak convergence will not by itself suffice for the arguments to follow, so we now sketch a construction of super-Brownian motion in which the random measure \(Y^{x}_{1}\) arises naturally as an almost sure limit. First, observe that infinite divisibility implies that a super-Brownian motion \(X_{t}\) with initial state \(X_{0}=\delta _{x}\) can be decomposed as

$$\begin{aligned} X_{t}=X'_{t}+X''_{t}, \end{aligned}$$

where \(X'_{t}\) and \(X''_{t}\) are independent super-Brownian motions started from initial measures \(X'_{0}=X''_{0}=\delta _{x}/2\). This decomposition process can be iterated, so by standard arguments there exist, on some probability space, countably many independent super-Brownian motions \(X_t^{n,m}\), where \(m=0,1,2,\cdots \) and \(n=1,2,\cdots ,2^m\), such that for every pair \((n,m)\) the process \(X_t^{n,m}\) has initial state \(X^{n,m}_{0}=\delta _{x}/2^{m}\) and

$$\begin{aligned} X_t^{n,m} = X_t^{n,m+1} + X_t^{n+2^m,m+1} \quad \text {for} \;\; t\ge 0. \end{aligned}$$

By the scaling property, the probability that any one of the \(m\)th generation processes \(\{X_t^{n,m} \}_{t\ge 0}\) survives to time \(1\) is the same as the probability that the \(0\)th-generation super-Brownian motion \(\{X_t^{1,0} \}_{t\ge 0}\) survives to time \(2^{m}\), which is \(\sim C/2^{m}\), where \(C=2/\sigma ^{2}\). Hence, if \(F_m = \{ n \mid X_1^{n,m} \ne 0 \}\) then \(|F_{m}|\) converges in law to the Poisson distribution with mean \(C\). By construction, the random variables \(|F_{m}|\) are nondecreasing in \(m\), so it follows that in fact the sequence \(\{|F_{m}| \}_{m\ge 0}\) is eventually constant, with Poisson limit \(N\). By re-indexing the super-Brownian motions in each generation \(m\), we can arrange that for all sufficiently large \(m\) the processes \(\{X^{n,m}_{t} \}_{t\ge 0}\) satisfy

$$\begin{aligned} X^{n,m}_{1}=X^{n,m+1}_{1} \quad \text {for all}\;\; 1\le n\le N, \end{aligned}$$

and only those processes \(\{X^{(n;m)}_{t} \}_{t\ge 0}\) for which \(n\le N\) survive to time \(t=1\). Conditional on the value of \(N\), each of the random measures

$$\begin{aligned} X^{n,\infty }_{1}:= \lim _{m \rightarrow \infty } X^{n,m}_{1} \end{aligned}$$

is an independent version of \(Y^{x}_{1}\). \(\square \)

3.4 Asymptotic behavior of scaling limits

Corollary 22

Let \(\varphi (t,x)\) be any subsequential limit of the functions \((t,x)\mapsto nv_{[nt]} (x\sqrt{n})\), for \(1\le t\) and \(x\in [-\infty ,\infty ]\). Then for any \(x\in \mathbb R\),

$$\begin{aligned} \lim _{m \rightarrow \infty } 2^{m} \varphi (2^{m}+1,x2^{m/2})=\frac{2}{\sigma ^{2}} P\{Y^{x}_{1} (-\infty ,0]\not = 0 \}:=2G (x)/\sigma ^{2}. \end{aligned}$$
(41)

Furthermore, for each \(x\in \mathbb R\) the convergence (41) holds uniformly over the set of all possible subsequential limits \(\varphi (t,x)\).

Proof

Write \(y=\sqrt{t}x\), and abbreviate \(\varphi (1,x)=\varphi (x)\). Since \(P\{X^{\delta _y}_{t}\not =0 \}\sim 2/ (\sigma ^{2}t)\) as \(t \rightarrow \infty \), the duality formula (39) implies that

$$\begin{aligned} t\varphi (t+1,y)&=-t\log E\exp \{-\langle {X^{\delta _y}_{t},\varphi }\rangle \}\\&\qquad \sim tE (1-\exp \{-\langle {X^{\delta _y}_{t},\varphi }\rangle \})\\&=tE (1-\exp \{-\langle {X^{\delta _y}_{t},\varphi }\rangle \})\mathbf {1}\{X^{\delta _y}_{t}\not =0 \} \\&\quad \sim (2/\sigma ^{2})E (( 1-\exp \{-\langle {X^{\delta _y}_{t},\varphi }\rangle \}) \,|\, X^{\delta _y}_{t}\not =0 ).\\ \end{aligned}$$

The last expectation can be rewritten using the scaling property of super-Brownian motion:

$$\begin{aligned}&E (( 1-\exp \{-\langle X^{\delta _y}_{t},\varphi \rangle \}) \,|\, X^{\delta _y}_{t}\not =0 )\\&\left. \quad \quad = \; 1- E \left( \exp \left\{ -\int \limits _{-\infty }^{\infty } \varphi (z) \,\mathrm {d}(X^{\delta _y}_{t}(z)\right) \right\} \,\right| \, X^{\delta _y}_{t}\not =0 )\\&\left. \quad \quad = \; 1- \left( \exp \left\{ -t \int \limits _{-\infty }^{\infty } \varphi (\sqrt{t}z) \,\mathrm {d}(t^{-1} X^{\delta _y}_{t}(\sqrt{t}z)\right) \right\} \right| \, X^{\delta _y}_{t}\not =0 )\\&\left. \quad \quad = \; 1- E \left( \exp \left\{ -t \int \limits _{-\infty }^{\infty } \varphi (\sqrt{t}z) \,\mathrm {d}(X^{\delta _x/t}_{1}(z)\right) \right\} \right| \, X^{\delta _x/t}_{1}\not =0 ). \end{aligned}$$

By the construction sketched in the preceding subsection, versions of the super-Brownian motions \(X^{\delta _{x}/t}\), for \(t=2^{m}\), conditioned to survive to time \(t=2^{m}\) can be constructed on a common probability space along with a version of the random measure \(Y^{x}_{1}\) in such a way that \(X^{\delta _{x}/2^{m}}_{1}=Y^{x}_{1}\) for all large \(m\). Hence, as \(t \rightarrow \infty \) through powers of \(2\),

$$\begin{aligned}&\lim _{t \rightarrow \infty } E \left( \exp \left\{ -t \int \limits _{-\infty }^{\infty } \varphi (\sqrt{t}z) \,\mathrm {d}(X^{\delta _x/t}_{1}(z)) \right\} \,|\, X^{\delta _x/t}_{1}\not =0 \right) \\&\quad = \; \lim _{t \rightarrow \infty } E \exp \left\{ -t \int \limits _{-\infty }^{\infty } \varphi (\sqrt{t}z) \,\mathrm {d}(Y^x_{1}(z)) \right\} . \end{aligned}$$

The result now follows from Lemma 19 and the dominated convergence theorem. To see this, observe that the exponential in the last expectation is bounded above by \(1\), because the function \(\varphi \) is nonnegative. By Lemma 19, for each \(z>0\)

$$\begin{aligned} \lim _{t \rightarrow \infty } t\varphi (\sqrt{t}z)&=0 \quad \text {and}\\ \lim _{t \rightarrow \infty } t\varphi (-\sqrt{t}z)&=\infty \end{aligned}$$

uniformly over the set of all possible subsequential limits \(\varphi (y)=\lim nv_{n} (\sqrt{n}y)\). Therefore, by dominated convergence, as \(t \rightarrow \infty \) through powers of \(2\),

$$\begin{aligned} \lim _{t \rightarrow \infty } E \exp \left\{ -t \int \limits _{-\infty }^{\infty } \varphi (\sqrt{t}z) \,\mathrm {d}(Y^x_{1}(z)) \} = P\{\text {supp} (Y^{x}_{1})\subset (0,\infty ) \right\} . \end{aligned}$$

\(\square \)

3.5 Proofs of Theorem 3 and Corollary 4

It suffices to show that

$$\begin{aligned} \lim _{n \rightarrow \infty }nv_{n} (x\sqrt{n})=V (x):=\frac{2}{\sigma ^{2}}G (x), \end{aligned}$$

where \(G (x)\) is defined by (8). By Corollary 17, subsequential limits exist, and by Corollary 18 subsequential limits must satisfy the partial differential Eq. (36). What must be shown is that the only possible limit is the function \(V\).

Suppose then that there is a subsequence \(n_{k}\rightarrow \infty \) along which \(nv_{n} (x\sqrt{n})\rightarrow U (x)\) for some function \(U\). By Corollary 17 the sequence \(n_{k}\) has a subsequence \(n_{j} \rightarrow \infty \) along which the functions \((t,x)\mapsto nv_{[nt]} (x\sqrt{n})\) converge uniformly for \(1\le t\le A\) and \(x\in [-\infty ,\infty ]\), for any \(A<\infty \). By rescaling time, we can extract yet another subsequence \(n_{i}\) along which the functions \((t,x)\mapsto nv_{[nt]} (x\sqrt{n})\) converge uniformly for \(t\in [2^{-m},2^{m}]\) and \(x\in [-\infty ,\infty ]\), for any \(m\ge 1\). Denote the limit function by \(\varphi (t,x)\).

By construction, \(\varphi (1,x)=U (x)\); moreover, each section \(\varphi (2^{-m},x)\) is a subsequential limit of the functions \(x\mapsto 2^{-m}nv_{2^{-m}n} (x\sqrt{n}2^{m/2})\). Consequently, Corollary 22 implies that \(U=V\). \(\square \)

3.6 Proof of Lemma 19

The proof will use Theorem 1 and the Dawson-Watanabe theorem. Denote by \(\xi ^{1}_{t},\xi ^{2}_{t},\ldots \) the (counting) measure-valued processes associated with independent branching random walks each started by a single particle at the origin at time \(0\), and each governed by the same offspring distribution and step distribution, with variances \(\eta ^{2}\) and \(\sigma ^{2}\), respectively. Set

$$\begin{aligned} S^{n}_{t}=\sum _{i=1}^{n}\xi ^{i}_{t}. \end{aligned}$$
(42)

Thus, the process \(\{S^{n}_{t} \}_{t\ge 0}\) is a branching random walk initiated by \(n\) particles all located at the origin. (We view the branching random walks as continuous-time processes that are constant on time intervals \([m,m+1)\), with jumps at integer times \(m\)). The Dawson-Watanabe theorem asserts that the process \(\{S^{n}_{t} \}_{t\ge 0}\), after rescaling, converges in law as \(n \rightarrow \infty \) to super-Brownian motion \(X_{t}\) with initial mass distribution \(X_{0}=\delta _{0}\). In particular,

$$\begin{aligned} \frac{1}{n}S^{n}_{nt} (\sqrt{n}\cdot ) \Longrightarrow X_{t} \end{aligned}$$
(43)

where the weak convergence is in the Skorohod topology on the space of cadlag measure-valued processes. The limiting super-Brownian motion has local branching rate \(\eta ^{2}\) and diffusion coefficient \(\sigma ^{2}\).

Super-Brownian motion \(X_{t}\) in one dimension has the property that with probability one, for each \(t>0\) the random measure \(X_{t}\) is absolutely continuous relative to Lebesgue measure, with jointly continuous density \(X (t,x)\). The density \(X (t,x)\) is jointly continuous except at \(t=0\), and for each \(t>0\) has compact support in \(x\); as \(t \rightarrow 0\) the support contracts to the point \(0\). Super-Brownian motion dies out in finite time, so

$$\begin{aligned} M^{X}:=\sup \left\{ x\in \mathbb R\,:\, \int \limits _{0}^{\infty } X_{t}[x,\infty )\, dt>0 \right\} \end{aligned}$$

is well-defined, measurable, and finite. Denote by \(\tau _{x}\) the infimal time that \(X_{t}[x,\infty )>0\); then the event \(M^{X}\ge x\) coincides a.s. with \(\tau _{x}<\infty \). The path-continuity properties of the density \(X (t,x)\) imply that for any \(\varepsilon >0\) and any compact interval \([x_{1},x_{2}]\) not containing \(0\) there exist \(0<\delta <\Delta <\infty \) such that for all \(x\in [x_{1},x_{2}]\),

$$\begin{aligned} P (\delta <\tau _{x}<\Delta \,|\, \tau _{x}<\infty )>1-\varepsilon . \end{aligned}$$
(44)

Proposition 23

For any \(x>0\),

$$\begin{aligned} P\{M^{X}\ge x \} =P\{\tau _{x}<\infty \}=1-\exp \{-C/x^{2} \}\quad \text {where} \;\; C= \frac{6\eta ^{2}}{\sigma ^{2}}. \end{aligned}$$
(45)

Proof

By a theorem of Dynkin (see, e.g., [12], Ch. 8) the function

$$\begin{aligned} u(x)=-\log P\{M^{X}<x \} \end{aligned}$$

is the unique solution of the differential equation \(u''=\eta ^{2}u^{2}/\sigma ^{2}\) with boundary conditions \(u (0)=\infty \) and \(u (\infty )=0\). \(\square \)

The crucial point of Proposition 23 is that the distribution (45) coincides with the limit distribution in Corollary 2. As noted earlier, the weak convergence (43) does not by itself imply that the maximal displacement \(M^{n}\) of the branching random walk \(S^{n}_{t}\) converges weakly (after rescaling) to \(M^{X}\), because in the Dawson-Watanabe scaling (43) individual particles receive vanishingly small mass as \(n \rightarrow \infty \), but it does imply that

$$\begin{aligned} \liminf _{n \rightarrow \infty } P\{M^{n}\ge \sqrt{n}x \} \ge P \{M^{X}\ge x \}. \end{aligned}$$

Thus, Proposition 23 and Corollary 2 together imply that the event \(\{M^{n}\ge \sqrt{n}x \}\) is almost entirely accounted for by sample evolutions in which a large number (order \(n\)) of particles reach \(\sqrt{n}x\). Furthermore, by (44), they imply that for large \(n\) the event \(\{M^{n}\ge \sqrt{n}x \}\) is mostly composed of sample evolutions in which particles of the branching random walk reach \([\sqrt{n}x,\infty )\) during the time interval \([n\delta ,n\Delta ]\), where \(0<\delta <\Delta <\infty \) are as in (44). Following is a formal statement of this observation.

Corollary 24

Denote by \(M^{n}_{t}\) the location of the rightmost particle of the branching random walk \(S^{n}_{t}\) at time \(t\). Then for any \(\varepsilon >0\) and any compact \(K\subset (0,\infty )\) there exist constants \(0<\delta <\Delta <\infty \) such that for all \(x\in K\) and all sufficiently large \(n\),

$$\begin{aligned} nx^{2}P \{M^{n}_{tn}\ge \sqrt{n}x \;\; \text {for some}\;\; t\in [\delta ,\Delta ] \} \ge (1-\varepsilon ) (1-\exp \{-C /x^{2}\}). \end{aligned}$$
(46)

Proof of Lemma 19

Recall that \(v_{n} (y)\) is the probability that the branching random walk \(\xi ^{1}\) has at least one particle to the right of \(y\) at time \(n\). Since the reflection of this branching random walk is again a driftless, critical branching random walk, the relation (37) implies relation (38), so it suffices to prove (37), that is,

$$\begin{aligned} \lim _{x \rightarrow \infty }\sup _{n\ge 1}nx^{2}P \{\xi ^{1}_{n}[\sqrt{n}x,\infty )\ge 1 \}=0 \end{aligned}$$
(47)

We will accomplish this by contradiction. Suppose that there exist \(\gamma >0\) and sequences \(n_{k},x_{k}\rightarrow \infty \) along which the left side of (47) remains bounded below by \(\gamma \). Let \(m=m_{k}=[n_{k}x_{k}^{2}]\) and \(\theta =\theta _{k}=1/x_{k}^{2}\), and consider the branching random walk \(S^{m}_{t}\) initiated by \(m\) particles at the origin; then our hypothesis (47) would imply

$$\begin{aligned} P\{M^{m}_{m\theta }\ge \sqrt{m} \}\ge \gamma '=\frac{1}{2} (1-\exp \{-C/\gamma \}) >0 \end{aligned}$$
(48)

along the subsequence \(m=m_{k}\). We will use Corollary 24 to show that (48) is impossible.

Denote by \(A^{m}_{\theta }\) the event that \(M^{m}_{m\theta }\ge \sqrt{m}\). We begin by observing that on this event the total number of particles in the interval \([\beta \sqrt{m},\infty )\), for any fixed \(\beta >0\), must be small relative to \(m\). This follows from the Dawson-Watanabe theorem: since \(\theta =\theta _{k} \rightarrow 0\), for any \(\alpha >0\) the chance that the limiting super-Brownian motion puts any mass on the interval \([\beta ,\infty )\) at some time \(t\le \theta \) is vanishingly small, and so for any \(\alpha ,\varepsilon >0\), if \(k\) is sufficiently large then

$$\begin{aligned} P\{S^{m}_{m\theta }[\beta \sqrt{m},\infty )\ge \alpha {m}\}<\varepsilon . \end{aligned}$$
(49)

By reflection, it follows that

$$\begin{aligned} P\{S^{m}_{m\theta } (-\infty ,-\beta \sqrt{m}]\ge \alpha {m}\}<\varepsilon . \end{aligned}$$
(50)

Now fix \(\varepsilon >0\) small, let \(K=[1/2,2]\), and let \(0<\delta <\Delta <\infty \) be as in Corollary 24. Since \(\theta _{k}=x_{k}^{-2} \rightarrow 0\) as \(k \rightarrow \infty \), eventually \(\theta _{k}<\delta /2\). By (49)–(50), with probability at least \(1-2\varepsilon \) there will be fewer than \(\alpha m\) particles outside the interval \([- \sqrt{m}/2, \sqrt{m}/2]\). Consider the component branching random walks initiated by these particles at time \(\theta m\): if \(\alpha \ll \delta /2\) then by Kolmogorov’s theorem on the extinction time of a critical Galton-Watson process, the (conditional) probability (given the history of the branching random walk up to generation \([\theta m]\)) that one of these branching random walks survives to time \(\delta m\) is vanishingly small (as \(\alpha \rightarrow 0\)); in particular, for suitable \(\alpha \) this probability will be \(<\varepsilon \). Thus, except with probability not exceeding \(3\varepsilon \), on the event \(A^{m}_{\theta }\) the only particles that will survive to time \(\delta m\) are descendants of particles located in \([- \sqrt{m}/2, \sqrt{m}/2]\) at time \(\theta m\).

Next, consider the total number \(N^{m}_{m\theta }\) of particles in the branching random walk at time \(\theta m\). By Feller’s theorem, the processes \(N^{m}_{mt}/m\) converge in law to a Feller diffusion \(F_{t}\) started at \(F_{0}=1\). Since diffusion processes have continuous paths, and since \(\theta =\theta _{k} \rightarrow 0\), it follows that with probability approaching \(1\),

$$\begin{aligned} N^{m}_{m\theta } \mathop { \longrightarrow }\limits ^{P}1 \end{aligned}$$

as \(k \rightarrow \infty \). In particular, the probability that \(N^{m}_{m\theta }\ge (1+\varepsilon )m\) will eventually be smaller than \(\varepsilon \).

Finally, consider the event \(F^{m}_{\delta ,\Delta }\) that \(M^{m}_{tm}\ge \sqrt{m}\) for some time \(t\in [\delta ,\Delta ]\). For \(F^{m}_{\delta ,\Delta }\) to occur, one of the particles alive at time \(m\theta \) must have a descendant that reaches the interval \([\sqrt{m},\infty )\) after time \(\delta m\). The probability that such a particle is not located in \([-\sqrt{m}/2,\sqrt{m}/2]\) is <\(3\varepsilon \). Moreover, in order that a particle located in \([-\sqrt{m}/2,\sqrt{m}/2]\) at time \(\theta m\) has a descendant in \([\sqrt{m},\infty )\), the intervening trajectory must travel a distance at least \(\sqrt{m}/2\), and by Theorem 1 the (asymptotic) probability of this is <\(4C/m\), where \(C=6\eta ^{2}/\sigma ^{2}\). But the probability that the number of particles in \([-\sqrt{m}/2,\sqrt{m}/2]\) at time \(\theta m\) exceeds \((1+\varepsilon )m\) is <\(\varepsilon \), and hence, for large \(k\),

$$\begin{aligned} P (F^{m}_{\delta ,\Delta }\,|\,A^{m}_{\theta })\le 4\varepsilon + 1- (1-4C/m)^{(1+\varepsilon )m} \approx 4\varepsilon +1-\exp \{-4C (1+\varepsilon ) \}. \end{aligned}$$

It follows (provided \(\varepsilon >0\) is sufficiently small) that

$$\begin{aligned} P ((F^{m}_{\delta ,\Delta })^{c}\cap A^{m}_{\theta }) \ge (1-e^{-8C}) P (A^{m}_{\theta }). \end{aligned}$$

Since the event \(M^{m}\ge \sqrt{m}\) contains \(F^{m}_{\delta ,\Delta }\cup A^{m}_{\theta }\), we now have, by Corollary 2,

$$\begin{aligned} P\{M^{m}\ge \sqrt{m} \}&\sim 1-e^{-C} \\&\ge P(F^{m}_{\delta ,\Delta })+P ((F^{m}_{\delta ,\Delta })^{c}\cap A^{m}_{\theta }) \\&\ge (1-\varepsilon ) (1-e^{-C})+ (1-e^{-8C})\gamma ' \end{aligned}$$

Since \(\varepsilon >0\) can be chosen arbitrarily small, this is impossible, so we have arrived at a contradiction. \(\square \)