1 Introduction and main results

A fundamental question concerning general spatial branching processes, both superprocesses and branching Markov processes, pertains to their moments. Whilst the setting of first and second moments has received quite some attention, limited information seems to be known about higher moments, in particular, their asymptotic behaviour with time. Relevant references that touch upon this topic include [12, 17, 18, 21, 24]. In this paper, we provide a single general result that pertains to both superprocesses and spatial branching Markov processes and which provides a very precise and somewhat remarkable result for moment growth.

We show that, under the assumption that the first moment semigroup of the process exhibits a natural Perron Frobenious type behaviour, the k-th moment functional of either a superprocess or branching Markov process, when appropriately normalised, limits to a precise constant. The setting in which we work is remarkably general, even allowing for the setting of non-local branching; that is, where mass is created at a different point in space to the position of the parent. Moreover, the methodology we use appears to be extremely robust and we show that the asymptotic k-th moments of the running occupation measure are equally accessible using essentially the same approach. Our results will thus expand on what is known for branching diffusions and superdiffusions e.g. in [10, 23], as well as giving precise growth rates for the moments of occupations. In future work we hope to use the ideas in this paper to develop general central limit theorems for the aforesaid class of processes.

To this end, let us spend some time providing the general setting in which we wish to work. Let E be a Lusin space. Throughout, will write B(E) for the Banach space of bounded measurable functions on E with norm \(\Vert \cdot \Vert \), \(B^{+}(E)\) for non-negative bounded measurable functions on E and \(B^{+}_1(E)\) for the subset of functions in \(B^{+}(E)\) which are uniformly bounded by unity. We are interested in spatial branching processes that are defined in terms of a Markov process and a branching operator. The former can be characterised by a semigroup on E, denoted by \(\texttt {P}=(\texttt {P}_t, t\ge 0)\). We do not need \(\texttt {P}\) to have the Feller property, and it is not necessary that \(\texttt {P}\) is conservative. That said, if so desired, we can append a cemetery state \(\{\dagger \}\) to E, which is to be treated as an absorbing state, and regard \(\texttt {P}\) as conservative on the extended space \(E\cup \{\dagger \}\), which can also be treated as a Lusin space. Equally, we can extend the branching operator to \(E\cup \{\dagger \}\) by defining it to be zero on \(\{\dagger \}\), i.e. no branching activity on the cemetery state.

1.1 Branching Markov processes

Consider now a spatial branching process in which, given their point of creation, particles evolve independently according to a \(\texttt {P}\)-Markov process. In an event which we refer to as ‘branching’, particles positioned at x die at rate \(\beta (x)\), where \(\beta \in B^+(E)\), and instantaneously, new particles are created in E according to a point process. The configurations of these offspring are described by the random counting measure

$$\begin{aligned} {\mathcal {Z}}(A) = \sum _{i = 1}^N \delta _{x_i}( A), \end{aligned}$$

for Borel A in E. The law of the aforementioned point process depends on x, the point of death of the parent, and we denote it by \({\mathcal {P}}_x\), \(x\in E\), with associated expectation operator given by \({\mathcal {E}}_x\), \(x\in E\). This information is captured in the so-called branching mechanism

$$\begin{aligned} \texttt {G}[f](x) := \beta (x){\mathcal {E}}_x\left[ \prod _{i = 1}^N f(x_i) - f(x)\right] , \qquad x\in E, \end{aligned}$$
(1)

where we recall \( f\in B^+_1(E): = \{f\in B^+(E):\sup _{x\in E}f(x)\le 1\}\). Without loss of generality we can assume that \({\mathcal {P}}_x(N =1) = 0\) for all \(x\in E\) by viewing a branching event with one offspring as an extra jump in the motion. On the other hand, we do allow for the possibility that \({\mathcal {P}}_x(N =0)>0\) for some or all \(x\in E\).

Henceforth we refer to this spatial branching process as a \((\texttt {P}, \texttt {G})\)-branching Markov process. It is well known that if the configuration of particles at time t is denoted by \(\{x_1(t), \ldots , x_{N_t}(t)\}\), then, on the event that the process has not become extinct or exploded, the branching Markov process can be described as the co-ordinate process \(X= (X_t, t\ge 0)\) in the space of atomic measures on E with non-negative integer total mass, denoted by N(E), where

$$\begin{aligned} X_t (\cdot ) = \sum _{i =1}^{N_t}\delta _{x_i(t)}(\cdot ), \qquad t\ge 0. \end{aligned}$$

In particular, X is Markovian in N(E). Its probabilities will be denoted \({\mathbb {P}}: = ({\mathbb {P}}_\mu , \mu \in N(E))\). With this notation in hand, it is worth noting that the independence that is manifest in the definition of branching events and movement implies that if we define,

$$\begin{aligned} \texttt {v}_t[f](x) = {\mathbb {E}}_{\delta _x}\left[ \prod _{i = 1}^{N_t} f(x_i(t))\right] , \qquad f\in B^+_1(E),t\ge 0, \end{aligned}$$
(2)

then for \(\mu \in N(E)\) given by \(\mu = \sum _{i =1}^n\delta _{y_i}\), we have

$$\begin{aligned} {\mathbb {E}}_{\mu }\left[ \prod _{i = 1}^{N_{t}} f(x_i(t))\right] = \prod _{i = 1}^{n}\texttt {v}_t[f](y_i), \qquad t\ge 0. \end{aligned}$$
(3)

Moreover, for \(f\in B^+(E)\) and \(x\in E\),

$$\begin{aligned} \texttt {v}_t[f](x) = {{{\hat{\texttt {P}}}}_t[f](x)} + \int _0^t \texttt {P}_s\left[ \texttt {G}[\texttt {v}_{t-s}[f]]\right] (x)\mathrm{d}s, \qquad t\ge 0, \end{aligned}$$
(4)

where \({{\hat{\texttt {P}}}}_t\) is defined similarly to \(\texttt {P}_t\) but returns a value of 1 on the event of killing. The above equation describes the evolution of the semigroup \(v_t\): either the initial particle has not branched (and has possibly been absorbed) by time t or at some time \(s \le t\), the initial particle has branched, producing offspring according to \(\texttt {G}\). We refer the reader to [19, 22] for a proof.

Branching Markov processes enjoy a very long history in the literature, dating back as far as [30,31,32], with a broad base of literature that is arguably too voluminous to give a fair summary of here. Most literature focuses on the setting of local branching. This corresponds to the setting that all offspring are positioned at their parent’s point of death (i.e. \(x_i = x\) in the definition of \(\texttt {G}\)). In that case, the branching mechanism reduces to

$$\begin{aligned} \texttt {G}[s](x) = \beta (x)\left[ \sum _{k =0}^\infty p_{k}(x)s^k - s\right] , \qquad x\in E, \end{aligned}$$

where \(s\in [0,1]\) and \((p_k(x), k\ge 0)\) is the offspring distribution when a parent branches at site \(x\in E\). The branching mechanism \(\texttt {G}\) may otherwise be seen in general as a mixture of local and non-local branching.

1.2 Superprocesses

Superprocesses can be thought of as the high-density limit of a sequence of branching Markov processes, resulting in a new family of measure-valued Markov processes; see e.g. [6, 7, 13, 26, 34]. Just as branching Markov processes are Markovian in N(E), the former are Markovian in the space of finite Borel measures on E topologised by the weak convergence topology, denoted by M(E). There is a broad literature base for superprocesses, e.g. [6, 15, 17, 26, 34], with so-called local branching mechanisms, and later broadened to the more general setting of non-local branching mechanisms in [7, 26]. Let us now introduce these concepts with an autonomous definition of what we mean by a superprocess.

A Markov process \(X : = (X_t:t\ge 0)\) with state space M(E) and probabilities \({\mathbb {P}} := ({\mathbb {P}}_\mu , \mu \in M(E))\) is called a \((\texttt {P},\psi ,\phi )\)-superprocess if it has transition semigroup \((\hat{{\mathbf {E}}}_t, t\ge 0)\) on M(E) satisfying

$$\begin{aligned} {\mathbb {E}}_\mu \left[ \mathrm{e}^{-\left\langle f,X_t\right\rangle }\right] = \int _{M(E)}\mathrm{e}^{-\langle f,\nu \rangle }\hat{{\mathbf {E}}}_t(\mu ,\mathrm{d}\nu )=\mathrm{e}^{-\langle {\texttt {V}}_t[f], \mu \rangle }, \qquad \mu \in M(E), f\in B^{+}(E).\nonumber \\ \end{aligned}$$
(5)

Here, we work with the inner product on \(B^+(E)\times M(E)\) defined by \(\langle f, \mu \rangle = \int _Ef(x)\mu (\mathrm{d}x)\) and \(({\texttt {V}}_t, t\ge 0)\) is a semigroup evolution that is characterised via the unique bounded positive solution to the evolution equation

$$\begin{aligned} {\texttt {V}}_t[f](x)=\texttt {P}_t[f](x)-\int _{0}^{t}\texttt {P}_s\left[ \psi (\cdot ,{\texttt {V}}_{t-s}[f](\cdot ))+\phi (\cdot ,{\texttt {V}}_{t-s}[f])\right] (x)\mathrm{d}s. \end{aligned}$$
(6)

Here \(\psi \) denotes the local branching mechanism

$$\begin{aligned} \psi (x,\lambda )=-b(x) \lambda + c(x)\lambda ^2 +\int _{(0,\infty )}\left( \mathrm{e}^{-\lambda y}-1+\lambda y\right) \nu (x,\mathrm{d}y),\;\;\;\lambda \ge 0, \end{aligned}$$
(7)

where \(b\in B(E)\), \(c\in B^+(E)\) and \((x\wedge x^2)\nu (x, \mathrm{d}y)\) is a bounded kernel from E to \((0,\infty )\), and \(\phi \) is the non-local branching mechanism

$$\begin{aligned} \phi (x,f)=\beta (x)\left( f(x)-\zeta (x,f)\right) , \end{aligned}$$
(8)

where \(\beta \in B^+(E)\) and \(\zeta \) has representation

$$\begin{aligned} \zeta (x,f)=\gamma (x,f)+\int _{M(E)^{\circ }}(1-\mathrm{e}^{-\left\langle f,\nu \right\rangle })\Gamma (x,\mathrm{d}\nu ), \end{aligned}$$
(9)

such that \(\gamma (x,f)\) is a bounded function on \(E\times B^+(E)\) and \(\nu (1)\Gamma (x,\mathrm{d}\nu )\) is a bounded kernel from E to \(M(E)^{\circ }:=M(E){\setminus }\left\{ 0\right\} \) with

$$\begin{aligned} \gamma (x,f)+\int _{M(E)^{\circ }}\left\langle 1,\nu \right\rangle \Gamma (x,\mathrm{d}\nu )\le 1. \end{aligned}$$
(10)

We refer the reader to [7, 27] for more details regarding the above formulae. Lemma 3.1 in [7] tells us that the functional \(\zeta (x,f)\) has the following equivalent representation

$$\begin{aligned} \zeta (x,f)=\int _{M_0(E)}\left[ \gamma (x,\pi )\left\langle f,\pi \right\rangle +\int _{0}^{\infty }\left( 1-\mathrm{e}^{-u\left\langle f,\pi \right\rangle }\right) n(x,\pi ,\mathrm{d}u)\right] G(x,\mathrm{d}\pi ),\nonumber \\ \end{aligned}$$
(11)

where \(M_0(E)\) denotes the set of probability measures on E, \(\gamma \ge 0\) is a bounded function on \(E\times M_0(E)\), \(un(x,\pi ,\mathrm{d}u)\) is a bounded kernel from \(E\times M_0(E)\) to \((0,\infty )\) and \(G(x,\mathrm{d}\pi )\) is a probability kernel from E to \(M_0(E)\) with

$$\begin{aligned} \gamma (x,\pi )+\int _{0}^{\infty }un(x,\pi ,\mathrm{d}u)\le 1. \end{aligned}$$
(12)

The reader will note that we have deliberately used some of the same notation for both branching Markov processes and superprocesses. In the sequel there should be no confusion and the motivation for this choice of repeated notation is that our main result is indifferent to which of the two processes we are talking about.

1.3 Main results: k-th moments

As alluded to above, in what follows, \((X, {\mathbb {P}})\) is taken as either a branching Markov process or a superprocess as defined in the previous section. Our main results concern understanding the growth of the k-th moment functional in time

$$\begin{aligned} \texttt {T}_t^{(k)}[f](x):={\mathbb {E}}_{\delta _x}\left[ \left\langle f,X_t\right\rangle ^k\right] , \qquad x\in E, f\in B^+(E), k\ge 1, t \ge 0. \end{aligned}$$
(13)

For convenience, we will write \(\texttt {T}\) in preference of \(\texttt {T}^{(1)}\) throughout.

Before stating our main theorem, we first introduce some assumptions that will be crucial in analysing the moments defined above. First, we have a Perron–Frobenius-type assumption.

(H1): There exists an eigenvalue \(\lambda \in {\mathbb {R}}\) and a corresponding right eigenfunction \(\varphi \in B^+(E)\) and finite left eigenmeasure \({\tilde{\varphi }}\) such that, for \(f\in B^+(E)\),

$$\begin{aligned} \langle \texttt {T}_t[\varphi ] , \mu \rangle = \mathrm{e}^{\lambda t}\langle {\varphi },{\mu }\rangle \text { and } \langle {\texttt {T}_t[f] },{{\tilde{\varphi }}} \rangle = \mathrm{e}^{\lambda t}\left\langle f,{\tilde{\varphi }}\right\rangle , \end{aligned}$$

for all \(\mu \in N(E)\) (resp. M(E)) if \((X, {\mathbb {P}})\) is a branching Markov process (resp. a superprocess). Further let us define

$$\begin{aligned} \Delta _t = \sup _{x\in E, f\in B^+_1(E)}|\varphi (x)^{-1}\mathrm{e}^{-\lambda t}\texttt {T}_t\left[ f\right] (x)-\left\langle \tilde{\varphi },f\right\rangle | , \qquad t\ge 0. \end{aligned}$$

We suppose that

$$\begin{aligned} \sup _{t\ge 0}\Delta _t <\infty \text { and }\lim _{t\rightarrow \infty } \Delta _t=0. \end{aligned}$$
(14)

Before explaining the heuristics behind (H1), we state are second assumption, which is a moment condition on the offspring distribution.

(H2): Suppose \(k \ge 1\). If \((X, {\mathbb {P}})\) is a branching Markov process,

$$\begin{aligned} \sup _{x\in E}{\mathcal {E}}_x(\langle 1, {\mathcal {Z}}\rangle ^k) <\infty \end{aligned}$$
(15)

and if \((X, {\mathbb {P}})\) is a superprocess,

$$\begin{aligned} \sup _{x\in E} \left( \int _0^\infty |y|^k \nu (x,\mathrm{d}y) + \int _{M(E)^\circ } \langle 1, \nu \rangle ^k \Gamma (x,\mathrm{d}\nu )\right) <\infty . \end{aligned}$$
(16)

Let us spend a little time considering these two assumptions in more detail. For a lot of literature surrounding spatial branching processes, there has been emphasis on results for which an underlying assumption of exponential ergodic growth in the first moment is present as in (H1); see e.g. [1, 16, 19, 22, 27, 29]. Due to this, we may characterise the process as supercritical if \(\lambda > 0\), critical if \(\lambda = 0\) and subcritical if \(\lambda < 0\).

One way to understand (14), is through the martingale that comes hand-in-hand with the eigenpair \((\lambda , \varphi )\), i.e.

$$\begin{aligned} M_t^\varphi : = \mathrm{e}^{-\lambda t}\langle \varphi , X_t\rangle , \qquad t\ge 0. \end{aligned}$$
(17)

Normalising this martingale and using it as a change of measure results in the ubiquitous spine decomposition; cf. [21, 22, 29]. Roughly speaking, under the change of measure, the process is equal in law to a copy of the original process with a superimposed process of immigration, which occurs both in space and time along the path of a single particle trajectory in E, the spine. Moreover, the assumption (14) implies that the spine has a stationary limit with stationary measure \(\varphi {\tilde{\varphi }}\).

The assumptions (15) and (16) of (H2) are natural to ensure that k-moments are well defined for all \(t\ge 0\). If not explicitly stated in the literature, their need to ensure that the functional moments \(\texttt {T}^{(k)}_t[f](x)\) are finite for all \(t\ge 0\), \(f\in B^+(E)\) and \(x\in E\) is certainly folklore. The two conditions (15) and (16) are clearly natural analogues of one another. Indeed, whereas for superprocesses, it is usual to separate out the non-diffusive local branching behaviour from non-local behaviour, i.e via the measures \(\nu (x,\mathrm{d}y)\) and \(\Gamma (x, \mathrm{d}\nu )\), the analogous behaviour is captured in the single point process \({\mathcal {Z}}\) for branching Markov processes.

In terms of the eigenvalue in (H1), the following suite of results give us the precise growth rates for k-th moments in each of the critical (\(\lambda = 0\)), supercritical (\(\lambda >0\)) and subcritical (\(\lambda <0\)) settings. In all three results, \((X, {\mathbb {P}})\) is either a \((\texttt {P},\texttt {G})\)-branching particle system or a \((\texttt {P},\psi ,\phi )\)-superprocess on E.

Theorem 1

(Critical, \(\lambda = 0\)) Suppose that (H1) holds along with (H2) for some \(k \ge 2\) and \(\lambda = 0\). Define

$$\begin{aligned} \Delta _t^{(\ell )} = \sup _{x\in E, f\in B^+_1(E)}\left| t^{-(\ell -1)} \varphi (x)^{-1}\texttt {T}^{(\ell )}_t[f](x) - 2^{-(\ell -1)} \ell ! \,\left\langle f,\tilde{\varphi }\right\rangle ^\ell \langle {\mathbb {V}}[\varphi ], {\tilde{\varphi }}\rangle ^{\ell -1}\right| , \end{aligned}$$

where

$$\begin{aligned} {\mathbb {V}}[\varphi ](x) = \beta (x){\mathcal {E}}_{x}\left( \langle \varphi , {\mathcal {Z}}\rangle ^2 - \langle \varphi ^2, {\mathcal {Z}}\rangle \right) , \end{aligned}$$

if \((X, {\mathbb {P}})\) is a branching Markov process or

$$\begin{aligned} {\mathbb {V}}[\varphi ](x) = \psi ''(x,0+)\varphi (x)^2+ \beta (x)\int _{M(E)^\circ } \langle \varphi , \nu \rangle ^2 \Gamma (x,\mathrm{d}\nu ) \end{aligned}$$

if \((X, {\mathbb {P}})\) is a superprocess. Then, for all \(\ell \le k\)

$$\begin{aligned} \sup _{t\ge 0} \Delta _t^{(\ell )}<\infty \text { and } \lim _{t\rightarrow \infty } \Delta _t^{(\ell )} =0. \end{aligned}$$
(18)

The novel contribution of Theorem 1 is the fact that no such result currently exists in the literature at this level of generality. The only comparable results appear in [11], which uses similar methods to derive the convergence for the critical Crump-Mode-Jagers processFootnote 1 and in [20], which inspired this paper but only deals with the special case of a general critical branching particle processes where the test function f is specifically taken to be the eigenfunction \(\varphi \). We will discuss these examples in more detail after stating our main results. We have been unable to find comparable results for superprocesses.

There are two facts that stand out in Theorem 1. The first is the polynomial scaling, which is quite a delicate conclusion given that there is no exponential growth to rely on. The second is that, for \(k\ge 3\), the scaled moment limit is expressed not in terms of the k-th moments in (15) and (16), but rather the second order moments.

In some sense, however, both the polynomial growth and the nature of the limiting constant are not entirely surprising given the folklore for the critical setting. More precisely, in at least some settings (see e.g. [20]), one would expect to see a Yaglom-type result at criticality. The latter would classically see, conditional on survival, convergence in law of \(t^{-1}\langle f, X_t\rangle \) to an exponentially distributed random variable as \(t\rightarrow \infty \), whose parameter is entirely determined by the second moment of X. This implies that, conditional on survival, the limit of the k-th moment of \(\langle f, X_t\rangle \) behaves like \(k! t^k c^k\) (i.e. the k-th moment of the aforesaid exponential distribution) as \(t \rightarrow \infty \), where \(c>0\) is written in terms of a second moment functional of the spatial offspring distribution. One can also expect a Kolmogorov-type result for the survival probability, which says that

$$\begin{aligned} {\mathbb {P}}_{\delta _x}(N_t > 0) \sim c\varphi (x)t^{-1},\qquad \text { as }t\rightarrow \infty , \end{aligned}$$
(19)

where we recall that \(N_t = \langle 1, X_t\rangle \) is the total mass of the process at time t; see for example [20] for the particle system setting and [29] in the superprocess setting. Combining this with the moment asymptotic of

$$\begin{aligned} \texttt {T}_t^{(k)}[f](x) = {\mathbb {E}}_{\delta _x}[\langle f, X_t\rangle ^{k} | N_t> 0]{\mathbb {P}}_{\delta _x}(N_t > 0) \end{aligned}$$
(20)

implies that the k-th moment behaves like \(\varphi (x) t^{k-1}k! c^k\), which is precisely the result given in Theorem 1.

It is also curious why, at criticality, all higher moments can be expressed in terms of the constant c, which is written in terms of a second moment functional of the spatial offspring distribution. We can give some intuition here, at least in the branching Markov setting, as to why this is the case. Let us recall the classical folklore that conditioning on survival is equivalent to studying the process under the change of measure

$$\begin{aligned} \left. \frac{\mathrm{d} {\mathbb {P}}^\varphi _{\delta _x}}{\mathrm{d} {\mathbb {P}}_{\delta _x}} \right| _{\sigma (X_s, s\le t)}= \frac{\langle \varphi , X_t\rangle }{\varphi (x)},\qquad x\in E, t\ge 0, \end{aligned}$$
(21)

The above change of measure (21) yields the classical spine decomposition, cf. [20]. In particular, this means that we can write, for \(f\in B^+(E)\),

$$\begin{aligned} \langle f, X_t\rangle = f(\xi _t) + \sum _{i = 1}^{n_t}\sum _{j = 1}^{N_i} \langle f, X_{t-T_i}^{i, j}\rangle , \end{aligned}$$
(22)

where \(\xi = (\xi _t, t \ge 0)\) is the motion of an immortal particle, called the spine, whose semigroup is conservative, \(n_t\) is the number of branching events along the spine, which arrive at a rate which depends on the motion of \(\xi \), \(N_i\) is the number of offspring produced such that, at the i-th such branching event, which occurs at time \(T_i\le t\), \(X_{t-T_i}^{i, j}\), \(j = 1, \ldots , N_i\) are i.i.d copies of the original branching Markov process under \({\mathbb {P}}_{\xi _{T_i}}\), which provide mass at time t. In other words, this means that under \({\mathbb {P}}^\varphi \), the process X can be decomposed into a single immortal trajectory, off which, copies of the original process \((X, {\mathbb {P}})\) immigrate simultaneously to groups of siblings.

With this in mind, let us consider genealogical lines of descent that contribute to the bulk of the mass of the k-th moment at large times t. For each copy of \((X, {\mathbb {P}})\) that immigrates onto the spine at time \(s>0\), the probability that the process survives to time \(t \ge s\), thus contributing to the bulk of the k-th moment at time t, is \(O(1/(t-s))\approx O(1/t)\); cf. (19). If there are multiple offspring at an immigration event at time s, then the chance that at least two of these offspring contribute to the bulk of the k-th moment at time t is \(O(1/t^2)\). Moreover, the semigroup of the spine \(\xi \) is given by \(\texttt {P}^\varphi _t[f](x) =\texttt {P}_t[\varphi f](x)/\varphi (x)\), which, under (H1), limits to a stationary distribution \(\varphi (x){\tilde{\varphi }}(\mathrm{d}x)\), \(x\in E\). This has the effect that the arrival of branching events along the spine begin to look increasingly like a Poisson process as \(t\rightarrow \infty \). Hence, for large t, \(n_t \approx O(t)\).

Putting these pieces together, as \(t\rightarrow \infty \), there are approximately O(t) branch points along the spine, each of which has the greatest likelihood of a single offspring among immigrating siblings contributing to the bulk of the k-th moment at time t, with probability of order O(1/t). Thus, it is clear that we only expect to see one of each sibling group of immigrants along the spine contributing to the mass of the k-th moment at time t. Now let \(\beta ^\varphi \) denote the spatial rate at which offspring immigrates onto the spine and let \(\{x_1, \ldots , x_N\}\) denote their positions at the point of branching including the position of the spine at this instance. Let \({\mathcal {P}}^\varphi \) denote the law of this offspring distribution, and suppose that \(i^*\) is the (random) index of the offspring that continues the evolution of the spine. We have that the rate at which a ‘uniform selection’ of a single offspring which is not the spine at a branching event (seen through the function \(f\in B^+(E)\)) is given by

$$\begin{aligned}&\beta ^\varphi (x) {\mathcal {E}}^\varphi _x\left[ \sum _{i =1}^N f(x_i){\mathbf {1}}_{(i\ne i^*)}\right] \nonumber \\&\quad = \beta (x)\frac{{\mathcal {E}}_{x}\left[ \langle \varphi , {\mathcal {Z}}\rangle \right] }{\varphi (x)}{\mathcal {E}}_{x}\left[ \frac{\langle \varphi , {\mathcal {Z}}\rangle }{ {\mathcal {E}}_{x}\left[ \langle \varphi , {\mathcal {Z}}\rangle \right] } \sum _{i = 1}^N \frac{\varphi (x_i)}{\langle \varphi , {\mathcal {Z}}\rangle } \sum _{\begin{array}{c} i = 1 \\ j\ne i \end{array}}^N f(x_j) \right] \nonumber \\&\quad =\frac{\beta (x)}{\varphi (x)}{\mathcal {E}}_{x}\left[ \sum _{i = 1}^N \varphi (x_i) \sum _{\begin{array}{c} i = 1 \\ j\ne i \end{array}}^N f(x_j) \right] \nonumber \\&\quad = \frac{\beta (x)}{\varphi (x)}{\mathcal {E}}_{x}\left[ \langle f, {\mathcal {Z}}\rangle \langle \varphi , {\mathcal {Z}}\rangle - \langle \varphi f, {\mathcal {Z}}\rangle \right] , \end{aligned}$$
(23)

where we have used, from [20], that \(\beta ^\varphi (x) = \beta (x){{\mathcal {E}}_{x}\left[ \langle \varphi , {\mathcal {Z}}\rangle \right] }/{\varphi (x)}\), \({\mathcal {P}}^\varphi _x\) is absolutely continuous with respect to \({\mathcal {P}}_x\) with density \({\langle \varphi , {\mathcal {Z}}\rangle }/{ {\mathcal {E}}_{x}\left[ \langle \varphi , {\mathcal {Z}}\rangle \right] }\) and, given \(\{x_1, \ldots , x_N\}\), \(i^* = i\) is empirically selected with probability proportional to \(\varphi (x_i)\).

We know from the the setting of first moments that it is the projection of \(\langle f, X_t \rangle \) on to \(\langle \varphi , X_t\rangle \), with coefficient \(\langle f, {\tilde{\varphi }}\rangle \), which dominates the mean growth; indeed a strong law large numbers exists to this effect as well, see e.g. [19]. In this spirit, let us take \(f = \varphi \) for simplicity and we see that in (23) we get precisely \({\mathbb {V}}[\varphi ](x)/\varphi (x)\) on the right-hand side. Hence, finally, we conclude our heuristic by observing that the rate at which immigration off the spine contributes to the bulk of the k-th moment limit of \(\langle \varphi , X_t\rangle \) is determined by the second moment functional \({\mathbb {V}}[\varphi ]\); together with (20) and the associated remarks above, this goes some way to explaining the appearance of the limit in Theorem 1.

The next results present a significantly different picture for the supercritical and subcritical cases. For those settings, the exponential behaviour of the first moment semigroup becomes a dominant feature of the higher moments.

Theorem 2

(Supercritical, \(\lambda > 0\)) Suppose that (H1) holds along with (H2) for some \(k \ge 2\) and \(\lambda > 0\). Redefine

$$\begin{aligned} \Delta _t^{(\ell )} = \sup _{x\in E, f\in B^+_1(E)}\left| \varphi (x)^{-1}\mathrm{e}^{-\ell \lambda t}\texttt {T}^{(\ell )}_t[f](x) - \,\ell !\left\langle f,\tilde{\varphi }\right\rangle ^\ell L_\ell \right| , \end{aligned}$$

where \(L_1 = 1\) and we define iteratively for \(k \ge 2\)

$$\begin{aligned} L_k = \frac{1}{\lambda (k - 1)}\left\langle \beta {\mathcal {E}}_{\cdot }\bigg [ \sum _{[k_1, \ldots , k_N]_k^{2}}\prod _{\begin{array}{c} j = 1 \\ j: k_j > 0 \end{array}}^N\varphi (x_j)L_{k_j}\bigg ], {{\tilde{\varphi }}}\right\rangle , \end{aligned}$$

where \([k_1, \ldots , k_N]_k^2\) is the set of all non-negative N-tuples \((k_1, \ldots , k_N)\) such that \(\sum _{i = 1}^N k_i = k\) and at least two of the \(k_i\) are strictly positiveFootnote 2 if \((X, {\mathbb {P}})\) is a branching Markov process, or

$$\begin{aligned} L_k(x)&=\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _{k}}\frac{1}{m_1!\ldots m_{k-1}!}(m_1+\cdots +m_{k-1}-1)!\varphi (x)^{m_1+\cdots +m_{k-1}-1}\\&\quad \quad \prod _{j=1}^{k-1}\left( -L_j(x)\right) ^{m_j}+\frac{1}{\lambda (k-1)}\left\langle {\mathbb {V}}_k\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \end{aligned}$$

and iteratively with \({\mathbb {V}}_2\left[ \varphi \right] (x)={\mathbb {V}}\left[ \varphi \right] (x)\) (defined in the previous theorem) and for \(k\ge 3\)

$$\begin{aligned} {\mathbb {V}}_k[\varphi ](x)&= \sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{1}{m_1!\ldots m_{k-1}!}\prod _{j=2}^{k-1}\left( \frac{\left\langle {\mathbb {V}}_j\left[ \varphi \right] ,\tilde{\varphi }\right\rangle }{\lambda (j-1)}\right) ^{m_j}\\&\qquad \times \left[ \psi ^{(m_1+\cdots +m_{k-1})}(x,0+)(-\varphi (x))^{m_1+\cdots +m_{k-1}}+\beta (x)\right. \\&\left. \qquad \int _{M(E)^{\circ }}\left\langle \varphi ,\nu \right\rangle ^{m_1+\cdots +m_{k-1}}\Gamma (x,d\nu )\right] \end{aligned}$$

if \((X, {\mathbb {P}})\) is a superprocess. Here the sums run over the set \(\left\{ m_1,\ldots ,m_{k-1}\right\} _k\) of positive integers such that \(m_1+2m_2+\cdots +(k-1)m_{k-1}=k\). Then, for all \(\ell \le k\)

$$\begin{aligned} \sup _{t\ge 0} \Delta _t^{(\ell )}<\infty \text { and } \lim _{t\rightarrow \infty } \Delta _t^{(\ell )} =0. \end{aligned}$$
(24)

Although we have not been able to find existing results of this kind in such generality for supercritical processes, we note that the asymptotics of branching diffusions with either constant or compactly supported branching potentials were studied in [25]. Moreover, the asymptotics for the first and second moments of age-dependent branching processes were considered in [4] and [5], respectively. While Jensen’s inequality easily shows that this is the minimal rate of growth, it turns out that it is the exact rate of growth, implying that the k-th moment grows as the k-th power of the first moment, i.e. using the terminology of [35], there is no intermittency. If we again appeal to folklore then this is again not necessarily surprising. In a number of settings, we would expect X to obey a strong law of large numbers (cf. [1, 16, 19, 27]) in the sense that

$$\begin{aligned} \lim _{t\rightarrow \infty } \mathrm{e}^{-\lambda t}\langle f, X_t\rangle = \langle {\tilde{\varphi }}, f\rangle M^\varphi _\infty , \end{aligned}$$

where \((M^\varphi _t, t\ge 0)\) was defined in (17) and the limit holds either almost surely or in the sense of \(L^p\) moments, for \(p>1\). Moreover, returning to the heuristic involving the spine decomposition discussed after Theorem 1, in this case, the copies of the original process that branch off the spine are all supercritical so we expect them all to survive to time t and hence why we see the sum over tuples with at least two positive entries in the recurrence \(L_k\) (the argument for not seeing precisely one still holds).

It is also worth commenting on the difference in the dependence on x in the limit for the two processes. In the case of the branching Markov process, the dependence on x appears through the normalisation by \(\varphi (x)\), which appears due to the assumption (H1). For superprocesses, this dependence is much more complicated. Thinking in terms of what is known as the skeletal decomposition (see e.g. [2, 14]), the superprocess issued from a unit mass at x can be seen at time \(t>0\) as the aggregation of a Poisson point process of ‘superprocess excursions’ conditioned to survive beyond time t as well as a copy of the superprocess conditioned to become extinct by time t. In the supercritical setting, a finite Poisson number of these excursions will contribute to the overall growth of the process, as \(t\rightarrow \infty \), with a rate proportional to \(p(x)\delta _x\), where p(x) is rate of survival of an excursion issued from \(x\in E\). Moreover, in terms of mass, each of the excursions is of the same order of magnitude as an analogous branching particle system. Taking k-th moments of the x-dependent Poisson sum of such excursions introduces an additional layer of complexity to its asymptotic behaviour and, specifically, the dependency on x. This may go part way to explaining the dependency of \(L_k\) on x in that setting.

Finally we turn to the growth of moments in the subcritical setting, which offers the heuristically appealing result that the k-th moment decays slower than the k-th moment of the linear semigroup.

Theorem 3

(Subcritical, \(\lambda < 0\)) Suppose that (H1) holds along with (H2) for some \(k \ge 2\) and \(\lambda < 0\). Redefine

$$\begin{aligned} \Delta _t^{(\ell )} = \sup _{x\in E, f\in B^+_1(E)}\left| \varphi (x)^{-1}\mathrm{e}^{-\lambda t}\texttt {T}^{(\ell )}_t[f](x)- \ell !\left\langle f,\tilde{\varphi }\right\rangle ^\ell L_\ell \right| , \end{aligned}$$

where we define iteratively \(L_1 = 1\) and for \(k \ge 2\),

$$\begin{aligned} L_k = \frac{\langle f^k, {\tilde{\varphi }}\rangle }{\left\langle f,\tilde{\varphi }\right\rangle ^k k!} + \left\langle \beta {\mathcal {E}}_{\cdot }\bigg [\sum _{n = 2}^k \frac{1}{|\lambda | (n - 1)} \sum _{[k_1, \ldots , k_N]^n_k}\prod _{\begin{array}{c} j = 1 \\ j : k_j > 0 \end{array}}^N \varphi (x_j)L_{k_j}\bigg ], {\tilde{\varphi }} \right\rangle , \end{aligned}$$

where \([k_1, \ldots , k_N]_k^n\) is the set of all non-negative N-tuples \((k_1, \ldots , k_N)\) such that \(\sum _{i = 1}^N k_i = k\) and exactly \(2\le n\le k\) of the \(k_i\) are strictly positive if \((X, {\mathbb {P}})\) is a branching Markov process, or \(L_k = \left\langle {\mathbb {V}}_k[\varphi ],\tilde{\varphi }\right\rangle \), where \({\mathbb {V}}_1[\varphi ](x) = \varphi (x)\) and for \(k\ge 2\),

$$\begin{aligned} {\mathbb {V}}_k[\varphi ](x)&= \sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{1}{m_1!\ldots m_{k-1}!}\frac{1}{\lambda (1-m_1-\cdots -m_{k-1})}\prod _{j=1}^{k-1}\left\langle {\mathbb {V}}_j\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{m_j}\\&\quad \times \left[ \psi ^{(m_1+\cdots +m_{k-1})}(x,0+)(-\varphi (x))^{m_1+\cdots +m_{k-1}}\right. \\&\left. \quad +\beta (x)\int _{M(E)^{\circ }}\left\langle \varphi ,\nu \right\rangle ^{m_1+\cdots +m_{k-1}}\Gamma (x,d\nu )\right] \end{aligned}$$

if \((X, {\mathbb {P}})\) is a superprocess. Here the sums run over the set \(\left\{ m_1,\ldots ,m_{k-1}\right\} _k\) of non-negative integers such that \(m_1+2m_2+\cdots +(k-1)m_{k-1}=k\). Then, for all \(\ell \le k\)

$$\begin{aligned} \sup _{t\ge 0} \Delta _t^{(\ell )}<\infty \text { and } \lim _{t\rightarrow \infty } \Delta _t^{(\ell )} =0. \end{aligned}$$
(25)

As alluded to above, it is heuristically appealing that the the k-th moment does not grow at the rate \(\exp (-k \lambda t)\). On the other hand the actual growth rate \(\exp (- \lambda t)\) is slightly less obvious but nonetheless the obvious candidate. The decay in mass to zero in the branching system would suggest that the k-th moment similarly does so, but no slower that the first moment.

1.4 Main result: k-th occupation moments

It also transpires that our method is remarkably robust. Indeed, again taking an agnostic position on whether X is a branching Markov process or a superprocess, as we will show, careful consideration of the proofs of Theorems 12 and 3 demonstrate that we can also conclude results for the quantities

$$\begin{aligned} \texttt {M}_t^{(k)}[g](x):={\mathbb {E}}_{\delta _x}\left[ \left( \int _0^t\left\langle g,X_s\right\rangle \mathrm{d}s\right) ^k\right] , \qquad x\in E, g\in B^+(E), k\ge 1, t \ge 0. \end{aligned}$$

We can think of \(\int _0^t\left\langle g,X_s\right\rangle \mathrm{d}s\) as characterising the running occupation measure \(\int _0^t X_s(\cdot ){\mathrm{d}s}\) of the process X and hence we refer to \(\texttt {M}_t^{(k)}[g](x)\) as the k-th moment of the running occupation. The following results also emerge from our calculations, mirroring Theorems 12 and 3 respectively.

Theorem 4

(Critical, \(\lambda = 0\)) Suppose that (H1) holds along with (H2) for \(k \ge 2\) and \(\lambda = 0\). Define

$$\begin{aligned} \Delta _t^{(\ell )} = \sup _{x\in E, g\in B^+_1(E)}\left| t^{-(2\ell -1)} \varphi (x)^{-1}\texttt {M}^{(\ell )}_t[g](x) - 2^{-(\ell -1)} \ell ! \,\left\langle g,\tilde{\varphi }\right\rangle ^\ell \langle {\mathbb {V}}[\varphi ], {\tilde{\varphi }}\rangle ^{\ell -1}L_\ell \right| , \end{aligned}$$

where \(L_1 = 1\) and \(L_k\) is defined through the recursion \(L_k =( \sum _{i = 1}^{k-1}L_i L_{k-i})/(2k-1)\) if \((X,{\mathbb {P}})\) is a branching Markov process or \(L_k=(\sum _{\left\{ k_1,k_2\right\} ^{+}}L_{k_1}L_{k_2})/(2k-1)\) where \(\left\{ k_1,k_2\right\} ^{+}\) is the set of non-negative integers \(k_1,k_2\) such that \(k_1+k_2=k\) if \((X,{\mathbb {P}})\) is a superprocess. Then, for all \(\ell \le k\)

$$\begin{aligned} \sup _{t\ge 0} \Delta _t^{(\ell )}<\infty \text { and } \lim _{t\rightarrow \infty } \Delta _t^{(\ell )} =0. \end{aligned}$$
(26)

Theorem 5

(Supercritical, \(\lambda > 0\)) Suppose that (H1) holds along with (H2) for some \(k \ge 2\) and \(\lambda > 0\). Redefine

$$\begin{aligned} \Delta _t^{(\ell )} = \sup _{x\in E, g\in B^+_1(E)}\left| \varphi (x)^{-1}\mathrm{e}^{-\ell \lambda t}\texttt {M}^{(\ell )}_t[g](x) - \,\ell !\left\langle g,\tilde{\varphi }\right\rangle ^\ell L_\ell \right| , \end{aligned}$$

where \(L_k\) was defined in Theorem 2, albeit that \(L_1 = 1/\lambda \).

Then, for all \(\ell \le k\)

$$\begin{aligned} \sup _{t\ge 0} \Delta _t^{(\ell )}<\infty \text { and } \lim _{t\rightarrow \infty } \Delta _t^{(\ell )} =0. \end{aligned}$$
(27)

Theorem 6

(Subcritical, \(\lambda < 0\)) Suppose that (H1) holds along with (H2) for some \(k \ge 2\) and \(\lambda < 0\). Redefine

$$\begin{aligned} \Delta _t^{(\ell )} = \sup _{x\in E, g\in B^+_1(E)}\left| \varphi (x)^{-1}\texttt {M}^{(\ell )}_t[g](x) - \,\ell !\left\langle g,\tilde{\varphi }\right\rangle ^\ell L_\ell \right| , \end{aligned}$$

where \(L_1 = 1/|\lambda |\) and for \(k\ge 2\), the constants \(L_k\) are defined recursively via

$$\begin{aligned} L_k =\frac{1}{|\lambda |} \left\langle \beta {\mathcal {E}}\Bigg [\sum _{[k_1, \ldots , k_N]_k^2} \prod _{\begin{array}{c} j = 1 \\ j : k_j > 0 \end{array}}^N\varphi (x_j)L_{k_j}\Bigg ], {\tilde{\varphi }} \right\rangle - \frac{\langle g\varphi , {\tilde{\varphi }}\rangle }{|\lambda |\langle g, {\tilde{\varphi }}\rangle } L_{k-1}, \end{aligned}$$

if X is a branching Markov process and

$$\begin{aligned} L_k (x)= (-1)^k \left\langle {\mathbb {U}}_k\left[ \varphi \right] ,\tilde{\varphi }\right\rangle -{\mathcal {R}}_k(x), \end{aligned}$$

where

$$\begin{aligned}&{\mathcal {R}}_k(x)=\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{(-1)^{k}}{m_1!\ldots m_{k-1}!}(-\varphi (x))^{m_1+\cdots + m_{k-1}-1}(m_1+\cdots +m_{k-1}-1)!\\&\quad \prod _{j =1}^{k-1}L_j(x)^{m_j}, \end{aligned}$$

and where we define recursively, \({\mathbb {U}}_1\left[ \varphi \right] (x)=-\varphi (x)/|\lambda |\) and for \(k\ge 2\),

$$\begin{aligned} {\mathbb {U}}_k\left[ \varphi \right] (x)= & {} \frac{1}{|\lambda |}\left( {\mathbb {U}}_{k-1}\left[ \varphi \right] (x)+\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{1}{m_1!\ldots m_{k-1}!}\prod _{j=1}^{k-1}\left\langle {\mathbb {U}}_j\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{m_j}\right. \\&\quad \left. \times \left[ \psi ^{(m_1+\cdots +m_{k-1})}(x,0+)(-\varphi (x))^{m_1+\cdots +m_{k-1}}+\beta (x)\right. \right. \\&\left. \left. \quad \int _{M(E)^{\circ }}\left\langle \varphi ,\nu \right\rangle ^{m_1+\cdots +m_{k-1}}\Gamma (x,d\nu )\right] \right) \end{aligned}$$

if X is a superprocess. Then, for all \(\ell \le k\)

$$\begin{aligned} \sup _{t\ge 0} \Delta _t^{(\ell )}<\infty \text { and } \lim _{t\rightarrow \infty } \Delta _t^{(\ell )} =0. \end{aligned}$$
(28)

The results in Theorems 45 and 6 are slightly less predictable. Let us discuss this point a little further in the particle setting for convenience. For the supercritical case, the extra “linear” term arising from the time integral does not affect the exponential growth of the process, and hence the leading order behaviour is still dominated by \(\mathrm{e}^{k\lambda t}\). In the critical case, let us assume momentarily that we are permitted to assume a Yaglom limit holds (see for example [20] in the branching Markov process setting). In that setting, we know that, conditional on \(N_t > 0\), \(\langle f, X_t\rangle \sim O(t)\) at \(t\rightarrow \infty \). Conditional on survival, we thus have (up to a constant)

$$\begin{aligned} \int _0^t \langle f, X_t\rangle \mathrm{d}s \sim \int _0^t s \mathrm{d}s = t^2. \end{aligned}$$

This implies that (still conditional on survival) the k-th moment of the occupation measure behaves like \(t^{2k}\). Recalling that, at criticality, the survival probability behaves like 1/t, we obtain the scaling \(t^{2k-1}\). Finally, in the subcritical case, we know that the total occupation \(\int _0^\zeta \langle g , X_s\rangle \mathrm{d}s\), where \(\zeta = \inf \{t>0: \langle 1, X_t\rangle = 0\}\), is finite, behaving like an average spatial distribution of mass, i.e. \(\langle g, {\tilde{\varphi }}\rangle \), multiplied by \(\zeta \), meaning that no normalisation is required to control the “growth” of the running occupation moments in this case.

1.5 Examples for specific branching processes

We now give some examples to illustrate our results and the generality of this setting.

Continuous-time Galton–Watson process and CSBPs: Let us now consider the simplest branching particle setting where the process is not spatially dependent. In effect, we can take \(E = \{0\}\), \(\texttt {P}\) to be the Markov process which remains at \(\{0\}\) and the branching mechanism has no spatial dependence. This is the setting of a continuous-time Galton–Watson process. Its branching rate \(\beta \) is constant, and the first and second moments of the offspring distribution are given by \(m_1 = {\mathcal {E}}[N]\) and \(m_2 = {\mathcal {E}}[N^2]\), respectively, where N is the number of offspring produced at a branching event, are key to characterising moment limits. When the process is independent of space, we have \(\lambda = \beta (m_1-1)\), \(\varphi =1\), \({\tilde{\varphi }}\) can be taken as \(\delta _{\{0\}}\) and (H1) trivially holds. Theorem 1 now tells us that, at criticality, i.e. \(m_1 = 1\), the limit for the \(\ell \)-th moment of the population size at time t, i.e. \(N_t\), satisfies

$$\begin{aligned} t^{-(\ell -1)}{\mathbb {E}}[N_t^\ell ]\sim 2^{-(\ell -1)}\ell ! \left( \beta (m_2-1)\right) ^{\ell -1},\qquad \text { as } t\rightarrow \infty , \end{aligned}$$
(29)

when \({\mathcal {E}}[N^\ell ]<\infty \) and \(\ell \ge 1\).

In the supercritical case, i.e. \(m_1>1\), the limit in Theorem 2 simplifies to

$$\begin{aligned} \mathrm{e}^{-\beta (m_1-1)\ell t}{\mathbb {E}}[N_t^\ell ]\sim \ell ! L_\ell ,\qquad \text { as } t\rightarrow \infty , \end{aligned}$$
(30)

where the iteration

$$\begin{aligned} L_\ell = \frac{1}{(m_1-1)(\ell -1)} {\mathcal {E}}\bigg [ \sum _{[k_1, \ldots , k_N]_\ell ^2} \prod _{\begin{array}{c} j = 1 \\ j : k_j > 0 \end{array}}^NL_{k_j}\bigg ], \qquad \ell \ge 2, \end{aligned}$$

holds. Here, although the simplified formula is still a little complicated, it demonstrates more clearly that the moments in Theorem 2 grow according to the leading order terms of the offspring distribution. Indeed, in the case \(\ell = 2\), we have

$$\begin{aligned} L_2 = \frac{1}{m_1-1}{\mathcal {E}}[\text {card}\{[k_1, \ldots , k_N]_2^2\}] = \frac{1}{m_1-1}\frac{{\mathcal {E}}[N(N-1)]}{2} = \frac{m_2-m_1}{2(m_1-1)}. \end{aligned}$$

The constant \(L_3\) can now be computed explicitly in terms of \(L_2\) and \(L_1 = 1\), and so on.

The limits in the subcritical case can be detailed similarly and only offer minor simplifications of the constants \(L_k\), \(k\ge 1\) presented in the statement of Theorem 3. Hence we leave the details for the reader to check.

The analogue of the continuous-time Galton–Watson process in the superprocess setting is that of a continuous-state branching process (CSBP). In this setting there is no associated movement, \(\psi \) in (7) is not spatially dependent and the non-local branching mechanism (8) satisfies \(\phi \equiv 0\). The right eigenvector and left eigenmeasure can be structured in the same way as in the Galton–Watson setting, with eigenvalue given by \(-\psi '(0+) = b\).

Similarly to (29), Theorem 1 tells us at criticality, i.e. \(-\psi '(0+)=0\), that the CSBP \((Z_t, t\ge 0)\) satisfies

$$\begin{aligned} t^{-(\ell -1)}{\mathbb {E}}[Z_t^\ell ]\sim 2^{-(\ell -1)}\ell ! \psi ''(0+)^{\ell -1},\qquad \text { as } t\rightarrow \infty , \end{aligned}$$

when \(\int _0^\infty |y|^\ell \nu (\mathrm{d}y)<\infty \) and \(\ell \ge 1\). The situation at super- and subcriticality again do not offer much more insight than the natural analogue of (30) with an induction on the \(L_k\) constants. Hence we end our discussion of the CSBP example here.

Crump-Mode-Jagers (CMJ) processes: Consider a branching process in which particles live for a random amount of time \(\zeta \) and during their lifetime, give birth to a (possibly random) number of offspring at random times; in essence, the age of a parent at the birth times forms a point process, say \(\eta (\mathrm{d}t)\) on \([0,\zeta ]\). We denote the law of the latter by \({\mathcal {P}}\). The offspring reproduce and die as independent copies of the parent particle and the law of the process is denotes by \({\mathbb {P}}\) when initiated from a single individual. Although this model, i.e. a CMJ process, is not covered in the present article, it appears to be the only context in which comparable asymptotic moment results can be found in the literature.

First let us consider the case where the number of offspring, \(N = \eta [0,\zeta ]\), born to the initial individual during its lifetime satisfies \({\mathbb {E}}[N] = 1\), which corresponds to the critical case. (Note, criticality is usually described in terms of the Malthusian parameter, however, the setting \({\mathbb {E}}[N] = 1\) is equivalent.) Further, let \(Z_t\) denote the number of individuals in the population at time \(t\ge 0\). Under the moment assumption \({\mathbb {E}}[N^k]< \infty \) for some \(k \ge 1\), [11] showed that the factorial moments \(m_k(t) := {\mathbb {E}}[Z_t(Z_t - 1) \cdots (Z_t - k + 1)]\) satisfy

$$\begin{aligned} \lim _{t \rightarrow \infty }\frac{m_k(t)}{t^{k-1}} = k! \frac{{\mathcal {E}}[\zeta ]^k}{b^{2k-1}}(m_2-1)^{k-1}, \end{aligned}$$

where \(m_2 = {\mathcal {E}}[N^2]\) and \(b = {\mathcal {E}}[\int _0^\zeta t\eta (\mathrm{d}t)]\). We encourage the reader to compare this to the spatially independent example considered above.

The proof in [11] echoes a similar approach to the one presented in the current article. The author first develops a non-linear integral equation that describes the evolution of \(m_k(t)\) in terms of the lower order moments, cf. [11, Theorem 1]. An inductive argument along with this evolution equation is then used to prove the above asymptotics.

Branching Brownian motion in a bounded domain: In [28], the Yaglom limit for branching Brownian motion (BBM) in a bounded domain was proved. In this setting, the semigroup \(\texttt {P}\) corresponds to that of a Brownian motion killed on exiting a \(C^1\) domain, E. The branching rate is taken as the constant \(\beta >0\) and the offspring distribution is not spatially dependent. Moreover, the first and second moments, \(m_1:={\mathcal {E}}[N]\) and \(m_2= {\mathcal {E}}[N^2]\), are finite. In this setting, the right eigenfunction \(\varphi \) exists on E, satisfying Dirichlet boundary conditions, and is accompanied by the left eigenmeasure \(\varphi (x)\mathrm{d}x\) on E. The associated eigenvalue is identified explicitly as \(\lambda = \beta (m_1-1) -\lambda _E\), where \(\lambda _E\) is the ground state eigenvalue of the Laplacian on E. The critical regime thus occurs when \(\beta (m_1-1) =\lambda _E\)

Among the main results of [28], are the Kolmogorov limit,

$$\begin{aligned} {\mathbb {P}}_{\delta _x}(N_t >0) \sim \frac{1}{t}\frac{2(m_1-1)\varphi (x)}{\lambda _E(m_2-m_1)\int _E \varphi (x)^3\mathrm{d}x}=: 2\varphi (x)/ \Sigma , \qquad x \in E, \end{aligned}$$

as \(t\rightarrow \infty \) and the Yaglom distributional limit,

$$\begin{aligned} \text {Law}\left( \left. \frac{\langle f, X_t\rangle }{t} \right| N_t>0\right) \rightarrow \text {Law}({\mathbf {e}}_{2/\langle \varphi , f\rangle \Sigma }), \qquad \text { as }t\rightarrow \infty , \end{aligned}$$

where \({\mathbf {e}}_{\langle \varphi , f\rangle \Sigma /2}\) is an exponentially distributed random variable with rate \(\langle \varphi , f\rangle \Sigma /2\). (Note, we understand \(\langle f, \varphi \rangle = \int _E \varphi (x)f(x)\mathrm{d}x\) in this context.) In particular these two results allude to the limit of moments (albeit further moment assumptions would be needed on N), which, in the spirit of (20), can be heuristically read as

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{t^{k-1}}{\mathbb {E}}_{\delta _x}[\langle f, X_t\rangle ^k]&= \lim _{t\rightarrow \infty } t {\mathbb {P}}_{\delta _x}(N_t>0){\mathbb {E}}_{\delta _x}\left[ \left. \frac{\langle f, X_t\rangle ^k}{t^k}\right| N_t >0\right] \nonumber \\&= k! 2^{-(k-1)}\langle f,\varphi \rangle ^k\Sigma ^{k-1}\varphi (x), \qquad x\in E, \end{aligned}$$
(31)

for \(k\ge 1\). Taking into account the fact that \({\tilde{\varphi }}(x) = \varphi (x) \mathrm{d}x\) and \(\beta =\lambda _E/(m_1-1)\), we see that

$$\begin{aligned} \langle {\mathbb {V}}[\varphi ](x), {\tilde{\varphi }}\rangle = \langle \beta \varphi ^2 (m_2-m_1),\varphi \rangle =\lambda _E\frac{(m_2-m_1)}{(m_1-1)}\int _E\varphi (x)^3 \mathrm{d}x = \Sigma . \end{aligned}$$

Hence (31) agrees precisely with Theorem 1.

Neutron branching processes: The neutron branching process (NBP), as introduced in [3, 22], is our final example. Neutrons evolve in the configuration space \(E = D \times V\), where \(D \subset {\mathbb {R}}^3\) is a bounded, open set denoting the set of particle locations and \(V:= \{\upsilon \in {\mathbb {R}}^3 : \texttt {v}_{\min } \le |\upsilon | \le \texttt {v}_{\max }\}\) with \(0< \texttt {v}_{\min } \le \texttt {v}_{\max } < \infty \), denotes the set of velocities. From an initial space-velocity configuration \((r, \upsilon )\), particles move according to piecewise deterministic Markov processes characterised by \(\sigma _\texttt {s}\pi _\texttt {s}\), where \(\sigma _\texttt {s}(r, \upsilon )\) denotes the rate at which particles change velocity (also called scattering events) at \((r, \upsilon )\), and \(\pi _\texttt {s}(r, \upsilon , \upsilon ')\mathrm{d}\upsilon '\) denotes the probability that such a scattering event results in a new outgoing velocity \(\upsilon '\). When at \((r, \upsilon ) \in D \times V\), at rate \(\sigma _\texttt {f}(r, \upsilon )\), a branching (or fission) event occurs, resulting in the release of several new neutrons with configurations \((r, \upsilon _1), \ldots , (r, \upsilon _N)\), say. The quantity \(\pi _\texttt {f}(r, \upsilon , \upsilon ')\mathrm{d}\upsilon '\) gives the average number of neutrons produced with outgoing velocity \(\upsilon '\) from a fission event at \((r, \upsilon )\). Thus, the NBP is an example of a branching Markov process with non-local branching, where the motion is a piecewise deterministic Markov process and the non-locality at branching events appears in the velocity.

As previously mentioned, Theorem 1 was established for the NBP in [20] using a spine decomposition approach but under more restrictive assumptions and only when the test function f is taken to be the right eigenfunction \(\varphi \). In [22], under the assumptions

  1. (A1)

    \(\sigma _\texttt {s}\), \(\pi _\texttt {s}\), \(\sigma _\texttt {f}\), \(\pi _\texttt {f}\) are uniformly bounded from above.

  2. (A2)

    \(\inf _{r \in D \upsilon , \upsilon ' \in V}(\sigma _\texttt {s}(r, \upsilon )\pi _\texttt {s}(r, \upsilon , \upsilon ') + \sigma _\texttt {f}(r, \upsilon )\pi _\texttt {f}(r, \upsilon , \upsilon ')) > 0\),

it was shown that (H1) holds. Moreover, since only a finite number of neutrons can be produced at a fission event, the number of offspring is uniformly bounded from above and thus (H2) holds for all \(k\ge 1\). Hence, the results obtained in this paper hold for the NBP.

Although there is no particular simplification of the limiting constants in Theorems 12345 and 6 in this setting, of particular interest is the notion of particle clustering that appears in Monte Carlo criticality calculations [8, 9, 33]. This phenomena occurs in critical reactors where particles exhibit strong spatial correlations. Studying the moments \(\texttt {M}^{(k)}\) and other correlation structures in this setting will shed further light on this phenomenon.

1.6 Outline of the paper and strategy for proofs

In the remainder of the paper, there are two sections. One handling the proofs for the branching Markov process setting, and another handling the proofs for the superprocess setting. The proofs of the six main theorems follow a similar pattern and so we will briefly provide a heuristic to their strategic approach here, and how this is laid out in the remainder of the paper. Roughly speaking, each of the proofs follows the following fundamental steps.

Step 1: The first step in all of the proofs is to establish a non-linear semigroup equation in the spirit of (4) for branching Markov processes and (5) for superprocesses, but for the joint Markovian pair \(\langle f, X_t\rangle \) and \(\int _0^t \langle g, X_s\rangle \mathrm{d}s\), \(t\ge 0\). Moreover, we want this semigroup equation to be written as an integral equation in terms of \((\texttt {T}_t, t\ge 0)\), rather than \((\texttt {P}_t, t\ge 0)\); recall \(\texttt {T}= \texttt {T}^{(1)}\) is the mean semigroup cf. (13). This is done for branching Markov processes in detail in Sect. 2.1, whereas we simply state the relationship for superprocesses in Sect. 3.1 as the proof is very similar.

Step 2: Next, we work with the simple observation that differentiating the non-linear semigroup k times will give us access to the the k-th moment process \((\texttt {T}^{(k)}_t, t\ge 0)\). That is,

$$\begin{aligned} \texttt {T}_t^{(k)}[f](x) = (-1)^k\frac{\partial ^k}{\partial \theta ^k}{\mathbb {E}}_{\delta _x}[\mathrm{e}^{-\theta \langle f, X_t\rangle }] \bigg |_{\theta = 0}, \end{aligned}$$

with a similar formula holding for \(\texttt {M}^{(k)}\). However, given the recursion of the non-linear semigroup derived in the previous step, this turns out to give us a new recursion of \(\texttt {T}^{(k)}\) in terms of \(\texttt {T}^{(k-1)}, \texttt {T}^{(k-2)}, \ldots , \texttt {T}^{(1)}\) and similarly for \(\texttt {M}^{(k)}\) in terms of \(\texttt {M}^{(k-1)}, \texttt {M}^{(k-2)}, \ldots , \texttt {M}^{(1)}\); the two recursions are extremely close to one another, but subtly different nonetheless.

Here we see a fundamental difference between the approaches we take for superprocesses and branching Markov processes. This is due to the way in which we differentiate the branching mechanisms in the recursion of the non-linear semigroup derived in Step 1. For the branching Markov process setting, the branching mechanism \(\texttt {G}\) that appears in the aforesaid recursion is written in terms of a product over particles at a birth event. This allows us to use the Leibniz rule when differentiating across the product; this is done in Sect. 2.2. In the case of superprocesses, the branching mechanism is written in terms of an analytically smoother object, namely a Lévy-Khintchine-type formula. With no representation in terms of particles, we must turn instead to Fàa di Bruno’s rule in order to differentiate this branching mechanism; cf. Sect. 3.2

Step 3: The differentiation in Step 2 yields two fundamentally different sets of equations for the k-th moment evolutions and k-th occupation moment evolutions for each of the two classes of branching processes. In turn, this results in slightly different combinatorial formulae to work with when when completing the proofs of the theorems. In essence the proof of all of the main results now pertains to identifying the leading order terms in the aforesaid combinatorial formulae and appealing to an inductive argument; for example, using the statement of Theorem 1 for \(\ell \le k\) to prove the statement of that theorem for \(\ell = k+1\).

In Sect. 2.3 we give the full argument for the proof of Theorem 1 in the branching Markov process setting. This sets the scene for all the other proofs for this class of branching processes. Indeed, based on this proof, we then complete the proof of Theorem 4 in Sect. 2.4, and Theorems 235 and 6 in Sect. 2.5, highlighting only the main differences relative to the key arguments in the proof of Theorem 1.

In the same spirit, we give a complete proof for Theorem 1 in Sect. 3.3 for superprocesses which gives the roadmap for all other proofs for this class of processes. Thereafter, in Sect. 3.4, we highlight only the differences for the proofs of Theorems 2 and 3 for superprocesses. Finally, at even greater brevity than in the branching Markov process setting (because of familiarity) we sketch the main differences for the proof of Theorem 45, and 6.

2 Proofs for branching Markov processes

The proof is a mixture of analytical and combinatorial computations which are based around the behaviour of the linear and non-linear semigroups of X.

2.1 Linear and non-linear semigroup equations

For \(f\in B^+(E)\), it is well known that the mean semigroup evolution satisfies

$$\begin{aligned} \texttt {T}_t[f](x) =\texttt {P}_t[f](x) + \int _0^t \texttt {P}_s\left[ \texttt {F}\texttt {T}_{t-s}[f]\right] (x)\mathrm{d}s \qquad t\ge 0, x\in E, \end{aligned}$$
(32)

where

$$\begin{aligned} \texttt {F}[f](x) = \beta (x){\mathcal {E}}_x\left[ \sum _{i =1}^N f(x_i) - f(x)\right] =: \beta (x)(\texttt {m}[f](x)-f(x)), \qquad x\in E. \end{aligned}$$

See for example the calculations in [22]. Associated with every linear semigroup of a branching process is a so-called many-to-one formula. Many-to-one formulae are not necessarily unique and the one we will develop here is slightly different from the usual construction because of non-locality.

Suppose that \(\xi = (\xi _t, t\ge 0)\), with probabilities \({\mathbf {P}} = ({\mathbf {P}}_x, x\in E)\), is the Markov process corresponding to the semigroup \(\texttt {P}\). Let us introduce a new Markov process \({\hat{\xi }} = ({\hat{\xi }}_t, t\ge 0)\) which evolves as the process \(\xi \) but at rate \(\beta (x)\texttt {m}[1](x)\) the process is sent to a new position in E, such that for all Borel \(A\subset E\), the new position is in A with probability \(\texttt {m}[{\mathbf {1}}_A](x)/\texttt {m}[1](x)\). We will refer to the latter as extra jumps. Note the law of the extra jumps is well defined thanks to the assumption (15), which ensures that \(\sup _{x \in E}\texttt {m}[1](x) =\sup _{x\in E}{\mathcal {E}}_x(\langle 1, {\mathcal {Z}}\rangle ) <\infty \). Accordingly we denote the probabilities of \({{\hat{\xi }}}\) by \((\hat{{\mathbf {P}}}_x, x\in E)\). We can now state our many-to-one formula.

Lemma 1

Write \(B(x) = \beta (x)(\texttt {m}[1](x) - 1)\), \(x\in E\). For \(f\in B^+(E)\) and \(t\ge 0\), we have

$$\begin{aligned} \texttt {T}_t [f](x) = \hat{{\mathbf {E}}}_x\left[ \exp \left( \int _0^t B({{\hat{\xi }}}_s)\mathrm{d}s\right) f({{\hat{\xi }}}_t)\right] . \end{aligned}$$
(33)

The proof is classical and follows standard reasoning for semigroup integral equations e.g. as in [19, 22]: First conditioning the right-hand side of (33) on the time of the first extra jump, then using the principle of transferring between multiplicative and additive potentials in the resulting integral equation (cf. Lemma 1.2, Chapter 4 in [13]) shows that (32) holds. Grönwall’s Lemma, the fact that \(\beta \in B^+(E)\) and (15) for \(k = 1\) ensure that the relevant integral equations have unique solutions.

We now define a variant of the non-linear evolution equation (2) associated with X via

$$\begin{aligned} \texttt {u}_t[f, g](x) = {\mathbb {E}}_{\delta _x}\left[ 1 - \mathrm{e}^{-\langle f, X_t\rangle - \int _0^t \langle g, X_s\rangle \mathrm {d}s}\right] , \qquad t \ge 0, \, x \in E, \, f, g \in B^+(E). \qquad \end{aligned}$$
(34)

For \(f\in B^+_1(E)\), define

$$\begin{aligned} \texttt {A}[f](x)= \beta (x) {\mathcal {E}}_x\left[ \prod _{i =1}^N (1- f(x_i) )- 1+ \sum _{i =1}^N f(x_i) \right] , \qquad x\in E. \end{aligned}$$

Our first preparatory result relates the two semigroups \((\texttt {u}_t, t\ge 0)\) and \((\texttt {T}_t, t\ge 0)\).

Lemma 2

For all \(f, g\in B^+(E)\), \(x \in E\) and \(t\ge 0\), the non-linear semigroup \(\texttt {u}_t[f, g](x)\) satisfies

$$\begin{aligned} \texttt {u}_t[f, g](x)= \texttt {T}_{t}[1-\mathrm{e}^{-f}](x) - \int _{0}^{t}\texttt {T}_{s}\left[ \texttt {A}[\texttt {u}_{t-s}[f, g]] - g(1-\texttt {u}_{t-s}[f, g]) \right] (x)\mathrm{d}s. \nonumber \\ \end{aligned}$$
(35)

Proof

Again, the proof uses standard techniques for integral evolution equations so we only sketch the proof. Instead of considering \(\texttt {u}_t[f,g]\), we will first work instead with

$$\begin{aligned} \texttt {v}_t[f, g] = {\mathbb {E}}_{\delta _x}\left[ \mathrm{e}^{-\langle f, X_t\rangle - \int _0^t \langle g, X_s\rangle \mathrm {d}s}\right] , \qquad t \ge 0, \, x \in E, \, f, g \in B^+(E), \end{aligned}$$
(36)

which will turn out to be more convenient for technical reasons.

By splitting the expectation in (36) on the first branching event and appealing to the Markov property, we get, for \(f, g\in B^+(E)\), \(t\ge 0\) and \(x\in E\),

$$\begin{aligned} \texttt {v}_t[f, g](x)&= {\mathbf {E}}_x\left[ \mathrm{e}^{-\int _0^t\beta (\xi _s)\mathrm{d}s} \mathrm{e}^{-f(\xi _t) -\int _0^t g(\xi _s)\mathrm{d}s }\right] \\&\quad + {\mathbf {E}}_x\left[ \int _0^t \beta (\xi _s) \mathrm{e}^{-\int _0^s\beta (\xi _u)+g(\xi _u)\mathrm{d}u} \texttt {H}[\texttt {v}_{t-s}[f, g]](\xi _s) \mathrm{d}s\right] , \end{aligned}$$

where

$$\begin{aligned} \texttt {H}[g](x) = {\mathcal {E}}_x\left[ \prod _{i = 1}^N g(x_i)\right] , \qquad g\in B^+(E), x\in E. \end{aligned}$$

Using similar reasoning to Lemma 1.2, Chapter 4 in [13] we can move the multiplicative potential with rate \(\beta +g\) to an additive potential in the above evolution equation to obtain

$$\begin{aligned} \texttt {v}_t[f, g](x)&={{\hat{\texttt {P}}}}_t[\mathrm{e}^{-f}](x) + \int _0^t \texttt {P}_s\left[ \texttt {G}[ \texttt {v}_{t-s}[f, g]] - g \texttt {v}_{t-s}[f, g]\right] (x)\mathrm{d}s. \end{aligned}$$
(37)

Now define

$$\begin{aligned} \texttt {D}[f] (x)= & {} \beta (x){\mathcal {E}}_x\left[ \prod _{i =1}^N f(x_i) -\sum _{i = 1}^N f(x_i)\right] \\= & {} \beta (x)\left( \texttt {H}[f](x)- \texttt {m}[f](x)\right) , \quad f\in B^+_1(E), x\in E \end{aligned}$$

and \(({\tilde{\texttt {v}}}_t, t\ge 0)\) via

$$\begin{aligned} {\tilde{\texttt {v}}}_{t}[f, g](x)&= \texttt {T}_t[\mathrm{e}^{-f}](x) + \int _0^t \texttt {T}_s\left[ \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ] -g {\tilde{\texttt {v}}}_{t-s}[f, g] \right] (x) \mathrm{d}s\nonumber \\&=\,\hat{{\mathbf {E}}}_{x}\Big [\mathrm{e}^{\int _{0}^{t}\texttt {B}({{\hat{\xi }}}_{s})\mathrm{d}s}\mathrm{e}^{- f({{\hat{\xi }}}_t)} \Big ] \nonumber \\&\qquad +\hat{{\mathbf {E}}}_{x}\left[ \int _{0}^{t }\mathrm{e}^{\int _{0}^{s}\texttt {B}({{\hat{\xi }}}_{u})\mathrm{d}u} \left( \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ]({{\hat{\xi }}}_{s}) -g({{\hat{\xi }}}_s){\tilde{\texttt {v}}}_{t-s}[f, g]({{\hat{\xi }}}_{s})\right) \mathrm{d}s\right] , \end{aligned}$$
(38)

for \(x \in E, t\ge 0\) and \(f, g\in B^+(E)\). Note that for the moment we don’t claim a solution to (38) exists.

For convenience, we will define

$$\begin{aligned} \texttt {K}_t[f, g](x) = \hat{{\mathbf {E}}}_{x}\left[ \int _{0}^{t }\mathrm{e}^{\int _{0}^{s}\texttt {B}({{\hat{\xi }}}_{u})\mathrm{d}u} \left( \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ]({{\hat{\xi }}}_{s}) -g({{\hat{\xi }}}_s){\tilde{\texttt {v}}}_{t-s}[f, g]({{\hat{\xi }}}_{s})\right) \mathrm{d}s\right] , \end{aligned}$$

so that \({\tilde{\texttt {v}}}_{t}[f, g](x) = \texttt {T}_t[\mathrm{e}^{-f}](x)+\texttt {K}_t[f, g](x) \). By conditioning the right-hand side of (38) on the first jump of \({{\hat{\xi }}}\) (bearing in mind the dynamics of \({{\hat{\xi }}}\) given just before Lemma 1) with the help of the Markov property (recalling that \(\texttt {B}(x) - \beta \texttt {m}[1] = \beta \)), we get

$$\begin{aligned}&{\tilde{\texttt {v}}}_{t}[f, g](x) \nonumber \\&\quad = {\mathbf {E}}_x\left[ \mathrm{e}^{-\int _0^t\beta ( {\xi }_s)\mathrm{d}s} \mathrm{e}^{-f( {\xi }_t)}\right] +{\mathbf {E}}_x\left[ \int _0^t \beta ( {\xi }_\ell )\texttt {m}[1]( {\xi }_\ell )\mathrm{e}^{-\int _0^\ell \beta ( {\xi }_s)\mathrm{d}s}\frac{\texttt {m}[\texttt {T}_{t-\ell }[\mathrm{e}^{-f}]]( {\xi }_\ell )}{\texttt {m}[1]( {\xi }_\ell )}\mathrm{d}\ell \right] \nonumber \\&\qquad +{\mathbf {E}}_{x}\left[ \mathrm{e}^{-\int _0^t \beta ( {\xi }_u)\texttt {m}[1]( {\xi }_u)\mathrm{d}u}\int _{0}^{t }\mathrm{e}^{\int _{0}^{s}\texttt {B}( {\xi }_{u})\mathrm{d}u} \left( \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ]( {\xi }_{s}) - g(\xi _s) [{\tilde{\texttt {v}}}_{t-s}[f, g]\right) \mathrm{d}s\right] \nonumber \\&\qquad +{\mathbf {E}}_{x}\Bigg [ \int _0^t \beta ( {\xi }_\ell )\texttt {m}[1]( {\xi }_\ell ) \mathrm{e}^{-\int _0^\ell \beta ( {\xi }_u)\texttt {m}[1]( {\xi }_u)\mathrm{d}u}\nonumber \\&\qquad \Bigg (\int _0^\ell \mathrm{e}^{\int _{0}^{s}\texttt {B}( {\xi }_{u})\mathrm{d}u}\left( \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ]( \xi _{s}) - g(\xi _s) \texttt {v}_{t-s}[f, g](\xi _s)\right) \mathrm{d}s\nonumber \\&\qquad + \mathrm{e}^{\int _{0}^\ell \texttt {B}( \xi _{u})\mathrm{d}u} \frac{\texttt {m}[\texttt {K}_{t-\ell }[g]]( \xi _\ell )}{\texttt {m}[1]( \xi _\ell )} \Bigg )\mathrm{d}\ell \Bigg ]. \end{aligned}$$

Gathering terms and exchanging the order of integration in the double integral, this simplifies to

$$\begin{aligned}&{\tilde{\texttt {v}}}_{t}[f, g](x)&\nonumber \\&\quad ={\mathbf {E}}_x\left[ \mathrm{e}^{-\int _0^t\beta ( \xi _s)\mathrm{d}s} \mathrm{e}^{-f( \xi _t)}\right] +{\mathbf {E}}_x\left[ \int _0^t \beta ( \xi _\ell )\mathrm{e}^{-\int _0^\ell \beta ( \xi _s)\mathrm{d}s} \texttt {m}[{\tilde{\texttt {v}}}_{t-\ell }[f, g](x)]( \xi _\ell ) \mathrm{d}\ell \right] \\&{\mathbf {E}}_{x}\left[ \mathrm{e}^{-\int _0^t \beta ( {\xi }_u)\texttt {m}[1]( {\xi }_u)\mathrm{d}u}\int _{0}^{t }\mathrm{e}^{\int _{0}^{s}\texttt {B}( {\xi }_{u})\mathrm{d}u} \left( \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ]( {\xi }_{s}) - g(\xi _s) [{\tilde{\texttt {v}}}_{t-s}[f, g]\right) \mathrm{d}s\right] \\&\qquad +{\mathbf {E}}_{x}\Bigg [ \int _0^t\int _0^t {\mathbf {1}}_{(s\le \ell )} \beta ( \xi _\ell )\texttt {m}[1]( \xi _\ell ) \mathrm{e}^{-\int _0^\ell \beta ( \xi _u)\texttt {m}[1]( \xi _u)\mathrm{d}u} \mathrm{e}^{\int _{0}^{s}\texttt {B}( \xi _{u})\mathrm{d}u}\\&\qquad \left( \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ]( \xi _{s}) - g( \xi _{s}){\tilde{\texttt {v}}}_{t-s}[f, g]( \xi _{s})) \right) \mathrm{d}\ell \,\mathrm{d}s\Bigg ]\\&\quad ={\mathbf {E}}_x\left[ \mathrm{e}^{-\int _0^t\beta ( \xi _s)\mathrm{d}s} \mathrm{e}^{-f(\xi _t)}\right] +{\mathbf {E}}_x\left[ \int _0^t \beta ( \xi _\ell )\mathrm{e}^{-\int _0^\ell \beta ( \xi _s)\mathrm{d}s} \texttt {m}[{\tilde{\texttt {v}}}_{t-\ell }[g](x)]( \xi _\ell ) \mathrm{d}\ell \right] \\&\qquad +{\mathbf {E}}_{x}\left[ \int _{0}^{t }\mathrm{e}^{-\int _{0}^{s}\beta ( \xi _{u})\mathrm{d}u}\left( \texttt {D}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ]( \xi _{s}) - g( \xi _{s}){\tilde{\texttt {v}}}_{t-s}[f, g]( \xi _{s})) \right) \mathrm{d}s\right] . \end{aligned}$$

Finally, appealing to the change of multiplicative potential to additive potential in the spirit of e.g. Lemma 1.2, Chapter 4 of [13], we get

$$\begin{aligned} {\tilde{\texttt {v}}}_{t}[f, g](x) =&{{\hat{\texttt {P}}}}_t[\mathrm{e}^{-f}](x) +\int _{0}^{t }\texttt {P}_t\left[ \texttt {G}\big [{\tilde{\texttt {v}}}_{t-s}[f, g]\big ] - g{\tilde{\texttt {v}}}_{t-s}[f, g]\right] (x)\mathrm{d}s \end{aligned}$$

and hence \(({\tilde{\texttt {v}}}_{t}, t\ge 0)\) is a solution to (37). Reversing these arguments also shows that solutions to (37) solve (38). As such, a standard argument using \(\beta \in B^+(E)\), the assumption (15) for \(k=1\) and Grönwall’s Lemma tells us that all of the integral equations thus far have unique solutions. In conclusion, \((\texttt {v}_t[g],t\ge 0)\) and \((\tilde{\texttt {v}}_t[g], t\ge 0)\) agree.

To complete the lemma, note that

$$\begin{aligned} 1- \texttt {T}_t[\mathrm{e}^{-f}](x)= \texttt {T}_t[1- \mathrm{e}^{-f}](x) + 1- \texttt {T}_t[1](x) \end{aligned}$$

moreover,

$$\begin{aligned} 1-\texttt {T}_t[1](x) = \hat{{\mathbf {E}}}_x\left[ \int _0^t \texttt {B}({{\hat{\xi }}}_s) \mathrm{e}^{\int _0^s \texttt {B}({{\hat{\xi }}}_u)\mathrm{d}u}\mathrm{d}s\right] = \int _0^t \texttt {T}_s[\texttt {B}](x)\mathrm{d}s. \end{aligned}$$

Hence, working form (38) and the definitions of \(\texttt {D}\) and \(\texttt {A}\), which are related via

$$\begin{aligned} \texttt {D}[1-f](x)= & {} \beta (x) {\mathcal {E}}_x\left[ \prod _i(1-f(x_i)) - \sum _{i = 1}^N(1-f(x_i))\right] \\= & {} \texttt {A}[f](x) +\texttt {B}(x), \qquad x\in E, f\in B^+_1(E), \end{aligned}$$

we get

$$\begin{aligned} {\texttt {u}}_t[f, g](x)&= 1- \texttt {v}_{t}[f, g](x)\\&=1- \texttt {T}_t[\mathrm{e}^{-f}](x)\\&\qquad - \int _0^t \texttt {T}_s\left[ \texttt {D}\big [1- \texttt {u}_{t-s}[f, g]\big ] -g (1- \texttt {u}_{t-s}[f, g]) \right] (x) \mathrm{d}s\\&=\texttt {T}_{t}[1-\mathrm{e}^{-f}](x) - \int _0^t \texttt {T}_s\left[ \texttt {A}\big [ \texttt {u}_{t-s}[f, g]\big ] -g (1- \texttt {u}_{t-s}[f, g]) \right] (x) \mathrm{d}s, \end{aligned}$$

as required. \(\square \)

2.2 Evolution equations for the k-th moment of branching Markov processes

Next we turn our attention to the evolution equation generated by the k-th moment functional \(\texttt {T}^{(k)}_t\), \(t\ge 0\). To this end, we start by observing that

$$\begin{aligned} \texttt {T}^{(k)}_t[f](x) = (-1)^{k+1}\frac{\partial ^k}{\partial \theta ^k}\texttt {u}_t[\mathrm{e}^{-\theta f}](x) \bigg |_{\theta = 0}. \end{aligned}$$
(39)

The following result gives us an iterative approach to writing the k-th moment functional in terms of lower order moment functionals.

Proposition 1

Fix \(k\ge 2\). Under the assumptions of Theorem 1, with the additional assumption that

$$\begin{aligned} \sup _{x\in E, s\le t}\texttt {T}^{(\ell )}_s[f](x)<\infty ,\qquad \ell \le k-1, f\in B^+(E), t\ge 0, \end{aligned}$$
(40)

it holds that

$$\begin{aligned} \texttt {T}^{(k)}_t[f](x) = \texttt {T}_t[f^{k}](x) + \int _0^t \texttt {T}_s\left[ \beta \eta _{t-s}^{(k-1)}[f]\right] \!(x) \, \mathrm{d}s,\qquad t\ge 0, \end{aligned}$$
(41)

where

$$\begin{aligned} \eta _{t-s}^{(k-1)}[f](x) = {\mathcal {E}}_x\left[ \sum _{[k_1, \ldots , k_N]_k^2}{k \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N \texttt {T}^{(k_j)}_{t-s}[f](x_j) \right] , \end{aligned}$$

and \([k_1, \ldots , k_N]_k^2\) is the set of all non-negative N-tuples \((k_1, \ldots , k_N)\) such that \(\sum _{i = 1}^N k_i = k\) and at least two of the \(k_i\) are strictly positive.

Proof

Recall from (35) that

$$\begin{aligned} { \texttt {u}_t[\theta f, 0](x)= \texttt {T}_{t}[1-\mathrm{e}^{-\theta f}](x) - \int _{0}^{t}\texttt {T}_{s}\left[ \texttt {A}[\texttt {u}_{t-s}[ \theta f, 0]]\right] (x)\mathrm{d}s }, \qquad t\ge 0. \end{aligned}$$
(42)

It is clear that differentiating the first term k times and setting \(\theta = 0\) on the right-hand side of (42) yields

$$\begin{aligned} \frac{\partial ^{{k}}}{\partial \theta ^{k}}\texttt {T}_{t}[1-\mathrm{e}^{-\theta f}](x)\bigg |_{\theta = 0} = (-1)^{{k}+1}\texttt {T}[f^{{k}}](x). \end{aligned}$$
(43)

Thus it remains to differentiate the second term on the right-hand side of (42) k times. To this end, without concern for passing derivatives through expectations, using the Leibniz rule in Lemma A.2 of the Appendix, we have

$$\begin{aligned}&-\frac{\partial ^{{k}}}{\partial \theta ^{k}}\texttt {A}[ {\texttt {u}_{t}[\theta f,0]] (x)} \bigg |_{\theta = 0} \nonumber \\&\quad = \frac{\partial ^{{k}}}{\partial \theta ^{k}}\beta (x){\mathcal {E}}_x\left[ 1- \prod _{i = 1}^N {\mathbb {E}}_{\delta _{x_i}}[\mathrm{e}^{-\theta \langle f, X_t\rangle }]- \sum _{i = 1}^N{\mathbb {E}}_{\delta _{x_i}}[1- \mathrm{e}^{-\theta \langle f, X_t\rangle }] \right] \nonumber \\&\quad = -\beta (x){\mathcal {E}}_x\left[ \sum _{k_1 + \cdots + k_N = {k}} {{k} \atopwithdelims ()k_1, \ldots , k_m} \prod _{j = 1}^N (-1)^{k_j}\texttt {T}^{(k_j)}_t[f](x_j) \right. \nonumber \\&\left. \qquad + (-1)^{{k} + 1}\sum _{i = 1}^N \texttt {T}^{({k})}_t[f](x_i)\right] \nonumber \\&\quad = \beta (x){\mathcal {E}}_x\left[ (-1)^{{k}+1}\sum _{k_1 + \cdots + k_N = {k}} {{k} \atopwithdelims ()k_1, \ldots , k_m} \prod _{j = 1}^N \texttt {T}^{(k_j)}_t[f](x_j)\right. \nonumber \\&\qquad \left. + (-1)^{{k}}\sum _{i = 1}^N \texttt {T}_{t}^{({k})}[f](x_i)\right] . \end{aligned}$$
(44)

where the sum is taken over all non-negative integers \(k_1,\ldots , k_N\) such that \(\sum _{i = 1}^N k_i = {k}\).

Next let us look in more detail at the sum/product term on the right-hand (44). Consider the terms where only one of the \(k_i\) in the sum is positive, in which case \(k_i = {k}\) and

$$\begin{aligned} {{k} \atopwithdelims ()k_1, \ldots , k_m} = 1. \end{aligned}$$

There are N ways this can happen in the sum of the sum-product term and hence

$$\begin{aligned}&\sum _{k_1 + \cdots + k_N = {k}} {{k} \atopwithdelims ()k_1, \ldots , k_m} \prod _{j = 1}^N\texttt {T}^{(k_j)}_t[f](x_j)\\&\quad = \sum _{i = 1}^N \texttt {T}^{({k})}[f](x_i) + \sum _{[k_1, \ldots , k_N]_k^2}{{k} \atopwithdelims ()k_1, \ldots , k_N} \prod _{j = 1}^N\texttt {T}^{(k_j)}_t[f](x_j), \end{aligned}$$

where \([k_1, \ldots , k_N]_k^2\) is the set of all non-negative N-tuples \((k_1, \ldots , k_N)\) such that \(\sum _{i = 1}^N k_i = k\) and at least two of the \(k_i\) are strictly positive. Substituting this back into (44) yields

$$\begin{aligned}&-\frac{\partial ^{{k}}}{\partial \theta ^{k}} \texttt {A}[\texttt {u}_t[\mathrm{e}^{-\theta f}]] \bigg |_{\theta = 0} \\&\quad = (-1)^{{k} + 1}\beta (x){\mathcal {E}}_x\left[ \sum _{[k_1, \ldots , k_N]_k^2}{{k} \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N \texttt {T}^{(k_j)}_t[f](x_j) \right] . \end{aligned}$$

Now let us return to the justification that we can pass the derivatives through the expectation in the above calculation, we first note that derivatives are limits and so an ‘epsilon-delta’ argument will ultimately require dominated convergence. This is where the assumption (15) and (40) come in. On the right-hand side of (44), each of the \(\texttt {T}_{t}^{(k_j)}[f](x_j)\) in the sum term are uniformly bounded by the assumption (40) and the collection \([k_1, \ldots , k_N]_k^2\) means that \(0\le k_j\le k-1\) for each \(j = 1,\ldots , N\). Moreover, there can be at most k items in the sum/product. Noting that

$$\begin{aligned} \sum _{k_1 + \cdots + k_N = {k}} {{k} \atopwithdelims ()k_1, \ldots , k_m} = N^{k}, \end{aligned}$$
(45)

the assumption (15) allows us to use a domination argument with the k-th order moment.

Combining this with (43) and (42), using an easy dominated convergence argument to pull the k derivatives through the integral in t, then dividing by \((-1)^{{k} +1}\), we get (41), as required. \(\square \)

2.3 Completing the proof of Theorem 1: critical case

We will prove Theorem 1 by induction, starting with the case \(k = 1\). In this case, (18) reads

$$\begin{aligned} \sup _{t\ge 0}\Delta _t <\infty \text { and } \lim _{t\rightarrow \infty }\Delta _t = 0, \end{aligned}$$

which holds due to (14).

We now assume that the theorem holds true in the branching Markov process setting for some \(k \ge 1\) and proceed to show that (18) holds for all \(\ell \le k +1\). As alluded to in the introduction, the strategy for the proof is to use equation (40), since this allows us to write the k-th moment in terms of the lower order moments. To prove that the right-hand side of this equation converges to the limit appearing in the statement of the theorem, we use Theorem A.1 stated in the appendix along with the induction hypothesis.

To this end, first note that the induction hypothesis implies that (40) holds. Hence Proposition 1 tells us that

$$\begin{aligned}&\varphi (x)^{-1}t^{-k}\texttt {T}^{(k+1)}_t[f](x) \nonumber \\&\quad = \varphi (x)^{-1}t^{-k}\texttt {T}_t[f^{(k + 1)}](x) \nonumber \\&\qquad + \varphi (x)^{-1}t^{-k}\int _0^t \texttt {T}_s\left[ {\mathcal {E}}_{\cdot } \left[ \sum _{[k_1, \ldots , k_N]_{k+1}^2}{k + 1 \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N\texttt {T}^{(k_j)}_{t-s} [f](x_j) \right] \right] (x)\mathrm{d}s \nonumber \\&\quad = \varphi (x)^{-1}t^{-k}\texttt {T}_t[f^{(k + 1)}](x) + \varphi (x)^{-1}t^{-(k-1)}\nonumber \\&\qquad \int _0^1 \texttt {T}_{ut}\left[ {\mathcal {E}}_{\cdot } \left[ \sum _{[k_1, \ldots , k_N]_{k+1}^2}{k + 1 \atopwithdelims ()k_1, \ldots , k_N} \prod _{j = 1}^N \texttt {T}^{(k_j)}_{t(1-u)} [f](x_j) \right] \right] (x)\mathrm{d}u, \end{aligned}$$
(46)

where we have used the change of variables \(s = ut\) in the final equality.

We now make some observations that will simplify the expression on the right-hand side of (46) as \(t \rightarrow \infty \). First note that due to (14), the first term on the right-hand side of (46) will vanish as \(t \rightarrow \infty \). Next, note that, if more than two of the \(k_i\) in the sum are strictly positive, then the renormalising by \(t^{k - 1}\) will cause the associated summand to go to zero as well. For example, suppose without loss of generality that \(k_1\) and \(k_2\) are both strictly positive, we can write \(t^{k - 1} = t^{(k + 1) - 2} = t^{k_1 - 1}t^{k_2 - 1}t^{k_3} \ldots t^{k_N}\). Now the induction hypothesis tells us that the correct normalisation of each of the terms in the product is \(t^{k_j - 1}\), which means that the item \(\texttt {T}^{(k_j)}_{t(1-u)}\) for a third \(k_j>0\) will be ‘over normalised to zero’ in the limit.

To make this heuristic rigorous, we can employ Theorem A.1 from the Appendix. To this end, let us set

$$\begin{aligned} {F[f](x,u,t)}&:=\frac{1}{\varphi (x)t^{k-1}}{\mathcal {E}}_{x} \left[ \sum _{[k_1, \ldots , k_N]_{k+1}^3}{k + 1 \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N\texttt {T}_{t(1-u)}^{(k_j)}[f](x_j) \right] \end{aligned}$$
(47)

where \([k_1, \ldots , k_N]_{k+1}^3\) is the subset of \([k_1, \ldots , k_N]_{k+1}^2\), for which at least three of the \(k_i\) are strictly positive (which can be an empty set). We will show that conditions (A.1) and (A.2) are satisfied via

$$\begin{aligned}&{\sup _{x\in E, f\in B^+_1(E), u\in [0,1]} \varphi (x)F[f](x,u,t)} < \infty \text { and }\nonumber \\&\quad {\lim _{t\rightarrow \infty }\sup _{u\in [0,1], f\in B^+_1(E), x\in E} \varphi (x)F[f](x,u,t)} = 0. \end{aligned}$$
(48)

First note that there are no more than \(k + 1\) of the \(k_i\) that are strictly greater than 1 in the product in (47). This follows from the fact that it is not possible to partition the set \(\{1, \ldots , k+1\}\) into more than \(k + 1\) non-empty blocks. Next note that

$$\begin{aligned} \frac{1}{t^{k - 1}}\prod _{\begin{array}{c} j = 1\\ j : k_j> 0 \end{array}}^N\texttt {T}^{(k_j)}_{t(1-u)}[f](x_j)= & {} \frac{(t(1-u))^{k+1 - \#\{j : k_j> 0\}}}{t^{k - 1}}\\&\prod _{\begin{array}{c} j = 1\\ j : k_j > 0 \end{array}}^N\varphi (x_j) \cdot \frac{1}{\varphi (x_j)}\frac{\texttt {T}^{(k_j)}_{t(1-u)}[f](x_j)}{(t(1-u))^{k_j - 1}}. \end{aligned}$$

The product term on the right-hand side is uniformly bounded in \(x_j\) and \(t(1-u)\) on compact intervals due to boundedness of \(\varphi \) and the fact that (18) is assumed to hold for all \(\ell \le k\) by induction. Moreover, if \(\#\{j : k_j > 0\} \le 1\), the set \([k_1, \ldots , k_N]_{k+1}^3\) is empty, otherwise, the term \(\textstyle {(t(1-u))^{k+1 - \#\{j : k_j > 0\}}}/{t^{k - 1}}\) is finite for all \(t \ge 1\), say. From (45) and (15), we also observe that

$$\begin{aligned} \sup _{x\in E}{\mathcal {E}}_x\left[ \sum _{[k_1, \ldots , k_N]_{k+1}^3}{k + 1 \atopwithdelims ()k_1, \ldots , k_N}\right] \le \sup _{x\in E}{\mathcal {E}}_x\left[ \langle 1, {\mathcal {Z}}\rangle ^{k+1}\right] < \infty . \end{aligned}$$

Taking these facts into account, it is now straightforward to see that the earlier given heuristic can be made rigorous and (48) holds. In particular, we can use dominated convergence to pass the limit in t through the expectation in (47) to achieve the second statement in (48).

As F belongs to the class of functions \( {\mathcal {C}}\), defined just before Theorem A.1 in the Appendix, the aforesaid theorem tells us that

$$\begin{aligned} { \lim _{t \rightarrow \infty }\sup _{x \in E, f\in B^+_1(E)}\left| \frac{1}{\varphi (x)}\int _0^1\texttt {T}_{ut}[\varphi F[f](\cdot , u, t)](x)\mathrm{d}u\right| = 0.} \end{aligned}$$
(49)

Returning to (46), since the sum there requires that at least two of the \(k_i\) are positive, this means that the only surviving terms in the limit are those that are combinations of two strictly positive terms \(k_i\) and \(k_j\) such that \(i \ne j\) and \(k_i + k_j = k+1\). This can be thought of as choosing \(i, j \in \{1, \ldots N\}\) with \(i \ne j\), choosing \(k_i \in \{1, \ldots , k\}\) and then setting \(k_j = k+1 - k_i\). One should take care however to avoid double counting each pair \((k_i, k_j)\). Thus, we have

$$\begin{aligned} \frac{1}{t^k\varphi (x)}\texttt {T}^{(k+1)}_t[f](x)&=\frac{1}{\varphi (x)}\int _0^1\texttt {T}_{ut}\Bigg [\frac{\beta (\cdot )}{2t^{k - 1}}{\mathcal {E}}_{\cdot } \Bigg [ \sum _{i = 1}^N\sum _{\begin{array}{c} j = 1\\ j\ne i \end{array}}^N\sum _{k_i = 1}^{k}{k + 1 \atopwithdelims ()k_i, \, k+1 - k_i}\nonumber \\&\quad \times \texttt {T}^{(k_i)}_{t(1-u)}[f](x_i)\texttt {T}^{(k+ 1 - k_i)}_{t(1-u)}[f](x_j)\Bigg ]\Bigg ](x) \mathrm{d}u, \end{aligned}$$
(50)

where the factor of 1/2 appears to compensate for the aforementioned double counting.

In order to show that the right-hand side above delivers the required finiteness and limit (18), we again turn to Theorem A.1. For \(x \in E\), \(t \ge 0\) and \(0\le u \le 1\), in anticipation of using this theorem, we now re-define

$$\begin{aligned} {F[f](x,u,t)}:= & {} \frac{\beta (x)}{2\varphi (x)t^{k - 1}}\\&{\mathcal {E}}_{x} \Bigg [ \sum _{i = 1}^N\sum _{\begin{array}{c} j = 1\\ j\ne i \end{array}}^N\sum _{k_i = 1}^{k}{k + 1 \atopwithdelims ()k_i, \, k+1 - k_i}\texttt {T}^{(k_i)}_{t(1-u)}[f](x_i)\texttt {T}^{(k+ 1 - k_i)}_{t(1-u)}[f](x_j)\Bigg ]. \end{aligned}$$

After some rearrangement, we have

$$\begin{aligned}&{F[f](x,u,t)}\nonumber \\&\quad = \frac{\beta (x)(1-u)^{k - 1}}{2\varphi (x)} {\mathcal {E}}_{x} \Bigg [ \sum _{i = 1}^N\sum _{\begin{array}{c} j = 1\\ j\ne i \end{array}}^N\sum _{k_i = 1}^{k}{k + 1 \atopwithdelims ()k_i, \, k+1 - k_i}\nonumber \\&\qquad \times \varphi (x_i)\varphi (x_j) \frac{\texttt {T}^{(k_i)}_{t(1-u)}[f](x_i)}{\varphi (x_i)(t(1-u))^{k_i-1}} \frac{\texttt {T}^{(k+ 1 - k_i)}_{t(1-u)}[f](x_j)}{\varphi (x_j)(t(1-u))^{k - k_i} } \Bigg ]. \end{aligned}$$
(51)

Using similar arguments to those given previously in the proof of (49) may, again, combine the induction hypothesis, simple combinatorics and dominated convergence to pass the limit as \(t\rightarrow \infty \) through the expectation and show that

$$\begin{aligned} {F[f](x,u)}&:= \lim _{t \rightarrow \infty }{F[f](x,u,t)} \nonumber \\&= (k+1)! (\langle {\tilde{\varphi }}, {\mathbb {V}}[\varphi ]\rangle /2)^{k - 1}\langle {\tilde{\varphi }}, f\rangle ^{k + 1} k \frac{(1-u)^{k-1}}{2\varphi (x)} {\mathbb {V}}[\varphi ](x), \end{aligned}$$
(52)

for which one uses that

$$\begin{aligned}&(k+1)! (\langle {\tilde{\varphi }}, \beta {\mathbb {V}}[\varphi ]\rangle /2)^{k - 1}\langle {\tilde{\varphi }}, f\rangle ^{k + 1} k {\mathbb {V}}[\varphi ](x)\\&\quad = {\mathcal {E}}_{x} \Bigg [ \sum _{i = 1}^N\sum _{\begin{array}{c} j = 1\\ j\ne i \end{array}}^N\sum _{k_i = 1}^{k}{k + 1 \atopwithdelims ()k_i, \, k+1 - k_i}\varphi (x_i)\varphi (x_j)&\\&\qquad \times \frac{ k_i! \,\left\langle f,\tilde{\varphi }\right\rangle ^{k_i} \langle {\mathbb {V}}[\varphi ], {\tilde{\varphi }}\rangle ^{k_i-1} }{2^{(k_i-1)} } \frac{(k+ 1 - k_i)! \,\left\langle f,\tilde{\varphi }\right\rangle ^{k+ 1 - k_i} \langle {\mathbb {V}}[\varphi ], {\tilde{\varphi }}\rangle ^{k-k_i} }{2^{(k-k_i)} } \Bigg ]. \end{aligned}$$

Note that, thanks to the assumption (H2), the expression for F(sx) clearly satisfies (A.1).

Subtracting the right-hand side of (52) from the right-hand side of (51), again appealing to the induction hypotheses, specifically the second statement in (18), it is not difficult to show that, for each \(\varepsilon \in (0,1)\),

$$\begin{aligned} {\lim _{t\rightarrow \infty }\sup _{x\in E, u\in [0,\varepsilon ), f\in B^+_1(E)}|\varphi (x)F[f](x,u,t) - \varphi (x)F[f](x,u)| = 0.} \end{aligned}$$

On the other hand, the first statement in the induction hypothesis (18) also implies that three exists a constant \(C_k>0\) (which depends on k but not \(\varepsilon \)) such that

$$\begin{aligned} {\lim _{t\rightarrow \infty }\sup _{x\in E, u\in [\varepsilon ,1], f\in B^+_1(E)}|\varphi (x)F(x,u,t) - \varphi (x)F(x,u)| \le C_k(1-\varepsilon )^{k-1} .} \end{aligned}$$

Since we may take \(\varepsilon \) arbitrarily close to 1, we conclude that (A.2) holds.

In conclusion, since the conditions of Theorem A.1 are now met, we get the two statements of (18) as a consequence. \(\square \)

2.4 Proof of Theorem 4

Next we turn our attention to the evolution equation generated by the k-th moment functional \(\texttt {M}^{(k)}_t\), \(t\ge 0\). To this end, we start by defining observing that

$$\begin{aligned} \texttt {M}^{(k)}_t[g](x) = (-1)^{k+1}\frac{\partial ^k}{\partial \theta ^k}\texttt {u}_t[0,\theta g](x) \bigg |_{\theta = 0}. \end{aligned}$$
(53)

Taking account of (35), we see that

$$\begin{aligned} \texttt {u}_t[0,\theta g](x)= - \int _{0}^{t}\texttt {T}_{s}\left[ \texttt {A}[\texttt {u}_{t-s}[0,\theta g]] - \theta g(1-\texttt {u}_{t-s}[0,\theta g]) \right] (x)\mathrm{d}s. \end{aligned}$$
(54)

Given the proximity of (54) to (42), it is easy to see that we can apply the same reasoning that we used for \(\texttt {T}^{(k)}_t[f](x)\) to \(\texttt {M}^{(k)}_t[g](x)\) and conclude that, for \(k\ge 2\),

$$\begin{aligned} \texttt {M}^{(k)}_t[g](x)&=\int _0^t \texttt {T}_s\left[ \beta {{\hat{\eta }}}_{t-s}^{(k-1)}[g]\right] \!(x) - k\texttt {T}_s[g \texttt {M}^{(k-1)}_{t-s}[g]](x) \mathrm{d}s, \end{aligned}$$
(55)

where \({{\hat{\eta }}}^k\) plays the role of \(\eta ^k\) albeit replacing the moment operators \(\texttt {T}^{(j)}\) by the moment operators \(\texttt {M}^{(j)}\).

We now proceed to prove Theorem 4, also by induction. First we consider the setting \(k = 1\). In that case,

$$\begin{aligned} \frac{1}{t}\texttt {M}^{(1)}[g](x) = \frac{1}{t}{\mathbb {E}}_{\delta _x}\left[ \int _0^t \langle g, X_s\rangle \mathrm{d}s \right] = \frac{1}{t}\int _0^t \texttt {T}_s[g](x)\mathrm{d}s = \int _0^1 \texttt {T}_{ut}[g](x)\mathrm{d}u. \end{aligned}$$

Referring now to Theorem A.1 in the Appendix, we can take \(F(x, s, t) = f(x)/\varphi (x)\), since \(f\in B^+(E)\), the conditions of the theorem are trivially met and hence

$$\begin{aligned} \lim _{t\rightarrow \infty } \sup _{x\in E, g\in B^+_1(E)}\left| \frac{1}{t}\texttt {M}^{(1)}[g](x) - \langle g, {\tilde{\varphi }}\rangle \right| = 0. \end{aligned}$$

Note that this limit sets the scene for the polynomial growth in \(t^{n(k)}\) of the higher moments for some function n(k). If we are to argue by induction, whatever the choice of n(k), it must satisfy \(n(1)=1\).

Next suppose that Theorem 4 holds for all integer moments up to and including \(k-1\). To make the inductive step, we follow similar arguments to the proof of Theorem 1. That is, we use Theorem A.1 and the induction hypothesis to show that the limit of the right-hand side of (55) is precisely the limit that appears in the statement of the theorem.

We have from (55) that

$$\begin{aligned} \frac{1}{t^{2k-1}} \texttt {M}^{(k)}_t[g](x)= & {} \frac{1}{t^{2k-1}} \int _0^t \texttt {T}_s\left[ \beta {{\hat{\eta }}}_{t-s}^{(k-1)}[g]\right] \!(x) \mathrm{d}s \nonumber \\&- \frac{1}{t^{2k-1}} \int _0^t k\texttt {T}_s[g \texttt {M}^{(k-1)}_{t-s}[g]](x) \mathrm{d}s. \end{aligned}$$
(56)

Let us first deal with the right most integral in (56). It can be written as

$$\begin{aligned}&\frac{1}{t^{2k-2}} \int _0^1k\texttt {T}_{ut}\left[ \varphi F(\cdot , u, t)\right] (x)\mathrm{d}u \\&\quad = \int _0^1(1-u)^{2k-2}k\texttt {T}_{ut}\left[ g\frac{1}{(t(1-u))^{2k-2}} \texttt {M}^{(k-1)}_{t(1-u)}[g]\right] (x)\mathrm{d}u. \end{aligned}$$

Arguing as in the spirit of the proof of Theorem 1, our induction hypothesis ensures that

$$\begin{aligned} \lim _{t\rightarrow \infty } F[g](x, u, t)= & {} \lim _{t\rightarrow \infty } g(1-u)^{2k-2}k\frac{1}{(t(1-u))^{2k-2}} \frac{\texttt {M}^{(k-1)}_{t(1-u)}[g](x)}{\varphi (x)} = 0\\&= :&F(x, u) \end{aligned}$$

satisfies (A.1) and (A.2). Theorem A.1 thus tells us that, uniformly in \(x\in E\) and \(g\in B^+_1(E)\),

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{t^{ 2k-1}} \int _0^t k\texttt {T}_s[g \texttt {M}^{(k-1)}_{t-s}[g]](x) = 0. \end{aligned}$$
(57)

Now turning our attention to the first integral on the right-hand side of (56) and again following the style of the reasoning in the proof of Theorem 1, we can pull out the leading order terms, uniformly for \(x\in E\) and \(g\in B^+_1(E)\),

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{1}{t^{2k-1}}\int _0^t \texttt {T}_s\left[ \beta {{\hat{\eta }}}_{t-s}^{(k-1)}[g]\right] \!(x)\mathrm{d}s \nonumber \\&\quad = \lim _{t\rightarrow \infty } \int _0^1\texttt {T}_{ut}\Bigg [\frac{\beta (\cdot )}{2 }(1-u)^{2k-2}{\mathcal {E}}_{\cdot } \Bigg [ \sum _{i = 1}^N\sum _{\begin{array}{c} j = 1\\ j\ne i \end{array}}^N\sum _{k_i = 1}^{k-1}{k \atopwithdelims ()k_i, \, k - k_i} \varphi (x_i)\varphi (x_j)\nonumber \\&\qquad \times \frac{\texttt {M}^{(k_i)}_{t(1-u)}[g](x_i)}{\varphi (x_i)( t(1-u) )^{2k_i-1}} \frac{\texttt {M}^{(k - k_i)}_{t(1-u)}[g](x_j)}{\varphi (x_j)( t(1-u) )^{2k-2k_i-1} } \Bigg ]\Bigg ](x) \mathrm{d}u. \end{aligned}$$
(58)

It is again worth noting here that the choice of the polynomial growth in the form \(t^{n(k)}\) also constrains the possible linear choices of n(k) to \(n(k) = 2k-1\) if we are to respect \(n(1)=1\) and the correct distribution of the index across (58).

Identifying

$$\begin{aligned} F[g](x, u, t)&= \frac{\beta (x)}{2\varphi (x)} (1-u)^{2k-2}{\mathcal {E}}_{x} \Bigg [ \sum _{i = 1}^N\sum _{\begin{array}{c} j = 1\\ j\ne i \end{array}}^N\sum _{k_i = 1}^{k-1}{k \atopwithdelims ()k_i, \, k - k_i} \varphi (x_i)\varphi (x_j)\nonumber \\&\qquad \times \frac{\texttt {M}^{(k_i)}_{t(1-u)}[g](x_i)}{\varphi (x_i)( t(1-u) )^{2k_i-1}} \frac{\texttt {M}^{(k - k_i)}_{t(1-u)}[g](x_j)}{\varphi (x_j)( t(1-u) )^{2k-2k_i-1} } \Bigg ], \end{aligned}$$

our induction hypothesis allows us to conclude that \(F[g](x,u): = \lim _{t\rightarrow \infty }F[g](x,u, t)\) exists and

$$\begin{aligned} \varphi (x)F[g](x,u)&= (1-u)^{2k-2}k!\frac{\beta (x){\mathbb {V}}[\varphi ](x)}{2^{k-1}} \langle g, {\tilde{\varphi }}\rangle ^k \langle {\mathbb {V}}[\varphi ], {\tilde{\varphi }}\rangle ^{k-1}\sum _{\ell = 1}^{k-1}L_{\ell }L_{k-\ell }. \end{aligned}$$

Thanks to our induction hypothesis, we can also easily verify (A.1) and (A.2). Theorem A.1 now gives us the required uniform (in \(x\in E\) and \(g\in B^+_1(E)\)) limit

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{t^{2k-1}}\int _0^t \texttt {T}_s\left[ \beta {{\hat{\eta }}}_{t-s}^{(k-1)}[g]\right] \!(x)\mathrm{d}s = \frac{k! \langle {\mathbb {V}}[\varphi ] , {\tilde{\varphi }} \rangle ^{k-1}\langle g, {\tilde{\varphi }}\rangle ^k}{2^{k-1}}L_k. \end{aligned}$$
(59)

Putting (59) together with (57) we get the statement of Theorem 4. \(\square \)

2.5 Remaining proofs in the non-critical cases

We now give an outline of the main steps in the proof of Theorem 1 for the sub and supercritical cases. As previously mentioned, the ideas used in this section will closely follow those presented in the previous section for the proof of the critical case and so we leave the details to the reader. We first note that the Perron Frobenius behaviour in (H1) ensures the base case for the induction argument, regardless of the value of \(\lambda \). We thus turn to the inductive step, assuming the result holds for \(k -1\).

Proof of Theorem 2 (supercritical case)

From the evolution equation (41), we have

$$\begin{aligned}&\lim _{t \rightarrow \infty }\frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)}\texttt {T}_t^{(k)}[f](x)\nonumber \\&\quad = \lim _{t \rightarrow \infty }\frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)} \int _0^t \texttt {T}_s\left[ \beta {\mathcal {E}}_x\left[ \sum _{[k_1, \ldots , k_N]_k^2}{k \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N \texttt {T}^{(k_j)}_{t-s}[f](x_j) \right] \right] (x) \mathrm{d}s. \end{aligned}$$
(60)

It then follows that

$$\begin{aligned}&\lim _{t \rightarrow \infty }\sup _{x\in E, f\in B^+_1(E)}\left| \varphi (x)^{-1}\mathrm{e}^{-k \lambda t}\texttt {T}^{(k)}_t[f](x) - \,\left\langle f,\tilde{\varphi }\right\rangle ^k { L_k}\right| \nonumber \\&\quad = \lim _{t \rightarrow \infty }\sup _{x\in E, f\in B^+_1(E)}\Bigg |\varphi (x)^{-1}\mathrm{e}^{-k \lambda t}\int _0^t \texttt {T}_s\nonumber \\&\quad \left[ \beta {\mathcal {E}}_x\left[ \sum _{[k_1, \ldots , k_N]_k^2}{k \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N \texttt {T}^{(k_j)}_{t-s}[f](x_j) \right] \right] (x) \mathrm{d}s \nonumber \\&\qquad - \int _0^t \mathrm{e}^{-(k-1) \lambda s}\mathrm{d}s \,\langle f, \tilde{\varphi }\rangle ^k\lambda (k-1)L_k \Bigg |. \end{aligned}$$
(61)

Noting that \(\textstyle \sum _{j=1}^N k_j =k\), we may again share the exponential term across the product in the right-hand side above as follows,

$$\begin{aligned}&\frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)} \int _0^t \texttt {T}_s\left[ \beta {\mathcal {E}}_\cdot \left[ \sum _{[k_1, \ldots , k_N]_k^2}{k \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N \texttt {T}^{(k_j)}_{t-s}[f](x_j) \right] \right] (x) \mathrm{d}s \\&\quad = t\int _0^1 \mathrm{e}^{-\lambda (k-1) ut}\frac{\mathrm{e}^{-\lambda ut}}{\varphi (x)} \texttt {T}_{ut}\\&\qquad \Bigg [k!\beta {\mathcal {E}}_\cdot \Bigg (\sum _{[k_1, \ldots , k_N]_k^2}\prod _{j = 1}^N \varphi (x_j)\frac{\mathrm{e}^{-\lambda k_j t(1-u)}\texttt {T}_{t(1-u)}^{(k_j)}[f](x_j)}{k_j!\varphi (x_j)}\Bigg )\Bigg ](x) \mathrm{d}u. \end{aligned}$$

Combining this with to (61) and changing variables in the final integral of the latter, we have

$$\begin{aligned}&\lim _{t \rightarrow \infty }\sup _{x\in E, f\in B^+_1(E)}\left| \varphi (x)^{-1}\mathrm{e}^{-k \lambda t}\texttt {T}^{(k)}_t[f](x) - \,\left\langle f,\tilde{\varphi }\right\rangle ^k L_k\right| \nonumber \\&\quad \le \lim _{t \rightarrow \infty } \sup _{x\in E, f\in B^+_1(E)} t\left| \int _0^1 \mathrm{e}^{-\lambda (k-1) ut} \Big (\frac{\mathrm{e}^{-\lambda ut}}{\varphi (x)} \texttt {T}_{ut}\left[ \varphi F(\cdot , u, t)\right] - \langle f, \tilde{\varphi }\rangle ^k \lambda (k - 1)L_k\Big ) \mathrm{d}u\right| , \end{aligned}$$
(62)

where we have defined

$$\begin{aligned} {F[f](x,u,t)}:= k!\frac{\beta (x)}{\varphi (x)}{\mathcal {E}}_x \left[ \sum _{[k_1, \ldots , k_N]_k^2}\prod _{j = 1}^N \varphi (x_j)\frac{\mathrm{e}^{-\lambda k_j t(1-u)}\texttt {T}_{t(1-u)}^{(k_j)}[f](x_j)}{k_j!\varphi (x_j)}\right] . \end{aligned}$$

It is easy to see that, pointwise in \(x\in E\) and \(u\in [0,1]\), using the induction hypothesis and (H2),

$$\begin{aligned} {F[f](x,u)}&: = \lim _{t\rightarrow \infty } {F[f](x,u,t)} \\&= k!\frac{\beta (x)}{\varphi (x)}{\mathcal {E}}_{x}\left[ \sum _{[k_1, \ldots , k_N]_k^{2}}{k \atopwithdelims ()k_1, \ldots , k_N} \prod _{\begin{array}{c} j = 1 \\ j : k_j > 0 \end{array}}^N\varphi (x_j)L_{k_j}\right] \langle f, {\tilde{\varphi }}\rangle ^k, \end{aligned}$$

where we have again used the fact that the \(k_j\)s sum to k to extract the \(\langle f, {\tilde{\varphi }}\rangle ^k\) term. Similarly to the critical setting we can also verify using the induction hypothesis and (H2) that (A.1) and (A.2) hold.

This is sufficient to note that, by using a triangle inequality similar spirit to the one found in (A.4) and appealing to (14) of the assumption (H1), we have that

$$\begin{aligned} \sup _{x\in E, u\in [0,1], {f\in B^+_1(E)}, t\ge 0 }\left| \frac{\mathrm{e}^{-\lambda ut}}{\varphi (x)} \texttt {T}_{ut}\left[ \varphi {F[f](\cdot ,u,t)}\right] - k!\langle f, {\tilde{\varphi }}\rangle ^k L_k\right| <\infty . \end{aligned}$$

This means that for t sufficiently large, we can control the modulus in the integral on the right-hand side of (62) by a global constant. The remainder of integral, yields a bound of \(\varepsilon (1-\mathrm{e}^{-\lambda (k - 1) t})/\lambda (k - 1) \), which tends to zero as \(t\rightarrow \infty \). \(\square \)

Proof of Theorem 3 (subcritical case)

We now outline the subcritical case. First note that since we only compensate by \(\mathrm{e}^{-\lambda t}\), the term \(\texttt {T}_{t}[f^k](x)\) that appears in equation (41) does not vanish after the normalisation. Due to assumption (H1), we have

$$\begin{aligned} \lim _{t \rightarrow \infty }\varphi ^{-1}(x)\mathrm{e}^{-\lambda t}\texttt {T}_{t}[f^k](x) = \langle f^k, {\tilde{\varphi }}\rangle . \end{aligned}$$

Next we turn to the integral term in (41). Define \([k_1, \ldots , k_N]^n_k\), for \(2 \le n \le k\) to be the set of tuples \((k_1, \ldots , k_N)\) with exactly n positive terms and whose sum is equal to k. Similar calculations to those given above yield

$$\begin{aligned}&\frac{\mathrm{e}^{-\lambda t}}{\varphi (x)} \int _0^t \texttt {T}_s\left[ \beta {\mathcal {E}}_x\left[ \sum _{[k_1, \ldots , k_N]_k^2}{k \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N \texttt {T}^{(k_j)}_{t-s}[f](x_j) \right] \right] (x) \mathrm{d}s \nonumber \\&\quad = t\sum _{n = 2}^k\int _0^1{\mathrm{e}^{\lambda (n-1) u t}}\frac{\mathrm{e}^{-\lambda t(1-u)}}{\varphi (x)} \texttt {T}_{t(1-u)}\Bigg [ k!\beta {\mathcal {E}}_\cdot \nonumber \\&\quad \Bigg [\sum _{[k_1, \ldots , k_N]^n_k} \prod _{j = 1}^N \varphi (x_j) \frac{\mathrm{e}^{-\lambda ut}\texttt {T}^{(k_j)}_{ut}[f](x_j)}{k_j!\varphi (x_j)} \Bigg ] \Bigg ](x) \mathrm{d}u \end{aligned}$$
(63)

Again, we leave the details to the reader but the idea is that the induction hypothesis will take care of the product of the lower order moments and the second part of (14) in assumption (H1) will then take care of the asymptotic behaviour semigroup \(\texttt {T}_{t(1-u)}\). The second part of (14) allows one to control the difference between this term and its limit. In a similar manner to the final step in the proof of Theorem 2, the difference of (41) and its limit can be reduced to the limit as \(t\rightarrow \infty \) of \(\varepsilon ({1-\mathrm{e}^{-|\lambda | (n-1) t} })/|\lambda |(n - 1)\), which is bounded above by \(\varepsilon \). \(\square \)

Proof of Theorem 5

For the case \(k = 1\), we have

$$\begin{aligned}&\left| \mathrm{e}^{-\lambda t}\varphi (x)^{-1}\int _0^t\right. \texttt {T}_s[g](x)\mathrm{d}s - \left. \frac{\left\langle g,{\tilde{\varphi }}\right\rangle }{\lambda } \right| \nonumber \\&\quad = \left| \mathrm{e}^{-\lambda t}t\int _0^1 \mathrm{e}^{\lambda ut} \left( \mathrm{e}^{-\lambda ut}\varphi (x)^{-1}\texttt {T}_{ut}[g](x) - \left\langle g,{\tilde{\varphi }}\right\rangle \right) \mathrm{d}u - \mathrm{e}^{-\lambda t}\frac{\left\langle g,{\tilde{\varphi }}\right\rangle }{\lambda }\right| \nonumber \\&\quad \le \mathrm{e}^{-\lambda t}t\int _0^1 \mathrm{e}^{\lambda ut}\left| \mathrm{e}^{-\lambda ut}\varphi (x)^{-1}\texttt {T}_{ut}[g](x) - \left\langle g,{\tilde{\varphi }}\right\rangle \right| \mathrm{d}u + \mathrm{e}^{-\lambda t}\frac{\left\langle g,{\tilde{\varphi }}\right\rangle }{\lambda }. \end{aligned}$$
(64)

Thanks to (H1) and similar arguments to those used in the proof of Theorem 2, we may choose t sufficiently large such that the modulus in the integral is bounded above by \(\varepsilon > 0\), uniformly in \(g \in B^+_1(E)\) and \(x \in E\). Then, the right-hand side of (64) is bounded above by \(\varepsilon \lambda ^{-1} (1-\mathrm{e}^{-\lambda t}) + \mathrm{e}^{-\lambda t}{\left\langle g,{\tilde{\varphi }}\right\rangle }/{\lambda }\). Since \(\varepsilon \) can be taken arbitrarily small, this gives the desired result and also pins down the initial value \(L_1 = 1/\lambda \).

Now assume the result holds for all \(\ell \le k-1\). Reflecting on proof of Theorem 2, we note that in this setting the starting point is almost identical except that the analogue of (60), which is derived from (55), is now the need to evaluate

$$\begin{aligned}&\lim _{t \rightarrow \infty }\frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)}\texttt {M}_t^{(k)}[g](x) = \lim _{t \rightarrow \infty }\frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)} \int _0^t \texttt {T}_s\nonumber \\&\quad \left[ \beta {\mathcal {E}}_\cdot \left[ \sum _{[k_1, \ldots , k_N]_k^2}{k \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N \texttt {M}^{(k_j)}_{t-s}[g](x_j) \right] \right] (x) \mathrm{d}s\nonumber \\&\quad - k \lim _{t \rightarrow \infty } \frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)}\int _0^t \texttt {T}_s[g \texttt {M}^{(k-1)}_{t-s}[g]] (x)\mathrm{d}s. \end{aligned}$$
(65)

The first term on the right-hand side of (65) can be handled in essentially the same way as in the proof of Theorem 2. The second term on the right-hand side of (65) can easily be dealt with along the lines that we are now familiar with from earlier proofs, using the induction hypothesis. In particular, its limit is zero. Hence combined with the first term on the right-hand side of (65), we recover the same recursion equation for \(L_k\). \(\square \)

Proof of Theorem 6

The case \(k = 1\) is relatively straightforward and, again in the interest of keeping things brief, we point the reader to the fact that, as \(t \rightarrow \infty \), we have

$$\begin{aligned} \frac{1}{\varphi (x)}\texttt {M}^{(1)}_t[g](x)= & {} \int _0^t \frac{\texttt {T}_s[g](x)}{\varphi (x)}\mathrm{d}s \nonumber \\= & {} \int _0^t \mathrm{e}^{\lambda s} \mathrm{e}^{-\lambda s}\frac{\texttt {T}_s[g](x)}{\varphi (x)}\mathrm{d}s \sim t\langle g, {\tilde{\varphi }}\rangle \int _0^1 \mathrm{e}^{\lambda ut }\mathrm{d}u \sim \frac{\langle g, {\tilde{\varphi }}\rangle }{|\lambda |}. \end{aligned}$$
(66)

Now suppose the result holds for all \(\ell \le k-1\). We again refer to (55), which means we are interested in handling a limit which is very similar to (65), now taking the form

$$\begin{aligned}&\frac{ \texttt {M}_t^{(k)}[g](x) }{\varphi (x)}\nonumber \\&\quad = \frac{ t}{\varphi (x)} \int _0^1\mathrm{e}^{\lambda ut} \mathrm{e}^{-\lambda ut} \texttt {T}_{ut}\nonumber \\&\qquad \left[ \beta {\mathcal {E}}_\cdot \left[ \sum _{[k_1, \ldots , k_N]_k^2}{k \atopwithdelims ()k_1, \ldots , k_N}\prod _{j = 1}^N\varphi (x_j)\frac{ \texttt {M}^{(k_j)}_{t(1-u)}[g](x_j)}{\varphi (x_j)} \right] \right] (x) \mathrm{d}u\nonumber \\&\qquad - k \frac{t}{\varphi (x)}\int _0^1\mathrm{e}^{\lambda ut}\mathrm{e}^{-\lambda ut} \texttt {T}_{ut}\left[ g\varphi \frac{\texttt {M}^{(k-1)}_{t(1-u)}[g]}{\varphi }\right] (x)\mathrm{d}u. \end{aligned}$$
(67)

Again skipping the details, we can quickly see from (67) the argument in (66), and the induction hypothesis gives us

$$\begin{aligned} \frac{ \texttt {M}_t^{(k)}[g](x) }{\varphi (x)}\sim & {} k!\frac{\langle g, {\tilde{\varphi }}\rangle ^k}{|\lambda |} \left\langle \beta {\mathcal {E}}\Bigg [\sum _{[k_1, \ldots , k_N]_k^2} \prod _{\begin{array}{c} j = 1 \\ j : k_j > 0 \end{array}}^N\varphi (x_j)L_{k_j}\Bigg ], {\tilde{\varphi }} \right\rangle \nonumber \\&-k !\frac{\langle g\varphi , {\tilde{\varphi }}\rangle \langle g, {\tilde{\varphi }}\rangle ^{k-1}}{|\lambda |} L_{k-1}, \end{aligned}$$
(68)

which gives us the required recursion for \(L_k\). \(\square \)

3 Proofs for superprocesses

For the proof of Theorems 12 and 3 in the setting of superprocesses we follow a similar approach. One difference is that we cannot work with the k-th moment as a product of an almost surely finite sum. As such the use of the Leibniz formula as in the previous section is no longer helpful. Instead, we use the Faà di Bruno formula (see Lemma A.1) to assist with multiple derivatives of the non-linear evolution equation (5).

3.1 Linear and non-linear semigroup equations

The evolution equation for the expectation semigroup \((\texttt {T}_t, t\ge 0)\) is well known and satisfies

$$\begin{aligned} \texttt {T}_t\left[ f\right] (x)=\texttt {P}_t[f](x) +\int _{0}^{t}\texttt {P}_s\left[ \beta (\texttt {m}[\texttt {T}_{t-s}[f]]-1)+b \right] (x)\mathrm{d}s, \end{aligned}$$
(69)

for \(t\ge 0\), \(x\in E\) and \(f\in B^+(E)\), where, with a meaningful abuse of our branching Markov process notation, we now define

$$\begin{aligned} \texttt {m}[f](x)&=\int _{M_0(E)}\left[ \gamma (x,\pi )\left\langle f,\pi \right\rangle +\int _{0}^{\infty }u\left\langle f,\pi \right\rangle n(x,\pi ,\mathrm{d}u)\right] G(x,\mathrm{d}\pi )\nonumber \\&=\gamma (x,f) + \int _{M(E)^\circ } \langle f, \nu \rangle \Gamma (x,\nu ). \end{aligned}$$
(70)

See for example equation (3.24) of [7].

In the spirit of Lemma 1 we can give a second representation of \(\texttt {T}_t[f]\) in terms of an auxiliary process, the so called many-to-one formula. To this end, if, as before, we work with the process \((\xi , {\mathbf {P}})\) to represent the Markov process associated to the semigroup \((\texttt {P}_t, t\ge 0)\), then, although we have redefined the quantity \(\texttt {m}[f](x)\), we can still meaningfully work with the process \(({{\hat{\xi }}}, \hat{{\mathbf {P}}})\) as defined just before Lemma 1.

Lemma 3

Let \(\vartheta (x) = B(x)+b(x) = \beta (x)(\texttt {m}[1](x) - 1)+b(x)\), then, for \(t\ge 0\) and \(f\in B^+(E)\),

$$\begin{aligned} \texttt {T}_t\left[ f\right] (x)=\hat{{\mathbf {E}}}_x\left[ \exp \left( \int _{0}^{t}\vartheta ({\hat{\xi }}_s)\mathrm{d}s\right) f({\hat{\xi }}_t)\right] . \end{aligned}$$
(71)

As with Lemma 1, the proof is classical, requiring only that we take the right-hand side of (71) and condition on the first extra jump of \(({{\hat{\xi }}}, \hat{{\mathbf {P}}})\) to show that it also solves (70). It is a straightforward application of Grönwall’s inequality to show that (70) has a unique solution and hence (69) holds. The reader will note that because we have separated out the local and non-local branching mechanisms of the superprocess, the deliberate repeat definition of \(\texttt {m}[f]\) for superprocesses is only the analogue of its counter part for branching Markov processes in the sense of non-local activity. The mean local branching rate has otherwise been singled out as the term b.

Similarly to the branching Markov process setting, let us re-write an extended version of the non-linear semigroup evolution \(({\texttt {V}}_t, t\ge 0)\), defined in (6), i.e. the natural analogue of (35), in terms of the linear semigroup \((\texttt {T}_t, t\ge 0)\). To this end, define

$$\begin{aligned} {\texttt {V}}_t\left[ f,g\right] (x)={\mathbb {E}}_x\left[ \mathrm{e}^{-\left\langle f,X_t\right\rangle -\int _{0}^{t}\left\langle g,X_s\right\rangle \mathrm{d}s}\right] , \end{aligned}$$

Analogously to Theorem 2 we have the following result.

Lemma 4

For all \(f, g\in B^+(E)\), \(x \in E\) and \(t\ge 0\), the non-linear semigroup \(\texttt {V}_t[f, g](x)\) satisfies

$$\begin{aligned} {\texttt {V}}_t[f,g](x)= \texttt {T}_{t}[f](x) - \int _{0}^{t}\texttt {T}_{s}\left[ \texttt {J}[{\texttt {V}}_{t-s}[f,g]]-g{\texttt {V}}_{t-s}\left[ f,g\right] \right] (x)\mathrm{d}s, \end{aligned}$$
(72)

where, for \(h\in B^+(E)\) and \(x\in E\),

$$\begin{aligned} \texttt {J}[h](x) = \psi (x, h(x)) + \phi (x, h) + \beta (x)(\texttt {m}[h](x)-h(x)) + b(x)h(x). \end{aligned}$$

The proof is essentially the same as the proof of Lemma 2 and hence we leave the details to the reader.

3.2 Evolution equations for the k-th moment of a superprocesses

Recall that we defined \(\texttt {T}_t^{(k)}\left[ f\right] (x):={\mathbb {E}}_{\delta _x}[\langle f,X_t\rangle ^k]\), \(t\ge 0\), \(f\in B^+(E)\), \(k\ge 1\). As with the setting of branching Markov processes, we want to establish an evolution equation for \((\texttt {T}_t^{(k)}, t\ge 0)\), from which we can establish the desired asymptotics. To this end, let us introduce the following notation.

For \(x\in E\), \(k\ge 2\) and \(t\ge 0\), define

$$\begin{aligned} R_k(x,t)&=\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}(-1)^{m_1+\cdots +m_{k-1}-1} \nonumber \\&\qquad (m_1+\cdots +m_{k-1}-1)! \prod _{j=1}^{k-1}\left( \frac{(-1)^j \texttt {T}_t^{(j)}\left[ f\right] (x)}{j!}\right) ^{m_j}, \end{aligned}$$
(73)

and

$$\begin{aligned} K_k(x,t)&= \sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}\psi ^{(m_1+\cdots +m_{k-1})}(x,0+)\nonumber \\&\qquad \prod _{j=1}^{k-1}\left( \frac{(-1)^{j+1}\texttt {T}_t^{(j)}\left[ f\right] (x)-R_j(x,t)}{j!}\right) ^{m_j}, \end{aligned}$$
(74)

and finally

$$\begin{aligned} S_k(x,t)&=\int _{M(E)^{\circ }}\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}(-1)^{m_1+\cdots +m_{k-1}}\nonumber \\&\qquad \prod _{j=1}^{k-1}\left( \frac{\left\langle (-1)^{j+1}\texttt {T}_{t}^{(j)}\left[ f\right] -R_j(\cdot ,t),\nu \right\rangle }{j!}\right) ^{m_j}\Gamma (x,d\nu ), \end{aligned}$$
(75)

and the sums run over the set of non-negative integers \(\left\{ m_1,\ldots ,m_{k-1}\right\} \) such that \(m_1+2m_2+\cdots +(k-1) m_{k-1}=k\).

Theorem 7

Fix \(k\ge 2\). Suppose that (H1) and (H2) hold, with the additional assumption that

$$\begin{aligned} \sup _{x\in E, s\le t}\texttt {T}^{(\ell )}_s[f](x)<\infty ,\qquad \ell \le k-1, f\in B^+(E), t\ge 0. \end{aligned}$$
(76)

Then,

$$\begin{aligned} \texttt {T}_t^{(k)}\left[ f\right] (x)=(-1)^{k+1}R_k(x,t)+(-1)^{k}\int _{0}^{t}\texttt {T}_s\left[ U_k(\cdot ,t-s)\right] \mathrm{d}s, \end{aligned}$$
(77)

where

$$\begin{aligned} U_k(x,t)=K_k(x,t)+\beta (x)S_k(x,t). \end{aligned}$$
(78)

Proof

First note that similarly to the Markov branching process case, defining

$$\begin{aligned} {\mathsf {e}}_t[f](x):={\mathbb {E}}_{\delta _x}\left[ \mathrm{e}^{-\left\langle f,X_t\right\rangle }\right] ,\qquad t\ge 0, f\in B^+(E), \end{aligned}$$

we have

$$\begin{aligned} {\mathsf {e}}^{(k)}_t[\theta f](x): = \frac{\partial ^k}{\partial \theta ^k} {\mathsf {e}}_t[\theta f](x) = (-1)^k{\mathbb {E}}_{\delta _x}\left[ \left\langle f,X_t\right\rangle ^k\mathrm{e}^{-\theta \left\langle f,X_t\right\rangle }\right] \end{aligned}$$

and

$$\begin{aligned} {\mathsf {e}}^{(k)}_t[\theta f](x)|_{\theta = 0} =(-1)^k\texttt {T}_t^{(k)}\left[ f\right] (x). \end{aligned}$$
(79)

To prove (77), recall the definition (5) and let

$$\begin{aligned} \texttt {v}_t^{(k)}[f](x):= & {} \frac{\partial ^k}{\partial \theta ^k}{\texttt {V}}_t [\theta f, 0](x)\bigg |_{\theta =0} \\= & {} -\frac{\partial ^k}{\partial \theta ^k}\log {\mathsf {e}}_t[\theta f](x)\bigg |_{\theta =0}, \qquad t\ge 0, f\in B^+(E), k\ge 1 \end{aligned}$$

The idea is to use Lemma A.1 to obtain two equivalent expressions for \( \texttt {v}_t^{(k)}[f](x)\) that, when equated, yield (77).

To this end, not that due to Faà di Bruno’s Lemma A.1, we have

$$\begin{aligned}&\texttt {v}_t^{(k)}[f](x)\\&\quad = \left. \frac{\partial ^k}{\partial \theta ^k}-\log {\mathsf {e}}_t[\theta f](x)\right| _{\theta =0}\\&\quad =-\sum _{\left\{ m_1,\ldots ,m_k\right\} _k}\frac{k!}{m_1!\ldots m_k!}\frac{(-1)^{m_1+\cdots +m_k-1}(m_1+\cdots +m_k-1)!}{{\mathsf {e}}_t[\theta f](x)^{m_1+\cdots +m_k}}\\&\qquad \qquad \left. \prod _{j=1}^{k}\left( \frac{{\mathsf {e}}^{(j)}[\theta f])}{j!}\right) ^{m_j}\right| _{\theta =0}\\ \\&\quad =-\sum _{\left\{ m_1,\ldots ,m_k\right\} _k}\frac{k!}{m_1!\ldots m_k!}(-1)^{m_1+\cdots +m_k-1}(m_1+\cdots +m_k-1)!\\&\qquad \qquad \prod _{j=1}^{k}\left( \frac{(-1)^j \texttt {T}_t^{(j)}\left[ f\right] (x)}{j!}\right) ^{m_j}, \end{aligned}$$

where the sum runs over the set of non-negative integers \(\left\{ m_1,\ldots ,m_k\right\} _k\) such that

$$\begin{aligned} m_1+2m_2+\cdots +km_k=k. \end{aligned}$$

Note that \(m_k>0\) if and only if \(m_k=1\) and \(m_1=m_2=\cdots =m_{k-1}=0\), so the k-th moment term \(\texttt {T}^{(k)}_t[f]\) appears only once and with a factor \((-1)^{k+1}\), that is,

$$\begin{aligned} {\texttt {v}_t^{(k)}[f]}(x)=(-1)^{k+1}\texttt {T}_t^{(k)}\left[ f\right] (x)-R_k(x,t), \end{aligned}$$
(80)

where all the terms in \(R_k(x,t)\) are products of two or more lower order moments. Thus it remains to show that

$$\begin{aligned} \texttt {v}_t^{(k)}[f](x) = -\int _{0}^{t}\texttt {T}_s\left[ U_k(\cdot ,t-s)\right] \mathrm{d}s. \end{aligned}$$

Differentiating the evolution equation (72) k times at \(\theta =0\), momentarily not worrying about passing derivatives through integrals, we get

$$\begin{aligned} {\texttt {v}_{t}^{(k)}[f]}(x)&=-\int _{0}^{t}\texttt {T}_s\left[ \frac{\partial ^{k}}{\partial \theta ^k}\Big (\psi (\cdot ,{\texttt {V}}_{t-s} \left[ \theta f,0\right] (\cdot ))\right. \\&\qquad \left. \left. +\phi (\cdot ,{\texttt {V}}_{t-s} \left[ \theta f,0\right] ) +\texttt {F}[\texttt {V}_{t-s}[\theta f,0]] \Big ) \right| _{\theta =0}\right] (x) \mathrm{d}s , \end{aligned}$$

where

$$\begin{aligned} \texttt {F}[g](x) = \beta (x)(m[g]-g)+b(x)g, \qquad x\in E, g\in B^+(E). \end{aligned}$$

We first deal with the k-th derivative of the term involving \(\psi \) in the above integral. For this, we again use Lemma A.1 to get

$$\begin{aligned}&\left. \frac{\partial ^{k}}{\partial \theta ^k}\psi \left( x,{\texttt {V}}_t\left[ \theta f,0\right] (x)\right) \right| _{\theta =0}\\&\quad =\sum _{\left\{ m_1,\ldots ,m_k\right\} _k}\frac{k!}{m_1!\ldots m_k!}\psi ^{(m_1+\cdots +m_k)}(x,{\texttt {V}}_t\left[ \theta f,0\right] )\\&\qquad \quad \left. \prod _{j=1}^{k}\left( \frac{\tfrac{\partial ^j}{\partial \theta ^j}{\texttt {V}}_t\left[ \theta f,0\right] (x)}{j!}\right) ^{m_j}\right| _{\theta =0}\\&\quad =\sum _{\left\{ m_1,\ldots ,m_k\right\} _k}\frac{k!}{m_1!\ldots m_k!}\psi ^{(m_1+\cdots +m_k)}(x,0+)\prod _{j=1}^{k}\left( \frac{\texttt {v}_t^{(j)}{[f](x)}}{j!}\right) ^{m_j}\\&\\&\quad =-b(x)\texttt {v}_t^{(k)}{\left[ f\right] }(x)+K_k(x,t), \end{aligned}$$

where the last equality holds because \(m_k=1\) if and only if \(m_1=\cdots =m_{k-1}=0\) and \(\psi '(x,0+)=-b(x)\).

Similarly, for the the kth derivative of the remaining terms, recalling (8), (9) and (70), we have

$$\begin{aligned}&\frac{\partial ^k}{\partial \theta ^k}\Big (\phi (x,{\texttt {V}}_t \left[ \theta f,0\right] ) +\texttt {F}[{\texttt {V}}_t \left[ \theta f,0\right] ] \Big )\\&\quad =b(x)\frac{\partial ^k}{\partial \theta ^k}{\texttt {V}}_t \left[ \theta f,0\right] \\&\qquad -\beta (x)\int _{M_0(E)}\int _{0}^{\infty }\frac{\partial ^k}{\partial \theta ^k}\left( 1-\mathrm{e}^{-u\left\langle {\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle } -u\left\langle {\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle \right) \\&\qquad \quad n(x,\pi ,\mathrm{d}u)G(x,\mathrm{d}\pi ). \end{aligned}$$

Using Lemma A.1 yields

$$\begin{aligned}&\frac{\partial ^k}{\partial \theta ^k}\left( 1-\mathrm{e}^{-u\left\langle {\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle } -u\left\langle {\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle \right) \\&\quad =\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_k!} (-1)^{m_1+\cdots +m_k+1}\mathrm{e}^{-u\left\langle {\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle }\\&\qquad \prod _{j=1}^{k}\left( \frac{u\left\langle \frac{\partial ^j}{\partial \theta ^j }{\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle }{j!}\right) ^{m_j}\\&\qquad + \left( \mathrm{e}^{-u\left\langle {\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle } -1\right) u\left\langle \frac{\partial ^k}{\partial \theta ^k }{\texttt {V}}_t \left[ \theta f,0\right] ,\pi \right\rangle , \end{aligned}$$

where, in the final equality, we have singled out the case that \(m_k=1\) and \(m_1=\cdots =m_{k-1}= 0\) in the Faà di Bruno formula. Using the definition of \(\texttt {m}[f](x)\) in (70) and the same observation as above about the \(m_j\)’s, we get

$$\begin{aligned} \left. \frac{\partial ^k}{\partial \theta ^k}\Big (\phi (x,{\texttt {V}}_t \left[ \theta f,0\right] ) +\texttt {F}[{\texttt {V}}_t \left[ \theta f,0\right] ]\Big )\right| _{\theta =0} =b(x)\texttt {v}_{t}^{(k)}{[f]}(x) +\beta (x)S_k(x,t).\qquad \end{aligned}$$
(81)

Putting the pieces together, we obtain

$$\begin{aligned} \texttt {v}_t^{(k)}{[f]}(x)=-\int _{0}^{t}\texttt {T}_s\left[ U_k(\cdot ,t-s)\right] (x)\mathrm{d}s. \end{aligned}$$
(82)

Combining this with equation (80) yields

$$\begin{aligned} (-1)^{k+1}\texttt {T}_t^{(k)}\left[ f\right] (x)=R_k(x,t)-\int _{0}^{t}\texttt {T}_s\left[ U_k(\cdot ,t-s)\right] (x)\mathrm{d}s, \end{aligned}$$

which is the desired result.

There is one final matter we must attend to, which is the ability to move derivatives through integrals. In this setting, this follows from the assumption (76), (H2) and the Lévy-Khintchine-type formulae for \(\psi \) and \(\phi \). \(\square \)

3.3 Completing the proof of Theorem 1: critical case

We will prove Theorem 1 for superprocesses using induction, similarly to the setting of branching Markov processes. The case \(k=1\) follows from assumption (H1).

Now assume that the statement of Theorem 1 holds in the superprocess setting for all \(\ell \le k\). Our aim is to prove that the result holds for \(k+1\). Using Theorem  7 and a change of variables, we have that

$$\begin{aligned} \frac{1}{\varphi (x)t^{k}}\texttt {T}^{(k+1)}_t\left[ f\right] (x)= & {} \frac{(-1)^{k}}{\varphi (x)t^{k}}R_{k+1}(x,t)\nonumber \\&+\frac{(-1)^{k+1}}{\varphi (x)t^{k-1}}\int _{0}^{1}\texttt {T}_{st}\left[ U_{k+1}(\cdot ,t(1-s))\right] (x)\mathrm{d}s, \end{aligned}$$
(83)

where R and U were defined in equations (73) and (78), respectively. As with the particle system, we first aim to simplify the right-hand side before showing that its limit is equivalent to the expression given in the statement of the theorem. In particular, we will first show that for each \(x\in E\), the limit of the right-hand side of (83) is equivalent to

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{2\varphi (x)t^{k-1}}\int _{0}^{1}\texttt {T}_{st}\left[ K_{k+1}^{(2)}(\cdot ,t(1-s))+\beta (\cdot )S^{(2)}_{k+1}(\cdot ,t(1-s))\right] (x)\mathrm{d}s, \end{aligned}$$
(84)

where

$$\begin{aligned} K_{k+1}^{(2)}(x,t):=\sum _{\left\{ k_1,k_2\right\} ^{+}}\frac{(k+1)!}{k_1!k_2!}\psi ''(x,0+)\texttt {T}_t^{(k_1)}\left[ f\right] (x)\texttt {T}_t^{(k_2)}\left[ f\right] (x) \end{aligned}$$
(85)

and

$$\begin{aligned} S_{k+1}^{(2)}(x,t)=\int _{M^(E)^{\circ }}\sum _{\left\{ k_1,k_2\right\} ^{+}}\frac{(k+1)!}{k_1!k_2!}\langle \texttt {T}_t^{(k_1)}\left[ f\right] , \nu \rangle \langle \texttt {T}_t^{(k_2)}\left[ f\right] , \nu \rangle \Gamma (x,d\nu ), \end{aligned}$$
(86)

such that \(\left\{ k_1,k_2\right\} ^{+}\) is defined to be the set of positive integers \(k_1,k_2\) such that \(k_1+k_2=k+1\).

To this end, writing \(c(m_1,\ldots ,m_{k})\) for the constants preceding the product summands in (73), observe that

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{1}{t^{k}}R_{k+1}(x,t)\\&\quad =\lim _{t\rightarrow \infty }\frac{(k+1)!}{t^{k}}\sum _{\left\{ m_1,\ldots ,m_{k}\right\} _{k+1} }c(m_1,\ldots ,m_{k})\prod _{j=1}^{k}\left( \frac{(-1)^j \texttt {T}_t^{(j)}\left[ f\right] (x)}{j!}\right) ^{m_j}\\&\quad =(-1)^{k}(k+1)!\lim _{t\rightarrow \infty }\sum _{\left\{ m_1,\ldots ,m_{k}\right\} _{k+1} }\frac{c(m_1,\ldots ,m_{k})}{t^{m_1+\cdots +m_{k}-1}}\prod _{j=1}^{k}\left( \frac{1}{j!}\frac{\texttt {T}_t^{(j)}\left[ f\right] (x)}{t^{j-1}}\right) ^{m_j}=0, \end{aligned}$$

where the final equality is due to the induction hypothesis and the fact that \(m_1+\cdots +m_{k}>1\), which follows from the fact that \(m_1+2m_2+\cdots +\cdots +km_k=k+1\). Note, moreover that the induction hypothesis ensures that the limit is uniform in \(x\in E\) and, in fact, that

$$\begin{aligned} \sup _{t\ge 0, x\in E} \frac{1}{t^{\ell -1}}R_{\ell }(x,t)<\infty \text { and }\lim _{t\rightarrow \infty } \sup _{x\in E} \frac{1}{t^{\ell -1}}R_{\ell }(x,t)=0 \qquad \ell = 1,\ldots , k+1.\nonumber \\ \end{aligned}$$
(87)

We now return to (83), to deal with the term involving \(U_{k+1}\), which we recall is a linear combination of \({ K}_{k+1}\) and \(S_{k+1}\), which were defined in (73) and (75), respectively. Note that if any of the summands in either \({K}_{k+1}\) or \(S_{k+1}\) have more than two of the \(m_j\) positive, the limit of that summand, when renormalised by \(1/t^{k-1}\), will be zero. In essence, the argument here is analogous to those that led to (49) in the branching Markov process setting. This implies that the only terms in the sums of (73) and (75) that remain in the limit of (83) are those for which \(m_{k_1}=m_{k_2}=1\) and \(m_j=0\) for all \(j\ne k_1,k_2\), with \(k_1<k_2\) such that \(k_1+k_2=k+1\), and if \(k+1\) is even, the terms in which \(m_{(k+1)/2}=2\) and \(m_j=0\) for all \(j\ne (k+1)/2\).

Let us now convert all of the above heuristics into rigorous computation, for which we appeal to Theorem A.1. We write

$$\begin{aligned} {F[f](x,s,t)}:=\frac{1}{\varphi (x)t^{k-1}}\left( K_{k+1}^{(3+)}(x,t(1-s))+\beta (x)S_{k+1}^{(3+)}(x,t(1-s))\right) , \end{aligned}$$
(88)

where \(K_{k+1}^{(3+)}\) and \(S_{k+1}^{(3+)}\) contain the terms in \(K_{k+1}\) and \(S_{k+1}\), respectively, for which the sum \(m_1+\cdots +m_k\) is greater than or equal to 3. We will prove that \(\lim _{t\rightarrow \infty }{F}(x,s,t)=0\) and that (A.1) and (A.2) hold.

Due to (16) and boundness of \(\varphi \), dominated convergence implies that

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{1}{\varphi (x)t^{k-1}}K_{k+1}^{(3+)}(x,t(1-s))\\&\quad =\frac{(k+1)!}{\varphi (x)}\sum _{\left\{ m_1,\ldots ,m_k\right\} ^{3}_{k+1}}\frac{\psi ^{(m_1+\cdots +m_k)}(x,0+)}{m_1!\ldots m_k!}\\&\quad \lim _{t\rightarrow \infty }\frac{1}{t^{m_1+\cdots +m_k-2}}\prod _{j=1}^{k}\left( \frac{(-1)^{j+1}\texttt {T}_{t(1-s)}^{(j)}\left[ f\right] (x)-R_j(x,t(1-s))}{j!t^{j-1}}\right) ^{m_j}, \end{aligned}$$

where the set \(\left\{ m_1,\ldots ,m_k\right\} ^{3}_{k+1}\) is the subset of \(\left\{ m_1,\ldots ,m_k\right\} _{k+1}\) for which \(m_1+\cdots +m_k\ge 3\). Using the induction hypothesis and (87), we get that the right-hand side above is zero. The same arguments also imply that the limit of \(S_{k+1}^{(3+)}\) is zero. Thus \({F[f](x,s)}: = \lim _{t\rightarrow \infty } {F[f](x,s,t)} = 0\). The condition (A.1) trivially holds. For (A.2), the required uniformity follows from the induction hypothesis and (87).

Using Theorem A.1 in the Appendix, we conclude that

$$\begin{aligned} \lim _{t\rightarrow \infty }\sup _{x\in E}\left| \frac{1}{\varphi (x) t^{k-1}}\int _{0}^{1}\texttt {T}_{st}\left[ \varphi {F[f](\cdot ,s,t)}\right] (x)\mathrm{d}s\right| =0. \end{aligned}$$

Let us now define \(\{k_1<k_2\} \) to be the elements in \(K_{k+1}\) for which \(m_{k_1}=m_{k_2}=1\) with \(k_1<k_2\) such that \(k_1+k_2=k+1\) and \(m_j=0\) for all other indices and, in the case where \(k+1\) is even, \(m_{(k+1)/2}=2\) and \(m_j=0\) for all \(j\ne (k+1)/2\). Restricting the sum to this set in \(K_{k+1}\) we get the following expression

$$\begin{aligned} K_{k+1}^{(2)}(x,t)&= \sum _{\{k_1<k_2\}}\frac{(k+1)!}{k_1!k_2!}\psi ''(x,0+)(-1)^{k+1}\texttt {T}_t^{(k_1)}\left[ f \right] (x)\texttt {T}_t^{(k_2)}\left[ f\right] (x)\\&\quad +{\mathbf {1}}_{(k+1\text { is even)} } \frac{1}{2}\left( {\begin{array}{c}k+1\\ k/2\end{array}}\right) \psi ''(x,0+)(-1)^{k+1}\left( \texttt {T}_t^{(k/2)}\left[ f\right] (x)\right) ^2\\&=\frac{(-1)^{k+1}}{2}\sum _{\left\{ k_1,k_2\right\} ^{+}}\frac{(k+1)!}{k_1!k_2!}\psi ''(x,0+)\texttt {T}_t^{(k_1)}\left[ f\right] (x)\texttt {T}_t^{(k_2)}\left[ f\right] (x), \end{aligned}$$

where we recall \(\left\{ k_1,k_2\right\} ^{+}\) is the set of positive integers \(k_1,k_2\) such that \(k_1+k_2=k+1\). Similarly, we obtain the following expression for \(S_{k+1}\):

$$\begin{aligned} S_{k+1}^{(2)}(x,t)&= \int _{M(E)^{\circ }}\sum _{k_1<k_2}\frac{(k+1)!}{k_1!k_2!}(-1)^{k+1} \langle \texttt {T}_t^{(k_1)}\left[ f \right] , \nu \rangle \langle \texttt {T}_t^{(k_2)}\left[ f\right] , \nu \rangle \Gamma (x,d\nu )\\&\qquad +{\mathbf {1}}_{(k+1\text { is even})}\int _{M(E)^{\circ }}\left( {\begin{array}{c}k+1\\ k/2\end{array}}\right) \frac{(-1)^{k+1}}{2} \langle \texttt {T}_t^{((k+1)/2)}\left[ f\right] , \nu \rangle ^{2}\Gamma (x,d\nu )\nonumber \\&\quad =\frac{(-1)^{k+1}}{2}\int _{M(E)^{\circ }}\sum _{\left\{ k_1,k_2\right\} ^{+}}\frac{(k+1)!}{k_1!k_2!} \langle \texttt {T}_t^{(k_1)}\left[ f\right] , \nu \rangle \langle \texttt {T}_t^{(k_2)}\left[ f\right] ,\nu \rangle \Gamma (x,d\nu ). \end{aligned}$$

This shows that the right-hand side of (83) is equivalent to (84).

To conclude the proof, we will again use Theorem A.1 to show that (84) is equivalent to \({\mathbb {V}}\) given in the theorem. To this end, define

$$\begin{aligned} {F[f](x,s,t)}:=\frac{1}{2\varphi (x)t^{k-1}}\left( K_{k+1}^{(2)}(x,t(1-s))+\beta (x)S_{2}^{(2)}(x,t(1-s))\right) . \end{aligned}$$
(89)

Due to (16) and the induction hypothesis,

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{1}{2\varphi (x)t^{k-1}}K_{k+1}^{(2)}(x,t(1-s))\\&\quad =\frac{(1-s)^{k-1}}{2\varphi (x)}\sum _{\left\{ k_1,k_2\right\} ^{+}}\frac{(k+1)!}{k_1!k_2!}\psi ''(x,0+)\lim _{t\rightarrow \infty }\frac{\texttt {T}_{t(1-s)}^{(k_1)}f(x)}{(t(1-s))^{k_1-1}}\frac{\texttt {T}_{t(1-s)}^{(k_2)}f(x)}{(t(1-s))^{k_2-1}}\\&\quad =(1-s)^{k-1}\varphi (x)\sum _{\left\{ k_1,k_2\right\} ^{+}}(k+1)!2^{-k}\psi ''(x,0+)\left\langle f,\tilde{\varphi }\right\rangle ^{k+1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{k-1}\\&\quad =k(1-s)^{k-1}\varphi (x)(k+1)!2^{-k}\psi ''(x,0+)\left\langle f,\tilde{\varphi }\right\rangle ^{k+1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{k-1}, \end{aligned}$$

where the last equality holds because the total number of ways of splitting one set of size \(k+1\) into two non empty sets is equal to k. To obtain the limit for \(S_{k+1}^{(2)}\), we use (16), the induction hypothesis, dominated convergence and linearity to obtain

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{S_{k+1}^{(2)}(x,t(1-s))}{2\varphi (x)t^{k-1}}\\&\quad =\frac{(1-s)^{k-1}}{2\varphi (x)}\int _{M(E)^{\circ }}\sum _{\left\{ k_1,k_2\right\} ^{+}}\frac{(k+1)!}{k_1!k_2!} \lim _{t\rightarrow \infty }\frac{\langle \texttt {T}_{t(1-s)}^{(k_1)}\left[ f\right] , \nu \rangle }{(t(1-s))^{k_1-1}} \frac{\langle \texttt {T}_{t(1-s)}^{(k_2)}\left[ f\right] , \nu \rangle }{(t(1-s))^{k_2-1}}\Gamma (x,d\nu )\\&\quad =\frac{(1-s)^{k-1}}{2^{k}\varphi (x)}\int _{M(E)^{\circ }}\sum _{\left\{ k_1,k_2\right\} ^{+}}(k+1)!\left\langle f,\tilde{\varphi }\right\rangle ^{k+1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{k-1}\left\langle \varphi ,\nu \right\rangle ^2\Gamma (x,d\nu )\\&\quad =\frac{k(1-s)^{k-1}}{2^{k}\varphi (x)}(k+1)!\left\langle f,\tilde{\varphi }\right\rangle ^{k+1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{k-1}\int _{M(E)^{\circ }}\left\langle \varphi ,\nu \right\rangle ^2\Gamma (x,d\nu ). \end{aligned}$$

Combining these two limits, we get that

$$\begin{aligned} {F[f](x,s)}:= & {} \lim _{t\rightarrow \infty }{F[f](x,s,t)}\\= & {} \frac{k(1-s)^{k-1}}{\varphi (x)}\frac{(k+1)!}{2^k}\left\langle f,\tilde{\varphi }\right\rangle ^{k+1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{k-1}{\mathbb {V}}\left[ \varphi \right] (x). \end{aligned}$$

To complete the proof it remains to verify assumptions (A.1) and (A.2) in order to apply Theorem A.1 to (84). By now the reader will be familiar with the arguments required to check these assumptions and thus, we exclude the details. Hence, it follows that

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{\varphi (x)t^{k}}\texttt {T}^{(k+1)}\left[ f\right] (x)&=\frac{(k+1)!}{2^k}\left\langle f,\tilde{\varphi }\right\rangle ^{k+1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{k-1}\\&\qquad \int _{0}^{1}k(1-s)^{k-1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \mathrm{d}s\\&=\frac{(k+1)!}{2^k}\left\langle f,\tilde{\varphi }\right\rangle ^{k+1}\left\langle {\mathbb {V}}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle ^{k}, \end{aligned}$$

where the limit is uniform in \(x\in E\). Moreover, \(\sup _{t\ge 0, x\in E}\texttt {T}^{(k+1)}\left[ f\right] (x)/\varphi (x)t^{k}<\infty \).

3.4 Proofs for moments in the non-critical cases

In this section we present the main ideas behind the proof of Theorems 2 and 3. The methods follow a similar reasoning to the critical case and the details are left to the reader. The base case is given by the Perron Frobenius behaviour in (H1) for both sub and supercritical cases. Thus, we assume the result for \(k-1\) and proceed to give the outline of the inductive step of the argument.

Proof of Theorem 2 (supercritical case)

The main difference here, compared to the critical case, is that all the terms in \(R_{k}(x,t)\) will survive after the normalisation \(\mathrm{e}^{-\lambda k t}\) since the exponential term shares across the product. From the evolution equation (77) and the definition of \(L_k\) we have that

$$\begin{aligned}&|\varphi (x)^{-1}\mathrm{e}^{-\lambda k t}\texttt {T}_t^{(k)}\left[ f\right] (x)-k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}L_k(x) |\\&\quad \le \left| \varphi (x)^{-1}\mathrm{e}^{-\lambda k t}(-1)^{k+1}R_k(x,t)\right. \\&\qquad \left. -\left\langle f,\tilde{\varphi }\right\rangle ^{k}\sum _{\left\{ m_1,\ldots , m_{k-1}\right\} }\frac{k!}{m_1!\ldots m_{k-1}!}(m_1+\cdots +m_{k-1}-1)!\varphi (x)^{m_1+\cdots +m_{k-1}-1}\right. \\&\left. \qquad \prod _{j=1}^{k-1}\left( -L_j(x)\right) ^{m_j}\right| \\&\quad +\left| \varphi (x)^{-1}\mathrm{e}^{-\lambda kt}(-1)^{k}\int _{0}^{t}T_s\left[ U_k(\cdot ,t-s)\right] (x)\mathrm{d}s - \frac{k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}}{\lambda (k-1)}\left\langle {\mathbb {V}}_k\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \right| . \end{aligned}$$

The first terms in the right hand side goes to zero uniformly since

$$\begin{aligned}&\frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)}(-1)^{k+1}R_{k}(x,t)=\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}(-1)^{m_1+\cdots +m_{k-1}}\\&\quad \times (m_1+\cdots +m_{k-1}-1)!\prod _{j=1}^{k-1}\left( \frac{\mathrm{e}^{-\lambda j t}\texttt {T}_t^{(j)}\left[ f\right] (x)}{\varphi (x)j!}\right) ^{m_j}\varphi (x)^{m_1+\cdots +m_{k-1}-1}, \end{aligned}$$

and the induction hypothesis implies that

$$\begin{aligned}&\lim _{t\rightarrow \infty }\frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)}(-1)^{k+1}R_k(x,t)=\left\langle f,\tilde{\varphi }\right\rangle ^{k}\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}\nonumber \\&\qquad \times (m_1+\cdots +m_{k-1}-1)!\varphi (x)^{m_1+\cdots +m_{k-1}-1}\prod _{j=1}^{k-1}\left( -L_j{(x)}\right) ^{m_j}. \end{aligned}$$
(90)

For the second term, define for \(k\ge 2\)

$$\begin{aligned} I_k(x,t):=\int _{0}^{t}\texttt {T}_s\left[ U_k(\cdot ,t-s)\right] (x)\mathrm{d}s=(-1)^{{k}}\texttt {T}_t^{({k})}\left[ f\right] (x)+R_{{k}}(x,t). \end{aligned}$$

We will use induction to prove that for \(k\ge 2\)

$$\begin{aligned} \lim _{t\rightarrow \infty }\sup _{x\in E,f\in B^{+}_1(E)}\left| \frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)}(-1)^{k}I_k{(x,t)}-\frac{k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}}{\lambda (k-1)} \left\langle {\mathbb {V}}_{k}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \right| =0, \end{aligned}$$
(91)

which will complete the proof of the theorem.

First notice that for any \(k\ge 2\) due to a change of variable, we have that

$$\begin{aligned}&\frac{\mathrm{e}^{-\lambda kt}}{\varphi (x)}(-1)^{k}I_k(x,t)\\&\qquad =t\int _{0}^{1}\mathrm{e}^{-\lambda (k-1)ut}\frac{\mathrm{e}^{-\lambda ut}}{\varphi (x)}\texttt {T}_{ut}\left[ (-1)^{k}\mathrm{e}^{-\lambda kt(1-u)}U_k(\cdot ,t(1-u))\right] (x)du, \end{aligned}$$

where

$$\begin{aligned} (-1)^k\mathrm{e}^{-\lambda kt}U_k(x,t)=(-1)^k\mathrm{e}^{-\lambda kt}K_k(x,t)+\beta (x)(-1)^k\mathrm{e}^{-\lambda kt}S_k(x,t). \end{aligned}$$

Recalling the definitions of \(K_k\) and \(S_k\) given in (74) and (75) respectively, and using the fact that \(I_j(x,t)=-((-1)^{j+1}\texttt {T}_t\left[ f\right] (x)-R_j(x,t))\) for \(j=2,\ldots ,k-1\), after sharing the exponential term across the products, we obtain

$$\begin{aligned} (-1)^k\mathrm{e}^{-\lambda kt}K_k(x,t)=&\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}\\&\psi ^{(m_1+\cdots +m_{k-1})}(x,0+)\left( -\mathrm{e}^{-\lambda t}\texttt {T}_t\left[ f\right] (x)\right) ^{m_1}\\&\quad \prod _{j=2}^{k-1}\left( -\frac{1}{j!}(-1)^je{^{-\lambda jt}}I_j(x,t)\right) ^{m_j} \end{aligned}$$

and

$$\begin{aligned}&(-1)^k\mathrm{e}^{-\lambda kt}S_k(x,t)\\&\quad = \int _{M(E)^{\circ }}\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}(-1)^{m_1+\cdots +m_{k-1}}\left\langle -\mathrm{e}^{-\lambda t}\texttt {T}_t\left[ f\right] ,\nu \right\rangle ^{m_1}\\&\qquad \prod _{j=2}^{k-1}\frac{1}{j!}\left\langle -(-1)^{j}e{^{-\lambda jt}}I_j(\cdot ,t),\nu \right\rangle ^{m_j}\Gamma (x,d\nu ). \end{aligned}$$

From these expressions and the definition of \({\mathbb {V}}_2\), the case \(k=2\) follows easily. Now we assume (91) holds for \(\ell =1,\ldots , k-1\), then similarly to the branching Markov process case it follows that

$$\begin{aligned}&\lim _{t\rightarrow \infty }\sup _{x\in E,f\in B^{+}_1(E)}\left| \frac{\mathrm{e}^{-\lambda k t}}{\varphi (x)}(-1)^{k}I_k-\frac{k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}}{\lambda (k-1)} \left\langle {\mathbb {V}}_{k}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \right| \nonumber \\&\quad \le \lim _{t\rightarrow \infty }\sup _{x\in E,f\in B_1^{+}(E)}\nonumber \\&\qquad \quad \,t\left| \int _{0}^{1}\mathrm{e}^{-\lambda (k-1)ut}\left( \frac{\mathrm{e}^{-\lambda ut}}{\varphi (x)}\texttt {T}_{ut}\left[ \varphi F(\cdot ,u,t)\right] -k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}\left\langle {\mathbb {V}}_{k}\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \right) du \right| , \end{aligned}$$
(92)

where we have defined

$$\begin{aligned}&{F[f](x,u,t)}\\&\quad = \varphi (x)^{-1}\left( (-1)^k\mathrm{e}^{-\lambda kt(1-u)}K_k(x,t(1-u))+\beta (x)(-1)^k\mathrm{e}^{-\lambda kt(1-u)}S_k(x,t(1-u))\right) \\&\quad =\frac{1}{\varphi (x)}\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}\\&\qquad \times \left[ \psi ^{(m_1+\cdots +m_{k-1})}(-\varphi )^{m_1+\cdots +m_{k-1}}\left( \frac{\mathrm{e}^{-\lambda t(1-u)}\texttt {T}_{t(1-u)}\left[ f\right] (x)}{\varphi (x)}\right) ^{m_1}\right. \\&\qquad \left. \prod _{j=2}^{k-1}\left( \frac{(-1)^j\mathrm{e}^{-\lambda jt(1-u)}I_j(x,t(1-u))}{\varphi (x)j!}\right) ^{m_j}\right. \\&\qquad \left. +\beta (x)\int _{M(E)^{\circ }}\left\langle \frac{\mathrm{e}^{\lambda t(1-u)}\texttt {T}_{t(1-u)}\left[ f\right] }{\varphi }\varphi ,\nu \right\rangle ^{m_1}\prod _{j=2}^{k-1}\frac{1}{j!}\right. \\&\left. \qquad \left\langle \frac{(-1)^{j}e{^{-\lambda jt(1-u)}}I_j(\cdot ,t(1-u))}{\varphi }\varphi ,\nu \right\rangle ^{m_j}\Gamma (x,d\nu )\right] . \end{aligned}$$

It is easy to see that, pointwise in \(x\in E\) and for \(u\in (0,1)\), using the induction hypothesis for \(I_k\) and the assumed Perron Frobenius behaviour (H1) for \(\texttt {T}_{t(1-u)}\left[ f\right] (x)\) we have

$$\begin{aligned} {F[f](x,u)}&:=\lim _{t\rightarrow \infty }{F[f](x,u,t)}\\&=\frac{1}{\varphi (x)}\sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}\left\langle f,\tilde{\varphi }\right\rangle ^{m_1}\prod _{j=2}^{k-1}\left( \frac{\left\langle f,\tilde{\varphi }\right\rangle ^{j}\left\langle {\mathbb {V}}_j\left[ \varphi \right] ,\tilde{\varphi }\right\rangle }{\lambda (j-1)}\right) ^{m_j}\\&\quad \times \left[ \psi ^{(m_1+\cdots +m_{k-1})}(x,0+)(-\varphi (x))^{m_1+\cdots +m_{k-1}}\right. \\&\left. \qquad +\beta (x)\int _{M(E)^{\circ }}\left\langle \varphi ,\nu \right\rangle ^{m_1+\cdots +m_{k-1}}\Gamma (x,d\nu )\right] \\&=\frac{k!}{\varphi (x)}\left\langle f,\tilde{\varphi }\right\rangle ^{k}{\mathbb {V}}_k[\varphi ](x). \end{aligned}$$

We again verify that the conditions (A.1) and (A.2) hold using the induction hypothesis and (H2). To complete the proof of (91), we again proceed along the same lines as in the branching Markov processes setting. Similar arguments to those given in the proof of Theorem A.1

$$\begin{aligned} \sup _{\begin{array}{c} x\in E, u\in \left[ 0,1\right] ,\\ {f\in B^+_1(E)}, t\ge 0 \end{array}}\left| \frac{\mathrm{e}^{-\lambda ut}}{\varphi (x)}\texttt {T}_{ut}\left[ \varphi {F[f](x,u,t)}\right] -k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}{\mathbb {V}}_k\left[ \varphi \right] \right| <\infty , \end{aligned}$$

it follows that the remainder of the integral in (92) can be bounded by \(\varepsilon (1-\mathrm{e}^{-\lambda (k-1)t})/\lambda (k-1)\), which can be bounded by \(\varepsilon \). Combining this with (90) we get the desired result. \(\square \)

Proof of Theorem 3 (subcritical case)

We now outline the proof for the subcritical case. Again we use an inductive argument. The case \(k=1\) follows from (H1) and the fact that \(\left\langle \varphi ,\tilde{\varphi }\right\rangle =1\). Now assume the result to be true for \(\ell =1,\ldots ,k-1\). We first note first that the term \(R_k(x,t)\) in (77) vanishes in the limit after the normalisation \(\mathrm{e}^{-\lambda t}\). To see this, note from (73) that

$$\begin{aligned}&\left| \frac{\mathrm{e}^{-\lambda t}}{\varphi (x)}R_{k}(x,t)\right| \le \sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}c(m_1,\ldots ,m_{k-1})\prod _{j=1}^{k-1}\left| \frac{\mathrm{e}^{-\lambda t}\texttt {T}_t^{(j)}\left[ f\right] (x)}{\varphi (x)j!}\right| ^{m_j}\\&\quad \times \varphi (x)^{m_1+\cdots +m_{k-1}}\mathrm{e}^{\lambda (m_1+\cdots +m_{k-1}-1){t}}, \end{aligned}$$

where \(c(m_1,\ldots ,m_{k-1})\) is a constant depending only on \(m_1, \ldots ,m_{k-1}\). Since each of the terms in the product is bounded, \(\lambda <0\), and \(m_1+\cdots +m_{k-1}>1\) for any partition, the limit of the right-hand side above is zero. Using this, along with the induction hypothesis, assumption (H1) and the evolution equation (77) we see that

$$\begin{aligned}&\lim _{t\rightarrow \infty }\sup _{x\in E,f\in B_1^{+}(E)}\left| \varphi (x)^{-1}\mathrm{e}^{-\lambda t}\texttt {T}_{t}^{(k)}\left[ f\right] (x)-k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}\left\langle {\mathbb {V}}_k\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \right| \nonumber \\&\quad =\lim _{t\rightarrow \infty }\sup _{x\in E,f\in B_1^{+}(E)}\left| \frac{\mathrm{e}^{-\lambda t}}{\varphi (x)}\int _{0}^{t}\texttt {T}_s\left[ U_k^{*}(\cdot ,t-s)\right] (x)\mathrm{d}s-k!\left\langle f,\tilde{\varphi }\right\rangle ^{k}\left\langle {\mathbb {V}}_k\left[ \varphi \right] ,\tilde{\varphi }\right\rangle \right| , \end{aligned}$$
(93)

where

$$\begin{aligned} U_k^{*}(x,t)&=\sum _{\left\{ m_1\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}\left[ \psi ^{(m_1+\cdots +m_{k-1})}(x,0+)\prod _{j=1}^{k-1}\left( -\frac{\texttt {T}_t^{(j)}\left[ f\right] (x)}{j!}\right) ^{m_j}\right. \\&\quad \left. +\beta (x)\int _{M(E)^{\circ }}\prod _{j=1}^{k-1}\left( \frac{1}{j!}\left\langle \texttt {T}_t^{(j)}\left[ f\right] ,\nu \right\rangle \right) ^{m_j}\Gamma (x,d\nu )\right] . \end{aligned}$$

Then, similar calculations to those above yield

$$\begin{aligned}&\frac{\mathrm{e}^{-\lambda t}}{\varphi (x)}\int _{0}^{t}\texttt {T}_s\left[ U_k^{*}(\cdot ,t-s)\right] (x)\mathrm{d}s\\&\quad = t \sum _{\left\{ m_1,\ldots ,m_{k-1}\right\} _k}\frac{k!}{m_1!\ldots m_{k-1}!}\int _{0}^{1}\mathrm{e}^{-\lambda t(1-u)(1-m_1-\cdots -m_{k-1}))}\\&\quad \quad \frac{\mathrm{e}^{-\lambda tu}}{\varphi (x)}\texttt {T}_{tu}\left[ \psi ^{(m_1+\cdots +m_{k-1})}(\cdot ,0+)(-\varphi (\cdot ))^{m_1+\cdots +m_{k-1}}\prod _{j=1}^{k-1}\left( \frac{\mathrm{e}^{-\lambda t(1-u)}\texttt {T}_{t(1-u)}^{(j)}\left[ f\right] (\cdot )}{\varphi (\cdot ) j!}\right) ^{m_j}\right. \\&\quad \quad \left. +\beta (\cdot )\int _{M(E)^{\circ }}\prod _{j=1}^{k-1}\left\langle \varphi \frac{\mathrm{e}^{-\lambda t(1-u)}}{\varphi j!}\texttt {T}_t^{(j)}\left[ f\right] ,\nu \right\rangle ^{m_j}\Gamma (x,d\nu )\right] (x)\mathrm{d}u. \end{aligned}$$

To finish the proof, we use induction hypothesis to deal with the lower order moments, dominated convergence and (H1) to deal with the limit of \(\texttt {T}_{t(1-u)}\). Similarly to the last part of the proof of Theorem 2 we get that (93) is bounded as \(t\rightarrow \infty \) by \(\varepsilon (1-\mathrm{e}^{-\lambda t (1-m_1-\cdots -m_k)})/\lambda (1-m_1-\cdots -m_k)\), which is bounded by \(\varepsilon \) since \(m_1+\cdots +m_{k-1}>1\). We once more leave the details of the rest of the proof to the reader. \(\square \)

3.5 Proofs for occupation moments, Theorems 45 and 6

Given the proofs we have now seen for the branching particle setting for these three theorems, as well as the proofs of Theorem 12 and 3, we mention only that a similar calculation to the one presented in Theorem 7 tells us that

$$\begin{aligned} \texttt {M}_t^{(k)}\left[ g\right] (x)&=\;(-1)^{k+1}{\tilde{R}}_k(x,t)+(-1)^{k}\int _{0}^{t}\texttt {T}_s\left[ {\tilde{U}}_k(\cdot ,t-s)\right] (x)\mathrm{d}s\\&\quad -k\int _{0}^{t}\texttt {T}_s\left[ g[ \texttt {M}_{t-s}^{(k-1)}\left[ g\right] +(-1)^{k-1}{\tilde{R}}_{k-1}(\cdot ,t-s) ]\right] (x)\mathrm{d}s, \end{aligned}$$

where

$$\begin{aligned} {\tilde{U}}_k(x,t)={\tilde{K}}_k(x,t)+\beta (x){\tilde{S}}_k(x,t) \end{aligned}$$

and \({\tilde{R}}\), \({\tilde{K}}\) and \({\tilde{S}}\) are defined as R, K and S in (73), (74), (75), respectively, albeit replacing \(\texttt {T}^{(j)}\) by \(\texttt {M}^{(j)}\). From here we can consider the claimed asymptotics using the the inductive reasoning as in the proof of Theorems 12 and 3, respectively. \(\square \)