Spatial Central Limit Theorem for Supercritical Superprocesses

We consider a measure-valued diffusion (i.e., a superprocess). It is deter-mined by a couple ( L , ψ) , where L is the inﬁnitesimal generator of a strongly recurrent diffusion in R d and ψ is a branching mechanism assumed to be supercritical. Such processes are known, see for example, (Englander and Winter in Ann Inst Henri Poincaré 42(2):171–185, 2006), to fulﬁll a law of large numbers for the spatial distribution of the mass. In this paper, we prove the corresponding central limit theorem. The limit and the CLT normalization fall into three qualitatively different classes arising from “competition” of the local growth induced by branching and global smoothing due to the strong recurrence of L . We also prove that the spatial ﬂuctuations are asymptotically independent of the ﬂuctuations of the total mass of the process.


Model
Let {P t } t≥0 be the semigroup of a strongly recurrent diffusion on R d with the infinitesimal generator L. We also introduce the so-called branching mechanism ψ : R + → R + . It is represented as where α, β ∈ R, β > 0 and is a measure concentrated on R + such that R + min(x 2 , x) (dx) < +∞. In this paper, we will study the behavior of a superprocess {X t } t≥0 with the infinitesimal operator L (or equivalently, with the semigroup P) and branching mechanism ψ. It is a time-homogenous, measure-valued Markov process. As such it is characterized by a transition kernel, which in our case is expressed in terms of its Laplace transform For the technical details of this construction, we refer the reader to [6,7]. The above definition could appear quite abstract, but actually any superprocess has a natural interpretation as the short lifetime and high-density limit of branching particle systems (see, for example, Introduction of [8] and Sect. 1.3). There is a vast body of literature concerning various aspects of superprocesses, e.g., [6,7,9,10].

Results: Outline
We postpone a formal description of our assumptions and results to Sects. 3 and 4 providing now intuitions.
In this paper, we are interested in the supercritical case in which the system grows exponentially (on the event of survival). The rate of growth is given by −ψ (0) = α which, in this paper, is assumed to be strictly positive: It is standard to prove that the limit: V ∞ := lim t→+∞ e −αt |X t |, where |X t | := X t , 1 is the total mass of the system, exists and is a non-trivial random variable. The semigroup P corresponds to a strongly recurrent diffusion with its unique invariant measure denoted by ϕ.
Superprocesses of this type fulfill a spatial law of large numbers. In a nutshell and without specifying detailed assumptions, recall [8,Theorem 1], this means that for any bounded continuous function f , we have lim t→+∞ e −αt X t , f = ϕ, f V ∞ , in probability.
The goal of our paper is to prove the corresponding central limit theorem. This will be achieved by studying the spatial fluctuations: (1.4) where N t is some norming, not necessarily deterministic. Before further discussion we need to quantify the recurrence of P. For the sake of discussion, not being quite precise, we assume that there exists μ > 0, such that for a bounded continuous function f , the quantity P t f − f, ϕ decays exponentially fast at rate μ. The behavior of (1.4) depends qualitatively on the sign of α − 2μ. Roughly speaking, it reflects the interplay of two antagonistic forces, the growth which is local and makes the system more coarse and the smoothing induced by the spatial evolution corresponding to P. The results split into three qualitatively different classes: Small growth rate α < 2μ see Theorem 4. In this case, "the smoothing" prevails and the formulation of the result resembles the standard CLT. The normalization is N t = |X t | 1/2 (which is of order e −(α/2)t ), and the limit is Gaussian, though its variance is given by a complicated formula. Moreover, the limit does not depend on X 0 . Critical growth rate α = 2μ see Theorem 6. In this case, we are in a situation of a delicate balance between "the growth and "the smoothing" with the growth being "somewhat stronger." The normalization is slightly bigger compared to the classical case: N t = t 1/2 |X t | 1/2 . The limit still does not depend on X 0 . Large growth rate α > 2μ see Theorem 8. In this case "the growth" prevails. The normalization is even bigger: N t = e (α−μ)t (we have α − μ > α/2 and therefore N t √ |X t |). What is perhaps most surprising the limit holds in probability. In addition, the growth is so fast that the limit depends on the starting configuration X 0 . Moreover, we suspect that the limit is non-Gaussian.
In either case, we prove that the spatial fluctuations (1.4) become independent of the fluctuations of the total mass: as the time increases.

Related Results
In [2], the authors established central limit theorem results for the branching particle system in which particles move according to the Ornstein-Uhlenbeck process (i.e., the one with infinitesimal generator L f = 1 2 σ 2 f − μ (x · grad f )) and branch after exponential time into two particles. Such a system is closely related to the superprocess with L and ψ(λ) = αλ + βλ 2 . In fact, it can be defined as the weak limit of branching particle systems. In the nth approximation, the system starts from a particle configuration distributed according to a Poisson point process with intensity nν (ν is the starting distribution of the superprocess). Each particles carries mass 1/n and lives for an exponential time with parameter 1/n. During this time, it executes a random movement according to an Ornstein-Uhlenbeck process. When it dies, the particle is replaced by a random number of offspring. The mean of this number is supposed to be 1 + α/n, while the variance 2β. Each particle evolves independently of the others. We note that this construction can be extended to general L and ψ (see, for example, [8]).
In [2], the authors studied fluctuations akin to (1.4) discovering three regimes similar to the list above. The particle point of view gives arguably more compelling intuitions. Having this picture in mind, it might be easier to understand the discussion above; moreover, some further heuristics are given in [2,Remarks 3.4,3.9,3.13].
Although [2] was inspiration for this paper, it must be stressed that the approximation, insightful as it is, cannot be easily used as a proof method in the superprocess setting nor the proofs of [2] can be transferred directly. The main difficulty compared to the branching systems is that a superprocess is not a discrete object. This was overcome using the backbone construction developed in [5]. It represents a supercritical superprocess as a subcritical superprocess (called dressing) immigrating continuously on top of a branching diffusion. Controlling the aggregate behavior of the dressing was the main technical issue to be resolved in this paper. This was achieved using analytical estimates of the behavior of P, which is a different approach then the coupling techniques applied in [2]. It is noteworthy that these analytical methods proved to be much more robust and allowed to obtain results for a quite general class of L. Moreover, in this paper we work with a general branching mechanism ψ, assuming only finite fourth moment. Related problems for branching particle systems were also considered in [1,4].

Organization
The next section presents notation and basic facts required further. Section 3 contains formulation of the assumptions. Section 4 is devoted to the presentation of our results. The proofs are deferred to Sects. 5, 6, 7 and 8 and "Appendix."

Preliminaries and Notation
Let us first recall the notions which appeared in the introduction. P is the semigroup of the diffusion process with the infinitesimal operator L. To shorten the notation for α ∈ R, we define a semigroup P α t t≥0 by M F is the space of finite, compactly supported measures and b + (R d ) is the space of bounded, positive and measurable functions on R d . By c 1 , c 2 , . . ., we will denote generic constants which might vary from line to line.
We will use C 0 to denote the space of continuous functions which grow at most polynomially. Formally: We will use R 1 , R 2 , . . . to denote generic functions for C 0 , and these may vary from line to line.
For x, y ∈ R n by x · y, we denote the usual scalar product. By → d , we denote convergence in law.
The parameter α in (1.1) is the rate of growth of the model. By Ext, we denote the event that the process becomes extinguished, i.e., Clearly, in the supercritical case we have λ * > 0.

Assumptions
In this section, we state precisely the assumptions on the branching mechanism ψ and the diffusion semigroup P. We will discuss them and give an example in Sect. 4.4 B1 The branching mechanism ψ given by (1.1) is non-trivial, precisely either β = 0 or = 0. It is supercritical, i.e., α > 0. Moreover, fulfills These conditions imply and ψ (0) = 0.
Further, we formulate assumptions on the semigroup P. Note that our formulation, although not the most compact, is chosen so that it is easy to verify and apply in proofs. Such a presentation also highlights what properties are essential for proofs.

S1
The semigroup P has the unique invariant probability measure ϕ. We require that any f ∈ C 0 (R d ) is integrable with respect to ϕ and for any We will usef to denote the centering of f with respect to ϕ i.e., We note that P tf = P t f − f, ϕ and for f = const we havef = 0. S2 There exists μ > 0 such that for any function S3 There exist μ > 0 and h : R d → R k (for some k ≥ 1) such that h = 0 for any i ∈ {1, . . . , k} we have h i ∈ C 0 (R d ) and for any t ≥ 0 Moreover, for any function f ∈ C 0 (R d ), there are R ∈ C 0 (R d ) and a bounded function r : R + → R + such that r (t) 0 and Note that for any t ≥ 0, we have P t h, ϕ = 0 (indeed by the fact that ϕ is invariant we have P t h, ϕ = h, ϕ and moreover P t h, ϕ = e −μt h, ϕ ).
We note that (S3) implies (S2). Indeed one can obtain (3.2) easily by dividing (3.3) by e μt . We note also that (S1) and (S2) imply the following fact. For any Remark 1 Conditions (S1), (S2) and (S3) state, roughly speaking, that the diffusion associated with P is strongly recurrent with the spectral gap μ. It might be possible that these conditions can be verified using a Bakry-Emery-type condition or by Foster-Lyapunov criteria. We refer to the classical work [13]. Section 6 addresses the so-called exponential ergodicity which might be useful for checking (S1) and (S2). Property (S3) seems harder to be check in generality, one can use the asymptotics of the transition density (as in the subsequent example). Other methods include using tools of functional analysis as, for example, [14,Sect. 3].
Example 2 Let us consider a superprocess with i.e., the infinitesimal operator of an Ornstein-Uhlenbeck process, where σ > 0 and γ > 0 and ψ(λ) = −αλ + βλ 2 for α, β > 0. It is obvious that (B1) holds. It is well known that the unique invariant distribution ϕ of L has density Moreover, for any f ∈ C 0 we have the following representation: where ou(t) := √ 1 − e −2γ t and G is distributed according to ϕ. Using this representation conditions, (S1), (S2) and (S3) can be verified quite easily (we refer to [1,Section 6]). Let us just mention that the function h in (S3) is h(x) = x and μ = γ . The limit objects V ∞ and H ∞ can be given a more explicit representation. V ∞ is distributed according to Exp(|X 0 | −1 ) and H ∞ is non-Gaussian. More information about the joint distribution of (V ∞ , H ∞ ) is contained in forthcoming Conjecture 14 which can be proved in this particular case.

Results
We start with a brief discussion of the behavior of the total mass of the superprocess, i.e., {|X t |} t≥0 . Let {V t } t≥0 be defined by  Therefore, V ∞ is non-trivial (e.g., EV ∞ = V 0 ). We also have The proof of the martingale property and (4.2) is analogous to the proof of forthcoming Fact 13 and is left to the reader. We recall that α > 0 is the growth rate of the system (see (1.1)) and that μ > 0 is the constant introduced in (S2)-(S3). Analogously to the presentation in the introduction, we split this section into three parts depending on the sign of α − 2μ.

Slow Growth α < 2μ
We recall (2.1), (3.1) and define Let us also remind the event Ext in (2.2) and σ V given in (4.3). The main result of this section is Let us assume that (B1), (S1), (S2) and α < 2μ hold. Then, for any f ∈ C 0 (R d ) we have σ f < +∞ and conditionally on the event Ext c the following holds Remark 5 The law of the first coordinate of the limit depends on X 0 only though its total mass |X 0 | (see Fact 3). The second and third coordinate do not depend on X 0 at all.
The proof is given in Sect. 6.

Critical Growth α = 2μ
We recall the function h from (S3) and define Using (S1) and (S3), one easily checks that for f ∈ C 0 we have σ 2 f < +∞. Let us remind the event Ext in (2.2) and σ V given by (4.3). The main result of this section is Theorem 6 Let {X t } t≥0 be the superprocess starting from X 0 ∈ M F (R d ). Let us assume that (B1), (S1), (S2), (S3) and α = 2μ hold. Then, for any f ∈ C 0 (R d ) and conditionally on the event Ext c the following holds The proof is given in Sect. 8.
From this fact, it follows that in the setting of this section, the limit exists both a.s. and in L 2 . Let us remind the event Ext in (2.2) and σ V given by (4.3).
The main result of this section is Let us assume that (B1), (S1), (S2), (S3) and α > 2μ hold. Then, for any f ∈ C 0 (R d ) conditionally on the event Ext c the following holds

Remark 9
The law of H ∞ exhibits non-trivial dependence on the starting condition X 0 and V ∞ , H ∞ are not independent. We expect that H ∞ is non-Gaussian. We make those observations precise in Conjecture 14 which is illustrated using the Ornstein-Uhlenbeck superprocess from Example 2. We notice that being the limit of infinitely divisible processes, the pair (V ∞ , H ∞ ) is also infinitely divisible. Determining its Lévy exponent would be an interesting result, though it seems unlikely to be obtained in a general setting. The convergence of the second coordinate in (4.10) is closer to a law of large numbers than to a central limit theorem. Intuitively speaking, the system grows so fast that the fluctuations become localized. This also manifests itself in the fact that the normalization is much bigger than the classical one. Writing exp((α − μ)t) = exp(αt) exp(−μt), we can decompose the normalization into exp(αt) and exp(−μt). The first term corresponds to the standard law of large numbers, and the second one reflects the fact that the mass of the system, roughly speaking, is distributed according to P * t (the measure adjoint to P t ). More precisely by (S3), we have e μt P tf ≈ h · f h, ϕ . Following these observations, we also conjecture that the convergence above holds almost surely.
The proofs are given in Sect. 7.

Discussion and Remarks
Remark 10 In our paper, we assume (B1) which states that the branching mechanism admits fourth moment. We use this assumption to verify Lyapunov's condition in the proof of central limit theorems. It seems that existence of (2 + )-moment for some > 0 should be sufficient, but we do not have the necessary formulas to calculate moments of superprocess in such a case. Further, it is not unlikely that the existence of second moment is enough for the results to hold.
An interesting question would be to go beyond this assumption. Namely, to study branching laws with heavy tails. It is natural to expect a different normalization and convergence to stable laws.

Proof Preliminaries
In this section, we gather necessary prerequisites for the proofs in Sects. 6, 7 and 8.

Backbone Construction
Supercritical superprocesses admit a beautiful and insightful description known as the backbone construction/decomposition. According to this construction, a supercritical superprocess consists of subcritical superprocesses (the so-called dressing) immigrating along the so-called (prolific) backbone which is a supercritical branching particle system. This allows to transfer many results concerning supercritical branching systems to superprocesses. On the conceptual level, this paper follows the strategy of [2], which presents CLTs for some branching particle systems. The main issue is to control the behavior of the dressing. We will comment on that once again after presenting the decomposition (5.7). Now we briefly discuss some aspects of the backbone construction referring the reader to [5, Sect. 2.4] for more details. 1 Let us recall the branching mechanism given by (1.1), we assume that it is supercritical, i.e., α > 0. Let λ * be the largest root of ψ(λ) = 0. We denote ψ * (λ) := ψ(λ + λ * ).
This happens to be a valid branching mechanism, and thus, we may consider a superprocess with this branching mechanism, it will be referred to as X * . It is subcritical, i.e., its total mass decays exponentially fast with rate The inequality follows by the fact that ψ is strictly convex. Next we define It is the generating function of the branching law of the backbone process {Z t } t≥0 . More precisely, it is a Markov process consisting of finite number of individuals. Each of them from the moment of birth lives for an independent and exponentially distributed period of time with parameter ψ (λ * ) during which it executes an L-diffusion started from its position of birth and at death gives birth at the same position to an independent number of offspring with distribution described by F. The configuration of particles can be naturally identified with an atomic measure. The space of such measures is denoted by M a (R d ).
Let Z be a branching particle diffusion (i.e., a backbone) with initial configuration γ and X 0, * be an independent copy of X * (i.e., with subcritical branching mechanism (5.1)) such that X 0, where the processes {I } t≥0 is independent of X 0, * . This process has a certain pathwise description, namely I consists of a subcritical superprocess immigrating along the backbone process. The full description is presented in [5]. The joint process {( t , Z t )} t≥0 is Markovian, we denote its law by P ν×γ . The following equation characterizes the transition kernel of this process where u * f is the solution of (1.3) with the subcritical branching mechanism ψ * given by (5.1).
We now present the main result concerning the backbone construction. First we randomize the law of P ν×γ for ν ∈ M F (R d ) by replacing the deterministic choice of γ with a Poisson random measure having intensity λ * ν. We denote the resulting law by P ν .
, under the measure P ν the process is Markovian and has the same law as X starting from X 0 = ν.
For any 0 ≤ s < t, we decompose the immigration process I (see (5.4)) as follows where D s t t≥0 describes the evolution of the dressing which appeared in the system before time s. The process i,s describes the mass which immigrated along the subtree stemming from the ith prolific individual at time s located at Z s (i) (we choose any enumeration of the particles of Z ). We have thus the following decomposition of We have Y s 0 = X s and Y evolves according the subcritical branching mechanism ψ * . Subcriticality is fundamental for our proof because this process is negligible when t s. The third term of (5.7) is a sum of random variables indexed by the branching process Z to which techniques similar to [2] can be applied. Each of the processes i,s performs Markovian evolution described by (5.5) with the starting conditions ν = 0 and γ = δ Z t (i) .

Martingales and Their Limits
We recall V and H (given by (4.1) and (4.7)). We define their analogues {W t } t≥0 , {I t } t≥0 , associated with the backbone process Z . Namely, where h is the eigenfunction introduced in (S3). Let us assume that V, H, W and I are defined for the backbone construction.

Fact 13
Let us assume that (B1), (S1) and (S2) hold. Then, W is a positive, L 2bounded martingale. We denote its limit by W ∞ . Moreover If, in addition, (S3) holds then I is a martingale, which for α > 2μ is L 2 -bounded. In this case the limit exists a.s and in L 2 . Moreover The proof uses some facts which are presented later and thus is postponed to Sect. 7. By Theorem 12, the backbone Z starts with a random number of particles. The definitions of W and I and the convergences remain valid under assumption Z 0 = δ 0 (i.e., one particle located at 0). We denote the joint limit in this case by (Ǐ ∞ ,W ∞ ). We conjecture the following behavior of the law of (H ∞ , V ∞ ).

Conjecture 14
Let us assume that (B1), (S1), (S3) and α > 2μ hold and let and N be a Poisson point process with intensity ν independent of the sequence. We definê where |N | is the number of points in N and (x 1 , . . . , x |N | ) are their positions.
Let (H ∞ , V ∞ ) be the limits of martingales (4.2) and (4.8) for the superprocess starting from X 0 = ν then The conjecture is supported by the fact that it holds in the case of superprocess from Example 2. In this case, it follows simply by Fact 13, Theorem 12 and an analogous decomposition for the Ornstein-Uhlenbeck branching process given in [2, Proposition 3.11]. In [2, Remark 3.14], it is proven also that in this case H ∞ is not Gaussian.

Moments
This section is devoted to the presentation of the moment formulas of processes appearing in the proofs. In the paper, we utilize moments up to order 4. We recall the branching mechanisms ψ and ψ * given in (1.1) and (5.1). Given Further, let be constants to be specified later. We define u * ,3 are constants to be specified. The usefulness of these formulas follows by
(1) For any X 0 ∈ M F (R d ) we have Similar formulas hold for the subcritical superprocess with the branching mechanism ψ * , namely There is a choice of constants c 1 m , c 2 m and c 3 m such that for x ∈ R and k ≤ 4 we have Moreover, we note that the constants c 1 m , c 2 m and c 3 m can be specified explicitly though are not relevant to our proofs.
Using these formulas, we analyze the process Y s defined in (5.8). Let f ∈ C 0 (R d ), and using the strong Markov property, (5.20) and (5.21), we obtain where under the measure E X s , the process X * is a subcritical superprocess starting from X s . In the last transformation, we used the fact that P is a semigroup. Now, by α * < 0, we see that indeed Y s t is negligible for t s. It will be useful to have the following bounds Lemma 17 Assume (B1), (S1) and (S2).

24)
and The proofs of Lemmas 15 and 17 are technical and thus postponed to "Appendix." We will also need moment formulas for the backbone process. We skip proofs referring the reader to [11] and derivation on [2, Sect. 4.1].
Lemma 18 Let us assume (B1) and (S1). Let Z be the backbone process as in Theorem 12. Then, there exists C > 0 such that for any f ∈ C 0 (R d ) we have

Proof of Theorem 4
In this section we fix f ∈ C 0 (R d ) and make the standing assumption that (B1), (S1), (S2) and α < 2μ hold. Let us first outline the proof. We use the decomposition of given in (5.7). We recall that V ∞ is the limit of the martingale V (see (4.1) and Fact 3), thatf = f − f, ϕ and finally (2.3). We start with the following random vectors i) (k and further details of definitions will be specified later).
We will show that Next, we will consider a random vector related to K 5 defined by 3) We will show that conditionally on the set of non-extinction Ext c (see (2.2)), we have where the limit is as in (4.5). From these results, Theorem 4 follows by standard arguments. Before going to the proofs, we recall (3.1) and σ 2 f given by (4.4) and we state the following technical lemma.

Lemma 19
We have σ 2 f < +∞ and The proof is deferred to the end of this section.
We will now concentrate on the second coordinate. The process of the total mass {| t |} t≥0 is a continuous-state branching processes (CSBP) (see [12,Sect. 10]. As such, it enjoys the branching property (see [12, 10.1]). Thus, for s ≥ kt we may decompose where F i s s≥0 are independent CSBPs having the initial mass 1 and F s s≥0 is a CSBP with the initial mass | kt | − | kt | . Analogously to (4.1) processes V i s := e −αs F i s andV s := e −αsF s are positive martingales with the respective limits V i ∞ and V ∞ as described in Fact 3. Passing to the limit in (6.9), we get One easily checks that 10) in probability.
We pass to analyze the third coordinate of (6.8). We recall that M i t = e −((k−1)α/2)t i,t (k−1)t ,f and observe that by (5.7) and (5.22) we have This follows easily by (5.23) and (6.7) (the second proviso). To recapitulate, we set By (6.10) and (6.11), we have |K being the location of the ith particle of Z (in some ordering). We define By (5.22) and assumption (S2), we have Further by (3.4), (5.27), the fact that X 0 ∈ M F (R d ) and using the first proviso of (6.7) we obtain In this way, we have established that |K 3 (t) − K 4 (t)| → 0. Finally, we deal with |K 5 (t)− K 4 (t)|. We introduce truncation in order to be able to control moments in the next section and Lemma 19. The choice of log t is somewhat arbitrary. We define I (t) and use the conditional expectation to calculate (6.14) We recall (5.18), (5.22) and obtain We treat the first term. By (S2), α < 2μ and (3.4), we obtain Other terms are easier and left to the reader. We conclude that By (5.27) and the Cauchy-Schwarz inequality, we conclude that For any x ∈ R, using (S1), we get lim sup t→+∞ P t 1 · ≥log t (x) ≤ lim sup

Proof of (6.4)
We will use characteristic functions. It will be convenient to work conditionally on the event E t := {| kt | ≥ t} ∩ {Z t ≥ t} (we denote the corresponding expectation by E t ). We set and We shall show that for any θ 1 , θ 2 , θ 3 , we have Secondly, we notice that P( ≤ | kt | ≤ t) → 0 and P( ≤ |Z t | ≤ t) → 0; thus, 1 E t → 1 Ext c a.s. and Fact 3 implies that Using 1 E t → 1 Ext c a.s. it is a standard task to conclude (6.4) from (6.16) and (6.17).
To get (6.16), we will introduce an intermediate function χ 2 and show that Let h be the characteristic function of (1 − V i ∞ ). One checks that all the random variables in the definition of χ 1 , except for V i ∞ , are measurable with respect to the σ -field F generated by { ks , Z s } s≤t . Moreover, conditionally on F, Using conditional expectation, we obtain The central limit theorem yields h θ 2 / √ n n → e −(θ 2 σ V ) 2 /2 . This motivates the following definition Dominated Lebesgue's theorem and the assumption on the event E t yield Similarly we will deal with the other sum. We work conditionally on Z t , and for notational simplicity, we work with integer times. We introduce sequences {a n } n≥0 , { p n } n≥0 such that a n ∈ N, p n ∈ R da n (intuitively a n is the number of particles at time n and p n are their positions). We assume that a n e −αn → a > 0, p n (i) ≤ log n. We denote defined below (6.1)) we set alsom i n = EM i n . We are going to use the CLT to analyze S n . Firstly, we calculate its variance A proof analogous to (6.13) gives λ * a −1 where σ 2 f is given by (4.4). Secondly, we check the Lyapunov condition. Using Hölder's inequality and (6.6), we get Therefore, the CLT implies Using the dominated convergence theorem in a similar manner as in the case of χ 1 −χ 2 , one can show that |χ 2 (θ 1 , θ 2 , θ 3 ; t) − χ 3 (θ 1 , θ 2 , θ 3 ; t)| → t 0.

Proof of Lemma 19
In order to prove (6.5), we will show that To get the first convergence, we use (5.18) and write We have Using (S2), the integrand in the last expression can be estimated as follows Using (3.4), we get L(0) ≤ c 1 e (α−2μ)s which, by assumption α < 2μ, is integrable with respect to s. By (S1) for any fixed s ≥ 0, we have Recalling (4.4) and appealing to dominated Lebesgue's theorem we conclude I 1 (t) → σ 2 f /λ * < +∞. An analogous argument, using (5.13) and (5.24), gives By (S1) ϕ is an invariant measure thus ϕ, P u f = ϕ, f , using (5.13), (5.14) and Fubini's theorem, we get Now we pass to the second statement of (6.20). We analyze the first term of (5.18) which is hardest and leave the other terms to the reader. Namely, we will prove that sup We recall (6.21) and notice that f (s, Now the convergence in (6.22) follows by Lebesgue's dominated theorem and we conclude the proof of (6.20). In order to prove (6.6), we apply the triangle inequality to (5.19) and Lemma 17 For simplicity, we skip all terms u * ,k f (which by (5.25) and α * < 0 are easy to control). We will thus consider For k = 2, calculations similar to (6.21) lead easily to S 2 (x, t) ≤ e αt R 1 (x). Thus, we conclude For k = 3, we recall (5.15) for and use (3.2), (5.17) and (6.25) to get Using assumption α < 2μ and (3.4), we estimate This yields that |V 3 f (x, t)| ≤ e (3α/2)t R 6 (x). (6.26) Finally, we pass to k = 4. We recall (5.15) and use (3.2), (5.17), (6.25) and (6.26) to get Using the assumption α < 2μ and (3.4), we get This is enough to conclude (6.6).

Proof of Theorem 8
In this section, we fix f ∈ C 0 (R d ) and make the standing assumption that (B1), (S1), (S2), (S3) and α > 2μ hold. Proving the convergence of the whole vectors (4.9) and (4.10) would be notationally cumbersome. As it follows similar lines as in the proof of Theorem 4, it is left to the reader. We focus on the most important part which is the convergence of the second coordinate of (4.10). Recalling (3.1) and the backbone construction given in Definition 11, we denote We shall prove that where, slightly abusing notation, we used H ∞ to denote the limit of martingale (4.7) defined for { t } t≥0 . By Theorem 12, the processes X and have the same law and thus (7.1) implies the convergence and hence also in probability. This establishes the convergence of the second coordinate in (4.10). Before the proof, we formulate a technical lemma Lemma 20 There exists R ∈ C 0 (R d ) such that We will define intermediate processes Y 2 , Y 3 , Y 4 . The convergence (7.1) will follow immediately once we show where the convergences hold in probability and j : R + → R + is a continuous function. Recall (5.7) and let us set choosing j to be any continuous function fulfilling and r ( j (t))e μt → 0, (7.4) where r is the function introduced in (S3) and α * defined in (5.2). Using (3.4), (5.8) and (5.23), we get where we used the first proviso in (7.4). We proceed to the second convergence of (7.3). Let By (5.22) and (5.26), we get Using (5.27), α > 2μ and (3.4), we obtain 7) which establishes the second convergence in (7.3).
We recall (5.17) and (5.22) to get where l(t) = 1 − e (α * −α) j (t) . Following (S3), we decompose m i t =m i t +m i t with We recall (5.10) and write . Applying (5.27), the second proviso of (7.4) and (3.4) we obtain thus the third convergence of (7.3) holds. Finally, noticing l(t) → 1 and using Fact 13, we get This is the last statement of (7.3), and thus, the proof is concluded.

Proof of Fact 13
The fact that W is a martingale is well known (see, for example, [ where M i t := e −l(α−μ)t i,t lt , h j and m i t := E(M i t |Z t ) = E(M i t |Z t (i)). Use (5.23), (3.4), (S3) to calculate By (5.2) we can choose l such that E|I 1 (t)| → 0. The proof of E (I 2 (t)) 2 → 0 is the same as the one of (7.7). Finally, for I 3 we use (5.22) and (S3) to get The convergence I 3 (t) → 0 follows by the convergence of the martingale I . Putting together, we obtain (5.12). Relation (5.11) can be proven using a similar but simpler way. Details are left to the reader.

Proof of Theorem 6
In this section, we fix f ∈ C 0 (R d ) and make the standing assumption that (B1), (S1), (S2), (S3) and α = 2μ hold. Let us first present the outline of the proof. We start with the following random vector We will define K 2 , K 3 , K 4 which will fulfill the following relations. For any k > −α/α * we have in probability as t → +∞ moreover lim sup By Lemma 18, we have Using (S2), we estimate Applying (3.4), and recalling α = 2μ, we get that for some C > 0. Let us now concentrated on the third coordinate of K 3 (t; k). We introduce truncation. We recall I (t) defined in (6.14), one can follow the proof there to show I (t) → 0, the only change is to show (6.15), namely that . This is left to the reader. Therefore, we have The final step listed in (8.3) that is to show K 4 (t; k) → L k . We proceed along the lines of the proof in Sect. 6.2. The definitions and arguments are analogous. The only significant change is the proof of convergence of v n defined by (6.19). In our case To treat I 1 , we use α = 2μ and decompose following notation of (S3) Recalling (S1), we check To I 4 , we apply (S3) and (3.4), namely Similarly, one can prove |I 5 (t)| → 0. Putting these results together we conclude I 1 (t) → σ 2 f /λ * and consequently the first convergence in (8.9) holds. Let us pass to the second statement. We analyze the first term of (5.18), which is hardest, and leave the others to the reader. Namely, we will show that sup where f (s, t) given in (6.23). We recall the notation of (S3) and write Using this decomposition, α = 2μ together with the triangle inequality, we get We use α = 2μ, apply (S3) to the first two expressions and define This will be applied to the first two terms. The third one can be analyzed similarly By the fact that r (s) 0, we can write f (s, t) ≤ c 1 r (s) + e −μ(t−s) R 6 (log t).
Clearly, it is enough to establish (8.10).

By (3.4), we obtain
Acknowledgements Parts of this paper were written while the author enjoyed the kind hospitality of the Probability Laboratory at Bath. The author wishes to thank Simon Harris and Andreas Kyprianou for stimulating discussions. The author also thanks the reviewer for suggesting changes which greatly improved the paper.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix
This section contains proofs of Lemmas 15 and 17. First, we recall Faà di Bruno's formula, which states that for sufficiently smooth functions g : R → R and h : R → R we have d k dx k h(g(x)) = m∈A k a m · h (m 1 +···+m k ) (g(x)) · k j=1 g ( j) (x) m j , where a m 1 ,...,m k := k! m 1 ! 1! m 1 m 2 ! 2! m 2 ··· m k ! k! m k and the sum is over the set A k of all k-tuples of nonnegative integers m = (m 1 , . . . , m k ) satisfying the constraint k j=1 jm j = k. Fix f ∈ b + (R d ) and recall (1.3). We introduce an additional parameter θ ≥ 0 and denote (·, s)) (x)ds.
Formal calculations using (9.1) yield that for k ≥ 2. It is standard to check that the above formulas are valid for θ > 0. Passing to the limit θ 0 we conclude that they remain true as long as ψ (k) (0) is finite. We denote u k f (x, t) := ∂ k ∂θ k u θ f (x, t) θ=0 . The same reasonings hold for the branching mechanism ψ * given by (5.1). The respective quantities are denoted with the superscript * (e.g., u * ,k f ). We will prove that under assumption (B1) for k ≤ 4 the quantities u k f and u * ,k f given here are the same as the ones of (5.13), (5.14) and (5.16). One checks that u * ,0 f (x, t) = u * 0 f = 0. Recalling (ψ * ) (0) = −α * , we get It is straightforward to verify that this equation is solved by the first formula of (5.13) (we recall the notation (2.1)). The second formula of (5.13) holds analogously. To treat the case k ≥ 2, we denote Analogously as before it is solved by 3) These are the same as (5.14) and (5.16).
The validity of the above expressions for f ∈ C 0 follows by a standard integraltheoretic exercise. The formulas (5.20) and (5.21) are standard properties of the Laplace transform (recall (1.2)).
Similar derivations hold for V k f . We fix f ∈ b + (R d ) recall (5.5) and put ν = 0, γ = δ x , f = θ f, h = 0, θ ≥ 0 and denote By (5.6), we know that V θ f (x, t) is the unique [0, 1]-valued solution of the integral equation Let k ≥ 1, using (9.1) we obtain (we skip some arguments to make expressions more clear) Similarly as before we pass to the limit θ 0, again it is possible when ψ k (0) < +∞.
Under assumption (B1) for k ≤ 4, we denote V k f := and we will prove that they are the same as given by (5.17), (5.18) and (5.19). One checks that V 0 f = 1 and u * ,0 f = 0 and thus