On nested infinite occupancy scheme in random environment

We consider an infinite balls-in-boxes occupancy scheme with boxes organised in nested hierarchy, and random probabilities of boxes defined in terms of iterated fragmentation of a unit mass. We obtain a multivariate functional limit theorem for the cumulative occupancy counts as the number of balls approaches infinity. In the case of fragmentation driven by a homogeneous residual allocation model our result generalises the functional central limit theorem for the block counts in Ewens’ and more general regenerative partitions.


Introduction
In the infinite multinomial occupancy scheme balls are thrown independently in a series of boxes, so that each ball hits box k = 1, 2, . . . with probability p k , where p k > 0 and k∈N p k = 1. This classical model is sometimes named after Karlin due to his seminal contribution [32]. Features of the occupancy pattern emerging after the first n balls are thrown have been intensely studied, see [6,20,28] for survey and references and [7,13,14,16] for recent advances. Statistics in focus of most of the B Alexander Gnedin a.gnedin@qmul.ac.uk Alexander Iksanov iksan@univ.kiev.ua 1 previous work, and also relevant to the subject of this paper, are not sensitive to the labelling of boxes but rather only depend on the integer partition of n comprised of nonzero occupancy numbers.
In the infinite occupancy scheme in a random environment the (hitting) probabilities of boxes are positive random variables (P k ) k∈N with an arbitrary joint distribution satisfying k∈N P k = 1 almost surely (a.s.). Conditionally on (P k ) k∈N , balls are thrown independently, with probability P k of hitting box k. Instances of this general setup have received considerable attention within the circle of questions around exchangeable partitions, discrete random measures and their applications to population genetics, Bayesian statistics and computer science. In the most studied and analytically best tractable case the probabilities of boxes are representable as the residual allocation (or stick-breaking) model where the U i 's are independent with beta(θ, 1) distribution 1 on (0, 1) and θ > 0. In this case the distribution of the sequence (P k ) k∈N is known as the Griffiths-Engen-McCloskey (GEM) distribution with parameter θ . The sequence of the P k 's arranged in decreasing order has the Poisson-Dirichlet (PD) distribution with parameter θ , and the induced exchangeable partition on the set of n balls follows the celebrated Ewens sampling formula [3,35,37,38]. Generalisations have been proposed in various directions. The two-parameter extension due to Pitman and Yor [35] involves probabilities of form (1) with independent but not identically distributed U i 's, where the distribution of U i is beta(θ + αi, 1 − α) (with 0 < α < 1 and θ > −α). Residual allocation models with other choices of parameters for the U i 's with different beta distributions are found in [30,39]. Much effort has been devoted to the occupancy scheme, known as the Bernoulli sieve, which is based on a homogeneous residual allocation model (1), that is, with independent and identically distributed (iid) factors U i having arbitrary distribution on (0, 1), see [2,15,22,28,29,36]. The homogeneous model has a multiplicative regenerative property, also inherited by the partition of the set of balls.
In more sophisticated constructions of random environments probabilities (P k ) k∈N are identified with some arrangement in sequence of masses of a purely atomic random probability measure. A widely explored possibility is to define a random cumulative distribution function F by transforming the path of an increasing drift-free Lévy process (subordinator) (X (t)) t≥0 . In particular, in the regenerative model F is defined by F(t) = 1−e −X (t) for t ≥ 0, see [5,21,24,25] and also Sect. 5. Such an F is called in the statistical literature neutral-to-the right prior [18]. In the Poisson-Kingman model F is given by F(t) = X (t)/X (1) for t ∈ [0, 1], see [18,35] and also Sect. 6.
Following [8,12,31] we shall study a nested infinite occupancy scheme in random environment. In this context we regard (P k ) k∈N as a random fragmentation law (with P k > 0 and k∈N P k = 1 a.s.). To introduce hierarchy of boxes, for each j ∈ N 0 let I j be the set of words of length j over N, where I 0 := {∅}. The set I = j∈N 0 I j of all finite words has the natural structure of a ∞-storey tree with root ∅ and ∞-ary branching at every node, where v1, v2, . . . ∈ I j+1 are the immediate followers of v ∈ I j . Let {(P (v) k ) k∈N , v ∈ I} be a family of independent copies of (P k ) k∈N . With each v ∈ I we associate a box divided in sub-boxes v1, v2, . . . of the next level. The probabilities of boxes are defined recursively by (note that the factors P(v) and P (v) k are independent). Given (P(v)) v∈I , balls are thrown independently, with probability P(v) of hitting box v. Since v∈I j P(v) = 1 the allocation of balls in boxes of level j occurs according to the ordinary Karlin's occupancy scheme.
Recursion (2) defines a discrete-time mass-fragmentation process, where the generic mass splits in proportions according to the same fragmentation law, independently of the history and masses of the co-existing fragments. The nested occupancy scheme can be seen as a combinatorial version of this fragmentation process. Initially all balls are placed in box ∅, and at each consecutive step j +1 each ball in box v ∈ I j is placed in sub-box vk with probability P (v) k . The inclusion relation on the hierarchy of boxes induces a combinatorial structure on the (labelled) set of balls called total partition, that is a sequence of refinements from the trivial one-block partition down to the partition in singletons. The paper [17] highlights the role of exchangeability and gives the general de Finetti-style connection between mass-fragmentations and total partitions.
We consider the random probabilities of the hierarchy of boxes and the outcome of throwing infinitely many balls all defined on the same underlying probability space. For j, r ∈ N, denote by K n, j,r the number of boxes v ∈ I j of the jth level that contain exactly r out of n first balls, and let be a cumulative count of occupied boxes, where · is the integer ceiling function. With probability one the random function s → K n, j (s) is nondecreasing and right-continuous, hence belongs to the Skorokhod space D[0, 1]. Also observe that K n, j (0) = K n, j,n is zero unless all balls fall in the same box and that K n, j (1) is the number of occupied boxes in the jth level. In [8] a central limit theorem with random centering was proved for K n, j (1) for j growing with n at certain rate. Our focus is different. We are interested in the joint weak convergence of (K n, j (s)) j∈N,s∈ [0,1] , properly normalised and centered, as the number of balls n tends to ∞. As far as we know, this question has not been addressed so far. We prove a multivariate functional limit theorem (Theorem 2.1) applicable to the fragmentation laws representable by homogeneous residual allocations models (including the GEM/PD distribution) and some other models where the sequence of P k 's arranged in decreasing order approaches zero sufficiently fast. A univariate functional limit for (K n,1 (s)) s∈ [0,1] in the case of Bernoulli sieve was previously obtained in [2].

Main result
For given fragmentation law (P k ) k∈N , let ρ(s) := #{k ∈ N: P k ≥ 1/s} for s > 0, and N (t) := ρ(e t ), V (t) := EN (t) for t ∈ R. The joint distribution of K n, j,r 's is completely determined by the probability law of the random function ρ(·), which captures the fragmentation law up to re-arrangement of P k 's. For our purposes therefore we can make no difference between fragmentation laws with the same ρ(·).
Similarly, using probabilities of boxes in level j ∈ N define ρ j (s) := #{v ∈ I j : P(v) ≥ 1/s} for s > 0, and N j (t) Let Here is a basic decomposition of principal importance for what follows: where (N (k) j−1 (t)) t≥0 for k ∈ N are independent copies of (N j−1 (t)) t≥0 which are also independent of T 1 , T 2 , . . . A consequence of (4) is a recursion for the expectations which shows that V j (·) is the jth convolution power of V (·). The assumptions on fragmentation law and the functional limit will involve a centered Gaussian process W := (W (s)) s≥0 which is a.s. locally Hölder continuous with exponent β > 0 and satisfy W (0) = 0. In particular, for any T > 0 for some a.s. finite random variable M T . For each u > 0, we set further For j ≥ 2, the process R (u) j is understood as the result of integration by parts In particular, when u( j − 1) is a positive integer, where s 1 = s, which can be seen with the help of repeated integration by parts. Throughout the paper D := D[0, ∞) and D[0, 1] denote the standard Skorokhod spaces. Here is our main result. Theorem 2.1 Assume the following conditions hold: for all t ≥ 0 and some constants c, ω, a 0 , in the J 1 -topology on D for some a > 0 and the same γ as in (8).
with (·) denoting the gamma function.

Remark 2.2
Observe that the limit processes in (10) are the restrictions of R j on [0, 1] only and assumed that (9) holds on D[0, 1] rather than on D. However, we do not think that such an assumption would be more natural than the present one.

Remark 2.3
The assumption 0 < ε 1 , ε 2 ≤ ω ensures that γ > 0. Furthermore, in view of (7) and the choice of γ relation (9) is equivalent to in the J 1 -topology on D. Similarly, in view of (13) given below relation (10) is equivalent to (7) ensures that, for j ∈ N and t ≥ 0,
Assuming that (17) holds for j = i − 1 we intend to show that it also holds for j = i. Recalling (4), write, for i ≥ 2 and t ≥ 0, An integration by parts yields, for s ≥ 0, Hence, (8) and (14).
Passing to the analysis of X i we obtain, for s ≥ 0 ) as t → ∞ by the induction assumption and the asymptotics Our main result, Theorem 2.1, is an immediate consequence of Proposition 3.7 given in Sect. 3.2, Theorem 3.2 given next and its corollary. (7), (8) and (9). Then (20) in the J 1 -topology on D N .

Corollary 3.3
In the setting of Theorem 3.2, for j ∈ N and h > 0, It is convenient to prove Corollary 3.3 at this early stage.
Theorem 3.2 follows, in its turn, from Propositions 3.4 and 3.5. Below we use the processes X j and Y j as defined in (19). (7) and (9). Then (22) in the J 1 -topology on D N . (7), (8) and (9). Then, for each integer j ≥ 2 and each

Connecting two ways of box-counting
We retreat for a while from our main theme to focus on Karlin's occupancy scheme with deterministic probabilities ( p k ) k∈N . By the law of large numbers a box of probability p gets occupied by about np balls, provided np is big enough. This suggests to relate counting the boxes occupied by at least n 1−s balls to the number of boxes with probability at least n −s . Letρ(t) := #{k ∈ N: p k ≥ 1/t} for t > 0, and letK n,r be the number of boxes containing exactly r out of n balls. We shall estimate uniformly the difference betweenK and (ρ(n s )) s∈ [0,1] . The following result is very close to Proposition 4.1 in [2]. However, we did not succeed to apply the cited proposition directly and will combine the estimates obtained in its proof.

Proposition 3.6
The following universal estimate holds for each n ∈ N E sup where y 0 ∈ (0, 1) is a constant which does not depend on n, nor on ( p k ) k∈N .
Proof For k ∈ N, denote byZ n,k the number of balls falling in the kth box, so that Then, for n ∈ N and s ∈ [0, 1], In [2] it was shown that, for n ∈ N, (see [2], pp. 1004-1005) and (see [2], p. 1006). Finally, for n ∈ N, Combining the estimates we arrive at (24) We apply next Proposition 3.6 to the setting of Theorem 2.1. This result shows that (10) is equivalent to the analogous limit relation with ρ j (n t ) = N j (t log n) replacing K n, j (t). (7) and (9). Then, for each j ∈ N,

Proposition 3.7 Suppose
Proof Fix any j ∈ N. By Proposition 3.6, for n ∈ N, Recall the notation In view of (14), The next step is to show that As a preparation for the proof of (28) we first note that according to (15) and (28) follows.
An appeal to (13) enables us to conclude that for large enough n Hence, by the same reasoning as above. Finally, by Corollary 3.3. Using (27)- (30) in combination with Markov's inequality [applied to the first three terms on the right-hand side of (26)] shows that the left-hand side of (26) divided by (log n) γ +ω( j−1) converges to zero in probability as n → ∞. Now (25) follows by another application of Markov's inequality and the dominated convergence theorem.

Proof of Proposition 3.4
We shall use an integral representation which has already appeared in the proof of Lemma 3.1(c): for j ≥ 2 and t ≥ 0. Here, the last equality is obtained with the help of integration by parts.
In view of (12) Skorokhod's representation theorem ensures that there exist versions for all T > 0. This implies that (22) where ) for j ≥ 2 and t, x ≥ 0. As far as the first coordinate is concerned the equivalence is an immediate consequence of (32). As for the other coordinates, note that, for each t > 0, the process (Y j (t·)) j≥2 has the same distribution as j≥2 and then write, for s > 0 fixed and j ≥ 2 Denoting by L(t, s) the first term on the right-hand side, we infer, for all T > 0, in view of (14) which implies that and (32).
because the left-hand sides of (33) and (35) have the same distribution. It remains to check two properties: (a) weak convergence of finite-dimensional distributions, i.e. that for all n ∈ N, all 0 ≤ s 1 < s 2 < · · · < s n < ∞ and all integer ≥ 2 as t → ∞; (b) tightness of the distributions of coordinates in (35), excluding the first one.

Proof of (36)
Hence, in what follows we consider the case s 1 > 0. Both the limit and the converging vectors in (36) are Gaussian. In view of this it suffices to prove that for k, j ∈ N, k + j ≥ 3 and s, u > 0, where we set Z 1 (t, ·) = W (·) and r (x, y) := E[W (x)W (y)] for x, y ≥ 0. We only consider the case where k, j ≥ 2, the complementary case being similar and simpler.
To prove (37) we need some preparation. For each t > 0 denote by θ k,t and θ j,t independent random variables with the distribution functions P{θ k, , respectively. Further, let θ k and θ j denote independent random variables with the distribution functions P{θ k ≤ y} = (y/s) ω(k−1) on [0, s] and P{θ j ≤ y} = (y/u) ω( j−1) on [0, u], respectively. According to (14), for every T > 0. This follows from the assumed a.s. continuity of W , the dominated convergence theorem in combination with E[sup z∈[0, T ] W (z)] 2 < ∞ for every T > 0 (for the latter, see Theorem 3.2 on p. 63 in [1]). As a result, by the dominated convergence theorem.

Proof of Tightness
Choose j ≥ 2. We intend to prove tightness of (t −ω( j−1) Z j (t, u)) u≥0 on D[0, T ] for all T > 0. Since the function t → t −ω( j−1) is regularly varying at ∞ it is enough to investigate the case T = 1 only. By Theorem 15.5 in [9] it suffices to show that for any κ 1 > 0 and κ 2 > 0 there exist t 0 > 0 and δ > 0 such that for all t ≥ t 0 . We only analyze the case where 0 ≤ v < u ≤ 1, the complementary case being analogous. Set W (x) = 0 for x < 0. The basic observation for the subsequent proof is that (6) extends to whenever −∞ < x, y ≤ T for the same positive random variable M T as in (6). This is trivial when x ∨ y ≤ 0 and a consequence of (6) when x ∧ y ≥ 0. Assume that where the first inequality follows from (6) with y = 0. Let 0 ≤ v < u ≤ 1 and u − v ≤ δ for some δ ∈ (0, 1]. Using (39) and (14) we obtain for large enough t and a positive constant λ. This proves (38).

Proof of Proposition 3.5
Relation (23) will be proved by induction in three steps.
Step 1 To prove (23) with j = 2, use (42) below with k = 1 which is nothing else but (9) and repeat verbatim the proof of Step 3.
Step 2 Assume that (23) holds for j = 2, . . . , k. We claim that then in the J 1 -topology on D k . Indeed, in view of (19) and the induction hypothesis relation (40) is equivalent to The latter holds by Proposition 3.4.
Step 3 Using in the J 1 -topology on D which is a consequence of (40) we shall prove that (23) holds with j = k + 1.
In view of (42) and the fact that R and, for each t > 0, R (t,ω) k a version of the process on the left-hand side of (42) for which (42) holds locally uniformly a.s. We can assume that the probability space on which these versions are defined is rich enough to accommodate for all T > 0 and r ∈ N.
in view of (17) and the well-known fact that the supremum over [0, T ] of any a.s. continuous Gaussian process has an exponential tail. Since lim t→∞ η 1 (t) = 0 a.s., inequality (46) ensures that lim t→∞ Eη 1 (t) = 0. The right-hand side in (45) multiplied by t −ω is dominated by where N (t) := #{r ∈ N: T r ≤ t}. Using the last limit relation and (16) we conclude that the second summand converges to 0 a.s., as t → ∞. The first summand converges to zero in probability, as t → ∞, by Markov's inequality in combination with ( Z 2 (t, y)) has the same distribution as the process (Z 2 (t, y)) in which the random variables involved do not carry the hats, and R In what follows we write E (T r ) (·) for E(·|(T r )) and P (T r ) (·) for P(·|(T r )). Note that k (1)] 2 )y 2(γ +ω(k−1)) N 1 (t y). Using now the Cramér-Wold device and Markov's inequality in combination with (16) we infer that, given (T r ), with probability one finite-dimensional distributions of (t −ω Z 2 (t, y)) y≥0 converge weakly to the zero vector, as t → ∞. Thus, (47) follows if we can show that the family of P (T r ) -distributions of (t −ω Z 2 (t, y)) y≥0 is tight. As a preparation, we observe that the process R (ω) k inherits the local Hölder continuity of W . Indeed, recalling (39) we obtain, for x, y ∈ [0, T ] and k ≥ 2, It is also important that the random variable M T has finite moments of all positive orders, see Theorem 1 in [4]. Pick now integer n ≥ 2 such that 2nβ > 1. By Rosenthal's inequality (Theorem 3 in [40]), for x, y ∈ [0, T ] and a positive constant C n which does not depend on (T r ) nor on t, having utilized (48) for the second inequality. In view of (16), this entails that a classical sufficient condition for tightness (formula (12.51) on p. 95 in [9]) holds for a positive random variable θ n and large enough t. Thus, we have proved that (47) holds conditionally on (T r ), hence, also unconditionally.

The case of homogeneous residual allocation model
In this section we apply Theorem 2.1 to the case of fragmentation law given by homogeneous residual allocation model (1 The process B q := (B q (s)) s≥0 is a centered Gaussian process called the fractionally integrated BM or the Riemann-Liouville process. Clearly B = B 0 , and for q ∈ N the process can be obtained as a repeated integral of the BM. It is known that B q is locally Hölder continuous with any exponent β < q + 1/2 [27].

Lemma 4.2
Assume that m := Eξ < ∞, s 2 := Var ξ ∈ (0, ∞) and Eη < ∞. Then for some constants b 1 < 0 and a 0 > 0. Also, Proof (a) A standard result of the renewal theory tells us that where a 0 is a known positive constant. The second inequality in combination with V (t) ≤ U (t) proves the second inequality in (51). Using the first inequality in (52) yields For a proof of weak convergence, see Theorem 3.2 in [2].
(b) We shall use a decompositioñ where ν(x) := #{r ∈ N 0 : S r ≤ x} for x ≥ 0, so that U (x) = Eν(x). It suffices to prove that and Proof of (53) For each j ∈ N, we write Similarly, Thus, (53) is a consequence of The second moment in (55) is equal In view of Eη < ∞, the function x → 1 −G(x) is directly Riemann integrable on [0, ∞). According to Lemma 6.2.8 in [28] this implies that the right-hand side of the last inequality is O(1), as j → ∞, thereby proving (55).
where the last inequality is a consequence of distributional subadditivity of ν, that is, Here, the second inequality is implied by convexity of x → x 2 and Jensen's inequality in the form ( (1)) and x k := ν(k + 1) − ν(k). Note that the p j, k satisfy j−1 k=0 p j, k = 1. Combining the obtained estimates together we arrive at (56).
Passing to the proof of (57) we first observe that in view of (52) relation (57) is equivalent to Since s → ν(s) − m −1 s is a (random) piecewise linear function with slope −m −1 having unit jumps at times S 0 , S 1 , . . . we conclude that Applying Doob's inequality to the martingale (S ν(t)∧n − m(ν(t) ∧ n)) n∈N 0 (this is a martingale with respect to the filtration generated by the ξ k because ν(t) is a stopping time with respect to the same filtration) we obtain for each n ∈ N. Here, the last equality is nothing else but Wald's identity. An application of Lévy's monotone convergence theorem yields In view of (52) the right-hand side is O(t), as t → ∞, and (58) follows.
Recall that (P k ) k∈N follows the GEM distribution with parameter θ > 0 when the U i 's in (1) are beta distributed with parameters θ and 1, in which case μ = E| log

Corollary 4.3
For θ > 0 let (P k ) k∈N be GEM-distributed with parameter θ , or any random sequence such that the sequence of P k 's arranged in decreasing order follows the PD distribution with parameter θ . Then in the product J 1 -topology on D[0, 1] N .

Some regenerative models
For (X (t)) t≥0 a drift-free subordinator with X (0) = 0 and a nonzero Lévy measure ν supported by (0, ∞) let be the associated process of jumps. The process X (·) assumes nonzero values on a countable set, which is dense in case ν(0, ∞) = ∞. The transformed process (multiplicative subordinator) F(t) = 1 − e −X (t) , t ≥ 0, has the associated process of jumps In this section we identify the fragmentation law (P k ) k∈N with nonzero jumps F(·) arranged in some order (for instance by decrease). Note that multiplying the Lévy measure by a positive factor corresponds to a time-change for F, hence does not affect the derived fragmentation law.
Theorem 5.1 applies to the gamma subordinator with the Lévy measure and to the subordinator with where θ, λ > 0. In both cases s 2 < ∞ and (60) holds with c 0 = θ and q = r 1 = r 2 = 1. Let X (·) be a subordinator with Lévy measure (61). We note in passing that ∞ 0 exp(−X (t))dt is the weak limit of the total tree length, properly normalized, of a beta (2, λ) coalescent, see Section 5 in [33] or Table 3 in the survey [23]. Also, the image of ν given in (61) under the transformation x → 1 − e −x yields a particular instance of the driving measure for a beta process, see formula (4) in [11].
Theorem 5.1 is a consequence of Theorem 2.1, the easily checked formula which we use for α = (q + 1)( j − 1), and the next lemma.

Lemma 5.2
Assume that (60) holds and s 2 < ∞. Then the following is true: for some constants a 0 , a 1 > 0 for all x > 0 and some constants α 0 , α 1 , β 0 and β 1 which are not necessarily the same as in (60). Since where the summation extends to all s > 0 with X (s) > 0, we conclude that is the renewal function and T (x) := inf{t > 0: X (t) > x} for x ≥ 0.
Similarly to (52) we have where a * 0 is a known positive constant. Using this and (63) we infer This proves the second inequality in (62). Arguing analogously we obtain thereby proving the first inequality in (62).

(b) Write
As a preparation for the proof of part (b) we intend to show that Proof of (66) To reduce technicalities to a minimum we only consider the case q > 1.
by the Borel-Cantelli lemma. For each t ≥ 0, there exists ∈ N 0 such that t ∈ [ , + 1). Now we use a.s. monotonicity of N (t) and N 2 (t) to obtain Thus, it remains to check that In view of (63), f satisfies a counterpart of (15), whence a.s. as → ∞. For the last equality we have used the strong law of large numbers for T (y).
We are ready to prove part (b). We shall use representation (65). Relation (66) entails for each T > 0. Thus, we are left with showing that in the J 1 -topology on D. The proof of this is similar to that of weak convergence of the jth coordinate, j ≥ 2, in (22). The only difference is that, instead of (12), we use where B is a Brownian motion, see Theorem 2a in [10].
(c) Since the proof is analogous to that of Lemma 4.2(b) we only give a sketch. In view of (65) it suffices to show that, as t → ∞, and E sup Proof of (69) Arguing as in the proof of Lemma 4.2(b) we conclude that (69) is a consequence of The second moment in (71)

The Poisson-Kingman model
Let (X (t)) t≥0 be a subordinator as in Sect. 5 with the only differences that the parameters in (60) satisfy q ∈ (0, 2), q/2 < r 1 , r 2 ≤ q and that we additionally assume where s = 2q when q ∈ (0, 3/2) and s = ε + q/(2 − q) for some ε > 0 when q ∈ [3/2, 2). The ranked sequence of jumps of the process (X (t)/X (1)) t∈ [0,1] can be represented as P j := L j /L > 0, where L 1 ≥ L 2 ≥ · · · is the sequence of atoms of a non-homogeneous Poisson random measure with mean measure ν, and . This is the Poisson-Kingman construction [34, Section 3] of probabilities (P j ) j∈N , which we regard as fragmentation law underlying a nested occupancy scheme. Theorem 6.1 Assume that the function x → ν((x, ∞)) is strictly decreasing and continuous on (0, ∞). For the fragmentation law as described above limit relation (10) holds with ω = q, γ = q/2, c = c 0 , a = c 1/2 0 and W (s) := B(s q ) for s ≥ 0 being a time changed Brownian motion. Theorem 6.1 is a consequence of Theorem 2.1 and Lemma 6.2 given next.

Lemma 6.2
Under the assumptions of Theorem 6.1 the following is true: for some constants α 2 , α 3 > 0, q ∈ (0, 2), q/2 < r 3 , r 4 ≤ q and β 2 , Note that N (t) = 0 for t < 0. Further, put m(t) := ν((e −t , ∞)) for t ∈ R and note that m is a strictly increasing and continuous function with m(−∞) = 0. In view of (60) for 2 t ≥ 0, where α 0 , α 1 > 0, q ∈ (0, 2), q/2 < r 1 , r 2 ≤ q and β 0 , β 1 < 0. Later, we shall need the following consequences of (75): and lim for all s 0 > 0. For the latter we have also used Dini's theorem. The random process ( N (t)) t∈R is non-homogeneous Poisson. In particular, N (t) has a Poisson distribution of mean m(t). Let P := (P(t)) t≥0 denote a homogeneous Poisson process of unit intensity. Throughout the proof we use the representation ( N (t)) t∈R = (P(m(t)) t∈R which gives us a transition from P to N . The converse transition, namely that the arrival times of P are m(− log L 1 ), m(− log L 2 ), . . . is secured by our assumption that m is strictly increasing and continuous (this assumption is not needed to guarantee the direct transition).
(b) Having written we intend to show that each of the three terms on the right-hand side is O(t q ).
1st summand Recall that (P(t) − t) t≥0 is a martingale with respect to the natural filtration. Using Note that (73) entails E[log + L] 2q < ∞. Thus, the first summand on the right-hand side of (84) is O(1) by (76) and Markov's inequality. Using (80) in combination with E[log + L] 2q < ∞ we conclude that the second summand on the right-hand side of (84) is O(t q ). 3rd summand Appealing to (74)  We already know from the previous part of the proof, that the second and the third summand on the right-hand side are O(t q ). As for the first summand, we use (78)  (P(m(s − log L)) − P(m(s))) 2 (1 + P * (m(s − log L 1 )) − P(m(s))) 2  The last expression is O(t q ) which can be seen by mimicking the arguments used in the previous part of the proof.
(c) A specialization of the functional limit theorem for the renewal processes with finite variance (see, for instance, Theorem 3.1 on p. 162 in [26]) yields P(t·) − (t·) in the J 1 -topology on D.
It is well-known (see, for instance, Lemma 2.3 on p. 159 in [26]) that the composition mapping (x, ϕ) → (x • ϕ) is continuous at continuous functions x: R + → R and nondecreasing continuous functions ϕ: R + → R + . This observation taken together with (85) and (77) enables us to conclude that in the J 1 -topology on D. Noting that, for all s 0 > 0, sup s∈[0, s 0 ] |s − t −1 log L − s| = t −1 | log L| → 0 a.s. as t → ∞ and applying the aforementioned result on compositions to (86) we infer in the J 1 -topology on D.
It remains to prove that in (87) we can replace m(t · − log L) by c 0 (t·) q . To this end, it is enough to show that, for all s 0 > 0,