Superdiffusive limits for deterministic fast–slow dynamical systems

We consider deterministic fast–slow dynamical systems on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^m\times Y$$\end{document}Rm×Y of the form \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\left\{ \begin{array}{ll} x_{k+1}^{(n)} = x_k^{(n)} + n^{-1} a\big (x_k^{(n)}\big ) + n^{-1/\alpha } b\big (x_k^{(n)}\big ) v(y_k), \\ y_{k+1} = f(y_k), \end{array}\right. } \end{aligned}$$\end{document}xk+1(n)=xk(n)+n-1a(xk(n))+n-1/αb(xk(n))v(yk),yk+1=f(yk),where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \in (1,2)$$\end{document}α∈(1,2). Under certain assumptions we prove convergence of the m-dimensional process \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_n(t)= x_{\lfloor nt \rfloor }^{(n)}$$\end{document}Xn(t)=x⌊nt⌋(n) to the solution of the stochastic differential equation \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathrm {d} X = a(X)\mathrm {d} t + b(X) \diamond \mathrm {d} L_\alpha , \end{aligned}$$\end{document}dX=a(X)dt+b(X)⋄dLα,where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_\alpha $$\end{document}Lα is an \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}α-stable Lévy process and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\diamond $$\end{document}⋄ indicates that the stochastic integral is in the Marcus sense. In addition, we show that our assumptions are satisfied for intermittent maps f of Pomeau–Manneville type.


Introduction
Averaging and homogenisation for systems with multiple timescales is a longstanding and very active area of research [34]. We focus particularly on homogenisation, where the limiting equation is a stochastic differential equation (SDE). Recently there has been considerable interest in the case where the underlying multiscale system is deterministic, see [9][10][11]16,20,21,24,32,35] as well as our survey paper [8]. Almost all of this previous research has been concerned with the case where the limiting SDE is driven by Brownian motion. Here, we consider the case where the limiting SDE is driven by a superdiffusive α-stable Lévy process.
Let α ∈ (1, 2). The multiscale equations that we are interested in have the form , It is assumed that the fast dynamical system f : Y → Y has an ergodic invariant probability measure μ and exhibits superdiffusive behaviour; specific examples for such f are described below. Let v : Y → R d be Hölder with vdμ = 0. Define for n ≥ 1, (1. 2) Then W n belongs to D([0, 1], R d ), the Skorokhod space of càdlàg functions, and can be viewed as a random process on the probability space (Y , μ) depending on the initial condition y 0 ∈ Y . As n → ∞, the sequence of random variables W n (1) converges weakly in R d to an α-stable law, and the process W n converges weakly in D([0, 1], R d ) to the corresponding α-stable Lévy process L α . Now consider x (n) 0 = ξ n ∈ R m , and solve (1.1) to obtain (x (n) k , y k ) k≥0 depending on the initial condition y 0 ∈ (Y , μ). Define the càdlàg process X n ∈ D([0, 1], R m ) given by X n (t) = x (n) nt ; again we view this as a process on (Y , μ). Our aim is to show, under mild regularity assumptions on the functions a : R m → R m and b : R m → R m×d , that X n → w X where X is the solution of the SDE dX = a(X )dt + b(X ) dL α , X (0) = ξ (1. 3) and ξ = lim n→∞ ξ n . Here, indicates that the SDE is in the Marcus sense [29] (see [2,5,25] for the general theory of Marcus SDEs and their applications). Previously such a result was shown by Gottwald and Melbourne [16,Section 5] in the special case d = m = 1. Generally the method in [16] works provided the noise is exact, that is d = m and b = (Dr) −1 for some diffeomorphism r : R m → R m , but cannot handle the general situation considered here where the noise term is typically not exact. There are three main complications: (1) In the case of exact noise, it is possible to reduce to the case b ≡ id by a change of coordinates, similar to Wong-Zakai [45]. The general situation necessitates the use of alternative tools such as rough paths. In particular, weak convergence of W n is no longer sufficient and we require in addition that W n is tight in p-variation. This is shown in Theorem 1.3 below for specific examples, and in Sect. 6 for a large class of deterministic dynamical systems f : Y → Y . (2) Since the results for exact noise are achieved by a change of coordinates, the sense of convergence for W n is inherited by X n . However, in general, even if W n → w L α in one of the standard Skorokhod topologies [40], this need not be the case for X n . This phenomenon already appears in the simplest situations, as illustrated in Example 1.4. Hence we have to consider convergence of X n in generalised Skorokhod topologies as introduced recently in Chevyrev and Friz [7]. (3) Rigorous results on convergence to d-dimensional stable Lévy processes in deterministic dynamical systems are only available for d = 1, see [1,22,33,42]. Hence one of the aims of this paper is to extend the dynamical systems theory to cover the case d ≥ 2. See Theorem 1.1 below for instances of this, and Sect. 6 for a general treatment.
In the remainder of the introduction, we discuss some of the issues associated to these three complications. We also mention some examples of fast dynamical systems that lead to superdiffusive behaviour. The archetypal such dynamical systems are the intermittent maps introduced by Pomeau and Manneville [37]. Perhaps the simplest example [27] is the map f : Y → Y , Y = [0, 1], with a neutral fixed point at 0: (1.4) See Fig. 1a. Here, α > 0 is a real parameter and there is a unique absolutely continuous invariant probability measure μ for α > 1. Let v : Y → R be Hölder with Y vdμ = 0 and v(0) = 0, and define W n as in (1.2). For α ∈ (1, 2) it was shown by Gouëzel [17] (see also [46]) that W n (1) converges in distribution to an α-stable law. By Melbourne and Zweimüller [33], the process W n converges weakly to the corresponding Lévy process L α in the M 1 Skorokhod topology on D([0, 1], R). Now let d ≥ 2. There are two versions of the M 1 topology on D([0, 1], R d ), see [43,Chapter 3.3]. In this paper we use the strong topology SM 1 . For v : Y → R d Hölder with Y vdμ = 0 and v(0) = 0, we prove convergence of W n to a d-dimensional Lévy process L α in the SM 1 topology.
The example (1.4) is somewhat oversimplified for our purposes since L α is essentially one-dimensional, being supported on the line {cv(0) : c ∈ R}. This structure can be exploited in proving that W n → w L α , though it is not clear if this simplifies the homogenisation result X n → w X . To illustrate that we do not rely on one-dimensionality of the limiting process in any way, we consider an example with two neutral fixed points. (It is straightforward to extend to maps with a larger number of neutral fixed points.) Accordingly, our main example is the intermittent map f : Y → Y , Y = [0, 1], with two symmetric neutral fixed points at 0 and 1: (1.5) See Fig. 1b. Again α > 0 is a real parameter, there is a unique absolutely continuous invariant probability measure μ for α > 1, and we restrict to the range α ∈ (1, 2). As part of a result for a general class of nonuniformly expanding maps (Sect. 6) we prove: Theorem 1.1 Consider the intermittent map (1.4) or (1.5) with α ∈ (1, 2) and let v : Y → R d be Hölder with Y vdμ = 0 and v(0) = 0, also v(1) = 0 in case of (1.5). Let P be any probability measure on Y that is absolutely continuous with respect to Lebesgue, and regard W n as a process on (Y , P). Then where L α is a d-dimensional α-stable Lévy process.
Remark 1. 2 The limiting process L α is explicitly identified in Sect. 6.2.
In the context of [16], the conclusion W n → w L α was sufficient to prove the homogenisation result X n → w X . This is not the case for general noise, and we require tightness in p-variation. For 1 ≤ p < ∞, recall that the p-variation of u : [0, 1] → R d is given by where | · | denotes the Euclidean norm on R d . The main abstract result in this paper states that the properties established in Theorems 1.1 and 1.3 are the key ingredients required to solve the homogenisation problem. Informally: Consider the fast-slow system (1.1) and define W n as in (1.2) and X n = x (n) nt with x (n) 0 = ξ n . Suppose that lim n→∞ ξ n = ξ , W n → w L α , an α-stable Lévy process, in D([0, 1], R d ) with the SM 1 -topology, and that W n p-var is tight for all p > α.
If v is bounded and a, b are sufficiently smooth, then We give a rigorous formulation of this result in Theorem 2.6 (in the above statement we assume that the limiting process is Lévy only for convenience-the result holds true for an arbitrary limiting process as seen from Theorem 2.6). To complete the statement, it is necessary to describe the topology on D([0, 1], R m ) in which X n converges. As already indicated, the SM 1 topology is too strong in general. The next example illustrates where the problem lies.
It is easy to see that W n converges to θ 1 [1/2,1] in the M 1 topology as n → ∞, and that (X 1 n , X 2 n ) = (cos W n , sin W n ). The process X n converges pointwise to In particular, if θ = 2π , then X ≡ (1, 0) is continuous. At the same time, X n fails to converge in any of the Skorokhod topologies.
The problem outlined in Example 1.4 arises naturally in the fast-slow system (1.1). Figure 3 illustrates a realisation 1 of W n and X n for d = m = 2 and the map (1.5). The function b is taken as Note that, although W n appears to converge in SM 1 in accordance with Theorem 1.1, X n moves along the integral curves of a vector field, and thus does not approximate its limit in SM 1 . Topologies naturally suited for convergence in Example 1.4 were recently introduced in [7]. These topologies are a generalisation of the Skorokhod SM 1 topology which allow for convenient control of differential equations. Briefly, jumps of a càdlàg process are interpreted as an instant travel along prescribed continuous paths which depend only on the start and end points of the jump. The full "pathspace" thus becomes the set of pairs (X , φ), where X : [0, 1] → R d is a càdlàg path and φ is a so-called path function [6] which maps each jump (X (t−), X (t)) to a continuous path from X (t−) to X (t). It is often convenient to fix φ, which in turn determines a topology on , and we give details in Sects. 2 and 3. The paper is organised as follows. In Sect. 2, we introduce the necessary prerequisites on generalised Skorokhod topologies and Marcus differential equations in order to state rigorously our main abstract result Theorem 2.6. The proof is given at the end of Sect. 3 after introducing the necessary results from rough path theory. In Sects. 4 to 6, we show that a class of nonuniformly expanding dynamical systems, including (1.4) and (1.5), satisfies the conclusions of Theorems 1.1 and 1.3 which are in turn the main hypotheses of Theorem 2.6. Section 4 deals with a class of uniformly expanding maps known as Gibbs-Markov maps, and Sect. 5 provides the inducing step to pass from uniformly expanding maps to nonuniformly expanding maps. In Sect. 6, we apply the results of Sects. 4 and 5 to the intermittent maps (1.4) and (1.5). The precise result on homogenisation of the system (1.1) with fast dynamics given by either (1.4) or (1.5) is stated in Corollary 6.4.

Notation
We use "big O" and notation interchangeably, writing a n = O(b n ) or a n b n if there is a constant C > 0 such that a n ≤ Cb n for all sufficiently large n. As usual, a n = o(b n ) means that lim n→∞ a n /b n = 0 and a n ∼ b n means that lim n→∞ a n /b n = 1.

Setup and result
In this section, we collect the material necessary to formulate our main abstract result Theorem 2.6.
Let denote the set of all increasing bijections λ : [0, 1] → [0, 1] and let id ∈ denote the identity map id(t) = t. For X 1 , X 2 ∈ D, let σ ∞ (X 1 , X 2 ) be the Skorokhod distance The topology on D induced by σ ∞ is known as the strong J 1 , or SJ 1 , topology.
Another important topology on D is the strong M 1 , or SM 1 , topology defined as follows. For X ∈ D consider the "completed" graph (

Remark 2.2
We often keep implicit the interval [0, 1] and R d , as well as J , when they are clear from the context. We allow J to be a strict subset of R d × R d since this case arises naturally when considering driver-solution pairs for canonical differential equations, see the final discussion in Sect. 2.3.

Definition 2.3 The linear path function on
Fix a sequence r 1 , r 2 , . . . > 0 with j r j < ∞. Given (X , φ) ∈D and δ > 0, let X φ,δ ∈ C([0, 1], R d ) denote the continuous version of X , where the k-th largest jump is made continuous using φ on a fictitious time interval of length δr k . More precisely: • Let m ≥ 0 be the number of jumps (possibly infinite) of X . We order the jump • Let r = m j=1 r j and define the map Note that (X , φ) p-var is well-defined since X φ,1 p-var depends on neither the parametrisation of φ, nor the sequence {r k }. Let which defines a metric on D p-var [7, Remark 3.8].

Marcus differential equations
Note that our notation is slightly non-standard since b ∈ C N for N ∈ N implies only that the (N − 1)-th derivative of b is Lipschitz rather than continuous.
Under these conditions, we can define and solve (in a purely deterministic way) a Marcus-type differential equation The solution is obtained as follows from the theory of continuous rough differential equations (RDEs) in the Young regime [12,14,28]. Consider the càdlàg path The solution is a continuous path X : . We discuss a more general interpretation of this equation in Sect. 3.2.

Remark 2.4
In the case that W is a semimartingale, one can verify that X is the solution to the classical Marcus SDE (see [7,Proposition 4.16] for the general case p > 2 but with stronger regularity assumptions on a, b; the proof carries over to our setting without change).
To properly describe solutions of (2.2) and regularity of the solution map W → X , it is not enough to look at X as an element of D([0, 1], R m ). As in Example 1.4, one may have X ≡ 0 say, but with sizeable jumps in fictitious time.
Following [7], we consider the driver-solution space D([0, 1], R d+m ), made to contain the pairs (W , X ), and introduce a new path function on R d+m .
We define the path function φ b on R d+m by which is defined on and the path function φ b describes how the discontinuities of (W , X ) are traversed in fictitious time.

Main abstract result
Now we are ready for a rigorous formulation of the main abstract result. Consider the fast-slow system (1.1) with initial condition x Then, for all p > α , it holds that L p-var < ∞ a.s. and The proof of Theorem 2.6 is given at the end of Sect. 3.

Remark 2.7 (a)
The property L p-var < ∞ a.s. together with γ > α guarantees that the Marcus equation (2.4) admits a unique solution for a.e. realisation of L. In our applications, L is an α-stable Lévy process, for which the finiteness of L p-var is classical, and we take α = α. We introduce the parameter α to highlight that the threshold for the value of p in the second condition of Theorem 2.6 does not need to be the same α as in (1.2). (b) The drift vector field a plays no role in the definition of φ b . This is expected since the driver V n (t) = n −1 tn corresponding to a in the RDE solved by X n (see the proof of Theorem 2.6 below) converges in q-variation for every q > 1 to a process with no jumps. (c) Since the limiting process L in general has jumps, it is crucial that we pair (L, X ) with the path function φ b . In contrast, the jumps of (W n , X n ) are of magnitude at most n −1/α , so (W n , X n ) is almost a continuous path for large n; we make the reference to d+m only for convenience (cf. (3.10) below).
Recall that a stochastic process (L t ) t∈[0,1] is called stochastically continuous if, for all t ∈ [0, 1], L s → L t in probability as s → t. Note that Lévy processes are stochastically continuous by definition.

Corollary 2.8
In the setting of Theorem 2.6, suppose further that the process L is stochastically continuous. Then X n → X in the sense of finite dimensional distributions.

Remark 2.9
As in Example 1.4, we do not expect that X n → w X in any of the Skorokhod topologies, or that f (X n ) → w f (X ) for certain standard functionals f : D → R that are continuous with respect to the Skorokhod topologies, such as f (X ) = X ∞ . Instead we have for example that X n ∞ → w X ∞ , where X n and X are the corresponding components of the continuous paths (W n , X n ) d+m ,1 and (W , X ) φ b ,1 .

Rough path formulation
In this section we expand the material in Sect. 2 in order to formulate and prove an abstract convergence result, Theorem 3.4, from which Theorem 2.6 follows.

Generalised SM 1 topologies with mixed variation
We use a modified version of the topologies from [7] suitable for handling differential equations with drift. We continue using notation from Sect. 2.
-var be the set of u ∈ D (q, p)-var which are continuous. We furthermore denote~u~( q, p)-var = |u(0)| + u (q, p)-var and define Given (X 1 , φ 1 ) and (X 2 , φ 2 ) inD, let Following [7, Lemma 2.7], the limit exists, is independent of the choice of the sequence r k , and is invariant under reparametrisation of the path functions. In particular, α ∞ induces a pseudometric on D.
As before, note that (X , φ) (q, p)-var is well-defined since X φ,1 (q, p)-var does not depend on the parametrisation of φ, nor the sequence {r k }. Let which is well-defined and induces a metric on D (q, p)-var (cf. [7, Remark 3.8]).

Differential equations with càdlàg drivers
Suppose 1 ≤ q ≤ p < 2 and that b ∈ C β,γ with β > q and γ > p such that Remark 12.7] for a discussion about condition (3.1). In our applications, we will consider β > 1 and γ > p as fixed, and q = 1+κ for κ > 0 arbitrarily small. In this case condition (3.1) is always attained by taking κ sufficiently small, which explains why it does not appear in Theorem 2.6.
Recall that under these conditions, Here, * stands for one of the different ways to interpret a differential equation in the presence of discontinuities, which in general result in different solutions X . Two common choices (considered in the case q = p by Williams [44] and studied further in [6,7,13,15]) are • Geometric (Marcus) RDE. The solution is completely analogous to that of (2.2): we solve the continuous RDE d X = b( X )dW φ,1 , where φ = 1+d is the linear path function on R 1+d , and then remove the fictitious time intervals (note that the RDE is well-posed since W φ,1 (q, p)-var W (q, p)-var by Chevyrev [6, Corollary A.6]). For geometric RDEs we use the notation The solution satisfies the integral equation where the integral is understood as a limit of Riemann-Stieltjes sums with b(X (s−)) evaluated at the left limit points of the partition intervals: Here, P are partitions of [0, t] into intervals, and |P| is the size of the longest interval. For forward RDEs we use the notation Remark 3.2 Geometric RDEs use linear paths to connect the endpoints of each jump. As mentioned in the introduction, this has been generalised in [7] allowing one to solve The interpretation is as for geometric RDEs: we construct a continuous path, solve the canonical RDE d X = b( X )dW φ,1 , and then remove fictitious time intervals. Then ((W , where φ b is the path function on R 1+d+m as in Definition 2.5 with d replaced by φ, and the solution map of (3.5) is locally Lipschitz continuous. (These results were shown in [7, Theorem 3.13] for q = p, but the same proof applies mutatis mutandis for the general case upon using the RDE with drift estimates [14,Theorem 12.10]. In fact one can allow rough path drivers in R d +d with finite (q, p)-variation for arbitrary p, q ≥ 1 satisfying p −1 + q −1 > 1. We consider only d = 1 and 1 ≤ q ≤ p < 2 since this suffices for our purposes.)

Convergence of forward RDEs to geometric RDEs
For the remainder of this section, let us fix 1 ≤ q ≤ p < 2, β > q, γ > p, such that (3.1) holds. Suppose that W ∈ D (q, p)-var ([0, 1], R 1+d ) and b ∈ C β,γ . Then for every ξ ∈ R m , the geometric RDE Suppose now that W has finitely many jumps at times 0 < t 1 < · · · < t n ≤ 1. Then the solution X of the forward RDE can be obtained by solving the canonical RDE on each of the intervals [0, t 1 ), [t 1 , t 2 ), . . . [t n , 1) (on which W is continuous), and requiring that at the jump times Hence in the case that W has finitely many jumps, it is straightforward to construct the solution X first on [0, t 1 ), then at t 1 , then on [t 1 , t 2 ) and so on. As we shall see, this construction furthermore allows for an easy extension of stability results of continuous RDEs to the setting with jumps.

Remark 3.3
The construction of the forward solution for processes with infinitely many discontinuities is more involved, and can be achieved by solving directly the integral equation (3.4). This is done in [15] but is not required here.
Recall that φ b is the path function on R 1+d+m as in Definition 2.5 with d replaced by 1+d . ([0, 1], R 1+d )-valued random elements with almost surely finitely many jumps. Suppose that b ∈ C β,γ . Let X n be the solution of the forward RDE

Theorem 3.4 Suppose that {W n } n≥1 is a sequence of D (q, p)-var
Suppose that (a) lim n→∞ ξ n = ξ for some ξ ∈ R m , (b) W n → w W in D([0, 1], R 1+d ) with the SM 1 topology as n → ∞ (we allow the limit process W to have infinitely many jumps), (c) the family of random variables W n (q, p)-var is tight, (The RDE is well-posed because W (q, p)-var < ∞.) Then for each q > q and p > p, We give the proof after several preliminary results. We will see that if X n solved the geometric RDE dX n = b(X n ) dW n instead of the forward RDE, then Theorem 3.4 would readily follow from [7] (and assumption (d) would not be needed). In Lemma 3.6, we verify that under assumption (d) the solution of the forward RDE dX n = b(X n ) − dW n closely approximates the solution of the geometric RDE dX n = b(X n ) dW n (generalising a result of [44]). First we show how a single jump of a geometric solution relates to a "forward" jump (cf. [44, Lemma 1.1, Eq. (11)]). Define the semi-norm We now quantify the error in moving from forward to geometric solutions. Lemma 3.6 Suppose that W ∈ D (q, p) ([0, 1], R 1+d ) has finitely many jumps. Let b ∈ C β,γ and let X , X ∈ D([0, 1], R m ) be given by Then where K depends only on b C β,γ , W (q, p)-var , γ , β, p, and q, and the sum is over all jump times t of W .
Proof Let t 1 < · · · < t n be the jump times of W ; let t 0 = 0. For j ≤ n, define X j as the solution of forward RDE dX j = b(X j ) − dW , X j (0) = ξ , on [0, t j ], and as the solution of the geometric RDE dX j = b(X j ) dW on [t j , 1] with the initial condition taken from the solution on [0, t j ].
For each j, the processes X j−1 and X j coincide on [0, t j ) but possibly differ at t j . By Lemma 3.5 and the identity (3.6), On [t j , 1], both X n, j−1 and X n, j solve the geometric RDE dX = b(X ) dW , although with possibly different initial conditions. Recall that solutions of geometric RDEs are obtained from RDEs driven by continuous paths by inserting fictitious time intervals and linearly bridging the jumps. As such, they enjoy Lipschitz dependence on the initial condition (see [14,Theorem 12.10]) where K depends only on b C β,γ , W (q, p)-var , γ , β, p, and q. It follows from (3.7) and (3.8) that Observing that X 0 = X and X n = X , and taking the sum over j, we obtain the result. Fix 1 ≤ q ≤ p < 2 with p ∈ ( p, γ ), q ∈ (q, β), and such that (3.1) holds with q, p replaced by q , p . By Chevyrev and Friz [7, Proposition 2.9], convergence in SM 1 is equivalent to convergence in (D, α ∞ ). By the Skorokhod representation theorem, we can thus suppose that a.s. lim n→∞ α ∞ (W n , W ) = 0. Tightness of { W n (q, p)-var } implies that a.s. there is a subsequence n k such that lim sup k→∞ W n k (q, p)-var < ∞, and thus W (q, p)-var < ∞ a.s. by lower semicontinuity of (q, p)-variation. In addition, by a standard interpolation argument (cf. [7, Lemma 3.11]), it holds that α (q , p )-var (W n , W ) → 0 in probability, and therefore W n → w W in (D 0,(q , p )-var , α (q , p )-var ).
An application of the continuity of solution map for generalised geometric RDEs (the proof of [7, Theorem 3.13] combined with [14, Theorem 12.10]; see Remark 3.2) shows that Furthermore, since clearly It follows from Lemma 3.6 that lim n→∞ (W n , X n ) − (W n , X n ) p -var = 0, and in particular that σ ∞ ((W n , X n ), (W n , X n )) → 0. By virtue of interpolation, for each q > q and p > p , the identity map Since q > q > q and p > p > p are arbitrary, the conclusion follows.
We are now ready for the proof of Theorem 2.6.

Results for Gibbs-Markov maps
In this section, we prove results on weak convergence to a Lévy process, and tightness in p-variation, for a class of uniformly expanding maps known as Gibbs-

Gibbs-Markov maps
Let (Z , d) be a bounded metric space with Borel sigma-algebra B and finite Borel measure ν, and an at most countable partition P of Z (up to a zero measure set) with ν(a) > 0 for each a ∈ P. Let F : Z → Z be a nonsingular ergodic measurable transformation. We assume that F is a Gibbs-Markov map. That is, there are constants λ > 1, K > 0 and θ ∈ (0, 1] such that for all z, z ∈ a and a ∈ P: • Fa is a union of partition elements and F restricts to a (measure-theoretic) bijection from a to Fa; moreover inf a∈P ν(Fa) > 0; Recall that an α-stable random variable X in R d with α ∈ (1, 2) and E X = 0 has characteristic function Here is a finite nonnegative Borel measure on S d−1 with (S d−1 ) > 0, known as the spectral measure [39,Section 2.3]. It is a direct verification that γ X , with γ ≥ 0, has spectral measure γ α . We say that an α-stable Lévy process L α has spectral measure if L α (1) has spectral measure .
Fix a function τ : Z → {1, 2, . . .} that is constant on each a ∈ P with value τ (a) such that Z τ dμ Z < ∞. Let V : Z → R d be integrable with Z V dμ Z = 0. Assume that there exists C 0 > 0 such that for and all z, z ∈ a, a ∈ P, and Suppose that b n is a sequence of positive numbers and define the càdlàg process We consider W n as a random element on the probability space (Z , μ Z ). Throughout this section, · p denotes the L p norm on (Z , μ Z ) for 1 ≤ p ≤ ∞ and E denotes expectation with respect to μ Z . We now state the main results of this section.

Preliminaries about Gibbs-Markov maps
We recall the following standard result.

Lemma 4.5
Let V : Z → R d be integrable with Z V dμ Z = 0 and satisfying (4.2). Then

is a constant C( p), depending only on p, such that
(We do not exclude the case V p = ∞.) Proof For z, z ∈ Z , let s(z, z ) be the separation time, i.e. the minimal nonnegative integer such that F s(z,z ) (z) and F s(z,z ) (z ) belong to different elements of P. Let d θ be the separation metric on Z : Let P : L 1 (μ Z ) → L 1 (μ Z ) be the transfer operator corresponding to F and μ Z , i.e. Z Pφ wdμ Z = Z φ w • Fdμ Z for all φ ∈ L 1 , w ∈ L ∞ .

By Melbourne and Nicol [30, Lemma 2.2]
, there is a constant C 2 > 0 independent of V such that PV ≤ C 0 C 2 for all V satisfying the stated conditions. Hence For part (b), we proceed as in the proof of [33,Proposition 4.3]. Fix n > 0 and let M n k = n−1 j=n−k m • F j . By (a), M n k is a martingale on 0 ≤ k ≤ n. By Burkholder's inequality, there is a constant C( p) depending only on p such that and part (b) follows. For 0 ≤ n ≤ k, let P k n be the smallest sigma-algebra which contains F − j P for j = n, . . . , k. A standard property of mixing Gibbs-Markov maps (see for example [1, Section 1]) is that there exist γ ∈ (0, 1) and C > 0 such that for all k ≥ 0, n ≥ 1, where the probability measure in the definition of ψ is μ Z .

Weak convergence to a Lévy process
In this subsection, we prove Theorem 4.2. We use the following result due to Tyran-Kamińska [41].
Theorem 4.6 Let X 0 , X 1 , . . . be a strictly stationary sequence of integrable R d -valued random variables with E X 0 = 0. For 0 ≤ n ≤ k, let F k n denote the sigma-algebra generated by {X n , . . . , X k }. Suppose that: (a) X 0 is regularly varying with index α ∈ [1, 2) and σ as in Definition 4.1.
Then as n → ∞, the random process W n given by W n (t) = b −1 n nt −1 j=0 X j converges to an α-stable Lévy process L α in D([0, 1], R d ) in the SJ 1 topology.

Remark 4.7
It is implicit in [41] that L α has spectral measure = cos πα 2 (1 − α)σ , where σ is the measure on S d−1 for X 0 as in Definition 4.1.
Proof of Theorem 4. 6 We verify the hypotheses of [41, Theorem 1.1]. In the notation of [41], observe that (b) and [41,Lemma 4.8] together with ρ ≤ ψ imply that [41,Eq. (1.6)] holds. Moreover, (c) and [41, Corollary 1.3] together with ϕ ≤ ψ imply that [41, LD(φ 0 )] holds (for inequalities concerning ρ, ψ, and ϕ, see [4]). Proof To prove part (i), we verify the hypotheses of Theorem 4.6 with X k = V • F k . Since μ Z is F-invariant, {V • F k } k≥0 is a strictly stationary sequence of R d -valued random variables. The remaining hypotheses are verified as follows (a) The observable V is regularly varying with index α and measure σ , and V ∈ L p with p > α, so V = V − V is regularly varying with the same α and σ . (b) This is a consequence of (4.3). (c) It follows from (4.3) and invariance of μ Z under F that Now we prove part (ii). By the assumptions of Theorem 4.2, V ∈ L p for some p ∈ (α, 2). Note that |V | τ , E V = 0 and for each z, z ∈ a, a ∈ P, Hence by Lemma 4.5 Proof of Theorem 4.2 By Proposition 4.8, W n = W n + W n → w L α .

Tightness in p-variation
In this subsection we prove Theorem 4.4. First we record the following elementary properties of τ . (The Gibbs-Markov structure is not required here; the proof only uses that τ is regularly varying with values in {1, 2, . . .} and that μ Z is F-invariant.) Proposition 4.9 Let p > α. Then , so part (a) follows by definition of b n . A similar calculation proves part (b). Next, By Jensen's inequality, invariance of μ Z and parts (a) and (b), proving part (c).
Proof By Proposition 4.

Inducing weak convergence and tightness in p-variation
A general principle in smooth ergodic theory is that limit laws for dynamical systems are often inherited from the corresponding laws for a suitable induced system [18,20,31,33,38]. In this section, we show that this principle applies to weak convergence in D([0, 1], R d ) with the SM 1 topology and to tightness in p-variation. The results hold in a purely probabilistic setting.
Let Y be a measurable space and f : Y → Y a measurable transformation. Suppose that Z ⊂ Y is a measurable subset with a measurable return time τ : Z → {1, 2, . . .}, i.e. f τ (z) (z) ∈ Z for each z ∈ Z . (It is not assumed that τ is the first return time.) Define the induced map Suppose that μ Z is an ergodic F-invariant probability measure and thatτ = Z τ dμ Z < ∞.
It is convenient to identify Z with Z × {0} ⊂ . Then on the tower, τ is the first return time to Z .
Let v : Y → R d be measurable and define the corresponding induced observable To measure how well the excursion {v k (z)} 0≤k≤τ (z) approximates the straight and monotone path from 0 to V (z), we define V * : Z → R d , Note that V * (z) = 0 if and only if there exist 0 = s 0 ≤ s 1 ≤ · · · ≤ s τ (z) = 1 such that v k (z) = s k V (z) for 0 ≤ k ≤ τ (z). Let b n be a sequence of positive numbers, bounded away from 0, and define In this section, the notation → μ and → μ Z is used to denote weak convergence for random variables defined on the probability spaces (Y , μ) and (Z , μ Z ) respectively. We prove: Then W n → μ W in the SM 1 topology where W (t) = W (t/τ ).

Theorem 5.2
Suppose that τ is regularly varying with index α > 1 on (Z , μ Z ), and that b n satisfies lim n→∞ nμ Z (τ > b n ) = 1. Let v ∈ L ∞ . Suppose that the family of random variables W n p-var is tight on (Z , μ Z ) for some p > α. Then the family W n p-var is tight on (Y , μ).

Inducing convergence in SM 1 topology
In this subsection, we prove Theorem 5.1. Our proof closely follows the analogous proof in [33], with the difference that we work in R d instead of R.
Since π : → Y is a measure-preserving semiconjugacy, we may suppose without loss of generality that Y = and f = f as in (5.1). In particular, we may suppose that τ is the first return time. Define Thus defined, the restriction of U n to Z corresponds to U n in [33].

Lemma 5.4 U n → μ Z W in the SM 1 topology.
Proof For the case d = 1, see [33,Lemma 3.4]. The proof for all d ≥ 1 goes through unchanged.
Next we control excursions: we estimate the distance between U n and W n in the SM 1 topology.

Corollary 5.6
For each n and k, on Z , Proof Denote T j = τ j /n. Since we restrict to Z , each interval [T j , T j+1 ], including with j = 0, corresponds to a complete excursion with U n (T j ) = W n (T j ) and U n (T j+1 ) = W n (T j+1 ). Fix j and let φ : [T j , T j+1 ] → R d be the linear path such that φ(T j ) = U n (T j ) and φ(T j+1 ) = U n (T j+1 ). Recall that U n is constant on [T j , T j+1 ). By Proposition 5.5, Hence Finally, and the result follows.
Proof Fix T > 0 and define the random variables k = k(n) = max{ j ≥ 0 : τ j /n ≤ T } on Z . Consider the processes U n , W n on Z , where the time interval [0, τ k /n] corresponds to k complete excursions, while [τ k /n, T ] is the final incomplete excursion. By Corollary 5.6 and the assumptions of Theorem 5.1, as required.

Results for nonuniformly expanding maps
In this section, we prove results on weak convergence to a Lévy process, and tightness in p-variation, for a class of nonuniformly expanding maps. The weak convergence result extends work of [33] from scalar-valued observables to R d -valued observables. The result on tightness in p-variation is again new even for d = 1. We show that intermittent maps such as (1.4) and (1.5) fit our setting in Sect. 6.2.

Nonuniformly expanding maps
Let f : Y → Y be a measurable transformation on a bounded metric space (Y , d) and let ν be a finite Borel measure on Y . Suppose that there exists a Borel subset Z ⊂ Y with ν(Z ) > 0 and an at most countable partition P of Z (up to a zero measure set) with ν(a) > 0 for each a ∈ P. Suppose also that there is an integrable return time function τ : Z → {1, 2, . . .} which is constant on each a ∈ P with value τ (a), such that f τ (a) (z) ∈ Z for all z ∈ a, a ∈ P.
Define the induced map F : We assume that f is nonuniformly expanding. That is, F is Gibbs-Markov as in Sect. 4 and in addition there is a constant C > 0 such that for all 0 ≤ k ≤ τ (a), z, z ∈ a, a ∈ P. (6.1) Let μ Z be the unique F-invariant probability measure absolutely continuous with respect to ν. Define the ergodic f -invariant probability measure μ = π * μ as in Sect. 5. Setτ = Z τ dμ Z .
Let v : Y → R d be a Hölder observable with Y vdμ = 0, and define V , V * : Z → R d as in (5.2) and (5.3).
Let b n be a sequence of positive numbers and define W n as in (5.4). Let P be any probability measure on Y that is absolutely continuous with respect to ν, and regard W n as a process with paths in D([0, 1], R d ), defined on the probability space (Y , P).
We can now state and prove the main results of this subsection. Then W n → w L α on (Y , P) in the SM 1 topology, where L α is the α-stable Lévy process with spectral measure = cos πα 2 (1 − α)σ/τ .

Proof
Note that |V | ≤ v ∞ τ . Let z, z ∈ a, a ∈ P. Then where C 0 is the Hölder constant for v and θ is the Hölder exponent, and we used condition (6.1) in the definition of nonuniformly expanding map. Hence condition (4.2) is satisfied. Define W n as in (5.4). By Theorem 4.2, W n → wL α on (Z , μ Z ) in the SJ 1 topology whereL α is an α-stable Lévy process withL α having spectral measurẽ = cos πα 2 (1 − α)σ . By Theorem 5.1, W n → w L α on (Y , μ) in the SM 1 topology where L α (t) = L α (t/τ ). This proves the result when P = μ.
By Zweimüller [47, Theorem 1 and Corollary 3] (see also [33,Proposition 2.8]), the convergence holds not only on (Y , μ) but also on (Y , P) for any probability measure P that is absolutely continuous with respect to ν. This completes the proof. Theorem 6.2 Suppose that τ is regularly varying with index α > 1 on (Z , μ Z ), and that b n satisfies lim n→∞ nμ Z (τ > b n ) = 1. Then { W n p-var } is tight on (Y , P) for each p > α.
Proof Condition (4.2) was established in the proof of Theorem 6.1. Tightness on (Y , μ) follows from Theorems 5.2 and 4.4. Tightness on (Y , P) holds by the same argument used in the proof of Lemma 5.9.
We choose Z = [ 1 2 , 1] for the map (1.4), and Z = [ 1 3 , 2 3 ] for (1.5). Let τ be the first return time to Z . The reference measure ν is Lebesgue and the partition P consists of maximal intervals on which the return time is constant. It is standard that the first return map F = f τ is Gibbs-Markov, and since f > 1, condition (6.1) holds. Thus both maps are nonuniformly expanding. is given by the formula in part (b). By (6.2), V is regularly varying with index α and the same σ , proving part (b). Moreover, μ Z (|τv| > n) ∼ cn −α with c as in part (c), so μ Z (|V | > n) ∼ cn −α by (6.2). Part (c) follows by Remark 4.3(a).
Finally, it follows from (6.2) that V * τ β , from which V * ∈ L q (μ Z ) for some q > α, and This proves (e) and completes the proof of the lemma.
Finally, as a consequence of these results combined with Theorem 2.6, we can record the desired conclusion for homogenisation of fast-slow systems with fast dynamics given by one of the intermittent maps in Sect. 1. Consider the fast-slow system (1.1) with initial condition x (n) 0 = ξ n such that lim n→∞ ξ n = ξ . Suppose that a ∈ C β (R m , R m ), b ∈ C γ (R m , R m×d ) for some β > 1, γ > α. Define W n as in (1.2) and X n (t) = x (n) nt . Let P be any probability measure on Y that is absolutely continuous with respect to Lebesgue, and regard W n and X n as processes on (Y , P).
Let k denote the linear path function on R k and let φ b be the path function on R d+m as in Definition 2.5. Fix p > α. Then ((W n , X n ), d+m ) → w ((L α , X ), φ b ) as n → ∞ in (D p-var ([0, 1], R d+m ), α p-var ), where L α is the α-stable Lévy process with spectral measure = c cos πα 2 (1 − α)σ/τ with c and σ as in Lemma 6.3, and X is the solution of the Marcus differential equation (2.4). and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.