The order of convergence in the averaging principle for slow-fast systems of stochastic evolution equations in Hilbert spaces

In this work we are concerned with the study of the strong order of convergence in the averaging principle for slow-fast systems of stochastic evolution equations in Hilbert spaces with additive noise. In particular the stochastic perturbations are general Wiener processes, i.e their covariance operators are allowed to be not trace class. We prove that the slow component converges strongly to the averaged one with order of convergence $1/2$ which is known to be optimal. Moreover we apply this result to a slow-fast stochastic reaction diffusion system where the stochastic perturbation is given by a white noise both in time and space.


Introduction
Consider the following slow-fast system of abstract stochastic evolution equations with additive noise where ǫ > 0 is a small parameter representing the ratio of time scales beween the slow component of the system U ǫ and the fast one V ǫ .Here H, K are Hilbert spaces, A 1 , A 2 are unbounded linear operators on H, K respectively and W Q 1 , W Q 2 are Wiener processes on H, K respectively.Slow-fast systems are very used in applications since it is very natural for real-world systems to present very different time-scales.We refer the reader for example to [23] for applications to physics, [32] to chemistry, [34] to neurophysiology, [1], [13], [21], [22] to mathematical finance (see also [12] for a slightly different financial model) and the references therein.
A natural idea is then to study the behaviour of the system when ǫ → 0. In particular under certain hypotheses it is known that the slow component U ǫ converges to the solution U of the so called averaged equation where and µ is the invariant measure related to the fast motion, i.e.
Note that the equation for U is uncoupled from V ǫ .This fact is known as averaging principle and it is fundamental in applications since U captures the effective dynamic of U ǫ (which is usually the most interesting variable in applications) and it is then a rigorous dimensionality reduction of the original system.The first general result for the averaging principle for finite-dimensional stochastic differential equations can be found in [27].For generalizations and improvements see [13], [17], [18], [23], [31], [40], [41] and the references therein.It is important to mention that the drift of the fast equation is allowed to depend also on the slow component (i.e.fully coupled system) and the stochastic perturbations of the slow and fast equations are allowed to be multiplicative, i.e. the diffusion coefficient of the slow equation can depend on both the slow and fast variables.Moreover when the diffusion coefficient of the slow equation is independent of the fast variable then a strong convergence in probability is obtained.Otherwise only a weak convergence can be proved.
However for numerical applications it is very important to know the speed of convergence for which U ǫ → U, e.g.see [3], [29].For the study of the order of convergence for finite dimensional systems we refer to [16], [28], [30], [35], [25] and the references therein.It is important to mention that the order of convergence can be studied in two ways: in the strong sense and in the weak sense.Moreover the optimal order for the strong and weak convergence are known to be 1/2 and 1 respectively.
Recently the problem of estimating the order of convergence in the averaging principle for infinite dimensional systems systems is being addressed by researchers: in [4] the author (generalizing his previous work [2]) considers a slow-fast stochastic reaction diffusion system with additive noise.Both the weak and strong orders of convergence are obtained: in particular under strong regularity of the noise (it is for example assumed that the covariance operator is trace class but for the precise statement see [4]) it is proved that the strong order of convergence is 1/2 and the weak order is 1 with both orders being optimal.Instead under more general assumptions on the noise only weaker orders of convergence are obtained for both the strong and weak convergence.In [34] a 1-dimensional fully coupled reaction-diffusion system is considered and the strong order of convergence is proved to be 1/2 under very strong assumptions on the covariance operators of the noises, i.e.T r(∆ 1/2 Q i ) < ∞, where ∆ is the Laplacian.In [36] the strong order of convergence for a fully coupled slow-fast stochastic system is studied.
Here it is assumed that the the covariance operators of the noises are trace-class and moreover that T r(−A 1 Q 1 ) < ∞.See also [26] where the weak order of convergence for a stochastic wave equation with fast oscillation given by a fast reaction-diffusion stochastic system is proved to be 1.Also here it assumed T r(Q i ) < ∞.Indeed in all these papers the case T r(Q i ) = ∞, which is very important for applications as it happens very naturally for example when the stochastic perturbation is a white noise i.e.Q i = I, can't be treated.In this manuscript we are then interested in studying the strong order of convergence for the slow-fast infinite-dimensional system of stochastic evolution equations (1) where Under some hypotheses, see Hypotheses 1, 2, 3, 4, 5 below, we prove that the strong order of convergence is 1/2 which is known to be optimal.In particular we show in Theorem 1 that where U t is the solution of the averaged equation.Notice that this result is much stronger than [4], [36] where sup t∈[0,T ] is outside the expectation.
The key tool in the proof of Theorem 1 is Proposition 3. The proof of this proposition is based on a technical result, i.e.Lemma 6.3, which is inspired by [7], and it is a consequence of the mixing properties of the fast motion, i.e.Lemmas 4.5, 4.6, 4.7.We recall that [7] studies the normal deviations, i.e. the weak convergence of Z ǫ := (U ǫ − U)/ √ ǫ, when the equation for the slow component has no stochastic perturbation (Q 1 = 0).
Finally we discuss an application of our theory to a 1-dimensional slow-fast stochastic reaction diffusion system where the stochastic perturbation is given by a white noise both in time and space which to the best of our knowledge, as said before, can not be treated by the existing literature.
The paper is organized as follows: in section 2 we introduce the problem in a formal way and we state the assumptions that we will use.In section 3 we prove some a-priori estimates.In section 4 we prove some results related to the fast motion.In section 5 we study the well posedness of the averaged equation.In section 6 we prove some preliminary results.In section 7 we prove that the order of convergence is 1/2 and we give an application of our theory.

Setup and assumptions
In this section we define the notation and the assumptions for the rest of the paper.
H, K will be Hilbert spaces with scalar products B B (H) will denote the space of bounded functions φ : Lip(H) will denote the set of Lipschitz functions φ : H → R and set ℒ(H) will denote the space of linear bounded operators from H to H, endowed with the operator norm Next denote by ℒ 2 (H) the space of Hilbert-Schmidt operators endowed with the norm The analogous spaces B B (K), Lip(K), ℒ(K), ℒ 2 (K) are defined for the Hilbert space K with the corresponding norms In order to simplify the notation we will omit the subscripts K and H in the various norms when no confusion is possible.ℬ(H) and ℬ(K) will denote the Borel sigma-algebra in H and K respectively.Consider now the following infinite dimensional system for where independent cylindrical Wiener processes on H, K respectively with covariance operator Q 1 , Q 2 respectively and they are defined on some probability space (Ω, ℱ , P) with a normal filtration ℱ t , t 0.
We now state the assumptions that we will use throughout the work: Moreover we assume that there exist ζ > 0, n 2 integer and 1/(2n ( Moreover we can define the fractional powers of −A 1 denoted by (−A 1 ) θ for θ 0 with domain D((−A 1 ) θ ).We will denote by The following standard results holds, e.g.see [33, Chapter 2, Theorem 6.13], for some ν > 0 and Hypothesis 2. A 2 : D(A 2 ) ⊂ K → K is a linear operator generator of a 0 -semigroup e A 2 t on K, t 0. Moreover there exists λ > 0 such that for every t 0.
Moreover we assume that L G < λ.
Hypothesis 5. Assume that Q 2 is invertible (with inverse Q −1 2 ∈ ℒ(K)).Remark 2.2.Hypotheses 4, 5 hold for example choosing A 1 = A 2 = A to be the Laplacian on [0, L] and Indeed Hypothesis 5 is immediately satisfied.Moreover by setting H = K = L 2 ([0, L]), we have that Ae k = −Ck 2 e k for some orthonormal basis {e k } of eigenvectors of A. Thus, by the spectral representation (3) with α k = Ck 2 , we have: where the inequality follows since ∀t > 0 the function h ↦ → e −Ch 2 t is non-increasing.This shows that Hypothesis 4 holds with γ = 1/4.
In the sequel we will always assume that Hypotheses 1, 2, 3, 4, 5 hold.Moreover C > 0 will denote a generic constant independent of ǫ which may change from line to line.

A priori estimates
In this section we prove some classical a-priori estimates for the slow and fast components.In the following for every t 0 denote by and the stochastic convolutions.First we prove some estimates related to Γ 1 t and Γ 2ǫ t .Lemma 3.1.
Now we estimate E|Y ρ | p θ ; since Y ρ is a Gaussian random variable, by Ito's isometry, (4) and Hypothesis 4 we have: for every ρ T and θ, η such that Next inserting the last inequality in (10) and recalling that p > 1/η, which yields p > (1/2−γ−θ) −1 we obtain the thesis of the Lemma for Finally by Holder's inequality we have the thesis of the Lemma.
Proof.For t > 0, by [10,Theorem 4.36] and by our hypotheses we have C, so that the thesis is proved.
and sup for every ǫ > 0. Proof.Define . By Young's inequality and Hypotheses 1, 3 we have: for every t T .
Then by the comparison Theorem we have for every t T .
Then by the definition of Λ 1ǫ and this last inequality it follows for every t T .Now by Lemma 3.1 we have: for every τ T .Now we proceed in a similar way to [19, proof of Lemma 3.10], i.e. set x + θ and use the previous inequality, then: Now by dominated convergence for θ → 0 we have: so that by recalling the definition of Λ 2ǫ t we have: Then, by Holder's inequality and Lemma 3.2, we have: This proves (12).Finally inserting (12) into (15) we have (11).
Proof.Consider for t T First as u ∈ D((−A 1 ) α ) we have: Moreover by (4) and by Lemma 3.3 we have: Finally by Lemma 3.1 we have: By considering (16), calculating E sup t∈[0,T ] U ǫ t 2 α and using the last three inequalities we have the thesis.
Proof.For 0 t T , h 0 such that t + h T we have Consider the first term on the right-hand-side, as u ∈ D((−A 1 ) α ) by ( 5) and by Lemma 3.4 we have: C|h| 2α (1 + |u| 2 α + |v| 2 ).Consider now the second term on the right-hand-side, then by Lemma 3.3 we have: Finally for the third term on the right-hand-side by Ito's isometry and Hypothesis 4 we have: As by assumption 2α ∧ 2 ∧ (1 − 2γ) = 2α then we have the thesis.

Fast motion
In this section we study some classical properties of the fast motion.Consider for every s 0 and for some Q 2 -Wiener process w s .First define the semigroup related to the fast motion by for every φ ∈ B B (K), s 0.
Next recall that δ was defined by ( 7), then we have: s , then by (8) we have: Then by taking the expectation and applying the comparison Theorem we have the thesis.
Next we can show: Lemma 4.2.Let p 1, then there exists C = C(p) > 0 such that: for every s 0, v ∈ K. Proof.Define First by Burkholder's inequality and Hypotheses 2 and 4 similarly to what is done for Lemma 3.2 we have: for every s 0.
Then for p 2 we have: Then by the comparison Theorem and (20) we have: Now by [11,Theorem 6.3.3]there exists a unique invariant measure µ for the semigroup P t .Moreover we have: Lemma 4.3.We have: Proof.Fix N > 0, then by definition of invariant measure and Lemma 4.2 we have for every s > 0 where we have used the fact that (a + b) ∧ c a ∧ c + b ∧ c for every a, b, c 0. By choosing s > 0 large enough we have for some C > 0 independent of N. Letting N → ∞ by the monotone convergence Theorem we have the result.
Next we study the convergence to equlibrium of the semigroup of the fast motion, i.e. we prove: Lemma 4.4.There exists C > 0 such that for every s 0, v ∈ K, φ ∈ Lip(K).Moreover there exists C > 0 such that Proof.First for every φ ∈ Lip(K) by Lemma 4.1 we have: for every s 0, v 1 , v 2 ∈ K. Now let s > 0, by definition of invariant measure, (22) and Lemma 4.3 we have: for every v ∈ K, φ ∈ Lip(K) so that we have the first inequality.Now thanks to Hypothesis 5 we can apply [10,Theorem 9.32] to have the Bismut-Elworthy formula: for every s > 0, φ ∈ B B (K).Now by the semigroup property, the regularizing property of the semigroup (24) and by (22) we have: for every s > 0, v 1 , v 2 ∈ K.
Finally similarly to before by (25) for s > 0 we have: Now we study the mixing properties of the semigroup of the fast motion.To this purpose define for 0 s t ∞, v ∈ K ℋ t s (v) = σ(v v r , 0 s r t).Then a classical consequence of Lemmas 4.2, 4.4 is the following mixing result whose proof is the same of [7, Lemma 3.2] and is reported in the appendix for completeness.Lemma 4.5.There exists C > 0 such that for every 0 s t, v ∈ K. Now Lemma 4.5 implies the following classical result, e.g.see [37] (see also [7,Proposition 3.3]).The proof can be found in the appendix for completeness.Lemma 4.6.There exists C > 0 such that for every 0 Since in our case |ξ i | will not be bounded by 1 we need the following result which is similar to [7,Proposition 3.3].Also in this case we postpone the proof in the appendix.Lemma 4.7.Let ρ ∈ (0, 1).Then there exists C = C(ρ) > 0 such that for every 0 s 1 t 1 < s 2 t 2 and ξ i ℋ t i s i (v)-measurable, i = 1, 2 satisfying for some then:

Averaged equation
In this section we introduce the averaged equation and we prove its well-posedness.
for every u ∈ H and consider the so called averaged equation: for every t T .Now we prove the well-posedness of the averaged equation: Proposition 2. Equation (28) admits a unique mild solution given by: Moreover for every p > 0 there exists C = C(T , p) > 0 such that Proof.In order to prove the first part of the Proposition it is sufficient to prove that F is Lipschitz (e.g.see [10]).But this follows from the Lipschitz continuity of F, indeed: Hence we obtain the Lipschitzianity of F and the first claim of the Proposition.The proof of the second claim of the Proposition follows by a standard application of Gronwall's Lemma.

Preliminary results
In this section we prove a technical result, i.e.Lemma 6.
Moreover let n ∈ N and set: for every 1 j n, 0 r 1 ... r 2n T /ǫ.First we show the following result: for every u ∈ H, v ∈ K, 1 j 1 j 2 n, 0 r 1 ... r 2n T /ǫ.Moreover there exists C = C(T , ρ, n) > 0 and η = η(ρ, n) > 0 such that where rj = max r 2n − r 2n−1 , r 2j+1 − r 2j , for every 1 j n, 0 r 1 ... r 2n T /ǫ.Remark 6.2.Note that the dependence of the exponents with respect to j 2 − j 1 and n is not restrictive: once n has been fixed (together with ρ and T ) we are allowed to take any 0 j 1 j 2 n.
In this sense η = η(ρ, j 2 − j 1 ).Of course since j 2 − j 1 n we could choose η ′ = η ′ (ρ, n) η and replace η with η ′ in the estimates of the Lemma.However the estimate with η is more precise.
Now we can state and prove the main result of this section.
Lemma 6.3.Let n ∈ N and 0 β < 1/3.Then there exists a constant C = C(T , n, β) > 0 and Proof.Recall the definition of Ψ h (r), then by a change of variable we have: where we have defined: By simmetry we have: We proceed by induction on n and to this purpose fix some ρ ∈ (0, 1).Consider n = 1 then by the definition of θ α,β = e −rα r −β , (30) and some changes of variables we have so that by (37) we have the thesis for n = 1.Now assume that , for every even j < 2n where η j = j + ρ(j−1) (ρ+2) .We prove that then it holds for j = 2n.Set for r = (r 1 , ...r 2n ) ∈ (s, t) 2n with s r 1 ... r 2n t the integer j(r) such that max j=1,...,n−1 and consider H ǫ (s, t) given by (38).Recalling the definition of J j (r 1 , ...r 2n ) given by ( 29) we have: Note that by definition of j(r), for every s r 1 ... r 2n t, we have It follows: Now by (31) and the definition of j(r) we have: where ρ = ρ 2(2+ρ) .Recall that r i+1 r i for every i and note that max Hence, since g(t) = 1/t ρ, f(t) = e −αt for α > 0 are decreasing, we have (recall that C may change from line to line): where in the last equality we have defined δ n = δρ n(2+ρ) and we have used the following identity: We can apply this last inequality in order to estimate I 1,ǫ (s, t), i.e.
Now consider i = 2, 4, ...2n, we have: Now by ( 40) and ( 41) we have: Now consider the remaining term in the inequality for I 1,ǫ (s, t): by ( 40) and ( 41) we have Applying now ( 42) and (43) to the inequality for I 1,ǫ (s, t) we have: In addition by the inductive hypothesis we have: Finally applying the last two inequalities to (39) and then going back to (37) we have the thesis.

Order of convergence
In this section we finally investigate the order of convergence.We first prove the following proposition which will be crucial in the derivation of the order of convergence.
Proof.We proceed with similar techniques to the ones in the proof of [7,Theorem 4.1].Define We proceed using the factorization method, e.g.see [10, Chapter 5, Section 3]: fix ζ > 0, n 2 integer and 1/(2n) < β < 1/3 as in Hypothesis 1, then: By Holder's inequality we have: We now claim that there exist C = C(T ) > 0 and η > 0 such that for every 0 t 0 s T , ǫ > 0, u ∈ H, v ∈ K. Indeed first recall the spectral representation (3).Then by Parseval's identity, Holder's inequality, Hypothesis 1 and the properties of conditional expectations we have: where the last inequality follows by Lemma 6.3.Hence we have: .
Inserting this inequality in the one for E |Y ǫ s | 2n we have: .
The series on the right-hand-side is convergent by Hypothesis 1 and we have (47), so that the claim is proved.Inserting (47) into (46) we have: Finally by Holder's inequality we have the thesis: We can now state and prove the main Theorem of this work: for every ǫ > 0.
Proof.For t ∈ [0, T ] we have: so that: For the first term on the right-hand-side by the Lipschitz continuity of F we have: For the second term on the right-hand-side by Proposition 3 we have: Cǫ.
Putting everything together we have: for every τ T .
Then by Gronwall's Lemma we have the thesis of the Theorem: Finally we can provide an application to which our theory can be applied and which is not covered by the existing literature.
Example 1.Consider the following fully coupled slow-fast stochastic reaction-diffusion system: where • ǫ ∈ (0, 1] is a small parameter representing the ratio of time-scales between the two variables of the system u ǫ and v ǫ , • u ǫ and v ǫ are the slow and fast components respectively, • u, v ∈ H = L 2 [0, T ] are the initial conditions, • λ > 0, • f, g : [0, L] × R → R are Lipschitz functions uniformly wrt ξ with Lipschitz constants L f , L G respectively and L G < λ, • w 1 , w 2 are independent white noises both in time and space. Then it is well known [6] that (49) can be rewritten in the abstract form (2) where In this setting the hypotheses of Theorem 1 are satisfied (recall Remarks 2.1, 2.2) so that the result can be applied.
A Proof of Lemma 4.5 Proof.First observe that is the family of cylindrical sets.Consider B 1 ∈ t 0 (v) and B 2 ∈ ∞ s+t (v), i.e.
First by the tower property of conditional expectations we have: where By iteration we have: so that by (51) we have: In a similar way we have: It follows B Proof of Lemma 4.6 Proof.As |ξ 1 | 1 a.s.we have: where we have defined Similarly, as |ξ 2 | 1 a.s., we have: where we have defined Then it follows: Then we have: Consider T 1,R , then we have:
For T 2,R by Holder's and Markov's inequalities we have: and then by (26) we have: For the first term of T 3,R we have: Also the other terms can be treated in an analogous way and then similarly to before we have: Now by inserting the inequalities for T i,R into the first equation we have: By minimizing over R > 0 the right-hand-side of the previous inequality we have: .

Remark 2 . 1 . 5 , n = 3 , β = 1 5 .
Hypothesis 1 is necessary in the proof of Proposition 3. It holds for example when A 1 = ∆ is the Laplacian on [0, L].Indeed in this case it is well known that α k = Ck 2 and then the two series converge for example by choosing ζ = 3 From Hypothesis 1 we have the following spectral representation: e At x = ∞ k=1 e −α k t x, e k e k , ∀t > 0.
a linear operator generator of an analytical semigroup e A 1 t on H, t 0. Moreover there exist an orthonormal basis {e k } k∈N of H and {α k } k∈N ⊂ (0, +∞) such that [7,which is inspired by[7, Lemma 4.2]and follows by the mixing properties of the fast motion studied in section 4, in particular Lemma 4.7.In order to prove it we proceed with similar techniques to the ones of the proof of [7, Lemma 4.2].However note that Lemma 6.3 is a slightly stronger result than [7, Lemma 4.2]: indeed in Lemma 6.3 the absolute value is inside the integral, while in[7, Lemma 4.2]it is outside.