Large deviation principle for fractional Brownian motion with respect to capacity

We show that fractional Brownian motion(fBM) defined via Volterra integral representation with Hurst parameter $H\geq\frac{1}{2}$ is a quasi-surely defined Wiener functional on classical Wiener space,and we establish the large deviation principle(LDP) for such fBM with respect to $(p,r)$-capacity on classical Wiener space in Malliavin's sense.


Introduction
Quasi sure analysis on the (infinite dimensional) Wiener space was initiated by Malliavin (see e.g. [34,35] and [36] for a definite account) and by Fukushima in the setting of Dirichlet forms (see e.g. [15]), which was created, according to Malliavin, as a regularity theory for studying functions on the Wiener space. In the past, typical Wiener functionals considered in stochastic analysis are solutions of Itô's stochastic differential equations, and therefore many results in the context of quasi sure analysis were established for Brownian motion and diffusion processes. Important quasi-sure results have been obtained (see e.g. [1,4,7,10,14,16,17,19,25,31,43,44,46] and the literature there-in). A regularity theory for Wiener functionals is even more important, this is due to the fact that, in infinite dimensional case, the laws of Wiener functionals are typically singular such as laws of diffusions with different diffusion coefficients. Malliavin's original idea of quasi sure analysis was based on a simple observation that most interesting Wiener functionals are smooth in the sense of Malliavin's differentiation, so it is possible to create a uniform measure and setting on which Wiener functionals can be studied like in the (finite dimensional) real analysis. The capacities c p,r on the Wiener space have been therefore introduced in terms of Malliavin's derivatives and have been studied in the past by various authors (see e.g. [37,38,36,14,25,26,27] and the literature there-in).
On the other hand, the analysis of rough paths, initiated by Lyons (see e.g. [33,32,13,11] for details), was developed in order to study solutions of differential equations driven by semi-martingales and other interesting rough signals. It turns out that the rough path analysis is a useful tool to study the quasi-sure properties of Wiener functionals. Many techniques developed in the rough path analysis in fact can be used in the study of quasi-sure analysis as demonstrated in [4], and in the work by various researchers (see e.g. [21,23,22,24,29,40] and etc.).
Large deviations (see e.g. [8,9,45,6]) for various distributions were developed due to their significance in statistics and in statistical mechanics. The classical theory of large deviation principle (LDP for short) completes the central limit theorem by telling us that convergence of tail distributions is exponentially fast. The power of this theory is revealed in the Cramér's theorem, which provides the rate 2 Preliminaries and the main result In this section, we will introduce basic definitions and state the main result.

Malliavin differentiation and capacities
We mainly follow the notations used in Ikeda and Watanabe's book (see Chapter V, Section 8, [20]). Although our presentation applies to multi-dimensional case as well, we consider the one dimensional Wiener space for simplicity. Let W be the space of all continuous paths valued in R over time interval [0, 1] started at the original, equipped with the uniform norm ω = sup t∈[0,1] |ω(t)|, and therefore the Borel σ -algebra B(W ). The functions which send each ω ∈ W to its coordinates ω(t) (where t ≥ 0) are denoted by ω(t) too or by ω t . The Wiener measure P is the distribution of standard Brownian motion, the unique probability on (W , B(W )) such that {ω(t) : t ≥ 0} is a standard Brownian motion. Let F be the completion of B(W ) under P. Wiener functionals, by convention in literature, are F -measurable functions on W .
Let H denote the Cameron-Martin space, which is a Hilbert space of all square integrable functions h on [0, 1] such that h(0) = 0 and its generalized derivativeḣ is square integrable too. The inner product on H is given by for some h 1 , · · · h n ∈ H and f ∈ C ∞ p (R n ) (a smooth function such that f and all of its partial derivatives have polynomial growth). The collection of all smooth random variables is denoted by S. The Malliavin derivative DF of F is defined as an H-valued random variable: where ∂ i f denotes the partial derivative of f in the i-th component. The high-order derivatives D l F, l ≥ 1 are defined inductively. The Sobolev norm F D p r of a smooth function F is defined to be where r = 0, 1, 2, · · · and 1 ≤ p < ∞. The completion of S with respect to this norm is denoted by D p r . The concept of capacities on the Wiener space plays the central rôle in what follows. For given p ≥ 1 and r = 0, 1, 2, · · · , the capacity c p,r is a sub-additive set function on the Wiener space, which can be defined in two steps.
Let us recall a few elementary properties about capacities c p,r (see e. g. Section 1.2, Chapter IV, [36]). By definition every capacity is an outer measure, so that c p,r (A) ≤ c p,r (B) for A ⊂ B ⊂ W , and c p,r is sub-additive The family of capacities c p,r are comparable in the sense that for A ⊂ W , 1 < p < q < ∞, and r < s, we have c p,r (A) ≤ c q,r (A) and c p,r (A) ≤ c p,s (A). In particular, by definition, (P(A)) The first Borel-Cantelli lemma for a capacity c p,r holds true. If {A n } ∞ n=1 is a sequence of subsets of W such that ∑ ∞ n=1 c p,r (A n ) < ∞, then c p,r (lim sup n→∞ A n ) = 0. Another important tool used in this paper is the capacity version of the Chebyshev inequality, which says that r and λ > 0.

The main result
We are now ready to introduce the definition of a large deviation principle (LDP) with respect to capacities, and to state the main result.
Recall that an fBM (B t ) t≥0 with Hurst parameter H ∈ 0, 1 2 is a self-similar Gaussian process with stationary increments whose co-variance function The fBM can be realised as Wiener functionals by where {ω(s) : s ≥ 0} is the coordinate process on W (hence a Brownian motion). B t is defined almost surely -the integral on the right-hand side is understood in the Itô's sense. K is a singular kernel given by when H > 1 2 , and c H some constant depending only on H. To state the large deviation principle for fBMs, we need to identify their rate functions which must be the same rate functions in the context of probabilities. Let Q = P • B −1 be the push-forward of the Wiener measure, which is a Gaussian measure on (W , B(W )), whose Cameron-Martin space is denoted byĤ. Then Let {Q ε } be the family of scaled measures, the laws of {εω} under Q. By definition for each A ∈ B(W ). According to Theorem 3.4.12 on page 88 in [8], {Q ε } satisfies the LDP with good rate function I given by otherwise.
Now we are in a position to state the main result of the present paper.
Theorem 2.2. Let r ∈ N, p > 1 and 1 2 ≤ H < 1. Then we have the following conclusions. 1) There is a modification of B t (for t ≥ 0) given by the integral representation (2.1) of the fractional Brownian motion with Hurst parameter H, which is defined (p, r)-quasi surely.
2) Let X ε t (ω) = B t (εω) for all ω expect for a c p,r -zero subset (for all t ≥ 0, ε > 0). Then {X ε : ε > 0} (which are scaled fBMs with Hurst parameter H ∈ 1 2 , 1 ) satisfies the c p,r -LDP with good rate function otherwise. (2. 3) The remaining of the paper is devoted to the proof of the main result, Theorem 2.2. The first part of the main result Theorem 2.2 follows from Theorem 4.2 in Section 4 directly. The proof of the second part of Theorem 2.2 will be presented in Section 5 and Section 6.
Our strategy of the proof can be described as the following. For each fixed t, we define B converges quasi-surely (with respect to the Brownian motion capacity). The main difficulty here is that the kernel K(t, s) is singular in s, so it is very difficult to control its increment in s and estimate the integral of K over small time interval near time s = 0. However, we notice that K as a function of t, behaves more regularly. Therefore, we control the difference of K(t, s + α) − K(t, s) by the difference of K when t varies. Then we may define the mapping X on the Wiener space by linear interpolation: For each n, X (n) is quasi-surely defined, then we show that this sequence X (n) converges to some mapping X quasi-surely. As any countable union of capacity zero sets still has zero capacity, we conclude that the limit X is quasi-surely defined, and we shall also see that this convergence is exponentially fast in the proof. Now as the large deviation principle may be established for X (n) 's, using exponentially good approximations result from LDP theory, we deduce LDP for the limit mapping X .

Some technical facts
In this section we collect a few technical facts which will be used to prove the quasi sure version of the large deviation principle for fBMs.

Wiener's chaos decomposition
The n-th Wiener chaos H n (where n = 0, 1, 2, · · · ) is the closed subspace of L 2 (W ) generated by all random variables H n ([h]) (where h ∈ H with h H = 1), where H n (x) is the n-th Hermite polynomial. H n and H m are orthogonal subspaces of L 2 (W ) when m = n, and the Wiener chaos decomposition states that L 2 (W , F , P) = ⊕ ∞ n=0 H n , (see e.g. Theorem 1.1.1, Section 1.1 in [42] and its proof). The projection from L 2 (W ) to H n is denoted by J n . Let P 0 n denote the space of polynomial random variables of the form where p is a polynomial with degree less than or equal to n, and P n be its closure in L 2 (W ). Then (T t ) t≥0 is extended to be a family of operator semigroups of contractions on L p (W ) for every p > 1, where for each t ≥ 0, For p > 1, t > 0 and q(t) = e 2t (p − 1) + 1 > p, T t F q(t) ≤ F p for any F ∈ L p (W ), which is called the hypercontractivity, for a proof, see e.g. Theorem 1.4.1, Section 1.4, [42].
for any q > 2 and l ≤ n.
Proof. The first inequality (3.2) is a consequence of hypercontractivity. Take F ∈ H n . Then T t F = e −nt F. Set p = 2, and q = q(t) = 1 + e 2t , so t = 1 2 log(q − 1), and hence by hypercontractivity, If F = ∑ n m=0 J m F ∈ P n , then To prove (3.3), applying (3.1) to F ∈ P n , where F ∈ D 2 l for l ≤ n, and J m F's vanish when m > n. Notice that F 2 2 = ∑ n m=0 J m F 2 2 , so that and the proof is complete.

Exponential tightness
Most conclusions in the theory of large deviations (see [8,6] for details) are still valid in the context of capacity LDP. Let us state some of them which will be used in this paper, their proofs are routine and will be omitted.
. Then the family {F • X ε : ε > 0} of c p,r -quasi-surely defined maps satisfies the c p,r -LDP with good rate function In stochastic analysis, the Wiener functionals we are interested are not continuous, but are smooth in the sense of Malliavin differentiation. To deal with measurable functions, the concept of exponential tightness is a useful technique in proving large deviation principles. The natural modification of this notion can be formulated as the following. Definition 3.3. Let X ε,(m) : ε > 0 (where m = 1, 2, · · · ) and {X ε : ε > 0} be two families of defined mappings from W to (Y, d). X ε,(m) : ε > 0 is said to be a family of exponentially good approximations of {X ε : ε > 0} under (p, r)-capacity if for all λ > 0, The following version of the contraction principle, which was discovered in dealing with Wiener functionals, will be useful as well in our context. Proposition 3.4. Suppose that for each m = 1, 2, · · · , the family X ε,(m) : ε > 0 , consisting of defined mappings from W to (Y, d), satisfies c p,r -LDP with good rate function I m , and X ε,(m) : ε > 0 are good exponential approximations of quasi-surely defined mappings {X ε : ε > 0}. Define the function

5)
where B(y, λ ) denotes the open ball in (Y, d) with centre y and radius λ . If J is a good rate function and for every closed then the family {X ε : ε > 0} satisfies c p,r -LDP with good rate function J.

FBMs as Wiener functionals
Let B = (B t ) t≥0 be the fBM defined by (2.1) as Wiener functionals. According to the transfer principle in [42] (see Proposition 5.2.1, page 288, [42]), B t ∈ D p r , and its Malliavin derivative may be computed explicitly as in the following lemma.
Lemma 4.1. Let H ∈ (0, 1), r ∈ N and p ∈ (1, ∞). Then B t ∈ D p r (for every t > 0) and its first order Malliavin derivative is given by The higher-order derivatives of B t vanish.
Let us first recall the definition of B and B (m) Its higher order Malliavin derivatives vanish identically. The first part of the main result Theorem 2.2 is a consequence of the following converges (p, r)-quasi-surely to some limit denoted by B t too, which is also the limit of B Proof. The proof is quite technical and will be divided into several steps. Let us begin with the simple fact that Therefore, by the first Borel-Cantelli lemma for capacity, we only need to show that for all p ∈ (1, ∞) and r ∈ N. Since c p,r is increasing in p and r, it suffices to prove that the above infinite sum is finite for p > 2 and all r ∈ N. Therefore, we shall assume that p > 2 in the sequel.
Step 1. In this step, we convert our problem from estimating capacity to estimating L 2 -norm of Gaussian random variables. By Chebyshev's inequality we have is a polynomial functional of degree 2, and for all p > 2 and 0 ≤ l ≤ r, and therefore where C r,p = 36(r + 1)(p − 1)2 r 2 depends only on r and p. The L 2 -norm on the right-hand side of the previous inequality can be handled as the following. By definition (4.1),

The integral term in B
(m) t (ω) may be split into two parts, i.e. for each i ∈ {0, 1, · · · , 2 m − 1}, Therefore, we deduce that (4.5) Since Brownian motion has independent increments, we have (4.6) Step 2. In this step, we further simplify our problem using a rather simple observation. By change of variables, for each i ∈ {0, · · · , 2 m − 1}, Using the definition of K, we observe that for all α ∈ (0,t − s), and hence On the other hand, for every α ∈ (0, r), By setting r = s + α, we deduce that (4.7) and (4.8), and for all s < t − t 2 m+1 , let Then L i ≤ M i ≤ U i for each i, and it thus follows that Step 3. In this step, we find upper bounds for L 2 i and U 2 i . We first find a control of For all r ∈ t 2 m+1 ,t , consider the function Then f r (0) = K(t, r) and Therefore, We may compute the derivative of f r , which is where ∂ 1 K(t, s) denotes the partial derivative of K with respect to the first variable. Denote Then for all x < r, g r (x) ≥ 0 and h r (x) ≥ 0. By Hölder's inequality, we obtain that (4.10) We control the integral of g r first. When H > 1 2 , since t ≤ 1, Consequently, where C 1 is a constant depending only on H. As for h r , due to change of variables, with C 2 some constant only depending on the value of H. Using (4.10), we get that where c 1 and c 2 are two positive constants. Next, we move onto the estimate for U i 's. By the definition of U i 's and Hölder's inequality, we have that (4.14) Step 4. We complete our proof using above estimates in this step. It follows from (4.9), (4.13) and (4.14) that where c 3 and c 4 are some constants.
Therefore, for any λ > 0, it holds that where C r,p,H is a suitable constant depending on r, p and H only. Hence, if we choose δ small enough such that 0 and applying the first Borel-Cantelli lemma as before. If for ω ∈ W , there are infinitely many k's such that B (ω)) k≥0 is not Cauchy. Therefore, by Chebyshev's inequality, and hence by the first Borel-Cantelli lemma, As a consequence, (B (m k ) t ) k≥0 converges to B t apart from on a slim set, and the uniqueness of limit forces its limit to beB t , which implies B t =B t q.s.
From now on, we work with one modification of B which is defined as a quasi-sure limit of the approximations B (m) .

Exponential tightness of the approximation sequence
For each fixed t, B t is quasi-surely defined (with B 0 (ω) = 0 for all ω ∈ W ), so we may define a map X (m) : W → W by where t m k = k 2 m . Then X (n) is quasi-surely defined as it is a linear interpolation of finitely many B t m k 's. For each m, let X ε,(m) to be the scaled map, which is defined as X ε,(m) (ω) = X (m) (εω). As B t is the limit of linear combinations of ω t 's, it follows that X ε,(m) (ω) = εX (m) (ω).
Our goal is to show that the sequence X (m) m∈N converges to some X quasi-surely, which implies that X ε,(m) m∈N converges to X ε quasi-surely, where the scaled map X ε is given by X ε (ω) = X (εω) = εX (ω). Moreover, the fact that the scaled map X ε,(m) m∈N converges exponentially fast will be revealed in the proof as well. Since X ε is quasi-surely defined with exponentially good approximations X ε,(m) m∈N , we may apply the result from LDP theory to conclude the final result. We will need the following estimate from the rough path analysis, which is contained in [33] (see Proposition 4.1.1 on page 62 or equation (4.15) on page 64). Here, we adapt the result to our case and state it as the following: Proposition 5.1. Let u and w be two continuous paths in a Banach space. Then for any q > 1 and γ > q − 1, there exists a constant C q,γ depending only on q and γ such that where the supremum is taken over all finite partitions D of [0, 1], t n l = l 2 n for n = 1, 2, · · · , l = 0, · · · , 2 n , and u s,t = u t − u s is the increment of path u.
Together with Proposition 3.1, the above estimate allows us to convert our problem from controlling capacity to controlling the L 2 -norm of Gaussian processes, which is much easier to work with.
Theorem 5.2. For r ∈ N and 1 < p < ∞, X (m) m∈N converges (p, r)-quasi-surely to some limit X , and the scaled maps X ε,(m) m∈N is a family of exponentially good approximations of X ε under the capacity c p,r .
Proof. Here we use a technique in the theory of rough paths to control the tails of X (m) 's which are Gaussian. Let us first prove that the sequence X (m) : m ≥ 1 converges uniformly quasi-surely. By using the elementary fact that for any u, w ∈ W and for any q > 1, where the supremum is taken over all possible finite partition of [0, 1], and u s,t = u t − u s , together with Proposition 5.1, we obtain that where C q,γ is a constant depending on q and γ, and t n k = k 2 n . We will apply the above estimate to X (n) to estimate where λ > 0. Since (p, r)-capacity is increasing in p, we shall assume that p > 2. By monotonicity and sub-additivity properties of capacity, we obtain that for θ > 0, We introduce a new parameter N, whose value is to be determined at the end of our proof, and consider Notice that when n ≤ m, X (m) t n k = B t m 2 m−n k . Since t m 2 m−n k = t m+1 2 m+1−n k , we have that X (m+1) t n k−1 ,t n k = X (m) t n k−1 ,t n k . By Chebyshev's inequality, we obtain that a polynomial functional of degree 2N, and N ≥ r 2 , we have that It thus implies that H) is a constant depending only on N and H. We may conclude from (5.5) and (5.6) that for n > m, where C r,p,N,H = (r + 1)(2N + 1)(p − 1) . (5.9) Therefore, according to (5.2), (5.3) and (5.9), .

(5.11)
Applying the same argument as in the previous theorem, we see that the problem may be reduced to proving that for suitable positive δ > 0, Then by the first Borel-Cantelli lemma for capacity, we obtain the quasi-sure convergence for X (m) . Since , so the series in (5.12) converges as long as we choose δ such that δ < H − 1+θ q − 1 2N , which must exist as we have chosen q and θ such that 2N H − 1+θ q − 1 > 0 for some N ∈ N. Convergence of the series in (5.12) implies the convergence of X (m) m∈N . Denote its limit by X , then X is defined quasi-surely on W .

The proof of the main result
This section is devoted to the proof of the large deviation principle stated in the second part of the main result, Theorem 2.2.
Notice that for each m, X (m) , which is a Wiener functional on W defined quasi surely, is a linear interpolation of some Gaussian random variables, so we may consider F m : which maps a (2 m + 1)-dimensional vector to its linear interpolation. Let us apply Varadhan's contraction principle to the maps above. As the rate function for the vector-valued Gaussian random variable B 0 , B 1 2 m , · · · , B k 2 m , · · · , B 1 is computable, the quasi sure version of LDP may be established easily for X (m) . attains its minimum, which happens when λ has the same direction as a since the first two terms only depends on the magnitude of λ , so we may write λ = a|λ | Σ |a| −1 Σ . Then the function becomes which is a quadratic function of |λ | Σ , we thus deduce that it reaches its minimum when Therefore, the minimum is attained at where B(K, δ ) = {x ∈ R n : inf y∈K |x − y| Σ < δ }. Let δ → 0, then the upper bound for compact sets is established. Now for any F ⊂ R n closed under Euclidean metric, as all norms on R n are equivalent, F is also closed in (R n , |·| Σ ). For ρ > 0, let H ρ = {x = (x 1 , · · · , x n ) : |x i | ≤ ρ, ∀1 ≤ i ≤ n} be a hypercube in R n . Then by sub-additivity property, for all ρ > 0. The proof is complete by letting ρ → ∞.
Now we may conclude our proof of the large deviation principle, part 2) of Theorem 2.2. As F m : R 2 m +1 → W defined in (6.1) is continuous and F m • T ε = X ε,(m) , so by the contraction principle (Theorem 3.2), the family X ε,(m) satisfies the c p,r -LDP with good rate function J m (ω) = inf x:F m (x)=ω I 2 m +1 (x), ω ∈ W where we define inf / 0 = ∞. When p = 1 and r = 0, the capacity c p,r coincides with Wiener measure P, and we would expect that the classical LDP for fBMs defined on classical Wiener space holds. Now defineF m : W → W bŷ SinceF m (εB) = X ε,(m) P-a.s. on W , we obtain that Q ε •F −1 m (A) = P ω ∈ W : X ε,(m) (ω) ∈ A .
Therefore, by the uniqueness of rate functions (see Lemma 4.1.4, Section 4.1.1 in [6]), J m andĴ m coincides. As shown in Theorem 5.2, X ε,(m) are exponentially good approximations of {X ε }, so it suffices to verify that the function I defined above coincides with the function J given in (3.5) and satisfies all conditions in Proposition 3.4. Let us first check if I satisfies all conditions. We observe that I given in (2.3) is a good rate function by definition.