A variational formula on the Cram\'er function of series of independent random variables

In [11] it has been proved some variational formula on the Legendre-Fenchel transform of the cumulant generating function (the Cram\'er function) of Rademacher series with coefficients in the space $\ell^1$. In this paper we show a generalization of this formula to series of a larger class of any independent random variables with coefficients that belong to the space $\ell^2$.


Introduction
For series t i g i of independent standard normal r.v.'s , where t 2 i < ∞ and g i ∈ N (0, 1), it is well-known their tail estimations of the form ; see for instance [6]. In Section 2 we present a derivation of this estimation and show some variational formula on the function α 2 t 2 i . It will turn out that it is given by In Section 3 we prove variational formulas for a wider class of series of independent random variables. Let us stress that they mostly do not possess simple forms as in the above Gaussian case. The function α 2 2 t 2 i is the Cramér transform of the random series t i g i . The Cramér transform is the Legendre-Fenchel transform of the cumulant generating function of r.v. To realize our purposes we will need the general notion of the Legendre-Fenchel transform in topological spaces (see [3] or [1]). Let V be a real locally convex Hausdorff space and V * its dual space. By ·, · we denote the canonical pairing between V and V * . Let f : V → R ∪ {∞} be a function nonidentically ∞. By D(f ) we denote the effective domain of f , i.e. D(f ) = {u ∈ V : f (u) < ∞}. A function f * : V * → R∪{∞} defined by is called the Legendre-Fenchel transform (convex conjugate) of f and a function f * * : is called the convex biconjugate of f .
The functions f * and f * * are convex and lower semicontinuous in the weak* and weak topology on V * and V , respectively. Moreover, the biconjugate theorem states that the function f : V → R ∪ {∞} not identically equal to +∞ is convex and lower semicontinuous if and only if f = f * * .
Let us mention additional properties of the convex conjugates; see 4.3 Examples in [3]. Let V be a normed space. We denote by · the norm of V and by · * the norm of V * . For conjugates exponents p, q ∈ (1, ∞) ( 1 p + 1 q = 1), a function 1 q u * q * is the convex conjugate of 1 p u p .
Remark 1.1. Let us emphasize that in Hilbert spaces a function 1 2 u 2 one can treat as the function invariant with respect to the Legendre-Fenchel transform.
Let us list another properties and one lemma. The convex-conjugation is orderreversing: and Let V and W be normed spaces. The dual space of their product (V × W ) * is isomorphic to a direct sum of their dual spaces V * ⊕ W * in this sens that a canonical pairing between V ×W and (V ×W ) * is given by the sum u, Proof. The convex conjugate of ϕ equals

A model example
The moment generating function of a standard normal r.v. g equals Ee sg = e The function s 2 2 is invariant with respect to Legendre transform (see Remark 1.1) that is the Cramér transform of g equals Let I ⊂ N and (g i ) i∈I be a sequence of independent standard normal r.v.'s. For t = (t i ) i∈I ∈ ℓ 2 (I) ≡ ℓ 2 consider X t = i∈I t i g i (convergence in L 2 and a.s.). Observe that EX 2 t = i∈I t 2 i = t 2 . Thus a subset {X t : t ∈ ℓ 2 } of L 2 is isomorphic to ℓ 2 . The cumulant generating function of X t we will denote by ψ t , i.e. ψ t = ψ X t . Notice that for fixed s we can consider ψ t (s) as a functional of the variable t in ℓ 2 . We will denote it by ψ s . Let us emphasize that ψ s (t) = ψ t (s).
Let I N denote a set {1, 2, ..., N} ∩ I. For t ∈ ℓ 2 (I) let t N is defined as follows: The function 1 2 t 2 is invariant with respect to the convex conjugate on ℓ 2 (see Remark 1.1). Taking in (2) f (t) = 1 2 t 2 and a = s we have ψ s (t) = f (st) and get for s = 0. The function ψ s (t) = s 2 2 t 2 2 is convex and continuous on ℓ 2 . By the biconjugate theorem we have Substituting a = sb into the right-hand side we may rewrite the above as follows If we split the supremum into two parts: over R and hyperplanes {b ∈ ℓ 2 : b, t = constant} then we get We should remember that ψ Using again (2) but now for f (x) = 1 2 x 2 and a = t we obtain that It is the above-mentioned tail estimate for the separable Gaussian process given by series t i g i . The function ψ t (s) = 1 2 ( t s) 2 is convex and continuous on R. By the biconjugate theorem we have that Comparing (3) and (4) we see (for a general case see Th.3.3) that what we could also check by using the Lagrange multipliers technique.

Main Theorem
The cumulant generating function ψ X (s) = ln Ee sX of any random variable X is convex and lower semicontinuous on R (analytic on intD(ψ X )). It maps R into R ∪ {∞} and takes value zero at zero but it is possible that ψ X (s) = ∞ when s = 0. It is reasonable to assume that it is finite on some neighborhood of zero, i.e. X satisfies condition: ∃ λ>0 s.t. Ee λ|X| < ∞. Let us emphasize that if EX = 0 then ψ X ≥ 0 but the Cramér transform ψ * X is always nonnegative and attains 0 at the value EX. Before proving our main theorem, we show forms of the cumulant generating function and the Cramér transform of series of independent random variables.
Proposition 3.1. Let (X i ) i∈I be a sequence of zero-mean independent random variables with common bounded second moments. Let for each i ∈ I the cumulant generating function ψ i (s) := ψ X i (s) = ln Ee sX i is finite on some neighborhood of zero. Then for each t = (t i ) i∈I ∈ ℓ 2 the cumulant generating function of a random series X t = i∈I t i X i is given by the following equality Proof. Because (X i ) are independent, centered and have common bounded second moments then for every t ∈ ℓ 2 the series X t = i∈I t i X i converges in L 2 and a.s.. Let us emphasize that the convergence of series X t in L 2 is equivalent to the convergence of sequences t in ℓ 2 . By fixed s we can consider the cumulant generating function ψ t (s) as a functional of the variable t in ℓ 2 . We will denote it by ψ s (t), i.e. ψ s (t) = ψ t (s). We show that for every s ∈ R the functional ψ s is convex and lower semicontinuous on ℓ 2 .
Convexity one may check by using the Hölder inequality. Let t, u ∈ ℓ 2 and λ ∈ (0, 1) then ψ s (λt + (1 − λ)u) = ln Ee s i∈I (λt i +(1−λ)u i )X i = ln E e s i∈I t i X i λ e s i∈I u i X i 1−λ = ln E e sX t λ e sXu 1−λ .
It means that ψ s is lower semicontinuous on ℓ 2 .
Let ℓ 0 denote the space of sequences with finite supports. Observe that ℓ 0 is a dense subset of ℓ 2 . For t ∈ ℓ 0 we have For t ∈ ℓ 2 consider a series i∈I ψ i (st i ). Since EX i = 0, ψ i ≥ 0. It follows that i∈I ψ i (st i ) is convergent or divergent to plus infinity. Since ψ i are convex, this series defines a convex function on the whole ℓ 2 . Let t n → t 0 in ℓ 2 . Hence for every i ∈ I t n i → t 0 i . By superadditivity of the limit inferior and, next, by lower semicontinuity of each ψ i , we get Notice that both functions: ψ s (t) and the series i∈I ψ i (st i ) are convex and lower semicontinuous on ℓ 2 and moreover coincide on ℓ 0 (a dense subset of ℓ 2 ). It follows that these functions are equal on whole ℓ 2 , i.e.
Let us observe that for s = 0 ψ 0 ≡ 0 and its convex conjugate (ψ 0 ) * (a) = 0 for a = 0 and ∞ otherwise. From now on we assume that s = 0. A form of (ψ s ) * for s = 0 is described in the following: for a ∈ ℓ 2 , where ψ * i 's are the Cramér transforms of X i 's.
Proof. The convex conjugate (ψ s ) * is convex and lower semicontinuous on (ℓ 2 ) * ≃ ℓ 2 . Assume first that I is a finite set. By virtue of the form of ψ s , Lemma 1.2 and the property (2), for a in ℓ 2 (I), we get Define now a functional i∈I ψ * i ( a i s ) on whole space ℓ 2 . Since ψ * i 's are convex and lower semicontinuous, this functional is convex and, similarly as in the case of i∈I ψ i , one can show that it is also lower semicontinuous on ℓ 2 . Because this functional coincides with (ψ s ) * on the dense subspace ℓ 0 then both functionals are equal on ℓ 2 .
Let us emphasize that the functions (ψ s ) * are nonnegative and lower semicontinuous. In large deviation theory such functions are called rate functions (good rate functions when level sets are not only closed but also compact). In the main theorem below we show that the contraction principle applied to the function (ψ 1 ) * by using a functional t, · over ℓ 2 gives the Cramér transform of X t . Theorem 3.3. Let a sequence of r.v.'s (X i ) i∈I satisfies the assumptions of Proposition 3.1. Then for every t = (t i ) i∈I ∈ ℓ 2 the Cramér transform ψ * X t = ψ * t of a random series X t = i∈I t i X i is given by the following variational formula for α ∈ intD(ψ * t ), where ψ * i 's are the Cramér transform of X i 's. Proof. The functional ψ s is convex and lower semicontinuous on ℓ 2 . By virtue of the biconjugate theorem we have where (ψ s ) * (a) = i∈I (ψ i ) * ( a i s ) (s = 0). Substituting a = sb we get and we can rewrite the above as follows Let us return to the function ψ t which is convex and lower semicontinuous on R. By the biconjugate theorem we have Let us recall that ψ t (s) = ψ s (t). If we split the supremum of (5), as in Model Example, into two parts then we get Let ϕ t (α) denote the function inf b∈ℓ 2 : t,b =α (ψ 1 ) * (b). The functional (ψ 1 ) * is convex on ℓ 2 . Convexity is preserved under contraction by linear transformation (see [4, Th.III.32]). It suffices to state that ϕ t and ψ * t coincide on intD(ψ * t ) (both ones take ∞ on a complement of clD(ψ * t )) that is Remark 3.4. We can not prove in general that ϕ t is lower semicontinuous. Sometimes it is obvious when for instance the effective domain of ψ * t is open subset of R (or even whole R; see Model Example and Example 3.5). Under an assumption that (ψ 1 ) * is a good rate function with respect to weak* topology we can state lower-semicontinuity of ϕ t (see Example 3.6).
Example 3.5. Let X be r.v. with a Laplace density 1 2 e −|x| . Its moment generating function Ee sX = 1 1−s 2 for |s| < 1 and ∞ otherwise. Let us observe that where g is standard normal distributed. Thus ψ X ≥ ψ g and, since the Cramér transform is order-reversing, we have Moreover by using the classical Legendre transform we can calculate an evident form of ψ * X and get ψ * X (α) = Consider a sequence (X i ) i∈I of independent r.v.'s with the same Laplace distribution. By (6) we have It means that (ψ 1 ) * takes finite values on the whole space ℓ 2 and for every α ∈ R there is a finite infimum: inf that is the finite value of ψ * t at α. In the paper [7] one can find an example of variational formulas for the Cramér transform of series of weighted symmetric Bernoulli random variables but with coefficient belonging to the space ℓ 1 . In the context of our Theorem 3.3 we recall main result of this paper but now with coefficients in the bigger space ℓ 2 .
Let us emphasize that if we know that (ψ 1 ) * is a good rate function then we can prove lower-semicontinuity of ϕ t . Remark 3.7. Eventually, let us emphasize once again that only in Gaussian cases we can solve the variational formula on exponential rate functions for tail estimates of series of independent r.v.'s. For non-Gaussian processes it remains at us only these variational principles.