# A variational formula on the Cramér function of series of independent random variables

## Abstract

In (Zajkowski, Positivity 19:529–537, 2015) it has been proved some variational formula on the Legendre–Fenchel transform of the cumulant generating function (the Cramér function) of Rademacher series with coefficients in the space \(\ell ^1\). In this paper we show a generalization of this formula to series of a larger class of any independent random variables with coefficients that belong to the space \(\ell ^2\).

## Keywords

Cramér function Legendre–Fenchel transform Contraction principle (large deviations theory) Rate function Infimal convolution## Mathematics Subject Classification

44A15## 1 Introduction

The *Legendre–Fenchel transforms* of *cumulant generating functions* of given random variables are at the core of the *large deviations theory* (see e.g. [3, 4]). The Cramér function gives the rate of the exponentially decay of tails of distributions for the empirical means of sequences of i.i.d. random variables. It provides a nice connection between convex analysis and statistics.

*X*satisfies the following variational principle

*relative entropy*of a probability distribution

*m*with respect to the distribution \(\mu _X\) of

*X*.

The aim of this paper is to prove some variational formula for the Cramér functions of series of independent random variables that depends on coefficients and Cramér functions of summands of a given series; see Theorem 2.3.

*V*be a real locally convex Hausdorff space and \(V^*\) its dual space. By \(\left\langle \cdot ,\cdot \right\rangle \) we denote the canonical pairing between

*V*and \(V^*\). Let \(f:V\mapsto \mathbb {R}\cup \{\infty \}\) be a function non-identically \(\infty \). By \(\mathcal {D}(f)\) we denote the

*effective domain*of

*f*, i.e. \(\mathcal {D}(f)=\{u\in V:\;f(u)<\infty \}\). A function \(f^*:V^*\mapsto \mathbb {R}\cup \{\infty \}\) defined by

*Legendre–Fenchel transform*(

*convex conjugate*) of

*f*and a function \(f^{**}:V\mapsto \mathbb {R}\cup \{\infty \}\) defined by

*convex biconjugate*of

*f*.

The functions \(f^*\) and \(f^{**}\) are convex and lower semicontinuous in the weak* and weak topology on \(V^*\) and *V*, respectively. Moreover, the *biconjugate theorem* states that the function \(f:V\mapsto \mathbb {R}\cup \{\infty \}\) not identically equal to \(+\infty \) is convex and lower semicontinuous if and only if \(f=f^{**}\).

Let us mention additional properties of the convex conjugates; see 4.3 Examples in [6]. Let *V* be a normed space. We denote by \(\Vert \cdot \Vert \) the norm of *V* and by \(\Vert \cdot \Vert _*\) the norm of \(V^*\). For conjugate exponents \(p,q\in (1,\infty )\) (\(\frac{1}{p}+\frac{1}{q}=1\)), a function \(\frac{1}{q}\Vert u^*\Vert _*^q\) is the convex conjugate of \(\frac{1}{p}\Vert u \Vert ^p\).

*Remark 1.1*

Let us emphasize that in Hilbert spaces a function \(\frac{1}{2}\Vert u \Vert ^2\) one can treat as the function invariant with respect to the Legendre–Fenchel transform.

It will turn out (see Remark 2.8) that the variational formula for the Cramér function of series of independent random variables is an example of an application of a generalization of the infimal convolution to the infinite case of summands. Often generalizations of formulas from finite numbers parameters (variables) to the case of infinite ones are not obvious. Other examples of generalizations of the convex conjugates of the logarithm of series of analytic functions, with applications to investigations of the convex conjugates of the spectral radius of the functions of weighted composition operators, one can find in [8, 12].

## 2 Main theorem

The cumulant generating function \(\psi _X(s)=\ln Ee^{sX}\) of any random variable *X* is convex and lower semicontinuous on \(\mathbb {R}\) (analytic on \(int \mathcal {D}(\psi _X)\)). It maps \(\mathbb {R}\) into \(\mathbb {R}\cup \{\infty \}\) and takes value zero at zero but it is possible that \(\psi _X(s)=\infty \) when \(s\ne 0\). We will assume that it is finite on some neighborhood of zero, i.e. *X* satisfies condition: \(\exists _{\lambda >0}\) s.t. \(Ee^{\lambda |X|}<\infty \). Let us emphasize that if \(EX=0\) then \(\psi _X\ge 0\) but the Cramér function \(\psi _X^*\) is always nonnegative and attains 0 at the value *EX*.

Let \(I\subset \mathbb {N}\) and \((X_i)_{i\in I}\) be a sequence of independent r.v.s. For \(\mathbf{t}=(t_i)_{i\in I}\in \ell ^2(I)\equiv \ell ^2\) consider \(X_\mathbf{t}=\sum _{i\in I}t_i X_i\) (convergence in \(L^2\) and almost surely). The cumulant generating function of \(X_\mathbf{t}\) we will denote by \(\psi _\mathbf{t}\), i.e. \(\psi _\mathbf{t}=\psi _{X_\mathbf{t}}\). Notice that for fixed *s* we can consider \(\psi _\mathbf{t}(s)\) as a functional of the variable \(\mathbf{t}\) in \(\ell ^2\). We will denote it by \(\psi ^s\). Let us emphasize that \(\psi ^s(\mathbf{t})=\psi _\mathbf{t}(s)\).

Before proving our main theorem, we show forms of the cumulant generating function and the Legendre–Fenchel transform of series of independent random variables.

**Proposition 2.1**

*Proof*

Because \((X_i)\) are independent, centered and have common bounded second moments then for every \(\mathbf{t}\in \ell ^2\) the series \(X_\mathbf{t}=\sum _{i\in I}t_iX_i\) converges in \(L^2\) and a.s.. Let us emphasize that the convergence of series \(X_\mathbf{t}\) in \(L^2\) is equivalent to the convergence of sequences \(\mathbf{t}\) in \(\ell ^2\). Fixing *s* we can consider the cumulant generating function \(\psi _\mathbf{t}(s)\) as a functional of the variable \(\mathbf{t}\) in \(\ell ^2\). We will denote it by \(\psi ^s(\mathbf{t})\), i.e. \(\psi ^s(\mathbf{t})=\psi _\mathbf{t}(s)\). We show that for every \(s\in \mathbb {R}\) the functional \(\psi ^s\) is convex and lower semicontinuous on \(\ell ^2\).

Let us observe that for \(s=0\) we have \(\psi ^0\equiv 0\) and its convex conjugate \((\psi ^0)^*(\mathbf{a})=0\) for \(\mathbf{a}=\mathbf{0}\) and \(\infty \) otherwise. From now on we assume that \(s\ne 0\). A form of \((\psi ^s)^*\) for \(s\ne 0\) is described in the following:

**Proposition 2.2**

*Proof*

*I*is a finite set. By virtue of the form of \(\psi ^s\), the convex conjugate of a separated sum (see e.g. [2, Prop. 13.27]) and the property (3), for \(\mathbf{a}\) in \(\ell ^2(I)\), we get

Let us emphasize that the functions \((\psi ^s)^*\) are nonnegative and lower semicontinuous. In large deviation theory such functions are called rate functions (good rate functions when sublevel sets are not only closed but also compact). In the main theorem below we show that the contraction principle applied to the function \((\psi ^1)^*\) by using a functional \(\left\langle \mathbf{t}, \cdot \right\rangle \) over \(\ell ^2\) gives the Cramér function of \(X_\mathbf{t}\).

**Theorem 2.3**

*Proof*

*Remark 2.4*

We cannot prove in general that \(\varphi _\mathbf{t}\) is lower semicontinuous. Sometimes it is obvious when for instance the effective domain of \(\psi _\mathbf{t}^*\) is open subset of \(\mathbb {R}\) (or even whole \(\mathbb {R}\); see Examples 2.5 and 2.6). Under an assumption that \((\psi ^1)^*\) is a good rate function with respect to weak* topology we can state lower-semicontinuity of \(\varphi _\mathbf{t}\) (see Example 2.7).

*Example 2.5*

*g*is given by

*Example 2.6*

*X*be r.v. with the Laplace density \(\frac{1}{2}e^{-|x|}\). Its moment generating function \(Ee^{sX}=\frac{1}{1-s^2}\) for \(|s|<1\) and \(\infty \) otherwise. Let us observe that

*g*is standard normal distributed. Thus \(\psi _X\ge \psi _g\) and, since the Legendre–Fenchel transform is order-reversing, we have

In the paper [11] one can find an example of the variational formula for the Cramér function of series of weighted symmetric Bernoulli random variables but with coefficients belonging to the space \(\ell ^1\). In the context of our Theorem 2.3, we recall the main result of this paper but now with coefficients in the larger space \(\ell ^2\).

*Example 2.7*

*X*is a symmetric Bernoulli r.v., i.e. \(Pr(X=\pm 1)=\frac{1}{2}\), then \(Ee^{sX}=\cosh s\). By power series expansions one has \(\cosh s\le \exp (\frac{s^2}{2})\). In this example, conversely as in previous one, we get that \(\psi _X(s)\le \psi _g(s)=\frac{s^2}{2}\) and

Let us emphasize that if we know that \((\psi ^1)^*\) is a good rate function then we can prove lower-semicontinuity of \(\varphi _\mathbf{t}\).

*Remark 2.8*

*I*is a finite set and \(Y_i=t_iX_i\) then \(X_\mathbf{t}=\sum _{i\in I} Y_i\). Note that

## References

- 1.Barbu, V., Precupanu, T.: Convexity and Optimization in Banach Spaces, 4th edn. Springer Monographs in Mathematics. Springer, Dordrecht (2012)CrossRefGoogle Scholar
- 2.Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer Sciences+Business Media (2011)Google Scholar
- 3.Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Corrected reprint of the second (1998) edition, Stochastic Modeling and Applied Probability, vol. 38. Springer, Berlin (2010)Google Scholar
- 4.den Hollander, F.: Large Deviations, Fields Institute Monographs, vol. 14. American Mathematical Society, Providence (2000)Google Scholar
- 5.Donsker, M.D., Varadhan, S.R.S.: Asymptotic evaluation of certain Markov process expectations for large time III. Commun. Pure Appl. Math
**29**, 389–461 (1976)MathSciNetCrossRefMATHGoogle Scholar - 6.Ekeland, I., Témam, R.: Convex Analysis and Variational Problems, Translated from French. Corrected reprint of the 1976 English edition. Classics in Applied Mathematics, vol. 28. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1999)Google Scholar
- 7.Hiriart-Urruty, J.-B.: A note on the Legendre–Fenchel Transform of Convex Composite Functions, Nonsmooth Mechanics and Analysis, pp. 35–46, Adv. Mech. Math., vol. 12. Springer, New York (2006)Google Scholar
- 8.Ostaszewska, U., Zajkowski, K.: Legendre–Fenchel transform of the spectral exponent of analytic functions of weighted composition operators. J. Convex Anal.
**18**(2), 367–377 (2011)MathSciNetMATHGoogle Scholar - 9.Rudin, W.: Functional Analysis, 2nd edn. International Series in Pure and Applied Mathematics (1991)Google Scholar
- 10.Talagrand, M.: Majorizing measures: the generic chaining. Ann. Probab.
**24**, 1049–1103 (1996)MathSciNetCrossRefMATHGoogle Scholar - 11.Zajkowski, K.: Cramér transform of Rademacher series. Positivity
**19**, 529–537 (2015)MathSciNetCrossRefMATHGoogle Scholar - 12.Zajkowski, K.: Convex conjugates of analytic functions of logarithmically convex functional. J. Convex Anal.
**20**(1), 243–252 (2013)MathSciNetMATHGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.