1 Introduction

The Cramér transform defines a rate function of the large deviations for empirical means of a sequence of i.i.d. random variables (see [2]). The literature concerning much more general contexts of the large deviation principles is very vast (see for instance monographs [3, 4]). A goal of this paper is only to show some variational formula for the Cramér transform of random variables which are series of weighted, independent symmetric Bernoulli random variables.

The Cramér transform is the Legendre–Fenchel transform of the cumulant generating function of r.v. We will need the general notion of the Legendre–Fenchel transform in topological spaces (see [5] or [1]). Let \(X\) be a real locally convex Hausdorff space and \(X^*\) its dual space. By \(\left\langle \cdot ,\cdot \right\rangle \) we denote the canonical pairing between \(X\) and \(X^*\). Let \(f:X\mapsto \mathbb {R}\cup \{\infty \}\) be a function nonidentically \(\infty \). By \(\mathcal {D}(f)\) we denote the effective domain of \(f\), i.e. \(\mathcal {D}(f)=\{x\in X:\;f(x)<\infty \}\). A function \(f^*:X^*\mapsto \mathbb {R}\cup \{\infty \}\) defined by

$$\begin{aligned} f^*(x^*)=\sup _{x\in X}\left\{ \left\langle x,x^*\right\rangle -f(x)\right\} =\sup _{x\in \mathcal {D}(f)}\left\{ \left\langle x,x^*\right\rangle -f(x)\right\} \;\;\;\;\;(x^*\in X^*) \end{aligned}$$

is called the Legendre–Fenchel transform (convex conjugate) of \(f\) and a function \(f^{**}:X\mapsto \mathbb {R}\cup \{\infty \}\) defined by

$$\begin{aligned} f^{**}(x)=\sup _{x^*\in X^*}\left\{ \left\langle x,x^*\right\rangle -f^*(x^*)\right\} =\sup _{x^*\in \mathcal {D}(f^*)}\left\{ \left\langle x,x^*\right\rangle -f^*(x^*)\right\} \;\;\;\;\;(x\in X) \end{aligned}$$

is called the convex biconjugate of \(f\).

The functions \(f^*\) and \(f^{**}\) are convex and lower semicontinuous in the weak* and weak topology on \(X^*\) and \(X\), respectively. Moreover, the biconjugate theorem states that the function \(f:X\mapsto \mathbb {R}\cup \{\infty \}\) not identically equal to \(+\infty \) is convex and lower semicontinuous if and only if \(f=f^{**}\).

Let \(I\) be a countable set and \((\epsilon _i)_{i\in I}\) be a Bernoulli sequence, i.e. a sequence of i.i.d. symmetric r.v’s taking values \(\pm 1\). For \(\mathbf{t}=(t_i)_{i\in I}\in \ell ^2(I)\equiv \ell ^2\) the series

$$\begin{aligned} X_\mathbf{t}:=\sum _{i\in I}t_i\epsilon _i \end{aligned}$$

converges a.s.. Notice that for \(\mathbf{t}\in \ell ^1\)

$$\begin{aligned} \vert X_\mathbf{t}\vert \le \sum _{i\in I}\vert t_i\vert =\Vert \mathbf{t}\Vert _1, \end{aligned}$$

i.e. \(X_\mathbf{t}\) is a bounded r.v. and we can define its cumulant generating function on whole \(\mathbb {R}\) that is

$$\begin{aligned} \psi _\mathbf{t}(s)=\ln Ee^{sX_\mathbf{t}} \end{aligned}$$

for every \(s\in \mathbb {R}\). Because \((\epsilon _i)_{i\in I}\) is i.i.d. Bernoulli sequence then

$$\begin{aligned} \psi _\mathbf{t}(s)&= \ln \prod _{i\in I} Ee^{st_i\epsilon _i}\\ \;&= \ln \prod _{i\in I}\frac{e^{st_i}+e^{-st_i}}{2}=\sum _{i\in I}\ln \cosh (st_i). \end{aligned}$$

Observe that

$$\begin{aligned} \psi ^\prime _\mathbf{t}(s)=\sum _{i\in I}t_i\tanh (st_i). \end{aligned}$$

We can not derive an evident form of \(\psi _\mathbf{t}^*\) by using the classical Legendre transform because we can not solve (inverse the derivative \(\psi _\mathbf{t}^\prime \)) the equation

$$\begin{aligned} \psi _\mathbf{t}^\prime (s)=\alpha \end{aligned}$$

and find

$$\begin{aligned} \psi _\mathbf{t}^*(\alpha )=\alpha s_\alpha -\psi _\mathbf{t}(s_\alpha ), \end{aligned}$$

where \(s_\alpha \) is a solution of the Eq. (1).

The following theorem shows some variational expression on \(\psi _\mathbf{t}^*\).

Theorem 1.1

Let \((\epsilon _i)_{i\in I}\) be a Bernoulli sequence and \(\mathbf{t}=(t_i)_{i\in I}\in \ell ^1(I)\). The Cramér transform of a variable \(X_{\mathbf{t}}=\sum _{i\in I}t_i\epsilon _i\) is given by the following variational formula

$$\begin{aligned} \psi _\mathbf{t}^*(\alpha )=\min _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \sum _{i\in I}t_ib_i=\alpha \end{array}}\psi _1^*(\mathbf{b}) \end{aligned}$$

for \(\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\) and \(+\infty \) otherwise, where

$$\begin{aligned} \psi _1^*(\mathbf{b})=\frac{1}{2}\sum _{i\in I}\Big [\big (1+b_i\big )\ln \big (1+b_i\big ) +\big (1-b_i\big )\ln \big (1-b_i\big )\Big ] \end{aligned}$$

is the convex conjugate of a functional \(\psi _1:\ell ^1\mapsto \mathbb {R}\) of the form \(\psi _1(\mathbf{t})=\ln Ee^{X_\mathbf{t}}\) and \(\mathcal {D}(\psi _1^*)\subset \ell _\infty (I)\) denotes its effective domain.

Remark 1.1

Presented in the next section proof techniques are similar, but not the same, to methods used by Ostaszewska and Zajkowski in [6, 7].

2 Proof of Theorem 1.1

We begin with an observation on the absolute value of the cumulant generating function: \(\vert \psi _\mathbf{t}(s)\vert \le \vert s\vert \Vert \mathbf{t}\Vert _1\). A parameter \(\mathbf{t}\) may be an arbitrary element of \(\ell ^1\). Formally we can define a function \(\psi \) of two variables:

$$\begin{aligned} \psi (s,\mathbf{t})=\psi _\mathbf{t}(s)=\ln Ee^{sX_\mathbf{t}}\quad \mathrm{for}\quad (s,\mathbf{t})\in \mathbb {R}\times \ell ^1. \end{aligned}$$

Fixing \(\mathbf{t}\) or \(s\) we write \(\psi (s,\mathbf{t})=\psi _\mathbf{t}(s)\) or \(\psi (s,\mathbf{t})=\psi _s(\mathbf{t})\), respectively. First we derive \(\psi _s^*\) and next we show how \(\psi _\mathbf{t}^*\) is expressed by \(\psi _s^*\).

In a standard way one can check the convexity of \(\psi _s\) for every \(s\in \mathbb {R}\). Let \(\mathbf{t},\mathbf{u}\in \ell ^1\) and \(\lambda \in (0,1)\) then

$$\begin{aligned} \psi _s(\lambda \mathbf{t}+(1-\lambda )\mathbf{u})&= \ln Ee^{s\sum _{i\in I}(\lambda t_i+(1-\lambda ) u_i)\epsilon _i}\\ \;&= \ln E\big [\big (e^{s\sum _{i\in I}t_i\epsilon _i}\big )^\lambda \big (e^{s\sum _{i\in I}u_i\epsilon _i}\big )^{1-\lambda }\big ]. \end{aligned}$$

Using the Hölder inequality for exponents \(1/\lambda \) and \(1/(1-\lambda )\) we get

$$\begin{aligned} E\big [\big (e^{s\sum _{i\in I}t_i\epsilon _i}\big )^\lambda \big (e^{s\sum _{i\in I}u_i\epsilon _i}\big )^{1-\lambda }\big ] \le \big (Ee^{s\sum _{i\in I}t_i\epsilon _i}\big )^\lambda \big (Ee^{s\sum _{i\in I}u_i\epsilon _i}\big )^{1-\lambda } \end{aligned}$$

and, in consequence,

$$\begin{aligned} \psi _s(\lambda \mathbf{t}+(1-\lambda )\mathbf{u})&\le \lambda \ln Ee^{s\sum _{i\in I}t_i\epsilon _i} + (1-\lambda )\ln Ee^{s\sum _{i\in I}u_i\epsilon _i}\\ \;&= \lambda \psi _s(\mathbf{t})+(1-\lambda )\psi _s(\mathbf{u}). \end{aligned}$$

Because \(\psi _s:\ell ^1\mapsto \mathbb {R}\) and \((\ell ^1)^* \simeq \ell _\infty \) then

$$\begin{aligned} \psi _s^*:\ell _\infty \mapsto \mathbb {R}\cup \{+\infty \}. \end{aligned}$$

Let \(\mathbf{a}=(a_i)_{i\in I}\in \ell _\infty \). By the definition of the convex conjugate we have

$$\begin{aligned} \psi _s^*(\mathbf{a})=\sup _{\mathbf{t}\in \ell ^1}\Big \{\left\langle \mathbf{t},\mathbf{a}\right\rangle -\sum _{i\in I}\ln \cosh (st_i)\Big \}, \end{aligned}$$

where \(\left\langle \mathbf{t},\mathbf{a}\right\rangle =\sum _{i\in I}t_ia_i\).

Note that for \(s=0\) we have

$$\begin{aligned} \psi _0^*(\mathbf{a})=\left\{ \begin{array}{ll} 0 &{}\quad \mathrm{if}\, \mathbf{a}=\mathbf{0},\\ +\infty &{}\quad \mathrm{otherwise}. \end{array} \right. \end{aligned}$$

Assume now that \(s\ne 0\). An expression in the curly bracket of (2), denote it by \(w\), is concave and its partial derivatives along vector of basis \(e_i=(\delta _{ij})_{j\in I}\) in \(\ell ^1\) (\(\delta _{ij}\) is the Kronecker delta) equal

$$\begin{aligned} \frac{\partial }{\partial t_i}w(\mathbf{t})=\frac{\partial }{\partial t_i}\left( \sum _{i\in I}t_ia_i-\sum _{i\in I}\ln \cosh (st_i)\right) =a_i-s\tanh (st_i). \end{aligned}$$

The expression \(w\) is a sum of functions with separated variables \((t_i)_{i\in I}\). Concavity of each of these functions implies that the gradient \(\nabla w(\mathbf{t})=(a_i-s\tanh (st_i))_{i\in I}\) belongs to the subgradient \(\partial w(\mathbf{t})\) since

$$\begin{aligned} \forall _{\mathbf{u}\in \ell ^1}\quad w(\mathbf{t})-w(\mathbf{u})\le \sum _{i\in I}(t_i-u_i)[a_i-s\tanh (st_i)]=\left\langle \mathbf{t}-\mathbf{u},\nabla w(\mathbf{t})\right\rangle . \end{aligned}$$

The concave function \(w\) attained its maximum (global) at the point \(\mathbf{t}\) if and only if \(\mathbf{0}\in \partial w(\mathbf{t})\). It suffices that

$$\begin{aligned} \forall _{i\in I}\quad a_i-s\tanh (st_i)=0. \end{aligned}$$

Because \(arc\tanh (x)=\frac{1}{2}\ln \frac{1+x}{1-x}\) for \(\vert x \vert <1\) then the partial derivatives equal zero when

$$\begin{aligned} t_i=\frac{1}{2s}\ln \frac{1+\frac{a_i}{s}}{1-\frac{a_i}{s}}\quad \mathrm{for}\quad \Big \vert \frac{a_i}{s} \Big \vert <1. \end{aligned}$$

Substituting the above values of \(t_i\)’s into (2) we get

$$\begin{aligned} \psi _s^*(\mathbf{a})=\frac{1}{2}\sum _{i\in I}\Big [\big (1+\frac{a_i}{s}\big )\ln \big (1+\frac{a_i}{s}\big ) +\big (1-\frac{a_i}{s}\big )\ln \big (1-\frac{a_i}{s}\big )\Big ]\quad \mathrm{for}\;\Big \vert \frac{a_i}{s} \Big \vert <1. \end{aligned}$$

Look a bit closely at the effective domain of \(\psi _s^*\) that is at the set

$$\begin{aligned} \mathcal {D}(\psi _s^*)=\Big \{\mathbf{a}\in l_\infty :\;\psi _s^*(\mathbf{a})<\infty \;\Big \}. \end{aligned}$$

The function \(f(x)=(1+x)\ln (1+x)+(1-x)\ln (1-x)\) is even and \(f(0)=0\). Since \(\lim _{|x|\rightarrow 1^-}=2\ln 2\) we can extend its domain to the interval \([-1,1]\). One can check that \((1+x)\ln (1+x)+(1-x)\ln (1-x)\ge x^2\). It follows that

$$\begin{aligned} \sum _{i\in I}\Big [\big (1+\frac{a_i}{s}\big )\ln \big (1+\frac{a_i}{s}\big ) +\big (1-\frac{a_i}{s}\big )\ln \big (1-\frac{a_i}{s}\big )\Big ]\ge \frac{1}{s^2}\sum _{i\in I} a_i^2 \end{aligned}$$

and \(|a_i|\le |s|\). Let \(\overline{B}_{\infty }(\mathbf{0};r)\) denote of the closed ball at the center \(\mathbf{0}\) and radius \(r\) in the space \(\ell _\infty \). The properties of \(f\) gives that

$$\begin{aligned} \mathcal {D}(\psi _s^*)\subset \overline{B}_{\infty }(\mathbf{0};|s|)\cap \ell ^2. \end{aligned}$$

Let us note that \(\mathcal {D}(\psi _s^*)\) is a symmetric set that is \(\mathbf{a}\in \mathcal {D}(\psi _s^*)\) if and only if \(-\mathbf{a}\in \mathcal {D}(\psi _s^*)\). Moreover it is symmetric with respect to each coordinates \(a_i\) of \(\mathbf{a}\).

Return to the function \(\psi _\mathbf{t}\). Let us observe that

$$\begin{aligned} \vert \psi _\mathbf{t}^\prime (s)\vert =\Big \vert \sum _{i\in I} t_i\tanh (st_i)\Big \vert < \Vert \mathbf{t}\Vert _1 \end{aligned}$$

and \(\lim _{s\rightarrow \pm \infty }\psi _\mathbf{t}^\prime (s)=\pm \Vert \mathbf{t}\Vert _1\). It follows \(\mathcal {D}(\psi _\mathbf{t}^*)=\psi _\mathbf{t}^\prime (\mathbb {R})=(-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\). Because \(\psi _\mathbf{t}\) is convex and continuous on \(\mathbb {R}\) then, by the biconjugate theorem, we get

$$\begin{aligned} \psi _\mathbf{t}(s)=\psi _\mathbf{t}^{**}(s)=\sup _{\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)}\big \{\alpha s-\psi _\mathbf{t}^*(\alpha )\big \}. \end{aligned}$$

On the other hand

$$\begin{aligned} \psi _\mathbf{t}(s)&= \psi _s(\mathbf{t})\\&= \sup _{\mathbf{a}\in \mathcal {D}(\psi _s^*)}\left\{ \left\langle \mathbf{t},\mathbf{a}\right\rangle \!-\!\frac{1}{2}\sum _{i\in I}\left[ \left( 1+\frac{a_i}{s}\right) \ln \left( 1+\frac{a_i}{s}\right) \!+\!\left( 1-\frac{a_i}{s}\right) \ln \left( 1-\frac{a_i}{s}\right) \right] \right\} . \end{aligned}$$

If we take \(\mathbf{a}=s\mathbf{b}\) then \(\psi _s^*(s\mathbf{b})=\psi _1^*(\mathbf{b})\) with \(\mathbf{b}\in D(\psi _1^*)\). It means that we can rewrite the above variational principle as follows

$$\begin{aligned} \psi _\mathbf{t}(s)=\sup _{\mathbf{b}\in \mathcal {D}(\psi _1^*)}\Big \{s\left\langle \mathbf{t},\mathbf{b}\right\rangle -\frac{1}{2}\sum _{i\in I}\Big [\big (1+b_i\big )\ln \big (1+b_i\big ) +\big (1-b_i\big )\ln \big (1-b_i\big )\Big ]\Big \}. \end{aligned}$$

Take now \(\alpha =\left\langle \mathbf{t},\mathbf{b}\right\rangle \). Recall that

$$\begin{aligned} \sup _{\mathbf{b}\in \overline{B}_{\infty }(\mathbf{0};1)}\left\langle \mathbf{t},\mathbf{b}\right\rangle =\Vert \mathbf{t}\Vert _1. \end{aligned}$$

We show that every number in \((-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\) is taken by the inner product \(\left\langle \mathbf{t},\mathbf{b}\right\rangle \) over the set \(\mathcal {D}(\psi _1^*)\). Observe that a vector \(\mathbf{b}=\sum _{i\in J}r(sgn\;t_i)e_i\), where \(J\) is some finite subset of \(I\) and \(r\in [-1,1]\), belongs to \(\mathcal {D}(\psi _1^*)\) (only finite number of nonzero terms). For this vector we have

$$\begin{aligned} \left\langle \mathbf{t},\mathbf{b}\right\rangle =r\sum _{i\in J}|t_i|. \end{aligned}$$

It follows that the inner product \(\left\langle \mathbf{t},\mathbf{b}\right\rangle \) attains over the set \(\mathcal {D}(\psi _1^*)\) any number belonging to the interval \((-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\).

For a fixed \(\mathbf{t}\in \ell ^1\), intersect \(\mathcal {D}(\psi _1^*)\subset \ell _\infty \) with a family of hyperplains

$$\begin{aligned} \Big \{\mathbf{b}\in \ell _\infty :\;\left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha \Big \}_{\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)}. \end{aligned}$$

Now we can divide the supremum of (3) into two parts and get

$$\begin{aligned} \psi _\mathbf{t}(s)&= \sup _{\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)} \sup _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha \end{array}} \Big \{s\left\langle \mathbf{t},\mathbf{b}\right\rangle -\psi _1^*(\mathbf{b})\Big \}\nonumber \\ \;&= \sup _{\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)}\Big \{s\alpha - \inf _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha \end{array}}\psi _1^*(\mathbf{b})\Big \}. \end{aligned}$$

Define a function

$$\begin{aligned} \varphi _\mathbf{t}(\alpha )=\inf _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha \end{array}}\psi _1^*(\mathbf{b}). \end{aligned}$$

We prove that in the above definition of function \(\varphi _\mathbf{t}\) an infimum over the set \(\mathcal {D}(\psi _1^*)\cap \{\mathbf{b}\in \ell _\infty :\;\left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha \}\) is attained and we can replace it by a minimum over this set that is we prove

$$\begin{aligned} \varphi _\mathbf{t}(\alpha )=\min _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha \end{array}}\psi _1^*(\mathbf{b}) \end{aligned}$$

for \(\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\) and \(+\infty \) otherwise.

By Banach–Alaoglu theorem the closed (unit) ball \(\overline{B}_{\infty }(\mathbf{0};1)\subset \ell _\infty \simeq (\ell ^1)^*\) is weak* compact and for each \(\mathbf{t}\) and \(\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\) the hyperplain \(H_{\mathbf{t},\alpha }=\{\mathbf{b}\in \ell _\infty :\;\left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha \}\) is closed in this topology. We have that an intersection \(\overline{B}_{\infty }(\mathbf{0};1)\cap H_{\mathbf{t},\alpha }\) is weak* compact. Let \(\ell _0\) be the space of sequences with finite support. Obviously \(\ell _0\cap \overline{B}_{\infty }(\mathbf{0};1)\subset \mathcal {D}(\psi _1^*)\) and \(H_{\mathbf{t},\alpha }\cap \ell _0\ne \emptyset \). We have

$$\begin{aligned} \forall _{\mathbf{t}\in \ell ^1}\forall _{\alpha \in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)}\quad \mathcal {D}(\psi _1^*)\cap H_{\mathbf{t},\alpha }\supset \overline{B}_{\infty }(\mathbf{0};1)\cap H_{\mathbf{t},\alpha }\cap \ell _0\ne \emptyset . \end{aligned}$$

Recall that the function \(\psi _1^*\) is nonegative and lower semicontinuous in the weak* topology. By Weierstrass Theorem \(\psi _1^*\) attains its minimum in the compact set \(\overline{B}_{\infty }(\mathbf{0};1)\cap H_{\mathbf{t},\alpha }\). Because an intersection of this set with the effective domain of \(\psi _1^*\) is nonempty then it means that a nonegative infimum is attained at some element in \(\mathcal {D}(\psi _1^*)\). It follows that in the definition of \(\varphi _\mathbf{t}\) we can replace the infimum by minimum and the formula (5) holds.

The formula (4) means that \(\psi _\mathbf{t}\) is the convex conjugate of \(\varphi _\mathbf{t}\). To prove an equality \(\varphi _\mathbf{t}=\psi _\mathbf{t}^*\) we should show that \(\varphi _\mathbf{t}\) is convex and lower semicontinuous.

First we check the convexity of \(\varphi _\mathbf{t}\). Take \(\alpha _1,\;\alpha _2\in (-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\). If \(\alpha _1\) or \(\alpha _2\) do not belong to the interval \((-\Vert \mathbf{t}\Vert _1,\Vert \mathbf{t}\Vert _1)\) then the value of \(\varphi _\mathbf{t}\) at such \(\alpha _k\) equals \(\infty \) and the condition of convexity is trivially satisfied. Let \(\mathbf{b}_k\) \((k=1,2)\) be vectors in \(\mathcal {D}(\psi _1^*)\cap H_{\mathbf{t},\alpha _k}\) such that

$$\begin{aligned} \varphi _\mathbf{t}(\alpha _k)=\min _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \left\langle \mathbf{t},\mathbf{b}\right\rangle =\alpha _k \end{array}}\psi _1^*(\mathbf{b}) =\psi _1^*(\mathbf{b}_k). \end{aligned}$$

Observe that for \(\lambda \in (0,1)\)

$$\begin{aligned} \left\langle \mathbf{t},\lambda \mathbf{b}_1+(1-\lambda )\mathbf{b}_2\right\rangle =\lambda \left\langle \mathbf{t},\mathbf{b}_1\right\rangle + (1-\lambda )\left\langle \mathbf{t},\mathbf{b}_2\right\rangle =\lambda \alpha _1+(1-\lambda )\alpha _2, \end{aligned}$$

that is \(\lambda \mathbf{b}_1+(1-\lambda )\mathbf{b}_2\in H_{\mathbf{t},\lambda \alpha _1+(1-\lambda )\alpha _2}\). The above and convexity of \(\psi _1^*\) gives

$$\begin{aligned} \varphi _{\mathbf{t}}(\lambda \alpha _1+(1-\lambda )\alpha _2)&\le \psi _1^*(\lambda \mathbf{b}_1+(1-\lambda )\mathbf{b}_2)\\&\le \lambda \psi _1^*( \mathbf{b}_1)+(1-\lambda )\psi _1^*(\mathbf{b}_2) =\lambda \varphi _{\mathbf{t}}(\alpha _1)+(1-\lambda )\varphi _{\mathbf{t}}(\alpha _2). \end{aligned}$$

Now we prove the lower semicontinuity of \(\varphi _\mathbf{t}\). Recall that \(\psi _1^*\) is convex and lower semicontinuous in the weak* topology on \(\ell _\infty \). It means that for any \(c\in \mathbb {R}\) the set

$$\begin{aligned} \{\mathbf{b}\in \ell _\infty :\;\psi _1^*(\mathbf{b})\le c\} \end{aligned}$$

is weak* closed. Since \(\psi _1^*\ge 0\) we can assume that \(c\ge 0\). Because the above set is contained in weak* compact unit ball \(\overline{B}_{\infty }(\mathbf{0};1)\supset \mathcal {D}(\psi _1^*)\) then it is also compact in this topology. Consider a range of the set (6) by the functional \(l_{\mathbf{t}}:=\left\langle \mathbf{t},\cdot \right\rangle \), i.e.

$$\begin{aligned} l_\mathbf{t}\Big (\big \{\psi _1^*(\mathbf{b})\le c\big \}\Big ). \end{aligned}$$

Since for each \(\mathbf{t}\in \ell ^1\) the linear functional \(l_\mathbf{t}\) is continuous on \(\ell _\infty \) (also in the weak* topology), by the intermediate and extreme value theorems we get that the set (7) is a closed interval. By symmetry of the set (6) and linearity of the functional \(l_\mathbf{t}\) we get the existence of a real number \(\alpha \) such that

$$\begin{aligned} l_\mathbf{t}\Big (\big \{\psi _1^*(\mathbf{b})\le c\big \}\Big )=[-\alpha ,\alpha ]. \end{aligned}$$

We show that

$$\begin{aligned} \varphi _\mathbf{t}^{-1}((-\infty ,c])=[-\alpha ,\alpha ]. \end{aligned}$$

Let \(\beta \in \varphi _\mathbf{t}^{-1}((-\infty ,c])\). Since \(\psi _1^*\) is lower semicontinuous, there exists \(\mathbf{b}_\beta \) such that

$$\begin{aligned} c\ge \varphi _\mathbf{t}(\beta )= \min _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \left\langle \mathbf{t},\mathbf{b}\right\rangle =\beta \end{array}}\psi _1^*(\mathbf{b})= \psi _1^*(\mathbf{b}_\beta ). \end{aligned}$$

That is \(\left\langle \mathbf{t},\mathbf{b}_\beta \right\rangle =\beta \in [-\alpha ,\alpha ]\). Conversely, let \(\beta \in [-\alpha ,\alpha ]\). Since \(l_{\mathbf{t}}=\left\langle \mathbf{t},\cdot \right\rangle \) is continuous on the connected set \(\{\psi _1^*(\mathbf{b})\le c\}\), there is \(\mathbf{b}_\beta ^\prime \in \{\psi _1^*(\mathbf{b})\le c\}\) such that

$$\begin{aligned} \left\langle \mathbf{t},\mathbf{b}_\beta ^\prime \right\rangle =\beta . \end{aligned}$$

Note that

$$\begin{aligned} \varphi _\mathbf{t}(\beta )= \min _{\begin{array}{c} \mathbf{b}\in \mathcal {D}(\psi _1^*) \\ \left\langle \mathbf{t},\mathbf{b}\right\rangle =\beta \end{array}}\psi _1^*(\mathbf{b})\le \psi _1^*(\mathbf{b}_\beta ^\prime )\le c, \end{aligned}$$

that is \(\beta \in \varphi _\mathbf{t}^{-1}((-\infty ,c])\).

Because \(\varphi _\mathbf{t}\) is convex and lower semicontinuos then \(\psi _\mathbf{t}^*=\varphi _\mathbf{t}\), which completes the proof.

Remark 2.1

The result of Theorem 1.1 is similar to those obtained by the contraction principle (see for instance [3]) but let us emphasize that we used the space of parameters \(\ell ^1\) to generate the convex conjugate of the investigated function and we did not consider any probability distribution on it.

Remark 2.2

Let us stress that the proof of Theorem 1.1 contains some scheme which allow us to generate, under some assumptions of course, variational formulas on the Cramér transform for another series of random variables.