Cramér transform and t-entropy

Ostaszewska, Urszula; Zajkowski, Krzysztof

doi:10.1007/s11117-013-0247-3

Cramér transform and t-entropy

Open access
Published: 14 June 2013

Volume 18, pages 347–358, (2014)
Cite this article

Download PDF

You have full access to this open access article

Positivity Aims and scope Submit manuscript

Cramér transform and t-entropy

Download PDF

Urszula Ostaszewska¹ &
Krzysztof Zajkowski¹

1105 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

t-entropy is the convex conjugate of the logarithm of the spectral radius of a weighted composition operator (WCO). Let $X$ be a nonnegative random variable. We show how the Cramér transform with respect to the spectral radius of WCO is expressed by the t-entropy and the Cramér transform of the given random variable $X$.

Relative operator entropy related with the spectral geometric mean

Article 25 February 2015

On Extreme Values of the Rényi Entropy under Coupling of Probability Distributions

Article 01 January 2019

Sup-Sums Principles for F-Divergence and a New Definition for t-Entropy

Article Open access 04 November 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $M_X$ denote the moment-generating function of a given random variable $X$ that is $M_X(t)=Ee^{tX}$. A random variable $X$ satisfies the Cramér condition if there exists $c>0$ such that $Ee^{c|X|}<\infty $. If a random variable $X$ satisfies the Cramér condition with a constant $c>0$ then $M_X$ is well defined (it takes finite values) on a connected neighborhood, containing the interval $[-c,c]$, of zero and moreover possesses the following expansion

$$\begin{aligned} M_X(t)= \sum _{n=0}^{\infty }\frac{EX^n}{n!}t^n \quad \mathrm{for}\; |t|< t_0, \end{aligned}$$

where $t_0\ge c$, compare [3].

The Cramér transform of a random variable $X$ satisfying the Cramér condition is the Legendre–Fenchel transform of the cumulant generating function of $X$, i.e.

$$\begin{aligned} (\ln M_X)^*(a)=\sup _{t\in \mathbb R }\{at-\ln M_X(t)\}. \end{aligned}$$

It was proved in [4] that the following contraction principle holds

$$\begin{aligned} (\ln M_X)^*(a)=\inf _{m\ll \mu _X,\;\int x dm=a}D(m\Vert \mu _X), \end{aligned}$$

(1)

where $D(m\Vert \mu _X)=\int \ln \frac{dm}{d\mu _X}dm$ is the relative entropy of a probability distribution $m$ with respect to the distribution $\mu _X$ of $X$.

Recall now the general notion of the Legendre–Fenchel transform. Let $f$ be a functional on a real locally convex Hausdorff space $L$ with the values in the extended system of real numbers $\bar{\mathbb{R }}=[-\infty ,+\infty ]$. The set $\mathcal D (f)=\{\varphi \in L:\;f(\varphi )<+\infty \}$ is called the effective domain of the functional $f$. The functional $f^*:L^*\mapsto \bar{\mathbb{R }}$ that is defined on the dual space by the equality

$$\begin{aligned} f^*(\mu )=\sup _{\varphi \in L}\{\left\langle \mu ,\varphi \right\rangle -f(\varphi )\}=\sup _{\varphi \in \mathcal D (f)}\{\left\langle \mu ,\varphi \right\rangle -f(\varphi )\}\quad (\mu \in L^*) \end{aligned}$$

is called the Legendre–Fenchel transform of the functional $f$ (or the convex conjugate of $f$). For a functional $g$ on the dual space $L^*$ the Legendre–Fechel transform is defined as the functional on the initial space given by the similar formula:

$$\begin{aligned} g^{*}(\varphi )=\sup _{\mu \in L^*}\{\left\langle \mu ,\varphi \right\rangle -g(\mu )\}=\sup _{\mu \in \mathcal D (g)}\{\left\langle \mu ,\varphi \right\rangle -g(\mu )\}\quad (\varphi \in L). \end{aligned}$$

Let us emphasize that the dual functional $f^*$ is convex and lower semicontinuous with respect to the weak-$*$ topology on the dual space. Moreover, if $f:L\mapsto (-\infty ,+\infty ]$ is convex and lower semicontinuous then $(f^*)^*=f$ (the Legendre–Fenchel transform is involutory).

Now we present a general result obtained for the spectral radius of weighted composition operators. Let $\mathcal X $ be a Hausdorff compact space with Borel measure $\mu $, $\alpha :\mathcal X \mapsto \mathcal X $ a continuous mapping preserving $\mu $ (i.e. $\mu \circ \alpha ^{-1}=\mu $) and $g$ be a continuous function on $\mathcal X $. Antonevich, Bakhtin and Lebedev constructed a functional $\tau _\alpha $ depending upon $\mu $, called $\mathbf{t}$-entropy (see [1, 2]), on the set of probability and $\alpha $-invariant measures $\mathcal M ^1_\alpha $ with values in $[0,+\infty ]$ such that for the spectral radius of the weighted composition operator $ (gC_\alpha )u(x)=g(x)u(\alpha (x))$ acting in spaces $L^p(\mathcal X ,\mu ),\ 1 \le p < \infty $, the following variational principle holds

$$\begin{aligned} \ln r(gC_\alpha )=\max _{\nu \in \mathcal M ^1_\alpha } \left\{ \int _\mathcal X \ln |g|d\nu -\frac{\tau _\alpha (\nu )}{p}\right\} . \end{aligned}$$

(2)

It turned out that $\tau _\alpha $ is nonnegative (not necessary taking only finite values), convex and lower semicontinuous on $\mathcal M ^1_\alpha $.

For $\varphi \in C(\mathcal X )$ let $\lambda (\varphi )=\ln r( e^{\varphi }C_\alpha )$. The functional $\lambda $ is convex and continuous on $C(\mathcal X )$ and the formula (2) states that $\lambda $ is the Legendre–Fenchel transform of the function $\frac{\tau _\alpha }{p}$, i.e.

$$\begin{aligned} \lambda (\varphi )=\max _{\nu \in \mathcal M ^1_\alpha } \left\{ \int _\mathcal X \varphi d\nu -\lambda ^*(\nu )\right\} , \end{aligned}$$

(3)

where

$$\begin{aligned} \lambda ^{*}(\nu )=\left\{ \begin{array}{l@{\quad }l} \frac{\tau _\alpha (\nu )}{p} &{} \mathrm{for} \; \nu \in \mathcal M _\alpha ^1\; \mathrm{and}\; \tau _\alpha (\nu )<+\infty , \\ +\infty &{} \mathrm{otherwise}. \end{array} \right. \\ \end{aligned}$$

It means that the effective domain $\mathcal D (\lambda ^{*})$ is contained in $\mathcal M ^1_{\alpha }$.

It turned out that considerations on the spectral exponent (the logarithm of the spectral radius) of some functions of WCO in the natural way lead us to investigate expressions which are similar to the cumulant generating functions of random variables (see [6]). Thus it appeared the natural idea to define operators which are moment generating functions of WCO and next to investigate their spectral exponent using tools related with the Cramér transform of given random variables.

This treatment brings together questions which deal with investigations of the spectral radius of some operators and forms of the Cramér transform of random variables.

2 Spectral radius of moment-generating functions of WCO

A weighted composition operator $e^\varphi C_\alpha $, considered in $L^p$-spaces (Banach lattices), is an example of positive operators. The spectral radius of any positive operator $A$ belongs to its spectrum (see Prop. 4.1 in Ch. V of [7]), i.e. $r(A) \in \sigma (A)$. Recall that if $r(A)$ is less than the convergence radius of some analytic function $f$ then one can consider operators that can be written as analytic functions of given operators. If the coefficients of $f$ are nonnegative then the composition $f(A)$, for any positive operator $A$, is positive and one has $r(f(A)) \in \sigma (f(A))$. In the following Proposition it is shown that $r(f(A)) = f(r(A))$.

Proposition 2.1

Let $A$ be a positive operator acting in a Banach lattice. Then for any analytic function $f$, with nonnegative coefficients, such that its convergence radius is greater than the spectral radius of $A$ the following holds

$$\begin{aligned} r(f(A)) = f(r(A)). \end{aligned}$$

Proof

If the spectrum $\sigma (A)$ of an operator $A$ is contained in the disc of convergence of an analytic function $f$ then one can correctly define the operator $f(A)$ and moreover by the spectral mapping theorem (see for instance [8]) we have

$$\begin{aligned} \sigma (f(A))=f(\sigma (A)). \end{aligned}$$

(4)

Since $r(A)\in \sigma (A)$, $f(r(A))\in f(\sigma (A))=\sigma (f(A))$. Thus we obtain the following inequality

$$\begin{aligned} f(r(A)) \le r(f(A)). \end{aligned}$$

To obtain the converse one let us consider an arbitrary element $\omega \in \sigma (f(A))$. By (4) there exists $\lambda \in \sigma (A)$ such that $\omega = f(\lambda )$. Obviously $|\lambda | \le r(A)$ and under the assumption on nonnegativity of coefficients of $f$ we obtain that $f(|\lambda |) \le f(r(A))$ and consequently

$$\begin{aligned} |\omega |\le f(|\lambda |) \le f(r(A)). \end{aligned}$$

Recall that $r(f(A)) \in \sigma (f(A))$ and substituting in the above $\omega = r(f(A))$ we have

$$\begin{aligned} r(f(A)) \le f(r(A)). \end{aligned}$$

$\square $

For a weighted composition operators $e^\varphi C_\alpha $, if $r(e^\varphi C_\alpha )$ is less than the radius of convergence of $M_X(t)= \sum _{n=0}^{\infty }\frac{EX^n}{n!}t^n$ then one can correctly define an operator

$$\begin{aligned} M_X(e^\varphi C_\alpha )=\sum _{n=0}^{\infty }\frac{EX^n}{n!}(e^\varphi C_\alpha )^n. \end{aligned}$$

(5)

Assuming $X\ge 0$ we have that $EX^n\ge 0$ and by Proposition 2.1 we obtain

$$\begin{aligned} r(M_X(e^\varphi C_\alpha ))=M_X(r(e^\varphi C_\alpha ))=M_X(e^{\lambda (\varphi )}). \end{aligned}$$

Define now a functional

$$\begin{aligned} \widetilde{\lambda }_X(\varphi )= \left\{ \begin{array}{l@{\quad }l} (\ln M_X\circ \exp )(\lambda (\varphi )) &{} \mathrm{if}\; \varphi \in \lambda ^{-1}(\mathcal D (\ln M_X\circ \exp )), \\ +\infty &{} \text{ if } \text{ not }. \end{array} \right. \end{aligned}$$

(6)

Let us emphasize that because $\mathcal D (\ln M_X\circ \exp )$ is some left half line or even whole $\mathbb{R }$ and $\lambda $ is a convex functional on $C(\mathcal X )$ then $\lambda ^{-1}(\mathcal D (\ln M_X\circ \exp ))$ is a convex subset of $C(\mathcal X )$. For a nonnegative random variable $X$ satisfying the Cramér condition the cumulant function $\ln M_X$ is convex lower semicontinuous and increasing on $\mathbb{R }$. Therefore the composition $\ln M_X\circ \exp $ is convex, lower semicontinuous and also increasing. Let us recall that the functional $\lambda $ is convex and continuous on $C(\mathcal X )$. Then the functional $\widetilde{\lambda }_X$ as a composition of $\ln M_X \circ \exp $ and $\lambda $ is also convex and lower semicontinuous on $C(\mathcal X )$.

Before in Theorem 2.4 we present a form of the convex conjugate of $\widetilde{\lambda }_X$ first we prove Proposition which allow us characterize the convex conjugate of the composition of some convex functions with the exponent function.

We start with some observations. If $f$ is convex and increasing function then its effective domain $\mathcal D (f)$ is some left half line or whole $\mathbb R $, moreover $\mathcal D (f^*)\subset [0,+\infty )$.

Proposition 2.2

Let $f$ be a convex, increasing and lower semicontinuous function on $\mathbb R $ such that $\mathcal D (f)$ is some neighborhood of zero. Then

$$\begin{aligned} (f\circ \exp )^*(a)=\min _{\alpha \ge 0}\{f^*(\alpha )-a\ln \alpha \}+(\exp )^*(a) \end{aligned}$$

(7)

for $a\in \mathcal D ((f\circ \exp )^*)$.

Proof

Recall that the convex conjugate of the exponent function $\exp ^{*}(c)$ takes the value $c\ln c - c$ if $c>0$, $\exp ^{*}(0)=0$ and $\exp ^{*}(c)= +\infty $ if $c<0$. Because the effective domain of $\exp $ is whole $\mathbb{R }$ then its support function $\sigma _\mathcal{D (\exp )}(a)= +\infty $ for $a\ne 0$ and $\sigma _\mathcal{D (\exp )}(0)=0$.

Observe now that $f\circ \exp $ is convex, increasing and lower semicontinuous. For this reason $\mathcal D ((f\circ \exp )^*)\subset [0,+\infty )$. Using the formula on the convex conjugate of composite functions (see Th. 2.5.1 in [5]), for $a\in \mathcal D ((f\circ \exp )^*)$, we get

$$\begin{aligned} (f\circ \exp )^*(a)=\min _{\alpha \ge 0}\{f^*(\alpha )+(\alpha \exp )^*(a)\}. \end{aligned}$$

(8)

Assume first that $a$ is a positive number belonging to $\mathcal D ((f\circ \exp )^*)$. If $\alpha =0$ then $(0\exp )^*(a)=\sigma _\mathcal{D (\exp )}(a)= +\infty $. It follows that we can search the above minimum for $\alpha >0$. But when $\alpha >0$ then $(\alpha \exp )^*(a)= \alpha \exp ^*(\frac{a}{\alpha }) $. Substituting the formula on $\exp ^*$ into (8) , for $a > 0$, we obtain

$$\begin{aligned} (f\circ \exp )^*(a) = \min _{\alpha \ge 0}\{f^*(\alpha )-a\ln \alpha \}+(\exp )^*(a) \end{aligned}$$

(9)

Consider now the possible case when $0\in \mathcal D ((f\circ \exp )^*)$. Notice that then for each $\alpha \ge 0$ $(\alpha \exp )^*(0)=0$ and the formula (8) take the form

$$\begin{aligned} (f\circ \exp )^*(0)=\min _{\alpha \ge 0}f^*(\alpha ) \end{aligned}$$

that coincides with (7) for $a=0$. On the end let us emphasize that if it is known that $f^*$ attains its minimum at a positive number then we can search the minimum in (9) for $\alpha > 0$. $\square $

Observe that if $X$ is a nonnegative and not identically zero (a.e.) random variable satisfying the Cramér condition then its cumulant generating function $\ln M_X$ is convex, increasing and lower semicontinuous. Note that $\ln M_X(0)=0$. Moreover $(\ln M_X)^{*}$ attains its minimum at $a=EX$ equals zero. Thus for the cumulant generating function we can formulate the following

Corollary 2.3

Let $X$ be a nonnegative and not identically zero (a.e.) random variable satisfying the Cramér condition. The convex conjugate of the composite function $\ln M_X\circ \exp $ can be expressed by the Cramér transform of $X$ as follows

$$\begin{aligned} (\ln M_X\circ \exp )^*(a)=\min _{\alpha > 0}\{(\ln M_X)^*(\alpha )-a\ln \alpha \}+(\exp )^*(a) \end{aligned}$$

(10)

for $a\in \mathcal D ((\ln M_X \circ \exp )^*)$. Moreover $0\in \mathcal D ((\ln M_X \circ \exp )^*)$ and $(\ln M_X\circ \exp )^*(0)=0$.

Theorem 2.4

The convex conjugate of the functional $\widetilde{\lambda }_X$ defined by (6) is of the form

$$\begin{aligned} \widetilde{\lambda }_X^{*}(\widetilde{\nu })=\frac{1}{p}\widetilde{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widetilde{\nu }}{\widetilde{\nu }(\mathcal X )}\right) + (\ln M_X\circ \exp )^*(\widetilde{\nu }(\mathcal X )). \end{aligned}$$

(11)

If $\widetilde{\nu }(\mathcal X ) =0$ then $\widetilde{\lambda }_X^{*}(\mathbf{0})=0$. And the effective domain of $\widetilde{\lambda }_X^{*}$ is contained in the set $ \widetilde{\mathcal{M }}=\{\widetilde{\nu }=a\nu : \nu \in \mathcal M ^1_{\alpha }\ \mathrm{a nd} \ a \in \mathcal D ((\ln M_X\circ \exp )^{*}) \}$.

Proof

The composition $\ln M_X\circ \exp $ is convex and lower semicontinuous. By the involutory of the Legendre–Fenchel transform we get

$$\begin{aligned} (\ln M_X\circ \exp )(t) = \sup _{a\in \mathcal D ((\ln M_X\circ \exp )^{*})}\Big \{t a - (\ln M_X\circ \exp )^*(a)\Big \}. \end{aligned}$$

(12)

Since for $a=0$ the expression on the right hand side is equal zero the supremum can be search on the set $\mathcal D ((\ln M_X\circ \exp )^{*}){\setminus } \{0\}$.

Substituting $t=\lambda (\varphi )$ into (12) and using the variational principle (3) we get

$$\begin{aligned} \widetilde{\lambda }_X(\varphi )&= \sup _{a\in \mathcal D ((\ln M_X\circ \exp )^{*}){\setminus } \{0\}}\Big \{\lambda (\varphi ) a - (\ln M_X\circ \exp )^*(a)\Big \}\\&= \sup _{a\in \mathcal D ((\ln M_X\circ \exp )^{*}){\setminus } \{0\}}\sup _{\nu \in \mathcal M ^1_\alpha }\left\{ \int _\mathcal X \varphi d(a\nu )-a\frac{\tau _\alpha (\nu )}{p}- (\ln M_X\circ \exp )^*(a)\right\} .\\ \end{aligned}$$

Denoting $a\nu $ by $\widetilde{\nu }$ we have that $\widetilde{\nu }(\mathcal X )=a$ and $\nu =\frac{\widetilde{\nu }}{\widetilde{\nu }(\mathcal X )}$ for $\widetilde{\nu }(\mathcal X )\ne 0$. Let us define $\widetilde{\mathcal{M }}_{+}=\{a\nu : \nu \in \mathcal M ^1_\alpha \;\mathrm{a nd}\;a\in \mathcal D ((\ln M_X\circ \exp )^{*}){\setminus } \{0\}\}$. Note that $\widetilde{\mathcal{M }}_{+}=\widetilde{\mathcal{M }}{\setminus } \{ \mathbf{0 } \}$. Applying the introduced notations we can rewrite the above as follows

$$\begin{aligned} \widetilde{\lambda }_X(\varphi )=\sup _{\widetilde{\nu }\in \widetilde{\mathcal{M }}_{+}}\left\{ \int _\mathcal X \varphi d\widetilde{\nu }-\frac{1}{p}\widetilde{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widetilde{\nu }}{\widetilde{\nu }(\mathcal X )}\right) - (\ln M_X\circ \exp )^*(\widetilde{\nu }(\mathcal X ))\right\} . \end{aligned}$$

Let us note that the above equation has the form of the Legendre–Fenchel transform. Thus we immediately obtain convexity and lower semicontinuity of the functional $\widetilde{\lambda }_X$ on $C(\mathcal X )$.

It remains to prove that the expression

$$\begin{aligned} \frac{1}{p}\widetilde{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widetilde{\nu }}{\widetilde{\nu }(\mathcal X )}\right) + (\ln M_X\circ \exp )^*(\widetilde{\nu }(\mathcal X )) \end{aligned}$$

(13)

is convex and lower semicontinuos on $\widetilde{\mathcal{M }}_{+}$. Notice now that $\widetilde{\mathcal{M }}$ is some subset (convex subset) of $C(\mathcal X )^*$ and $\widetilde{\nu }(\mathcal X )$ is the total variation of $\widetilde{\nu }$ on $\widetilde{\mathcal{M }}$ that is a norm on $C(\mathcal X )^*$. For this reason the functions $\widetilde{\nu }\mapsto \widetilde{\nu }(\mathcal X )$ and $\widetilde{\nu }\mapsto \frac{\widetilde{\nu }}{\widetilde{\nu }(\mathcal X )}$ are continuous on $\widetilde{\mathcal{M }}_{+}$. The $\mathbf{t}$-entropy and $(\ln M_X\circ \exp )^*$ are lower semicontinuous on $\mathcal M ^1_\alpha $ and $\mathbb R $, respectively. Thus the expression (13) is lower semicontinuous on $\widetilde{\mathcal{M }}_{+}$.

Convexity of $(\ln M_X\circ \exp )^*$ on $\mathbb R $, additivity and positive homogeneity of the total variation on $\widetilde{\mathcal{M }}$ gives convexity of $(\ln M_X\circ \exp )^*(\widetilde{\nu }(\mathcal X ))$ on $\widetilde{\mathcal{M }}$. Moreover by convexity of $\tau _{\alpha }$, for $s \in [0,1]$, we get

$$\begin{aligned}&[s\widetilde{\nu }_1(\mathcal X )+ (1-s)\widetilde{\nu }_2(\mathcal X )] \tau _{\alpha }\left( \frac{s\widetilde{\nu }_1+ (1-s)\widetilde{\nu }_2}{s\widetilde{\nu }_1(\mathcal X )+ (1-s)\widetilde{\nu }_2(\mathcal X )}\right) \\&\quad = [s\widetilde{\nu }_1(\mathcal X )+ (1-s)\widetilde{\nu }_2(\mathcal X )]\tau _{\alpha }\\&\qquad \cdot \left( \frac{s\widetilde{\nu }_1(\mathcal X )}{s\widetilde{\nu }_1(\mathcal X )+ (1-s)\widetilde{\nu }_2(\mathcal X )}\cdot \frac{\widetilde{\nu }_1}{\widetilde{\nu }_1(\mathcal X ) } + \frac{(1-s)\widetilde{\nu }_2(\mathcal X )}{s\widetilde{\nu }_1(\mathcal X )+ (1-s)\widetilde{\nu }_2(\mathcal X )}\cdot \frac{\widetilde{\nu }_2}{\widetilde{\nu }_2(\mathcal X )}\right) \\&\quad \le s\widetilde{\nu }_1(\mathcal X )\tau _{\alpha }\left( \frac{\widetilde{\nu }_1}{\widetilde{\nu }_1(\mathcal X ) }\right) + (1-s)\widetilde{\nu }_2(\mathcal X )\tau _{\alpha }\left( \frac{\widetilde{\nu }_2}{\widetilde{\nu }_2(\mathcal X ) }\right) . \end{aligned}$$

For this reason the expression (13) is convex and lower semicontinuous on $\widetilde{\mathcal{M }}_{+}$. It means that the formula (13) is equal to $\widetilde{\lambda }^{*}$ on this set.

To calculate the value of $\widetilde{\lambda }^{*}$ at $\widetilde{\nu }\equiv \mathbf{0}$ we use the Legendre–Fenchel transform, i.e.

$$\begin{aligned} \widetilde{\lambda }_X^{*}(\mathbf{0}) =\sup _{\varphi \in \mathcal D (\widetilde{\lambda }_X)}\{-(\ln M_X\circ \exp )(\lambda (\varphi ))\}=-\inf _{\varphi \in \mathcal D (\widetilde{\lambda }_X)}(\ln M_X)(r(e^\varphi T_\alpha )). \end{aligned}$$

The cumulant generating function $\ln M_X$ is continuous at $0$ and its value equals $0$. Because the spectral radius $r(e^\varphi T_\alpha )$ can be an arbitrary small positive number then we obtain that $\widetilde{\lambda }_X^*(\mathbf{0})=0$. $\square $

Example 2.5

Let a random variable $X$ be exponentially distributed with a positive parameter $\mu $, i.e. with the density function $ f(x)=\mu e ^{-\mu x}\mathbf{1}_{(0,\infty )}(x). $ Its cumulant generating function is $ \ln M_X(t)= \ln \frac{\mu }{\mu - t} $ for $t<\mu $ and $+\infty $ otherwise. It is a convex and increasing function. The classical Legendre–Fenchel transform gives that

$$\begin{aligned} (\ln M_X)^*(a)=a[(\ln M_X)^{\prime }]^{-1}(a)-(\ln M_X)([(\ln M_X)^{\prime }]^{-1}(a)), \end{aligned}$$

where $[(\ln M_X)^{\prime }]^{-1}$ is the inverse function to the derivative of $\ln M_X$. By direct calculations we get that

$$\begin{aligned} (\ln M_X)^*(a)=\mu a-\ln (\mu a)-1 \end{aligned}$$

for $a>0$. In the same manner we can obtain the formula on

$$\begin{aligned} (\ln M_X\circ \exp )^*(a)=a\ln (\mu a)-(a+1)\ln (a+1)\quad (a>0). \end{aligned}$$

(14)

We consider the operator of the form $M_X(A)= \mu (\mu I- A)^{-1}$ which is well defined if $r(A)<\mu $. For the weighted composition $e^\varphi C_\alpha $, the set $\{\varphi \in C(\mathcal X ):\ r(e^\varphi C_\alpha )<\mu \}$ is the effective domain of the functional $\widetilde{\lambda }_X$. Substituting the formula on $(\ln M_X\circ \exp )^*$ into (11) we get the evident form of the convex conjugate of $\widetilde{\lambda }_X$ for the exponentially distributed random variable.

Remark 2.6

Using the Legendre–Fenchel transform we can also obtain the formula

$$\begin{aligned} ((\ln M_X)^*\circ \exp )^*(a)=(a+1)\ln (a+1)-a\ln \mu -a. \end{aligned}$$

In this case $\mathcal D ((\ln M_X\circ \exp )^*)=[0,\infty )$.

Recall that if $X$ satisfies the Cramér condition with some $c>0$ then for $t\in (-c,c)$ one has

$$\begin{aligned} M_X(t)= \sum _{n=0}^{\infty }\frac{EX^n}{n!}t^n \end{aligned}$$

and it is known that this power series possesses the convergence radius $R$ not less than $c$. Moreover if the moment-generating function $M_X$ of a nonnegative random variable satisfies additionally condition $\lim _{t\rightarrow R^{-}} M_X(t)=+\infty $ then, by Theorem 2.5 in [6], we obtain the following formula on the convex conjugate of composition $\ln M_X\circ \exp $ depending on the moments of random variable $X$

$$\begin{aligned} (\ln M_X \circ \exp )^{*}(a)=\left\{ \begin{array}{l@{\quad }l} \min \limits _{(t_k)\in S_a} \liminf \limits _{N\rightarrow \infty } \sum \nolimits _{k=0}^{N} t_k\ln \frac{t_k k!}{EX^k} &{} a>0, \\ 0 &{} a=0,\\ +\infty &{} a<0, \end{array} \right. \end{aligned}$$

(15)

where $S_a=\{ (t_n): t_n\ge 0,\ \sum _{n=0}^{\infty } t_n=1 \ \mathrm{and} \ \sum _{n=0}^{\infty } n t_n =a\}$. Let us emphasize that it is an another formula on the convex conjugate of the composition $\ln M_X\circ \exp $.

Example 2.7

The moments of the exponentially distributed random variables $X$ equal $E X^n = \frac{n!}{\mu ^n}$ for any $n$. Notice that the moment-generating function of $X$ satisfies assumptions of Theorem 2.5 in [6] and for $a>0$, by the formula (15), we obtain

$$\begin{aligned} (\ln M_X \circ \exp )^{*}(a)&= \min \limits _{(t_n)\in S_a} \liminf \limits _{N\rightarrow \infty } \sum _{k=0}^{N} (k t_k \ln \mu +t_k\ln t_k) \nonumber \\&= \min \limits _{(t_n)\in S_a}\left\{ \left( \,\,\sum _{n=0}^\infty nt_n\right) \ln \mu + \sum _{n=0}^\infty t_n\ln t_n\right\} \nonumber \\&= a\ln \mu + \min _{(t_n)\in S_a} \sum _{n=0}^\infty t_n\ln t_n . \end{aligned}$$

(16)

Let us emphasize that, how it was proved in [9, Prop. 2.3], the entropy function of infinite numbers of variables $\sum _{n=0}^\infty t_n\ln t_n$ takes on the set $\{ (t_n): t_n\ge 0,\ \sum _{n=0}^{\infty } t_n=1 \ \mathrm{and} \ \sum _{n=0}^{\infty } n t_n <\infty \}$ finite values and therefore the above series are convergent. Moreover comparing (14) and (16) we get

$$\begin{aligned} a \ln a - (a+1)\ln (a+1)= \min _{(t_n)\in S_a} \sum _{n=0}^\infty t_n\ln t_n . \end{aligned}$$

On the left handside there is the Legendre-Fenchel transform of $\ln M_X \circ \exp $ for the parameter $\mu =1$.

Consider a discrete random variable $X$ taking values in $\mathbb{N }\cup \{0\}$; $P(X=n)=p_n$. In this case it appears another opportunity of an application of Theorem 2.5 [6]. The probability-generating function of $X$ has the form

$$\begin{aligned} g_X(s)= \sum _{n=0}^{\infty } p_n s^n. \end{aligned}$$

Since $g_X(1)=1$, the convergence radius $R$ of $g_X$ is not less than $1$. Let $A$ be a positive operator with the spectral radius less than $R$ and greater than zero. We can consider now an operator $g_X(A)$. Let us emphasize that it is new different kind of operator than above considered. Its spectral radius can be rewritten as follows

$$\begin{aligned} r(g_X(A))&= \sum _{n=0}^{\infty } p_n r(A)^n = \sum _{n=0}^{\infty } p_n e^{n \ln r(A)}\\&= M_X(\ln r(A)). \end{aligned}$$

If the operator $A$ is a weighted composition operator then we obtain that the logarithm of the spectral radius of

$$\begin{aligned} g_X(e^{\varphi }C_{\alpha })=\sum _{n=0}^{\infty } p_n (e^{\varphi }C_{\alpha })^n \end{aligned}$$

(17)

is a composition of cumulant and functional $\lambda $, i.e.

$$\begin{aligned} \ln r(g_X(e^{\varphi }C_{\alpha }))= (\ln M_X \circ \lambda )(\varphi ). \end{aligned}$$

Define now a functional $\widehat{\lambda }_X$ by the following formula

$$\begin{aligned} \widehat{\lambda }_X(\varphi )= (\ln M_X \circ \lambda )(\varphi ) \end{aligned}$$

(18)

for $\varphi \in \lambda ^{-1}(\mathcal D (\ln M_X))$ and $+\infty $ otherwise.

Because the cumulant generating function is convex and lower semicontinuous on $\mathbb R $ then the following equality is satisfied

$$\begin{aligned} (\ln M_X)(t) = \sup _{a\in \mathbb R }\Big \{t a - (\ln M_X)^*(a)\Big \}. \end{aligned}$$

(19)

Observe that for the discrete random variable with values in the set of nonnegative integers

$$\begin{aligned} (\ln M_X)(t)=(\ln g_X\circ \exp )(t)=\ln \sum _{n=0}^\infty p_ne^{nt} \end{aligned}$$

and we can use once again Theorem 2.5 in [6]. Taking in Theorem 2.5 $a_n$ equals the probability $p_n$, assuming that $p_n>0$, we obtain the following formula

$$\begin{aligned} (\ln M_X)^*(a)=\min _{(t_k)\in S_a} \liminf _{N\rightarrow \infty } \sum _{k=0}^{N} t_k\ln \frac{t_k}{p_k}\quad (a\in int\mathcal D ((\ln M_X)^*)) \end{aligned}$$

which is an example (for a discrete random variable) of the contraction principle (1).

Substituting $t=\lambda (\varphi )$ into (19) and using the formula (3) we get

$$\begin{aligned} \widehat{\lambda }_X(\varphi )=\sup _{a\in \mathcal D ((\ln M_X)^*)}\sup _{\nu \in \mathcal M ^1_\alpha }\left\{ \int _\mathcal X \varphi d(a\nu )-a\frac{\tau _\alpha (\nu )}{p}- (\ln M_X)^*(a)\right\} . \end{aligned}$$

Defining now the set $\widehat{\mathcal{M }}_+=\{a\nu : \nu \in \mathcal M ^1_\alpha \;\mathrm{a nd}\;a\in \mathcal D ((\ln M_X)^*)\}{\setminus } \{\mathbf{0}\}$ and introducing it to the above formula we get

$$\begin{aligned} \widehat{\lambda }_X(\varphi )= \sup _{\widehat{\nu }\in \widehat{\mathcal{M }}_+}\left\{ \int _\mathcal X \varphi d\widehat{\nu } - \frac{1}{p}\widehat{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widehat{\nu }}{\widehat{\nu }(\mathcal X )}\right) - (\ln M_X)^*(\widehat{\nu }(\mathcal X ))\right\} . \end{aligned}$$

The expression

$$\begin{aligned} \frac{1}{p}\widehat{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widehat{\nu }}{\widehat{\nu }(\mathcal X )}\right) + (\ln M_X)^*(\widehat{\nu }(\mathcal X )) \end{aligned}$$

is convex and lower semicontinuous on $\widehat{\mathcal{M }}_+$ and

$$\begin{aligned} \widehat{\lambda }_X^{*}(\mathbf{0})&= -\inf _{\varphi \in C(X)}(\ln M_X)(\lambda (\varphi ))\\&= -\inf _{\varphi \in C(X)}\ln \sum _{n=0}^\infty p_n r(e^\varphi C_\alpha )^n = -\ln p_0. \end{aligned}$$

In this way we obtained the following

Proposition 2.8

For the functional $\widehat{\lambda }_X$ given by (18) the following variational principle holds

$$\begin{aligned} \widehat{\lambda }_X(\varphi )= \sup _{\widehat{\nu }\in \widehat{\mathcal{M }}}\left\{ \int _\mathcal X \varphi d\,\widehat{\nu } - \widehat{\lambda }_X^{*}(\,\widehat{\nu }\,)\right\} , \end{aligned}$$

where $ \widehat{\mathcal{M }}=\{\widehat{\nu }=a \nu : \nu \in \mathcal M ^1_{\alpha }\ \mathrm{a nd} \ a\in \mathcal D ((\ln M_X)^{*}) \}$ and

$$\begin{aligned} \widehat{\lambda }_X^{*}(\,\widehat{\nu }\,)=\frac{1}{p}\widehat{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widehat{\nu }}{\widehat{\nu }(\mathcal X )}\right) + (\ln M_X)^*(\,\widehat{\nu }(\mathcal X )) \quad \mathrm{for} \; \widehat{\nu }(\mathcal X )>0. \end{aligned}$$

If $\widehat{\nu }(\mathcal X )=0$ then $\widehat{\lambda }^{*}_X(\mathbf{0})=-\ln p_0$.

Remark 2.9

Let us stress once again that Theorem 2.4 and Proposition 2.8 are dealt with two different classes of operators. In the first one we consider operators that can be symbolically written as $\int _0^\infty e^{sA}\mu _X(ds)$, where $A=e^\varphi C_\alpha $ and the integral is understood in the sens of the power series (5). In the second one we investigate the spectral exponent of operators of the form $\int _0^\infty A^s\mu _X(ds)$, where this integral is defined by the series (17). Therefore Proposition 2.8 is not a simple subcase of Theorem 2.4. For a discrete random variables with values in $\mathbb N \cup \{0\}$ we can defined a new type of considered operators.

Example 2.10

For Poisson distributed $X$ with parameter $\mu $ the probability-generating function is of the form

$$\begin{aligned} g_X(s)=e^{\mu (s-1)}. \end{aligned}$$

Then the operator $g_X(A)=e^{-\mu }e^{\mu A}$ and

$$\begin{aligned} \ln r(g_X(A))= \ln (e^{-\mu }e^{\mu r(A)})= \mu r(A) -\mu = \ln M_X(\ln r(A)). \end{aligned}$$

(20)

The cumulant generating function of $X$ is equal to $\ln M_X(t)= \mu e^t - \mu $ and its Cramér transform has the form

$$\begin{aligned} (\ln M_X )^{*}(a)=\left\{ \begin{array}{l@{\quad }l} \mu - a + a \ln \frac{a}{\mu } &{} a>0, \\ \mu &{} a=0,\\ +\infty &{} a<0. \end{array} \right. \end{aligned}$$

Taking $A=e^{\varphi }C_{\alpha }$ in (20) by Proposition 2.8 for $\widetilde{\nu }(\mathcal X )>0$ we obtain that

$$\begin{aligned} \widehat{\lambda }_X^{*}(\,\widehat{\nu }\,)=\frac{1}{p}\widehat{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widehat{\nu }}{\widehat{\nu }(\mathcal X )}\right) + \mu - \widehat{\nu }(\mathcal X ) + \widehat{\nu }(\mathcal X ) \ln \frac{\widehat{\nu }(\mathcal X )}{\mu }. \end{aligned}$$

If we consider now the operator $M_X(A)= e^{-\mu } e^{\mu e^A}$ then for $A=e^{\varphi }C_{\alpha }$ by Theorem 2.4 for $\widetilde{\nu }(\mathcal X )>0$ we get

$$\begin{aligned} \widetilde{\lambda }_X^{*}(\,\widetilde{\nu }\,)=\frac{1}{p}\widetilde{\nu }(\mathcal X )\tau _\alpha \left( \frac{\widetilde{\nu }}{\widetilde{\nu }(\mathcal X )}\right) + (\ln M_X\circ \exp )^*(\widetilde{\nu }(\mathcal X )), \end{aligned}$$

where

$$\begin{aligned} (\ln M_X\circ \exp )^*(\,\widetilde{\nu }(\mathcal X ))= \min _{\alpha > 0} \Big \{ \mu - \alpha + \alpha \ln \frac{\alpha }{\mu } - \widetilde{\nu }(\mathcal X ) \ln \alpha \Big \} + (\exp )^{*} (\,\widetilde{\nu }(\mathcal X )). \end{aligned}$$

References

Antonevich, A., Bakhtin, V., Lebedev, A.: Thermodynamics and spectral radius. Nonlinear Phenom. Complex Syst. 4(4), 318–321 (2001)
MathSciNet Google Scholar
Antonevich, A., Bakhtin, V., Lebedev, A.: On t-entropy and variational principle for the spectral radii of transfer and weighted shift operators. Ergodic Theory Dyn. Syst. 31(4), 995–1042 (2011)
Article MATH MathSciNet Google Scholar
Billingsley, P.: Probability and Measure, 3rd edn. Wiley, New York (1995)
MATH Google Scholar
Donsker, M.D., Varadhan, S.R.S.: Asymptotic evaluation of certain Markov process expectations for large time III. Commun. Pure Appl. Math. 29, 389–461 (1976)
Article MATH MathSciNet Google Scholar
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms, II. Springer, Berlin (1993)
MATH Google Scholar
Ostaszewska, U., Zajkowski, K.: Legendre–Fenchel transform of the spectral exponent of analytic functions of weighted composition operators. J. Convex Anal. 18(2), 367–377 (2011)
MATH MathSciNet Google Scholar
Schaefer, H.H.: Banach Lattices and Positive Operators. Springer, Berlin (1974)
Book MATH Google Scholar
Yosida, K.: Functional Analysis, 5th edn. Springer, Berlin (1978)
Book MATH Google Scholar
Zajkowski, K.: Convex conjugates of analytic functions of logarithmically convex functional. J. Convex Anal. 20(1), 243–252 (2013)
MATH MathSciNet Google Scholar

Download references

Acknowledgments

We would like to thank the reviewer for his very precise improvements and inquiring remarks and comments.

Author information

Authors and Affiliations

Institute of Mathematics, University of Bialystok, Akademicka 2, 15-267 , Bialystok, Poland
Urszula Ostaszewska & Krzysztof Zajkowski

Authors

Urszula Ostaszewska
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Zajkowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krzysztof Zajkowski.

Additional information

The authors are supported by the Polish National Science Center, Grant No. DEC-2011/01/B/ST1/03838.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Reprints and permissions

About this article

Cite this article

Ostaszewska, U., Zajkowski, K. Cramér transform and t-entropy. Positivity 18, 347–358 (2014). https://doi.org/10.1007/s11117-013-0247-3

Download citation

Received: 25 November 2012
Accepted: 05 June 2013
Published: 14 June 2013
Issue Date: June 2014
DOI: https://doi.org/10.1007/s11117-013-0247-3

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Cramér transform and t-entropy

Abstract

Similar content being viewed by others

Relative operator entropy related with the spectral geometric mean

On Extreme Values of the Rényi Entropy under Coupling of Probability Distributions

Sup-Sums Principles for F-Divergence and a New Definition for t-Entropy

1 Introduction

2 Spectral radius of moment-generating functions of WCO

Proposition 2.1

Proof

Proposition 2.2

Proof

Corollary 2.3

Theorem 2.4

Proof

Example 2.5

Remark 2.6

Example 2.7

Proposition 2.8

Remark 2.9

Example 2.10

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Cramér transform and t-entropy

Abstract

Similar content being viewed by others

Relative operator entropy related with the spectral geometric mean

On Extreme Values of the Rényi Entropy under Coupling of Probability Distributions

Sup-Sums Principles for F-Divergence and a New Definition for t-Entropy

1 Introduction

2 Spectral radius of moment-generating functions of WCO

Proposition 2.1

Proof

Proposition 2.2

Proof

Corollary 2.3

Theorem 2.4

Proof

Example 2.5

Remark 2.6

Example 2.7

Proposition 2.8

Remark 2.9

Example 2.10

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation