Estimation of different entropies via Lidstone polynomial using Jensen-type functionals

Khan, Khuram Ali; Niaz, Tasadduq; Pečarić, Đilda; Pečarić, Josip

doi:10.1007/s40065-020-00277-y

Estimation of different entropies via Lidstone polynomial using Jensen-type functionals

Open access
Published: 18 February 2020

Volume 9, pages 613–631, (2020)
Cite this article

Download PDF

You have full access to this open access article

Arabian Journal of Mathematics Aims and scope Submit manuscript

Estimation of different entropies via Lidstone polynomial using Jensen-type functionals

Download PDF

Khuram Ali Khan¹,
Tasadduq Niaz ORCID: orcid.org/0000-0002-2397-9608^1,2,
Đilda Pečarić³ &
…
Josip Pečarić⁴

1872 Accesses
Explore all metrics

Abstract

In this work, some new functional of Jensen-type inequalities are constructed using Shannon entropy, f-divergence, and Rényi divergence, and some estimates are obtained for these new functionals. Also using the Zipf–Mandelbrot law and hybrid Zipf–Mandelbrot law, we investigate some bounds for these new functionals. Furthermore, we generalize these new functionals for m-convex function using Lidstone polynomial.

Estimation of different entropies via Hermite interpolating polynomial using Jensen type functionals

Article 12 June 2020

Estimation of f-divergence and Shannon entropy by using Levinson type inequalities for higher order convex functions via Hermite interpolating polynomial

Article Open access 12 May 2020

Estimation of f-divergence and Shannon entropy by Levinson type inequalities via new Green’s functions and Lidstone polynomial

Article Open access 14 January 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction and preliminary results

The most commonly used words, population ranks of cities in various countries, corporation sizes, income rankings can be described in terms of Zipf’s law. The f-divergence measures the difference between two probability distributions by making an average value, which is weighted by a specified function. There are other probability distributions like Csiszar f-divergence [10, 11], a special case of which is Kullback–Leibler-divergence which is used to find the appropriate distance between the probability distributions (see [18, 19]). The notion of distance is stronger than divergence, because it gives the properties of symmetry and triangle inequalities. Probability theory has application in many fields and the divergence between probability distributions has many applications in these fields.

Many natural phenomena such as distributions of wealth and income in a society, Facebook likes, football goals, and city sizes follow power-law distributions (Zipf’s Law). Auerbach [2] was the first to explore the idea that the distribution of city size can be well approximated with the help of Pareto distribution (power-law distribution). This idea was well refined by many researchers, but Zipf [27] worked significantly in this field. The distribution of city sizes is investigated by many scholars of the urban economics, like Rosen and Resnick [24], Black and Henderson [3], Ioannides and Overman [17], Soo [25], Anderson and Ge [1], and Bosker et al. [4]. Zipf’s law states that: “The rank of cities with a certain number of inhabitants varies proportional to the city sizes with some negative exponent, say that is close to unit". In other words, Zipf’s Law states that the product of city sizes and their ranks appears roughly constant. This indicates that the population of the second largest city is one half of the population of the largest city and the third largest city equal to one-third of the population of the largest city and the population of n-th city is $\frac{1}{n}$ of the largest city population. This rule is called rank, size rule, and also named as Zipf’s Law. Hence, Zip’s law shows that the city size distribution follows the Pareto distribution and, in addition, that the estimated value of the shape parameter is equal to unity.

Horváth et al. [16] introduced and obtained some estimates for new functionals based on the f-divergence functionals. and obtained some estimates for the new functionals. They obtained f-divergence and Rényi divergence by applying a cyclic refinement of Jensen’s inequality. They also constructed new inequalities for Rényi and Shannon entropies and used Zipf–Madelbrot law to illustrate the results.

The inequalities involving higher order convexity are used by many physicists in higher dimension problems since the founding of higher order convexity by Popoviciu (see [22, p. 15]). It is quite interesting that some results are true for convex functions, but in higher order convexity, they are not valid anymore.

In [22, p. 16], the following criteria are given to check the m-convexity of the function.

If $f^{(m)}$ exists, then f is m-convex if and only if $f^{(m)} \ge 0$.

In recent years, many researchers have generalized the inequalities for m-convex functions. For instance, Butt et al. generalized the Popoviciu’s inequality for m-convex function using Taylor’s formula, Lidstone polynomial, Montgomery identity, Fink’s identity, Abel–Goncharov interpolation, and Hermite interpolating polynomial (see [5,6,7,8,9]).

For many years, Jensen’s inequality has been of great interest. It was refined by defining some new functions (see [14, 15]). Horváth and Pečarić ([12, 15], see also [13, p. 26]) gave a refinement of Jensen’s inequality for convex function. They defined some essential notions to prove the refinement given as follows:

Let X be a set, and: $P(X):=$ Power set of X, |X|:= Number of elements of X, ${\mathbb {N}}$:= Set of natural numbers with 0. Consider $q \ge 1$ and $r \ge 2$ be fixed integers. Define the functions:

$$\begin{aligned}&F_{r, s}:\{1, \ldots , q\}^{r}\rightarrow \{1, \ldots , q\}^{r-1} \,\ \,\ \,\ 1 \le s \le r,\\&\qquad F_{r}: \{1, \ldots , q \}^{r} \rightarrow P\left( \{1, \ldots , q \}^{r-1}\right) , \end{aligned}$$

and

$$\begin{aligned} T_{r}: P\left( \{1, \ldots , q\}^{r} \right) \rightarrow P\left( \{1, \ldots , q\}^{r-1}\right) , \end{aligned}$$

by

$$\begin{aligned} F_{r, s}(i_1, \ldots , i_{r}):= & {} (i_1, i_2, \ldots , i_{s-1}, i_{s+1}, \ldots , i_r) \,\ \,\ \,\ 1\le s \le r,\\ F_{r}(i_1, \ldots , i_r):= & {} \bigcup \limits _{s=1}^{r}\{F_{r, s}(i_1, \ldots , i_r)\}, \end{aligned}$$

and

$$\begin{aligned} T_{r}(I)=\left\{ \begin{array}{ll} \phi , &{} {I = \phi ;} \\ \bigcup \limits _{(i_1, \ldots , i_r)\in I}F_{r}(i_1, \ldots , i_r), &{} {I \ne \phi .} \end{array} \right. \end{aligned}$$

Next, the function

$$\begin{aligned} \alpha _{r, i}: \{1, \ldots , q\}^{r} \rightarrow {\mathbb {N}} \,\ \,\ \,\ 1\le i \le q \end{aligned}$$

is defined by:

$$\begin{aligned} \alpha _{r, i}(i_1, \ldots , i_r) \,\ \text {is the number of occurrences of } i \text { in the sequence}\,\ (i_1, \ldots , i_r). \end{aligned}$$

For each $I \in P(\{1, \ldots , q\}^r)$, let

$$\begin{aligned} \alpha _{I, i}:=\sum \limits _{(i_1, \ldots , i_r)\in I}\alpha _{r, i}(i_1, \ldots , i_r) \,\ \,\ \,\ 1\le i \le q. \end{aligned}$$

$\left( H_1\right) $ Let n, m be fixed positive integers, such that $n\ge 1$, $m\ge 2$, and let $I_m$ be a subset of $\{1, \ldots , n \}^m$, such that:

$$\begin{aligned} \alpha _{I_m, i} \ge 1 \,\ \,\ \,\ 1 \le i \le n. \end{aligned}$$

Introduce the sets $I_{l}\subset \{1, \ldots , n\}^{l} (m-1 \ge l \ge 1)$ inductively by:

$$\begin{aligned} I_{l-1}:= T_{l}(I_l) \,\ \,\ \,\ m \ge l \ge 2. \end{aligned}$$

Obviously, the sets $I_1= \{1, \ldots , n\}$, by $(H_1)$, and this insures that $\alpha _{I_1, i}=1(1 \le i \le n)$. From $(H_1)$, we have $\alpha _{I_l, i} \ge 1(m-1 \ge l \ge 1, 1 \le i \le n)$.

For $m \ge l \ge 2$, and for any $(j_1, \ldots , j_{l-1})\in I_{l-1}$, let:

$$\begin{aligned} {\mathscr {H}}_{I_l}(j_1, \ldots , j_{l-1}):=\{((i_1, \ldots , i_l), k) \times \{1, \ldots , l\}|F_{l, k}(i_1, \ldots , i_l)=(j_1, \ldots , j_{l-1})\}. \end{aligned}$$

With the help of these sets, they define the functions $\eta _{I_m, l}: I_l \rightarrow {\mathbb {N}}(m \ge l \ge 1)$ inductively by:

$$\begin{aligned}&\eta _{I_m, m}(i_1, \ldots , i_m):=1\,\ \,\ \,\ (i_1, \ldots , i_m)\in I_m;\\&\quad \eta _{I_m, l-1}(j_1, \ldots , j_{l-1}):=\sum \limits _{((i_1, \ldots , i_l),k)\in {\mathscr {H}}_{I_l}(j_1, \ldots , j_{l-1})}\eta _{I_m, l}(i_1, \ldots ,i_l). \end{aligned}$$

They define some special expressions for $1 \le l \le m$, as follows:

$$\begin{aligned} {\mathscr {A}}_{m,l}= & {} {\mathscr {A}}_{m,l}(I_m, x_1, \ldots , x_n, p_1, \ldots , p_n ; f):= \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l)\in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \\&\times \left( \sum \limits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}\right) f\left( \frac{\sum \nolimits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}x_{i_j}}{\sum \nolimits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}}\right) \end{aligned}$$

and prove the following theorem.

Theorem 1.1

Assume $(\mathrm{H}_1)$, and let $f: I \rightarrow {\mathbb {R}}$ be a convex function where $I \subset {\mathbb {R}}$ is an interval. If $x_1, \ldots , x_n \in I$, and $p_1, \ldots , p_n$ are positive real numbers, such that $\sum \nolimits _{i=1}^{n}p_i=1$, then

$$\begin{aligned} f\left( \sum \limits _{s=1}^{n}p_{s}x_s\right) \le {\mathscr {A}}_{m, m} \le {\mathscr {A}}_{m, m-1} \le \cdots \le {\mathscr {A}}_{m, 2} \le {\mathscr {A}}_{m, 1} = \sum \limits _{s=1}^{n}p_{s}f\left( x_s\right) . \end{aligned}$$

(1)

We define the following functionals by taking the differences of refinement of Jensen’s inequality given in (1):

$$\begin{aligned} \Theta _{1}(f)= & {} {\mathscr {A}}_{m, r} - f\left( \sum \limits _{s=1}^{n}p_s x_s\right) , \,\ \,\ \,\ r=1, \ldots , m, \end{aligned}$$

(2)

$$\begin{aligned} \Theta _{2}(f)= & {} {\mathscr {A}}_{m, r} - {\mathscr {A}}_{m, k}, \,\ \,\ \,\ 1\le r <k \le m. \end{aligned}$$

(3)

Under the assumptions of Theorem 1.1, we have:

$$\begin{aligned} \Theta _{i}(f) \ge 0, \,\ \,\ \,\ i=1,2. \end{aligned}$$

(4)

Inequalities (4) are reversed if f is concave on I.

1.1 Lidstone polynomial

We generalize the refinement of Jensen’s inequality for higher order convex function using Lidstone interpolating polynomial. In [26], Widder gives the following result.

Lemma A

If $g \in C^{\infty }([0, 1])$, then:

$$\begin{aligned} g(u)= \sum _{l=0}^{m-1}\left[ g^{(2l)}(0){\mathfrak {F}}_{l}(1 - u) + g^{(2l)}(0){\mathfrak {F}}_{l}(t) \right] + \int _{0}^{1}G_{m}(u,s)g^{(2m)}(s)\mathrm{d}s, \end{aligned}$$

where ${\mathfrak {F}}_{l}$ is a polynomial of degree $2l+1$ defined by the relation:

$$\begin{aligned} {\mathfrak {F}}_{0}(u)=u, \,\ {\mathfrak {F}}_{m}''(u)={\mathfrak {F}}_{m-1}(u), \,\ {\mathfrak {F}}_{m}(0)={\mathfrak {F}}_{m}(1)=0, \,\ m\ge 1, \end{aligned}$$

(5)

and

$$\begin{aligned} G_{1}(u, s)= G(u, s)=\left\{ \begin{array}{ll} (u-1)s, &{} \quad {\alpha _1 \le s \le u \le \alpha _2;} \\ (s-1)u, &{} \quad {\alpha _1 \le u \le s \le \alpha _2,} \end{array} \right. \end{aligned}$$

is a homogeneous Green’s function of the differential operator $\frac{\mathrm{d}^2}{\mathrm{d}^2s}$ on [0, 1], and with the successive iterates of G(u, s):

$$\begin{aligned} G_{m}(u,s)=\int _{0}^{1}G_{1}(u,p)G_{m-1}(p,s)\mathrm{d}p, \,\ \,\ m \ge 2. \end{aligned}$$

The Lidstone polynomial can be expressed in terms of $G_{m}(u, s)$ as:

$$\begin{aligned} {\mathfrak {F}}_{m}(u)=\int _{0}^{1}G_{m}(u,s)s\mathrm{d}s. \end{aligned}$$

Lidstone series representation of $g \in C^{2m}[\alpha _1, \alpha _2]$ is given by:

$$\begin{aligned} g(u)= & {} \sum _{l=0}^{m-1}(\alpha _2 - \alpha _1)^{2l}g^{(2l)}(\alpha _1){\mathfrak {F}}_{l}\left( \frac{\alpha _2 - u}{\alpha _2 - \alpha _1}\right) + \sum _{l=0}^{m-1}(\alpha _2 - \alpha _1)^{2l}g^{(2l)}(\alpha _2){\mathfrak {F}}_{l}\left( \frac{u - \alpha _1}{\alpha _2 - \alpha _1}\right) \nonumber \\&+ \, (\alpha _2 - \alpha _1)^{2l-1}\int _{\alpha _1}^{\alpha _2}G_{m}\left( \frac{u - \alpha _1}{\alpha _2 - \alpha _1}, \frac{t - \alpha _1}{\alpha _2 - \alpha _1}\right) g^{(2l)}(t)\mathrm{d}t. \end{aligned}$$

(6)

2 Inequalities for Csiszár divergence

In [10, 11], Csiszár introduced the following notion.

Definition 2.1

Let $f : {\mathbb {R}}^{+} \rightarrow {\mathbb {R}}^{+}$ be a convex function, let ${\mathbf {r}}=\left( r_1, \ldots , r_n\right) $, and ${\mathbf {q}}=\left( q_1, \ldots , q_n\right) $ be positive probability distributions. Then, f-divergence functional is defined by:

$$\begin{aligned} I_{f}({\mathbf {r}}, {\mathbf {q}}) := \sum _{i=1}^{n}q_{i}f\left( \frac{r_i}{q_i}\right) . \end{aligned}$$

(7)

And he stated that by defining:

$$\begin{aligned} f(0) := \lim \limits _{x \rightarrow 0^{+}}f(x); \,\ \,\ \,\ 0f\left( \frac{0}{0}\right) :=0; \,\ \,\ \,\ 0f\left( \frac{a}{0}\right) := \lim \limits _{x \rightarrow 0^{+}}xf\left( \frac{a}{0}\right) , \,\ \,\ a>0, \end{aligned}$$

(8)

we can also use the non-negative probability distributions, as well.

Horv́ath et al. [16] gave the following functional based on the previous definition.

Definition 2.2

Let $I \subset {\mathbb {R}}$ be an interval and let $f: I \rightarrow {\mathbb {R}}$ be a function. Let ${\mathbf {r}}=(r_1, \ldots , r_n)\in {\mathbb {R}}^n$ and ${\mathbf {q}}=(q_1, \ldots , q_n)\in (0, \infty )^{n}$, such that:

$$\begin{aligned} \frac{r_s}{q_s} \in I, \,\ \,\ \,\ s= 1, \ldots , n. \end{aligned}$$

Then, they define the sum ${\hat{I}}_{f}({\mathbf {r}}, {\mathbf {q}})$ as:

$$\begin{aligned} {\hat{I}}_{f}({\mathbf {r}}, {\mathbf {q}}) : = \sum _{s=1}^{n}q_{s}f\left( \frac{r_s}{q_s}\right) . \end{aligned}$$

(9)

We apply Theorem 1.1 to ${\hat{I}}_{f}({\mathbf {r}}, {\mathbf {q}})$.

Theorem 2.3

Assume $(H_1)$, let $I \subset {\mathbb {R}}$ be an interval and let ${\mathbf {r}}=\left( r_1, \ldots , r_n\right) $ and ${\mathbf {q}}=\left( q_1, \ldots , q_n\right) $ be in $(0, \infty )^{n}$, such that

$$\begin{aligned} \frac{r_s}{q_s} \in I, \,\ \,\ \,\ s = 1, \ldots , n. \end{aligned}$$

(i) If $f: I \rightarrow {\mathbb {R}}$ is a convex function, then:

$$\begin{aligned} {\hat{I}}_{f}({\mathbf {r}}, {\mathbf {q}})= & {} \sum _{s=1}^{n}q_{s}f\left( \frac{r_s}{q_s}\right) =A_{m,1}^{[1]}\ge A_{m,2}^{[1]} \ge \ldots \ge A_{m,m-1}^{[1]} \ge A_{m,m}^{[1]} \ge f\left( \frac{\sum _{s=1}^{n}r_s}{\sum _{s=1}^{n}q_s} \right) \sum _{s=1}^{n}q_s, \end{aligned}$$

(10)

where

$$\begin{aligned} A_{m,l}^{[1]}= & {} \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_m, i_j}}\right) f\left( \frac{\sum _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}{\sum \nolimits _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_m, i_j}}}\right) . \end{aligned}$$

(11)

If f is a concave function, then inequality signs in (10) are reversed.

(ii) If $f: I \rightarrow {\mathbb {R}}$ is a function, such that $x \rightarrow xf(x) (x \in I)$ is convex, then:

$$\begin{aligned}&\left( \sum _{s=1}^{n}r_s\right) f\left( \sum _{s=1}^{n}\frac{r_s}{\sum _{s=1}^{n}q_s}\right) \le A_{m, m}^{[2]} \le A_{m,m-1}^{[2]} \le \cdots \le A_{m,2}^{[2]} \le A_{m,1}^{[2]} = \sum _{s=1}^{n}r_sf\left( \frac{r_s}{q_S}\right) = {\hat{I}}_{id f}({\mathbf {r}}, {\mathbf {q}}),\nonumber \\ \end{aligned}$$

(12)

where

$$\begin{aligned} A_{m, l}^{[2]}= & {} \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l)\in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{{q_{i_j}}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_{m}, i_j}}} \right) f\left( \frac{\sum _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_{m}, i_j}}} \right) . \end{aligned}$$

Proof

(i) Consider $p_{s} = \frac{q_{s}}{\sum _{s=1}^{n}q_s}$ and $x_{s} = \frac{r_s}{q_s}$ in Theorem 1.1, we have:

$$\begin{aligned}&f\left( \sum _{s=1}^{n}\frac{q_s}{\sum _{s=1}^{n}q_s}\frac{r_s}{q_s}\right) \le \cdots \le \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l)\nonumber \\&\quad \times \, \left( \sum \limits _{j=1}^{l}\frac{\frac{q_{i_j}}{\sum _{s=1}^{n}q_{s}}}{\alpha _{I_m, i_j}}\right) f\left( \frac{\sum \nolimits _{j=1}^{l}\frac{ \frac{q_{i_j}}{\sum _{i=1}^{n}q_{i}} }{\alpha _{I_m, i_j}}\frac{r_{i_j}}{q_{i_j}}}{\sum \nolimits _{j=1}^{l}\frac{ \frac{q_{i_j}}{\sum _{i=1}^{n}q_{i}} }{\alpha _{I_m, i_j}}}\right) \le \ldots \le \sum _{s=1}^{n}\frac{q_s}{\sum _{i=1}^{n}q_{s}}f\left( \frac{r_s}{q_s} \right) . \end{aligned}$$

(13)

And taking the sum $\sum _{s=1}^{n}q_{i}$, we have (10).

(ii) Using $f:=id f$ (where “id” is the identity function) in Theorem 1.1, we have:

$$\begin{aligned}&\sum _{s=1}^{n}p_{s}x_{s}f\left( \sum _{s=1}^{n}p_s x_s\right) \le \ldots \le \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l)\in I_l}\eta _{I_m, l}(i_1, \ldots , i_l)\nonumber \\&\quad \times \left( \sum \limits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum \nolimits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}x_{i_j}}{\sum \nolimits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}}\right) f\left( \frac{\sum \nolimits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}x_{i_j}}{\sum \nolimits _{j=1}^{l}\frac{p_{i_j}}{\alpha _{I_m, i_j}}}\right) \le \cdots \le \sum _{s=1}^{n}p_{s}x_{s}f(x_s). \end{aligned}$$

(14)

Now, using $p_s = \frac{q_s}{\sum _{s=1}^{n}q_s}$ and $x_s = \frac{r_s}{q_s}, \,\ s = 1, \ldots , n$, we get:

$$\begin{aligned}&\sum _{s=1}^{n}\frac{q_s}{\sum _{s=1}^{n}q_s}\frac{r_s}{q_s}f\left( \sum _{s=1}^{n}\frac{q_s}{\sum _{s=1}^{n}q_s}\frac{r_s}{q_s}\right) \le \cdots \le \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l)\in I_l}\eta _{I_m, l}(i_1, \ldots , i_l)\nonumber \\&\quad \times \left( \sum \limits _{j=1}^{l}\frac{\frac{q_{i_j}}{\sum _{s=1}^{n}q_s}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum _{j=1}^{l} \frac{\frac{q_{i_j}}{\sum _{s=1}^{n}q_s}}{\alpha _{I_{m}, i_j}}\frac{r_{i_j}}{q_{i_j}}}{\sum _{j=1}^{l} \frac{\frac{q_{i_j}}{\sum _{s=1}^{n}q_s}}{\alpha _{I_{m}, i_j}}} \right) f\left( \frac{\sum _{j=1}^{l} \frac{\frac{q_{i_j}}{\sum _{s=1}^{n}q_s}}{\alpha _{I_{m}, i_j}}\frac{r_{i_j}}{q_{i_j}}}{\sum _{j=1}^{l} \frac{\frac{q_{i_j}}{\sum _{s=1}^{n}q_s}}{\alpha _{I_{m}, i_j}}} \right) \le \cdots \le \sum _{s=1}^{n}\frac{q_s}{\sum _{s=1}^{n}q_s} \frac{r_s}{q_s}f\left( \frac{r_s}{q_S}\right) .\nonumber \\ \end{aligned}$$

(15)

By taking sum $\sum _{s=1}^{n}q_s$ on both sides, we get (12). $\square $

3 Inequalities for Shannon entropy

Definition 3.1

(See [16]) The Shannon entropy of positive probability distribution ${\mathbf {r}}=(r_1, \ldots , r_n)$ is defined by:

$$\begin{aligned} S : = - \sum _{s=1}^{n}r_{s}\log (r_s). \end{aligned}$$

(16)

Corollary 3.2

Assume $(\mathrm{H}_1)$.

(i)
If ${\mathbf {q}}=(q_1, \ldots , q_n) \in (0, \infty )^{n}$, and the base of $\log $ is greater than 1, then:
$$\begin{aligned} S \le A_{m,m}^{[3]} \le A_{m,m-1}^{[3]} \le \cdots \le A_{m,2}^{[3]} \le A_{m,1}^{[3]} = \log \left( \frac{n}{\sum _{s=1}^{n}q_s}\right) \sum _{s=1}^{n}q_s, \end{aligned}$$
(17)
where
$$\begin{aligned}&A_{m,l}^{[3]} = - \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l)\left( \sum \limits _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_m, i_j}}\right) \log \left( \sum \limits _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_m, i_j}}\right) . \end{aligned}$$
(18)
If the base of $\log $ is between 0 and 1, then inequality signs in (17) are reversed.
(ii)
If ${\mathbf {q}}= (q_1, \ldots , q_n)$ is a positive probability distribution and the base of $\log $ is greater than 1, then we have the estimates for the Shannon entropy of ${\mathbf {q}}$:
$$\begin{aligned}&S \le A_{m,m}^{[4]} \le A_{m,m-1}^{[4]} \le \cdots \le A_{m,2}^{[4]} \le A_{m,1}^{[4]} = \log (n), \end{aligned}$$
(19)
where
$$\begin{aligned} A_{m,l}^{[4]} = - \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_m, i_j}}\right) \log \left( \sum \limits _{j=1}^{l} \frac{q_{i_j}}{\alpha _{I_m, i_j}}\right) . \end{aligned}$$

Proof

(i) Using $f:= \log $ and ${\mathbf {r}} = (1, \ldots , 1)$ in Theorem 2.3 (i), we get (17).

(ii) It is a special case of (i). $\square $

Definition 3.3

(See [16]) The Kullback–Leibler divergence between the positive probability distribution ${\mathbf {r}}=(r_1, \ldots , r_n)$ and ${\mathbf {q}}= (q_1, \ldots , q_n)$ is defined by:

$$\begin{aligned} D({\mathbf {r}}, {\mathbf {q}}) : = \sum _{s=1}^{n}r_{i} \log \left( \frac{r_i}{q_i}\right) . \end{aligned}$$

(20)

Corollary 3.4

Assume $(\mathrm{H}_1)$.

(i)
Let ${\mathbf {r}} = (r_1 , \ldots , r_n) \in (0, \infty )^{n}$ and ${\mathbf {q}} : = (q_1, \ldots , q_n) \in (0, \infty )^{n}$. If the base of $\log $ is greater than 1, then:
$$\begin{aligned}&\sum _{s=1}^{n}r_s \log \left( \sum _{s=1}^{n}\frac{r_s}{\sum _{s=1}^{n}q_s}\right) \le A_{m, m}^{[5]} \le A_{m, m-1}^{[5]} \le \cdots \le A_{m, 2}^{[5]} \le A_{m, 1}^{[5]} = \sum _{s=1}^{n}r_s \log \left( \frac{r_s}{q_s}\right) = D({\mathbf {r}}, {\mathbf {q}}),\nonumber \\ \end{aligned}$$
(21)
where
$$\begin{aligned}&A_{m, l}^{[5]} = \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l)\in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{{q_{i_j}}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_{m}, i_j}}} \right) \log \left( \frac{\sum _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_{m}, i_j}}} \right) . \end{aligned}$$
If the base of $\log $ is between 0 and 1, then inequality in (21) is reversed.
(ii)
If $\mathbf{r }$ and $\mathbf{q }$ are positive probability distributions, and the base of $\log $ is greater than 1, then we have:
$$\begin{aligned}&D(\mathbf{r }, \mathbf{q }) = A_{m, 1}^{[6]} \ge A_{m, 2}^{[6]} \ge \ldots \ge A_{m, m-1}^{[6]} \ge A_{m, m}^{[6]} \ge 0, \end{aligned}$$
(22)
where
$$\begin{aligned} A_{m, l}^{[6]}= & {} \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l)\in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{{q_{i_j}}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_{m}, i_j}}} \right) \log \left( \frac{\sum _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{q_{i_j}}{\alpha _{I_{m}, i_j}}} \right) . \end{aligned}$$
If the base of $\log $ is between 0 and 1, then inequality signs in (22) are reversed.

Proof

(i) On taking $f: = \log $ in Theorem 2.3 (ii), we get (21).

(ii) It is a special case of (i). $\square $

4 Inequalities for Rényi divergence and entropy

The Rényi divergence and entropy come from [23].

Definition 4.1

Let $\mathbf{r } := (r_1, \ldots , r_n)$ and $\mathbf{q } : = (q_1, \ldots , q_n)$ be positive probability distributions, and let $\lambda \ge 0$, $\lambda \ne 1$.

(a):: The Rényi divergence of order $\lambda $ is defined by:
$$\begin{aligned} D_{\lambda }(\mathbf{r }, \mathbf{q }) : = \frac{1}{\lambda - 1} \log \left( \sum _{i=1}^{n}q_{i}\left( \frac{r_i}{q_i}\right) ^{\lambda } \right) . \end{aligned}$$
(23)
(b):: The Rényi entropy of order $\lambda $ of $\mathbf{r }$ is defined by:
$$\begin{aligned} H_{\lambda }(\mathbf{r }) : = \frac{1}{1 - \lambda } \log \left( \sum _{i=1}^{n} r_{i}^{\lambda }\right) . \end{aligned}$$
(24)

The Rényi divergence and the Rényi entropy can also be extended to non-negative probability distributions. If $\lambda \rightarrow 1$ in (23), we have the Kullback–Leibler divergence, and if $\lambda \rightarrow 1$ in (24), then we have the Shannon entropy. In the next two results, inequalities can be found for the Rényi divergence.

Theorem 4.2

Assume $(\mathrm{H}_{1})$, let $\mathbf{r } = (r_1, \ldots , r_n)$ be $\mathbf{q } = (q_1, \ldots , q_n)$ be probability distributions.

(i)
If $ 0 \le \lambda \le \mu $, such that $ \lambda , \mu \ne 1$, and the base of $\log $ is greater than 1, then:
$$\begin{aligned}&D_{\lambda } (\mathbf{r }, \mathbf{q }) \le A_{m, m}^{[7]} \le A_{m, m-1}^{[7]} \le \cdots \le A_{m, 2}^{[7]} \le A_{m, 1}^{[7]} = D_{\mu } (\mathbf{r }, \mathbf{q }), \end{aligned}$$
(25)
where
$$\begin{aligned} A_{m, l}^{[7]}= & {} \frac{1}{\mu -1}\log \left( \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \right. \\&\times \left. \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) ^{\frac{\mu - 1}{\lambda - 1}}\right) . \end{aligned}$$
The reverse inequalities hold in (25) if the base of $\log $ is between 0 and 1.
(ii)
If $1 < \mu $ and the base of $\log $ is greater than 1, then:
$$\begin{aligned}&D_{1} (\mathbf{r }, \mathbf{q }) = D (\mathbf{r }, \mathbf{q }) = \sum _{s=1}^{n}r_s\log \left( \frac{r_s}{q_s}\right) \le A_{m, m}^{[8]} \le A_{m, m-1}^{[8]} \le \cdots \le A_{m, 2}^{[8]} \le A_{m, 1}^{[8]} = D_{\mu } (\mathbf{r }, \mathbf{q }),\nonumber \\ \end{aligned}$$
(26)
where
$$\begin{aligned} A_{m, l}^{[8]}= & {} \le \frac{1}{\mu -1}\log \left( \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \exp \right. \\&\times \left. \left( \frac{(\mu -1)\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}} \log \left( \frac{r_{i_j}}{q_{i_j}}\right) }{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}} \right) \right) ; \end{aligned}$$
here the base of $\exp $ is same as the base of $\log $, and the reverse inequalities hold if the base of $\log $ is between 0 and 1.
(iii)
If $0 \le \lambda < 1$, and the base of $\log $ is greater than 1, then:
$$\begin{aligned}&D_{\lambda } (\mathbf{r }, \mathbf{q }) \le A_{m, m}^{[9]} \le A_{m, m-1}^{[9]} \le \cdots \le A_{m, 2}^{[9]} \le A_{m, 1}^{[9]} = D_{1} (\mathbf{r }, \mathbf{q }), \end{aligned}$$
(27)
where
$$\begin{aligned}&A_{m, l}^{[9]} = \frac{1}{\lambda -1}\frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) .\nonumber \\ \end{aligned}$$
(28)

Proof

By applying Theorem 1.1 with $I=(0, \infty )$, $f: (0, \infty ) \rightarrow {\mathbb {R}}$, $f(t):= t^{\frac{\mu - 1}{\lambda -1}}$:

$$\begin{aligned} p_s : = r_s, \,\ \,\ \,\ x_s : = \left( \frac{r_s}{q_s} \right) ^{\lambda - 1}, \,\ s = 1, \ldots , n, \end{aligned}$$

we have:

$$\begin{aligned}&\left( \sum _{s=1}^{n}q_s \left( \frac{r_s}{q_s} \right) ^{\lambda }\right) ^{\frac{\mu - 1}{\lambda - 1}} = \left( \sum _{s=1}^{n}r_s \left( \frac{r_s}{q_s}\right) ^{\lambda - 1} \right) ^{\frac{\mu - 1}{\lambda - 1}} \nonumber \\&\quad \ldots \le \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) ^{\frac{\mu - 1}{\lambda - 1}} \nonumber \\&\quad \le \ldots \le \sum _{s=1}^{n}r_s \left( \left( \frac{r_s}{q_s}\right) ^{\lambda - 1}\right) ^{\frac{\mu - 1}{\lambda - 1}}, \end{aligned}$$

(29)

if either $0 \le \lambda< 1 < \beta $ or $1 < \lambda \le \mu $, and the reverse inequality in (29) holds if $0 \le \lambda \le \beta < 1$. By raising to power $\frac{1}{\mu - 1}$, we have from all:

$$\begin{aligned}&\left( \sum _{s=1}^{n}q_s \left( \frac{r_s}{q_s}\right) ^{\lambda }\right) ^{\frac{1}{\lambda - 1}} \nonumber \\&\quad \le \ldots \le \left( \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) ^{\frac{\mu - 1}{\lambda - 1}}\right) ^{\frac{1}{\mu - 1}} \nonumber \\&\quad \le \cdots \le \left( \sum _{s=1}^{n}r_s \left( \left( \frac{r_s}{q_s}\right) ^{\lambda - 1}\right) ^{\frac{\mu - 1}{\lambda - 1}}\right) ^{\frac{1}{\mu - 1}} = \left( \sum _{s=1}^{n}q_{s}\left( \frac{r_s}{q_s} \right) ^{\mu } \right) ^{\frac{1}{\mu - 1}}. \end{aligned}$$

(30)

Since $\log $ is increasing if the base of $\log $ is greater than 1, it now follows (25). If the base of log is between 0 and 1, then $\log $ is decreasing and, therefore, inequality in (25) is reversed. If $\lambda = 1$ and $\beta = 1$, we have (ii) and (iii), respectively, by taking limit. $\square $

Theorem 4.3

Assume $(\mathrm{H}_{1})$; let $\mathbf{r } = (r_1, \ldots , r_n)$ and $\mathbf{q } = (q_1, \ldots , q_n)$ be probability distributions. If either $0 \le \lambda < 1$ and the base of $\log $ is greater than 1, or $1 < \lambda $ and the base of $\log $ is between 0 and 1, then:

$$\begin{aligned} \frac{1}{\sum _{s=1}^{n}q_s \left( \frac{r_s}{q_s}\right) ^{\lambda }} \sum _{s=1}^{n}q_s \left( \frac{r_s}{q_s}\right) ^{\lambda } \log \left( \frac{r_s}{q_s}\right)= & {} A_{m, 1}^{[10]} \le A_{m, 2}^{[10]} \le \cdots \le A_{m, m-1}^{[10]} \le A_{m, m}^{[10]} \le D_{\lambda } (\mathbf{r }, \mathbf{q }) \le A_{m, m}^{[11]} \nonumber \\\le & {} A_{m, m}^{[11]} \le \cdots \le A_{m, 2}^{[11]} \le A_{m, 1}^{[11]} = D_{1} (\mathbf{r }, \mathbf{q }), \end{aligned}$$

(31)

where

$$\begin{aligned} A_{m, m}^{[10]}= & {} \frac{1}{(\lambda - 1)\sum _{s=1}^{n}q_{s}\left( \frac{r_{s}}{q_s}\right) ^{\lambda }} \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l)\\&\times \left( {\sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) \end{aligned}$$

and

$$\begin{aligned} A_{m, m}^{[11]}&= \frac{1}{\lambda - 1}\frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) . \end{aligned}$$

The inequalities in (31) are reversed if either $0 \le \lambda < 1$ and the base of $\log $ is between 0 and 1, or $1 < \lambda $ and the base of $\log $ is greater than 1.

Proof

We prove only the case when $0 \le \lambda < 1$ and the base of $\log $ is greater than 1 and the other cases can be proved similarly. Since $\frac{1}{\lambda - 1} < 0$ and the function $\log $ is concave and then choose $I = (0, \infty )$, $f : = \log $, $p_{s} = r_{s}$, $x_{s}: = \left( \frac{r_s}{q_s}\right) ^{\lambda - 1}$ in Theorem 1.1, we have:

$$\begin{aligned}&D_{\lambda } (\mathbf{r }, \mathbf{q }) = \frac{1}{\lambda -1}\log \left( \sum _{s=1}^{n}q_{s}\left( \frac{r_{s}}{q_s}\right) ^{\lambda }\right) =\frac{1}{\lambda -1}\log \left( \sum _{s=1}^{n}r_{s} \left( \frac{r_{s}}{q_s}\right) ^{\lambda - 1}\right) \nonumber \\&\quad \le \cdots \le \frac{1}{\lambda - 1}\frac{(m-1)!}{(l-1)!} \sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l} (i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) \nonumber \\&\quad \le \cdots \le \frac{1}{\lambda - 1}\sum _{s=1}^{n}r_{s}\log \left( \left( \frac{r_s}{q_s}\right) ^{\lambda - 1}\right) = \sum _{s=1}^{n}r_{s}\log \left( \frac{r_s}{q_s}\right) = D_{1} (\mathbf{r }, \mathbf{q }), \end{aligned}$$

(32)

and this gives the upper bound for $D_{\lambda } (\mathbf{r }, \mathbf{q })$.

Since the base of $\log $ is greater than 1, the function $x \mapsto xf(x)$ $(x > 0)$ is convex; therefore, $\frac{1}{1 - \lambda } < 0$, and Theorem 1.1 gives:

$$\begin{aligned} D_{\lambda } (\mathbf{r }, \mathbf{q })= & {} \frac{1}{\lambda -1}\log \left( \sum _{s=1}^{n}q_{s}\left( \frac{r_{s}}{q_s}\right) ^{\lambda }\right) = \frac{1}{\lambda -1\left( \sum _{s=1}^{n}q_{s} \left( \frac{r_{s}}{q_s}\right) ^{\lambda }\right) } \left( \sum _{s=1}^{n}q_{s}\left( \frac{r_{s}}{q_s}\right) ^{\lambda }\right) \log \left( \sum _{s=1}^{n}q_{s}\left( \frac{r_{s}}{q_s}\right) ^{\lambda }\right) \nonumber \\\ge & {} \ldots \ge \frac{1}{\lambda - 1\left( \sum _{s=1}^{n}q_{s}\left( \frac{r_{s}}{q_s}\right) ^{\lambda }\right) } \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \nonumber \\&\times \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) \nonumber \\= & {} \frac{1}{\lambda - 1\left( \sum _{s=1}^{n}q_{s} \left( \frac{r_{s}}{q_s}\right) ^{\lambda }\right) }\frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \nonumber \\&\times \left( {\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\left( \frac{r_{i_j}}{q_{i_j}}\right) ^{\lambda - 1}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) \nonumber \\\ge & {} \ldots \ge \nonumber \\&\times \frac{1}{\lambda - 1} \sum _{s=1}^{n}r_s \left( \frac{r_s}{q_s}\right) ^{\lambda - 1} \log \left( \frac{r_s}{q_s}\right) ^{\lambda - 1}\frac{1}{\sum _{s=1}^{n}r_s \left( \frac{r_s}{q_s}\right) ^{\lambda - 1}} \nonumber \\= & {} \frac{1}{\sum _{s=1}^{n}q_s \left( \frac{r_s}{q_s}\right) ^{\lambda }} \sum _{s=1}^{n}q_s \left( \frac{r_s}{q_s}\right) ^{\lambda } \log \left( \frac{r_s}{q_s}\right) , \end{aligned}$$

(33)

which give the lower bound of $D_{\lambda } (\mathbf{r }, \mathbf{q })$. $\square $

Using the previous results, some inequalities of Rényi entropy are obtained. Let $\frac{\mathbf{1 }}{\mathbf{n }} = (\frac{1}{n}, \ldots , \frac{1}{n})$ be a discrete probability distribution.

Corollary 4.4

Assume $(\mathrm{H}_1)$; let $\mathbf{r }= (r_1, \ldots , r_n)$ and $\mathbf{q }= (q_1, \ldots , q_n)$ be positive probability distributions.

(i)
If $0 \le \lambda \le \mu $, $\lambda , \mu \ne 1$, and the base of $\log $ is greater than 1, then:
$$\begin{aligned}&H_{\lambda }(\mathbf{r }) = \log (n) - D_{\lambda }\left( \mathbf{r }, \frac{1}{\mathbf{n }}\right) \ge A_{m, m}^{[12]} \ge A_{m, m}^{[12]} \ge \cdots A_{m, 2}^{[12]} \ge A_{m, 1}^{[12]} = H_{\mu }(\mathbf{r }), \end{aligned}$$
(34)
where
$$\begin{aligned} A_{m, l}^{[12]}= & {} \frac{1}{1 - \mu }\log \left( \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \nonumber \right. \\&\times \left. \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}^{\lambda }}{\alpha _{I_m, i_j}}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) ^{\frac{\mu - 1}{\lambda - 1}} \right) . \end{aligned}$$
The reverse inequalities hold in (34) if the base of $\log $ is between 0 and 1.
(ii)
If $1 < \mu $ and base of $\log $ is greater than 1, then:
$$\begin{aligned} S= -\sum _{s=1}^{n}p_i\log (p_i) \ge A_{m, m}^{[13]} \ge A_{m, m-1}^{[13]} \ge \ldots \ge A_{m, 2}^{[13]} \ge A_{m, 1}^{[13]} = H_{\mu }(\mathbf{r }), \end{aligned}$$
(35)
where
$$\begin{aligned} A_{m, l}^{[13]}= & {} \log (n) + \frac{1}{1 -\mu }\log \left( \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \exp \right. \\&\times \left. \left( \frac{(\mu -1)\sum \nolimits _{j=1}^{l} \frac{r_{i_j}}{\alpha _{I_m, i_j}}\log \left( nr_{i_j}\right) }{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) \right) ; \end{aligned}$$
the base of $\exp $ is the same as the base of $\log $. The inequalities in (35) are reversed if the base of $\log $ is between 0 and 1.
(iii)
If $0 \le \lambda < 1$, and the base of $\log $ is greater than 1, then:
$$\begin{aligned} H_{\lambda }(\mathbf{r }) \ge A_{m, m}^{[14]} \ge A_{m, m-1}^{[14]} \ge \cdots \ge A_{m, 2}^{[14]} \le A_{m, 1}^{[14]} = S, \end{aligned}$$
(36)
where
$$\begin{aligned} A_{m, m}^{[14]} = \frac{1}{1 - \lambda } \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}^{\lambda }}{\alpha _{I_m, i_j}}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) . \end{aligned}$$
(37)
The inequalities in (36) are reversed if the base of $\log $ is between 0 and 1.

Proof

(i) Suppose $\mathbf{q }= \frac{\mathbf{1 }}{\mathbf{n }}$; then from (23), we have:

$$\begin{aligned} D_{\lambda } (\mathbf{r }, \mathbf{q }) = \frac{1}{\lambda - 1} \log \left( \sum _{s=1}^{n}n^{\lambda - 1}r_{s}^{\lambda } \right) = \log (n) + \frac{1}{\lambda - 1}\log \left( \sum _{s=1}^{n}r_{s}^{\lambda } \right) ; \end{aligned}$$

(38)

therefore, we have:

$$\begin{aligned} H_{\lambda }(\mathbf{r }) = \log (n) - D_{\lambda } (\mathbf{r }, \frac{\mathbf{1 }}{\mathbf{n }}). \end{aligned}$$

(39)

Now, using Theorem 4.2 (i) and (39), we get:

$$\begin{aligned} H_{\lambda }(\mathbf{r })= & {} \log (n) - D_{\lambda } \left( \mathbf{r }, \frac{\mathbf{1 }}{\mathbf{n }}\right) \ge \cdots \ge \log (n) - \frac{1}{\mu -1}\log \left( n^{\mu - 1}\frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \right. \nonumber \\&\left. \times \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}^{\lambda }}{\alpha _{I_m, i_j}}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) ^{\frac{\mu - 1}{\lambda - 1}}\right) \ge \cdots \ge \log (n) - D_{\mu } (\mathbf{r }, \mathbf{q }) = H_{\mu }(\mathbf{r }); \end{aligned}$$

(40)

(ii) and (iii) can be proved similarly. $\square $

Corollary 4.5

Assume $(H_1)$, and let $\mathbf{r }= (r_1, \ldots , r_n)$ and $\mathbf{q }= (q_1, \ldots , q_n)$ be positive probability distributions.

If either $0 \le \lambda < 1$ and the base of $\log $ is greater than 1, or $1 < \lambda $ and the base of $\log $ is between 0 and 1, then:

$$\begin{aligned}&- \frac{1}{\sum _{s=1}^{n}r_s^{\lambda } } \sum _{s=1}^{n}r_s^{\lambda } \log (r_s) = A_{m, 1}^{[15]} \ge A_{m, 2}^{[15]} \ge \ldots \ge A_{m, m-1}^{[15]} \ge A_{m, m}^{[15]} \ge H_{\lambda }(\mathbf{r }) \ge A_{m, m}^{[16]} \ge A_{m, m-1}^{[16]} \nonumber \\&\quad \ge \ldots A_{m, 2}^{[16]} \ge A_{m, 1}^{[16]} = H\left( \mathbf{r } \right) , \end{aligned}$$

(41)

where

$$\begin{aligned} A_{m, l}^{[15]}= & {} \frac{1}{(\lambda - 1)\sum _{s=1}^{n}r_s^{\lambda }}\frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( {\sum \limits _{j=1}^{l}\frac{r_{i_j}^{\lambda }}{\alpha _{I_m, i_j}}}\right) \log \left( n^{\lambda - 1}\frac{\sum \nolimits _{j=1}^{l} \frac{r_{i_j}^{\lambda }}{\alpha _{I_m, i_j}}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) \\&\times \, { \hbox {and}} \\ A_{m, 1}^{[16]}= & {} \frac{1}{1 - \lambda }\frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}\right) \log \left( \frac{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}^{\lambda }}{\alpha _{I_m, i_j}}}{\sum \nolimits _{j=1}^{l}\frac{r_{i_j}}{\alpha _{I_m, i_j}}}\right) . \end{aligned}$$

The inequalities in (41) are reversed if either $0 \le \lambda < 1$ and the base of $\log $ is between 0 and 1, or $1 < \lambda $ and the base of $\log $ is greater than 1.

Proof

The proof is similar to Corollary 4.4 using Theorem 4.3. $\square $

5 Inequalities using Zipf–Mandelbrot law

The Zipf–Mandelbrot law is defined as follows (see [20]).

Definition 5.1

Zipf–Mandelbrot law is a discrete probability distribution depending on three parameters $N \in \{1, 2, \ldots , \}, q \in [0, \infty )$ and $t > 0$, and is defined by:

$$\begin{aligned} f(s; N, q, t) : = \frac{1}{(s + q)^{t}H_{N, q, t}}, \,\ \,\ \,\ s = 1, \ldots , N, \end{aligned}$$

(42)

where

$$\begin{aligned} H_{N, q, t} = \sum _{j=1}^{N}\frac{1}{(j + q)^{t}}. \end{aligned}$$

(43)

If the total mass of the law is taken over all ${\mathbb {N}}$, then for $q \ge 0$, $t > 1$, $s \in {\mathbb {N}}$, density function of Zipf–Mandelbrot law becomes:

$$\begin{aligned} f(s; q, t) = \frac{1}{(s + q)^{t}H_{q, t}}, \end{aligned}$$

(44)

where

$$\begin{aligned} H_{q, t} = \sum _{j=1}^{\infty } \frac{1}{(j + q)^{t}}. \end{aligned}$$

(45)

For $q = 0$, the Zipf–Mandelbrot law. By Corollary 4.4 (iii), we get:

Conclusion 5.2

Assume $(H_1)$; let $\mathbf{r }$ be a Zipf–Mandelbrot law, by Corollary 4.4 (iii), we get. If $0 \le \lambda < 1$, and the base of $\log $ is greater than 1, then:

$$\begin{aligned} H_{\lambda }(\mathbf{r })= & {} \frac{1}{1 - \lambda }\log \left( \frac{1}{H_{N, q, t}^{\lambda }}\sum _{s=1}^{n}\frac{1}{(s + q)^{\lambda s}} \right) \nonumber \\\ge & {} \ldots \ge \frac{1}{1 - \lambda } \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l) \in I_l}\eta _{I_m, l}(i_1, \ldots , i_l) \left( \sum \limits _{j=1}^{l}\frac{1}{\alpha _{I_m, i_j}(i_j + q)H_{N. q, t}} \right) \log \nonumber \\&\times \left( \frac{1}{H_{N, q, t}^{\lambda - 1}}\frac{\sum \nolimits _{j=1}^{l}\frac{1}{\alpha _{I_m, i_j}(i_j - q)^{\lambda s}}}{\sum \nolimits _{j=1}^{l}\frac{1}{\alpha _{I_m, i_j}(i_j - q)^{s}}}\right) \nonumber \\\ge & {} \ldots \ge \frac{t}{H_{N, q, t}}\sum _{s=1}^{N} \frac{\log (s + q)}{(s + q)^{t}} + \log (H_{N, q, t}) = S. \end{aligned}$$

(46)

The inequalities in (46) are reversed if the base of $\log $ is between 0 and 1.

Conclusion 5.3

Assume $(H_1)$; let $\mathbf{r }_{1}$ and $\mathbf{r }_2$ be the Zipf–Mandelbort law with parameters $N \in \{1, 2, \ldots \}$, $q_1, q_2 \in [0, \infty )$ and $s_1, s_2 > 0$, respectively. Then, from Corollary 3.4 (ii), we have: if the base of $\log $ is greater than 1, then:

$$\begin{aligned} {\bar{D}}(\mathbf{r }_{1}, \mathbf{r }_{2})= & {} \sum _{s=1}^{n}\frac{1}{(s + q_1)^{t_1} H_{N, q_1, t_1}} \log \left( \frac{(s + q_2)^{t_2}H_{N, q_2, t_2}}{(s + q_1)^{t_1}H_{N, q_2, t_1}} \right) \ge \ldots \ge \frac{(m-1)!}{(l-1)!}\sum \limits _{(i_1, \ldots , i_l)\in I_l}\eta _{I_m, l}(i_1, \ldots , i_l)\nonumber \\&\times \left( \sum \limits _{j=1}^{l}\frac{{\frac{1}{(i_j + q_2)^{t_2}H_{N, q_2, t_2}}}}{\alpha _{I_m, i_j}}\right) \left( \frac{\sum _{j=1}^{l} \frac{\frac{1}{(i_j + q_1)^{t_1}H_{N, q_1, t_1}}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{\frac{1}{(i_j + q_2)^{t_2}H_{N, q_2, t_2}}}{\alpha _{I_{m}, i_j}}} \right) \log \left( \frac{\sum _{j=1}^{l} \frac{\frac{1}{(i_j + q_1)^{t_1}H_{N, q_1, t_1}}}{\alpha _{I_{m}, i_j}}}{\sum _{j=1}^{l}\frac{\frac{1}{(i_j + q_2)^{t_2}H_{N, q_2, t_2}}}{\alpha _{I_{m}, i_j}}}\right) \ge \ldots \ge 0.\nonumber \\ \end{aligned}$$

(47)

The inequalities in (47) are reversed if base of $\log $ is between 0 and 1.

6 Shannon entropy, Zipf–Mandelbrot law, and hybrid Zipf–Mandelbrot law

Here, we maximize the Shannon entropy using the method of Lagrange multiplier under some equations constraints and get the Zipf–Mandelbrot law.

Theorem 6.1

If $J = \{1, 2, \ldots , N \}$, for a given $q \ge 0$, a probability distribution that maximizes the Shannon entropy under the constraints:

$$\begin{aligned} \sum \limits _{s\in J}r_s = 1, \,\ \,\ \,\ \sum \limits _{s\in J} r_s \left( \ln (s + q) \right) := \Psi , \end{aligned}$$

is Zipf–Mandelbrot law.

Proof

If $J = \{1, 2, \ldots , N \}$. We set the Lagrange multipliers $\lambda $ and t and consider the expression:

$$\begin{aligned} {\widetilde{S}} = - \sum _{s=1}^{N}r_s \ln r_s - \lambda \left( \sum _{s=1}^{N}r_s - 1\right) - t\left( \sum _{s=1}^{N}r_{s}\ln (s + q) - \Psi \right) . \end{aligned}$$

Just for the sake of convenience, replace $\lambda $ by $\ln \lambda -1$, and thus, the last expression gives:

$$\begin{aligned} {\widetilde{S}} = - \sum _{s=1}^{N}r_s \ln r_s - \left( \ln \lambda - 1\right) \left( \sum _{s=1}^{N}r_s - 1\right) - t\left( \sum _{s=1}^{N}r_{s}\ln (s + q) - \Psi \right) . \end{aligned}$$

From ${\widetilde{S}}_{r_s} = 0$, for $s =1, 2, \ldots , N$, we get:

$$\begin{aligned} r_s = \frac{1}{\lambda \left( s + q\right) ^t}, \end{aligned}$$

and on using the constraint $\sum _{s = 1}^Nr_s = 1 $, we have:

$$\begin{aligned} \lambda = \sum _{s=1}^{N} \left( \frac{1}{(s+1)^t} \right) , \end{aligned}$$

where $t > 0$, concluding that:

$$\begin{aligned} r_s = \frac{1}{(s + q)^t H_{N, q, t}}, \,\ \,\ \,\ s = 1, 2, \ldots , N. \end{aligned}$$

$\square $

Remark 6.2

Observe that the Zipf–Mandelbrot law and Shannon entropy can be bounded from above (see [21]):

$$\begin{aligned} S = - \sum _{s=1}^{N} f\left( s, N, q, t\right) \ln f(s, N, q, t) \le - \sum _{s=1}^{N}f(s, N, q, t) \ln q_s, \end{aligned}$$

where $\left( q_1, \ldots , q_N\right) $ is a positive N-tuple, such that $\sum _{s=1}^{N}q_s = 1$.

Theorem 6.3

If $J = \{1, \ldots , N\}$, then probability distribution that maximizes Shannon entropy under constraints

$$\begin{aligned} \sum \limits _{s \in J}r_s = 1, \,\ \,\ \,\ \sum \limits _{s \in J} r_s \ln (s + q) := \Psi , \,\ \,\ \,\ \sum \limits _{s \in J}sr_s := \eta \end{aligned}$$

is hybrid Zipf–Mandelbrot law given as:

$$\begin{aligned} r_s = \frac{w^s}{\left( s+q\right) ^k \Phi ^{*}(k, q, w)}, \,\ \,\ \,\ s \in J, \end{aligned}$$

where

$$\begin{aligned} \Phi _{J}(k, q, w) = \sum \limits _{ s \in J} \frac{w^s}{(s + q)^k}. \end{aligned}$$

Proof

First, consider $J = \{1, \ldots , N\}$; we set the Lagrange multiplier and consider the expression:

$$\begin{aligned} {\tilde{S}} = - \sum _{s=1}^{N}r_s \ln r_s + \ln w \left( \sum _{s=1}^{N}sr_s - \eta \right) - \left( \ln \lambda - 1\right) \left( \sum _{s=1}^{N}r_s - 1 \right) - k \left( \sum _{s=1}^{N}r_s \ln (s + q) - \Psi \right) . \end{aligned}$$

On setting ${\tilde{S}}_{r_s} = 0$, for $s= 1, \ldots , N$, we get:

$$\begin{aligned} - \ln r_s + s \ln w - \ln \lambda - k \ln (s+q) = 0; \end{aligned}$$

after solving for $r_s$, we get:

$$\begin{aligned} \lambda = \sum _{s=1}^{N}\frac{w^s}{\left( s+q\right) ^k}; \end{aligned}$$

and we recognize this as the partial sum of Lerch’s transcendent that we will denote with:

$$\begin{aligned} \Phi _{N}^{*}\left( k, q, w\right) = \sum _{s=1}^{N}\frac{w^s}{(s+q)^k} \end{aligned}$$

with $w \ge 0, k > 0$. $\square $

Remark 6.4

Observe that for Zipf–Mandelbrot law, Shannon entropy can be bounded from above (see [21]):

$$\begin{aligned} S = -\sum _{s=1}^{N} f_{h}\left( s, N, q, k\right) \ln f_{h}\left( s, N, q, k\right) \le - \sum _{s=1}^{N}f_{h}\left( s, N, q, k\right) \ln q_s, \end{aligned}$$

where $\left( q_1, \ldots , q_N\right) $ is any positive N-tuple, such that $\sum _{s=1}^{N}q_s = 1$.

Under the assumption of Theorem 2.3 (i), define the non-negative functionals as follows:

$$\begin{aligned} \Theta _{3}(f)= & {} {\mathscr {A}}_{m, r}^{[1]} - f\left( \frac{\sum _{s=1}^{n}r_s}{\sum _{s=1}^{n}q_s}\right) \sum _{s=1}^{n}q_{s}, \,\ \,\ \,\ r=1, \ldots , m, \end{aligned}$$

(48)

$$\begin{aligned} \Theta _{4}(f)= & {} {\mathscr {A}}_{m, r}^{[1]} - {\mathscr {A}}_{m, k}^{[1]}, \,\ \,\ \,\ 1\le r <k \le m. \end{aligned}$$

(49)

Under the assumption of Theorem 2.3 (ii), define the non-negative functionals as follows:

$$\begin{aligned} \Theta _{5}(f)= & {} {\mathscr {A}}_{m, r}^{[2]} - \left( \sum _{s=1}^{n}r_s\right) f\left( \frac{\sum _{s=1}^{n}r_s}{\sum _{s=1}^{n}q_s}\right) , \,\ \,\ \,\ r=1, \ldots , m, \end{aligned}$$

(50)

$$\begin{aligned} \Theta _{6}(f)= & {} {\mathscr {A}}_{m, r}^{[2]} - {\mathscr {A}}_{m, k}^{[2]}, \,\ \,\ \,\ 1\le r <k \le m. \end{aligned}$$

(51)

Under the assumption of Corollary 3.2 (i), define the following non-negative functionals:

$$\begin{aligned} \Theta _{7}(f)= & {} A_{m,r}^{[3]}+\sum _{i=1}^{n}q_{i}\log (q_{i}), \,\ r=1,\ldots , n \end{aligned}$$

(52)

$$\begin{aligned} \Theta _{8}(f)= & {} A_{m,r}^{[3]}-A_{m,k}^{[3]},\,\ 1 \le r <k \le m. \end{aligned}$$

(53)

Under the assumption of Corollary 3.2 (ii), define the following non-negative functionals are given as:

$$\begin{aligned} \Theta _{9}(f)= & {} A_{m,r}^{[4]}- S, \,\ r=1,\ldots ,m \end{aligned}$$

(54)

$$\begin{aligned} \Theta _{10}(f)= & {} A_{m,r}^{[4]}-A_{m,k}^{[4]},\,\ 1 \le r <k \le m. \end{aligned}$$

(55)

Under the assumption of Corollary 3.4 (i), let us define the non-negative functionals as follows:

$$\begin{aligned} \Theta _{11}(f)= & {} A_{m,r}^{[5]}-\sum _{s=1}^{n}r_{s}\log \left( \sum _{s=1}^{n} \log \frac{r_{n}}{\sum _{s=1}^{n}q_{s}}\right) , \,\ r=1,\ldots ,m \end{aligned}$$

(56)

$$\begin{aligned} \Theta _{12}(f)= & {} A_{m,r}^{[5]}-A_{m,k}^{[5]},\,\ 1 \le r <k \le m. \end{aligned}$$

(57)

Under the assumption of Corollary 3.4 (ii), define the non-negative functionals as follows:

$$\begin{aligned} \Theta _{13}(f) = A_{m,r}^{[6]}-A_{m,k}^{[6]},\,\ 1 \le r <k \le m. \end{aligned}$$

(58)

Under the assumption of Theorem 4.2 (i), consider the following functionals:

$$\begin{aligned} \Theta _{14}(f)= & {} A_{m,r}^{[7]}- D_{\lambda }({\mathbf {r}},{\mathbf {q}}), \,\ r=1, \ldots ,m \end{aligned}$$

(59)

$$\begin{aligned} \Theta _{15}(f)= & {} A_{m,r}^{[7]}-A_{m,k}^{[7]},\,\ 1 \le r <k \le m. \end{aligned}$$

(60)

Under the assumption of Theorem 4.2 (ii), consider the following functionals:

$$\begin{aligned} \Theta _{16}(f) = A_{m,r}^{[8]}-D_{1}({\mathbf {r}},{\mathbf {q}}), \,\ r=1, \ldots ,m \end{aligned}$$

(61)

$$\begin{aligned} \Theta _{17}(f)=A_{m,r}^{[8]}-A_{m,k}^{[8]},\,\ 1 \le r <k \le m. \end{aligned}$$

(62)

Under the assumption of Theorem 4.2 (iii), consider the following functionals:

$$\begin{aligned} \Theta _{18}(f) = A_{m,r}^{[9]}-D_{\lambda }({\mathbf {r}},{\mathbf {q}}), \,\ r=1, \ldots ,m \end{aligned}$$

(63)

$$\begin{aligned} \Theta _{19}(f) = A_{m,r}^{[9]}-A_{m,k}^{[9]},\,\ 1 \le r <k \le m. \end{aligned}$$

(64)

Under the assumption of Theorem 4.3, consider the following non-negative functionals:

$$\begin{aligned} \Theta _{20}(f)= & {} D_{\lambda } (\mathbf{r }, \mathbf{q }) - A_{m,r}^{[10]}, \,\ \,\ \,\ r = 1, \ldots , m \end{aligned}$$

(65)

$$\begin{aligned} \Theta _{21}(f)= & {} A_{m,k}^{[10]}-A_{m,r}^{[10]},\,\ 1 \le r <k \le m. \end{aligned}$$

(66)

$$\begin{aligned} \Theta _{22}(f)= & {} A_{m,r}^{[11]} - D_{\lambda } (\mathbf{r }, \mathbf{q }), \,\ r = 1, \ldots , m \end{aligned}$$

(67)

$$\begin{aligned} \Theta _{23}(f)= & {} A_{m,r}^{[11]} - A_{m,r}^{[11]} ,\,\ 1 \le r <k \le m. \end{aligned}$$

(68)

$$\begin{aligned} \Theta _{24}(f)= & {} A_{m,r}^{[11]} - A_{m,k}^{[10]} ,\,\ r = 1, \ldots , m, \,\ k = 1, \ldots , m. \end{aligned}$$

(69)

Under the assumption of Corollary 4.4 (i), consider the following non-negative functionals:

$$\begin{aligned} \Theta _{25}(f)= & {} H_{\lambda }(\mathbf{r }) - A_{m,r}^{[12]}, \,\ r=1, \ldots ,m \end{aligned}$$

(70)

$$\begin{aligned} \Theta _{26}(f)= & {} A_{m,k}^{[12]}-A_{m,r}^{[12]}, \,\ 1 \le r <k \le m. \end{aligned}$$

(71)

Under the assumption of Corollary 4.4 (ii), consider the following functionals:

$$\begin{aligned} \Theta _{27}(f)= & {} S - A_{m,r}^{[13]}, \,\ r=1, \ldots ,m \end{aligned}$$

(72)

$$\begin{aligned} \Theta _{28}(f)= & {} A_{m,k}^{[13]}-A_{m,r}^{[13]}, \,\ 1 \le r <k \le m. \end{aligned}$$

(73)

Under the assumption of Corollary 4.4 (iii), consider the following functionals:

$$\begin{aligned} \Theta _{29}(f)= & {} H_{\lambda }(\mathbf{r }) - A_{m,r}^{[14]}, \,\ r=1, \ldots ,m \end{aligned}$$

(74)

$$\begin{aligned} \Theta _{30}(f)= & {} A_{m,k}^{[14]}-A_{m,r}^{[14]}, \,\ 1 \le r <k \le m. \end{aligned}$$

(75)

Under the assumption of Corollary 4.5, defined the following functionals.

$$\begin{aligned} \Theta _{31}= & {} A_{m,r}^{[15]} - H_{\lambda }(\mathbf{r }), \,\ r = 1, \ldots , m \end{aligned}$$

(76)

$$\begin{aligned} \Theta _{32}= & {} A_{m,r}^{[15]} - A_{m,k}^{[15]}, \,\ 1 \le r <k \le m. \end{aligned}$$

(77)

$$\begin{aligned} \Theta _{33}= & {} H_{\lambda }(\mathbf{r }) - A_{m,r}^{[16]}, \,\ r =1, \ldots , m \end{aligned}$$

(78)

$$\begin{aligned} \Theta _{34}= & {} A_{m,k}^{[16]} - A_{m,r}^{[16]}, \,\ 1 \le r <k \le m. \end{aligned}$$

(79)

$$\begin{aligned} \Theta _{35}= & {} A_{m,r}^{[15]} - A_{m,k}^{[16]} , \,\ r = 1, \ldots , m, \,\ k = 1, \ldots , m. \end{aligned}$$

(80)

7 Generalization of refinement of Jensen-, Rényi-, and Shannon-type inequalities via Lidstone polynomial

We construct some new identities with the help of generalized Lidstone polynomial (6).

Theorem 7.1

Assume $(H_1)$; let $f: [\alpha _1, \alpha _2] \rightarrow {\mathbb {R}}$ be a function, where $[\alpha _1, \alpha _2] \subset {\mathbb {R}}$, such that be an interval, such that $f\in C^{2m}[\alpha _1, \alpha _2]$ for $m \ge 1$. Also let $x_1, \ldots , x_n \in [\alpha _1, \alpha _2]$ and $p_1, \ldots , p_n$ be positive real numbers, such that $\sum \nolimits _{i=1}^{n}p_i=1$, and ${\mathfrak {F}}_m(t)$ are the same as defined in (5), and then:

$$\begin{aligned} \Theta _{i}(f)= & {} \sum _{k=1}^{m-1}(\alpha _2 - \alpha _1)^{2k}f^{(2k)}(\alpha _1)\Theta _{i}\left( {\mathfrak {F}}_l\left( \frac{\alpha _2 -x}{\alpha _2 - \alpha _1}\right) \right) \nonumber \\&+ \sum _{k=1}^{m-1}(\alpha _2 - \alpha _1)^{2k}f^{(2k)}(\alpha _2) \Theta _{i}\left( {\mathfrak {F}}_l\left( \frac{x - \alpha _1}{\alpha _2 - \alpha _1}\right) \right) \nonumber \\&+ (\alpha _2 -\alpha _1)^{2k-1}\int _{\alpha _1}^{\alpha _2} \Theta _{i}\left( G_{m}\left( \frac{x - \alpha _1}{\alpha _2 - \alpha _1}, \frac{t - \alpha _1}{\alpha _2 - \alpha _1}\right) \right) f^{(2m)}(t)\mathrm{d}t, \,\ \,\ i = 1, 2, \ldots , 35. \end{aligned}$$

(81)

Proof

Using (6) in place of f in $\Theta _{i}(f),$ $i = 1, 2, \ldots , 35$, we get (81). $\square $

Theorem 7.2

Assume $(H_1)$; let $f: [\alpha _1, \alpha _2] \rightarrow {\mathbb {R}}$ be a function, where $[\alpha _1, \alpha _2] \subset {\mathbb {R}}$, such that be an interval, such that $f \in C^{2m}[\alpha _1, \alpha _2]$ for $ m \ge 1$. Also let $x_1, \ldots , x_n \in [\alpha _1, \alpha _2]$ and $p_1, \ldots , p_n$ be positive real numbers, such that $\sum \nolimits _{i=1}^{n}p_i=1$, and ${\mathfrak {F}}_m(t)$ are the same as defined in (5); let for $m \ge 1$:

$$\begin{aligned} \Theta _{i}\left( G_{m}\left( \frac{x - \alpha _1}{\alpha _2 - \alpha _1}, \frac{t - \alpha _1}{\alpha _2 - \alpha _1}\right) \right) \ge 0, \,\ for all \,\ t \in [\alpha _1, \alpha _2]. \end{aligned}$$

(82)

If f is 2m-convex function, then we have:

$$\begin{aligned} \Theta _{i}(f)\ge & {} \sum _{k=1}^{m-1}(\alpha _2 - \alpha _1)^{2k}f^{(2k)}(\alpha _1)\Theta _{i}\left( {\mathfrak {F}}_{l}\left( \frac{\alpha _2 - x}{\alpha _2 - \alpha _1}\right) \right) \nonumber \\&+ \sum _{k=1}^{m-1}(\alpha _2 - \alpha _1)^{2k}f^{(2k)}(\alpha _2)\Theta _{i}\left( {\mathfrak {F}}_{l}\left( \frac{x - \alpha _1}{\alpha _2 - \alpha _1}\right) \right) , \,\ \,\ i = 1, 2, \ldots , 35. \end{aligned}$$

(83)

Proof

Since f is 2m-convex; therefore, $f^{(2m)} \ge 0$ for all $x \in [\alpha _1, \alpha _2]$, then using (82) in (81), we get the required result. $\square $

Theorem 7.3

Assume $(H_1)$; let $f: [\alpha _1, \alpha _2] \rightarrow {\mathbb {R}}$ be a function, where $[\alpha _1, \alpha _2] \subset {\mathbb {R}}$ be an interval. Also let $x_1, \ldots , x_n \in [\alpha _1, \alpha _2]$ and $p_1, \ldots , p_n$ be positive real numbers, such that $\sum \nolimits _{i=1}^{n}p_i=1$, and also suppose that $f:[\alpha _1, \alpha _2] \rightarrow {\mathbb {R}}$ is 2m-convex. Then, the following results are valid.

(i)
If m is odd integer, then for every 2m-convex function, (83) holds. (ii) Suppose that (83) holds. If the function
$$\begin{aligned} \lambda (u) = \sum _{l=0}^{m-1}(\alpha _2 - \alpha _1)^{2l}g^{(2l)}(\alpha _1){\mathfrak {F}}_{l}\left( \frac{\alpha _2 - u}{\alpha _2 - \alpha _1}\right) + \sum _{l=0}^{m-1}(\alpha _2 - \alpha _1)^{2l}g^{(2l)}(\alpha _2){\mathfrak {F}}_{l}\left( \frac{u - \alpha _1}{\alpha _2 - \alpha _1}\right) \end{aligned}$$
is convex, then the right-hand side of (83) is non-negative and we have:
$$\begin{aligned} \Theta _{i}(f) \ge 0, \,\ \,\ i = 1, 2, \ldots , 35. \end{aligned}$$
(84)

Proof

(i) Note that $G_{1}(u, s) \le 0$ for $1 \le u, s, \le 1$ and also note that $G_{m}(u, s) \le 0$ for odd integer m and $G_{m}(u, s) \ge 0$ for even integer m. As $G_1$ is convex function and $G_{m-1}$ is positive for odd integer m, therefore:

$$\begin{aligned} \frac{\mathrm{d}^2}{\mathrm{d}^{2}u}\left( G_{m}(u,s)\right) = \int _{0}^{1}\frac{\mathrm{d}^2}{\mathrm{d}^2u}G_{1}(u,p)G_{m-1}(p,s)\mathrm{d}p \ge 0, \,\ m\ge 2. \end{aligned}$$

This shows that $G_m$ is convex in the first variable u if m is convex. Similarly, $G_m$ is concave in the first variable if m is even. Hence, if m is odd, then:

$$\begin{aligned} \Theta _{i}\left( G_{m}\left( \frac{x - \alpha _1}{\alpha _2 - \alpha _1}, \frac{t - \alpha _1}{\alpha _2 - \alpha _1}\right) \right) \ge 0; \end{aligned}$$

therefore, (84) is valid.

(ii) Using the linearity of $\Theta _{i}(f)$, we can write the right-hand side of (83) in the form $\Theta _{i}(\lambda )$. As $\lambda $ is supposed to be convex, therefore, the right-hand side of (83) is non-negative, and so $\Theta _{i}(f) \ge 0$. $\square $

Remark A

We can investigate the bounds for the identities related to the generalization of refinement of Jensen inequality using inequalities for the C̆ebys̆ev functional and some results relating to the Gr̈uss and Ostrowski-type inequalities can be constructed as given in Section 3 of [5]. Also we can construct the non-negative functionals from inequality (83) and give related mean value theorems and we can construct the new families of m-exponentially convex functions and Cauchy means related to these functionals as given in Section 4 of [5].

References

Anderson, G.; Ge, Y.: The size distribution of Chinese cities. Reg. Sci. Urban Econ. 35(6), 756–776 (2005)
Article Google Scholar
Auerbach, F.: Das Gesetz der Bevölkerungskonzentration. Petermanns Geogr. Mitt. 59, 74–76 (1913)
Google Scholar
Black, D.; Henderson, V.: Urban evolution in the USA. J. Econ. Geogr. 3(4), 343–372 (2003)
Article Google Scholar
Bosker, M.; Brakman, S.; Garrestsen, H.; Schramm, M.: A century of shocks: the evolution of the German city size distribution 1925–1999. Reg. Sci. Urban Econ. 38(4), 330–347 (2008)
Article Google Scholar
Butt, S.I.; Khan, K.A.; Pečarić, J.: Generaliztion of Popoviciu inequality for higher order convex function via Tayor’s polynomial. Acta Univ. Apulensis Math. Inform 42, 181–200 (2015)
MATH Google Scholar
Butt, S.I.; Mehmood, N.; Pečarić, J.: New generalizations of Popoviciu type inequalities via new green functions and Fink’s identity. Trans. A Razmadze Math. Inst. 171(3), 293–303 (2017)
Article MathSciNet Google Scholar
Butt, S.I.; Pečarić, J.: Popoviciu’s Inequality For $N$-Convex Functions. Lap Lambert Academic Publishing, Saarbrücken (2016)
Google Scholar
Butt, S.I.; Pečarić, J.: Weighted Popoviciu type inequalities via generalized Montgomery identities. Hrvatske akademije znanosti i umjetnosti: Matematicke znanosti 69–89 (2015)
Butt, S.I.; Khan, K.A.; Pečarić, J.: Popoviciu type inequalities via Hermite’s polynomial. Math. Inequal. Appl. 19(4), 1309–1318 (2016)
MathSciNet MATH Google Scholar
Csiszár, I.: Information measures: a critical survey. In: Tans. 7th Prague Conf. on Info. Th., Statist. Decis. Funct., Random Process and 8th European Meeting of Statist., vol. B, pp. 73–86. Academia Prague (1978)
Csiszár, I.: Information-type measures of difference of probability distributions and indirect observations. Stud. Sci. Math. Hung. 2, 299–318 (1967)
MathSciNet MATH Google Scholar
Horváth, L.: A method to refine the discrete Jensen’s inequality for convex and mid-convex functions. Math. Comput. Model. 54(9–10), 2451–2459 (2011)
Article MathSciNet Google Scholar
Horváth, L.; Khan, K.A.; Pečarić, J.: J. Combinatorial Improvements of Jensens Inequality/Classical and New Refinements of Jensens Inequality with Applications, Monographs in Inequalities 8, Element, Zagreb (2014)
Horváth, L.; Khan, K.A.; Pečarić, J.: Refinement of Jensen’s inequality for operator convex functions. Adv. Inequal. Appl. (2014)
Horváth, L.; Pečarić, J.: A refinement of discrete Jensen’s inequality. Math. Inequal. Appl. 14, 777–791 (2011)
MathSciNet MATH Google Scholar
Horváth, L.; Pečarić, Đ.; Pečarić, J.: Estimations of f-and Rényi divergences by using a cyclic refinement of the Jensens inequality. Bull. Malays. Math. Sci. Soc. 1–14 (2017)
Ioannides, Y.M.; Overman, H.G.: Zipf’s law for cities: an empirical examination. Reg. Sci. Urban Econ. 33(2), 127–137 (2003)
Article Google Scholar
Kullback, S.: Information Theory and Statistics. Courier Corporation, Chelmsford (1997)
MATH Google Scholar
Kullback, S.; Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet Google Scholar
Lovričević, N.; Pečarić, Đ.; Pečarić, J.: Zipf–Mandelbrot law, f-divergences and the Jensen-type interpolating inequalities. J. Inequal. Appl. 2018(1), 36 (2018)
Matic, M.; Pearce, C.E.; Pečarić, J.: Shannon’s and related inequalities in information theory. In: Survey on Classical Inequalities, pp. 127–164. Springer, Dordrecht (2000)
Pečarić, J.; Proschan, F.; Tong, Y.L.: Convex Functions, Partial Orderings and Statistical Applications. Academic Press, New York (1992)
MATH Google Scholar
Rényi, A.: On measure of information and entropy. In: Proceeding of the Fourth Berkely Symposium on Mathematics, Statistics and Probability, pp. 547–561 (1960)
Rosen, K.T.; Resnick, M.: The size distribution of cities: an examination of the Pareto law and primacy. J. Urban Econ. 8(2), 165–186 (1980)
Article Google Scholar
Soo, K.T.: Zipf’s Law for cities: a cross-country investigation. Reg. Sci. Urban Econ. 35(3), 239–263 (2005)
Article Google Scholar
Widder, D.V.: Completely convex function and Lidstone series. Trans. Am. Math. Soc. 51(1942), 387–398 (1942)
Article MathSciNet Google Scholar
Zipf, G.K.: Human behaviour and the principle of least-effort. In: Cambridge MA edn. Addison-Wesley, Reading (1949)

Download references

Acknowledgements

The authors wish to thank the anonymous referees for their very careful reading of the manuscript and fruitful comments and suggestions.

Author information

Authors and Affiliations

Department of Mathematics, University of Sargodha, Sargodha, 40100, Pakistan
Khuram Ali Khan & Tasadduq Niaz
Department of Mathematics, The University of Lahore, Sargodha-Campus, Sargodha, 40100, Pakistan
Tasadduq Niaz
Catholic University of Croatia, Ilica, 242, Zagreb, Croatia
Đilda Pečarić
RUDN University, Miklukho-Maklaya str. 6, 117198, Moscow, Russia
Josip Pečarić

Authors

Khuram Ali Khan
View author publications
You can also search for this author in PubMed Google Scholar
Tasadduq Niaz
View author publications
You can also search for this author in PubMed Google Scholar
Đilda Pečarić
View author publications
You can also search for this author in PubMed Google Scholar
Josip Pečarić
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tasadduq Niaz.

Ethics declarations

Author contributions

All authors jointly worked on the results and they read and approved the final manuscript.

Conflict of interest

The authors declares that there is no conflict of interests regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research of fourth author was supported by the Ministry of Education and Science of the Russian Federation (the Agreement number No. 02.a03.21.0008).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Khan, K.A., Niaz, T., Pečarić, Đ. et al. Estimation of different entropies via Lidstone polynomial using Jensen-type functionals. Arab. J. Math. 9, 613–631 (2020). https://doi.org/10.1007/s40065-020-00277-y

Download citation

Received: 05 November 2018
Accepted: 01 February 2020
Published: 18 February 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s40065-020-00277-y

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Estimation of different entropies via Lidstone polynomial using Jensen-type functionals

Abstract

Similar content being viewed by others

Estimation of different entropies via Hermite interpolating polynomial using Jensen type functionals

Estimation of f-divergence and Shannon entropy by using Levinson type inequalities for higher order convex functions via Hermite interpolating polynomial

Estimation of f-divergence and Shannon entropy by Levinson type inequalities via new Green’s functions and Lidstone polynomial

1 Introduction and preliminary results

Theorem 1.1

1.1 Lidstone polynomial

Lemma A

2 Inequalities for Csiszár divergence

Definition 2.1

Definition 2.2

Theorem 2.3

Proof

3 Inequalities for Shannon entropy

Definition 3.1

Corollary 3.2

Proof

Definition 3.3

Corollary 3.4

Proof

4 Inequalities for Rényi divergence and entropy

Definition 4.1

Theorem 4.2

Proof

Theorem 4.3

Proof

Corollary 4.4

Proof

Corollary 4.5

Proof

5 Inequalities using Zipf–Mandelbrot law

Definition 5.1

Conclusion 5.2

Conclusion 5.3

6 Shannon entropy, Zipf–Mandelbrot law, and hybrid Zipf–Mandelbrot law

Theorem 6.1

Proof

Remark 6.2

Theorem 6.3

Proof

Remark 6.4

7 Generalization of refinement of Jensen-, Rényi-, and Shannon-type inequalities via Lidstone polynomial

Theorem 7.1

Proof

Theorem 7.2

Proof

Theorem 7.3

Proof

Remark A

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Author contributions

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation