1 Introduction

The celebrated Jensen inequality states that: If I is an interval in \(\mathbb{R}\) and \(g,p:[a,b]\rightarrow \mathbb{R}\) are integrable functions such that \(g(\varrho )\in I, p(\varrho )>0 \ \forall \varrho \in [a,b]\). Also, if \(\psi:I\rightarrow \mathbb{R}\) is convex function and \((\psi \circ g).p\) is integrable on \([a,b]\). Then

$$ \psi \biggl( \frac{\int _{a}^{b}g(\varrho )p(\varrho )\,d\varrho }{\int _{a}^{b}p(\varrho )\,d\varrho } \biggr) \leq \frac{\int _{a}^{b}p(\varrho )(\psi \circ g)(\varrho )\,d\varrho }{\int _{a}^{b}p(\varrho )\,d\varrho }. $$
(1)

Jensen’s inequality is one of the fundamental inequalities in mathematics and it underlies many vital statistical concepts and proofs. Some important applications involve derivation of the AM-GM inequality, estimations for Zipf–Mandelbrot and Shannon entropies, the convergence property of the expectation maximization algorithm, and positivity of Kullback–Leibler divergence [17]. Also, this inequality has been utilized to solve several problems in many areas of science and technology e.g. physics, engineering, financial economics and computer science.

There are several classical important inequalities which may be deduced from (1), for example Hölder, Levinson’s, and Ky Fan and Young’s inequalities. Due to the great importance of this inequality, several researchers have focused on this inequality and derived many improvements, refinements and extensions of the Jensen inequality. The Jensen inequality also has been given for some other generalized convex functions such as s-convex, preinvex, h-convex and η-convex functions. For some recent results concerning the Jensen inequality see [13, 5, 820].

In this article first of all we establish an interesting refinement of the Jensen inequality associated to two functions whose sum is equal to unity. Using this refinement, we derive refinements of Hölder, power mean, quasi-arithmetic mean and Hermite–Hadamard inequalities. We also focus on deducing bounds for Csiszár-divergence, Kullback–Leibler divergence, Shannon entropy and variational distance etc. We present a more general refinement of Jensen inequality concerning n functions whose sums are equal to unity.

2 Main results

We start to derive a new refinement of the Jensen inequality associated to two functions whose sum is equal to unity.

Theorem 1

Let\(\psi: I \rightarrow \mathbb{R}\)be a convex function defined on the intervalI. Let\(p,u,v,g:[a,b]\rightarrow \mathbb{R}\)be integrable functions such that\(g(\varrho )\in I, u(\varrho ), v(\varrho ), p(\varrho )\in \mathbb{R}^{+}\)for all\(\varrho \in [a,b]\)and\(v(\varrho )+u(\varrho )=1\), \(P=\int _{a}^{b}p(\varrho )\,d\varrho \). Then

$$\begin{aligned} & \frac{1}{P} \int _{a}^{b}p(\varrho )\psi \bigl(g(\varrho )\bigr) \,d\varrho \\ &\quad\geq \frac{1}{P} \int _{a}^{b}u(\varrho )p(\varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}p(\varrho )u(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}p(\varrho )u(\varrho )\,d\varrho } \biggr) \\ &\qquad{}+ \frac{1}{P} \int _{a}^{b}p(\varrho )v(\varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}p(\varrho )v(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}p(\varrho )v(\varrho )\,d\varrho } \biggr) \\ &\quad \geq \psi \biggl(\frac{1}{P} \int _{a}^{b}p(\varrho )g( \varrho )\,d\varrho \biggr). \end{aligned}$$
(2)

If the functionψis concave then the reverse inequalities hold in (2).

Proof

Since \(u(\varrho )+v(\varrho )=1\), so we have

$$ \int _{a}^{b}p(\varrho )\psi \bigl(g(\varrho )\bigr) \,d\varrho = \int _{a}^{b}u( \varrho )p(\varrho )\psi \bigl(g( \varrho )\bigr)\,d\varrho + \int _{a}^{b}v( \varrho )p(\varrho )\psi \bigl(g( \varrho )\bigr)\,d\varrho. $$
(3)

Applying the integral Jensen inequality on both terms on the right side of (3) we obtain

$$\begin{aligned} &\frac{1}{P} \int _{a}^{b}p(\varrho )\psi \bigl(g(\varrho )\bigr) \,d\varrho \\ &\quad\geq \frac{1}{P} \int _{a}^{b}u(\varrho )p(\varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}u(\varrho )p(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}u(\varrho ) p(\varrho )\,d\varrho } \biggr) \\ &\qquad{}+ \frac{1}{P} \int _{a}^{b}v(\varrho )p(\varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}v(\varrho )p(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}v(\varrho )p(\varrho )\,d\varrho } \biggr) \\ & \quad\geq \psi \biggl(\frac{1}{P} \int _{a}^{b}u(\varrho )p(\varrho )g( \varrho )\,d \varrho + \frac{1}{P} \int _{a}^{b}v(\varrho )p(\varrho )g( \varrho )\,d \varrho \biggr) \\ &\qquad\text{(by the convexity of $\psi $)} \\ &\quad=\psi \biggl(\frac{1}{P} \int _{a}^{b}p(\varrho )g(\varrho )\,d \varrho \biggr). \end{aligned}$$
(4)

 □

As a consequence of the above theorem we deduce the following refinement of the Hölder inequality.

Corollary 1

Let\(r_{1},r_{2}>1\)be such that\(\frac{1}{r_{1}}+\frac{1}{r_{2}}=1\). If\(u,v,\tau,g_{1}\)and\(g_{2}\)are non-negative functions defined on\([a,b]\)such that\(\tau g_{1}^{r_{1}},\tau g_{2}^{r_{2}},u\tau g_{2}^{r_{2}}, v\tau g_{2}^{r_{2}}, u\tau g_{1}g_{2},v\tau g_{1}g_{2},\tau g_{1}g_{2}\in L^{1}([a,b])\)and\(u(\varrho )+v(\varrho )=1\)for all\(\varrho \in [a,b]\), then

$$\begin{aligned} & \biggl( \int _{a}^{b}\tau (\varrho )g_{1}^{r_{1}}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{1}}} \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{2}}} \\ &\quad\geq \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d \varrho \biggr)^{\frac{1}{r_{2}}} \\ &\qquad{}\times \biggl\{ \biggl( \int _{a}^{b}u( \varrho )\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho \biggr)^{1-r_{1}} \biggl( \int _{a}^{b}u(\varrho )\tau (\varrho )g_{1}(\varrho )g_{2}( \varrho )\,d\varrho \biggr)^{r_{1}} \\ &\qquad{}+ \biggl( \int _{a}^{b}v(\varrho )\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{1-r_{1}} \biggl( \int _{a}^{b}v(\varrho ) \tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\,d\varrho \biggr)^{r_{1}} \biggr\} ^{\frac{1}{r_{1}}} \\ &\quad \geq \int _{a}^{b}\tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\,d \varrho. \end{aligned}$$
(5)

In the case when\(0< r_{1}<1\)and\(r_{2}=\frac{r_{1}}{r_{1}-1}\)with\(\int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho >0\)or\(r_{1}<0\)and\(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho >0\), then we have

$$\begin{aligned} & \int _{a}^{b}\tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\,d\varrho \\ &\quad\geq \biggl( \int _{a}^{b}u(\varrho )\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{\frac{1}{{r_{2}}}} \biggl( \int _{a}^{b}u( \varrho )\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho \biggr)^{ \frac{1}{r_{1}}} \\ &\qquad{}+ \biggl( \int _{a}^{b}v(\varrho )\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{\frac{1}{{r_{2}}}} \biggl( \int _{a}^{b}v( \varrho )\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho \biggr)^{ \frac{1}{r_{1}}} \\ &\quad \geq \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}( \varrho )\,d \varrho \biggr)^{\frac{1}{r_{1}}} \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{2}}}. \end{aligned}$$
(6)

Proof

If \(\int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho >0\), then by using Theorem 1 for \(\psi (\varrho )=\varrho ^{r_{1}}, \varrho >0, r_{1}>1\), \(p(\varrho )=\tau (\varrho )g_{2}^{r_{2}}(\varrho ), g(\varrho )=g_{1}( \varrho )g_{2}^{\frac{-r_{2}}{r_{1}}}(\varrho )\), we obtain (5). If \(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho >0\), then applying the same procedure but taking \(r_{1}, r_{2}, g_{1},g_{2}\) instead of \(r_{2}, r_{1}, g_{2},g_{1}\), we obtain (5).

Set \(\int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho =0\) and \(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho =0\). We know that

$$ 0\leq \tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\leq \frac{1}{r_{1}}\tau (\varrho )g_{1}^{r_{1}}(\varrho )+ \frac{1}{r_{2}}\tau (\varrho )g_{2}^{r_{2}}(\varrho ). $$
(7)

Therefore taking the integral and then using the given conditions we have \(\int _{a}^{b}\tau (\varrho )g_{1}(\varrho )\times g_{2}(\varrho )\,d\varrho =0\).

For the case \(r_{1}>1\), the proof is completed.

For the case when \(0< r_{1}<1\), \(M=\frac{1}{r_{1}}>1\) and applying (5) for M and \(N=(1-r_{1})^{-1}, \overline{g}_{1}= (g_{1}g_{2})^{r_{1}}, \overline{g}_{2}=g^{-r_{1}}_{2}\) instead of \(r_{1},r_{2},g_{1},g_{2}\).

Finally, if \(r_{1}<0\) then \(0< r_{2}<1\) and we may apply similar arguments with \(r_{1}, r_{2}, g_{1},g_{2}\) replaced by \(r_{2}, r_{1}, g_{2},g_{1} \) provided that \(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho >0\). □

Another refinement of the Hölder inequality presented in the following corollary.

Corollary 2

Let\(r_{1}>1, r_{2}=\frac{r_{1}}{r_{1}-1}\). If\(u,v,\tau,g_{1}\)and\(g_{2}\)are non-negative functions defined on\([a,b]\)such that\(\tau g_{1}^{r_{1}},\tau g_{2}^{r_{2}},u\tau g_{2}^{r_{2}}, v\tau g_{2}^{r_{2}}, \tau g_{1}g_{2}\in L^{1}([a,b])\)and\(u(\varrho )+v(\varrho )=1\)for all\(\varrho \in [a,b]\), also assuming that\(\int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}(\varrho )>0\), then

$$\begin{aligned} & \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{1}}} \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{2}}} \\ &\quad\geq \biggl( \int _{a}^{b}u(\varrho )\tau (\varrho )g^{r_{1}}_{1}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{1}}} \biggl( \int _{a}^{b}u( \varrho )\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho \biggr)^{ \frac{1}{r_{2}}} \\ &\qquad{}+ \biggl( \int _{a}^{b}v(\varrho )\tau (\varrho )g^{r_{1}}_{1}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{1}}} \biggl( \int _{a}^{b}v( \varrho )\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho \biggr)^{ \frac{1}{r_{2}}} \\ &\quad \geq \int _{a}^{b}\tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\,d \varrho. \end{aligned}$$
(8)

In the case when\(0< r_{1}<1\)and\(r_{2}=\frac{r_{1}}{r_{1}-1}\)with\(\int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho >0\)or\(r_{1}<0\)and\(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho >0\), then we have

$$\begin{aligned} & \biggl( \int _{a}^{b}\tau (\varrho )g_{1}^{r_{1}}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{1}}} \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{\frac{1}{r_{2}}} \\ &\quad\leq \biggl( \int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d \varrho \biggr)^{\frac{1}{r_{2}}} \\ &\qquad{}\times \biggl\{ \biggl( \int _{a}^{b}u( \varrho )\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho \biggr)^{1-r_{1}} \biggl( \int _{a}^{b}u(\varrho )\tau (\varrho )g_{1}(\varrho )g_{2}( \varrho )\,d\varrho \biggr)^{r_{1}} \\ &\qquad{}+ \biggl( \int _{a}^{b}v(\varrho )\tau (\varrho )g^{r_{2}}_{2}( \varrho )\,d\varrho \biggr)^{1-r_{1}} \biggl( \int _{a}^{b}v(\varrho ) \tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\,d\varrho \biggr)^{r_{1}} \biggr\} ^{\frac{1}{r_{1}}} \\ &\quad \leq \int _{a}^{b}\tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\,d \varrho. \end{aligned}$$
(9)

Proof

Assume that \(\int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho >0\). Let \(\psi (\varrho )=\varrho ^{\frac{1}{r_{1}}}\), \(\varrho >0,r_{1}>1 \). Then clearly the function ψ is concave. Therefore applying Theorem 1 for \(\psi (\varrho )=\varrho ^{\frac{1}{r_{1}}}, p=\tau g^{r_{2}}_{2}, g=g^{r_{1}}_{1}g^{-r_{2}}_{2}\), we obtain (8). If \(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho >0\), then applying the same procedure but taking \(r_{1}, r_{2}, g_{1},g_{2}\) instead of \(r_{2}, r_{1}, g_{2},g_{1}\), we obtain (8).

If \(\int _{a}^{b}\tau (\varrho )g^{r_{2}}_{2}(\varrho )\,d\varrho =0\) and \(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho =0\), then since as we know that

$$ 0\leq \tau (\varrho )g_{1}(\varrho )g_{2}(\varrho )\leq \frac{1}{r_{1}}\tau (\varrho )g_{1}^{r_{1}}(\varrho )+ \frac{1}{r_{2}}\tau (\varrho )g_{2}^{r_{2}}(\varrho ). $$
(10)

Therefore taking the integral and then using the given conditions we have \(\int _{a}^{b}\tau (\varrho )\times g_{1}(\varrho )g_{2}(\varrho )\,d\varrho =0\).

In the case when \(0< r_{1}<1\), \(M=\frac{1}{r_{1}}>1\) and applying (8) for M and \(N=(1-r_{1})^{-1}, \overline{g}_{1}= (g_{1}g_{2})^{r_{1}}, \overline{g}_{2}=g^{-r_{1}}_{2}\) instead of \(r_{1},r_{2},g_{1},g_{2}\), we get (9).

Finally, if \(r_{1}<0\) then \(0< r_{2}<1\) and we may apply similar arguments with \(r_{1}, r_{2}, g_{1},g_{2}\) replaced by \(r_{2}, r_{1}, g_{2},g_{1} \) provided that \(\int _{a}^{b}\tau (\varrho )g^{r_{1}}_{1}(\varrho )\,d\varrho >0\). □

Remark 1

If we put \(u(\varrho )=\frac{b-\varrho }{b-a},v(\varrho )= \frac{\varrho -a}{b-a}\) in (8), then we deduce the inequalities which have been obtained by Işcan in [21].

Let p and g be positive integrable functions defined on \([a,b]\). Then the integral power means of order \(r\in \mathbb{R}\) are defined as follows:

$$ M_{r}(p;g)= \textstyle\begin{cases} (\frac{1}{\int _{a}^{b}p(\varrho )\,d\varrho }\int _{a}^{b}p( \varrho )g^{r}(\varrho )\,d\varrho )^{\frac{1}{r}},& \text{if $r\neq 0$}, \\ \exp ( \frac{\int _{a}^{b}p(\varrho )\log g(\varrho )\,d\varrho }{\int _{a}^{b}p(\varrho )\,d\varrho } ),& \text{if $r= 0$}. \end{cases} $$
(11)

In the following corollary we deduce inequalities for power means.

Corollary 3

Let\(p,u,v\)andgbe positive integrable functions defined on\([a,b]\)with\(u(\varrho )+v(\varrho )=1\)for all\(\varrho \in [a,b]\). Let\(s,t\in \mathbb{R}\)such that\(s\leq t\). Then

$$\begin{aligned} &M_{t}(p;g)\geq \bigl[ M_{1}(u;p) M_{s}^{t}(u.p;g)+ M_{1}(v;p) M_{s}^{t}(v.p;g) \bigr]^{\frac{1}{t}}\geq M_{s}(p;g), \quad t\neq 0, \end{aligned}$$
(12)
$$\begin{aligned} & M_{t}(p;g)\geq M_{1}(u;p) \log M_{s}(u.p;g)+ M_{1}(v;p) \log M_{s}(v.p;g) \geq M_{s}(p;g),\quad t= 0, \end{aligned}$$
(13)
$$\begin{aligned} & M_{s}(p;g)\leq \bigl[ M_{1}(u;p) M_{t}^{s}(u.p;g)+ M_{1}(v;p) M_{t}^{s}(v.p;g) \bigr]^{\frac{1}{s}}\leq M_{t}(p;g), \quad s\neq 0, \end{aligned}$$
(14)
$$\begin{aligned} & M_{s}(p;g)\leq M_{1}(u;p) \log M_{t}(u.p;g)+ M_{1}(v;p) \log M_{t}(v.p;g) \leq M_{t}(p;g),\quad s= 0. \end{aligned}$$
(15)

Proof

If \(s,t\in \mathbb{R}\) and \(s,t\neq 0\), then using (2) for \(\psi (\varrho )=\varrho ^{\frac{t}{s}}\), \(\varrho >0\), \(g\rightarrow g^{s}\) and then taking the power \(\frac{1}{t}\) we get (12). For the case \(t=0\), taking the limit \(t\rightarrow 0 \) in (12) we obtain (13). We have the same for \(s=0\) taking the limit.

Similarly taking (2) for \(\psi (\varrho )=\varrho ^{\frac{s}{t}}\), \(\varrho >0,s,t\neq 0\), \(g\rightarrow g^{t}\) and then taking the power \(\frac{1}{s}\) we get (14). For \(s=0\) or \(t=0\) we take the limit as above. □

Let p be positive integrable function defined on \([a,b]\) and g be any integrable function defined on \([a,b]\). Then, for a strictly monotone continuous function h whose domain belongs to the image of g, the quasi-arithmetic mean is defined as follows:

$$ M_{h}(p;g)= h^{-1} \biggl(\frac{1}{\int _{a}^{b}p(\varrho )\,d\varrho } \int _{a}^{b}p(\varrho )h\bigl(g(\varrho )\bigr)\,d \varrho \biggr). $$
(16)

We give inequalities for the quasi-arithmetic mean.

Corollary 4

Let\(u,v,p\)be positive integrable functions defined on\([a,b]\)such that\(u(\varrho )+v(\varrho )=1\)for all\(\varrho \in [a,b]\)andgbe any integrable function defined on\([a,b]\). Also assume thathis a strictly monotone continuous function whose domain belongs to the image ofg. If\(f\circ h^{-1}\)is convex function then

$$\begin{aligned} & \frac{1}{\int _{a}^{b}p(\varrho )\,d\varrho } \int _{a}^{b}p(\varrho )f\bigl(g( \varrho )\bigr)\,d \varrho \\ &\quad \geq M_{1}(u;p)f\bigl(M_{h}(p.u;g)\bigr) + M_{1}(v;p)f\bigl(M_{h}(p.v;g)\bigr) \geq f \bigl(M_{h}(p;g) \bigr). \end{aligned}$$
(17)

If the function\(f\circ h^{-1}\)is concave then the reverse inequalities hold in (17).

Proof

The required inequalities may be deduced by using (2) for \(g\rightarrow h\circ g\) and \(\psi \rightarrow f\circ h^{-1}\). □

The following refinement of the Hermite–Hadamard inequality may be given.

Corollary 5

Let\(\psi: [a,b] \rightarrow \mathbb{R}\)be a convex function defined on the interval\([a,b]\). Let\(u,v:[a,b]\rightarrow \mathbb{R}\)be integrable functions such that\(u(\varrho ), v(\varrho )\in \mathbb{R}^{+}\)for all\(\varrho \in [a,b]\)and\(u(\varrho )+v(\varrho )=1\). Then

$$\begin{aligned} \frac{1}{b-a} \int _{a}^{b}\psi (\varrho )\,d\varrho \geq{}& \frac{1}{b-a} \int _{a}^{b}u(\varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}\varrho u(\varrho )\,d\varrho }{\int _{a}^{b}u(\varrho )\,d\varrho } \biggr) \\ &{}+ \frac{1}{b-a} \int _{a}^{b}v(\varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}\varrho v(\varrho )\,d\varrho }{\int _{a}^{b}v(\varrho )\,d\varrho } \biggr) \geq \psi \biggl(\frac{a+b}{2} \biggr). \end{aligned}$$
(18)

For the concave functionψthe reverse inequalities hold in (18).

Proof

Using Theorem 1 for \(p(\varrho )=1, g(\varrho )=\varrho \) for all \(\varrho \in [a,b]\), we obtain (18). □

3 Applications in information theory

In this section, we present some important applications for different divergences and distances in information theory [22] of our main result.

Definition 1

(Csiszár divergence)

Let \(T:I\rightarrow \mathbb{R}\) be a function defined on the positive interval I. Also let \(u_{1},v_{1}:[a,b]\rightarrow (0,\infty )\) be two integrable functions such that \(\frac{u_{1}(\varrho )}{v_{1}(\varrho )}\in I\) for all \(\varrho \in [a,b]\), then the Csiszár divergence is defined as

$$ \mathtt{C}_{\mathtt{d}}(u_{1},v_{1})= \int _{a}^{b}v_{1}(\varrho )T \biggl( \frac{u_{1}(\varrho )}{v_{1}(\varrho )} \biggr)\,d\varrho. $$

Theorem 2

Let\(T: I \rightarrow \mathbb{R}\)be a convex function defined on the positive intervalI. Let\(u,v,u_{1},v_{1}:[a,b]\rightarrow \mathbb{R}^{+}\)be integrable functions such that\(\frac{u_{1}(\varrho )}{v_{1}(\varrho )}\in I\)and\(u(\varrho )+v(\varrho )=1\)for all\(\varrho \in [a,b]\). Then

$$\begin{aligned} \mathtt{C}_{\mathtt{d}}\geq{}& \int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d \varrho T \biggl(\frac{\int _{a}^{b}u(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \\ &{} + \int _{a}^{b}v(\varrho )v_{1}(\varrho ) \,d\varrho T \biggl( \frac{\int _{a}^{b}v(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}v(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \geq T \biggl( \frac{\int _{a}^{b}u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}v_{1}(\varrho )\,d\varrho } \biggr) \int _{a}^{b}v_{1}(\varrho )\,d\varrho. \end{aligned}$$
(19)

Proof

Using Theorem 1 for \(\psi =T\), \(g=\frac{u_{1}}{v_{1}}\) and \(p=v_{1}\), we obtain (19). □

Definition 2

(Shannon entropy)

If \(v_{1}(\varrho )\) is positive probability density function defined on \([a,b]\), then the Shannon entropy is defined by

$$ \mathrm{SE}(v_{1})=- \int _{a}^{b}v_{1}(\varrho )\log v_{1}(\varrho )\,d\varrho. $$

Corollary 6

Let\(u,v,v_{1}:[a,b]\rightarrow \mathbb{R}^{+}\)be integrable functions such that\(v_{1}\)is probability density function and\(u(\varrho )+v(\varrho )=1\)for all\(\varrho \in [a,b]\). Then

$$\begin{aligned} & \int _{a}^{b}v_{1}(\varrho )\log \bigl(u_{1}(\varrho )\bigr)\,d\varrho +\mathrm{SE}(v_{1}) \\ &\quad \leq \int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d \varrho \log \biggl(\frac{\int _{a}^{b}u(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \\ &\qquad{} + \int _{a}^{b}v(\varrho )v_{1}(\varrho )\,d \varrho \log \biggl( \frac{\int _{a}^{b}v(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}v(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \leq \log \biggl( \int _{a}^{b}u_{1}(\varrho )\,d\varrho \biggr). \end{aligned}$$
(20)

Proof

Taking \(T(\varrho )=-\log \varrho, \varrho \in \mathbb{R}^{+}\), in (19), we obtain (20). □

Definition 3

(Kullback–Leibler divergence)

If \(u_{1}\) and \(v_{1}\) are two positive probability densities defined on \([a,b]\), the Kullback–Leibler divergence is defined by

$$ \mathtt{KL}_{\mathtt{d}}(u_{1},v_{1})= \int _{a}^{b}u_{1}(\varrho ) \log \biggl( \frac{u_{1}(\varrho )}{v_{1}(\varrho )} \biggr)\,d\varrho. $$

Corollary 7

Let\(u,v,u_{1},v_{1}:[a,b]\rightarrow \mathbb{R}^{+}\)be integrable functions such that\(u_{1}\)and\(v_{1}\)are probability density functions and\(u(\varrho )+v(\varrho )=1\)for all\(\varrho \in [a,b]\). Then

$$\begin{aligned} \mathtt{KL}_{\mathtt{d}}(u_{1},v_{1})\geq{}& \int _{a}^{b}u(\varrho )u_{1}(\varrho )\,d \varrho \log \biggl(\frac{\int _{a}^{b}u(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \\ &{} + \int _{a}^{b}v(\varrho )u_{1}(\varrho )\,d \varrho \log \biggl( \frac{\int _{a}^{b}v(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}v(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \geq 0. \end{aligned}$$
(21)

Proof

Taking \(T(\varrho )=\varrho \log \varrho, \varrho \in \mathbb{R}^{+}\), in (19), we obtain (20). □

Definition 4

(Variational distance)

If \(u_{1}\) and \(v_{1}\) are positive probability density functions defined on \([a,b]\), then the variational distance is defined by

$$ \mathtt{V}_{\mathtt{d}}(u_{1},v_{1})= \int _{a}^{b} \bigl\vert u_{1}( \varrho )-v_{1}(\varrho ) \bigr\vert \,d\varrho. $$

Corollary 8

Let\(u,v,u_{1},v_{1}\)be as stated in Corollary7. Then

$$\begin{aligned} \mathtt{V}_{\mathtt{d}}(u_{1},v_{1}) \geq{}& \biggl\vert \int _{a}^{b}u( \varrho ) \bigl(u_{1}( \varrho )-v_{1}(\varrho )\bigr)\,d\varrho \biggr\vert \\ &{}+ \biggl\vert \int _{a}^{b}v(\varrho ) \bigl(u_{1}( \varrho )-v_{1}(\varrho )\bigr)\,d \varrho \biggr\vert . \end{aligned}$$
(22)

Proof

Using the function \(T(\varrho )=| \varrho -1|, \varrho \in \mathbb{R}^{+}\), in (19), we obtain (22). □

Definition 5

(Jeffrey’s distance)

If \(u_{1}\) and \(v_{1}\) are two positive probability density functions defined on \([a,b]\), then the Jeffrey distance is defined by

$$ \mathtt{J}_{\mathtt{d}}(u_{1},v_{1})= \int _{a}^{b} \bigl(u_{1}( \varrho )-v_{1}(\varrho ) \bigr)\log \biggl( \frac{u_{1}(\varrho )}{v_{1}(\varrho )} \biggr) \,d \varrho. $$

Corollary 9

Let\(u,v,u_{1},v_{1}\)be as stated in Corollary7. Then

$$\begin{aligned} \mathtt{J}_{\mathtt{d}}(u_{1},v_{1})\geq{}& \int _{a}^{b}u(\varrho ) \bigl(u_{1}( \varrho )-v_{1}(\varrho )\bigr)\,d\varrho \log \biggl(\frac{\int _{a}^{b}u(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \\ & {}+ \int _{a}^{b}v(\varrho ) \bigl(u_{1}( \varrho )-v_{1}(\varrho )\bigr)\,d \varrho \log \biggl( \frac{\int _{a}^{b}v(\varrho )u_{1}(\varrho )\,d\varrho }{\int _{a}^{b}v(\varrho )v_{1}(\varrho )\,d\varrho } \biggr) \geq 0. \end{aligned}$$
(23)

Proof

Using the function \(T(\varrho )=(\varrho -1)\log \varrho, \varrho \in \mathbb{R}^{+}\), in (19), we obtain (23). □

Definition 6

(Bhattacharyya coefficient)

If \(u_{1}\) and \(v_{1}\) are two positive probability density functions defined on \([a,b]\), then the Bhattacharyya coefficient is defined by

$$ \mathtt{B}_{\mathtt{d}}(u_{1},v_{1})= \int _{a}^{b}\sqrt{u_{1}( \varrho )v_{1}(\varrho )} \,d\varrho. $$

Corollary 10

Let\(u,v,u_{1},v_{1}\)be as stated in Corollary7. Then

$$\begin{aligned} \mathtt{B}_{\mathtt{d}}(u_{1},v_{1})\leq{}& \sqrt{ \int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d \varrho \int _{a}^{b}u(\varrho )u_{1}(\varrho )\,d \varrho } \\ &{} + \sqrt{ \int _{a}^{b}v(\varrho )v_{1}(\varrho )\,d \varrho \int _{a}^{b}v( \varrho )u_{1}(\varrho ) \,d\varrho }. \end{aligned}$$
(24)

Proof

Using the function \(T(\varrho )=-\sqrt{\varrho }, \varrho \in \mathbb{R}^{+}\), in (19), we obtain (24). □

Definition 7

(Hellinger distance)

If \(u_{1}\) and \(v_{1}\) are two positive probability density functions defined on \([a,b]\), then the Hellinger distance is defined by

$$ \mathtt{H}_{\mathtt{d}}(u_{1},v_{1})= \int _{a}^{b} \bigl(\sqrt{u_{1}( \varrho )}-\sqrt{v_{1}(\varrho )} \bigr)^{2} \,d\varrho. $$

Corollary 11

Let\(u,v,u_{1},v_{1}\)be as stated in Corollary7. Then

$$\begin{aligned} \mathtt{H}_{\mathtt{d}}(u_{1},v_{1})\geq{}& \biggl(\sqrt{ \int _{a}^{b}u(\varrho )u_{1}(\varrho )\,d \varrho }- \sqrt{ \int _{a}^{b}u(\varrho )v_{1}(\varrho )\,d \varrho } \biggr)^{2} \\ & {}+ \biggl(\sqrt{ \int _{a}^{b}v(\varrho )u_{1}(\varrho )\,d \varrho }- \sqrt{ \int _{a}^{b}v(\varrho )v_{1}(\varrho )\,d \varrho } \biggr)^{2} \geq 0. \end{aligned}$$
(25)

Proof

Using the function \(T(\varrho )=(\sqrt{\varrho }-1)^{2}, \varrho \in \mathbb{R}^{+}\), in (19), we obtain (25). □

Definition 8

(Triangular discrimination)

If \(u_{1}\) and \(v_{1}\) are two positive probability density functions defined on \([a,b]\), then the triangular discrimination between \(u_{1}\) and \(v_{1}\) is defined by

$$ \mathtt{T}_{\mathtt{d}}(u_{1},v_{1})= \int _{a}^{b} \frac{ (u_{1}(\varrho )-v_{1}(\varrho ) )^{2}}{u_{1}(\varrho )+v_{1}(\varrho )} \,d\varrho. $$

Corollary 12

Let\(u,v,u_{1},v_{1}\)be as stated in Corollary7. Then

$$\begin{aligned} \mathtt{T}_{\mathtt{d}}(u_{1},v_{1})\geq{}& \frac{ (\int _{a}^{b}u(\varrho )(u_{1}(\varrho )-v_{1}(\varrho ))\,d\varrho )^{2}}{\int _{a}^{b}u(\varrho )(u_{1}(\varrho )+v_{1}(\varrho ))\,d\varrho } \\ & {}+ \frac{ (\int _{a}^{b}(u_{1}(\varrho )-v_{1}(\varrho ))v(\varrho )\,d\varrho )^{2}}{\int _{a}^{b}(u_{1}(\varrho )+v_{1}(\varrho ))v(\varrho )\,d\varrho } \geq 0. \end{aligned}$$
(26)

Proof

Since the function \(\phi (\varrho )= \frac{(\varrho -1)^{2}}{\varrho +1}, \varrho \in \mathbb{R}^{+}\), is convex, using the function \(T(\varrho )=\phi (\varrho )\), in (19), we obtain (26). □

4 Further generalization

In the following theorem we present further refinement of the Jensen inequality concerning n functions whose sum is equal to unity.

Theorem 3

Let\(\psi: \mathtt{G} \rightarrow \mathbb{R}\)be a convex function defined on the intervalG. Let\(p,g,u_{l}\in L[a,b]\)such that\(g(\varrho )\in \mathtt{G}, p(\varrho ), u_{l}(\varrho )\in \mathbb{R}^{+}\)for all\(\varrho \in [a,b]\)\((l=1,2,\ldots,n)\)and\(\sum_{l=1}^{n}u_{l}(\varrho )=1\), \(P=\int _{a}^{b}p(\varrho )\,d\varrho \). Assume that\(L_{1}\)and\(L_{2}\)are non-empty disjoint subsets of\(\{1,2,\ldots,n\}\)such that\(L_{1}\cup L_{2}=\{1,2,\ldots,n\}\). Then

$$\begin{aligned} &\frac{1}{P} \int _{a}^{b}p(\varrho )\psi \bigl(g(\varrho )\bigr) \,d\varrho \\ &\quad\geq \frac{1}{P} \int _{a}^{b}\sum_{l\in L_{1}}u_{l}( \varrho )p( \varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}\sum_{l\in L_{1}}u_{l}(\varrho )p(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}\sum_{l\in L_{1}}u_{l}(\varrho )p(\varrho )\,d\varrho } \biggr) \\ & \qquad{}+ \frac{1}{P} \int _{a}^{b}\sum_{l\in L_{2}}u_{l}( \varrho )p( \varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}\sum_{l\in L_{2}}u_{l}(\varrho )p(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}\sum_{l\in L_{2}}u_{l}(\varrho )p(\varrho )\,d\varrho } \biggr) \\ &\quad \geq \psi \biggl(\frac{1}{P} \int _{a}^{b}p(\varrho )g(\varrho )\,d \varrho \biggr). \end{aligned}$$
(27)

If the functionψis concave then the reverse inequalities hold in (27).

Proof

Since \(\sum_{l=1}^{n}u_{l}(\varrho )=1\), we may write

$$\begin{aligned} &\int _{a}^{b}p(\varrho )\psi \bigl(g(\varrho )\bigr) \,d\varrho \\ &\quad = \int _{a}^{b} \sum_{l\in L_{1}}u_{l}( \varrho )p(\varrho )\psi \bigl(g(\varrho )\bigr)\,d \varrho + \int _{a}^{b}\sum_{l\in L_{2}}u_{l}( \varrho )p(\varrho ) \psi \bigl(g(\varrho )\bigr)\,d\varrho. \end{aligned}$$
(28)

Applying integral Jensen’s inequality on both terms on the right hand side of (28) we obtain

$$\begin{aligned} &\frac{1}{P} \int _{a}^{b}p(\varrho )\psi \bigl(g(\varrho )\bigr) \,d\varrho \\ &\quad\geq \frac{1}{P} \int _{a}^{b}\sum_{l\in L_{1}}u_{l}( \varrho )p( \varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}\sum_{l\in L_{1}}u_{l}(\varrho )p(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}\sum_{l\in L_{1}}u_{l}(\varrho )p(\varrho )\,d\varrho } \biggr) \\ &\qquad{} + \frac{1}{P} \int _{a}^{b}\sum_{l\in L_{2}}u_{l}( \varrho )p( \varrho )\,d\varrho \psi \biggl( \frac{\int _{a}^{b}\sum_{l\in L_{2}}u_{l}(\varrho )p(\varrho )g(\varrho )\,d\varrho }{\int _{a}^{b}\sum_{l\in L_{2}}u_{l}(\varrho )p(\varrho )\,d\varrho } \biggr) \\ &\quad \geq \psi \biggl(\frac{1}{P} \int _{a}^{b}\sum_{l\in L_{1}}u_{l}( \varrho )p(\varrho )g(\varrho )\,d\varrho + \frac{1}{P} \int _{a}^{b} \sum_{l\in L_{2}}u_{l}( \varrho )p(\varrho )g(\varrho )\,d\varrho \biggr) \\ & \qquad\text{(by the convexity of $\psi $)} \\ &\quad=\psi \biggl(\frac{1}{P} \int _{a}^{b}p(\varrho )g(\varrho )\,d \varrho \biggr). \end{aligned}$$
(29)

 □

Remark 2

If we take \(n=2\), in Theorem 3, we deduce Theorem 1. Also, analogously to the previous sections we may give applications of Theorem 3 for different means, the Hölder inequality and information theory.