1 Introduction

In the literature, several kinds of information divergence measures that compare two probability distributions are discussed and are widely applied in engineering, information theory, and statistics. The classification of these measures, like nonparametric, parametric, and entropy-type measures of information, provides a simple way to distinguish them [31]. The amount of information about an unknown parameter θ supplied by the data is measured using parametric measures of information, which are functions of θ. Fisher’s measure of information is the most well-known example of this type of measure [32]. Nonparametric measures are used to determine the amount of information provided by data in order to discriminate in favor of one probability distribution \(\textbf{p}_{1}\) over another \(\textbf{p}_{2}\), or to measure the affinity or distance among \(\textbf{p}_{1}\) and \(\textbf{p}_{2}\). The well-known measure of this class is Kullback–Leibler measure [42]. Measures of entropy express the amount of information contained in a distribution, that is, the amount of uncertainty associated with the outcome of an experiment. Rényi’s [62] and Shannon’s measures [65] are the most well-known examples of this type of measure. Shannon and Zipf–Mandelbrot entropies are very useful in various applied sciences, for example, in economics, information theory, and biology.

The theory of time scales played a vital role in difference calculus, differential calculus, and quantum calculus. This subject was rapidly developed by several mathematicians, who added various useful results to the literature by utilizing dynamic equations and integral inequalities on time scales, see e.g. [8, 9, 19, 20, 56, 63, 69]. Besides mathematics, inequalities and dynamic equations are also used in other disciplines e.g. physical problems, population dynamics, finance problems, and optical problems [22, 37, 51, 72].

In the early twentieth century, the idea of q-calculus was initiated after the work of Jackson who defined q-Jackson integral [36]. The q-calculus has a lot of applications in many disciplines of physics and mathematics [35, 57, 76]. Recently, this subject received a rapid boost in the development of q-calculus, see [11, 21, 34, 40, 50, 52, 55, 70, 71, 77] and the references therein. Several researchers established q-analogue of various integral inequalities, see [18, 27, 30, 37, 43, 46, 47, 68]. Inequalities involving convex functions received a considerable attention (see [7, 12, 33, 61]). Various papers related to inequalities for entropies and divergence measures exist in the literature, see e.g. [1, 3, 4, 6, 1316, 28, 64] and the references cited therein. Jensen’s inequality plays an important role in information theory. It is useful to compute upper bounds for various divergence measures arising from information theory. Further, it plays a vital part for computing various bounds for conditional entropy, joint entropy, and mutual information. It gives various counterpart inequalities of Shannon entropy that is one of the vital parts applied in information theory and used to solve various problems in statistics, economics, ecology, psychology, accounting, computer science, etc.

It is important to note that results for convex functions may not be valid for n-convex functions. Recently, several inequalities for n-convex functions have been generalized by numerous researchers, see e.g. [23, 25, 39, 44, 54, 59, 60, 66]. Further, in [58], the authors used Hermite interpolation and obtained Popoviciu-type inequalities for n-convex function. In [24, 26], Butt et al. established the identities concerning Popoviciu-type inequalities for n-convex functions by using Green’s function and Hermite’s interpolation. In [5], Khan et al. generalized Sherman’s inequality for n-convex functions by using Hermite’s interpolation. In [2], Adeel et al. used Hermite’s interpolation and generalized Levinson-type inequalities for higher order convex functions. They also applied their results to information theory by finding certain estimates for f-divergence. In [49], Mehmood et al. used Hermite’s interpolation and new Green’s functions and generalized the continuous and discrete cyclic refinements of Jensen’s inequality for n-convex functions. The established results were used to obtain new bounds for relative, Shannon, and Zipf–Mandelbrot entropies. Further, in [38], the authors utilized Hermite’s interpolation and Green’s functions and established new general linear inequalities and identities containing n-convex functions.

Motivated by the above discussion, we use Hermite’s interpolating polynomial and generalize the Csiszár-type inequality on time scales for n-convex functions. Moreover, we compute bounds of differential entropy, Kullback–Leibler divergence, triangular discrimination, and Jeffreys distance on time scales, q-calculus, and h-discrete calculus. Some estimates for Zipf–Mandelbrot entropy are also given.

2 Preliminaries

The details on time scale calculus can be followed from [19, 20]:

A function \(\mathfrak{f} : \mathbb{T} \rightarrow\mathbb{R}\) is rd-continuous if its left-sided limit is finite at left-dense points of \(\mathbb{T}\) and it is continuous at right-dense points of \(\mathbb{T}\). In this paper, \(C_{rd}\) denotes the set of rd-continuous functions.

Theorem A

Every rd-continuous function has an antiderivative. If \(x_{0} \in \mathbb{T}\), then F given by

$$F(x):= \int_{x_{0}}^{x}\mathfrak{f}(\xi)\Delta\xi \quad\textit{for } x \in\mathbb{T} $$

is an antiderivative of f.

Consider the following set:

$$ \Omega:= \biggl\{ \mathbf{p}~|~ \mathbf{p} : \mathbb{T}\rightarrow[0, \infty), \int_{a}^{b}\mathbf{p}(\xi) \Delta\xi= 1 \biggr\} . $$

In [14], Ansari et al. proved the following result.

Theorem B

Suppose that \(\Theta: [0, \infty) \rightarrow\mathbb{R}\) is a convex function on \([\varsigma_{1}, \varsigma_{2}] \subset[0, \infty)\) and \(\varsigma _{1}\leq1 \leq\varsigma_{2}\). If \(\mathbf{p}_{1}, \mathbf{p}_{2} \in \Omega\) with \(\varsigma_{1}\leq\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \leq\varsigma_{2}\) for every \(\xi\in\mathbb{T}\),

$$ \int_{a}^{b}\mathbf{p}_{2}(\xi) \Theta\biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \biggr)\Delta\xi\leq\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} \Theta( \varsigma_{2}). $$
(1)

Under the assumptions of Theorem B, we define the following linear functional:

$$ \breve{J}\bigl(\Theta(x)\bigr) = \frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \Theta\biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \biggr)\Delta\xi. $$
(2)

Remark 1

From Theorem B, if Θ is continuous and convex, then \(\breve {J}(\Theta) \geq0\) and \(\breve{J}(\Theta) = 0\) for \(\Theta(x) = x\) or Θ is a constant function.

The following formula of Hermite’s interpolation is given in [10].

Consider \(\varsigma_{1}, \varsigma_{2} \in\mathbb{R}\) and \(\varsigma _{1} < \varsigma_{2}\) with \(\varsigma_{1} = d_{1} < d_{2} < \cdots< d_{s}=\varsigma_{2} (s \geq2)\) be the points. For \(\Theta\in C^{n}[\varsigma_{1}, \varsigma_{2}]\), a unique polynomial \(\sigma _{\mathcal{H}}^{(i)}(z_{1})\) of degree \((n-1)\) exists and satisfies either of the given axioms:

Hermite conditions

$$\sigma_{\mathcal{H}}^{(i)}(d_{j}) = \Theta^{(i)}(d_{j});\quad0 \leq i \leq k_{j}, 1 \leq j \leq s, \sum_{j=1}^{s}k_{j} + s = n. $$

There are a few more specific cases.

Lagrange conditions (\(s=n\), \(k_{j}=0\) for all j)

$$\sigma_{L}(d_{j}) = \Theta(d_{j}), \quad 1 \leq j \leq n. $$

Type \((\vartheta, n-\vartheta)\) conditions (\(s = 2\), \(1 \leq\vartheta\leq n-1\), \(k_{1} = \vartheta-1\), \(k_{2} = n - \vartheta- 1\))

$$\begin{aligned}& \sigma_{(\vartheta, n)}^{(i)}(\varsigma_{1}) = \Theta^{(i)}(\varsigma _{1}), \quad 0 \leq i \leq\vartheta-1, \\& \sigma_{(\vartheta, n)}^{(i)}(\varsigma_{2}) = \Theta^{(i)}(\varsigma _{2}), \quad 0 \leq i \leq n-\vartheta-1. \end{aligned}$$

Two-point Taylor conditions (\(n=2\vartheta\), \(s=2\), \(k_{1} = k_{2} = \vartheta-1\))

$$\sigma_{2T}^{(i)}(\varsigma_{1}) = \Theta^{(i)}(\varsigma_{1}), \qquad \Theta_{2T}^{(i)}(\varsigma_{2}) =\Theta^{(i)}(\varsigma_{2}), \quad 0 \leq i \leq\vartheta-1. $$

The next result is stated in [10].

Theorem C

Consider \(-\infty< \varsigma_{1} < \varsigma_{2} < \infty\) and \(\Theta \in C^{n}([\varsigma_{1}, \varsigma_{2}])\) with points \(\varsigma_{1} < d_{1} < d_{2} < \cdots< d_{s} \leq\varsigma_{2}\) \((s \geq2)\). Then we have

$$\begin{aligned} \Theta(y)= \sigma_{\mathcal{H}}(y) + R_{\mathcal{H}}( \Theta, y), \end{aligned}$$
(3)

where \(\sigma_{\mathcal{H}}(y)\) represents Hermite’s polynomial i.e.

$$\begin{aligned} \sigma_{\mathcal{H}}(y) = \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}\mathcal{H}_{i_{j}}(y) \Theta^{(i)}(d_{j}); \end{aligned}$$

and the Hermite basis \(\mathcal{H}_{i_{j}}\) stated as

$$\begin{aligned} \mathcal{H}_{i_{j}}(y)=\frac{1}{i!} \frac{\omega(y)}{(y - d_{j})^{k_{j} + 1 - i}}\sum_{k=0}^{k_{j} - i} \frac{1}{k!}\frac{d^{k}}{dy^{k}} \biggl(\frac{(y -d_{j})^{k_{j} + 1}}{\omega(y)} \biggr)\biggm|_{y = d_{j}}(y - d_{j})^{k}, \end{aligned}$$
(4)

with

$$\begin{aligned} \omega(y) = \Pi_{j=1}^{s}(y - d_{j})^{k_{j} + 1}, \end{aligned}$$

and the remainder is given by

$$\begin{aligned} R_{\mathcal{H}}(\Theta, y) = \int_{\varsigma_{1}}^{\varsigma_{2}}\mathcal{G}_{\mathcal{H}, n}(y, z_{1})\Theta^{(n)}(z_{1})\,dz_{1}, \end{aligned}$$

where \(\mathcal{G}_{\mathcal{H}, n}(y, z_{1})\) is given as

$$\begin{aligned} \mathcal{G}_{\mathcal{H}, n}(y, z_{1})= \textstyle\begin{cases} \sum_{j=1}^{r}\sum_{i=0}^{k_{j}}\frac{(d_{j} - z_{1})^{n - i -1}}{(n - i - 1)!}\mathcal{H}_{i_{j}}(y) , & \hbox{$z_{1} \leq y$;} \\ -\sum_{j=r+1}^{s}\sum_{i=0}^{k_{j}}\frac{(d_{j} - z_{1})^{n - i -1}}{(n - i - 1)!}\mathcal{H}_{i_{j}}(y) , & \hbox{$ z_{1} \geq y$,} \end{cases}\displaystyle \end{aligned}$$
(5)

for every \(d_{r} \leq z_{1} \leq d_{r+1}\); \(r = 0, 1, \ldots, s\), along \(d_{0} = \varsigma_{1}\) and \(d_{s+1}= \varsigma_{2}\).

Remark A

Consider the specific cases for Hermite conditions, for the Lagrange conditions, one gets

$$\begin{aligned} \Theta(y) = \sigma_{L}(y) + R_{L}(\Theta, y), \end{aligned}$$

where \(\sigma_{L}(y)\) represents Lagrange’s polynomial i.e.

$$\begin{aligned} \sigma_{L}(y) = \sum_{j=1}^{n} \prod_{k=1, k \neq j}^{n} \biggl( \frac{y - d_{k}}{d_{j} - d_{k}} \biggr)\Theta(d_{j}), \end{aligned}$$

and the remainder \(R_{L}(\Theta, y)\) is given by

$$\begin{aligned} R_{L}(\Theta, y) = \int_{\varsigma_{1}}^{\varsigma_{2}}\mathcal{G}_{L}(y, z_{1})\Theta^{(n)}(z_{1})\,dz_{1}, \end{aligned}$$

with

$$\begin{aligned} \mathcal{G}_{L}(y, z_{1}) = \frac{1}{(n-1)!} \textstyle\begin{cases} \sum_{j=1}^{r}(d_{j} - z_{1})^{n-1}\prod_{k=1, k\neq j}^{n} (\frac{y - d_{k}}{d_{j} - d_{k}} ), & \hbox{$z_{1} \leq y$;} \\ - \sum_{j=r+1}^{n}(d_{j} - z_{1})^{n-1}\prod_{k=1, k\neq j}^{n} ( \frac{y - d_{k}}{d_{j} - d_{k}} ), & \hbox{$z_{1} \geq y$,} \end{cases}\displaystyle \end{aligned}$$
(6)

\(d_{r} \leq z_{1} \leq d_{r+1}\) \(r=1, 2, \ldots, n-1\), with \(d_{1} = \varsigma _{1}\) and \(d_{n} =\varsigma_{2}\). From Theorem C, considering type \((\vartheta, n-\vartheta)\) conditions, one gets

$$\begin{aligned} \Theta(y) = \sigma_{(\vartheta, n)}(y) + R_{\vartheta, n}(\Theta, y), \end{aligned}$$

where \(\sigma_{(\vartheta, n)}(y)\) is \((\vartheta, n-\vartheta)\) interpolating polynomial i.e.

$$\begin{aligned} \sigma_{(\vartheta, n)}(y) = \sum_{i=0}^{\vartheta-1} \tau_{i}(y)\Theta^{(i)}(\varsigma_{1}) + \sum _{i=0}^{n-\vartheta-1}\eta_{i}(y) \Theta^{(i)}(\varsigma_{2}), \end{aligned}$$

with

$$\begin{aligned} \tau_{i}(y) = \frac{1}{i!}(y - \varsigma_{1})^{i} \biggl(\frac{y - \varsigma_{1}}{\varsigma_{1} - \varsigma_{2}} \biggr)^{n-\vartheta}\sum_{k=0}^{\vartheta- 1 -i} {{n-\vartheta+k-1}\choose {k}} \biggl(\frac{y - \varsigma_{1}}{\varsigma _{2} - \varsigma_{1}} \biggr)^{k} \end{aligned}$$
(7)

and

$$\begin{aligned} \eta_{i}(y) = \frac{1}{i!}(y - \varsigma_{1})^{i} \biggl(\frac{y - \varsigma_{1}}{\varsigma_{2} - \varsigma_{1}} \biggr)^{\vartheta}\sum_{k=0}^{n-\vartheta-1-i} {{\vartheta+k-1}\choose {k}} \biggl(\frac{y - \varsigma_{2}}{\varsigma _{2} - \varsigma_{1}} \biggr)^{k}, \end{aligned}$$
(8)

and the remainder \(R_{(\vartheta, n)}(\Theta, y)\) is defined as

$$\begin{aligned} R_{(\vartheta, n)}(\Theta, y) = \int_{\varsigma_{1}}^{\varsigma_{2}}\mathcal{G}_{\vartheta, n}(y, z_{1})\Theta^{(n)}(z_{1})\,dz_{1}, \end{aligned}$$

with

$$\begin{aligned} \mathcal{G}_{(\vartheta, n)}(y, z_{1}) = \textstyle\begin{cases} \sum_{j=0}^{\vartheta-1} [\sum_{p=0}^{\vartheta-1-j}{{n-\vartheta +p-1}\choose {p}} (\frac{y - \varsigma_{1}}{\varsigma_{2} - \varsigma _{1}} )^{p} ]\\ \quad{}\times\frac{(y - \varsigma_{1})^{j}(\varsigma_{1} - z_{1})^{n-j-1}}{j!(n - j - 1)!} (\frac{\varsigma_{2} - y}{\varsigma_{2} - \varsigma_{1}} )^{n - \vartheta}, & \hbox{$\varsigma_{1} \leq z_{1} \leq y \leq\varsigma_{2}$;} \\ - \sum_{j=0}^{n-\vartheta-1} [\sum_{\lambda=0}^{n-\vartheta -j-1}{{\vartheta+\lambda-1}\choose {\lambda}} (\frac{\varsigma_{2} - y}{\varsigma_{2} - \varsigma_{1}} )^{\lambda} ]\\ \quad{}\times\frac{(y - \varsigma_{2})^{j}(\varsigma_{2} - z_{1})^{n-j-1}}{j!(n - j - 1)!} (\frac{y - \varsigma_{1}}{\varsigma_{2} - \varsigma_{1}} )^{\vartheta}, & \hbox{$\varsigma_{1} \leq y \leq z_{1} \leq\varsigma_{2}$.} \end{cases}\displaystyle \end{aligned}$$
(9)

From Theorem C, consider type two-point Taylor condition, then

$$\begin{aligned} \Theta(y) = \sigma_{2T}(y) + R_{2T}(\Theta, y), \end{aligned}$$

where

$$\begin{aligned} \sigma_{2T}(y) =& \sum_{i=0}^{\vartheta-1} \sum_{k=0}^{\vartheta- 1 - i}{{\vartheta+ k - 1}\choose {k}} \biggl[\frac{(y - \varsigma_{1})^{i}}{i!} \biggl(\frac{y - \varsigma _{2}}{\varsigma_{1} - \varsigma_{2}} \biggr)^{\vartheta} \biggl(\frac{y - \varsigma_{1}}{\varsigma_{2} - \varsigma_{1}} \biggr)^{k}\Theta^{(i)}( \varsigma_{1}) \\ &{}+ \frac{(y - \varsigma_{2})^{i}}{i!} \biggl(\frac{y - \varsigma _{1}}{\varsigma_{2} - \varsigma_{1}} \biggr)^{\vartheta} \biggl(\frac{y - \varsigma_{2}}{\varsigma_{1} - \varsigma_{2}} \biggr)^{k}\Theta^{(i)}( \varsigma_{2}) \biggr], \end{aligned}$$

and the remainder \(R_{2T}(\Theta, y)\) is given by

$$\begin{aligned} R_{2T}(\Theta, y) = \int_{\varsigma_{1}}^{\varsigma_{2}}\mathcal{G}_{2T}(y, z_{1})\Theta^{(n)}(z_{1})\,dz_{1} \end{aligned}$$

with

$$\begin{aligned} \mathcal{G}_{2T}(y, z_{1}) = \textstyle\begin{cases} \frac{(-1)^{\vartheta}}{(2\vartheta- 1)!}p^{\vartheta}(y, z_{1})\sum _{j=0}^{\vartheta-1}{{\vartheta-1+j}\choose {j}}(y - z_{1})^{\vartheta -1-j}\delta^{j}(y, z_{1}), & \hbox{$\varsigma_{1} \leq z_{1} \leq y \leq \varsigma_{2}$;} \\ \frac{(-1)^{\vartheta}}{(2\vartheta- 1)!}\delta^{\vartheta}(y, z_{1})\sum _{j=0}^{\vartheta-1}{{\vartheta-1+j}\choose {j}}(z_{1} - y)^{\vartheta -1-j}p^{j}(y, z_{1}), & \hbox{$\varsigma_{1} \leq y \leq z_{1} \leq \varsigma_{2} $,} \end{cases}\displaystyle \end{aligned}$$
(10)

where \(p(y, z_{1}) = \frac{(z_{1} - \varsigma_{1})(\varsigma_{2} - y)}{\varsigma_{2} - \varsigma_{1}}\), \(\delta(y, z_{1}) = p(y, z_{1})\) for every \(y, z_{1} \in[\varsigma_{1}, \varsigma_{2}]\).

The nonnegativity of Green’s functions is characterized by Beesack [17] and Levin [45].

Lemma A

  1. (i)

    \(\frac{\mathcal{G}_{\mathcal{H}, n}(y, z_{1})}{\omega(y)} > 0 \,\ \,\ \,\ d_{1} \leq y \leq d_{s}, \,\ d_{1} \leq z_{1} \leq d_{s}\);

  2. (ii)

    \(\mathcal{G}_{\mathcal{H}, n}(y, z_{1}) \leq\frac {1}{(n-1)!(\varsigma_{2} - \varsigma_{1})} \vert \omega(y) \vert \);

  3. (iii)

    \(\int_{\varsigma_{1}}^{\varsigma_{2}}\mathcal{G}_{\mathcal {H}, n}(y, z_{1})\,dz_{1} = \frac{\omega(y)}{n!}\).

3 Csiszár-type inequality on time scales via Hermite interpolation

Let us start with the following main identity.

Theorem 1

Assume the conditions of Theorem Band consider the points \(\varsigma _{1} = d_{1} < d_{2} < \cdots< d_{s} = \varsigma_{2}\) \((s \geq2)\) with \(\Theta\in C^{n}[\varsigma_{1}, \varsigma_{2}]\). Also let \(\mathcal {H}_{i_{j}}\), \(\mathcal{G}_{\mathcal{H}, n}\) be defined in (4) and (5). Then

$$\begin{aligned} I_{\Theta}(\mathbf{p}_{1}, \mathbf{p}_{2}) =& \int_{a}^{b}\mathbf{p}_{2}(\xi) \Theta\biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \biggr)\Delta\xi \\ =& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma _{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma _{2})-\sum _{j=1}^{s}\sum_{i=0}^{k_{j}} \boldsymbol{\Theta}^{(i)} (d_{j} ) \breve{J} \bigl( \mathcal{H}_{i_{j}}(t) \bigr) \\ &{}- \int_{\varsigma_{1}}^{\varsigma_{2}}\breve{J} \bigl( \mathcal{G}_{\mathcal{H}, n}(t, z_{1}) \bigr)\boldsymbol{ \Theta}^{(n)}(z_{1})\,dz_{1}, \end{aligned}$$
(11)

where

$$ \breve{J}\bigl(\mathcal{H}_{i_{j}}(x)\bigr)= \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\Delta\xi $$
(12)

and

$$\begin{aligned} \breve{J} \bigl(\mathcal{G}_{\mathcal{H}, n}(x, z_{1}) \bigr) =& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{G}_{\mathcal{H}, n}( \varsigma_{1}, z_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{G}_{\mathcal{H}, n}(\varsigma_{2}, z_{1}) \\ &{}- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{G}_{\mathcal{H}, n} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf {p}_{2}(\xi)}, z_{1} \biggr)\Delta\xi. \end{aligned}$$
(13)

Proof

Use (3) in (2) with the linearity of \(\breve {J}(x)\) to obtain (11). □

The following result is related to the generalization of new identity (11) for n-convex function.

Theorem 2

Assume all the conditions of Theorem 1with Θ is an n-convex function and

$$ \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{G}_{\mathcal{H}, n}( \varsigma_{1}, z_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{G}_{\mathcal{H}, n}(\varsigma_{2}, z_{1})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{G}_{\mathcal{H}, n} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf {p}_{2}(\xi)}, z_{1} \biggr) )\Delta\xi\geq0, $$
(14)

where \(z_{1} \in[\varsigma_{1}, \varsigma_{2}]\). Then

$$\begin{aligned}& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \Theta\biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \biggr)\Delta\xi \\& \quad \geq\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\Theta^{(i)} (d_{j} ) \breve{J} \bigl(\mathcal{H}_{i_{j}}(x) \bigr). \end{aligned}$$
(15)

One can write (15) in terms of Csiszár divergence on time scales as follows:

$$\begin{aligned}& \int_{a}^{b}\mathbf{p}_{2}(\xi) \Theta\biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \biggr)\Delta\xi \\& \quad \leq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} \Theta( \varsigma_{2})- \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}\Theta^{(i)} (d_{j} ) \\& \qquad {}\times\biggl(\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\Delta\xi \biggr). \end{aligned}$$
(16)

Proof

As the function Θ is n-convex, hence Θ is n-times differentiable and \(\Theta^{(n)}(x) \geq0\) for each \(x \in[\varsigma_{1}, \varsigma_{2}]\). Use Theorem 1 to obtain (15). □

Use Lagrange conditions in (11) to obtain the following result.

Corollary 1

Under the assumptions of Theorem 1with Θ is an n-convex function and \(\mathcal{G}_{L}\) is given in (6). If

$$ \breve{J} \bigl(\mathcal{G}_{L}(x, z_{1}) \bigr)\geq0 \quad z_{1} \in[ \varsigma_{1}, \varsigma_{2}], $$
(17)

then

$$ \breve{J} \bigl(\Theta(\cdot) \bigr) \geq\sum _{j=1}^{n}\Theta(d_{j} ) \breve{J} \Biggl(\prod_{k=1, k \neq j}^{n} \biggl( \frac{y - d_{k}}{d_{j} - d_{k}} \biggr) \Biggr). $$
(18)

Use type \((\vartheta, n-\vartheta)\) conditions in (11) to obtain the following result.

Corollary 2

Under the assumptions of Theorem 1with Θ is an n-convex function and \(\mathcal{G}_{\vartheta,n}\) is defined in (9). If

$$ \breve{J} \bigl(\mathcal{G}_{\vartheta,n}(x, z_{1}) \bigr)\geq0 \quad z_{1} \in[ \varsigma_{1}, \varsigma_{2}], $$
(19)

then

$$ \breve{J} \bigl(\Theta(\cdot) \bigr) \geq\sum _{i=0}^{\vartheta-1}\breve{J} \bigl(\tau_{i}(y) \bigr)\Theta^{(i)}(\varsigma_{1}) + \sum _{i=0}^{n-\vartheta-1}\breve{J} \bigl(\eta_{i}(y) \bigr)\Theta^{(i)}(\varsigma_{2}). $$
(20)

Use the two-point Taylor condition in (11) to get the following corollary.

Corollary 3

Assume the hypothesis of Theorem 1with Θ is an n-convex function and \(\mathcal{G}_{2T}\) is defined in (10). If

$$ \breve{J} \bigl(\mathcal{G}_{2T}(x, z_{1}) \bigr)\geq0 \quad z_{1} \in[ \varsigma_{1}, \varsigma_{2}], $$
(21)

then

$$\begin{aligned} \breve{J} \bigl(\Theta(\cdot) \bigr) \geq&\sum _{i=0}^{\vartheta-1}\sum_{k=0}^{\vartheta- 1 - i} {{\vartheta+ k - 1}\choose {k}} \biggl[\breve{J} \biggl(\frac{(y - \varsigma_{1})^{i}}{i!} \biggl( \frac{y - \varsigma_{2}}{\varsigma_{1} - \varsigma_{2}} \biggr)^{\vartheta} \biggl(\frac{y - \varsigma_{1}}{\varsigma_{2} - \varsigma_{1}} \biggr)^{k} \biggr)\Theta^{(i)}(\varsigma_{1}) \\ &{}+\breve{J} \biggl( \frac{(y - \varsigma_{2})^{i}}{i!} \biggl(\frac {y - \varsigma_{1}}{\varsigma_{2} - \varsigma_{1}} \biggr)^{\vartheta} \biggl(\frac{y - \varsigma_{2}}{\varsigma_{1} - \varsigma_{2}} \biggr)^{k} \biggr)\Theta^{(i)}(\varsigma_{2}) \biggr]. \end{aligned}$$
(22)

Theorem 3

Assume the hypothesis of Theorem 1and \(\mathbf{p} \in C([a, b]_{\mathbb{T}}, \mathbb{R})\) is positive such that \(\int_{a}^{b}\mathbf{p}(t)\Delta t = 1\) with \(\Theta\in C^{n}[\varsigma_{1}, \varsigma_{2}]\) being an n-convex function.

  1. (i)

    For each \(j = 2, \ldots, s\), if \(k_{j}\) is odd, then (15) is valid.

  2. (ii)

    Let (15) be valid, and the function

    $$\begin{aligned} F(t) = \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}\Theta^{(i+2)}(d_{j}) \mathcal{H}_{i_{j}}(t) \end{aligned}$$
    (23)

    is convex, the right-hand side of (15) is nonnegative, and

    $$\begin{aligned} \breve{J}\bigl(\Theta(\cdot)\bigr) \geq0. \end{aligned}$$
    (24)

Proof

  1. (i)

    If \(k_{j}\) is odd, \(\omega(x) \geq0\), and use Lemma A to get \(\mathcal{G}_{\mathcal{H}, n-2}(\cdot, z_{1}) \geq0\). Thus, \(\mathcal {G}_{\mathcal{H}, n}(\cdot, z_{1})\) is convex and \(\breve{J} (\mathcal {G}_{\mathcal{H}, n}(\cdot, z_{1}) ) \geq0\) by Remark 1. Use Theorem 2 to get (15).

  2. (ii)

    As (15) holds, one can write the right-hand side of (15) in a functional form, and the nonnegativity of (15) is followed by Remark 1. Use (23) in (15) to obtain (24).

 □

Remark 2

It is also possible to compute Grüss, Cebys̆ev, and Ostrowski-type bounds corresponding to identity (11).

4 Bounds of divergence measures

In the sequel X denotes a continuous random variable and \(\bar{b} > 1\) is base of log.

Consider a positive density function \(p: \mathbb{T}\rightarrow X\) on time scales with \(\int_{a}^{b} p(x)\Delta x = 1\) when the integral exists.

On time scales, the differential entropy is introduced by Ansari et al. [13]:

$$ h_{\bar{b}}(X) := \int_{a}^{b} p(x) \log\frac{1}{p(x)} \Delta x. $$
(25)

Theorem 4

Assume the hypothesis of Theorem 1with \(\Theta\in C^{n}[\varsigma_{1}, \varsigma_{2}]\) is an n-convex function. If n is even,

$$\begin{aligned} h_{\bar{b}}(X) \geq& \frac{\varsigma_{2}-1}{\varsigma _{2}-\varsigma_{1}} \log( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi)\log \bigl(\mathbf{p}_{1}(\xi)\bigr)\Delta\xi \\ &{}+ \sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\frac{(-1)^{i-2}(i-1)!}{ (d_{j} )^{i}} \biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2}) \\ &{}- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\Delta\xi \biggr), \end{aligned}$$
(26)

where \(h_{\bar{b}}(X)\) is given in (25).

Proof

Use \(x \rightarrow-\log x\) in Theorem 2 to get (26). □

Kullback–Leibler divergence is one of the best known among information divergences. The well-known divergence measure is used in information theory, mathematical statistics, and signal processing (see [75]). On time scales, Kullback–Leibler divergence is defined by Ansari et al. [14]

$$ D(\mathbf{p}_{1}, \mathbf{p}_{2}) = \int_{a}^{b} \mathbf{p}_{1}(\xi) \ln\biggl[\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \biggr]\Delta \xi. $$
(27)

Theorem 5

Assume the conditions of Theorem 1with \(\Theta\in C^{n}[\varsigma_{1}, \varsigma_{2}]\) is an n-convex function. If n is even,

$$\begin{aligned} D(\mathbf{p}_{1}, \mathbf{p}_{2}) \leq& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \varsigma _{1}\ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \varsigma_{2}\ln (\varsigma_{2}) -\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\frac{(-1)^{i-1}(i-2)!}{ (d_{j} )^{i-1}} \\ & {}\times\biggl(\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\Delta\xi \biggr), \end{aligned}$$
(28)

where \(D(\mathbf{p}_{1}, \mathbf{p}_{2})\) is given in (27).

Proof

Use \(\Theta(\xi) = \xi\ln\xi\) in Theorem 2 to get (28). □

Jeffreys distance have many applications in statistics and pattern recognition (see [41, 74]). On time scales, Jeffreys distance is defined by Ansari et al. [14]

$$ D_{J}(\mathbf{p}_{1}, \mathbf{p}_{2}) := \int_{a}^{b} \bigl(\mathbf{p}_{1}( \xi) - \mathbf{p}_{2}(\xi)\bigr) \ln\biggl[\frac{\mathbf{p}_{1}(\xi )}{\mathbf{p}_{2}(\xi)} \biggr]\Delta\xi. $$
(29)

Theorem 6

Assume the conditions of Theorem 1with \(\Theta\in C^{n}[\varsigma_{1}, \varsigma_{2}]\) is an n-convex function. If n is even,

$$\begin{aligned}& D_{J}(\mathbf{p}_{1}, \mathbf{p}_{2}) \\& \quad \leq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma _{1}} (\varsigma_{1} -1)\ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} (\varsigma_{2} -1)\ln(\varsigma_{2}) -\sum_{i=0}^{k_{j}} \frac{(-1)^{i}(i-2)!}{ (d_{j} )^{i-1}} \\& \qquad {}\times\biggl(\frac{i-1}{d_{j}} + 1 \biggr) \biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\Delta\xi \biggr), \end{aligned}$$
(30)

where \(D_{J}(\mathbf{p}_{1}, \mathbf{p}_{2})\) is given in (29).

Proof

Use \(\Theta(\xi) = (\xi-1)\ln\xi\) in Theorem 2 to get (30). □

Triangular discrimination has many applications in statistics and information theory (see [41, 73]). On time scales, triangular discrimination is defined by Ansari et al. [14]

$$ D_{\Delta}(\mathbf{p}_{1}, \mathbf{p}_{2}) = \int_{a}^{b}\frac{[\mathbf{p}_{2}(\xi) - \mathbf{p}_{1}(\xi)]^{2}}{\mathbf{p}_{2}(\xi) + \mathbf{p}_{1}(\xi )}\Delta\xi. $$
(31)

Theorem 7

Assume the conditions of Theorem 1with \(\Theta\in C^{n}[\varsigma_{1}, \varsigma_{2}]\) is an n-convex function. If n is even,

$$\begin{aligned} D_{\Delta}(\mathbf{p}_{1}, \mathbf{p}_{2}) \leq& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma _{1}} \frac{(\varsigma_{1} - 1)^{2}}{\varsigma_{1} + 1} + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \frac{(\varsigma _{2} - 1)^{2}}{\varsigma_{2} + 1}-4 \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}\frac{(-1)^{i} (i)!}{ (d_{j} +1 )^{i+1}} \\ &{}\times\biggl(\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma _{1}}\mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\Delta\xi \biggr), \end{aligned}$$
(32)

where \(D_{\Delta}(\mathbf{p}_{1}, \mathbf{p}_{2})\) is given in (31).

Proof

Use \(\Theta(\xi) = \frac{(\xi- 1)^{2}}{\xi+ 1} \) in Theorem 2 to get (32). □

4.1 Inequalities in classical calculus (continuous case)

In this section, new bounds of Csiszár divergence, differential entropy, Kullback–Leibler divergence, Jeffreys distance, and triangular discrimination are given, respectively.

If \(\mathbb{T} = \mathbb{R}\) in Theorem 2, inequality (16) has the following form and gives a new bound for Csiszár divergence:

$$\begin{aligned}& \int_{a}^{b}\mathbf{p}_{2}(\xi) \Theta\biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \biggr)\,d\xi\\& \quad \leq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} \Theta( \varsigma_{2})- \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}\Theta^{(i)} (d_{j} ) \\& \qquad {}\times\biggl(\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\,d\xi\biggr). \end{aligned}$$

If \(\mathbb{T} = \mathbb{R}\) in Theorems 47, inequalities (26), (28), (30), and (32) take the following new form, respectively:

$$\begin{aligned}& \begin{gathered} \int_{a}^{b} \mathbf{p}_{2}(\xi) \log\frac{1}{\mathbf{p}_{2}(\xi)} \,d\xi\\ \quad \geq \frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \log( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi)\log \bigl(\mathbf{p}_{1}(\xi)\bigr)\,d\xi \\ \qquad {}+\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\frac{(-1)^{i-2}(i-1)!}{ (d_{j} )^{i}} \biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2})\\ \qquad {}- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\,d\xi\biggr), \end{gathered} \\& \begin{gathered} \int_{a}^{b} \mathbf{p}_{1}(\xi) \ln\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi)} \,d\xi\\ \quad \leq \frac {\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \varsigma_{1} \ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} \varsigma_{2} \ln(\varsigma_{2}) -\sum_{j=1}^{s} \sum_{i=0}^{k_{j}}(-1)^{i-2} \\ \qquad {}\times\frac{(i-2)!}{ (d_{j} )^{i-1}} \biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2})- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\,d\xi\biggr), \end{gathered} \\& \begin{gathered} \int_{a}^{b} \bigl[\mathbf{p}_{1}( \xi)-\mathbf{p}_{2}(\xi)\bigr] \ln\frac{\mathbf{p}_{1}(\xi)}{\mathbf {p}_{2}(\xi)} \,d\xi\\ \quad \leq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} (\varsigma_{1} -1)\ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} (\varsigma_{2} -1)\ln(\varsigma_{2}) -\sum_{i=0}^{k_{j}}(-1)^{i} \\ \qquad {}\times\frac{(i-2)!}{ (d_{j} )^{i-1}} \biggl(\frac{i-1}{d_{j}} + 1 \biggr) \biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2})\\ \qquad {}- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\,d\xi\biggr), \end{gathered} \end{aligned}$$

and

$$\begin{aligned} \int_{a}^{b} \frac{[\mathbf{p}_{2}(\xi)-\mathbf{p}_{1}(\xi )]^{2}}{\mathbf{p}_{1}(\xi) + \mathbf{p}_{2}(\xi)} \,d\xi \leq& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \frac{(\varsigma _{1} - 1)^{2}}{\varsigma_{1} + 1} +\frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \frac{(\varsigma_{2} - 1)^{2}}{\varsigma_{2} + 1}-4 \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}(-1)^{i} \\ &{}\times\frac{(i)!}{ (d_{j} +1 )^{i+1}} \biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2}) \\ &{}- \int_{a}^{b}\mathbf{p}_{2}(\xi) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(\xi)}{\mathbf{p}_{2}(\xi )} \biggr)\,d\xi\biggr). \end{aligned}$$

4.2 Inequalities in h-discrete calculus

The following inequalities give new bounds of Csiszár divergence, Shannon entropy, Kullback–Leibler divergence, Jeffreys distance, and triangular discrimination in h-discrete calculus, respectively. In this section, discrete cases of these divergence measures are also given.

Use \(\mathbb{T} = h\mathbb{Z}\), \(h>0\) in Theorem 2, inequality (16) has the following form:

$$\begin{aligned}& \sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h \Theta\biggl(\frac{\mathbf{p}_{1}(vh)}{\mathbf {p}_{2}(vh)} \biggr) \\ & \quad \leq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta (\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma _{2})- \sum _{j=1}^{s}\sum_{i=0}^{k_{j}} \Theta^{(i)} (d_{j} ) \\ & \qquad {}\times\Biggl(\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2})-\sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h\mathcal{H}_{i_{j}} \biggl( \frac{\mathbf{p}_{1}(vh)}{\mathbf{p}_{2}(vh)} \biggr) \Biggr). \end{aligned}$$
(33)

Use \(\mathbb{T} = h\mathbb{Z},~h>0\) in Theorems 47, inequalities (26), (28), (30), and (32) take the following new form in h-discrete calculus, respectively:

$$\begin{aligned}& \begin{gathered}[b] \sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h \log\frac{1}{\mathbf{p}_{2}(vh)h} \\ \quad \geq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma_{2})- \sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h \\ \qquad {}\times\log\bigl(\mathbf{p}_{1}(vh)h\bigr) + \sum _{j=1}^{s}\sum_{i=0}^{k_{j}} \frac{(-1)^{i-2}(i-1)!}{ (d_{j} )^{i}} \Biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2}) \\ \qquad {}-\sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h\mathcal{H}_{i_{j}} \biggl( \frac{\mathbf{p}_{1}(vh)}{\mathbf{p}_{2}(vh)} \biggr) \Biggr), \end{gathered} \end{aligned}$$
(34)
$$\begin{aligned}& \begin{gathered}[b] \sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{1}(vh)h \ln\biggl[\frac{\mathbf{p}_{1}(vh)}{\mathbf {p}_{2}(vh)} \biggr] \\ \quad \leq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \varsigma_{1}\ln (\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \varsigma_{2}\ln (\varsigma_{2}) -\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}(-1)^{i-2} \\ \qquad {}\times\frac{(i-2)!}{ (d_{j} )^{i-1}} \Biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2})- \sum _{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(vh)}{\mathbf{p}_{2}(vh)} \biggr) \Biggr), \end{gathered} \end{aligned}$$
(35)
$$\begin{aligned}& \begin{gathered}[b] \sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} (\mathbf{p}_{1}- \mathbf{p}_{2}) (vh)h \ln \frac{\mathbf{p}_{1}(vh)}{\mathbf{p}_{2}(vh)} \\ \quad \leq \frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} (\varsigma_{1}- 1)\ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} (\varsigma_{2}) \ln(\varsigma_{2}- 1) \\ \qquad {}-\sum_{i=0}^{k_{j}}\frac{(-1)^{i}(i-2)!}{ (d_{j} )^{i-1}} \biggl(\frac{i-1}{d_{j}} + 1 \biggr) \Biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2})\\ \qquad {}- \sum _{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h \times\mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(vh)}{\mathbf {p}_{2}(vh)} \biggr) \Biggr), \end{gathered} \end{aligned}$$
(36)

and

$$\begin{aligned}& \sum_{v=\frac{a}{h}}^{\frac{b}{h} - 1} h \frac{[\mathbf{p}_{2}(vh)-\mathbf{p}_{1}(vh)]^{2}}{\mathbf{p}_{1}(vh) + \mathbf{p}_{2}(vh)} \\& \quad \leq \frac{\varsigma_{2}-1}{\varsigma _{2}-\varsigma_{1}} \frac{(\varsigma_{1} - 1)^{2}}{\varsigma_{1} + 1} + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \frac{(\varsigma _{2} - 1)^{2}}{\varsigma_{2} + 1}-4 \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}(-1)^{i} \\& \qquad {}\times\frac{ (i)!}{ (d_{j} +1 )^{i+1}} \Biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2}) \\& \qquad {}-\sum _{v=\frac{a}{h}}^{\frac{b}{h} - 1} \mathbf{p}_{2}(vh)h \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(vh)}{\mathbf{p}_{2}(vh)} \biggr) \Biggr). \end{aligned}$$
(37)

Remark 3

If \(h = 1\), \(a = 0\), \(b = m\), \(\mathbf{p}_{1}(v) = (\mathbf{p}_{1})_{v}\), and \(\mathbf{p}_{2}(v) = (\mathbf{p}_{2})_{v}\), inequality (33) takes the following new form and gives a new bound for discrete Csiszár divergence:

$$\begin{aligned} \sum_{v = 1}^{m} (\mathbf{p}_{2})_{v} \Theta\biggl(\frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \biggr) \leq& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{2})- \sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\Theta^{(i)} (d_{j} ) \\ &{}\times\Biggl(\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2})- \sum_{v = 1}^{m} (\mathbf{p}_{2})_{v}\mathcal{H}_{i_{j}} \biggl(\frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \biggr) \Biggr). \end{aligned}$$

Remark 4

Put \(h = 1 \), \(a = 0\), \(b = m\), \(\mathbf{p}_{1}(v) = (\mathbf {p}_{1})_{v}\), and \(\mathbf{p}_{2}(v) = (\mathbf{p}_{2})_{v}\), inequality (34) takes the following form and gives a new bound for discrete Shannon entropy:

$$\begin{aligned} S =&\sum_{v = 1}^{m} ( \mathbf{p}_{2})_{v} \log\frac{1}{(\mathbf{p}_{2})_{v}} \\ \geq& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma_{2})- \sum_{v = 1}^{m} (\mathbf{p}_{2})_{v}\log\bigl((\mathbf{p}_{1})_{v} \bigr) \\ & {}+\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\frac{(-1)^{i-2}(i-1)!}{ (d_{j} )^{i}} \Biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2}) \\ & {}- \sum_{v = 1}^{m} (\mathbf{p}_{2})_{v} \mathcal{H}_{i_{j}} \biggl(\frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \biggr) \Biggr). \end{aligned}$$
(38)

Remark 5

Consider \(h = 1\), \(a = 0\), \(b = m\), \(\mathbf{p}_{1}(v) = (\mathbf {p}_{1})_{v}\), and \(\mathbf{p}_{2}(v) = (\mathbf{p}_{2})_{v}\), inequality (35) takes the following form and gives a new bound for discrete Kullback–Leibler divergence:

$$\begin{aligned} KL(\mathbf{p}_{1}, \mathbf{p}_{2}) =&\sum_{j=1}^{n} (\mathbf{p}_{1})_{v} \ln\frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \\ \leq&\frac {\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \varsigma_{1} \ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} \varsigma_{2} \ln(\varsigma_{2}) \\ &{}-\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}(-1)^{i-2} \frac{(i-2)!}{ (d_{j} )^{i-1}} \Biggl(\frac{\varsigma_{2}-1}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2}) \\ &{}- \sum_{v = 1}^{m} (\mathbf{p}_{2})_{v}\mathcal{H}_{i_{j}} \biggl(\frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \biggr) \Biggr). \end{aligned}$$
(39)

Remark 6

Put \(h = 1\), \(a = 0\), \(b = m\), \(\mathbf{p}_{1}(v) = (\mathbf {p}_{1})_{v}\), and \(\mathbf{p}_{2}(v) = (\mathbf{p}_{2})_{v}\), inequality (36) takes the following new form and gives a new bound for discrete Jeffreys distance:

$$\begin{aligned} J_{a}(\mathbf{p}_{1}, \mathbf{p}_{2}) =&\sum_{v = 1}^{m} (\mathbf{p}_{1} - \mathbf{p}_{2})_{j} \ln \frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \\ \leq& \frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} (\varsigma_{1}- 1) \ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} (\varsigma_{2}- 1)\ln(\varsigma_{2}) \\ &{}-\sum_{i=0}^{k_{j}} \frac{(-1)^{i}(i-2)!}{ (d_{j} )^{i-1}} \biggl(\frac{i-1}{d_{j}} + 1 \biggr) \Biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2}) \\ &{}- \sum_{v = 1}^{m} ( \mathbf{p}_{2})_{v}\mathcal{H}_{i_{j}} \biggl( \frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \biggr) \Biggr). \end{aligned}$$
(40)

Remark 7

Take \(h = 1\), \(a = 0\), \(b = m\), \(\mathbf{p}_{1}(v) = (\mathbf {p}_{1})_{v}\), and \(\mathbf{p}_{2}(v) = (\mathbf{p}_{2})_{v}\), inequality (37) takes the following new form and gives a new bound for discrete triangular discrimination:

$$\begin{aligned}& \sum_{v = 1}^{m} \frac{[(\mathbf{p}_{2})_{v}-(\mathbf {p}_{1})_{v}]^{2}}{(\mathbf{p}_{1})_{v} + (\mathbf{p}_{2})_{v}} \\& \quad \leq \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \frac {(\varsigma_{1} - 1)^{2}}{\varsigma_{1} + 1} +\frac{1-\varsigma _{1}}{\varsigma_{2}-\varsigma_{1}} \frac{(\varsigma_{2} - 1)^{2}}{\varsigma_{2} + 1}-4 \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}(-1)^{i} \\& \qquad {}\times\frac{ (i)!}{ (d_{j} +1 )^{i+1}} \Biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2})- \sum _{v = 1}^{m} (\mathbf{p}_{2})_{v} \mathcal{H}_{i_{j}} \biggl(\frac{(\mathbf{p}_{1})_{v}}{(\mathbf{p}_{2})_{v}} \biggr) \Biggr). \end{aligned}$$

4.3 Inequalities in q-calculus

The following inequalities give new bounds in q-calculus for Csiszár divergence, Shannon entropy, Kullback–Leibler divergence, triangular discrimination, and Jeffreys distance.

Put \(\mathbb{T} = q^{\mathbb{N}_{0}}\) (\(q > 1\)), \(b = q^{m}\), and \(a = q^{k}\) (\(k < m\)), inequality (16) becomes

$$\begin{aligned}& \sum_{v=k}^{m - 1} q^{v+1} \mathbf{p}_{2}\bigl(q^{v}\bigr)\Theta\biggl( \frac{\mathbf{p}_{1}(q^{v})}{\mathbf{p}_{2}(q^{v})} \biggr) \\& \quad \leq\frac {\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \Theta( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \Theta(\varsigma_{2})- \sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\Theta^{(i)} (d_{j} ) \\& \qquad {} \times\Biggl(\frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2})- \sum_{v = k}^{m} q^{v+1}\mathbf{p}_{2}\bigl(q^{v}\bigr) \mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(q^{v})}{\mathbf {p}_{2}(q^{v})} \biggr) \Biggr). \end{aligned}$$

Use \(\mathbb{T} = q^{\mathbb{N}_{0}}\), \(q > 1\), \(a = q^{k}\), and \(b = q^{m}\) with \(k < m\) in Theorems 47, inequalities (26), (28), (30), and (32) take the following new form in quantum calculus, respectively:

$$\begin{aligned}& \begin{aligned}&\sum_{v=k}^{m - 1} q^{v+1}\mathbf{p}_{2}\bigl(q^{v}\bigr) \log \frac{1}{\mathbf{p}_{2}(q^{v})} \\ &\quad \geq \frac{\varsigma_{2}-1}{\varsigma _{2}-\varsigma_{1}} \log(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma _{2})- \sum _{v=\frac{a}{h}}^{\frac{b}{h} - 1} q^{v+1} \mathbf{p}_{2}\bigl(q^{v}\bigr) \\ &\qquad {}\times\log\bigl(\mathbf{p}_{1}\bigl(q^{v}\bigr) \bigr) + \sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\frac{(-1)^{i-2}(i-1)!}{ (d_{j} )^{i}} \Biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2}) \\ &\qquad {}-\sum_{v = k}^{m} q^{v+1} \mathbf{p}_{2}\bigl(q^{v}\bigr)\mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(q^{v})}{\mathbf{p}_{2}(q^{v})} \biggr) \Biggr), \end{aligned} \\& \begin{aligned} &\sum _{v = k}^{m - 1} q^{v+1} \mathbf{p}_{1}\bigl(q^{v}\bigr) \ln \frac{\mathbf{p}_{1}(q^{v})}{\mathbf{p}_{2}(q^{v})}\\ &\quad \leq \frac {\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \varsigma_{1}\ln( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \varsigma_{2}\ln( \varsigma_{2}) -\sum_{j=1}^{s} \sum_{i=0}^{k_{j}}(-1)^{i-2} \\ &\qquad {}\times\frac{(i-2)!}{ (d_{j} )^{i-1}} \Biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2})- \sum _{v = k}^{m} q^{v+1} \mathbf{p}_{2}\bigl(q^{v}\bigr)\mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(q^{v})}{\mathbf{p}_{2}(q^{v})} \biggr) \Biggr), \end{aligned} \\& \begin{aligned} &\sum_{v = k}^{m - 1} q^{v+1}\bigl[\mathbf{p}_{1}\bigl(q^{v} \bigr)-\mathbf{p}_{2}\bigl(q^{v}\bigr)\bigr] \ln \frac{\mathbf{p}_{1}(q^{v})}{\mathbf{p}_{2}(q^{v})} \\ &\quad \leq \frac {\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} (\varsigma_{1}- 1)\ln(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma _{1}} (\varsigma_{2}- 1)\ln(\varsigma_{2}) \\ &\qquad {} -\sum_{i=0}^{k_{j}} \frac{(-1)^{i}(i-2)!}{ (d_{j} )^{i-1}} \biggl(\frac{i-1}{d_{j}} + 1 \biggr) \Biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2}) \\ &\qquad {}-\sum_{v = k}^{m} q^{v+1} \mathbf{p}_{2}\bigl(q^{v}\bigr)\mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(q^{v})}{\mathbf{p}_{2}(q^{v})} \biggr) \Biggr), \end{aligned} \end{aligned}$$

and

$$\begin{aligned}& \sum_{v = k}^{m - 1} q^{v+1} \frac{[\mathbf{p}_{2}(q^{v})-\mathbf{p}_{1}(q^{v})]^{2}}{\mathbf {p}_{1}(q^{v}) + \mathbf{p}_{2}(q^{v})} \\& \quad \leq \frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \frac{(\varsigma_{1} - 1)^{2}}{\varsigma_{1} + 1} + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \frac{(\varsigma _{2} - 1)^{2}}{\varsigma_{2} + 1}-4 \sum_{j=1}^{s} \sum_{i=0}^{k_{j}}(-1)^{i} \\& \qquad {}\times\frac{ (i)!}{ (d_{j} +1 )^{i+1}} \Biggl(\frac{\varsigma _{2}-1}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}(\varsigma_{2})-\sum _{v = k}^{m} q^{v+1} \mathbf{p}_{2}\bigl(q^{v}\bigr)\mathcal{H}_{i_{j}} \biggl(\frac{\mathbf{p}_{1}(q^{v})}{\mathbf{p}_{2}(q^{v})} \biggr) \Biggr). \end{aligned}$$

5 Zipf–Mandelbrot law

In the field of information sciences, Zipf’s law is used for indexing [29, 67] in ecological field studies [53], and it plays an important role in art for identifying the aesthetics criteria in music [48].

For \(m\in\{1, 2, \dots\}, c \geq0\), and \(l >0\), the Zipf–Mandelbrot law (probability mass function) is defined as

$$ f(v; m, c, l)=\frac{1}{(v+c)^{l}H_{m, c, l}}, \quad v=1, \ldots, m, $$
(41)

where

$$ H_{m, c, l}=\sum_{u=1}^{m} \frac{1}{(u+c)^{l}} $$
(42)

is a generalization of the harmonic number.

Let \(m\in\{1, 2, \dots\}\), \(c \geq0\), and \(l >0\), then the Zipf–Mandelbrot entropy may be defined as

$$ Z(H; c, l)=\frac{l}{H_{m, c, l}}\sum _{v=1}^{m}\frac{\ln(v+c)}{(v+c)^{l}} + \ln(H_{m, c, l}). $$
(43)

Assume

$$ q_{v} = f(v; m, c, l)=\frac{1}{(v+c)^{l}H_{m, c, l}}. $$
(44)

Use \((\mathbf{p}_{2})_{v} = \frac{1}{(v+c)^{l}H_{m, c, l}}\) in (38) to get the following result which establishes the link of Mandelbrot entropy (43) with discrete Shannon entropy:

$$\begin{aligned} Z(H; c, l) \geq& \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \log(\varsigma _{2})- \sum _{v = 1}^{m} \biggl(\frac{1}{(v+c)^{l}H_{m, c, l}} \biggr)~ \log\bigl((\mathbf{p}_{1})_{v}\bigr) \\ &{}+\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}\frac{(-1)^{i-2}(i-1)!}{ (d_{j} )^{i}} \Biggl( \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma_{1}}\mathcal {H}_{i_{j}}(\varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal {H}_{i_{j}}(\varsigma_{2}) \\ &{}-\sum_{v = 1}^{m} \biggl( \frac{1}{(v+c)^{l}H_{m, c, l}} \biggr)\times\mathcal{H}_{i_{j}} \bigl(( \mathbf{p}_{1})_{v} (v+c)^{l}H_{m, c, l} \bigr) \Biggr). \end{aligned}$$

Use \((\mathbf{p}_{1})_{v} = \frac{1}{(v+c_{1})^{l_{1}}H_{m, c_{1}, l_{1}}}\) and \((\mathbf{p}_{2})_{v} = \frac{1}{(v+c_{2})^{l_{2}}H_{m, c_{2}, l_{2}}}\) in (39) to get following result which establishes the link of Mandelbrot entropy (43) with Kullback–Leibler divergence:

$$\begin{aligned} Z(H; c_{1}, l_{1}) \geq& \frac{l_{2}}{H_{m, c_{1}, l_{1}}}\sum _{v=1}^{m}\frac{\ln(v+c_{2})}{(v+c_{1})^{l_{1}}} + \ln(H_{n, c_{2}, l_{2}}) - \frac{\varsigma_{2}-1}{\varsigma_{2}-\varsigma _{1}} \varsigma_{1}\ln( \varsigma_{1}) - \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \varsigma_{2}\ln( \varsigma_{2}) \\ &{}+\sum_{j=1}^{s}\sum _{i=0}^{k_{j}}(-1)^{i-1} \frac{(i-2)!}{ (d_{j} )^{i-1}} \Biggl(\frac{\varsigma_{2}-1}{\varsigma _{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{1}) + \frac{1-\varsigma_{1}}{\varsigma_{2}-\varsigma_{1}} \mathcal{H}_{i_{j}}( \varsigma_{2}) \\ &{}- \sum_{v = 1}^{m} \biggl( \frac{1}{(v+c_{2})^{l_{2}}H_{m, c_{2}, l_{2}}} \biggr)\times\mathcal{H}_{i_{j}} \biggl( \frac{(v+c_{2})^{l_{2}}H_{m, c_{2}, l_{2}}}{(v+c_{1})^{l_{1}}H_{m, c_{1}, l_{1}}} \biggr) \Biggr), \end{aligned}$$

where \(H_{m, c_{1}, l_{1}} = \frac{1}{(v+c_{1})^{l_{1}}}\) and \(H_{m, c_{2}, l_{2}} = \frac{1}{(v+c_{2})^{l_{2}}}\).

Remark 8

Similarly, use \((\mathbf{p}_{1})_{v} = \frac{1}{(v+c_{1})^{l_{1}}H_{m, c_{1}, l_{1}}}\) and \((\mathbf{p}_{2})_{v} = \frac{1}{(v+c_{2})^{l_{2}}H_{m, c_{2}, l_{2}}}\) in (40) to obtain the relationship among Jeffreys distance \(J(\mathbf{p}_{1}, \mathbf{p}_{2})\) and Mandelbrot entropy (43).