1 Introduction

Shannon’s entropy [9] is defined as

$$\begin{aligned} H(P)\equiv -\sum _{i=1}^{n}p_{i}\log p_{i}, \end{aligned}$$

where \(P=(p_{1},\ldots ,p_{n})\) is a finite probability distribution. (Here and elsewhere in this paper, \(\log \) denotes the natural logarithm.) It is nonnegative and its maximum value is \(H(U)=\log n\), where \(U=(1/n,\ldots ,1/n)\). Throughout the paper we use the convention \(0\log 0=0\).

The known recursivity (grouping) property of Shannon’s entropy (see for instance [1, 2]) states that

$$\begin{aligned} H\left( p_{1},p_{2},\ldots ,p_{n}\right) =H\left( p_{1}+p_{2},\ldots ,p_{n}\right) +\left( p_{1}+p_{2}\right) H\left( \frac{p_{1}}{p_{1}+p_{2}},\frac{p_{2}}{ p_{1}+p_{2}}\right) .\nonumber \\ \end{aligned}$$
(1.1)

Apparently the simple question ”how do slight modifications of the probabilities affect the entropy?” does not have many answers in the literature, and we stated in [6] the following open problem.

Open Problem. Find the lower bound (threshold) \(a\left( k\right) \ge 0\) such that, if the probability distribution \(P=\left( p_{1},\ldots ,p_{n}\right) \) has at least k nonzero and equal components \(\ge a\left( k\right) \), then the Shannon entropy \(H\left( P\right) \) attains its minimum when \(n-k\) components of P are zero. In other words, find the best (smallest) \(a\left( k\right) \) such that

$$\begin{aligned} H\left( p_{1},\ldots ,p_{n-k},p,\ldots ,p\right) \ge H\left( 0,\ldots ,0,1/k,\ldots ,1/k\right) \end{aligned}$$

for all probability distributions \(P=\left( p_{1},\ldots ,p_{n-k},p,\ldots ,p\right) \in \mathbb {R} _{+}^{n}\) such that \(p>0\) and \(a\left( k\right) \le p\le 1/k\ (k\le n-1)\). Obviously \(a\left( k\right) \le 1/k.\)

2 Main results

Our starting point now is the following answer given in [3], useful for computer assisted analysis of the experimental data.

Proposition 1

Let the probability distribution \(P=\left( p_{1},\ldots ,p_{n}\right) \) be such that it has at least k nonzero and equal components \(p_{n-k+1}= \cdots =p_{n}=p\). The best (smallest) \(a\left( k\right) \ge 0\) such that

$$\begin{aligned} H\left( p_{1},\ldots ,p_{n-k},p,\ldots ,p\right) \ge H\left( 0,\ldots ,0,1/k,\ldots ,1/k\right) , \end{aligned}$$
(2.1)

for all P for which additionally \(a\left( k\right) \le p\le 1/k\) holds\(,\ \)is the value of the abscisse of the first intersection of the horizontal line \(y=\log (k)\) and the graph of the function

$$\begin{aligned} f_{k}\left( p\right) =-kp\log \left( p\right) -\left( 1-kp\right) \log \left( 1-kp\right) ,\text { }0\le p\le 1/k. \end{aligned}$$

Figure 1 shows these intersections for \(k=1,\ldots ,5.\)

In [3], the proof of this result was reduced to the fact that \(a\left( k\right) \) is given as the smallest solution p of the equation

$$\begin{aligned} -kp\log \left( p\right) -\left( 1-kp\right) \log \left( 1-kp\right) =\log (k). \end{aligned}$$
(2.2)

The maximum of the function \(f_{k}\left( p\right) \) is \(\log \left( k+1\right) .\) Therefore, we are interested in the part of the graph which is in between the horizontal lines \(y=\log (k)\) and \(y=\log (k+1).\) The line \( y=\log (k)\) meets the graph of \(f_{k}\left( p\right) \) twice: one point has as abscisse the required bound \(a\left( k\right) ,\) the other is situated at the right endpoint of the domain of \(f_{k}\left( p\right) ,\) \(p=1/k.\)

In [3] we also provided some particular estimates of interest for \(a\left( k\right) ,\) found with the computer package MATLAB, needed for practical purposes, as in Fig. 1.

Fig. 1
figure 1

The plot of the function \(f_{k}\left( p\right) \ \) for \(k=1,\ldots ,5\)

In what follows, we look for a nicer formula (however still implicit) of the first solution of the equation (2.2), \(a\left( k\right) .\) As a result, the equation (2.2) is solved also in the case when k is not an integer, and we consider this fact of some theoretical importance.

If \(x:=kp\), equation (2.2) takes the form

$$\begin{aligned} F(k,x):=\log k+x\log \frac{x}{k}+(1-x)\log (1-x)=0. \end{aligned}$$
(2.3)

Since \(F(k,x)=(1-x)\log k+x\log x+(1-x)\log (1-x)\), (2.3 ) is solvable in k, and the solution is

$$\begin{aligned} k=\frac{x^{\frac{x}{x-1}}}{1-x},\qquad 0<x<1. \end{aligned}$$

As a result we obtain \(p=a(k)\) as a function of \(x=kp\):

$$\begin{aligned} p=p(x)=\frac{x}{k}=(1-x)x^{\frac{1}{1-x}}. \end{aligned}$$
(2.4)

In Fig. 2 (generated with MATLAB as well) we plot the function \((1-x)x^{\frac{1}{1-x}}\) and the straight lines \(\frac{x}{k}\) for \(k=1,\ldots ,5.\) The intersections correspond to \( a\left( k\right) \) for \(k=1,\ldots ,5.\)

Fig. 2
figure 2

The plot of the function \((1-x)x^{\frac{1}{1-x}}\) and the straight lines \(\frac{x}{k}\) for \(k=1,\ldots ,5\)

Proposition 2

With the above notation, it holds that

$$\begin{aligned} 0<a\left( k\right) \le 1/(k+1), \end{aligned}$$

for \(k\ge 2.\)

Proof

It is straightforward to observe, as an immediate consequence of the recursivity of Shannon’s entropy (1.1), that

$$\begin{aligned} H\left( p_{1},p_{2},\ldots ,p_{k+2}\right) \ge H\left( p_{1}+p_{2},\ldots ,p_{k+2}\right) . \end{aligned}$$

In the case \(p_{1}+p_{2}=1/(k+1),\) \(p_{3}=\cdots =p_{k+2}=1/(k+1)\) this yields

$$\begin{aligned} H(p_{1},p_{2},\underset{k\text { terms}}{\underbrace{1/(k+1),\ldots ,1/(k+1))})} \ge H\left( 1/(k+1),\ldots ,1/(k+1)\right) =\log (k+1), \end{aligned}$$

and we can infer that for all \(n>k\) it holds that

$$\begin{aligned}{} & {} H(p_{1},\ldots ,p_{n-k},\underset{k\text { terms}}{\underbrace{1/(k+1),\ldots ,1/(k+1) })}\\{} & {} \quad \ge \log (k+1)\ge H\left( 0,\ldots ,0,1/k,\ldots ,1/k\right) \end{aligned}$$

for all positive \(p_{1},\ldots ,p_{n-k}\) such that \(p_{1}+\cdots +p_{n-k}=1/(k+1).\)

Then for \(p=1/(k+1)\) inequality (2.1) holds true, therefore \( a\left( k\right) \le 1/(k+1).\) \(\square \)

Geometrically speaking, this means that the intersection of the graph of the function \((1-x)x^{\frac{1}{1-x}}\) with the straight line \(\frac{x}{k}\) has a lower ordinate than the intersection of the straight line \(\frac{x}{k+1}\) with the vertical line \(x=1.\)

Remark 1

Note that, according to Corollary 2 in [3], one also has

$$\begin{aligned} -kp\log p-(1-kp)\log (1-kp)\le & {} H\left( p_{1},\ldots ,p_{n-k},p,\ldots ,p\right) \\\le & {} -kp\log p-(1-kp)\log \frac{1-kp}{n-k}\le \log n. \end{aligned}$$

The first equality holds true for \(p_{1}=\cdots =p_{n-k-1}=0,\ p_{n-k}=1-kp,\) the second equality is valid for \(p_{1}=\cdots =p_{n-k}=\frac{1-kp}{n-k}.\) The last equality holds true for \(p=1/n.\) In this paper we studied an alternative way to determine the domain of p such that

$$\begin{aligned} \log k\le -kp\log p-(1-kp)\log (1-kp). \end{aligned}$$

Such studies become of practical interest when one uses redistributing algorithms to analyze the time series, as in the papers [4,5,6,7,8].