Statistical Estimation of the Shannon Entropy


The behavior of the Kozachenko–Leonenko estimates for the (differential) Shannon entropy is studied when the number of i.i.d. vector-valued observations tends to infinity. The asymptotic unbiasedness and L2-consistency of the estimates are established. The conditions employed involve the analogues of the Hardy–Littlewood maximal function. It is shown that the results are valid in particular for the entropy estimation of any nondegenerate Gaussian vector.

This is a preview of subscription content, access via your institution.


  1. [1]

    Alonso–Ruiz, P., Spodarev, E.: Entropy–based inhomogeneity detection in porous media, arXiv:1611.02241

  2. [2]

    Balibrea, F.: On Clausius, Boltzmann and Shannon notions of entropy. Journal of Modern Physics, 7, 219–227 (2016)

    Article  Google Scholar 

  3. [3]

    Beirlant, J., Dudewicz, E. J., Györfi, L., et al.: Nonparametric entropy estimation: An overview. International Journal of Mathematical and Statistical Sciences, 6, 17–39 (1997)

    MathSciNet  MATH  Google Scholar 

  4. [4]

    Benguigui, L.: The different paths to entropy. European Journal of Physics, 34, 303–321 (2013)

    Article  MATH  Google Scholar 

  5. [5]

    Billingsley, P.: Convergence of Probability Measures (2nd ed.), J. Wiley and sons Inc., New York, 1999

    Book  MATH  Google Scholar 

  6. [6]

    Borkar, V. S.: Probability Theory. An Advanced Course, Springer–Verlag, New York, 1995

    Book  MATH  Google Scholar 

  7. [7]

    Charzyńska, A., Gambin, A.: Improvement of of the k–NN entropy estimator with applications in systems biology. Entropy, 18(1), 13 (2016)

    Article  Google Scholar 

  8. [8]

    Delattre, S., Fournier, N.: On the Kozachenko–Leonenko entropy estimator. Journal of Statistical Planning and Inference, (2017), DOI: (accepted manuscript)

    Google Scholar 

  9. [9]

    Evans, D.: A computationally efficient estimator for mutual information. Proc. R. Soc. A, 464, 1203–1215 (2008)

    MathSciNet  Article  MATH  Google Scholar 

  10. [10]

    Evans, D., Jones, A. J., Schmidt, W. M.: Asymptotic moments of near–neighbour distance distributions. Proc. R. Soc. A, 458, 2839–2849 (2002)

    MathSciNet  Article  MATH  Google Scholar 

  11. [11]

    Gorban, A. N., Gorban, P. A., Judge, G.: Entropy: the Markov ordering approach. Entropy, 12, 1145–1193 (2010)

    MathSciNet  Article  MATH  Google Scholar 

  12. [12]

    Kallenberg, O.: Foundations of Modern Probability, Springer, New York, 1997

    MATH  Google Scholar 

  13. [13]

    Kozachenko, L. F., Leonenko, N. N.: Sample estimate of the entropy of a random vector. Problems of Information Transmission, 23(2), 90–16 (1987)

    MATH  Google Scholar 

  14. [14]

    Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Phys. Rev. E, 69:066138 (2004)

    Book  Google Scholar 

  15. [15]

    Laurent, B.: Efficient estimation of integral functionals of a density. The Annals of Statistics, 24, 659–681 (1996)

    MathSciNet  Article  MATH  Google Scholar 

  16. [16]

    Leonenko, N. N., Pronzato, L., Savani, V.: A class of Rényi information estimations for multidimensional densities. The Annals of Statistics, 36, 2153–2182 (2008) Correction: The Annals of Statistics, 38, 3837–3838 (2010)

    MathSciNet  Article  MATH  Google Scholar 

  17. [17]

    Ma, J., Sun, Z.: Mutual information is copula entropy. Tsinghua Science and Tech., 16(1), 51–54 (2011)

    Article  MATH  Google Scholar 

  18. [18]

    Miller, E. G.: A new class of entropy estimators for multidimensional densities. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP03), Hong Kong, China, 6–10 April 2003, 3, 297–300

    Google Scholar 

  19. [19]

    Pál, D., Póczos, B., Szepesvári, C.: Estimation of Rényi entropy and mutual information based on generalized nearest–neighbor graphs. In: NIPS’10 Proceedings of the 23rd International Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada (December 06–09, 2010), 1849–1857

    Google Scholar 

  20. [20]

    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max–dependency, max–relevance, and min–redundancy. IEEE Trans. on Pattern Analysis and Machine Intelligence, 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  21. [21]

    Shannon, C. E.: A mathematical theory of communication. Bell Systems Technical Journal, 27, July and October, 379–423 and 623–656 (1948)

    Google Scholar 

  22. [22]

    Shiryaev, A. N.: Probability–1 (3rd ed.), Springer, New York, 2016

    MATH  Google Scholar 

  23. [23]

    Singh, S., Pószoc, B.: Nonparanormal information estimation. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 3210–3219, 2017

    Google Scholar 

  24. [24]

    Singh, S., Pószoc, B.: Analysis of k–nearest neighbor distances with application to entropy estimation, arXiv: 1603.08578v2

  25. [25]

    Sricharan, K., Wei, D., Hero, A. O.: Ensemble estimators for multivariate entropy estimation. IEEE Transactions on Information Theory, 59(7), 4374–4388 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  26. [26]

    Stowell, D., Plumbley, M. D.: Fast multidimensional entropy estimation by k–d partitioning. IEEE Signal Processing Letters, 16(6), (2009)

    Book  Google Scholar 

  27. [27]

    Yeh, J.: Real Analysis: Theory of Measure and Integration (3rd edition), World Scientific, Singapore, 2014

    Book  MATH  Google Scholar 

Download references


The work is supported by the Russian Science Foundation under grant 14-21-00162 and performed at the Steklov Mathematical Institute of Russian Academy of Sciences. The authors are grateful to Professor E. Spodarev for drawing their attention to the entropy estimation problems.

Author information



Corresponding author

Correspondence to Alexander Bulinski.

Additional information

Supported by the Russian Science Foundation (Grant No. 14-21-00162)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bulinski, A., Dimitrov, D. Statistical Estimation of the Shannon Entropy. Acta. Math. Sin.-English Ser. 35, 17–46 (2019).

Download citation


  • Shannon differential entropy
  • Kozachenko–Leonenko estimates
  • Hardy–Littlewood maximal function analogues
  • asymptotic unbiasedness and L2-consistency
  • Gaussian vectors

MR(2010) Subject Classification

  • 60F25
  • 62G20
  • 62H12