Entropy and Thinning of Discrete Random Variables

Conference paper
Part of the The IMA Volumes in Mathematics and its Applications book series (IMA, volume 161)

Abstract

We describe five types of results concerning information and concentration of discrete random variables, and relationships between them, motivated by their counterparts in the continuous case. The results we consider are information theoretic approaches to Poisson approximation, the maximum entropy property of the Poisson distribution, discrete concentration (Poincaré and logarithmic Sobolev) inequalities, monotonicity of entropy and concavity of entropy in the Shepp–Olkin regime.

Keywords

Mass Function Fisher Information Relative Entropy Probability Mass Function Discrete Random Variable 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

The author thanks the Institute for Mathematics and Its Applications for the invitation and funding to speak at the workshop ‘Information Theory and Concentration Phenomena’ in Minneapolis in April 2015. In addition, he would like to thank the organizers and participants of this workshop for stimulating discussions. The author thanks Fraser Daly, Mokshay Madiman and an anonymous referee for helpful comments on earlier drafts of this paper.

References

  1. 1.
    J. A. Adell, A. Lekuona, and Y. Yu. Sharp bounds on the entropy of the Poisson law and related quantities. IEEE Trans. Inform. Theory, 56(5):2299–2306, May 2010.Google Scholar
  2. 2.
    S.-i. Amari, O. E. Barndorff-Nielsen, R. E. Kass, S. L. Lauritzen, and C. R. Rao. Differential geometry in statistical inference. Institute of Mathematical Statistics Lecture Notes—Monograph Series, 10. Institute of Mathematical Statistics, Hayward, CA, 1987.Google Scholar
  3. 3.
    S.-i. Amari and H. Nagaoka. Methods of information geometry, volume 191 of Translations of Mathematical Monographs. American Mathematical Society, Providence, RI, 2000.Google Scholar
  4. 4.
    L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows in metric spaces and in the space of probability measures. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, second edition, 2008.Google Scholar
  5. 5.
    V. Anantharam. Counterexamples to a proposed Stam inequality on finite groups. IEEE Trans. Inform. Theory, 56(4):1825–1827, 2010.MathSciNetCrossRefGoogle Scholar
  6. 6.
    C. Ané, S. Blachere, D. Chafaï, P. Fougeres, I. Gentil, F. Malrieu, C. Roberto, and G. Scheffer. Sur les inégalités de Sobolev logarithmiques. Panoramas et Syntheses, 10:217, 2000.MATHGoogle Scholar
  7. 7.
    S. Artstein, K. M. Ball, F. Barthe, and A. Naor. On the rate of convergence in the entropic central limit theorem. Probab. Theory Related Fields, 129(3):381–390, 2004.MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    S. Artstein, K. M. Ball, F. Barthe, and A. Naor. Solution of Shannon’s problem on the monotonicity of entropy. J. Amer. Math. Soc., 17(4):975–982 (electronic), 2004.Google Scholar
  9. 9.
    D. Bakry and M. Émery. Diffusions hypercontractives. In Séminaire de probabilités, XIX, volume 1123 of Lecture Notes in Math., pages 177–206. Springer, Berlin, 1985.Google Scholar
  10. 10.
    D. Bakry, I. Gentil, and M. Ledoux. Analysis and geometry of Markov diffusion operators, volume 348 of Grundlehren der mathematischen Wissenschaften. Springer, 2014.CrossRefMATHGoogle Scholar
  11. 11.
    K. Ball, F. Barthe, and A. Naor. Entropy jumps in the presence of a spectral gap. Duke Math. J., 119(1):41–63, 2003.MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    A. Barbour, L. Holst, and S. Janson. Poisson Approximation. Clarendon Press, Oxford, 1992.MATHGoogle Scholar
  13. 13.
    A. Barbour, O. T. Johnson, I. Kontoyiannis, and M. Madiman. Compound Poisson approximation via local information quantities. Electronic Journal of Probability, 15:1344–1369, 2010.MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    A. R. Barron. Entropy and the Central Limit Theorem. Ann. Probab., 14(1):336–342, 1986.MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    J.-D. Benamou and Y. Brenier. A numerical method for the optimal time-continuous mass transport problem and related problems. In Monge Ampère equation: applications to geometry and optimization (Deerfield Beach, FL, 1997), volume 226 of Contemp. Math., pages 1–11. Amer. Math. Soc., Providence, RI, 1999.Google Scholar
  16. 16.
    J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numer. Math., 84(3):375–393, 2000.MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    N. M. Blachman. The convolution inequality for entropy powers. IEEE Trans. Information Theory, 11:267–271, 1965.MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    S. G. Bobkov, G. P. Chistyakov, and F. Götze. Convergence to stable laws in relative entropy. Journal of Theoretical Probability, 26(3):803–818, 2013.MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    S. G. Bobkov, G. P. Chistyakov, and F. Götze. Rate of convergence and Edgeworth-type expansion in the entropic central limit theorem. Ann. Probab., 41(4):2479–2512, 2013.MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    S. G. Bobkov, G. P. Chistyakov, and F. Götze. Berry–Esseen bounds in the entropic central limit theorem. Probability Theory and Related Fields, 159(3–4):435–478, 2014.MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    S. G. Bobkov, G. P. Chistyakov, and F. Götze. Fisher information and convergence to stable laws. Bernoulli, 20(3):1620–1646, 2014.MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    S. G. Bobkov and M. Ledoux. On modified logarithmic Sobolev inequalities for Bernoulli and Poisson measures. J. Funct. Anal., 156(2):347–365, 1998.MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    A. Borovkov and S. Utev. On an inequality and a related characterisation of the normal distribution. Theory Probab. Appl., 28(2):219–228, 1984.CrossRefMATHGoogle Scholar
  24. 24.
    P. Brändén. Iterated sequences and the geometry of zeros. J. Reine Angew. Math., 658:115–131, 2011.MathSciNetMATHGoogle Scholar
  25. 25.
    L. D. Brown. A proof of the Central Limit Theorem motivated by the Cramér-Rao inequality. In G. Kallianpur, P. R. Krishnaiah, and J. K. Ghosh, editors, Statistics and Probability: Essays in Honour of C.R. Rao, pages 141–148. North-Holland, New York, 1982.Google Scholar
  26. 26.
    T. Cacoullos. On upper and lower bounds for the variance of a function of a random variable. Ann. Probab., 10(3):799–809, 1982.MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    L. A. Caffarelli. Monotonicity properties of optimal transportation and the FKG and related inequalities. Communications in Mathematical Physics, 214(3):547–563, 2000.MathSciNetCrossRefMATHGoogle Scholar
  28. 28.
    P. Caputo, P. Dai Pra, and G. Posta. Convex entropy decay via the Bochner-Bakry-Emery approach. Ann. Inst. Henri Poincaré Probab. Stat., 45(3):734–753, 2009.MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    E. Carlen and A. Soffer. Entropy production by block variable summation and Central Limit Theorems. Comm. Math. Phys., 140(2):339–371, 1991.MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    E. A. Carlen and W. Gangbo. Constrained steepest descent in the 2-Wasserstein metric. Ann. of Math. (2), 157(3):807–846, 2003.MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    D. Chafaï. Binomial-Poisson entropic inequalities and the M/M/ queue. ESAIM Probability and Statistics, 10:317–339, 2006.MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    H. Chernoff. A note on an inequality involving the normal distribution. Ann. Probab., 9(3):533–535, 1981.MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    D. Cordero-Erausquin. Some applications of mass transport to Gaussian-type inequalities. Arch. Ration. Mech. Anal., 161(3):257–269, 2002.MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    F. Daly. Negative dependence and stochastic orderings. ESAIM: PS, 20:45–65, 2016. https://doi.org/10.1051/ps/2016002.MathSciNetCrossRefMATHGoogle Scholar
  35. 35.
    F. Daly and O. T. Johnson. Bounds on the Poincaré constant under negative dependence. Statistics and Probability Letters, 83:511–518, 2013.MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    A. Dembo, T. M. Cover, and J. A. Thomas. Information theoretic inequalities. IEEE Trans. Information Theory, 37(6):1501–1518, 1991.MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Y. Derriennic. Entropie, théorèmes limite et marches aléatoires. In H. Heyer, editor, Probability Measures on Groups VIII, Oberwolfach, number 1210 in Lecture Notes in Mathematics, pages 241–284, Berlin, 1985. Springer-Verlag. In French.Google Scholar
  38. 38.
    M. Erbar and J. Maas. Ricci curvature of finite Markov chains via convexity of the entropy. Archive for Rational Mechanics and Analysis, 206:997–1038, 2012.MathSciNetCrossRefMATHGoogle Scholar
  39. 39.
    B. V. Gnedenko and A. N. Kolmogorov. Limit distributions for sums of independent random variables. Addison-Wesley, Cambridge, Mass, 1954.MATHGoogle Scholar
  40. 40.
    B. V. Gnedenko and V. Y. Korolev. Random Summation: Limit Theorems and Applications. CRC Press, Boca Raton, Florida, 1996.MATHGoogle Scholar
  41. 41.
    N. Gozlan, C. Roberto, P.-M. Samson, and P. Tetali. Displacement convexity of entropy and related inequalities on graphs. Probability Theory and Related Fields, 160(1–2):47–94, 2014.MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    L. Gross. Logarithmic Sobolev inequalities. Amer. J. Math., 97(4):1061–1083, 1975.MathSciNetCrossRefMATHGoogle Scholar
  43. 43.
    A. Guionnet and B. Zegarlinski. Lectures on logarithmic Sobolev inequalities. In Séminaire de Probabilités, XXXVI, volume 1801 of Lecture Notes in Math., pages 1–134. Springer, Berlin, 2003.Google Scholar
  44. 44.
    D. Guo, S. Shamai, and S. Verdú. Mutual information and minimum mean-square error in Gaussian channels. IEEE Trans. Inform. Theory, 51(4):1261–1282, 2005.MathSciNetCrossRefMATHGoogle Scholar
  45. 45.
    S. Haghighatshoar, E. Abbe, and I. E. Telatar. A new entropy power inequality for integer-valued random variables. IEEE Transactions on Information Theory, 60(7):3787–3796, 2014.MathSciNetCrossRefGoogle Scholar
  46. 46.
    P. Harremoës. Binomial and Poisson distributions as maximum entropy distributions. IEEE Trans. Information Theory, 47(5):2039–2041, 2001.MathSciNetCrossRefMATHGoogle Scholar
  47. 47.
    P. Harremoës, O. T. Johnson, and I. Kontoyiannis. Thinning, entropy and the law of thin numbers. IEEE Trans. Inform. Theory, 56(9):4228–4244, 2010.MathSciNetCrossRefGoogle Scholar
  48. 48.
    P. Harremoës and C. Vignat. An Entropy Power Inequality for the binomial family. JIPAM. J. Inequal. Pure Appl. Math., 4, 2003. Issue 5, Article 93; see also http://jipam.vu.edu.au/.
  49. 49.
    E. Hillion. Concavity of entropy along binomial convolutions. Electron. Commun. Probab., 17(4):1–9, 2012.MathSciNetMATHGoogle Scholar
  50. 50.
    E. Hillion and O. T. Johnson. Discrete versions of the transport equation and the Shepp-Olkin conjecture. Annals of Probability, 44(1):276–306, 2016.MathSciNetCrossRefMATHGoogle Scholar
  51. 51.
    E. Hillion and O. T. Johnson. A proof of the Shepp-Olkin entropy concavity conjecture. Bernoulli (to appear), 2017. See also arxiv:1503.01570.Google Scholar
  52. 52.
    E. Hillion, O. T. Johnson, and Y. Yu. A natural derivative on [0, n] and a binomial Poincaré inequality. ESAIM Probability and Statistics, 16:703–712, 2014.Google Scholar
  53. 53.
    V. Jog and V. Anantharam. The entropy power inequality and Mrs. Gerber’s Lemma for groups of order 2n. IEEE Transactions on Information Theory, 60(7):3773–3786, 2014.Google Scholar
  54. 54.
    O. T. Johnson. Information theory and the Central Limit Theorem. Imperial College Press, London, 2004.CrossRefMATHGoogle Scholar
  55. 55.
    O. T. Johnson. Log-concavity and the maximum entropy property of the Poisson distribution. Stoch. Proc. Appl., 117(6):791–802, 2007.MathSciNetCrossRefMATHGoogle Scholar
  56. 56.
    O. T. Johnson. A de Bruijn identity for symmetric stable laws. In submission, see arXiv:1310.2045, 2013.Google Scholar
  57. 57.
    O. T. Johnson. A discrete log-Sobolev inequality under a Bakry-Émery type condition. In submission. Ann. L’Inst. Henri Poincaré Probab. Stat. http://imstat.org/aihp/accepted.html.
  58. 58.
    O. T. Johnson and A. R. Barron. Fisher information inequalities and the Central Limit Theorem. Probability Theory and Related Fields, 129(3):391–409, 2004.MathSciNetCrossRefMATHGoogle Scholar
  59. 59.
    O. T. Johnson, I. Kontoyiannis, and M. Madiman. Log-concavity, ultra-log-concavity, and a maximum entropy property of discrete compound Poisson measures. Discrete Applied Mathematics, 161:1232–1250, 2013.MathSciNetCrossRefMATHGoogle Scholar
  60. 60.
    O. T. Johnson and Y. Yu. Monotonicity, thinning and discrete versions of the Entropy Power Inequality. IEEE Trans. Inform. Theory, 56(11):5387–5395, 2010.MathSciNetCrossRefGoogle Scholar
  61. 61.
    I. Johnstone and B. MacGibbon. Une mesure d’information caractérisant la loi de Poisson. In Séminaire de Probabilités, XXI, pages 563–573. Springer, Berlin, 1987.Google Scholar
  62. 62.
    A. Kagan. A discrete version of the Stam inequality and a characterization of the Poisson distribution. J. Statist. Plann. Inference, 92(1-2):7–12, 2001.MathSciNetCrossRefMATHGoogle Scholar
  63. 63.
    J. F. C. Kingman. Uses of exchangeability. Ann. Probability, 6(2):183–197, 1978.MathSciNetCrossRefMATHGoogle Scholar
  64. 64.
    C. Klaassen. On an inequality of Chernoff. Ann. Probab., 13(3):966–974, 1985.MathSciNetCrossRefMATHGoogle Scholar
  65. 65.
    I. Kontoyiannis, P. Harremoës, and O. T. Johnson. Entropy and the law of small numbers. IEEE Trans. Inform. Theory, 51(2):466–472, 2005.MathSciNetCrossRefMATHGoogle Scholar
  66. 66.
    S. Kullback. A lower bound for discrimination information in terms of variation. IEEE Trans. Information Theory, 13:126–127, 1967.CrossRefGoogle Scholar
  67. 67.
    C. Ley and Y. Swan. Stein’s density approach for discrete distributions and information inequalities. See arxiv:1211.3668, 2012.Google Scholar
  68. 68.
    E. Lieb. Proof of an entropy conjecture of Wehrl. Comm. Math. Phys., 62:35–41, 1978.MathSciNetCrossRefMATHGoogle Scholar
  69. 69.
    T. M. Liggett. Ultra logconcave sequences and negative dependence. J. Combin. Theory Ser. A, 79(2):315–325, 1997.MathSciNetCrossRefMATHGoogle Scholar
  70. 70.
    Y. Linnik. An information-theoretic proof of the Central Limit Theorem with the Lindeberg Condition. Theory Probab. Appl., 4:288–299, 1959.MathSciNetCrossRefMATHGoogle Scholar
  71. 71.
    J. Lott and C. Villani. Ricci curvature for metric-measure spaces via optimal transport. Ann. of Math. (2), 169(3):903–991, 2009.MathSciNetCrossRefMATHGoogle Scholar
  72. 72.
    M. Madiman and A. Barron. Generalized entropy power inequalities and monotonicity properties of information. IEEE Trans. Inform. Theory, 53(7):2317–2329, 2007.MathSciNetCrossRefMATHGoogle Scholar
  73. 73.
    M. Madiman, J. Melbourne, and P. Xu. Forward and reverse Entropy Power Inequalities in convex geometry. See: arxiv:1604.04225, 2016.Google Scholar
  74. 74.
    P. Mateev. The entropy of the multinomial distribution. Teor. Verojatnost. i Primenen., 23(1):196–198, 1978.MathSciNetMATHGoogle Scholar
  75. 75.
    N. Papadatos and V. Papathanasiou. Poisson approximation for a sum of dependent indicators: an alternative approach. Adv. in Appl. Probab., 34(3):609–625, 2002.MathSciNetCrossRefMATHGoogle Scholar
  76. 76.
    V. Papathanasiou. Some characteristic properties of the Fisher information matrix via Cacoullos-type inequalities. J. Multivariate Anal., 44(2):256–265, 1993.MathSciNetCrossRefMATHGoogle Scholar
  77. 77.
    R. Pemantle. Towards a theory of negative dependence. J. Math. Phys., 41(3):1371–1390, 2000.MathSciNetCrossRefMATHGoogle Scholar
  78. 78.
    C. R. Rao. On the distance between two populations. Sankhya, 9:246–248, 1948.MathSciNetGoogle Scholar
  79. 79.
    A. Rényi. A characterization of Poisson processes. Magyar Tud. Akad. Mat. Kutató Int. Közl., 1:519–527, 1956.MathSciNetMATHGoogle Scholar
  80. 80.
    A. Rényi. On measures of entropy and information. In J. Neyman, editor, Proceedings of the 4th Berkeley Conference on Mathematical Statistics and Probability, pages 547–561, Berkeley, 1961. University of California Press.Google Scholar
  81. 81.
    B. Roos. Kerstan’s method for compound Poisson approximation. Ann. Probab., 31(4):1754–1771, 2003.MathSciNetCrossRefMATHGoogle Scholar
  82. 82.
    M. Shaked and J. G. Shanthikumar. Stochastic orders. Springer Series in Statistics. Springer, New York, 2007.CrossRefMATHGoogle Scholar
  83. 83.
    S. Shamai and A. Wyner. A binary analog to the entropy-power inequality. IEEE Trans. Inform. Theory, 36(6):1428–1430, Nov 1990.Google Scholar
  84. 84.
    C. E. Shannon. A mathematical theory of communication. Bell System Tech. J., 27:379–423, 623–656, 1948.Google Scholar
  85. 85.
    N. Sharma, S. Das, and S. Muthukrishnan. Entropy power inequality for a family of discrete random variables. In 2011 IEEE International Symposium on Information Theory Proceedings (ISIT), pages 1945–1949. IEEE, 2011.Google Scholar
  86. 86.
    L. A. Shepp and I. Olkin. Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution. In Contributions to probability, pages 201–206. Academic Press, New York, 1981.Google Scholar
  87. 87.
    R. Shimizu. On Fisher’s amount of information for location family. In Patil, G. P. and Kotz, S. and Ord, J. K., editor, A Modern Course on Statistical Distributions in Scientific Work, Volume 3, pages 305–312. Reidel, 1975.Google Scholar
  88. 88.
    A. J. Stam. Some inequalities satisfied by the quantities of information of Fisher and Shannon. Information and Control, 2:101–112, 1959.MathSciNetCrossRefMATHGoogle Scholar
  89. 89.
    K.-T. Sturm. On the geometry of metric measure spaces. I. Acta Math., 196(1):65–131, 2006.Google Scholar
  90. 90.
    K.-T. Sturm. On the geometry of metric measure spaces. II. Acta Math., 196(1):133–177, 2006.Google Scholar
  91. 91.
    G. Szegő. Orthogonal Polynomials. American Mathematical Society, New York, revised edition, 1958.MATHGoogle Scholar
  92. 92.
    G. Toscani. The fractional Fisher information and the central limit theorem for stable laws. Ricerche di Matematica, 65(1):71–91, 2016.MathSciNetCrossRefMATHGoogle Scholar
  93. 93.
    C. Tsallis. Possible generalization of Boltzmann-Gibbs statistics. Journal of Statistical Physics, 52:479–487, 1988.MathSciNetCrossRefMATHGoogle Scholar
  94. 94.
    A. Tulino and S. Verdú. Monotonic decrease of the non-Gaussianness of the sum of independent random variables: a simple proof. IEEE Trans. Inform. Theory, 52(9):4295–4297, 2006.MathSciNetCrossRefMATHGoogle Scholar
  95. 95.
    C. Villani. Topics in optimal transportation, volume 58 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2003.Google Scholar
  96. 96.
    C. Villani. Optimal transport: Old and New, volume 338 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, 2009.Google Scholar
  97. 97.
    D. W. Walkup. Pólya sequences, binomial convolution and the union of random sets. J. Appl. Probability, 13(1):76–85, 1976.MathSciNetCrossRefMATHGoogle Scholar
  98. 98.
    L. Wang, J. O. Woo, and M. Madiman. A lower bound on the Rényi entropy of convolutions in the integers. In 2014 IEEE International Symposium on Information Theory (ISIT), pages 2829–2833. IEEE, 2014.Google Scholar
  99. 99.
    H. S. Witsenhausen. Some aspects of convexity useful in information theory. IEEE Trans. Inform. Theory, 26(3):265–271, 1980.MathSciNetCrossRefMATHGoogle Scholar
  100. 100.
    L. Wu. A new modified logarithmic Sobolev inequality for Poisson point processes and several applications. Probab. Theory Related Fields, 118(3):427–438, 2000.MathSciNetCrossRefMATHGoogle Scholar
  101. 101.
    A. D. Wyner. A theorem on the entropy of certain binary sequences and applications. II. IEEE Trans. Information Theory, 19(6):772–777, 1973.Google Scholar
  102. 102.
    A. D. Wyner and J. Ziv. A theorem on the entropy of certain binary sequences and applications. I. IEEE Trans. Information Theory, 19(6):769–772, 1973.Google Scholar
  103. 103.
    Y. Yu. On the maximum entropy properties of the binomial distribution. IEEE Trans. Inform. Theory, 54(7):3351–3353, July 2008.Google Scholar
  104. 104.
    Y. Yu. Monotonic convergence in an information-theoretic law of small numbers. IEEE Trans. Inform. Theory, 55(12):5412–5422, 2009.MathSciNetCrossRefGoogle Scholar
  105. 105.
    Y. Yu. On the entropy of compound distributions on nonnegative integers. IEEE Trans. Inform. Theory, 55(8):3645–3650, 2009.MathSciNetCrossRefGoogle Scholar
  106. 106.
    Y. Yu and O. T. Johnson. Concavity of entropy under thinning. In Proceedings of the 2009 IEEE International Symposium on Information Theory, 28th June - 3rd July 2009, Seoul, pages 144–148, 2009.Google Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.School of MathematicsUniversity of BristolBristolUK

Personalised recommendations