Convexity and Concentration pp 33-53 | Cite as
Entropy and Thinning of Discrete Random Variables
Abstract
We describe five types of results concerning information and concentration of discrete random variables, and relationships between them, motivated by their counterparts in the continuous case. The results we consider are information theoretic approaches to Poisson approximation, the maximum entropy property of the Poisson distribution, discrete concentration (Poincaré and logarithmic Sobolev) inequalities, monotonicity of entropy and concavity of entropy in the Shepp–Olkin regime.
Keywords
Mass Function Fisher Information Relative Entropy Probability Mass Function Discrete Random VariableNotes
Acknowledgements
The author thanks the Institute for Mathematics and Its Applications for the invitation and funding to speak at the workshop ‘Information Theory and Concentration Phenomena’ in Minneapolis in April 2015. In addition, he would like to thank the organizers and participants of this workshop for stimulating discussions. The author thanks Fraser Daly, Mokshay Madiman and an anonymous referee for helpful comments on earlier drafts of this paper.
References
- 1.J. A. Adell, A. Lekuona, and Y. Yu. Sharp bounds on the entropy of the Poisson law and related quantities. IEEE Trans. Inform. Theory, 56(5):2299–2306, May 2010.Google Scholar
- 2.S.-i. Amari, O. E. Barndorff-Nielsen, R. E. Kass, S. L. Lauritzen, and C. R. Rao. Differential geometry in statistical inference. Institute of Mathematical Statistics Lecture Notes—Monograph Series, 10. Institute of Mathematical Statistics, Hayward, CA, 1987.Google Scholar
- 3.S.-i. Amari and H. Nagaoka. Methods of information geometry, volume 191 of Translations of Mathematical Monographs. American Mathematical Society, Providence, RI, 2000.Google Scholar
- 4.L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows in metric spaces and in the space of probability measures. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, second edition, 2008.Google Scholar
- 5.V. Anantharam. Counterexamples to a proposed Stam inequality on finite groups. IEEE Trans. Inform. Theory, 56(4):1825–1827, 2010.MathSciNetCrossRefGoogle Scholar
- 6.C. Ané, S. Blachere, D. Chafaï, P. Fougeres, I. Gentil, F. Malrieu, C. Roberto, and G. Scheffer. Sur les inégalités de Sobolev logarithmiques. Panoramas et Syntheses, 10:217, 2000.MATHGoogle Scholar
- 7.S. Artstein, K. M. Ball, F. Barthe, and A. Naor. On the rate of convergence in the entropic central limit theorem. Probab. Theory Related Fields, 129(3):381–390, 2004.MathSciNetCrossRefMATHGoogle Scholar
- 8.S. Artstein, K. M. Ball, F. Barthe, and A. Naor. Solution of Shannon’s problem on the monotonicity of entropy. J. Amer. Math. Soc., 17(4):975–982 (electronic), 2004.Google Scholar
- 9.D. Bakry and M. Émery. Diffusions hypercontractives. In Séminaire de probabilités, XIX, volume 1123 of Lecture Notes in Math., pages 177–206. Springer, Berlin, 1985.Google Scholar
- 10.D. Bakry, I. Gentil, and M. Ledoux. Analysis and geometry of Markov diffusion operators, volume 348 of Grundlehren der mathematischen Wissenschaften. Springer, 2014.CrossRefMATHGoogle Scholar
- 11.K. Ball, F. Barthe, and A. Naor. Entropy jumps in the presence of a spectral gap. Duke Math. J., 119(1):41–63, 2003.MathSciNetCrossRefMATHGoogle Scholar
- 12.A. Barbour, L. Holst, and S. Janson. Poisson Approximation. Clarendon Press, Oxford, 1992.MATHGoogle Scholar
- 13.A. Barbour, O. T. Johnson, I. Kontoyiannis, and M. Madiman. Compound Poisson approximation via local information quantities. Electronic Journal of Probability, 15:1344–1369, 2010.MathSciNetCrossRefMATHGoogle Scholar
- 14.A. R. Barron. Entropy and the Central Limit Theorem. Ann. Probab., 14(1):336–342, 1986.MathSciNetCrossRefMATHGoogle Scholar
- 15.J.-D. Benamou and Y. Brenier. A numerical method for the optimal time-continuous mass transport problem and related problems. In Monge Ampère equation: applications to geometry and optimization (Deerfield Beach, FL, 1997), volume 226 of Contemp. Math., pages 1–11. Amer. Math. Soc., Providence, RI, 1999.Google Scholar
- 16.J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numer. Math., 84(3):375–393, 2000.MathSciNetCrossRefMATHGoogle Scholar
- 17.N. M. Blachman. The convolution inequality for entropy powers. IEEE Trans. Information Theory, 11:267–271, 1965.MathSciNetCrossRefMATHGoogle Scholar
- 18.S. G. Bobkov, G. P. Chistyakov, and F. Götze. Convergence to stable laws in relative entropy. Journal of Theoretical Probability, 26(3):803–818, 2013.MathSciNetCrossRefMATHGoogle Scholar
- 19.S. G. Bobkov, G. P. Chistyakov, and F. Götze. Rate of convergence and Edgeworth-type expansion in the entropic central limit theorem. Ann. Probab., 41(4):2479–2512, 2013.MathSciNetCrossRefMATHGoogle Scholar
- 20.S. G. Bobkov, G. P. Chistyakov, and F. Götze. Berry–Esseen bounds in the entropic central limit theorem. Probability Theory and Related Fields, 159(3–4):435–478, 2014.MathSciNetCrossRefMATHGoogle Scholar
- 21.S. G. Bobkov, G. P. Chistyakov, and F. Götze. Fisher information and convergence to stable laws. Bernoulli, 20(3):1620–1646, 2014.MathSciNetCrossRefMATHGoogle Scholar
- 22.S. G. Bobkov and M. Ledoux. On modified logarithmic Sobolev inequalities for Bernoulli and Poisson measures. J. Funct. Anal., 156(2):347–365, 1998.MathSciNetCrossRefMATHGoogle Scholar
- 23.A. Borovkov and S. Utev. On an inequality and a related characterisation of the normal distribution. Theory Probab. Appl., 28(2):219–228, 1984.CrossRefMATHGoogle Scholar
- 24.P. Brändén. Iterated sequences and the geometry of zeros. J. Reine Angew. Math., 658:115–131, 2011.MathSciNetMATHGoogle Scholar
- 25.L. D. Brown. A proof of the Central Limit Theorem motivated by the Cramér-Rao inequality. In G. Kallianpur, P. R. Krishnaiah, and J. K. Ghosh, editors, Statistics and Probability: Essays in Honour of C.R. Rao, pages 141–148. North-Holland, New York, 1982.Google Scholar
- 26.T. Cacoullos. On upper and lower bounds for the variance of a function of a random variable. Ann. Probab., 10(3):799–809, 1982.MathSciNetCrossRefMATHGoogle Scholar
- 27.L. A. Caffarelli. Monotonicity properties of optimal transportation and the FKG and related inequalities. Communications in Mathematical Physics, 214(3):547–563, 2000.MathSciNetCrossRefMATHGoogle Scholar
- 28.P. Caputo, P. Dai Pra, and G. Posta. Convex entropy decay via the Bochner-Bakry-Emery approach. Ann. Inst. Henri Poincaré Probab. Stat., 45(3):734–753, 2009.MathSciNetCrossRefMATHGoogle Scholar
- 29.E. Carlen and A. Soffer. Entropy production by block variable summation and Central Limit Theorems. Comm. Math. Phys., 140(2):339–371, 1991.MathSciNetCrossRefMATHGoogle Scholar
- 30.E. A. Carlen and W. Gangbo. Constrained steepest descent in the 2-Wasserstein metric. Ann. of Math. (2), 157(3):807–846, 2003.MathSciNetCrossRefMATHGoogle Scholar
- 31.D. Chafaï. Binomial-Poisson entropic inequalities and the M/M/∞ queue. ESAIM Probability and Statistics, 10:317–339, 2006.MathSciNetCrossRefMATHGoogle Scholar
- 32.H. Chernoff. A note on an inequality involving the normal distribution. Ann. Probab., 9(3):533–535, 1981.MathSciNetCrossRefMATHGoogle Scholar
- 33.D. Cordero-Erausquin. Some applications of mass transport to Gaussian-type inequalities. Arch. Ration. Mech. Anal., 161(3):257–269, 2002.MathSciNetCrossRefMATHGoogle Scholar
- 34.F. Daly. Negative dependence and stochastic orderings. ESAIM: PS, 20:45–65, 2016. https://doi.org/10.1051/ps/2016002.MathSciNetCrossRefMATHGoogle Scholar
- 35.F. Daly and O. T. Johnson. Bounds on the Poincaré constant under negative dependence. Statistics and Probability Letters, 83:511–518, 2013.MathSciNetCrossRefMATHGoogle Scholar
- 36.A. Dembo, T. M. Cover, and J. A. Thomas. Information theoretic inequalities. IEEE Trans. Information Theory, 37(6):1501–1518, 1991.MathSciNetCrossRefMATHGoogle Scholar
- 37.Y. Derriennic. Entropie, théorèmes limite et marches aléatoires. In H. Heyer, editor, Probability Measures on Groups VIII, Oberwolfach, number 1210 in Lecture Notes in Mathematics, pages 241–284, Berlin, 1985. Springer-Verlag. In French.Google Scholar
- 38.M. Erbar and J. Maas. Ricci curvature of finite Markov chains via convexity of the entropy. Archive for Rational Mechanics and Analysis, 206:997–1038, 2012.MathSciNetCrossRefMATHGoogle Scholar
- 39.B. V. Gnedenko and A. N. Kolmogorov. Limit distributions for sums of independent random variables. Addison-Wesley, Cambridge, Mass, 1954.MATHGoogle Scholar
- 40.B. V. Gnedenko and V. Y. Korolev. Random Summation: Limit Theorems and Applications. CRC Press, Boca Raton, Florida, 1996.MATHGoogle Scholar
- 41.N. Gozlan, C. Roberto, P.-M. Samson, and P. Tetali. Displacement convexity of entropy and related inequalities on graphs. Probability Theory and Related Fields, 160(1–2):47–94, 2014.MathSciNetCrossRefMATHGoogle Scholar
- 42.L. Gross. Logarithmic Sobolev inequalities. Amer. J. Math., 97(4):1061–1083, 1975.MathSciNetCrossRefMATHGoogle Scholar
- 43.A. Guionnet and B. Zegarlinski. Lectures on logarithmic Sobolev inequalities. In Séminaire de Probabilités, XXXVI, volume 1801 of Lecture Notes in Math., pages 1–134. Springer, Berlin, 2003.Google Scholar
- 44.D. Guo, S. Shamai, and S. Verdú. Mutual information and minimum mean-square error in Gaussian channels. IEEE Trans. Inform. Theory, 51(4):1261–1282, 2005.MathSciNetCrossRefMATHGoogle Scholar
- 45.S. Haghighatshoar, E. Abbe, and I. E. Telatar. A new entropy power inequality for integer-valued random variables. IEEE Transactions on Information Theory, 60(7):3787–3796, 2014.MathSciNetCrossRefGoogle Scholar
- 46.P. Harremoës. Binomial and Poisson distributions as maximum entropy distributions. IEEE Trans. Information Theory, 47(5):2039–2041, 2001.MathSciNetCrossRefMATHGoogle Scholar
- 47.P. Harremoës, O. T. Johnson, and I. Kontoyiannis. Thinning, entropy and the law of thin numbers. IEEE Trans. Inform. Theory, 56(9):4228–4244, 2010.MathSciNetCrossRefGoogle Scholar
- 48.P. Harremoës and C. Vignat. An Entropy Power Inequality for the binomial family. JIPAM. J. Inequal. Pure Appl. Math., 4, 2003. Issue 5, Article 93; see also http://jipam.vu.edu.au/.
- 49.E. Hillion. Concavity of entropy along binomial convolutions. Electron. Commun. Probab., 17(4):1–9, 2012.MathSciNetMATHGoogle Scholar
- 50.E. Hillion and O. T. Johnson. Discrete versions of the transport equation and the Shepp-Olkin conjecture. Annals of Probability, 44(1):276–306, 2016.MathSciNetCrossRefMATHGoogle Scholar
- 51.E. Hillion and O. T. Johnson. A proof of the Shepp-Olkin entropy concavity conjecture. Bernoulli (to appear), 2017. See also arxiv:1503.01570.Google Scholar
- 52.E. Hillion, O. T. Johnson, and Y. Yu. A natural derivative on [0, n] and a binomial Poincaré inequality. ESAIM Probability and Statistics, 16:703–712, 2014.Google Scholar
- 53.V. Jog and V. Anantharam. The entropy power inequality and Mrs. Gerber’s Lemma for groups of order 2n. IEEE Transactions on Information Theory, 60(7):3773–3786, 2014.Google Scholar
- 54.O. T. Johnson. Information theory and the Central Limit Theorem. Imperial College Press, London, 2004.CrossRefMATHGoogle Scholar
- 55.O. T. Johnson. Log-concavity and the maximum entropy property of the Poisson distribution. Stoch. Proc. Appl., 117(6):791–802, 2007.MathSciNetCrossRefMATHGoogle Scholar
- 56.O. T. Johnson. A de Bruijn identity for symmetric stable laws. In submission, see arXiv:1310.2045, 2013.Google Scholar
- 57.O. T. Johnson. A discrete log-Sobolev inequality under a Bakry-Émery type condition. In submission. Ann. L’Inst. Henri Poincaré Probab. Stat. http://imstat.org/aihp/accepted.html.
- 58.O. T. Johnson and A. R. Barron. Fisher information inequalities and the Central Limit Theorem. Probability Theory and Related Fields, 129(3):391–409, 2004.MathSciNetCrossRefMATHGoogle Scholar
- 59.O. T. Johnson, I. Kontoyiannis, and M. Madiman. Log-concavity, ultra-log-concavity, and a maximum entropy property of discrete compound Poisson measures. Discrete Applied Mathematics, 161:1232–1250, 2013.MathSciNetCrossRefMATHGoogle Scholar
- 60.O. T. Johnson and Y. Yu. Monotonicity, thinning and discrete versions of the Entropy Power Inequality. IEEE Trans. Inform. Theory, 56(11):5387–5395, 2010.MathSciNetCrossRefGoogle Scholar
- 61.I. Johnstone and B. MacGibbon. Une mesure d’information caractérisant la loi de Poisson. In Séminaire de Probabilités, XXI, pages 563–573. Springer, Berlin, 1987.Google Scholar
- 62.A. Kagan. A discrete version of the Stam inequality and a characterization of the Poisson distribution. J. Statist. Plann. Inference, 92(1-2):7–12, 2001.MathSciNetCrossRefMATHGoogle Scholar
- 63.J. F. C. Kingman. Uses of exchangeability. Ann. Probability, 6(2):183–197, 1978.MathSciNetCrossRefMATHGoogle Scholar
- 64.C. Klaassen. On an inequality of Chernoff. Ann. Probab., 13(3):966–974, 1985.MathSciNetCrossRefMATHGoogle Scholar
- 65.I. Kontoyiannis, P. Harremoës, and O. T. Johnson. Entropy and the law of small numbers. IEEE Trans. Inform. Theory, 51(2):466–472, 2005.MathSciNetCrossRefMATHGoogle Scholar
- 66.S. Kullback. A lower bound for discrimination information in terms of variation. IEEE Trans. Information Theory, 13:126–127, 1967.CrossRefGoogle Scholar
- 67.C. Ley and Y. Swan. Stein’s density approach for discrete distributions and information inequalities. See arxiv:1211.3668, 2012.Google Scholar
- 68.E. Lieb. Proof of an entropy conjecture of Wehrl. Comm. Math. Phys., 62:35–41, 1978.MathSciNetCrossRefMATHGoogle Scholar
- 69.T. M. Liggett. Ultra logconcave sequences and negative dependence. J. Combin. Theory Ser. A, 79(2):315–325, 1997.MathSciNetCrossRefMATHGoogle Scholar
- 70.Y. Linnik. An information-theoretic proof of the Central Limit Theorem with the Lindeberg Condition. Theory Probab. Appl., 4:288–299, 1959.MathSciNetCrossRefMATHGoogle Scholar
- 71.J. Lott and C. Villani. Ricci curvature for metric-measure spaces via optimal transport. Ann. of Math. (2), 169(3):903–991, 2009.MathSciNetCrossRefMATHGoogle Scholar
- 72.M. Madiman and A. Barron. Generalized entropy power inequalities and monotonicity properties of information. IEEE Trans. Inform. Theory, 53(7):2317–2329, 2007.MathSciNetCrossRefMATHGoogle Scholar
- 73.M. Madiman, J. Melbourne, and P. Xu. Forward and reverse Entropy Power Inequalities in convex geometry. See: arxiv:1604.04225, 2016.Google Scholar
- 74.P. Mateev. The entropy of the multinomial distribution. Teor. Verojatnost. i Primenen., 23(1):196–198, 1978.MathSciNetMATHGoogle Scholar
- 75.N. Papadatos and V. Papathanasiou. Poisson approximation for a sum of dependent indicators: an alternative approach. Adv. in Appl. Probab., 34(3):609–625, 2002.MathSciNetCrossRefMATHGoogle Scholar
- 76.V. Papathanasiou. Some characteristic properties of the Fisher information matrix via Cacoullos-type inequalities. J. Multivariate Anal., 44(2):256–265, 1993.MathSciNetCrossRefMATHGoogle Scholar
- 77.R. Pemantle. Towards a theory of negative dependence. J. Math. Phys., 41(3):1371–1390, 2000.MathSciNetCrossRefMATHGoogle Scholar
- 78.C. R. Rao. On the distance between two populations. Sankhya, 9:246–248, 1948.MathSciNetGoogle Scholar
- 79.A. Rényi. A characterization of Poisson processes. Magyar Tud. Akad. Mat. Kutató Int. Közl., 1:519–527, 1956.MathSciNetMATHGoogle Scholar
- 80.A. Rényi. On measures of entropy and information. In J. Neyman, editor, Proceedings of the 4th Berkeley Conference on Mathematical Statistics and Probability, pages 547–561, Berkeley, 1961. University of California Press.Google Scholar
- 81.B. Roos. Kerstan’s method for compound Poisson approximation. Ann. Probab., 31(4):1754–1771, 2003.MathSciNetCrossRefMATHGoogle Scholar
- 82.M. Shaked and J. G. Shanthikumar. Stochastic orders. Springer Series in Statistics. Springer, New York, 2007.CrossRefMATHGoogle Scholar
- 83.S. Shamai and A. Wyner. A binary analog to the entropy-power inequality. IEEE Trans. Inform. Theory, 36(6):1428–1430, Nov 1990.Google Scholar
- 84.C. E. Shannon. A mathematical theory of communication. Bell System Tech. J., 27:379–423, 623–656, 1948.Google Scholar
- 85.N. Sharma, S. Das, and S. Muthukrishnan. Entropy power inequality for a family of discrete random variables. In 2011 IEEE International Symposium on Information Theory Proceedings (ISIT), pages 1945–1949. IEEE, 2011.Google Scholar
- 86.L. A. Shepp and I. Olkin. Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution. In Contributions to probability, pages 201–206. Academic Press, New York, 1981.Google Scholar
- 87.R. Shimizu. On Fisher’s amount of information for location family. In Patil, G. P. and Kotz, S. and Ord, J. K., editor, A Modern Course on Statistical Distributions in Scientific Work, Volume 3, pages 305–312. Reidel, 1975.Google Scholar
- 88.A. J. Stam. Some inequalities satisfied by the quantities of information of Fisher and Shannon. Information and Control, 2:101–112, 1959.MathSciNetCrossRefMATHGoogle Scholar
- 89.K.-T. Sturm. On the geometry of metric measure spaces. I. Acta Math., 196(1):65–131, 2006.Google Scholar
- 90.K.-T. Sturm. On the geometry of metric measure spaces. II. Acta Math., 196(1):133–177, 2006.Google Scholar
- 91.G. Szegő. Orthogonal Polynomials. American Mathematical Society, New York, revised edition, 1958.MATHGoogle Scholar
- 92.G. Toscani. The fractional Fisher information and the central limit theorem for stable laws. Ricerche di Matematica, 65(1):71–91, 2016.MathSciNetCrossRefMATHGoogle Scholar
- 93.C. Tsallis. Possible generalization of Boltzmann-Gibbs statistics. Journal of Statistical Physics, 52:479–487, 1988.MathSciNetCrossRefMATHGoogle Scholar
- 94.A. Tulino and S. Verdú. Monotonic decrease of the non-Gaussianness of the sum of independent random variables: a simple proof. IEEE Trans. Inform. Theory, 52(9):4295–4297, 2006.MathSciNetCrossRefMATHGoogle Scholar
- 95.C. Villani. Topics in optimal transportation, volume 58 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2003.Google Scholar
- 96.C. Villani. Optimal transport: Old and New, volume 338 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, 2009.Google Scholar
- 97.D. W. Walkup. Pólya sequences, binomial convolution and the union of random sets. J. Appl. Probability, 13(1):76–85, 1976.MathSciNetCrossRefMATHGoogle Scholar
- 98.L. Wang, J. O. Woo, and M. Madiman. A lower bound on the Rényi entropy of convolutions in the integers. In 2014 IEEE International Symposium on Information Theory (ISIT), pages 2829–2833. IEEE, 2014.Google Scholar
- 99.H. S. Witsenhausen. Some aspects of convexity useful in information theory. IEEE Trans. Inform. Theory, 26(3):265–271, 1980.MathSciNetCrossRefMATHGoogle Scholar
- 100.L. Wu. A new modified logarithmic Sobolev inequality for Poisson point processes and several applications. Probab. Theory Related Fields, 118(3):427–438, 2000.MathSciNetCrossRefMATHGoogle Scholar
- 101.A. D. Wyner. A theorem on the entropy of certain binary sequences and applications. II. IEEE Trans. Information Theory, 19(6):772–777, 1973.Google Scholar
- 102.A. D. Wyner and J. Ziv. A theorem on the entropy of certain binary sequences and applications. I. IEEE Trans. Information Theory, 19(6):769–772, 1973.Google Scholar
- 103.Y. Yu. On the maximum entropy properties of the binomial distribution. IEEE Trans. Inform. Theory, 54(7):3351–3353, July 2008.Google Scholar
- 104.Y. Yu. Monotonic convergence in an information-theoretic law of small numbers. IEEE Trans. Inform. Theory, 55(12):5412–5422, 2009.MathSciNetCrossRefGoogle Scholar
- 105.Y. Yu. On the entropy of compound distributions on nonnegative integers. IEEE Trans. Inform. Theory, 55(8):3645–3650, 2009.MathSciNetCrossRefGoogle Scholar
- 106.Y. Yu and O. T. Johnson. Concavity of entropy under thinning. In Proceedings of the 2009 IEEE International Symposium on Information Theory, 28th June - 3rd July 2009, Seoul, pages 144–148, 2009.Google Scholar