Skip to main content

The Kullback–Leibler Divergence Between Lattice Gaussian Distributions

Abstract

A lattice Gaussian distribution of given mean and covariance matrix is a discrete distribution supported on a lattice maximizing Shannon’s entropy under these mean and covariance constraints. Lattice Gaussian distributions find applications in cryptography and in machine learning. The set of Gaussian distributions on a given lattice can be handled as a discrete exponential family whose partition function is related to the Riemann theta function. In this paper, we first report a formula for the Kullback–Leibler divergence between two lattice Gaussian distributions and then show how to efficiently approximate it numerically either via Rényi’s \(\alpha \)-divergences or via the projective \(\gamma \)-divergences. We illustrate how to use the Kullback-Leibler divergence to calculate the Chernoff information on the dually flat structure of the manifold of lattice Gaussian distributions.

This is a preview of subscription content, access via your institution.

Figure 1:
Figure 2:
Figure 3:

Notes

  1. Definition: n univariate functions \(f_1(x),\ldots , f_n(x)\) are said to be linearly dependent if there exists n constants \(c_1,\ldots , c_n\), not all zero, such that \(\sum _{i=1}^n c_i f_i(x)=0\) for some x belonging to an interval \(I\subset \mathbb {R}\). Otherwise, the functions are said linearly independent.

References

  1. Keener RW (2010) Theoretical statistics: topics for a core course. Springer, Heidelberg

    Book  Google Scholar 

  2. Grätzer G (2011) Lattice theory: foundation. Springer, Heidelberg

    Book  Google Scholar 

  3. Barndorff-Nielsen O (2014) Information and Exponential Families in Statistical Theory. Wiley, New Jersey

    Book  Google Scholar 

  4. Calin O, Udrişte C (2014) Geometric modeling in probability and statistics. Springer, Heidelberg, Germany

    Book  Google Scholar 

  5. Agostini D, Améndola C (2019) Discrete Gaussian distributions via theta functions. SIAM J Appl Algebra Geometry 3(1):1–30

    Article  Google Scholar 

  6. Olver FW, Lozier DW, Boisvert RF, Clark CW (2010) NIST Handbook of Mathematical Functions. Cambridge University Press, Cambridge

    Google Scholar 

  7. Nielsen F (2020) An elementary introduction to information geometry. Entropy 22(10):1100

    Article  Google Scholar 

  8. Siegel CL (2014) Symplectic Geometry. Elsevier, Amsterdam

    Google Scholar 

  9. Deconinck B, Van Hoeij M (2001) Computing Riemann matrices of algebraic curves. Physica D 152:28–46

    Article  Google Scholar 

  10. Frauendiener J, Jaber C, Klein C (2019) Efficient computation of multidimensional theta functions. J Geometry Phys 141:147–158

    Article  Google Scholar 

  11. Mumford, D., Musili, C.: Tata Lectures on Theta I. Birkhäuser, Boston, USA (2007). With the collaboration of C. Musili, M. Nori, E. Previato, and M. Stillman

  12. Deconinck B, Heil M, Bobenko A, Van Hoeij M, Schmies M (2004) Computing Riemann theta functions. Math Comput 73(247):1417–1442

    Article  Google Scholar 

  13. Agostini D, Chua L (2021) Computing theta functions with Julia. J Softw Algebra Geometry 11(1):41–51

    Article  Google Scholar 

  14. Osborne, A.R.: Nonlinear ocean wave and the inverse scattering transform. In: Scattering, pp. 637–666. Elsevier, The Netherlands (2002)

  15. Labrande H (2018) Computing Jacobi’s theta in quasi-linear time. Math Comput 87(311):1479–1508

    Article  Google Scholar 

  16. Lisman J, Van Zuylen M (1972) Note on the generation of most probable frequency distributions. Statistica Neerlandica 26(1):19–23

    Article  Google Scholar 

  17. Kemp AW (1997) Characterizations of a discrete normal distribution. J Stat Planning Inference 63(2):223–229

    Article  Google Scholar 

  18. Szabłowski PJ (2001) Discrete normal distribution and its relationship with Jacobi theta functions. Stat Prob Lett 52(3):289–299

    Article  Google Scholar 

  19. Budroni, A., Semaev, I.: New Public-Key Crypto-System EHT. arXiv preprint arXiv:2103.01147 (2021)

  20. Wang, L., Jia, R., Song, D.: D2P-Fed: Differentially private federated learning with efficient communication. arXiv preprint arXiv:2006.13039 (2020)

  21. Canonne, C.L., Kamath, G., Steinke, T.: The discrete Gaussian for differential privacy. arXiv preprint arXiv:2004.00010 (2020)

  22. Canonne, C.L., Kamath, G., Steinke, T.: The discrete gaussian for differential privacy. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual (2020)

  23. Carrazza S, Krefl D (2020) Sampling the Riemann-Theta Boltzmann machine. Comput Phys Commun 256:107464

    CAS  Article  Google Scholar 

  24. Van Erven T, Harremos P (2014) Rényi divergence and Kullback–Leibler divergence. IEEE Trans Inform Theory 60(7):3797–3820

    Article  Google Scholar 

  25. Fujisawa H, Eguchi S (2008) Robust parameter estimation with a small bias against heavy contamination. J Multivariate Anal 99(9):2053–2081

    Article  Google Scholar 

  26. Cover TM (1999) Elements of Information Theory. Wiley, New Jersey

    Google Scholar 

  27. Julier SJ An empirical study into the use of Chernoff information for robust, distributed fusion of Gaussian mixture models. In: 9th International conference on information Fusion, pp. 1–8 (2006). IEEE

  28. Pistone G, Wynn HP Finitely generated cumulants. Statistica Sinica, 1029–1052 (1999)

  29. Rockafellar RT (2015) Convex Analysis. Princeton University Press, Princeton

    Google Scholar 

  30. Navarro J, Ruiz J (2005) A note on the discrete normal distribution. Adv Appl Stat 5(2):229–245

    Google Scholar 

  31. Zellner A, Highfield RA (1988) Calculation of maximum entropy distributions and approximation of marginalposterior distributions. J Econ 37(2):195–209

    Article  Google Scholar 

  32. Mohammad-Djafari A A Matlab program to calculate the maximum entropy distributions. In: Maximum Entropy and Bayesian Methods, pp. 221–233. Springer, Heidelberg (1992)

  33. George AJ, Kashyap N An MCMC Method to Sample from Lattice Distributions. arXiv:2101.06453 (2021)

  34. Nielsen F, Nock R Entropies and cross-entropies of exponential families. In: 2010 IEEE International Conference on Image Processing, pp. 3621–3624 (2010). IEEE

  35. Zhang J (2004) Divergence function, duality, and convex analysis. Neural Comput 16(1):159–195

    Article  Google Scholar 

  36. Nielsen F, Boltz S (2011) The Burbea-Rao and Bhattacharyya centroids. IEEE Trans Inform Theory 57(8):5455–5466

    Article  Google Scholar 

  37. Bhattacharyya A On a measure of divergence between two multinomial populations. Sankhyā: the indian journal of statistics, 401–406 (1946)

  38. Cichocki A, Amari S-i Families of alpha-beta-and gamma-divergences: Flexible and robust measures of similarities. Entropy 12(6), 1532–1568 (2010)

  39. Nielsen F, Sun K, Marchand-Maillet S (2017) On Hölder projective divergences. Entropy 19(3):122

    Article  Google Scholar 

  40. Jenssen R, Principe JC, Erdogmus D, Eltoft T (2006) The Cauchy-Schwarz divergence and Parzen windowing: Connections to graph theory and Mercer kernels. J Franklin Inst 343(6):614–629

    Article  Google Scholar 

  41. Amari S-i Information Geometry and Its Applications vol. 194. Springer, Heidelberg (2016)

  42. Nielsen F (2013) An information-geometric characterization of Chernoff information. IEEE Signal Process Lett 20(3):269–272

    Article  Google Scholar 

  43. Bregman LM (1967) The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput Math Math Phys 7(3):200–217

    Article  Google Scholar 

  44. Shima H (2007) The Geometry of Hessian Structures. World Scientific, Singapore

    Book  Google Scholar 

  45. Boissonnat J-D, Nielsen F, Nock R (2010) Bregman Voronoi diagrams. Discrete Comput Geometry 44(2):281–307

    Article  Google Scholar 

  46. Garcia V, Nielsen F (2010) Simplification and hierarchical representations of mixtures of exponential families. Signal Process 90(12):3197–3212

    Article  Google Scholar 

  47. Banerjee A, Merugu S, Dhillon IS, Ghosh J Clustering with Bregman divergences. Journal of machine learning research 6(10) (2005)

  48. McNicholas PD, Murphy TB (2008) Parsimonious Gaussian mixture models. Stat Comput 18(3):285–296

    Article  Google Scholar 

Download references

Acknowledgements

We thank the reviewers for the constructive and helpful suggestions on this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Nielsen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nielsen, F. The Kullback–Leibler Divergence Between Lattice Gaussian Distributions. J Indian Inst Sci (2022). https://doi.org/10.1007/s41745-021-00279-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41745-021-00279-5

Keywords

  • Lattice Gaussian distribution
  • Discrete exponential family
  • Riemann theta function
  • Statistical divergence
  • Information geometry