Abstract
In this paper, we first extend the result of Ali and Silvey [J R Stat Soc Ser B, 28:131–142, 1966] who proved that any f-divergence between two isotropic multivariate Gaussian distributions amounts to a corresponding strictly increasing scalar function of their corresponding Mahalanobis distance. We give sufficient conditions on the standard probability density function generating a multivariate location family and the function generator f in order to generalize this result. This property is useful in practice as it allows to compare exactly f-divergences between densities of these location families via their corresponding Mahalanobis distances, even when the f-divergences are not available in closed-form as it is the case, for example, for the Jensen–Shannon divergence or the total variation distance between densities of a normal location family. Second, we consider f-divergences between densities of multivariate scale families: We recall Ali and Silvey ’s result that for normal scale families we get matrix spectral divergences, and we extend this result to densities of a scale family.
Similar content being viewed by others
References
Ali, S.M., Silvey, S.D.: A general class of coefficients of divergence of one distribution from another. J. R. Stat. Soc. Ser. B 28(1), 131–142 (1966)
Amari, S.-I.: Information Geometry and Its Applications. Applied Mathematical Sciences, Springer, Tokyo, Japan (2016)
Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J., Lafferty, J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6(10), 1705 (2005)
Calvo, M., Oller, J.M.: An explicit solution of information geodesic equations for the multivariate normal model. Stat Risk Model 9(1–2), 119–138 (1991)
Cover, T.M.: Elements Information Theory. John Wiley & Sons, Hoboken (1999)
Csiszár, I.: Eine informationstheoretische ungleichung und ihre anwendung auf beweis der ergodizitaet von markoffschen ketten. Magyer Tud. Akad. Mat. Kutato Int. Koezl. 8, 85–108 (1964)
Davis, J., Dhillon, I.: Differential entropic clustering of multivariate gaussians. Adv. Neural. Inf. Process. Syst. 19, (2006)
Devroye, L., Mehrabian, A., Reddad, T.: The total variation distance between high-dimensional Gaussians with the same mean. arXiv preprint arXiv:1810.08693 (2018)
Fuglede, B., Topsoe, F.: Jensen-Shannon divergence and Hilbert space embedding. In: International Symposium on Information Theory (ISIT), p. 31 (2004). IEEE
Globke, W., Quiroga-Barranco, R.: Information geometry and asymptotic geodesics on the space of normal distributions. Inf. Geomet. 4(1), 131–153 (2021)
Khosravifard, M., Fooladivanda, D., Gulliver, T.A.: Confliction of the convexity and metric properties in \(f\)-divergences. IEICE Trans. Fund. Electr. Commun. Comput. Sci. 90(9), 1848–1853 (2007)
Kollo, T.: Advanced Multivariate Statistics with Matrices. Springer, Berlin/Heidelberg (2005)
Kulis, B., Sustik, M.A., Dhillon, I.S.: Low-rank Kernel learning with Bregman matrix divergences. J. Mach. Learn. Res. 10, 2 (2009)
Lin, J.: Divergence measures based on the Shannon entropy. IEEE Trans. Inf. theory 37(1), 145–151 (1991)
Mahalanobis, P.C.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. 2, 49–55 (1936)
Michalowicz, J.V., Nichols, J.M., Bucholtz, F.: Calculation of differential entropy for a mixed Gaussian distribution. Entropy 10(3), 200–206 (2008)
Molenberghs, G., Lesaffre, E.: Non-linear integral equations to approximate bivariate densities with given marginals and dependence function. Stat. Sin. 7, 713–738 (1997)
Nielsen, F.: On Information Projections Between Multivariate Elliptical and Location-Scale Families. arXiv preprint arXiv:2101.03839 (2021)
Nielsen, F., Boltz, S.: The Burbea-Rao and Bhattacharyya centroids. IEEE Trans. Inf. Theory 57(8), 5455–5466 (2011)
Nielsen, F., Nock, R.: On the chi square and higher-order chi distances for approximating \(f\)-divergences. IEEE Sigl. Process. Lett. 21(1), 10–13 (2013)
Nielsen, F., Okamura, K.: On \(f\)-divergences between Cauchy distributions. IEEE Trans. Inf. Theory 69(5), 3150–3171 (2023)
Ollila, E., Tyler, D.E., Koivunen, V., Poor, H.V.: Complex elliptically symmetric distributions: survey, new results and applications. IEEE Trans. Sigl. Process. 60(11), 5597–5625 (2012)
Pardo, L.: Stat Inf. Based Diverg. Measur. Chapman and Hall/CRC, Boca Raton, Florida (2018)
Qiao, Y., Minematsu, N.: A study on invariance of \(f\)-divergence and its application to speech recognition. IEEE Trans. Signal Process. 58(7), 3884–3890 (2010)
Rohban, M.H., Ishwar, P., Orten, B., Karl, W.C., Saligrama, V.: An impossibility result for high dimensional supervised learning. In: 2013 IEEE Information Theory Workshop (ITW), pp. 1–5 (2013). IEEE
Steerneman, A., Van Perlo-Ten Kleij, F.: Spherical distributions: Schoenberg (1938) revisited. Expos. Mathem. 23(3), 281–287 (2005)
Touboul, J.: Projection pursuit through \(\phi \)-divergence minimisation. Entropy 12(6), 1581–1611 (2010)
Watanabe, S., Yamazaki, K., Aoyagi, M.: Kullback information of normal mixture is not an analytic function. IEICE Tech. Rep. 2004, 41–46 (2004)
Author information
Authors and Affiliations
Contributions
FN and KO wrote the manuscript. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nielsen, F., Okamura, K. On the f-divergences between densities of a multivariate location or scale family. Stat Comput 34, 60 (2024). https://doi.org/10.1007/s11222-023-10373-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-023-10373-6