Abstract
We study the geometry of probability distributions with respect to a generalized family of Csiszár f-divergences. A member of this family is the relative \(\alpha \)-entropy which is also a Rényi analog of relative entropy in information theory and known as logarithmic or projective power divergence in statistics. We apply Eguchi’s theory to derive the Fisher information metric and the dual affine connections arising from these generalized divergence functions. This enables us to arrive at a more widely applicable version of the Cramér–Rao inequality, which provides a lower bound for the variance of an estimator for an escort of the underlying parametric probability distribution. We then extend the Amari–Nagaoka’s dually flat structure of the exponential and mixer models to other distributions with respect to the aforementioned generalized metric. We show that these formulations lead us to find unbiased and efficient estimators for the escort model. Finally, we compare our work with prior results on generalized Cramér–Rao inequalities that were derived from non-information-geometric frameworks.
Similar content being viewed by others
Notes
A divergence function is a non-negative function D on \(S\times S\) satisfying \(D(p,q) \ge 0\) with equality iff \(p=q\).
References
Amari, S.: Information geometry and its applications. Springer, New York (2016)
Amari, S., Cichocki, A.: Information geometry of divergence functions. Bull. Polish Acad. Sci. Tech. Sci. 58(1), 183–195 (2010)
Amari, S., Nagaoka, H.: Methods of information geometry. Oxford University Press, Oxford (2000)
Arıkan, E.: An inequality on guessing and its application to sequential decoding. IEEE Trans Inf Theory 42(1), 99–105 (1996)
Ay, N., Jost, J., Lê, H.V., Schwachhöfer, L.: Information geometry. Springer, New York (2017)
Basu, A., Shioya, H., Park, C.: Statistical inference: The minimum distance approach. In: Monographs on Statistics and Applied Probability. Chapman & Hall/CRC Press, London (2011)
Bercher, J.F.: On a (\(\beta \), q)-generalized fisher information and inequalities involving q- gaussian distributions. J. Math. Phys. 53(063303), 1–12 (2012)
Bercher, J.F.: On generalized Cramér-Rao inequalities, generalized Fisher information and characterizations of generalized q-Gaussian distributions. J. Phys. A Math. Theor. 45(25), 255303 (2012)
Blumer, A.C., McEliece, R.J.: The Rényi redundancy of generalized Huffman codes. IEEE Trans. Inf. Theory 34(5), 1242–1249 (1988)
Bunte, C., Lapidoth, A.: Codes for tasks and Rényi entropy. IEEE Trans. Inf. Theory 60(9), 5065–5076 (2014)
Campbell, L.L.: A coding theorem and Rényi’s entropy. Inf. Control 8, 423–429 (1965)
Cichocki, A., Amari, S.: Families of alpha- beta- and gamma- divergences: Flexible and robust measures of similarities. Entropy 12, 1532–1568 (2010)
Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley, Hoboken (2012)
Csiszár, I.: Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems. Ann. Stat. 19(4), 2032–2066 (1991)
Eguchi, S.: Geometry of minimum contrast. Hiroshima Math. J. 22(3), 631–647 (1992)
Eguchi, S., Kato, S.: Entropy and divergence associated with power function and the statistical application. Entropy 12(2), 262–274 (2010)
Eguchi, S., Komori, O., Kato, S.: Projective power entropy and maximum Tsallis entropy distributions. Entropy 13(10), 1746–1764 (2011)
van Erven, T., Harremoës, P.: Rényi divergence and Kullback–Leibler divergence. IEEE Trans. Inf. Theory 60(7), 3797–3820 (2014)
Fujisawa, H., Eguchi, S.: Robust parameter estimation with a small bias against heavy contamination. J. Multivar. Anal. 99, 2053–2081 (2008)
Furuichi, S.: On the maximum entropy principle and the minimization of the Fisher information in Tsallis statistics. J. Math. Phys. 50(013303), 1–12 (2009)
Huleihel, W., Salamatian, S., Médard, M.: Guessing with limited memory. In: IEEE International Symposium on Information Theory, pp. 2253–2257 (2017)
Jones, M.C., Hjort, N.L., Harris, I.R., Basu, A.: A comparison of related density based minimum divergence estimators. Biometrika 88(3), 865–873 (2001)
Karthik, P.N., Sundaresan, R.: On the equivalence of projections in relative \(\alpha \)-entropy and Rényi divergence. In: National Conference on Communication, pp. 1–6 (2018)
Kumar, M.A., Mishra, K.V.: Information geometric approach to Bayesian lower error bounds. In: IEEE International Symposium on Information Theory, pp. 746–750 (2018)
Kumar, M.A., Sason, I.: Projection theorems for the Rényi divergence on alpha-convex sets. IEEE Trans. Inf. Theory 62(9), 4924–4935 (2016)
Kumar, M.A., Sundaresan, R.: Minimization problems based on relative \(\alpha \)-entropy I: Forward projection. IEEE Trans. Inf. Theory 61(9), 5063–5080 (2015)
Kumar, M.A., Sundaresan, R.: Minimization problems based on relative \(\alpha \)-entropy II: Reverse projection. IEEE Trans. Inf.Theory 61(9), 5081–5095 (2015)
Lutwak, E., Yang, D., Lv, S., Zhang, G.: Extensions of fisher information and stam’s inequality. IEEE Trans. Inf. Theory 58(3), 1319–1327 (2012)
Lutwak, E., Yang, D., Zhang, G.: Cramér-Rao and moment-entropy inequalities for Rényi entropy and generalized Fisher information. IEEE Trans. Inf. Theory 51(1), 473–478 (2005)
Mishra, K.V., Kumar, M.A.: Generalized Bayesian Cramér-Rao inequality via information geometry of relative \(\alpha \)-entropy. In: IEEE Annual Conference on Information Science and Systems, pp. 1–6 (2020)
Naudts, J.: Estimators, escort probabilities, and \(\phi \)-exponential families in statistical physics. J. Inequal. Pure Appl. Math. 5(4), 1–15 (2004)
Naudts, J.: Generalised thermostatistics. Springer, New York (2011)
Notsu, A., Komori, O., Eguchi, S.: Spontaneous clustering via minimum gamma-divergence. Neural Comput. 26(2), 421–448 (2014)
Rényi, A., et al.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, p. 547–561 (1961)
Sundaresan, R.: Guessing under source uncertainty. IEEE Trans. Inf. Theory 53(1), 269–287 (2007)
Tsallis, C., Mendes, R.S., Plastino, A.R.: The role of constraints within generalized nonextensive statistics. Phys. A 261, 534–554 (1998)
Zhang, J.: Divergence function, duality, and convex analysis. Neural Comput. 16, 159–195 (2004)
Acknowledgements
The authors are indebted to Prof. Rajesh Sundaresan of the Indian Institute of Science, Bengaluru for his helpful suggestions and discussions that improved the presentation of this material substantially. We sincerely thank the anonymous reviewers for their constructive suggestions that significantly improved the presentation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Proof of Theorem 3
A Proof of Theorem 3
Taking log on both sides of (48),
Partial derivative produces
or
Taking expectation on both sides of (64), we obtain
Since the expected value of the score function vanishes (left hand side of (65)), we have
Substituting (66) into (64), we get
where
Moreover, (66) implies that \(\log M(\theta )\) should be the potential (if exists).
The Riemmanian metric becomes
This further strengthens our expectation that the \(\eta _i\)’s are dual parameters to \(\theta _i\)’s. However, it is surprising that it is not so as we shall see now. We have
Let \(R_{\theta }(x) = q(x)^{\alpha -1} + \sum \limits _{j=1}^k\theta _j f_j(x)\). Partial differentiation produces
Substituting (71) into (70) gives
This shows that \(\eta _i\) cannot be the dual parameters of \(\theta _i\) for the statistical model \(\mathbb {M}^{(\alpha )}\). This completes the proof.
Rights and permissions
About this article
Cite this article
Ashok Kumar, M., Vijay Mishra, K. Cramér–Rao lower bounds arising from generalized Csiszár divergences. Info. Geo. 3, 33–59 (2020). https://doi.org/10.1007/s41884-020-00029-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41884-020-00029-z