Annals of the Institute of Statistical Mathematics

, Volume 55, Issue 3, pp 639–653 | Cite as

A new class of metric divergences on probability spaces and its applicability in statistics

  • Ferdinand Österreicher
  • Igor Vajda


The classI f β, βε(0, ∞], off-divergences investigated in this paper is defined in terms of a class of entropies introduced by Arimoto (1971,Information and Control,19, 181–194). It contains the squared Hellinger distance (for β=1/2), the sumI(Q 1‖(Q 1+Q 2)/2)+I(Q 2‖(Q 1+Q 2)/2) of Kullback-Leibler divergences (for β=1) and half of the variation distance (for β=∞) and continuously extends the class of squared perimeter-type distances introduced by Österreicher (1996,Kybernetika,32, 389–393) (for βε (1, ∞]). It is shown that\((I_{f_\beta } (Q_1 ,Q_2 ))^{\min (\beta ,1/2)}\) are distances of probability distributionsQ 1,Q 2 for β ε (0, ∞). The applicability of\(I_{f_\beta }\)-divergences in statistics is also considered. In particular, it is shown that the\(I_{f_\beta }\)-projections of appropriate empirical distributions to regular families define distribution estimates which are in the case of an i.i.d. sample of size'n consistent. The order of consistency is investigated as well.

Key words and phrases

Dissimilarities metric divergences minimum distance estimators 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ali, S. M. and Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another,J. Roy. Statist. Soc. Ser. B,28, 131–142.zbMATHMathSciNetGoogle Scholar
  2. Arimoto, S. (1971). Information-theoretical considerations on estimation problems,Information and Control,19, 181–194.zbMATHMathSciNetCrossRefGoogle Scholar
  3. Barron, A. R., Győrfi, L. and van der Meulen, E. (1992). Distribution estimates consistent in total variation and in two types of information divergence,IEEE Trans. Inform. Theory,38, 1437–1454.zbMATHMathSciNetCrossRefGoogle Scholar
  4. Basu, A. and Lindsay, B. G. (1994). Minimum disparity estimation for continuous models: Efficiency, distribution and robustness,Ann. Inst. Statist. Math.,46, 683–705.zbMATHMathSciNetCrossRefGoogle Scholar
  5. Beran, R. (1977). Minimum Hellinger distance estimates for parameteric models,Ann. Statist.,5, 445–463.zbMATHMathSciNetGoogle Scholar
  6. Beran, R. (1978). An efficient and robust adaptive estimator of location,Ann. Statist.,6, 292–313.zbMATHMathSciNetGoogle Scholar
  7. Berlinet, A., Devroye, L. and Győrfi, L. (1995). Asymptotic normality ofL 1-error in density estimation,Statistics,26, 329–343.zbMATHMathSciNetGoogle Scholar
  8. Berlinet, A., Vajda, I. and van der Meulen, E. (1997). About the asymptotic accuracy of Barron density estimator,IEEE Trans. Inform. Theory (submitted).Google Scholar
  9. Boekee, D. E. (1977).A Generalization of the Fisher Information Measure, Delft University Press, Delft.Google Scholar
  10. Csiszár, I. (1963). Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten,Publ. Math. Inst. Hungar. Acad. Sci.,8, 85–107.zbMATHGoogle Scholar
  11. Csiszár, I. (1975). I-divergence geometry of probability distributions and minimization problems,Ann. Probab.,3(1), 146–158.zbMATHGoogle Scholar
  12. Csiszár, I. and Fischer, J. (1962). Informationsentfernungen im Raum der Wahrscheinlichkeitsverteilungen,Magyar Tud. Akad. Mat. Kutató Int. Kösl.,7, 159–180.zbMATHGoogle Scholar
  13. Cutler, A. and Cordero-Brana, O. I. (1996). Minimum Hellinger distance estimation for finite mixture models,J. Amer. Statist. Assoc.,91, 1716–1723.zbMATHMathSciNetCrossRefGoogle Scholar
  14. Devroye, L. and Győrfi, L. (1990). No empirical measure can converge in total variation sense for all distributions,Ann. Statist.,18, 1496–1499.zbMATHMathSciNetGoogle Scholar
  15. Feldman, D. and Österreicher, F. (1989). A note onf-divergence,Studia Sci. Math. Hungar.,24, 191–200.zbMATHMathSciNetGoogle Scholar
  16. Győrfi, L., Vajda, I. and van der Meulen, E. (1994). Minimum Hellinger distance point estimates consistent under weak family regularity,Math. Methods Statist.,3, 25–45.MathSciNetGoogle Scholar
  17. Győrfi, L., Vajda, I. and van der Meulen, E. (1996). Minimum Kolmogorov distance estimates of parameters and parametrized distributions,Metrika,43, 237–255.MathSciNetGoogle Scholar
  18. Hall, P. and Patil, P. (1995). Formulae for mean integrated squared error of nonlinear wavelet-based density estimators,Ann. Statist.,23(3), 905–928.zbMATHMathSciNetGoogle Scholar
  19. Kafka, P., Österreicher, F. and Vincze, I. (1991). On powers off-divergences defining a distance,Studia Sci. Math. Hungar.,26, 415–422.zbMATHMathSciNetGoogle Scholar
  20. Liese, F. and Vajda, I. (1987).Convex Statistical Distances, Teubner-Texte zur Mathematik, Band 95, Teubner, Leipzig.zbMATHGoogle Scholar
  21. Lin, J. (1991). Divergence measures based on the Shannon entropy,IEEE Trans. Inform. Theory,37, 145–151.zbMATHMathSciNetCrossRefGoogle Scholar
  22. Lindsay, B. G. (1994). Efficiency versus robustness: The case for minimum Hellinger and related methods,Ann. Statist.,22(2), 1081–1114.zbMATHMathSciNetGoogle Scholar
  23. Matusita, K. (1955). Decision rules based on the distance for problems of fit, two samples and estimation,Ann. Math. Statist.,26, 631–640.zbMATHMathSciNetGoogle Scholar
  24. Matusita, K. (1964). Distances and decision rules,Ann. Inst. Statist. Math.,16, 305–320.zbMATHMathSciNetCrossRefGoogle Scholar
  25. Morales, D., Pardo, L. and Vajda, I. (1995). Asymptotic divergence of estimates of discrete distributions,J. Statist. Plann. Inference,47, 347–369.MathSciNetCrossRefGoogle Scholar
  26. Morales, D., Pardo, L. and Vajda, I. (1996). Uncertainty of discrete stochastic systems: General theory and statistical inference,IEEE Trans. Systems, Man and Cybernetics,26, 681–697.CrossRefGoogle Scholar
  27. Österreicher, F. (1982). The construction of least favourable distributions is traceable to a minimal perimeter problem,Studia Sci. Math. Hungar.,17, 341–351.zbMATHMathSciNetGoogle Scholar
  28. Österreicher, F. (1992). The risk set of a testing problem—A vivid statistical tool,Transactions of the Eleventh Prague Conference, Vol. A, 175–188, Academia, Prague.Google Scholar
  29. Österreicher, F. (1996). On a class of perimeter-type distances of probability distributions,Kybernetika,32, 389–393.zbMATHMathSciNetGoogle Scholar
  30. Österreicher, F. and Vajda, I. (1993). Statistical information and discrimination,IEEE Trans. Inform. Theory,39(3), 1036–1039.zbMATHMathSciNetCrossRefGoogle Scholar
  31. Pak, R. J. (1996). Minimum Hellinger distance estimation in simple linear regression models; distribution and efficiency,Statist. Probab. Lett.,26, 263–269.zbMATHMathSciNetCrossRefGoogle Scholar
  32. Read, T. C. R. and Cressie, N. A. (1988).Goodness-of-Fit Statistics for Discrete Multivariate Data, Springer, New York.zbMATHGoogle Scholar
  33. Reschenhofer, E. and Bomze, I. M. (1991). Length tests for goodness of fit,Biometrika,78, 207–216.zbMATHMathSciNetCrossRefGoogle Scholar
  34. Tamura, R. D. and Boos, D. D. (1986). Minimum Hellinger distance estimation for multivariate location and covariance,J. Amer. Statist. Assoc.,81, 223–229.zbMATHMathSciNetCrossRefGoogle Scholar
  35. Vincze, I. (1981). On the concept and measure of information contained in an observation,Contributions to Probability (eds. J. Gani and V. F. Rohatgi), 207–214, Academic press, New York.Google Scholar

Copyright information

© The Institute of Statistical Mathematics 2003

Authors and Affiliations

  • Ferdinand Österreicher
    • 1
  • Igor Vajda
    • 2
  1. 1.Institute of MathematicsUniversity of SalzburgSalzburgAustria
  2. 2.Institute of Information Theory and AutomationAcademy of SciencesPragueCzech Republic

Personalised recommendations