Annals of the Institute of Statistical Mathematics

, Volume 55, Issue 3, pp 639–653

A new class of metric divergences on probability spaces and its applicability in statistics

  • Ferdinand Österreicher
  • Igor Vajda
Divergence

Abstract

The classIfβ, βε(0, ∞], off-divergences investigated in this paper is defined in terms of a class of entropies introduced by Arimoto (1971,Information and Control,19, 181–194). It contains the squared Hellinger distance (for β=1/2), the sumI(Q1‖(Q1+Q2)/2)+I(Q2‖(Q1+Q2)/2) of Kullback-Leibler divergences (for β=1) and half of the variation distance (for β=∞) and continuously extends the class of squared perimeter-type distances introduced by Österreicher (1996,Kybernetika,32, 389–393) (for βε (1, ∞]). It is shown that\((I_{f_\beta } (Q_1 ,Q_2 ))^{\min (\beta ,1/2)}\) are distances of probability distributionsQ1,Q2 for β ε (0, ∞). The applicability of\(I_{f_\beta }\)-divergences in statistics is also considered. In particular, it is shown that the\(I_{f_\beta }\)-projections of appropriate empirical distributions to regular families define distribution estimates which are in the case of an i.i.d. sample of size'n consistent. The order of consistency is investigated as well.

Key words and phrases

Dissimilarities metric divergences minimum distance estimators 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali, S. M. and Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another,J. Roy. Statist. Soc. Ser. B,28, 131–142.MATHMathSciNetGoogle Scholar
  2. Arimoto, S. (1971). Information-theoretical considerations on estimation problems,Information and Control,19, 181–194.MATHMathSciNetCrossRefGoogle Scholar
  3. Barron, A. R., Győrfi, L. and van der Meulen, E. (1992). Distribution estimates consistent in total variation and in two types of information divergence,IEEE Trans. Inform. Theory,38, 1437–1454.MATHMathSciNetCrossRefGoogle Scholar
  4. Basu, A. and Lindsay, B. G. (1994). Minimum disparity estimation for continuous models: Efficiency, distribution and robustness,Ann. Inst. Statist. Math.,46, 683–705.MATHMathSciNetCrossRefGoogle Scholar
  5. Beran, R. (1977). Minimum Hellinger distance estimates for parameteric models,Ann. Statist.,5, 445–463.MATHMathSciNetGoogle Scholar
  6. Beran, R. (1978). An efficient and robust adaptive estimator of location,Ann. Statist.,6, 292–313.MATHMathSciNetGoogle Scholar
  7. Berlinet, A., Devroye, L. and Győrfi, L. (1995). Asymptotic normality ofL 1-error in density estimation,Statistics,26, 329–343.MATHMathSciNetGoogle Scholar
  8. Berlinet, A., Vajda, I. and van der Meulen, E. (1997). About the asymptotic accuracy of Barron density estimator,IEEE Trans. Inform. Theory (submitted).Google Scholar
  9. Boekee, D. E. (1977).A Generalization of the Fisher Information Measure, Delft University Press, Delft.Google Scholar
  10. Csiszár, I. (1963). Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten,Publ. Math. Inst. Hungar. Acad. Sci.,8, 85–107.MATHGoogle Scholar
  11. Csiszár, I. (1975). I-divergence geometry of probability distributions and minimization problems,Ann. Probab.,3(1), 146–158.MATHGoogle Scholar
  12. Csiszár, I. and Fischer, J. (1962). Informationsentfernungen im Raum der Wahrscheinlichkeitsverteilungen,Magyar Tud. Akad. Mat. Kutató Int. Kösl.,7, 159–180.MATHGoogle Scholar
  13. Cutler, A. and Cordero-Brana, O. I. (1996). Minimum Hellinger distance estimation for finite mixture models,J. Amer. Statist. Assoc.,91, 1716–1723.MATHMathSciNetCrossRefGoogle Scholar
  14. Devroye, L. and Győrfi, L. (1990). No empirical measure can converge in total variation sense for all distributions,Ann. Statist.,18, 1496–1499.MATHMathSciNetGoogle Scholar
  15. Feldman, D. and Österreicher, F. (1989). A note onf-divergence,Studia Sci. Math. Hungar.,24, 191–200.MATHMathSciNetGoogle Scholar
  16. Győrfi, L., Vajda, I. and van der Meulen, E. (1994). Minimum Hellinger distance point estimates consistent under weak family regularity,Math. Methods Statist.,3, 25–45.MathSciNetGoogle Scholar
  17. Győrfi, L., Vajda, I. and van der Meulen, E. (1996). Minimum Kolmogorov distance estimates of parameters and parametrized distributions,Metrika,43, 237–255.MathSciNetGoogle Scholar
  18. Hall, P. and Patil, P. (1995). Formulae for mean integrated squared error of nonlinear wavelet-based density estimators,Ann. Statist.,23(3), 905–928.MATHMathSciNetGoogle Scholar
  19. Kafka, P., Österreicher, F. and Vincze, I. (1991). On powers off-divergences defining a distance,Studia Sci. Math. Hungar.,26, 415–422.MATHMathSciNetGoogle Scholar
  20. Liese, F. and Vajda, I. (1987).Convex Statistical Distances, Teubner-Texte zur Mathematik, Band 95, Teubner, Leipzig.MATHGoogle Scholar
  21. Lin, J. (1991). Divergence measures based on the Shannon entropy,IEEE Trans. Inform. Theory,37, 145–151.MATHMathSciNetCrossRefGoogle Scholar
  22. Lindsay, B. G. (1994). Efficiency versus robustness: The case for minimum Hellinger and related methods,Ann. Statist.,22(2), 1081–1114.MATHMathSciNetGoogle Scholar
  23. Matusita, K. (1955). Decision rules based on the distance for problems of fit, two samples and estimation,Ann. Math. Statist.,26, 631–640.MATHMathSciNetGoogle Scholar
  24. Matusita, K. (1964). Distances and decision rules,Ann. Inst. Statist. Math.,16, 305–320.MATHMathSciNetCrossRefGoogle Scholar
  25. Morales, D., Pardo, L. and Vajda, I. (1995). Asymptotic divergence of estimates of discrete distributions,J. Statist. Plann. Inference,47, 347–369.MathSciNetCrossRefGoogle Scholar
  26. Morales, D., Pardo, L. and Vajda, I. (1996). Uncertainty of discrete stochastic systems: General theory and statistical inference,IEEE Trans. Systems, Man and Cybernetics,26, 681–697.CrossRefGoogle Scholar
  27. Österreicher, F. (1982). The construction of least favourable distributions is traceable to a minimal perimeter problem,Studia Sci. Math. Hungar.,17, 341–351.MATHMathSciNetGoogle Scholar
  28. Österreicher, F. (1992). The risk set of a testing problem—A vivid statistical tool,Transactions of the Eleventh Prague Conference, Vol. A, 175–188, Academia, Prague.Google Scholar
  29. Österreicher, F. (1996). On a class of perimeter-type distances of probability distributions,Kybernetika,32, 389–393.MATHMathSciNetGoogle Scholar
  30. Österreicher, F. and Vajda, I. (1993). Statistical information and discrimination,IEEE Trans. Inform. Theory,39(3), 1036–1039.MATHMathSciNetCrossRefGoogle Scholar
  31. Pak, R. J. (1996). Minimum Hellinger distance estimation in simple linear regression models; distribution and efficiency,Statist. Probab. Lett.,26, 263–269.MATHMathSciNetCrossRefGoogle Scholar
  32. Read, T. C. R. and Cressie, N. A. (1988).Goodness-of-Fit Statistics for Discrete Multivariate Data, Springer, New York.MATHGoogle Scholar
  33. Reschenhofer, E. and Bomze, I. M. (1991). Length tests for goodness of fit,Biometrika,78, 207–216.MATHMathSciNetCrossRefGoogle Scholar
  34. Tamura, R. D. and Boos, D. D. (1986). Minimum Hellinger distance estimation for multivariate location and covariance,J. Amer. Statist. Assoc.,81, 223–229.MATHMathSciNetCrossRefGoogle Scholar
  35. Vincze, I. (1981). On the concept and measure of information contained in an observation,Contributions to Probability (eds. J. Gani and V. F. Rohatgi), 207–214, Academic press, New York.Google Scholar

Copyright information

© The Institute of Statistical Mathematics 2003

Authors and Affiliations

  • Ferdinand Österreicher
    • 1
  • Igor Vajda
    • 2
  1. 1.Institute of MathematicsUniversity of SalzburgSalzburgAustria
  2. 2.Institute of Information Theory and AutomationAcademy of SciencesPragueCzech Republic

Personalised recommendations