Abstract
The classI f β, βε(0, ∞], off-divergences investigated in this paper is defined in terms of a class of entropies introduced by Arimoto (1971,Information and Control,19, 181–194). It contains the squared Hellinger distance (for β=1/2), the sumI(Q 1‖(Q 1+Q 2)/2)+I(Q 2‖(Q 1+Q 2)/2) of Kullback-Leibler divergences (for β=1) and half of the variation distance (for β=∞) and continuously extends the class of squared perimeter-type distances introduced by Österreicher (1996,Kybernetika,32, 389–393) (for βε (1, ∞]). It is shown that\((I_{f_\beta } (Q_1 ,Q_2 ))^{\min (\beta ,1/2)}\) are distances of probability distributionsQ 1,Q 2 for β ε (0, ∞). The applicability of\(I_{f_\beta }\)-divergences in statistics is also considered. In particular, it is shown that the\(I_{f_\beta }\)-projections of appropriate empirical distributions to regular families define distribution estimates which are in the case of an i.i.d. sample of size'n consistent. The order of consistency is investigated as well.
References
Ali, S. M. and Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another,J. Roy. Statist. Soc. Ser. B,28, 131–142.
Arimoto, S. (1971). Information-theoretical considerations on estimation problems,Information and Control,19, 181–194.
Barron, A. R., Győrfi, L. and van der Meulen, E. (1992). Distribution estimates consistent in total variation and in two types of information divergence,IEEE Trans. Inform. Theory,38, 1437–1454.
Basu, A. and Lindsay, B. G. (1994). Minimum disparity estimation for continuous models: Efficiency, distribution and robustness,Ann. Inst. Statist. Math.,46, 683–705.
Beran, R. (1977). Minimum Hellinger distance estimates for parameteric models,Ann. Statist.,5, 445–463.
Beran, R. (1978). An efficient and robust adaptive estimator of location,Ann. Statist.,6, 292–313.
Berlinet, A., Devroye, L. and Győrfi, L. (1995). Asymptotic normality ofL 1-error in density estimation,Statistics,26, 329–343.
Berlinet, A., Vajda, I. and van der Meulen, E. (1997). About the asymptotic accuracy of Barron density estimator,IEEE Trans. Inform. Theory (submitted).
Boekee, D. E. (1977).A Generalization of the Fisher Information Measure, Delft University Press, Delft.
Csiszár, I. (1963). Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten,Publ. Math. Inst. Hungar. Acad. Sci.,8, 85–107.
Csiszár, I. (1975). I-divergence geometry of probability distributions and minimization problems,Ann. Probab.,3(1), 146–158.
Csiszár, I. and Fischer, J. (1962). Informationsentfernungen im Raum der Wahrscheinlichkeitsverteilungen,Magyar Tud. Akad. Mat. Kutató Int. Kösl.,7, 159–180.
Cutler, A. and Cordero-Brana, O. I. (1996). Minimum Hellinger distance estimation for finite mixture models,J. Amer. Statist. Assoc.,91, 1716–1723.
Devroye, L. and Győrfi, L. (1990). No empirical measure can converge in total variation sense for all distributions,Ann. Statist.,18, 1496–1499.
Feldman, D. and Österreicher, F. (1989). A note onf-divergence,Studia Sci. Math. Hungar.,24, 191–200.
Győrfi, L., Vajda, I. and van der Meulen, E. (1994). Minimum Hellinger distance point estimates consistent under weak family regularity,Math. Methods Statist.,3, 25–45.
Győrfi, L., Vajda, I. and van der Meulen, E. (1996). Minimum Kolmogorov distance estimates of parameters and parametrized distributions,Metrika,43, 237–255.
Hall, P. and Patil, P. (1995). Formulae for mean integrated squared error of nonlinear wavelet-based density estimators,Ann. Statist.,23(3), 905–928.
Kafka, P., Österreicher, F. and Vincze, I. (1991). On powers off-divergences defining a distance,Studia Sci. Math. Hungar.,26, 415–422.
Liese, F. and Vajda, I. (1987).Convex Statistical Distances, Teubner-Texte zur Mathematik, Band 95, Teubner, Leipzig.
Lin, J. (1991). Divergence measures based on the Shannon entropy,IEEE Trans. Inform. Theory,37, 145–151.
Lindsay, B. G. (1994). Efficiency versus robustness: The case for minimum Hellinger and related methods,Ann. Statist.,22(2), 1081–1114.
Matusita, K. (1955). Decision rules based on the distance for problems of fit, two samples and estimation,Ann. Math. Statist.,26, 631–640.
Matusita, K. (1964). Distances and decision rules,Ann. Inst. Statist. Math.,16, 305–320.
Morales, D., Pardo, L. and Vajda, I. (1995). Asymptotic divergence of estimates of discrete distributions,J. Statist. Plann. Inference,47, 347–369.
Morales, D., Pardo, L. and Vajda, I. (1996). Uncertainty of discrete stochastic systems: General theory and statistical inference,IEEE Trans. Systems, Man and Cybernetics,26, 681–697.
Österreicher, F. (1982). The construction of least favourable distributions is traceable to a minimal perimeter problem,Studia Sci. Math. Hungar.,17, 341–351.
Österreicher, F. (1992). The risk set of a testing problem—A vivid statistical tool,Transactions of the Eleventh Prague Conference, Vol. A, 175–188, Academia, Prague.
Österreicher, F. (1996). On a class of perimeter-type distances of probability distributions,Kybernetika,32, 389–393.
Österreicher, F. and Vajda, I. (1993). Statistical information and discrimination,IEEE Trans. Inform. Theory,39(3), 1036–1039.
Pak, R. J. (1996). Minimum Hellinger distance estimation in simple linear regression models; distribution and efficiency,Statist. Probab. Lett.,26, 263–269.
Read, T. C. R. and Cressie, N. A. (1988).Goodness-of-Fit Statistics for Discrete Multivariate Data, Springer, New York.
Reschenhofer, E. and Bomze, I. M. (1991). Length tests for goodness of fit,Biometrika,78, 207–216.
Tamura, R. D. and Boos, D. D. (1986). Minimum Hellinger distance estimation for multivariate location and covariance,J. Amer. Statist. Assoc.,81, 223–229.
Vincze, I. (1981). On the concept and measure of information contained in an observation,Contributions to Probability (eds. J. Gani and V. F. Rohatgi), 207–214, Academic press, New York.
Author information
Authors and Affiliations
Additional information
Supported by the EC grant Copernicus 579.
About this article
Cite this article
Österreicher, F., Vajda, I. A new class of metric divergences on probability spaces and its applicability in statistics. Ann Inst Stat Math 55, 639–653 (2003). https://doi.org/10.1007/BF02517812
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02517812