Abstract
Numerous entropy-type characteristics (functionals) generalizing Rényi entropy are widely used in mathematical statistics, physics, information theory, and signal processing for characterizing uncertainty in probability distributions and distribution identification problems. We consider estimators of some entropy (integral) functionals for discrete and continuous distributions based on the number of epsilon-close vector records in the corresponding independent and identically distributed samples from two distributions. The proposed estimators are generalized U-statistics. We show the asymptotic properties of these estimators (e.g., consistency and asymptotic normality). The results can be applied in various problems in computer science and mathematical statistics (e.g., approximate matching for random databases, record linkage, image matching).
AMS 2000 subject classification: 94A15, 62G20
Keywords
- entropy estimation
- Rényi entropy
- U-statistics
- approximate matching
- asymptotic normality
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baryshnikov, Y., Penrose, M., Yukich, J.E.: Gaussian limits for generalized spacings. Ann. Appl. Probab. 19, 158–185 (2009)
Copas, J.B., Hilton, F.J.: Record linkage: statistical models for matching computer records. Jour. Royal Stat. Soc. Ser. A 153, 287–320 (1990)
Costa, J., Hero, A., Vignat, C.: On Solutions to Multivariate Maximum α-entropy Problems. In: Rangarajan, A., Figueiredo, M.A.T., Zerubia, J. (eds.) EMMCVPR 2003. LNCS, vol. 2683, pp. 211–228. Springer, Heidelberg (2003)
Demetrovics, J., Katona, G.O.H., Miklós, D., Seleznjev, O., Thalheim, B.: The Average Length of Keys and Functional Dependencies in (Random) Databases. In: Vardi, M.Y., Gottlob, G. (eds.) ICDT 1995. LNCS, vol. 893, pp. 266–279. Springer, Heidelberg (1995)
Demetrovics, J., Katona, G.O.H., Miklós, D., Seleznjev, O., Thalheim, B.: Asymptotic properties of keys and functional dependencies in random databases. Theor. Computer Science 190, 151–166 (1998)
Demetrovics, J., Katona, G.O.H., Miklós, D., Seleznjev, O., Thalheim, B.: Functional dependencies in random databases. Studia Scien. Math. Hungarica 34, 127–140 (1998)
Durrett, R.: Probability: Theory and Examples. Brooks/Cole Publishing Company, New York (1991)
Kapur, J.N.: Maximum-entropy Models in Science and Engineering. Wiley, New York (1989)
Kapur, J.N., Kesavan, H.K.: Entropy Optimization Principles with Applications. Academic Press, New York (1992)
Kiefer, M., Bernstein, A., Lewis, P.M.: Database Systems: An Application-Oriented Approach. Addison-Wesley (2005)
Koroljuk, V.S., Borovskich, Y.V.: Theory of U-statistics. Kluwer, London (1994)
Kozachenko, L.F., Leonenko, N.N.: On statistical estimation of entropy of random vector. Problems Infor. Transmiss. 23, 95–101 (1987)
Lee, A.J.: U-Statistics: Theory and Practice. Marcel Dekker, New York (1990)
Leonenko, N., Pronzato, L., Savani, V.: A class of Rényi information estimators for multidimensional densities. Ann. Stat. 36, 2153–2182 (2008); Corrections, Ann. Stat. 38(6), 3837–3838 (2010)
Leonenko, N., Seleznjev, O.: Statistical inference for the ε-entropy and the quadratic Rényi entropy. Jour. Multivariate Analysis 101, 1981–1994 (2010)
Pardo, L.: Statistical Inference Based on Divergence Measures. Chapman and Hall (2006)
Rényi, A.: On measures of entropy and information. In: Proc. 4th Berkeley Symp. Math. Statist. Prob., vol. 1 (1961)
Rényi, A.: Probability Theory. North-Holland, London (1970)
Seleznjev, O., Thalheim, B.: Average case analysis in database problems. Methodol. Comput. Appl. Prob. 5, 395–418 (2003)
Seleznjev, O., Thalheim, B.: Random databases with approximate record matching. Methodol. Comput. Appl. Prob. 12, 63–89 (2008), doi:10.1007/s11009-008-9092-4 (2010) (published online, in print)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. Jour. 27, 379–423, 623–656 (1948)
Szpankowski, W.: Average Case Analysis of Algorithms on Sequences. John Wiley, New York (2001)
Thalheim, B.: Entity-Relationship Modeling. In: Foundations of Database Technology. Springer, Berlin (2000)
Tsybakov, A.B., Van der Meulen, E.C.: Root-n consistent estimators of entropy for densities with unbounded support. Scandinavian Jour. Statistics 23, 75–83 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Källberg, D., Leonenko, N., Seleznjev, O. (2012). Statistical Inference for Rényi Entropy Functionals. In: Düsterhöft, A., Klettke, M., Schewe, KD. (eds) Conceptual Modelling and Its Theoretical Foundations. Lecture Notes in Computer Science, vol 7260. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28279-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-28279-9_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28278-2
Online ISBN: 978-3-642-28279-9
eBook Packages: Computer ScienceComputer Science (R0)
