Abstract
Three simple and explicit procedures for testing the independence of two multi-dimensional random variables are described. Two of the associated test statistics (L 1, log-likelihood) are defined when the empirical distribution of the variables is restricted to finite partitions. A third test statistic is defined as a kernel-based independence measure. All tests reject the null hypothesis of independence if the test statistics become large. The large deviation and limit distribution properties of all three test statistics are given. Following from these results, distribution-free strong consistent tests of independence are derived, as are asymptotically α-level tests. The performance of the tests is evaluated experimentally on benchmark data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Biau, G., Györfi, L.: On the asymptotic properties of a nonparametric l 1-test statistic of homogeneity. IEEE Trans. Inform. Theory 51, 3965–3973 (2005)
Györfi, L., van der Meulen, E.C.: A consistent goodness of fit test based on the total variation distance. In: Roussas, G. (ed.) Nonparametric Functional Estimation and Related Topics, pp. 631–645. Kluwer Academic Publishers, Dordrecht (1990)
Beirlant, J., Györfi, L., Lugosi, G.: On the asymptotic normality of the l 1- and l 2-errors in histogram density estimation. Canad. J. Statist. 22, 309–318 (1994)
Györfi, L., Vajda, I.: Asymptotic distributions for goodness of fit statistics in a sequence of multinomial models. Stat. Prob. Lett. 56, 57–67 (2002)
Dembo, A., Peres, Y.: A topological criterion for hypothesis testing. Ann. Statist. 22, 106–117 (1994)
Read, T., Cressie, N.: Goodness-Of-Fit Statistics for Discrete Multivariate Analysis. Springer, New York (1988)
Rosenblatt, M.: A quadratic measure of deviation of two-dimensional density estimates and a test of independence. The Annals of Statistics 3, 1–14 (1975)
Feuerverger, A.: A consistent test for bivariate dependence. International Statistical Review 61, 419–433 (1993)
Kankainen, A.: Consistent Testing of Total Independence Based on the Empirical Characteristic Function. PhD thesis, University of Jyväskylä (1995)
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 63–78. Springer, Heidelberg (2005)
Gretton, A., Fukumizu, K., Teo, C.H., Song, L., Schölkopf, B., Smola, A.: A kernel statistical test of independence. In: NIPS 20 (2008)
Hoeffding, W.: A nonparametric test for independence. The Annals of Mathematical Statistics 19, 546–557 (1948)
Blum, J.R., Kiefer, J., Rosenblatt, M.: Distribution free tests of independence based on the sample distribution function. Ann. Math. Stat. 32, 485–498 (1961)
Gretton, A., Györfi, L.: Consistent nonparametric tests of independence. Technical Report 172, MPI for Biological Cybernetics (2008)
Beirlant, J., Devroye, L., Györfi, L., Vajda, I.: Large deviations of divergence measures on partitions. J. Statist. Plan. Inference 93, 1–16 (2001)
Kallenberg, W.C.M.: On moderate and large deviations in multinomial distributions. Annals of Statistics 13, 1554–1580 (1985)
Quine, M., Robinson, J.: Efficiencies of chi-square and likelihood ratio goodness-of-fit tests. Ann. Statist. 13, 727–742 (1985)
Fukumizu, K., Gretton, A., Sun, X., Schölkopf, B.: Kernel measures of conditional dependence. In: NIPS 20 (2008)
Sriperumbudur, B.K., Gretton, A., Fukumizu, K., Lanckriet, G.R.G., Schölkopf, B.: Injective Hilbert space embeddings of probability measures. In: COLT, pp. 111–122 (2008)
Steinwart, I.: On the influence of the kernel on the consistency of support vector machines. Journal of Machine Learning Research 2, 67–93 (2001)
McDiarmid, C.: On the method of bounded differences. In: Survey in Combinatorics, pp. 148–188. Cambridge University Press, Cambridge (1989)
Hall, P.: Central limit theorem for integrated square error of multivariate nonparametric density estimators. Journal of Multivariate Analysis 14, 1–16 (1984)
Cotterill, D.S., Csörgő, M.: On the limiting distribution of and critical values for the Hoeffding, Blum, Kiefer, Rosenblatt independence criterion. Statistics and Decisions 3, 1–48 (1985)
Beirlant, J., Mason, D.M.: On the asymptotic normality of l p -norms of empirical functionals. Math. Methods Statist. 4, 1–19 (1995)
Bach, F.R., Jordan, M.I.: Kernel independent component analysis. J. Mach. Learn. Res. 3, 1–48 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gretton, A., Györfi, L. (2008). Nonparametric Independence Tests: Space Partitioning and Kernel Approaches. In: Freund, Y., Györfi, L., Turán, G., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2008. Lecture Notes in Computer Science(), vol 5254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87987-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-87987-9_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87986-2
Online ISBN: 978-3-540-87987-9
eBook Packages: Computer ScienceComputer Science (R0)