Abstract
We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator (we term this a Hilbert-Schmidt Independence Criterion, or HSIC). This approach has several advantages, compared with previous kernel-based independence criteria. First, the empirical estimate is simpler than any other kernel dependence test, and requires no user-defined regularisation. Second, there is a clearly defined population quantity which the empirical estimate approaches in the large sample limit, with exponential convergence guaranteed between the two: this ensures that independence tests based on HSIC do not suffer from slow learning rates. Finally, we show in the context of independent component analysis (ICA) that the performance of HSIC is competitive with that of previously published kernel-based criteria, and of other recently published ICA methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Achard, S., Pham, D.-T., Jutten, C.: Quadratic dependence measure for nonlinear blind source separation. In: 4th International Conference on ICA and BSS (2003)
Amari, S.-I., Cichoki, A., Yang, H.: A new learning algorithm for blind signal separation. Advances in Neural Information Processing Systems 8, 757–763 (1996)
Bach, F., Jordan, M.: Kernel independent component analysis. Journal of Machine Learning Research 3, 1–48 (2002)
Baker, C.R.: Joint measures and cross-covariance operators. Transactions of the American Mathematical Society 186, 273–289 (1973)
Bell, A., Sejnowski, T.: An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7(6), 1129–1159 (1995)
Cardoso, J.-F.: Blind signal separation: statistical principles. Proceedings of the IEEE 90(8), 2009–2026 (1998)
Chen, A., Bickel, P.: Consistent independent component analysis and prewhitening, Tech. report, Berkeley (2004)
Devroye, L., Györfi, L., Lugosi, G.: A probabilistic theory of pattern recognition. In: Applications of mathematics, vol. 31. Springer, New York (1996)
Fukumizu, K., Bach, F.R., Jordan, M.I.: Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. Journal of Machine Learning Research 5, 73–99 (2004)
Gretton, A., Herbrich, R., Smola, A.: The kernel mutual information, Tech. report, Cambridge University Engineering Department and Max Planck Institute for Biological Cybernetics (2003)
Gretton, A., Smola, A., Bousquet, O., Herbrich, R., Belitski, A., Augath, M., Murayama, Y., Pauls, J., Schölkopf, B., Logothetis, N.: Kernel constrained covariance for dependence measurement. AISTATS 10 (2005)
Hein, M., Bousquet, O.: Kernels, associated structures, and generalizations, Tech. Report 127, Max Planck Institute for Biological Cybernetics (2004)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 13–30 (1963)
Hyvärinen, A., Karhunen, J., Oja, E.: Independent component analysis. John Wiley and Sons, New York (2001)
Leurgans, S.E., Moyeed, R.A., Silverman, B.W.: Canonical correlation analysis when the data are curves. Journal of the Royal Statistical Society, Series B (Methodological) 55(3), 725–740 (1993)
Miller, E., Fisher III, J.: ICA using spacings estimates of entropy. JMLR 4, 1271–1295 (2003)
Rényi, A.: On measures of dependence. Acta Math. Acad. Sci. Hungar 10, 441–451 (1959)
Samarov, A., Tsybakov, A.: Nonparametric independent component analysis. Bernoulli 10, 565–582 (2004)
Steinwart, I.: On the influence of the kernel on the consistency of support vector machines. JMLR 2 (2001)
Yamanishi, Y., Vert, J.-P., Kanehisa, M.: Heterogeneous data comparison and gene selection with kernel canonical correlation analysis. In: Schölkopf, B., Tsuda, K., Vert, J.-P. (eds.) Kernel Methods in Computational Biology, pp. 209–229. MIT Press, Cambridge (2004)
Zwald, L., Bousquet, O., Blanchard, G.: Statistical properties of kernel principal component analysis. In: Proceedings of the 17th Conference on Computational Learning Theory (COLT) (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B. (2005). Measuring Statistical Dependence with Hilbert-Schmidt Norms. In: Jain, S., Simon, H.U., Tomita, E. (eds) Algorithmic Learning Theory. ALT 2005. Lecture Notes in Computer Science(), vol 3734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564089_7
Download citation
DOI: https://doi.org/10.1007/11564089_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29242-5
Online ISBN: 978-3-540-31696-1
eBook Packages: Computer ScienceComputer Science (R0)