Semi-supervised Laplacian Regularization of Kernel Canonical Correlation Analysis

  • Matthew B. Blaschko
  • Christoph H. Lampert
  • Arthur Gretton
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5211)

Abstract

Kernel canonical correlation analysis (KCCA) is a fundamental technique for dimensionality reduction for paired data. By finding directions that maximize correlation in the space implied by the kernel, KCCA is able to learn representations that are more closely tied to the underlying semantics of the data rather than high variance directions, which are found by PCA but may be the result of noise. However, meaningful directions are not only those that have high correlation to another modality, but also those that capture the manifold structure of the data. We propose a method that is able to simultaneously find highly correlated directions that are also located on high variance directions along the data manifold. This is achieved by the use of semi-supervised Laplacian regularization in the formulation of KCCA, which has the additional benefit of being able to use additional data for which correspondence between the modalities is not known to more robustly estimate the structure of the data manifold. We show experimentally on datasets of images and text that Laplacian regularized training improves the class separation over KCCA with only Tikhonov regularization, while causing no degradation in the correlation between modalities. We propose a model selection criterion based on the Hilbert-Schmidt norm of the semi-supervised Laplacian regularized cross-covariance operator, which can be computed in closed form. Kernel canonical correlation analysis (KCCA) is a dimensionality reduction technique for paired data. By finding directions that maximize correlation, KCCA learns representations that are more closely tied to the underlying semantics of the data rather than noise. However, meaningful directions are not only those that have high correlation to another modality, but also those that capture the manifold structure of the data. We propose a method that is simultaneously able to find highly correlated directions that are also located on high variance directions along the data manifold. This is achieved by the use of semi-supervised Laplacian regularization of KCCA. We show experimentally that Laplacian regularized training improves class separation over KCCA with only Tikhonov regularization, while causing no degradation in the correlation between modalities. We propose a model selection criterion based on the Hilbert-Schmidt norm of the semi-supervised Laplacian regularized cross-covariance operator, which we compute in closed form.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hotelling, H.: Relations Between Two Sets of Variates. Biometrika 28, 321–377 (1936)MATHGoogle Scholar
  2. 2.
    Hardoon, D.R., Szedmák, S., Shawe-Taylor, J.R.: Canonical Correlation Analysis: An Overview with Application to Learning Methods. Neural Computation 16, 2639–2664 (2004)MATHCrossRefGoogle Scholar
  3. 3.
    Blaschko, M.B., Lampert, C.H.: Correlational Spectral Clustering. In: CVPR (2008)Google Scholar
  4. 4.
    Belkin, M., Niyogi, P., Sindhwani, V.: Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples. JMLR 7, 2399–2434 (2006)MathSciNetGoogle Scholar
  5. 5.
    Fukumizu, K., Bach, F.R., Gretton, A.: Statistical Consistency of Kernel Canonical Correlation Analysis. JMLR 8, 361–383 (2007)MathSciNetGoogle Scholar
  6. 6.
    Li, Y., Shawe-Taylor, J.: Using kcca for japanese—english cross-language information retrieval and document classification. J. Intell. Inf. Syst. 27, 117–133 (2006)CrossRefGoogle Scholar
  7. 7.
    Hardoon, D.R., Mourão-Miranda, J., Brammer, M., Shawe-Taylor, J.: Unsupervised Analysis of fMRI Data Using Kernel Canonical Correlation. NeuroImage 37, 1250–1259 (2007)CrossRefGoogle Scholar
  8. 8.
    Yamanishi, Y., Vert, J.P., Nakaya, A., Kanehisa, M.: Extraction of correlated gene clusters from multiple genomic data by generalized kernel canonical correlation analysis. Bioinformatics 19, i323–330 (2003)CrossRefGoogle Scholar
  9. 9.
    Dauxois, J., Nkiet, G.M.: Nonlinear canonical analysis and independence tests. Ann. Statist. 26, 1254–1278 (1998)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Fukumizu, K., Gretton, A., Sun, X., Schölkopf, B.: Kernel Measures of Conditional Dependence. In: NIPS (2007)Google Scholar
  11. 11.
    Bach, F.R., Jordan, M.I.: Kernel Independent Component Analysis. JMLR 3, 1–48 (2002)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)Google Scholar
  13. 13.
    Cai, D., He, X., Han, J.: Semi-supervised discriminant analysis. In: ICCV (2007)Google Scholar
  14. 14.
    Gretton, A., Herbrich, R., Smola, A., Bousquet, O., Schölkopf, B.: Kernel methods for measuring independence. J. Mach. Learn. Res. 6, 2075–2129 (2005)MathSciNetGoogle Scholar
  15. 15.
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179–188 (1936)Google Scholar
  16. 16.
    De Bie, T.: Semi-supervised learning based on kernel methods and graph cut algorithms. Phd thesis, K.U.Leuven (Leuven, Belgium), Faculty of Engineering (2005)Google Scholar
  17. 17.
    Bach, F.R., Jordan, M.I.: A Probabilistic Interpretation of Canonical Correlation Analysis. Technical Report 688, Department of Statistics, University of California, Berkeley (2005)Google Scholar
  18. 18.
    Braun, M.L.: Accurate error bounds for the eigenvalues of the kernel matrix. JMLR 7, 2303–2328 (2006)MathSciNetGoogle Scholar
  19. 19.
    Loeff, N., Alm, C.O., Forsyth, D.A.: Discriminating Image Senses by Clustering with Multimodal Features. In: ACL (2006)Google Scholar
  20. 20.
    Bay, H., Tuytelaars, T., Gool, L.J.V.: SURF: Speeded Up Robust Features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  21. 21.
    van Rijsbergen, C.J.: Information Retrieval. Butterworths (1975)Google Scholar
  22. 22.
    Kolenda, T., Hansen, L.K., Larsen, J., Winther, O.: Independent Component Analysis for Understanding Multimedia Content. In: IEEE Workshop on Neural Networks for Signal Processing, pp. 757–766 (2002)Google Scholar
  23. 23.
    Zhou, D., Schölkopf, B.: Discrete regularization. In: Chapelle, O., Schölkopf, B., Zien, A. (eds.) Semi-supervised learning. Adaptive computation and machine learning, pp. 221–232. MIT Press, Cambridge (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Matthew B. Blaschko
    • 1
  • Christoph H. Lampert
    • 1
  • Arthur Gretton
    • 1
  1. 1.Department of Empirical InferenceMax Planck Institute for Biological CyberneticsTübingenGermany

Personalised recommendations