Skip to main content

Regularized Generalized Canonical Correlation Analysis

Abstract

Regularized generalized canonical correlation analysis (RGCCA) is a generalization of regularized canonical correlation analysis to three or more sets of variables. It constitutes a general framework for many multi-block data analysis methods. It combines the power of multi-block data analysis methods (maximization of well identified criteria) and the flexibility of PLS path modeling (the researcher decides which blocks are connected and which are not). Searching for a fixed point of the stationary equations related to RGCCA, a new monotonically convergent algorithm, very similar to the PLS algorithm proposed by Herman Wold, is obtained. Finally, a practical example is discussed.

This is a preview of subscription content, access via your institution.

References

  • Barker, M., & Rayens, W. (2003). Partial least squares for discrimination. Journal of Chemometrics, 17, 166–173.

    Article  Google Scholar 

  • Bougeard, S., Hanafi, M., & Qannari, E.M. (2007). ACPVI multibloc. Application en épidémiologie animale. Journal de la Société Française de Statistique, 148, 77–94.

    Google Scholar 

  • Bougeard, S., Hanafi, M., & Qannari, E.M. (2008). Continuum redundancy-PLS regression: a simple continuum approach. Computational Statistics & Data Analysis, 52, 3686–3696.

    Article  Google Scholar 

  • Burnham, A.J., Viveros, R., & MacGregor, J.F. (1996). Frameworks for latent variable multivariate regression. Journal of Chemometrics, 10, 31–45.

    Article  Google Scholar 

  • Carroll, J.D. (1968a). A generalization of canonical correlation analysis to three or more sets of variables. In Proc. 76th conv. Am. Psych. Assoc. (pp. 227–228).

    Google Scholar 

  • Carroll, J.D. (1968b). Equations and tables for a generalization of canonical correlation analysis to three or more sets of variables. Unpublished companion paper to Carroll, J.D. (1968a).

  • Chessel, D., & Hanafi, M. (1996). Analyse de la co-inertie de K nuages de points. Revue de Statistique Appliquée, 44, 35–60.

    Google Scholar 

  • Chu, M.T., & Watterson, J.L. (1993). On a multivariate eigenvalue problem: I. Algebraic theory and power method. SIAM Journal on Scientific and Statistical Computing, 14, 1089–1106.

    Article  Google Scholar 

  • Dahl, T., & Næs, T. (2006). A bridge between Tucker-1 and Carroll’s generalized canonical analysis. Computational Statistics & Data Analysis, 50, 3086–3098.

    Article  Google Scholar 

  • Fornell, C., & Bookstein, F.L. (1982). Two structural equation models: LISREL and PLS applied to consumer exit-voice theory. Journal of Marketing Research, 19, 440–452.

    Article  Google Scholar 

  • Gifi, A. (1990). Nonlinear multivariate analysis. Chichester: Wiley.

    Google Scholar 

  • Hanafi, M. (2007). PLS Path modelling: computation of latent variables with the estimation mode B. Computational Statistics, 22, 275–292.

    Article  Google Scholar 

  • Hanafi, M., & Kiers, H.A.L. (2006). Analysis of K sets of data, with differential emphasis on agreement between and within sets. Computational Statistics & Data Analysis, 51, 1491–1508.

    Article  Google Scholar 

  • Hanafi, M., & Lafosse, R. (2001). Généralisations de la régression simple pour analyser la dépendance de K ensembles de variables avec un K+1ème. Revue de Statistique Appliquée, 49, 5–30.

    Google Scholar 

  • Horst, P. (1961). Relations among m sets of variables. Psychometrika, 26, 126–149.

    Article  Google Scholar 

  • Jöreskog, K.G. (1970). A general method for the analysis of covariance structure. Biometrika, 57, 239–251.

    Google Scholar 

  • Kettenring, J.R. (1971). Canonical analysis of several sets of variables. Biometrika, 58, 433–451.

    Article  Google Scholar 

  • Krämer, N. (2007). Analysis of high-dimensional data with partial least squares and boosting. Doctoral dissertation, Technischen Universität Berlin.

  • Ledoit, O., & Wolf, M. (2004). A well conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88, 365–411.

    Article  Google Scholar 

  • Leurgans, S.E., Moyeed, R.A., & Silverman, B.W. (1993). Canonical correlation analysis when the data are curves. Journal of the Royal Statistical Society, Series B, 55, 725–740.

    Google Scholar 

  • Lohmöller, J.-B. (1989). Latent variables path modeling with partial least squares. Heildelberg: Physica-Verlag.

    Google Scholar 

  • Noonan, R., & Wold, H. (1982). PLS path modeling with indirectly observed variables: a comparison of alternative estimates for the latent variable. In K.G. Jöreskog & H. Wold (Eds.), Systems under indirect observation, Part 2 (pp. 75–94). Amsterdam: North-Holland.

    Google Scholar 

  • Qannari, E.M., & Hanafi, M. (2005). A simple continuum regression approach. Journal of Chemometrics, 19, 387–392.

    Article  Google Scholar 

  • R Development Core Team (2009). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org.

    Google Scholar 

  • Russett, B.M. (1964). Inequality and instability: the relation of land tenure to politics. World Politics, 16, 442–454.

    Article  Google Scholar 

  • Schäfer, J., & Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1), Article 32.

    Google Scholar 

  • Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. New York: Cambridge University Press.

    Google Scholar 

  • Takane, Y., & Hwang, H. (2007). Regularized linear and kernel redundancy analysis. Computational Statistics & Data Analysis, 52, 394–405.

    Article  Google Scholar 

  • Takane, Y., Hwang, H., & Abdi, H. (2008). Regularized multiple-set canonical correlation analysis. Psychometrika, 73, 753–775.

    Article  Google Scholar 

  • Ten Berge, J.M.F. (1988). Generalized approaches to the MAXBET problem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika, 53, 487–494.

    Article  Google Scholar 

  • Tenenhaus, A. (2010). Kernel generalized canonical correlation analysis. In 42ièmes journées de statistique (JdS’10), Marseille, France, May 24–28.

    Google Scholar 

  • Tenenhaus, M. (2008). Component-based structural equation modelling. Total Quality Management & Business Excellence, 19, 871–886.

    Article  Google Scholar 

  • Tenenhaus, M., Esposito Vinzi, V., Chatelin, Y.-M., & Lauro, C. (2005). PLS path modeling. Computational Statistics & Data Analysis, 48, 159–205.

    Article  Google Scholar 

  • Tenenhaus, M., & Hanafi, M. (2010). A bridge between PLS path modeling and multi-block data analysis. In V. Esposito Vinzi, J. Henseler, W. Chin, & H. Wang (Eds.), Handbook of partial least squares (PLS): concepts, methods and applications (pp. 99–123). Berlin: Springer.

    Chapter  Google Scholar 

  • Tucker, L.R. (1958). An inter-battery method of factor analysis. Psychometrika, 23, 111–136.

    Article  Google Scholar 

  • Van de Geer, J.P. (1984). Linear relations among k sets of variables. Psychometrika, 4(9), 70–94.

    Google Scholar 

  • Vinod, H.D. (1976). Canonical ridge and econometrics of joint production. Journal of Econometrics, 4, 147–166.

    Article  Google Scholar 

  • Vivien, M., & Sabatier, R. (2003). Generalized orthogonal multiple co-inertia analysis (-PLS): new multiblock component and regression methods. Journal of Chemometrics, 17, 287–301.

    Article  Google Scholar 

  • Westerhuis, J.A., Kourti, T., & MacGregor, J.F. (1998). Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12, 301–321.

    Article  Google Scholar 

  • Wold, H. (1982). Soft modeling: the basic design and some extensions. In K.G. Jöreskog & H. Wold (Eds.), Systems under indirect observation, Part 2 (pp. 1–54). Amsterdam: North-Holland.

    Google Scholar 

  • Wold, H. (1985). In S. Kotz & N.L. Johnson (Eds.), Encyclopedia of statistical sciences. Partial least squares (Vol. 6, pp. 581–591). New York: Wiley.

    Google Scholar 

  • Wold, S., Martens, H., & Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. In A. Ruhe & B. Kåstrøm (Eds.), Lecture notes in mathematics. Proc. conf. matrix pencils, March 1982, (pp. 286–293). Heidelberg: Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arthur Tenenhaus.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Tenenhaus, A., Tenenhaus, M. Regularized Generalized Canonical Correlation Analysis. Psychometrika 76, 257–284 (2011). https://doi.org/10.1007/s11336-011-9206-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-011-9206-8

Keywords

  • generalized canonical correlation analysis
  • multi-block data analysis
  • PLS path modeling
  • regularized canonical correlation analysis