Abstract
In canonical correlation analysis (CCA), the substantive interpretations of the canonical variates are of primary interest to the applied researchers. However, there are two different interpretive approaches used by different researchers—the weight-based approach and the loading-based approach, of which the latter is favored by the majority of researchers in practice. For those who choose the loading-based approach and apply CCA simultaneously to multiple samples, they may wish to test the invariance of the canonical loadings. In this paper, three covariance structure models are defined for CCA. In particular, the first model (i.e., the CCA-W model) corresponds directly with regular CCA, including the canonical correlations and canonical weights as the parameters, while the third model (i.e., the CCA-L model) is in alignment with the loading-based interpretive approach, including the canonical correlations and canonical loadings as the parameters. The CCA-L model is further extended to the unrestricted and restricted SCCA-L models, of which the latter allows one to test the invariance of the canonical loadings. A real example drawn from the sociological literature is provided to illustrate the restrictive SCCA-L model, and some strategies to calculate good starting values for the restrictive SCCA-L model are discussed.
Similar content being viewed by others
Notes
When the CCA-L model is fitted to a sample covariance matrix, the estimated scaling parameters in D are identical to the usual sample standard deviations. In case of the analysis of a correlation matrix, D becomes an identity matrix.
References
Alpert MI, Peterson RA (1972) On the interpretation of canonical analysis. J Mark Res 9:187–192
Amemiya Y, Anderson TW (1990) Asymptotic chi-square tests for a large class of factor analysis models. Annal Stat 18:1453–1463
Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, New York
Barcikowski RS, Stevens JP (1975) A Monte Carlo study of the stability of canonical correlations, canonical weights and canonical variate-variable correlations. Multivar Behav Res 10:353–364
Bentler PM, Huba GJ (1982) Symmetric and asymmetric rotations in canonical correlation analysis: new methods with drug variable examples. In: Hirschberg N, Humphreys LG (eds) Multivariate applications in the social sciences. Erlbaum, Hillsdale, pp 21–46
Bishop GD, Tong EMW, Diong SM, Enkelmann HC, Why YP (2001) The relationship between coping and personality among police officers in Singapore. J Res Pers 35:353–374
Bolton B (1986) Canonical relationships between vocational interests and personality of adult handicapped persons. Rehabil Psychol 31:169–182
Browne MW (1982) Covariance structures. In: Hawkins DM (ed) Topics in applied multivariate analysis. Cambridge University Press, Cambridge, pp 72–141
Browne MW (1984) Asymptotically distribution-free methods for the analysis of covariance structures. Br J Math Stat Psychol 37:62–83
Browne MW (1987) Robustness of statistical inference in factor analysis and related models. Biometrika 74:375–384
Browne MW, Shapiro A (1988) Robustness of normal theory methods in the analysis of linear latent variate models. Br J Math Stat Psychol 41:193–208
Buchanan T (1983) Toward an understanding of variability in satisfactions within activities. J Leis Res 15:39–51
Christensen JE (1983) An exposition of canonical correlation in leisure research. J Leis Res 15:311–322
Cohen P, Gaughran E, Cohen J (1979) Age patterns of childbearing: a canonical analysis. Multivar Behav Res 14:75–89
Collins KM, Killough LN (1992) An empirical examination of stress in public accounting. Account Organ Soc 17:535–547
Cooley WW (1967) Interactions among interests, abilities, and career plans. J Appl Psychol 51:1–16
Cooley WW, Lohnes PR (1971) Multivariate data analysis. Wiley, New York
Cudeck R (1989) Analysis of correlation matrices using covariance structure models. Psychol Bull 105:317–327
Darlington RB, Weinberg SL, Walberg HJ (1973) Canonical variate analysis and related techniques. Rev Educ Res 43:433–454
Das S, Sen PK (1994) Restricted canonical correlations. Linear Algebr Appl 210:29–47
DeSarbo WS, Hausman RE, Lin S, Thompson W (1982) Constrained canonical correlation. Psychometrika 47:489–516
Dunham RB (1977) Reactions to job characteristics: moderating effects of the organization. Acad Manag J 20:42–65
Dunham RB, Kravetz DJ (1975) Canonical correlation analysis in a predictive system. J Exp Educ 43:35–42
Dunteman GH, Bailey JP Jr (1967) A canonical correlational analysis of the strong vocational interest blank and the Minnesota multiphasic personality inventory for a female college population. Educ Psychol Measur 27:631–642
Elkins J (1973) Relationships between the WISC and the ITPA—a multivariate analysis. Slow Learn Child 20:147–153
Estabrook GE (1984) A canonical correlation analysis of the Wechsler Intelligence Scale for Children-Revised and the Woodcock-Johnson Tests of Cognitive Ability in a sample referred for suspected learning disabilities. J Educ Psychol 76:1170–1177
Fornell C, Larcker DF (1980) The use of canonical correlation analysis in accounting research. J Bus Financ Acc 7:455–473
Fouladi RT (2000) Performance of modified test statistics in covariance and correlation structure analysis under conditions of multivariate nonnormality. Struct Equ Model 7:356–410
Gittins R (1985) Canonical analysis: a review with applications in ecology. Springer, Berlin
Goria MN, Flury BD (1996) Common canonical variates in k independent groups. J Am Stat Assoc 91:1735–1742
Green PE, Halbert MH, Robinson PJ (1966) Canonical analysis: an exposition and illustrative application. J Mark Res 3:32–39
Greenacre MJ (1984) Theory and applications of correspondence analysis. Academic Press, London
Gu F (2016) Analysis of correlation matrices using scale-invariant common principal component models and a hierarchy of relationships between correlation matrices. Struct Equ Model 23:819–826
Harris RJ (1989) A canonical cautionary. Multivar Behav Res 24:17–39
Harris RJ (2001) A primer of multivariate statistics, 3rd edn. Lawrence Erlbaum Associates Inc., Mahwah
Hodge RW, Treiman DJ (1968) Social participation and social status. Am Sociol Rev 33:722–740
Hotelling H (1936) Relations between two sets of variates. Biometrika 28:321–377
Huba GJ, Wingard JA, Bentler PM (1979) Beginning adolescent drug use and peer and adult interaction patterns. J Consult Clin Psychol 47:265–276
Huba GJ, Wingard JA, Bentler PM (1981) Intentions to use drugs among adolescents: a longitudinal analysis. Int J Addict 16:331–339
Humphries-Wadsworth TM (1998). Features of published analyses of canonical results. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA. (ERIC Document Reproductive Service No. ED 418125)
Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 6th edn. Pearson Education Inc., Upper Saddle River
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York
Kiers HAL, ten Berge JMF (1994) Hierarchical relations between methods for simultaneous component analysis and a technique for rotation to simple simultaneous structure. Br J Math Stat Psychol 47:109–126
Krane WR, McDonald RP (1978) Scale invariance and the factor analysis of correlation matrices. Br J Math Stat Psychol 31:218–228
Kuylen AAA, Verhallen TMM (1981) The use of canonical analysis. J Econ Psychol 1:217–237
Lambert ZV, Durand RM (1975) Some precautions in using canonical analysis. J Mark Res 12:468–475
Lee S-Y, Fong W-K (1983) A scale invariant model for three-mode factor analysis. Br J Math Stat Psychol 36:217–223
Lee R, McCabe DJ, Graham WK (1983) Multivariate relationships between job characteristics and job satisfaction in the public sector: a triple cross-validation study. Multivar Behav Res 18:47–62
Levine MS (1977) Canonical analysis and factor comparison. Sage Publications Inc, Beverly Hill
Liu J, Drane W, Liu X, Wu T (2009) Examination of the relationships between environmental exposures to volatile organic compounds and biochemical liver tests: application of canonical correlation analysis. Environ Res 109:193–199
Marcoulides GA, Chin W (2013) You write, but others read: common methodological misunderstandings in PLS and related methods. In: Abdi H et al (eds) New perspectives in partial least squares and related methods, Springer proceedings in mathematics & statistics 56. Springer, New York, pp 31–64
McDonald RP, Parker PM, Ishizuka T (1993) A scale-invariant treatment for recursive path models. Psychometrika 58:431–443
McLeskey J, Kandaswamy S, Colarusso R (1980) A canonical correlation analysis of the WISC and ITPA for a group of learning-disabled children. J Spec Educ 14:253–259
Meredith W (1964) Canonical correlations with fallible data. Psychometrika 29:55–65
Meredith W, Tisak J (1982) Canonical analysis of longitudinal and repeated measures data with stationary weights. Psychometrika 47:47–67
Neuenschwander BE, Flury BD (1995) Common canonical variates. Biometrika 82:553–560
Oh HC, Uysal M, Weaver PA (1995) Product bundles and market segments based on travel motivations: a canonical correlation approach. Int J Hosp Manag 14:123–137
Olsson UH, Foss T, Troye SV, Howell RD (2000) The performance of ML, GLS, and WLS estimation in structural equation modeling under conditions of misspecification and nonnormality. Struct Equ Model 7:557–595
Patel S, Long TE, McCammon SL, Wuensch KL (1995) Personality and emotional correlates of self-reported antigay behaviors. J Interpers Violence 10:354–366
Perreault WD, Spiro RL (1978) An approach for improved interpretation of multivariate analysis. Decis Sci 9:402–413
Pyo S, Mihalik BJ, Uysal M (1989) Attraction attributes and motivations: a canonical correlation analysis. Annal Tour Res 16:277–282
Raykov T, Marcoulides GA (2008) An introduction to applied multivariate analysis. Routledge/Taylor & Francis Group, New York
Razavi AR, Gill H, Stal O, Sundquist M, Thorstenson S, Ahlfeldt H et al (2005) Exploring cancer register data to find risk factors for recurrence of breast cancer—application of canonical correlation analysis. BMC Med Inform Decis Mak 5:1–7
Rencher AC (1988) On the use of correlations to interpret canonical functions. Biometrika 75:363–365
Rencher AC (1992) Interpretation of canonical discriminant functions, canonical variates, and principal components. Am Stat 46:217–225
Rencher AC, Christensen WF (2012) Methods of multivariate analysis, 3rd edn. Wiley, Hoboken
Satomura H, Adachi K (2013) Oblique rotation in canonical correlation analysis reformulated as maximizing the generalized coefficient of determination. Psychometrika 78:526–537
Satorra A, Bentler PM (1994) Corrections to test statistics and standard errors in covariance structure analysis. In: von Eye A, Clogg CC (eds) Latent variables analysis: applications for developmental research. Sage, Thousand Oaks, pp 399–419
Sims HP, Szilagyi AD (1976) Job characteristic relationships: individual and structural moderators. Organ Behav Hum Perform 17:211–230
Tatsuoka MM (1971) Multivariate analysis: techniques for educational and psychological research. Wiley, New York
Tatsuoka MM (1973) Multivariate analysis in educational research. In Kerlinger F (ed) Review of research in education. American Educational Research Association/Sage Publications, Inc, Beverly Hill, pp. 273–319
Thompson B (1980) Canonical correlation: recent extensions for modelling educational processes. Paper presented at the annual meeting of the American Educational Research Association, Boston, MA (ERIC Document Reproductive Service No. ED 199269)
Thompson B (1984) Canonical correlation analysis: use and interpretations. Sage Publications Inc, Beverly Hill
Thompson B (1987) Fundamentals of canonical correlation analysis: basics and three common fallacies in interpretation. Paper presented at the annual meeting of the Society for Multivariate Experimental Psychology, Southwestern Division, New Orleans, LA (ERIC Document Reproductive Service No. ED 282904)
Thompson B (1988) Canonical correlation analysis: an explanation with comments on correct practice. Paper presented at the annual meeting of the American Educational Research Association, New Orleans, LA (ERIC Document Reproductive Service No. ED 295957)
Thompson B (1990) Variable importance in multiple regression and canonical correlation. Paper presented at the annual meeting of the American Educational Research Association, Boston, MA (ERIC Document Reproductive Service No. ED 317615)
Thompson B (1991a) A primer on the logic and use of canonical correlation analysis. Measur Eval Couns Dev 24:80–95
Thompson B (1991b) Invariance of multivariate results: a Monte Carlo study of canonical function and structure coefficients. J Exp Educ 59:367–382
Thompson B (2005) Canonical correlation analysis. In: Everitt BS, Howell DC (eds) Encyclopedia of statistics in behavioral science. Wiley, Chichester, pp 192–196
Thompson B, Rucker R (1980) Two-year college students’goals and programs preferences. J Coll Stud Pers 21:393–398
Thorndike RM (1976) Studying canonical analysis: comments on Barcikowski and Stevens. Multivar Behav Res 11:249–253
Thorndike RM (2000) Canonical correlation analysis. In: Tinsley HEA, Brown SD (eds) Handbook of applied multivariate statistics and mathematical modeling. Academic Press, San Diego, pp 237–263
Thorndike RM, Weiss DJ (1973) A study of the stability of canonical correlations and canonical components. Educ Psychol Measur 33:123–134
Thorndike RM, Weiss DJ, Dawis RV (1968a) Canonical correlation of vocational interests and vocational needs. J Counsel Psychol 15:101–106
Thorndike RM, Weiss DJ, Dawis RV (1968b) Multivariate relationships between a measure of vocational interests and a measure of vocational needs. J Appl Psychol 52:491–496
Veldman DJ (1967) Fortran programming for the behavioral sciences. Holt, Rinehart & Winston, New York
Weiss DJ (1972) Canonical correlation analysis in counseling psychology research. J Counsel Psychol 19:241–252
Wingard JA, Huba GJ, Bentler PM (1979) The relationship of personality structure to patterns of adolescent substance use. Multivar Behav Res 14:131–143
Wood DA, Erskine JA (1976) Strategies in canonical correlation with application to behavioral data. Educ Psychol Measur 36:861–878
Wu H (in press) Approximations to the distribution of test statistic in covariance structure analysis: a comprehensive study. Br J Math and Stat Psychol
Yuan K-H, Bentler PM (1999a) F tests for mean and covariance structure analysis. J Educ Behav Stat 24:225–243
Yuan K-H, Bentler PM (1999b) On normal theory and associated test statistics in covariance structure analysis under two classes of nonnormal distributions. Stat Sin 9:831–853
Yuan K-H, Hayashi K (2010) Fitting data to model: structural equation modeling diagnosis using two scatter plots. Psychol Methods 15:335–351
Yuan K-H, Zhang Z (2012) Robust structural equation modeling with missing data and auxiliary variables. Psychometrika 77:803–826
Yuan K-H, Zhong X (2008) Outliers, leverage observations and influential cases in factor analysis: minimizing their effect using robust procedures. Sociol Methodol 38:329–368
Yuan K-H, Yian Y, Yanagihara H (2015) Empirical correction to the likelihood ratio statistic for structural equation modeling with many variables. Psychometrika 80:379–405
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
On behalf of all the authors, the corresponding author states that there is no conflict of interest.
Additional information
Communicated by Haruhiko Ogasawara.
Appendix
Appendix
In the SCCA-L model, we denote the pair of canonical variates as \( {\mathbf{Y}}_{1}^{(k)} = \left( {{\mathbf{A}}_{1}^{(k)} } \right)^{\prime } {\mathbf{X}}_{1}^{(k)} \) and \( {\mathbf{Y}}_{2}^{(k)} = \left( {{\mathbf{A}}_{2}^{(k)} } \right)^{\prime } {\mathbf{X}}_{2}^{(k)} \) in the kth group and their individual elements \( y_{1i} \) and \( y_{2i} \), i = 1, 2, …, p. We also denote \( {\mathbf{Y}}_{3}^{(k)} = \left( {{\mathbf{A}}_{3}^{(k)} } \right)^{\prime } {\mathbf{X}}_{2}^{(k)} \) and its individual elements \( y_{3j} \), j = 1, 2, …, d. Define \( {\mathbf{Y}}^{(k)} = \left[ {\begin{array}{*{20}c} {\left( {{\mathbf{Y}}_{1}^{(k)} } \right)^{\prime } } & {\left( {{\mathbf{Y}}_{2}^{(k)} } \right)^{\prime } } & {\left( {{\mathbf{Y}}_{3}^{(k)} } \right)^{\prime } } \\ \end{array} } \right]^{\prime } \). Now the observed variables \( {\mathbf{X}}^{(k)} \) can be written as
Note that to simplify the notation, we allow \( {\mathbf{B}}^{(k)} \) to absorb the scaling diagonal matrix \( {\mathbf{D}}^{(k)} \) in Eq. (7).
Proposition 1
If the canonical variates are allowed to have free variance parameters (that could vary across groups) in the restricted SCCA-L model and the model is identified by setting one loading of each column of the loading matrix, the test statistic \( T = (N - K)\hat{F} \) has the same \( \chi^{2} \) distribution as under normality assumption as long as different pairs of canonical variates are independent of each other and are independent of \( {\mathbf{Y}}_{3}^{(k)}. \)
Proof
We change the identification condition of \( {\mathbf{Y}}_{3}^{(k)} \) so that \( {\mathbf{Y}}_{3}^{(k)} \) has a saturated covariance matrix, but \( {\mathbf{B}}_{3}^{(k)} \) involves a block of identity matrix. We denote the Jacobian matrix of the covariance structure by
where \( {\varvec{\Delta}}_{k}^{(k)} = \frac{{\partial \text{vec} {\varvec{\Sigma}}^{(k)} }}{{\partial {\varvec{\upvarphi }}^{(k)\prime } }} \), k = 1,2,…,K, \( {\varvec{\upvarphi }}^{(k)} \) is a \( (3p + d(d + 1)/2) \times 1 \) vector that involves the 2p variances and p covariances of the p pairs of canonical variates in the kth group and the d(d + 1)/2 non-duplicated elements of the covariance matrix of \( {\mathbf{Y}}_{3}^{(k)} \), and \( {\varvec{\Delta}}_{0}^{(k)} \) is the derivative w.r.t. all other parameters in the model.
The derivative of the discrepancy function in (8) is
where \( {\mathbf{s}}^{(k)} = \text{vec} {\mathbf{S}}^{(k)} \), \( {\varvec{\upsigma}}^{(k)} = \text{vec} {\varvec{\Sigma}}^{(k)} \), s and σ are long vectors of length K(2p − d)2 that join \( {\mathbf{s}}^{(k)} \) and \( {\varvec{\upsigma}}^{(k)} \) from K groups together, V is a block-diagonal matrix with blocks \( \frac{{n_{k} - 1}}{{2\left( {N - K} \right)}}\left( {{\varvec{\Sigma}}^{(k)} \otimes {\varvec{\Sigma}}^{(k)} } \right)^{ - 1} \), and \( {\varvec{\Delta}} \) is defined as
Similar to the single group situation, standard asymptotic derivations give the expansion of the test statistic
where the asymptotic framework allows \( N \to \infty \) but fixes the proportions \( \frac{{n_{k} - 1}}{N - K} \) in each of the groups. The matrix U is defined as \( {\mathbf{U}} = {\mathbf{V}} - {\mathbf{V}{\varvec{\Delta }}}\left( {{\varvec{\Delta^{\prime}{\mathbf{V}}{\varvec{\Delta }}}}} \right)^{ - 1} {\varvec{\Delta^{\prime}{\mathbf{V}}}} \), evaluated at its population value. The vector \( {\varvec{\upsigma}}_{0} \) is σ evaluated at the correctly specified population.
When data are nonnormally distributed but with finite fourth-order moments, Browne and Shapiro (1988, Eq. 2.7) have shown that for a single group, the asymptotic covariance matrix of \( \sqrt {n_{k} - 1} \left( {{\mathbf{s}}^{(k)} - {\varvec{\upsigma}}_{0}^{(k)} } \right) \) is given by
where \( {\varvec{\Gamma}}_{N}^{(k)} \) is the asymptotic covariance matrix under the normal assumption, \( {\mathbf{C}}_{i}^{(k)} \) (i = 1, 2, …, p) is the \( 4 \times 4 \) cumulant matrix of the ith pair of canonical variates, \( {\varvec{\Lambda}}_{i}^{(k)} \) (i = 1, 2, …, p) is the (2p + d) × 2 loading matrix that relates the manifest variables with the ith pair of canonical variates, \( {\mathbf{C}}_{0}^{(k)} \) is the d 2 × d 2 cumulant matrix of \( {\mathbf{Y}}_{3}^{(k)} \), and \( {\varvec{\Lambda}}_{0}^{(k)} \) is the (2p + d) × d loading matrix on \( {\mathbf{Y}}_{3}^{(k)} \). For K groups, the asymptotic covariance matrix \( {\varvec{\Gamma}} \) of the long vector \( \sqrt {N - K} \left( {{\mathbf{s}} - {\varvec{\upsigma}}_{0} } \right) \) is a block-diagonal matrix with \( {\varvec{\Gamma}}^{(k)} \) as blocks: \( {\varvec{\Gamma}} = {\varvec{\Gamma}}_{N} + \sum\limits_{i = 1}^{p} {{\mathbf{L}}_{i} {\mathbf{C}}_{i} {\mathbf{L^{\prime}}}_{i} } + {\mathbf{L}}_{0} {\mathbf{C}}_{0} {\mathbf{L^{\prime}}}_{0} \), where \( {\varvec{\Gamma}}_{N} \) and \( {\mathbf{C}}_{i} \) are block-diagonal matrices with \( {\varvec{\Gamma}}_{N}^{(k)} \) and \( {\mathbf{C}}_{i}^{(k)} \) as blocks and \( {\mathbf{L}}_{i} \) is a block-diagonal matrix with \( {\varvec{\Lambda}}_{i}^{(k)} \otimes {\varvec{\Lambda}}_{i}^{(k)} \) as blocks. Asymptotically, this test statistic has a distribution of weighted sum of \( \chi^{2} \) variates, each of single degrees of freedom, where the weights are given by the eigenvalues of the matrix \( {\mathbf{U}}{\varvec {\Gamma } }\). To prove asymptotic robustness, we only need to show \( {\mathbf{U}}{\varvec{\Gamma }} = {\mathbf{U}}{\varvec{\Gamma }}_{N} \), for which we now show \( {\mathbf{UL}}_{i} = {\mathbf{0}} \).
Note that the derivative of \( {\varvec{\upsigma}}^{(k)} \) w.r.t the half-vectorized covariance matrix of the ith pair of canonical variates (or w.r.t that of \( {\mathbf{Y}}_{3}^{(k)} \)) is \( \left( {{\varvec{\Lambda}}_{i}^{(k)} \otimes {\varvec{\Lambda}}_{i}^{(k)} } \right){\mathbf{K}}\left( {{\mathbf{K^{\prime}K}}} \right)^{ - 1} \), where \( {\mathbf{K}} \) is the \( 4 \times 3 \) transition matrix for i > 0 and is the d 2 × d(d + 1)/2 transition matrix for i = 0 (Browne and Shapiro 1988). This means that \( \left( {{\varvec{\Lambda}}_{i}^{(k)} \otimes {\varvec{\Lambda}}_{i}^{(k)} } \right){\mathbf{K}}\left( {{\mathbf{K^{\prime}K}}} \right)^{ - 1} \) contributes to three columns in \( {\varvec{\Delta}}_{k}^{(k)} \) for i > 0 and to d(d + 1)/2 columns in \( {\varvec{\Delta}}_{k}^{(k)} \) for i = 0. The relationship \( \left( {{\varvec{\Lambda}}_{i}^{(k)} \otimes {\varvec{\Lambda}}_{i}^{(k)} } \right){\mathbf{K}}\left( {{\mathbf{K^{\prime}K}}} \right)^{ - 1} {\mathbf{K^{\prime}}} = {\varvec{\Lambda}}_{i}^{(k)} \otimes {\varvec{\Lambda}}_{i}^{(k)} \) shows that \( ({\varvec{\Lambda}}_{i}^{(k)} \otimes {\varvec{\Lambda}}_{i}^{(k)} ) \) is in the column space of \( {\varvec{\Delta}}_{k}^{(k)} \). Given the structure of block matrices \( {\varvec{\Delta}} \) and \( {\mathbf{L}}_{i} \), we see that the latter matrix is in the column space of the former matrix, which implies \( {\mathbf{UL}}_{i} = {\mathbf{0}} \). This proves Proposition 1.
About this article
Cite this article
Gu, F., Wu, H. Simultaneous canonical correlation analysis with invariant canonical loadings. Behaviormetrika 45, 111–132 (2018). https://doi.org/10.1007/s41237-017-0042-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41237-017-0042-8