Summary
Several approaches for robust canonical correlation analysis will be presented and discussed. A first method is based on the definition of canonical correlation analysis as looking for linear combinations of two sets of variables having maximal (robust) correlation. A second method is based on alternating robust regressions. These methods are discussed in detail and compared with the more traditional approach to robust canonical correlation via covariance matrix estimates. A simulation study compares the performance of the different estimators under several kinds of sampling schemes. Robustness is studied as well by breakdown plots.
Similar content being viewed by others
References
Becker, C. and Gather, U. (2001). The largest nonidentifiable outlier: A comparison of multivariate simultaneous outlier identification rules.Computational Statistics and Data Analysis,36, 119–127.
Croux, C., and Dehon, C. (2002). Analyse canonique basée sur des estimateurs robustes de la matrice de covariance.La Revue de Statistique Appliquée,2, 5–26.
Croux, C., Filzmoser, P., Pison, G., and Rousseeuw, P.J. (2003). Fitting multiplicative models by robust alternating regressions.Statistics and Computing,13, 23–36.
Croux, C., and Ruiz-Gazen, A. (1996). A fast algorithm for robust principal components based on projection pursuit. In: A. Prat (ed.),COMP-STAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, pp. 211–216.
Das, S. and Sen, P.K. (1998). Canonical correlations. In: P. Armitage and T. Colton (eds.),Encyclopedia of Biostatistics, Vol. 1, Wiley, New York, pp. 468–482.
Filzmoser, P., Dehon, C., and Croux, C. (2000). Outlier resistant estimators for canonical correlation analysis. In: J.G. Betlehem and P.G.M. van der Heijden (eds.),COMPSTAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, pp. 301–306.
Hotelling, H. (1936). Relations between two sets of variates.Biometrika,28, 321–377.
Huber, P.J. (1981).Robust Statistics. Wiley, New York.
Huber, P.J. (1985). Projection pursuit.The Annals of Statistics,13, 435–525.
Johnson, R. A., and Wichern, D.W. (1998).Applied Multivariate Statistical Analysis. Prentice-Hall, London.
Karnel, G. (1991). Robust canonical correlation and correspondence analysis. In:The Frontiers of Statistical Scientific and Industrial Applications, (Volume II of the proceedings of ICOSCO-I, The First International Conference on Statistical Computing), American Sciences Press, Strassbourg, pp. 335–354.
Lyttkens, E. (1972). Regression aspects of canonical correlation.Journal of Multivariate Analysis,2, 418–439.
Maronna, R.A. (1976). Robust M-estimators of multivariate location and scatter.The Annals of Statistics,4, 51–67.
Oliveira, M.R., and Branco, J.A. (2000). Projection pursuit approach to robust canonical correlation analysis. In: J.G. Betlehem and P.G.M. van der Heijden (eds.),COMPSTAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, pp. 415–420.
Rencher, A.C. (1998).Multivariate Statistical Inference and Applications, John Wiley, New York.
Romanazzi, M. (1992). Influence in canonical correlation analysis.Psychometrika,57, 237–259.
Rousseeuw, P.J. (1984). Least median of squares regression.Journal of the American Statistical Association,79, 871–880.
Rousseeuw, P.J. (1985). Multivariate estimation with high breakdown point. In: W. Grossmann, G. Pflug, I. Vincze, and W. Wertz (eds.),Mathematical Statistics and Applications, Vol. B, Reidel, Dordrecht, pp. 283–297.
Rousseeuw, P.J., and Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator.Technometrics,41, 212–223.
Rousseeuw, P.J. and Van Zomeren, B. (1990). Unmasking multivariate outliers and leverage points.Journal of the American Statistical Association,85, 633–651.
Taskinen, S., Croux, C., Kankainen, A., Ollila, E., and Oja, H. (2003). Canonical Analysis based on Scatter Matrices. Manuscript.
Taskinen, S., Oja, H., and Randies, R.H. (2004). Multivariate nonparametric tests of independence. Manuscript, conditionally accepted.
Tenenhaus, M. (1998).La Régression PLS. Théorie et pratique, Éditions Technip, Paris.
Wold, H. (1966). Nonlinear estimation by iterative least squares procedures. In: F.N. David (ed.),A Festschrift for J. Neyman, Wiley, New York, pp. 411–444.
Acknowledgements
The authors are grateful for helpful comments of two anonymous referees.
Author information
Authors and Affiliations
6 Appendix
6 Appendix
Least Squares Alternating Regression Scheme (using the notations of Section 3):
Step 1: \(\boldsymbol{X}_{0}=\boldsymbol{X}-\mathbf{1} \overline{\boldsymbol{x}}^{t}, \boldsymbol{Y}_{0}=\boldsymbol{Y}-\mathbf{1} \overline{\boldsymbol{y}}^{t}\)
Step 2: For l = 1, …, k:
Step 2.1: Residual spaces (only if l > 1):
$$\begin{array}{l}{\boldsymbol{X}_{l-1}=\left(\boldsymbol{I}_{n}-\frac{\boldsymbol{u}_{l-1} \boldsymbol{u}_{l-1}^{t}}{\boldsymbol{u}_{l-1}^{t} \boldsymbol{u}_{l-1}}\right) \boldsymbol{X}_{l-2}} \\ {\boldsymbol{Y}_{l-1}=\left(\boldsymbol{I}_{n}-\frac{\boldsymbol{v}_{l-1} \boldsymbol{v}_{l-1}^{t}}{\boldsymbol{v}_{l-1}^{t} \boldsymbol{v}_{l-1}}\right) \boldsymbol{Y}_{l-2}}\end{array}$$Step 2.2: Starting values (using first principal component \(\boldsymbol{z}_{1}^{l-1}\) of Xl−1): \(\begin{array}{l}{\hat{b}_{l}^{(0)}\ =\left(\boldsymbol{Y}_{l-1}^{t} \boldsymbol{Y}_{l-1}\right)^{-} \boldsymbol{Y}_{l-1}^{t} \boldsymbol{z}_{1}^{l-1}} \\ {\boldsymbol{\beta}_{l}^{(0)}=\frac{\hat{\boldsymbol{b}}_{l}^{(0)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(0)}\right\|}} \\ {\boldsymbol{v}_{l}^{(0)}=\boldsymbol{Y}_{l-1} \boldsymbol{\beta}_{l}^{(0)}}\end{array}\)
Step 2.3: From iteration s = 1 to convergence:
$$\begin{aligned} \hat{\boldsymbol{a}}_{l}^{(s)} &=\left(\boldsymbol{X}_{l-1}^{t} \boldsymbol{X}_{l-1}\right)^{-} \boldsymbol{X}_{l-1}^{t} \boldsymbol{v}_{l}^{(s-1)} \\ \boldsymbol{\alpha}_{l}^{(s)} &=\frac{\hat{\boldsymbol{a}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{a}}_{l}^{(s)}\right\|} \\ \boldsymbol{u}_{l}^{(s)} &=\boldsymbol{X}_{l-1} \boldsymbol{\alpha}_{l}^{(s)} \\ \hat{\boldsymbol{b}}_{l}^{(s)} &=\left(\boldsymbol{Y}_{l-1}^{t} \boldsymbol{Y}_{l-1}\right)^{-} \boldsymbol{Y}_{l-1}^{t} \boldsymbol{u}_{l}^{(s)}\\ \boldsymbol{\beta}_{l}^{(s)} &=\frac{\hat{\boldsymbol{b}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(s)}\right\|} \\ \boldsymbol{v}_{l}^{(s)} &= \boldsymbol{Y}_{l-1}\boldsymbol{\beta}_{l}^{s}\end{aligned}$$Step 2.4: After convergence, resulting in \(\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}, \boldsymbol{\alpha}_{l}^{*}, \boldsymbol{\beta}_{1}^{*}\):
$$\left|r_{l}=\operatorname{Corr}\left(\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}\right)\right|$$Step 2.4.1: If l = 1: \(\boldsymbol{u}_{1}=\boldsymbol{u}_{1}^{*}, \boldsymbol{v}_{1}=\boldsymbol{v}_{1}^{*}, \hat{\boldsymbol{\alpha}}_{1}=\boldsymbol{\alpha}_{1}^{*}, \boldsymbol{\beta}_{1}=\boldsymbol{\beta}_{1}^{*}\)
Step 2.4.2: If l > 1: \(\begin{array}{l}{\boldsymbol{u}_{l}=\boldsymbol{u}_{l}^{*}} \\ {\hat{\boldsymbol{\alpha}}_{l}=\left(\boldsymbol{X}_{0}^{t} \boldsymbol{X}_{0}\right)^{-1} \boldsymbol{X}_{0}^{t} \boldsymbol{u}_{l}} \\ {\boldsymbol{v}_{l}=\boldsymbol{v}_{l}^{*}} \\ {\hat{\boldsymbol{\beta}}_{l}=\left(\boldsymbol{Y}_{0}^{t} \boldsymbol{Y}_{0}\right)^{-1} \boldsymbol{Y}_{0}^{t} \boldsymbol{v}_{l}}\end{array}\)
Robust Alternating Regression Scheme (using the notations of Section 3):
Step 1: \(\boldsymbol{X}_{0}=\boldsymbol{X}-\mathbf{1} \tilde{\boldsymbol{x}}^{t}, \boldsymbol{Y}_{0}=\boldsymbol{Y}-\mathbf{1} \tilde{\boldsymbol{y}}^{t}\)\(\tilde{\boldsymbol{x}}\) and ӯ are the column-wise medians of X and Y, respectively.
Step 2: For l = 1, …, k:
Step 2.1: Residual spaces (only if l > 1):
Xl−1 are the estimated residuals of Xl−2 = ul−1ct + ε1 using weighted L1 regressions with weights wi;(ul−1)
Yl−1 are the estimated residuals of Yl−2 = vl−1dt + ε2 using weighted L1 regressions with weights wi(vl−1)
Step 2.2: Starting values:
Compute the first robust principal component \(\boldsymbol{z}_{1}^{l-1}\) of Xl−1 using the algorithm of Croux and Ruiz-Gazen (1996)
\(\hat{\boldsymbol{b}}_{l}^{(0)}\) are the estimated coefficients of \(z_{1}^{l-1}=\boldsymbol{Y}_{l-1} \boldsymbol{b}_{l}^{(0)}+\varepsilon_{3}\) using weighted L1 regression with weights \(w_{i}\left(\boldsymbol{Y}_{l-1}^{*}\right)\)
- $$\begin{array}{l}{\boldsymbol{\beta}_{l}^{(0)}=\frac{\hat{\boldsymbol{b}}_{l}^{(0)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(0)}\right\|}} \\ {v_{l}^{(0)}=\boldsymbol{Y}_{l-1} \boldsymbol{\beta}_{l}^{(0)}}\end{array}$$
Step 2.3: From iteration s = 1 upto convergence:
\(\hat{\boldsymbol{a}}_{l}^{(s)}\) are the estimated coefficients of \(\boldsymbol{v}_{l}^{s-1}=\boldsymbol{X}_{l-1} \boldsymbol{a}_{l}^{(s)}+\boldsymbol{\varepsilon}_{4}\) using weighted L1 regression with weights \(w_{i}\left(\boldsymbol{X}_{l-1}^{*}\right)\)
\(\begin{aligned} \boldsymbol{\alpha}_{l}^{(s)} &=\frac{\hat{\boldsymbol{a}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{a}}_{l}^{(s)}\right\|} \\ \boldsymbol{u}_{l}^{(s)} &=\boldsymbol{X}_{l-1} \boldsymbol{\alpha}_{l}^{(s)} \end{aligned}\) are the estimated coefficients of \(\boldsymbol{u}_{l}^{s-1}=\boldsymbol{Y}_{l-1} \boldsymbol{b}_{l}^{(s)}+\boldsymbol{\varepsilon}_{5}\) using weighted L1 regression with weights \(w_{i}\left(\boldsymbol{Y}_{l-1}^{*}\right)\)
- $$\begin{array}{l}{\boldsymbol{\beta}_{l}^{(s)}=\frac{\hat{\boldsymbol{b}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(s)}\right\|}} \\ {\boldsymbol{v}_{l}^{(s)}=\boldsymbol{Y}_{l-1} \boldsymbol{\beta}_{l}^{(s)}}\end{array}$$
Step 2.4: After convergence, resulting in \(\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}, \boldsymbol{\alpha}_{l}^{*}, \boldsymbol{\beta}_{1}^{*}\): \(r_{l}=\operatorname{Corr}\left(\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}\right)\); Corr is a robust correlation measure like the bivariate MCD correlation discussed in Section 2
Step 2.4.1: If l = 1: \(\boldsymbol{u}_{1}=\boldsymbol{u}_{1}^{*}, \boldsymbol{v}_{1}=\boldsymbol{v}_{1}^{*}, \hat{\boldsymbol{\alpha}}_{1}=\boldsymbol{\alpha}_{1}^{*}, \hat{\boldsymbol{\beta}}_{1}=\boldsymbol{\beta}_{1}^{*}\)
Step 2.4.2: If l > 1:
Ul−1 = [u1, …, ul−1]
ũl are the estimated residuals of \(\boldsymbol{u}_{l}^{*}=\boldsymbol{U}_{l-1}\boldsymbol{e}+\boldsymbol{\varepsilon}_{6}\) using robust LTS regression
\(\hat{\boldsymbol{\alpha}}_{l}\) are the estimated coefficients of ũl = X0f + ε7 using robust LTS regression
\(\boldsymbol{u}_{l}=\boldsymbol{X}_{0} \hat{\boldsymbol{\alpha}}_{l}\)
Vl−1 = [v1, …, vl−1]
ṽl are the estimated residuals of \(\boldsymbol{v}_{l}^{*}=\boldsymbol{V}_{l-1} \boldsymbol{g}+\boldsymbol{\varepsilon}_{8}\) using robust LTS regression
\(\hat{\boldsymbol{\beta}}_{l}\) are the estimated coefficients of ṽl = X0h + ε9 using robust LTS regression
\(\boldsymbol{v}_{l}=\boldsymbol{Y}_{0} \hat{\boldsymbol{\beta}}_{l}\)
Rights and permissions
About this article
Cite this article
Branco, J.A., Croux, C., Filzmoser, P. et al. Robust canonical correlations: A comparative study. Computational Statistics 20, 203–229 (2005). https://doi.org/10.1007/BF02789700
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF02789700