Robust canonical correlations: A comparative study

Branco, J. A.; Croux, C.; Filzmoser, P.; Oliveira, M. R.

doi:10.1007/BF02789700

Robust canonical correlations: A comparative study

Published: 01 June 2005

Volume 20, pages 203–229, (2005)
Cite this article

Computational Statistics Aims and scope Submit manuscript

J. A. Branco¹,
C. Croux²,
P. Filzmoser³ &
…
M. R. Oliveira¹

683 Accesses
40 Citations
Explore all metrics

Summary

Several approaches for robust canonical correlation analysis will be presented and discussed. A first method is based on the definition of canonical correlation analysis as looking for linear combinations of two sets of variables having maximal (robust) correlation. A second method is based on alternating robust regressions. These methods are discussed in detail and compared with the more traditional approach to robust canonical correlation via covariance matrix estimates. A simulation study compares the performance of the different estimators under several kinds of sampling schemes. Robustness is studied as well by breakdown plots.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Becker, C. and Gather, U. (2001). The largest nonidentifiable outlier: A comparison of multivariate simultaneous outlier identification rules.Computational Statistics and Data Analysis,36, 119–127.
Article MathSciNet Google Scholar
Croux, C., and Dehon, C. (2002). Analyse canonique basée sur des estimateurs robustes de la matrice de covariance.La Revue de Statistique Appliquée,2, 5–26.
Google Scholar
Croux, C., Filzmoser, P., Pison, G., and Rousseeuw, P.J. (2003). Fitting multiplicative models by robust alternating regressions.Statistics and Computing,13, 23–36.
Article MathSciNet Google Scholar
Croux, C., and Ruiz-Gazen, A. (1996). A fast algorithm for robust principal components based on projection pursuit. In: A. Prat (ed.),COMP-STAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, pp. 211–216.
Google Scholar
Das, S. and Sen, P.K. (1998). Canonical correlations. In: P. Armitage and T. Colton (eds.),Encyclopedia of Biostatistics, Vol. 1, Wiley, New York, pp. 468–482.
Google Scholar
Filzmoser, P., Dehon, C., and Croux, C. (2000). Outlier resistant estimators for canonical correlation analysis. In: J.G. Betlehem and P.G.M. van der Heijden (eds.),COMPSTAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, pp. 301–306.
Chapter Google Scholar
Hotelling, H. (1936). Relations between two sets of variates.Biometrika,28, 321–377.
Article Google Scholar
Huber, P.J. (1981).Robust Statistics. Wiley, New York.
Book Google Scholar
Huber, P.J. (1985). Projection pursuit.The Annals of Statistics,13, 435–525.
Article MathSciNet Google Scholar
Johnson, R. A., and Wichern, D.W. (1998).Applied Multivariate Statistical Analysis. Prentice-Hall, London.
MATH Google Scholar
Karnel, G. (1991). Robust canonical correlation and correspondence analysis. In:The Frontiers of Statistical Scientific and Industrial Applications, (Volume II of the proceedings of ICOSCO-I, The First International Conference on Statistical Computing), American Sciences Press, Strassbourg, pp. 335–354.
Google Scholar
Lyttkens, E. (1972). Regression aspects of canonical correlation.Journal of Multivariate Analysis,2, 418–439.
Article MathSciNet Google Scholar
Maronna, R.A. (1976). Robust M-estimators of multivariate location and scatter.The Annals of Statistics,4, 51–67.
Article MathSciNet Google Scholar
Oliveira, M.R., and Branco, J.A. (2000). Projection pursuit approach to robust canonical correlation analysis. In: J.G. Betlehem and P.G.M. van der Heijden (eds.),COMPSTAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, pp. 415–420.
Chapter Google Scholar
Rencher, A.C. (1998).Multivariate Statistical Inference and Applications, John Wiley, New York.
MATH Google Scholar
Romanazzi, M. (1992). Influence in canonical correlation analysis.Psychometrika,57, 237–259.
Article MathSciNet Google Scholar
Rousseeuw, P.J. (1984). Least median of squares regression.Journal of the American Statistical Association,79, 871–880.
Article MathSciNet Google Scholar
Rousseeuw, P.J. (1985). Multivariate estimation with high breakdown point. In: W. Grossmann, G. Pflug, I. Vincze, and W. Wertz (eds.),Mathematical Statistics and Applications, Vol. B, Reidel, Dordrecht, pp. 283–297.
Chapter Google Scholar
Rousseeuw, P.J., and Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator.Technometrics,41, 212–223.
Article Google Scholar
Rousseeuw, P.J. and Van Zomeren, B. (1990). Unmasking multivariate outliers and leverage points.Journal of the American Statistical Association,85, 633–651.
Article Google Scholar
Taskinen, S., Croux, C., Kankainen, A., Ollila, E., and Oja, H. (2003). Canonical Analysis based on Scatter Matrices. Manuscript.
Taskinen, S., Oja, H., and Randies, R.H. (2004). Multivariate nonparametric tests of independence. Manuscript, conditionally accepted.
Tenenhaus, M. (1998).La Régression PLS. Théorie et pratique, Éditions Technip, Paris.
MATH Google Scholar
Wold, H. (1966). Nonlinear estimation by iterative least squares procedures. In: F.N. David (ed.),A Festschrift for J. Neyman, Wiley, New York, pp. 411–444.
Google Scholar

Download references

Acknowledgements

The authors are grateful for helpful comments of two anonymous referees.

Author information

Authors and Affiliations

Department of Mathematics and Center for Mathematics and its Applications, Institute Superior Técnico, Av. Rovisco Pais, 1049-001, Lisboa, Portugal
J. A. Branco & M. R. Oliveira
Department of Applied Economics, K.U.Leuven, Naamsestraat 69, B-3000, Leuven, Belgium
C. Croux
Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstraβe 8-10, A-1040, Vienna, Austria
P. Filzmoser

Authors

J. A. Branco
View author publications
You can also search for this author in PubMed Google Scholar
C. Croux
View author publications
You can also search for this author in PubMed Google Scholar
P. Filzmoser
View author publications
You can also search for this author in PubMed Google Scholar
M. R. Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

6 Appendix

Least Squares Alternating Regression Scheme (using the notations of Section 3):

Step 1: $\boldsymbol{X}_{0}=\boldsymbol{X}-\mathbf{1} \overline{\boldsymbol{x}}^{t}, \boldsymbol{Y}_{0}=\boldsymbol{Y}-\mathbf{1} \overline{\boldsymbol{y}}^{t}$
Step 2: For l = 1, …, k:
- Step 2.1: Residual spaces (only if l > 1):
  $$\begin{array}{l}{\boldsymbol{X}_{l-1}=\left(\boldsymbol{I}_{n}-\frac{\boldsymbol{u}_{l-1} \boldsymbol{u}_{l-1}^{t}}{\boldsymbol{u}_{l-1}^{t} \boldsymbol{u}_{l-1}}\right) \boldsymbol{X}_{l-2}} \\ {\boldsymbol{Y}_{l-1}=\left(\boldsymbol{I}_{n}-\frac{\boldsymbol{v}_{l-1} \boldsymbol{v}_{l-1}^{t}}{\boldsymbol{v}_{l-1}^{t} \boldsymbol{v}_{l-1}}\right) \boldsymbol{Y}_{l-2}}\end{array}$$
- Step 2.2: Starting values (using first principal component $\boldsymbol{z}_{1}^{l-1}$ of X_l₋₁): $\begin{array}{l}{\hat{b}_{l}^{(0)}\ =\left(\boldsymbol{Y}_{l-1}^{t} \boldsymbol{Y}_{l-1}\right)^{-} \boldsymbol{Y}_{l-1}^{t} \boldsymbol{z}_{1}^{l-1}} \\ {\boldsymbol{\beta}_{l}^{(0)}=\frac{\hat{\boldsymbol{b}}_{l}^{(0)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(0)}\right\|}} \\ {\boldsymbol{v}_{l}^{(0)}=\boldsymbol{Y}_{l-1} \boldsymbol{\beta}_{l}^{(0)}}\end{array}$
- Step 2.3: From iteration s = 1 to convergence:
  $$\begin{aligned} \hat{\boldsymbol{a}}_{l}^{(s)} &=\left(\boldsymbol{X}_{l-1}^{t} \boldsymbol{X}_{l-1}\right)^{-} \boldsymbol{X}_{l-1}^{t} \boldsymbol{v}_{l}^{(s-1)} \\ \boldsymbol{\alpha}_{l}^{(s)} &=\frac{\hat{\boldsymbol{a}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{a}}_{l}^{(s)}\right\|} \\ \boldsymbol{u}_{l}^{(s)} &=\boldsymbol{X}_{l-1} \boldsymbol{\alpha}_{l}^{(s)} \\ \hat{\boldsymbol{b}}_{l}^{(s)} &=\left(\boldsymbol{Y}_{l-1}^{t} \boldsymbol{Y}_{l-1}\right)^{-} \boldsymbol{Y}_{l-1}^{t} \boldsymbol{u}_{l}^{(s)}\\ \boldsymbol{\beta}_{l}^{(s)} &=\frac{\hat{\boldsymbol{b}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(s)}\right\|} \\ \boldsymbol{v}_{l}^{(s)} &= \boldsymbol{Y}_{l-1}\boldsymbol{\beta}_{l}^{s}\end{aligned}$$
- Step 2.4: After convergence, resulting in $\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}, \boldsymbol{\alpha}_{l}^{*}, \boldsymbol{\beta}_{1}^{*}$:
  $$\left|r_{l}=\operatorname{Corr}\left(\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}\right)\right|$$
  - Step 2.4.1: If l = 1: $\boldsymbol{u}_{1}=\boldsymbol{u}_{1}^{*}, \boldsymbol{v}_{1}=\boldsymbol{v}_{1}^{*}, \hat{\boldsymbol{\alpha}}_{1}=\boldsymbol{\alpha}_{1}^{*}, \boldsymbol{\beta}_{1}=\boldsymbol{\beta}_{1}^{*}$
  - Step 2.4.2: If l > 1: $\begin{array}{l}{\boldsymbol{u}_{l}=\boldsymbol{u}_{l}^{*}} \\ {\hat{\boldsymbol{\alpha}}_{l}=\left(\boldsymbol{X}_{0}^{t} \boldsymbol{X}_{0}\right)^{-1} \boldsymbol{X}_{0}^{t} \boldsymbol{u}_{l}} \\ {\boldsymbol{v}_{l}=\boldsymbol{v}_{l}^{*}} \\ {\hat{\boldsymbol{\beta}}_{l}=\left(\boldsymbol{Y}_{0}^{t} \boldsymbol{Y}_{0}\right)^{-1} \boldsymbol{Y}_{0}^{t} \boldsymbol{v}_{l}}\end{array}$

Robust Alternating Regression Scheme (using the notations of Section 3):

Step 1: $\boldsymbol{X}_{0}=\boldsymbol{X}-\mathbf{1} \tilde{\boldsymbol{x}}^{t}, \boldsymbol{Y}_{0}=\boldsymbol{Y}-\mathbf{1} \tilde{\boldsymbol{y}}^{t}$$\tilde{\boldsymbol{x}}$ and ӯ are the column-wise medians of X and Y, respectively.
Step 2: For l = 1, …, k:
- Step 2.1: Residual spaces (only if l > 1):
  - X_l₋₁ are the estimated residuals of X_l₋₂ = u_l₋₁c^t + ε₁ using weighted L₁ regressions with weights w_i;(u_l₋₁)
  - Y_l₋₁ are the estimated residuals of Y_l₋₂ = v_l₋₁d^t + ε₂ using weighted L₁ regressions with weights w_i(v_l₋₁)
- Step 2.2: Starting values:
  - Compute the first robust principal component $\boldsymbol{z}_{1}^{l-1}$ of X_l₋₁ using the algorithm of Croux and Ruiz-Gazen (1996)
  - $\hat{\boldsymbol{b}}_{l}^{(0)}$ are the estimated coefficients of $z_{1}^{l-1}=\boldsymbol{Y}_{l-1} \boldsymbol{b}_{l}^{(0)}+\varepsilon_{3}$ using weighted L₁ regression with weights $w_{i}\left(\boldsymbol{Y}_{l-1}^{*}\right)$
  - $$\begin{array}{l}{\boldsymbol{\beta}_{l}^{(0)}=\frac{\hat{\boldsymbol{b}}_{l}^{(0)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(0)}\right\|}} \\ {v_{l}^{(0)}=\boldsymbol{Y}_{l-1} \boldsymbol{\beta}_{l}^{(0)}}\end{array}$$
- Step 2.3: From iteration s = 1 upto convergence:
  - $\hat{\boldsymbol{a}}_{l}^{(s)}$ are the estimated coefficients of $\boldsymbol{v}_{l}^{s-1}=\boldsymbol{X}_{l-1} \boldsymbol{a}_{l}^{(s)}+\boldsymbol{\varepsilon}_{4}$ using weighted L₁ regression with weights $w_{i}\left(\boldsymbol{X}_{l-1}^{*}\right)$
  - $\begin{aligned} \boldsymbol{\alpha}_{l}^{(s)} &=\frac{\hat{\boldsymbol{a}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{a}}_{l}^{(s)}\right\|} \\ \boldsymbol{u}_{l}^{(s)} &=\boldsymbol{X}_{l-1} \boldsymbol{\alpha}_{l}^{(s)} \end{aligned}$ are the estimated coefficients of $\boldsymbol{u}_{l}^{s-1}=\boldsymbol{Y}_{l-1} \boldsymbol{b}_{l}^{(s)}+\boldsymbol{\varepsilon}_{5}$ using weighted L₁ regression with weights $w_{i}\left(\boldsymbol{Y}_{l-1}^{*}\right)$
  - $$\begin{array}{l}{\boldsymbol{\beta}_{l}^{(s)}=\frac{\hat{\boldsymbol{b}}_{l}^{(s)}}{\left\|\hat{\boldsymbol{b}}_{l}^{(s)}\right\|}} \\ {\boldsymbol{v}_{l}^{(s)}=\boldsymbol{Y}_{l-1} \boldsymbol{\beta}_{l}^{(s)}}\end{array}$$
  - Step 2.4: After convergence, resulting in $\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}, \boldsymbol{\alpha}_{l}^{*}, \boldsymbol{\beta}_{1}^{*}$: $r_{l}=\operatorname{Corr}\left(\boldsymbol{u}_{l}^{*}, \boldsymbol{v}_{l}^{*}\right)$; Corr is a robust correlation measure like the bivariate MCD correlation discussed in Section 2
    - Step 2.4.1: If l = 1: $\boldsymbol{u}_{1}=\boldsymbol{u}_{1}^{*}, \boldsymbol{v}_{1}=\boldsymbol{v}_{1}^{*}, \hat{\boldsymbol{\alpha}}_{1}=\boldsymbol{\alpha}_{1}^{*}, \hat{\boldsymbol{\beta}}_{1}=\boldsymbol{\beta}_{1}^{*}$
    - Step 2.4.2: If l > 1:
      - U_l₋₁ = [u₁, …, u_l₋₁]
      - ũ_l are the estimated residuals of $\boldsymbol{u}_{l}^{*}=\boldsymbol{U}_{l-1}\boldsymbol{e}+\boldsymbol{\varepsilon}_{6}$ using robust LTS regression
      - $\hat{\boldsymbol{\alpha}}_{l}$ are the estimated coefficients of ũ_l = X₀f + ε₇ using robust LTS regression
      - $\boldsymbol{u}_{l}=\boldsymbol{X}_{0} \hat{\boldsymbol{\alpha}}_{l}$
      - V_l₋₁ = [v₁, …, v_l₋₁]
      - ṽ_l are the estimated residuals of $\boldsymbol{v}_{l}^{*}=\boldsymbol{V}_{l-1} \boldsymbol{g}+\boldsymbol{\varepsilon}_{8}$ using robust LTS regression
      - $\hat{\boldsymbol{\beta}}_{l}$ are the estimated coefficients of ṽ_l = X₀h + ε₉ using robust LTS regression
      - $\boldsymbol{v}_{l}=\boldsymbol{Y}_{0} \hat{\boldsymbol{\beta}}_{l}$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Branco, J.A., Croux, C., Filzmoser, P. et al. Robust canonical correlations: A comparative study. Computational Statistics 20, 203–229 (2005). https://doi.org/10.1007/BF02789700

Download citation

Published: 01 June 2005
Issue Date: June 2005
DOI: https://doi.org/10.1007/BF02789700

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust canonical correlations: A comparative study

Summary

Access this article

Similar content being viewed by others

Canonical Correlation Analysis

Canonical Correlation Analysis with Missing Values: A Structural Equation Modeling Approach

Multivariate Exploratory Approaches

References

Acknowledgements

Author information

Authors and Affiliations

6 Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Summary

Access this article

Similar content being viewed by others

Canonical Correlation Analysis

Canonical Correlation Analysis with Missing Values: A Structural Equation Modeling Approach

Multivariate Exploratory Approaches

References

Acknowledgements

Author information

Authors and Affiliations

6 Appendix

6 Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation