The Gaussian rank correlation estimator: robustness properties

Abstract

The Gaussian rank correlation equals the usual correlation coefficient computed from the normal scores of the data. Although its influence function is unbounded, it still has attractive robustness properties. In particular, its breakdown point is above 12%. Moreover, the estimator is consistent and asymptotically efficient at the normal distribution. The correlation matrix obtained from pairwise Gaussian rank correlations is always positive semidefinite, and very easy to compute, also in high dimensions. We compare the properties of the Gaussian rank correlation with the popular Kendall and Spearman correlation measures. A simulation study confirms the good efficiency and robustness properties of the Gaussian rank correlation. In the empirical application, we show how it can be used for multivariate outlier detection based on robust principal component analysis.

This is a preview of subscription content, log in to check access.

References

  1. Alqallaf, F.A., Konis, K.P., Martin, R.D., Zamar, R.H.: Scalable robust covariance and correlation estimates for data mining. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton (2002)

    Google Scholar 

  2. Alqallaf, F., Van Aelst, S., Yohai, V., Zamar, R.: Propagation of outliers in multivariate data. Ann. Stat. 37, 311–331 (2009)

    MATH  Article  Google Scholar 

  3. Atkinson, A.C., Riani, M., Cerioli, A.: Exploring Multivariate Data with the Forward Search. Springer, Berlin (2004)

    Google Scholar 

  4. Bernholt, T., Fischer, P.: The complexity of computing the MCD-estimator. Theor. Comput. Sci. 326, 383–398 (2004)

    MathSciNet  MATH  Article  Google Scholar 

  5. Branco, J.A., Croux, C., Filzmoser, P., Oliveira, M.R.: Robust canonical correlations: A comparative study. Comput. Stat. 20, 203–229 (2005)

    MathSciNet  MATH  Article  Google Scholar 

  6. Capéraà, P., Guillem, A.I.G.: Taux de resistance des tests de rang d’indépendance. Can. J. Stat. 25, 113–124 (1997)

    MATH  Article  Google Scholar 

  7. Christensen, D.: Fast algorithms for the calculation of Kendall’s τ. Comput. Stat. 20, 51–62 (2005)

    MATH  Article  Google Scholar 

  8. Critchley, F., Schyns, M., Haesbroeck, G.: A relaxed approach to combinatorial problems in robustness and diagnostics. Stat. Comput. 20, 99–115 (2010)

    MathSciNet  Article  Google Scholar 

  9. Croux, C., Dehon, C.: Influence functions of the Spearman and Kendall correlation measures. Stat. Methods Appl. 19, 497–515 (2010)

    MathSciNet  Article  Google Scholar 

  10. Croux, C., Haesbroeck, G.: Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies. Biometrika 87, 603–618 (2000)

    MathSciNet  MATH  Article  Google Scholar 

  11. Daudin, J.J., Duby, C., Trecourt, P.: Stability of principal component analysis studied by the bootstrap method. Statistics 19, 241–258 (1988)

    MathSciNet  MATH  Article  Google Scholar 

  12. Davies, P., Gather, U.: Breakdown and groups (with discussion). Ann. Stat. 33, 977–1035 (2005)

    MathSciNet  MATH  Article  Google Scholar 

  13. Devlin, S., Gnanadesikan, R., Kettering, J.: Robust estimation and outlier detection with correlation coefficients. Biometrika 62, 531–545 (1975)

    MATH  Article  Google Scholar 

  14. Dominici, D.E.: The inverse of the cumulative standard normal probability function. Integral Transforms Spec. Funct. 14, 281–292 (2003)

    MathSciNet  MATH  Article  Google Scholar 

  15. Filzmoser, P., Fritz, H., Kalcher, K.: pcaPP: Robust PCA by Projection Pursuit. R package version 1.9 (2010)

  16. Grize, Y.: Robustheitseigenschaften von Korrelations-schätzungen. Ph. D. thesis, ETH Zürich (1978)

  17. Hájek, J., Sidak, Z.: Theory of Rank Tests. Academic Press, New York (1967)

    Google Scholar 

  18. Hubert, M., Rousseeuw, P., Vanden Branden, K.: ROBPCA: a new approach to robust principal component analysis. Technometrics 47, 64–79 (2005)

    MathSciNet  Article  Google Scholar 

  19. Iman, R., Conover, W.: A distribution-free approach to inducing rank correlation among input variables. Commun. Stat., Simul. Comput. 11, 311–334 (1982)

    MATH  Article  Google Scholar 

  20. Kendall, M.: A new measure of rank correlation. Biometrika 30, 81–93 (1938)

    MathSciNet  MATH  Google Scholar 

  21. Khan, J., Van Aelst, S., Zamar, R.: Robust linear model selection based on least angle regression. J. Am. Stat. Assoc. 480, 1289–1299 (2007)

    Article  Google Scholar 

  22. Maronna, R., Zamar, R.: Robust estimates of location and dispersion of high-dimensional datasets. Technometrics 44, 307–317 (2002)

    MathSciNet  Article  Google Scholar 

  23. Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust Statistics: Theory and Methods. Wiley, Chichester (2006)

    Google Scholar 

  24. Rousseeuw, P., Van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)

    Article  Google Scholar 

  25. Rousseeuw, P., Croux, C., Todorov, V., Ruckstuhl, A., Salibian-Barrera, M., Verbeke, T., Maechler, M.: Robustbase: Basic Robust Statistics. R package version 0.5-0-1 (2009)

  26. Spearman, C.: General intelligence objectively determined and measured. Am. J. Psychol. 15, 201–293 (1904)

    Article  Google Scholar 

  27. Van Aelst, S., Vandervieren, E., Willems, G.: Robust principal component analysis based on pairwise correlation estimators. In: Proceedings of the 19th International Conference on Computational Statistics, Paris, pp. 573–580 (2010)

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Christophe Croux.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Boudt, K., Cornelissen, J. & Croux, C. The Gaussian rank correlation estimator: robustness properties. Stat Comput 22, 471–483 (2012). https://doi.org/10.1007/s11222-011-9237-0

Download citation

Keywords

  • Breakdown
  • Correlation
  • Efficiency
  • Robustness
  • Van der Waerden