Skip to main content

Robust and Sparse Estimation of the Inverse Covariance Matrix Using Rank Correlation Measures

  • Conference paper
  • First Online:
Recent Advances in Robust Statistics: Theory and Applications

Abstract

Spearman’s rank correlation is a robust alternative for the standard correlation coefficient. Using ranks instead of the actual values of the observations, the impact of outliers remains limited. In this paper, we study an estimator based on this rank correlation measure for estimating covariance matrices and their inverses. The resulting estimator is robust and consistent at the normal distribution. By applying the graphical lasso, the inverse covariance matrix estimator is positive definite if more variables than observations are available in the data set. Moreover, it will contain many zeros, and is therefore said to be sparse. Instead of Spearman’s rank correlation, one can use Kendall correlation, Quadrant correlation or Gaussian rank scores. A simulation study compares the different estimators. This type of estimator is particularly useful for estimating (inverse) covariance matrices in high dimensions, when the data may contain several outliers in many cells of the data matrix. More traditional robust estimators are not well defined or computable in this setting. An important feature of the proposed estimators is their simplicity and easiness to compute using existing software.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Abbruzzo A, Vujacic I, Wit E, Mineo A (2014) Generalized information criterion for model selection in penalized graphical models. arXiv:1403.1249

  • Agostinelli C, Leung A, Yohai V, Zamar R (2015) Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. Test 24(3):441–461

    Google Scholar 

  • Alqallaf F, Konis K, Martin R, Zamar R (2002) Scalable robust covariance and correlation estimates for data mining. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 14–23

    Google Scholar 

  • Alqallaf F, Van Aelst S, Yohai V, Zamar R (2009) Propagation of outliers in multivariate data. Ann Stat 37(1):311–331

    Article  MathSciNet  MATH  Google Scholar 

  • Bilodeau M (2014) Graphical lassos for meta-elliptical distributions. Can J Stat 42:185–203

    Article  MathSciNet  MATH  Google Scholar 

  • Boudt K, Cornelissen J, Croux C (2012) The Gaussian rank correlation estimator: robustness properties. Stat Comput 22(2):471–483

    Article  MathSciNet  MATH  Google Scholar 

  • Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data. Springer, Heidelberg

    Book  MATH  Google Scholar 

  • Croux C, Dehon C (2010) Influence functions of the Spearman and Kendall correlation measures. Stat Meth Appl 19(4):497–515

    Article  MathSciNet  MATH  Google Scholar 

  • Dürre A, Vogel D, Fried R (2015) Spatial sign correlation. J Multivar Anal 135:89–105

    Article  MathSciNet  MATH  Google Scholar 

  • Finegold M, Drton M (2011) Robust graphical modeling of gene networks using classical and alternative \(t\)-distributions. Ann Appl Stat 5(2A):1057–1080

    Article  MathSciNet  MATH  Google Scholar 

  • Foygel R, Drton M (2010) Extended bayesian information criteria for gaussian graphical models. In: Advances in neural information processing systems 23, Curran Associates, Inc., pp 604–612

    Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441

    Article  MATH  Google Scholar 

  • Gnanadesikan R, Kettenring J (1972) Robust estimates, residuals and outlier detection with multiresponse data. Biometrics 28(1):81–124

    Article  Google Scholar 

  • Higham N (2002) Computing the nearest correlation matrix - a problem from finance. IMA J Numer Anal 22(3):329–343

    Article  MathSciNet  MATH  Google Scholar 

  • Kalisch M, Bühlmann P (2008) Robustification of the pc-algorithm for directed acyclic graphs. J Comput Graph Stat 17(4):773–789

    Article  MathSciNet  Google Scholar 

  • Kendall M (1938) A new measure of rank correlation. Biometrika 30:81–93

    Article  MathSciNet  MATH  Google Scholar 

  • Liu H, Lafferty J, Wasserman L (2009) The nonparanormal: semiparametric estimation on high dimensional undirected graphs. J Mach Learn Res 10:2295–2328

    MathSciNet  MATH  Google Scholar 

  • Liu H, Roeder K, Wasserman L (2010) Stability approach to regularization selection (StARS) for high dimensional graphical models. In: Advances in neural information processing systems 23, Curran Associates, Inc., pp 1432–1440

    Google Scholar 

  • Liu H, Han F, Yuan M, Lafferty J, Wasserman L (2012a) High-dimensional semiparametric Gaussian copula graphical models. Ann Stat 40(4):2293–2326

    Article  MathSciNet  MATH  Google Scholar 

  • Liu H, Han F, Zhang C (2012b) Transelliptical graphical models. In: Advances in neural information processing systems 25, Curran Associates, Inc., pp 800–808

    Google Scholar 

  • Maronna R, Martin R, Yohai V (2006) Robust statistics, 2nd edn. Wiley, Hoboken

    Book  MATH  Google Scholar 

  • Öllerer V, Croux C (2015) Robust high-dimensional precision matrix estimation. In: Nordhausen K, Taskinen S (eds) Modern Nonparametric, Robust and Multivariate Methods, Springer, pp 325–350

    Google Scholar 

  • Ollila E, Tyler D (2014) Regularized M-estimators of scatter matrix. IEEE Trans Signal Process 62(22):6059–6070

    Article  MathSciNet  Google Scholar 

  • Rousseeuw P, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88(424):1273–1283

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseeuw P, Molenberghs G (1993) Transformation of nonpositive semidefinite correlation matrices. Commun Stat - Theory Meth 22(4):965–984

    Article  MATH  Google Scholar 

  • Rousseeuw P, Van Driessen K (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3):212–223

    Article  Google Scholar 

  • Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Maechler M (2015) Robustbase: basic robust statistics. http://CRAN.R-project.org/package=robustbase, r package version 0.92-3

  • Seber G (2008) A matrix handbook for Statisticians. Wiley, Hoboken

    MATH  Google Scholar 

  • Tarr G, Müller S, Weber N (2016) Robust estimation of precision matrices under cellwise contamination. Comput Stat Data Anal 93:404–420

    Google Scholar 

  • Todorov V, Filzmoser P, Fritz H, Kalcher K (2014) pcaPP: Robust PCA by Projection Pursuit. http://CRAN.R-project.org/package=pcaPP, r package version 1.9-60

  • Tyler D (2010) A note on multivariate location and scatter statistics for sparse data. Stat Probab Lett 80(17–18):1409–1413

    Article  MathSciNet  MATH  Google Scholar 

  • Van Aelst S, Vandervieren E, Willems G (2010) Robust principal component analysis based on pairwise correlation estimators. In: Proceedings of COMPSTAT2010, Physica-Verlag HD, pp 573–580

    Google Scholar 

  • Van Aelst S, Vandervieren E, Willems G (2011) Stahel-Donoho estimators with cellwise weights. J Stat Comput Simul 81(1):1–27

    Article  MathSciNet  MATH  Google Scholar 

  • Vogel D, Fried R (2011) Elliptical graphical modelling. Biometrika 98(4):935–951

    Article  MathSciNet  MATH  Google Scholar 

  • Xue L, Zou H (2012) Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Ann Stat 40(5):2541–2571

    Article  MathSciNet  MATH  Google Scholar 

  • Yuan M, Lin Y (2007) Model selection and estimation in the Gaussian graphical model. Biometrika 94(1):19–35

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao T, Liu H, Roeder K, Lafferty J, Wasserman L (2012) The huge package for high-dimensional undirected graph estimation in \({\sf {R}}\). J Mach Learn Res 13:1059–1062

    Google Scholar 

  • Zhao T, Liu H, Roeder K, Lafferty J, Wasserman L (2014a) huge: High-dimensional undirected graph estimation. URL http://CRAN.R-project.org/package=huge, r package version 1.2.6

  • Zhao T, Roeder K, Liu H (2014b) Positive semidefinite rank-based correlation matrix estimation with application to semiparametric graph estimation. J Comput Graph Stat 23(4):895–922

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors wish to acknowledge the support from the GOA/12/014 project of the Research Fund KU Leuven. We also would like to thank the referees for their constructive comments that improved the paper considerably.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christophe Croux .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this paper

Cite this paper

Croux, C., Öllerer, V. (2016). Robust and Sparse Estimation of the Inverse Covariance Matrix Using Rank Correlation Measures. In: Agostinelli, C., Basu, A., Filzmoser, P., Mukherjee, D. (eds) Recent Advances in Robust Statistics: Theory and Applications. Springer, New Delhi. https://doi.org/10.1007/978-81-322-3643-6_3

Download citation

Publish with us

Policies and ethics