Abstract
Spearman’s rank correlation is a robust alternative for the standard correlation coefficient. Using ranks instead of the actual values of the observations, the impact of outliers remains limited. In this paper, we study an estimator based on this rank correlation measure for estimating covariance matrices and their inverses. The resulting estimator is robust and consistent at the normal distribution. By applying the graphical lasso, the inverse covariance matrix estimator is positive definite if more variables than observations are available in the data set. Moreover, it will contain many zeros, and is therefore said to be sparse. Instead of Spearman’s rank correlation, one can use Kendall correlation, Quadrant correlation or Gaussian rank scores. A simulation study compares the different estimators. This type of estimator is particularly useful for estimating (inverse) covariance matrices in high dimensions, when the data may contain several outliers in many cells of the data matrix. More traditional robust estimators are not well defined or computable in this setting. An important feature of the proposed estimators is their simplicity and easiness to compute using existing software.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abbruzzo A, Vujacic I, Wit E, Mineo A (2014) Generalized information criterion for model selection in penalized graphical models. arXiv:1403.1249
Agostinelli C, Leung A, Yohai V, Zamar R (2015) Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. Test 24(3):441–461
Alqallaf F, Konis K, Martin R, Zamar R (2002) Scalable robust covariance and correlation estimates for data mining. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 14–23
Alqallaf F, Van Aelst S, Yohai V, Zamar R (2009) Propagation of outliers in multivariate data. Ann Stat 37(1):311–331
Bilodeau M (2014) Graphical lassos for meta-elliptical distributions. Can J Stat 42:185–203
Boudt K, Cornelissen J, Croux C (2012) The Gaussian rank correlation estimator: robustness properties. Stat Comput 22(2):471–483
Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data. Springer, Heidelberg
Croux C, Dehon C (2010) Influence functions of the Spearman and Kendall correlation measures. Stat Meth Appl 19(4):497–515
Dürre A, Vogel D, Fried R (2015) Spatial sign correlation. J Multivar Anal 135:89–105
Finegold M, Drton M (2011) Robust graphical modeling of gene networks using classical and alternative \(t\)-distributions. Ann Appl Stat 5(2A):1057–1080
Foygel R, Drton M (2010) Extended bayesian information criteria for gaussian graphical models. In: Advances in neural information processing systems 23, Curran Associates, Inc., pp 604–612
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441
Gnanadesikan R, Kettenring J (1972) Robust estimates, residuals and outlier detection with multiresponse data. Biometrics 28(1):81–124
Higham N (2002) Computing the nearest correlation matrix - a problem from finance. IMA J Numer Anal 22(3):329–343
Kalisch M, Bühlmann P (2008) Robustification of the pc-algorithm for directed acyclic graphs. J Comput Graph Stat 17(4):773–789
Kendall M (1938) A new measure of rank correlation. Biometrika 30:81–93
Liu H, Lafferty J, Wasserman L (2009) The nonparanormal: semiparametric estimation on high dimensional undirected graphs. J Mach Learn Res 10:2295–2328
Liu H, Roeder K, Wasserman L (2010) Stability approach to regularization selection (StARS) for high dimensional graphical models. In: Advances in neural information processing systems 23, Curran Associates, Inc., pp 1432–1440
Liu H, Han F, Yuan M, Lafferty J, Wasserman L (2012a) High-dimensional semiparametric Gaussian copula graphical models. Ann Stat 40(4):2293–2326
Liu H, Han F, Zhang C (2012b) Transelliptical graphical models. In: Advances in neural information processing systems 25, Curran Associates, Inc., pp 800–808
Maronna R, Martin R, Yohai V (2006) Robust statistics, 2nd edn. Wiley, Hoboken
Öllerer V, Croux C (2015) Robust high-dimensional precision matrix estimation. In: Nordhausen K, Taskinen S (eds) Modern Nonparametric, Robust and Multivariate Methods, Springer, pp 325–350
Ollila E, Tyler D (2014) Regularized M-estimators of scatter matrix. IEEE Trans Signal Process 62(22):6059–6070
Rousseeuw P, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88(424):1273–1283
Rousseeuw P, Molenberghs G (1993) Transformation of nonpositive semidefinite correlation matrices. Commun Stat - Theory Meth 22(4):965–984
Rousseeuw P, Van Driessen K (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3):212–223
Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Maechler M (2015) Robustbase: basic robust statistics. http://CRAN.R-project.org/package=robustbase, r package version 0.92-3
Seber G (2008) A matrix handbook for Statisticians. Wiley, Hoboken
Tarr G, Müller S, Weber N (2016) Robust estimation of precision matrices under cellwise contamination. Comput Stat Data Anal 93:404–420
Todorov V, Filzmoser P, Fritz H, Kalcher K (2014) pcaPP: Robust PCA by Projection Pursuit. http://CRAN.R-project.org/package=pcaPP, r package version 1.9-60
Tyler D (2010) A note on multivariate location and scatter statistics for sparse data. Stat Probab Lett 80(17–18):1409–1413
Van Aelst S, Vandervieren E, Willems G (2010) Robust principal component analysis based on pairwise correlation estimators. In: Proceedings of COMPSTAT2010, Physica-Verlag HD, pp 573–580
Van Aelst S, Vandervieren E, Willems G (2011) Stahel-Donoho estimators with cellwise weights. J Stat Comput Simul 81(1):1–27
Vogel D, Fried R (2011) Elliptical graphical modelling. Biometrika 98(4):935–951
Xue L, Zou H (2012) Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Ann Stat 40(5):2541–2571
Yuan M, Lin Y (2007) Model selection and estimation in the Gaussian graphical model. Biometrika 94(1):19–35
Zhao T, Liu H, Roeder K, Lafferty J, Wasserman L (2012) The huge package for high-dimensional undirected graph estimation in \({\sf {R}}\). J Mach Learn Res 13:1059–1062
Zhao T, Liu H, Roeder K, Lafferty J, Wasserman L (2014a) huge: High-dimensional undirected graph estimation. URL http://CRAN.R-project.org/package=huge, r package version 1.2.6
Zhao T, Roeder K, Liu H (2014b) Positive semidefinite rank-based correlation matrix estimation with application to semiparametric graph estimation. J Comput Graph Stat 23(4):895–922
Acknowledgments
The authors wish to acknowledge the support from the GOA/12/014 project of the Research Fund KU Leuven. We also would like to thank the referees for their constructive comments that improved the paper considerably.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this paper
Cite this paper
Croux, C., Öllerer, V. (2016). Robust and Sparse Estimation of the Inverse Covariance Matrix Using Rank Correlation Measures. In: Agostinelli, C., Basu, A., Filzmoser, P., Mukherjee, D. (eds) Recent Advances in Robust Statistics: Theory and Applications. Springer, New Delhi. https://doi.org/10.1007/978-81-322-3643-6_3
Download citation
DOI: https://doi.org/10.1007/978-81-322-3643-6_3
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-3641-2
Online ISBN: 978-81-322-3643-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)