Skip to main content
Log in

Shape bias of robust covariance estimators: an empirical study

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Detecting outliers in a multivariate point cloud is not trivial, especially when dealing with a sizable fraction of contamination. Over time, it has increasingly been recognized that the safest and most feasible approach to exposing outliers starts by computing a highly robust estimator of location and scatter that can withstand a large proportion of contamination. Many such estimators have been proposed in recent years. We will compare the worst-case bias of several prominent robust multivariate estimators by means of simulation. We also propose a new tool to compare robust estimators on real data sets, and illustrate it.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Billor N, Hadi AS, Velleman PF (2000) BACON: blocked adaptive computationally efficient outlier nominators. Comput Stat Data Anal 34:279–298

    Article  MATH  Google Scholar 

  • Daudin JJ, Duby C, Trecourt P (1988) Stability of principal component analysis studied by the bootstrap method. Statistics 19:241–258

    Article  MATH  MathSciNet  Google Scholar 

  • Debruyne M, Hubert M (2009) The influence function of the Stahel–Donoho covariance estimator of smallest outlyingness. Stat Probab Lett 79:275–282

    Article  MATH  MathSciNet  Google Scholar 

  • Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L (2009) Detecting influenza epidemics using search engine query data. Nature 457:1012–1014

    Article  Google Scholar 

  • Google Flu Trends (2012). (http://www.google.org/flutrends). Accessed 25 March 2012

  • Hubert M, Rousseeuw PJ, Vanden Branden K (2005) ROBPCA: a new approach to robust principal component analysis. Technometrics 47:64–79

    Article  MathSciNet  Google Scholar 

  • Hubert M, Rousseeuw PJ, Verdonck T (2012) A deterministic algorithm for robust location and scatter. J Comput Graph Stat 21:618–637

    Article  MathSciNet  Google Scholar 

  • Hubert M, Vandervieren E (2008) An adjusted boxplot for skewed distributions. Comput Stat Data Anal 52:5186–5201

    Article  MATH  MathSciNet  Google Scholar 

  • Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley, New York

    Book  Google Scholar 

  • Maronna RA, Yohai VJ (1995) The behavior of the Stahel–Donoho robust multivariate estimator. J Am Stat Assoc 90:330–341

    Article  MATH  MathSciNet  Google Scholar 

  • Maronna RA, Zamar RH (2002) Robust estimates of location and dispersion for high-dimensional data sets. Technometrics 44:307–317

    Article  MathSciNet  Google Scholar 

  • Rocke DM, Woodruff DL (1996) Identification of outliers in multivariate data. J Am Stat Assoc 91:1047–1061

    Article  MATH  MathSciNet  Google Scholar 

  • Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79:871–880

    Article  MATH  MathSciNet  Google Scholar 

  • Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New York

    Book  MATH  Google Scholar 

  • Rousseeuw PJ, Van Aelst S, Van Driessen K, Agulló J (2004) Robust multivariate regression. Technometrics 46:293–305

    Article  MathSciNet  Google Scholar 

  • Rousseeuw PJ, Van Driessen K (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41:212–223

    Article  Google Scholar 

  • Salibian-Barrera M, Van Aelst S, Willems G (2006) PCA based on multivariate MM-estimators with fast and robust bootstrap. J Am Stat Assoc 101:1198–1211

    Article  MATH  Google Scholar 

  • Salibian-Barrera M, Yohai VJ (2006) A fast algorithm for S-regression estimates. J Comput Graph Stat 15:414–427

    Article  MathSciNet  Google Scholar 

  • Stahel W, Maechler M (2009) robustX: eXperimental eXtraneous eXtraordinary ... functionality for robust statistics. R package version 1.1-2.

  • Todorov V, Filzmoser P (2009) An object-oriented framework for robust multivariate analysis. J Stat Soft 32:1–47

    Google Scholar 

  • Yohai VJ, Maronna RA (1990) The maximum bias of robust covariances. Commun Stat Theory Methods 19:3925–3933

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Hubert.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hubert, M., Rousseeuw, P. & Vakili, K. Shape bias of robust covariance estimators: an empirical study. Stat Papers 55, 15–28 (2014). https://doi.org/10.1007/s00362-013-0544-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-013-0544-8

Keywords

Mathematics Subject Classification

Navigation