Skip to main content
Log in

The shooting S-estimator for robust regression

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

To perform multiple regression, the least squares estimator is commonly used. However, this estimator is not robust to outliers. Therefore, robust methods such as S-estimation have been proposed. These estimators flag any observation with a large residual as an outlier and downweight it in the further procedure. However, a large residual may be caused by an outlier in only one single predictor variable, and downweighting the complete observation results in a loss of information. Therefore, we propose the shooting S-estimator, a regression estimator that is especially designed for situations where a large number of observations suffer from contamination in a small number of predictor variables. The shooting S-estimator combines the ideas of the coordinate descent algorithm with simple S-regression, which makes it robust against componentwise contamination, at the cost of failing the regression equivariance property.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The signal-to-noise ratio equals \(\frac{\sqrt{{{\varvec{\beta }}}'\Sigma {{\varvec{\beta }}}}}{\sigma }\).

  2. The expected value of the number of contaminated rows is \(n(1-(1-\epsilon )^p)\) for a cellwise contamination level \(\epsilon \).

References

  • Alfons A, Croux C, Gelper S (2013) Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann Appl Stat 7(1):226–248

    Article  MathSciNet  MATH  Google Scholar 

  • Alqallaf F, Van Aelst S, Yohai V, Zamar R (2009) Propagation of outliers in multivariate data. Ann Stat 37(1):311–331

    Article  MathSciNet  MATH  Google Scholar 

  • Belsley D, Kuh E, Welsch R (1980) Regression diagnostics: identifying influential data and source of collinearity. Wiley, New York

    Book  MATH  Google Scholar 

  • Brown P (1982) Multivariate calibration. J R Stat Soc Ser B 44(3):287–321

    MathSciNet  MATH  Google Scholar 

  • Friedman J, Hastie T, Hofling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1(2):302–332

    Article  MathSciNet  MATH  Google Scholar 

  • Fu W (1998) Penalized regressions: the bridge versus the lasso. J Comput Graph Stat 7(3):397–416

    MathSciNet  Google Scholar 

  • Harrison D, Rubinfeld D (1978) Hedonic housing prices and the demand of clean air. J Environ Econ Manag 5(1):81–102

    Article  MATH  Google Scholar 

  • Koller M, Stahel W (2011) Sharpening Wald-type inference in robust regression for small samples. Comput Stat Data Anal 55(8):2504–2515

    Article  MathSciNet  Google Scholar 

  • Little R (1992) Regression with missing X’s: a review. J Am Stat Assoc 87(420):1227–1237

    Google Scholar 

  • Maronna R, Martin R, Yohai V (2006) Robust statistics, 2nd edn. Wiley, Hoboken

    Book  MATH  Google Scholar 

  • Rousseeuw P, Leroy A (1987) Robust regression and outlier detection. Wiley, Hoboken

    Book  MATH  Google Scholar 

  • StataCorp (2013) Stata: release 13. Stata Press, College Station, Texas, Statistical Software

  • Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 3:475–494

    Article  MathSciNet  MATH  Google Scholar 

  • Van Aelst S, Vandervieren E, Willems G (2010) Robust principal component analysis based on pairwise correlation estimators. In: Lechevallier Y, Saporta G (eds) COMPSTAT 2010: proceedings in computational statistics. Physika, Heidelberg, pp 1677–1684

  • Van Aelst S, Vandervieren E, Willems G (2011) Stahel–Donoho estimators with cellwise weights. J Stat Comput Simul 81(1):1–27

    Article  MathSciNet  MATH  Google Scholar 

  • Van Aelst S, Vandervieren E, Willems G (2012) A Stahel–Donoho estimator based on huberized outlyingness. Comput Stat Data Anal 56(3):531–542

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge support from the GOA/12/014 Project of the Research Fund KU Leuven. We thank the referees for their constructive comments, and in particular the third anonymous referee who corrected some flaws in the first version of the paper and who made many suggestions for improving the write up of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Viktoria Öllerer.

Appendix: Description of variables for real data examples

Appendix: Description of variables for real data examples

See Tables 6, 7 and 8.

Table 6 Variables of the Cars93 data
Table 7 Variables of the Auto data
Table 8 Variables of the Boston data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Öllerer, V., Alfons, A. & Croux, C. The shooting S-estimator for robust regression. Comput Stat 31, 829–844 (2016). https://doi.org/10.1007/s00180-015-0593-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-015-0593-7

Keywords

Navigation