Robust Regression

Brown, Jonathon D.

doi:10.1007/978-3-319-93549-2_7

Jonathon D. Brown²

1497 Accesses

Abstract

In Chap. 6 we learned how to detect and manage violations of the Gauss-Markov theorem. In this chapter, we consider a related problem—how to accommodate errors that are not normally distributed. Normally distributed errors are not demanded by the Gauss-Markov theorem, but the errors need to be at least approximately normal if we wish to use the normal distribution to test hypotheses about the regression coefficients or construct confidence intervals around them. Fortunately, the central limit theorem tells us that if our criterion is normally distributed, the errors will also be normally distributed with large samples. Normality is less certain with small samples, however, so it is important to examine the residuals to be sure that they are, at least, approximately normal and to take appropriate action if they are found not to be so.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In fact, no distribution is ever “perfectly” normal, so our concern is a relative one.
2.
See Chap. 2 for a discussion of the hat matrix and its diagonal elements, called hat values.
3.
The term “resistant” is sometimes used to refer to an estimator that retains its value in the face of extreme observations, with a robust estimator being one that is resistant and efficient. The two terms are now used more or less interchangeably, and I will distinguish them only when it is important to do so.
4.
Details regarding another resistant estimator, Least Trimmed Squares, can be found in Rousseeuw and Leroy (1987).
5.
Least Absolute Regression is also known as Least Absolute Deviation Regression, L₁ Norm Regression, and Quantile Regression (when using the median).
6.
The open brackets in the calculation of h indicate that we are to use the lowest integer (i.e., floor).
7.
The number of possible combinations is n! /[(n − p)! ∗ p!], so combinations need to be randomly sampled from the data with large samples.
8.
The value of 1.4826 in Eq. (7.12) is chosen so that when n is large and the errors are normally distributed, s closely approximates the standard deviation of the residuals from an OLS regression.
9.
The tuning constants k in Eqs. (7.13) and (7.14) are used because they have been shown to produce estimates that possess 95% efficiency.
10.
Bisquare weights perform even better in our example, producing a regression slope that is virtually identical to the one found with the final observation omitted (b = .2438).
11.
The bootstrap samples are formed randomly, so your results will not exactly match the ones in the text. Additionally, because our sample size is so small, the estimation might fail to converge.
12.
These observations provide the best scale value.

References

Andersen, R. (2008). Modern methods for robust regression. Los Angeles: Sage.
Book Google Scholar
Efron, B., & Tibshirani, R. (1994). An introduction to the bootstrap. New York: Chapman & Hall.
MATH Google Scholar
Rousseeuw, P. J., & Leroy, A. M. (1987). Robust regression and outlier detection. New York: Wiley.
Book Google Scholar
Rousseeuw, P. J., & Yohai, V. J. (1984). Robust regression by means of S estimators. In J. Franke, W. Härdle, & R. D. Martin (Eds.), Robust and nonlinear time series: Lecture notes in statistics, 26 (pp. 256–272). New York: Springer-Verlag.
Chapter Google Scholar
Salibian-Barrera, M., & Yohai, V. (2006). A fast algorithm for S-regression estimates. Journal of Computational and Graphical Statistics, 15, 414–427.
Article MathSciNet Google Scholar
Stephens, M. A. (1986). Tests based on EDF statistics. In R. B. d’Agostino & M. A. Stephens (Eds.), Goodness-of-fit techniques (pp. 97–193). New York: Marcel Dekker.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, University of Washington, Seattle, WA, USA
Jonathon D. Brown

Authors

Jonathon D. Brown
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Brown, J.D. (2018). Robust Regression. In: Advanced Statistics for the Behavioral Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-93549-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-93549-2_7
Published: 01 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93547-8
Online ISBN: 978-3-319-93549-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics