Skip to main content
Log in

A comparative study on detection of influential observations in linear regression

  • Articles
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

A large number of statistics are used in the literature to detect outliers and influential observations in the linear regression model. In this paper comparison studies have been made for determining a statistic which performs better than the other. This includes: (i) a detailed simulation study, and (ii) analyses of several data sets studied by different authors. Different choices of the design matrix of regression model are considered. Design A studies the performance of the various statistics for detecting the scale shift type outliers, and designs B and C provide information on the performance of the statistics for identifying the influential observations. We have used cutoff points using the exact distributions and Bonferroni's inequality for each statistic. The results show that the studentized residual which is used for detection of mean shift outliers is appropriate for detection of scale shift outliers also, and the Welsch's statistic and the Cook's distance are appropriate for detection of influential observations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aitchinson, T. and Dunsmore, I. R. (1975). Statistical Prediction Analysis. Cambridge University Press.

  • Aitchinson, T. and Dunsmore, I. R. (1975). Statistical Prediction Analysis. Cambridge University Press.

  • Atkinson, A. C. (1981). Two graphical displays for outlying and influential observations in regression. Biometrika, 68: 13–20.

    Article  MATH  MathSciNet  Google Scholar 

  • Atkinson, A. C. (1985). Plots, Transformation, and Regression. University Press, Oxford.

    Google Scholar 

  • Balasooriya, U. and Tse, Y. K. (1986). Outlier detection in linear models: A comparative study in simple linear regression. Communications in Statistics: Theory and Methods, 15(12): 3589–3597.

    Article  MathSciNet  Google Scholar 

  • Balasooriya, U., Tse, Y. K. and Liew, Y. S. (1987). An empirical comparison of some statistics for identifying outliers and influential observations in linear regression models. Journal of Applied Statistics, 14: 177–184.

    Article  Google Scholar 

  • Beckman, R. J. and Cook, R. D. (1983). Outlier …s. Technometrics, 25: 119–149.

    Article  MATH  MathSciNet  Google Scholar 

  • Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics: Identifying influential data and sources of collinearity. John Wiley, New York.

    MATH  Google Scholar 

  • Brownlee, K. A. (1965). Statistical Theory and Methodology in Science and Engineering. 2nd edn. John Wiley, New York.

    MATH  Google Scholar 

  • Chatterjee, S. and Hadi, A. S. (1986). Influential observations, high leverage points, and outlier in linear regression. Statistical Science, 1: 379–416.

    Article  MathSciNet  Google Scholar 

  • Cook, R. D. (1977). Detection of influential observations in linear regression. Technometrics, 19: 15–18.

    Article  MATH  MathSciNet  Google Scholar 

  • Cook, R. D. and Weisberg, S. (1980). Characterization of an empirical influence function for detecting influential cases in regression. Technometrics, 22: 495–508.

    Article  MATH  MathSciNet  Google Scholar 

  • Cook, R. D. and Weisberg, S. (1982). Residuals and Influence in Regression. Chapman and Hall, London.

    MATH  Google Scholar 

  • Cook, R. D., Holschuh, N. and Weisberg, S. (1982). A note on an alternative outlier model. Journal of Royal Statist. Soc. B, 44: 370–376.

    MATH  MathSciNet  Google Scholar 

  • Ellenberg, J. H. (1973). The joint distribution of the standardized least squares residuals from a general linear regression. Journal of the Amer. Statist. Assoc., 68: 941–943.

    Article  MATH  MathSciNet  Google Scholar 

  • Gibbons, D. G. (1981). A simulation study of some ridge estimators. Journal of the Amer. Statist. Assoc., 76: 131–139.

    Article  MATH  Google Scholar 

  • Hoaglin, D. C. and Kempthorne, P. J. (1986). Comment on Chatterjee and Hadi's paper. Statistical Science, 1: 408–412.

    Article  Google Scholar 

  • Hoaglin, D. C. and Welsch, R. E. (1978). The hat matrix in regression and ANOVA. The American Statistician, 32: 117–122.

    Article  Google Scholar 

  • Hossain, A. (1989). Detection of outliers and influential observations in regression models. Unpublished Dissertation, Old Dominion University.

  • Hossain, A. and Naik, D. N. (1989). Detection of influential observations in multivariate regression. Journal of Applied Statistics, 16: 25–37.

    Article  Google Scholar 

  • Mickey, M. R., Dunn, O. J. and Clark, V. (1967). Note on the use of stepwise regression in detecting outliers. Computers and Biomedical Research, 1: 105–111.

    Article  Google Scholar 

  • Moore, J. (1975). Total biochemical oxygen demand of dairy manures. Ph.D thesis, University of Minnesota.

  • Naik, D. N. (1989). Detection of outliers in the multivariate linear regression model. Communications in Statistics: Theory and Methods, 16(6): 2225–2232.

    MathSciNet  Google Scholar 

  • Srikantan, K. S. (1961). Testing for the single outlier in a regression model. Sankhya, Series A, 23: 251–260.

    Google Scholar 

  • Weisberg, S. (1980). Applied linear regression. John Wiley, New York.

    MATH  Google Scholar 

  • Welsch, R. E., Kuh, E. (2977). Linear regression diagnostics. Massachusetts Institute of Technology. Technical report 923–77.

  • Welsch, R. E. (1982). Influence function and regression diagnostics. In Modern data analysis. R. L. Launer and A. F. Siegel, Eds. Academic, New York.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hossain, A., Naik, D.N. A comparative study on detection of influential observations in linear regression. Statistical Papers 32, 55–69 (1991). https://doi.org/10.1007/BF02925479

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02925479

Key words

Navigation