Advertisement

Outliers

  • Gideon J. MellenberghEmail author
Chapter

Abstract

An outlier is a value of a variable that is inconsistent with the other values. The discussion is restricted to univariate outliers, that is, outliers that are inconsistent with other values of the same variable. A naive strategy is to apply the Z-score method to detect outliers, to remove the outliers, and to analyze the remaining data as usual. The Z-score method is incorrect and should be replaced by other methods, such as, the MAD-score method. Researchers have to check whether outliers are caused by mistake. If mistakes are detected and the correct values are known, the outliers are corrected. If mistakes are detected but the correct values are not known, the outliers are treated as missing data and the data are analyzed with model-based statistical methods that assume MCAR or MAR. If mistakes are not found, two strategies are suitable. First, to study the robustness of the substantive conclusions against outliers. The data are analyzed with and without the outliers using the same statistical methods. The results of the two analyses are compared, and the results of both analyses or the results of the analysis that gives weakest support to the substantive hypothesis are reported. Second, robust statistical methods, such as, bootstrap methods are applied to analyze the data. Finally, whatever method researchers used, they should always report the frequency and handling of the outliers.

Keywords

Bootstrap methods Content robustness against outliers Kendall’s tau MAD-score method Univariate outliers 

References

  1. Adèr, H. J. (2008a). Phases and initial steps in data analysis. In H. J. Adèr & G. J. Mellenbergh (with contributions by D. J. Hand), Advising on research methods: A consultant’s companion (pp. 333–356). Huizen, The Netherlands: van Kessel.Google Scholar
  2. Adèr, H. J. (2008b). Missing and biasing information. In H. J. Adèr & G. J. Mellenbergh (with contributions by D. J. Hand), Advising on research methods: A consultant’s companion (pp. 305–332). Huizen, The Netherlands: van Kessel.Google Scholar
  3. Bakker, M., & Wicherts, J. M. (2014). Outlier removal, sum scores, and the inflation of the Type I error rate in independent samples t tests: The power of alternatives and recommendations. Psychological Methods, 19, 409–427.CrossRefGoogle Scholar
  4. Barnett, V., & Lewis, T. (1984). Outliers in statistical data (2nd ed.). Chichester, UK: Wiley.Google Scholar
  5. Blommestijn, S. Q., & Lietaert Peerbolte, E. A. (2012). Outliers and extreme observations: What are they and how to handle them? In H. J. Adèr & G. J. Mellenbergh (Eds.), Advising on research methods, Selected topics 2012 (pp. 81–105). Huizen, The Netherlands: Van Kessel.Google Scholar
  6. Canavos, G. C. (1984). Applied probability and statistical methods. Boston: Little, Brown & Company.Google Scholar
  7. Croux, C., & Dehon, C. (2010). Influence functions of the Spearman and Kendall correlation measures. Statistical Methods and Applications, 19, 497–515.CrossRefGoogle Scholar
  8. Gibbons, J. D. (1971). Nonparametric statistical inference. New York, NY: McGraw-Hill.Google Scholar
  9. Wilcox, R. R. (2010). Fundamentals of modern statistical methods. New York, NY: Springer.CrossRefGoogle Scholar
  10. Wilcox, R. R. (2012). Introduction to robust estimation and testing (3rd ed.). Amsterdam, The Netherlands: Elsevier.Google Scholar
  11. Yuen, K. K. (1974). The two sample trimmed t to unequal population variances. Biometrika, 61, 165–170.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Emeritus Professor Psychological Methods, Department of PsychologyUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations