Statistical Papers

, Volume 52, Issue 1, pp 219–231 | Cite as

The two-sample t test: pre-testing its assumptions does not pay off

  • Dieter Rasch
  • Klaus D. KubingerEmail author
  • Karl Moder
Regular Article


Traditionally, when applying the two-sample t test, some pre-testing occurs. That is, the theory-based assumptions of normal distributions as well as of homogeneity of the variances are often tested in applied sciences in advance of the tried-for t test. But this paper shows that such pre-testing leads to unknown final type-I- and type-II-risks if the respective statistical tests are performed using the same set of observations. In order to get an impression of the extension of the resulting misinterpreted risks, some theoretical deductions are given and, in particular, a systematic simulation study is done. As a result, we propose that it is preferable to apply no pre-tests for the t test and no t test at all, but instead to use the Welch-test as a standard test: its power comes close to that of the t test when the variances are homogeneous, and for unequal variances and skewness values |γ 1| < 3, it keeps the so called 20% robustness whereas the t test as well as Wilcoxon’s U test cannot be recommended for most cases.


Pre-tests Two-sample t test Welch-test Wilcoxon-U test 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Easterling RG, Anderson HE (1978) The effect of preliminary normality goodness of fit tests on subsequent inference. J Stat Comput Simul 8: 1–11zbMATHCrossRefGoogle Scholar
  2. Fleishman AJ (1978) A method for simulating non-normal distributions. Psychometrika 43: 521–532zbMATHCrossRefGoogle Scholar
  3. Kolmogorov AV (1933) Sulla determinazione empirica di una legge di distribuzione. Inst Ital Attuari Gorn 4: 1–11zbMATHGoogle Scholar
  4. Levene H (1960) Robust tests for equality of variances. In: Olkin I (eds) Contributions to probability and statistics. Essays in honor of Harold Hotelling. University Press, Stanford, pp 278–292Google Scholar
  5. Mann HB, Whitney DR (1947) On a test whether one of two random variables is stochastically larger than the other. Ann Math Stat 18: 50–60zbMATHCrossRefMathSciNetGoogle Scholar
  6. Moser BK, Stevens GR (1992) Homogeneity of variance in the two-sample means test. Am Stat 46: 19–21CrossRefGoogle Scholar
  7. Rasch D, Guiard V (2004) The robustness of parametric statistical methods. Psychol Sci 46: 175–208Google Scholar
  8. Rasch D, Teuscher F, Guiard V (2007a) How robust are tests for two independent samples?. J Stat Plan Inference 137: 2706–2720zbMATHCrossRefMathSciNetGoogle Scholar
  9. Rasch D, Verdooren LR, Gowers JI (2007b) Design and analysis of experiments and surveys (2nd edn.). Oldenbourg, MünchenGoogle Scholar
  10. Schucany WR, Ng HKT (2006) Preliminary goodness-of-fit tests for normality do not validate the one-sample Student t. Commun Stat Theory Methods 35: 2275–2286zbMATHCrossRefMathSciNetGoogle Scholar
  11. Smirnov VI (1939) On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bull Math Univ Moscou 2: 3–14Google Scholar
  12. Welch BL (1947) The generalisation of “Student’s” problem when several different population variances are involves. Biometrika 34: 28–35zbMATHMathSciNetGoogle Scholar
  13. Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1: 80–82CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  1. 1.Department of Landscape, Spatial and Infrastructure Sciences, Institute of Applied Statistics and ComputingUniversity of Natural Resources and Applied Life SciencesViennaAustria
  2. 2.Division of Psychological Assessment and Applied Psychometrics, Faculty of PsychologyUniversity of ViennaViennaAustria

Personalised recommendations