# Methods to test for equality of two normal distributions

## Abstract

Statistical tests for two independent samples under the assumption of normality are applied routinely by most practitioners of statistics. Likewise, presumably each introductory course in statistics treats some statistical procedures for two independent normal samples. Often, the classical two-sample model with equal variances is introduced, emphasizing that a test for equality of the expected values is a test for equality of both distributions as well, which is the actual goal. In a second step, usually the assumption of equal variances is discarded. The two-sample *t* test with Welch correction and the *F* test for equality of variances are introduced. The first test is solely treated as a test for the equality of central location, as well as the second as a test for the equality of scatter. Typically, there is no discussion if and to which extent testing for equality of the underlying normal distributions is possible, which is quite unsatisfactorily regarding the motivation and treatment of the situation with equal variances. It is the aim of this article to investigate the problem of testing for equality of two normal distributions, and to do so using knowledge and methods adequate to statistical practitioners as well as to students in an introductory statistics course. The power of the different tests discussed in the article is examined empirically. Finally, we apply the tests to several real data sets to illustrate their performance. In particular, we consider several data sets arising from intelligence tests since there is a large body of research supporting the existence of sex differences in mean scores or in variability in specific cognitive abilities.

## Keywords

Fisher combination method Minimum combination method Likelihood ratio test Two-sample model## Notes

### Acknowledgments

The authors thank the Editor and two anonymous referees for their valuable comments on the original version of the manuscript.

## References

- Aspin AA, Welch BL (1949) Tables for use in comparisons whose accuracy involves two variances, separately estimated. Biometrika 36:290–296MathSciNetCrossRefzbMATHGoogle Scholar
- Berk RH, Cohen A (1979) Asymptotically optimal methods of combining tests. Journal of the American Statistical Association 74:812–814MathSciNetCrossRefzbMATHGoogle Scholar
- Bickel PJ, Doksum KA (2006) Mathematical statistics, basic ideas and selected topics, 2nd ed, vol 1. Pearson, LondonGoogle Scholar
- Cucconi O, (1968) Un nuovo test non parametrico per il confronto tra due gruppi campionari. Giornale degli Economisti XXVII, pp 225–248Google Scholar
- Deary IJ, Irwing P, Der G, Bates TC (2007) Brother-sister differences in the \(g\) factor in intelligence: analysis of full, opposite-sex siblings from the NLSY1979. Intelligence 35:451–456CrossRefGoogle Scholar
- Edington ES (1972) An additive method for combining probability values from independent experiments. J Psychol 80:351–363CrossRefGoogle Scholar
- Fisher RA, (1932) Statistical methods for research workers, 4th ed. Oliver & Boyd, EdinburghGoogle Scholar
- Gastwirth JL, Gel YR, Miao W (2009) The impact of Levene’s test of equality of variances on statistical theory and practice. Stat Sci 24:343–360MathSciNetCrossRefzbMATHGoogle Scholar
- George EO, Mudholkar GS (1983) On the convolution of logistic random variables. Metrika 30:1–13MathSciNetCrossRefzbMATHGoogle Scholar
- Hogg RV, McKean JW, Craig AT (2005) Introduction to mathematical statistics, 6th ed. Pearson Education, LondonGoogle Scholar
- Hsieh HK (1979) On asymptotic optimality of likelihood ratio tests for multivariate normal distributions. Ann Statist 7:592–598MathSciNetCrossRefzbMATHGoogle Scholar
- Jain SK, Rathie PN, Shah MC (1975) The exact distributions of certain likelihood ratio criteria. Sankhya Ser A 37:150–163Google Scholar
- Lehmann EL, Romano JP (2005) Testing statistical hypotheses, 3rd edn. Springer, BerlinGoogle Scholar
- Lepage Y (1971) A combination of Wilcoxon’s and Ansari–Bradley’s statistics. Biometrika 58:213–217MathSciNetCrossRefzbMATHGoogle Scholar
- Lipták T (1958) On the combinationn of independent tests. Magyar Tudományos Akadémia Matematikai Kuatató Intezetenek Kozlemenyei 3:1971–1977Google Scholar
- Loughin TM (2004) A systematic comparison of methods for combining \(p\)-values from independent tests. Comput Stat Data Anal 47:467–485MathSciNetCrossRefzbMATHGoogle Scholar
- Marozzi M (2009) Some notes on the location-scale Cucconi test. J Nonparametric Stat 21:629–647MathSciNetCrossRefzbMATHGoogle Scholar
- Marozzi M (2011) Levene type tests for the ratio of two scales. J Stat Comput Simul 81:815–826MathSciNetCrossRefzbMATHGoogle Scholar
- Marozzi M (2012) A combined test for differences in scale based on the interquantile range. Stat Paper 53:61–72MathSciNetCrossRefzbMATHGoogle Scholar
- Marozzi M (2013) Nonparametric simultaneous tests for location and scale testing: a comparison of several methods. Commun Stat Simul Comput 42:1298–1317MathSciNetCrossRefzbMATHGoogle Scholar
- Mudholkar GS, George EO (1979) The logit statistic for combining probabilities. In: Rustagi J (ed) Symposium on optimizing methods in statistics. Academic Press, New York, pp 345–366Google Scholar
- Muirhead RJ (1982) On the distribution of the likelihood ratio test of equality of normal populations. Can J Stat 10:59–62MathSciNetCrossRefzbMATHGoogle Scholar
- Murdoch DJ, Tsai Y, Adcock J (2008) P-values are random variables. Am Stat 62:242–245MathSciNetCrossRefGoogle Scholar
- Nagar DK, Gupta AK (2004) Percentage points for testing homogeneity of several Univariate Gaussian populations. Appl Math Comput 156:551–561MathSciNetzbMATHGoogle Scholar
- Nair VN (1984) On the behaviour of some estimators from probability plots. J Am Stat Assoc 79:823–830CrossRefzbMATHGoogle Scholar
- Pearson ES, Neyman J (1930) On the problem of two samples. In: Neyman J, Pearson ES (eds) Joint statistical papers. Cambridge University Press, Cambridge, pp 99–115, 1967Google Scholar
- Perng SK, Littell RC (1976) A test of equality of two normal population means and variances. J Am Stat Assoc 71:968–971MathSciNetCrossRefzbMATHGoogle Scholar
- Pesarin F, Salmaso L (2010) Permutation tests for complex data: theory, applications and software. Wiley, New YorkCrossRefzbMATHGoogle Scholar
- Shoemaker LH (1999) Interquantile tests for dispersion in skewed distributions. Commun Stat Simul Comput 28:189–205MathSciNetCrossRefGoogle Scholar
- Singh N (1986) A simple and asymptotically optimal test for the equality of normal populations: a pragmatic approach to one-way classification. J Am Stat Assoc 81:703–704MathSciNetCrossRefzbMATHGoogle Scholar
- Steinmayr R, Beauducel A, Spinath B (2010) Do sex differences in a faceted model of fluid and crystallized intelligence depend on the method applied? Intelligence 38:101–110CrossRefGoogle Scholar
- Stouffer S, Suchman E, DeVinnery L, Star S, Williams R (1949) The American soldier, vol I. Adjustment during army life. Princeton University Press, PrincetonGoogle Scholar
- Tippett LHC (1931) The method of statistics. Williams and Norgate, LondonGoogle Scholar
- Zhang L, Xu X, Chen G (2012) The exact likelihood ratio test for equality of two normal populations. Am Stat 66:180–184MathSciNetCrossRefGoogle Scholar