Abstract
It is the purpose of this paper to review recently-proposed exact tests based on the Baumgartner-Weiß-Schindler statistic and its modification. Except for the generalized Behrens-Fisher problem, these tests are broadly applicable, and they can be used to compare two groups irrespective of whether or not ties occur. In addition, a nonparametric trend test and a trend test for binomial proportions are possible. These exact tests are preferable to commonly-applied tests, such as the Wilcoxon rank sum test, in terms of both type I error rate and power.
Similar content being viewed by others
References
Agresti A (2003) Dealing with discreteness: making ‘exact’ confidence intervals for proportions, differences of proportions, and odds ratios more exact.Statist. Meth. Med. Res. 12, 3–21.
Ansari AR, Bradley RA (1960) Rank-sum tests for dispersion.Ann. Math. Stat. 31, 1174–1189.
Armitage P (1955) Tests for linear trends in proportions and frequencies.Biometrics 11, 375–386.
Baumgartner W, Weiß P, Schindler H (1998) A nonparametric test for the general two-sample problem.Biometrics 54, 1129–1135.
Berger VW (2000) Pros and cons of permutation tests in clinical trials.Statist. Med. 19, 1319–1328.
Blair RC, Sawilowsky S (1993) Comparison of two tests useful in situations where treatment is expected to increase variability relative to controls.Statist. Med. 12, 2233–2243.
Bradley JV (1968)Distribution-free statistical tests. Prentice-Hall, Englewood Cliffs.
Brownie C, Boos DD, Hughes-Oliver J (1990) Modifying thet and ANOVAF tests when treatment is expected to increase variability relative to controls.Biometrics 46, 259–266.
Brunner E, Munzel U (2000) The nonparametric Behrens-Fisher problem: asymptotic theory and a small sample approximation.Biometrical J. 42, 17–25.
Brunner E, Munzel U (2002)Nichtparametrische Datenanalyse. Springer, Berlin.
Büning H (1991)Robuste und adaptive Tests. De Gruyter, Berlin.
Büning H (2002) Robustness and power of modified Lepage, Kolmogorov-Smirnov and Cramér-von Mises two-sample tests.J. Appl. Statist. 29, 907–924.
Büning H, Trenkler G (1994)Nichtparametrische statistische Methoden. De Gruyter, Berlin (2nd edition).
Chen Y-I (1999) Nonparametric identification of the minimum effective dose.Biometrics 55, 1236–1240.
Chen Y-I, Wolfe DA (1990) Modifications of the Mack-Wolfe umbrella tests for a generalized Behrens-Fisher problem.Canad. J. Statist. 18, 245–253.
Coakley CW, Heise MA (1996) Versions of the sign test in the presence of ties.Biometrics 52, 1242–1251.
Cochran WG (1954) Some methods for strengthening the common χ2 tests.Biometrics 10, 417–451.
Cohen A, Sackrowitz HB (2003) Methods of reducing loss of efficiency due to discreteness of distributions.Statist. Meth. Med. Res. 12, 23–36.
Dwass M (1960) Somek-sample rank-order tests. In: Olkin I, Ghurye SG, Hoeffding W, Madow WG, Mann HB (eds.)Contributions to probability and statistics. Stanford University Press, Stanford, pp. 198–202.
Efron B, Tibshirani R (2002) Empirical Bayes methods and false discovery rates for microarrays.Genet. Epidemiol. 23, 70–86.
Fligner MA, Policello GE (1981) Robust rank procedures for the Behrens-Fisher problem.J. Amer. Statist. Assoc. 76, 162–168.
Freidlin B, Miao W, Gastwirth JL (2003) On the use of the Shapiro-Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions.Biometrical J. 45, 887–900.
Freidlin B, Zheng G, Li Z, Gastwirth JL (2002) Trend tests for case-control studies of genetic markers: power, sample size and robustness.Hum. Hered. 53, 146–152.
Gebhard J, Schmitz N (1998) Permutation tests—a revival?! II. An efficient algorithm for computing the critical region.Statist. Papers 39, 87–96.
Gibbons JD (1993)Nonparametric statistics: an introduction. Sage, Newbury Park.
Good PI (2000)Permutation tests. Springer-Verlag, New York (2nd edition).
Graubard BI, Korn EL (1987) Choice of column scores for testing independence in ordered 2×K contingency tables.Biometrics 43, 471–476.
Hall P, Yao Q (2003) Inference in ARCH and GARCH models with heavy-tailed errors.Econometrica 71, 285–317.
Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O-P, Wilfond B, Borg Å, Trent J (2001) Gene-expression profiles in hereditary breast cancer.N. Engl. J. Med. 344, 539–548.
Hilton JF (1996) The appropriateness of the Wilcoxon test in ordinal data.Statist. Med. 15, 631–645.
Hollander M, Wolfe DA (1999)Nonparametric statistical methods. Wiley, New York (2nd edition).
Horn M (1990) Zum Test von Wilcoxon, Mann und Whitney: Bedingungen, unter denen und Fragestellungen, für die er anwendbar ist.Z. Versuchstierkd. 33, 109–114.
Hothorn LA, Bretz F (2000) Evaluation of animal carcinogenicity studies: Cochran-Armitage trend test vs. multiple contrast tests.Biometrical J. 42, 553–567.
Hothorn LA, Hauschke D (1998) Principles in statistical testing in randomized toxicological studies. In: Chow SC, Liu JP (eds.)Designs and analysis of animal studies in pharmaceutical development. Marcel Dekker, New York, pp. 79–133.
Hunter MA, May RB (1993) Some myths concerning parametric and nonparametric tests.Canad. Psychol. 34, 384–389.
Jansen RC (2001) Quantitative trait loci in inbred lines. In: Balding DJ, Bishop M, Cannings C (eds.):Handbook of statistical genetics. Wiley, Chichester, pp. 567–597.
Jonckheere AR (1954) A distribution-freek-sample test against ordered alternatives.Biometrika 41, 133–145.
Kasuya E (2001) Mann-WhitneyU test when variances are unequal.Anim. Behav. 61, 1247–1249.
Lancaster HO (1961) Significance tests in discrete distributions.J. Amer. Statist. Assoc. 56, 223–234.
Lehmann EL (1975)Nonparametrics: Statistical methods based on ranks. Holden-Day, San Francisco.
Lepage Y (1971) A combination of Wilcoxon's and Ansari-Bradley's statistics.Biometrika 58, 213–217.
Levene H (1960) Robust tests for equality of variances. In: Olkin I, Ghurye SG, Hoeffding W, Madow WG, Mann HB (eds.)Contributions to probability and statistics. Stanford University Press, Stanford, pp. 278–292.
Liu X, Nickel R, Beyer K, Wahn U, Ehrlich E, Freidhoff LR, Björksten B, Beaty TH, Huang SK, and the MAS-Study Group (2000) AnIL13 coding region variant is associated with a high total serum Ige level and atopic dermatitis in the German multicenter atopy study (MAS-90).J. Allergy Clin. Immunol. 106, 167–170.
Ludbrook J, Dudley H (1994) Issues in biomedical statistics: statistical inference.Aust. N.Z. J. Surg. 64, 630–636.
Ludbrook J, Dudley H (1998) Why permutation tests are superior tot andF tests in biomedical research.Amer. Statist. 52, 127–132.
Manly BFJ (1997)Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman and Hall, London (2nd edition).
Manly BFJ, Francis RICC (2002) Testing for mean and variance differences with samples from distributions that may be non-normal with unequal variances.J. Statist. Comput. Simul. 72, 633–646.
Mann HB, Whitney DR (1947) On a test whether one of two random variables is stochastically larger than the other.Ann. Math. Statist. 18, 50–60.
Mayhew PJ, Pen I (2002) Comparative analysis of sex ratios. In: Hardy ICW (ed.):Sex ratios: concepts and research methods. Cambridge University Press, Cambridge, pp. 132–156.
Mehrotra DV, Chan ISF, Berger RL (2003) A cautionary note on exact unconditional inference for a difference between two independent bionomial proportions.Biometrics 59, 441–450.
Mehta CR, Hilton JF (1993) Exact power of conditional and unconditional tests: Going beyond the 2x2 contigency table.Amer. Statist. 47, 91–98
Micceri T (1989) The unicorn, the normal curve, and other improbable creature.Psychol. Bull. 105, 156–166.
Mundry R, Fischer J (1998) Use of statistical programs for nonparametric tests of small samples often leads to incorrectP values: examples fromAnimal Behaviour.Anim. Behav. 56, 256–259.
Neuhäuser M (2000) An exact two-sample test based on the Baumgartner-Weiß-Schindler statistic and a modification of Lepage's test.Commun. Statist.— Theory Meth. 29, 67–78.
Neuhäuser M. (2001a): An adaptive location-scale test.Biometrical J. 43, 809–819.
Neuhäuser M. (2001b): One-sided two-sample and trend tests based on a modified Baumgartner-Weiß-Schindler statistic.J. Nonparam. Statist. 13, 729–739.
Neuhäuser M. (2002a): Two-sample tests when variances are unequal.Anim. Behav. 63, 823–825.
Neuhäuser M. (2002b) Nonparametric identification of the minimum effective dose.Drug Inf. J. 36, 881–888.
Neuhäuser M. (2002c): The Baumgartner-Weiß-Schindler test in the presence of ties (letter to the editor).Biometrics 58, 250.
Neuhäuser M. (2002d) Exact tests for the analysis of case-control studies of genetic markers.Hum. Hered. 54, 151–156.
Neuhäuser M. (2003a) A note on the exact test based on the Baumgartner-Weiß-Schindler statistic in the presence of ties.Comput. Statist. Data Anal. 42, 561–568.
Neuhäuser M. (2003b) An exact test for trend among binomial proportions with a modified Baumgartner-Weiß-Schindler statistic. Submitted manuscript.
Neuhäuser M., Bretz F (2001) Nonparametric all-pairs multiple comparisons.Biometrical J. 43, 571–580.
Neuhäuser M., Büning H, Hothorn LA (2004) Maximum test versus adaptive tests for the two-sample location problem.J. Appl. Statist. 31, 215–227.
Neuhäuser M., Hothorn LA (1999) An exact Cochran-Armitage test for trend when dose-response shapes are a priori unknownComput. Statist. Data Anal. 30, 403–412.
Neuhäuser M., Hothorn LA (2000) Parametric location-scale and scale trend tests based on Levene's transformation.Comput. Statist. Data Anal. 33, 189–200.
Neuhäuser M., Liu P-Y, Hothorn LA (1998) Nonparametric tests for trend: Jonckheere's test, a modification and a maximum test.Biometrical J. 40, 899–909.
North BV, Curtis D, Sham PC (2002) A note on the calculation of empiricalP values from Monte Carlo procedures.Am. J. Hum. Genet. 71, 439–441.
Ogenstad S (1998) The use of generalized tests in medical research.J. Biopharm. Statist. 8, 497–508.
Portier C, Hoel D (1984) Type I error of trend tests in proportions and the design of cancer screens.Commun. Statist.— Theory Meth. 13, 1–14.
Ryman N, Jorde PE (2001) Statistical power when testing for genetic differentiation.Mol. Ecol. 10, 2361–2373.
Sasieni PD (1997) From genotypes to genes: doubling the sample size.Biometrics 53, 1253–1261.
Sawilowsky SS, Blair RC (1992) A more realistic look at the robustness and type II error properties of thet test to departures from population normality.Psychol. Bull. 111, 352–360.
Sham P (1998)Statistics in human genetics. Arnold, London.
Siegel S (1956)Nonparametric statistics for the behavioral sciences. McGraw-Hill, New York.
Singer J (2001) A simple procedure to compute the sample size needed to compare two independent groups when the population variances are unequal.Statist. Med. 20, 1089–1095.
Slager SL, Schaid DJ (2001) Case-control studies of genetic markers: Power and sample size approximations for Armitage's test for trend.Hum. Hered. 52, 149–153.
Steel RGD (1960) A rank sum test for comparing all pairs of treatments.Technometrics 2, 197–207.
Streitberg B, Roehmel J (1990) On tests that are uniformly more powerful than the Wilcoxon-Mann-Whitney test.Biometrics 46, 481–484.
Wilcoxon F (1945) Individual comparisons by ranking methods.Biometrics 1, 80–83.
Williams DA (1988) Tests for differences between several small proportions.Appl. Statist. 37, 421–434.
Williams PB, Carnine DW (1981) Relationship between range of examples and of instructions and attention in concept attainment.J. Educ. Res. 74, 144–148.
Yezerinac SM, Weatherhead PJ, Boag PT (1995) Extra-pair paternity and the opportunity for sexual selection in a socially monogamous bird (Dendroica petechia),Behav. Ecol. Sociobiol. 37, 179–188.
Zar JH (1974)Biostatistical analysis. Prentice-Hall, Englewood Cliffs.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Neuhäuser, M. Exact tests based on the Baumgartner-Weiß-Schindler statistic—A survey. Statistical Papers 46, 1–29 (2005). https://doi.org/10.1007/BF02762032
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02762032