Skip to main content
Log in

Exact tests based on the Baumgartner-Weiß-Schindler statistic—A survey

  • Survey Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

It is the purpose of this paper to review recently-proposed exact tests based on the Baumgartner-Weiß-Schindler statistic and its modification. Except for the generalized Behrens-Fisher problem, these tests are broadly applicable, and they can be used to compare two groups irrespective of whether or not ties occur. In addition, a nonparametric trend test and a trend test for binomial proportions are possible. These exact tests are preferable to commonly-applied tests, such as the Wilcoxon rank sum test, in terms of both type I error rate and power.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti A (2003) Dealing with discreteness: making ‘exact’ confidence intervals for proportions, differences of proportions, and odds ratios more exact.Statist. Meth. Med. Res. 12, 3–21.

    Article  MATH  MathSciNet  Google Scholar 

  • Ansari AR, Bradley RA (1960) Rank-sum tests for dispersion.Ann. Math. Stat. 31, 1174–1189.

    Article  MathSciNet  MATH  Google Scholar 

  • Armitage P (1955) Tests for linear trends in proportions and frequencies.Biometrics 11, 375–386.

    Article  Google Scholar 

  • Baumgartner W, Weiß P, Schindler H (1998) A nonparametric test for the general two-sample problem.Biometrics 54, 1129–1135.

    Article  MATH  Google Scholar 

  • Berger VW (2000) Pros and cons of permutation tests in clinical trials.Statist. Med. 19, 1319–1328.

    Article  Google Scholar 

  • Blair RC, Sawilowsky S (1993) Comparison of two tests useful in situations where treatment is expected to increase variability relative to controls.Statist. Med. 12, 2233–2243.

    Article  Google Scholar 

  • Bradley JV (1968)Distribution-free statistical tests. Prentice-Hall, Englewood Cliffs.

    MATH  Google Scholar 

  • Brownie C, Boos DD, Hughes-Oliver J (1990) Modifying thet and ANOVAF tests when treatment is expected to increase variability relative to controls.Biometrics 46, 259–266.

    Article  MATH  MathSciNet  Google Scholar 

  • Brunner E, Munzel U (2000) The nonparametric Behrens-Fisher problem: asymptotic theory and a small sample approximation.Biometrical J. 42, 17–25.

    Article  MATH  MathSciNet  Google Scholar 

  • Brunner E, Munzel U (2002)Nichtparametrische Datenanalyse. Springer, Berlin.

    MATH  Google Scholar 

  • Büning H (1991)Robuste und adaptive Tests. De Gruyter, Berlin.

    MATH  Google Scholar 

  • Büning H (2002) Robustness and power of modified Lepage, Kolmogorov-Smirnov and Cramér-von Mises two-sample tests.J. Appl. Statist. 29, 907–924.

    Article  MATH  Google Scholar 

  • Büning H, Trenkler G (1994)Nichtparametrische statistische Methoden. De Gruyter, Berlin (2nd edition).

    MATH  Google Scholar 

  • Chen Y-I (1999) Nonparametric identification of the minimum effective dose.Biometrics 55, 1236–1240.

    Article  MATH  Google Scholar 

  • Chen Y-I, Wolfe DA (1990) Modifications of the Mack-Wolfe umbrella tests for a generalized Behrens-Fisher problem.Canad. J. Statist. 18, 245–253.

    Article  MathSciNet  Google Scholar 

  • Coakley CW, Heise MA (1996) Versions of the sign test in the presence of ties.Biometrics 52, 1242–1251.

    Article  MATH  MathSciNet  Google Scholar 

  • Cochran WG (1954) Some methods for strengthening the common χ2 tests.Biometrics 10, 417–451.

    Article  MATH  MathSciNet  Google Scholar 

  • Cohen A, Sackrowitz HB (2003) Methods of reducing loss of efficiency due to discreteness of distributions.Statist. Meth. Med. Res. 12, 23–36.

    Article  MATH  MathSciNet  Google Scholar 

  • Dwass M (1960) Somek-sample rank-order tests. In: Olkin I, Ghurye SG, Hoeffding W, Madow WG, Mann HB (eds.)Contributions to probability and statistics. Stanford University Press, Stanford, pp. 198–202.

    Google Scholar 

  • Efron B, Tibshirani R (2002) Empirical Bayes methods and false discovery rates for microarrays.Genet. Epidemiol. 23, 70–86.

    Article  Google Scholar 

  • Fligner MA, Policello GE (1981) Robust rank procedures for the Behrens-Fisher problem.J. Amer. Statist. Assoc. 76, 162–168.

    Article  MathSciNet  Google Scholar 

  • Freidlin B, Miao W, Gastwirth JL (2003) On the use of the Shapiro-Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions.Biometrical J. 45, 887–900.

    Article  MathSciNet  Google Scholar 

  • Freidlin B, Zheng G, Li Z, Gastwirth JL (2002) Trend tests for case-control studies of genetic markers: power, sample size and robustness.Hum. Hered. 53, 146–152.

    Article  Google Scholar 

  • Gebhard J, Schmitz N (1998) Permutation tests—a revival?! II. An efficient algorithm for computing the critical region.Statist. Papers 39, 87–96.

    Article  MATH  MathSciNet  Google Scholar 

  • Gibbons JD (1993)Nonparametric statistics: an introduction. Sage, Newbury Park.

    MATH  Google Scholar 

  • Good PI (2000)Permutation tests. Springer-Verlag, New York (2nd edition).

    MATH  Google Scholar 

  • Graubard BI, Korn EL (1987) Choice of column scores for testing independence in ordered 2×K contingency tables.Biometrics 43, 471–476.

    Article  MathSciNet  Google Scholar 

  • Hall P, Yao Q (2003) Inference in ARCH and GARCH models with heavy-tailed errors.Econometrica 71, 285–317.

    Article  MATH  MathSciNet  Google Scholar 

  • Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O-P, Wilfond B, Borg Å, Trent J (2001) Gene-expression profiles in hereditary breast cancer.N. Engl. J. Med. 344, 539–548.

    Article  Google Scholar 

  • Hilton JF (1996) The appropriateness of the Wilcoxon test in ordinal data.Statist. Med. 15, 631–645.

    Article  Google Scholar 

  • Hollander M, Wolfe DA (1999)Nonparametric statistical methods. Wiley, New York (2nd edition).

    MATH  Google Scholar 

  • Horn M (1990) Zum Test von Wilcoxon, Mann und Whitney: Bedingungen, unter denen und Fragestellungen, für die er anwendbar ist.Z. Versuchstierkd. 33, 109–114.

    Google Scholar 

  • Hothorn LA, Bretz F (2000) Evaluation of animal carcinogenicity studies: Cochran-Armitage trend test vs. multiple contrast tests.Biometrical J. 42, 553–567.

    Article  MATH  MathSciNet  Google Scholar 

  • Hothorn LA, Hauschke D (1998) Principles in statistical testing in randomized toxicological studies. In: Chow SC, Liu JP (eds.)Designs and analysis of animal studies in pharmaceutical development. Marcel Dekker, New York, pp. 79–133.

    Google Scholar 

  • Hunter MA, May RB (1993) Some myths concerning parametric and nonparametric tests.Canad. Psychol. 34, 384–389.

    Article  Google Scholar 

  • Jansen RC (2001) Quantitative trait loci in inbred lines. In: Balding DJ, Bishop M, Cannings C (eds.):Handbook of statistical genetics. Wiley, Chichester, pp. 567–597.

    Google Scholar 

  • Jonckheere AR (1954) A distribution-freek-sample test against ordered alternatives.Biometrika 41, 133–145.

    MATH  MathSciNet  Google Scholar 

  • Kasuya E (2001) Mann-WhitneyU test when variances are unequal.Anim. Behav. 61, 1247–1249.

    Article  Google Scholar 

  • Lancaster HO (1961) Significance tests in discrete distributions.J. Amer. Statist. Assoc. 56, 223–234.

    Article  MATH  MathSciNet  Google Scholar 

  • Lehmann EL (1975)Nonparametrics: Statistical methods based on ranks. Holden-Day, San Francisco.

    MATH  Google Scholar 

  • Lepage Y (1971) A combination of Wilcoxon's and Ansari-Bradley's statistics.Biometrika 58, 213–217.

    Article  MATH  MathSciNet  Google Scholar 

  • Levene H (1960) Robust tests for equality of variances. In: Olkin I, Ghurye SG, Hoeffding W, Madow WG, Mann HB (eds.)Contributions to probability and statistics. Stanford University Press, Stanford, pp. 278–292.

    Google Scholar 

  • Liu X, Nickel R, Beyer K, Wahn U, Ehrlich E, Freidhoff LR, Björksten B, Beaty TH, Huang SK, and the MAS-Study Group (2000) AnIL13 coding region variant is associated with a high total serum Ige level and atopic dermatitis in the German multicenter atopy study (MAS-90).J. Allergy Clin. Immunol. 106, 167–170.

    Article  Google Scholar 

  • Ludbrook J, Dudley H (1994) Issues in biomedical statistics: statistical inference.Aust. N.Z. J. Surg. 64, 630–636.

    Article  Google Scholar 

  • Ludbrook J, Dudley H (1998) Why permutation tests are superior tot andF tests in biomedical research.Amer. Statist. 52, 127–132.

    Article  Google Scholar 

  • Manly BFJ (1997)Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman and Hall, London (2nd edition).

    MATH  Google Scholar 

  • Manly BFJ, Francis RICC (2002) Testing for mean and variance differences with samples from distributions that may be non-normal with unequal variances.J. Statist. Comput. Simul. 72, 633–646.

    Article  MATH  MathSciNet  Google Scholar 

  • Mann HB, Whitney DR (1947) On a test whether one of two random variables is stochastically larger than the other.Ann. Math. Statist. 18, 50–60.

    Article  MathSciNet  MATH  Google Scholar 

  • Mayhew PJ, Pen I (2002) Comparative analysis of sex ratios. In: Hardy ICW (ed.):Sex ratios: concepts and research methods. Cambridge University Press, Cambridge, pp. 132–156.

    Google Scholar 

  • Mehrotra DV, Chan ISF, Berger RL (2003) A cautionary note on exact unconditional inference for a difference between two independent bionomial proportions.Biometrics 59, 441–450.

    Article  MathSciNet  Google Scholar 

  • Mehta CR, Hilton JF (1993) Exact power of conditional and unconditional tests: Going beyond the 2x2 contigency table.Amer. Statist. 47, 91–98

    Article  Google Scholar 

  • Micceri T (1989) The unicorn, the normal curve, and other improbable creature.Psychol. Bull. 105, 156–166.

    Article  Google Scholar 

  • Mundry R, Fischer J (1998) Use of statistical programs for nonparametric tests of small samples often leads to incorrectP values: examples fromAnimal Behaviour.Anim. Behav. 56, 256–259.

    Article  Google Scholar 

  • Neuhäuser M (2000) An exact two-sample test based on the Baumgartner-Weiß-Schindler statistic and a modification of Lepage's test.Commun. Statist.— Theory Meth. 29, 67–78.

    Article  MATH  Google Scholar 

  • Neuhäuser M. (2001a): An adaptive location-scale test.Biometrical J. 43, 809–819.

    Article  MATH  Google Scholar 

  • Neuhäuser M. (2001b): One-sided two-sample and trend tests based on a modified Baumgartner-Weiß-Schindler statistic.J. Nonparam. Statist. 13, 729–739.

    Article  MATH  Google Scholar 

  • Neuhäuser M. (2002a): Two-sample tests when variances are unequal.Anim. Behav. 63, 823–825.

    Article  Google Scholar 

  • Neuhäuser M. (2002b) Nonparametric identification of the minimum effective dose.Drug Inf. J. 36, 881–888.

    Google Scholar 

  • Neuhäuser M. (2002c): The Baumgartner-Weiß-Schindler test in the presence of ties (letter to the editor).Biometrics 58, 250.

    Google Scholar 

  • Neuhäuser M. (2002d) Exact tests for the analysis of case-control studies of genetic markers.Hum. Hered. 54, 151–156.

    Article  Google Scholar 

  • Neuhäuser M. (2003a) A note on the exact test based on the Baumgartner-Weiß-Schindler statistic in the presence of ties.Comput. Statist. Data Anal. 42, 561–568.

    Article  MathSciNet  MATH  Google Scholar 

  • Neuhäuser M. (2003b) An exact test for trend among binomial proportions with a modified Baumgartner-Weiß-Schindler statistic. Submitted manuscript.

  • Neuhäuser M., Bretz F (2001) Nonparametric all-pairs multiple comparisons.Biometrical J. 43, 571–580.

    Article  MATH  Google Scholar 

  • Neuhäuser M., Büning H, Hothorn LA (2004) Maximum test versus adaptive tests for the two-sample location problem.J. Appl. Statist. 31, 215–227.

    Article  MATH  Google Scholar 

  • Neuhäuser M., Hothorn LA (1999) An exact Cochran-Armitage test for trend when dose-response shapes are a priori unknownComput. Statist. Data Anal. 30, 403–412.

    Article  MATH  Google Scholar 

  • Neuhäuser M., Hothorn LA (2000) Parametric location-scale and scale trend tests based on Levene's transformation.Comput. Statist. Data Anal. 33, 189–200.

    Article  MATH  Google Scholar 

  • Neuhäuser M., Liu P-Y, Hothorn LA (1998) Nonparametric tests for trend: Jonckheere's test, a modification and a maximum test.Biometrical J. 40, 899–909.

    Article  MATH  Google Scholar 

  • North BV, Curtis D, Sham PC (2002) A note on the calculation of empiricalP values from Monte Carlo procedures.Am. J. Hum. Genet. 71, 439–441.

    Article  Google Scholar 

  • Ogenstad S (1998) The use of generalized tests in medical research.J. Biopharm. Statist. 8, 497–508.

    Article  MATH  Google Scholar 

  • Portier C, Hoel D (1984) Type I error of trend tests in proportions and the design of cancer screens.Commun. Statist.— Theory Meth. 13, 1–14.

    Article  Google Scholar 

  • Ryman N, Jorde PE (2001) Statistical power when testing for genetic differentiation.Mol. Ecol. 10, 2361–2373.

    Article  Google Scholar 

  • Sasieni PD (1997) From genotypes to genes: doubling the sample size.Biometrics 53, 1253–1261.

    Article  MATH  MathSciNet  Google Scholar 

  • Sawilowsky SS, Blair RC (1992) A more realistic look at the robustness and type II error properties of thet test to departures from population normality.Psychol. Bull. 111, 352–360.

    Article  Google Scholar 

  • Sham P (1998)Statistics in human genetics. Arnold, London.

    Google Scholar 

  • Siegel S (1956)Nonparametric statistics for the behavioral sciences. McGraw-Hill, New York.

    MATH  Google Scholar 

  • Singer J (2001) A simple procedure to compute the sample size needed to compare two independent groups when the population variances are unequal.Statist. Med. 20, 1089–1095.

    Article  Google Scholar 

  • Slager SL, Schaid DJ (2001) Case-control studies of genetic markers: Power and sample size approximations for Armitage's test for trend.Hum. Hered. 52, 149–153.

    Article  Google Scholar 

  • Steel RGD (1960) A rank sum test for comparing all pairs of treatments.Technometrics 2, 197–207.

    Article  MATH  MathSciNet  Google Scholar 

  • Streitberg B, Roehmel J (1990) On tests that are uniformly more powerful than the Wilcoxon-Mann-Whitney test.Biometrics 46, 481–484.

    Article  MATH  MathSciNet  Google Scholar 

  • Wilcoxon F (1945) Individual comparisons by ranking methods.Biometrics 1, 80–83.

    Article  Google Scholar 

  • Williams DA (1988) Tests for differences between several small proportions.Appl. Statist. 37, 421–434.

    Article  Google Scholar 

  • Williams PB, Carnine DW (1981) Relationship between range of examples and of instructions and attention in concept attainment.J. Educ. Res. 74, 144–148.

    Google Scholar 

  • Yezerinac SM, Weatherhead PJ, Boag PT (1995) Extra-pair paternity and the opportunity for sexual selection in a socially monogamous bird (Dendroica petechia),Behav. Ecol. Sociobiol. 37, 179–188.

    Article  Google Scholar 

  • Zar JH (1974)Biostatistical analysis. Prentice-Hall, Englewood Cliffs.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markus Neuhäuser.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neuhäuser, M. Exact tests based on the Baumgartner-Weiß-Schindler statistic—A survey. Statistical Papers 46, 1–29 (2005). https://doi.org/10.1007/BF02762032

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02762032

Key words

Navigation