Exact tests based on the Baumgartner-Weiß-Schindler statistic—A survey

Abstract

It is the purpose of this paper to review recently-proposed exact tests based on the Baumgartner-Weiß-Schindler statistic and its modification. Except for the generalized Behrens-Fisher problem, these tests are broadly applicable, and they can be used to compare two groups irrespective of whether or not ties occur. In addition, a nonparametric trend test and a trend test for binomial proportions are possible. These exact tests are preferable to commonly-applied tests, such as the Wilcoxon rank sum test, in terms of both type I error rate and power.

This is a preview of subscription content, log in to check access.

References

  1. Agresti A (2003) Dealing with discreteness: making ‘exact’ confidence intervals for proportions, differences of proportions, and odds ratios more exact.Statist. Meth. Med. Res. 12, 3–21.

    MATH  Article  MathSciNet  Google Scholar 

  2. Ansari AR, Bradley RA (1960) Rank-sum tests for dispersion.Ann. Math. Stat. 31, 1174–1189.

    Article  MathSciNet  MATH  Google Scholar 

  3. Armitage P (1955) Tests for linear trends in proportions and frequencies.Biometrics 11, 375–386.

    Article  Google Scholar 

  4. Baumgartner W, Weiß P, Schindler H (1998) A nonparametric test for the general two-sample problem.Biometrics 54, 1129–1135.

    MATH  Article  Google Scholar 

  5. Berger VW (2000) Pros and cons of permutation tests in clinical trials.Statist. Med. 19, 1319–1328.

    Article  Google Scholar 

  6. Blair RC, Sawilowsky S (1993) Comparison of two tests useful in situations where treatment is expected to increase variability relative to controls.Statist. Med. 12, 2233–2243.

    Article  Google Scholar 

  7. Bradley JV (1968)Distribution-free statistical tests. Prentice-Hall, Englewood Cliffs.

    Google Scholar 

  8. Brownie C, Boos DD, Hughes-Oliver J (1990) Modifying thet and ANOVAF tests when treatment is expected to increase variability relative to controls.Biometrics 46, 259–266.

    MATH  Article  MathSciNet  Google Scholar 

  9. Brunner E, Munzel U (2000) The nonparametric Behrens-Fisher problem: asymptotic theory and a small sample approximation.Biometrical J. 42, 17–25.

    MATH  Article  MathSciNet  Google Scholar 

  10. Brunner E, Munzel U (2002)Nichtparametrische Datenanalyse. Springer, Berlin.

    Google Scholar 

  11. Büning H (1991)Robuste und adaptive Tests. De Gruyter, Berlin.

    Google Scholar 

  12. Büning H (2002) Robustness and power of modified Lepage, Kolmogorov-Smirnov and Cramér-von Mises two-sample tests.J. Appl. Statist. 29, 907–924.

    MATH  Article  Google Scholar 

  13. Büning H, Trenkler G (1994)Nichtparametrische statistische Methoden. De Gruyter, Berlin (2nd edition).

    Google Scholar 

  14. Chen Y-I (1999) Nonparametric identification of the minimum effective dose.Biometrics 55, 1236–1240.

    MATH  Article  Google Scholar 

  15. Chen Y-I, Wolfe DA (1990) Modifications of the Mack-Wolfe umbrella tests for a generalized Behrens-Fisher problem.Canad. J. Statist. 18, 245–253.

    Article  MathSciNet  Google Scholar 

  16. Coakley CW, Heise MA (1996) Versions of the sign test in the presence of ties.Biometrics 52, 1242–1251.

    MATH  Article  MathSciNet  Google Scholar 

  17. Cochran WG (1954) Some methods for strengthening the common χ2 tests.Biometrics 10, 417–451.

    MATH  Article  MathSciNet  Google Scholar 

  18. Cohen A, Sackrowitz HB (2003) Methods of reducing loss of efficiency due to discreteness of distributions.Statist. Meth. Med. Res. 12, 23–36.

    MATH  Article  MathSciNet  Google Scholar 

  19. Dwass M (1960) Somek-sample rank-order tests. In: Olkin I, Ghurye SG, Hoeffding W, Madow WG, Mann HB (eds.)Contributions to probability and statistics. Stanford University Press, Stanford, pp. 198–202.

    Google Scholar 

  20. Efron B, Tibshirani R (2002) Empirical Bayes methods and false discovery rates for microarrays.Genet. Epidemiol. 23, 70–86.

    Article  Google Scholar 

  21. Fligner MA, Policello GE (1981) Robust rank procedures for the Behrens-Fisher problem.J. Amer. Statist. Assoc. 76, 162–168.

    Article  MathSciNet  Google Scholar 

  22. Freidlin B, Miao W, Gastwirth JL (2003) On the use of the Shapiro-Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions.Biometrical J. 45, 887–900.

    Article  MathSciNet  Google Scholar 

  23. Freidlin B, Zheng G, Li Z, Gastwirth JL (2002) Trend tests for case-control studies of genetic markers: power, sample size and robustness.Hum. Hered. 53, 146–152.

    Article  Google Scholar 

  24. Gebhard J, Schmitz N (1998) Permutation tests—a revival?! II. An efficient algorithm for computing the critical region.Statist. Papers 39, 87–96.

    MATH  MathSciNet  Article  Google Scholar 

  25. Gibbons JD (1993)Nonparametric statistics: an introduction. Sage, Newbury Park.

    Google Scholar 

  26. Good PI (2000)Permutation tests. Springer-Verlag, New York (2nd edition).

    Google Scholar 

  27. Graubard BI, Korn EL (1987) Choice of column scores for testing independence in ordered 2×K contingency tables.Biometrics 43, 471–476.

    Article  MathSciNet  Google Scholar 

  28. Hall P, Yao Q (2003) Inference in ARCH and GARCH models with heavy-tailed errors.Econometrica 71, 285–317.

    MATH  Article  MathSciNet  Google Scholar 

  29. Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O-P, Wilfond B, Borg Å, Trent J (2001) Gene-expression profiles in hereditary breast cancer.N. Engl. J. Med. 344, 539–548.

    Article  Google Scholar 

  30. Hilton JF (1996) The appropriateness of the Wilcoxon test in ordinal data.Statist. Med. 15, 631–645.

    Article  Google Scholar 

  31. Hollander M, Wolfe DA (1999)Nonparametric statistical methods. Wiley, New York (2nd edition).

    Google Scholar 

  32. Horn M (1990) Zum Test von Wilcoxon, Mann und Whitney: Bedingungen, unter denen und Fragestellungen, für die er anwendbar ist.Z. Versuchstierkd. 33, 109–114.

    Google Scholar 

  33. Hothorn LA, Bretz F (2000) Evaluation of animal carcinogenicity studies: Cochran-Armitage trend test vs. multiple contrast tests.Biometrical J. 42, 553–567.

    MATH  Article  MathSciNet  Google Scholar 

  34. Hothorn LA, Hauschke D (1998) Principles in statistical testing in randomized toxicological studies. In: Chow SC, Liu JP (eds.)Designs and analysis of animal studies in pharmaceutical development. Marcel Dekker, New York, pp. 79–133.

    Google Scholar 

  35. Hunter MA, May RB (1993) Some myths concerning parametric and nonparametric tests.Canad. Psychol. 34, 384–389.

    Article  Google Scholar 

  36. Jansen RC (2001) Quantitative trait loci in inbred lines. In: Balding DJ, Bishop M, Cannings C (eds.):Handbook of statistical genetics. Wiley, Chichester, pp. 567–597.

    Google Scholar 

  37. Jonckheere AR (1954) A distribution-freek-sample test against ordered alternatives.Biometrika 41, 133–145.

    MATH  MathSciNet  Google Scholar 

  38. Kasuya E (2001) Mann-WhitneyU test when variances are unequal.Anim. Behav. 61, 1247–1249.

    Article  Google Scholar 

  39. Lancaster HO (1961) Significance tests in discrete distributions.J. Amer. Statist. Assoc. 56, 223–234.

    MATH  Article  MathSciNet  Google Scholar 

  40. Lehmann EL (1975)Nonparametrics: Statistical methods based on ranks. Holden-Day, San Francisco.

    Google Scholar 

  41. Lepage Y (1971) A combination of Wilcoxon's and Ansari-Bradley's statistics.Biometrika 58, 213–217.

    MATH  Article  MathSciNet  Google Scholar 

  42. Levene H (1960) Robust tests for equality of variances. In: Olkin I, Ghurye SG, Hoeffding W, Madow WG, Mann HB (eds.)Contributions to probability and statistics. Stanford University Press, Stanford, pp. 278–292.

    Google Scholar 

  43. Liu X, Nickel R, Beyer K, Wahn U, Ehrlich E, Freidhoff LR, Björksten B, Beaty TH, Huang SK, and the MAS-Study Group (2000) AnIL13 coding region variant is associated with a high total serum Ige level and atopic dermatitis in the German multicenter atopy study (MAS-90).J. Allergy Clin. Immunol. 106, 167–170.

    Article  Google Scholar 

  44. Ludbrook J, Dudley H (1994) Issues in biomedical statistics: statistical inference.Aust. N.Z. J. Surg. 64, 630–636.

    Article  Google Scholar 

  45. Ludbrook J, Dudley H (1998) Why permutation tests are superior tot andF tests in biomedical research.Amer. Statist. 52, 127–132.

    Article  Google Scholar 

  46. Manly BFJ (1997)Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman and Hall, London (2nd edition).

    Google Scholar 

  47. Manly BFJ, Francis RICC (2002) Testing for mean and variance differences with samples from distributions that may be non-normal with unequal variances.J. Statist. Comput. Simul. 72, 633–646.

    MATH  Article  MathSciNet  Google Scholar 

  48. Mann HB, Whitney DR (1947) On a test whether one of two random variables is stochastically larger than the other.Ann. Math. Statist. 18, 50–60.

    Article  MathSciNet  MATH  Google Scholar 

  49. Mayhew PJ, Pen I (2002) Comparative analysis of sex ratios. In: Hardy ICW (ed.):Sex ratios: concepts and research methods. Cambridge University Press, Cambridge, pp. 132–156.

    Google Scholar 

  50. Mehrotra DV, Chan ISF, Berger RL (2003) A cautionary note on exact unconditional inference for a difference between two independent bionomial proportions.Biometrics 59, 441–450.

    Article  MathSciNet  Google Scholar 

  51. Mehta CR, Hilton JF (1993) Exact power of conditional and unconditional tests: Going beyond the 2x2 contigency table.Amer. Statist. 47, 91–98

    Article  Google Scholar 

  52. Micceri T (1989) The unicorn, the normal curve, and other improbable creature.Psychol. Bull. 105, 156–166.

    Article  Google Scholar 

  53. Mundry R, Fischer J (1998) Use of statistical programs for nonparametric tests of small samples often leads to incorrectP values: examples fromAnimal Behaviour.Anim. Behav. 56, 256–259.

    Article  Google Scholar 

  54. Neuhäuser M (2000) An exact two-sample test based on the Baumgartner-Weiß-Schindler statistic and a modification of Lepage's test.Commun. Statist.— Theory Meth. 29, 67–78.

    MATH  Article  Google Scholar 

  55. Neuhäuser M. (2001a): An adaptive location-scale test.Biometrical J. 43, 809–819.

    MATH  Article  Google Scholar 

  56. Neuhäuser M. (2001b): One-sided two-sample and trend tests based on a modified Baumgartner-Weiß-Schindler statistic.J. Nonparam. Statist. 13, 729–739.

    MATH  Article  Google Scholar 

  57. Neuhäuser M. (2002a): Two-sample tests when variances are unequal.Anim. Behav. 63, 823–825.

    Article  Google Scholar 

  58. Neuhäuser M. (2002b) Nonparametric identification of the minimum effective dose.Drug Inf. J. 36, 881–888.

    Google Scholar 

  59. Neuhäuser M. (2002c): The Baumgartner-Weiß-Schindler test in the presence of ties (letter to the editor).Biometrics 58, 250.

    Google Scholar 

  60. Neuhäuser M. (2002d) Exact tests for the analysis of case-control studies of genetic markers.Hum. Hered. 54, 151–156.

    Article  Google Scholar 

  61. Neuhäuser M. (2003a) A note on the exact test based on the Baumgartner-Weiß-Schindler statistic in the presence of ties.Comput. Statist. Data Anal. 42, 561–568.

    Article  MathSciNet  MATH  Google Scholar 

  62. Neuhäuser M. (2003b) An exact test for trend among binomial proportions with a modified Baumgartner-Weiß-Schindler statistic. Submitted manuscript.

  63. Neuhäuser M., Bretz F (2001) Nonparametric all-pairs multiple comparisons.Biometrical J. 43, 571–580.

    MATH  Article  Google Scholar 

  64. Neuhäuser M., Büning H, Hothorn LA (2004) Maximum test versus adaptive tests for the two-sample location problem.J. Appl. Statist. 31, 215–227.

    MATH  Article  Google Scholar 

  65. Neuhäuser M., Hothorn LA (1999) An exact Cochran-Armitage test for trend when dose-response shapes are a priori unknownComput. Statist. Data Anal. 30, 403–412.

    MATH  Article  Google Scholar 

  66. Neuhäuser M., Hothorn LA (2000) Parametric location-scale and scale trend tests based on Levene's transformation.Comput. Statist. Data Anal. 33, 189–200.

    MATH  Article  Google Scholar 

  67. Neuhäuser M., Liu P-Y, Hothorn LA (1998) Nonparametric tests for trend: Jonckheere's test, a modification and a maximum test.Biometrical J. 40, 899–909.

    MATH  Article  Google Scholar 

  68. North BV, Curtis D, Sham PC (2002) A note on the calculation of empiricalP values from Monte Carlo procedures.Am. J. Hum. Genet. 71, 439–441.

    Article  Google Scholar 

  69. Ogenstad S (1998) The use of generalized tests in medical research.J. Biopharm. Statist. 8, 497–508.

    MATH  Article  Google Scholar 

  70. Portier C, Hoel D (1984) Type I error of trend tests in proportions and the design of cancer screens.Commun. Statist.— Theory Meth. 13, 1–14.

    Article  Google Scholar 

  71. Ryman N, Jorde PE (2001) Statistical power when testing for genetic differentiation.Mol. Ecol. 10, 2361–2373.

    Article  Google Scholar 

  72. Sasieni PD (1997) From genotypes to genes: doubling the sample size.Biometrics 53, 1253–1261.

    MATH  Article  MathSciNet  Google Scholar 

  73. Sawilowsky SS, Blair RC (1992) A more realistic look at the robustness and type II error properties of thet test to departures from population normality.Psychol. Bull. 111, 352–360.

    Article  Google Scholar 

  74. Sham P (1998)Statistics in human genetics. Arnold, London.

    Google Scholar 

  75. Siegel S (1956)Nonparametric statistics for the behavioral sciences. McGraw-Hill, New York.

    Google Scholar 

  76. Singer J (2001) A simple procedure to compute the sample size needed to compare two independent groups when the population variances are unequal.Statist. Med. 20, 1089–1095.

    Article  Google Scholar 

  77. Slager SL, Schaid DJ (2001) Case-control studies of genetic markers: Power and sample size approximations for Armitage's test for trend.Hum. Hered. 52, 149–153.

    Article  Google Scholar 

  78. Steel RGD (1960) A rank sum test for comparing all pairs of treatments.Technometrics 2, 197–207.

    MATH  Article  MathSciNet  Google Scholar 

  79. Streitberg B, Roehmel J (1990) On tests that are uniformly more powerful than the Wilcoxon-Mann-Whitney test.Biometrics 46, 481–484.

    MATH  Article  MathSciNet  Google Scholar 

  80. Wilcoxon F (1945) Individual comparisons by ranking methods.Biometrics 1, 80–83.

    Article  Google Scholar 

  81. Williams DA (1988) Tests for differences between several small proportions.Appl. Statist. 37, 421–434.

    Article  Google Scholar 

  82. Williams PB, Carnine DW (1981) Relationship between range of examples and of instructions and attention in concept attainment.J. Educ. Res. 74, 144–148.

    Google Scholar 

  83. Yezerinac SM, Weatherhead PJ, Boag PT (1995) Extra-pair paternity and the opportunity for sexual selection in a socially monogamous bird (Dendroica petechia),Behav. Ecol. Sociobiol. 37, 179–188.

    Article  Google Scholar 

  84. Zar JH (1974)Biostatistical analysis. Prentice-Hall, Englewood Cliffs.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Markus Neuhäuser.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Neuhäuser, M. Exact tests based on the Baumgartner-Weiß-Schindler statistic—A survey. Statistical Papers 46, 1–29 (2005). https://doi.org/10.1007/BF02762032

Download citation

Key words

  • Cochran-Armitage test
  • conditional test
  • conservatism
  • Jonckheere-Terpstra test
  • Kolmogorov-Smirnov test
  • nonparametrics
  • ties
  • Wilcoxon rank sum test