Abstract
The International Conference on Harmonization (ICH) E9 guideline recommends using a significance level of α/2 for one-sided tests in regulatory sellings. Two arguments are presented to demonstrate that this approach may not be universally sensible. First, a two-sided p-value is not always twice the minimum of the two tail probabilities, that is, the two possible one-sided p-values. Based on Fisher’s exact test, examples are presented in which the one-sided p-value is larger than α/2 although the corresponding two-sided p-value is smaller than α. Second, the choice between one- and two-sided tests is an artificial dichotomy since there is a continuum of choices when using asymmetrical critical regions. Such an unequal split of α is implicitly used when Fisher’s exact test is applied two-sided. Furthermore, a test intermediate to one- and two-sided tests is sometimes appropriate in group sequential designs.
Similar content being viewed by others
References
Zar JH. Biostatistical Analysis. Englewood Cliffs, NJ: Prentice-Hall; 1984.
Proschan MA, Waclawiw MA. Practical guidelines for multiplicity adjustment in clinical trials. Control Clin Trials. 2000;21:527–539.
ICH E9 Expert Working Group. ICH Harmonised Tripartite Guideline: Statistical Principles for Clinical Trials. Stat Med. 1999;18:1905–1942.
Lewis J, Louv W, Rockhold F, Sato T. The impact of the international guideline entitled Statistical Principles for Clinical Trials (ICH E9). Stat Med. 2001;20:2549–2560.
Pelletier JP, Yaron M, Haraoui B, Cohen P, Nahir MA, Choquette D, Wigler I, Rosner IA, Beaulieu AD. Efficacy and safety of diacerein in osteoarthritis of the knee. Arthritis Rheumatism. 2000;43:2339–2348.
Sacristan JA, Gilaberte I, Bolo B, Buesching DP, Obenchain RL, Demitrack M, Perez Sola V, Alvarez E, Artigas F. Cost-effectiveness of fluoxetine plus pindolol in patients with major depressive disorder: results from a randomized, double-blind clinical trial. Int Clin Psychopharmacol. 2000;15:107–113.
Cardozo L, Chapple CR, Toozs-Hobson P, Grosse-Freese M, Bulitta M, Lehmacher W. Strösser W, Ballering-Brühl B, Schäfer M. Efficacy of trospium chloride in patients with detrusor instability: a placebo-controlled, randomized. double-blind, multicentre clinical trial. BJU Int. 2000;85:659–664.
Tollefson GD, Birkett MA, Kiesler GM, Wood AJ. Double-blind comparison of olanzapine versus clozapine in schizophrenic patients clinically eligible for treatment with clozapine. Biolog Psychiatry 2001;49:52–63.
Akin MD, Weingand KW, Hengehold DA, Goodale MB, Hinkle RT, Smith RP. Continuouslow-level topical heat in the treatment of dysmenorrhea. Obstetrics Gynecol. 2001;97:343–349.
O’Suilleabhain P, Bullard J, Dewey RB. Proprioception in Parkinson’s disease is acutely depressed by dopaminergic medications. J Neurology. Neurosurgery Psychiatry. 2001;71:607–610.
International Recombinant Human Chorionic Gonadotropin Study Group. Induction of ovulation in World Health Organization group II anovulatory women undergoing follicular stimulation with recombinant human follicle-stimulating hormone: a comparison of recombinant human chorionic gonadotropin (rbCG) and urinary hCG. Fertil Steril. 2001;75:1111–1118.
Cohen MB, Giannella RA, Bean J, Taylor DN, Parker S, Hoeper A, Wowk S, Hawkins J, Kochi SK, Schiff G, Killeen KP. Randomized, controlled human challenge study of the safety, immunogenicity, and protective efficacy of a singe dose of Pcru-15, a live attenuated oral cholera vaccine. infect Immun. 2002;70:1965–1970.
Bernard P, Chosidow O, Vaillant L. Oral pristinamycin versus standard penicillin regimen to treat erysipelas in adults: randomised, non-inferiority, open trial. Br Med J. 2002;325:864–866.
Rice WR, Gaines SD. ‘Heads I win, tails you lose’: Testing directional alternative hypotheses in ecological and evolutionary research. Trends Ecol Evolut. 1994;9:235–237.
George EO, Mudholkar DS. P-values for two-sided tests. Biometrical J. 1990;32:747–751.
Onghena P, May RB. Pitfalls in computing and interpreting randomization test p values; A commentary on Chen and Dunlap. Behavior Res Meth, Instruments, Computers. 1995;27:408–411.
Lloyd CJ. Statistical Analysis of Categorical Data. New York, NY: Wiley; 1999.
Dupont WD. Sensitivity of Fisher’s exact test to minor perturbations in 2 X 2 contingency tables. Stat Med. 1986;5:629–635.
Terwilliger JD, Ott J. Handbook of Human Genetic Linkage. Baltimore. MD: John Hopkins University Press; 1994.
Paggiaro PL, Dahle R, Bakran I, Frith L, Hollingworth K, Efthimiou J. Multicentre randomised placebo-controlled trial of inhaled fluticasone propionate in patients with chronic obstructive pulmonary disease. Lancet. 1998;351:773–780.
Lloyd CJ. Doubling the one-sided P-value in testing independence in 2 x 2 tables against a two-sided alternative. Stat Med. 1988;7:1297–1306.
Kittelson JM, Emerson SS. A unifying family of group sequential test designs. Biometrics. 1999;55:874–882.
Emerson SS. S + SeqTrial: Technical Overview. Seattle, WA: MathSoft. Inc.; 2000.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Neuhävser, M. The Choice of α for One-Sided Tests. Ther Innov Regul Sci 38, 57–60 (2004). https://doi.org/10.1177/009286150403800108
Published:
Issue Date:
DOI: https://doi.org/10.1177/009286150403800108