Abstract
Environmental scientists face the reality that many of their journals’ editors and referees routinely insist that results be accompanied by statements of statistical significance, obtained from two-sided tests of point-null hypotheses. Many in these three groups of people appear only vaguely a ware of the arbitrarinessoften invoked by this procedure and of the information sterility in a single p-value. The interpretation to be made of the failure of a test to attain such significance is not clear. For such reasons, some colleagues (and senior statisticans) have called current usage of the procedures into serious question. Some reasons for this dislocation and some of the more dramatic consequences for environmental science and management are presented. Interval and Bayesian approaches can offer remedies.
Similar content being viewed by others
References
Berger, J. O. (1986), “Are P-Values Reasonable Measures of Accuracy?,” in Pacific Statistical Congress, eds. I. S. Francis, B. F. J. Manly, and F. C. Lam, North Holland: Elsevier, pp. 21–27.
Berkson, J. (1938), “Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test,” Journal of the American Statistical Association, 33, 526–542.
— (1942), “Tests of Significance Considered as Evidence,” Journal of the American Statistical Association, 37, 325–335.
Berry, D. A. (1996), Statistics: A Bayesian Perspective, Belmont: Duxbury.
Bower, B. (1997), “Null Science. Psychology’s Statistical Status Quo Draws Fire,” Science News, 151, 356–357.
Buhl-Mortensen, L. (1996), “Type-II Statistical Errors in Environmental Science and the Precautionary Principle,” Marine Pollution Bulletin, 32(7), 528–531.
Calderon, R. L., Mood, E. W., and Dufour, A. P. (1991), “Health Effects of Swimmers and Nonpoint Sources of Contaminated Water,” International Journal of Environmental Health Research, 1, 21–31. Discussion by G. B. McBride, International Journal of Environmental Health Research, 3, 115–116.
Carver, R. P. (1978), “The Case Against Statistical Significance Testing,” Harvard Education Review, 48, 378–399.
— (1993), “The Case Against Statistical Significance Testing, Revisited,” Journal of Experimental Education, 61, 287–292.
Chow, S. C., and Liu, J. P. (1992), Design and Analysis of Bioavailability and Bioequivalence Studies, New York: Marcel Dekker.
Cohen, J. (1988), Statistical Power Analysis for the Behavioral Sciences, 2nd ed., Hillsdale, NJ: Lawrence Erlbaum.
Cole, R. G., McBride, G. B., and Healy, T. R. (2001), “Equivalence Tests and Sedimentary Data: Dredge Spoil Disposal at Pine Harbour Marina, Auckland,” Journal of Coastal Research Special Issue, 34, 611–622.
Dayton, P. K. (1998), “Reversal of the Burden of Proof in Fisheries Management,” Science, 279, 821–822.
Fairweather, P. G. (1991), “Statistical Power and Design Requirements for Environmental Monitoring,” Australian Journal of Marine and Freshwater Research, 42, 555–567.
Germano, J. D. (1999), “Ecology, Statistics, and the Art of Misdiagnosis: The Need for a Paradigm Shift,” Environmental Reviews, 7, 167–190.
Gibbons, J. D., and Pratt, J. W. (1975), “p-Values: Interpretation and Methodlogy,” The American Statistician, 29, 20–25.
Gilbert, R. O. (1987), Statistical Methods for Environmental Pollution Monitoring, New York: Van Nostrand Reinhold.
Goodman, S. N. (1993), “p-Values, Hypothesis Tests, and Likelihood: Implications for Epidemilogy of a Neglected Historical Debate (with discussuion),” American Journal of Epidemiology, 137, 485–501.
— (1999), “Toward Evidence-Based Medical Statistics. 1: The p-Value Fallacy,” Annals of Internal Medicine, 130, 995–1004.
Gray, J. S. (1990), “Statistics and the Precautionary Principle,” Marine Pollution Bulletin, 21(4), 174–176.
Harlow, L. L., Muliak, S. A., and Steiger, J. H. (eds.) (1997), What If There Were No Significance Tests? Mah wah. NJ: Lawrence Erlbaum.
Helsel, D. R., and Hirsch, R. M. (1992), Statistical Methods in Water Resources, Amsterdam: Elsevier.
Hilborn, R., and Mangel, M. (1997), The Ecological Detective—Confronting Models With Data, Princeton, NJ: Princeton University Press.
Johnson, D. H. (1999), “The Insignificance of Statistical Significance Testing,” Journal of Wildlife Management, 63, 763–772 (http://www.npwrc.usgs.gov/resource/1999/statsig/statsig.htm).
Krebs, C. J. (1989), Ecological Methodology, New York: Harper and Row.
Legendre, P., and Legendre, L. (1998), Numerical Ecology (2nd ed.), Amsterdam: Elsevier.
Manly, B. F. J. (1997), Randomisation, Bootstrap and Monte Carlo Methods in Biology (2nd ed.), London: Chapman and Hall.
Mapstone, B. D., (1995), “Scalable Decision Rules for Environmental Impact Studies: Effect Size, Type 1 and Type II Errors,” Ecological Applications, 5, 401–410.
McBride, G. B. (1999), “Equivalence Tests Can Enhance Environmental Science and Management,” Australian and New Zealand Journal of Statistics, 41, 19–29.
McBride, G. B., and Ellis, J. C. (2001), “Confidence of Compliance: A Bayesian Approach for Percentile Standards,” Water Research, 35, 1117–1124.
McBride, G. B., Loftis, J. C., and Adkins, N. C. (1993), “What Do Significance Tests Really Tell us About the Environment?” Environmental Management, 17, 423–432 (errata in 18, p. 317).
Ministry of Health. (1995), Drinking-Water Standards for New Zealand, Wellington, New Zealand: Ministry of Health.
Morrison, D. E., and Henkel, R. E. (1970), The Significance Test Controversy—A Reader, Chicago: Aldine.
Nelder, J. A. (1999), “Statistics for the Millennium: From Statistics to Statistical Science,” The Statistician, 48, 257–269.
Neyman, J., and Pearson, E. S. (1933), “On the Problem of the Most Efficient Tests of Statistical Hypotheses,” Philosophical Transactions of the Royal Society, Series A, 231, 289–337.
Peterman, R. M., and M’Gonigle, M. (1992), “Statistical Power Analysis and the Precautionary Principle,” Marine Pollution Bulletin, 24, 231–234.
Poole, C. (1987), “Beyond the Confidence Interval,” American Journal of Public Health, 77, 195–199.
Poole, C. (1988), “Editorial: Feelings and Frequencies: Two Kinds of Probability in Public Health Research,” American Journal of Public Health, 78, 1531–1532.
Reckhow, K. H. (1990), “Bayesian Inference in Non-Replicated Ecological Studies,” Ecology, 71, 2053–2059.
Reckhow, K. H., and Chapra, S. C. (1983), Engineering Approaches for Lake Management (Vol. 1), Data Analysis and Empirical Modeling, Boston: Butterworth.
Royall, R. M. (1997), Statistical Evidence: A Likelihood Paradigm, London: Chapman and Hall.
Rozeboom, W. W. (1960), “The Fallacy of the Null-Hypothesis Significance Tests,” Psychological Bulletin, 57, 416–428.
Sokal, R. R., and Rohlf, F. J. (1981), Biometry (2nd ed.), New York: Freeman.
Stuart, A., Ord, J. K., and Arnold, S. (1999), Kendall’s Advanced Theory of Statistics (Vol. 2A), Classical Inference and the Linear Model, London: Arnold.
Suter, G. W. II (1996), “Abuse of Hypothesis Testing Statistics in Ecological Risk Assessment,” Human and Ecological Risk Assessment, 2, 331–347.
Thomas, L., and Krebs, C. J. (1997), “A Review of Statistical Power Analysis Software,” Bulletin of the Ecological Society of America, 78, 126–139.
Tukey, J. W. (1991), “The Philosophy of Multiple Comparison,” Statistical Science, 6, 100–116.
Veiland, V. J., and Hodge, S. E. (1998), “Book Reviews: Statistical Evidence: A Likelihood Paradigm. By Richard Royall,” American Journal of Human Genetics, 63, 283–289 (http://www.journals.uchicago.edu/AJHG/ journal/issues/v63n1/980002/980002.text.html)
Zar, J. H. (1996), Biostatistical Analysis (3rd ed.), Upper Saddle River, NJ: Prentice-Hall.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
McBride, G.B. Statistical methods helping and hindering environmental science and management. JABES 7, 300–305 (2002). https://doi.org/10.1198/108571102258
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1198/108571102258