Statistical methods helping and hindering environmental science and management

Article

Abstract

Environmental scientists face the reality that many of their journals’ editors and referees routinely insist that results be accompanied by statements of statistical significance, obtained from two-sided tests of point-null hypotheses. Many in these three groups of people appear only vaguely a ware of the arbitrarinessoften invoked by this procedure and of the information sterility in a single p-value. The interpretation to be made of the failure of a test to attain such significance is not clear. For such reasons, some colleagues (and senior statisticans) have called current usage of the procedures into serious question. Some reasons for this dislocation and some of the more dramatic consequences for environmental science and management are presented. Interval and Bayesian approaches can offer remedies.

Key Words

Burden of proof Compliance rules Credibility and confidence intervals Equivalence Interval hypothesis Null hypothesis 

References

  1. Berger, J. O. (1986), “Are P-Values Reasonable Measures of Accuracy?,” in Pacific Statistical Congress, eds. I. S. Francis, B. F. J. Manly, and F. C. Lam, North Holland: Elsevier, pp. 21–27.Google Scholar
  2. Berkson, J. (1938), “Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test,” Journal of the American Statistical Association, 33, 526–542.CrossRefMATHGoogle Scholar
  3. — (1942), “Tests of Significance Considered as Evidence,” Journal of the American Statistical Association, 37, 325–335.CrossRefGoogle Scholar
  4. Berry, D. A. (1996), Statistics: A Bayesian Perspective, Belmont: Duxbury.Google Scholar
  5. Bower, B. (1997), “Null Science. Psychology’s Statistical Status Quo Draws Fire,” Science News, 151, 356–357.CrossRefGoogle Scholar
  6. Buhl-Mortensen, L. (1996), “Type-II Statistical Errors in Environmental Science and the Precautionary Principle,” Marine Pollution Bulletin, 32(7), 528–531.CrossRefGoogle Scholar
  7. Calderon, R. L., Mood, E. W., and Dufour, A. P. (1991), “Health Effects of Swimmers and Nonpoint Sources of Contaminated Water,” International Journal of Environmental Health Research, 1, 21–31. Discussion by G. B. McBride, International Journal of Environmental Health Research, 3, 115–116.CrossRefGoogle Scholar
  8. Carver, R. P. (1978), “The Case Against Statistical Significance Testing,” Harvard Education Review, 48, 378–399.Google Scholar
  9. — (1993), “The Case Against Statistical Significance Testing, Revisited,” Journal of Experimental Education, 61, 287–292.Google Scholar
  10. Chow, S. C., and Liu, J. P. (1992), Design and Analysis of Bioavailability and Bioequivalence Studies, New York: Marcel Dekker.MATHGoogle Scholar
  11. Cohen, J. (1988), Statistical Power Analysis for the Behavioral Sciences, 2nd ed., Hillsdale, NJ: Lawrence Erlbaum.MATHGoogle Scholar
  12. Cole, R. G., McBride, G. B., and Healy, T. R. (2001), “Equivalence Tests and Sedimentary Data: Dredge Spoil Disposal at Pine Harbour Marina, Auckland,” Journal of Coastal Research Special Issue, 34, 611–622.Google Scholar
  13. Dayton, P. K. (1998), “Reversal of the Burden of Proof in Fisheries Management,” Science, 279, 821–822.CrossRefGoogle Scholar
  14. Fairweather, P. G. (1991), “Statistical Power and Design Requirements for Environmental Monitoring,” Australian Journal of Marine and Freshwater Research, 42, 555–567.CrossRefGoogle Scholar
  15. Germano, J. D. (1999), “Ecology, Statistics, and the Art of Misdiagnosis: The Need for a Paradigm Shift,” Environmental Reviews, 7, 167–190.CrossRefGoogle Scholar
  16. Gibbons, J. D., and Pratt, J. W. (1975), “p-Values: Interpretation and Methodlogy,” The American Statistician, 29, 20–25.CrossRefMATHGoogle Scholar
  17. Gilbert, R. O. (1987), Statistical Methods for Environmental Pollution Monitoring, New York: Van Nostrand Reinhold.Google Scholar
  18. Goodman, S. N. (1993), “p-Values, Hypothesis Tests, and Likelihood: Implications for Epidemilogy of a Neglected Historical Debate (with discussuion),” American Journal of Epidemiology, 137, 485–501.Google Scholar
  19. — (1999), “Toward Evidence-Based Medical Statistics. 1: The p-Value Fallacy,” Annals of Internal Medicine, 130, 995–1004.Google Scholar
  20. Gray, J. S. (1990), “Statistics and the Precautionary Principle,” Marine Pollution Bulletin, 21(4), 174–176.CrossRefGoogle Scholar
  21. Harlow, L. L., Muliak, S. A., and Steiger, J. H. (eds.) (1997), What If There Were No Significance Tests? Mah wah. NJ: Lawrence Erlbaum.Google Scholar
  22. Helsel, D. R., and Hirsch, R. M. (1992), Statistical Methods in Water Resources, Amsterdam: Elsevier.Google Scholar
  23. Hilborn, R., and Mangel, M. (1997), The Ecological Detective—Confronting Models With Data, Princeton, NJ: Princeton University Press.Google Scholar
  24. Johnson, D. H. (1999), “The Insignificance of Statistical Significance Testing,” Journal of Wildlife Management, 63, 763–772 (http://www.npwrc.usgs.gov/resource/1999/statsig/statsig.htm).CrossRefGoogle Scholar
  25. Krebs, C. J. (1989), Ecological Methodology, New York: Harper and Row.Google Scholar
  26. Legendre, P., and Legendre, L. (1998), Numerical Ecology (2nd ed.), Amsterdam: Elsevier.MATHGoogle Scholar
  27. Manly, B. F. J. (1997), Randomisation, Bootstrap and Monte Carlo Methods in Biology (2nd ed.), London: Chapman and Hall.Google Scholar
  28. Mapstone, B. D., (1995), “Scalable Decision Rules for Environmental Impact Studies: Effect Size, Type 1 and Type II Errors,” Ecological Applications, 5, 401–410.CrossRefGoogle Scholar
  29. McBride, G. B. (1999), “Equivalence Tests Can Enhance Environmental Science and Management,” Australian and New Zealand Journal of Statistics, 41, 19–29.CrossRefMATHGoogle Scholar
  30. McBride, G. B., and Ellis, J. C. (2001), “Confidence of Compliance: A Bayesian Approach for Percentile Standards,” Water Research, 35, 1117–1124.CrossRefGoogle Scholar
  31. McBride, G. B., Loftis, J. C., and Adkins, N. C. (1993), “What Do Significance Tests Really Tell us About the Environment?” Environmental Management, 17, 423–432 (errata in 18, p. 317).CrossRefGoogle Scholar
  32. Ministry of Health. (1995), Drinking-Water Standards for New Zealand, Wellington, New Zealand: Ministry of Health.Google Scholar
  33. Morrison, D. E., and Henkel, R. E. (1970), The Significance Test Controversy—A Reader, Chicago: Aldine.Google Scholar
  34. Nelder, J. A. (1999), “Statistics for the Millennium: From Statistics to Statistical Science,” The Statistician, 48, 257–269.Google Scholar
  35. Neyman, J., and Pearson, E. S. (1933), “On the Problem of the Most Efficient Tests of Statistical Hypotheses,” Philosophical Transactions of the Royal Society, Series A, 231, 289–337.CrossRefMATHGoogle Scholar
  36. Peterman, R. M., and M’Gonigle, M. (1992), “Statistical Power Analysis and the Precautionary Principle,” Marine Pollution Bulletin, 24, 231–234.CrossRefGoogle Scholar
  37. Poole, C. (1987), “Beyond the Confidence Interval,” American Journal of Public Health, 77, 195–199.CrossRefGoogle Scholar
  38. Poole, C. (1988), “Editorial: Feelings and Frequencies: Two Kinds of Probability in Public Health Research,” American Journal of Public Health, 78, 1531–1532.CrossRefGoogle Scholar
  39. Reckhow, K. H. (1990), “Bayesian Inference in Non-Replicated Ecological Studies,” Ecology, 71, 2053–2059.CrossRefGoogle Scholar
  40. Reckhow, K. H., and Chapra, S. C. (1983), Engineering Approaches for Lake Management (Vol. 1), Data Analysis and Empirical Modeling, Boston: Butterworth.Google Scholar
  41. Royall, R. M. (1997), Statistical Evidence: A Likelihood Paradigm, London: Chapman and Hall.MATHGoogle Scholar
  42. Rozeboom, W. W. (1960), “The Fallacy of the Null-Hypothesis Significance Tests,” Psychological Bulletin, 57, 416–428.CrossRefGoogle Scholar
  43. Sokal, R. R., and Rohlf, F. J. (1981), Biometry (2nd ed.), New York: Freeman.MATHGoogle Scholar
  44. Stuart, A., Ord, J. K., and Arnold, S. (1999), Kendall’s Advanced Theory of Statistics (Vol. 2A), Classical Inference and the Linear Model, London: Arnold.Google Scholar
  45. Suter, G. W. II (1996), “Abuse of Hypothesis Testing Statistics in Ecological Risk Assessment,” Human and Ecological Risk Assessment, 2, 331–347.Google Scholar
  46. Thomas, L., and Krebs, C. J. (1997), “A Review of Statistical Power Analysis Software,” Bulletin of the Ecological Society of America, 78, 126–139.Google Scholar
  47. Tukey, J. W. (1991), “The Philosophy of Multiple Comparison,” Statistical Science, 6, 100–116.CrossRefGoogle Scholar
  48. Veiland, V. J., and Hodge, S. E. (1998), “Book Reviews: Statistical Evidence: A Likelihood Paradigm. By Richard Royall,” American Journal of Human Genetics, 63, 283–289 (http://www.journals.uchicago.edu/AJHG/ journal/issues/v63n1/980002/980002.text.html)Google Scholar
  49. Zar, J. H. (1996), Biostatistical Analysis (3rd ed.), Upper Saddle River, NJ: Prentice-Hall.Google Scholar

Copyright information

© International Biometric Society 2002

Authors and Affiliations

  1. 1.NIWA (National Institute of Water and Atmospheric Research)HamiltonNew Zealand

Personalised recommendations