Routine application of significance tests does not extract the maximum information from environmental data and can lead to misleading conclusions. Reasons leading to this are: a significant result can often be reached merely by collecting enough samples; a statistically significant result is not necessarily practically significant; and reports of the presence or absence of statistically significant differences for multiple tests are not comparable unless identical sample sizes are used. These problems are demonstrated by application to pH data for grazed and retired fields, and by discussion of significance tests used in recent US regulations for groundwater quality. The advantages of equivalence tests, where the tester must state the difference of practical difference, are discussed and applied to the field pH problem. We recommend that environmental managers and scientists pay more attention to statistical power and decide on what is a practical difference. Confidence intervals for the size of the differences, accompanied where necessary by equivalence tests, are the preferred means of addressing the question: “is there a difference of practical significance?”
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Bhattacharyya, G. K., and R. A. Johnson. 1977. Statistical concepts and methods. John Wiley & Sons, New York.
Bowker, A. H., and G. J. Lieberman. 1972. Engineering statistics, 2nd ed. Prentice-Hall, Englewood Cliffs, New Jersey.
Calderon, R. L., E. W. Mood, and A. P. Dufour. 1991. Health effects of swimmers and nonpoint sources of contaminated water.International Journal of Environmental Health Research 1:21–31.
Chew, V. 1980. Testing differences among means: Correct interpretation and some alternatives.HortScience 15(4):467–470.
Cochran, W. G., and G. M. Cox. 1957. Experimental designs. Wiley, New York.
Cohen, J. 1977. Statistical power analysis for the behavioral sciences, revision of 1969 ed. Academic Press, New York, 474 pp.
Cohen, J. 1988. Statistical power analysis for the behavioural sciences, 2nd ed. Lawrence Erlbaum Associates, Hillsdale, New Jersey.
Downing, J. A. 1986. Spatial heterogeneity: Evolved behaviour or mathematical artifact?Nature 323:255–257.
Ferguson, T. S. 1967. Mathematical statistics: A decision theoretic approach. Academic Press, New York, sec. 5.3.
Freund, J. E. 1971. Mathematical statistics, 2nd ed. Prentice-Hall, Englewood Cliffs, New Jersey.
Gardner, M. J., and D. G. Altman. 1986. Confidence intervals rather thanP values: Estimation rather than hypothesis testing.British Medical Journal 292:746–750.
Goldstein, R. 1989. Power and sample size via MS/PC-DOS computers.The American Statistician 43(4):253–260.
Green, R. H. 1989. Power analysis and practical strategies for environmental monitoring.Environmental Research 50:195–205.
Hines, W. W., and D. C. Montgomery. 1980. Probability and statistics in engineering and management science, 2nd ed. John Wiley & Sons, New York.
Iman, R. I., and Conover, W. J. 1983. A modern approach to statistics. Wiley, New York.
Johnson, N. L., and S. Kotz. 1970. Continuous univariate distributions—2. Houghton-Mifflin, Boston.
Kirk, R. E. (ed.). 1972. Statistical issues: A reader for the behavioural sciences. Brooks/Cole Publishing Co., Monterey, California.
Kraemer, H. C., and M. Paik. 1979. A centralt approximation to the noncentralt distribution.Technometrics 21(3):357–360.
Larsen, R. J., and M. L. Marx. 1986. An introduction to mathematical statistics and its applications, 2nd ed. Prentice-Hall, Englewood Cliffs, New Jersey.
Lehman, E. L. 1959. Testing statistical hypotheses. Wiley, New York, sec. 3.7.
Lettenmaier, D. P. 1976. Detection of trends in water quality data from records with dependent observations.Water Resources Research 12(5):1037–1046.
Lettenmaier, D. P. 1978. Design considerations for ambient stream equality monitoring.Water Resources Bulletin 14(4):884–902. Discussion by Egar, D. L., A. L. Wilson, R. K. Aylesworth, and reply by author in 15(6):1781–1786.
Millard, S. P. 1987. Environmental monitoring, statistics, and the law: Room for improvement.The American Statistician 41(4):249–253.
Montgomery, R. H., and J. C. Loftis. 1987. Applicability of thet-test for detecting trends in water quality variables.Water Resources Bulletin 23(4):653–662. Discussion by Helsel D. R., and R. M. Hirsch, and reply by authors in 24(1):201–207.
Mood, A. M., and F. A. Graybill. 1963. Introduction to the theory of statistics. McGraw-Hill, New York, sec. 12.5.
Morrison, D. E., and R. E. Henkel. 1970. The significance test controversy. Aldin Publishing Company, Chicago, Illinois.
Nicholls, A. 1987. Personal communication. Ministry of the Environment, Dorset, Ontario, Canada, July 31.
Oakes, M. 1986. Statistical significance: a commentary for the social and behavioural sciences. John Wiley & Sons, New York.
Patel, H. I., and G. D. Gupta. 1984. A problem of equivalence in clinical trials.Biomedical Journal 26(5):471–474.
Pearson, E. S., and H. O. Hartley. 1976a. Biometrika tables for statisticians, vol. 1. Cambridge University Press, London.
Pearson, E. S., and H. O. Hartley. 1976b. Biometrika tables for statisticians, vol. 2. Cambridge University Press, London.
Perry, J. N. 1986. Multiple-comparison procedures: A dissenting view.Journal of Economic Entonology 79:1149–1155.
Press, W. H., B. P. Flannery, S. A. Teukolsky, and W. A. Vetterling. 1986. Numerical recipes: The art of scientific computing. Cambridge University Press, Cambridge, England.
Reckhow, K. H., J. T. Clements, and R. C. Dodd. 1990. Statistical evaluation of mechanistic water-quality models.Journal of Environmental Engineering 116(2):250–268.
Resnikoff, G. J., and G.J. Lieberman. 1957. Tables of the noncentralt distribution. Stanford University Press, Stanford, California.
Sokal, R. R., and F. J. Rohlf. 1981. Biometry. The principles and practice of statistics in biological research, 2nd ed., W. H. Freeman, New York.
Stevenson, A. H. 1953. Studies of bathing water quality and health.American Journal of Public Health 43:529–538.
Trautmann, N. M., C. E. McCulloch and R. T. Oglesby. 1982. Statistical determination of data requirements for assessment of lake restoration programs.Canadian Journal of Fish and Aquatic Sciences 39:607–610.
Toft, C. A., and P. J. Shea. 1983. Detecting community-wide patterns: estimating power strengthens statistical significance.The American Naturalist 122(5):618–625.
Tukey, J. W. 1991. The philosophy of multiple comparisons.Statistical Science 6(1):100–116.
USEPA. 1989. Statistical analysis of ground-water monitoring data at RCRA facilities. Interim final guidance, Office of Solid Waste, Waste Management Division, US Environmental Protection Agency, Washington, DC 20460, February.
Ward, R. C., J. C. Loftis, H. P. DeLong and H. F. Bell. 1988. Groundwater quality: A data analysis protocol.Journal of the Water Pollution Control Federation 60(11):291–297.
Ward, R. C., J. C. Loftis, and G. B. McBride. 1990. Design of water quality monitoring systems. Van Nostrand Reinhold, New York, 231 pp.
Welkowitz, J., R. B. Ewen, and J. Cohen. 1982. Introductory statistics for the behavioural sciences, 3rd ed. Academic Press, New York.
Wolfowitz, J. 1967. Remarks on the theory of testing hypotheses.The New York Statistician 18(7):439–441.
Zar, J. H. 1984. Biostatistical analysis, 2nd ed. Prentice-Hall, Englewood Cliffs, New Jersey, 718 pp.
About this article
Cite this article
McBride, G.B., Loftis, J.C. & Adkins, N.C. What do significance tests really tell us about the environment?. Environmental Management 17, 423–432 (1993). https://doi.org/10.1007/BF02394658
- Hypothesis tests
- Statistical significance
- Statistical power
- Equivalence test