Abstract
Ecotoxicologists often encounter count and proportion data that are rarely normally distributed. To meet the assumptions of the linear model, such data are usually transformed or non-parametric methods are used if the transformed data still violate the assumptions. Generalized linear models (GLMs) allow to directly model such data, without the need for transformation. Here, we compare the performance of two parametric methods, i.e., (1) the linear model (assuming normality of transformed data), (2) GLMs (assuming a Poisson, negative binomial, or binomially distributed response), and (3) non-parametric methods. We simulated typical data mimicking low replicated ecotoxicological experiments of two common data types (counts and proportions from counts). We compared the performance of the different methods in terms of statistical power and Type I error for detecting a general treatment effect and determining the lowest observed effect concentration (LOEC). In addition, we outlined differences on a real-world mesocosm data set. For count data, we found that the quasi-Poisson model yielded the highest power. The negative binomial GLM resulted in increased Type I errors, which could be fixed using the parametric bootstrap. For proportions, binomial GLMs performed better than the linear model, except to determine LOEC at extremely low sample sizes. The compared non-parametric methods had generally lower power. We recommend that counts in one-factorial experiments should be analyzed using quasi-Poisson models and proportions from counts by binomial GLMs. These methods should become standard in ecotoxicology.
Similar content being viewed by others
References
Anderson MJ, Crist TO, Chase JM, Vellend M, Inouye BD, Freestone AL, Sanders NJ, Cornell HV, Comita LS, Davies KF, Harrison SP, Kraft NJB, Stegen JC, Swenson NG (2011) Navigating the multiple meanings of beta diversity: a roadmap for the practicing ecologist. Ecol Lett 14(1):19–28
Bolker B, Brooks M, Clark C, Geange S, Poulsen J, Stevens M, White J (2009) Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol Evol 24(3):127–135
ter Braak CJF, Šmilauer P (2014) Topics in constrained and unconstrained ordination. Plant Ecol. doi:10.1007/s11258-014-0356-5
van den Brink PJ, Hattink J, Brock TCM, Bransen F, van Donk E (2000) Impact of the fungicide carbendazim in freshwater microcosms. II. Zooplankton, primary producers and final conclusions. Aquat Toxicol 48 (2-3):251–264
Brock TCM, Hammers-Wirtz M, Hommen U, Preuss TG, Ratte HT, Roessink I, Strauss T, Van den Brink PJ (2015) The minimum detectable difference (MDD) and the interpretation of treatment-related effects of pesticides in experimental ecosystems. Environ Sci Pollut Res 22(2):1160–1174
Dunnett CW (1955) A multiple comparison procedure for comparing several treatments with a control. J Am Stat Assoc 50(272):1096–1121
EFSA PPR (2013) Guidance on tiered risk assessment for plant protection products for aquatic organisms in edge-of-field surface waters. EFSA J 11(7):3290
EPA (2002) Methods for Measuring the Acute Toxicity of Effluents and Receiving Waters to Freshwater and Marine Organisms. U.S. Environmental Protection Agency
Faraway JJ (2006) Extending the linear model with R: Generalized linear, mixed effects and nonparametric regression models. Chapman & Hall, Boca Raton
Gelman A, Stern H (2006) The difference between “significant” and “not significant” is not itself statistically significant. Am Stat 60(4):328–331
Hauck WW, Donner A (1977) Wald’s test as applied to hypotheses in logit analysis. J Am Stat Assoc 72(360):851
Hilbe JM (2014) Modeling Count Data. Cambridge University Press, New York
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2):65–70
Hothorn LA (2014) Statistical evaluation of toxicological bioassays—a review. Toxicol Res 3(6):418–432
Hothorn T, Bretz F, Westfall P (2008) Simultaneous inference in general parametric models. Biom J 50(3):346–363
Ives AR (2015) For testing the significance of regression coefficients, go ahead and log-transform count data. Methods Ecol Evol. doi:10.1111/2041-210X.12386
Jaki T, Hothorn LA (2013) Statistical evaluation of toxicological assays: Dunnett or Williams test—take both. Arch Toxicol 87(11):1901–1910
Johnson PCD, Barry SJE, Ferguson HM, Müller P (2015) Power analysis for generalized linear mixed models in ecology and evolution. Methods Ecol Evol 6(2):133–142
Konietschke F, Hothorn LA, Brunner E (2012) Rank-based multiple test procedures and simultaneous confidence intervals. Electron J Stat 6:738–759
Kuiper RM, Gerhard D, Hothorn LA (2014) Identification of the minimum effective dose for normally distributed endpoints using a model selection approach. Stat Biopharmaceutical Res 6(1):55–66
Landis WG, Chapman PM (2011) Well past time to stop using NOELs and LOELs. Integr Environ Assess Manag 7(4):vi–viii
Laskowski R (1995) Some good reasons to ban the use of NOEC, LOEC and related concepts in ecotoxicology. Oikos 73(1):140–144
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A (General) 135(3):370–384
Newman MC (1993) Regression analysis of log-transformed data: Statistical bias and its correction. Environ Toxicol Chem 12(6):1129–1133
Newman MC (2012) Quantitative ecotoxicology. Taylor & Francis, Boca Raton
OECD (2006) Current Approaches in the Statistical Analysis of Ecotoxicity Data: A Guidance to Application. No. 54. In: Series on Testing and Assessment. OECD, Paris
O’Hara RB, Kotze DJ (2010) Do not log-transform count data. Methods Ecol Evol 1(2):118–122
Quinn GP, Keough MJ (2009) Experimental design and data analysis for biologists. Cambridge University Press, Cambridge
R Core Team (2014) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Rothery P (1988) A cautionary note on data transformation: bias in back-transformed means. Bird Study 35(3):219–221
Sanderson H (2002) Pesticide studies. Environ Sci Pollut Res 9(6):429–435
Stroup WW (2014) Rethinking the analysis of non-normal data in plant and soil science. Agron J. doi:10.2134/agronj2013.0342
Szöcs E, Brink PJVd, Lagadic L, Caquet T, Roucaute M, Auber A, Bayona Y, Liess M, Ebke P, Ippolito A, Braak CJFt, Brock TCM, Schäfer RB (2015) Analysing chemical-induced changes in macroinvertebrate communities in aquatic mesocosm experiments: a comparison of methods. Ecotoxicology 24(4):760–769
Venables WN, Ripley BD (2002) Modern Applied Statistics with S, 4th edn. Springer, New York
Ver Hoef JM, Boveng PL (2007) Quasi-Poisson vs. negative binomial regression: how should we model overdispersed count data? Ecology 88(11):2766–2772
Wang M, Riffel M (2011) Making the right conclusions based on wrong results and small sample sizes: interpretation of statistical tests in ecotoxicology. Ecotoxicol Environ Saf 74(4): 684–92
Warton DI (2005) Many zeros does not mean zero inflation: comparing the goodness-of-fit of parametric models to multivariate abundance data. Environmetrics 16(3):275–289
Warton DI, Hui FKC (2011) The arcsine is asinine: the analysis of proportions in ecology. Ecology 92(1):3–10
Warton DI, Wright ST, Wang Y (2012) Distance-based multivariate analyses confound location and dispersion effects. Methods in Ecology and Evolution 3(1):89–101
Weber CI, Peltier WH, Norbert-King TJ, Horning WB, Kessler F, Menkedick JR, Neiheisel TW, Lewis PA, Klemm DJ, Pickering Q, Robinson EL, Lazorchak JM, Wymer L, Freyberg RW (1989) Short-term methods for estimating the chronic toxicity of effluents and receiving waters to fresh- water organisms. Tech. Rep. EPA/600/4–89/001, Environmental Protection Agency, Cincinnati, OH: Environmental Monitoring Systems Laboratory
Wilks SS (1938) The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann Math Stat 9(1):60–62
Williams DA (1972) The comparison of several dose levels with a zero dose control. Biometrics:519–531
Williams DA (1982) Extra-Binomial variation in logistic linear models. J R Stat Soc Ser C (Appl Stat) 31(2):144–148. doi:10.2307/2347977. http://www.jstor.org/stable/2347977
Zuur AF (2013) A beginner’s guide to GLM and GLMM with R: a frequentist and Bayesian perspctive for ecologists. Highland Statistics, Newburgh
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Marcus Schulz
Conflict of interests
The authors declare that they have no conflict of interest.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Szöcs, E., Schäfer, R.B. Ecotoxicology is not normal. Environ Sci Pollut Res 22, 13990–13999 (2015). https://doi.org/10.1007/s11356-015-4579-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-015-4579-3