Statistical evaluation of toxicological assays: Dunnett or Williams test—take both
The US National Toxicology Program recommends the use of the parametric multiple comparison procedures of Dunnett and Williams for the evaluation of repeated toxicity studies. For endpoints where either increasing or decreasing effects are of toxicological relevance, we recommend the use of the two-sided Dunnett test exclusively. For the many other endpoints, where a priori only one direction is of toxicological relevance, however, we recommend the combination of Dunnett and Williams test. In particular, we recommend the so-called Umbrella-protected Williams test which offers insights for all interesting monotone and non-monotone alternatives while only suffering a marginal loss in power compared to the Dunnett test. We illustrate the power difference analytically and compare the approach for different endpoint types using three real data examples to alternative tests available. Nonparametric tests, which are suitable for the evaluation of skewed distributed or scores data, are also considered. Particular attention is given to the different interpretations of the findings revealed by the different test. R programs used for the analyses are provided.
KeywordsDunnett test Multiple comparisons R program Repeated toxicity studies Umbrella alternative Williams test
This work was supported in part by the German Science Foundation grant DfG-HO1687 and the EC FP7 program project ESNATS for the last author (LAH).
Conflict of interest
The authors declare that there is no conflict of interest.
- Bretz F, Hothorn L (2003) Statistical analysis of monotone or non-monotone dose-response data from in vitro toxicological assays. ATLA-Altern Lab Anim 31(Suppl 1):81–96Google Scholar
- Bretz F, Hothorn T, Westfall P (2002) On multiple comparisons in R. R News 2:14–17Google Scholar
- Dilba G, Bretz E, Guiard V, Hothorn LA (2004) Simultaneous confidence intervals for ratios with applications to the comparison of several treatments with a control. Method Inf Med 43(5):465–469Google Scholar
- Dilba G, Schaarschmidt F, Hothorn L (2007) Inferences for ratios of normal means. R News 7:20–23Google Scholar
- Konietschke F (2013) nparcomp: an R software package for nonparametric multiple comparisons and simultaneous confidence intervals (submitted)Google Scholar
- Kuiper RM, Gerhard D, Hothorn LA (2013) Identification of the minimum effective dose for normal distributed endpoints using a model selection approach (submitted)Google Scholar
- R Development Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org, ISBN 3-900051-07-0
- Schaarschmidt F, Gerhard D, Sill M (2012) MCPAN: multiple comparisons using normal approximation. http://CRAN.R-project.org/package=MCPAN, r package version 1.1-14
- Swain A, Turton J, Scudamore C, Maguire D, Pereira I, Freitas S, Smyth R, Munday M, Stamp C, Gandhi M, Sondh S, Ashall H, Francis I, Woodfine J, Bowles J, York M (2012) Nephrotoxicity of hexachloro-1:3-butadiene in the male Hanover Wistar rat; correlation of minimal histopathological changes with biomarkers of renal injury. J Appl Toxicol 32(6):417–428. doi: 10.1002/jat.1727 PubMedCrossRefGoogle Scholar
- US-NTP (2000) Toxicology and carcinogenesis studies of methyleugenol in f344/n rats and b6c3f1 mice. Technical report 491. Tech. rep., National Toxicology Program. US Department of Health and Human Services: National Institutes of Health, Washington DCGoogle Scholar
- US-NTP (2012) Testing information, statistical procedures, expanded overview. Tech. rep., National Toxicology Program, Department of Health and Human Services, Testing Information, Statistical Procedures, Expanded Overview (http://ntp.niehs.nih.gov/?objectid=72015E2C-BDB7-CEBA-F17F9ACA7AE5346D)