Abstract
Multivariate analysis of variance, based on randomization (permutation) tests, has become an important tool for ecological data analyses. However, a comprehensive evaluation of the accuracy and power of available methods is still lacking. This is a thorough examination of randomization tests for multivariate group mean differences. With simulated data, the accuracy and power of randomization tests were evaluated using different test statistics in one-factor multivariate analysis of variance (MANOVA). The evaluations span a wide spectrum of data types, including specified and unspecified (field data) distributional properties, correlation structures, homogeneous to very heterogeneous variances, and balanced and unbalanced group sizes. The choice of test statistic strongly affected the results. Sums of squares between groups (Qb) computed on Euclidean distances (Qb-EUD) gave better accuracy. Qb on Bray-Curtis, Manhattan or Chord distances, the multiresponse permutation procedure (MRPP) and the sum of univariate ANOVA F produced severely inflated type I errors under increasing variance heterogeneity among groups, a common scenario in ecological data. Despite pervasive claims in the ecological literature, the evidence thus suggests caution when using test statistics other than Qb-EUD.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- ANOSIM:
-
Analysis of Similarity
- ANOVA:
-
Analysis of variancev
- CHOR:
-
Chord distance
- EUD:
-
Euclidean distance
- LR-IND:
-
Likelihood-ratio test assuming independence of variables
- MAN:
-
Mahattan distance
- MANOVA:
-
Multivariate analysis of variance
- MRPP:
-
Multiresponse permutation procedure
- PERMANOVA:
-
Permutational multivariate analysis of variance
- Qb:
-
Sums of squares between groups
- Qw:
-
Within-groups sum of squares
- SUM-F:
-
Univariate ANOVA F statistic summed over all variables
References
Anderson, M.J. 2001. A new method for non-parametric multivariate analysis of variance. Austral Ecol. 26: 32–46.
Anderson, M.J. 2006. Distance-based tests for homogeneity of multivariate dispersions. Biometrics 62: 245–253.
Anderson, M.J. and ter Braak, C.J.F. 2003. Permutation tests for multi-factorial analysis of variance. J. Statist. Comput. Simulation 73: 85–113.
Bradley, J.V. 1968. Distribution-Free Statistical Tests. Prentice-Hall, Englewood Cliffs.
Clarke, K.R. 1993. Non-parametric multivariate analyses of changes in community structure. Aust. J. Ecol. 18: 117–143.
Crowley, P.H. 1992. Resampling methods for computation-intensive data analysis in ecology and evolution. Annu. Rev. Ecol. Syst. 23: 405–447.
Edgington, E.S. 1969a. Approximate randomization tests. J. Psychol. 72: 143–149.
Edgington, E.S. 1969b. Statistical Inference: The Distribution-Free Approach. McGraw-Hill, New York.
Edgington, E.S. 1987. Randomization Tests. Marcel Dekker, New York.
Edwards, A.W.F. and Cavalli-Sforza L.L. 1965. A method for cluster analysis. Biometrics 21: 362–375.
Fisher, R.A. 1951. The Design of Experiments. 6th ed. Oliver and Boyd, Edinburgh.
Galassi, M., Davies, J., Theiler, J., Gough, B., Jungman, G., Booth, M. and Rossi, F. 2003. GNU Scientific Library Reference Manual (2nd Ed). Available also at https://doi.org/www.gnu.org/soft-ware/gsl/.
Gower, J.C. and Legendre, P. 1986. Metric and Euclidean properties of dissimilarity coefficients. J. Classif. 3: 5–48.
Hope, A.C.A. 1968. A simplified Monte Carlo significance test procedure. J. R. Stat. Soc. 30: 582–598.
Hotelling, H. 1931. The generalization of Student’s ratio. Annals of Math. Stat. 2: 360–378.
Kempthorne, O. 1952. The Design and Analysis of Experiments. Wiley, New York.
Kempthorne, O. 1955. The randomization theory of experimental inference. J. Amer. Statistical Assoc. 50: 946–967.
Legendre, P. and Anderson, M.J. 1999. Distance-based redundancy analysis: testing multi-species responses in multi-factorial ecological experiments. Ecol. Monogr. 69: 1–24.
Legendre, L. and Legendre, P. 1998. Numerical Ecology 2nd ed. Elsevier, New York.
Lehmann, E.L. 1993. The Fisher, Neyman-Pearson theories of testing hypotheses: one theory or two? J. Amer. Statistical Assoc. 88: 1242–1249.
Manly, B.F.J. 2007. Randomization, Bootstrap, and Monte Carlo Methods in Biology. Chapman & Hall/ CRC, Boca Raton.
Mantel, N. and Valand, R.S. 1970. A technique of nonparametric multivariate analysis. Biometrics 26: 547–558.
McArdle, B.H. and Anderson, M.J. 2001. Fitting multivariate models to community data: a comment on distance-based redundancy analysis. Ecology 82: 290–297.
McArdle, B.H. and Anderson, M.J. 2004. Variance heterogeneity, transformations, and models of species abundance: a cautionary tale. Can. J. Fish. Aquat. Sci. 61: 1294–1302.
Mielke, P.W. and Berry, J.A. 1999. Multivariate tests for correlated data in completely randomized designs. J. Educ. Behav. Stat. 24: 109–131.
Mielke, P.W. and Berry, J.A. 2001. Permutation Methods: A Distance Approach. Springer-Verlag, New York.
Mielke, P.W., Berry, K.J. and Johnson, E.S. 1976. Multi-response permutation procedures for a priori classifications. Commun. Stat. Theory Meth. 5: 1409–1424.
Orlóci, L. 1967. An agglomerative method for classification of plant communities. J. Ecol. 55: 193–205.
Orlóci, L. 1978. Multivariate Analysis in Vegetation Research. Junk, The Hague.
Orlóci, L. 1993. The complexities and scenarios of ecosystem analysis. In: Patil, G.P. and Rao, C.R. (eds.) Multivariate Environmental Statistics, Elsevier, Amsterdam. pp. 423–432.
Peres-Neto, P.R. and Jackson, D.A. 2001. How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia 129: 169–178.
Pillar, V.D., Jacques, A.V.A. and Boldrini, I.I. 1992. Fatores de ambiente relacionados à variação da vegetação de um campo natural. Pesqui. Agropecu. Bras. 27: 1089–1101.
Pillar, V.D. and Orlóci, L. 1996. On randomization testing in vegetation science: multifactor comparisons of relevé groups. J. Veg. Sci. 7: 585–592.
Podani, J. 2000. Introduction to the Exploration of Multivariate Biological Data. Backuys Publishers, Leiden.
Potvin, C. and Roff, D.A. 1993. Distribution-free and robust statistical methods: viable alternatives to parametric statistics? Ecology 74: 1617–1628.
Romesburg, H.C. 1985. Exploring, confirming, and randomization tests. Comput. Geosci. 11: 19–37.
Torres, P.S., Quaglino, M.B. and Pillar, V.D. 2010. Properties of a randomization test for multifactor comparisons of groups. J. Statist. Comput. Simulation 80: 1131 – 1150.
Warton, D.I. and Hudson, H.M. 2004. A MANOVA statistic is just as powerful as distance-based statistics, for multivariate abundances. Ecology 85: 858–874.
Warton, D.I., Wright, S.T. and Wang, Y. 2012. Distance-based multivariate analyses confound location and dispersion effects. Methods Ecol. Evol. 3: 89–101.
White, G.C. and Bennets, R.E. 1996. Analysis of frequency count data using the negative binomial distribution. Ecology 77: 2549–2557.
Wilks, S.S. 1932. Certain generalizations in the analysis of variance. Biometrika 24: 471–494.
Zimmerman, G.M., Goetz, H. and Jr. P.W. Mielke. 1985. Use of an improved statistical method for group comparisons to study effects of prairie fire. Ecology 66: 606–611.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Pillar, V.D. How accurate and powerful are randomization tests in multivariate analysis of variance?. COMMUNITY ECOLOGY 14, 153–163 (2013). https://doi.org/10.1556/ComEc.14.2013.2.5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1556/ComEc.14.2013.2.5