Advertisement

Community Ecology

, Volume 14, Issue 2, pp 153–163 | Cite as

How accurate and powerful are randomization tests in multivariate analysis of variance?

  • V. D. PillarEmail author
Open Access
Article

Abstract

Multivariate analysis of variance, based on randomization (permutation) tests, has become an important tool for ecological data analyses. However, a comprehensive evaluation of the accuracy and power of available methods is still lacking. This is a thorough examination of randomization tests for multivariate group mean differences. With simulated data, the accuracy and power of randomization tests were evaluated using different test statistics in one-factor multivariate analysis of variance (MANOVA). The evaluations span a wide spectrum of data types, including specified and unspecified (field data) distributional properties, correlation structures, homogeneous to very heterogeneous variances, and balanced and unbalanced group sizes. The choice of test statistic strongly affected the results. Sums of squares between groups (Qb) computed on Euclidean distances (Qb-EUD) gave better accuracy. Qb on Bray-Curtis, Manhattan or Chord distances, the multiresponse permutation procedure (MRPP) and the sum of univariate ANOVA F produced severely inflated type I errors under increasing variance heterogeneity among groups, a common scenario in ecological data. Despite pervasive claims in the ecological literature, the evidence thus suggests caution when using test statistics other than Qb-EUD.

Keywords

Count data Distance-based MANOVA Distribution free MRPP Neyman-Pearson lemma Permutation tests Type I error Type II error 

Abbreviations

ANOSIM

Analysis of Similarity

ANOVA

Analysis of variancev

CHOR

Chord distance

EUD

Euclidean distance

LR-IND

Likelihood-ratio test assuming independence of variables

MAN

Mahattan distance

MANOVA

Multivariate analysis of variance

MRPP

Multiresponse permutation procedure

PERMANOVA

Permutational multivariate analysis of variance

Qb

Sums of squares between groups

Qw

Within-groups sum of squares

SUM-F

Univariate ANOVA F statistic summed over all variables

Supplementary material

42974_2013_14020153_MOESM1_ESM.pdf (366 kb)
Supplementary material, approximately 374 KB.

References

  1. Anderson, M.J. 2001. A new method for non-parametric multivariate analysis of variance. Austral Ecol. 26: 32–46.Google Scholar
  2. Anderson, M.J. 2006. Distance-based tests for homogeneity of multivariate dispersions. Biometrics 62: 245–253.CrossRefGoogle Scholar
  3. Anderson, M.J. and ter Braak, C.J.F. 2003. Permutation tests for multi-factorial analysis of variance. J. Statist. Comput. Simulation 73: 85–113.CrossRefGoogle Scholar
  4. Bradley, J.V. 1968. Distribution-Free Statistical Tests. Prentice-Hall, Englewood Cliffs.Google Scholar
  5. Clarke, K.R. 1993. Non-parametric multivariate analyses of changes in community structure. Aust. J. Ecol. 18: 117–143.CrossRefGoogle Scholar
  6. Crowley, P.H. 1992. Resampling methods for computation-intensive data analysis in ecology and evolution. Annu. Rev. Ecol. Syst. 23: 405–447.CrossRefGoogle Scholar
  7. Edgington, E.S. 1969a. Approximate randomization tests. J. Psychol. 72: 143–149.CrossRefGoogle Scholar
  8. Edgington, E.S. 1969b. Statistical Inference: The Distribution-Free Approach. McGraw-Hill, New York.Google Scholar
  9. Edgington, E.S. 1987. Randomization Tests. Marcel Dekker, New York.Google Scholar
  10. Edwards, A.W.F. and Cavalli-Sforza L.L. 1965. A method for cluster analysis. Biometrics 21: 362–375.CrossRefGoogle Scholar
  11. Fisher, R.A. 1951. The Design of Experiments. 6th ed. Oliver and Boyd, Edinburgh.Google Scholar
  12. Galassi, M., Davies, J., Theiler, J., Gough, B., Jungman, G., Booth, M. and Rossi, F. 2003. GNU Scientific Library Reference Manual (2nd Ed). Available also at https://doi.org/www.gnu.org/soft-ware/gsl/.
  13. Gower, J.C. and Legendre, P. 1986. Metric and Euclidean properties of dissimilarity coefficients. J. Classif. 3: 5–48.CrossRefGoogle Scholar
  14. Hope, A.C.A. 1968. A simplified Monte Carlo significance test procedure. J. R. Stat. Soc. 30: 582–598.Google Scholar
  15. Hotelling, H. 1931. The generalization of Student’s ratio. Annals of Math. Stat. 2: 360–378.CrossRefGoogle Scholar
  16. Kempthorne, O. 1952. The Design and Analysis of Experiments. Wiley, New York.CrossRefGoogle Scholar
  17. Kempthorne, O. 1955. The randomization theory of experimental inference. J. Amer. Statistical Assoc. 50: 946–967.Google Scholar
  18. Legendre, P. and Anderson, M.J. 1999. Distance-based redundancy analysis: testing multi-species responses in multi-factorial ecological experiments. Ecol. Monogr. 69: 1–24.CrossRefGoogle Scholar
  19. Legendre, L. and Legendre, P. 1998. Numerical Ecology 2nd ed. Elsevier, New York.Google Scholar
  20. Lehmann, E.L. 1993. The Fisher, Neyman-Pearson theories of testing hypotheses: one theory or two? J. Amer. Statistical Assoc. 88: 1242–1249.CrossRefGoogle Scholar
  21. Manly, B.F.J. 2007. Randomization, Bootstrap, and Monte Carlo Methods in Biology. Chapman & Hall/ CRC, Boca Raton.Google Scholar
  22. Mantel, N. and Valand, R.S. 1970. A technique of nonparametric multivariate analysis. Biometrics 26: 547–558.CrossRefGoogle Scholar
  23. McArdle, B.H. and Anderson, M.J. 2001. Fitting multivariate models to community data: a comment on distance-based redundancy analysis. Ecology 82: 290–297.CrossRefGoogle Scholar
  24. McArdle, B.H. and Anderson, M.J. 2004. Variance heterogeneity, transformations, and models of species abundance: a cautionary tale. Can. J. Fish. Aquat. Sci. 61: 1294–1302.CrossRefGoogle Scholar
  25. Mielke, P.W. and Berry, J.A. 1999. Multivariate tests for correlated data in completely randomized designs. J. Educ. Behav. Stat. 24: 109–131.CrossRefGoogle Scholar
  26. Mielke, P.W. and Berry, J.A. 2001. Permutation Methods: A Distance Approach. Springer-Verlag, New York.CrossRefGoogle Scholar
  27. Mielke, P.W., Berry, K.J. and Johnson, E.S. 1976. Multi-response permutation procedures for a priori classifications. Commun. Stat. Theory Meth. 5: 1409–1424.CrossRefGoogle Scholar
  28. Orlóci, L. 1967. An agglomerative method for classification of plant communities. J. Ecol. 55: 193–205.CrossRefGoogle Scholar
  29. Orlóci, L. 1978. Multivariate Analysis in Vegetation Research. Junk, The Hague.Google Scholar
  30. Orlóci, L. 1993. The complexities and scenarios of ecosystem analysis. In: Patil, G.P. and Rao, C.R. (eds.) Multivariate Environmental Statistics, Elsevier, Amsterdam. pp. 423–432.Google Scholar
  31. Peres-Neto, P.R. and Jackson, D.A. 2001. How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia 129: 169–178.CrossRefGoogle Scholar
  32. Pillar, V.D., Jacques, A.V.A. and Boldrini, I.I. 1992. Fatores de ambiente relacionados à variação da vegetação de um campo natural. Pesqui. Agropecu. Bras. 27: 1089–1101.Google Scholar
  33. Pillar, V.D. and Orlóci, L. 1996. On randomization testing in vegetation science: multifactor comparisons of relevé groups. J. Veg. Sci. 7: 585–592.CrossRefGoogle Scholar
  34. Podani, J. 2000. Introduction to the Exploration of Multivariate Biological Data. Backuys Publishers, Leiden.Google Scholar
  35. Potvin, C. and Roff, D.A. 1993. Distribution-free and robust statistical methods: viable alternatives to parametric statistics? Ecology 74: 1617–1628.CrossRefGoogle Scholar
  36. Romesburg, H.C. 1985. Exploring, confirming, and randomization tests. Comput. Geosci. 11: 19–37.CrossRefGoogle Scholar
  37. Torres, P.S., Quaglino, M.B. and Pillar, V.D. 2010. Properties of a randomization test for multifactor comparisons of groups. J. Statist. Comput. Simulation 80: 1131 – 1150.CrossRefGoogle Scholar
  38. Warton, D.I. and Hudson, H.M. 2004. A MANOVA statistic is just as powerful as distance-based statistics, for multivariate abundances. Ecology 85: 858–874.CrossRefGoogle Scholar
  39. Warton, D.I., Wright, S.T. and Wang, Y. 2012. Distance-based multivariate analyses confound location and dispersion effects. Methods Ecol. Evol. 3: 89–101.CrossRefGoogle Scholar
  40. White, G.C. and Bennets, R.E. 1996. Analysis of frequency count data using the negative binomial distribution. Ecology 77: 2549–2557.CrossRefGoogle Scholar
  41. Wilks, S.S. 1932. Certain generalizations in the analysis of variance. Biometrika 24: 471–494.CrossRefGoogle Scholar
  42. Zimmerman, G.M., Goetz, H. and Jr. P.W. Mielke. 1985. Use of an improved statistical method for group comparisons to study effects of prairie fire. Ecology 66: 606–611.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest 2013

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Departamento de EcologiaUniversidade Federal do Rio Grande do SulPorto AlegreBrazil

Personalised recommendations