Beals smoothing revisited

Abstract

Beals smoothing is a multivariate transformation specially designed for species presence/absence community data containing noise and/or a lot of zeros. This transformation replaces the observed values of the target species by predictions of occurrence on the basis of its co-occurrences with the remaining species. In many applications, the transformed values are used as input for multivariate analyses. As Beals smoothing values provide a sense of “probability of occurrence”, they have also been used for inference. However, this transformation can produce spurious results, and it must be used with caution. Here we study the statistical and ecological bases underlying the Beals smoothing function, and the factors that may affect the reliability of transformed values are explored using simulated data sets. Our simulations demonstrate that Beals predictions are unreliable for target species that are not related to the overall ecological structure. Furthermore, the presence of these “random” species may diminish the quality of Beals smoothing values for the remaining species. A statistical test is proposed to determine when observed values can be replaced with Beals smoothing predictions. Two real-data example applications are presented to illustrate the potentially false predictions of Beals smoothing and the necessary checking step performed by the new test.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Austin MP (1976) On non-linear species response models in ordination. Vegetatio 33:33–41

    Article  Google Scholar 

  2. Beals EW (1984) Bray–Curtis ordination: an effective strategy for analysis of multivariate ecological data. Adv Ecol Res 14:1–55

    Google Scholar 

  3. Beauchamp VB, Stromberg JC, Stutz JC (2006) Arbuscular mycorrhizal fungi associated with Populus–Salix stands in a semiarid riparian ecosystem. New Phytol 170:369

    PubMed  Article  Google Scholar 

  4. Bouxin G (2005) Ginkgo, a multivariate analysis package. J Veg Sci 16:355–359

    Article  Google Scholar 

  5. Brisse H, Grandjouan G, Hoff M, de Ruffray P (1980) Utilisation d’un critère statistique de l’écologie en phytosociologie–exemple des forêts alluviales en Alsace. Coll Phytosociol 9:543–590

    Google Scholar 

  6. Brodeur RD, Fisher JP, Emmett RL, Morgan CA, Casillas E (2005) Species composition and community structure of pelagic nekton off Oregon and Washington under variable oceanographic conditions. Mar Ecol Prog Ser 298:41–57

    Article  Google Scholar 

  7. De Cáceres M, Oliva F, Font X, Vives S (2007) GINKGO, a program for non-standard multivariate fuzzy analysis. Adv Fuzzy Sets Syst 2:41–56

    Google Scholar 

  8. Ellyson WJT, Sillett SC (2003) Epiphyte Communities on Sitka Spruce in an old-growth Redwood Forest. Bryologist 106:197–211

    Article  Google Scholar 

  9. Ewald J (2002) A probabilistic approach to estimating species pools from large compositional matrices. J Veg Sci 13:191–198

    Article  Google Scholar 

  10. Fortin MJ, Dale MRT (2005) Spatial analysis: a guide for ecologists. Cambridge University Press, Cambridge

    Google Scholar 

  11. Gotelli NJ (2000) Null model analysis of species co-occurrence patterns. Ecology 81:2606–2621

    Article  Google Scholar 

  12. Harms KE, Condit R, Hubbell SP, Foster RB (2001) Habitat associations of trees and shrubs in a 50-ha neotropical forest plot. J Ecol 89:947–959

    Article  Google Scholar 

  13. Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:1979

    Google Scholar 

  14. Holz I, Gradstein RS (2005) Cryptogamic epiphytes in primary and recovering upper montane oak forests of Costa Rica–species richness, community composition and ecology. Plant Ecol 178:89–109

    Article  Google Scholar 

  15. Hope ACA (1968) A simplified Monte Carlo test procedure. J R Stat Soc B 50:35–45

    Google Scholar 

  16. Hubbell SP, Condit R, Foster RB (2005) Barro colorado forest census plot data. Available at http://ctfs.si/edu/datasets/bci

  17. Hutchinson GE (1957) Concluding remarks. Cold Spring Harb Symp Quant Biol 22:415–427

    Google Scholar 

  18. Joy MK, Death RG (2000) Development and application of a predictive model of riverine fish community assemblages in the Taranaki region of the North Island, New Zealand. NZ J Mar Freshw Res 34:241–252

    Article  Google Scholar 

  19. Kimball S, Wilson P, Crowther J (2004) Local ecology and geographic ranges of plants in the Bishop Creek watershed of the eastern Sierra Nevada, California, USA. J Biogeogr 31:1637–1657

    Article  Google Scholar 

  20. Lee P (2004) The impact of burn intensity from wildfires on seed and vegetative banks, and emergent understory in aspen-dominated boreal forests. Can J Bot/Rev Can Bot 82:1468–1480

    Article  Google Scholar 

  21. Legendre P (2005) Species associations: the Kendall coefficient of concordance revisited. J Agric Biol Environ Stat 10:226–245

    Article  Google Scholar 

  22. Legendre P, Legendre L (1998) Numerical ecology, 2nd English edn. Elsevier, Amsterdam

    Google Scholar 

  23. Marra JL, Edmonds RL (2005) Soil arthropod responses to different patch types in a mixed-conifer forest of the Sierra Nevada. For Sci 51:255

    Google Scholar 

  24. McCune B (1994) Improving community analysis with the Beals smoothing function. Ecoscience 1:82–86

    Google Scholar 

  25. McCune B, Grace JB (2002) Analysis of ecological communities. MjM Software Design, Gleneden Beach

  26. McCune B, Mefford MJ (1999) PC-ORD. Multivariate analysis of ecological data, Version 4. MjM Software Design, Gleneden Beach

  27. Minchin PR (1987) Simulation of multidimensional community patterns towards a comprehensive model. Vegetatio 71:145–156

    Google Scholar 

  28. Münzbergová Z, Herben T (2004) Identification of suitable unoccupied habitats in metapopulation studies using co-occurrence of species. Oikos 105:408–414

    Article  Google Scholar 

  29. North M, Oakley B, Fiegener R, Gray A, Barbour M (2005) Influence of light and soil moisture on Sierran mixed-conifer understory communities. Plant Ecol 177:13–24

    Article  Google Scholar 

  30. Oksanen J, Kindt R, Legendre P, O’Hara B, Simpson GL, Stevens MHH (2008) Vegan: community ecology package. R package version 1.11-0. http://cran.r-project.org/, http://vegan.r-forge.r-project.org/

  31. Peres-Neto PR, Olden JD, Jackson DA (2001) Environmentally constrained null models: site suitability as occupancy criterion. Oikos 93:110

    Article  Google Scholar 

  32. R Development Core Team (2007) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at http://www.R-project.org

  33. Roberts DW, Wight D (1988) Plant community distribution and dynamics in Bryce Canyon National Park. United States Department of Interior National Park Service

  34. Roberts DW (2006) labdsv: Laboratory for Dynamic Synthetic Vegephenomenology. R package version 1.2–2. Available at http://cran.r-project.org/

  35. Schnittler M, Unterseher M, Tesmer J (2006) Species richness and ecological characterization of myxomycetes and myxomycete-like organisms in the canopy of a temperate deciduous forest. Mycologia 98:223

    PubMed  Article  Google Scholar 

  36. Sidak Z (1967) Rectangular confidence regions for the means of multivariate normal distributions. J Am Stat Assoc 62:626–633

    Article  Google Scholar 

  37. Swan JMA (1970) An examination of some ordination problems by use of simulated vegetational data. Ecology 51:89–102

    Article  Google Scholar 

  38. Whitehouse HE, Bayley SE (2005) Vegetation patterns and biodiversity of peatland plant communities surrounding mid-boreal wetland ponds in Alberta, Canada. Can J Bot 83:621–637

    Article  Google Scholar 

Download references

Acknowledgments

This work benefitted from comments by Pedro-Peres Neto on randomization methods and by Daniel Borcard and Artur Lluent on the ecological interpretation of the Beals smoothing function. The authors are especially grateful to Jari Oksanen, who suggested interesting real-data applications and provided several suggestions, and to Bruce McCune and David Roberts for their comments on previous versions of the manuscript. This research was funded by NSERC grant no. OGP0007738 to P. Legendre. The BCI forest dynamics research project is part the Center for Tropical Forest Science, a global network of large-scale demographic tree plots. All experiments comply with the current laws of the country in which the experiments were performed.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Miquel De Cáceres.

Additional information

Communicated by Scott Collins.

Electronic supplementary material

Below is the link to the electronic supplementary material.

S1. Expected value of Beals smoothing for a “random” species (doc 54 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

De Cáceres, M., Legendre, P. Beals smoothing revisited. Oecologia 156, 657–669 (2008). https://doi.org/10.1007/s00442-008-1017-y

Download citation

Keywords

  • Barro Colorado Island
  • Beals smoothing
  • Binary data
  • Community ecology
  • Randomization model