, Volume 162, Issue 2, pp 463–477 | Cite as

The empirical Bayes approach as a tool to identify non-random species associations

Community ecology - Original Paper


A statistical challenge in community ecology is to identify segregated and aggregated pairs of species from a binary presence–absence matrix, which often contains hundreds or thousands of such potential pairs. A similar challenge is found in genomics and proteomics, where the expression of thousands of genes in microarrays must be statistically analyzed. Here we adapt the empirical Bayes method to identify statistically significant species pairs in a binary presence–absence matrix. We evaluated the performance of a simple confidence interval, a sequential Bonferroni test, and two tests based on the mean and the confidence interval of an empirical Bayes method. Observed patterns were compared to patterns generated from null model randomizations that preserved matrix row and column totals. We evaluated these four methods with random matrices and also with random matrices that had been seeded with an additional segregated or aggregated species pair. The Bayes methods and Bonferroni corrections reduced the frequency of false-positive tests (type I error) in random matrices, but did not always correctly identify the non-random pair in a seeded matrix (type II error). All of the methods were vulnerable to identifying spurious secondary associations in the seeded matrices. When applied to a set of 272 published presence–absence matrices, even the most conservative tests indicated a fourfold increase in the frequency of perfectly segregated “checkerboard” species pairs compared to the null expectation, and a greater predominance of segregated versus aggregated species pairs. The tests did not reveal a large number of significant species pairs in the Vanuatu bird matrix, but in the much smaller Galapagos bird matrix they correctly identified a concentration of segregated species pairs in the genus Geospiza. The Bayesian methods provide for increased selectivity in identifying non-random species pairs, but the analyses will be most powerful if investigators can use a priori biological criteria to identify potential sets of interacting species.


Biogeography Null model C score Presence–absence matrix Statistical test 

Supplementary material

442_2009_1474_MOESM1_ESM.xls (558 kb)
(XLS 558 kb)


  1. Abbott I, Black R (1980) Changes in species composition of floras on islets near Perth, Western Australia. J Biogeogr 7:399–410CrossRefGoogle Scholar
  2. Atmar W, Patterson BD (1995) The nestedness temperature calculator: a visual basic program, including 294 presence absence matrices. AICS Research Incorporate and The Field Museum. http://www.aics-research.com/nestedness/tempcalc.html
  3. Bacallado JJ (1976) Notas sobre la distribucion y evolucion de la avifauna Canaria. In: Kunkel G (ed) Biogeography and ecology in the Canary Islands. Junk, The Hague, pp 13–431Google Scholar
  4. Beard JS (1948) The natural vegetation of the Windward and Leeward Islands. Oxford For Mem 21:1–192Google Scholar
  5. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300Google Scholar
  6. Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188CrossRefGoogle Scholar
  7. Brown JH, Kurzius MA (1987) Composition of desert rodent faunas: combinations of coexisting species. Ann Zool Fenn 24:227–237Google Scholar
  8. Burnham KP, Anderson DR (2002) Model selection and inference: A practical information-theoretic approach. Springer, New YorkGoogle Scholar
  9. Burns KC (2007) Patterns in the assembly of an island plant community. J Biogeogr 34:760–768CrossRefGoogle Scholar
  10. Cameron RAD (1992) Land snail faunas of the Napier and Oscar Ranges, Western Australia; diversity, distribution and speciation. Biol J Linn Soc 45:271–286CrossRefGoogle Scholar
  11. Colwell RK, Winkler DW (1984) A null model for null models in biogeography. In: Strong Jr, Simberloff D, Abele LG, Thistle AB (eds) Ecological communities: conceptual issues and the evidence. Princeton University Press, Princeton, pp 344–359Google Scholar
  12. Connor EF, Simberloff D (1979) The assembly of species communities: chance or competition? Ecology 60:1132–1140CrossRefGoogle Scholar
  13. Crowe TM (1979) Lots of weeds. J Biogeogr 6:169–181CrossRefGoogle Scholar
  14. Descimon H (1986) Origins of Lepidopteran faunas in the high tropical Andes. In: Vuilleumier F, Monasterio M (eds) High altitude tropical biogeography. Oxford University Press, Oxford, pp 500–532Google Scholar
  15. Diamond JM (1975) Assembly of species communities. In: Cody ML, Diamond JM (eds) Ecology and evolution of communities. Harvard University Press, Cambridge, pp 342–444Google Scholar
  16. Diamond JM, Gilpin ME (1982) Examination of the “null” model of Connor and Simberloff for species co-occurrences on islands. Oecologia 52:64–74CrossRefGoogle Scholar
  17. Diamond JM, Marshall AG (1976) Origin of the New Hebridean avifauna. Emu 76:187–200Google Scholar
  18. Efron B (2005) Bayesians, frequentists, and scientists. J Am Stat Assoc 100:1–5CrossRefGoogle Scholar
  19. Gotelli NJ (2000) Null model analysis of species co-occurrence patterns. Ecology 81:2606–2621Google Scholar
  20. Gotelli NJ (2001) Research frontiers in null model analysis. Glob Ecol Biogeogr 10:337–343CrossRefGoogle Scholar
  21. Gotelli NJ, Ellison AE (2004) A primer of ecological statistics. Sinauer, SunderlandGoogle Scholar
  22. Gotelli NJ, Entsminger GL (2001) Swap and fill algorithms in null model analysis: rethinking the knight’s tour. Oecologia 129:281–291CrossRefGoogle Scholar
  23. Gotelli NJ, Entsminger GL (2003) Swap algorithms in null model analysis. Ecology 84:532–535CrossRefGoogle Scholar
  24. Gotelli NJ, Graves GR (1996) Null models In ecology. Smithsonian Institution Press, WashingtonGoogle Scholar
  25. Gotelli NJ, McCabe DJ (2002) Species co-occurrence: a meta-analysis of J. M. Diamond’s assembly rules model. Ecology 83:2091–2096CrossRefGoogle Scholar
  26. Haila Y, Järvinen O, Vaisanen RA (1980) Habitat distributions and species associations of land bird populations on the Aland Islands, SW Finland. Ann Zool Fenn 17:87–106Google Scholar
  27. Hatt RT, Van Tyne J, Stuart LC, Pope CH, Grobman AB (1948) Island life: a study of the land vertebrates of the islands of eastern Lake Michigan. Cranbrook Institute of Science. Bloomfield Hills, MIGoogle Scholar
  28. Higgins CL, Willig MR, Strauss RE (2006) The role of stochastic processes in producing nested patterns of species distributions. Oikos 114:159–167CrossRefGoogle Scholar
  29. Hocutt CH, Denoncourt RF, Stauffer JR (1978) Fishes of the Greenbriar River, West Virginia, with drainage history of the Central Appalachians. J Biogeogr 5:59–80CrossRefGoogle Scholar
  30. Kammenga JE, Herman MA, Ouborg NJ, Johnson L, Breitling R (2007) Microarray challenges in ecology. Trends Ecol Evol 22:273–279CrossRefPubMedGoogle Scholar
  31. Lehsten V, Harmand P (2006) Null models for species co-occurrence patterns: assessing bias and minimum iteration number for the sequential swap. Ecography 29:786–792CrossRefGoogle Scholar
  32. Manly BFJ (1991) Randomization and Monte Carlo methods in biology. Chapman and Hall, LondonGoogle Scholar
  33. Manly BFJ (1995) A note on the analysis of species co-occurrences. Ecology 76:1109–1115CrossRefGoogle Scholar
  34. May RM (1975) Patterns of species abundance and diversity. In: Cody ML, Diamond JM (eds) Ecology and evolution of communities. Harvard University Press, Cambridge, pp 81–120Google Scholar
  35. McCoy ED, Heck KL (1987) Some observations on the use of taxonomic similarity in large-scale biogeography. J Biogeogr 14:79–87CrossRefGoogle Scholar
  36. Miklós I, Podani J (2004) Randomization of presence–absence matrices: comments and new algorithms. Ecology 85:86–92CrossRefGoogle Scholar
  37. Moran MD (2003) Arguments for rejecting the sequential Bonferroni in ecological studies. Oikos 100:403–405CrossRefGoogle Scholar
  38. Murphy RW (1983) The reptiles: origins and evolution. In: Case TJ, Cody ML (eds) Island biogeography in the Sea of Cortez. University of California Press, Berkeley, pp 130–158Google Scholar
  39. Patterson BD (1987) The principle of nested subsets and its implications for biological conservation. Conserv Biol 1:323–334CrossRefGoogle Scholar
  40. Patterson BD, Atmar W (1986) Nested subsets and the structure of insular mammalian faunas and archipelagos. In: Heaney LR, Patterson BD (eds) Island biogeography of mammals. Academic Press, London, pp 65–82Google Scholar
  41. Patterson BD, Pacheco V, Solari S (1996) Distributions of bats along an elevational gradient in the Andes of south-eastern Peru. J Zool 240:637–658CrossRefGoogle Scholar
  42. Sanderson JG (2000) Testing ecological patterns. Am Sci 88:332–339Google Scholar
  43. Sanderson JG (2004) Null model analysis of communities on gradients. J Biogeogr 31:879–883CrossRefGoogle Scholar
  44. Schluter D, Grant PR (1984) Determinants of morphological patterns in communities of Darwin’s finches. Am Nat 123:175–196CrossRefGoogle Scholar
  45. Sfenthourakis S, Giokas S, Tzanatos E (2004) From sampling stations to archipelagos: investigating aspects of the assemblage of insular biota. Glob Ecol Biogeogr 13:23–35CrossRefGoogle Scholar
  46. Sfenthourakis S, Tzanatos E, Giokas S (2006) Species co-occurrence: the case of congeneric species and a causal approach to patterns of species association. Glob Ecol Biogeogr 15:39–49CrossRefGoogle Scholar
  47. Shipley B (2002) Cause and correlation in biology: a user’s guide to path analysis, structural equations and causal inference. Cambridge University Press, CambridgeGoogle Scholar
  48. Simberloff D, Connor EF (1979) Q-mode and R-mode analyses of biogeographic distributions: null hypotheses based on random colonization. In: Patil GP, Rosenzweig ML (eds) Contemporary quantitative ecology and related ecometrics. International Cooperative Publishing House, Fairland, pp 123–138Google Scholar
  49. Simberloff D, Connor EF (1981) Missing species combinations. Am Nat 118:215–239CrossRefGoogle Scholar
  50. Springer VG (1982) Pacific plate biogeography, with special reference to shorefishes. Smithsonian Institution, WashingtonGoogle Scholar
  51. Stone L, Roberts A (1990) The checkerboard score and species distributions. Oecologia 85:74–79CrossRefGoogle Scholar
  52. Sutherland JP, Karlson RH (1977) Development and stability of the fouling community at Beaufort, North Carolina. Ecol Monogr 47:425–446CrossRefGoogle Scholar
  53. Ulrich W (2004) Species co-occurrences and neutral models: reassessing J. M. Diamond’s assembly rules. Oikos 107:603–609CrossRefGoogle Scholar
  54. Ulrich W (2008) Pairs—a FORTRAN program for studying pair-wise species associations in ecological matrices. www.uni.torun.pl/~ulrichw
  55. Ulrich W, Gotelli NJ (2007a) Null model analysis of species nestedness patterns. Ecology 88:1824–1831CrossRefPubMedGoogle Scholar
  56. Ulrich W, Gotelli NJ (2007b) Disentangling community patterns of nestedness and species co-occurrence. Oikos 116:2053–2061CrossRefGoogle Scholar
  57. Ulrich W, Zalewski M (2006) Abundance and co-occurrence patterns of core and satellite species of ground beetles on small lake islands. Oikos 114:338–348CrossRefGoogle Scholar
  58. Veech JA (2006) A probability-based analysis of temporal and spatial co-occurrence in grassland birds. J Biogeogr 33:2145–2153CrossRefGoogle Scholar
  59. Zaman A, Simberloff D (2002) Random binary matrices in biogeographical ecology—instituting a good neighbor policy. Environ Ecol Stat 9:405–421CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  1. 1.Department of BiologyUniversity of VermontBurlingtonUSA
  2. 2.Department of Animal EcologyNicolaus Copernicus University in ToruńTorunPoland

Personalised recommendations