Modeling Taxa-Abundance Distributions in Microbial Communities using Environmental Sequence Data

Sloan, William T.; Woodcock, Stephen; Lunn, Mary; Head, Ian M.; Curtis, Thomas P.

doi:10.1007/s00248-006-9141-x

Modeling Taxa-Abundance Distributions in Microbial Communities using Environmental Sequence Data

Published: 13 December 2006

Volume 53, pages 443–455, (2007)
Cite this article

Microbial Ecology Aims and scope Submit manuscript

William T. Sloan¹,
Stephen Woodcock¹,
Mary Lunn²,
Ian M. Head³ &
…
Thomas P. Curtis³

2041 Accesses
116 Citations
Explore all metrics

Abstract

We show that inferring the taxa-abundance distribution of a microbial community from small environmental samples alone is difficult. The difficulty stems from the disparity in scale between the number of genetic sequences that can be characterized and the number of individuals in communities that microbial ecologists aspire to describe. One solution is to calibrate and validate a mathematical model of microbial community assembly using the small samples and use the model to extrapolate to the taxa-abundance distribution for the population that is deemed to constitute a community. We demonstrate this approach by using a simple neutral community assembly model in which random immigrations, births, and deaths determine the relative abundance of taxa in a community. In doing so, we further develop a neutral theory to produce a taxa-abundance distribution for large communities that are typical of microbial communities. In addition, we highlight that the sampling uncertainties conspire to make the immigration rate calibrated on the basis of small samples very much higher than the true immigration rate. This scale dependence of model parameters is not unique to neutral theories; it is a generic problem in ecology that is particularly acute in microbial ecology. We argue that to overcome this, so that microbial ecologists can characterize large microbial communities from small samples, mathematical models that encapsulate sampling effects are required.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A practical guide to amplicon and metagenomic analysis of microbiome data

Article Open access 11 May 2020

High-level classification of the Fungi and a tool for evolutionary ecological analyses

Article Open access 16 May 2018

Modeling Microbial Community Networks: Methods and Tools for Studying Microbial Interactions

Article Open access 08 April 2024

References

Bell, G (2000) The distribution of abundance in neutral communities. Am Nat 155: 606–617
Article PubMed Google Scholar
Bell, T, Agar, D, Song, J, Newman, JA, Thompson, IP, Lilley, AK, van der Gast, CJ (2005) Larger islands house more bacterial taxa. Science 308: 1884
Article PubMed CAS Google Scholar
Coskuner, G, Ballinger, SJ, Davenport, RJ, Pickering, RL, Solera, R, Head, IM, Curtis, TP (2005) Agreement between theory and measurement in quantification of ammonia-oxidizing bacteria. Appl Environ Microbiol 71: 6325–6334
Article PubMed CAS Google Scholar
Cox, DR, Miller, HD (1965) The Theory of Stochastic Processes. Methuen, London
Google Scholar
Curtis, T, Sloan, WT, Scannell, J (2002) Modelling prokaryotic diversity and its limits. Proc Natl Acad Sci 99: 10494–10499
Article PubMed CAS Google Scholar
Curtis, TP, Sloan, WT (2005) Exploring microbial diversity—a vast below. Science 309: 1331–1333
Article PubMed CAS Google Scholar
Enquist, BJ, Sanderson, J, Weiser, MD (2002) Modeling macroscopic patterns in ecology. Science 295: 1835–1836
Article PubMed CAS Google Scholar
Fenchel, T, Finlay, BJ (2005) Bacteria and Island Biogeography. Science 309: 1997–1999
Article PubMed CAS Google Scholar
Finlay, BJ, Clarke, KJ (1999) Ubiquitous dispersal of microbial species. Nature 400: 828–828
Article Google Scholar
Green, JL, Holmes, AJ, Westoby, M, Oliver, I, Briscoe, D, Dangerfield, M, et al. (2004) Spatial scaling of microbial eukaryote diversity. Nature 432: 747–750
Article PubMed CAS Google Scholar
Harris, LD (1984) The Fragmented Forest. University of Chicago Press
Horner-Devine, MC, Lage, M, Hughes, JB, Bohannan, BJM (2004) A taxa-area relationship for bacteria. Nature 432: 750–753
Article PubMed CAS Google Scholar
Houchmandzadeh, B, Vallade, M (2003) Clustering in neutral ecology. Phys Rev E 68: art. no. 061912
Google Scholar
Hubbell, SP (2001) The Unified Neutral Theory of Biodiversity and Biogeography. Princeton University Press, Princeton
Google Scholar
Kimura, M, Ohta, T (1971) Theoretical Aspects of Population Genetics. Princeton University Press, Princeton
Google Scholar
Linacre, CH (2004) Diversity and the quantification of ammonia oxidising bacteria and denitrification from turbidity maximum of estuaries. PhD thesis, Civil Engineering and Geosciences, University of Newcastle upon Tyne.
MacArthur, RH, Wilson, EO (Eds.) (1967) The Theory of Island Biogeography. Princeton University Press, Princeton
May, RM (1975) Patterns of species abundance and diversity. In: Cody, ML, Diamond, JM (Eds.), Ecology and Evolution of Communities. Harvard University Press, Harvard, MA, pp 81–120
Google Scholar
McGill, BJ (2003) A test of the unified neutral theory of biodiversity. Nature 422: 881–885
Article PubMed CAS Google Scholar
McKane, AJ, Alonso, D, Sole, RV (2004) Analytic solution of Hubbell’s model of local community dynamics. Theor Popul Biol 65: 67–73
Article PubMed Google Scholar
Purkhold, U, Pommerening-Roser, A, Juretschko, S, Schmid, MC, Koops, HP, Wagner, M (2000) Phylogeny of all recognized species of ammonia oxidizers based on comparative 16S rRNA and amoA sequence analysis: implications for molecular diversity surveys. Appl Environ Microbiol 66: 5368–5382
Article PubMed CAS Google Scholar
Sloan, WT, Woodcock, S, Lunn, M, Head, IM, Nee, S, Curtis, TP (2005) The roles of immigration and chance in shaping prokaryote community structure. Environ Microbiol, Early Online 28 Nov
Sloan, WT, Lunn, M, Woodcock, S, Head, IM, Nee, S, Curtis, TP (2006) Quantifying the roles of immigration and chance in shaping prokaryote community structure. Environ Microbiol 8: 732–740
Article PubMed Google Scholar
Vallade, M, Houchmandzadeh, B (2003) Analytical solution of a neutral model of biodiversity. Phys Rev E 68: art. no. 061902
Google Scholar
Volkov, I, Banavar, JR, Hubbell, SP, Maritan, A (2003) Neutral theory and relative species abundance in ecology. Nature 424: 1035–1037
Article PubMed CAS Google Scholar
Wagner, M, Loy, A (2002) Bacterial community composition and function in sewage treatment systems. Curr Opin Biotechnol 13: 218–227
Article PubMed CAS Google Scholar
Whitman, WB, Coleman, DC, Wiebe, WJ (1998) Prokaryotes: the unseen majority. Proc Natl Acad Sci USA 95: 6578–6583
Article PubMed CAS Google Scholar
Woodcock, S, Lunn, M, Curtis, TP, Head, IM, Sloan, WT (2006) Taxa area relationships for microbes: the unsampled and the unseen. Ecol Lett 9: 805–812
Article PubMed Google Scholar
Zwart, G, van Hannen, EJ, van Kamst, Agterveld, MP, van der Gucht, K, Lindstrom, ES, van Wichelen, J, et al. (2003) Rapid screening for freshwater bacterial groups by using reverse line blot hybridization. Appl Environ Microbiol 69: 5875–5883
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Civil Engineering, University of Glasgow, Oakfield Avenue, Glasgow, G12 8LT, UK
William T. Sloan & Stephen Woodcock
Department of Statistics, University of Oxford, 1 South Parks Road, Oxford, OX1 3TG, UK
Mary Lunn
School of Civil Engineering and Geosciences, University of Newcastle upon Tyne, Newcastle, NE1 7RU, UK
Ian M. Head & Thomas P. Curtis

Authors

William T. Sloan
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Woodcock
View author publications
You can also search for this author in PubMed Google Scholar
Mary Lunn
View author publications
You can also search for this author in PubMed Google Scholar
Ian M. Head
View author publications
You can also search for this author in PubMed Google Scholar
Thomas P. Curtis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William T. Sloan.

Appendix: Mathematical Appendix

Kolmogorov Backward Equation for the neutral community model

The basis of the model is Hubbell’s NCM in which the community is saturated with a total of NT individuals; and for an assemblage to change, an individual must die or leave the system. This occurs at a taxa independent rate δ. The dead individual is immediately replaced by an immigrant from a source community, with probability m, or by reproduction of a member of the local community with probability 1−m. Thus, the community forms and develops through a continuous cycle of immigration, reproduction, and death. Assuming that deaths are uniformly distributed in time, then during a period of time 1/d one death is expected and the ith species, with initial absolute abundance N _i, will either increase by 1, stay the same, or decrease by 1, with probability given by the following three expressions, respectively:

$$ Pr{\left( {N_{i} + 1 \mathord{\left/ {\vphantom {1 {N_{i} }}} \right. \kern-\nulldelimiterspace} {N_{i} }} \right)} = {\left( {\frac{{N_{{\text{T}}} - N_{i} }} {{N_{{\text{T}}} }}} \right)}{\left[ {{\text{mp}}_{i} + {\left( {1 - m} \right)}{\left( {\frac{{N_{i} }} {{N_{{\text{T}}} - 1}}} \right)}} \right]} $$

(8)

$$Pr{\left( {{N_{i} } \mathord{\left/ {\vphantom {{N_{i} } {N_{i} }}} \right. \kern-\nulldelimiterspace} {N_{i} }} \right)} = \frac{{N_{i} }}{{N_{{\text{T}}} }}{\left[ {{\text{mp}}_{i} + {\left( {1 - m} \right)}{\left( {\frac{{N_{i} - 1}}{{N_{{\text{T}}} - 1}}} \right)}} \right]} + {\left( {\frac{{N_{{\text{T}}} - N_{i} }}{{N_{{\text{T}}} }}} \right)}{\left[ {m{\left( {1 - p_{i} } \right)} + {\left( {1 - m} \right)}{\left( {\frac{{N_{{\text{T}}} - N_{i} - 1}}{{N_{{\text{T}}} - 1}}} \right)}} \right]}$$

(9)

$$ Pr{\left( {N_{i} - 1 \mathord{\left/ {\vphantom {1 {N_{i} }}} \right. \kern-\nulldelimiterspace} {N_{i} }} \right)} = \frac{{N_{i} }} {{N_{{\text{T}}} }}{\left[ {m{\left( {1 - p_{i} } \right)} + {\left( {1 - m} \right)}{\left( {\frac{{N_{{\text{T}}} - N_{i} }} {{N_{{\text{T}}} - 1}}} \right)}} \right]} $$

(10)

where p _i is the relative abundance of the ith species in the source community. Hubbell used these transition probabilities for relatively small populations to form a finite Markov–Chain model with which the community dynamics can be investigated and the stationary probability distribution for N _i can be calculated. The computational expense [19] of this discrete Markov-Chain formulation makes it impossible to apply to the very large diverse populations that typify the microbial world [27]. Here, we employ Kimura and Ohta’s [15] methods to recast the model for large populations.

Let, $ x_{i} = \frac{{N_{i} }} {{N_{{\text{T}}} }} $ be the relative abundance of the ith species, and assume that N _T, the local community size, is large enough that x _i can be considered continuous. Also, let $ \phi {\left( {x_{i} , x_{2} , \ldots , x_{n} ; t} \right)} $ be the joint pdf that the relative abundances of species 1,..., n at time t are x ₁,..., x _n, respectively. The continuous model comes from considering the expected change in ϕ that will occur in a small time interval δt. To do this, we define $ g{\left( {x_{i} , \delta x_{1} , \ldots , x_{n} , \delta x_{n} ; t, \delta t} \right)} $ to be the pdf for the relative abundance of species1 changing from x ₁ to x ₁ + δx ₁, and the relative abundance of species 2 changes from x ₂ to x ₂ + δx ₂,..., and the abundance of species n changes from x _n to x _n + δx _n during the time period between t and t + δt.

Then,

$$ \phi {\left( {x_{i} , \ldots , x_{n} ; t + \delta t} \right)} = {\int {\phi {\left( {x_{1} - \delta x_{1} , \ldots , x_{n} - \delta x_{n} ; t} \right)}} }g{\left( {x_{1} - \delta x_{1} , \delta x_{1} , \ldots , x_{n} - \delta x_{n} , \delta x_{n} ; t, \delta t} \right)}{\text{d}}{\left( {\delta x_{1} } \right)} \cdots {\text{d}}{\left( {\delta x_{n} } \right)} $$

Expanding this as an n-dimensional Taylor series about the point x ₁,..., x _n and neglecting terms of order 3 and above gives

$$ \phi {\left( {x_{i} , \ldots ,x_{n} ;t + \delta t} \right)}{\int {{\left[ \begin{aligned} \phi g - {\sum\limits_{i = 1}^n {{\left( {\delta x_{i} \frac{\partial } {{\partial x_{i} }}{\left( {\phi g} \right)}} \right)}} } + {\sum\limits_{i = 1}^n {{\left( {\frac{{{\left( {\delta x_{i} } \right)}^{2} }} {2}\frac{{\partial ^{2} }} {{\partial x^{2}_{i} }}{\left( {\phi g} \right)}} \right)}} } + \frac{1} {2}{\sum\limits_{i = 1}^n {{\sum\limits_{j \ne i} {{\left( {\delta x_{i} \delta x_{j} \frac{{\partial ^{2} }} {{\partial x_{i} \partial x_{j} }}{\left( {\phi g} \right)}} \right)}} }} } \\ \quad \quad \; \\ \end{aligned} \right]}} }d{\left( {\delta x_{1} } \right)} \cdots d{\left( {\delta x_{n} } \right)} $$

(11)

where ϕg denotes ϕ (x ₁, x ₂,...,x _n, t)g(x ₁, δx ₁,...,x _n, δx _n; t, δt). Because $ {\int {g{\text{d}}{\left( {\delta x_{i} } \right)} = 1,} } $

$$\begin{aligned} & \phi {\left( {x_{1} , x_{2} , \ldots , x_{n} ; t + \delta t} \right)} \\ & \quad \quad - \phi {\left( {x_{1} , x_{2} , \ldots , x_{n} ; t} \right)} \\ & \quad = - {\sum\limits_{i = 1}^n {\frac{\partial }{{\partial x_{i} }}} }{\left( {\phi {\left( {p_{i} , x_{i} ; t} \right)}{\int {{\left( {\delta x_{i} } \right)}g{\text{d}}{\left( {\delta x_{i} } \right)}} }} \right)} \\ & \quad \;\;\, + \frac{1}{2}{\sum\limits_{i = 1}^n {\frac{{\partial ^{2} }}{{\partial x^{2}_{i} }}{\left( {\phi {\left( {p_{i} , x_{i} ;t} \right)}{\int {{\left( {\delta x_{i} } \right)}^{2} g\;d{\left( {\delta x_{i} } \right)}} }} \right)}} } \\ & \quad \;\;\, + \frac{1}{2}{\sum\limits_{i = 1}^n {{\sum\limits_{j \ne i} {{\left( {\phi {\left( {p_{i} , x_{i} ; t} \right)}{\int {{\left( {\partial \delta x_{i} } \right)}{\left( {\delta x_{j} } \right)}g{\text{d}}{\left( {\delta x_{i} } \right)}{\text{d}}{\left( {\delta x_{j} } \right)}} }} \right)}} }} } \\ \end{aligned} $$

(12)

therefore,

$$ \frac{{\partial \phi }} {{\partial t}} = {\sum\limits_{i = 1}^n {{\left[ { - \frac{{\partial {\left( {M_{{\delta x_{i} }} \phi } \right)}}} {{\partial x_{i} }} + \frac{1} {2}\frac{{\partial ^{2} {\left( {V_{{\delta x_{i} \phi }} } \right)}}} {{\partial x^{2}_{i} }}} \right]} + \frac{1} {2}} }{\sum\limits_{i = 1}^n {{\sum\limits_{j \ne i} {\frac{{\partial ^{2} {\left( {C_{{\delta x_{i} \delta x_{j} }} \phi } \right)}}} {{\partial x_{i} \partial x_{j} }}} }} } $$

(13)

where $ M_{{\delta x_{i} }} $ and $ V_{{\delta x_{i} }} $ are the first and second moments of the change in x _i per unit of time and $ C_{{\delta x_{i} \delta x_{j} }} $ is the expected product of changes in x _i and x _j. This is the n-dimensional version of the Kolmogorov equation. By considering the expected changes in relative abundance in the discrete time interval 1/d given by Eqs. (8)–(10), then $ M_{{\delta x_{i} }} ,{\text{ }}V_{{\delta x_{i} }} $ and $ C_{{\delta x_{i} \delta x_{j} }} $ can be approximated by

$$ M_{{\delta x_{i} }} = \frac{{m{\left( {p_{i} - x_{i} } \right)}}} {{N_{{\text{T}}} }} $$

(14)

$$ V_{{\delta x_{i} }} = \frac{{2x_{i} {\left( {1 - x_{i} } \right)} + m{\left( {p_{i} - x_{i} } \right)}{\left( {1 - 2x_{i} } \right)}}} {{N^{2}_{{\text{T}}} }} $$

(15)

$$ C_{{\delta x_{j} \delta x_{j} }} = - {\left[ {\frac{{2x_{i} x_{j} + m{\left[ {x_{i} {\left( {p_{j} - x_{j} } \right)} + x_{j} {\left( {x_{i} - p_{i} } \right)}} \right]}}} {{N^{2}_{{\text{T}}} }}} \right]}. $$

(16)

Reasoning that typically either m is small or p _i rapidly converges on x _i, we can neglect all but the first term of both $ C_{{\delta x_{i} \delta x_{j} }} $ and $ V_{{\delta x_{i} }} $. Equations (13)–(16) then define the NCM for large populations by describing the change in the joint probability of the relative abundances of the n different taxa in the local community.

Stationary probability density function

The solution to the diffusion equation [Eq. (13)] with $ \frac{{\partial \phi }} {{\partial t}} = 0 $ and reflecting boundaries, where x _i = 0 or x _i = 1, gives the stationary (long-term equilibrium) joint probability density function (pdf) for the relative abundance of the n taxa in the local community, $ {\left\{ {x_{i} } \right\}}^{n}_{{i = 1}} $. Here, we show that the joint pdf for a Dirichlet distribution,

$$ \phi = {\left[ {\frac{{\Gamma {\left( {N_{{\text{T}}} m} \right)}}} {{\Gamma {\left( {N_{{\text{T}}} {\text{mp}}_{1} } \right)} \ldots \Gamma {\left( {N_{{\text{T}}} {\text{mp}}_{n} } \right)}}}} \right]}x^{{N_{{\text{T}}} {\text{mp}}_{1} - 1}}_{1} x^{{N_{{\text{T}}} {\text{mp}}_{2} - 1}}_{2} \ldots x^{{N_{{\text{T}}} {\text{mp}}_{n} - 1}}_{n} $$

(17)

where $ x_{n} = 1 - x_{1} - \cdots - x_{{n - 1}} $ and $ p_{n} = 1 - p_{1} - \cdots - p_{{n - 1}} $ is a solution.

Note that if

$$ {\left[ { - {\left( {M_{{\delta x_{i} }} \phi } \right)} + \frac{1} {2}\frac{{\partial {\left( {V_{{\delta x_{i} }} \phi } \right)}}} {{\partial x_{i} }}} \right]} + \frac{1} {2}{\sum\limits_{i \ne j} {\frac{{\partial {\left( {C_{{\delta x_{i} \delta x_{j} }} \phi } \right)}}} {{\partial x_{j} }}} }{\text{ = }}0{\text{ for }}i = 1, \ldots , n $$

(18)

then $ \frac{{\partial \phi }} {{\partial t}} = 0 $. Therefore, substituting in Eqs. (14)–(16), we require

$$ \frac{{m{\left( {p_{i} - x_{i} } \right)}}} {{N_{{\text{T}}} }}\phi - \frac{1} {2}\frac{\partial } {{\partial x_{i} }}{\left( {\frac{{2x_{i} {\left( {1 - x_{i} } \right)}}} {{N^{2}_{{\text{T}}} }}\phi } \right)} = \frac{1} {2}{\sum\limits_{i \ne j} {\frac{\partial } {{\partial x_{j} }}{\left( {\frac{{ - 2x_{i} x_{j} }} {{N^{2}_{{\text{T}}} }}\phi } \right)}} } $$

(19)

Substituting ϕ into the left-hand side of Eq. (19) gives

$$ \frac{{m{\left( {p_{i} - x_{i} } \right)}}} {{N_{{\text{T}}} }}\phi - \frac{\partial } {{\partial x_{i} }}{\left( {\frac{{x_{i} {\left( 1 \right)} - x_{i} }} {{N^{2}_{{\text{T}}} }}\phi } \right)} $$

$$ \matrix {{ = \frac{{m{\left( {p_{i} - x_{i} } \right)}}} {{N_{{\text{T}}} }}\phi - {\left[ {\frac{\phi } {{N^{2}_{{\text{T}}} }}} \right]}{\left[ {N_{{\text{T}}} {\text{mp}}_{i} - x_{i} {\left( {N_{{\text{T}}} {\text{mp}}_{i} + 1} \right)} - \frac{{x_{i} {\left( {1 - x_{i} } \right)}{\left( {N_{{\text{T}}} {\text{mp}}_{n} - 1} \right)}}} {{x_{n} }}} \right]}} \hfill} \\ {{ = {\left[ {\frac{\phi } {{N^{2}_{{\text{T}}} }}} \right]}{\left[ {N_{{\text{T}}} m{\left( {p_{i} - x_{i} } \right)} - {\left( {N_{{\text{T}}} {\text{mp}}_{i} - x_{i} {\left( {N_{{\text{T}}} {\text{mp}}_{i} + 1} \right)} - \frac{{x_{i} {\left( {1 - x_{i} } \right)}{\left( {N_{{\text{T}}} {\text{mp}}_{n} - 1} \right)}}} {{x_{n} }}} \right)}} \right]}} \hfill} \\ {{ = {\left[ { - \frac{{\phi x_{i} }} {{N^{2}_{{\text{T}}} }}} \right]}{\left[ {N_{{\text{T}}} m{\left( {1 - p_{i} } \right)} - 1 - \frac{{{\left( {1 - x_{i} } \right)}}} {{x_{n} }}{\left( {N_{{\text{T}}} {\text{mp}}_{n} - 1} \right)}} \right]}} \hfill} \ $$

(20)

Similarly, substituting ϕ into the right-hand side of (19) gives

$$ \matrix {{\sum\limits_{i \ne j} {\frac{\partial } {{\partial x_{j} }}{\left( {\frac{{ - x_{i} x_{j} }} {{N^{2}_{{\text{T}}} }}\phi } \right)}} }}{ = - \frac{{x_{i} \phi }} {{N^{2}_{{\text{T}}} }}{\sum\limits_{i \ne j} {{\left[ {N_{{\text{T}}} {\text{mp}}_{j} - \frac{{x_{j} }} {{x_{n} }}{\left( {N_{{\text{T}}} {\text{mp}}_{n} - 1} \right)}} \right]}} }} \\ {}{ = - \frac{{x_{i} \phi }} {{N^{2}_{{\text{T}}} }}{\left[ {N_{{\text{T}}} m{\left( {1 - p_{i} - p_{r} } \right)} - \frac{{{\left( {1 - x_{i} - x_{n} } \right)}}} {{x_{n} }}{\left( {N_{{\text{T}}} {\text{mp}}_{n} - 1} \right)}} \right]}} \\ {}{ = - \frac{{x_{i} \phi }} {{N^{2}_{{\text{T}}} }}{\left[ {N_{{\text{T}}} m{\left( {1 - p_{i} - p_{r} } \right)} + {\left( {N_{{\text{T}}} {\text{mp}}_{r} - 1} \right)} - \frac{{{\left( {1 - x_{i} } \right)}}} {{x_{n} }}{\left( {N_{{\text{T}}} {\text{mp}}_{n} - 1} \right)}} \right]}} \\ {}{ = - \frac{{x_{i} \phi }} {{N^{2}_{{\text{T}}} }}{\left[ {N_{{\text{T}}} m{\left( {1 - p_{i} } \right)} - 1 - \frac{{{\left( {1 - x_{i} } \right)}}} {{x_{n} }}{\left( {N_{{\text{T}}} {\text{mp}}_{n} - 1} \right)}} \right]}} \ $$

(21)

Now, because (20) and (21) are equal, ϕ is a solution to the diffusion equation [Eq. (13)] with $ \frac{{\partial \phi }} {{\partial t}} = 0 $ and the reflecting boundary conditions are met.

Algorithm for generating the stationary probability density function

Given the relative abundances of n taxa in the source community $ {\left\{ {p_{i} } \right\}}^{n}_{{i = 1}} $, a realization of the Dirichlet distributed local abundances can be generated by sampling from a set of gamma dis-tributions. Let $ {\left\{ {Y_{i} } \right\}}^{n}_{{i = 1}} $ be random variables such that Y _i ∼ gamma(N _T mp _i) and let $ {\left\{ {Y_{i} } \right\}}^{n}_{{i = 1}} $ be realizations of these variables sampled at random, then

$$ x_{i} = \frac{{y_{i} }} {{{\sum\limits_{j = 1}^n {y_{j} } }}}\quad i = 1, \ldots , n $$

(22)

will represent a random sample from the Dirichlet joint probability distribution for a local neutral community [Eq. (17)].

Sampling a neutral community

We have already shown that for the continuous variant of the NCM, the steady-state joint pdf for all species is Dirichlet Dir(N _T mp _i,...,N _T mp _n), where p ₁,...,p _n are the relative abundances of the species in the metacommunity.

We can repeat the exact same argument to derive the joint distribution of the relative abundances within a sample of size N _S from such a community. Strictly speaking, selecting a subsample of size N _S from a local community is achieved by simply sampling N _S individuals without replacement from the community of size N _T. However, since for almost all microbial samples $ N_{{\text{S}}} \ll N_{{\text{T}}} $, the problem can be approximated to one of sampling with replacement.

Regard the sampling exercise as a continuous process through time. Individuals are selected from the source community one by one until a sample of size N _S has been collected. Once this sample size has been reached, the process of selecting individuals continues at regular intervals in time (generations) but now the selected individual replaces one randomly chosen individual currently in the sample population. This is analogous to the argument used for deriving the joint distribution for the local abundances, except that we have a pure immigration–death process, with immigrants into the sample from the local community. Setting m = 1 and regarding our local abundances as the metacommunity from which immigrants are drawn, it is clear that conditional on knowledge of local abundances x ₁,...,x _n the joint distribution of relative abundances y ₁,...,y _n within a sample is Dirichlet Dir(N _S x _i,...,N _S x _n). That is,

$$ f{\left( {Y\left| X \right.} \right)} = \Gamma {\left( {N_{{\text{S}}} } \right)}{\prod\limits_{i = 1}^n {\frac{{y_{i} ^{{N_{{\text{S}}} x_{i} }} }} {{\Gamma {\left( {N_{{\text{S}}} x_{i} } \right)}}}} } $$

(23)

where X = (x ₁,...,x _n) and X = (y ₁,...,y _n) for notational convenience. This allows us to calculate the first and second moments of the sample distribution because we know that the marginal densities of a Dirichlet distribution are beta distributed. Therefore,

$$ E{\left( {y_{i} \left| {x_{i} } \right.} \right)} = x_{i} $$

(24)

and

$$ E{\left( {y_{i} ^{2} \left| {x_{i} } \right.} \right)} = \frac{{x_{i} {\left( {N_{{\text{S}}} x_{i} + 1} \right)}}} {{N_{{\text{S}}} + 1}} $$

(25)

Now, since $ x_{i} \sim Beta{\left( {N_{{\text{T}}} {\text{mp}}_{i} , N_{{\text{T}}} m{\left( {1 - p_{i} } \right)}} \right)} $, we have that

$$ E{\left( {y_{i} } \right)} = p_{i} $$

(26)

and

$$ E{\left( {y_{i} ^{2} } \right)} = {\left[ {\frac{1} {{N_{{\text{S}}} + 1}}} \right]}{\left[ {N_{{\text{S}}} \frac{{p_{i} {\left( {N_{{\text{T}}} {\text{mp}}_{i} + 1} \right)}}} {{N_{{\text{T}}} m + 1}} + p_{i} } \right]} = \frac{{N_{{\text{S}}} N_{{\text{T}}} {\text{mp}}_{i} ^{2} + {\left( {N_{{\text{S}}} + N_{{\text{T}}} m + 1} \right)}p_{i} }} {{N_{{\text{S}}} N_{{\text{T}}} m + N_{{\text{T}}} m + N_{{\text{S}}} + 1}} = \frac{{{\left( {\frac{{N_{{\text{S}}} N_{{\text{T}}} m}} {{N_{{\text{T}}} m + N_{{\text{S}}} + 1}}} \right)}p_{i} ^{2} + p_{i} }} {{{\left( {\frac{{N_{{\text{S}}} N_{{\text{T}}} m}} {{N_{{\text{T}}} m + N_{{\text{S}}} + 1}}} \right)} + 1}} $$

(27)

letting

$$ \ifmmode\expandafter\tilde\else\expandafter\~\fi{m} = \frac{{N_{{\text{T}}} m}} {{N_{{\text{T}}} m + N_{{\text{S}}} + 1}} $$

(28)

then

$$ E{\left( {y_{i} ^{2} } \right)} = \frac{{N_{{\text{S}}} {\text{\ifmmode\expandafter\hat\else\expandafter\^\fi{m}p}}_{i} ^{2} + p_{i} }} {{N_{{\text{S}}} \ifmmode\expandafter\hat\else\expandafter\^\fi{m} + 1}} $$

(29)

We were unable to derive a neat analytical solution for the marginal pdfs of abundance in the sample. However, repeated sampling from neutrally assembled synthetic communities confirmed that the marginals were very closely approximated by beta distributions. If we assume that the sample marginal distributions are exactly beta, then—as their first and second moments are given by Eqs. (26) and (29), respectively—the sample distribution is given by,

$$ y_{i} \tilde{}b{\text{eta}}{\left( {N_{{\text{S}}} {\text{\ifmmode\expandafter\hat\else\expandafter\^\fi{m}p}}_{i} ,N_{{\text{S}}} \ifmmode\expandafter\hat\else\expandafter\^\fi{m}{\left( {1 - p_{i} } \right)}} \right)} $$

(30)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sloan, W.T., Woodcock, S., Lunn, M. et al. Modeling Taxa-Abundance Distributions in Microbial Communities using Environmental Sequence Data. Microb Ecol 53, 443–455 (2007). https://doi.org/10.1007/s00248-006-9141-x

Download citation

Received: 22 June 2006
Revised: 22 June 2006
Accepted: 10 July 2006
Published: 13 December 2006
Issue Date: April 2007
DOI: https://doi.org/10.1007/s00248-006-9141-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling Taxa-Abundance Distributions in Microbial Communities using Environmental Sequence Data

Abstract

Access this article

Similar content being viewed by others

A practical guide to amplicon and metagenomic analysis of microbiome data

High-level classification of the Fungi and a tool for evolutionary ecological analyses

Modeling Microbial Community Networks: Methods and Tools for Studying Microbial Interactions

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Mathematical Appendix

Kolmogorov Backward Equation for the neutral community model

Stationary probability density function

Algorithm for generating the stationary probability density function

Sampling a neutral community

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modeling Taxa-Abundance Distributions in Microbial Communities using Environmental Sequence Data

Abstract

Access this article

Similar content being viewed by others

A practical guide to amplicon and metagenomic analysis of microbiome data

High-level classification of the Fungi and a tool for evolutionary ecological analyses

Modeling Microbial Community Networks: Methods and Tools for Studying Microbial Interactions

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Mathematical Appendix

Appendix: Mathematical Appendix

Kolmogorov Backward Equation for the neutral community model

Stationary probability density function

Algorithm for generating the stationary probability density function

Sampling a neutral community

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation