Statistical Estimation of Uncultivated Microbial Diversity

Bunge, J.

doi:10.1007/978-3-540-85465-4_3

J. Bunge²

Part of the book series: Microbiology Monographs ((MICROMONO,volume 10))

1533 Accesses
3 Citations

Abstract

The full microbial richness of a community, or even of an environmental sample, usually cannot be observed completely, but only estimated statistically. This estimation is typically based on observed count data, that is, the counts of the representatives of each species (or other taxonomic units) appearing in the sample or samples. “Abundance” data consists of counts of the numbers of individuals from various species in a single sample, while “incidence” (or multiple recapture) data consists of lists of species appearing in several or many samples. In this chapter we consider statistical estimation of the total richness, i.e., the total number of species, observed + unobserved, based on abundance or on incidence data. We discuss parametric and nonparametric methods, their underlying assumptions, and their advantages and disadvantages; computational implementations and software; and larger scientific issues such as the scope of applicability of the results of a given analysis. Some real-world examples from microbial studies are presented. Our discussion is intended to serve as an overview and an introduction to the literature and available software.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Behnke A, Bunge J, Barger KJ, Stoeck T (2008) Impact of the time dimension on our perception of microbial molecular diversity and its patterns. Submitted for publication
Google Scholar
Borchers DL, Buckland ST, Zucchini W (2002) Estimating animal abundance: closed populations. Springer New York
Google Scholar
Bunge J, Barger K (2008) Parametric models for estimating the number of classes. Biometrical Journal 50(5)
Google Scholar
Chao A (2005) Species estimation and applications. In: Balakrishnan N, Read , CBVidakovicCEncyclopedia of statistical sciences,2nd edn,vol 12.Wiley, New York, 7907–7916
Google Scholar
Chao A, Bunge JA (2002) Estimating the number of species in a stochastic abundance model. Biometrics 58:531–539
Article PubMed Google Scholar
Chao A, Huggins RM (2005) Classical closed population models. In: Manly B, Mcdonald T, Amstrup S The handbook of capture–recapture methods, Princeton University Press, Princeton, 22–35
Google Scholar
Chao A, Lee S-M (1992) Estimating the number of classes via sample coverage. J Am Statist Assn 87:210–217
Article Google Scholar
Chao A, Yip , PSFLee S-M, Chu W (2001) Population size estimation based on estimating functions for closed capture–recapture models. J Statist Plan Inference 92:213–232
Article Google Scholar
Choquet R, Reboulet A-M, Pradel R, Gimenez O Lebreton J-D (2004) M-SURGE: new software specifically designed for multistate capture–recapture models. Anim Biodivers Conserv 27:207–215
Google Scholar
Colwell RK (2005) EstimateS: Statistical estimation of species richness and shared species from samples. Version 7.5. er’s Guide and application published at: http://purl.oclc.org/estimates
Efford MG, Dawson DK, Robbins CS (2004) DENSITY: Software for analysing capture–recapture data from passive detector arrays. Anim Biodivers Conserv 27:217–228
Google Scholar
Epstein SS, Bunge J (2006) Estimation of microbial diversity from GenBank data. Appl Environ Microbiol 72:(10)6578–6583
Article PubMed Google Scholar
Epstein SS, Bunge J (2008) Estimation of microbial diversity from GenBank data. In preparation.
Google Scholar
Fienberg SE, Johnson MS, Junker BW (1999) Classical multilevel and Bayesian approaches to population size estimation using multiple lists. J R Stat Soc: Ser A 162:383–405
Article Google Scholar
Hong S-H, Bunge J, Jeon S-O, Epstein SS (2006) Predicting microbial species richness. Proc Natl Acad Sci USA 103:117–122
Article PubMed CAS Google Scholar
Huber JA, Mark Welch DB, Morrison HG, Huse SM, Neal PR, Butterfield DA, Sogin ML(2007) Microbial population structures in the deep marine biosphere. Science 318:97–100
Article PubMed CAS Google Scholar
Huggins RM, Yip PSF(2001) A note on nonparametric inference for capture–recapture experiments with heterogeneous capture probabilities. Statistica Sinica 11:843–853
Google Scholar
Lee S-M, Chao A (1994) Estimating population size via sample coverage for closed capture–recapture models. Biometrics 50:88–97
Article PubMed CAS Google Scholar
Magurran AE (2004) Measuring biological diversity. Blackwell, Oxford
Google Scholar
Mao CX (2004) Predicting the conditional probability of discovering a new class. J Am Stat Assoc 99:1108–1118
Article Google Scholar
Mao CX, Lindsay BG (2007) Estimating the number of classes. Ann Stat 35:917–930
Article Google Scholar
Norris JL III, Pollock KH (1996) Nonparametric MLE under two closed capture–recapture models with heterogeneity. Biometrics 52:639–649
Article Google Scholar
Pledger S (2005) The performance of mixture models in heterogeneous closed population capture–recapture. Biometrics 61:868–876
Article PubMed Google Scholar
Rexstad E, Burnham KP (1991). User’s Guide for Interactive Program CAPTURE. Colorado Cooperative Fish and Wildlife Research Unit, Fort Collins, ***CO, USA, 29
Google Scholar
Shen TJ, Chao A, Lin CF (2003) Predicting the number of new species in further taxonomic sampling. Ecology 84:798–804
Article Google Scholar
Stackebrandt E, Goebel BM (1994) Taxonomic note: A place for DNA:DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteria. Int J Syst Bacteriol 44:846–849
Article CAS Google Scholar
Tardella L (2002) A new Bayesian method for nonparametric capture–recapture models in presence of heterogeneity. Biometrika 89:807–817
Article Google Scholar
Wang J-PZ, Lindsay BG (2005) A penalized nonparametric maximum likelihood approach to species richness estimation. J Am Stat Assoc 100:942–959
Article CAS Google Scholar
Williamson M, Gaston KJ (2005) The lognormal distribution is not an appropriate null hypothesis for the species-abundance distribution. J Anim Ecol 74:409–422
Article Google Scholar
Zwane E, van der Heijden P (2005) Population estimation using the multiple system estimator in the presence of continuous covariates. Stat Modelling 5:39–52
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistical Science, Cornell University, Ithaca, NY, 14853, USA
J. Bunge

Authors

J. Bunge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Bunge .

Editor information

Editors and Affiliations

Dept. Biology, Northeastern University, Boston, 02115, U.S.A.
Slava S. Epstein

APPENDIX: Software

Here is a list of some computer software that is available as of this writing. Documentation is available at the websites listed. The collection of such software continues to expand, and for new applications and data analyses the reader should check recent developments via expert advice and Internet searching.

Abundance data

Code written in MAPLE (www.maplesoft.com) for various parametric models, http://www.stat.cornell.edu/~bunge/(Hong et al. 2006)

Abundance or incidence data
- SPADE, http://chao.stat.nthu.edu.tw/softwareCE.html(Shen et al. 2003)
- EstimateS, http://viceroy.eeb.uconn.edu/EstimateS(Colwell 2005)
Incidence data
- DENSITY, http://www.landcareresearch.co.nz/(Efford et al. 2004)
- CARE-2, http://chao.stat.nthu.edu.tw/softwareCE.html(Chao et al. 2001)
- CAPTURE, http://www.mbr-pwrc.usgs.gov/software.html(Rexstad and Burnham 1991)
- M-SURGE, http://www.cefe.cnrs.fr/(Choquet et al. 2004)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bunge, J. (2009). Statistical Estimation of Uncultivated Microbial Diversity. In: Epstein, S. (eds) Uncultivated Microorganisms. Microbiology Monographs, vol 10. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85465-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-85465-4_3
Published: 28 February 2009
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85464-7
Online ISBN: 978-3-540-85465-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Statistical Estimation of Uncultivated Microbial Diversity

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

APPENDIX: Software

APPENDIX: Software

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation