Abstract
The full microbial richness of a community, or even of an environmental sample, usually cannot be observed completely, but only estimated statistically. This estimation is typically based on observed count data, that is, the counts of the representatives of each species (or other taxonomic units) appearing in the sample or samples. “Abundance” data consists of counts of the numbers of individuals from various species in a single sample, while “incidence” (or multiple recapture) data consists of lists of species appearing in several or many samples. In this chapter we consider statistical estimation of the total richness, i.e., the total number of species, observed + unobserved, based on abundance or on incidence data. We discuss parametric and nonparametric methods, their underlying assumptions, and their advantages and disadvantages; computational implementations and software; and larger scientific issues such as the scope of applicability of the results of a given analysis. Some real-world examples from microbial studies are presented. Our discussion is intended to serve as an overview and an introduction to the literature and available software.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Behnke A, Bunge J, Barger KJ, Stoeck T (2008) Impact of the time dimension on our perception of microbial molecular diversity and its patterns. Submitted for publication
Borchers DL, Buckland ST, Zucchini W (2002) Estimating animal abundance: closed populations. Springer New York
Bunge J, Barger K (2008) Parametric models for estimating the number of classes. Biometrical Journal 50(5)
Chao A (2005) Species estimation and applications. In: Balakrishnan N, Read , CBVidakovicCEncyclopedia of statistical sciences,2nd edn,vol 12.Wiley, New York, 7907–7916
Chao A, Bunge JA (2002) Estimating the number of species in a stochastic abundance model. Biometrics 58:531–539
Chao A, Huggins RM (2005) Classical closed population models. In: Manly B, Mcdonald T, Amstrup S The handbook of capture–recapture methods, Princeton University Press, Princeton, 22–35
Chao A, Lee S-M (1992) Estimating the number of classes via sample coverage. J Am Statist Assn 87:210–217
Chao A, Yip , PSFLee S-M, Chu W (2001) Population size estimation based on estimating functions for closed capture–recapture models. J Statist Plan Inference 92:213–232
Choquet R, Reboulet A-M, Pradel R, Gimenez O Lebreton J-D (2004) M-SURGE: new software specifically designed for multistate capture–recapture models. Anim Biodivers Conserv 27:207–215
Colwell RK (2005) EstimateS: Statistical estimation of species richness and shared species from samples. Version 7.5. er’s Guide and application published at: http://purl.oclc.org/estimates
Efford MG, Dawson DK, Robbins CS (2004) DENSITY: Software for analysing capture–recapture data from passive detector arrays. Anim Biodivers Conserv 27:217–228
Epstein SS, Bunge J (2006) Estimation of microbial diversity from GenBank data. Appl Environ Microbiol 72:(10)6578–6583
Epstein SS, Bunge J (2008) Estimation of microbial diversity from GenBank data. In preparation.
Fienberg SE, Johnson MS, Junker BW (1999) Classical multilevel and Bayesian approaches to population size estimation using multiple lists. J R Stat Soc: Ser A 162:383–405
Hong S-H, Bunge J, Jeon S-O, Epstein SS (2006) Predicting microbial species richness. Proc Natl Acad Sci USA 103:117–122
Huber JA, Mark Welch DB, Morrison HG, Huse SM, Neal PR, Butterfield DA, Sogin ML(2007) Microbial population structures in the deep marine biosphere. Science 318:97–100
Huggins RM, Yip PSF(2001) A note on nonparametric inference for capture–recapture experiments with heterogeneous capture probabilities. Statistica Sinica 11:843–853
Lee S-M, Chao A (1994) Estimating population size via sample coverage for closed capture–recapture models. Biometrics 50:88–97
Magurran AE (2004) Measuring biological diversity. Blackwell, Oxford
Mao CX (2004) Predicting the conditional probability of discovering a new class. J Am Stat Assoc 99:1108–1118
Mao CX, Lindsay BG (2007) Estimating the number of classes. Ann Stat 35:917–930
Norris JL III, Pollock KH (1996) Nonparametric MLE under two closed capture–recapture models with heterogeneity. Biometrics 52:639–649
Pledger S (2005) The performance of mixture models in heterogeneous closed population capture–recapture. Biometrics 61:868–876
Rexstad E, Burnham KP (1991). User’s Guide for Interactive Program CAPTURE. Colorado Cooperative Fish and Wildlife Research Unit, Fort Collins, ***CO, USA, 29
Shen TJ, Chao A, Lin CF (2003) Predicting the number of new species in further taxonomic sampling. Ecology 84:798–804
Stackebrandt E, Goebel BM (1994) Taxonomic note: A place for DNA:DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteria. Int J Syst Bacteriol 44:846–849
Tardella L (2002) A new Bayesian method for nonparametric capture–recapture models in presence of heterogeneity. Biometrika 89:807–817
Wang J-PZ, Lindsay BG (2005) A penalized nonparametric maximum likelihood approach to species richness estimation. J Am Stat Assoc 100:942–959
Williamson M, Gaston KJ (2005) The lognormal distribution is not an appropriate null hypothesis for the species-abundance distribution. J Anim Ecol 74:409–422
Zwane E, van der Heijden P (2005) Population estimation using the multiple system estimator in the presence of continuous covariates. Stat Modelling 5:39–52
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
APPENDIX: Software
APPENDIX: Software
Here is a list of some computer software that is available as of this writing. Documentation is available at the websites listed. The collection of such software continues to expand, and for new applications and data analyses the reader should check recent developments via expert advice and Internet searching.
-
Abundance data
-
Code written in MAPLE (www.maplesoft.com) for various parametric models, http://www.stat.cornell.edu/~bunge/(Hong et al. 2006)
-
Abundance or incidence data
-
SPADE, http://chao.stat.nthu.edu.tw/softwareCE.html(Shen et al. 2003)
-
EstimateS, http://viceroy.eeb.uconn.edu/EstimateS(Colwell 2005)
-
-
Incidence data
-
DENSITY, http://www.landcareresearch.co.nz/(Efford et al. 2004)
-
CARE-2, http://chao.stat.nthu.edu.tw/softwareCE.html(Chao et al. 2001)
-
CAPTURE, http://www.mbr-pwrc.usgs.gov/software.html(Rexstad and Burnham 1991)
-
M-SURGE, http://www.cefe.cnrs.fr/(Choquet et al. 2004)
-
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bunge, J. (2009). Statistical Estimation of Uncultivated Microbial Diversity. In: Epstein, S. (eds) Uncultivated Microorganisms. Microbiology Monographs, vol 10. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85465-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-85465-4_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85464-7
Online ISBN: 978-3-540-85465-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)