Estimating Bacterial Diversity from Environmental DNA: A Maximum Likelihood Approach

  • Frederick Cohan
  • Danny Krizanc
  • Yun Lu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4463)


The ability to measure bacterial diversity is a prerequisite for the systematic study of bacterial biogeography and ecology. In this paper we describe a method of estimating diversity from an environmental sample of DNA and apply it to data taken from samples from the Sargasso Sea. Our approach combines the coverage depth method of Venter et al. [2] and the contig spectrum approach of Angly et al. [4], but uses maximum likelihood to recover the diversity rather than using hand-fit models as in [2]. We assume four species abundance distributions, then maximize the likelihood of fitting the coverage depth at different positions of the consensus sequence provided in the Sargasso Sea sample. The resulting estimates match well with those obtained using less mathematically rigorous approaches.


Lognormal Distribution Bacterial Diversity Coverage Depth Abundance Level Abundance Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Curtis, T.P., Sloan, W.T.: Exploring Microbial Diversity - A Vast Below. Science 309, 1331–1333 (2005)CrossRefGoogle Scholar
  2. 2.
    Venter, J.C., et al.: Environmental Genome Shotgun Sequencing of the Sargasso Sea. Science 304, 66–74 (2004), Supporting Online Material: CrossRefGoogle Scholar
  3. 3.
    Angly, F., et al.: PHACCS, an Online Tool for Estimating the Structure and Diversity of Uncultured Viral Communities Using Metagenomic Information. BMC Bioinformatics 6(41) (2005),
  4. 4.
    Bohannan, B.J.M., Hughes, J.: New Approaches to Analyzing Microbial Biodiversity Data. Current Opinion in Microbiology 6, 282–287 (2003)CrossRefGoogle Scholar
  5. 5.
    Myers, G.: Whole-Genome DNA Seqencing. Computing in Science and Engineering, 33–43 (May-June 1999)Google Scholar
  6. 6.
    Preston, F.W.: The Commonness and Rarity of Species. Ecology 29, 254–283 (1948)CrossRefGoogle Scholar
  7. 7.
    Bulmer, M.G.: On Fitting the Poisson Lognormal Distribution to Species Abundance Data. Biometrics 30, 101–110 (1974)MATHCrossRefGoogle Scholar
  8. 8.
    Hubbell, S.: The Unified Neutral Theroy of Biodiversity and Biogeography. Princeton University Press, Princeton (2001)Google Scholar
  9. 9.
    Curtis, T.P., Sloan, W.T., Scannell, J.W.: Estimating Prokaryotic Diversity and Its Limits. Proc. Natl. Acad. Sci. USA 99, 10494–10499 (2002)CrossRefGoogle Scholar
  10. 10.
    Dunbar, J., et al.: Empirical and Theoretical Bacterial Diversity in Four Arizona Soils. Appl. Environ. Microbiol. 68, 3035–3045 (2002)CrossRefGoogle Scholar
  11. 11.
    Zhou, J., et al.: Spatial and Rescource Factors Influencing High Microbial Diversity in Soil. Appl. Environ. Microbiol. 68, 326–334 (2002)CrossRefGoogle Scholar
  12. 12.
    Kroes, I., Lepp, P.W., Relman, D.: Bacterial Diversity Within the Human Subgingival Crevice. Proc. Natl. Acad. Sci. USA 96, 14547–14552 (1999)CrossRefGoogle Scholar
  13. 13.
    Hughes, J.B., et al.: Counting the Uncountable: Statistical Approaches to Estimating Microbial Diversity. Appl. Environ. Microbiol. 67, 4399–4406 (2001)CrossRefGoogle Scholar
  14. 14.
    Seber, G.: The Estimation of Animal Abundance and Related Parameters. Griffin, London (1973)MATHGoogle Scholar
  15. 15.
    Krebs, C.: Ecological Methodology. Harper and Row, New York (1989)Google Scholar
  16. 16.
    Chao, A.: Estimating the Population Size for Capture-recapture Data with Unequal Catchability. Biometrics 43, 783–791 (1987)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Breitbart, M., et al.: Genomic Analysis of Uncultured Marine Viral Communities. Proc. Natl. Acad. Sci. USA 99, 14250–14255 (2002)CrossRefGoogle Scholar
  18. 18.
    Reysenbach, A., et al.: Differential Amplification of rRNA Genes by Polymerase Chain Reaction. Appl. Environ. Microbiol. 58, 3417–3418 (1992)Google Scholar
  19. 19.
    Suzuki, M., Giovannoni, S.: Bias caused by Template Annealing in the Amplification of Mixutures of 16S rRNA Genes by PCR. Appl. Environ. Microbiol. 62, 625–630 (1996)Google Scholar
  20. 20.
    Speksnijder, A., et al.: Microvariation Artefacts Introduced by PCR and Cloning of Closely Related 16S rRNA Gene Sequences. Appl. Environ. Microbiol. 67, 469–472 (2001)CrossRefGoogle Scholar
  21. 21.
    Jasons, G., Wolinsky, M., Dunbar, J.: Computational Improvements Reveal Great Bacterial Diversity and Hign Metal Toxicity in Soil. Science 309, 1387–1390 (2005)CrossRefGoogle Scholar
  22. 22.
    Falkowski, P.G., de Vargas, C.: Shotgun Sequencing in the Sea: A Blast from the Past? Science 304, 58–60 (2004)CrossRefGoogle Scholar
  23. 23.
    Travis, J.M., Larsen, D.R.: Meaures of Diversity. Natural Resource biometrics (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Frederick Cohan
    • 1
  • Danny Krizanc
    • 2
  • Yun Lu
    • 2
  1. 1.Department of Biology, Wesleyan University, Middletown, CT, 06459 
  2. 2.Department of Mathematics and Computer Science, Wesleyan University, Middletown, CT, 06459 

Personalised recommendations