Journal of Computer Science and Technology

, Volume 25, Issue 1, pp 71–81

Metagenomics: Facts and Artifacts, and Computational Challenges


DOI: 10.1007/s11390-010-9306-4

Cite this article as:
Wooley, J.C. & Ye, Y. J. Comput. Sci. Technol. (2010) 25: 71. doi:10.1007/s11390-010-9306-4


Metagenomics is the study of microbial communities sampled directly from their natural environment, without prior culturing. By enabling an analysis of populations including many (so-far) unculturable and often unknown microbes, metagenomics is revolutionizing the field of microbiology, and has excited researchers in many disciplines that could benefit from the study of environmental microbes, including those in ecology, environmental sciences, and biomedicine. Specific computational and statistical tools have been developed for metagenomic data analysis and comparison. New studies, however, have revealed various kinds of artifacts present in metagenomics data caused by limitations in the experimental protocols and/or inadequate data analysis procedures, which often lead to incorrect conclusions about a microbial community. Here, we review some of the artifacts, such as overestimation of species diversity and incorrect estimation of gene family frequencies, and discuss emerging computational approaches to address them. We also review potential challenges that metagenomics may encounter with the extensive application of next-generation sequencing (NGS) techniques.


metagenomics next-generation sequencing (NGS) taxonomic/functional profiling statistical approaches comparative metagenomics 

Copyright information

© Springer 2010

Authors and Affiliations

  1. 1.Center for Research on BioSystems, Calit2University of Califormia San DiegoLa JollaU.S.A.
  2. 2.School of Informatics and ComputingIndiana UniversityBloomingtonU.S.A.

Personalised recommendations