Encyclopedia of Metagenomics

2015 Edition
| Editors: Karen E. Nelson

Protein-Coding Genes as Alternative Markers in Microbial Diversity Studies

Reference work entry
DOI: https://doi.org/10.1007/978-1-4899-7478-5_734


Automated Phylogenomic Inference Application (AMPHORA)


The small ribosomal unit RNA (SSU rRNA or 16S rRNA) has been widely used in microbial systematic and diversity studies. The appeal of using 16S rRNA gene as a marker gene is numerous. First of all, it is distributed in every single cellular organism. Secondly, because regions of 16S rRNA sequence are highly conserved, 16S rRNA gene can be PCR amplified from a wide diversity of taxa using “universal” primers and sequenced, bypassing the need to isolate and culture the organisms in question. Consequently, millions of 16S rRNA reference sequences are available for microbial classification and identification (Cole 2009).

Although 16S rRNA has been the “gold standard” in microbial diversity studies, it has several shortcomings. First, because 16S rRNA only makes up a tiny fraction of a genome (~0.1 %), its application as a marker gene in classifying metagenomic sequences is seriously limited. Secondly, the widely...

This is a preview of subscription content, log in to check access.


  1. Acinas SG, Sarma-Rupavtarm R, Klepac-Ceraj V, et al. PCR-induced sequence artifacts and bias: insights from comparison of two 16S rRNA clone libraries constructed from the same sample. Appl Environ Microbiol. 2005;71:8966–9.PubMedCentralPubMedGoogle Scholar
  2. Berger SA, Krompass D, Stamatakis A. Performance, accuracy, and Web server for evolutionary placement of short sequence reads under maximum likelihood. Syst Biol. 2011;60:291–302.PubMedCentralPubMedGoogle Scholar
  3. Cammarano P, Creti R, Sanangelantoni AM, et al. The archaea monophyly issue: a phylogeny of translational elongation factor G(2) sequences inferred from an optimized selection of alignment positions. J Mol Evol. 1999;49:524–37.PubMedGoogle Scholar
  4. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52.PubMedGoogle Scholar
  5. Cole JR, Wang Q, Cardenas E, et al. The ribosomal database project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 2009;37:D141–5.PubMedCentralPubMedGoogle Scholar
  6. Eddy SR. Accelerated profile HMM searches. PLoS Comput Biol. 2011;7:e1002195.PubMedCentralPubMedGoogle Scholar
  7. Grundy WN, Naylor GJ. Phylogenetic inference from conserved sites alignments. J Exp Zool. 1999;285:128–39.PubMedGoogle Scholar
  8. Huson DH, Auch AF, Qi J, et al. MEGAN analysis of metagenomic data. Genome Res. 2007;17:377–86.PubMedCentralPubMedGoogle Scholar
  9. Hwang UW, Kim W, Tautz D, et al. Molecular phylogenetics at the Felsenstein zone: approaching the Strepsiptera problem using 5.8S and 28S rDNA sequences. Mol Phylogenet Evol. 1998;9:470–80.PubMedGoogle Scholar
  10. Jain R. Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci. 1999;96:3801–6.PubMedCentralPubMedGoogle Scholar
  11. Kembel SW, Wu M, Eisen JA, et al. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Comput Biol. 2012;8:e1002743.PubMedCentralPubMedGoogle Scholar
  12. Koski LB, Golding GB. The closest BLAST hit is often not the nearest neighbor. J Mol Evol. 2001;52:540–2.PubMedGoogle Scholar
  13. Lake JA. The order of sequence alignment can bias the selection of tree topology. Mol Biol Evol. 1991;8:378–85.PubMedGoogle Scholar
  14. Landan G, Graur D. Heads or tails: a simple reliability check for multiple sequence alignments. Mol Biol Evol. 2007;24:1380–3.PubMedGoogle Scholar
  15. Loytynoja A, Goldman N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science. 2008;320:1632–5.PubMedGoogle Scholar
  16. Ludwig W, Klenk H-P. Overview: a phylogenetic backbone and taxonomic framework for procaryotic systematics. In: Boone DR, Castenholz RW, Garrity GM, editors. Bergey’s manual of systematic bacteriology, vol. 1. New York: Springer-Verlag; 2000. p. 49–65.Google Scholar
  17. Matsen FA, Kodner RB, Armbrus EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinforma. 2010;11.Google Scholar
  18. Morris RM, Rappe MS, Connon SA, et al. SAR11 clade dominates ocean surface bacterioplankton communities. Nature. 2002;420:806–10.PubMedGoogle Scholar
  19. Morrison DA, Ellis JT. Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa. Mol Biol Evol. 1997;14:428–41.PubMedGoogle Scholar
  20. Pagani I, Liolios K, Jansson J, et al. The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 2012;40:D571–9.PubMedCentralPubMedGoogle Scholar
  21. Rusch DB, Halpern AL, Sutton G, et al. The sorcerer II global ocean sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol. 2007;5:e77.PubMedCentralPubMedGoogle Scholar
  22. Santos SR, Ochman H. Identification and phylogenetic sorting of bacterial lineages with universally conserved genes and proteins. Environ Microbiol. 2004;6:754–9.PubMedGoogle Scholar
  23. Sorek R, Zhu Y, Creevey CJ, et al. Genome-wide experimental determination of barriers to horizontal gene transfer. Science. 2007;318:1449–52.PubMedGoogle Scholar
  24. Wu M, Eisen JA. A simple, fast, and accurate method of phylogenomic inference. Genome Biol. 2008;9:R151.PubMedCentralPubMedGoogle Scholar
  25. Wu M, Scott AJ. Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2. Bioinformatics. 2012;28:1033–4.PubMedGoogle Scholar
  26. Wu M, Chatterji S, Eisen JA. Accounting for alignment uncertainty in phylogenomics. PLoS ONE. 2012;7(1):e30288.PubMedCentralPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of BiologyUniversity of VirginiaCharlottesvilleUSA