In recent years, thanks to the essential support provided by the Next-Generation Sequencing (NGS) technologies, Metagenomics is enabling the direct access to the taxonomic and functional composition of mixed microbial communities living in any environmental niche, without the prerequisite to isolate or culture the single organisms. This approach has already been successfully applied for the analysis of many habitats, such as water or soil natural environments, also characterized by extreme physical and chemical conditions, food supply chains, and animal organisms, including humans. A shotgun sequencing approach can lead to investigate both organisms and genes diversity. Anyway, if the purpose is limited to explore the taxonomic complexity, an amplicon-based approach, based on PCR-targeted sequencing of selected genetic species markers, commonly named “meta-barcodes”, is desirable. Among the genomic regions most widely used for the discrimination of bacterial organisms, in some cases up to the species level, some hypervariable domains of the gene coding for the 16S rRNA occupy a prominent place.
The amplification of a certain meta-barcode from a microbial community through the use of PCR primers able to work in the entire considered taxonomic group is the first task after the extraction of the total DNA. Generally, this step is followed by the high-throughput sequencing of the resulting amplicons libraries by means of a selected NGS platform. Finally, the interpretation of the huge amount of produced data requires appropriate bioinformatics tools and know-how in addition to efficient computational resources.
Here a computational methodology suitable for the taxonomic characterization of 454 meta-barcode sequences is described in detail. In particular, a dataset covering the V1–V3 region belonging to the bacterial 16S rRNA coding gene and produced in the Human Microbiome Project (HMP) from a palatine tonsils sample is analyzed. The proposed exercise includes the basic steps to manage raw sequencing data, remove amplification and pyrosequencing errors, and finally map sequences on the taxonomy.
Dutton RJ, Turnbaugh PJ (2012) Taking a metagenomic view of human nutrition. Curr Opin Clin Nutr Metab Care 15(5):448–454PubMedCrossRefGoogle Scholar
Knight R, Jansson J, Field D et al (2012) Unlocking the potential of metagenomics through replicated experimental design. Nat Biotechnol 30(6):513–520PubMedCrossRefGoogle Scholar
Barnard D, Casanueva A, Tuffin M et al (2010) Extremophiles in biofuel synthesis. Environ Technol 31(8–9):871–888PubMedCrossRefGoogle Scholar
Shokralla S, Spall JL, Gibson JF et al (2012) Next-generation sequencing technologies for environmental DNA research. Mol Ecol 21:1794–1805PubMedCrossRefGoogle Scholar
Luo C, Tsementzi D, Kyrpides N et al (2012) Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample. PLoS One 7:e30087PubMedCentralPubMedCrossRefGoogle Scholar
Taberlet P, Coissac E, Pompanon F et al (2012) Towards next-generation biodiversity assessment using DNA metabarcoding. Mol Ecol 21(8):2045–2050PubMedCrossRefGoogle Scholar
Blaalid R, Kumar S, Nilsson RH et al (2013) ITS1 versus ITS2 as DNA metabarcodes for fungi. Mol Ecol Resour 13(2):218–224PubMedCrossRefGoogle Scholar
Santamaria M, Fosso B, Consiglio A et al (2012) Reference databases for taxonomic assignment in metagenomics. Brief Bioinform 13(6):682–695PubMedCrossRefGoogle Scholar
Tringe SG, Hugenholtz P (2008) A renaissance for the pioneering 16S rRNA gene. Curr Opin Microbiol 11(5):442–446PubMedCrossRefGoogle Scholar
Nilsson RH, Kristiansson E, Ryberg M et al (2008) Intraspecific ITS variability in the kingdom fungi as expressed in the international sequence databases and its implications for molecular species identification. Evol Bioinform Online 4:193–201PubMedCentralPubMedGoogle Scholar
Gilbert JA, Field D, Swift P et al (2010) The taxonomic and functional diversity of microbes at a temperate coastal site: a 'multi-omic' study of seasonal and diel temporal variation. PLoS One 5(11):e15545PubMedCentralPubMedCrossRefGoogle Scholar
DeSantis TZ, Hugenholtz P, Larsen N et al (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 72(7):5069–5072PubMedCentralPubMedCrossRefGoogle Scholar
Cole JR, Chai B, Marsh TL et al (2003) Ribosomal Database Project. The ribosomal database project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res 31(1):442–443PubMedCentralPubMedCrossRefGoogle Scholar
Pruesse E, Quast C, Knittel K et al (2007) SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35(21):7188–7196PubMedCentralPubMedCrossRefGoogle Scholar
Chuong BD, Batzoglou S (2008) What is the expectation maximization algorithm? Nat Biotechnol 26(8):897–899CrossRefGoogle Scholar
Wang Q, Garrity GM, Tiedje JM et al (2007) Naïve bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 73(16):5261–5267PubMedCentralPubMedCrossRefGoogle Scholar
Claesson MJ, O'Sullivan O, Wang Q et al (2009) Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS One 4(8):e6669PubMedCentralPubMedCrossRefGoogle Scholar
Gosalbes MJ, Abellan JJ, Durbán A et al (2012) Metagenomics of human microbiome: beyond 16s rDNA. Clin Microbiol Infect 18(4):47–49PubMedCrossRefGoogle Scholar