Skip to main content

e-DNA Meta-Barcoding: From NGS Raw Data to Taxonomic Profiling

  • Protocol
  • First Online:
RNA Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1269))

Abstract

In recent years, thanks to the essential support provided by the Next-Generation Sequencing (NGS) technologies, Metagenomics is enabling the direct access to the taxonomic and functional composition of mixed microbial communities living in any environmental niche, without the prerequisite to isolate or culture the single organisms. This approach has already been successfully applied for the analysis of many habitats, such as water or soil natural environments, also characterized by extreme physical and chemical conditions, food supply chains, and animal organisms, including humans. A shotgun sequencing approach can lead to investigate both organisms and genes diversity. Anyway, if the purpose is limited to explore the taxonomic complexity, an amplicon-based approach, based on PCR-targeted sequencing of selected genetic species markers, commonly named “meta-barcodes”, is desirable. Among the genomic regions most widely used for the discrimination of bacterial organisms, in some cases up to the species level, some hypervariable domains of the gene coding for the 16S rRNA occupy a prominent place.

The amplification of a certain meta-barcode from a microbial community through the use of PCR primers able to work in the entire considered taxonomic group is the first task after the extraction of the total DNA. Generally, this step is followed by the high-throughput sequencing of the resulting amplicons libraries by means of a selected NGS platform. Finally, the interpretation of the huge amount of produced data requires appropriate bioinformatics tools and know-how in addition to efficient computational resources.

Here a computational methodology suitable for the taxonomic characterization of 454 meta-barcode sequences is described in detail. In particular, a dataset covering the V1–V3 region belonging to the bacterial 16S rRNA coding gene and produced in the Human Microbiome Project (HMP) from a palatine tonsils sample is analyzed. The proposed exercise includes the basic steps to manage raw sequencing data, remove amplification and pyrosequencing errors, and finally map sequences on the taxonomy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hemme CL, Deng Y, Gentry TJ et al (2010) Metagenomic insights into evolution of a heavy metal-contaminated groundwater microbial community. ISME J 4(5):660–672

    Article  CAS  PubMed  Google Scholar 

  2. Ottman N, Smidt H, de Vos WM et al (2012) The function of our microbiota: who is out there and what do they do? Front Cell Infect Microbiol 2:104

    Article  PubMed Central  PubMed  Google Scholar 

  3. Dutton RJ, Turnbaugh PJ (2012) Taking a metagenomic view of human nutrition. Curr Opin Clin Nutr Metab Care 15(5):448–454

    Article  PubMed  Google Scholar 

  4. Knight R, Jansson J, Field D et al (2012) Unlocking the potential of metagenomics through replicated experimental design. Nat Biotechnol 30(6):513–520

    Article  CAS  PubMed  Google Scholar 

  5. Barnard D, Casanueva A, Tuffin M et al (2010) Extremophiles in biofuel synthesis. Environ Technol 31(8–9):871–888

    Article  CAS  PubMed  Google Scholar 

  6. Shokralla S, Spall JL, Gibson JF et al (2012) Next-generation sequencing technologies for environmental DNA research. Mol Ecol 21:1794–1805

    Article  CAS  PubMed  Google Scholar 

  7. Luo C, Tsementzi D, Kyrpides N et al (2012) Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample. PLoS One 7:e30087

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  8. Taberlet P, Coissac E, Pompanon F et al (2012) Towards next-generation biodiversity assessment using DNA metabarcoding. Mol Ecol 21(8):2045–2050

    Article  CAS  PubMed  Google Scholar 

  9. Blaalid R, Kumar S, Nilsson RH et al (2013) ITS1 versus ITS2 as DNA metabarcodes for fungi. Mol Ecol Resour 13(2):218–224

    Article  CAS  PubMed  Google Scholar 

  10. Santamaria M, Fosso B, Consiglio A et al (2012) Reference databases for taxonomic assignment in metagenomics. Brief Bioinform 13(6):682–695

    Article  CAS  PubMed  Google Scholar 

  11. Tringe SG, Hugenholtz P (2008) A renaissance for the pioneering 16S rRNA gene. Curr Opin Microbiol 11(5):442–446

    Article  CAS  PubMed  Google Scholar 

  12. Nilsson RH, Kristiansson E, Ryberg M et al (2008) Intraspecific ITS variability in the kingdom fungi as expressed in the international sequence databases and its implications for molecular species identification. Evol Bioinform Online 4:193–201

    PubMed Central  PubMed  Google Scholar 

  13. Teeling H, Glöckner FO (2012) Current opportunities and challenges in microbial metagenome analysis—a bioinformatic perspective. Brief Bioinform 13(6):728–742

    Article  PubMed Central  PubMed  Google Scholar 

  14. Gilbert JA, Field D, Swift P et al (2010) The taxonomic and functional diversity of microbes at a temperate coastal site: a 'multi-omic' study of seasonal and diel temporal variation. PLoS One 5(11):e15545

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  15. Bazinet al, Cummings MP (2012) A comparative evaluation of sequence classification programs. http://drum.lib.umd.edu/handle/1903/13346

  16. Simon C, Daniel R (2011) Metagenomic analyses: past and future trends. Appl Environ Microbiol 77(4):1153–1161

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  17. DeSantis TZ, Hugenholtz P, Larsen N et al (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 72(7):5069–5072

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  18. Cole JR, Chai B, Marsh TL et al (2003) Ribosomal Database Project. The ribosomal database project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res 31(1):442–443

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  19. Pruesse E, Quast C, Knittel K et al (2007) SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35(21):7188–7196

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Roche Applied Sciences (2008) Genome sequencer data analysis software manual. Roche Diagnostics GmbH, Germany

    Google Scholar 

  21. Metzker ML (2010) Sequencing Technologies - the Next Generation. Nat Rev Genet 11(1):31–46

    Article  CAS  PubMed  Google Scholar 

  22. Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred II error probabilities. Genome Res 8(3):186–194

    Article  CAS  PubMed  Google Scholar 

  23. Quince C, Lanzen A, Davenport RJ et al (2011) Removing noise from pyrosequenced amplicons. BMC Bioinformatics 12:38

    Article  PubMed Central  PubMed  Google Scholar 

  24. Schloss PD (2009) A high-throughput DNA sequence aligner for microbial ecology studies. PLoS One 4(12):e8230

    Article  PubMed Central  PubMed  Google Scholar 

  25. Balzer S, Malde K, Lanzén A et al (2010) Characteristics of 454 pyrosequencing data-enabling realistic simulation with flowsim. Bioinformatics 26(18):i420–i425

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  26. Huse SM, Huber JA, Morrison HG et al (2007) Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol 8(7):R143

    Article  PubMed Central  PubMed  Google Scholar 

  27. Chuong BD, Batzoglou S (2008) What is the expectation maximization algorithm? Nat Biotechnol 26(8):897–899

    Article  Google Scholar 

  28. Wang Q, Garrity GM, Tiedje JM et al (2007) Naïve bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 73(16):5261–5267

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  29. Cole JR, Wang Q, Cardenas E et al (2009) The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 37(Database issue):D141–D145. doi:10.1093/nar/gkn879

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  30. Claesson MJ, O'Sullivan O, Wang Q et al (2009) Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS One 4(8):e6669

    Article  PubMed Central  PubMed  Google Scholar 

  31. Gosalbes MJ, Abellan JJ, Durbán A et al (2012) Metagenomics of human microbiome: beyond 16s rDNA. Clin Microbiol Infect 18(4):47–49

    Article  CAS  PubMed  Google Scholar 

  32. Andersson AF, Lindberg M, Jakobsson H et al (2008) Comparative analysis of human gut microbiota by barcoded pyrosequencing. PLoS One 3(7):e2836

    Article  PubMed Central  PubMed  Google Scholar 

  33. Malde K (2011) Flower: extracting information from pyrosequencing data. Bioinformatics 27(7):1041–1042

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  34. Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7(5):335–336

    Article  CAS  PubMed Central  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Monica Santamaria .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this protocol

Cite this protocol

Bruno, F., Marinella, M., Santamaria, M. (2015). e-DNA Meta-Barcoding: From NGS Raw Data to Taxonomic Profiling. In: Picardi, E. (eds) RNA Bioinformatics. Methods in Molecular Biology, vol 1269. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-2291-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-2291-8_16

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-2290-1

  • Online ISBN: 978-1-4939-2291-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics