Abstract
Metagenomics is the study of microbial organisms using sequencing applied directly to environmental samples. Similarly, in metatranscriptomics and metaproteomics, the RNA and protein sequences of such samples are studied. The analysis of these kinds of data often starts by asking the questions of “who is out there?”, “what are they doing?”, and “how do they compare?”. In this chapter, we describe how these computational questions can be addressed using MEGAN, the MEtaGenome ANalyzer program. We first show how to analyze the taxonomic and functional content of a single dataset and then show how such analyses can be performed in a comparative fashion. We demonstrate how to compare different datasets using ecological indices and other distance measures. The discussion is conducted using a number of published marine datasets comprising metagenomic, metatranscriptomic, metaproteomic, and 16S rRNA data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Huson DH, Auch AF, Qi J, Schuster SC (2007) MEGAN analysis of metagenomic data. Genome Res 17: 377–386.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. the gene ontology consortium. Nat Genet 25: 25–29.
Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY, et al. (2005) The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 33: 5691–02.
Kanehisa M, Goto S (2000) Kegg: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28: 27–30.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410.
Benson D, Karsch-Mizrachi I, Lipman D, Ostell J, Wheeler D (2005) Genbank. Nucleic Acids Res 1: D34–38.
Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC (2011) Integrative analysis of environmental sequences using MEGAN 4. Under revision.
Gilbert JA, Field D, Huang Y, Edwards R, Li W, et al. (2008) Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities. PLoS One 3: e3042.
Mardis ER (2008) Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet 9: 387–402.
Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol 26: 1135–45.
McHardy AC, Martin HG, Tsirigos A, Hugenholtz P, Rigoutsos I (2007) Accurate phylogenetic classification of variable-length DNA fragments. Nat Methods 4: 63–72.
Rosen GL, Reichenberger E, Rosenfeld A (2010) NBC: The naive Bayes classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics : Advanced access.
Kuever J, Rainey FA, Widdel F (2005) Bergey’s Manual of Systematic Bacteriology. Springer, 1388pp.
Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, et al. (2004) Environmental genome shotgun sequencing of the Sargasso Sea. Science 304: 66–74.
Wu M, Eisen JA (2008) A simple, fast, and accurate method of phylogenomic inference. Genome Biol 9: R151.
von Mering C, Hugenholtz P, Raes J, Tringe SG, Doerks T, et al. (2007) Quantitative phylogenetic assessment of microbial communities in diverse environments. Science 315: 1126–30.
Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, et al. (2004) Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428: 37–43.
Goodall DW (1966) A new similarity index based on probability. Biometrics 22: 882–907.
Lozupone C, Hamady M, Knight R (2006) Unifrac - an online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics 7: 371.
Huson D, Bryant D (2006) Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23: 254–267.
Bryant D, Moulton V (2004) Neighbor-net: An agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution 21: 255–265.
Mitra S, Gilbert JA, Field D, Huson DH (2010) Comparison of multiple metagenomes using phylogenetic networks based on ecological indices. ISME J 4: 1236–1242.
Morris RM, Nunn BL, Frazar C, Goodlett DR, Ting YS, et al. (2010) Comparative metaproteomics reveals ocean-scale shifts in microbial nutrient utilization and energy transduction. ISME J 4: 673–685.
Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, et al. (2007) The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol 5: e77.
Pace N, Stahl D, Olsen G, Lane D (1985) Analyzing natural microbial populations by rRNA sequences. American Society for Microbiology News 51: 4–12.
Pruesse E, Quast C, Knittel K, Fuchs B, Ludwig W, et al. (2007) SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nuc Acids Res 35: 7188–7196.
Wu D, Hugenholtz P, Mavromatis K, Pukall R, Dalin E, et al. (2009) A phylogeny-driven genomic encyclopaedia of bacteria and archaea. Nature 462: 1056–1060.
Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, et al. (2007) The Human Microbiome Project. Nature 449: 804–810.
Turnbaugh PJ, Backhed F, Fulton L, Gordon JI (2008) Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell Host Microbe 3: 213–223.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Huson, D.H., Mitra, S. (2012). Introduction to the Analysis of Environmental Sequences: Metagenomics with MEGAN. In: Anisimova, M. (eds) Evolutionary Genomics. Methods in Molecular Biology, vol 856. Humana Press. https://doi.org/10.1007/978-1-61779-585-5_17
Download citation
DOI: https://doi.org/10.1007/978-1-61779-585-5_17
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-584-8
Online ISBN: 978-1-61779-585-5
eBook Packages: Springer Protocols