DIAMOND +  MEGAN Microbiome Analysis

Gautam, Anupam; Zeng, Wenhuan; Huson, Daniel H.

doi:10.1007/978-1-0716-3072-3_6

Anupam Gautam^3,4,
Wenhuan Zeng⁵ &
Daniel H. Huson⁵

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2649))

1540 Accesses
3 Citations
2 Altmetric

Abstract

Metagenomics is the study of microbiomes using DNA sequencing technologies. Basic computational tasks are to determine the taxonomic composition (who is out there?), the functional composition (what can they do?), and also to correlate changes of composition to changes in external parameters (how do they compare?). One approach to address these issues is to first align all sequences against a protein reference database such as NCBI-nr and to then perform taxonomic and functional binning of all sequences based on their alignments. The resulting classifications can then be interactively analyzed and compared. Here we illustrate how to pursue this approach using the DIAMOND+MEGAN pipeline, on two different publicly available datasets, one containing short-read samples and other containing long-read samples.

Authors Anupam Gautam and Wenhuan Zeng equally contributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Berg G, Rybakova D, Fischer D, Cernava T, Vergès MCC, Charles T, Chen X, Cocolin L, Eversole K, Corral GH, Kazou M (2020) Microbiome definition re-visited: old concepts and new challenges. Microbiome 8(1):1–22
Google Scholar
Zeng W, Gautam A, Huson DH (2022) DeepToA: an ensemble deep-learning approach to predicting the theater of activity of a microbiome. Bioinformatics, btac584
Google Scholar
Chaudhari NM, Gautam A, Gupta VK, Kaur G, Dutta C, Paul S (2018) PanGFR-HM: a dynamic web resource for pan-genomic and functional profiling of human microbiome with comparative features. Front Microbiol 9:2322
Article PubMed PubMed Central Google Scholar
Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM (1998) Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem Biol 5(10):R245–R249
Article CAS PubMed Google Scholar
Pace NR, Stahl DA, Lane DJ, Olsen GJ (1986) The analysis of natural microbial populations by ribosomal RNA sequences. In: Advances in microbial ecology, vol 9. Springer, Berlin
Google Scholar
Bentley DR et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456(7218):53–59
Article CAS PubMed PubMed Central Google Scholar
Branton D, Deamer DW, Marziali A, Bayley H, Benner SA, Butler T, Di Ventra M, Garaj S, Hibbs A, Huang X, Jovanovich SB (2008). The potential and challenges of nanopore sequencing. Nanosci Technol Nat Biotechnol 26:1146–1153
Article CAS Google Scholar
Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17(1):1–11
Google Scholar
Rhoads A, Au KF (2015) PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13(5):278–289
Article PubMed PubMed Central Google Scholar
Huson DH, Auch AF, Qi J, Schuster SC (2007) MEGAN analysis of metagenomic data. Genome Res 17(3):377–386
Article CAS PubMed PubMed Central Google Scholar
Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC (2011) Integrative analysis of environmental sequences using MEGAN4. Genome Res 21(9):1552–1560
Article CAS PubMed PubMed Central Google Scholar
Glass EM, Wilkening J, Wilke A, Antonopoulos D, Meyer F (2010) Using the metagenomics RAST server (MG-RAST) for analyzing shotgun metagenomes. Cold Spring Harb Protoc 2010(1):pdb-prot5368
Google Scholar
Kunin V, Copeland A, Lapidus A, Mavromatis K, Hugenholtz P (2008) A bioinformatician’s guide to metagenomics. Microbiol Mol Biol Rev 72(4):557–578
Article CAS PubMed PubMed Central Google Scholar
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL (2005) GenBank. Nucleic Acids Res 13(1) 33(suppl_1):D34-D38
Google Scholar
Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 13(1) 12(1):59–60
Article CAS Google Scholar
Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C (2012) Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 9(8):811–814
Article CAS PubMed PubMed Central Google Scholar
Arumugam K, Bağcı C, Bessarab I, Beier S, Buchfink B, Gorska A, Qiu G, Huson DH, Williams RB (2019) Annotated bacterial chromosomes from frame-shift-corrected long-read metagenomic data. Microbiome 7(1):1–13
Article Google Scholar
Huson DH, Albrecht B, Bağcı C, Bessarab I, Gorska A, Jolic D, Williams RB (2018) MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs. Biol Direct 13(1):1–17
Article Google Scholar
Boisvert S, Raymond F, Godzaridis É, Laviolette F, Corbeil J (2012) Ray Meta: scalable de novo metagenome assembly and profiling. Genome Biol 13(12):1–13
Article Google Scholar
Delforno TP, Lacerda Jr GV, Sierra-Garcia IN, Okada DY, Macedo TZ, Varesche MBA, Oliveira VM (2017) Metagenomic analysis of the microbiome in three different bioreactor configurations applied to commercial laundry wastewater treatment. Sci Total Environ 587:389–398
Article PubMed Google Scholar
Wilke A, Bischof J, Harrison T, Brettin T, D’Souza M, Gerlach W, Matthews H, Paczian T, Wilkening J, Glass EM, Desai N (2015) A RESTful API for accessing microbial community data for MG-RAST. PLoS Comput Biol 11(1):e1004008
Article PubMed PubMed Central Google Scholar
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Available online at: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Ewels P, Magnusson M, Lundin S, Käller M (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32(19):3047–3048
Article CAS PubMed PubMed Central Google Scholar
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120
Article CAS PubMed PubMed Central Google Scholar
Liem M, Regensburg-Tuïnk T, Henkel C, Jansen H, Spaink H (2021) Microbial diversity characterization of seawater in a pilot study using Oxford Nanopore Technologies long-read sequencing. BMC Res Notes 14(1):1–7
Article Google Scholar
Kolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37(5):540–546
Article CAS PubMed Google Scholar
Vaser R, Sović I, Nagarajan N, Šikić M (2017) Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27(5):737–746
Article CAS PubMed PubMed Central Google Scholar
Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, Ruscheweyh HJ, Tappu R (2016) MEGAN community edition-interactive exploration and analysis of large-scale microbiome sequencing data. PLoS Comput Biol 12(6):e1004957
Article PubMed PubMed Central Google Scholar
Poinar HN, Schwarz C, Qi J, Shapiro B, Macphee RD, Buigues B, Tikhonov A, Huson DH, Tomsho LP, Auch A, Rampp M, Miller W, Schuster SC (2006) Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA. Science 311(5759):392–394
Article CAS PubMed Google Scholar
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M (1999) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 27(1):29–34
Article CAS PubMed PubMed Central Google Scholar
Gish W, States DJ (1993) Identification of protein coding regions by database similarity search. Nat Genet 3(3):266–272
Article CAS PubMed Google Scholar
Federhen S (2012) The NCBI taxonomy database. Nucleic Acids Res 40(D1):D136–D143
Article CAS PubMed Google Scholar
Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P (2020) A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol 38(9):1079–1086
Article CAS PubMed Google Scholar
Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, Arnold R, Rattei T, Letunic I, Doerks T, Jensen LJ (2012) eggNOG v3. 0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res 40(D1):D284–D289
Article CAS PubMed Google Scholar
Mitchell A, Chang HY, Daugherty L, Fraser M, Hunter S, Lopez R, McAnulla C, McMenamin C, Nuka G, Pesseat S, Sangrador-Vegas A (2015) The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res 43(D1):D213–D221
Article PubMed Google Scholar
Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V (2014) The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 42(D1):D206–D214
Article CAS PubMed Google Scholar
Wattam AR, Abraham D, Dalay O, Disz TL, Driscoll T, Gabbard JL, Gillespie JJ, Gough R, Hix D, Kenyon R, Machi D (2014) PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res 42(D1):D581–D591
Article CAS PubMed Google Scholar
Webb EC (1992) Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the nomenclature and classification of enzymes (No. Ed. 6). Academic Press, Cambridge
Google Scholar
Huson DH, Tappu R, Bazinet AL et al (2017) Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads. Microbiome 5:11
Article PubMed PubMed Central Google Scholar
Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25(7):1043–1055
Article CAS PubMed PubMed Central Google Scholar
Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14):2068–2069
Article CAS PubMed Google Scholar
Mendler K, Chen H, Parks DH, Lobb B, Hug LA, Doxey AC (2019) AnnoTree: visualization and exploration of a functionally annotated microbial tree of life. Nucleic Acids Res 47(9):4442–4448
Article CAS PubMed PubMed Central Google Scholar
Gautam A, Felderhoff H, Bağci C, Huson DH (2022) Using AnnoTree to get more assignments, faster, in DIAMOND+MEGAN microbiome analysis. mSystems 7(1):e01408–e01421
Google Scholar
Gautam A, Zeng W, Huson DH (2023) MeganServer: facilitating interactive access to metagenomic data on a server, to appear in: Bioinformatics https://doi.org/10.1093/bioinformatics/btad105

Download references

Acknowledgements

The authors acknowledge hardware support by the High Performance and Cloud Computing Group at the Zentrum für Datenverarbeitung of the University of Tübingen, the state of Baden-Württemberg through bwHPC, and the German Research Foundation (DFG) through grant no. INST 37/935-1 FUGG. We would also like to acknowledge Marius Eisele for helping us with the long-read datasets.

Author information

Authors and Affiliations

Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
Anupam Gautam
International Max Planck Research School “From Molecules to Organisms”, Max Planck Institute for Biology Tübingen, Tübingen, Germany
Anupam Gautam
Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
Wenhuan Zeng & Daniel H. Huson

Authors

Anupam Gautam
View author publications
You can also search for this author in PubMed Google Scholar
Wenhuan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Daniel H. Huson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel H. Huson .

Editor information

Editors and Affiliations

Leeds Institute of Medical Research, University of Leeds, Leeds, UK
Suparna Mitra

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Gautam, A., Zeng, W., Huson, D.H. (2023). DIAMOND + MEGAN Microbiome Analysis. In: Mitra, S. (eds) Metagenomic Data Analysis. Methods in Molecular Biology, vol 2649. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3072-3_6

Download citation

DOI: https://doi.org/10.1007/978-1-0716-3072-3_6
Published: 03 June 2023
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3071-6
Online ISBN: 978-1-0716-3072-3
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

DIAMOND + MEGAN Microbiome Analysis