Skip to main content

DIAMOND +  MEGAN Microbiome Analysis

  • Protocol
  • First Online:
Metagenomic Data Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2649))

Abstract

Metagenomics is the study of microbiomes using DNA sequencing technologies. Basic computational tasks are to determine the taxonomic composition (who is out there?), the functional composition (what can they do?), and also to correlate changes of composition to changes in external parameters (how do they compare?). One approach to address these issues is to first align all sequences against a protein reference database such as NCBI-nr and to then perform taxonomic and functional binning of all sequences based on their alignments. The resulting classifications can then be interactively analyzed and compared. Here we illustrate how to pursue this approach using the DIAMOND+MEGAN pipeline, on two different publicly available datasets, one containing short-read samples and other containing long-read samples.

Authors Anupam Gautam and Wenhuan Zeng equally contributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Berg G, Rybakova D, Fischer D, Cernava T, Vergès MCC, Charles T, Chen X, Cocolin L, Eversole K, Corral GH, Kazou M (2020) Microbiome definition re-visited: old concepts and new challenges. Microbiome 8(1):1–22

    Google Scholar 

  2. Zeng W, Gautam A, Huson DH (2022) DeepToA: an ensemble deep-learning approach to predicting the theater of activity of a microbiome. Bioinformatics, btac584

    Google Scholar 

  3. Chaudhari NM, Gautam A, Gupta VK, Kaur G, Dutta C, Paul S (2018) PanGFR-HM: a dynamic web resource for pan-genomic and functional profiling of human microbiome with comparative features. Front Microbiol 9:2322

    Article  PubMed  PubMed Central  Google Scholar 

  4. Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM (1998) Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem Biol 5(10):R245–R249

    Article  CAS  PubMed  Google Scholar 

  5. Pace NR, Stahl DA, Lane DJ, Olsen GJ (1986) The analysis of natural microbial populations by ribosomal RNA sequences. In: Advances in microbial ecology, vol 9. Springer, Berlin

    Google Scholar 

  6. Bentley DR et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456(7218):53–59

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Branton D, Deamer DW, Marziali A, Bayley H, Benner SA, Butler T, Di Ventra M, Garaj S, Hibbs A, Huang X, Jovanovich SB (2008). The potential and challenges of nanopore sequencing. Nanosci Technol Nat Biotechnol 26:1146–1153

    Article  CAS  Google Scholar 

  8. Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17(1):1–11

    Google Scholar 

  9. Rhoads A, Au KF (2015) PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13(5):278–289

    Article  PubMed  PubMed Central  Google Scholar 

  10. Huson DH, Auch AF, Qi J, Schuster SC (2007) MEGAN analysis of metagenomic data. Genome Res 17(3):377–386

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC (2011) Integrative analysis of environmental sequences using MEGAN4. Genome Res 21(9):1552–1560

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Glass EM, Wilkening J, Wilke A, Antonopoulos D, Meyer F (2010) Using the metagenomics RAST server (MG-RAST) for analyzing shotgun metagenomes. Cold Spring Harb Protoc 2010(1):pdb-prot5368

    Google Scholar 

  13. Kunin V, Copeland A, Lapidus A, Mavromatis K, Hugenholtz P (2008) A bioinformatician’s guide to metagenomics. Microbiol Mol Biol Rev 72(4):557–578

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL (2005) GenBank. Nucleic Acids Res 13(1) 33(suppl_1):D34-D38

    Google Scholar 

  15. Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 13(1) 12(1):59–60

    Article  CAS  Google Scholar 

  16. Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C (2012) Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 9(8):811–814

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Arumugam K, Bağcı C, Bessarab I, Beier S, Buchfink B, Gorska A, Qiu G, Huson DH, Williams RB (2019) Annotated bacterial chromosomes from frame-shift-corrected long-read metagenomic data. Microbiome 7(1):1–13

    Article  Google Scholar 

  18. Huson DH, Albrecht B, Bağcı C, Bessarab I, Gorska A, Jolic D, Williams RB (2018) MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs. Biol Direct 13(1):1–17

    Article  Google Scholar 

  19. Boisvert S, Raymond F, Godzaridis É, Laviolette F, Corbeil J (2012) Ray Meta: scalable de novo metagenome assembly and profiling. Genome Biol 13(12):1–13

    Article  Google Scholar 

  20. Delforno TP, Lacerda Jr GV, Sierra-Garcia IN, Okada DY, Macedo TZ, Varesche MBA, Oliveira VM (2017) Metagenomic analysis of the microbiome in three different bioreactor configurations applied to commercial laundry wastewater treatment. Sci Total Environ 587:389–398

    Article  PubMed  Google Scholar 

  21. Wilke A, Bischof J, Harrison T, Brettin T, D’Souza M, Gerlach W, Matthews H, Paczian T, Wilkening J, Glass EM, Desai N (2015) A RESTful API for accessing microbial community data for MG-RAST. PLoS Comput Biol 11(1):e1004008

    Article  PubMed  PubMed Central  Google Scholar 

  22. Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Available online at: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

  23. Ewels P, Magnusson M, Lundin S, Käller M (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32(19):3047–3048

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Liem M, Regensburg-Tuïnk T, Henkel C, Jansen H, Spaink H (2021) Microbial diversity characterization of seawater in a pilot study using Oxford Nanopore Technologies long-read sequencing. BMC Res Notes 14(1):1–7

    Article  Google Scholar 

  26. Kolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37(5):540–546

    Article  CAS  PubMed  Google Scholar 

  27. Vaser R, Sović I, Nagarajan N, Šikić M (2017) Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27(5):737–746

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, Ruscheweyh HJ, Tappu R (2016) MEGAN community edition-interactive exploration and analysis of large-scale microbiome sequencing data. PLoS Comput Biol 12(6):e1004957

    Article  PubMed  PubMed Central  Google Scholar 

  29. Poinar HN, Schwarz C, Qi J, Shapiro B, Macphee RD, Buigues B, Tikhonov A, Huson DH, Tomsho LP, Auch A, Rampp M, Miller W, Schuster SC (2006) Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA. Science 311(5759):392–394

    Article  CAS  PubMed  Google Scholar 

  30. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M (1999) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 27(1):29–34

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Gish W, States DJ (1993) Identification of protein coding regions by database similarity search. Nat Genet 3(3):266–272

    Article  CAS  PubMed  Google Scholar 

  32. Federhen S (2012) The NCBI taxonomy database. Nucleic Acids Res 40(D1):D136–D143

    Article  CAS  PubMed  Google Scholar 

  33. Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P (2020) A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol 38(9):1079–1086

    Article  CAS  PubMed  Google Scholar 

  34. Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, Arnold R, Rattei T, Letunic I, Doerks T, Jensen LJ (2012) eggNOG v3. 0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res 40(D1):D284–D289

    Article  CAS  PubMed  Google Scholar 

  35. Mitchell A, Chang HY, Daugherty L, Fraser M, Hunter S, Lopez R, McAnulla C, McMenamin C, Nuka G, Pesseat S, Sangrador-Vegas A (2015) The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res 43(D1):D213–D221

    Article  PubMed  Google Scholar 

  36. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V (2014) The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 42(D1):D206–D214

    Article  CAS  PubMed  Google Scholar 

  37. Wattam AR, Abraham D, Dalay O, Disz TL, Driscoll T, Gabbard JL, Gillespie JJ, Gough R, Hix D, Kenyon R, Machi D (2014) PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res 42(D1):D581–D591

    Article  CAS  PubMed  Google Scholar 

  38. Webb EC (1992) Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the nomenclature and classification of enzymes (No. Ed. 6). Academic Press, Cambridge

    Google Scholar 

  39. Huson DH, Tappu R, Bazinet AL et al (2017) Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads. Microbiome 5:11

    Article  PubMed  PubMed Central  Google Scholar 

  40. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25(7):1043–1055

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14):2068–2069

    Article  CAS  PubMed  Google Scholar 

  42. Mendler K, Chen H, Parks DH, Lobb B, Hug LA, Doxey AC (2019) AnnoTree: visualization and exploration of a functionally annotated microbial tree of life. Nucleic Acids Res 47(9):4442–4448

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Gautam A, Felderhoff H, Bağci C, Huson DH (2022) Using AnnoTree to get more assignments, faster, in DIAMOND+MEGAN microbiome analysis. mSystems 7(1):e01408–e01421

    Google Scholar 

  44. Gautam A, Zeng W, Huson DH (2023) MeganServer: facilitating interactive access to metagenomic data on a server, to appear in: Bioinformatics https://doi.org/10.1093/bioinformatics/btad105

Download references

Acknowledgements

The authors acknowledge hardware support by the High Performance and Cloud Computing Group at the Zentrum für Datenverarbeitung of the University of Tübingen, the state of Baden-Württemberg through bwHPC, and the German Research Foundation (DFG) through grant no. INST 37/935-1 FUGG. We would also like to acknowledge Marius Eisele for helping us with the long-read datasets.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel H. Huson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Gautam, A., Zeng, W., Huson, D.H. (2023). DIAMOND +  MEGAN Microbiome Analysis. In: Mitra, S. (eds) Metagenomic Data Analysis. Methods in Molecular Biology, vol 2649. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3072-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3072-3_6

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-3071-6

  • Online ISBN: 978-1-0716-3072-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics