Skip to main content

An Agile Functional Analysis of Metagenomic Data Using SUPER-FOCUS

  • Protocol
  • First Online:
Protein Function Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1611))

Abstract

One of the main goals in metagenomics is to identify the functional profile of a microbial community from unannotated shotgun sequencing reads. Functional annotation is important in biological research because it enables researchers to identify the abundance of functional genes of the organisms present in the sample, answering the question, “What can the organisms in the sample do?” Most currently available approaches do not scale with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here, we present SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with real metagenomes, and the results show that it accurately predicts the subsystems present in the profiled microbial communities, is computationally efficient, and up to 1000 times faster than other tools. SUPER-FOCUS is freely available at http://edwards.sdsu.edu/SUPERFOCUS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    This option can be one of rapsearch, blast, or diamond.

  2. 2.

    Aligner choice [rapsearch (default), blast, or diamond]

References

  1. Handelsman J (2004) Metagenomics: application of genomics to uncultured microorganisms. Microbiol Mol Biol Rev 68:669–685

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Zhang J, Chiodini R, Badr A et al (2011) The impact of next-generation sequencing on genomics. J Genet Genomics 38:95–109

    Article  PubMed  PubMed Central  Google Scholar 

  3. T.H.M.P. Consortium (2012) Structure, function and diversity of the healthy human microbiome. Nature 486:207–214

    Article  Google Scholar 

  4. Sunagawa S, Coelho LP, Chaffron S et al (2015) Structure and function of the global ocean microbiome. Science 348:1261359

    Article  PubMed  Google Scholar 

  5. Mendoza MLZ, Sicheritz-Pontén T, Gilbert MTP (2015) Environmental genes and genomes: understanding the differences and challenges in the approaches and software for their analyses. Brief Bioinform 16(5):745–758. doi:10.1093/bib/bbv001

    Article  Google Scholar 

  6. Overbeek R, Begley T, Butler RM et al (2005) The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 33:5691–5702

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Caspi R, Altman T, Dale JM et al (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38:D473–D479

    Article  CAS  PubMed  Google Scholar 

  9. Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Kent WJ (2002) BLAT—The BLAST-like alignment tool. Genome Res 12:656–664

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Zhao Y, Tang H, Ye Y (2012) RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data. Bioinformatics 28:125–126

    Article  CAS  PubMed  Google Scholar 

  12. Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, et al. MEGAN Community Edition—Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data. PLOS Comput. Biol. 2016;12:e1004957

    Google Scholar 

  13. Mitra S, Rupek P, Richter DC et al (2011) Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG. BMC Bioinformatics 12:S21

    Article  PubMed  PubMed Central  Google Scholar 

  14. Meyer F, Paarmann D, D’Souza M et al (2008) The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 9:386

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Edwards RA, Olson R, Disz T et al (2012) Real Time Metagenomics: Using k-mers to annotate metagenomes. Bioinformatics 28:3316–3317

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. G.G.Z. Silva, K.T. Green, B.E. Dutilh, et al. (2015) SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data, Bioinformatics. btv584

    Google Scholar 

  17. Silva GGZ, Cuevas DA, Dutilh BE et al (2014) FOCUS: an alignment-free model to identify organisms in metagenomes using non-negative least squares. PeerJ 2:e425

    Article  PubMed  PubMed Central  Google Scholar 

  18. Berendzen J, Bruno WJ, Cohn JD et al (2012) Rapid phylogenetic and functional classification of short genomic fragments with signature peptides. BMC Res Notes 5:460

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Rho M, Tang H, Ye Y (2010) FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 38:e191–e191

    Article  PubMed  PubMed Central  Google Scholar 

  20. Zhang J, Kobert K, Flouri T et al (2014) PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30:614–620

    Article  CAS  PubMed  Google Scholar 

  21. Magoč T, Salzberg SL (2011) FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27:2957–2963

    Article  PubMed  PubMed Central  Google Scholar 

  22. Parks DH, Tyson GW, Hugenholtz P et al (2014) STAMP: statistical analysis of taxonomic and functional profiles. Bioinformatics 30:3123–3124

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

We thank the SEED curators Dr Ross Overbeek, Dr Veronika Vonstein, and Dr Ramy Aziz for the amazing work on the annotation of subsystems since 2004. GGZS was supported by NSF Grants (CNS-1305112, MCB-1330800, and DUE-132809 to RAE and), and FACL was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES/Brazil) fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert A. Edwards .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Silva, G.G.Z., Lopes, F.A.C., Edwards, R.A. (2017). An Agile Functional Analysis of Metagenomic Data Using SUPER-FOCUS. In: Kihara, D. (eds) Protein Function Prediction. Methods in Molecular Biology, vol 1611. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7015-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7015-5_4

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7013-1

  • Online ISBN: 978-1-4939-7015-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics