Abstract
One of the main goals in metagenomics is to identify the functional profile of a microbial community from unannotated shotgun sequencing reads. Functional annotation is important in biological research because it enables researchers to identify the abundance of functional genes of the organisms present in the sample, answering the question, “What can the organisms in the sample do?” Most currently available approaches do not scale with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here, we present SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with real metagenomes, and the results show that it accurately predicts the subsystems present in the profiled microbial communities, is computationally efficient, and up to 1000 times faster than other tools. SUPER-FOCUS is freely available at http://edwards.sdsu.edu/SUPERFOCUS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This option can be one of rapsearch, blast, or diamond.
- 2.
Aligner choice [rapsearch (default), blast, or diamond]
References
Handelsman J (2004) Metagenomics: application of genomics to uncultured microorganisms. Microbiol Mol Biol Rev 68:669–685
Zhang J, Chiodini R, Badr A et al (2011) The impact of next-generation sequencing on genomics. J Genet Genomics 38:95–109
T.H.M.P. Consortium (2012) Structure, function and diversity of the healthy human microbiome. Nature 486:207–214
Sunagawa S, Coelho LP, Chaffron S et al (2015) Structure and function of the global ocean microbiome. Science 348:1261359
Mendoza MLZ, Sicheritz-Pontén T, Gilbert MTP (2015) Environmental genes and genomes: understanding the differences and challenges in the approaches and software for their analyses. Brief Bioinform 16(5):745–758. doi:10.1093/bib/bbv001
Overbeek R, Begley T, Butler RM et al (2005) The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 33:5691–5702
Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30
Caspi R, Altman T, Dale JM et al (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38:D473–D479
Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Kent WJ (2002) BLAT—The BLAST-like alignment tool. Genome Res 12:656–664
Zhao Y, Tang H, Ye Y (2012) RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data. Bioinformatics 28:125–126
Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, et al. MEGAN Community Edition—Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data. PLOS Comput. Biol. 2016;12:e1004957
Mitra S, Rupek P, Richter DC et al (2011) Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG. BMC Bioinformatics 12:S21
Meyer F, Paarmann D, D’Souza M et al (2008) The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 9:386
Edwards RA, Olson R, Disz T et al (2012) Real Time Metagenomics: Using k-mers to annotate metagenomes. Bioinformatics 28:3316–3317
G.G.Z. Silva, K.T. Green, B.E. Dutilh, et al. (2015) SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data, Bioinformatics. btv584
Silva GGZ, Cuevas DA, Dutilh BE et al (2014) FOCUS: an alignment-free model to identify organisms in metagenomes using non-negative least squares. PeerJ 2:e425
Berendzen J, Bruno WJ, Cohn JD et al (2012) Rapid phylogenetic and functional classification of short genomic fragments with signature peptides. BMC Res Notes 5:460
Rho M, Tang H, Ye Y (2010) FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 38:e191–e191
Zhang J, Kobert K, Flouri T et al (2014) PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30:614–620
Magoč T, Salzberg SL (2011) FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27:2957–2963
Parks DH, Tyson GW, Hugenholtz P et al (2014) STAMP: statistical analysis of taxonomic and functional profiles. Bioinformatics 30:3123–3124
Acknowledgments
We thank the SEED curators Dr Ross Overbeek, Dr Veronika Vonstein, and Dr Ramy Aziz for the amazing work on the annotation of subsystems since 2004. GGZS was supported by NSF Grants (CNS-1305112, MCB-1330800, and DUE-132809 to RAE and), and FACL was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES/Brazil) fellowship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Silva, G.G.Z., Lopes, F.A.C., Edwards, R.A. (2017). An Agile Functional Analysis of Metagenomic Data Using SUPER-FOCUS. In: Kihara, D. (eds) Protein Function Prediction. Methods in Molecular Biology, vol 1611. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7015-5_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7015-5_4
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7013-1
Online ISBN: 978-1-4939-7015-5
eBook Packages: Springer Protocols