An Agile Functional Analysis of Metagenomic Data Using SUPER-FOCUS

  • Genivaldo Gueiros Z. Silva
  • Fabyano A. C. Lopes
  • Robert A. EdwardsEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1611)


One of the main goals in metagenomics is to identify the functional profile of a microbial community from unannotated shotgun sequencing reads. Functional annotation is important in biological research because it enables researchers to identify the abundance of functional genes of the organisms present in the sample, answering the question, “What can the organisms in the sample do?” Most currently available approaches do not scale with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here, we present SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with real metagenomes, and the results show that it accurately predicts the subsystems present in the profiled microbial communities, is computationally efficient, and up to 1000 times faster than other tools. SUPER-FOCUS is freely available at


Bioinformatics Metagenomics Functional profiling Agile tool Sensitive SEED 



We thank the SEED curators Dr Ross Overbeek, Dr Veronika Vonstein, and Dr Ramy Aziz for the amazing work on the annotation of subsystems since 2004. GGZS was supported by NSF Grants (CNS-1305112, MCB-1330800, and DUE-132809 to RAE and), and FACL was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES/Brazil) fellowship.


  1. 1.
    Handelsman J (2004) Metagenomics: application of genomics to uncultured microorganisms. Microbiol Mol Biol Rev 68:669–685CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Zhang J, Chiodini R, Badr A et al (2011) The impact of next-generation sequencing on genomics. J Genet Genomics 38:95–109CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    T.H.M.P. Consortium (2012) Structure, function and diversity of the healthy human microbiome. Nature 486:207–214CrossRefGoogle Scholar
  4. 4.
    Sunagawa S, Coelho LP, Chaffron S et al (2015) Structure and function of the global ocean microbiome. Science 348:1261359CrossRefPubMedGoogle Scholar
  5. 5.
    Mendoza MLZ, Sicheritz-Pontén T, Gilbert MTP (2015) Environmental genes and genomes: understanding the differences and challenges in the approaches and software for their analyses. Brief Bioinform 16(5):745–758. doi: 10.1093/bib/bbv001 CrossRefGoogle Scholar
  6. 6.
    Overbeek R, Begley T, Butler RM et al (2005) The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 33:5691–5702CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Caspi R, Altman T, Dale JM et al (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38:D473–D479CrossRefPubMedGoogle Scholar
  9. 9.
    Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Kent WJ (2002) BLAT—The BLAST-like alignment tool. Genome Res 12:656–664CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Zhao Y, Tang H, Ye Y (2012) RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data. Bioinformatics 28:125–126CrossRefPubMedGoogle Scholar
  12. 12.
    Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, et al. MEGAN Community Edition—Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data. PLOS Comput. Biol. 2016;12:e1004957Google Scholar
  13. 13.
    Mitra S, Rupek P, Richter DC et al (2011) Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG. BMC Bioinformatics 12:S21CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Meyer F, Paarmann D, D’Souza M et al (2008) The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 9:386CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Edwards RA, Olson R, Disz T et al (2012) Real Time Metagenomics: Using k-mers to annotate metagenomes. Bioinformatics 28:3316–3317CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    G.G.Z. Silva, K.T. Green, B.E. Dutilh, et al. (2015) SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data, Bioinformatics. btv584Google Scholar
  17. 17.
    Silva GGZ, Cuevas DA, Dutilh BE et al (2014) FOCUS: an alignment-free model to identify organisms in metagenomes using non-negative least squares. PeerJ 2:e425CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Berendzen J, Bruno WJ, Cohn JD et al (2012) Rapid phylogenetic and functional classification of short genomic fragments with signature peptides. BMC Res Notes 5:460CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Rho M, Tang H, Ye Y (2010) FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 38:e191–e191CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Zhang J, Kobert K, Flouri T et al (2014) PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30:614–620CrossRefPubMedGoogle Scholar
  21. 21.
    Magoč T, Salzberg SL (2011) FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27:2957–2963CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Parks DH, Tyson GW, Hugenholtz P et al (2014) STAMP: statistical analysis of taxonomic and functional profiles. Bioinformatics 30:3123–3124CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  • Genivaldo Gueiros Z. Silva
    • 1
  • Fabyano A. C. Lopes
    • 2
  • Robert A. Edwards
    • 1
    • 3
    • 4
    Email author
  1. 1.Computational Science Research CenterSan Diego State UniversitySan DiegoUSA
  2. 2.Cellular Biology DepartmentUniversidade de Brasília (UnB)BrasíliaBrazil
  3. 3.Department of BiologySan Diego State UniversitySan DiegoUSA
  4. 4.Department of Computer ScienceSan Diego State UniversitySan DiegoUSA

Personalised recommendations