Automated Genome Annotation and Metabolic Model Reconstruction in the SEED and Model SEED

  • Scott Devoid
  • Ross Overbeek
  • Matthew DeJongh
  • Veronika Vonstein
  • Aaron A. Best
  • Christopher HenryEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 985)


Over the past decade, genome-scale metabolic models have proven to be a crucial resource for predicting organism phenotypes from genotypes. These models provide a means of rapidly translating detailed knowledge of thousands of enzymatic processes into quantitative predictions of whole-cell behavior. Until recently, the pace of new metabolic model development was eclipsed by the pace at which new genomes were being sequenced. To address this problem, the RAST and the Model SEED framework were developed as a means of automatically producing annotations and draft genome-scale metabolic models. In this chapter, we describe the automated model reconstruction process in detail, starting from a new genome sequence and finishing on a functioning genome-scale metabolic model. We break down the model reconstruction process into eight steps: submitting a genome sequence to RAST, annotating the genome, curating the annotation, submitting the annotation to Model SEED, reconstructing the core model, generating the draft biomass reaction, auto-completing the model, and curating the model. Each of these eight steps is documented in detail.

Key words

Model SEED RAST Automated metabolic model reconstruction Flux balance analysis Gap filling Microbial metabolism Systems metabolic engineering 



We acknowledge the entire SEED, Model SEED, and CytoSEED teams at Argonne National Laboratory, Fellowship for Interpretation of Genomes, Hope College, and University of Chicago for efforts on the frameworks described in this chapter. This work was supported by the US Department of Energy under contract DE-ACO2-06CH11357 (SD, CH), the National Institute of Allergy and Infectious Diseases under contract HHSN266200400042C (RO), and the National Science Foundation under grants MCB-0745100 and DBI-0850546 (MD, AB, VV, RO).


  1. 1.
    Feist AM, Palsson BO (2008) The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat Biotechnol 26:659–667CrossRefGoogle Scholar
  2. 2.
    Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL (2010) High-throughput generation, optimization, and analysis of genome-scale metabolic models. Nat Biotechnol 1672:1–6Google Scholar
  3. 3.
    Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O (2008) The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75CrossRefGoogle Scholar
  4. 4.
    Overbeek R, Disz T, Stevens R (2004) The SEED: a peer-to-peer environment for genome annotation. Commun ACM 47:46–51CrossRefGoogle Scholar
  5. 5.
    DeJongh M, Formsma K, Boillot P, Gould J, Rycenga M, Best A (2007) Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics 8:139CrossRefGoogle Scholar
  6. 6.
    Jankowski MD, Henry CS, Broadbelt LJ, Hatzimanikatis V (2008) Group contribution method for thermodynamic analysis of complex metabolic networks. Biophys J 95:1487–1499CrossRefGoogle Scholar
  7. 7.
    Henry CS, Zinner J, Cohoon M, Stevens R (2009) iBsu1103: a new genome scale metabolic model of B. subtilis based on SEED annotations. Genome Biol 10:R69CrossRefGoogle Scholar
  8. 8.
    Kumar VS, Maranas CD (2009) GrowMatch: an automated method for reconciling in silico/in vivo growth predictions. PLoS Comput Biol 5:e1000308CrossRefGoogle Scholar
  9. 9.
    Suthers PF, Dasika MS, Kumar VS, Denisov G, Glass JI, Maranas CD (2009) A genome-scale metabolic reconstruction of Mycoplasma genitalium, iPS189. PLoS Comput Biol 5:e1000285CrossRefGoogle Scholar
  10. 10.
    Thiele I, Palsson B (2010) A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc 5:93–121CrossRefGoogle Scholar
  11. 11.
    Schuler GD, Epstein JA, Ohkawa H, Kans JA (1996) Entrez: molecular biology database and retrieval system. Methods Enzymol 266:141–162CrossRefGoogle Scholar
  12. 12.
    Edwards JS, Palsson BO (2000) The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc Natl Acad Sci U S A 97:5528–5533CrossRefGoogle Scholar
  13. 13.
    Papoutsakis ET, Meyer CL (1985) Equations and calculations of product yields and preferred pathways for butanediol and mixed-acid fermentations. Biotechnol Bioeng 27:50–66CrossRefGoogle Scholar
  14. 14.
    Jin YS, Jeffries TW (2004) Stoichiometric network constraints on xylose metabolism by recombinant Saccharomyces cerevisiae. Metab Eng 6:229–238CrossRefGoogle Scholar
  15. 15.
    Varma A, Palsson BO (1994) Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Appl Environ Microbiol 60:3724–3731Google Scholar
  16. 16.
    Varma A, Palsson BO (1993) Metabolic capabilities of Escherichia coli. 2. Optimal-growth patterns. J Theor Biol 165:503–522CrossRefGoogle Scholar
  17. 17.
    Varma A, Palsson BO (1993) Metabolic capabilities of Escherichia coli.1. Synthesis of biosynthetic precursors and cofactors. J Theor Biol 165:477–502CrossRefGoogle Scholar
  18. 18.
    Edwards JS, Ibarra RU, Palsson BO (2001) In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat Biotechnol 19:125–130CrossRefGoogle Scholar
  19. 19.
    Meyer F, Overbeek R, Rodriguez A (2009) FIGfams: yet another set of protein families. Nucleic Acids Res 37:6643–6654CrossRefGoogle Scholar
  20. 20.
    Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27:4636–4641CrossRefGoogle Scholar
  21. 21.
    Delcher AL, Bratke KA, Powers EC, Salzberg SL (2007) Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23:673–679CrossRefGoogle Scholar
  22. 22.
    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402CrossRefGoogle Scholar
  23. 23.
    Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30CrossRefGoogle Scholar
  24. 24.
    Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO (2009) Reconstruction of biochemical networks in microorganisms. Nat Rev Microbiol 7:129–143Google Scholar
  25. 25.
    Kummel A, Panke S, Heinemann M (2006) Systematic assignment of thermodynamic constraints in metabolic network models. BMC Bioinformatics 7:512CrossRefGoogle Scholar
  26. 26.
    Krumholz EW, Yang H, Weisenhorn P, Henry CS, Libourel IG (2012) Genome-wide metabolic network reconstruction of the picoalga Ostreococcus. J Exp Bot 63:2353–2362CrossRefGoogle Scholar
  27. 27.
    DeJongh M, Bockstege B, Frybarger P, Hazekamp N, Kammeraad J, McGeehan T (2012) CytoSEED: a Cytoscape plugin for viewing, manipulating and analyzing metabolic models created by the Model SEED. Bioinformatics 28:891–892CrossRefGoogle Scholar
  28. 28.
    Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27:431–432CrossRefGoogle Scholar
  29. 29.
    Becker SA, Feist AM, Mo ML, Hannum G, Palsson BO, Herrgard MJ (2007) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nat Protoc 2:727–738CrossRefGoogle Scholar
  30. 30.
    Rocha I, Maia P, Evangelista P, Vilaca P, Soares S, Pinto JP, Nielsen J, Patil KR, Ferreira EC, Rocha M (2010) OptFlux: an open-source software platform for in silico metabolic engineering. BMC Syst Biol 4:45CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2013

Authors and Affiliations

  • Scott Devoid
    • 1
  • Ross Overbeek
    • 2
  • Matthew DeJongh
    • 3
  • Veronika Vonstein
    • 2
  • Aaron A. Best
    • 4
  • Christopher Henry
    • 1
    Email author
  1. 1.MCS DivisionArgonne National LaboratoryArgonneUSA
  2. 2.Fellowship for Interpretation of GenomesBurr RidgeUSA
  3. 3.Department of Computer ScienceHope CollegeHollandUSA
  4. 4.Department of BiologyHope CollegeHollandUSA

Personalised recommendations