Skip to main content
Log in

A survey of metabolic databases emphasizing the MetaCyc family

  • Review Article
  • Published:
Archives of Toxicology Aims and scope Submit manuscript

Abstract

Thanks to the confluence of genome sequencing and bioinformatics, the number of metabolic databases has expanded from a handful in the mid-1990s to several thousand today. These databases lie within distinct families that have common ancestry and common attributes. The main families are the MetaCyc, KEGG, Reactome, Model SEED, and BiGG families. We survey these database families, as well as important individual metabolic databases, including multiple human metabolic databases. The MetaCyc family is described in particular detail. It contains well over 1,000 databases, including highly curated databases for Escherichia coli, Saccharomyces cerevisiae, Mus musculus, and Arabidopsis thaliana. These databases are available through a number of web sites that offer a range of software tools for querying and visualizing metabolic networks. These web sites also provide multiple tools for analysis of gene expression and metabolomics data, including visualization of those datasets on metabolic network diagrams and over-representation analysis of gene sets and metabolite sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Aanensen DM, Mavroidi A, Bentley SD, Reeves PR, Spratt BG (2007) Predicted functions and linkage specificities of the products of the Streptococcus pneumoniae capsular biosynthetic loci. J Bacteriol 189(21):7856–7876

    Article  PubMed  CAS  Google Scholar 

  • Bairoch A (2000) The ENZYME database in 2000. Nucleic Acids Res 28(1):304–305

    Article  PubMed  CAS  Google Scholar 

  • Bernal V, Carinhas N, Yokomizo AY, Carrondo MJ, Alves PM (2009) Cell density effect in the baculovirus-insect cells system: a quantitative analysis of energetic metabolism. Biotechnol Bioeng 104(1):162–180

    Article  PubMed  CAS  Google Scholar 

  • BioCyc webinars, http://biocyc.org/webinar.shtml, SRI International

  • Caspi R, Altman T, Dale JM, Dreher K, Fulcher CA, Gilham F, Kaipa P, Karthikeyan AS, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Paley S, Popescu L, Pujar A, Shearer AG, Zhang P, Karp PD (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38 (Database issue):D473–479

  • Christie KR, Weng S, Balakrishnan R, Costanzo MC, Dolinski K, Dwight SS, Engel SR, Feierbach B, Fisk DG, Hirschman JE, Hong EL, Issel-Tarver L, Nash R, Sethuraman A, Starr B, Theesfeld CL, Andrada R, Binkley G, Dong Q, Lane C, Schroeder M, Botstein D, Cherry JM (2004) Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res 32 (Database issue):D311–314

  • Croft D, O’Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B, Jupe S, Kataskaya I, Mahajan S, May B, Ndegwa N, Schmidt E, Shamovsky V, Yung C, Birney E, Hermjakob H, D’Eustachio P, Stein L (2010) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res (in Press)

  • Dale JM, Popescu L, Karp PD (2010) Machine learning methods for metabolic pathway prediction. BMC Bioinformatics 11:15

    Article  PubMed  Google Scholar 

  • Doyle MA, MacRae JI, De Souza DP, Saunders EC, McConville MJ, Likic VA (2009) LeishCyc: a biochemical pathways database for Leishmania major. BMC Syst Biol 3:57

    Article  PubMed  Google Scholar 

  • Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, Vo TD, Srivas R, Palsson BO (2007) Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci USA 104(6):1777–1782

    Article  PubMed  CAS  Google Scholar 

  • Evsikov AV, Dolan ME, Genrich MP, Patek E, Bult CJ (2009) MouseCyc: a curated biochemical pathways database for the laboratory mouse. Genome Biol 10(8):R84

    Article  PubMed  Google Scholar 

  • Fey P, Gaudet P, Curk T, Zupan B, Just EM, Basu S, Merchant SN, Bushmanova YA, Shaulsky G, Kibbe WA, Chisholm RL (2009) dictyBase–a Dictyostelium bioinformatics resource update. Nucleic Acids Res 37 (Database issue):D515–519

  • Green ML, Karp PD (2006) The outcomes of pathway database computations depend on pathway ontology. Nucleic Acids Res 34(13):3687–3697

    Article  PubMed  CAS  Google Scholar 

  • Grossmann S, Bauer S, Robinson PN, Vingron M (2007) Improved detection of overrepresentation of Gene-Ontology annotations with parent child analysis. Bioinformatics 23(22):3024–3031

    Article  PubMed  CAS  Google Scholar 

  • Guide to the BioCyc database collection, http://biocyc.org/BioCycUserGuide.shtml, SRI International

  • Guide to the EcoCyc Database, http://biocyc.org/ecocyc/EcoCycUserGuide.shtml, SRI International

  • Guide to the MetaCyc Database, http://www.metacyc.org/MetaCycUserGuide.shtml, SRI International

  • Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL (2010) High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol 28(9):977–982

    Article  PubMed  CAS  Google Scholar 

  • How to use a pathway tools website, http://biocyc.org/PToolsWebsiteHowto.shtml, SRI International

  • Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38 (Database issue):D355–360

  • Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, Kaipa P, Gilham F, Spaulding A, Popescu L, Altman T, Paulsen I, Keseler IM, Caspi R (2010) Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform 11 (1):40–79

  • Keseler IM, Bonavides-Martinez C, Collado-Vides J, Gama-Castro S, Gunsalus RP, Johnson DA, Krummenacker M, Nolan LM, Paley S, Paulsen IT, Peralta-Gil M, Santos-Zavaleta A, Shearer AG, Karp PD (2009) EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res 37 (Database issue):D464–470

  • Kim TY, Kim HU, Park JM, Song H, Kim JS, Lee SY (2007) Genome-scale analysis of Mannheimia succiniciproducens metabolism. Biotechnol Bioeng 97(4):657–671

    Article  PubMed  CAS  Google Scholar 

  • Latendresse M, Karp PD (2010) An advanced web query interface for biological databases. Database (Oxford) 2010:baq006

  • Liang C, Jaiswal P, Hebbard C, Avraham S, Buckler ES, Casstevens T, Hurwitz B, McCouch S, Ni J, Pujar A, Ravenscroft D, Ren L, Spooner W, Tecle I, Thomason J, Tung CW, Wei X, Yap I, Youens-Clark K, Ware D, Stein L (2008) Gramene: a growing plant comparative genomics resource. Nucleic Acids Res 36 (Database issue):D947–953

  • Ma H, Sorokin A, Mazein A, Selkov A, Selkov E, Demin O, Goryanin I (2007) The Edinburgh human metabolic network reconstruction and its functional analysis. Mol Syst Biol 3:135

    Article  PubMed  Google Scholar 

  • May P, Christian JO, Kempa S, Walther D (2009) ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii. BMC Genomics 10:209

    Article  PubMed  Google Scholar 

  • Mazourek M, Pujar A, Borovsky Y, Paran I, Mueller L, Jahn MM (2009) A dynamic interface for capsaicinoid systems biology. Plant Physiol 150(4):1806–1821

    Article  PubMed  CAS  Google Scholar 

  • McDonald AG, Boyce S, Tipton KF (2009) ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Res 37 (Database issue):D593–597

    Google Scholar 

  • Mueller LA, Zhang P, Rhee SY (2003) AraCyc: a biochemical pathway database for Arabidopsis. Plant Physiol 132:453–460

    Article  PubMed  CAS  Google Scholar 

  • Rivals I, Personnaz L, Taing L, Potier MC (2007) Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics 23(4):401–407

    Article  PubMed  CAS  Google Scholar 

  • Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD (2004) Computational prediction of human metabolic pathways from the complete human genome. Genome Biol 6:R2 (1):R2.1–R2.17

    Google Scholar 

  • Salomonis N, Hanspers K, Zambon AC, Vranizan K, Lawlor SC, Dahlquist KD, Doniger SW, Stuart J, Conklin BR, Pico AR (2007) GenMAPP 2: new features and resources for pathway analysis. BMC Bioinformatics 8:217

    Article  PubMed  Google Scholar 

  • Scheer M, Grote A, Chang A, Schomburg I, Munaretto C, Rother M, Sohngen C, Stelzer M, Thiele J, Schomburg D (2010) BRENDA, the enzyme information system in 2011. Nucleic Acids Res (in Press)

  • Schellenberger J, Park JO, Conrad TM, Palsson BO (2010) BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics 11:213

    Article  PubMed  Google Scholar 

  • Seo S, Lewin HA (2009) Reconstruction of metabolic pathways for the cattle genome. BMC Syst Biol 3:33

    Article  PubMed  Google Scholar 

  • Snyder EE, Kampanya N, Lu J, Nordberg EK, Karur HR, Shukla M, Soneja J, Tian Y, Xue T, Yoo H, Zhang F, Dharmanolla C, Dongre NV, Gillespie JJ, Hamelius J, Hance M, Huntington KI, Jukneliene D, Koziski J, Mackasmiel L, Mane SP, Nguyen V, Purkayastha A, Shallom J, Yu G, Guo Y, Gabbard J, Hix D, Azad AF, Baker SC, Boyle SM, Khudyakov Y, Meng XJ, Rupprecht C, Vinje J, Crasta OR, Czar MJ, Dickerman A, Eckart JD, Kenyon R, Will R, Setubal JC, Sobral BW (2007) PATRIC: the VBI PathoSystems Resource Integration Center. Nucleic Acids Res 35 (Database issue):D401–406

  • Urbanczyk-Wochniak E, Sumner LW (2007) MedicCyc: a biochemical pathway database for Medicago truncatula. Bioinformatics 23(11):1418–1423

    Article  PubMed  CAS  Google Scholar 

  • Valdes J, Veloso F, Jedlicki E, Holmes D (2003) Metabolic reconstruction of sulfur assimilation in the extremophile Acidithiobacillus ferrooxidans based on genome analysis. BMC Genomics 4(1):51

    Article  PubMed  Google Scholar 

  • Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B, Hau DD, Psychogios N, Dong E, Bouatra S, Mandal R, Sinelnikov I, Xia J, Jia L, Cruz JA, Lim E, Sobsey CA, Shrivastava S, Huang P, Liu P, Fang L, Peng J, Fradette R, Cheng D, Tzur D, Clements M, Lewis A, De Souza A, Zuniga A, Dawe M, Xiong Y, Clive D, Greiner R, Nazyrova A, Shaykhutdinov R, Li L, Vogel HJ, Forsythe I (2009) HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res 37 (Database issue):D603–610

  • Zhang P, Dreher K, Karthikeyan A, Chi A, Pujar A, Caspi R, Karp P, Kirkup V, Latendresse M, Lee C, Mueller LA, Muller R, Rhee SY (2010) Creation of a genome-wide metabolic pathway database for Populus trichocarpa using a new approach for reconstruction and curation of metabolic pathways for plants. Plant Physiol 153(4):1479–1491

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

The projects described were supported by award numbers GM75742 and GM080746 from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter D. Karp.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karp, P.D., Caspi, R. A survey of metabolic databases emphasizing the MetaCyc family. Arch Toxicol 85, 1015–1033 (2011). https://doi.org/10.1007/s00204-011-0705-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00204-011-0705-2

Keywords

Navigation