Advertisement

Archives of Toxicology

, Volume 85, Issue 9, pp 1015–1033 | Cite as

A survey of metabolic databases emphasizing the MetaCyc family

  • Peter D. Karp
  • Ron Caspi
Review Article

Abstract

Thanks to the confluence of genome sequencing and bioinformatics, the number of metabolic databases has expanded from a handful in the mid-1990s to several thousand today. These databases lie within distinct families that have common ancestry and common attributes. The main families are the MetaCyc, KEGG, Reactome, Model SEED, and BiGG families. We survey these database families, as well as important individual metabolic databases, including multiple human metabolic databases. The MetaCyc family is described in particular detail. It contains well over 1,000 databases, including highly curated databases for Escherichia coli, Saccharomyces cerevisiae, Mus musculus, and Arabidopsis thaliana. These databases are available through a number of web sites that offer a range of software tools for querying and visualizing metabolic networks. These web sites also provide multiple tools for analysis of gene expression and metabolomics data, including visualization of those datasets on metabolic network diagrams and over-representation analysis of gene sets and metabolite sets.

Keywords

Metabolic databases Bioinformatics Metabolic pathways Databases Genome databases 

Notes

Acknowledgments

The projects described were supported by award numbers GM75742 and GM080746 from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health.

References

  1. Aanensen DM, Mavroidi A, Bentley SD, Reeves PR, Spratt BG (2007) Predicted functions and linkage specificities of the products of the Streptococcus pneumoniae capsular biosynthetic loci. J Bacteriol 189(21):7856–7876PubMedCrossRefGoogle Scholar
  2. Bairoch A (2000) The ENZYME database in 2000. Nucleic Acids Res 28(1):304–305PubMedCrossRefGoogle Scholar
  3. Bernal V, Carinhas N, Yokomizo AY, Carrondo MJ, Alves PM (2009) Cell density effect in the baculovirus-insect cells system: a quantitative analysis of energetic metabolism. Biotechnol Bioeng 104(1):162–180PubMedCrossRefGoogle Scholar
  4. BioCyc webinars, http://biocyc.org/webinar.shtml, SRI International
  5. Caspi R, Altman T, Dale JM, Dreher K, Fulcher CA, Gilham F, Kaipa P, Karthikeyan AS, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Paley S, Popescu L, Pujar A, Shearer AG, Zhang P, Karp PD (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38 (Database issue):D473–479Google Scholar
  6. Christie KR, Weng S, Balakrishnan R, Costanzo MC, Dolinski K, Dwight SS, Engel SR, Feierbach B, Fisk DG, Hirschman JE, Hong EL, Issel-Tarver L, Nash R, Sethuraman A, Starr B, Theesfeld CL, Andrada R, Binkley G, Dong Q, Lane C, Schroeder M, Botstein D, Cherry JM (2004) Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res 32 (Database issue):D311–314Google Scholar
  7. Croft D, O’Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B, Jupe S, Kataskaya I, Mahajan S, May B, Ndegwa N, Schmidt E, Shamovsky V, Yung C, Birney E, Hermjakob H, D’Eustachio P, Stein L (2010) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res (in Press)Google Scholar
  8. Dale JM, Popescu L, Karp PD (2010) Machine learning methods for metabolic pathway prediction. BMC Bioinformatics 11:15PubMedCrossRefGoogle Scholar
  9. Doyle MA, MacRae JI, De Souza DP, Saunders EC, McConville MJ, Likic VA (2009) LeishCyc: a biochemical pathways database for Leishmania major. BMC Syst Biol 3:57PubMedCrossRefGoogle Scholar
  10. Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, Vo TD, Srivas R, Palsson BO (2007) Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci USA 104(6):1777–1782PubMedCrossRefGoogle Scholar
  11. Evsikov AV, Dolan ME, Genrich MP, Patek E, Bult CJ (2009) MouseCyc: a curated biochemical pathways database for the laboratory mouse. Genome Biol 10(8):R84PubMedCrossRefGoogle Scholar
  12. Fey P, Gaudet P, Curk T, Zupan B, Just EM, Basu S, Merchant SN, Bushmanova YA, Shaulsky G, Kibbe WA, Chisholm RL (2009) dictyBase–a Dictyostelium bioinformatics resource update. Nucleic Acids Res 37 (Database issue):D515–519Google Scholar
  13. Green ML, Karp PD (2006) The outcomes of pathway database computations depend on pathway ontology. Nucleic Acids Res 34(13):3687–3697PubMedCrossRefGoogle Scholar
  14. Grossmann S, Bauer S, Robinson PN, Vingron M (2007) Improved detection of overrepresentation of Gene-Ontology annotations with parent child analysis. Bioinformatics 23(22):3024–3031PubMedCrossRefGoogle Scholar
  15. Guide to the BioCyc database collection, http://biocyc.org/BioCycUserGuide.shtml, SRI International
  16. Guide to the EcoCyc Database, http://biocyc.org/ecocyc/EcoCycUserGuide.shtml, SRI International
  17. Guide to the MetaCyc Database, http://www.metacyc.org/MetaCycUserGuide.shtml, SRI International
  18. Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL (2010) High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol 28(9):977–982PubMedCrossRefGoogle Scholar
  19. How to use a pathway tools website, http://biocyc.org/PToolsWebsiteHowto.shtml, SRI International
  20. Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38 (Database issue):D355–360Google Scholar
  21. Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, Kaipa P, Gilham F, Spaulding A, Popescu L, Altman T, Paulsen I, Keseler IM, Caspi R (2010) Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform 11 (1):40–79Google Scholar
  22. Keseler IM, Bonavides-Martinez C, Collado-Vides J, Gama-Castro S, Gunsalus RP, Johnson DA, Krummenacker M, Nolan LM, Paley S, Paulsen IT, Peralta-Gil M, Santos-Zavaleta A, Shearer AG, Karp PD (2009) EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res 37 (Database issue):D464–470Google Scholar
  23. Kim TY, Kim HU, Park JM, Song H, Kim JS, Lee SY (2007) Genome-scale analysis of Mannheimia succiniciproducens metabolism. Biotechnol Bioeng 97(4):657–671PubMedCrossRefGoogle Scholar
  24. Latendresse M, Karp PD (2010) An advanced web query interface for biological databases. Database (Oxford) 2010:baq006Google Scholar
  25. Liang C, Jaiswal P, Hebbard C, Avraham S, Buckler ES, Casstevens T, Hurwitz B, McCouch S, Ni J, Pujar A, Ravenscroft D, Ren L, Spooner W, Tecle I, Thomason J, Tung CW, Wei X, Yap I, Youens-Clark K, Ware D, Stein L (2008) Gramene: a growing plant comparative genomics resource. Nucleic Acids Res 36 (Database issue):D947–953Google Scholar
  26. Ma H, Sorokin A, Mazein A, Selkov A, Selkov E, Demin O, Goryanin I (2007) The Edinburgh human metabolic network reconstruction and its functional analysis. Mol Syst Biol 3:135PubMedCrossRefGoogle Scholar
  27. May P, Christian JO, Kempa S, Walther D (2009) ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii. BMC Genomics 10:209PubMedCrossRefGoogle Scholar
  28. Mazourek M, Pujar A, Borovsky Y, Paran I, Mueller L, Jahn MM (2009) A dynamic interface for capsaicinoid systems biology. Plant Physiol 150(4):1806–1821PubMedCrossRefGoogle Scholar
  29. McDonald AG, Boyce S, Tipton KF (2009) ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Res 37 (Database issue):D593–597Google Scholar
  30. Mueller LA, Zhang P, Rhee SY (2003) AraCyc: a biochemical pathway database for Arabidopsis. Plant Physiol 132:453–460PubMedCrossRefGoogle Scholar
  31. Rivals I, Personnaz L, Taing L, Potier MC (2007) Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics 23(4):401–407PubMedCrossRefGoogle Scholar
  32. Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD (2004) Computational prediction of human metabolic pathways from the complete human genome. Genome Biol 6:R2 (1):R2.1–R2.17Google Scholar
  33. Salomonis N, Hanspers K, Zambon AC, Vranizan K, Lawlor SC, Dahlquist KD, Doniger SW, Stuart J, Conklin BR, Pico AR (2007) GenMAPP 2: new features and resources for pathway analysis. BMC Bioinformatics 8:217PubMedCrossRefGoogle Scholar
  34. Scheer M, Grote A, Chang A, Schomburg I, Munaretto C, Rother M, Sohngen C, Stelzer M, Thiele J, Schomburg D (2010) BRENDA, the enzyme information system in 2011. Nucleic Acids Res (in Press)Google Scholar
  35. Schellenberger J, Park JO, Conrad TM, Palsson BO (2010) BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics 11:213PubMedCrossRefGoogle Scholar
  36. Seo S, Lewin HA (2009) Reconstruction of metabolic pathways for the cattle genome. BMC Syst Biol 3:33PubMedCrossRefGoogle Scholar
  37. Snyder EE, Kampanya N, Lu J, Nordberg EK, Karur HR, Shukla M, Soneja J, Tian Y, Xue T, Yoo H, Zhang F, Dharmanolla C, Dongre NV, Gillespie JJ, Hamelius J, Hance M, Huntington KI, Jukneliene D, Koziski J, Mackasmiel L, Mane SP, Nguyen V, Purkayastha A, Shallom J, Yu G, Guo Y, Gabbard J, Hix D, Azad AF, Baker SC, Boyle SM, Khudyakov Y, Meng XJ, Rupprecht C, Vinje J, Crasta OR, Czar MJ, Dickerman A, Eckart JD, Kenyon R, Will R, Setubal JC, Sobral BW (2007) PATRIC: the VBI PathoSystems Resource Integration Center. Nucleic Acids Res 35 (Database issue):D401–406Google Scholar
  38. Urbanczyk-Wochniak E, Sumner LW (2007) MedicCyc: a biochemical pathway database for Medicago truncatula. Bioinformatics 23(11):1418–1423PubMedCrossRefGoogle Scholar
  39. Valdes J, Veloso F, Jedlicki E, Holmes D (2003) Metabolic reconstruction of sulfur assimilation in the extremophile Acidithiobacillus ferrooxidans based on genome analysis. BMC Genomics 4(1):51PubMedCrossRefGoogle Scholar
  40. Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B, Hau DD, Psychogios N, Dong E, Bouatra S, Mandal R, Sinelnikov I, Xia J, Jia L, Cruz JA, Lim E, Sobsey CA, Shrivastava S, Huang P, Liu P, Fang L, Peng J, Fradette R, Cheng D, Tzur D, Clements M, Lewis A, De Souza A, Zuniga A, Dawe M, Xiong Y, Clive D, Greiner R, Nazyrova A, Shaykhutdinov R, Li L, Vogel HJ, Forsythe I (2009) HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res 37 (Database issue):D603–610Google Scholar
  41. Zhang P, Dreher K, Karthikeyan A, Chi A, Pujar A, Caspi R, Karp P, Kirkup V, Latendresse M, Lee C, Mueller LA, Muller R, Rhee SY (2010) Creation of a genome-wide metabolic pathway database for Populus trichocarpa using a new approach for reconstruction and curation of metabolic pathways for plants. Plant Physiol 153(4):1479–1491PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Bioinformatics Research GroupSRI InternationalMenlo ParkUSA

Personalised recommendations