Skip to main content

Machine Learning and Hybrid Methods for Metabolic Pathway Modeling

  • Protocol
  • First Online:
Computational Biology and Machine Learning for Metabolic Engineering and Synthetic Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2553))

Abstract

Computational cell metabolism models seek to provide metabolic explanations of cell behavior under different conditions or following genetic alterations, help in the optimization of in vitro cell growth environments, or predict cellular behavior in vivo and in vitro. In the extremes, mechanistic models can include highly detailed descriptions of a small number of metabolic reactions or an approximate representation of an entire metabolic network. To date, all mechanistic models have required details of individual metabolic reactions, either kinetic parameters or metabolic flux, as well as information about extracellular and intracellular metabolite concentrations. Despite the extensive efforts and the increasing availability of high-quality data, required in vivo data are not available for the majority of known metabolic reactions; thus, mechanistic models are based primarily on ex vivo kinetic measurements and limited flux information. Machine learning approaches provide an alternative for derivation of functional dependencies from existing data. The increasing availability of metabolomic and lipidomic data, with growing feature coverage as well as sample set size, is expected to provide new data options needed for derivation of machine learning models of cell metabolic processes. Moreover, machine learning analysis of longitudinal data can lead to predictive models of cell behaviors over time. Conversely, machine learning models trained on steady-state data can provide descriptive models for the comparison of metabolic states in different environments or disease conditions. Additionally, inclusion of metabolic network knowledge in these analyses can further help in the development of models with limited data.

This chapter will explore the application of machine learning to the modeling of cell metabolism. We first provide a theoretical explanation of several machine learning and hybrid mechanistic machine learning methods currently being explored to model metabolism. Next, we introduce several avenues for improving these models with machine learning. Finally, we provide protocols for specific examples of the utilization of machine learning in the development of predictive cell metabolism models using metabolomic data. We describe data preprocessing, approaches for training of machine learning models for both descriptive and predictive models, and the utilization of these models in synthetic and systems biology. Detailed protocols provide a list of software tools and libraries used for these applications, step-by-step modeling protocols, troubleshooting, as well as an overview of existing limitations to these approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Richelle A, David B, Demaegd D et al (2020) Towards a widespread adoption of metabolic modeling tools in biopharmaceutical industry: a process systems biology engineering perspective. NPJ Syst Biol Appl 6(1):6

    Article  PubMed  PubMed Central  Google Scholar 

  2. Puniya BL, Amin R, Lichter B et al (2021) Integrative computational approach identifies drug targets in CD4(+) T-cell-mediated immune disorders. NPJ Syst Biol Appl 7(1):4

    Article  PubMed  PubMed Central  Google Scholar 

  3. Blais EM, Rawls KD, Dougherty BV et al (2017) Reconciled rat and human metabolic networks for comparative toxicogenomics and biomarker predictions. Nat Commun 8:14250

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Bordbar A, Jamshidi N, Palsson BO (2011) iAB-RBC-283: a proteomically derived knowledge-base of erythrocyte metabolism that can be used to simulate its physiological and patho-physiological states. BMC Syst Biol 5:110

    Article  PubMed  PubMed Central  Google Scholar 

  5. Thomas A, Rahmanian S, Bordbar A et al (2014) Network reconstruction of platelet metabolism identifies metabolic signature for aspirin resistance. Sci Rep 4:3925

    Article  PubMed  PubMed Central  Google Scholar 

  6. Rico J, Nantel A, Pham PL et al (2018) Kinetic model of metabolism of monoclonal antibody producing CHO cells. Current Metabolomics 6

    Google Scholar 

  7. Nguyen TNT, Sha S, Hong MS et al (2021) Mechanistic model for production of recombinant adeno-associated virus via triple transfection of HEK293 cells. Mol Ther Methods Clin Dev 21:642–655

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Chandrasekaran S, Zhang J, Sun Z et al (2017) Comprehensive mapping of pluripotent stem cell metabolism using dynamic genome-scale network modeling. Cell Rep 21(10):2965–2977

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Cuperlovic-Culf M (2018) Machine learning methods for analysis of metabolic data and metabolic pathway modeling. Meta 8(1)

    Google Scholar 

  10. Srinivasan S, Cluett WR, Mahadevan R (2015) Constructing kinetic models of metabolism at genome-scales: a review. Biotechnol J 10(9):1345–1359

    Article  CAS  PubMed  Google Scholar 

  11. Helmy M, Smith D, Selvarajoo K (2020) Systems biology approaches integrated with artificial intelligence for optimized metabolic engineering. Metab Eng Commun 11:e00149

    Article  PubMed  PubMed Central  Google Scholar 

  12. Borzì A (2020) Modelling with ordinary differential equations: a comprehensive approach, 1st edn. Chapman and Hall/CRC

    Book  Google Scholar 

  13. von Stosch M, Peres J, de Azevedo SF et al (2010) Modelling biochemical networks with intrinsic time delays: a hybrid semi-parametric approach. BMC Syst Biol 4:131

    Article  Google Scholar 

  14. Srinivasan B (2021) A guide to the Michaelis-Menten equation: steady state and beyond. FEBS J

    Google Scholar 

  15. Wittig U, Kania R, Golebiewski M et al (2012) SABIO-RK--database for biochemical reaction kinetics. Nucleic Acids Res 40(Database issue):D790–D796

    Article  CAS  PubMed  Google Scholar 

  16. Chang A, Jeske L, Ulbrich S et al (2021) BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res 49(D1):D498–D508

    Article  CAS  PubMed  Google Scholar 

  17. Saa PA, Nielsen LK (2016) Construction of feasible and accurate kinetic models of metabolism: a Bayesian approach. Sci Rep 6:29635

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Orth JD, Thiele I, Palsson BO (2010) What is flux balance analysis? Nat Biotechnol 28(3):245–248

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Jerby L, Shlomi T, Ruppin E (2010) Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism. Mol Syst Biol 6:401

    Article  PubMed  PubMed Central  Google Scholar 

  20. Zhang C, Bidkhori G, Benfeitas R et al (2018) ESS: a tool for genome-scale quantification of essentiality score for reaction/genes in constraint-based modeling. Front Physiol 9:1355

    Article  PubMed  PubMed Central  Google Scholar 

  21. Lewis NE, Nagarajan H, Palsson BO (2012) Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nat Rev Microbiol 10(4):291–305

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Richelle A, Joshi C, Lewis NE (2019) Assessing key decisions for transcriptomic data integration in biochemical networks. PLoS Comput Biol 15(7):e1007185

    Article  PubMed  PubMed Central  Google Scholar 

  23. Opdam S, Richelle A, Kellman B et al (2017) A systematic evaluation of methods for tailoring genome-scale metabolic models. Cell Syst 4(3):318–329. e316

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Aurich MK, Fleming RM, Thiele I (2016) MetaboTools: a comprehensive toolbox for analysis of genome-scale metabolic models. Front Physiol 7:327

    Article  PubMed  PubMed Central  Google Scholar 

  25. Brunk E, Sahoo S, Zielinski DC et al (2018) Recon3D enables a three-dimensional view of gene variation in human metabolism. Nat Biotechnol 36(3):272–281

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Fahy E, Subramaniam S, Murphy RC et al (2009) Update of the LIPID MAPS comprehensive classification system for lipids. J Lipid Res 50(Suppl):S9–S14

    Article  PubMed  PubMed Central  Google Scholar 

  27. Shevchenko A, Simons K (2010) Lipidomics: coming to grips with lipid diversity. Nat Rev Mol Cell Biol 11(8):593–598

    Article  CAS  PubMed  Google Scholar 

  28. Bennett SAL, Valenzuela N, Xu H et al (2013) Using neurolipidomics to identify phospholipid mediators of synaptic (dys)function in Alzheimer’s disease. Front Physiol 4:168

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Mao C, Obeid LM (2008) Ceramidases: regulators of cellular responses mediated by ceramide, sphingosine, and sphingosine-1-phosphate. Biochim Biophys Acta 1781(9):424–434

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Teichgraber V, Ulrich M, Endlich N et al (2008) Ceramide accumulation mediates inflammation, cell death and infection susceptibility in cystic fibrosis. Nat Med 14(4):382–391

    Article  PubMed  Google Scholar 

  31. Bastanlar Y, Ozuysal M (2014) Introduction to machine learning. Methods Mol Biol 1107:105–128

    Article  PubMed  Google Scholar 

  32. Costello Z, Martin HG (2018) A machine learning approach to predict metabolic pathway dynamics from time-series multiomics data. NPJ Syst Biol Appl 4:19

    Article  PubMed  PubMed Central  Google Scholar 

  33. Culley C, Vijayakumar S, Zampieri G et al (2020) A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth. Proc Natl Acad Sci U S A 117(31):18869–18879

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Mc Auley MT, Mooney KM (2015) Computationally modeling lipid metabolism and aging: a mini-review. Comput Struct Biotechnol J 13:38–46

    Article  CAS  PubMed  Google Scholar 

  35. Shaked I, Oberhardt MA, Atias N et al (2016) Metabolic network prediction of drug side effects. Cell Syst 2(3):209–213

    Article  CAS  PubMed  Google Scholar 

  36. Folch-Fortuny A, Teusink B, Hoefsloot HCJ et al (2018) Dynamic elementary mode modelling of non-steady state flux data. BMC Syst Biol 12(1):71

    Article  PubMed  PubMed Central  Google Scholar 

  37. Metzcar J, Wang Y, Heiland R et al (2019) A review of cell-based computational modeling in cancer biology. JCO Clin Cancer Inform 3:1–13

    Article  PubMed  Google Scholar 

  38. Van Houdt G, Mosquera C, Nápoles G (2020) A review on the long short-term memory model. Artif Intell Rev 53:5929–5955

    Article  Google Scholar 

  39. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  CAS  PubMed  Google Scholar 

  40. Cheng L, Ramchandran S, Vatanen T et al (2019) An additive Gaussian process regression model for interpretable non-parametric analysis of longitudinal data. Nat Commun 10(1):1798

    Article  PubMed  PubMed Central  Google Scholar 

  41. Mass spectrometry-based lipidomics approaches (2016) In: Hsu F-F (ed) Lipidomics. pp 53–88

    Google Scholar 

  42. Lipidomics (2017) Springer Protocols

    Google Scholar 

  43. Chitpin JG, Surendra A, Nguyen TT et al (2021) BATL: Bayesian annotations for targeted lipidomics. Bioinformatics. in press

    Google Scholar 

  44. Tsugawa H, Arita M, Kanazawa M et al (2013) MRMPROBS: a data assessment and metabolite identification tool for large-scale multiple reaction monitoring based widely targeted metabolomics. Anal Chem 85(10):5191–5199

    Article  CAS  PubMed  Google Scholar 

  45. Domingo-Almenara X, Montenegro-Burke JR, Ivanisevic J et al (2018) XCMS-MRM and METLIN-MRM: a cloud library and public resource for targeted analysis of small molecules. Nat Methods 15(9):681–684

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Niu W, Knight E, Xia Q et al (2014) Comparative evaluation of eight software programs for alignment of gas chromatography-mass spectrometry chromatograms in metabolomics experiments. J Chromatogr A 1374:199–206

    Article  CAS  PubMed  Google Scholar 

  47. Wang Y, Ma L, Zhang M et al (2019) A simple method for peak alignment using relative retention time related to an inherent peak in liquid chromatography-mass spectrometry-based metabolomics. J Chromatogr Sci 57(1):9–16

    Article  CAS  PubMed  Google Scholar 

  48. Lin CY, Wu H, Tjeerdema RS et al (2007) Evaluation of metabolite extraction strategies from tissue samples using NMR metabolomics. Metabolomics 3(1):55–67

    Article  CAS  Google Scholar 

  49. Wishart DS, Jewison T, Guo AC et al (2013) HMDB 3.0--the human metabolome database in 2013. Nucleic Acids Res 41(Database issue):D801–D807

    CAS  PubMed  Google Scholar 

  50. Velankar S, Burley SK, Kurisu G et al (2021) The protein data bank archive. Methods Mol Biol 2305:3–21

    Article  CAS  PubMed  Google Scholar 

  51. Romero PR, Kobayashi N, Wedell JR et al (2020) BioMagResBank (BMRB) as a resource for structural biology. Methods Mol Biol 2112:187–218

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Ravanbakhsh S, Liu P, Bjorndahl TC et al (2015) Accurate, fully-automated NMR spectral profiling for metabolomics. PLoS One 10(5):e0124219

    Article  PubMed  PubMed Central  Google Scholar 

  53. Wang RCC, Campbell DA, Green JR et al (2021) Automatic 1D (1)H NMR metabolite quantification for bioreactor monitoring. Meta 11(3)

    Google Scholar 

  54. Jager S, Allhorn A, Biessmann F (2021) A benchmark for data imputation methods. Front Big Data 4:693674

    Article  PubMed  PubMed Central  Google Scholar 

  55. Jauhiainen A, Madhu B, Narita M et al (2014) Normalization of metabolomics data with applications to correlation maps. Bioinformatics 30(15):2155–2161

    Article  CAS  PubMed  Google Scholar 

  56. Walach J, Filzmoser P, Hron K (2018) Data normalization and scaling: consequences for the analysis in omics sciences. Compr Anal Chem 82:165–196

    Article  CAS  Google Scholar 

  57. Heirendt L, Arreckx S, Pfau T et al (2019) Creation and analysis of biochemical constraint-based models using the COBRA toolbox v.3.0. Nat Protoc 14(3):639–702

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Wang H, Marcisauskas S, Sanchez BJ, Domenzain I, Hermansson D, Agren R, Nielsen J, Kerkhoven EJ (2018) RAVEN 2.0: a versatile toolbox for metabolic network reconstruction and a case study on Streptomyces coelicolor. PLoS Comput Biol 14(10):e1006541

    Article  PubMed  PubMed Central  Google Scholar 

  59. Cornish-Bowden A (2014) Fundamentals of enzyme kinetics. Elsevier

    Google Scholar 

  60. Lewis JE, Kemp ML (2021) Integration of machine learning and genome-scale metabolic modeling identifies multi-omics biomarkers for radiation resistance. Nat Commun 12(1):2700

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Guyon I (2017) Advances in neural information processing system 30 pre-proceedings. NeurlPS 2017

    Google Scholar 

  62. Blattmann P, Henriques D, Zimmermann M et al (2017) Systems pharmacology dissection of cholesterol regulation reveals determinants of large pharmacodynamic variability between cell lines. Cell Syst 5(6):604–619.e607

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Sahle S, Gauges R, Pahle J, et al. Simulation of Biochemical Networks Using Copasi – A Complex Pathway Simulator. In: Proceedings of the 2006 Winter Simulation Conference, 2006

    Google Scholar 

  64. Matsuoka Y, Funahashi A, Ghosh S et al (2014) Modeling and simulation using CellDesigner. Methods Mol Biol 1164:121–145

    Article  PubMed  Google Scholar 

  65. Resasco DC, Gao F, Morgan F et al (2012) Virtual cell: computational tools for modeling in cell biology. Wiley Interdiscip Rev Syst Biol Med 4(2):129–140

    Article  CAS  PubMed  Google Scholar 

  66. Bergmann FT, Hoops S, Klahn B et al (2017) COPASI and its applications in biotechnology. J Biotechnol 261:215–220

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Martinez JA, Bulte DB, Contreras MA et al (2020) Dynamic modeling of CHO cell metabolism using the hybrid cybernetic approach with a novel elementary mode analysis strategy. Front Bioeng Biotechnol 8:279

    Article  PubMed  PubMed Central  Google Scholar 

  68. Sanft KR, Wu S, Roh M et al (2011) StochKit2: software for discrete stochastic simulation of biochemical systems with events. Bioinformatics 27(17):2457–2458

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Tonn MK, Thomas P, Barahona M et al (2019) Stochastic modelling reveals mechanisms of metabolic heterogeneity. Commun Biol 2:108

    Article  PubMed  PubMed Central  Google Scholar 

  70. Ebrahim A, Lerman JA, Palsson BO et al (2013) COBRApy: COnstraints-based reconstruction and analysis for python. BMC Syst Biol 7:74

    Article  PubMed  PubMed Central  Google Scholar 

  71. Dias O, Rocha M, Ferreira EC et al (2015) Reconstructing genome-scale metabolic models with merlin. Nucleic Acids Res 43(8):3899–3910

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Gutierrez JM, Feizi A, Li S et al (2020) Genome-scale reconstructions of the mammalian secretory pathway predict metabolic costs and limitations of protein secretion. Nat Commun 11(1):68

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Yu JS, Bagheri N (2020) Agent-based models predict emergent behavior of heterogeneous cell populations in dynamic microenvironments. Front Bioeng Biotechnol 8:249

    Article  PubMed  PubMed Central  Google Scholar 

  74. Malik-Sheriff RS, Glont M, Nguyen TVN et al (2020) BioModels-15 years of sharing computational models in life science. Nucleic Acids Res 48(D1):D407–D415

    CAS  PubMed  Google Scholar 

  75. Wittig U, Rey M, Weidemann A et al (2018) SABIO-RK: an updated resource for manually curated biochemical reaction kinetics. Nucleic Acids Res 46(D1):D656–D660

    Article  CAS  PubMed  Google Scholar 

  76. Flamholz A, Noor E, Bar-Even A et al (2012) eQuilibrator--the biochemical thermodynamics calculator. Nucleic Acids Res 40(Database issue):D770–D775

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

Work was supported in part by operating grants AI-4D-102-3 to SALB and MCC from the National Research Council AI for Design Challenge Program, RGPIN-2019-06796 to SALB from the Natural Sciences and Engineering Research Council of Canada (NSERC), as well as an NSERC CREATE Matrix Metabolomics Training grant to SALB. TTN received an NSERC CREATE Matrix Metabolomics Graduate Scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miroslava Cuperlovic-Culf .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Cuperlovic-Culf, M., Nguyen-Tran, T., Bennett, S.A.L. (2023). Machine Learning and Hybrid Methods for Metabolic Pathway Modeling. In: Selvarajoo, K. (eds) Computational Biology and Machine Learning for Metabolic Engineering and Synthetic Biology. Methods in Molecular Biology, vol 2553. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2617-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2617-7_18

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2616-0

  • Online ISBN: 978-1-0716-2617-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics