Robust Approaches to Generating Reliable Predictive Models in Systems Biology

  • Kiri ChoiEmail author
Part of the RNA Technologies book series (RNATECHN)


A computational technique is described to reduce the model search space and construct an ensemble of models for systems biology using perturbation data. While doing so, an effective way of representing a network model is developed for computing purposes using adjacency matrix-like data structures. This allows models to include Uni-Uni to Bi-Bi reactions in addition to enzymatic activation and inhibition. It is demonstrated that the technique is effective, fast, and suggests it can be used as an initial filtering step in conjunction with other computational techniques. Finally, other potential methods to construct a set of reliable network models using time-course data are explored.


Systems biology Biochemical networks Network reduction Machine learning Ensemble modeling 



KC is supported by NIH grants GM123032-01A1. The content is solely the responsibility of the author and does not necessarily represent the views of the National Institutes of Health. KC wishes to thank Herbert Sauro and Joseph Hellerstein for their help and guidance in completing this chapter.


  1. Abadi M, Agarwal A, Barham P et al (2015) Tensorflow: large-scale machine learning on heterogeneous distributed systems.
  2. Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8(6):450–461CrossRefPubMedGoogle Scholar
  3. Bonneau R, Reiss DJ, Shannon P et al (2006) The inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol 7(5):R36CrossRefPubMedPubMedCentralGoogle Scholar
  4. Butcher EC, Berg EL, Kunkel EJ (2004) Systems biology in drug discovery. Nat Biotechnol 22(10):1253–1259CrossRefPubMedGoogle Scholar
  5. Chavez A, Scheiman J, Vora S et al (2015) Highly efficient cas9-mediated transcriptional programming. Nat Methods 12(4):326–328CrossRefPubMedPubMedCentralGoogle Scholar
  6. Cheng AW, Wang H, Yang H et al (2013) Multiplexed activation of endogenous genes by CRISPR-on, an RNA-guided transcriptional activator system. Cell Res 23(10):1163–1171CrossRefPubMedPubMedCentralGoogle Scholar
  7. Choi K, Medley JK, Cannistra C et al (2016) Tellurium: a python based modeling and reproducibility platform for systems biology. bioRxiv p 054601.
  8. Chollet F et al (2015) Keras.
  9. Daniels BC, Nemenman I (2015) Efficient inference of parsimonious phenomenological models of cellular dynamics using s-systems and alternating regression. PLoS ONE 10(3):e0119821CrossRefPubMedPubMedCentralGoogle Scholar
  10. Davey JW, Hohenlohe PA, Etter PD et al (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet 12(7):499–510CrossRefPubMedGoogle Scholar
  11. Fisher J, Woodhouse S (2017) Program synthesis meets deep learning for decoding regulatory networks. Curr Opin Syst Biol 4:64–70CrossRefGoogle Scholar
  12. Gilbert LA, Horlbeck MA, Adamson B et al (2014) Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159(3):647–661CrossRefPubMedPubMedCentralGoogle Scholar
  13. Gosink LJ, Hogan EA, Pulsipher TC et al (2014) Bayesian model aggregation for ensemble-based estimates of protein pKa values. Proteins Struct Funct Bioinf 82(3):354–363CrossRefGoogle Scholar
  14. Henriques D, Villaverde AF, Rocha M et al (2017) Data-driven reverse engineering of signaling pathways using ensembles of dynamic models. PLoS Comput Biol 13(2):e1005379CrossRefPubMedPubMedCentralGoogle Scholar
  15. Karr JR, Sanghvi JC, Macklin DN et al (2012) A whole-cell computational model predicts phenotype from genotype. Cell 150(2):389–401CrossRefPubMedPubMedCentralGoogle Scholar
  16. Kitano H (2002a) Computational systems biology. Nature 420(6912):206–210CrossRefGoogle Scholar
  17. Kitano H (2002b) Systems biology: a brief overview. Science 295(5560):1662–1664CrossRefPubMedGoogle Scholar
  18. Li S, Park Y, Duraisingham S et al (2013) Predicting network activity from high throughput metabolomics. PLoS Comput Biol 9(7):e1003123CrossRefPubMedPubMedCentralGoogle Scholar
  19. Mangan NM, Brunton SL, Proctor JL et al (2016) Inferring biological networks by sparse identification of nonlinear dynamics. IEEE Trans Mol Biol Multi-Scale Commun 2(1):52–63CrossRefGoogle Scholar
  20. McGoff KA, Guo X, Deckard A et al (2016) The local edge machine: inference of dynamic models of gene regulation. Genome Biol 17(1):214CrossRefPubMedPubMedCentralGoogle Scholar
  21. Millard P, Smallbone K, Mendes P (2017) Metabolic regulation is sufficient for global and robust coordination of glucose uptake, catabolism, energy production and growth in escherichia coli. PLoS Comput Biol 13(2):e1005396CrossRefPubMedPubMedCentralGoogle Scholar
  22. Natale JL, Hofmann D, Hernández DG et al (2017) Reverse-engineering biological networks from large data sets. arXiv preprint arXiv:170506370Google Scholar
  23. Oates CJ, Dondelinger F, Bayani N et al (2014) Causal network inference using biochemical kinetics. Bioinformatics 30(17):i468–i474CrossRefPubMedPubMedCentralGoogle Scholar
  24. Pan W, Yuan Y, Gonçalves J et al (2016) A sparse bayesian approach to the identification of nonlinear state-space systems. IEEE Trans Automat Control 61(1):182–187CrossRefGoogle Scholar
  25. Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830Google Scholar
  26. Qi LS, Larson MH, Gilbert LA et al (2013) Repurposing crispr as an rna-guided platform for sequence-specific control of gene expression. Cell 152(5):1173–1183CrossRefPubMedPubMedCentralGoogle Scholar
  27. Sévin DC, Fuhrer T, Zamboni N et al (2017) Nontargeted in vitro metabolomics for high-throughput identification of novel enzymes in escherichia coli. Nat Methods 14(2):187–194CrossRefPubMedGoogle Scholar
  28. Shi T, Fillmore TL, Sun X et al (2012) Antibody-free, targeted mass-spectrometric approach for quantification of proteins at low picogram per milliliter levels in human plasma/serum. Proc Natl Acad Sci 109(38):15395–15400CrossRefPubMedGoogle Scholar
  29. Shi T, Niepel M, McDermott JE et al (2016) Conservation of protein abundance patterns reveals the regulatory architecture of the EGFR-MAPK pathway. Sci Signal 9(436):rs6Google Scholar
  30. Smith LP, Bergmann FT, Chandran D et al (2009) Antimony: a modular model definition language. Bioinformatics 25(18):2452–2454CrossRefPubMedPubMedCentralGoogle Scholar
  31. Somogyi ET, Bouteiller JM, Glazier JA et al (2015) libroadrunner: a high performance SBML simulation and analysis library. Bioinformatics 31(20):3315–3321CrossRefPubMedPubMedCentralGoogle Scholar
  32. Van Dijk EL, Auger H, Jaszczyszyn Y et al (2014) Ten years of next-generation sequencing technology. Trends Genet 30(9):418–426CrossRefPubMedGoogle Scholar
  33. Yan J, Deforet M, Boyle KE et al (2017) Bow-tie signaling in c-di-GMP: machine learning in a simple biochemical network. PLoS Comput Biol 13(8):e1005677CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of BioengineeringUniversity of WashingtonSeattleUSA

Personalised recommendations