Inferring Gene Regulatory Networks from Multiple Datasets

  • Christopher A. PenfoldEmail author
  • Iulia Gherman
  • Anastasiya Sybirna
  • David L. Wild
Part of the Methods in Molecular Biology book series (MIMB, volume 1883)


Gaussian process dynamical systems (GPDS) represent Bayesian nonparametric approaches to inference of nonlinear dynamical systems, and provide a principled framework for the learning of biological networks from multiple perturbed time series measurements of gene or protein expression. Such approaches are able to capture the full richness of complex ODE models, and can be scaled for inference in moderately large systems containing hundreds of genes. Related hierarchical approaches allow for inference from multiple datasets in which the underlying generative networks are assumed to have been rewired, either by context-dependent changes in network structure, evolutionary processes, or synthetic manipulation. These approaches can also be used to leverage experimentally determined network structures from one species into another where the network structure is unknown. Collectively, these methods provide a comprehensive and flexible platform for inference from a diverse range of data, with applications in systems and synthetic biology, as well as spatiotemporal modelling of embryo development. In this chapter we provide an overview of GPDS approaches and highlight their applications in the biological sciences, with accompanying tutorials available as a Jupyter notebook from

Key words

Nonlinear dynamical systems Gaussian process dynamical systems Causal structure identification Learning from multiple data sources Spatiotemporal models 



CAP is supported by the Wellcome Trust (grant 083089/Z/07/Z). IG is supported by EPSRC/BBSRC research grant EP/L016494/1. AS is supported by a 4-year Wellcome Trust PhD Scholarship and Cambridge International Trust Scholarship. DLW acknowledges support from the Engineering and Physical Science Research Council (grant EP/R014337/1).

CAP, IG, and AS BBSRC-EPSRC funded OpenPlant Synthetic Biology Research Centre (BB/L014130/1) through the OpenPlant Fund scheme. CAP and AS also thank M. Azim Surani for his support.


  1. 1.
    Kholodenko BN, Kiyatkin A, Bruggeman FJ, Sontag E, Westerhoff HV, Hoek JB (2002) Untangling the wires: a strategy to trace functional interactions in signaling and gene networks. Proc Natl Acad Sci 99(20):12841–12846PubMedCrossRefGoogle Scholar
  2. 2.
    Elowitz MB, Leibler S (2000) A synthetic oscillatory network of transcriptional regulators. Nature 403(6767):335–338PubMedPubMedCentralCrossRefGoogle Scholar
  3. 3.
    Gardner TS, Cantor CR, Collins JJ (2000) Construction of a genetic toggle switch in Escherichia coli. Nature 403(6767):339–342PubMedCrossRefGoogle Scholar
  4. 4.
    Zak DE, Gonye GE, Schwaber JS, Doyle FJ (2003) Importance of input perturbations and stochastic gene expression in the reverse engineering of genetic regulatory networks: insights from an identifiability analysis of an in silico network. Genome Res 13(11):2396–2405PubMedPubMedCentralCrossRefGoogle Scholar
  5. 5.
    Locke J, Millar A, Turner M (2005) Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana. J Theor Biol 234(3):383–393PubMedCrossRefGoogle Scholar
  6. 6.
    Pokhilko A, Mas P, Millar AJ (2013) Modelling the widespread effects of toc1 signalling on the plant circadian clock and its outputs. BMC Syst Biol 7(1):23PubMedPubMedCentralCrossRefGoogle Scholar
  7. 7.
    Fogelmark K, Troein C (2014) Rethinking transcriptional activation in the Arabidopsis circadian clock. PLoS Comput Biol 10(7):e1003705PubMedPubMedCentralCrossRefGoogle Scholar
  8. 8.
    Domijan M, Rand DA (2015) Using constraints and their value for optimization of large ode systems. J R Soc Interface 12(104):20141303PubMedPubMedCentralCrossRefGoogle Scholar
  9. 9.
    De Caluwé J, Xiao Q, Hermans C, Verbruggen N, Leloup JC, Gonze D (2016) A compact model for the complex plant circadian clock. Front Plant Sci 7:74PubMedPubMedCentralCrossRefGoogle Scholar
  10. 10.
    Ashall L, Horton CA, Nelson DE, Paszek P, Harper CV, Sillitoe K, Ryan S, Spiller DG, Unitt JF, Broomhead DS et al (2009) Pulsatile stimulation determines timing and specificity of NF-κB-dependent transcription. Science 324(5924):242–246PubMedPubMedCentralCrossRefGoogle Scholar
  11. 11.
    Wang Y, Paszek P, Horton CA, Yue H, White MR, Kell DB, Muldoon MR, Broomhead DS (2012) A systematic survey of the response of a model nf-κb signalling pathway to tnfα stimulation. J Theor Biol 297:137–147PubMedCrossRefGoogle Scholar
  12. 12.
    Jonak K, Kurpas M, Szoltysek K, Janus P, Abramowicz A, Puszynski K (2016) A novel mathematical model of atm/p53/nf-κ b pathways points to the importance of the DDR switch-off mechanisms. BMC Syst Biol 10(1):75PubMedPubMedCentralCrossRefGoogle Scholar
  13. 13.
    Calderhead B, Girolami M, Lawrence ND (2009) Accelerating Bayesian inference over nonlinear differential equations with Gaussian processes. In: Advances in neural information processing systems, pp 217–224Google Scholar
  14. 14.
    Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MP (2009) Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface 6(31):187–202PubMedCrossRefGoogle Scholar
  15. 15.
    Liepe J, Kirk P, Filippi S, Toni T, Barnes CP, Stumpf MP (2014) A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation. Nat Protoc 9(2):439–456PubMedPubMedCentralCrossRefGoogle Scholar
  16. 16.
    Beaumont MA, Rannala B (2004) The Bayesian revolution in genetics. Nat Rev Genet 5(4):251–261PubMedCrossRefGoogle Scholar
  17. 17.
    Hjort N, Holmes C, Müller P, Walker S (eds) (2010) Bayesian nonparametrics. Cambridge University Press, CambridgeGoogle Scholar
  18. 18.
    Murray-Smith R, Johansen TA, Shorten R (1999) On transient dynamics, off-equilibrium behaviour and identification in blended multiple model structures. In: 1999 European control conference (ECC). IEEE, Piscataway, pp 3569–3574Google Scholar
  19. 19.
    Murray-Smith R, Girard A (2001) Gaussian process priors with ARMA noise models. In: Irish signals and systems conference, Maynooth, pp 147–152Google Scholar
  20. 20.
    Girard A, Rasmussen CE, Candela JQ, Murray-Smith R (2003) Gaussian process priors with uncertain inputs application to multiple-step ahead time series forecasting. In: Advances in neural information processing systems, pp 545–552Google Scholar
  21. 21.
    Leithead W, Solak E, Leith D (2003) Direct identification of nonlinear structure using Gaussian process prior models. In: European control conference (ECC), 2003. IEEE, Piscataway, pp 2565–2570CrossRefGoogle Scholar
  22. 22.
    Sbarbaro D, Murray-Smith R (2005) Self-tuning control of non-linear systems using Gaussian process prior models. In: Switching and learning in feedback systems. Springer, Berlin, pp 140–157CrossRefGoogle Scholar
  23. 23.
    Cunningham J, Ghahramani Z, Rasmussen CE (2012) Gaussian processes for time-marked time-series data. In: International conference on artificial intelligence and statistics, pp 255–263Google Scholar
  24. 24.
    Frigola R, Lindsten F, Schön TB, Rasmussen CE (2014) Identification of Gaussian process state-space models with particle stochastic approximation EM. IFAC Proc Vol 47(3):4097–4102CrossRefGoogle Scholar
  25. 25.
    Frigola R, Chen Y, Rasmussen CE (2014) Variational Gaussian process state-space models. In: Advances in neural information processing systems, pp 3680–3688Google Scholar
  26. 26.
    Klemm S et al (2008) Causal structure identification in nonlinear dynamical systems. Department of Engineering, University of Cambridge, CambridgeGoogle Scholar
  27. 27.
    Penfold CA, Wild DL (2011) How to infer gene networks from expression profiles, revisited. Interface Focus 1(6):857–870PubMedPubMedCentralCrossRefGoogle Scholar
  28. 28.
    Penfold CA, Millar JB, Wild DL (2015) Inferring orthologous gene regulatory networks using interspecies data fusion. Bioinformatics 31(12):i97–i105PubMedPubMedCentralCrossRefGoogle Scholar
  29. 29.
    Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning, vol 2. MIT Press, CambridgeGoogle Scholar
  30. 30.
    Lloyd JR, Duvenaud D, Grosse R, Tenenbaum JB, Ghahramani Z (2014) Automatic construction and natural-language description of nonparametric regression models. Preprint. arXiv:14024304Google Scholar
  31. 31.
    Yang J, Penfold CA, Grant MR, Rattray M (2016) Inferring the perturbation time from biological time course data. Bioinformatics 32:2956–2964PubMedPubMedCentralCrossRefGoogle Scholar
  32. 32.
    Penfold CA, Sybirna A, Reid J, Huang Y, Wernisch L, Grant M, Ghahramani Z, Surani MA (2017) Nonparametric Bayesian inference of transcriptional branching and recombination identifies regulators of early human germ cell development. bioRxiv p 167684Google Scholar
  33. 33.
    Penfold CA, Sybirna A, Reid J, Huang Y, Wernisch L, Ghahramani Z, Grant M, Surani MA (2018) Branch-recombinant Gaussian processes for analysis of perturbations in biological time series. Bioinformatics, 34(17):i1005–i1013PubMedPubMedCentralCrossRefGoogle Scholar
  34. 34.
    Boukouvalas, Alexis, Hensman J, Rattray M (2018) BGP: identifying gene-specific branching dynamics from single-cell data with a branching Gaussian process. Genome biology 19.1:65PubMedPubMedCentralCrossRefGoogle Scholar
  35. 35.
    Äijö T, Lähdesmäki H (2009) Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics. Bioinformatics 25(22):2937– 2944PubMedPubMedCentralCrossRefGoogle Scholar
  36. 36.
    Solak E, Murray-Smith R, Leithead WE, Leith DJ, Rasmussen CE (2003) Derivative observations in Gaussian process models of dynamic systems. In: Advances in neural information processing systems, pp 1057–1064Google Scholar
  37. 37.
    Penfold CA, Shifaz A, Brown PE, Nicholson A, Wild DL (2015) Csi: a nonparametric Bayesian approach to network inference from multiple perturbed time series gene expression data. Stat Appl Genet Mol Biol 14(3):307–310PubMedCrossRefGoogle Scholar
  38. 38.
    Polanski K, Gao B, Mason SA, Brown P, Ott S, Denby KJ, Wild DL (2017) Bringing numerous methods for expression and promoter analysis to a public cloud computing service. Bioinformatics 1:3Google Scholar
  39. 39.
    Rabani M, Levin JZ, Fan L, Adiconis X, Raychowdhury R, Garber M, Gnirke A, Nusbaum C, Hacohen N, Friedman N et al (2011) Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells. Nat Biotechnol 29(5):436–442PubMedPubMedCentralCrossRefGoogle Scholar
  40. 40.
    Li L, Nelson C, Fenske R, Trösch J, Pružinská A, Millar AH, Huang S (2017) Changes in specific protein degradation rates in Arabidopsis thaliana reveal multiple roles of lon1 in mitochondrial protein homeostasis. Plant J 89(3):458–471PubMedCrossRefGoogle Scholar
  41. 41.
    D’Amour KA, Agulnick AD, Eliazer S, Kelly OG, Kroon E, Baetge EE (2005) Efficient differentiation of human embryonic stem cells to definitive endoderm. Nat Biotechnol 23(12):1534–1541PubMedCrossRefGoogle Scholar
  42. 42.
    Wang P, Rodriguez RT, Wang J, Ghodasara A, Kim SK (2011) Targeting SOX17 in human embryonic stem cells creates unique strategies for isolating and analyzing developing endoderm. Cell Stem Cell 8(3):335–346PubMedPubMedCentralCrossRefGoogle Scholar
  43. 43.
    Viotti M, Nowotschin S, Hadjantonakis AK (2014) SOX17 links gut endoderm morphogenesis and germ layer segregation. Nat Cell Biol 16(12):1146–1156PubMedPubMedCentralCrossRefGoogle Scholar
  44. 44.
    Kobayashi T, Zhang H, Tang WW, Irie N, Withey S, Klisch D, Sybirna A, Dietmann S, Contreras DA, Webb R et al (2017) Principles of early human development and germ cell program from conserved model systems. Nature 546:416–420PubMedPubMedCentralCrossRefGoogle Scholar
  45. 45.
    Irie N, Weinberger L, Tang WW, Kobayashi T, Viukov S, Manor YS, Dietmann S, Hanna JH, Surani MA (2015) SOX17 is a critical specifier of human primordial germ cell fate. Cell 160(1):253–268PubMedPubMedCentralCrossRefGoogle Scholar
  46. 46.
    Werhli AV, Husmeier D (2008) Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. J Bioinform Comput Biol 6(03):543–572PubMedCrossRefGoogle Scholar
  47. 47.
    Penfold CA, Buchanan-Wollaston V, Denby KJ, Wild DL (2012) Nonparametric Bayesian inference for perturbed and orthologous gene regulatory networks. Bioinformatics 28(12):i233–i241PubMedPubMedCentralCrossRefGoogle Scholar
  48. 48.
    Oates CJ, Korkola J, Gray JW, Mukherjee S et al (2014) Joint estimation of multiple related biological networks. Ann Appl Stat 8(3):1892–1919CrossRefGoogle Scholar
  49. 49.
    Hickman R, Hill C, Penfold CA, Breeze E, Bowden L, Moore JD, Zhang P, Jackson A, Cooke E, Bewicke-Copley F et al (2013) A local regulatory network around three NAC transcription factors in stress responses and senescence in Arabidopsis leaves. Plant J 75(1):26–39PubMedPubMedCentralCrossRefGoogle Scholar
  50. 50.
    Kashima H, Yamanishi Y, Kato T, Sugiyama M, Tsuda K (2009) Simultaneous inference of biological networks of multiple species from genome-wide data and evolutionary information: a semi-supervised approach. Bioinformatics 25(22):2962–2968PubMedCrossRefGoogle Scholar
  51. 51.
    Gholami AM, Fellenberg K (2010) Cross-species common regulatory network inference without requirement for prior gene affiliation. Bioinformatics 26(8):1082–1090PubMedCrossRefGoogle Scholar
  52. 52.
    Zhang X, Moret BM (2010) Refining transcriptional regulatory networks using network evolutionary models and gene histories. Algorithms Mol Biol 5(1):1PubMedPubMedCentralCrossRefGoogle Scholar
  53. 53.
    Joshi A, Beck Y, Michoel T (2015) Multi-species network inference improves gene regulatory network reconstruction for early embryonic development in Drosophila. J Comput Biol 22(4):253–265PubMedCrossRefGoogle Scholar
  54. 54.
    Shervashidze N, Schweitzer P, Leeuwen EJv, Mehlhorn K, Borgwardt KM (2011) Weisfeiler-Lehman graph kernels. J Mach Learn Res 12(Sep):2539–2561Google Scholar
  55. 55.
    Turing A (1952) The chemical basis of morphogenesis. Phil Trans R Soc Lond B 237:37–72CrossRefGoogle Scholar
  56. 56.
    Kondo S, Miura T (2010) Reaction-diffusion model as a framework for understanding biological pattern formation. Science 329(5999):1616–1620PubMedCrossRefGoogle Scholar
  57. 57.
    Müller P, Rogers KW, Jordan BM, Lee JS, Robson D, Ramanathan S, Schier AF (2012) Differential diffusivity of nodal and lefty underlies a reaction-diffusion patterning system. Science 336(6082):721–724PubMedPubMedCentralCrossRefGoogle Scholar
  58. 58.
    Pisarev A, Poustelnikova E, Samsonova M, Reinitz J (2008) Flyex, the quantitative atlas on segmentation gene expression at cellular resolution. Nucleic Acids Res 37(Suppl 1):D560–D566PubMedPubMedCentralGoogle Scholar
  59. 59.
    Poustelnikova E, Pisarev A, Blagov M, Samsonova M, Reinitz J (2004) A database for management of gene expression data in situ. Bioinformatics 20(14):2212–2221PubMedCrossRefGoogle Scholar
  60. 60.
    Kozlov K, Gursky V, Kulakovskiy I, Samsonova M (2014) Sequence-based model of gap gene regulatory network. BMC Genomics 15(12):S6PubMedPubMedCentralCrossRefGoogle Scholar
  61. 61.
    Purnick PE, Weiss R (2009) The second wave of synthetic biology: from modules to systems. Nat Rev Mol Cell Biol 10(6):410–422PubMedCrossRefGoogle Scholar
  62. 62.
    Khalil AS, Collins JJ (2010) Synthetic biology: applications come of age. Nat Rev Genet 11(5):367–379PubMedPubMedCentralCrossRefGoogle Scholar
  63. 63.
    Windram OP, Rodrigues RT, Lee S, Haines M, Bayer TS (2017) Engineering microbial phenotypes through rewiring of genetic networks. Nucleic Acids Res 45(8):4984–4993PubMedPubMedCentralCrossRefGoogle Scholar
  64. 64.
    Isalan M, Lemerle C, Michalodimitrakis K, Horn C, Beltrao P, Raineri E, Garriga-Canut M, Serrano L (2008) Evolvability and hierarchy in rewired bacterial gene networks. Nature 452(7189):840–845PubMedPubMedCentralCrossRefGoogle Scholar
  65. 65.
    Lee MJ, Albert SY, Gardino AK, Heijink AM, Sorger PK, MacBeath G, Yaffe MB (2012) Sequential application of anticancer drugs enhances cell death by rewiring apoptotic signaling networks. Cell 149(4):780–794PubMedPubMedCentralCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Christopher A. Penfold
    • 1
    Email author
  • Iulia Gherman
    • 2
  • Anastasiya Sybirna
    • 1
    • 3
    • 4
  • David L. Wild
    • 5
  1. 1.Wellcome/CRUK Gurdon InstituteUniversity of CambridgeCambridgeUK
  2. 2.Warwick Integrative Synthetic Biology Centre, School of EngineeringUniversity of WarwickCoventryUK
  3. 3.Wellcome/MRC Cambridge Stem Cell InstituteUniversity of CambridgeCambridgeUK
  4. 4.Physiology, Development and Neuroscience DepartmentUniversity of CambridgeCambridgeUK
  5. 5.Department of Statistics and Systems Biology CentreUniversity of WarwickCoventryUK

Personalised recommendations