Probably Approximately Correct Learning of Regulatory Networks from Time-Series Data

  • Arthur Carcano
  • François FagesEmail author
  • Sylvain Soliman
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10545)


Automating the process of model building from experimental data is a very desirable goal to palliate the lack of modellers for many applications. However, despite the spectacular progress of machine learning techniques in data analytics, classification, clustering and prediction making, learning dynamical models from data time-series is still challenging. In this paper we investigate the use of the Probably Approximately Correct (PAC) learning framework of Leslie Valiant as a method for the automated discovery of influence models of biochemical processes from Boolean and stochastic traces. We show that Thomas’ Boolean influence systems can be naturally represented by k-CNF formulae, and learned from time-series data with a number of Boolean activation samples per species quasi-linear in the precision of the learned model, and that positive Boolean influence systems can be represented by monotone DNF formulae and learned actively with both activation samples and oracle calls. We consider Boolean traces and Boolean abstractions of stochastic simulation traces, and study the space-time tradeoff there is between the diversity of initial states and the length of the time horizon, and its impact on the error bounds provided by the PAC learning algorithms. We evaluate the performance of this approach on a model of T-lymphocyte differentiation, with and without prior knowledge, and discuss its merits as well as its limitations with respect to realistic experiments.



This work is partly supported by the ANR project Hyclock.


  1. 1.
    Angelopoulos, N., Muggleton, S.H.: Machine learning metabolic pathway descriptions using a probabilistic relational representation. Electron. Trans. Artif. Intell. 7(9), 1–11 (2002). also in Proceedings of Machine IntelligenceGoogle Scholar
  2. 2.
    Angelopoulos, N., Muggleton, S.H.: Slps for probabilistic pathways: Modeling and parameter estimation. Technical Report TR 2002/12. Department of Computing, Imperial College, London, UK (2002)Google Scholar
  3. 3.
    Bernot, G., Comet, J.P., Richard, A., Guespin, J.: A fruitful application of formal methods to biological regulatory networks: Extending Thomas’ asynchronous logical approach with temporal logic. J. Theor. Biol. 229(3), 339–347 (2004)CrossRefGoogle Scholar
  4. 4.
    Bryant, C.H., Muggleton, S.H., Oliver, S.G., Kell, D.B., Reiser, P.G.K., King, R.D.: Combining inductive logic programming, active learning and robotics to discover the function of genes. Electron. Trans. Artif. Intell. 6(12), 1–36 (2001)Google Scholar
  5. 5.
    Calzone, L., Chabrier-Rivier, N., Fages, F., Soliman, S.: Machine learning biochemical networks from temporal logic properties. In: Priami, C., Plotkin, G. (eds.) Transactions on Computational Systems Biology VI. LNCS, vol. 4220, pp. 68–94. Springer, Heidelberg (2006). doi: 10.1007/11880646_4 CrossRefGoogle Scholar
  6. 6.
    Chen, K.C., Calzone, L., Csikász-Nagy, A., Cross, F.R., Györffy, B., Val, J., Novàk, B., Tyson, J.J.: Integrative analysis of cell cycle control in budding yeast. Mol. Biol. Cell 15(8), 3841–3862 (2004)CrossRefGoogle Scholar
  7. 7.
    Deng, K., Bourke, C., Scott, S.D., Sunderman, J., Zheng, Y.: Bandit-based algorithms for budgeted learning. In: ICDM (2007)Google Scholar
  8. 8.
    Deng, K., Zheng, Y., Bourke, C., Scott, S., Masciale, J.: New algorithms for budgeted learning. Mach. Learn. 90, 59–90 (2013)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Fages, F., Martinez, T., Rosenblueth, D.A., Soliman, S.: Influence systems vs Reaction systems. In: Bartocci, E., Lio, P., Paoletti, N. (eds.) CMSB 2016. LNCS, vol. 9859, pp. 98–115. Springer, Cham (2016). doi: 10.1007/978-3-319-45177-0_7 CrossRefGoogle Scholar
  10. 10.
    Fages, F., Soliman, S.: Abstract interpretation and types for systems biology. Theor. Comput. Sci. 403(1), 52–70 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Gebser, M., Kaufmann, B., Neumann, A., Schaub, T.: clasp: A conflict-driven answer set solver. In: Baral, C., Brewka, G., Schlipf, J. (eds.) LPNMR 2007. LNCS (LNAI), vol. 4483, pp. 260–265. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-72200-7_23 CrossRefGoogle Scholar
  12. 12.
    Gebser, M., Schaub, T., Thiele, S., Usadel, B., Veber, P.: Detecting inconsistencies in large biological networks with answer set programming. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 130–144. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-89982-2_19 CrossRefGoogle Scholar
  13. 13.
    Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. J. Phys. Chemis. 81(25), 2340–2361 (1977)CrossRefGoogle Scholar
  14. 14.
    Gordon, A.D., Henzinger, T.A., Nori, A.V., Rajamani, S.K.: Probabilistic programming. In: Proceedings of the on Future of Software Engineering, FOSE 2014, pp. 167–181, NY, USA. ACM, New York (2014)Google Scholar
  15. 15.
    Hill, S.M., et al.: Inferring causal molecular networks: empirical assessment through a community-based effort. Nat. Method. 1(4), 310–318 (2016)CrossRefGoogle Scholar
  16. 16.
    Llamosi, A., Mezine, A., dÁlché-Buc, F., Letort, V., Sebag, M.: Experimental design in dynamical system identification: a bandit-based active learning approach. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8725, pp. 306–321. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44851-9_20 Google Scholar
  17. 17.
    Mendoza, L.: A network model for the control of the differentiation process in Th cells. Biosystems 84(2), 101–114 (2006)CrossRefGoogle Scholar
  18. 18.
    Meyer, P., Cokelaer, T., Chandran, D., Kim, K.H., Loh, P.R., Tucker, G., Lipson, M., Berger, B., Kreutz, C., Raue, A., Steiert, B., Timmer, J., Bilal, E., Sauro, H.M., Stolovitzky, G., Saez-Rodriguez, J.: Network topology and parameter estimation: from experimental design methods to gene regulatory network kinetics using a community based approach. BMC Syst. Biol. 8(1), 1–18 (2014)CrossRefGoogle Scholar
  19. 19.
    Muggleton, S.H.: Inverse entailment and progol. New Gener. Comput. 13, 245–286 (1995)CrossRefGoogle Scholar
  20. 20.
    Ostrowski, M., Paulevé, L., Schaub, T., Siegel, A., Guziolowski, C.: Boolean network identification from perturbation time series data combining dynamics abstraction and logic programming. Biosystems 149, 139–153 (2016)CrossRefGoogle Scholar
  21. 21.
    Remy, E., Ruet, P., Mendoza, L., Thieffry, D., Chaouiya, C.: From logical regulatory graphs to standard petri nets: dynamical roles and functionality of feedback circuits. In: Priami, C., Ingólfsdóttir, A., Mishra, B., Riis Nielson, H. (eds.) Transactions on Computational Systems Biology VII. LNCS, vol. 4230, pp. 56–72. Springer, Heidelberg (2006). doi: 10.1007/11905455_3 CrossRefGoogle Scholar
  22. 22.
    Thomas, R.: Boolean formalisation of genetic control circuits. J. Theor. Biol. 42, 565–583 (1973)CrossRefGoogle Scholar
  23. 23.
    Thomas, R.: Regulatory networks seen as asynchronous automata : a logical description. J. Theor. Biol. 153, 1–23 (1991)CrossRefGoogle Scholar
  24. 24.
    Valiant, L.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)CrossRefzbMATHGoogle Scholar
  25. 25.
    Valiant, L.: Probably Approximately Correct. Basic Books (2013)Google Scholar
  26. 26.
    Videla, S., Konokotina, I., Alexopoulos, L.G., Saez-Rodriguez, J., Schaub, T., Siegel, A., Guziolowski, C.: Designing experiments to discriminate families of logic models. Front. Bioeng. Biotechnol. 3, 131 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Arthur Carcano
    • 1
  • François Fages
    • 2
    Email author
  • Sylvain Soliman
    • 2
  1. 1.Ecole Normale SupérieureParisFrance
  2. 2.Inria, University Paris-Saclay, Lifeware GroupPalaiseauFrance

Personalised recommendations