A Bayesian Active Learning Experimental Design for Inferring Signaling Networks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10229)


Machine learning methods for learning network structure, applied to quantitative proteomics experiments, reverse-engineer intracellular signal transduction networks. They provide insight into the rewiring of signaling within the context of a disease or a phenotype. To learn the causal patterns of influence between proteins in the network, the methods require experiments that include targeted interventions that fix the activity of specific proteins. However, the interventions are costly and add experimental complexity.

We describe a active learning strategy for selecting optimal interventions. Our approach takes as inputs pathway databases and historic datasets, expresses them in form of prior probability distributions on network structures, and selects interventions that maximize their expected contribution to structure learning. Evaluations on simulated and real data show that the strategy reduces the detection error of validated edges as compared to an unguided choice of interventions, and avoids redundant interventions, thereby increasing the effectiveness of the experiment.


Machine learning Active learning Causal inference Bayesian network Probabilistic graphical models Biological networks 



We thank M. Scutari for guidance in using the R package bnlearn. This work was supported in part by the NSF CAREER award DBI-1054826, and by the Sy and Laurie Sternberg award to OV.


  1. 1.
    Bandura, D.R., Baranov, V.I., Ornatsky, O.I., Antonov, A., Kinach, R., Lou, X., Pavlov, S., Vorobiev, S., Dick, J.E., Tanner, S.D.: Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Anal. Chem. 81(16), 6813–6822 (2009)CrossRefGoogle Scholar
  2. 2.
    Berger, J.O.: Statistical Decision Theory and Bayesian Analysis. Springer Science & Business Media, New York (2013)Google Scholar
  3. 3.
    Castelo, R., Siebes, A.: Priors on network structures. Biasing the search for Bayesian networks. Int. J. Approx. Reason. 24(1), 39–57 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Chen, T.J., Kotecha, N.: Cytobank: providing an analytics platform for community cytometry data analysis and collaboration. In: Fienberg, H.G., Nolan, G.P. (eds.) High-Dimensional Single Cell Analysis. Current Topics in Microbiology and Immunology, vol. 377, pp. 127–157. Springer, Heidelberg (2014). doi: 10.1007/82_2014_364 CrossRefGoogle Scholar
  5. 5.
    Chickering, D.M.: A transformational characterization of equivalent Bayesian network structures. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 87–98. Morgan Kaufmann Publishers Inc. (1995)Google Scholar
  6. 6.
    Chickering, D.M., Heckerman, D.: Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Mach. Learn. 29(2–3), 181–212 (1997)CrossRefzbMATHGoogle Scholar
  7. 7.
    Cho, H., Berger, B., Peng, J.: Reconstructing causal biological networks through active learning. PloS ONE 11(3), e0150611 (2016)CrossRefGoogle Scholar
  8. 8.
    Cooper, G.F., Yoo, C.: Causal discovery from a mixture of experimental and observational data. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 116–125. Morgan Kaufmann Publishers Inc. (1999)Google Scholar
  9. 9.
    Eaton, D., Murphy, K.P.: Exact Bayesian structure learning from uncertain interventions. In: International Conference on Artificial Intelligence and Statistics, pp. 107–114 (2007)Google Scholar
  10. 10.
    Eberhardt, F., Glymour, C., Scheines, R.: On the number of experiments sufficient and in the worst case necessary to identify all causal relations among N variables (2012). arXiv preprint: arXiv:1207.1389
  11. 11.
    Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical LASSO. Biostatistics 9(3), 432–441 (2008)CrossRefzbMATHGoogle Scholar
  12. 12.
    Friedman, N.: Inferring cellular networks using probabilistic graphical models. Science 303(5659), 799–805 (2004)CrossRefGoogle Scholar
  13. 13.
    Friedman, N., et al.: Learning belief networks in the presence of missing values and hidden variables. ICML 97, 125–133 (1997)Google Scholar
  14. 14.
    Friedman, N., Goldszmidt, M., Wyner, A.: Data analysis with Bayesian networks: a bootstrap approach. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 196–205. Morgan Kaufmann Publishers Inc. (1999)Google Scholar
  15. 15.
    Friedman, N., Koller, D.: Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Mach. Learn. 50(1–2), 95–125 (2003)CrossRefzbMATHGoogle Scholar
  16. 16.
    Guan, Y., Dunham, M., Caudy, A., Troyanskaya, O.: Systematic planning of genome-scale experiments in poorly studied species. PLoS Comput. Biol. 6(3), e1000698 (2010)CrossRefGoogle Scholar
  17. 17.
    He, Y.-B., Geng, Z.: Active learning of causal networks with intervention experiments and optimal designs. J. Mach. Learn. Res. 9(11), 2523–2547 (2008)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)zbMATHGoogle Scholar
  19. 19.
    Ide, J.S., Cozman, F.G.: Random generation of Bayesian networks. In: Bittencourt, G., Ramalho, G.L. (eds.) SBIA 2002. LNCS (LNAI), vol. 2507, pp. 366–376. Springer, Heidelberg (2002). doi: 10.1007/3-540-36127-8_35 CrossRefGoogle Scholar
  20. 20.
    Ideker, T., Krogan, N.J.: Differential network biology. Mol. Syst. Biol. 8(1), 565 (2012)Google Scholar
  21. 21.
    Imoto, S., Kim, S.Y., Shimodaira, H., Aburatani, S., Tashiro, K., Kuhara, S., Miyano, S.: Bootstrap analysis of gene networks based on Bayesian networks and nonparametric regression. Genome Inform. 13, 369–370 (2002)Google Scholar
  22. 22.
    Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M., Tanabe, M.: Kegg as a reference resource for gene and protein annotation. Nucleic Acids Res. 44(D1), D457–D462 (2016)CrossRefGoogle Scholar
  23. 23.
    King, R.D., Whelan, K.E., Jones, F.M., Reiser, P.G.K., Bryant, C.H., Muggleton, S.H., Kell, D.B., Oliver, S.G.: Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427(6971), 247–252 (2004)CrossRefGoogle Scholar
  24. 24.
    Koller, D., Friedman, N., Models, P.G.: Principles and Techniques. MIT Press, Cambridge (2009)Google Scholar
  25. 25.
    Korb, K.B., Nicholson, A.E.: Bayesian Artificial Intelligence. CRC Press, Boca Raton (2010)zbMATHGoogle Scholar
  26. 26.
    Margaritis, D.: Learning Bayesian network model structure from data. Ph.D. thesis, U.S. Army (2003)Google Scholar
  27. 27.
    Meganck, S., Leray, P., Manderick, B.: Learning causal Bayesian networks from observations and experiments: a decision theoretic approach. In: Torra, V., Narukawa, Y., Valls, A., Domingo-Ferrer, J. (eds.) MDAI 2006. LNCS (LNAI), vol. 3885, pp. 58–69. Springer, Heidelberg (2006). doi: 10.1007/11681960_8 CrossRefGoogle Scholar
  28. 28.
    Murphy, K.P.: Active learning of causal Bayes net structure (2001)Google Scholar
  29. 29.
    Ness, R.O., Sachs, K., Vitek, O.: From correlation to causality: statistical approaches to learning regulatory relationships in large-scale biomolecular investigations. J. Proteome Res. 15, 683–690 (2016)CrossRefGoogle Scholar
  30. 30.
    Pawson, T., Warner, N.: Oncogenic re-wiring of cellular signaling pathways. Oncogene 26(9), 1268–1275 (2007)CrossRefGoogle Scholar
  31. 31.
    Pearl, J.: Causality: Models, Reasoning and Inference, vol. 29. Cambridge University Press, Cambridge (2000)zbMATHGoogle Scholar
  32. 32.
    Perez, O.D., Nolan, G.P.: Simultaneous measurement of multiple active kinase states using polychromatic flow cytometry. Nat. Biotechnol. 20(2), 155–162 (2002)Google Scholar
  33. 33.
    Pournara, I., Wernisch, L.: Reconstruction of gene networks using Bayesian learning and manipulation experiments. Bioinformatics 20(17), 2934–2942 (2004)CrossRefGoogle Scholar
  34. 34.
    Prill, R.J., Saez-Rodriguez, J., Alexopoulos, L.G., Sorger, P.K., Stolovitzky, G.: Crowdsourcing network inference: the DREAM predictive signaling network challenge. Sci. Signal. 4(189), mr7 (2011)CrossRefGoogle Scholar
  35. 35.
    Rossell, D., Müller, P.: Sequential stopping for high-throughput experiments. Biostatistics 14(1), 75–86 (2013)CrossRefGoogle Scholar
  36. 36.
    Russell, S.J., Norvig, P., Canny, J.F., Malik, J.M., Edwards, D.D.: Artificial Intelligence: A Modern Approach, vol. 2. Prentice Hall, Upper Saddle River (2003)Google Scholar
  37. 37.
    Sachs, K., Gentles, A.J., Youland, R., Itani, S., Irish, J., Nolan, G.P., Plevritis, S.K.: Characterization of patient specific signaling via augmentation of Bayesian networks with disease and patient state nodes. In: 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 6624–6627. IEEE (2009)Google Scholar
  38. 38.
    Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D.A., Nolan, G.P.: Causal protein-signaling networks derived from multiparameter single-cell data. Sci. (N.Y., NY) 308(5721), 523–529 (2005)CrossRefGoogle Scholar
  39. 39.
    Scutari, M.: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010)CrossRefGoogle Scholar
  40. 40.
    Scutari, M.: On the prior and posterior distributions used in graphical modelling. Bayesian Anal. 8(3), 505–532 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Terfve, C., Cokelaer, T., Henriques, D., MacNamara, A., Goncalves, E., Morris, M.K., van Iersel, M., Lauffenburger, D.A., Saez-Rodriguez, J.: CellNOptR: a flexible toolkit to train protein signaling networks to data using multiple logic formalisms. BMC Syst. Biol. 6(1), 1 (2012)CrossRefGoogle Scholar
  42. 42.
    Terfve, C., Saez-Rodriguez, J.: Modeling signaling networks using high-throughput phospho-proteomics. In: Goryanin, I., Goryachev, A. (eds.) Advances in Systems Biology. Advances in Experimental Medicine and Biology, vol. 736, pp. 19–57. Springer, New York (2012). doi: 10.1007/978-1-4419-7210-1_2 CrossRefGoogle Scholar
  43. 43.
    Tian, J., Pearl, J.: Causal discovery from changes. In: Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, pp. 512–521. Morgan Kaufmann Publishers Inc. (2001)Google Scholar
  44. 44.
    Tong, S., Koller, D.: Active learning for structure in Bayesian networks. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 863–869. Lawrence Erlbaum Associates Ltd. (2001)Google Scholar
  45. 45.
    Werhli, A.V., Husmeier, D.: Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge. Stat. Appl. Genet. Mol. Biol. 6(1), 15 (2007)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of StatisticsPurdue UniversityWest LafayetteUSA
  2. 2.College of Science, College of Computer and Information ScienceNortheastern UniversityBostonUSA
  3. 3.School of MedicineStanford UniversityPalo AltoUSA

Personalised recommendations