Statistics and Computing

, Volume 26, Issue 4, pp 797–811 | Cite as

Exact estimation of multiple directed acyclic graphs

  • Chris J. OatesEmail author
  • Jim Q. Smith
  • Sach Mukherjee
  • James Cussens


This paper considers structure learning for multiple related directed acyclic graph (DAG) models. Building on recent developments in exact estimation of DAGs using integer linear programming (ILP), we present an ILP approach for joint estimation over multiple DAGs. Unlike previous work, we do not require that the vertices in each DAG share a common ordering. Furthermore, we allow for (potentially unknown) dependency structure between the DAGs. Results are presented on both simulated data and fMRI data obtained from multiple subjects.


Hierarchical model Bayesian network  Multiregression dynamical model Integer linear programming Joint estimation 



The authors are grateful to Dr. Ricardo Silva and two anonymous reviewers, whose feedback helped to improve the paper. CJO was supported by the Centre for Research in Statistical Methodology (CRiSM) EPSRC EP /D002060/1. JC was supported by the Medical Research Council (Project Grant G1002312). SM was supported by the UK Medical Research Council and is a recipient of a Royal Society Wolfson Research Merit Award. The authors are grateful to Lilia Carneiro da Costa and Tom Nichols who collaborated in the analysis of fMRI data and to Mark Bartlett who provided technical support with GOBNILP. The authors also thank Diane Oyen and several other colleagues who provided feedback on an earlier draft.

Supplementary material

11222_2015_9570_MOESM1_ESM.pdf (358 kb)
Supplementary material 1 (pdf 357 KB)


  1. Achterberg, T.: SCIP: solving constraint integer programs. Math Program Comput 1(1), 1–41 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  2. Bartlett, M., Cussens, J.: Advances in Bayesian network learning using integer programming. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, pp. 182–191 (2013)Google Scholar
  3. Berg, J., Järvisalo, M., Malone, B.: Learning optimal bounded treewidth Bayesian networks via maximum satisfiability. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics 33, pp. 86–95 (2014)Google Scholar
  4. Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2003)MathSciNetzbMATHGoogle Scholar
  5. Costa, L., Smith, J.Q., Nicholls, T., Cussens, J., Duff, E.P., Makin, T.R.: Searching multiregression dynamic models of resting-state fMRI networks using integer programming. Bayesian Anal., to appear (2015)Google Scholar
  6. Cowell, R.G.: Efficient maximum likelihood pedigree reconstruction. Theor. Popul. Biol. 76, 285–291 (2009)CrossRefGoogle Scholar
  7. Cussens, J.: Maximum likelihood pedigree reconstruction using integer programming. In: Proceedings of the Workshop on Constraint Based Methods for Bioinformatics (WCB-10), Edinburgh (2010)Google Scholar
  8. Cussens, J.: Bayesian network learning with cutting planes. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp. 153–160 (2011)Google Scholar
  9. Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. B 76(2), 373–397 (2014)MathSciNetCrossRefGoogle Scholar
  10. De Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011)MathSciNetzbMATHGoogle Scholar
  11. Ellis, B., Wong, W.H.: Learning causal Bayesian network structures from experimental data. J. Am. Stat. Assoc. 103(482), 778–789 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  12. Friedman, N., Koller, D.: Being Bayesian about network structure: a Bayesian approach to structure discovery in Bayesian networks. Mach. Learn. 50(1–2), 95–126 (2003)CrossRefzbMATHGoogle Scholar
  13. Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)CrossRefzbMATHGoogle Scholar
  14. Friston, K.J.: Functional and effective connectivity: a review. Brain Connect. 1(1), 13–36 (2011)MathSciNetCrossRefGoogle Scholar
  15. He, Y., Jia, J., Yu, B.: Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs. Ann. Stat. 41(4), 1742–1779 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  16. Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)zbMATHGoogle Scholar
  17. Hill, S., Lu, Y., Molina, J., Heiser, L.M., Spellman, P.T., Speed, T.P., Gray, J.W., Mills, G.B., Mukherjee, S.: Bayesian inference of signaling network topology in a cancer cell line. Bioinformatics 28(21), 2804–2810 (2012)CrossRefGoogle Scholar
  18. Jaakkola, T., Sontag, D., Globerson, A., Meila, M.: Learning Bayesian network structure using LP relaxations. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, pp. 358–365 (2010)Google Scholar
  19. Lee, S.Y.: Structural Equation Modeling: A Bayesian Approach. Wiley, New York (2007)CrossRefGoogle Scholar
  20. Li, J., Wang, Z.J., Palmer, S.J., McKeown, M.J.: Dynamic Bayesian network modeling of fMRI: a comparison of group-analysis methods. Neuroimage 41(2), 398–407 (2008)CrossRefGoogle Scholar
  21. Loh, P.-L., Wainwright, M.J.: Structure estimation for discrete graphical models: generalized covariance matrices and their inverses. Ann. Stat. 41(6), 3022–3049 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  22. Luis, R., Sucar, L.E., Morales, E.F.: Inductive transfer for learning Bayesian networks. Mach. Learn. 79(1–2), 227–255 (2010)MathSciNetCrossRefGoogle Scholar
  23. Mahajan, A.: Presolving mixed-integer linear programs. Wiley Encyclopedia of Operations Research and Management Science (2010)Google Scholar
  24. Malone, B., Kangas, K., Jarvisalo, M., Koivisto, M., Myllymäki, P.: Predicting the hardness of learning Bayesian networks. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence, (2014)Google Scholar
  25. Mechellia, A., Penny, W.D., Pricea, C.J., Gitelman, D.R., Friston, K.J.: Effective connectivity and intersubject variability: using a multisubject network to test differences and commonalities. Neuroimage 17(3), 1459–1469 (2002)CrossRefGoogle Scholar
  26. Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  27. Nemhauser, G.L., Wolsey, L.A.: Integer and Combinatorial Optimization. Wiley, New York (1988)CrossRefzbMATHGoogle Scholar
  28. Niculescu-Mizil, A., Caruana, R.: Inductive transfer for Bayesian network structure learning. In: Proceedings of the 11th International Conference on Artificial Intelligence and Statistics, pp. 339–346 (2007)Google Scholar
  29. Nie, S., Mauá, D.D., de Campos, C.P., Ji, Q.: Advances in learning Bayesian networks of bounded treewidth. Adv. Neur. In. 27, 2285–2293 (2014)Google Scholar
  30. Oates, C.J., Mukherjee, S.: Joint structure learning of multiple non-exchangeable networks. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp. 687–695 (2014)Google Scholar
  31. Oates, C.J., Korkola, J., Gray, J.W., Mukherjee, S.: Joint estimation of multiple networks from time course data. Ann. Appl. Stat. 8(3), 1892–1919 (2014a)MathSciNetCrossRefzbMATHGoogle Scholar
  32. Oates, C.J., Carneiro da Costa, L., Nichols, T.: Towards a multi-subject analysis of neural connectivity. Neural Compt. 27, 151–170 (2015)CrossRefGoogle Scholar
  33. Oyen, D., Lane, T.: Leveraging domain knowledge in multitask bayesian network structure learning. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (2012)Google Scholar
  34. Oyen, D., Lane, T.: Bayesian discovery of multiple Bayesian networks via transfer learning. In: Proceedings of the 13th IEEE International Conference on Data Mining, pp. 577–586 (2013)Google Scholar
  35. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE T. Knowl. Data En. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  36. Parviainen, P., Farahani, H.S., Lagergren, J.: Learning Bounded Tree-width Bayesian Networks using Integer Linear Programming Proceedings of the 17th International Conference on Artificial Intelligence and Statistics 33, pp. 751–759 (2014)Google Scholar
  37. Penfold, C.A., Buchanan-Wollaston, V., Denby, K.J., Wild, D.L.: Nonparametric Bayesian inference for perturbed and orthologous gene regulatory networks. Bioinformatics 28(12), i233–i241 (2012)CrossRefGoogle Scholar
  38. Peters, J., Mooij, J.M., Janzing, D., Schölkopf, B.: Identifiability of causal graphs using functional models. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp. 589–598 (2011)Google Scholar
  39. Peters, J., Bühlmann, P.: Identifiability of Gaussian structural equation models with equal error variances. Biometrika 101, 219–228 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  40. Queen, C.M., Smith, J.Q.: Multiregression dynamic models. J. R. Stat. Soc. B 55(4), 849–870 (1993)MathSciNetzbMATHGoogle Scholar
  41. Sheehan, N.A., Bartlett, M., Cussens, J.: Improved maximum likelihood reconstruction of complex multi-generational pedigrees. Theor. Popul. Biol. 97, 11–19 (2014)CrossRefzbMATHGoogle Scholar
  42. Silander, T., Myllymäki, P.: A simple approach to finding the globally optimal Bayesian network structure. In: Proceedings of the 22nd Conference on Artificial Intelligence, pp. 445–452 (2006)Google Scholar
  43. Studený, M., Vomlel, J., Hemmecke, R.: A geometric view on learning Bayesian network structures. Int. J. Approx. Reason. 51(5), 578–586 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  44. Studený, M., Haws, D.: On polyhedral approximations of polytopes for learning Bayesian networks. J. Algebraic Stat. 4(1), 59–92 (2013)MathSciNetCrossRefGoogle Scholar
  45. Sugihara, G., Kaminaga, T., Sugishita, M.: Interindividual uniformity and variety of the “Writing center”: a functional MRI study. Neuroimage 32(4), 1837–1849 (2006)CrossRefGoogle Scholar
  46. Thiesson, B., Meek, C., Chickering, D. M., Heckerman, D.: Learning mixtures of Bayesian networks. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, pp. 504–513 (1998)Google Scholar
  47. Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)CrossRefGoogle Scholar
  48. Van Essen, D.C., Smith, S.M., Barch, D.M., Behrens, T.E., Yacoub, E., Ugurbil, K.: The WU-Minn human connectome project: an overview. Neuroimage 80, 62–79 (2013)CrossRefGoogle Scholar
  49. Werhli, A.V., Husmeier, D.: Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. J. Bioinform. Comput. Biol. 6(3), 543–572 (2008)CrossRefGoogle Scholar
  50. Wolsey, L.A.: Integer Programming. Wiley, New York (1998)zbMATHGoogle Scholar
  51. Yajima, M., Telesca, D., Ji, Y., Müller, P.: Detecting differential patterns of interaction in molecular pathways. Biostatistics, kxu054 (2014)Google Scholar
  52. Yuan, C., Malone, B.: Learning optimal Bayesian networks: a shortest path perspective. J. Artif. Intell. Res. 48, 23–65 (2013)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Chris J. Oates
    • 1
    • 4
    Email author
  • Jim Q. Smith
    • 1
  • Sach Mukherjee
    • 2
    • 5
  • James Cussens
    • 3
  1. 1.Department of StatisticsUniversity of WarwickCoventryUK
  2. 2.MRC Biostatistics Unit and CRUK Cambridge InstituteUniversity of CambridgeCambridgeUK
  3. 3.Department of Computer Science and York Centre for Complex Systems AnalysisUniversity of YorkYorkUK
  4. 4.School of Mathematical and Physical SciencesUniversity of Technology SydneySydneyAustralia
  5. 5.German Center for Neurodegenerative Diseases (DZNE)BonnGermany

Personalised recommendations