Mathematical Programming

, Volume 164, Issue 1–2, pp 285–324 | Cite as

Polyhedral aspects of score equivalence in Bayesian network structure learning

  • James Cussens
  • David Haws
  • Milan StudenýEmail author
Full Length Paper Series A


This paper deals with faces and facets of the family-variable polytope and the characteristic-imset polytope, which are special polytopes used in integer linear programming approaches to statistically learn Bayesian network structure. A common form of linear objectives to be maximized in this area leads to the concept of score equivalence (SE), both for linear objectives and for faces of the family-variable polytope. We characterize the linear space of SE objectives and establish a one-to-one correspondence between SE faces of the family-variable polytope, the faces of the characteristic-imset polytope, and standardized supermodular functions. The characterization of SE facets in terms of extremality of the corresponding supermodular function gives an elegant method to verify whether an inequality is SE-facet-defining for the family-variable polytope. We also show that when maximizing an SE objective one can eliminate linear constraints of the family-variable polytope that correspond to non-SE facets. However, we show that solely considering SE facets is not enough as a counter-example shows; one has to consider the linear inequality constraints that correspond to facets of the characteristic-imset polytope despite the fact that they may not define facets in the family-variable mode.


Family-variable polytope Characteristic-imset polytope Score equivalent face/facet Supermodular set function 

Mathematics Subject Classification

52B12 90C27 68Q32 



The research of Milan Studený has been supported by the grants GAČR n. 13-20012S and 16-12010S. James Cussens was supported by the UK Medical Research Council, Grant G1002312 and senior postdoctoral fellowship SF/14/008 from KU Leuven. Our special thanks are devoted to Fero Matúš, who helped us to find an easy proof of the combinatorial identity from Lemma 13. We also express our gratitude to the reviewer for valuable comments.


  1. 1.
    Bartlett, M., Cussens, J.: Advances in Bayesian network learning using integer programming. In: Nicholson, A., Smyth, P. (eds.) Uncertainty in Artificial Intelligence, vol. 29, pp. 182–191. AUAI Press, Corvallis (2013)Google Scholar
  2. 2.
    Barvinok, A.: A Course in Convexity. Graduate Studies in Mathematics, vol. 54. American Mathematical Society, Providence (2002)zbMATHGoogle Scholar
  3. 3.
    Bouckaert, R.R.: Bayesian belief networks—from construction to evidence. Ph.D. thesis, University of Utrecht (1995)Google Scholar
  4. 4.
    Brøndsted, A.: An Introduction to Convex Polytopes. Springer, New York (1983)CrossRefzbMATHGoogle Scholar
  5. 5.
    Chickering, D.M.: A transformational characterization of equivalent Bayesian network structures. In: Besnard, P., Hanks, S. (eds.) Uncertainty in Artificial Intelligence, vol. 11, pp. 87–98. Morgan Kaufmann, San Francisco (1995)Google Scholar
  6. 6.
    Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 505–554 (2002)MathSciNetGoogle Scholar
  7. 7.
    Cussens, J.: Maximum likelihood pedigree reconstruction using integer programming. In: Proceedings of the Workshop on Constraint Based Methods for Bioinformatics (WCBMB), pp. 9–19 (2010)Google Scholar
  8. 8.
    Cussens, J.: Bayesian network learning with cutting planes. In: Cozman, F., Pfeffer, A. (eds.) Uncertainty in Artificial Intelligence, vol. 27, pp. 153–160, AUAI Press, Corvallis (2011)Google Scholar
  9. 9.
    Cussens, J., Bartlett, M.: GOBNILP software (2016).
  10. 10.
    Cussens, J., Järvisalo, M., Korhonen, J.H., Bartlett, M.: Bayesian network structure learning with integer programming: polytopes, facets, and complexity. J. Artif. Intell. Res. (2016). arXiv:1605.04071
  11. 11.
    de Campos, C.P., Ji, Q.: Efficient structure learning Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Hemmecke, R., Lindner, S., Studený, M.: Characteristic imsets for learning Bayesian network structure. Int. J. Approx. Reason. 53, 1336–1349 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Jaakkola, T., Sontag, D., Globerson, A., Meila, M.: Learning Bayesian network structure using LP relaxations. In: Journal of Machine Learning Research Workshop and Conference Proceedings, AISTATS 2010, vol. 9, pp. 358–365 (2010)Google Scholar
  14. 14.
    Lauritzen, S.L.: Graphical Models. Clarendon Press, Oxford (1996)zbMATHGoogle Scholar
  15. 15.
    Neapolitan, R.E.: Learning Bayesian Networks. Pearson Prentice Hall, Upper Saddle River (2004)Google Scholar
  16. 16.
    Orlinskaya G.: Linear constraints on standard and characteristic imsets for learning Bayesian network structures. Diploma thesis, TU Munich (2014)Google Scholar
  17. 17.
    Oxley, J.G.: Matroid Theory. Oxford University Press, Oxford (1992)zbMATHGoogle Scholar
  18. 18.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo (1988)zbMATHGoogle Scholar
  19. 19.
    Studený, M.: Probabilistic Conditional Independence Structures. Springer, London (2005)zbMATHGoogle Scholar
  20. 20.
    Studený, M., Vomlel, J., Hemmecke, R.: A geometric view on learning Bayesian network structures. Int. J. Approx. Reason. 51, 573–586 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Studený, M., Vomlel, J.: On open questions in the geometric approach to structural learning Bayesian nets. Int. J. Approx. Reason. 52, 627–640 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Studený, M., Haws, D.C.: On polyhedral approximations of polytopes for learning Bayesian networks. J. Algebr. Stat. 4, 59–92 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Studený, M., Haws, D.: Learning Bayesian network structure: towards the essential graph by integer linear programming tools. Int. J. Approx. Reason. 55, 1043–1071 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Studený, M.: How matroids occur in the context of learning Bayesian network structures. In: Meila, M., Heskes, T. (eds.) Uncertainty in Artificial Intelligence, vol. 31, pp. 832–841. AUAI Press, Corvallis (2015)Google Scholar
  25. 25.
    Studený, M., Kroupa, T.: Core-based criterion for extreme supermodular functions. Discrete Appl. Math. 206, 122–151 (2016)Google Scholar
  26. 26.
    Wolsey, L.A.: Integer Programming. Wiley, New York (1998)zbMATHGoogle Scholar
  27. 27.
    Ziegler, G.M.: Lectures on Polytopes. Springer, New York (1995)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg and Mathematical Optimization Society 2016

Authors and Affiliations

  1. 1.Department of Computer Science and York Centre for Complex Systems AnalysisUniversity of YorkDeramore LaneUK
  2. 2.Thomas J. Watson Research CenterYorktown HeightsUSA
  3. 3.Institute of Information Theory and Automation of the CASPragueCzech Republic

Personalised recommendations