Skip to main content
Log in

A survey on Bayesian network structure learning from data

  • Review
  • Published:
Progress in Artificial Intelligence Aims and scope Submit manuscript

Abstract

A necessary step in the development of artificial intelligence is to enable a machine to represent how the world works, building an internal structure from data. This structure should hold a good trade-off between expressive power and querying efficiency. Bayesian networks have proven to be an effective and versatile tool for the task at hand. They have been applied to modeling knowledge in a variety of fields, ranging from bioinformatics to law, from image processing to economic risk analysis. A crucial aspect is learning the dependency graph of a Bayesian network from data. This task, called structure learning, is NP-hard and is the subject of intense, cutting-edge research. In short, it can be thought of as choosing one graph over the many candidates, grounding our reasoning over a collection of samples of the distribution generating the data. The number of possible graphs increases very quickly at the increase in the number of variables. Searching in this space, and selecting a graph over the others, becomes quickly burdensome. In this survey, we review the most relevant structure learning algorithms that have been proposed in the literature. We classify them according to the approach they follow for solving the problem and we also show alternatives for handling missing data and continuous variable. An extensive review of existing software tools is also given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Abellán, J., Gómez-Olmedo, M., Moral, S.: Some variations on the PC algorithm. In: Third European Workshop on Probabilistic Graphical Models, pp. 1–8 (2006)

  2. Adel, T., de Campos, C.P.: Learning Bayesian networks with incomplete data by augmentation. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 1684–1690 (2017)

  3. Alonso-Barba, J., de la Ossa, L., Gámez, J., Puerta, J.: Scaling up the greedy equivalence search algorithm by constraining the search space of equivalence classes. Int. J. Approx. Reason. 54, 429–451 (2013)

    Article  MathSciNet  Google Scholar 

  4. Alonso-Barba, J.I., de la Ossa, L., Puerta, J.M.: Structural learning of Bayesian networks using local algorithms based on the space of orderings. Soft Comput. 15(10), 1881–1895 (2011)

    Article  Google Scholar 

  5. Alonso, J., de la Ossa, L., Gámez, J., Puerta, J.: On the use of local search heuristics to improve GES-based Bayesian network learning. Appl. Soft Comput. 64, 366–376 (2018)

    Article  Google Scholar 

  6. Bacciu, D., Etchells, T., Lisboa, P., Whittaker, J.: Efficient identification of independence networks using mutual information. Comput. Stat. 28, 621–646 (2013)

    Article  MathSciNet  Google Scholar 

  7. Ben-Daya, M., Al-Fawzan, M.: A tabu search approach for the flow shop scheduling problem. Eur. J. Oper. Res. 109(1), 88–95 (1998)

    Article  Google Scholar 

  8. Bøttcher, S.: Learning Bayesian networks with mixed variables. In: Proceedings of the Eighth International Workshop in Artificial Intelligence and Statistics (2001)

  9. Bøttcher, S., Dethlefsen, C.: deal: A package for learning bayesian networks. J. Stat. Softw. 8, 1–40 (2003)

    Google Scholar 

  10. Buntine, W.: Theory refinement on Bayesian networks. In: Proceedings of the 8th Conference on Uncertainty in Artificial Intelligence, pp. 52–60 (1991)

  11. Cheng, J., Bell, D.A., Liu, W.: An algorithm for Bayesian belief network construction from data. In: Proceedings of Artificial Intelligence and Statistics, pp. 83–90 (1997)

  12. Chickering, D.: A transformational characterization of equivalent Bayesian network structures. In: Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, pp. 87–98. Morgan Kaufmann (1995)

  13. Chickering, D.M., Heckerman, D., Meek, C.: Large-sample learning of Bayesian networks is NP-Hard. J. Mach. Learn. Res. 5, 1287–1330 (2014)

    MathSciNet  MATH  Google Scholar 

  14. Colombo, D., Maathuis, M.H.: Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15, 3741–3782 (2014)

    MathSciNet  MATH  Google Scholar 

  15. Consortium, Elvira.: Elvira: An environment for creating and using probabilistic graphical models. In: Gámez, J., Salmerón, A. (eds) Proceedings of the First European Workshop on Probabilistic Graphical Models, pp. 222–230 (2002)

  16. Cooper, G.F.: The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell. 42, 393–405 (1990)

    Article  MathSciNet  Google Scholar 

  17. Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992)

    MATH  Google Scholar 

  18. Cussens, J.: Bayesian network learning with cutting planes. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp. 153–160 (2011)

  19. Cussens, J., Malone, B., Yuan, C.: IJCAI 2013 tutorial on optimal algorithms for learning Bayesian networks (2013). https://sites.google.com/site/ijcai2013bns/slides. Accessed June 2018

  20. de Campos, C.P., Corani, G., Scanagatta, M., Cuccu, M., Zaffalon, M.: Learning extended tree augmented naive structures. Int. J. Approx. Reason. 68, 153–163 (2015)

    Article  MathSciNet  Google Scholar 

  21. de Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011)

    MathSciNet  MATH  Google Scholar 

  22. de Campos, C.P., Zeng, Z., Ji, Q.: Structure learning of Bayesian networks using constraints. In: Proceedings of the 26th International Conference on Machine Learning, pp. 113–120 (2009)

  23. Elidan, G., Gould, S.: Learning bounded treewidth Bayesian networks. J. Mach. Learn. Res. 9, 2699–2731 (2008)

    MathSciNet  MATH  Google Scholar 

  24. Fernández, A., Nielsen, J.D., Salmerón, A.: Learning Bayesian networks for regression from incomplete databases. Int. J. Uncertain. Fuzziness Knowl. Based Syst 18(1), 69–86 (2010)

    Article  MathSciNet  Google Scholar 

  25. Fernández, A., Pérez-Bernabé, I., Salmerón, A.: On Using the PC Algorithm for Learning Continuous Bayesian Networks: An Experimental Analysis, CAEPIA’13. Lecture Notes in Computer Science 8109, 342–351 (2013)

  26. Fernández, A., Salmerón, A.: Extension of Bayesian network classifiers to regression problems. In: Geffner, H., Prada, R., Alexandre, I.M., David, N. (eds) Advances in Artificial Intelligence—IBERAMIA 2008, Vol. 5290 of Lecture Notes in Artificial Intelligence, pp. 83–92. Springer (2008)

  27. Friedman, N.: The Bayesian structural EM algorithm. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, pp. 129–138 (1998)

  28. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29, 131–163 (1997)

    Article  Google Scholar 

  29. Hand, D.J., Yu, K.: Idiot’s Bayes–not so stupid after all? Int. Stat. Rev. 69(3), 385–398 (2001)

    MATH  Google Scholar 

  30. He, Y., Jia, J., Geng, Z.: Structural learning of causal networks. Behaviormetrika 44, 287–305 (2017)

    Article  Google Scholar 

  31. Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995)

    MATH  Google Scholar 

  32. Jaakkola, T., Sontag, D., Globerson, A., Meila, M.: Learning Bayesian network structure using LP relaxations. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, pp. 358–365 (2010)

  33. Jaeger, M.: Probabilistic decision graphs—combining verification and ai techniques for probabilistic inference. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 12, 19–42 (2004)

    Article  MathSciNet  Google Scholar 

  34. Kalisch, M., Bühlmann, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)

    MATH  Google Scholar 

  35. Koivisto, M.: Parent assignment is hard for the MDL, AIC, and NML costs. In: Proceedings of the 29th Annual Conference On Learning Theory, vol. 4005, pp. 289–303 (2016)

  36. Koivisto, M., Sood, K.: Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 5, 549–573 (2004)

    MathSciNet  MATH  Google Scholar 

  37. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Boston (2009)

    MATH  Google Scholar 

  38. Korhonen, J., Parviainen, P.: Exact learning of bounded treewidth Bayesian networks. In: Artificial Intelligence and Statistics, pp 370–378 (2013)

  39. Kwisthout, J. H.P., Bodlaender, H.L., van der Gaag, L.C.: The necessity of bounded treewidth for efficient inference in Bayesian networks. In: Proceedings of the 19th European Conference on Artificial Intelligence, pp. 237–242 (2010)

  40. Lauritzen, S., Wermuth, N.: Graphical models for associations between variables, some of which are qualitative and some quantitative. Ann. Stat. 17, 31–57 (1989)

    Article  MathSciNet  Google Scholar 

  41. Lee, C., van Beek, P.: Metaheuristics for score-and-search Bayesian network structure learning. In: Proceedings of the 30th Canadian Conference on Artificial Intelligence, pp. 129–141 (2017)

  42. Madsen, A.L., Jensen, F., Salmerón, A., Langseth, H., Nielsen, T.D.: A parallel algorithm for Bayesian network structure learning from large data sets. Knowl. Based Syst. 117, 46–55 (2017)

    Article  Google Scholar 

  43. Malone, B., Kangas, K., Järvisalo, M., Koivisto, M., Myllymäki, P.: Empirical hardness of finding optimal Bayesian network structures: algorithm selection and runtime prediction. Mach. Learn. 107, 1–37 (2018)

    Article  MathSciNet  Google Scholar 

  44. Malone, B.M.: Learning optimal Bayesian networks with heuristic search. Ph.D. thesis, Mississippi State University (2012)

  45. Moral, S., Rumí, R., Salmerón, A.: Mixtures of Truncated Exponentials in Hybrid Bayesian Networks. In: Benferhat, S., Besnard , P. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Vol. 2143 of Lecture Notes in Artificial Intelligence, pp. 156–167. Springer (2001)

  46. Nie, S., de Campos, C.P., Ji, Q.: Learning bounded treewidth Bayesian networks via sampling. In: Proceedings of the 13th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, pp. 387–396 (2015)

  47. Nie, S., Mauá, D.D., de Campos, C.P., Ji, Q.: Advances in learning Bayesian networks of bounded treewidth. Adv. Neural Inf. Process. Syst. 27, 2285–2293 (2014)

    Google Scholar 

  48. Nielsen, J.D., Rumí, R., Salmerón, A.: Structural-EM for learning PDG models from incomplete data. Int. J. Approx. Reason. 51(5), 515–530 (2010)

    Article  MathSciNet  Google Scholar 

  49. Parviainen, P., Farahani, H.S., Lagergren, J.: Learning bounded treewidth Bayesian networks using integer linear programming. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp. 751–759 (2014)

  50. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Elsevier, Amsterdam (1988)

    MATH  Google Scholar 

  51. Pearl, J.: Causality: models, reasoning and inference. Econom. Theory 19(46), 675–685 (2003)

    Google Scholar 

  52. Pearl, J., Verma, T.S.: A theory of inferred causation. Stud. Logic Found. Math. 134, 789–811 (1995)

    Article  MathSciNet  Google Scholar 

  53. Pourret, O., Naïm, P., Marcot, B.: Bayesian Networks: A Practical Guide to Applications. Wiley, Hoboken (2008)

    Book  Google Scholar 

  54. Robinson, R.W.: Counting Labeled Acyclic Digraphs, New Directions in the Theory of Graphs, pp. 28–43. Academic Press, New York (1973)

    Google Scholar 

  55. Romero, V., Rumí, R., Salmerón, A.: Learning hybrid Bayesian networks using mixtures of truncated exponentials. Int. J. Approx. Reason. 42, 54–68 (2006)

    Article  MathSciNet  Google Scholar 

  56. Scanagatta, M., Corani, G., de Campos, C.P., Zaffalon, M.: Learning treewidth-bounded Bayesian networks with thousands of variables. Adv. Neural Inf. Process. Syst. 29, 1462–1470 (2016)

    Google Scholar 

  57. Scanagatta, M., Corani, G., de Campos, C.P., Zaffalon, M.: Approximate structure learning for large Bayesian networks. Mach. Learn. 107, 1–19 (2018)

    Article  MathSciNet  Google Scholar 

  58. Scanagatta, M., Corani, G., Zaffalon, M.: Improved local search in Bayesian networks structure learning. In:Proceedings of the 3rd International Workshop on Advanced Methodologies for Bayesian Networks, pp. 45–56 (2017)

  59. Scanagatta, M., Corani, G., Zaffalon, M., Yoo, J., Kang, U.: Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data sets. Int. J. Approx. Reason. 95, 152–166 (2018)

    Article  MathSciNet  Google Scholar 

  60. Scanagatta, M., de Campos, C.P., Corani, G., Zaffalon, M.: Learning Bayesian networks with thousands of variables. Adv. Neural Inf. Process. Syst. 28, 1855–1863 (2015)

    Google Scholar 

  61. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MathSciNet  Google Scholar 

  62. Scutari, M.: Bayesian network constraint-based structure learning algorithms: Parallel and optimised implementations in the bnlearn R package. CoRR (2014). arXiv:1406.7648

  63. Silander, T., Myllymaki, P.: A simple approach for finding the globally optimal Bayesian network structure. In: Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, pp. 445–452 (2006)

  64. Spirtes, P., Glymour, C.N., Scheines, R.: Causation, Prediction, and Search. MIT Press, Boston (2000)

    MATH  Google Scholar 

  65. Steck, H., Tresp, V.: Bayesian belief networks for data mining. University of Magdeburg, pp 145–154 (1996)

  66. Teyssier, M., Koller, D.: Ordering-based search: a simple and effective algorithm for learning Bayesian networks. In: Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, pp. 584–590 (2005)

  67. Yuan, C., Malone, B.: An improved admissible heuristic for learning optimal Bayesian networks. In: Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence, pp. 924–933 (2012)

  68. Yuan, C., Malone, B., Wu, X.: Learning optimal Bayesian networks using A* search. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp. 2186–2191 (2011)

  69. Zheng, X., Aragam, B., Ravikumar, P., Xing, E.: DAGs with no tears: Continuous optimization for structure learning. In: Advances in Neural Information Processing Systems, pp. 9492–9503 (2018)

Download references

Acknowledgements

This work has been partly supported by the Spanish Ministry of Science, Innovation and Universities, grant TIN2016-77902-C3-3-P and by ERDF (FEDER) funds.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mauro Scanagatta.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Scanagatta, M., Salmerón, A. & Stella, F. A survey on Bayesian network structure learning from data. Prog Artif Intell 8, 425–439 (2019). https://doi.org/10.1007/s13748-019-00194-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13748-019-00194-y

Keywords

Navigation