Advertisement

Min-BDeu and Max-BDeu Scores for Learning Bayesian Networks

  • Mauro Scanagatta
  • Cassio P. de Campos
  • Marco Zaffalon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8754)

Abstract

This work presents two new score functions based on the Bayesian Dirichlet equivalent uniform (BDeu) score for learning Bayesian network structures. They consider the sensitivity of BDeu to varying parameters of the Dirichlet prior. The scores take on the most adversary and the most beneficial priors among those within a contamination set around the symmetric one. We build these scores in such way that they are decomposable and can be computed efficiently. Because of that, they can be integrated into any state-of-the-art structure learning method that explores the space of directed acyclic graphs and allows decomposable scores. Empirical results suggest that our scores outperform the standard BDeu score in terms of the likelihood of unseen data and in terms of edge discovery with respect to the true network, at least when the training sample size is small. We discuss the relation between these new scores and the accuracy of inferred models. Moreover, our new criteria can be used to identify the amount of data after which learning is saturated, that is, additional data are of little help to improve the resulting model.

Keywords

Bayesian networks structure learning Bayesian Dirichlet score 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)zbMATHGoogle Scholar
  2. 2.
    Barlett, M., Cussens, J.: Advances in Bayesian network learning using integer programming. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, UAI 2013, pp. 182–191 (2013)Google Scholar
  3. 3.
    Cussens, J.: Bayesian network learning with cutting planes. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011, pp. 153–160. AUAI Press, Barcelona (2011)Google Scholar
  4. 4.
    de Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. Journal of Machine Learning Research 12, 663–689 (2011)zbMATHGoogle Scholar
  5. 5.
    de Campos, C.P., Zeng, Z., Ji, Q.: Structure learning of Bayesian networks using constraints. In: Proceedings of the 26th International Conference on Machine Learning, ICML 2009, pp. 113–120. Omnipress, Montreal (2009)Google Scholar
  6. 6.
    Jaakkola, T., Sontag, D., Globerson, A., Meila, M.: Learning Bayesian Network Structure using LP Relaxations. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, AISTATS 2010, pp. 358–365 (2010)Google Scholar
  7. 7.
    Niinimäki, T., Koivisto, M.: Annealed importance sampling for structure learning in Bayesian networks. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013, pp. 1579–1585. AAAI Press (2013)Google Scholar
  8. 8.
    Parviainen, P., Koivisto, M.: Exact structure discovery in Bayesian networks with less space. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009, pp. 436–443. AUAI Press (2009)Google Scholar
  9. 9.
    Parviainen, P., Koivisto, M.: Finding optimal Bayesian networks using precedence constraints. Journal of Machine Learning Research 14, 1387–1415 (2013)MathSciNetGoogle Scholar
  10. 10.
    Yuan, C., Malone, B.: Learning optimal Bayesian networks: A shortest path perspective. Journal of Artificial Intelligence Research 48, 23–65 (2013)zbMATHGoogle Scholar
  11. 11.
    Buntine, W.: Theory refinement on Bayesian networks. In: Proceedings of the 8th Conference on Uncertainty in Artificial Intelligence, UAI 1992, pp. 52–60. Morgan Kaufmann, San Francisco (1991)Google Scholar
  12. 12.
    Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9, 309–347 (1992)zbMATHGoogle Scholar
  13. 13.
    Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics 6(2), 461–464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6), 716–723 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Brenner, E., Sontag, D.: Sparsityboost: A new scoring function for learning Bayesian network structure. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, UAI 2013, pp. 112–121. AUAI Press, Corvallis (2013)Google Scholar
  16. 16.
    Walley, P.: Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London (1991)Google Scholar
  17. 17.
    Silander, T., Kontkanen, P., Myllymäki, P.: On Sensitivity of the MAP Bayesian Network Structure to the Equivalent Sample Size Parameter. In: Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, UAI 2007, pp. 360–367 (2007)Google Scholar
  18. 18.
    Abellan, J., Moral, S.: New score for independence based on the imprecise Dirichlet model. In: International Symposium on Imprecise Probability: Theory and Applications, ISIPTA 2005, SIPTA, pp. 1–10 (2005)Google Scholar
  19. 19.
    Cano, A., Gómez-Olmedo, M., Masegosa, A.R., Moral, S.: Locally averaged Bayesian Dirichlet metrics for learning the structure and the parameters of Bayesian networks. International Journal of Approximate Reasoning 54(4), 526–540 (2013)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. MPS/SIAM Series on Optimization. SIAM (2001)Google Scholar
  21. 21.
    Spiegelhalter, D.J., Cowell, R.G.: In: Learning in probabilistic expert systems, pp. 447–466. Clarendon Press, Oxford (1992)Google Scholar
  22. 22.
    Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Machine Learning 29 (1997)Google Scholar
  23. 23.
    Jensen, F.V., Kjærulff, U., Olesen, K.G., Pedersen, J.: Et forprojekt til et ekspertsystem for drift af spildevandsrensning (an expert system for control of waste water treatment — a pilot project). Technical report, Judex Datasystemer A/S, Aalborg, Denmark (1989) (in Danish)Google Scholar
  24. 24.
    Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. In: Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine. Lecture Notes in Medical Informatics, vol. 38, pp. 247–256. Springer, Heidelberg (1989)Google Scholar
  25. 25.
    Ide, J.S., Cozman, F.G.: Random generation of Bayesian networks. In: Bittencourt, G., Ramalho, G.L. (eds.) SBIA 2002. LNCS (LNAI), vol. 2507, pp. 366–375. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  26. 26.
    Nagarajan, R., Scutari, M., Lèbre, S.: Bayesian Networks in R with Applications in Systems Biology. Use R! series. Springer (2013)Google Scholar
  27. 27.
    de Campos, C.P., Ji, Q.: Properties of Bayesian Dirichlet scores to learn Bayesian network structures. In: AAAI Conference on Artificial Intelligence, pp. 431–436. AAAI Press (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Mauro Scanagatta
    • 1
  • Cassio P. de Campos
    • 1
  • Marco Zaffalon
    • 1
  1. 1.Istituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA)Switzerland

Personalised recommendations