Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Bayesian network hybrid learning using an elite-guided genetic algorithm

Abstract

Bayesian networks (BNs) constitute a powerful framework for probabilistic reasoning and have been extensively used in different research domains. This paper presents an improved hybrid learning strategy that features parameterized genetic algorithms (GAs) to learn the structure of BNs underlying a set of data samples. The performance of GAs is influenced by the choice of multiple initial parameters. This work is concerned with designing a series of parameter-less hybrid methods on a build-up basis: first the standard implementation is refined with the previously-developed data-informed evolutionary strategies. Then, two novel knowledge-driven parent controlling enhancements are presented. The first improvement works upon the parent limitation setting. BN structure learning algorithms typically set a bound for the maximum number of parents a BN node can possess to comply with the computational feasibility of the learning process. Our proposed method carefully selects the parents to rule out based on a knowledge-driven strategy. The second enhancement aims at reducing the sensitivity of the parent control setting by dynamically adjusting the maximum number of parents each node can hold. In the experimental section, it is shown how the adopted baseline outperforms the competitor algorithms included in the benchmark: thanks to its global search capabilities, the genetic methodology can efficiently prevail over other state-of-the-art structural learners on large networks. Presented experiments also prove how the proposed methods enhance the algorithmic efficiency and sensitivity to parameter setting, and address the problem of data fragmentation with respect to the baseline, with the advantage of higher performances in some cases.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Berzan C (2012) An exploration of structure learning in Bayesian networks. Ph.D. thesis, Department of Computer Science, Tufts University, Medford

  2. Bromley J, Jackson N, Clymer O, Giacomello A, Jensen F (2005) The use of Hugin to develop Bayesian networks as an aid to integrated water resource planning. Environ Model Soft 20:231–242

  3. Buntine W (1994) Operations for learning with graphical models. J Artif Intell Res 2:159–225. https://doi.org/10.1613/jair.62

  4. Buntine WL (1991) Theory refinement on Bayesian networks. In: Proceedings of the seventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., Los Angeles, pp 52–60. https://doi.org/10.1016/b978-1-55860-203-8.50010-3

  5. Carvalho A (2011) A cooperative coevolutionary genetic algorithm for learning Bayesian network structures. In: Proceedings of the 13th genetic and evolutionary computation conference. ACM, New York, pp 1131–1138. https://doi.org/10.1145/2001576.2001729

  6. Chickering DM, Heckerman D, Meek C (2004) Large-sample learning of Bayesian networks is NP-hard. J Mach Learn Res 5:1287–1330

  7. Chow CK, Liu CN (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14:462–467. https://doi.org/10.1109/tit.1968.1054142

  8. Contaldi C, Vafaee F, Nelson PC (2017) The role of crossover operator in Bayesian network structure learning performance: a comprehensive comparative study and new insights. In: Proceedings of the genetic and evolutionary computation conference, GECCO ’17. ACM, New York, pp 769–776. https://doi.org/10.1145/3071178.3071240

  9. Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347. https://doi.org/10.1007/bf00994110

  10. de Campos LM, Fernández-Luna JM, Huete JF (2004) Bayesian networks and information retrieval: an introduction to the special issue. Inf Process Manag 40:727–733

  11. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64. https://doi.org/10.1080/01621459.1961.10482090

  12. Eades P, Lin X, Smyth WF (1993) A fast and effective heuristic for the feedback arc set problem. Inf Process Lett 47:319–323. https://doi.org/10.1016/0020-0190(93)90079-o

  13. Eiben AE, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans Evolut Comput 3(2):124–141. https://doi.org/10.1109/4235.771166

  14. Friedman N, Geiger D, Goldszmidt M (1997) Bayesian networks classifiers. Mach Learn 29:131–163. https://doi.org/10.1023/A:1007465528199

  15. Friedman N, Nachman I, Peér D (1999) Learning Bayesian network structure from massive datasets: the “sparse candidate” algorithm. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence, association for uncertainty in artificial intelligence (AUAI), Stockholm, pp 206–215

  16. Hill SM, Lu Y, Molina J, Heiser LM, Spellman PT, Speed TP, Gray JW, Mills GB, Mukherjee S (2012) Bayesian inference of signaling network topology in a cancer cell line. Bioinformatics 28:2804–2810

  17. Kabli R, Herrmann F, McCall J (2007) A chain-model genetic algorithm for Bayesian network structure learning. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, New York. https://doi.org/10.1145/1276958.1277200

  18. Kahn CE, Roberts LM, Shaffer KA, Haddawy P (1997) Construction of a Bayesian network for mammographic diagnosis of breast cancer. Comput Biol Med 27:19–29

  19. Larrañaga P, Kuijpers CMH, Murga RH, Yurramendi Y (1996) Learning Bayesian network structures by searching for the best ordering with genetic algorithms. IEEE Trans Syst Man Cybern 26:487–493. https://doi.org/10.1109/3468.508827

  20. Leray P, Francois O (2004) BNT structure learning package: documentation and experiments. Technical report, Laboratoire PSI, Universitè et INSA de Rouen

  21. McDonald JH (2014) Handbook of biological statistics, 3rd edn. Sparky House Publishing, Baltimore

  22. Pearl J (2000) Causality: models, reasoning, and inference. Cambridge University Press, New York

  23. Pearson K (1900) On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos Mag 50:157–175. https://doi.org/10.1080/14786440009463897

  24. Pellet J, Elisseeff A (2008) Using Markov blankets for causal structure learning. J Mach Learn Res 9:1295–1342

  25. Rudolph G (1994) Convergence analysis of canonical genetic algorithms. IEEE Trans Neural Netw 5:96–101

  26. Sachs K, Perez O, Pe’er D (2005) Causal protein-signaling networks derived from multiparameter single-cell data. Science 308:523–529

  27. Scutari M (2016) Bayesian network repository. URL http://www.bnlearn.com/bnrepository/

  28. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

  29. Sokal RR, Rohlf FJ (1995) Biometry: the principles and practices of statistics in biological research, 3rd edn. W.H. Freeman and Company, New York. https://doi.org/10.2307/2412280

  30. Spirtes P, Glymour C, Scheines R (2000) Causation, predition and search, 2nd edn. MIT Press, Cambridge. https://doi.org/10.1007/978-1-4612-2748-9

  31. Teyssier M, Koller D (2005) Ordering-based search: a simple and effective algorithm for learning Bayesian networks. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., Edinburgh, pp 584–590

  32. Tsamardinos I, Brown LE, Aliferis C (2006) The max–min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65:31–78. https://doi.org/10.1007/s10994-006-6889-7

  33. Uusitalo L (2007) Advantages and challenges of Bayesian networks in environmental modelling. Ecol Modell 203:312–318. https://doi.org/10.1016/j.ecolmodel.2006.11.033

  34. Uusitalo L, Kuikka S, Romakkaniemi A (2005) Estimation of Atlantic salmon smolt carrying capacity of rivers using expert knowledge. ICES J Mar Sci 62:708–722

  35. Vafaee F (2014) Learning the structure of large-scale Bayesian networks using genetic algorithm. In: Proceedings of the conference on genetic and evolutionary computation. ACM, New York, pp 855–862. https://doi.org/10.1145/2576768.2598223

  36. Vafaee F, Turán G, Nelson PC, Berger-Wolf TY (2014a) Among-site rate variation: adaptation of genetic algorithm mutation rates at each single site. In: Proceedings of the 2014 annual conference on genetic and evolutionary computation. ACM, New York. https://doi.org/10.1145/2576768.2598216

  37. Vafaee F, Turán G, Nelson PC, Berger-Wolf TY (2014b) Balancing the exploration and exploitation in an adaptive diversity guided genetic algorithm. In: 2014 IEEE congress on evolutionary computation. IEEE, Beijing. https://doi.org/10.1109/cec.2014.6900257

  38. Van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths, London

  39. Villanueva E, Maciel CD (2014) Efficient methods for learning Bayesian network super-structures. Neurocomputing 123:3–12. https://doi.org/10.1016/j.neucom.2012.10.035

  40. Wong ML, Lee SY, Leung KS (2004) Data mining of Bayesian networks using cooperative co-evolution. Decis Support Syst 38:451–472. https://doi.org/10.1016/s0167-9236(03)00115-5

  41. Wooldridge S, Done T (2004) Learning to predict large-scale coral bleaching from past events: a Bayesian approach using remotely sensed data, in-situ data, and environmental proxies. Coral Reefs 23:96–108

  42. Xia J, Neapolitan R, Barmadaand MM, Visweswaran S (2011) Learning genetic epistasis using Bayesian network scoring criteria. BMC Bioinform 12:89

  43. Yehezkel R, Lerner B (2009) Bayesian network structure learning by recursive autonomy identification. J Mach Learn Res 10:1527–1570. https://doi.org/10.1007/11815921_16

Download references

Author information

Correspondence to Fatemeh Vafaee.

Ethics declarations

Conflicts of interest

This study acts in accordance with the ethical standards. There was no funding particularly associated with this study and the authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 544 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Contaldi, C., Vafaee, F. & Nelson, P.C. Bayesian network hybrid learning using an elite-guided genetic algorithm. Artif Intell Rev 52, 245–272 (2019). https://doi.org/10.1007/s10462-018-9615-5

Download citation

Keywords

  • Bayesian networks
  • Structure learning
  • Genetic algorithms
  • Parent control