Skip to main content

Adaptive Bayesian Network Structure Learning from Big Datasets

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10179))

Included in the following conference series:

Abstract

Since big data contain more comprehensive probability distributions and richer causal relationships than conventional small datasets, discovering Bayesian network (BN) structure from big datasets is becoming more and more valuable for modeling and reasoning under uncertainties in many areas. Facing big data, most of the current BN structure learning algorithms have limitations. First, learning BNs structure from big datasets is an expensive process that requires high computational cost, often ending in failure. Second, given any dataset as input, it is very difficult to choose one algorithm from numerous candidates for consistently achieving good learning accuracy. To address these issues, we introduce a novel approach called Adaptive Bayesian network Learning (ABNL). ABNL begins with an adaptive sampling process that extracts a sufficiently large data partition from any big dataset for fast structure learning. Then, ABNL feeds the data partition to different learning algorithms to obtain a collection of BN Structures. Lastly, ABNL adaptively chooses the structures and merge them into a final network structure using an ensemble method. Experimental results on four big datasets show that ABNL leads to a significantly improved performance than whole dataset learning and more accurate results than baseline algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ben-Gal, I.: Bayesian networks. In: Ruggeri, F., Kenett, R.S., Faltin, F.W. (eds.) Encyclopedia of Statistics in Quality and Reliability. Wiley, Hoboken (2007)

    Google Scholar 

  2. Yoo, C., Ramirez, L., Liuzzi, J.: Big data analysis using modern statistical and machine learning methods in medicine. Int. Neurourol. J. 18(2), 50–57 (2014)

    Article  Google Scholar 

  3. Zhang, Y., Zhang, Y., Swears, N., et al.: Modeling temporal interactions with interval temporal Bayesian networks for complex activity recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2468–2483 (2013)

    Article  Google Scholar 

  4. Njah, H., Jamoussi, S.: Weighted ensemble learning of Bayesian network for gene regulatory networks. Neurocomputing 150(B), 404–416 (2015)

    Article  Google Scholar 

  5. Yang, J., Tong, Y., Liu, X., Tan, S.: Causal inference from financial factors: continuous variable based local structure learning algorithm. In: 2014 IEEE Conference on Computational Intelligence for Financial Engineering and Economics (CIFEr), pp. 278–285. IEEE (2014)

    Google Scholar 

  6. Yue, K., Wu, H., Fu, X., Xu, J., Yin, Z., Liu, W.: A data-intensive approach for discovering user similarities in social behavioral interactions based on the Bayesian network. Neurocomputing 219, 364–375 (2017)

    Article  Google Scholar 

  7. Al-Jarrah, O., Yoo, P., et al.: Efficient machine learning for big data: a review. Big Data Res. 2(3), 87–93 (2015)

    Article  Google Scholar 

  8. Fang, Q., Yue, K., Fu, X.,Wu, H., Liu, W.: A mapreduce-based method for learning Bayesian network from massive data. In: Proceedings of the 15th Asia-Pacific Web Conference (APWeb 2013), pp. 697–708 (2013)

    Google Scholar 

  9. Tang, Y., Wang, Y., Cooper, K., Li, L.: Towards big data Bayesian network learning - an ensemble learning based approach. In: Proceedings of the IEEE International Congress on Big Data (BigData Congress), pp. 355–357 (2014)

    Google Scholar 

  10. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2010)

    Article  Google Scholar 

  11. Tang, Y., Xu, Z., Zhuang, Y.: Bayesian network structure learning from big data: a reservoir sampling based ensemble method. In: Gao, H., Kim, J., Sakurai, Y. (eds.) DASFAA 2016. LNCS, vol. 9645, pp. 209–222. Springer, Heidelberg (2016). doi:10.1007/978-3-319-32055-7_18

    Chapter  Google Scholar 

  12. Chickering, D., Heckerman, D., Meek, C.: Large-sample learning of Bayesian networks is NP-hard. J. Mach. Learn. Res. 5, 1287–1330 (2004)

    MathSciNet  MATH  Google Scholar 

  13. Wang, J., Tang, Y., Nguyen, M., Altintas, I.: A scalable data science workflow approach for big data Bayesian network learning. In: Proceedings of the 2014 IEEE/ACM International Symposium on Big Data Computing (BDC 2014), pp. 16–25 (2014)

    Google Scholar 

  14. Jiang, L., Li, C., Cai, Z., Zhang, H.: Sampled Bayesian network classifiers for class-imbalance and cost-sensitive learning. In: Proceedings of the IEEE 25th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 512–517 (2013)

    Google Scholar 

  15. Cheng, J., Greiner, R., Kelly, J., Bell, D., Liu, W.: Learning Bayesian networks from data: an information-theory based approach. Artif. Intell. 137(1–2), 43–90 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  16. Margaritis, D.: Learning Bayesian network model structure from data. Ph.D. thesis, Carnegie-Mellon University (2003)

    Google Scholar 

  17. Yaramakala, S., Margaritis, D.: Speculative Markov blanket discovery for optimal feature selection. In: Fifth IEEE International Conference on Data Mining (ICDM 2005), pp. 809–812. IEEE (2005)

    Google Scholar 

  18. Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)

    Article  Google Scholar 

  19. Njah, H., Jamoussi, S.: Weighted ensemble learning of Bayesian network for gene regulatory networks. Neurocomputing 150(PB), 404–416 (2015)

    Article  Google Scholar 

  20. Scutari, M.: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010)

    Article  Google Scholar 

  21. Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995)

    MATH  Google Scholar 

  22. Spiegelhalter, D., Cowell, R.: Learning in probabilistic expert systems. In: Bayesian Statistics, vol. 4. Clarendon Press (1992)

    Google Scholar 

  23. Beinlich, I., Suermondt, H., Chavez, R., Cooper, G.: The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks. In: Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine, pp. 247–256 (1989)

    Google Scholar 

  24. Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Mach. Learn. 29(2–3), 213–244 (1997)

    Article  MATH  Google Scholar 

  25. Abramson, B., Brown, J., Edwards, W., Murphy, A., Winkler, R.L.: Hailfinder: a Bayesian system for forecasting severe weather. Int. J. Forecast. 12(1), 57–71 (1996)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Natural Science Foundation of Jiangsu Province, China (Grant No. BK20141420 and Grant No. BK20140857).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Tang, Y., Zhang, Q., Liu, H., Wang, W. (2017). Adaptive Bayesian Network Structure Learning from Big Datasets. In: Bao, Z., Trajcevski, G., Chang, L., Hua, W. (eds) Database Systems for Advanced Applications. DASFAA 2017. Lecture Notes in Computer Science(), vol 10179. Springer, Cham. https://doi.org/10.1007/978-3-319-55705-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55705-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55704-5

  • Online ISBN: 978-3-319-55705-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics