Skip to main content
Log in

A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data. Compared to continuous Bayesian networks, learning a discrete Bayesian network is a challenging problem due to the large parameter space. Although many approaches have been developed for learning continuous Bayesian networks, few approaches have been proposed for the discrete ones. In this paper, we address learning Bayesian networks as an optimization problem and propose a score function which guarantees the learnt structure to be a sparse directed acyclic graph. Besides, we implement a block-wised stochastic coordinate descent algorithm to optimize the score function. Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently for high-dimensional data. The proposed approach is applied to synthetic data from well-known benchmark networks. The quality, scalability, and robustness of the constructed network are measured. Compared to some competitive approaches, the results reveal that our algorithm outperforms some of the well-known proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://github.com/nshajoon/SVRCD-Algorithm.git.

References

  1. Adabor ES, Acquaah-Mensah GK, Oduro FT (2015) Saga: a hybrid search algorithm for bayesian network structure learning of transcriptional regulatory networks. J Biomed Inform 53:27–35

    Article  Google Scholar 

  2. Akbar A, Kousiouris G, Pervaiz H et al (2018) Real-time probabilistic data fusion for large-scale iot applications. IEEE Access 6:10,015-10,027

    Article  Google Scholar 

  3. Aragam B, Zhou Q (2015) Concave penalized estimation of sparse gaussian bayesian networks. J Mach Learn Res 16(1):2273–2328

    MathSciNet  MATH  Google Scholar 

  4. Aragam B, Gu J, Zhou Q (2019) Learning large-scale bayesian networks with the sparsebn package. J Stat Softw 91(1):1–38

    Google Scholar 

  5. Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  MATH  Google Scholar 

  6. Bottou L (2012) Stochastic gradient descent tricks. Neural networks: tricks of the trade. Springer, Berlin, pp 421–436

    Chapter  Google Scholar 

  7. Cassidy B, Rae C, Solo V (2014) Brain activity: connectivity, sparsity, and mutual information. IEEE Trans Med Imaging 34(4):846–860

    Article  Google Scholar 

  8. Chickering M, Heckerman D, Meek C (2004) Large-sample learning of bayesian networks is np-hard. J Mach Learn Res 5:25

    MathSciNet  MATH  Google Scholar 

  9. Colombo D, Maathuis MH, Kalisch M et al (2012) Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann Stat 20:294–321

    MathSciNet  MATH  Google Scholar 

  10. Condat L, Richtárik P (2021) Murana: A generic framework for stochastic variance-reduced optimization. arXiv:2106.03056 (arXiv preprint)

  11. Contaldi C, Vafaee F, Nelson PC (2019) Bayesian network hybrid learning using an elite-guided genetic algorithm. Artif Intell Rev 52(1):245–272

    Article  Google Scholar 

  12. Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347

    Article  MATH  Google Scholar 

  13. Csardi G, Nepusz T et al (2006) The igraph software package for complex network research. Int J Complex Syst 1695(5):1–9

    Google Scholar 

  14. Cypko MA, Stoehr M, Kozniewski M et al (2017) Validation workflow for a clinical bayesian network model in multidisciplinary decision making in head and neck oncology treatment. Int J Comput Assist Radiol Surg 12(11):1959–1970

    Article  Google Scholar 

  15. Dai J, Ren J, Du W et al (2020) An improved evolutionary approach-based hybrid algorithm for bayesian network structure learning in dynamic constrained search space. Neural Comput Appl 32(5):1413–1434

    Article  Google Scholar 

  16. Deepa N, Prabadevi B, Maddikunta PK et al (2021) An ai-based intelligent system for healthcare analysis using ridge-adaline stochastic gradient descent classifier. J Supercomput 77(2):1998–2017

    Article  Google Scholar 

  17. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1

    Article  Google Scholar 

  18. Friedman N, Nachman I, Pe’er D (2013) Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. arXiv:1301.6696 (arXiv preprint)

  19. Fu F, Zhou Q (2013) Learning sparse causal gaussian networks with experimental intervention: regularization and coordinate descent. J Am Stat Assoc 108(501):288–300

    Article  MathSciNet  MATH  Google Scholar 

  20. Gu J, Fu F, Zhou Q (2019) Penalized estimation of directed acyclic graphs from discrete data. J Stat Comput 19(1):161–176

    Article  MathSciNet  MATH  Google Scholar 

  21. Hastie T, Tibshirani R, Wainwright M (2015) Statistical learning with sparsity: the lasso and generalizations. CRC Press, New York

    Book  MATH  Google Scholar 

  22. Huang W, Zhang X (2021) Randomized smoothing variance reduction method for large-scale non-smooth convex optimization. Oper Res Forum. Springer, Berlin, pp 1–28

    Google Scholar 

  23. Jiang Y, Liang Z, Gao H et al (2018) An improved constraint-based bayesian network learning method using gaussian kernel probability density estimator. Expert Syst Appl 113:544–554

    Article  Google Scholar 

  24. Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. Adv Neural Inf Process Syst 26:315–323

    Google Scholar 

  25. Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques

  26. Kourou K, Rigas G, Papaloukas C et al (2020) Cancer classification from time series microarray data through regulatory dynamic bayesian networks. Comput Biol Med 116(103):577

    Google Scholar 

  27. Lachapelle S, Brouillard P, Deleu T, et al (2019) Gradient-based neural dag learning. arXiv:1906.02226 (arXiv preprint)

  28. Lee S, Kim SB (2019) Parallel simulated annealing with a greedy algorithm for bayesian network structure learning. IEEE Trans Knowl Data Eng 32(6):1157–1166

    Article  Google Scholar 

  29. Luo Y, El Naqa I, McShan DL et al (2017) Unraveling biophysical interactions of radiation pneumonitis in non-small-cell lung cancer via bayesian network analysis. Radiother Oncol 123(1):85–92

    Article  Google Scholar 

  30. Luppi AI, Stamatakis EA (2021) Combining network topology and information theory to construct representative brain networks. Netw Neurosci 5(1):96–124

    Article  Google Scholar 

  31. Malone B (2015) Empirical behavior of bayesian network structure learning algorithms. Workshop on advanced methodologies for bayesian networks. Springer, Berlin, pp 105–121

    Chapter  Google Scholar 

  32. Manogaran G, Lopez D (2018) Health data analytics using scalable logistic regression with stochastic gradient descent. Int J Adv Intell Paradig 10(1–2):118–132

    Google Scholar 

  33. Margaritis D (2003) Margaritis D (2003) Learning bayesian network model structure from data. Tech. rep., Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science

  34. Min E, Long J, Cui J (2018) Analysis of the variance reduction in svrg and a new acceleration method. IEEE Access 6:16,165-16,175

    Article  Google Scholar 

  35. Ming Y, Zhao Y, Wu C et al (2018) Distributed and asynchronous stochastic gradient descent with variance reduction. Neurocomputing 281:27–36

    Article  Google Scholar 

  36. Niinimaki T, Parviainen P, Koivisto M (2016) Structure discovery in bayesian networks by sampling partial orders. J Mach Learn Res 17(1):2002–2048

    MathSciNet  MATH  Google Scholar 

  37. Perrier E, Imoto S, Miyano S (2008) Finding optimal bayesian network given a super-structure. J Mach Learn Res 9:10

    MathSciNet  MATH  Google Scholar 

  38. Rao ASS, Rao CR (2020) Principles and methods for data science. Elsevier, New York

    MATH  Google Scholar 

  39. Scutari M (2009) Learning bayesian networks with the bnlearn r package. arXiv:0908.3817 (arXiv preprint)

  40. Scutari M, Vitolo C, Tucker A (2019) Learning bayesian networks from big data with greedy search: computational complexity and efficient implementation. Stat Comput 29(5):1095–1108

    Article  MathSciNet  MATH  Google Scholar 

  41. Shuai H, Jing L, Jie-ping Y et al (2013) A sparse structure learning algorithm for gaussian bayesian network identification from high-dimensional data. IEEE Trans Pattern Anal Mach Intell 35(6):1328–1342

    Article  Google Scholar 

  42. Spirtes P, Glymour CN, Scheines R et al (2000) Causation, prediction, and search. MIT Press, London

    MATH  Google Scholar 

  43. Sun S, Cao Z, Zhu H et al (2019) A survey of optimization methods from a machine learning perspective. IEEE Trans Cybern 50(8):3668–3681

    Article  Google Scholar 

  44. Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1):31–78

    Article  MATH  Google Scholar 

  45. Wright SJ (2015) Coordinate descent algorithms. Math Program 151(1):3–34

    Article  MathSciNet  MATH  Google Scholar 

  46. Wu JX, Chen PY, Li CM et al (2020) Multilayer fractional-order machine vision classifier for rapid typical lung diseases screening on digital chest x-ray images. IEEE Access 8:105,886-105,902

    Article  Google Scholar 

  47. Yu Y, Chen J, Gao T, et al (2019) Dag-gnn: Dag structure learning with graph neural networks. In: International conference on machine learning, PMLR, pp 7154–7163

  48. Yu Y, Gao T, Yin N, et al (2021) Dags with no curl: an efficient dag structure learning approach. In: International conference on machine learning, PMLR, pp 12,156–12,166

  49. Zeng L, Ge Z (2020) Improved population-based incremental learning of bayesian networks with partly known structure and parallel computing. Eng Appl Artif Intell 95(103):920

    Google Scholar 

  50. Zhang J, Cormode G, Procopiuc CM et al (2017) Privbayes: private data release via bayesian networks. ACM Trans Database Syst 42(4):1–41

    Article  MathSciNet  MATH  Google Scholar 

  51. Zheng X, Aragam B, Ravikumar PK et al (2018) Dags with no tears: continuous optimization for structure learning. Adv Neural Inf Process Syst 31:25

    Google Scholar 

  52. Zhou Q (2011) Multi-domain sampling with applications to structural inference of bayesian networks. J Am Stat Assoc 106(496):1317–1330

    Article  MathSciNet  MATH  Google Scholar 

  53. Zhu X, Li H, Shen HT et al (2021) Fusing functional connectivity with network nodal information for sparse network pattern learning of functional brain networks. Inf Fusion 75:131–139

    Article  Google Scholar 

Download references

Funding

No funds, grants, or other support was received.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nazanin Shajoonnezhad.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare.

Data availability

The data generation method is available online https://github.com/nshajoon/SVRCD-Algorithm.git.

Code availability

The code is available online.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shajoonnezhad, N., Nikanjam, A. A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data. Int. J. Mach. Learn. & Cyber. 14, 947–958 (2023). https://doi.org/10.1007/s13042-022-01674-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01674-9

Keywords

Navigation