Abstract
This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data. Compared to continuous Bayesian networks, learning a discrete Bayesian network is a challenging problem due to the large parameter space. Although many approaches have been developed for learning continuous Bayesian networks, few approaches have been proposed for the discrete ones. In this paper, we address learning Bayesian networks as an optimization problem and propose a score function which guarantees the learnt structure to be a sparse directed acyclic graph. Besides, we implement a block-wised stochastic coordinate descent algorithm to optimize the score function. Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently for high-dimensional data. The proposed approach is applied to synthetic data from well-known benchmark networks. The quality, scalability, and robustness of the constructed network are measured. Compared to some competitive approaches, the results reveal that our algorithm outperforms some of the well-known proposed methods.
Similar content being viewed by others
References
Adabor ES, Acquaah-Mensah GK, Oduro FT (2015) Saga: a hybrid search algorithm for bayesian network structure learning of transcriptional regulatory networks. J Biomed Inform 53:27–35
Akbar A, Kousiouris G, Pervaiz H et al (2018) Real-time probabilistic data fusion for large-scale iot applications. IEEE Access 6:10,015-10,027
Aragam B, Zhou Q (2015) Concave penalized estimation of sparse gaussian bayesian networks. J Mach Learn Res 16(1):2273–2328
Aragam B, Gu J, Zhou Q (2019) Learning large-scale bayesian networks with the sparsebn package. J Stat Softw 91(1):1–38
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Bottou L (2012) Stochastic gradient descent tricks. Neural networks: tricks of the trade. Springer, Berlin, pp 421–436
Cassidy B, Rae C, Solo V (2014) Brain activity: connectivity, sparsity, and mutual information. IEEE Trans Med Imaging 34(4):846–860
Chickering M, Heckerman D, Meek C (2004) Large-sample learning of bayesian networks is np-hard. J Mach Learn Res 5:25
Colombo D, Maathuis MH, Kalisch M et al (2012) Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann Stat 20:294–321
Condat L, Richtárik P (2021) Murana: A generic framework for stochastic variance-reduced optimization. arXiv:2106.03056 (arXiv preprint)
Contaldi C, Vafaee F, Nelson PC (2019) Bayesian network hybrid learning using an elite-guided genetic algorithm. Artif Intell Rev 52(1):245–272
Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347
Csardi G, Nepusz T et al (2006) The igraph software package for complex network research. Int J Complex Syst 1695(5):1–9
Cypko MA, Stoehr M, Kozniewski M et al (2017) Validation workflow for a clinical bayesian network model in multidisciplinary decision making in head and neck oncology treatment. Int J Comput Assist Radiol Surg 12(11):1959–1970
Dai J, Ren J, Du W et al (2020) An improved evolutionary approach-based hybrid algorithm for bayesian network structure learning in dynamic constrained search space. Neural Comput Appl 32(5):1413–1434
Deepa N, Prabadevi B, Maddikunta PK et al (2021) An ai-based intelligent system for healthcare analysis using ridge-adaline stochastic gradient descent classifier. J Supercomput 77(2):1998–2017
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1
Friedman N, Nachman I, Pe’er D (2013) Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. arXiv:1301.6696 (arXiv preprint)
Fu F, Zhou Q (2013) Learning sparse causal gaussian networks with experimental intervention: regularization and coordinate descent. J Am Stat Assoc 108(501):288–300
Gu J, Fu F, Zhou Q (2019) Penalized estimation of directed acyclic graphs from discrete data. J Stat Comput 19(1):161–176
Hastie T, Tibshirani R, Wainwright M (2015) Statistical learning with sparsity: the lasso and generalizations. CRC Press, New York
Huang W, Zhang X (2021) Randomized smoothing variance reduction method for large-scale non-smooth convex optimization. Oper Res Forum. Springer, Berlin, pp 1–28
Jiang Y, Liang Z, Gao H et al (2018) An improved constraint-based bayesian network learning method using gaussian kernel probability density estimator. Expert Syst Appl 113:544–554
Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. Adv Neural Inf Process Syst 26:315–323
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques
Kourou K, Rigas G, Papaloukas C et al (2020) Cancer classification from time series microarray data through regulatory dynamic bayesian networks. Comput Biol Med 116(103):577
Lachapelle S, Brouillard P, Deleu T, et al (2019) Gradient-based neural dag learning. arXiv:1906.02226 (arXiv preprint)
Lee S, Kim SB (2019) Parallel simulated annealing with a greedy algorithm for bayesian network structure learning. IEEE Trans Knowl Data Eng 32(6):1157–1166
Luo Y, El Naqa I, McShan DL et al (2017) Unraveling biophysical interactions of radiation pneumonitis in non-small-cell lung cancer via bayesian network analysis. Radiother Oncol 123(1):85–92
Luppi AI, Stamatakis EA (2021) Combining network topology and information theory to construct representative brain networks. Netw Neurosci 5(1):96–124
Malone B (2015) Empirical behavior of bayesian network structure learning algorithms. Workshop on advanced methodologies for bayesian networks. Springer, Berlin, pp 105–121
Manogaran G, Lopez D (2018) Health data analytics using scalable logistic regression with stochastic gradient descent. Int J Adv Intell Paradig 10(1–2):118–132
Margaritis D (2003) Margaritis D (2003) Learning bayesian network model structure from data. Tech. rep., Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science
Min E, Long J, Cui J (2018) Analysis of the variance reduction in svrg and a new acceleration method. IEEE Access 6:16,165-16,175
Ming Y, Zhao Y, Wu C et al (2018) Distributed and asynchronous stochastic gradient descent with variance reduction. Neurocomputing 281:27–36
Niinimaki T, Parviainen P, Koivisto M (2016) Structure discovery in bayesian networks by sampling partial orders. J Mach Learn Res 17(1):2002–2048
Perrier E, Imoto S, Miyano S (2008) Finding optimal bayesian network given a super-structure. J Mach Learn Res 9:10
Rao ASS, Rao CR (2020) Principles and methods for data science. Elsevier, New York
Scutari M (2009) Learning bayesian networks with the bnlearn r package. arXiv:0908.3817 (arXiv preprint)
Scutari M, Vitolo C, Tucker A (2019) Learning bayesian networks from big data with greedy search: computational complexity and efficient implementation. Stat Comput 29(5):1095–1108
Shuai H, Jing L, Jie-ping Y et al (2013) A sparse structure learning algorithm for gaussian bayesian network identification from high-dimensional data. IEEE Trans Pattern Anal Mach Intell 35(6):1328–1342
Spirtes P, Glymour CN, Scheines R et al (2000) Causation, prediction, and search. MIT Press, London
Sun S, Cao Z, Zhu H et al (2019) A survey of optimization methods from a machine learning perspective. IEEE Trans Cybern 50(8):3668–3681
Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1):31–78
Wright SJ (2015) Coordinate descent algorithms. Math Program 151(1):3–34
Wu JX, Chen PY, Li CM et al (2020) Multilayer fractional-order machine vision classifier for rapid typical lung diseases screening on digital chest x-ray images. IEEE Access 8:105,886-105,902
Yu Y, Chen J, Gao T, et al (2019) Dag-gnn: Dag structure learning with graph neural networks. In: International conference on machine learning, PMLR, pp 7154–7163
Yu Y, Gao T, Yin N, et al (2021) Dags with no curl: an efficient dag structure learning approach. In: International conference on machine learning, PMLR, pp 12,156–12,166
Zeng L, Ge Z (2020) Improved population-based incremental learning of bayesian networks with partly known structure and parallel computing. Eng Appl Artif Intell 95(103):920
Zhang J, Cormode G, Procopiuc CM et al (2017) Privbayes: private data release via bayesian networks. ACM Trans Database Syst 42(4):1–41
Zheng X, Aragam B, Ravikumar PK et al (2018) Dags with no tears: continuous optimization for structure learning. Adv Neural Inf Process Syst 31:25
Zhou Q (2011) Multi-domain sampling with applications to structural inference of bayesian networks. J Am Stat Assoc 106(496):1317–1330
Zhu X, Li H, Shen HT et al (2021) Fusing functional connectivity with network nodal information for sparse network pattern learning of functional brain networks. Inf Fusion 75:131–139
Funding
No funds, grants, or other support was received.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare.
Data availability
The data generation method is available online https://github.com/nshajoon/SVRCD-Algorithm.git.
Code availability
The code is available online.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shajoonnezhad, N., Nikanjam, A. A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data. Int. J. Mach. Learn. & Cyber. 14, 947–958 (2023). https://doi.org/10.1007/s13042-022-01674-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01674-9