A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data

Shajoonnezhad, Nazanin; Nikanjam, Amin

doi:10.1007/s13042-022-01674-9

A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data

Original Article
Published: 01 October 2022

Volume 14, pages 947–958, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

231 Accesses
1 Citation
Explore all metrics

Abstract

This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data. Compared to continuous Bayesian networks, learning a discrete Bayesian network is a challenging problem due to the large parameter space. Although many approaches have been developed for learning continuous Bayesian networks, few approaches have been proposed for the discrete ones. In this paper, we address learning Bayesian networks as an optimization problem and propose a score function which guarantees the learnt structure to be a sparse directed acyclic graph. Besides, we implement a block-wised stochastic coordinate descent algorithm to optimize the score function. Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently for high-dimensional data. The proposed approach is applied to synthetic data from well-known benchmark networks. The quality, scalability, and robustness of the constructed network are measured. Compared to some competitive approaches, the results reveal that our algorithm outperforms some of the well-known proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Structure learning of Bayesian Networks using global optimization with applications in data classification

Article 04 October 2014

Sona Taheri & Musa Mammadov

Empirical Behavior of Bayesian Network Structure Learning Algorithms

Multivariate Cluster-Based Discretization for Bayesian Network Structure Learning

Notes

https://github.com/nshajoon/SVRCD-Algorithm.git.

References

Adabor ES, Acquaah-Mensah GK, Oduro FT (2015) Saga: a hybrid search algorithm for bayesian network structure learning of transcriptional regulatory networks. J Biomed Inform 53:27–35
Article Google Scholar
Akbar A, Kousiouris G, Pervaiz H et al (2018) Real-time probabilistic data fusion for large-scale iot applications. IEEE Access 6:10,015-10,027
Article Google Scholar
Aragam B, Zhou Q (2015) Concave penalized estimation of sparse gaussian bayesian networks. J Mach Learn Res 16(1):2273–2328
MathSciNet MATH Google Scholar
Aragam B, Gu J, Zhou Q (2019) Learning large-scale bayesian networks with the sparsebn package. J Stat Softw 91(1):1–38
Google Scholar
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Article MathSciNet MATH Google Scholar
Bottou L (2012) Stochastic gradient descent tricks. Neural networks: tricks of the trade. Springer, Berlin, pp 421–436
Chapter Google Scholar
Cassidy B, Rae C, Solo V (2014) Brain activity: connectivity, sparsity, and mutual information. IEEE Trans Med Imaging 34(4):846–860
Article Google Scholar
Chickering M, Heckerman D, Meek C (2004) Large-sample learning of bayesian networks is np-hard. J Mach Learn Res 5:25
MathSciNet MATH Google Scholar
Colombo D, Maathuis MH, Kalisch M et al (2012) Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann Stat 20:294–321
MathSciNet MATH Google Scholar
Condat L, Richtárik P (2021) Murana: A generic framework for stochastic variance-reduced optimization. arXiv:2106.03056 (arXiv preprint)
Contaldi C, Vafaee F, Nelson PC (2019) Bayesian network hybrid learning using an elite-guided genetic algorithm. Artif Intell Rev 52(1):245–272
Article Google Scholar
Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347
Article MATH Google Scholar
Csardi G, Nepusz T et al (2006) The igraph software package for complex network research. Int J Complex Syst 1695(5):1–9
Google Scholar
Cypko MA, Stoehr M, Kozniewski M et al (2017) Validation workflow for a clinical bayesian network model in multidisciplinary decision making in head and neck oncology treatment. Int J Comput Assist Radiol Surg 12(11):1959–1970
Article Google Scholar
Dai J, Ren J, Du W et al (2020) An improved evolutionary approach-based hybrid algorithm for bayesian network structure learning in dynamic constrained search space. Neural Comput Appl 32(5):1413–1434
Article Google Scholar
Deepa N, Prabadevi B, Maddikunta PK et al (2021) An ai-based intelligent system for healthcare analysis using ridge-adaline stochastic gradient descent classifier. J Supercomput 77(2):1998–2017
Article Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1
Article Google Scholar
Friedman N, Nachman I, Pe’er D (2013) Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. arXiv:1301.6696 (arXiv preprint)
Fu F, Zhou Q (2013) Learning sparse causal gaussian networks with experimental intervention: regularization and coordinate descent. J Am Stat Assoc 108(501):288–300
Article MathSciNet MATH Google Scholar
Gu J, Fu F, Zhou Q (2019) Penalized estimation of directed acyclic graphs from discrete data. J Stat Comput 19(1):161–176
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Wainwright M (2015) Statistical learning with sparsity: the lasso and generalizations. CRC Press, New York
Book MATH Google Scholar
Huang W, Zhang X (2021) Randomized smoothing variance reduction method for large-scale non-smooth convex optimization. Oper Res Forum. Springer, Berlin, pp 1–28
Google Scholar
Jiang Y, Liang Z, Gao H et al (2018) An improved constraint-based bayesian network learning method using gaussian kernel probability density estimator. Expert Syst Appl 113:544–554
Article Google Scholar
Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. Adv Neural Inf Process Syst 26:315–323
Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques
Kourou K, Rigas G, Papaloukas C et al (2020) Cancer classification from time series microarray data through regulatory dynamic bayesian networks. Comput Biol Med 116(103):577
Google Scholar
Lachapelle S, Brouillard P, Deleu T, et al (2019) Gradient-based neural dag learning. arXiv:1906.02226 (arXiv preprint)
Lee S, Kim SB (2019) Parallel simulated annealing with a greedy algorithm for bayesian network structure learning. IEEE Trans Knowl Data Eng 32(6):1157–1166
Article Google Scholar
Luo Y, El Naqa I, McShan DL et al (2017) Unraveling biophysical interactions of radiation pneumonitis in non-small-cell lung cancer via bayesian network analysis. Radiother Oncol 123(1):85–92
Article Google Scholar
Luppi AI, Stamatakis EA (2021) Combining network topology and information theory to construct representative brain networks. Netw Neurosci 5(1):96–124
Article Google Scholar
Malone B (2015) Empirical behavior of bayesian network structure learning algorithms. Workshop on advanced methodologies for bayesian networks. Springer, Berlin, pp 105–121
Chapter Google Scholar
Manogaran G, Lopez D (2018) Health data analytics using scalable logistic regression with stochastic gradient descent. Int J Adv Intell Paradig 10(1–2):118–132
Google Scholar
Margaritis D (2003) Margaritis D (2003) Learning bayesian network model structure from data. Tech. rep., Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science
Min E, Long J, Cui J (2018) Analysis of the variance reduction in svrg and a new acceleration method. IEEE Access 6:16,165-16,175
Article Google Scholar
Ming Y, Zhao Y, Wu C et al (2018) Distributed and asynchronous stochastic gradient descent with variance reduction. Neurocomputing 281:27–36
Article Google Scholar
Niinimaki T, Parviainen P, Koivisto M (2016) Structure discovery in bayesian networks by sampling partial orders. J Mach Learn Res 17(1):2002–2048
MathSciNet MATH Google Scholar
Perrier E, Imoto S, Miyano S (2008) Finding optimal bayesian network given a super-structure. J Mach Learn Res 9:10
MathSciNet MATH Google Scholar
Rao ASS, Rao CR (2020) Principles and methods for data science. Elsevier, New York
MATH Google Scholar
Scutari M (2009) Learning bayesian networks with the bnlearn r package. arXiv:0908.3817 (arXiv preprint)
Scutari M, Vitolo C, Tucker A (2019) Learning bayesian networks from big data with greedy search: computational complexity and efficient implementation. Stat Comput 29(5):1095–1108
Article MathSciNet MATH Google Scholar
Shuai H, Jing L, Jie-ping Y et al (2013) A sparse structure learning algorithm for gaussian bayesian network identification from high-dimensional data. IEEE Trans Pattern Anal Mach Intell 35(6):1328–1342
Article Google Scholar
Spirtes P, Glymour CN, Scheines R et al (2000) Causation, prediction, and search. MIT Press, London
MATH Google Scholar
Sun S, Cao Z, Zhu H et al (2019) A survey of optimization methods from a machine learning perspective. IEEE Trans Cybern 50(8):3668–3681
Article Google Scholar
Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1):31–78
Article MATH Google Scholar
Wright SJ (2015) Coordinate descent algorithms. Math Program 151(1):3–34
Article MathSciNet MATH Google Scholar
Wu JX, Chen PY, Li CM et al (2020) Multilayer fractional-order machine vision classifier for rapid typical lung diseases screening on digital chest x-ray images. IEEE Access 8:105,886-105,902
Article Google Scholar
Yu Y, Chen J, Gao T, et al (2019) Dag-gnn: Dag structure learning with graph neural networks. In: International conference on machine learning, PMLR, pp 7154–7163
Yu Y, Gao T, Yin N, et al (2021) Dags with no curl: an efficient dag structure learning approach. In: International conference on machine learning, PMLR, pp 12,156–12,166
Zeng L, Ge Z (2020) Improved population-based incremental learning of bayesian networks with partly known structure and parallel computing. Eng Appl Artif Intell 95(103):920
Google Scholar
Zhang J, Cormode G, Procopiuc CM et al (2017) Privbayes: private data release via bayesian networks. ACM Trans Database Syst 42(4):1–41
Article MathSciNet MATH Google Scholar
Zheng X, Aragam B, Ravikumar PK et al (2018) Dags with no tears: continuous optimization for structure learning. Adv Neural Inf Process Syst 31:25
Google Scholar
Zhou Q (2011) Multi-domain sampling with applications to structural inference of bayesian networks. J Am Stat Assoc 106(496):1317–1330
Article MathSciNet MATH Google Scholar
Zhu X, Li H, Shen HT et al (2021) Fusing functional connectivity with network nodal information for sparse network pattern learning of functional brain networks. Inf Fusion 75:131–139
Article Google Scholar

Download references

Funding

No funds, grants, or other support was received.

Author information

Authors and Affiliations

K. N. Toosi University of Technology, Tehran, Iran
Nazanin Shajoonnezhad
Polytechnique Montréal, Montréal, QC, Canada
Amin Nikanjam

Authors

Nazanin Shajoonnezhad
View author publications
You can also search for this author in PubMed Google Scholar
Amin Nikanjam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nazanin Shajoonnezhad.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare.

Data availability

The data generation method is available online https://github.com/nshajoon/SVRCD-Algorithm.git.

Code availability

The code is available online.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shajoonnezhad, N., Nikanjam, A. A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data. Int. J. Mach. Learn. & Cyber. 14, 947–958 (2023). https://doi.org/10.1007/s13042-022-01674-9

Download citation

Received: 21 September 2021
Accepted: 22 September 2022
Published: 01 October 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s13042-022-01674-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data

Abstract

Access this article

Similar content being viewed by others

Structure learning of Bayesian Networks using global optimization with applications in data classification

Empirical Behavior of Bayesian Network Structure Learning Algorithms

Multivariate Cluster-Based Discretization for Bayesian Network Structure Learning

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Data availability

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data

Abstract

Access this article

Similar content being viewed by others

Structure learning of Bayesian Networks using global optimization with applications in data classification

Empirical Behavior of Bayesian Network Structure Learning Algorithms

Multivariate Cluster-Based Discretization for Bayesian Network Structure Learning

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Data availability

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation