Parallel Computing for Bayesian Networks

Nagarajan, Radhakrishnan; Scutari, Marco; Lèbre, Sophie

doi:10.1007/978-1-4614-6446-4_5

Radhakrishnan Nagarajan⁴,
Marco Scutari⁵ &
Sophie Lèbre⁶

Part of the book series: Use R! ((USE R,volume 48))

13k Accesses

Abstract

Most problems in Bayesian network theory have a computational complexity that, in the worst case, scales exponentially with the number of variables. It is polynomial even for sparse networks. Even though newer algorithms are designed to improve scalability, it is unfeasible to analyze data containing more than a few hundreds of variables. Parallel computing provides a way to address this problem by making better use of modern hardware.

In this chapter we will provide a brief overview of the history and the fundamental concepts of parallel computing, and we will examine their applications to Bayesian network learning and inference using the bnlearn package.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Since version 2.14, the R base distribution includes a revised copy of snow in the parallel package.

References

Abramson B, Brown J, Edwards W, Murphy A, Winkler RL (1996) Hailfinder: a Bayesian system for forecasting severe weather. Int J Forecast 12(1):57–71
Article Google Scholar
Borgelt C, Steinbrecher M, Krus R (2009) Graphical models: representations for learning, reasoning and data mining, 2nd edn. Wiley, New York
MATH Google Scholar
Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Fisher D, Lenz H (eds) Learning from data: artificial intelligence and statistics V. Springer, New York, pp 121–130
Chapter Google Scholar
Cooper GF (1990) The computational complexity of probabilistic inference using Bayesian belief networks. Artif Intell 42(2–3):393–405
Article MATH Google Scholar
Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman & Hall, New York
MATH Google Scholar
Flynn MJ (1972) Some computer organizations and their effectiveness. IEEE Trans Comput 21(9):948–960
Article MathSciNet MATH Google Scholar
Friedman N, Goldszmidt M, Wyner A (1999a) Data analysis with Bayesian networks: a bootstrap approach. In: Proceedings of the 15th conference on uncertainty in artificial intelligence, pp 196–205
Google Scholar
Friedman N, Pe’er D, Nachman I (1999b) Learning Bayesian network structure from massive datasets: the “Sparse Candidate” algorithm. In: Proceedings of 15th conference on uncertainty in artificial intelligence (UAI), Morgan Kaufmann, pp 206–215
Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York
Book MATH Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge
Google Scholar
Li NM (2010) rsprng: R interface to SPRNG (Scalable Parallel Random Number Generators). R package version 1.0
Google Scholar
Li NM, Rossini AJ (2010) rpvm: R interface to PVM (Parallel Virtual Machine). R package version 1.0-4
Google Scholar
Margolin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera R, Califano A (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7(Suppl 1):S7
Article Google Scholar
Meloni A, Ripoli A, Positano V, Landini L (2009) Improved learning of Bayesian networks in biomedicine. In: Proceedings of the 9th international conference on intelligent systems design and applications, IEEE Computer Society, pp 624–628
Google Scholar
Rauber T, Rünger G (2010) Parallel programming for multicore and cluster systems. Springer, Berlin
Book MATH Google Scholar
Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP (2005) Causal protein-signaling networks derived from multiparameter single-cell data. Science 308(5721):523–529
Article Google Scholar
Schmidberger M, Morgan M, Eddelbuettel D, Yu H, Tierney L, Mansmann U (2009) State of the art in parallel computing with R. J Stat Softw 31(1):1–27
Google Scholar
Tierney L, Rossini AJ, Li NM, Sevcikova H (2008) snow: simple network of workstations. R package version 0.3-3
Google Scholar
Yu H (2010) Rmpi: Interface (Wrapper) to MPI (Message-Passing Interface). R package version 0.5-8
Google Scholar
Zhang H (2004) The optimality of naive bayes. In: Proceedings of the 17th International Florida Artificial Intelligence Research Society Conference, AAAI Press, pp 562–567
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Biomedical Informatics Department of Biostatistics, University of Kentucky, Lexington, Kentucky, USA
Radhakrishnan Nagarajan
Genetics Institute, University College London, London, UK
Marco Scutari
ICube, Université de Strasbourg, Strasbourg, France
Sophie Lèbre

Authors

Radhakrishnan Nagarajan
View author publications
You can also search for this author in PubMed Google Scholar
Marco Scutari
View author publications
You can also search for this author in PubMed Google Scholar
Sophie Lèbre
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Nagarajan, R., Scutari, M., Lèbre, S. (2013). Parallel Computing for Bayesian Networks. In: Bayesian Networks in R. Use R!, vol 48. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6446-4_5

Download citation

DOI: https://doi.org/10.1007/978-1-4614-6446-4_5
Published: 19 March 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6445-7
Online ISBN: 978-1-4614-6446-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics