An Efficient and Scalable Algorithm for Local Bayesian Network Structure Discovery

de Morais, Sérgio Rodrigues; Aussem, Alex

doi:10.1007/978-3-642-15939-8_11

Sérgio Rodrigues de Morais²³ &
Alex Aussem²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6323))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3671 Accesses
6 Citations

Abstract

We present an efficient and scalable constraint-based algorithm, called Hybrid Parents and Children (HPC), to learn the parents and children of a target variable in a Bayesian network. Finding those variables is an important first step in many applications including Bayesian network structure learning, dimensionality reduction and feature selection. The algorithm combines ideas from incremental and divide-and-conquer methods in a principled and effective way, while still being sound in the sample limit. Extensive empirical experiments are provided on public synthetic and real-world data sets of various sample sizes. The most noteworthy feature of HPC is its ability to handle large neighborhoods contrary to current CB algorithm proposals. The number of calls to the statistical test, en hence the run-time, is empirically on the order O(n ^1.09), where n is the number of variables, on the five benchmarks that we considered, and O(n ^1.21) on a real drug design characterized by 138,351 features.

Download to read the full chapter text

Chapter PDF

Problem solving with Molecular Topology: a walkthrough

Article 03 March 2017

Model-Based Lead Molecule Design

LEADD: Lamarckian evolutionary algorithm for de novo drug design

Article Open access 15 January 2022

Keywords

References

Aliferis, C., Tsamardinos, I., Statnikov, A., Brown, L.: Causal explorer: A causal probabilistic network learning toolkit for biomedical discovery. In: Proceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Scienes, METMBS, Las Vegas, Nevada, USA, pp. 23–26. CSREA Press (2003)
Google Scholar
Aussem, A., de Morais, S.R., Corbex, M.: Nasopharyngeal carcinoma data analysis with a novel Bayesian network skeleton learning. In: Bellazzi, R., Abu-Hanna, A., Hunter, J. (eds.) AIME 2007. LNCS (LNAI), vol. 4594, pp. 326–330. Springer, Heidelberg (2007)
Chapter Google Scholar
Brown, L.E., Tsamardinos, I.: A strategy for making predictions under manipulation. In: JMLR: Workshop and Conference Proceedings, vol. 3, pp. 35–52 (2008)
Google Scholar
Cawley, G.: Causal and non-causal feature selection for ridge regression. In: JMLR: Workshop and Conference Proceedings, vol. 3 (2008)
Google Scholar
Cheng, J., Hatzis, C., Hayashi, H., Krogel, M.A., Morishita, S., Page, D., Sese, J.: KDD Cup 2001 Report. In: ACM SIGKDD Explorations, pp. 1–18 (2002)
Google Scholar
Chickering, D.: Learning equivalence classes of bayesian-network structures. Machine Learning 2, 445–498 (2002)
Article MATH MathSciNet Google Scholar
Friedman, N.L., Nachman, I., Pe’er, D.: Learning bayesian network structure from massive datasets: the ”sparse candidate” algorithm. In: Laskey, K.B., Prade, H. (eds.) Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, pp. 21–30. Morgan Kaufmann Publishers, San Francisco (1999)
Google Scholar
Fu, S., Desmarais, M.: Tradeoff analysis of different Markov blanket local learning approaches. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 562–571. Springer, Heidelberg (2008)
Chapter Google Scholar
Guyon, I., Aliferis, C., Cooper, G., Elisseef, A., Pellet, J.P., Statnikov, P.A.: Design and analysis of the causation and prediction challenge. In: JMLR: Workshop and Conference Proceedings, vol. 1, pp. 1–16. MIT Press, Boston (2008)
Google Scholar
Neapolitan, R.E.: Learning Bayesian Networks. Pearson Prentice Hall, Upper Saddle River (2004)
Google Scholar
Peña, J.M., Nilsson, R., Björkegren, J., Tegnér, J.: Towards scalable and data efficient learning of Markov boundaries. International Journal of Approximate Reasoning 45(2), 211–232 (2007)
Article MATH Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)
Google Scholar
Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)
MATH Google Scholar
Peña, J.: Learning gaussian graphical models of gene networks with false discovery rate control. In: Marchiori, E., Moore, J.H. (eds.) EvoBIO 2008. LNCS, vol. 4973, pp. 165–176. Springer, Heidelberg (2008)
Chapter Google Scholar
Rodrigues de Morais, S., Aussem, A.: A novel scalable and data efficient feature subset selection algorithm. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 298–312. Springer, Heidelberg (2008)
Chapter Google Scholar
Rodrigues de Morais, S., Aussem, A.: A novel Markov boundary based feature subset selection algorithm. Neurocomputing 73, 578–584 (2010)
Article Google Scholar
Rodrigues de Morais, S., Aussem, A., Corbex, M.: Handling almost-deterministic relationships in constraint-based Bayesian network discovery: Application to cancer risk factor identification. In: 16th European Symposium on Artificial Neural Networks ESANN’08, pp. 101–106 (2008)
Google Scholar
Steck, H.: Learning the Bayesian network structure: Dirichlet prior vs data. In: Conference on Uncertainty in Artificial Intelligence UAI’08, pp. 511–518 (2008)
Google Scholar
Tsamardinos, I., Aliferis, C.F., Statnikov, A.R.: Algorithms for large scale Markov blanket discovery. In: Florida Artificial Intelligence Research Society Conference FLAIRS’03, pp. 376–381 (2003)
Google Scholar
Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65(1), 31–78 (2006)
Article Google Scholar
Tsamardinos, I., Brown, L.E.: Bounding the false discovery rate in local Bayesian network learning. In: Proceedings AAAI National Conference on AI AAAI’08, pp. 1100–1105 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Lyon, F-69000, Lyon, University of Lyon 1, LIESP Laboratory, 69622, Villeurbanne, France
Sérgio Rodrigues de Morais & Alex Aussem

Authors

Sérgio Rodrigues de Morais
View author publications
You can also search for this author in PubMed Google Scholar
Alex Aussem
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, Avenida de los Castros, s/n, 39071, Santander, Spain
José Luis Balcázar
Yahoo! Research Barcelona, Avinguda Diagonal 177, 08018, Barcelona, Spain
Francesco Bonchi
Yahoo! Research Barcelona, Avinguda Diagnonal 177, 08018, Barcelona, Spain
Aristides Gionis
TAO, CNRS-INRIA-LRI, Université Paris-Sud, 91405, Orsay, France
Michèle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Morais, S.R., Aussem, A. (2010). An Efficient and Scalable Algorithm for Local Bayesian Network Structure Discovery. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15939-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-15939-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15938-1
Online ISBN: 978-3-642-15939-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Efficient and Scalable Algorithm for Local Bayesian Network Structure Discovery

Abstract

Chapter PDF

Similar content being viewed by others

Problem solving with Molecular Topology: a walkthrough

Model-Based Lead Molecule Design

LEADD: Lamarckian evolutionary algorithm for de novo drug design

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Efficient and Scalable Algorithm for Local Bayesian Network Structure Discovery

Abstract

Chapter PDF

Similar content being viewed by others

Problem solving with Molecular Topology: a walkthrough

Model-Based Lead Molecule Design

LEADD: Lamarckian evolutionary algorithm for de novo drug design

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation