Mining Bayesian Network Structure for Large Sets of Variables

Kłopotek, Mieczysław A.

doi:10.1007/3-540-48050-1_14

Mining Bayesian Network Structure for Large Sets of Variables

Mieczysław A. Kłopotek^5,6

Conference paper
First Online: 01 January 2002

675 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2366))

Abstract

A well-known problem with Bayesian networks (BN) is the practical limitation for the number of variables for which a Bayesian network can be learned in reasonable time. Even the complexity of simplest tree-like BN learning algorithms is prohibitive for large sets of variables. The paper presents a novel algorithm overcoming this limitation for the tree-like class of Bayesian networks. The new algorithm space consumption grows linearly with the number of variables n while the execution time is proportional to n ln(n), outperforming any known algorithm. This opens new perspectives in construction of Bayesian networks from data containing tens of thousands and more variables, e.g. in automatic text categorization.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cerquides, J.: Applying General Bayesian Techniques to Improve TAN Induction, Knowledge Discovery and Data Mining, 1999, pp 292–296.
Google Scholar
Cheng, J., Bell, D.A., Liu, W.: An algorithm for Bayesian belief network construction from data, Proceedings of AI & STAT’97, Ft. Lauderdale, Florida, 1997.
Google Scholar
Cheng, J., Bell, D.A., Liu, W.: Learning belief networks from data: an information theory based approach. Proceedings of the Sixth ACM International Conference on Information and Knowledge Management, 1997.
Google Scholar
Chow, C. K., Liu, C. N.: Approximating discrete probability distributions with dependence trees, IEEE Trans. on IT, IT-14, No. 3, 1968, pp. 462–467
Article Google Scholar
Chou, C. K., Wagner, T. J.: Consistency of an estimate of tree-dependent probability distribution, IEEE Transactions on Information Theory, IT-19, 1973, 369–371
Article Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers, Machine Learning vol. 29, 1997, pp. 131.
Article MATH Google Scholar
Inza, N., Merino, M., Larranaga, P., Quiroga, J., Sierra, B., Girala, M.: Feature Subset selection by genetic algorithms and estimation of distribution algorithms. A case study in the survival of cirrhotic patients treated with TIPS. Artificial Intelligence in Medicine (in press)
Google Scholar
K1lopotek M.A.: A New Bayesian Tree Learning Method with Reduced Time and Space Complexity. Fundamenta Informaticae, 49(2002), IOS Press, in press
Google Scholar
Kłopotek, M. A., et al.: Bayesian Network Mining System. Proc. X International Symposium on Intelligent Information Systems, Zakopane, 18–22 June, 2001, Springer-Verlag, New York 2001. pp. 97-110
Google Scholar
Meila, M., Jordan, M.: Learning with mixtures of trees. Journal of Machine Learning Research, Vol. 1, 2000
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo CA, 1988.
Google Scholar
Suzuki, J.: Learning Bayesian Belief Networks based on the Minimum Descripion Length Principle: Basic Properties, IEICE Trans.Found., Vol. E82-A, Oct. 1999
Google Scholar
Valiveti, R. S., Oommen, B. J.: On using the chi-squared statistics for determining statistic dependence, Pattern Recognition Vol. 25 No. 11, 1992, pp. 1389–1400.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
Mieczysław A. Kłopotek
Institute of Computer Science, University of Podlasie, Siedlce, Poland
Mieczysław A. Kłopotek

Authors

Mieczysław A. Kłopotek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UFR d’Informatique, Université Claude Bernard Lyon I, 8, boulevard Niels Bohr, 69622, Villeurbanne Cedex, France
Mohand-Saïd Hacid
Dept. of Computer Science College of IT, University of North Carolina, Charlotte, NC, 28223, USA
Zbigniew W. Raś
Băt. L. Equipe de Recherche en Ingénierie des Connaissances, Université Lumière Lyon 2, 5, avenue Pierre Mendes-France, 69676, Bron Cedex, France
Djamel A. Zighed
LRI, Université Paris Sud, Băt. 490, 91405, Orsay Cedex, France
Yves Kodratoff

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kłopotek, M.A. (2002). Mining Bayesian Network Structure for Large Sets of Variables. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_14

Download citation

DOI: https://doi.org/10.1007/3-540-48050-1_14
Published: 21 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43785-7
Online ISBN: 978-3-540-48050-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics