Skip to main content

Mining Bayesian Network Structure for Large Sets of Variables

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2366))

Abstract

A well-known problem with Bayesian networks (BN) is the practical limitation for the number of variables for which a Bayesian network can be learned in reasonable time. Even the complexity of simplest tree-like BN learning algorithms is prohibitive for large sets of variables. The paper presents a novel algorithm overcoming this limitation for the tree-like class of Bayesian networks. The new algorithm space consumption grows linearly with the number of variables n while the execution time is proportional to n ln(n), outperforming any known algorithm. This opens new perspectives in construction of Bayesian networks from data containing tens of thousands and more variables, e.g. in automatic text categorization.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cerquides, J.: Applying General Bayesian Techniques to Improve TAN Induction, Knowledge Discovery and Data Mining, 1999, pp 292–296.

    Google Scholar 

  2. Cheng, J., Bell, D.A., Liu, W.: An algorithm for Bayesian belief network construction from data, Proceedings of AI & STAT’97, Ft. Lauderdale, Florida, 1997.

    Google Scholar 

  3. Cheng, J., Bell, D.A., Liu, W.: Learning belief networks from data: an information theory based approach. Proceedings of the Sixth ACM International Conference on Information and Knowledge Management, 1997.

    Google Scholar 

  4. Chow, C. K., Liu, C. N.: Approximating discrete probability distributions with dependence trees, IEEE Trans. on IT, IT-14, No. 3, 1968, pp. 462–467

    Article  Google Scholar 

  5. Chou, C. K., Wagner, T. J.: Consistency of an estimate of tree-dependent probability distribution, IEEE Transactions on Information Theory, IT-19, 1973, 369–371

    Article  Google Scholar 

  6. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers, Machine Learning vol. 29, 1997, pp. 131.

    Article  MATH  Google Scholar 

  7. Inza, N., Merino, M., Larranaga, P., Quiroga, J., Sierra, B., Girala, M.: Feature Subset selection by genetic algorithms and estimation of distribution algorithms. A case study in the survival of cirrhotic patients treated with TIPS. Artificial Intelligence in Medicine (in press)

    Google Scholar 

  8. K1lopotek M.A.: A New Bayesian Tree Learning Method with Reduced Time and Space Complexity. Fundamenta Informaticae, 49(2002), IOS Press, in press

    Google Scholar 

  9. Kłopotek, M. A., et al.: Bayesian Network Mining System. Proc. X International Symposium on Intelligent Information Systems, Zakopane, 18–22 June, 2001, Springer-Verlag, New York 2001. pp. 97-110

    Google Scholar 

  10. Meila, M., Jordan, M.: Learning with mixtures of trees. Journal of Machine Learning Research, Vol. 1, 2000

    Google Scholar 

  11. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo CA, 1988.

    Google Scholar 

  12. Suzuki, J.: Learning Bayesian Belief Networks based on the Minimum Descripion Length Principle: Basic Properties, IEICE Trans.Found., Vol. E82-A, Oct. 1999

    Google Scholar 

  13. Valiveti, R. S., Oommen, B. J.: On using the chi-squared statistics for determining statistic dependence, Pattern Recognition Vol. 25 No. 11, 1992, pp. 1389–1400.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kłopotek, M.A. (2002). Mining Bayesian Network Structure for Large Sets of Variables. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-48050-1_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43785-7

  • Online ISBN: 978-3-540-48050-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics