Abstract
A well-known problem with Bayesian networks (BN) is the practical limitation for the number of variables for which a Bayesian network can be learned in reasonable time. Even the complexity of simplest tree-like BN learning algorithms is prohibitive for large sets of variables. The paper presents a novel algorithm overcoming this limitation for the tree-like class of Bayesian networks. The new algorithm space consumption grows linearly with the number of variables n while the execution time is proportional to n ln(n), outperforming any known algorithm. This opens new perspectives in construction of Bayesian networks from data containing tens of thousands and more variables, e.g. in automatic text categorization.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cerquides, J.: Applying General Bayesian Techniques to Improve TAN Induction, Knowledge Discovery and Data Mining, 1999, pp 292–296.
Cheng, J., Bell, D.A., Liu, W.: An algorithm for Bayesian belief network construction from data, Proceedings of AI & STAT’97, Ft. Lauderdale, Florida, 1997.
Cheng, J., Bell, D.A., Liu, W.: Learning belief networks from data: an information theory based approach. Proceedings of the Sixth ACM International Conference on Information and Knowledge Management, 1997.
Chow, C. K., Liu, C. N.: Approximating discrete probability distributions with dependence trees, IEEE Trans. on IT, IT-14, No. 3, 1968, pp. 462–467
Chou, C. K., Wagner, T. J.: Consistency of an estimate of tree-dependent probability distribution, IEEE Transactions on Information Theory, IT-19, 1973, 369–371
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers, Machine Learning vol. 29, 1997, pp. 131.
Inza, N., Merino, M., Larranaga, P., Quiroga, J., Sierra, B., Girala, M.: Feature Subset selection by genetic algorithms and estimation of distribution algorithms. A case study in the survival of cirrhotic patients treated with TIPS. Artificial Intelligence in Medicine (in press)
K1lopotek M.A.: A New Bayesian Tree Learning Method with Reduced Time and Space Complexity. Fundamenta Informaticae, 49(2002), IOS Press, in press
Kłopotek, M. A., et al.: Bayesian Network Mining System. Proc. X International Symposium on Intelligent Information Systems, Zakopane, 18–22 June, 2001, Springer-Verlag, New York 2001. pp. 97-110
Meila, M., Jordan, M.: Learning with mixtures of trees. Journal of Machine Learning Research, Vol. 1, 2000
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo CA, 1988.
Suzuki, J.: Learning Bayesian Belief Networks based on the Minimum Descripion Length Principle: Basic Properties, IEICE Trans.Found., Vol. E82-A, Oct. 1999
Valiveti, R. S., Oommen, B. J.: On using the chi-squared statistics for determining statistic dependence, Pattern Recognition Vol. 25 No. 11, 1992, pp. 1389–1400.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kłopotek, M.A. (2002). Mining Bayesian Network Structure for Large Sets of Variables. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_14
Download citation
DOI: https://doi.org/10.1007/3-540-48050-1_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43785-7
Online ISBN: 978-3-540-48050-1
eBook Packages: Springer Book Archive