An Improved Approach for the Discovery of Causal Models via MML

Dai, Honghua; Li, Gang

doi:10.1007/3-540-47887-6_30

Honghua Dai⁴ &
Gang Li⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2336))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2128 Accesses
6 Citations

Abstract

Discovering a precise causal structure accurately reflecting the given data is one of the most essential tasks in the area of data mining and machine learning. One of the successful causal discovery approaches is the information-theoretic approach using the Minimum Message Length Principle[19]. This paper presents an improved and further experimental results of the MML discovery algorithm. We introduced a new encoding scheme for measuring the cost of describing the causal structure. Stiring function is also applied to further simplify the computational complexity and thus works more efficiently. The experimental results of the current version of the discovery system show that: (1) the current version is capable of discovering what discovered by previous system; (2) current system is capable of discovering more complicated causal models with large number of variables; (3) the new version works more efficiently compared with the previous version in terms of time complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. L. Chin and G. F. Cooper. Stochastic simulation of Bayesian belief networks. In Proc. of 3rd Workshop on Uncertainty in AI, Seattle, 1987.
Google Scholar
G. F. Cooper and E. Herskovits. A Bayesian method for constructing Bayesian belief networks from databases. In Proc. of 7th Conference on Uncertainty in AI. Morgan Kaufmann, 1991.
Google Scholar
Honghua Dai, Kevin Korb, Chris Wallace, and Xindong Wu. A study of causal discovery with small samples and weak links. In Proceedings of the 15th International Joint Conference On Artificial Intelligence IJCAI’97, pages 1304–1309. Morgan Kaufmann Publishers, Inc., 1997.
Google Scholar
Clark Glymour, Richard Scheines, Peter Spirtes, and Kevin Kelly. Discovering Causal Structure: Artificial Intelligence, Philosophy of Science, and Statistical Modeling. Academic Press, San Diego, 1987.
MATH Google Scholar
David Heckerman, Dan Geiger, and David M. Chickering. Learning bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197–243, 1995.
MATH Google Scholar
Wai Lam and Fahiem Bacchus. Learning Bayesian belief networks: An approach based on the MDL principle. Computational Intelligence, 10:269–292, 1994.
Article Google Scholar
John C. Loehlin. Latent Variable Models: An Introduction to Factor, Path and Structural Analysis. Lawrence Erlbaum Associates, Hillsdale, New Jersey, second edition, 1992.
Google Scholar
Richard Neapolitan. Probabilistic Reasoning in Expert Systems. Wiley, New York, 1990.
Google Scholar
J.J. Oliver and R.A. Baxter. MML and Bayesianism: Similarities and differences. Tech Report 206, Computer Science, Monash University, 1994.
Google Scholar
Judea Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, California, 1988.
Google Scholar
R.W. Robinson. Counting unlabelled acyclic digraphs. In C.H.C. Little, editor, Lecture Notes in Mathematics: Combinatorial Mathematics V, pages 28–43. Springer-Verlag, 1977.
Google Scholar
Peter Spirtes, Clark Glymour, and Richard Scheines. Causality from probability. In J.E. Tiles, G.T. McKee, and G.C. Dean, editors, Evolving Knowledge in Natural Science and Artificial Intelligence, London, 1990. Pitman.
Google Scholar
Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. Springer-Verlag, New York, Berlin, Heideberg, 1993.
MATH Google Scholar
Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. MIT Press, New York, 2000.
Google Scholar
Peter Spirtes, Clark Glymour, Richard Scheines, and C. Meek. TETRAD II: tools for causal modeling. Lawrence Erlbaum, Hillsdale, New Jersey, 1994.
Google Scholar
Chris Wallace and David Boulton. An information measure for classification. Computer Journal, 11:185–194, 1968.
MATH Google Scholar
Chris Wallace and P.R. Freeman. Estimation and inference by compact coding. Journal of the Royal Statistical Society, B,49:240–252, 1987.
MathSciNet Google Scholar
Chris Wallace and Michael Georgeff. A general selection criterion for inductive inference. ECAI 84, Advances in Artificial Intelligence, pages 1–18, 1984.
Google Scholar
Chris Wallace, Kevin Korb, and Honghua Dai. Causal discovery via MML. In Proceedings of the 13th International Conference on Machine Learning (ICML’96), pages 516–524, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Mathematics, Deakin University, Melbourne Campus, Burwood, Vic, 3125, Australia
Honghua Dai & Gang Li

Authors

Honghua Dai
View author publications
You can also search for this author in PubMed Google Scholar
Gang Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

EE Department, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan, ROC
Ming-Syan Chen
IBM Thomas J. Watson Research Center, 30 Sawmill River Road, Hawthorne, NY, 10532, USA
Philip S. Yu
School of Computing, National University of Singapore, Lower Kent Ridge Road, Singapore, 119260
Bing Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dai, H., Li, G. (2002). An Improved Approach for the Discovery of Causal Models via MML. In: Chen, MS., Yu, P.S., Liu, B. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2002. Lecture Notes in Computer Science(), vol 2336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47887-6_30

Download citation

DOI: https://doi.org/10.1007/3-540-47887-6_30
Published: 29 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43704-8
Online ISBN: 978-3-540-47887-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics