Causal Learning with Occam’s Razor

Schulte, Oliver

doi:10.1007/s11225-018-9829-1

Causal Learning with Occam’s Razor

Published: 06 September 2018

Volume 107, pages 991–1023, (2019)
Cite this article

Studia Logica Aims and scope Submit manuscript

Oliver Schulte ORCID: orcid.org/0000-0002-2805-4313¹

220 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Occam’s razor directs us to adopt the simplest hypothesis consistent with the evidence. Learning theory provides a precise definition of the inductive simplicity of a hypothesis for a given learning problem. This definition specifies a learning method that implements an inductive version of Occam’s razor. As a case study, we apply Occam’s inductive razor to causal learning. We consider two causal learning problems: learning a causal graph structure that presents global causal connections among a set of domain variables, and learning context-sensitive causal relationships that hold not globally, but only relative to a context. For causal graph learning, Occam’s inductive razor directs us to adopt the model that explains the observed correlations with a minimum number of direct causal connections. For expanding a causal graph structure to include context-sensitive relationships, Occam’s inductive razor directs us to adopt the expansion that explains the observed correlations with a minimum number of free parameters. This is equivalent to explaining the correlations with a minimum number of probabilistic logical rules. The paper provides a gentle introduction to the learning-theoretic definition of inductive simplicity and the application of Occam’s razor for causal learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A random forest guided tour

Article 19 April 2016

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Article Open access 08 March 2021

References

Boutilier, C., T. L. Dean, and S. Hanks, Decision-theoretic planning: Structural assumptions and computational leverage, Journal of Artificial Intelligence Research (JAIR) 11:1–94, 1999.
Article Google Scholar
Boutilier, C., N. Friedman, M. Goldszmidt, and D. Koller, Context-specific independence in Bayesian networks, in UAI, 1996, pp. 115–123.
Case, J., and C. Smith, Comparison of identification criteria for machine inductive inference, Theoretical Computer Science 25:193–220, 1983.
Article Google Scholar
Chickering, D., Optimal structure identification with greedy search, Journal of Machine Learning Research 3:507–554, 2003.
Google Scholar
Cooper, G., An overview of the representation and discovery of causal relationships using Bayesian networks, in C. Glymour and G. Cooper, (eds.), Computation, Causation, and Discovery, AAAI Press/The MIT Press, Cambridge, 1999, pp. 4–62.
Google Scholar
de Campos, L. M., A scoring function for learning Bayesian networks based on mutual information and conditional independence tests, Journal of Machine Learning Research 7:2149–2187, 2006.
Google Scholar
Dowe, D. L., MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness, in Handbook of Philosophy of Science, volume 7: Handbook of Philosophy of Statistics, Elsevier, 2011.
Friedman, N., and M. Goldszmidt, Learning Bayesian networks with local structures, in Proceedings of the NATO Advanced Study Institute on Learning in graphical models, Norwell, MA, USA, Kluwer Academic Publishers, 1998, pp. 421–459.
Geiger, D., and D. Heckerman, Knowledge representation and inference in similarity networks and Bayesian multinets, Artificial Intelligence 82(1-2):45–74, 1996.
Article Google Scholar
Genin, K., and K. T. Kelly, The topology of statistical verifiability, in Proceedings Conference on Theoretical Aspects of Rationality and Knowledge, TARK, 2017, pp. 236–250.
Article Google Scholar
Giere, R. N., The significance test controversy, The British Journal for the Philosophy of Science 23(2):170–181, 1972.
Article Google Scholar
Glymour, C., On the methods of cognitive neuropsychology, British Journal for the Philosophy of Science 45:815–835, 1994.
Article Google Scholar
Gold, E. M., Language identification in the limit, Information and Control 10(5):447–474, 1967.
Article Google Scholar
Heckerman, D., A tutorial on learning with Bayesian networks, in Proceedings of the NATO Advanced Study Institute on Learning in graphical models, 1998, pp. 301–354.
Chapter Google Scholar
Jain, S., D. Osherson, J. S. Royer, and A. Sharma, Systems that Learn, 2 edition, MIT Press, Cambridge, 1999.
Kelly, K., The Logic of Reliable Inquiry, Oxford University Press, Oxford, 1996.
Google Scholar
Kelly, K., Justification as truth-finding efficiency: How Ockham’s razor works, Minds and Machines 14(4):485–505, 2004.
Article Google Scholar
Kelly, K., Why probability does not capture the logic of scientific justification, in C. Hitchcock, (ed.), Contemporary Debates in the philosophy of Science, Wiley-Blackwell, London, 2004, pp. 94–114.
Google Scholar
Kelly, K. T., and C. Mayo-Wilson, Causal conclusions that flip repeatedly and their justification, in UAI, 2010, pp. 277–285.
Khosravi, H., O. Schulte, J. Hu, and T. Gao, Learning compact Markov logic networks with decision trees, Machine Learning 89(3):257–277, 2012.
Article Google Scholar
Lauritzen, S. L., and D. J. Spiegelhalter, Local computations with probabilities on graphical structures and their application to expert systems, Journal of the Royal Statistics Society B 50(2):157–194, 1988.
Google Scholar
Lucas, J. F., Introduction to Abstract Mathematics, Rowman & Littlefield, Lanham, 1990.
Google Scholar
Luo, W., Learning Bayesian networks in semi-deterministic systems, in Canadian AI 2006, number 4013 in LNAI, Springer-Verlag, 2006, pp. 230–241.
Luo, W., and O. Schulte, Mind change efficient learning, Information and Computation 204:989–1011, 2006.
Article Google Scholar
Martin, E., and D. N. Osherson, Elements of Scientific Inquiry, The MIT Press, Cambridge, Massachusetts, 1998.
Google Scholar
Meek, C., Graphical Models: Selecting causal and statistical models, Ph.D. thesis, Carnegie Mellon University, 1997.
Ngo, L., and P. Haddawy, Answering queries from context-sensitive probabilistic knowledge bases, Theoretical Computer Science 171(1-2):147–177, 1997.
Article Google Scholar
Pearl, J., Probabilistic Reasoning in Intelligent Systems, Morgan Kauffmann, San Mateo, CA, 1988.
Google Scholar
Pearl, J., Causality: Models, Reasoning, and Inference, Cambridge university press, Cambridge, 2000.
Google Scholar
Provost, F. J., and P. Domingos, Tree induction for probability-based ranking, Machine Learning 52(3):199–215, 2003.
Article Google Scholar
Putnam, H., Trial and error predicates and the solution to a problem of Mostowski, The Journal of Symbolic Logic 30(1):49–57, 1965.
Article Google Scholar
Schulte, O., Means-ends epistemology epistemology, The British Journal for the Philosophy of Science 79(1):141–147, 1996.
Google Scholar
Schulte, O., Discussion. What to believe and what to take seriously: A reply to David Chart concerning the riddle of induction, The British Journal for the Philosophy of Science 51(1):151–153, 2000.
Article Google Scholar
Schulte, O., The co-discovery of conservation laws and particle families, Studies in the History and Philosophy of Modern Physics 39(2):288–314, 2008.
Article Google Scholar
Schulte, O., G. Frigo, R. Greiner, and H. Khosravi, The IMAP hybrid method for learning Gaussian Bayes nets, in A. Farzindar, and V. Keselj, (eds.), Canadian Conference on AI, volume 6085 of Lecture Notes in Computer Science, Springer, 2010, pp. 123–134.
Schulte, O., W. Luo, and R. Greiner, Mind-change optimal learning of Bayes net structure from dependency and independency data, Information and Computation 208:63–82, 2010.
Article Google Scholar
Spirtes, P., C. Glymour, and R. Scheines, Causation, prediction, and search, MIT Press, Cambridge, 2000.
Google Scholar
Studeny, M., Probabilistic Conditional Independence Structures, Springer, Berlin, 2005.
Google Scholar
Tsamardinos, I., L. E. Brown, and C. Aliferis, The max-min hill-climbing Bayesian network structure learning algorithm, Machine Learning 65(1):31–78, 2006.
Article Google Scholar
Verma, T. S., and J. Pearl, Equivalence and synthesis of causal models, in Proceedings of the Sixth Conference on Uncertainty in Artificial Intelligence (UAI 1990), 1990, pp. 220–227.
Xiang, Y., S. K. Wong, and N. Cercone, Critical remarks on single link search in learning belief networks, in Proceedings of the 12th Annual Conference on Uncertainty in Artificial Intelligence (UAI 1996), 1996, pp. 564–571.

Download references

Acknowledgements

This research was supported by an NSERC discovery grant to the author. Preliminary results were presented at the Center for Formal Epistemology at Carnegie Mellon University. The author is grateful to the audience at the Center for helpful comments.

Author information

Authors and Affiliations

School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
Oliver Schulte

Authors

Oliver Schulte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oliver Schulte.

Additional information

Presented by Jacek Malinowski

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schulte, O. Causal Learning with Occam’s Razor. Stud Logica 107, 991–1023 (2019). https://doi.org/10.1007/s11225-018-9829-1

Download citation

Received: 17 May 2018
Published: 06 September 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s11225-018-9829-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Causal Learning with Occam’s Razor

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Causal Learning with Occam’s Razor

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation