Evaluation of Rules for Coping with Insufficient Data in Constraint-Based Search Algorithms

de Jongh, Martijn; Druzdzel, Marek J.

doi:10.1007/978-3-319-11433-0_13

Martijn de Jongh²¹ &
Marek J. Druzdzel^21,22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8754))

Included in the following conference series:

European Workshop on Probabilistic Graphical Models

2058 Accesses

Abstract

A fundamental step in the PC causal discovery algorithm consists of testing for (conditional) independence. When the number of data records is very small, a classical statistical independence test is typically unable to reject the (null) independence hypothesis. In this paper, we are comparing two conflicting pieces of advice in the literature that in case of too few data records recommend (1) assuming dependence and (2) assuming independence. Our results show that assuming independence is a safer strategy in minimizing the structural distance between the causal structure that has generated the data and the discovered structure. We also propose a simple improvement on the PC algorithm that we call blacklisting. We demonstrate that blacklisting can lead to orders of magnitude savings in computation by avoiding unnecessary independence tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abramson, B., Brown, J., Edwards, W., Winkler, R., Murphy, A.: Hailfinder: A Bayesian system for forecasting severe weather. International Journal of Forecasting 12(1), 57–72 (1996)
Article Google Scholar
Acid, S., de Campos, L.M.: Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. Journal of Artificial Intelligence Research 18, 445–490 (2003)
MathSciNet MATH Google Scholar
Bache, K., Lichman, M.: UCI Machine Learning Repository (2013)
Google Scholar
Beinlich, I., Suermondt, J., Chavez, M., Cooper, G.: The ALARM Monitoring System: A Case Study with Two Probablistic Inference Techniques for Belief Networks. In: Second European Conference on Artificial Intelligence in Medicine, London, pp. 247–256 (1989)
Google Scholar
Chickering, D.M.: A transformational characterization of equivalent Bayesian network structures. In: Proceedings of the 11th Annual Conference on Uncertainty in Artificial Intelligence (UAI 1995), pp. 87–98. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Dor, D., Tarsi, M.: A simple algorithm to construct a consistent extension of a partially oriented graph. Technicial Report R-185, Cognitive Systems Laboratory, UCLA (1992)
Google Scholar
Fienberg, S.E., Holland, P.W.: Methods for eliminating zero counts in contingency tables. In: Random Counts on Models and Structures, pp. 233–260 (1970)
Google Scholar
Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society 50, 157–224 (1988)
MathSciNet MATH Google Scholar
Onisko, A.: Probabilistic Causal Models in Medicine: Application to Diagnosis of Liver Disorders. PhD thesis, Institute of Biocybernetics and Biomedical Engineering, Polish Academy of Science, Warsaw (March 2003)
Google Scholar
Pradhan, M., Provan, G., Middleton, B., Henrion, M.: Knowledge engineering for large belief networks. In: Proceedings of the Tenth International Conference on Uncertainty in Artificial Intelligence, pp. 484–490. Morgan Kaufmann Publishers Inc. (1994)
Google Scholar
Ratnapinda, P., Druzdzel, M.J.: An empirical comparison of Bayesian network parameter learning algorithms for continuous data streams. In: Recent Advances in Artificial Intelligence: Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference (FLAIRS–2013), pp. 627–632 (2013)
Google Scholar
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press (2000)
Google Scholar
Stojnic, R., Fu, A.Q., Adryan, B.: A graphical modelling approach to the dissection of highly correlated transcription factor binding site profiles. PLoS Computational Biology 8(11), e1002725 (2012)
Google Scholar
Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)
Article Google Scholar
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1(6), 80–83 (1945)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Decision System Laboratory, School of Information Sciences and, Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
Martijn de Jongh & Marek J. Druzdzel
Faculty of Computer Science, Białystok University of Technology, Wiejska 45A, 15-351, Białystok, Poland
Marek J. Druzdzel

Authors

Martijn de Jongh
View author publications
You can also search for this author in PubMed Google Scholar
Marek J. Druzdzel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Utrecht University, Faculty of Science, Department of Information and Computing Sciences, Princetonplein 5, 3584 CC Utrecht, The Netherlands
Linda C. van der Gaag
Utrecht University, Faculty of Science, Department of Information and Computing Sciences, Princetonplein 5, 3584 CC Utrecht,, The Netherlands
Ad J. Feelders

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Jongh, M., Druzdzel, M.J. (2014). Evaluation of Rules for Coping with Insufficient Data in Constraint-Based Search Algorithms. In: van der Gaag, L.C., Feelders, A.J. (eds) Probabilistic Graphical Models. PGM 2014. Lecture Notes in Computer Science(), vol 8754. Springer, Cham. https://doi.org/10.1007/978-3-319-11433-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-11433-0_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11432-3
Online ISBN: 978-3-319-11433-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics