Skip to main content

Evaluation of Rules for Coping with Insufficient Data in Constraint-Based Search Algorithms

  • Conference paper
Book cover Probabilistic Graphical Models (PGM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8754))

Included in the following conference series:

  • 2058 Accesses

Abstract

A fundamental step in the PC causal discovery algorithm consists of testing for (conditional) independence. When the number of data records is very small, a classical statistical independence test is typically unable to reject the (null) independence hypothesis. In this paper, we are comparing two conflicting pieces of advice in the literature that in case of too few data records recommend (1) assuming dependence and (2) assuming independence. Our results show that assuming independence is a safer strategy in minimizing the structural distance between the causal structure that has generated the data and the discovered structure. We also propose a simple improvement on the PC algorithm that we call blacklisting. We demonstrate that blacklisting can lead to orders of magnitude savings in computation by avoiding unnecessary independence tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramson, B., Brown, J., Edwards, W., Winkler, R., Murphy, A.: Hailfinder: A Bayesian system for forecasting severe weather. International Journal of Forecasting 12(1), 57–72 (1996)

    Article  Google Scholar 

  2. Acid, S., de Campos, L.M.: Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. Journal of Artificial Intelligence Research 18, 445–490 (2003)

    MathSciNet  MATH  Google Scholar 

  3. Bache, K., Lichman, M.: UCI Machine Learning Repository (2013)

    Google Scholar 

  4. Beinlich, I., Suermondt, J., Chavez, M., Cooper, G.: The ALARM Monitoring System: A Case Study with Two Probablistic Inference Techniques for Belief Networks. In: Second European Conference on Artificial Intelligence in Medicine, London, pp. 247–256 (1989)

    Google Scholar 

  5. Chickering, D.M.: A transformational characterization of equivalent Bayesian network structures. In: Proceedings of the 11th Annual Conference on Uncertainty in Artificial Intelligence (UAI 1995), pp. 87–98. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  6. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  7. Dor, D., Tarsi, M.: A simple algorithm to construct a consistent extension of a partially oriented graph. Technicial Report R-185, Cognitive Systems Laboratory, UCLA (1992)

    Google Scholar 

  8. Fienberg, S.E., Holland, P.W.: Methods for eliminating zero counts in contingency tables. In: Random Counts on Models and Structures, pp. 233–260 (1970)

    Google Scholar 

  9. Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society 50, 157–224 (1988)

    MathSciNet  MATH  Google Scholar 

  10. Onisko, A.: Probabilistic Causal Models in Medicine: Application to Diagnosis of Liver Disorders. PhD thesis, Institute of Biocybernetics and Biomedical Engineering, Polish Academy of Science, Warsaw (March 2003)

    Google Scholar 

  11. Pradhan, M., Provan, G., Middleton, B., Henrion, M.: Knowledge engineering for large belief networks. In: Proceedings of the Tenth International Conference on Uncertainty in Artificial Intelligence, pp. 484–490. Morgan Kaufmann Publishers Inc. (1994)

    Google Scholar 

  12. Ratnapinda, P., Druzdzel, M.J.: An empirical comparison of Bayesian network parameter learning algorithms for continuous data streams. In: Recent Advances in Artificial Intelligence: Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference (FLAIRS–2013), pp. 627–632 (2013)

    Google Scholar 

  13. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press (2000)

    Google Scholar 

  14. Stojnic, R., Fu, A.Q., Adryan, B.: A graphical modelling approach to the dissection of highly correlated transcription factor binding site profiles. PLoS Computational Biology 8(11), e1002725 (2012)

    Google Scholar 

  15. Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)

    Article  Google Scholar 

  16. Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1(6), 80–83 (1945)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

de Jongh, M., Druzdzel, M.J. (2014). Evaluation of Rules for Coping with Insufficient Data in Constraint-Based Search Algorithms. In: van der Gaag, L.C., Feelders, A.J. (eds) Probabilistic Graphical Models. PGM 2014. Lecture Notes in Computer Science(), vol 8754. Springer, Cham. https://doi.org/10.1007/978-3-319-11433-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11433-0_13

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11432-3

  • Online ISBN: 978-3-319-11433-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics