Discovery of Causal Models that Contain Latent Variables Through Bayesian Scoring of Independence Constraints

  • Fattaneh JabbariEmail author
  • Joseph Ramsey
  • Peter Spirtes
  • Gregory Cooper
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10535)


Discovering causal structure from observational data in the presence of latent variables remains an active research area. Constraint-based causal discovery algorithms are relatively efficient at discovering such causal models from data using independence tests. Typically, however, they derive and output only one such model. In contrast, Bayesian methods can generate and probabilistically score multiple models, outputting the most probable one; however, they are often computationally infeasible to apply when modeling latent variables. We introduce a hybrid method that derives a Bayesian probability that the set of independence tests associated with a given causal model are jointly correct. Using this constraint-based scoring method, we are able to score multiple causal models, which possibly contain latent variables, and output the most probable one. The structure-discovery performance of the proposed method is compared to an existing constraint-based method (RFCI) using data generated from several previously published Bayesian networks. The structural Hamming distances of the output models improved when using the proposed method compared to RFCI, especially for small sample sizes.


Observational data Latent (hidden) variable Constraint-based and Bayesian causal discovery Posterior probability 



Research reported in this publication was supported by grant U54HG008540 awarded by the National Human Genome Research Institute through funds provided by the trans-NIH Big Data to Knowledge initiative. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.


  1. 1.
    Abramson, B., Brown, J., Edwards, W., Murphy, A., Winkler, R.L.: Hailfinder: a Bayesian system for forecasting severe weather. Int. J. Forecast. 12(1), 57–71 (1996)CrossRefGoogle Scholar
  2. 2.
    Beal, M.J., Ghahramani, Z.: The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. In: Proceedings of the Seventh Valencia International Meeting, pp. 453–464 (2003)Google Scholar
  3. 3.
    Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The ALARM monitoring system: a case study with two probabilistic inference techniques for belief networks. In: Hunter, J., Cookson, J., Wyatt, J. (eds.) AIME 89. LNMI, vol. 38, pp. 247–256. Springer, Heidelberg (1989). CrossRefGoogle Scholar
  4. 4.
    Bayesian Network Repository.
  5. 5.
    Borchani, H., Ben Amor, N., Mellouli, K.: Learning Bayesian network equivalence classes from incomplete data. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds.) DS 2006. LNCS (LNAI), vol. 4265, pp. 291–295. Springer, Heidelberg (2006). CrossRefGoogle Scholar
  6. 6.
    Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2002)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Choi, M.J., Tan, V.Y., Anandkumar, A., Willsky, A.S.: Learning latent tree graphical models. J. Mach. Learn. Res. 12, 1771–1812 (2011)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Claassen, T., Heskes, T.: A Bayesian approach to constraint based causal inference. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 207–216 (2012)Google Scholar
  9. 9.
    Claassen, T., Mooij, J., Heskes, T.: Learning sparse causal models is not NP-hard. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (2013)Google Scholar
  10. 10.
    Colombo, D., Maathuis, M.H., Kalisch, M., Richardson, T.S.: Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40(1), 294–321 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Daly, R., Shen, Q., Aitken, S.: Review: learning Bayesian networks: approaches and issues. Knowl. Eng. Rev. 26(2), 99–157 (2011)CrossRefGoogle Scholar
  12. 12.
    Dash, D., Druzdzel, M.J.: A hybrid anytime algorithm for the construction of causal models from sparse data. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 142–149 (1999)Google Scholar
  13. 13.
    De Campos, L.M., FernndezLuna, J.M., Puerta, J.M.: An iterated local search algorithm for learning Bayesian networks with restarts based on conditional independence tests. Int. J. Intell. Syst. 18(2), 221–235 (2003)CrossRefzbMATHGoogle Scholar
  14. 14.
    Drton, M., Maathuis, M.H.: Structure learning in graphical modeling. Annu. Rev. Stat. Appl. 4, 365–393 (2016)CrossRefGoogle Scholar
  15. 15.
    Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. CRC Press, Boca Raton (1994)zbMATHGoogle Scholar
  16. 16.
    Elidan, G., Friedman, N.: Learning hidden variable networks: the information bottleneck approach. J. Mach. Learn. Res. 6(Jan), 81–127 (2005)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Friedman, N.: The Bayesian structural EM algorithm. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 129–138 (1998)Google Scholar
  18. 18.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)zbMATHGoogle Scholar
  19. 19.
    Heckerman, D., Meek, C., Cooper, G.: A Bayesian approach to causal discovery. In: Glymour, C., Cooper, G.F. (eds.) Computation, Causation, and Discovery, pp. 141–165. MIT Press, Menlo Park, CA (1999)Google Scholar
  20. 20.
    Hyttinen, A., Eberhardt, F., Jrvisalo, M.: Constraint-based causal discovery: conflict resolution with answer set programming. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pp. 340–349 (2014)Google Scholar
  21. 21.
    Illari, P.M., Russo, F., Williamson, J.: Causality in the Sciences. Oxford University Press, Oxford (2011)CrossRefzbMATHGoogle Scholar
  22. 22.
    Koski, T.J., Noble, J.: A review of Bayesian networks and structure learning. Math. Appl. 40(1), 51–103 (2012)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Lazic, N., Bishop, C.M., Winn, J.M.: Structural Expectation Propagation (SEP): Bayesian structure learning for networks with latent variables. In: Proceedings of the Conference on Artificial Intelligence and Statistics (AISTATS), pp. 379–387 (2013)Google Scholar
  24. 24.
    Magliacane, S., Claassen, T., Mooij, J.M.: Ancestral causal inference. In: Advances in Neural Information Processing Systems, pp. 4466–4474 (2016)Google Scholar
  25. 25.
    Nandy, P., Hauser, A., Maathuis, M.H.: High-dimensional consistency in score-based and hybrid structure learning. arXiv preprint arXiv:1507.02608 (2015)
  26. 26.
    Ogarrio, J.M., Spirtes, P., Ramsey, J.: A hybrid causal search algorithm for latent variable models. In: Conference on Probabilistic Graphical Models, pp. 368–379 (2016)Google Scholar
  27. 27.
    Onisko, A.: Probabilistic causal models in medicine: application to diagnosis of liver disorders. Ph.D. dissertation, Institute of Biocybernetics and Biomedical Engineering, Polish Academy of Science, Warsaw (2003)Google Scholar
  28. 28.
    Parviainen, P., Koivisto, M.: Ancestor relations in the presence of unobserved variables. Mach. Learn. Knowl. Discov. Databases 6912, 581–596 (2011)Google Scholar
  29. 29.
    Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, New York (2009)CrossRefzbMATHGoogle Scholar
  30. 30.
    Peters, J., Mooij, J., Janzing, D., Schlkopf, B.: Identifiability of causal graphs using functional models. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 589–598 (2012)Google Scholar
  31. 31.
    Ramsey, J.D.: Scaling up greedy equivalence search for continuous variables. CoRR, abs/1507.07749 (2015)Google Scholar
  32. 32.
    Singh, M., Valtorta, M.: Construction of claass network structures from data: a brief survey and an efficient algorithm. Int. J. Approx. Reason. 12(2), 111–131 (1995)CrossRefzbMATHGoogle Scholar
  33. 33.
    Spirtes, P., Glymour, C.N., Scheines, R.: Causation, Prediction, and Search. MIT Press, Cambridge (2000)zbMATHGoogle Scholar
  34. 34.
    Triantafillou, S., Tsamardinos, I., Roumpelaki, A.: Learning neighborhoods of high confidence in constraint-based causal discovery. In: van der Gaag, L.C., Feelders, A.J. (eds.) PGM 2014. LNCS (LNAI), vol. 8754, pp. 487–502. Springer, Cham (2014). Google Scholar
  35. 35.
    Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)CrossRefGoogle Scholar
  36. 36.
    Zhang, J.: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell. 172(16), 1873–1896 (2008)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Fattaneh Jabbari
    • 1
    Email author
  • Joseph Ramsey
    • 2
  • Peter Spirtes
    • 2
  • Gregory Cooper
    • 1
  1. 1.Intelligent Systems ProgramUniversity of PittsburghPittsburghUSA
  2. 2.Department of PhilosophyCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations