Advertisement

A Community-Driven Graph Partitioning Method for Constraint-Based Causal Discovery

  • Mandar S. Chaudhary
  • Stephen Ranshous
  • Nagiza F. Samatova
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 689)

Abstract

Constraint-based (CB) methods are widely used for discovering causal relationships in observational data. The PC-stable algorithm is a prominent example of CB methods. A critical component of the PC-stable algorithm is to find d-separators and perform conditional independence (CI) tests to eliminate spurious causal relationships. While the pairwise CI tests are necessary for identifying causal relationships, the error rate, where true causal relationships are erroneously removed, increases with the number of tests performed. Efficiently searching for the true d-separator set is thus a critical component to increase the accuracy of the causal graph. To this end, we propose a novel recursive algorithm for constructing causal graphs, based on a two-phase divide and conquer strategy. In phase one, we recursively partition the undirected graph using community detection, and subsequently construct partial skeletons from each partition. Phase two uses a bottom-up approach to merge the subgraph skeletons, ultimately yielding the full causal graph. Simulations on several real-world data sets show that our approach effectively finds the d-separators, leading to a significant improvement in the quality of causal graphs.

Notes

Acknowledgements

This material is based upon work supported by the NSF grant 1029711. In addition, this material is based on work supported in part by the DOE SDAVI Institute and the U.S. National Science Foundation (Expeditions in Computing program).

References

  1. 1.
    Abellán, J., Gómez-Olmedo, M., Moral, S., et al.: Some variations on the pc algorithm. In: Probabilistic Graphical Models, pp. 1–8 (2006)Google Scholar
  2. 2.
    Aliferis, C.F., Statnikov, A., Tsamardinos, I., Mani, S., Koutsoukos, X.D.: Local causal and markov blanket induction for causal discovery and feature selection for classification part i: algorithms and empirical evaluation. J. Mach. Learn. Res. 11, 171–234 (2010)Google Scholar
  3. 3.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theor. Exp. 2008(10), P10008 (2008)Google Scholar
  4. 4.
    Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: Maximizing modularity is hard. arXiv preprint physics/0608255 (2006)Google Scholar
  5. 5.
    Cai, R., Zhang, Z., Hao, Z.: Sada: A general framework to support robust causation discovery. In: International Conference on Machine Learning, pp. 208–216 (2013)Google Scholar
  6. 6.
    Chaudhary, M.S., Gonzalez, D.L., Bello, G.A., Angus, M.P., Desai, D., Harenberg, S., Doraiswamy, P.M., Semazzi, F.H., Kumar, V., Samatova, N.F.: Causality-guided feature selection. In: Advanced Data Mining and Applications, pp. 391–405. Springer (2016)Google Scholar
  7. 7.
    Colombo, D., Maathuis, M.H.: Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. 15(1), 3741–3782 (2014)MathSciNetMATHGoogle Scholar
  8. 8.
    Geng, Z., Wang, C., Zhao, Q.: Decomposition of search for v-structures in dags. J. Multivariate Anal. 96(2), 282–294 (2005)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Harenberg, S., Bello, G., Gjeltema, L., Ranshous, S., Harlalka, J., Seay, R., Padmanabhan, K., Samatova, N.: Community detection in large-scale networks: a survey and empirical evaluation. Wiley Interdisciplinary Rev. Comput. Statistics 6(6), 426–439 (2014)CrossRefGoogle Scholar
  10. 10.
    Kalisch, M., Bühlmann, P.: Estimating high-dimensional directed acyclic graphs with the pc-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)MATHGoogle Scholar
  11. 11.
    Le, T., Hoang, T., Li, J., Liu, L., Liu, H., Hu, S.: A fast pc algorithm for high dimensional causal discovery with multi-core pcs. IEEE/ACM Trans. Comput. Biol. Bioinf (2016)Google Scholar
  12. 12.
    Liu, H., Zhou, S., Lam, W., Guan, J.: A new hybrid method for learning bayesian networks: separation and reunion. Knowl. Based Syst. 121, 185–197 (2017)CrossRefGoogle Scholar
  13. 13.
    Meek, C.: Causal inference and causal explanation with background knowledge. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 403–410. Morgan Kaufmann Publishers Inc. (1995)Google Scholar
  14. 14.
    Spirtes, P., Glymour, C.N., Scheines, R.: Causation, Prediction, and Search. MIT press (2000)Google Scholar
  15. 15.
    Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)CrossRefGoogle Scholar
  16. 16.
    Wille, A., Bühlmann, P.: Low-order conditional independence graphs for inferring genetic networks. Stat. Appl. Genet. Molec. Biol. 5(1) (2006)Google Scholar
  17. 17.
    Xie, X., Geng, Z.: A recursive method for structural learning of directed acyclic graphs. J. Mach. Learn. Res. 9, 459–483 (2008)Google Scholar
  18. 18.
    Xie, X., Geng, Z., Zhao, Q.: Decomposition of structural learning about directed acyclic graphs. Artif. Intell. 170(4–5), 422–439 (2006)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Zhang, J., Mayer, W., et al.: Weakening faithfulness: some heuristic causal discovery algorithms. Int. J. Data Sci. Anal. 3(2), 93–104 (2017)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Mandar S. Chaudhary
    • 1
  • Stephen Ranshous
    • 1
  • Nagiza F. Samatova
    • 1
  1. 1.North Carolina State UniversityRaleighUSA

Personalised recommendations