Skip to main content
Log in

Causal order identification to address confounding: binary variables

  • Original Paper
  • Published:
Behaviormetrika Aims and scope Submit manuscript

Abstract

This paper considers an extension of the linear non-Gaussian acyclic model (LiNGAM) that determines the causal order among variables from a dataset when the variables are expressed by a set of linear equations, including noise. In particular, we assume that the variables are binary. The existing LiNGAM assumes that no confounding is present, which is restrictive in practice. Based on the concept of independent component analysis (ICA), this paper proposes an extended framework in which the mutual information among the noises is minimized. Another significant contribution is to reduce the realization to the shortest path problem, in which the distance between each pair of nodes expresses an associated mutual information value, and the path with the minimum sum (KL divergence) is sought. Although p! mutual information values should be compared, this paper dramatically reduces the computation when no confounding is present. The proposed algorithm finds the globally optimal solution, while the existing approaches locally greedily seek the order based on hypothesis testing. We use the best estimator in the sense of Bayes/MDL that correctly detects independence for mutual information estimation. Experiments using artificial and actual data show that the proposed version of LiNGAM achieves significantly better performance, particularly when confounding is present.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. We write \(X\perp \!\!\!\perp Y\) when X and Y are independent.

References

  • Chen Z, Chan L (2013) Causality in linear non-Gaussian acyclic models in the presence of latent Gaussian confounders. Neural Comput 25:1605–1641

    Article  MathSciNet  MATH  Google Scholar 

  • Chickering DM (2002) Optimal structure identification with greedy search. J Mach Learn Res 3:507–554

    MathSciNet  MATH  Google Scholar 

  • Comon P (1994) Independent component analysis, a new concept? Signal Process 36:62–83

    Article  MATH  Google Scholar 

  • Darmois G (1953) Analyse générale des liaisons stochastiques: etude particulière de l’analyse factorielle linéaire. Rev Int Stat Inst 21:2–8

    Article  MathSciNet  MATH  Google Scholar 

  • Entner D (2013) Causal structure learning and effect identification in linear non-Gaussian models and beyond. PhD thesis, University of Helsinki

  • Gretton A, Fukumizu K, Teo CH, Song L, Scholkopf, B, Smola AJ (2008) A kernel statistical test of independence. In: Advances in neural information processing systems, vol 20. MIT Press, Cambridge, pp 585–592

  • Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York

  • Hyvärinen A, Smith SM (2013) Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. J Mach Learn Res 14:111–152

    MathSciNet  MATH  Google Scholar 

  • Inazumi T, Washio T, Shimizu S, Suzuki J, Yamamoto A, Kawahara Y (2011) Discovering causal structures in binary exclusive-or skew acyclic models. In: Proceedings of the 27th conference on uncertainty in artificial intelligence, pp 373–382

  • Kano Y, Shimizu S (2003) Causal inference using non-normality. In: The international symposium on science of modeling: the 30th anniversary of the information criterion, vol 12, Washington DC, pp 261–270

  • Lauritzen S, Spiegelhalter D (1988) Local computations with probabilities on graphical structures and their application to expert systems. J R Stat Soc Ser B Methodol 50:415–448

    MathSciNet  MATH  Google Scholar 

  • Peters J, Janzing D, Scholkopf B (2011) Causal inference on discrete data using additive noise models. IEEE Trans Pattern Anal Machi Intell 61(2):282–293

    Google Scholar 

  • Shimizu S, Hoyer PO, Hyvarinen A, Kerminen A (2006) A linear non-Gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030

    MathSciNet  MATH  Google Scholar 

  • Shimizu S, Inazumi T, Sogawa Y, Hyvarinen A, Kawahara Y, Washio T, Hoyer PO, Bollen K (2011) DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model. J Mach Learn Res 12:1225–1248

    MathSciNet  MATH  Google Scholar 

  • Shimizu S, Bollen K (2014) Bayesian estimation of causal direction in acyclic structural equation models with individual-specific confounder variables and non-Gaussian distributions. JMLR 15:2629–2652

    MathSciNet  MATH  Google Scholar 

  • Skitovitch WP (1953) On a property of the normal distribution. Doklady Akad Nauk SSSR 89:217–219

    MathSciNet  Google Scholar 

  • Spirtes P, Glymour C, Scheines R (1993) Causation. In: Prediction and search. Springer, Berlin

    Chapter  MATH  Google Scholar 

  • Suzuki J (1993) A construction of Bayesian networks from databases based on an MDL principle. In: Uncertainty in artificial intelligence, Morgan Kaufmann, Washington DC, pp 266–273

  • Tashiro T, Shimizu S, Hyvärinen A, Washio T (2014) ParceLiNGAM: a causal ordering method robust against latent confounders. Neural Comput 26:57–83

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joe Suzuki.

Additional information

Communicated by Shohei Shimizu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Suzuki, J., Inaoka, Y. Causal order identification to address confounding: binary variables. Behaviormetrika 49, 5–21 (2022). https://doi.org/10.1007/s41237-021-00149-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41237-021-00149-5

Keywords

Navigation