Counterfactual Inference with Hidden Confounders Using Implicit Generative Models

  • Fujin Zhu
  • Adi Lin
  • Guangquan Zhang
  • Jie Lu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11320)


In observational studies, a key problem is to estimate the causal effect of a treatment on some outcome. Counterfactual inference tries to handle it by directly learning the treatment exposure surfaces. One of the biggest challenges in counterfactual inference is the existence of unobserved confounders, which are latent variables that affect both the treatment and outcome variables. Building on recent advances in latent variable modelling and efficient Bayesian inference techniques, deep latent variable models, such as variational auto-encoders (VAEs), have been used to ease the challenge by learning the latent confounders from the observations. However, for the sake of tractability, the posterior of latent variables used in existing methods is assumed to be Gaussian with diagonal covariance matrix. This specification is quite restrictive and even contradictory with the underlying truth, limiting the quality of the resulting generative models and the causal effect estimation. In this paper, we propose to take advantage of implicit generative models to detour this limitation by using black-box inference models. To make inference for the implicit generative model with intractable likelihood, we adopt recent implicit variational inference based on adversary training to obtain a close approximation to the true posterior. Experiments on simulated and real data show the proposed method matches the state-of-art.


Causal effect Counterfactual inference Latent variable models 


  1. 1.
    Imbens, G.W., Rubin, D.B.: Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press, Cambridge (2015)CrossRefGoogle Scholar
  2. 2.
    Hernán, M.A., Robins, J.M.: Causal Inference. Chapman & Hall/CRC, Boca Raton (2018, forthcoming)Google Scholar
  3. 3.
    Swaminathan, A., Joachims, T.: Batch learning from logged bandit feedback through counterfactual risk minimization. J. Mach. Learn. Res. 16, 1731–1755 (2015)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Swaminathan, A., Joachims, T.: Counterfactual risk minimization: learning from logged bandit feedback. In: ICML, pp. 814–823 (2015)Google Scholar
  5. 5.
    Bottou, L., et al.: Counterfactual reasoning and learning systems: the example of computational advertising. J. Mach. Learn. Res. 14, 3207–3260 (2013)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Swaminathan, A., Joachims, T.: The self-normalized estimator for counterfactual learning. In: NIPS, pp. 3231–3239 (2015)Google Scholar
  7. 7.
    Johansson, F.D., Shalit, U., Sontag, D.: Learning representations for counterfactual inference. In: ICML (2016)Google Scholar
  8. 8.
    Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In: ICML, vol. 1050, p. 28 (2017)Google Scholar
  9. 9.
    Louizos, C., Shalit, U., Mooij, J., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: NIPS (2017)Google Scholar
  10. 10.
    Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, Cambridge (2000)zbMATHGoogle Scholar
  11. 11.
    Ullman, J.B., Bentler, P.M.: Structural equation modeling. In: Handbook of Psychology, 2nd Edn (2012)Google Scholar
  12. 12.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)zbMATHGoogle Scholar
  13. 13.
    Mohamed, S., Lakshminarayanan, B.: Learning in implicit generative models. arXiv preprint arXiv:1610.03483 (2016)
  14. 14.
    Tran, D., Ranganath, R., Blei, D.: Hierarchical implicit models and likelihood-free variational inference. In: NIPS, pp. 5527–5537 (2017)Google Scholar
  15. 15.
    Tran, D., Blei, D.M.: Implicit causal models for genome-wide association studies. In: ICLR (2018)Google Scholar
  16. 16.
    Pearl, J.: An introduction to causal inference. Int. J. Biostat. 6(2) (2010). Article 7Google Scholar
  17. 17.
    Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Sig. Syst. 2, 303–314 (1989)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Izbicki, R., Lee, A.B., Pospisil, T.: ABC-CDE: towards approximate bayesian computation with complex high-dimensional data and limited simulations. arXiv:1805.05480 [stat.ME] (2018)
  19. 19.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  20. 20.
    Jethava, V., Dubhashi, D.: GANs for LIFE: generative adversarial networks for likelihood free inference. arXiv preprint arXiv:1711.11139 (2017)
  21. 21.
    Tran, D., Hoffman, M.D., Saurous, R.A., Brevdo, E., Murphy, K., Blei, D.M.: Deep probabilistic programming. In: ICLR (2017)Google Scholar
  22. 22.
    Pearl, J., Glymour, M., Jewell, N.P.: Causal Inference in Statistics: A Primer. Wiley, Hoboken (2016)zbMATHGoogle Scholar
  23. 23.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
  24. 24.
    Mescheder, L., Nowozin, S., Geiger, A.: Adversarial variational bayes: unifying variational autoencoders and generative adversarial networks. arXiv preprint arXiv:1701.04722 (2017)
  25. 25.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
  26. 26.
    Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20, 217–240 (2011)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Van der Laan, M.J., Rose, S.: Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media, New York (2011). Scholar
  28. 28.
    Chipman, H.A., George, E.I., McCulloch, R.E.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4, 266–298 (2010)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)CrossRefGoogle Scholar
  30. 30.
    Athey, S., Tibshirani, J., Wager, S.: Generalized random forests. Ann. Stat. (2018, forthcoming)Google Scholar
  31. 31.
    Wager, S., Athey, S.: Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113, 1228–1242 (2018)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Huszár, F.: Variational inference using implicit distributions. arXiv preprint arXiv:1702.08235 (2017)
  33. 33.
    Shi, J., Sun, S., Zhu, J.: Kernel implicit variational inference. In: ICLR (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Centre for Artificial Intelligence, School of Software, FEITUniversity of Technology SydneySydneyAustralia
  2. 2.School of Management and EconomicsBeijing Institute of TechnologyBeijingChina

Personalised recommendations