Counterfactual Propagation for Semi-supervised Individual Treatment Effect Estimation

Harada, Shonosuke; Kashima, Hisashi

doi:10.1007/978-3-030-67658-2_31

Shonosuke Harada¹² &
Hisashi Kashima^12,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12457))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1619 Accesses
3 Citations

Abstract

Individual treatment effect (ITE) represents the expected improvement in the outcome of taking a particular action to a particular target, and plays important roles in decision making in various domains. However, its estimation problem is difficult because intervention studies to collect information regarding the applied treatments (i.e., actions) and their outcomes are often quite expensive in terms of time and monetary costs. In this study, we consider a semi-supervised ITE estimation problem that exploits more easily-available unlabeled instances to improve the performance of ITE estimation using small labeled data. We combine two ideas from causal inference and semi-supervised learning, namely, matching and label propagation, respectively, to propose counterfactual propagation, which is the first semi-supervised ITE estimation method. Experiments using semi-real datasets demonstrate that the proposed method can successfully mitigate the data scarcity problem in ITE estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SemiITE: Semi-supervised Individual Treatment Effect Estimation via Disagreement-Based Co-training

Meta-learning for Estimating Multiple Treatment Effects with Imbalance

Learning Conditional Instrumental Variable Representation for Causal Effect Estimation

Notes

References

Abadie, A., Imbens, G.W.: Large sample properties of matching estimators for average treatment effects. Econometrica 74(1), 235–267 (2006)
Article MathSciNet Google Scholar
Alvari, H., Shaabani, E., Sarkar, S., Beigi, G., Shakarian, P.: Less is more: semi-supervised causal inference for detecting pathogenic users in social media. In: Proceedings of the 2019 World Wide Web Conference (WWW), pp. 154–161 (2019)
Google Scholar
Baiocchi, M., Cheng, J., Small, D.S.: Instrumental variable methods for causal inference. Stat. Med. 33(13), 2297–2340 (2014)
Article MathSciNet Google Scholar
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Machine Learn. Res. 7, 2399–2434 (2006)
Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 153–160 (2007)
Google Scholar
Breiman, L.: Random forests. Machine Learn. 45(1), 5–32 (2001)
Article Google Scholar
Bui, T.D., Ravi, S., Ramavajjala, V.: Neural graph learning: training neural networks using graphs. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM), pp. 64–71 (2018)
Google Scholar
Chan, D., Ge, R., Gershony, O., Hesterberg, T., Lambert, D.: Evaluating online ad campaigns in a pipeline: causal models at scale. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 7–16 (2010)
Google Scholar
Chipman, H.A., George, E.I., McCulloch, R.E., et al.: Bart: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Article MathSciNet Google Scholar
Dorie, V.: NPCI: Non-parametrics for causal inference. https://www.github.com/vdorie/npci (2016)
Du, B., Xinyao, T., Wang, Z., Zhang, L., Tao, D.: Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion. IEEE Trans. Cyber. 49(4), 1440–1453 (2018)
Article Google Scholar
Glass, T.A., Goodman, S.N., Hernán, M.A., Samet, J.M.: Causal inference in public health. Annual Rev. Public Health 34, 61–75 (2013)
Article Google Scholar
Guo, R., Li, J., Liu, H.: Learning individual causal effects from networked observational data. In: Proceedings of the 13th International Conference on Web Search and Data Mining (WSDM), pp. 232–240 (2020)
Google Scholar
Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20(1), 217–240 (2011)
Article MathSciNet Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet Google Scholar
Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Label propagation for deep semi-supervised learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5070–5079 (2019)
Google Scholar
Johansson, F., Shalit, U., Sontag, D.: Learning representations for counterfactual inference. In: Proceedings of the 33rd International Conference on Machine Learning (ICML), pp. 3020–3029 (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
LaLonde, R.J.: Evaluating the econometric evaluations of training programs with experimental data. The American Economic Review, pp. 604–620 (1986)
Google Scholar
Lewis, D.: Causation. J. Philosophy 70(17), 556–567 (1974)
Article Google Scholar
Li, S., Vlassis, N., Kawale, J., Fu, Y.: Matching via dimensionality reduction for estimation of treatment effects in digital marketing campaigns. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI), pp. 3768–3774 (2016)
Google Scholar
Liu, W., Wang, J., Chang, S.F.: Robust and scalable graph-based semisupervised learning. Proc. IEEE 100(9), 2624–2638 (2012)
Article Google Scholar
Lunceford, J.K., Davidian, M.: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat. Med. 23(19), 2937–2960 (2004)
Article Google Scholar
Pal, A., Chakrabarti, D.: Label propagation with neural networks. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM), pp. 1671–1674 (2018)
Google Scholar
Pearl, J.: Causality. Cambridge University Press (2009)
Google Scholar
Pombo, N., Garcia, N., Bousson, K., Felizardo, V.: Machine learning approaches to automated medical decision support systems. In: Handbook of Research on Artificial Intelligence Techniques and Algorithms, pp. 183–203. IGI Global (2015)
Google Scholar
Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD), pp. 239–248. ACM (2005)
Google Scholar
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)
Article MathSciNet Google Scholar
Rosenbaum, P.R., Rubin, D.B.: Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am. Stat. 39(1), 33–38 (1985)
Google Scholar
Rubin, D.B.: Matching to remove bias in observational studies. Biometrics, pp. 159–183 (1973)
Google Scholar
Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688 (1974)
Article Google Scholar
Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In: Proceedings of the 34th International Conference on Machine Learning (ICML), pp. 3076–3085. JMLR.org (2017)
Google Scholar
Splawa-Neyman, J., Dabrowska, D.M., Speed, T.: On the application of probability theory to agricultural experiments. essay on principles. Section 9. Statistical Science, pp. 465–472 (1990)
Google Scholar
Vahdat, A.: Toward robustness against label noise in training deep discriminative neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 5596–5605 (2017)
Google Scholar
Veitch, V., Wang, Y., Blei, D.: Using embeddings to correct for unobserved confounding in networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 13769–13779 (2019)
Google Scholar
Wager, S., Athey, S.: Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113(523), 1228–1242 (2018)
Article MathSciNet Google Scholar
Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 639–655. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_34
Chapter Google Scholar
Yang, Z., Cohen, W.W., Salakhutdinov, R.: Revisiting semi-supervised learning with graph embeddings. arXiv preprint arXiv:1603.08861 (2016)
Yao, L., Li, S., Li, Y., Huai, M., Gao, J., Zhang, A.: Representation learning for treatment effect estimation from observational data. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 2633–2643 (2018)
Google Scholar
Zhou, F., Li, T., Zhou, H., Zhu, H., Jieping, Y.: Graph-based semi-supervised learning with non-ignorable non-response. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 7013–7023 (2019)
Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International conference on Machine learning (ICML), pp. 912–919 (2003)
Google Scholar

Download references

Acknowledgments

This work was partially supported by JSPS KAKENHI Grant Number 20H04244.

Author information

Authors and Affiliations

Kyoto University, Kyoto, Japan
Shonosuke Harada & Hisashi Kashima
RIKEN AIP, Tokyo, Japan
Hisashi Kashima

Authors

Shonosuke Harada
View author publications
You can also search for this author in PubMed Google Scholar
Hisashi Kashima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shonosuke Harada .

Editor information

Editors and Affiliations

Albert-Ludwigs-Universität, Freiburg, Germany
Frank Hutter
TU Darmstadt, Darmstadt, Germany
Kristian Kersting
Ghent University, Ghent, Belgium
Jefrey Lijffijt
Saarland University, Saarbrücken, Germany
Isabel Valera

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Harada, S., Kashima, H. (2021). Counterfactual Propagation for Semi-supervised Individual Treatment Effect Estimation. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12457. Springer, Cham. https://doi.org/10.1007/978-3-030-67658-2_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-67658-2_31
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67657-5
Online ISBN: 978-3-030-67658-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Counterfactual Propagation for Semi-supervised Individual Treatment Effect Estimation

Abstract

Access this chapter

Similar content being viewed by others

SemiITE: Semi-supervised Individual Treatment Effect Estimation via Disagreement-Based Co-training

Meta-learning for Estimating Multiple Treatment Effects with Imbalance

Learning Conditional Instrumental Variable Representation for Causal Effect Estimation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Counterfactual Propagation for Semi-supervised Individual Treatment Effect Estimation

Abstract

Access this chapter

Similar content being viewed by others

SemiITE: Semi-supervised Individual Treatment Effect Estimation via Disagreement-Based Co-training

Meta-learning for Estimating Multiple Treatment Effects with Imbalance

Learning Conditional Instrumental Variable Representation for Causal Effect Estimation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation