Covariate-Balancing-Aware Interpretable Deep Learning Models for Treatment Effect Estimation

Chen, Kan; Yin, Qishuo; Long, Qi

doi:10.1007/s12561-023-09394-6

Covariate-Balancing-Aware Interpretable Deep Learning Models for Treatment Effect Estimation

Published: 28 October 2023

(2023)
Cite this article

Statistics in Biosciences Aims and scope Submit manuscript

Kan Chen¹,
Qishuo Yin¹ &
Qi Long²

154 Accesses
Explore all metrics

Abstract

Estimating treatment effects is of great importance for many biomedical applications with observational data. Particularly, interpretability of the treatment effects is preferable for many biomedical researchers. In this paper, we first provide a theoretical analysis and derive an upper bound for the bias of average treatment effect (ATE) estimation under the strong ignorability assumption. Derived by leveraging appealing properties of the weighted energy distance, our upper bound is tighter than what has been reported in the literature. Motivated by the theoretical analysis, we propose a novel objective function for estimating the ATE that uses the energy distance balancing score and hence does not require the correct specification of the propensity score model. We also leverage recently developed neural additive models to improve interpretability of deep learning models used for potential outcome prediction. We further enhance our proposed model with an energy distance balancing score weighted regularization. The superiority of our proposed model over current state-of-the-art methods is demonstrated in semi-synthetic experiments using two benchmark datasets, namely, IHDP and ACIC, as well as is examined through the study of the effect of smoking on the blood level of cadmium using NHANES.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Article Open access 07 September 2023

Antihypertensive drug targets and breast cancer risk: a two-sample Mendelian randomization study

Article Open access 24 February 2024

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Article Open access 19 December 2014

References

Rosenbaum PR, Rosenbaum PB, Briskman (2010) Design of observational studies, chap 1. Springer, New York, pp 3–4
Book MATH Google Scholar
Shalit U, Johansson FD, Sontag D (2017) Estimating individual treatment effect: generalization bounds and algorithms. International conference on machine learning. PMLR, pp 3076–3085
Google Scholar
Johansson F, Shalit U, Sontag D (2016) Learning representations for counterfactual inference. International conference on machine learning. PMLR, pp 3020–3029
Google Scholar
Louizos C, Shalit U, Mooij J, Sontag D, Zemel R, Welling M (2017) Causal effect inference with deep latent-variable models. arXiv preprint arXiv:1705.08821
Alaa AM, Weisz M, Van Der Schaar M (2017) Deep counterfactual networks with propensity-dropout. arXiv preprint arXiv:1706.05966
Alaa AM, van der Schaar M (2017) Bayesian inference of individualized treatment effects using multi-task gaussian processes. arXiv preprint arXiv:1704.02801
Schwab P, Linhardt L, Karlen W (2018) Perfect match: a simple method for learning representations for counterfactual inference with neural networks. arXiv preprint arXiv:1810.00656
Yoon J, Jordon J, Van Der Schaar M (2018) GANITE: estimation of individualized treatment effects using generative adversarial nets. In: International conference on learning representations
Farrell MH, Liang T, Misra S (2018) Deep neural networks for estimation and inference: application to causal effects and other semiparametric estimands. arXiv preprint arXiv:1809.09953
Shi C, Blei DM, Veitch V (2019) Adapting neural networks for the estimation of treatment effects. arXiv preprint arXiv:1906.02120
Bica I, Jordon J, van der Schaar M (2020) Estimating the effects of continuous-valued interventions using generative adversarial networks. arXiv preprint arXiv:2002.12326
Kaddour J, Zhu Y, Liu Q, Kusner MJ, Silva R (2021) Causal effect inference for structured treatments. Adv Neural Inf Process Syst 34:24841–24854
Google Scholar
Yao L, Li S, Li Y, Huai M, Gao J, Zhang A (2018) Representation learning for treatment effect estimation from observational data. Adv Neural Inf Process Syst 31:2633–2643
Google Scholar
Agarwal R, Frosst N, Zhang X, Caruana R, Hinton G.E (2020) Neural additive models: interpretable machine learning with neural nets. arXiv preprint arXiv:2004.13912
Hill JL (2011) Bayesian nonparametric modeling for causal inference. J Comput Graph Stat 20(1):217–240
Article MathSciNet Google Scholar
MacDorman MF, Atkinson JO (1998) Infant mortality statistics from the linked birth/infant death data set-1995 period data
Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66(5):688
Article Google Scholar
Splawa-Neyman J, Dabrowska DM, Speed T (1990) On the application of probability theory to agricultural experiments . essay on principles. section 9. Stat Sci 10:465–472
MathSciNet MATH Google Scholar
Cox D (1958) The planning of experiments. John Wiley and Sons, New York
MATH Google Scholar
Huling JD, Mak S (2020) Energy balancing of covariate distributions. arXiv preprint arXiv:2004.13962
Cramér H (1928) On the composition of elementary errors: second paper: statistical applications. Scand Actuar J 1928(1):141–180
Article MATH Google Scholar
Székely GJ (2003) E-statistics: the energy of statistical samples. Bowling Green State Univ Dep Math Stat Tech Rep 3(05):1–18
Google Scholar
Székely GJ, Rizzo ML (2013) Energy statistics: a class of statistics based on distances. J Stat Plan Inference 143(8):1249–1272
Article MathSciNet MATH Google Scholar
Johansson FD, Shalit U, Kallus N, Sontag D (2020) Generalization bounds and representation learning for estimation of potential outcomes and causal effects. arXiv preprint arXiv:2001.07426
Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55
Article MathSciNet MATH Google Scholar
Du SS, Zhai X, Poczos B, Singh A (2018) Gradient descent provably optimizes over-parameterized neural networks. arXiv preprint arXiv:1810.02054
Bu Z, Xu S, Chen K (2021) A dynamical view on optimization algorithms of overparameterized neural networks. International conference on artificial intelligence and statistics. PMLR, pp 3187–3195
Google Scholar
Kennedy EH (2016) Semiparametric theory and empirical processes in causal inference. In: Statistical causal inferences and their applications in public health research. pp 141–167
Van der Laan MJ, Rose S (2011) Targeted learning: causal inference for observational and experimental data, vol 4. Springer, New York
Book Google Scholar
Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W (2017) Double/debiased/neyman machine learning of treatment effects. Am Econ Rev 107(5):261–65
Article MATH Google Scholar
Chipman HA, George EI, McCulloch RE (2010) Bart: Bayesian additive regression trees. Ann Appl Stat 4(1):266–298
Article MathSciNet MATH Google Scholar
Imai K, Ratkovic M (2014) Covariate balancing propensity score. J R Stat Soc Ser B (Stat Methodol) 76(1):243–263
Article MathSciNet MATH Google Scholar
Sharma A, Kiciman E et al (2019) DoWhy: a Python package for causal inference. https://github.com/microsoft/dowhy
Athey S, Tibshirani J, Wager S (2019) Generalized random forests. Ann Stat 47(2):1148–1178
Article MathSciNet MATH Google Scholar
Curth A, van der Schaar M (2021) On inductive biases for heterogeneous treatment effect estimation. Adv Neural Inf Process Syst 34:15883–15894
Google Scholar
Sparapani R, Spanbauer C, McCulloch R (2021) Nonparametric machine learning and efficient computation with Bayesian additive regression trees: the BART R package. J Stat Softw 97(1):1–66. https://doi.org/10.18637/jss.v097.i01
Article Google Scholar
Rosenbaum PR (2013) Using differential comparisons in observational studies. Chance 26(3):18–25
Article Google Scholar
Hartford J, Lewis G, Leyton-Brown K, Taddy M (2017) Deep iv: a flexible approach for counterfactual prediction. International Conference on Machine Learning. PMLR, pp 1414–1423
Google Scholar
Mak S, Joseph VR (2018) Support points. Ann Stat 46(6A):2562–2592
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research was partially supported by NIH RF1AG063481 grant.

Author information

Authors and Affiliations

Graduate Group of Applied Math and Computational Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
Kan Chen & Qishuo Yin
Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA
Qi Long

Authors

Kan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qishuo Yin
View author publications
You can also search for this author in PubMed Google Scholar
Qi Long
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Long.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 3127 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, K., Yin, Q. & Long, Q. Covariate-Balancing-Aware Interpretable Deep Learning Models for Treatment Effect Estimation. Stat Biosci (2023). https://doi.org/10.1007/s12561-023-09394-6

Download citation

Received: 30 January 2023
Revised: 12 September 2023
Accepted: 17 September 2023
Published: 28 October 2023
DOI: https://doi.org/10.1007/s12561-023-09394-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Covariate-Balancing-Aware Interpretable Deep Learning Models for Treatment Effect Estimation

Abstract

Access this article

Similar content being viewed by others

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Antihypertensive drug targets and breast cancer risk: a two-sample Mendelian randomization study

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

Supplementary file1 (PDF 3127 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Covariate-Balancing-Aware Interpretable Deep Learning Models for Treatment Effect Estimation

Abstract

Access this article

Similar content being viewed by others

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Antihypertensive drug targets and breast cancer risk: a two-sample Mendelian randomization study

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

Supplementary file1 (PDF 3127 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation