Using entropy balancing to strengthen an observational cohort study design: lessons learned from an evaluation of a complex multi-state federal demonstration

Parish, William J.; Keyes, Vincent; Beadles, Christopher; Kandilov, Amy

doi:10.1007/s10742-017-0174-z

Using entropy balancing to strengthen an observational cohort study design: lessons learned from an evaluation of a complex multi-state federal demonstration

Published: 29 November 2017

Volume 18, pages 17–46, (2018)
Cite this article

Health Services and Outcomes Research Methodology Aims and scope Submit manuscript

William J. Parish ORCID: orcid.org/0000-0001-5697-1250¹,
Vincent Keyes¹,
Christopher Beadles¹ &
…
Amy Kandilov¹

1558 Accesses
25 Citations
9 Altmetric
1 Mention
Explore all metrics

Abstract

We conducted an evaluation of a patient-centered medical home demonstration sponsored by the Centers for Medicare & Medicaid Services. We implemented a quasi-experimental pre-post with a comparison group design. Traditional propensity score weighting failed to achieve balance (exchangeability) between the two groups on several critical characteristics. In response, we incorporated a relatively new alternative known as entropy balancing. Our objective is to share lessons learned from using entropy balancing in a quasi-experimental study design. We document the advantages and challenges with using entropy balancing. We also describe a set of best practices, and we present a series of illustrative analyses that empirically demonstrate the performance of entropy balancing relative to traditional propensity score weighting. We compare alternative approaches based on: (i) covariate balance (e.g., standardized differences); (ii) overlap in conditional treatment probabilities; and (iii) the distribution of weights. Our comparison of overlap is based on a novel approach we developed that uses entropy balancing weights to calculate a pseudo-propensity score. In many situations, entropy balancing provides remarkably superior covariate balance compared to traditional propensity score weighting methods. Entropy balancing is also preferred because it does not require extensive iterative manual searching for an optimal propensity score specification. However, we demonstrate that there are some situations where entropy balancing “fails”. Specifically, there are instances where entropy balancing achieves adequate covariate balance only by using a distribution of weights that dramatically up-weights a small set of observations, giving them a disproportionately large and undesirable influence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A 24-step guide on how to design, conduct, and successfully publish a systematic review and meta-analysis in medical research

Article 13 November 2019

Analysing the Efficiency of Health Systems: A Systematic Review of the Literature

Article Open access 28 December 2022

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Notes

Exchangeability refers to a technical condition where treatment and comparison groups are equivalent in terms of all observed and unobserved dimensions that are related to the receipt of treatment and the outcome of interest. Covariate balance refers to an assessment the similarity of all observed dimensions across treatment and comparison observations. Thus, assessments of covariate balance support whether exchangeability is a reasonable assumption to make.
For an overview of alternative propensity score methods, we refer the reader to Austin (2011).
Brookhart et al. (2006) recommend including anything correlated with the outcome, and excluding anything that is only correlated with treatment assignment. However, beyond this there is little guidance.
In the MAPCP Demonstration group we could perfectly identify practices. However, in the comparison groups, we were only able to identify groups of providers within a tax identification number. Thus, it was difficult to compare practice size calculations across the comparison group and MAPCP Demonstration group.
R has some packages that can be used to do this (TWANG and OPTMATCH).

References

Austin, P.C.: The relative ability of different propensity score methods to balance measured covariates between treated and untreated subjects in observational studies. Med. Decis. Mak 29, 661–667 (2009a)
Article Google Scholar
Austin, P.C.: Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research. Commun. Stat.-Simul. Comput. 38, 1228–1234 (2009b)
Article Google Scholar
Austin, P.C.: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat. Med. 28, 3083–3107 (2009c)
Article PubMed PubMed Central Google Scholar
Austin, P.C.: The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute reductions) in observational studies. Stat. Med. 29, 2137–2148 (2010)
Article PubMed PubMed Central Google Scholar
Austin, P.C.: An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar. Behav. Res. 46, 399–424 (2011)
Article Google Scholar
Austin, P.C., Mamdani, M.M., Stukel, T.A., Anderson, G.M., Tu, J.V.: The use of the propensity score for estimating treatment effects: administrative versus clinical data. Stat. Med. 24, 1563–1578 (2005)
Article PubMed Google Scholar
Austin, P.C., Mamdani, M.M.: A comparison of propensity score methods: a case-study estimating the effectiveness of post-AMI statin use. Stat. Med. 25, 2084–2106 (2006)
Article PubMed Google Scholar
Austin, P.C., Stuart, E.A.: Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat. Med. 34, 3661–3679 (2015)
Article PubMed PubMed Central Google Scholar
Brookhard, M.A., Schneeweiss, S., Rothman, K.J., Glynn, R.J., Avorn, J., Sturmer, T.: Variable selection for propensity score models. Am. J. Epidemiol. 163, 1149–1156 (2006)
Article Google Scholar
Diamond, A., Sekhon, J.S.: Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev. Econ. Stat. 95, 932–945 (2013)
Article Google Scholar
Flurry, B.K., Riedwyl, H.: Standard distance in univariate and multivariate analysis. Am. Stat. 40, 249–251 (1986)
Google Scholar
Hainmueller, J.: Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies. Polit. Anal. 20, 25–46 (2012)
Article Google Scholar
Hainmueller, J., Xu, Y.: Ebalance: a Stata package for entropy balancing. J. Stat. Softw. 54, 1–18 (2013)
Article Google Scholar
Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Gr. Stat. 20, 217–240 (2011)
Article Google Scholar
Ho, D.E., Imai, K., King, G., Stuart, E.A.: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal. 15, 199–236 (2007)
Article Google Scholar
Imai, K., Ratkovic, M.: Covariate balancing propensity score. J. R. Stat. Soc. 76, 243–263 (2014)
Article Google Scholar
Imbens, G.W.: Nonparametric estimation of average treatment effects under exogeneity: a review. Rev. Econ. Stat. 86, 4–29 (2004)
Article Google Scholar
McCaffrey, D.F., Ridgeway, G., Morral, A.R.: Propensity score estimation with boosted regression for evaluation causal effects in observational studies. Psychol. Methods 9, 403–425 (2004)
Article PubMed Google Scholar
Robins, J., Ritov, Y.: Towards a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. Stat. Med. 16, 285–319 (1997)
Article CAS PubMed Google Scholar
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55 (1983)
Article Google Scholar
Rosenbaum, P.R., Rubin, D.B.: Reducing bias in observational studies using subclassification on the propensity score. J. Am. Stat. Assoc. 79, 516–524 (1984)
Article Google Scholar
Rubin, D.B.: Estimating the causal effects of treatments in randomized and non-randomized studies. J. Educ. Psychol. 66, 688–701 (1974)
Article Google Scholar
Rubin, D.B.: Assignment to a treatment group on the basis of a covariate. J. Educ. Stat. 2, 1–26 (1977)
Article Google Scholar
Rubin, D.B.: Using propensity scores to help design observational studies: application to the tobacco litigation. Health Serv. Outcomes Res. Method. 2, 169–188 (2001)
Article Google Scholar
Sturmer, T., Rothman, K., Avorn, J., Glynn, R.: Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution–A simulation study. Am. J. Epidemiol. 172, 843–854 (2010)
Article PubMed PubMed Central Google Scholar
StataCorp.: Stata Statistical Software: Release 14. College Station, TX: StataCorp LP (2015)
Zhao, Q., Percival, D.: Entropy balancing is doubly robust. J Causal Inference 5, 1–23 (2016)
Article Google Scholar

Download references

Funding

This research was supported by the Centers for Medicare & Medicaid Services under contract number HHSM-500-2010-00021I. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the U.S. Department of Health and Human Services or any of its agencies.

Author information

Authors and Affiliations

RTI International, 3040 E. Cornwallis Road, P.O. Box 12194, Research Triangle Park, NC, 27709, USA
William J. Parish, Vincent Keyes, Christopher Beadles & Amy Kandilov

Authors

William J. Parish
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Keyes
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Beadles
View author publications
You can also search for this author in PubMed Google Scholar
Amy Kandilov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William J. Parish.

Ethics declarations

Conflicts of interest

None.

Appendix

See Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 and 18.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parish, W.J., Keyes, V., Beadles, C. et al. Using entropy balancing to strengthen an observational cohort study design: lessons learned from an evaluation of a complex multi-state federal demonstration. Health Serv Outcomes Res Method 18, 17–46 (2018). https://doi.org/10.1007/s10742-017-0174-z

Download citation

Received: 11 May 2017
Revised: 04 November 2017
Accepted: 22 November 2017
Published: 29 November 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s10742-017-0174-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using entropy balancing to strengthen an observational cohort study design: lessons learned from an evaluation of a complex multi-state federal demonstration

Abstract

Access this article

Similar content being viewed by others

A 24-step guide on how to design, conduct, and successfully publish a systematic review and meta-analysis in medical research

Analysing the Efficiency of Health Systems: A Systematic Review of the Literature

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Notes

References

Funding