Abstract
We conducted an evaluation of a patient-centered medical home demonstration sponsored by the Centers for Medicare & Medicaid Services. We implemented a quasi-experimental pre-post with a comparison group design. Traditional propensity score weighting failed to achieve balance (exchangeability) between the two groups on several critical characteristics. In response, we incorporated a relatively new alternative known as entropy balancing. Our objective is to share lessons learned from using entropy balancing in a quasi-experimental study design. We document the advantages and challenges with using entropy balancing. We also describe a set of best practices, and we present a series of illustrative analyses that empirically demonstrate the performance of entropy balancing relative to traditional propensity score weighting. We compare alternative approaches based on: (i) covariate balance (e.g., standardized differences); (ii) overlap in conditional treatment probabilities; and (iii) the distribution of weights. Our comparison of overlap is based on a novel approach we developed that uses entropy balancing weights to calculate a pseudo-propensity score. In many situations, entropy balancing provides remarkably superior covariate balance compared to traditional propensity score weighting methods. Entropy balancing is also preferred because it does not require extensive iterative manual searching for an optimal propensity score specification. However, we demonstrate that there are some situations where entropy balancing “fails”. Specifically, there are instances where entropy balancing achieves adequate covariate balance only by using a distribution of weights that dramatically up-weights a small set of observations, giving them a disproportionately large and undesirable influence.
Similar content being viewed by others
Notes
Exchangeability refers to a technical condition where treatment and comparison groups are equivalent in terms of all observed and unobserved dimensions that are related to the receipt of treatment and the outcome of interest. Covariate balance refers to an assessment the similarity of all observed dimensions across treatment and comparison observations. Thus, assessments of covariate balance support whether exchangeability is a reasonable assumption to make.
For an overview of alternative propensity score methods, we refer the reader to Austin (2011).
Brookhart et al. (2006) recommend including anything correlated with the outcome, and excluding anything that is only correlated with treatment assignment. However, beyond this there is little guidance.
In the MAPCP Demonstration group we could perfectly identify practices. However, in the comparison groups, we were only able to identify groups of providers within a tax identification number. Thus, it was difficult to compare practice size calculations across the comparison group and MAPCP Demonstration group.
R has some packages that can be used to do this (TWANG and OPTMATCH).
References
Austin, P.C.: The relative ability of different propensity score methods to balance measured covariates between treated and untreated subjects in observational studies. Med. Decis. Mak 29, 661–667 (2009a)
Austin, P.C.: Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research. Commun. Stat.-Simul. Comput. 38, 1228–1234 (2009b)
Austin, P.C.: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat. Med. 28, 3083–3107 (2009c)
Austin, P.C.: The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute reductions) in observational studies. Stat. Med. 29, 2137–2148 (2010)
Austin, P.C.: An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar. Behav. Res. 46, 399–424 (2011)
Austin, P.C., Mamdani, M.M., Stukel, T.A., Anderson, G.M., Tu, J.V.: The use of the propensity score for estimating treatment effects: administrative versus clinical data. Stat. Med. 24, 1563–1578 (2005)
Austin, P.C., Mamdani, M.M.: A comparison of propensity score methods: a case-study estimating the effectiveness of post-AMI statin use. Stat. Med. 25, 2084–2106 (2006)
Austin, P.C., Stuart, E.A.: Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat. Med. 34, 3661–3679 (2015)
Brookhard, M.A., Schneeweiss, S., Rothman, K.J., Glynn, R.J., Avorn, J., Sturmer, T.: Variable selection for propensity score models. Am. J. Epidemiol. 163, 1149–1156 (2006)
Diamond, A., Sekhon, J.S.: Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev. Econ. Stat. 95, 932–945 (2013)
Flurry, B.K., Riedwyl, H.: Standard distance in univariate and multivariate analysis. Am. Stat. 40, 249–251 (1986)
Hainmueller, J.: Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies. Polit. Anal. 20, 25–46 (2012)
Hainmueller, J., Xu, Y.: Ebalance: a Stata package for entropy balancing. J. Stat. Softw. 54, 1–18 (2013)
Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Gr. Stat. 20, 217–240 (2011)
Ho, D.E., Imai, K., King, G., Stuart, E.A.: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal. 15, 199–236 (2007)
Imai, K., Ratkovic, M.: Covariate balancing propensity score. J. R. Stat. Soc. 76, 243–263 (2014)
Imbens, G.W.: Nonparametric estimation of average treatment effects under exogeneity: a review. Rev. Econ. Stat. 86, 4–29 (2004)
McCaffrey, D.F., Ridgeway, G., Morral, A.R.: Propensity score estimation with boosted regression for evaluation causal effects in observational studies. Psychol. Methods 9, 403–425 (2004)
Robins, J., Ritov, Y.: Towards a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. Stat. Med. 16, 285–319 (1997)
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55 (1983)
Rosenbaum, P.R., Rubin, D.B.: Reducing bias in observational studies using subclassification on the propensity score. J. Am. Stat. Assoc. 79, 516–524 (1984)
Rubin, D.B.: Estimating the causal effects of treatments in randomized and non-randomized studies. J. Educ. Psychol. 66, 688–701 (1974)
Rubin, D.B.: Assignment to a treatment group on the basis of a covariate. J. Educ. Stat. 2, 1–26 (1977)
Rubin, D.B.: Using propensity scores to help design observational studies: application to the tobacco litigation. Health Serv. Outcomes Res. Method. 2, 169–188 (2001)
Sturmer, T., Rothman, K., Avorn, J., Glynn, R.: Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution–A simulation study. Am. J. Epidemiol. 172, 843–854 (2010)
StataCorp.: Stata Statistical Software: Release 14. College Station, TX: StataCorp LP (2015)
Zhao, Q., Percival, D.: Entropy balancing is doubly robust. J Causal Inference 5, 1–23 (2016)
Funding
This research was supported by the Centers for Medicare & Medicaid Services under contract number HHSM-500-2010-00021I. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the U.S. Department of Health and Human Services or any of its agencies.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
None.
Rights and permissions
About this article
Cite this article
Parish, W.J., Keyes, V., Beadles, C. et al. Using entropy balancing to strengthen an observational cohort study design: lessons learned from an evaluation of a complex multi-state federal demonstration. Health Serv Outcomes Res Method 18, 17–46 (2018). https://doi.org/10.1007/s10742-017-0174-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10742-017-0174-z