Propensity score weighting for a continuous exposure with multilevel data
Propensity score methods (e.g., matching, weighting, subclassification) provide a statistical approach for balancing dissimilar exposure groups on baseline covariates. These methods were developed in the context of data with no hierarchical structure or clustering. Yet in many applications the data have a clustered structure that is of substantive importance, such as when individuals are nested within healthcare providers or within schools. Recent work has extended propensity score methods to a multilevel setting, primarily focusing on binary exposures. In this paper, we focus on propensity score weighting for a continuous, rather than binary, exposure in a multilevel setting. Using simulations, we compare several specifications of the propensity score: a random effects model, a fixed effects model, and a single-level model. Additionally, our simulations compare the performance of marginal versus cluster-mean stabilized propensity score weights. In our results, regression specifications that accounted for the multilevel structure reduced bias, particularly when cluster-level confounders were omitted. Furthermore, cluster mean weights outperformed marginal weights.
KeywordsPropensity score Continuous exposure Multilevel data Observational study
This work was conducted while Megan Schuler was post-doctoral fellow and Wanghuan Chu was a doctoral student at the Pennsylvania State University. This work was funded by awards P50 DA010075, P50 DA039838, and T32 DA017629 from the National Institute on Drug Abuse and K01 ES025437 from the National Institutes of Health Big Data to Knowledge initiative; IGERT award DGE-1144860 from the National Science Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- Cohen, J.: Statistical power analysis for the behavioral sciences, 2nd edn. Routledge, Abingdon-on-Thames (1988)Google Scholar
- Fonarow, G.C., Zhao, X., Smith, E.E., Saver, J.L., Reeves, M.J., Bhatt, D.L., Xian, Y., Hernandez, A.F., Peterson, E.D., Schwamm, L.H.: Door-to-needle times for tissue plasminogen activator administration and clinical outcomes in acute ischemic stroke before and after a quality improvement initiative. J. Am. Med. Assoc. 311(16), 1632–1640 (2014)CrossRefGoogle Scholar
- Hirano, K., Imbens, G.W.: The propensity score with continuous treatments. In: Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives. pp. 73–84 (2004)Google Scholar
- Kim, J., Seltzer, M.: Causal inference in multilevel settings in which selection processes vary across schools. Center for Study of Evaluation, Los Angeles (2007)Google Scholar
- Lee, B.K., Lessler, J., Stuart, E.A.: Improving propensity score weighting using machine learning. Stat. Med. 29(3), 337–346 (2009)Google Scholar
- Leite, W.L., Jimenez, F., Kaya, Y., Stapleton, L.M., MacInnes, J.W., Sandbach, R.: An evaluation of weighting methods based on propensity scores to reduce selection bias in multilevel observational studies. Multivar. Behav. Res. 50(3), 265–284 (2015)Google Scholar
- Potter, F.J.: The effect of weight trimming on nonlinear survey estimates. In: Proceedings of the Section on Survey Research Methods. American Statistical Association (1993)Google Scholar
- R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/ (2015). Accessed 01 Dec 2015
- Rubin, D.B.: Comment: randomization analysis of experimental data: The Fisher randomization test. J. Am. Stat. Assoc. 75(371), 591–593 (1980)Google Scholar
- Rubin, D.B.: Statistics and causal inference: comment: Which ifs have causal answers. J. Am. Stat. Assoc. 81(396), 961–962 (1986)Google Scholar
- Setoguchi, S., Schneeweiss, S., Brookhart, M.A., Glynn, R.J., Cook, E.F.: Evaluating uses of data mining techniques in propensity score estimation: A simulation study. Pharmacoepidemiol Drug Saf. 17(6), 546–555 (2008)Google Scholar
- Su, Y.S., Cortina, J.: What do we gain? Combining propensity score methods and multilevel modeling. Paper presented at the Annual Meeting of the American Political Science Association, Toronto, Canada (2009)Google Scholar