Arkhangelsky, D., & Imbens, G. (2019). The role of the propensity score in fixed effect models. arXiv. Retrieved from arxiv: 1807.02099. https://doi.org/10.3386/w24814
Arpino, B., & Cannas, M. (2016). Propensity score matching with clustered data. an application to the estimation of the impact of caesarean section on the apgar score. Statistics in Medicine, 35(12), 2074–2091. https://doi.org/10.1002/sim.6880
Arpino, B., & Mealli, F. (2011). The specification of the propensity score in multilevel observational studies. Computational Statistics & Data Analysis, 55(4), 1770–1780. https://doi.org/10.1016/j.csda.2010.11.008
Article
Google Scholar
Athey, S., & Imbens, G. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27), 7353–7360. https://doi.org/10.1073/pnas.1510489113
Article
Google Scholar
Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46, 399–424. https://doi.org/10.1080/00273171.2011.568786
Article
PubMed
PubMed Central
Google Scholar
Bang, H., & Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4), 962–973. https://doi.org/10.1111/j.1541-0420.2005.00377.x
Article
PubMed
Google Scholar
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01.
Carvalho, C., Feller, A., Murray, J., Woody, S., & Yeager, D. (2019). Assessing treatment effect variation in observational studies: Results from a data challenge. Observational Studies, 5, 21–35. https://doi.org/10.1353/obs.2019.0000
Article
Google Scholar
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., & Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1–C68. https://doi.org/10.1111/ectj.12097
Article
Google Scholar
Ding, P., Feller, A., & Miratrix, L. (2019). Decomposing treatment effect variation. Journal of the American Statistical Association, 114(525), 304–317. https://doi.org/10.1080/01621459.2017.1407322
Article
Google Scholar
Dorie, V., & Hill, J. (2019). bartcause: Causal inference using bayesian additive regression trees [Computer software manual]. Retrieved from https://github.com/vdorie/bartCause (R package version 1.0-0)
Dorie, V., Hill, J., Shalit, U., Scott, M., & Cervone, D. (2019, 02). Automated versus do-it-yourself methods for causal inference Lessons: Learned from a data analysis competition. Statistical Science, 34(1), 43–68. https://doi.org/10.1214/18-STS667
Evdokimov, K. (2010). Identification and estimation of a nonparametric panel data model with unobserved heterogeneity. Working paper, Princeton University.
Firebaugh, G., Warner, C., & Massoglia, M. (2013). Fixed effects, random effects, and hybrid models for causal analysis. In S. L. Morgan (Ed.), Handbook of causal analysis for social research (pp. 113–132). Springer. https://doi.org/10.1007/978-94-007-6094-3_7.
Glynn, A. N., & Quinn, K. M. (2010). An introduction to the augmented inverse propensity weighted estimator. Political Analysis, 18(1), 36–56. https://doi.org/10.1093/pan/mpp036
Article
Google Scholar
Gruber, S., & van der Laan, M. J. (2012). tmle: An R package for targeted maximum likelihood estimation. Journal of Statistical Software, 51(13), 1–35. Retrieved from http://www.jstatsoft.org/v51/i13/. https://doi.org/10.18637/jss.v051.i13
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica: Journal of the Econometric Society, pp. 1029–1054. https://doi.org/10.2307/1912775
He, Z. (2018). Inverse conditional probability weighting with clustered data in causal inference. arXiv. Retrieved from arxiv: 1808.01647
Henderson, D. J., Carroll, R. J., & Li, Q. (2008). Nonparametric estimation and testing of fixed effects panel data models. Journal of Econometrics, 144(1), 257–275. https://doi.org/10.1016/j.jeconom.2008.01.005
Article
PubMed
PubMed Central
Google Scholar
Hernan, M. A., & Robins, J. M. (2020). Causal inference: What if. Boca Raton: Chapman & HallCRC.
Google Scholar
Hill, J. L. (2011). Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1), 217–240. https://doi.org/10.1198/jcgs.2010.08162
Article
Google Scholar
Holloway, J. H. (2004). Closing the minority achievement gap in math. Educational Leadership, 61(5), 84.
Google Scholar
Hong, G., & Hong, Y. (2009). Reading instruction time and homogeneous grouping in kindergarten: An application of marginal mean weighting through stratification. Educational Evaluation and Policy Analysis, 31(1), 54–81. https://doi.org/10.3102/0162373708328259
Article
Google Scholar
Hong, G., & Raudenbush, S. W. (2006). Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association, 101, 901–910. https://doi.org/10.1198/016214506000000447
Article
Google Scholar
Hong, G., & Raudenbush, S. W. (2013). Heterogeneous agents, social interactions, and causal inference. In S. L. Morgan (Ed.), Handbook of causal analysis for social research (pp. 331–352). Springer. https://doi.org/10.1007/978--94--007--6094--3_16
Imai, K., & Kim, I. S. (2019). When should we use unit fixed effects regression models for causal inference with longitudinal data? American Journal of Political Science, 63(2), 467–490. https://doi.org/10.1111/ajps.12417
Article
Google Scholar
Kim, J.-S., & Frees, E. W. (2006). Omitted variables in multilevel models. Psychometrika, 71(4), 659. https://doi.org/10.1007/s11336-005-1283-0
Article
Google Scholar
Kim, J.-S., & Frees, E. W. (2007). Multilevel modeling with correlated effects. Psychometrika, 72(4), 505–533. https://doi.org/10.1007/s11336-007-9008-1
Article
Google Scholar
Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B. (2019). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences, 116(10), 4156–4165. https://doi.org/10.1073/pnas.1804597116
Article
Google Scholar
LeDell, E., Gill, N., Aiello, S., Fu, A., Candel, A., Click, C., . . . Malohlava, M. (2020). h2o: R interface for the ’h2o’ scalable machine learning platform [Computer software manual]. Retrieved from https://github.com/h2oai/h2o-3 (R package version 3.30.1.1)
Lee, Y., Nguyen, T. Q., & Stuart, E. A. (2019). Partially pooled propensity score models for average treatment effect estimation with multilevel data. arXiv Retrieved from arxiv:1910.05600
Li, F., Zaslavsky, A. M., & Landrum, M. B. (2013). Propensity score weighting with multilevel data. Statistics in Medicine, 32(19), 3373–3387. https://doi.org/10.1002/sim.5786
Article
PubMed
PubMed Central
Google Scholar
Li, Y., Lee, Y., Port, F. K., & Robinson, B. M. (2020). The impact of unmeasured within-and between-cluster confounding on the bias of effect estimators of a continuous exposure. Statistical Methods in Medical Research, 29(8), 2119–2139. https://doi.org/10.1177/0962280219883323
Article
PubMed
Google Scholar
Lin, Z., Li, Q., & Sun, Y. (2014). A consistent nonparametric test of parametric regression functional form in fixed effects panel data models. Journal of Econometrics, 178, 167–179. https://doi.org/10.1016/j.jeconom.2013.08.014
Article
Google Scholar
McCaffrey, D. F., Ridgeway, G., & Morral, A. R. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9(4), 403–425. https://doi.org/10.1037/1082-989X.9.4.403
Article
PubMed
Google Scholar
Meyers, J. L., & Beretvas, S. N. (2006). The impact of inappropriate modeling of cross-classified data structures. Multivariate Behavioral Research, 41(4), 473–497. https://doi.org/10.1207/s15327906mbr4104_3
Article
PubMed
Google Scholar
Mullen, K. M., & van Stokkum, I. H. M. (2012). nnls: The lawson-hanson algorithm for non-negative least squares (nnls) [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=nnls (R package version 1.4)
Neyman, J. S. (1923). On the application of probability theory to agricultural experiments: Essay on principles. section 9 (with discussion). Statistical Science, 4, 465–480.
Google Scholar
Noguera, P. A., & Wing, J. Y. (2008). Unfinished business: Closing the racial achievement gap in our schools. Wiley.
Polley, E. C., & van der Laan, M. J. (2010). Super learner in prediction. U.C. Berkeley Division of Biostatistics Working Paper Series. Paper 226. https://doi.org/10.1007/978-1-4419-9782-1_3
R Core Team. (2020). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (Vol. 1). Sage.
Rickles, J. H. (2013). Examining heterogeneity in the effect of taking algebra in eighth grade. The Journal of Educational Research, 106(4), 251–268. https://doi.org/10.1080/00220671.2012.692731
Article
Google Scholar
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55. https://doi.org/10.1093/biomet/70.1.41
Article
Google Scholar
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701. https://doi.org/10.1080/01621459.1986.10478355
Article
Google Scholar
Rubin, D. B. (1986). Comment: Which ifs have causal answers. Journal of the American Statistical Association, 81(396), 961–962. https://doi.org/10.2307/2289065
Article
Google Scholar
Rubin, D. B. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2(3–4), 169–188.
Article
Google Scholar
Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), 279–313. https://doi.org/10.1037/a0014268
Article
PubMed
Google Scholar
Semenova, V., & Chernozhukov, V. (2020). Estimation and inference about conditional average treatment effect and other structural functions. arXiv Retrieved from arxiv: 1702.06240
Shadish, W. R., Clark, M. H., & Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. Journal of the American Statistical Association, 103(484), 1334–1344. https://doi.org/10.1198/016214508000000733
Article
Google Scholar
Steiner, P. M., & Cook, D. (2013). Matching and propensity scores. In T. Little (Ed.), The oxford handbook of quantitative methods (p. 236–258). New York, NY: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199934874.013.0013
Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15(3), 250. https://doi.org/10.1037/a0018719
Article
PubMed
Google Scholar
Su, X., Tsai, C.-L., Wang, H., Nickerson, D. M., & Li, B. (2009). Subgroup analysis via recursive partitioning. Journal of Machine Learning Research, 10(2), 141–158. https://doi.org/10.2139/ssrn.1341380
Article
Google Scholar
Suk, Y., Kang, H., & Kim, J.-S. (2020). Random forests approach for causal inference with clustered observational data. Multivariate Behavioral Research. https://doi.org/10.1080/00273171.2020.1808437
Article
PubMed
Google Scholar
Sun, Y., Carroll, R. J., & Li, D. (2009). Semiparametric estimation of fixed-effects panel data varying coefficient models. In Q. Li & J. S. Racine (Eds.), Nonparametric econometric methods (pp. 101–130). Emerald Group Publishing Limited. https://doi.org/10.1108/S0731-9053(2009)0000025006
van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1–67. https://doi.org/10.18637/jss.v045.i03.
van der Laan, M. J., Polley, E. C., & Hubbard, A. E. (2007). Super learner. Statistical Applications in Genetics and Molecular Biology, 6(1). https://doi.org/10.2202/1544-6115.1309.
van der Laan, M. J., & Rose, S. (2011). Targeted learning: Causal inference for observational and experimental data. Springer.
Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523), 1228–1242. https://doi.org/10.1080/01621459.2017.1319839
Article
Google Scholar
Walston, J., & McCarroll, J. C. (2010). Eighth-grade algebra: Findings from the eighth-grade round of the early childhood longitudinal study, kindergarten class of 1998–99 (ECLS-K). statistics in brief. nces 2010–016. National Center for Education Statistics.
Wenglinsky, H. (2004). Closing the racial achievement gap: The role of reforming instructional practices. Education Policy Analysis Archives, 12, 64. https://doi.org/10.14507/epaa.v12n64.2004.
Westreich, D., Lessler, J., & Funk, M. J. (2010). Propensity score estimation: neural networks, support vector machines, decision trees (cart), and meta-classifiers as alternatives to logistic regression. Journal of Clinical Epidemiology, 63(8), 826–833. https://doi.org/10.1016/j.jclinepi.2009.11.020
Article
PubMed
PubMed Central
Google Scholar
White, I. R., Royston, P., & Wood, A. M. (2011). Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine, 30(4), 377–399. https://doi.org/10.1002/sim.4067
Article
PubMed
Google Scholar
Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data. The MIT press.
Wooldridge, J. M. (2012). Introductory econometrics: A modern approach. South-Western Cengage Learning.
Yang, S. (2018). Propensity score weighting for causal inference with clustered data. Journal of Causal Inference, 6(2). https://doi.org/10.1515/jci-2017-0027.
Zetterqvist, J., & Sjölander, A. (2015). Doubly robust estimation with the R package drgee. Epidemiologic Methods, 4(1), 69–86. https://doi.org/10.1515/em-2014-0021
Article
Google Scholar
Zetterqvist, J., Vansteelandt, S., Pawitan, Y., & Sjölander, A. (2016). Doubly robust methods for handling confounding by cluster. Biostatistics, 17(2), 264–276. https://doi.org/10.1093/biostatistics/kxv041
Article
PubMed
Google Scholar