Skip to main content

Bias reduction methods for propensity scores estimated from error-prone EHR-derived covariates


As the use of electronic health records (EHR) to estimate treatment effects has become widespread, concern about bias introduced by error in EHR-derived covariates has also grown. While methods exist to address measurement error in individual covariates, little prior research has investigated the implications of using propensity scores for confounder control when the propensity scores are constructed from a combination of accurate and error-prone covariates. We reviewed approaches to account for error in propensity scores and used simulation studies to compare their performance. These comparisons were conducted across a range of scenarios featuring variation in outcome type, validation sample size, main sample size, strength of confounding, and structure of the error in the mismeasured covariate. We then applied these approaches to a real-world EHR-based comparative effectiveness study of alternative treatments for metastatic bladder cancer. This head-to-head comparison of measurement error correction methods in the context of a propensity score-adjusted analysis demonstrated that multiple imputation for propensity scores performs best when the outcome is continuous and regression calibration-based methods perform best when the outcome is binary.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3


  1. Abernethy, A.P., et al.: Use of electronic health record data for quality reporting. J. Oncol. Pract. 13(8), 530–534 (2017)

    Article  Google Scholar 

  2. Berger, M.L., et al.: Opportunities and challenges in leveraging electronic health record data in oncology. Fut. Oncol. 12(10), 1261–1274 (2016)

    CAS  Article  Google Scholar 

  3. Carroll, R.J., et al.: Measurement Error in Nonlinear Models: A Modern Perspective. Chapman & Hall, New York (2006)

    Book  Google Scholar 

  4. Cole, S.R., Chu, H., Greenland, S.: Multiple-imputation for measurement error correction. Int. J. Epidemiol. 35, 1074–1081 (2006)

    Article  Google Scholar 

  5. Curtis, M.D., et al.: Development and validation of a high-quality composite real-world mortality endpoint. Health Serv. Res. 53(6), 4460–4476 (2018)

    Article  Google Scholar 

  6. Elixhauser, A., et al.: Comorbidity measures for use with administrative data. Med. Care 36(1), 8–27 (1998)

    CAS  Article  Google Scholar 

  7. Freedman, L.S., et al.: A comparison of regression calibration, moment reconstruction and imputation for adjusting for covariate measurement error in regression. Stat. Med. 27, 5195–5216 (2008)

    Article  Google Scholar 

  8. Guo, Y., Little, R.A., McConnell, D.S.: On using summary statistics from an external calibration sample to correct for covariate measurement error. Epidemiology 23(1), 165–174 (2012)

    Article  Google Scholar 

  9. Hersh, W.R. et al.: Caveats for the use of operational electronic health record data in comparative effectiveness research. In: Medical Care 51.8 0 3, S30–S37. (visited on 03/16/2019) (2013)

  10. Hong, H. et al.: Propensity Score-Based Estimators with Multiple Error- Prone Covariates. In: American Journal of Epidemiology (2019)

  11. Joshua, L.K. et al.: Identifying patients with high data completeness to improve validity of comparative effectiveness research in electronic health records data. In: Clinical Pharmacology and Therapeutics, vol. 103. (2017)

  12. Lin, H.-W., Chen, Y.-H.: Adjustment for missing confounders in studies based on observational databases: 2-stage calibration combining propensity scores from primary and validation data. In: American Journal of Epidemiology, vol. 180. (2014)

  13. Lin, K.J., et al.: Out-of-system care and recording of patient characteristics critical for comparative effectiveness research. Epidemiology 29, 356–363 (2018)

    Article  Google Scholar 

  14. Little, R.J.A.: Missing-data adjustments in large surveys. J. Bus. Econ. Stat. 6(3), 287–296 (1988).

    Article  Google Scholar 

  15. Messer, K., Natarajan, L.: Maximum likelihood, multiple imputation and regression calibration for measurement error adjustment. Stat. Med. 27, 6332–50 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Miksad, R.A., Abernethy, A.P.: Harnessing the power of real-world evidence (RWE): a checklist to ensure regulatory-grade data quality. Clin. Phar-macol. Therap. 103(2), 202–205 (2018)

    Article  Google Scholar 

  17. Presley, C.J., et al.: Association of broad-based genomic sequencing with survival among patients with advanced non-small cell lung cancer in the community oncology setting. JAMA 320(5), 469–477 (2018)

    CAS  Article  Google Scholar 

  18. Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)

    Article  Google Scholar 

  19. Rosner, B., Spiegelman, D., Willett, W.C.: Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am. J. Epidemiol. 132(4), 734–745 (1990)

    CAS  Article  Google Scholar 

  20. Rusanov, A. et al.: Hidden in plain sight: bias towards sick patients when sampling patients with sufficient electronic health record data for research. In: BMC Medi- cal Informatics and Decision Making 14. url :%3CGo%20to%20ISI%3E://WOS:000338259400001 (2014)

  21. Spiegelman, D., Carroll, R.J., Kipnis, V.: Efficient regression calibration for logistic regression regression in main study/internal validation study designs with an imperfect reference instrument. Stat. Med. 20, 139–160 (2001)

    CAS  Article  Google Scholar 

  22. Steiner, P.M., Cook, T.D., Shadish, W.R.: On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. J. Educ. Behav. Stat. 36(2), 213–236 (2011)

    Article  Google Scholar 

  23. Sturmer, T., Schneeweiss, S., Avorn, J., et al.: Adjusting effect estimates for unmeasured confounding with validation data using propensity score calibration. Am. J. Epidemiol. 162(3), 279–289 (2005)

    Article  Google Scholar 

  24. Sturmer, T., Schneeweiss, S., Rothman, K.J., et al.: Performance of Propensity Score Calibration: A Simulation Study. American journal of epidemiol- ogy 165, 1110–8 (2007).

    Article  Google Scholar 

  25. USFDA (2018). Framework for FDA’s Real-World Evidence Program

  26. Van Buuren, S., Groothuis-Oudshoorn, C.: MICE multivariate imputation by chained equations. J. Stat. Softw. (2011)

  27. Webb-Vargas, Y., et al.: An imputation-based solution to using mismeasured covariates in propensity score analysis. Stat. Methods Med. Res. 26(4), 1824–1837 (2017)

    Article  Google Scholar 

  28. Weiskopf, N.G., Weng, C.: Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J. Am. Med. Inform. Assoc. (2013).

    Article  PubMed  PubMed Central  Google Scholar 

Download references


The authors would like to thank Flatiron Health for providing us with the data for patients with metastatic bladder cancer.


Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R21CA227613 and K23CA187185. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information



Corresponding author

Correspondence to Joanna Harton.

Ethics declarations

Conflict of interest

Dr. Mamtani reports having served as a consultant for Seattle genetics/Astellas. The author(s) declared no other potential conflict of interest with respect to the research, authorship, and/or publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Nandita Mitra and Rebecca Hubbard: Co-senior authors.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 818 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Harton, J., Mamtani, R., Mitra, N. et al. Bias reduction methods for propensity scores estimated from error-prone EHR-derived covariates. Health Serv Outcomes Res Method 21, 169–187 (2021).

Download citation


  • Electronic health record (EHR) data
  • Missingness
  • Bias
  • Mismeasurement
  • Regression calibration
  • Propensity score