An intuitive review of methods for observational studies of comparative effectiveness

  • Steven D. PizerEmail author


I use diagrams to illustrate the sources of potential selection bias in observational studies of comparative effectiveness. I adapt these diagrams for three hypothetical scenarios that clarify the strengths and weaknesses of two prominent methods used to account for potential selection bias: propensity scores and instrumental variables. After reviewing the fundamentals of how to apply each method, including new developments that make implementation easier, I refer to some recent studies that illustrate how choice of method can affect estimates. I conclude by emphasizing that many studies with apparently rich sources of data are nevertheless unlikely to produce unbiased estimates and that conceptual modeling can help identify these problems in advance.


Comparative effectiveness Observational studies Selection bias Propensity scores Instrumental variables 



This research was supported by Grant Number IAD 06-112 from the Health Services Research and Development Service of the U.S. Department of Veterans Affairs. All opinions expressed in this paper are those of the author and do not necessarily reflect the official position of the U.S. Department of Veterans Affairs or of Boston University. The author wishes to thank Matt Maciejewski, Paul Hebert, Ann Hendricks, Austin Frakt, and an anonymous reviewer for helpful comments.


  1. ALLHAT Officers and Coordinators for the ALLHAT Collaborative Research Group: Major outcomes in high-risk hypertensive patients randomized to angiotensin-converting enzyme inhibitor or calcium channel blocker vs. diuretic. JAMA 288(23), 2981–2997 (2002). doi: 10.1001/jama.288.23.2981 CrossRefGoogle Scholar
  2. Baum, C.F., Shaffer, M.E., Stillman, S.: Instrumental variables and GMM: estimation and testing. Stat. J. 3(1), 1–31 (2003)Google Scholar
  3. Bound, J., Jaeger, D.A., Baker, R.M.: Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J. Am. Stat. Assoc. 90, 443–450 (1995). doi: 10.2307/2291055 CrossRefGoogle Scholar
  4. Clancy, C.: Health issues and opportunities at AHRQ. Testimony before the House Subcommittee on Labor-HHS-Education appropriations, Washington DC, March 5, 2008. (2008). Accessed 7 April 2008
  5. Congressional Budget Offices: Research on the comparative effectiveness of medical treatments: issues and options for an expanded federal role. Congress of the United States, Pub. No. 2975, December 2007Google Scholar
  6. Congressional Research Service: Comparative clinical effectiveness and cost-effectiveness research: background, history, and overview. CRS Report for Congress, October 15, 2007Google Scholar
  7. D’Agostino Jr., R.B.: Tutorial in biostatistics: propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat. Med. 17, 2265–2281 (1998). doi:10.1002/(SICI)1097-0258(19981015)17:19<2265::AID-SIM918>3.0.CO;2-BPubMedCrossRefGoogle Scholar
  8. Davidson, R., MacKinnon, J.G.: Estimation and Inference in Econometrics. Oxford University Press, New York (1993)Google Scholar
  9. Dixon, K.: US may compare medical products; companies wary. Reuters, March 31 (2008)Google Scholar
  10. Earle, C.C., Tsai, J.S., Gelber, R.D., Weinstein, M.C., Neumann, P.J., Weeks, J.C.: Effectiveness of chemotherapy for advanced lung cancer in the elderly: instrumental variable and propensity analysis. J. Clin. Oncol. 19(4), 1064–1070 (2001)PubMedGoogle Scholar
  11. Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7(1), 1–26 (1979). doi: 10.1214/aos/1176344552 CrossRefGoogle Scholar
  12. Grootendorst, P.: A review of instrumental variables estimation of treatment effects in the applied health sciences. Health Serv. Outcomes Res. Methodol. 7, 159–179 (2007). doi: 10.1007/s10742-007-0023-6 CrossRefGoogle Scholar
  13. Hausman, J.A.: Specification tests in econometrics. Econometrica 46(6), 1251–1271 (1978). doi: 10.2307/1913827 CrossRefGoogle Scholar
  14. Heckman, J.J.: Dummy endogenous variables in a simultaneous equation system. Econometrica 46(4), 931–959 (1978). doi: 10.2307/1909757 CrossRefGoogle Scholar
  15. Heckman, J.J.: Sample selection bias as a specification error. Econometrica 47(1), 153–161 (1979). doi: 10.2307/1912352 CrossRefGoogle Scholar
  16. Imbens, G.W., Angrist, J.D.: Identification and estimation of local average treatment effects. Econometrica 62(2), 467–475 (1994). doi: 10.2307/2951620 CrossRefGoogle Scholar
  17. Institute of Medicine: Learning what works best: the nation’s need for evidence on comparative effectiveness in health care. (2007) Accessed 19 May 2008
  18. Newey, W.K., Powell, J.L., Vella, F.: Nonparametric estimation of triangular simultaneous equations models. Econometrica 67, 565–603 (1999). doi: 10.1111/1468-0262.00037 CrossRefGoogle Scholar
  19. Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983). doi: 10.1093/biomet/70.1.41 CrossRefGoogle Scholar
  20. Rosenbaum, P.R., Rubin, D.B.: Reducing bias in observational studies using subclassification on the propensity score. J. Am. Stat. Assoc. 79, 516–524 (1984). doi: 10.2307/2288398 CrossRefGoogle Scholar
  21. Rosenbaum, P.R., Rubin, D.B.: Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am. Stat. 39, 33–38 (1985). doi: 10.2307/2683903 CrossRefGoogle Scholar
  22. Staiger, D., Stock, J.: Instrumental variables regression with weak instruments. Econometrica 65, 557–586 (1997). doi: 10.2307/2171753 CrossRefGoogle Scholar
  23. Stukel, T.A., Fisher, E.S., Wennberg, D.E., Alter, D.A., Gottlieb, D.J., Vermeulen, M.J.: Analysis of observational studies in the presence of treatment selection bias: effects of invasive cardiac management on AMI survival using propensity score and instrumental variable methods. JAMA 297(3), 278–285 (2007). doi: 10.1001/jama.297.3.278 PubMedCrossRefGoogle Scholar
  24. Terza, J.V., Bradford, W.D., Dismuke, C.E.: The use of linear instrumental variables methods in health services research and health economics: a cautionary note. Health Serv. Res. 43(3), 1102–1120 (2008a). doi: 10.1111/j.1475-6773.2007.00807.x PubMedCrossRefGoogle Scholar
  25. Terza, J.V., Basu, A., Rathouz, P.J.: Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling. J. Health Econ. 27, 531–543 (2008b). doi: 10.1016/j.jhealeco.2007.09.009 PubMedCrossRefGoogle Scholar
  26. Wang, P.S., Schneeweiss, S., Avorn, J., Fischer, M.A., Mogun, H., Solomon, D.H., Brookhart, M.A.: Risk of death in elderly users of conventional vs. atypical antipsychotic medications. N. Engl. J. Med. 353(22), 2335–2341 (2005). doi: 10.1056/NEJMoa052827 PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Health Care Financing & EconomicsU.S. Department of Veterans AffairsBostonUSA
  2. 2.Boston University School of Public HealthBostonUSA

Personalised recommendations