Advertisement

Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation

  • Donald B. Rubin
Article

Abstract

Propensity score methodology can be used to help design observational studies in a way analogous to the way randomized experiments are designed: without seeing any answers involving outcome variables. The typical models used to analyze observational data (e.g., least squares regressions, difference of difference methods) involve outcomes, and so cannot be used for design in this sense. Because the propensity score is a function only of covariates, not outcomes, repeated analyses attempting to balance covariate distributions across treatment groups do not bias estimates of the treatment effect on outcome variables. This theme will the primary focus of this article: how to use the techniques of matching, subclassification and/or weighting to help design observational studies. The article also proposes a new diagnostic table to aid in this endeavor, which is especially useful when there are many covariates under consideration. The conclusion of the initial design phase may be that the treatment and control groups are too far apart to produce reliable effect estimates without heroic modeling assumptions. In such cases, it may be wisest to abandon the intended observational study, and search for a more acceptable data set where such heroic modeling assumptions are not necessary. The ideas and techniques will be illustrated using the initial design of an observational study for use in the tobacco litigation based on the NMES data set.

balance matching subclassification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AHCPR, “National medical expenditure survey, calendar year 1987,” Center for General Health Services Research, Agency for Health Care Policy and Research, Public Health Service: Rockville,MD, 1992.Google Scholar
  2. S. Anderson, A. Auquier, W. W. Hauck, D. Oakes, W. Vandaele and H. I. Weisberg, Statistical methods for comparative studies, John Wiley, New York, 1980.Google Scholar
  3. D. J. Benjamin, “Does 401(k) eligibility increase net national savings?: reducing bias in the eligibility effect estimate,” A. B. Honors Thesis in Economics, Harvard University, Cambridge, MA, 1999.Google Scholar
  4. D. Card and A. Kreuger, “Minimum wages and employment: a case study of the fast food industry in New Jersey and Pennsylvania,” American Economic Review, 84, pp. 772-793, 1994.Google Scholar
  5. R. G. Carpenter, “Matching when covariates are normally distributed,” Biometrika, 64, pp. 299-307, 1977.Google Scholar
  6. W. G. Cochran, “Analysis of covariance: its nature and uses,” Biometrics, 13, pp. 261-281, 1957.Google Scholar
  7. W. G. Cochran, “The planning of observational studies of human populations,” Journal of the Royal Statistical Society-A, 128, pp. 234-265, 1965.Google Scholar
  8. W. G. Cochran and D. B. Rubin, “Controlling bias in observational studies: a review,” Sankhya-A, 35, pp. 417-446, 1973.Google Scholar
  9. J. C. Czajka, S. M. Hirabayashi, R. J. A. Little and D. B. Rubin, “Projecting from advance data using propensity modeling,” Journal of Business and Economics Statistics, 10, pp. 117-131, 1992.Google Scholar
  10. R. D'Agostino and D. B. Rubin, “Estimation and use of propensity scores with incomplete data,” Journal of the American Statistical Association, 95, pp. 749-759, 2000.Google Scholar
  11. R. Dehejia and S. Wahba, “Causal effects in non-experimental studies: re-evaluating the evaluation of training programs,” Journal of the American Statistical Association, 94, pp. 1053-1062, 1999.Google Scholar
  12. C. Frangakis and D. B. Rubin, “principal stratification in Casual Inference” Vol. 58( 1), pp. 21-29, Biometrics, 2002.Google Scholar
  13. GAO (U.S. General Accounting Office), “Breast conservation versus mastectomy: patient survival in day-to-day medical practice and randomized studies,” Report #GAO-PEMD-95-9, U.S. General Accounting Office: Washington, D.C., 1995.Google Scholar
  14. X. Gu and P. Rosenbaum, “Comparison of multivariate matching methods: structures, distances, and algorithms,” Journal of Computational and Graphical Statistics, 2, pp. 405-420, 1993.Google Scholar
  15. G. W. Harrison, ‘Expert Report, April 27, 1998: “Health care expenditures attributable to smoking in Oklahoma,”’ The State of Oklahoma, ex rel., et al., Plaintiffs, vs. Reynolds Tobacco Co., et al., Defendants, Case No. CJ-96-1499-L, District Court of Cleveland County, Oklahoma, 1998.Google Scholar
  16. J. J. Heckman, “The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models,” Annals of Economic and Social Measurement, 5, pp. 475-492, 1976.Google Scholar
  17. J. J. Heckman and V. J. Hotz, “Choosing among alternative nonexperimental methods for estimating the impact of social programs: the case of manpower training,” Journal of the American Statistical Association, 84, pp. 862-880, 1989.Google Scholar
  18. J. Hill, D. B. Rubin and N. Thomas, “The design of the New York school choice scholarship program evaluation,” in Research Designs: Inspired by the Work of Donald Campbell, (L. Bickman, ed.), Sage Publications, Thousand Oaks, CA, 155-180, 1999.Google Scholar
  19. D. G. Horvitz and D. J. Thompson, “A generalization of sampling without replacement from a finite universe,” Journal of the American Statistical Association, 47, pp. 663-685, 1952.Google Scholar
  20. G. W. Imbens, “The role of the propensity score in estimating dose-response functions,” Biometrika, 87, pp. 706-710, 2000.Google Scholar
  21. R. Lalonde, “Evaluating the econometric evaluations of training programs with experimental data,” American Economic Review, 76, pp. 604-620, 1986.Google Scholar
  22. O. Miettinen, “Stratification by a multivariate confounder score,” American Journal of Epidemiology, 104, pp. 609-620, 1976.Google Scholar
  23. C. C. Peters, “A method of matching groups with no loss of population,” Journal of Educational Research, 34, pp. 606-612, 1941.Google Scholar
  24. J. Reinisch, S. Sanders, E. Mortensen and D. B. Rubin, “In utero exposure to phenobarbital and intelligence deficits in adult men,” Journal of the American Medical Association, 274 pp. 1518-1525, 1995.Google Scholar
  25. L. Roseman, “Reducing bias in the estimate of the difference in survival in observational studies using subclassification on the propensity score,” Ph.D. Thesis, Department of Statistics, Harvard University, Cambridge, MA, 1998.Google Scholar
  26. P. R. Rosenbaum, “Optimal matching for observational studies,” Journal of the American Statistical Association, 84, pp. 1024-1032, 1989.Google Scholar
  27. P. R. Rosenbaum, “A characterization of optimal designs for observational studies,” Journal of the Royal Statistical Society-B, 53, pp. 597-610, 1991.Google Scholar
  28. P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,” Biometrika, 70, pp. 41-55, 1983a.Google Scholar
  29. P. R. Rosenbaum and D. B. Rubin "Assessing sensitive to an unobserved binary covariate in an observational study with binary outcome,” Journal of the Royal Statistical Society-B, 45, pp. 212-218.Google Scholar
  30. P. R. Rosenbaum and D. B. Rubin, “Reducing bias in observational studies using subclassification on the propensity score,” Journal of the American Statistical Association, 79, pp. 516-524, 1984.Google Scholar
  31. P. R. Rosenbaum and D. B. Rubin, “Constructing a control group using multivariate matched sampling incorporating the propensity score,” The American Statistician, 39, pp. 33-38, 1985.Google Scholar
  32. D. B. Rubin, “The use of matched sampling and regression adjustment in observational studies,” Ph.D. Thesis, Department of Statistics, Harvard University: Cambridge, MA, 1970.Google Scholar
  33. D. B. Rubin, “Matching to remove bias in observational studies,” Biometrics, 29, pp. 159-183, 1973a. Printer's correction note 30, p. 728.Google Scholar
  34. D. B. Rubin, “The use of matched sampling and regression adjustment to remove bias in observational studies,” Biometrics, 29, pp. 184-203, 1973b.Google Scholar
  35. D. B. Rubin, “Multivariate matching methods that are equal percent bias reducing, I: some examples,” Biometrics, 32, pp. 109-120, 1976a. Printer's correction note p. 955.Google Scholar
  36. D. B. Rubin, “Multivariate matching methods that are equal percent bias reducing, II: maximums on bias reduction for fixed sample sizes,” Biometrics, 32, pp. 121-132, 1976b. Printer's correction note p. 955.Google Scholar
  37. D. B. Rubin, “Assignment to treatment group on the basis of a covariate,” Journal of Educational Statistics, 2, pp. 1-26, 1977.Google Scholar
  38. D. B. Rubin, “Using multivariate matched sampling and regression adjustment to control bias in observational studies,” Journal of the American Statistical Association, 74, pp. 318-328, 1979.Google Scholar
  39. D. B. Rubin, “Bias reduction using Mahalanobis' metric matching,” Biometrics, 36, pp. 295-298, 1980. Printer's Correction p. 296 ((5,10)=75%).Google Scholar
  40. D. B. Rubin, “William, G. Cochran's contributions to the design, analysis, and evaluation of observational studies,” in W. G. Cochran's Impact on Statistics ( Rao and Sedransk, eds.), John Wiley, New York, pp. 37-69, 1984.Google Scholar
  41. D. B. Rubin, “Statistical issues in the estimation of the causal effects of smoking due to the conduct of the tobacco industry,” in Statistical Science in the Courtroom (J. Gastwirth, ed.), Springer-Verlag: New York, Chapter 16, pp. 321-351, 2000a.Google Scholar
  42. D. B. Rubin, “Statistical assumptions in the estimation of the causal effects of smoking due to the conduct of the tobacco industry,” in Social Science Methodology in the New Millennium. Proceedings of the Fifth International Conference on Logic and Methodology (J. Blasius, J. Hox, E. de Leeuw and P. Schmidt, eds.), October 6, 2000, Cologne, Germany, 1-22, 2000b.Google Scholar
  43. D. B. Rubin, “Estimating the causal effects of smoking,” Statistics in Medicine, 20, pp. 1395-1414, 2001.Google Scholar
  44. D. B. Rubin and N. Thomas, “Affinely invariant matching methods with ellipsoidal distributions,” Annals of Statistics, 20, pp. 1079-93, 1992a.Google Scholar
  45. D. B. Rubin and N. Thomas, “Characterizing the effect of matching using linear propensity score methods with normal covariates,” Biometrika, 79, pp. 797-809, 1992b.Google Scholar
  46. D. B. Rubin and N. Thomas, “Matching using estimated propensity scores: relating theory to practice,” Biometrics, 52, pp. 249-264, 1996.Google Scholar
  47. D. B. Rubin and N. Thomas, “Combining propensity score matching with additional adjustments for prognostic covariates,” Journal of the American Statistical Association, 95, pp. 573-585, 2000.Google Scholar
  48. Smith and Todd, Health Services and Outcomes Research Methodology, 2002.Google Scholar
  49. S. L. Zeger, T. Wyant, L. Miller and J. Samet, “Statistical testimony on damages in Minnesota v. Tobacco Industry,” in Statistical Science in the Courtroom (J. Gastwirth, ed.), Springer-Verlag, New York, Chapter 15, 303-320, 2000.Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Donald B. Rubin
    • 1
  1. 1.Department of StatisticsHarvard UniversityCambridge

Personalised recommendations