Propensity score methodology can be used to help design observational studies in a way analogous to the way randomized experiments are designed: without seeing any answers involving outcome variables. The typical models used to analyze observational data (e.g., least squares regressions, difference of difference methods) involve outcomes, and so cannot be used for design in this sense. Because the propensity score is a function only of covariates, not outcomes, repeated analyses attempting to balance covariate distributions across treatment groups do not bias estimates of the treatment effect on outcome variables. This theme will the primary focus of this article: how to use the techniques of matching, subclassification and/or weighting to help design observational studies. The article also proposes a new diagnostic table to aid in this endeavor, which is especially useful when there are many covariates under consideration. The conclusion of the initial design phase may be that the treatment and control groups are too far apart to produce reliable effect estimates without heroic modeling assumptions. In such cases, it may be wisest to abandon the intended observational study, and search for a more acceptable data set where such heroic modeling assumptions are not necessary. The ideas and techniques will be illustrated using the initial design of an observational study for use in the tobacco litigation based on the NMES data set.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
AHCPR, “National medical expenditure survey, calendar year 1987,” Center for General Health Services Research, Agency for Health Care Policy and Research, Public Health Service: Rockville,MD, 1992.
S. Anderson, A. Auquier, W. W. Hauck, D. Oakes, W. Vandaele and H. I. Weisberg, Statistical methods for comparative studies, John Wiley, New York, 1980.
D. J. Benjamin, “Does 401(k) eligibility increase net national savings?: reducing bias in the eligibility effect estimate,” A. B. Honors Thesis in Economics, Harvard University, Cambridge, MA, 1999.
D. Card and A. Kreuger, “Minimum wages and employment: a case study of the fast food industry in New Jersey and Pennsylvania,” American Economic Review, 84, pp. 772-793, 1994.
R. G. Carpenter, “Matching when covariates are normally distributed,” Biometrika, 64, pp. 299-307, 1977.
W. G. Cochran, “Analysis of covariance: its nature and uses,” Biometrics, 13, pp. 261-281, 1957.
W. G. Cochran, “The planning of observational studies of human populations,” Journal of the Royal Statistical Society-A, 128, pp. 234-265, 1965.
W. G. Cochran and D. B. Rubin, “Controlling bias in observational studies: a review,” Sankhya-A, 35, pp. 417-446, 1973.
J. C. Czajka, S. M. Hirabayashi, R. J. A. Little and D. B. Rubin, “Projecting from advance data using propensity modeling,” Journal of Business and Economics Statistics, 10, pp. 117-131, 1992.
R. D'Agostino and D. B. Rubin, “Estimation and use of propensity scores with incomplete data,” Journal of the American Statistical Association, 95, pp. 749-759, 2000.
R. Dehejia and S. Wahba, “Causal effects in non-experimental studies: re-evaluating the evaluation of training programs,” Journal of the American Statistical Association, 94, pp. 1053-1062, 1999.
C. Frangakis and D. B. Rubin, “principal stratification in Casual Inference” Vol. 58( 1), pp. 21-29, Biometrics, 2002.
GAO (U.S. General Accounting Office), “Breast conservation versus mastectomy: patient survival in day-to-day medical practice and randomized studies,” Report #GAO-PEMD-95-9, U.S. General Accounting Office: Washington, D.C., 1995.
X. Gu and P. Rosenbaum, “Comparison of multivariate matching methods: structures, distances, and algorithms,” Journal of Computational and Graphical Statistics, 2, pp. 405-420, 1993.
G. W. Harrison, ‘Expert Report, April 27, 1998: “Health care expenditures attributable to smoking in Oklahoma,”’ The State of Oklahoma, ex rel., et al., Plaintiffs, vs. Reynolds Tobacco Co., et al., Defendants, Case No. CJ-96-1499-L, District Court of Cleveland County, Oklahoma, 1998.
J. J. Heckman, “The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models,” Annals of Economic and Social Measurement, 5, pp. 475-492, 1976.
J. J. Heckman and V. J. Hotz, “Choosing among alternative nonexperimental methods for estimating the impact of social programs: the case of manpower training,” Journal of the American Statistical Association, 84, pp. 862-880, 1989.
J. Hill, D. B. Rubin and N. Thomas, “The design of the New York school choice scholarship program evaluation,” in Research Designs: Inspired by the Work of Donald Campbell, (L. Bickman, ed.), Sage Publications, Thousand Oaks, CA, 155-180, 1999.
D. G. Horvitz and D. J. Thompson, “A generalization of sampling without replacement from a finite universe,” Journal of the American Statistical Association, 47, pp. 663-685, 1952.
G. W. Imbens, “The role of the propensity score in estimating dose-response functions,” Biometrika, 87, pp. 706-710, 2000.
R. Lalonde, “Evaluating the econometric evaluations of training programs with experimental data,” American Economic Review, 76, pp. 604-620, 1986.
O. Miettinen, “Stratification by a multivariate confounder score,” American Journal of Epidemiology, 104, pp. 609-620, 1976.
C. C. Peters, “A method of matching groups with no loss of population,” Journal of Educational Research, 34, pp. 606-612, 1941.
J. Reinisch, S. Sanders, E. Mortensen and D. B. Rubin, “In utero exposure to phenobarbital and intelligence deficits in adult men,” Journal of the American Medical Association, 274 pp. 1518-1525, 1995.
L. Roseman, “Reducing bias in the estimate of the difference in survival in observational studies using subclassification on the propensity score,” Ph.D. Thesis, Department of Statistics, Harvard University, Cambridge, MA, 1998.
P. R. Rosenbaum, “Optimal matching for observational studies,” Journal of the American Statistical Association, 84, pp. 1024-1032, 1989.
P. R. Rosenbaum, “A characterization of optimal designs for observational studies,” Journal of the Royal Statistical Society-B, 53, pp. 597-610, 1991.
P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,” Biometrika, 70, pp. 41-55, 1983a.
P. R. Rosenbaum and D. B. Rubin "Assessing sensitive to an unobserved binary covariate in an observational study with binary outcome,” Journal of the Royal Statistical Society-B, 45, pp. 212-218.
P. R. Rosenbaum and D. B. Rubin, “Reducing bias in observational studies using subclassification on the propensity score,” Journal of the American Statistical Association, 79, pp. 516-524, 1984.
P. R. Rosenbaum and D. B. Rubin, “Constructing a control group using multivariate matched sampling incorporating the propensity score,” The American Statistician, 39, pp. 33-38, 1985.
D. B. Rubin, “The use of matched sampling and regression adjustment in observational studies,” Ph.D. Thesis, Department of Statistics, Harvard University: Cambridge, MA, 1970.
D. B. Rubin, “Matching to remove bias in observational studies,” Biometrics, 29, pp. 159-183, 1973a. Printer's correction note 30, p. 728.
D. B. Rubin, “The use of matched sampling and regression adjustment to remove bias in observational studies,” Biometrics, 29, pp. 184-203, 1973b.
D. B. Rubin, “Multivariate matching methods that are equal percent bias reducing, I: some examples,” Biometrics, 32, pp. 109-120, 1976a. Printer's correction note p. 955.
D. B. Rubin, “Multivariate matching methods that are equal percent bias reducing, II: maximums on bias reduction for fixed sample sizes,” Biometrics, 32, pp. 121-132, 1976b. Printer's correction note p. 955.
D. B. Rubin, “Assignment to treatment group on the basis of a covariate,” Journal of Educational Statistics, 2, pp. 1-26, 1977.
D. B. Rubin, “Using multivariate matched sampling and regression adjustment to control bias in observational studies,” Journal of the American Statistical Association, 74, pp. 318-328, 1979.
D. B. Rubin, “Bias reduction using Mahalanobis' metric matching,” Biometrics, 36, pp. 295-298, 1980. Printer's Correction p. 296 ((5,10)=75%).
D. B. Rubin, “William, G. Cochran's contributions to the design, analysis, and evaluation of observational studies,” in W. G. Cochran's Impact on Statistics ( Rao and Sedransk, eds.), John Wiley, New York, pp. 37-69, 1984.
D. B. Rubin, “Statistical issues in the estimation of the causal effects of smoking due to the conduct of the tobacco industry,” in Statistical Science in the Courtroom (J. Gastwirth, ed.), Springer-Verlag: New York, Chapter 16, pp. 321-351, 2000a.
D. B. Rubin, “Statistical assumptions in the estimation of the causal effects of smoking due to the conduct of the tobacco industry,” in Social Science Methodology in the New Millennium. Proceedings of the Fifth International Conference on Logic and Methodology (J. Blasius, J. Hox, E. de Leeuw and P. Schmidt, eds.), October 6, 2000, Cologne, Germany, 1-22, 2000b.
D. B. Rubin, “Estimating the causal effects of smoking,” Statistics in Medicine, 20, pp. 1395-1414, 2001.
D. B. Rubin and N. Thomas, “Affinely invariant matching methods with ellipsoidal distributions,” Annals of Statistics, 20, pp. 1079-93, 1992a.
D. B. Rubin and N. Thomas, “Characterizing the effect of matching using linear propensity score methods with normal covariates,” Biometrika, 79, pp. 797-809, 1992b.
D. B. Rubin and N. Thomas, “Matching using estimated propensity scores: relating theory to practice,” Biometrics, 52, pp. 249-264, 1996.
D. B. Rubin and N. Thomas, “Combining propensity score matching with additional adjustments for prognostic covariates,” Journal of the American Statistical Association, 95, pp. 573-585, 2000.
Smith and Todd, Health Services and Outcomes Research Methodology, 2002.
S. L. Zeger, T. Wyant, L. Miller and J. Samet, “Statistical testimony on damages in Minnesota v. Tobacco Industry,” in Statistical Science in the Courtroom (J. Gastwirth, ed.), Springer-Verlag, New York, Chapter 15, 303-320, 2000.
About this article
Cite this article
Rubin, D.B. Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation. Health Services & Outcomes Research Methodology 2, 169–188 (2001). https://doi.org/10.1023/A:1020363010465