Matching in R

  • Paul R. Rosenbaum
Part of the Springer Series in Statistics book series (SSS)


The statistical package R is used to construct several matched samples from one data set. The focus is on the mechanics of using R, not on the design of observational studies. The process is made tangible by describing it in detail, closely inspecting intermediate results; however, essentially, three steps are involved, (i) creating a distance matrix, (ii) adding a propensity score caliper to the distance matrix, and (iii) finding an optimal match. One appendix contains a short introduction to R. A second appendix contains short R functions used to create distance matrices used in matching.


Propensity Score Distance Matrix Statist Assoc High School Senior Optimal Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aitkin, M., Francis, B., Hinde, J., Darnell, R.: Statistical Modelling in R. New York: Oxford University Press (2009)Google Scholar
  2. Angrist, J.D., Lavy, V.: Using Maimonides’ rule to estimate the effect of class size on scholastic achievement. Quart J Econ 114, 533–575 (1999)CrossRefGoogle Scholar
  3. Bertsekas, D.P.: A new algorithm for the assignment problem. Math Program 21, 152–171 (1981)MATHCrossRefMathSciNetGoogle Scholar
  4. Bertsekas, D.P.: Linear Network Optimization. Cambridge, MA: MIT Press (1991)MATHGoogle Scholar
  5. Campbell, D.T. : Factors relevant to the validity of experiments in social settings. Psychol Bull 54, 297–312 (1957)CrossRefGoogle Scholar
  6. Card, D., Krueger, A.: Minimum wages and employment: A case study of the fast-food industry in New Jersey and Pennsylvania. Am Econ Rev 84, 772–793 (1994) Data:
  7. Chambers, J.: Software for Data Analysis: Programming with R. New York: Springer (2008)MATHGoogle Scholar
  8. Dalgaard, P.: Introductory Statistics with R. New York: Springer (2002)MATHGoogle Scholar
  9. Dynarski, S.M.: Does aid matter? Measuring the effect of student aid on college attendance and completion. Am Econ Rev 93, 279–288 (2003)CrossRefGoogle Scholar
  10. Fleiss, J.L., Levin, B., Paik, M.C.: Statistical Methods for Rates and Proportions. New York: Wiley (2001)Google Scholar
  11. Hansen, B.B.: Optmatch: Flexible, optimal matching for observational studies. R News 7, 18–24 (2007)Google Scholar
  12. Hansen, B.B., Klopfer, S.O.: Optimal full matching and related designs via network flows. J Comp Graph Statist 15, 609–627 (2006)CrossRefMathSciNetGoogle Scholar
  13. Ho, D., Imai, K., King, G., Stuart, E.A.: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal 15, 199–236 (2007)CrossRefGoogle Scholar
  14. LaLonde, R.J.: Evaluating the econometric evaluations of training programs with experimental data. Am Econ Rev 76, 604–620 (1986)Google Scholar
  15. Maindonald, J., Braun, J.: Data Analysis and Graphics Using R. New York: Cambridge University Press (2001)Google Scholar
  16. McCullagh, P., Nelder, J.A. : Generalized Linear Models. New York: Chapman and Hall/CRC (1989)MATHGoogle Scholar
  17. Ming, K., Rosenbaum, P.R.: A note on optimal matching with variable controls using the assignment algorithm. J Comput Graph Statist 10, 455–463 (2001)CrossRefMathSciNetGoogle Scholar
  18. R Development Core Team.: R: A Language and Environment for Statistical Computing. Vienna: R Foundation, (2007)
  19. Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55 (1983)MATHCrossRefMathSciNetGoogle Scholar
  20. Rosenbaum, P.R.: From association to causation in observational studies. J Am Statist Assoc 79, 41–48 (1984)MATHCrossRefMathSciNetGoogle Scholar
  21. Rosenbaum, P.R., Rubin, D.B.: Reducing bias in observational studies using subclassification on the propensity score. J Am Statist Assoc 79, 516–524 (1984)CrossRefGoogle Scholar
  22. Rosenbaum, P.R., Rubin, D.B.: Constructing a control group by multivariate matched sampling methods that incorporate the propensity score. Am Statistician 39, 33–38 (1985)CrossRefGoogle Scholar
  23. Rosenbaum, P.R.: Permutation tests for matched pairs with adjustments for covariates. Appl Statist 37, 401–411 (1988) (Correction: [27, §3])Google Scholar
  24. Rosenbaum, P.R.: Optimal matching in observational studies. J Am Statist Assoc 84, 1024–32 (1989)CrossRefGoogle Scholar
  25. Rosenbaum, P.R.: A characterization of optimal designs for observational studies. J Roy Statist Soc B 53, 597–610 (1991)MATHMathSciNetGoogle Scholar
  26. Rosenbaum, P.R.: Stability in the absence of treatment. J Am Statist Assoc 96, 210–219 (2001)MATHCrossRefMathSciNetGoogle Scholar
  27. Rosenbaum, P.R.: Observational Studies (2nd ed.). New York: Springer (2002)MATHGoogle Scholar
  28. Rosenbaum, P.R.: Covariance adjustment in randomized experiments and observational studies (with Discussion). Statist Sci 17, 286–327 (2002)MATHCrossRefMathSciNetGoogle Scholar
  29. Rubin, D.B.: Using multivariate matched sampling and regression adjustment to control bias in observational studies. J Am Statist Assoc 74, 318–328 (1979)MATHCrossRefGoogle Scholar
  30. Shadish, W. R., Cook, T. D., Campbell, D.T.: Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton-Mifflin (2002)Google Scholar
  31. Wooldridge, J.M.: Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press. (2002)Google Scholar

Copyright information

© Springer-Verlag New York 2010

Authors and Affiliations

  1. 1.Statistics Department Wharton SchoolUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations