Skip to main content

Radius matching on the propensity score with bias adjustment: tuning parameters and finite sample behaviour


Using a simulation design that is based on empirical data, a recent study by Huber et al. (J Econom 175:1–21, 2013) finds that distance-weighted radius matching with bias adjustment as proposed in Lechneret et al. (J Eur Econ Assoc 9:742–784, 2011) is competitive among a broad range of propensity score-based estimators used to correct for mean differences due to observable covariates. In this companion paper, we further investigate the finite sample behaviour of radius matching with respect to various tuning parameters. The results are intended to help the practitioner to choose suitable values of these parameters when using this method, which has been implemented in the software packages GAUSS, STATA and R.

This is a preview of subscription content, access via your institution.


  1. See for example the recent surveys by Blundell and Costa Dias (2009), Imbens (2004), and Imbens and Wooldridge (2009) for a discussion of the properties of such estimators as well as a list of recent applications.

  2. It has also been used in Wunsch and Lechner (2008), Lechner (2009), Lechner and Wunsch (2009a, b), Behncke (2010a); Behncke et al. (2010b)), and Huber et al. (2011).

  3. Note that HLW13 combine the radius multiplier with the maximum distance between matched, rather than a particular quantile.

  4. The latest version of the GAUSS codes is available from The latest version of the STATA code is available from the SSC archive.

  5. We focus on the ATET for reasons of computational costs. Note that estimating the average treatment effect on the non-treated (ATENT) is symmetric to the problem we consider (just recode \(D\) as \(1-D\)) and thus not interesting in its own right. The ATE is obtained as a weighted average of the ATET and the ATENT, where the weight for the ATET is the share of treated and the weight of ATENT is one minus this share.

  6. In contrast, the Euclidean distance metric - defined as \(\sqrt{\left( {\tilde{x}_i^{D=1} -\tilde{x}_j^{D=0} } \right) I\left( {\tilde{x}_i^{D=1} -\tilde{x}_j^{D=0} } \right) ^{\prime }}=\sqrt{\sum _{k=1}^K {\left( {\tilde{x}_{i,k}^{D=1} -\tilde{x}_{j,k}^{D=0} } \right) } ^{2}}\), with \(I\) denoting the \(K\)-dimensional identity matrix and \(\tilde{x}_{i,k}^{D=1} ,\tilde{x}_{j,k}^{D=0} \) being the \(k^{th}\) elements in \(\tilde{x}_i^{D=1} ,\tilde{x}_j^{D=0} \)- would assign equal weights to all differences, irrespective of how much they differ in terms of standard deviations and covariances.

  7. Note that this estimator satisfies the so-called ’double robustness property’: it is consistent if either the matching step is based on a correctly specified propensity score model or if the bias-adjustment step is based on a correctly specified regression model (see for instance Joffe et al. 2004, and Rubin 1979). However, in our implementation the propensity score and the variables included in the Mahalanobis metric are used as regressors in the local adjustment. Therefore, the relevance of the double robust property in our context is not clear.

  8. We acknowledge that cross-validation might be an alternative data-driven approach worth considering. See Frölich (2005), whose simulations suggest that cross-validation performs rather well for bandwidth selection in kernel matching (and in particular better than a selection method based on an asymptotic approximation of the estimator’s mean squared error), even though it does asymptotically not provide the optimal bandwidth. Similar arguments could carry over to radius matching as considered in this paper.

  9. If both procedures are used at the same time, the common support restriction of Dehejia and Wahba (1999) is enforced prior to trimming the weights of the remaining observations.

  10. \(\hat{{\sigma }}_i^2\) may also be obtained from different methods as for instance the Abadie and Imbens (2006) variance estimator based on matching within the same treatment group.

  11. Papers with related approaches include Abadie and Imbens (2002), Bertrand et al. (2004), Diamond and Sekhon (2008),  Lee and Whang (2009), Khwaja et al. (2010) and Huber (2012).

  12. This covers 85 % of the German workforce. It excludes the self-employed as well as civil servants.

  13. Further details regarding the data can be found in Appendix 2.

  14. The programmes we consider correspond to general training in Wunsch and Lechner (2008) and to short and long training in LMW11.

  15. Note that the descriptive statistics in Table 2 seemingly differ from those in Table 1 of HLW13, even though they refer to the same data. The reason is that in HLW13, the non-treated covariate means are incorrectly displayed in the column which claims to provide the standard deviations of the covariates of the treated, while the latter are given in the column which claims to show the non-treated covariate means. Therefore, Table 2 is correct, while the statistics in Table 1 of HLW13 are partially misplaced.

  16. Note that the simulations are not conditional on \(D\). Thus, the share of treated in each sample is random.

  17. The standardized differences as well as the pseudo-\(R^{2}\)s are based on a re-estimated propensity score in the population with simulated treated (114,349 obs.). However, when reassigning controls to act as simulated treated this changes the control population. Therefore, this effect, and the fact that the share of treated differs from the original share leads to different values of those statistics even in the case that mimics selection in the original population.

    Table 3 Summary statistic of DGP’s


  • Abadie A, Imbens GW (2006) Large sample properties of matching estimators for average treatment effects. Econometrica 74:235–267

    Article  Google Scholar 

  • Abadie A, Imbens GW (2008) On the failure of the bootstrap for matching estimators. Econometrica 76:1537–1557

    Article  Google Scholar 

  • Behncke S, Frölich M, Lechner M (2010a) Unemployed and their case workers: should they be friends or foes? J R Stat Soc Ser A 173:67–92

    Article  Google Scholar 

  • Behncke S, Frölich M, Lechner M (2010b) A caseworker like me: does the similarity between unemployed and caseworker increase job placements? Econ J 120:1430–1459

    Article  Google Scholar 

  • Blundell R, Costa Dias M (2009) Alternative approaches to evaluation in empirical microeconomics. J Hum Resour 44:565–640

    Article  Google Scholar 

  • Busso M, DiNardo J, McCrary J (2009a) Finite sample properties of semiparametric estimators of average treatment effects. J Bus Econ Stat 27:397–415

    Article  Google Scholar 

  • Busso M, DiNardo J, McCrary J (2009b) New evidence on the finite sample properties of propensity score matching and reweighting estimators, IZA Discussion Paper, 3998

  • Crump RK, Hotz VJ, Imbens GW, Mitnik OA (2009) Dealing with limited overlap in estimation of average treatment effects. Biometrika 96:187–199

    Article  Google Scholar 

  • Dehejia RH, Wahba S (1999) Causal effects in non-experimental studies: reevaluating the evaluation of training programmes. J Am Stat Assoc 94:1053–1062

    Article  Google Scholar 

  • Dehejia RH, Wahba S (2002) Propensity score: matching methods for nonexperimental causal studies. Rev Econ Stat 84:151–161

    Article  Google Scholar 

  • Diamond A, Sekhon JS (2008) Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Mimeo, Berkeley

    Google Scholar 

  • Efron B (1979) Bootstrap methods: another look at the Jackknife. Ann Stat 7:1–26

    Article  Google Scholar 

  • Frölich M (2004) Finite-sample properties of propensity-score matching and weighting estimators. Rev Econ Stat 86:77–90

    Article  Google Scholar 

  • Frölich M (2005) Matching estimators and optimal bandwidth choice. Stat Comput 15:197–215

    Article  Google Scholar 

  • Frölich M (2007) Nonparametric IV estimation of local average treatment effects with covariates. J Econom 139:35–75

    Article  Google Scholar 

  • Graham BS, Pinto C, Egel D (2012) Inverse probability tilting for moment condition models with missing data. Rev Econ Stud 79:1053–1079

    Article  Google Scholar 

  • Heckman JJ, Ichimura H, Todd P (1998) Matching as an econometric evaluation estimator. Rev Econ Stud 65:261–294

    Article  Google Scholar 

  • Heckman JJ, Ichimura H, Smith J, Todd P (1998) Characterizing selection bias using experimental data. Econometrica 66:1017–1098

    Article  Google Scholar 

  • Heckman JJ, LaLonde R, Smith J (1999) The economics and econometrics of active labor market programs. In: Ashenfelter O, Card D (eds) Handbook of labour economics. Elsevier Science, Amsterdam, pp 1865–2097

    Google Scholar 

  • Hirano K, Imbens GW, Ridder G (2003) Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 2003:1161–1189

    Article  Google Scholar 

  • Ho D, Imai K, King G, Stuart E (2007) Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference, political analysis, pp 199–236. 15 Aug 2007

  • Horowitz JL (2001) The bootstrap. In: Heckman JJ, Leamer E (eds) Handbook of econometrics. Elsevier Science, Amsterdam, pp 3159–3228

    Google Scholar 

  • Horvitz D, Thompson D (1952) A generalization of sampling without replacement from a finite population. J Am Stat Assoc 47:663–685

    Article  Google Scholar 

  • Huber M (2012) Identification of average treatment effects in social experiments under alternative forms of attrition. J Educ Behav Stat 37:443–474

    Article  Google Scholar 

  • Huber M, Lechner M, Wunsch C (2013) The performance of estimators based on the propensity score. J Econom 175:1–21

    Article  Google Scholar 

  • Huber M, Lechner M, Wunsch C (2011) Does leaving welfare improve health? Evid Ger Health Econ 20:484–504

    Article  Google Scholar 

  • Imbens GW (2004) Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat 86:4–29

    Article  Google Scholar 

  • Imbens GW, Wooldridge JM (2009) Recent developments in the econometrics of program evaluation. J Econ Lit 47:5–86

    Article  Google Scholar 

  • Joffe MM, Ten Have TR, Feldman HI, Kimmel SE (2004) Model selection, confounder control, and marginal structural models. Am Stat 58:272–279

    Article  Google Scholar 

  • Khwaja A, Salm GPM, Trogdon JG (2010) A comparison of treatment effects estimators using a structural model of AMI treatment choices and severity of illness information from hospital charts. J Appl Econom. doi:10.1002/Jae.1181

  • Lechner M (2009) Long-run labour market and health effects of individual sports activities. J Health Econ 28:839–854

    Article  Google Scholar 

  • Lechner M, Wunsch C (2009a) Active labour market policy in East Germany: waiting for the economy to take off. Econ Trans 17:661–702

    Article  Google Scholar 

  • Lechner M, Wunsch C (2009b) Are training programs more effective when unemployment is high? J Lab Econ 27:653–692

    Article  Google Scholar 

  • Lechner M, Miquel R, Wunsch C (2011) Long-run effects of public sector sponsored training in West Germany. J Eur Econ Assoc 9:742–784

    Article  Google Scholar 

  • Lechner M, Strittmatter A (2014) Practical procedures to deal with common support problems in matching estimation. Mimeo,

  • Lee S, Whang Y-J (2009) Nonparametric tests of conditional treatment effects, Cowles Foundation Discussion Paper 1740

  • MacKinnon JG (2006) Bootstrap methods in econometrics. Econ Rec 82:2–18

    Article  Google Scholar 

  • Robins JM, Mark SD, Newey WK (1992) Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics 48:479–495

    Article  Google Scholar 

  • Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70:41–55

    Article  Google Scholar 

  • Rosenbaum PR, Rubin DB (1985) Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Stat 39:33–38

    Google Scholar 

  • Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701

    Article  Google Scholar 

  • Rubin DB (1979) Using multivariate matched sampling and regression adjustment to control bias in observational studies. J Am Stat Assoc 74:318–328

    Google Scholar 

  • Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, London

    Book  Google Scholar 

  • Wunsch C, Lechner M (2008) What did all the money do? On the general ineffectiveness of recent West German Labour Market Programmes. Kyklos 61:134–174

    Article  Google Scholar 

Download references


Martin Huber gratefully acknowledges financial support from the Swiss National Science Foundation grant PBSGP1_138770. We would like to thank Conny Wunsch (SEW) for her help in the early stages of the paper.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Michael Lechner.

Additional information

Michael Lechner is a Research Fellow of CEPR and PSI, London, CES-Ifo, Munich, IAB, Nuremberg, and IZA, Bonn.


Appendix 1: More details on the features of the DGP and the estimator

Appendix 2: Dataset description

The data comprise all aspects of an individual’s employment, earnings and unemployment insurance history since 1990 (e.g. type of employment such as full/part-time and high/low-skilled, occupation, earnings, type and amount of unemployment insurance benefits and remaining claims), participation in major labour market programmes from 2000 onwards (including the exact start date, end date, planned end date and type of programme), individual characteristics (e.g. date of birth, gender, educational attainment, marital status, number of children, age of youngest child, nationality, occupation, the presence of health impairments and disability status) and job search activities (the type of job looked for such as full/part-time, high/low-skilled and the occupation, mobility within Germany and health impairments affecting employability). Furthermore, a variety of regional variables has been matched to the data, including information about migration and commuting, average earnings, unemployment rate, long-term unemployment, welfare dependency rates, urbanisation codes, and measures of industry structure and public transport facilities.

The sample used for the simulations covers all entries into unemployment in the period 2000–2003, however, excluding East Germany and Berlin since they are still affected by the aftermath of reunification. Furthermore, unemployment entries in January-March 2000 are discarded because with programme information starting only in January 2000, it should be prevented that entries from employment programmes (which we would consider as unemployed) are accidentally classified as entries from unsubsidized employment due to missing information regarding the accompanying programme spell. Entries after 2003 are not considered such that the outcome variables, employment and earnings, are observed for at least three years after entering unemployment. Moreover, the analysis is restricted to the prime-age population aged 20–59 in order to limit the impact of schooling and (early) retirement decisions and to individuals who were not unemployed or in any labour market programme in the last 12 months before becoming unemployed to make the sample more homogeneous. Finally, the very few cases whose last employment was any non-standard form of employment such as internships were excluded.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huber, M., Lechner, M. & Steinmayr, A. Radius matching on the propensity score with bias adjustment: tuning parameters and finite sample behaviour. Empir Econ 49, 1–31 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Propensity score matching
  • Radius matching
  • Selection on observables
  • Empirical Monte Carlo study
  • Finite sample properties

JEL Classification

  • C21