Abstract
In treatment effect analysis, often the treatment takes a particular structure: ‘on’ if an underlying continuous variable crosses a threshold, and ‘off’ otherwise. Such a treatment occurs in various institutional settings such as a test score crossing a threshold to graduate, or income falling below a threshold to qualify for an aid. In this kind of cases, the study design is called ‘regression discontinuity (RD)’, which is popular in analyzing observational data, as long as the treatment takes the required form. This paper reviews RD to convey its essentials, and provides some extensions. First, the main RD idea based on local randomization due to an institutional/legal break is introduced. Second, treatment effects identified by RD are explored. Third, popular RD estimators are reviewed. Fourth, main specification tests are examined. Fifth, special RD topics are reviewed. Also, an empirical illustration is provided.
Similar content being viewed by others
References
Abadie A (2002) Bootstrap tests for distributional treatment effects in instrumental variable models. J Am Stat Assoc 97:284–292
Almond D, Doyle JJ Jr, Kowalski AE, Willimans H (2010) Estimating marginal returns to medical care: evidence from at-risk newborns. Q J Econ 125:591–634
Almond D, Doyld JJ Jr, Kowalski AE, Williams H (2011) The role of hospital heterogeneity in measuring marginal returns to medical care: a reply to Barreca, Guldi, Lindo, and Waddell. Q J Econ 126:2125–2131
Angrist JD, Lavy V (1999) Using Maimonides’ rule to estimate the effect of class size on scholastic achievement. Q J Econ 114:533–575
Angrist JD, Rokkanen M (2015) Wanna get away? Regression discontinuity estimation of exam school effects away from the cutoff. J Am Stat Assoc 10:1331–1344
Barreca AI, Guldi M, Lindo JM, Waddell GR (2011) Saving babies? Revisiting the effect of very low birth weight classification. Q J Econ 126:1–7
Barreca AI, Lindo JM, Waddell GR (2016) Heaping-induced bias in regression-discontinuity designs. Econ Inq 54:268–293
Battistin E, Brugiavini A, Rettore E, Weber G (2009) The retirement consumption puzzle: evidence from a regression discontinuity approach. Am Econ Rev 99:2209–2226
Battistin E, Rettore E (2002) Testing for programme effects in a regression discontinuity design with imperfect compliance. J R Stat Soc (Ser A) 165:39–57
Berk RA, de Leeuw J (1999) An evaluation of California’s inmate classification system using a generalized regression discontinuity design. J Am Stat Assoc 94:1045–1052
Berk RA, Rauma D (1983) Capitalizing on nonrandom assignment to treatments: a regression-discontinuity evaluation of a crime control program. J Am Stat Assoc 78:21–27
Bertanha M (2015) Regression discontinuity design with many thresholds unpublished paper
Bertanha M, Imbens GW (2014) External validity in fuzzy regression discontinuity designs. NBER working paper 20773
Breitung J, Kruse R (2013) When bubbles burst: econometric tests based on structural breaks. Stat Pap 54:911–930
Caliendo M, Tatsiramos K, Uhlendorff A (2013) Benefit duration, unemployment duration and job match quality: a regression-discontinuity approach. J Appl Econ 28:604–627
Calonico S, Cattaneo MD, Titiunik R (2014) Robust nonparametric confidence intervals for regression-discontinuity designs. Econometrica 82:2295–2326
Calonico S, Cattaneo MD, Titiunik R (2015) Optimal data-driven regression discontinuity plots. J Am Stat Assoc 10:1753–1769
Cappelleri JC, Trochim WMK, Stanley TD, Reichardt CS (1991) Random measurement error does not bias the treatment effect estimate in the regression-discontinuity design: I. the case of no interaction. Eval Rev 15:395–419
Card D, Lee DS, Pei Z, Weber A (2012) Nonlinear policy rules and the identification and estimation of causal effects in a generalized regression kink design, NBER Working Paper 18564
Card D, Lee DS, Pei Z, Weber A (2015) Inference on causal effects in a generalized regression kink design. Econometrica 83:2453–2483
Cattaneo MD, Frandsen B, Titiunik R (2015) Randomization Inference in the regression discontinuity design: an application to party advantages in the U.S. Senate. J Causal Inference 3(1):1–24
Choi JY, Lee MJ (2015) Regression discontinuity with multiple running variables allowing partial effects, presented at the World Congress Meeting of the Econometric Society at Montreal
Ciuperca G (2014) Model selection by LASSO methods in a change-point model. Stat Pap 55:349–374
Clark D, Martorell P (2014) The signaling value of a high school diploma. J Polit Econ 122:282–318
Cook TD (2008) “ Waiting for life to arrive” : a history of the regression-discontinuity design in psychology, statistics and economics. J Econ 142:636–654
Crawford C, Dearden L, Greaves E (2014) The drivers of month-of-birth differences in children’s cognitive and non-cognitive skills. J R Stat Soc (Ser A) 177:829–860
Dickens R, Riley R, Wilkinson D (2014) The UK minimum wage at 22 years of age: a regression discontinuity approach. J R Stat Soc (Ser A) 177:95–114
Dong Y (2014) Jump or kink? Identification of binary treatment regression discontinuity design without the discontinuity. R&R J Political Econ forthcoming
Dong Y (2015) Regression discontinuity applications with rounding errors in the running variable. J Appl Econ 30:422–446
Dong Y, Lewbel A (2015) Identifying the effect of changing the policy threshold in regression discontinuity models. Rev Econ Stat 97:1081–1092
Feir D, Lemieux T, Marmer V (2015) Weak identification in fuzzy regression discontinuity designs. J Bus Econ Stat forthcoming
Frandsen BR, Frölich M, Melly B (2012) Quantile treatment effects in the regression discontinuity design. J Econ 168:382–395
Gerard F, Rokkanen M, Rothe M (2015) Partial identification in regression discontinuity designs with manipulated running variables unpublished paper
Hahn J, Todd P, Van der Klaauw W (2001) Identification and estimation of treatment effects with a regression-discontinuity design. Econometrica 69:201–209
Hansen BB (2008) The prognostic analogue of the propensity score. Biometrika 95:481–488
Hullegie P, Klein TJ (2010) The effect of private health insurance on medical care utilization and self-assessed health in Germany. Health Econ 19:1048–1062
Imbens GW (2000) The role of the propensity score in estimating dose-response functions. Biometrika 87:706–710
Imbens GW, Angrist JD (1994) Identification and estimation of local average treatment effects. Econometrica 62:467–475
Imbens GW, Kalyanaraman K (2012) Optimal bandwidth choice for the regression discontinuity estimator. Rev Econ Stud 79:933–959
Imbens GW, Lemieux T (2008) Regression discontinuity designs: a guide to practice. J Econ 142:615–635
Imbens GW, Zajonc T (2009) Regression discontinuity design with vector-argument assignment rules unpublished paper
Jacob BA, Lefgren L (2004) Remedial education and student achievement: a regression discontinuity analysis. Rev Econ Stat 86:226–244
Keele LJ, Titiunik R (2015) Geographic boundaries as regression discontinuities. Polit Anal 23:127–155
Keele LJ, Titiunik R, Zubizarreta JR (2015) Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout. J R Stat Soc (Ser A) 178:223–239
Kim YS, Lee MJ (2016) Regression-kink approach for wage effect on male work hours. Oxf Bull Econ Stat forthcoming
Lalive R (2008) How do extended benefits affect unemployment duration? Regression discontinuity approach. J Econ 142:785–806
Lechner M (2001) Identification and estimation of causal effects of multiple treatments under the conditional independence assumption. In: Lechner M, Pfeiffer F (eds) Econometric evaluation of labor market policies. Physica-Verlag, New York, pp 43–58
Lee DS, Card D (2008) Regression discontinuity inference with specification error. J Econ 142:655–674
Lee DS, Lemieux T (2010) Regression discontinuity designs in economics. J Econ Lit 48:281–355
Lee MJ (2000) Median treatment effect in randomized trials. J R Stat Soc (Ser B) 62:595–604
Lee MJ (2005) Micro-econometrics for policy, program, and treatment effects. Oxford University Press, Oxford
Lee MJ (2016) Regression discontinuity with errors in the running variable: effect on truthful margin. J Econ Methods forthcoming
Leuven E, Lindahl M, Oosterbeek H, Webbink D (2007) The effect of extra funding for disadvantaged pupils on achievement. Rev Econ Stat 89:721–736
Ludwig J, Miller D (2007) Does head start improve children’s life chances? evidence from a regression discontinuity design. Q J Econ 122:159–208
Matsudaira JD (2008) Mandatory summer school and student achievement. J Econ 142:829–850
McCrary J (2008) Manipulation of the running variable in the regression discontinuity design: a density test. J Econ 142:698–714
MacDonald JM, Klick J, Grunwald B (2016) The effect of private police on crime: evidence from a geographic regression discontinuity design. J R Stat Soc (Ser A) forthcoming
Mealli F, Rampichini C (2012) Evaluating the effects of university grants by using regression discontinuity designs. J R Stat Soc (Ser A) 175:775–798
Nielsen HS, Sorensen T, Taber CR (2010) Estimating the effect of student aid on college enrollment: evidence from a government grant policy reform. Am Econ J Econ Policy 2(2):185–215
Otsu T, Xu KL, Matsushita Y (2013) Estimation and inference of discontinuity in density. J Bus Econ Stat 31:507–524
Otsu T, Xu KL, Matsushita Y (2015) Empirical likelihood for regression discontinuity design. J Econ 186:94–112
Papay JP, Murnane RJ, Willett JB (2011) Extending the regression discontinuity approach to multiple assignment variables. J Econ 161:203–207
Pei Z (2011) Regression discontinuity design with measurement error in the assignment variable unpublished paper
Porter J (2003) Estimation in the regression discontinuity model. Department of Economics, University of Wisconsin, Madison unpublished paper
Porter J, Yu P (2015) Regression discontinuity designs with unknown discontinuity points: testing and estimation. J Econ 189:132–147
Qiu P (2005) Image processing and jump regression analysis. Wiley, Hoboken
Robbins H, Zhang CH (1991) Estimating a multiplicative treatment effect under biased allocation. Biometrika 78:349–354
Schanzenbach DW (2009) Do school lunches contribute to childhood obesity? J Hum Resour 44:684–709
Schmieder JF, Wachter TV, Bender S (2012) The effects of extended unemployment insurance over the business cycle: evidence from regression discontinuity estimates over 20 Years. Q J Econ 127:701–752
Simonsen M, Skipper L, Skipper N (2015) Price sensitivity of demand for prescription drugs: exploiting a regression kink design. J Appl Econ forthcoming
Thistlethwaite D, Campbell D (1960) Regression-discontinuity analysis: an alternative to the ex post facto experiment. J Educ Psychol 51:309–317
Urquiola M, Verhoogen E (2009) Class-size caps, sorting, and the regression-discontinuity design. Am Econ Rev 99:179–215
Uysal SD (2015) Doubly robust estimation of causal effects with multivalued treatments: an application to the returns to schooling. J Appl Econ 30:763–786
Van der Klaauw W (2002) Estimating the effect of financial aid offers on college enrollment: a regression discontinuity approach. Int Econ Rev 43:1249–1287
Van der Klaauw W (2008) Regression-discontinuity analysis: a survey of recent developments in economics. Labour 22:219–245
Wong VC, Steiner PM, Cook TD (2013) Analyzing regression discontinuity designs with multiple assignment variables: a comparative study of four estimation methods. J Educ Behav Stat 38:107–141
Yanagi T (2014) The effect of measurement error in the sharp regression discontinuity design. KIER Discussion Paper, No 910
Yu P (2012) Identification in regression discontinuity designs with measurement error unpublished paper
Yu P (2015) Understanding estimators of treatment effects in regression discontinuity designs. Econ Rev forthcoming
Acknowledgments
The authors are grateful to the Editor and two reviewers for their helpful comments. The research of Myoung-jae Lee has been supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2015S1A5A2A01009718).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Identified RD effect when continuity fails
Rewrite (2.1) as
It holds that
Then \(E\{\mathring{m}(S)|S\}\) is continuous at \(S=0\) because
1.2 Optimal plug-in bandwidth
The optimal bandwidth \(h_{opt}\) proposed by Imbens and Kalyanaraman (2012) for SRD is
where \(C_{k}\) is a kernel-dependent constant (5.40 for the uniform kernel and 3.44 for the triangular kernel \(K(t)=\max (0,1-|t|)=(1-|t|)1[|t|<1]\)), \(\hat{f}_{S}(0)\) is an estimator for \(f_{S}(0)\), \(\hat{\sigma }^{2}(s)\) is an estimator for V(Y|s), \(\hat{m}^{(2)}(s)\) is an estimator for the second derivative of E(Y|s), and \(\hat{r}_{-}\) and \(\hat{r}_{+}\) are functions of \(\hat{\sigma }^{2}(0^{-})\) and \(\hat{\sigma }^{2}(0^{+})\). Obtain \(h_{opt}\) with the following three steps.
First, using a pilot rule-of-thumb bandwidth \(h_{1}=1.84\cdot SD(S)\cdot N^{-1/5}\) for the uniform kernel, let \(Q_{1i}^{-}\equiv 1[-h_{1}<S_{i}<0]\) and \(Q_{1i}^{+}\equiv 1[0<S_{i}<h_{1}]\) and
Second, set the second pilot bandwidths:
where \(\hat{m}^{(3)}(0)\) for the third derivative of E(Y|0) is to be obtained as \(6\hat{\gamma }_{4}\) from the LSE to
Let \(Q_{2i}^{-}\equiv 1[-h_{2}^{-}<S_{i}<0]\) and \(Q_{2i}^{+}\equiv 1[0<S_{i}<h_{2}^{+}]\). Apply LSE to
to obtain \(\hat{m}^{(2)}(0^{-})=2\hat{\lambda }_{2}^{-}\) and \(\hat{m} ^{(2)}(0^{+})=2\hat{\lambda }_{2}^{+}\).
Finally, obtain
and then \(h_{opt}\). Imbens and Kalyanaraman (2012) proposed an optimal bandwidth for FRD in their Eq. (2.4), but then noted that the above \( h_{opt}\) for SRD would be adequate even for FRD, as it is based on the numerator estimation in FRD.
1.3 LLR-type score-density continuity test
Let \(h_{1}\) be the interval size for the first-stage histogram of the McCrary (2008) test, and for an even number n, there are n / 2 intervals to be constructed on either side of 0. Consider left and right intervals of width \(h_{1}\) around 0:
The midpoints of these intervals are
‘G’ stands for grid points and there are n midpoints. Let \(R_{j}\) denote the histogram height at \(G_{j}\); \((R_{j},G_{j})\), \(j=1,\ldots ,n\) are to be taken as “ observations” in the second stage.
The test is based on \(\ln \hat{\varphi }_{0}-\ln \hat{\psi }_{0}\) that come from the LLR intercept estimates in (\(h_{2}\) is a second-stage bandwidth)
Specifically, the test statistic is
As for the asymptotic distribution, the proposition in McCrary (2008), p.702) is that, for the triangular kernel, under \(h_{2}\rightarrow 0\), \( Nh_{2}\rightarrow \infty \), \(h_{1}/h_{2}\rightarrow 0\), and \(h_{2}^{2}\sqrt{ Nh_{2}}\rightarrow B\in [0,\infty )\),
where \(f^{+}\equiv f_{S}(0^{+}),\) \(f^{-}\equiv f_{S}(0^{-})\), \(\theta \equiv \ln f^{+}-\ln f^{-}\) and ‘\(^{\prime \prime }\)’ denotes the second derivative. With under-smoothing \(h_{2}^{2}\sqrt{Nh_{2}}\rightarrow 0\), the asymptotic bias can be ignored. In practice, adopt the under-smoothing to ignore the asymptotic bias. Choosing two bandwidths \(h_{1}\) and \(h_{2}\) is a problem. McCrary (2008) stated that the choice of \(h_{1}\) does not matter much whereas the choice of \(h_{2}\) does, which is questionable though; (McCrary 2008, p. 705) suggested \(h_{1}=2SD(S)N^{-1/2}\) (then \(h_{2}\) may be chosen by visual inspection).
Rights and permissions
About this article
Cite this article
Choi, Jy., Lee, Mj. Regression discontinuity: review with extensions. Stat Papers 58, 1217–1246 (2017). https://doi.org/10.1007/s00362-016-0745-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-016-0745-z