Sorting Around the Discontinuity Threshold: The Case of a Neighbourhood Investment Programme

This paper investigates the empirical validity of the setup of a large-scale government neighbourhood investment programme in the Netherlands. Selection of neighbourhoods into the programme was determined by their score on a predetermined index. At first sight this is a textbook example for the application of a regression discontinuity (RD) design to estimate the causal effect of the programme on neighbourhood outcomes. However, at the discontinuity threshold we find a large gap in the share of non-Western immigrants. In addition, the pattern of non-compliance with the assignment rule is consistent with investing in neighbourhoods with a high share of non-Western immigrants. Finally, the way of selecting neighbourhoods into the programme could be a likely explanation for the imbalance at the discontinuity threshold. This case illustrates that RD designs can become invalid even when treatment and control groups have no influence on the assignment.


NON-TECHNICAL SUMMARY
In the Netherlands, a neighbourhood investment programme was implemented in 2008. It consisted of large scale neighbourhood investments in social and physical infrastructure aimed at improving the living conditions in 40 disadvantaged neighbourhoods (Vogelaarwijken). In the period 2008-2011 the Dutch government invested 216 million Euros, while an additional amount of one billion Euros was invested by housing corporations.
Such investment programmes have been evaluated using several different econometric techniques. A popular way to estimate treatment effects is by making use of regression discontinuity (RD) designs. One of the main reasons for this is that variation around the cutoff value, which determines assignment to the treatment, can be considered as good as random. The reason for this is that those who take part in the programme have no control over the assignment. However, knowledge about the assignment rule might influence the assignment to the treatment and thereby invalidate the key assumption that individuals on either side of the discontinuity threshold are similar.
This research documents a case of sorting disadvantaged areas into the neighbourhood investment programme. Policymakers at the national level, who designed and implemented the assignment rules for the policy in disadvantaged neighbourhoods, sorted areas into and out of the programme in such a way that there exists a large discontinuity in the share of non-Western immigrants at the discontinuity threshold. At the threshold value for the assignment to the treatment we find a large and statistically significant gap in the proportion of non-Western immigrants of between 11 and 21 percentage points. Moreover, there exists noncompliance with a bias toward removing areas with lower shares of non-Western immigrants from the treatment group.
The violation of a continuous distribution around the discontinuity threshold of this important characteristic could be due to the way the selection process of neighbourhoods has been carried out. Politicians at the national level demanded that there had to be a list of 40 eligible neighbourhoods. To determine the 40 neighbourhoods, a two-step procedure has been used. In the first step, a preliminary list of 40 neighbourhoods was created based on the most disadvantaged postal code areas (PCAs) according to the 'quality' index. Because most neighbourhoods consist of multiple adjacent PCAs, policymakers sometimes merged PCAs with different rank numbers to create a neighbourhood. This opens possibilities of adding lower-ranked PCAs to an already identified neighbourhood. When we move down the list of PCAs, it is possible to add more PCAs beyond the point at which 40 geographical areas have been identified as neighbourhoods. This process continues until a PCA from a different geographical area is next on the list and would become neighbourhood number 41. We show that neighbourhood 41 is indeed in another city. In the second step, a number of PCAs were removed from and added to this list to obtain a final list of 40 eligible neighbourhoods. We show that the added neighbourhoods are not close to the discontinuity threshold.
We illustrate the bias of the RD estimates when using the official cut-off. We find that the estimates from RD models that do not take account of the endogenous sorting differ from the estimates from RD models that do account for the endogenous sorting. We also show that a different selection process of 40 neighbourhoods does not lead to a discontinuity in the share of non-Western immigrants. Finally, we cannot rule out that the result of selecting 40 neighbourhoods in this way is a case of bad luck. Using the same procedure to select 30 neighbourhoods does not yield the same discontinuities.

Introduction
Neighbourhood investment programmes target government transfers toward particular geographic areas rather than individuals (e.g., Glaeser and Gottlieb, 2008). These investment programmes have been evaluated using several different econometric techniques. A series of recent studies in this area have used regression discontinuity (RD) designs to estimate treatment effects. For example, Busso et al. (2013) evaluate the employment effects of the U.S. federal urban Empowerment Zone programme; Freedman (2015) studies the labourmarket effects of the New Markets Tax Credit programme in the United States; and Horn (2015) investigates the relationship between school quality and capital investments in the housing stock using a boundary discontinuity identification strategy.
RD designs are increasingly used by economists. 1 One of the main reasons for this is that variation around the cut-off value, which determines assignment to the treatment, can be considered as good as random because those who take part in the programme have no control over the assignment (e.g., Lee, 2008). This inability to control or influence the assignment to the treatment suggests that the identifying assumptions required for a valid design are relatively weak (e.g., Hahn et al., 2001). However, public knowledge about the assignment rule might influence the assignment to the treatment and thereby invalidate the key assumption that individuals on either side of the discontinuity threshold are similar. Recent studies have considered the possibility of such "endogenous sorting" around the discontinuity threshold and have developed tools to examine its presence and consequences (e.g., Lee, 2008 andMcCrary, 2008). In addition, a number of studies offer examples of sorting around the discontinuity threshold. It seems to be the case that sorting is driven by incentives for potential receivers of the treatment to select themselves into the treatment, such as home owners, parents/schools, tax payers or traders on financial markets (e.g., Bayer et al., 2007, Urquiola and Verhoogen, 2009, Saez, 2010, Bubb and Kaufman, 2014and Vogl, 2014.
This research adds a novel case to this relatively new literature about cautiousness with respect to applying RD designs when there are opportunities for influencing the discontinuity threshold by documenting a case of sorting disadvantaged areas into a large scale neighbourhood investment programme. The unique feature of our research is that sorting into the treatment group was impossible for units that were entitled to receiving the treatment.
Policymakers at the national level, who designed and implemented the assignment rules for the policy in disadvantaged neighbourhoods, sorted areas into and out of the programme in such a way that there exists a large discontinuity in the share of non-Western immigrants at the discontinuity threshold.
The neighbourhood investment programme was implemented in 2008 and consisted of large scale neighbourhood investments in social and physical infrastructure aimed at improving the living conditions in disadvantaged neighbourhoods in the Netherlands. Approximately 4,000 postal code areas (PCAs) 2 were ranked based on a 'quality' index, which was constructed by making use of eighteen different items. PCAs with the worst outcomes on the index were selected into the programme and received additional funds. In the end, 83 PCAs received funding from the programme. Together these 83 PCAs are put together to form 40 neighbourhoods. In the period 2008-2011 the Dutch government invested 216 million Euros in these 40 neighbourhoods, while an additional amount of one billion Euros was invested by housing corporations.
The assignment of PCAs to the programme based on the 'quality' index score is a textbook example for the application of a RD design for estimating the causal effect of the programme.
However, at the threshold value for the assignment to the treatment we find a large and statistically significant gap in the proportion of non-Western immigrants of between 11 and 21 percentage points depending on the specification. Moreover, there is non-compliance because twelve eligible PCAs have been excluded from the programme, whereas two others have been added to the treatment group. The observed pattern of non-compliance with the assignment rule shows a similar difference in the share of non-Western immigrants. These differences cannot be explained by endogenous sorting induced by local authorities, as they had no control over the assignment to the treatment. It also seems unlikely that a random threshold produces such large differences in the proportion of non-Western immigrants at the discontinuity threshold.
The violation of a continuous distribution around the discontinuity threshold of such an important baseline characteristic could be due to the way the selection process of neighbourhoods has been carried out. Politicians at the national level demanded that there had to be a list of 40 eligible neighbourhoods. To determine the 40 neighbourhoods, a two-step procedure has been used. In the first step, a preliminary list of 40 neighbourhoods was created 3 based on the most disadvantaged PCAs according to the PCA 'quality' index. Because neighbourhoods can consist of multiple adjacent PCAs, policymakers sometimes merged PCAs with different rank numbers to create a neighbourhood. This opens possibilities of adding lower-ranked PCAs to an already identified neighbourhood. When we move down the list of PCAs, it is possible to add more PCAs beyond the point at which 40 geographical areas have been identified as neighbourhoods. This process continues until a PCA from a different geographical area is next on the list and would become neighbourhood number 41. We show that neighbourhood 41 is indeed in another city. In the second step, a number of PCAs were removed from and added to this list to obtain a final list of 40 eligible neighbourhoods. The added neighbourhoods are not close to the discontinuity threshold as we will show below.
We illustrate the bias of the RD estimates when using the official cut-off. We find that the estimates from RD models that do not take account of the endogenous sorting differ from the estimates from RD models that do account for the endogenous sorting. We also show that a different selection process of 40 neighbourhoods does not lead to a discontinuity in the share of non-Western immigrants. Finally, we cannot rule out that the result of selecting 40 neighbourhoods in this way is a case of bad luck. Using the same procedure to select 30 neighbourhoods does not yield the same discontinuities. Nevertheless, this set of estimates and our investigation of the selection process provides a new case of sorting around a discontinuity threshold in a situation where the units that might receive treatment have no control over their assignment to treatment.
We view our findings as a cautionary note regarding the use of RD designs. This conclusion does not only apply to the area of urban economics but applies in general to situations in which policymakers have control over the assignment to the treatment.

Background of the neighbourhood investment programme
In 2008 the Dutch government introduced a programme to improve the quality of life in disadvantaged neighbourhoods. Until 2011 the national government invested 216 million Euros on the programme, while housing corporations added about one billion Euros to the programme. The aim of the programme was to invest these resources in the most disadvantaged neighbourhoods in the country. The programme was an important part of the newly appointed government and was instigated by the Labour Party (Partij van de Arbeid).
When the programme was announced in 2007, it received a great deal of media attention as it was one of main spearheads of the newly established political coalition. A new ministry was 4 established to among others manage and monitor this programme (the Ministry of Housing, Neighbourhoods and Integration). Statistics Netherlands was asked to deliver a range of statistics on the outcomes of treated neighbourhoods in an annual outcome monitor. In addition, government research organisations were asked to evaluate the effects of the policy and the Court of Audit monitored whether the funds were appropriately invested in the targeted areas.

Defining and ranking neighbourhoods
The neighbourhoods were created from PCAs that were ranked according to a 'quality' index.
For each of the selected neighbourhoods a tailor-made investment plan was developed. Some neighbourhoods invested in physical infrastructure, others spent more on reducing social problems. The Dutch government's Court of Audit made an elaborate overview and has assessed the expenditures (e.g., Court of Audit, 2008).
The PCA 'quality' index was constructed by making use of eighteen different items. These items cover socioeconomic disadvantages, physical disadvantages, and a range of social problems, such as nuisance, vandalism or insecurity, but also social problems in terms of poor housing, environmental pollution, heavy traffic, noise pollution and a lack of safety. The items were both based on measured socioeconomic variables and information about the housing quality and obtained through surveys about nuisance and feelings of insecurity among residents (see Table A.1 in the Appendix). The scores on this index were collected at the PCA level. The ranking of PCAs was used to construct and thereafter select the most disadvantaged neighbourhoods. There are approximately 4,000 PCAs in the Netherlands.
The area of a single PCA is not always considered to define a neighbourhood. In many cases multiple, geographically adjacent PCAs form neighbourhoods. Together the selected PCAs formed 40 neighbourhoods that consist of 83 PCAs. This number of 40 was -according to the responsible politicians -a sound number of neighbourhoods to be able to guarantee a sufficiently large monetary investment, to carefully monitor progress and to pay regular visits. Table 1 shows the list of the 40 disadvantaged neighbourhoods and the 83 PCAs they consist of. Figure 1 shows a map of the Netherlands in which the 83 treated PCAs are highlighted in red. In most cases, disadvantaged neighbourhoods (PCAs) are located in the largest cities of the country. The vast majority of the neighbourhoods is concentrated in the four largest cities 5 in the Randstad (i.e., Amsterdam, Rotterdam, The Hague and Utrecht). The PCAs in blue and green are control and non-compliance areas, respectively. We explain them below.

The process of selecting neighbourhoods
The consequence of the political decision to merge 83 PCAs to arrive at a number of 40 neighbourhoods is that PCAs with consecutive rank numbers (on the 'quality' index) are not necessarily geographically adjacent to each other. In most cases a neighbourhood consists of multiple PCAs with different rank numbers. Moreover, the geographical boundaries of (a collection of) PCAs yields neighbourhoods that do often not correspond to the official classification of neighbourhoods as defined by Statistics Netherlands (CBS). Figure 2 shows an example. It displays the neighbourhood Schilderswijk in the Hague, which, according to Table 1 The process to construct 40 neighbourhoods involved two steps. First, 40 neighbourhoods were constructed by moving down the list of PCAs. Since these neighbourhoods do not necessarily coincide with the official classifications of Statistics Netherlands but consist of adjacent PCAs, it is difficult to precisely reconstruct the exact scope of these initial 40 neighbourhoods. In the second step, policymakers removed and added PCAs to the list to arrive at a final list of 40 neighbourhoods. Table 2 shows the results. The table documents the worst 187 PCAs in the Netherlands according to the 'quality' index (we discuss the most salient details of the index in Section 3).
The first two columns display the rank number and PCA (the higher the rank, the worse the score on the 'quality' index). The third column shows the number of the neighbourhood the PCA has been assigned to. The fourth column displays the neighbourhood's name. The fifth column marks whether the PCA has been removed in the first step of the selection process.
We link these PCAs to a neighbourhood just as the policymakers linked the non-removed PCAs to neighbourhoods. That is, we reconstruct the preliminary list from the first step. If we move down Table 2, at least four observations stand out. 6 First, and consistent with Figure 1, a number of PCAs have been put together to form one neighbourhood. For instance 3086 (rank 2) and 3085 (rank 31) in Rotterdam form one neighbourhood (Zuidelijke Tuinsteden). This selection rule to define neighbourhoods leads to putting together PCAs into neighbourhoods until the 41 st neighbourhood needs to be defined.
Second, the official cut-off is set at rank 93. Policymakers arrived at this point after removing 12 and adding 2 PCAs to the list in the second step of the selection process. The 12 removed PCAs are coloured green in Table 2. These areas are mostly touristic centres in which there is nuisance in terms of traffic and environmental pollution. We linked these PCAs to a neighbourhood. PCAs 7533 and 1024 have been added to the list. 3 As can be seen, the cut-off lies at the point where 39 neighbourhoods have been identified. Including 7533 (Enschede Velve-Lindenhof) yields the 40 th neighbourhood. PCA 1024 belongs to Amsterdam Noord, which was already defined. This shows the tendency of adding PCAs to already existing neighbourhoods.
Third, if the selection rule to define neighbourhoods was such that each single PCA would have been considered a neighbourhood, the point at which we can identify 40 'neighbourhoods', would have been at rank 40 (just after 2533 Den Haag Zuid-West).
Fourth, if we allow for the combination of adjacent PCAs into a single neighbourhood, and do not remove the twelve PCAs as the policymakers did in the second step, we arrive for the first time at 40 neighbourhoods at rank 80 (just after including 4827 Breda Geeren-Noord). Both 'reconstructed' cut-offs are different from the official cut-off. We analyse the outcomes of using different selection rules in Section 5. Finally, Figure 3 shows the relationship between the (scaled) 'quality' index of PCAs and the actual participation in the programme using the official cut-off (at row number 93). PCAs with scores above 0 are eligible to participate in the treatment, while PCAs with scores below 0 are not. Compliance and non-compliance with this assignment rule can be observed from

Data
The data for our empirical analysis are obtained from various sources. First, the ranking of PCAs and the score on the 'quality' index were obtained from ABF Research, the organisation that was asked by the government to construct the index. The 'quality' index will be used as the forcing variable for the assignment of PCAs to the programme in the RD model. We rescaled this variable in such a way that neighbourhoods with scores above 0 are eligible, while neighbourhoods with scores below 0 are not.  Table 3 compares the means of the outcomes and covariates for all 93 eligible PCAs to the right of the cut-off and the same number of ineligible PCAs to the left of the cut-off. 5 We observe that in 2006, a year before the start of the programme the eligible PCAs on average do worse on nearly all outcome measures. Moreover, these PCAs have much higher proportions of (non-Western) immigrants. In 2012, four years after the start of the programme, we observe a similar pattern for the differences on the outcomes variables.

Empirical strategy
The selection of PCAs based on the 'quality' index is at first sight an opportunity for applying a RD design to evaluate the effects of the programme. The cut-off for assignment to the treatment generates variation that is expected to be exogenous because it is beyond the control of the treatment and control PCAs. As the central government decided about the construction of the 'quality' index and because this index was not announced or available on beforehand, it can be expected that PCAs at both sides of the cut-off will be very similar. A comparison of the outcomes of PCAs close to the cut-off will then yield the causal effect of the neighbourhood programme. The basic assumption in this model is that the potential outcomes and characteristics of the PCAs are smooth around the cut-off.
This basic assumption can be investigated by performing balancing tests for the similarity of covariates or outcome variables before the start of the programme across the cut-off. These tests can be carried out by using a reduced form model as specified in equation (1): where i Y is an outcome or covariate before the start of the programme of PCA i, i Z is a dummy variable that equals 1 if the 'quality' index is >0 and 0 if the 'quality' index is <0, and i  are unobserved factors. (.) f is a smooth function of the 'quality' index, which is allowed to be different at either side of the cut-off ( l f and r f ), as suggested by Lee and Lemieux (2010) . The parameter 1  reveals whether or not the outcomes and covariates before the start of the programme are balanced across the cut-off.
Statistically insignificant estimates of this parameter can be considered as support for the main assumption of the RD model.
If this main assumption holds, the causal effect of the programme can be estimated by making use of specifications that are very similar to equation (1). In case of full compliance with the assignment rule, which means that all PCAs with a 'quality' index score above (below) the cut-off (don't) enrol into the programme, the effect of the programme can be estimated using the following specification: (2)  is based on the non-linear relationship between the 'quality' index and the allocation of resources around the cut-off.
However, the selection of PCAs into the programme did not fully comply with the assignment rule. This non-compliance can be dealt with in an instrumental variable (IV) approach. The causal effect of the programme can be estimated by using the dummy for the assignment rule ( i Z ) as an instrument for participation in the programme ( i P ) in a two-stage least squares (2SLS) approach. The first and second stage equations in this approach are (4) is the predicted probability of equation (3). Estimates of the parameter 1  yield the causal effect of the treatment for PCAs that comply with the assignment rule.

Sorting around the threshold
The empirical strategy outlined in the previous section can be applied to estimate the causal effect of the programme when the potential outcomes behave smoothly around the cut-off for the assignment of the treatment.
To investigate this assumption we perform balancing tests for seven outcome variables measured a year before the start of the programme and for three covariates. For the balancing test, we estimate the reduced form model (equation (1)). To estimate the causal effects of the programme, we apply the 2SLS approach outlined in equations (3) and (4). 6 Table 4 and Figure 4 show the results of the balancing tests for the seven main outcomes variables that have been used to build the 'quality' index. We use a sample of 187 PCAs that includes all 93 PCAs to the right of the discontinuity threshold and 94 PCAs to the left of the cut-off. For each outcome in Table 4 we use a specification with a linear and square term of the forcing variable. We find that all reduced-form estimates are statistically insignificant. 6 In all our estimations we use the most conservative (i.e., largest) standard errors. In most cases these were obtained by only correcting for heteroskedasticity. We also experimented with clustered standard errors at the municipality level ( 39). This yields in most cases lower standard errors. The notes below each table with regression coefficients document which standard errors apply. Similar results are found when we focus on a discontinuity sample closer to the cut-off (50 PCAs to the right and 50 PCAs to the left of the cut-off). The results for the seventh outcome variable 'quality of life', which is based on the six outcomes used in Table 4, are also statistically insignificant (see last column in Table 4 and Figure 5). These findings suggest that the allocation of PCAs around the threshold is random, which supports the possibility and usefulness of applying a RD design.

Balancing tests
Next to the indicators that should reveal information about the 'quality' of the neighbourhood, the composition of the population seems a natural indicator to investigate. Many of the PCAs that are selected into the treatment are located in the larger cities in the Randstad. It is wellknown that the population composition in these cities is different from cities outside this area.

This does not have to be a problem if the comparison in the RD framework is between PCAs
with similar characteristics, something we expect if the variation around the cut-off is as good as random. However, inspection of indicators of the composition of the population suggest a remarkable difference between the treatment and control PCAs at the cut-off. Table 5 shows balancing tests for three indicators of the composition of the population, which have somewhat surprisingly not been included in the 'quality' index. Depending on the specification, we observe that in 2006 there are living between 11 and 21 percentage points more non-Western immigrants in PCAs in the treatment group compared to PCAs in the control group. 7 For the smaller discontinuity sample of 100 PCAs we observe similar differences in the composition of the population. This gap in the proportion of non-Western immigrants implies a large increase of this proportion at the cut-off, as shown in Figure 6. The observed difference in the composition of the population implies that the basic assumption about smoothness around the discontinuity is unlikely to hold.

Non-compliance with the assignment rule
We next look at non-compliance of PCAs with the assignment rules. Twelve PCAs were eligible for participation but were excluded; two PCAs were ineligible but did receive the treatment. Table 6 shows descriptive statistics for these two groups. The first row shows that the two PCAs that were ineligible do better on the 'quality' index. It should also be noted that one of these two PCAs ranked as PCA number 210 in the original ranking. The second row in Table 6 shows however that the 'quality of the composition of the population' differs statistically significant between the PCAs that did receive funds and the PCAs that were eligible but did not receive funds. Two of the other population indicators 'percentage immigrants' and 'percentage non-Western immigrants' show the same picture. This pattern of non-compliance is similar when compared to the previous findings from the balancing tests.

Balancing tests with alternative neighbourhood definitions/cut-offs
We next look what happens to our balancing tests for non-Western immigrants when we choose different neighbourhood definitions and different cut-offs. We investigate what happens with the tests if we use (i) our reconstructed cut-off at the point at which for the first time we obtain 40 neighbourhoods (rank 80 in Table 2), (ii) the cut-off at which we for the first time obtain 40 PCAs (rank 40 in Table 2), (iii) the same strategy as the policymakers have done for a selection of 30 neighbourhoods (rank 63 in Table 2), (iv) the 'reconstructed' cut-off for 30 neighbourhoods (rank 55 in Table 2), and (v) the cut-off at which we for the first time obtain 30 PCAs (rank 30 in Table 2). Table 7 presents the results of this analysis. We draw two conclusions from the coefficients documented in this table. First, the coefficients of the balancing test of selecting 40 neighbourhoods in a different way show no discontinuity in the percentage non-Western immigrants. This suggests that removing PCAs that were eligible and adding PCAs until the point at which the 41 st neighbourhood has to be selected yields a discontinuity. The reason for this is that the PCA which forms the 41 st neighbourhood has to be different from the PCAs that together yield the first 40 neighbourhoods. If it would have been similar, policymakers would have added the PCA to one of the existing 40 neighbourhoods. Second, when using the same procedure and our alternative procedures to select 30 neighbourhoods, we do not find discontinuities. This also holds for the case in which we keep on adding PCAs to neighbourhoods until we are force to define neighbourhood 31. This suggests two things.
First, we cannot rule out that the discontinuity is the result of a coincidence. Second, the difference between the treatment and control PCAs around the cut-off of 30 neighbourhoods seems to be absent because we are able to compare neighbourhoods from similar cities, mainly in the Randstad (e.g. around the cut-off at rank 55 or 63 a number of PCAs pertain to the largest four cities in the Randstad 8 ). Compared to a cut-off set at 40 neighbourhoods, not one of the first six PCAs after the cut-off pertains to the Randstad. This seems to be a major reason for the discontinuity we observe at the cut-off. 12

Illustration of 'invalid' RD
Endogenous sorting around the discontinuity threshold invalidates the application of a RD design because the assignment of the treatment to PCAs just below or above the threshold value no longer can be considered to be (conditionally) independent. We conduct two types of analysis. First, we show the potential bias in outcomes of the RD model when we use the official cut-off and the discontinuity in the share of non-Western immigrants is not taken into account. Second, we look what happens with the RD-estimates when we control for the share of non-Western immigrants. Table 8 investigates this. The first RD model does not take into account proportion of non-Western immigrants (columns (1), (3) and (5)), the second model controls for this variable (columns (2), (4) and (6) However, non-Western immigrants are more likely to vote for the Labour Party, and we find that the estimated effect reduces towards zero after taking account of this population difference. In the last two columns we also find a large difference between the two estimates, varying between an increase of Labour Party voters in 2012 with 5.4 percentage points and a decrease of 4.3 percentage points.

Lessons
This paper documents a case of endogenous sorting around the discontinuity threshold for assigning neighbourhoods to a large-scale investment programme. Selection of neighbourhoods into the programme was determined by their score on a 'quality' index. At first sight this seems to be a textbook example for the application of a RD model aimed at estimating the causal effect of the programme. 13 The forcing variable was constructed from eighteen indicators on socioeconomic or housing disadvantages, social problems and safety issues, and neighbourhoods themselves had no control over the assignment to the treatment. However, at the cut-off for assignment to the programme, we find a remarkably large difference in the proportion of non-Western immigrants, a variable not taken into account in the 'quality' index. We also find that the pattern of non-compliance with the assignment rule seems consistent with investing in neighbourhoods with a high share of non-Western immigrants. These remarkable differences cannot be explained by endogenous sorting induced by PCAs themselves, as they had no control over the assignment to the treatment. It also seems highly unlikely that random sorting of neighbourhoods will produce such large differences in the proportion of non-Western immigrants at the cut-off.
We find that this non-random sorting may generate a bias of the RD estimates. Despite the differences in the proportion of non-Western immigrants at the discontinuity threshold, both policymakers and researchers have used the cut-off to analyse the effects of the neighbourhood investment programme. The Ministry of Housing, Spatial Planning and the Environment (currently the Ministry of the Interior), under which supervision the neighbourhood investment programme was launched, has initiated several ways to review the progress of the programme. There are several more descriptive reports available about improvements in outcomes. These reports aim to inform members of parliament about the progress of the programme (e.g., CBS, 2012). None of these reports have noticed or taken into account the difference in the proportion of non-Western immigrants at the discontinuity threshold. Also researchers did not take into account this difference at the threshold. For example, Wittebrood and Permentier (2011) conclude that the share of non-Western immigrants is not increasing in treatment PCAs that focussed on the restructuring of housing.
Such a finding has been regarded as a positive signal of improvement, but given our observation that the share of non-Western immigrants was higher in the treatment PCAs before the programme started sheds different light on this perceived success. In addition, a recent study by Permentier et al. (2013) uses the discontinuity threshold in a RD setting to evaluate the effects of the programme. This study does not take into account the difference in the share of non-Western immigrants nor does it account for non-compliance with the assignment rule.
Based on our empirical analysis we have to be careful in concluding whether or not policymakers' preferences or political forces at the national level have contributed to the 14 sorting patterns observed in the data. The simplest explanation for the observed sorting pattern is that it is a coincidence that there is such a large discontinuity in the share of non-Western immigrants at the threshold. Indeed, several indicators have been constructed to make a decision about which PCAs would be eligible for treatment and by coincidence there could be a discontinuity in the share of non-Western immigrants at exactly this threshold. Our analysis of the alternative of selecting 30 neighbourhoods with the same criterion does not rule out this possibility.
However, some observations suggest otherwise. First, the pattern of non-compliance with the assignment rule is consistent with selecting PCAs with more or less non-Western immigrants into and out of the treatment, respectively. Second, the size of the difference at the threshold points at selecting neighbourhoods in the Randstad relative to neighbourhoods in large cities in other parts of the country. Non-Western immigrants are concentrated in the Randstad. This selection seems to be the result of the selection rule to keep on adding PCAs to neighbourhoods until the threshold of 40 neighbourhoods set by the Minister was exhausted.
Overall, our results provide a new case of endogenous sorting around a threshold in a situation where the units that might receive treatment have no control over their assignment to the treatment. We view our findings as a cautionary note regarding the use of RD designs in situations in which policymakers are able to influence the assignment to the treatment.        Note: Each column is an IV-regression. The dummy for being a treated neighbourhood is instrumented by the dummy that equals 1 if the neighbourhood index>=0. In columns (1) and (2) standard errors are corrected for heteroskedasticity. In columns (3)-(6) standard errors are clustered at the PCA-level, and % voted labour is weighted by number of votes cast at the ballot box. Quadratic polynomial fitted in forcing variable, based on Akaike Information Criterion in reduced form. *** p<0.01, ** p<0.05, * p<0.1