Abstract
The Moving to Opportunity (MTO) experiment randomly assigned housing vouchers that could be used in lowpoverty neighborhoods. Consistent with the literature, I find that receiving an MTO voucher had no effect on outcomes like earnings, employment, and test scores. However, after studying the assumptions identifying neighborhood effects with MTO data, this paper reaches a very different interpretation of these results than found in the literature. I first specify a model in which the absence of effects from the MTO program implies an absence of neighborhood effects. I present theory and evidence against two key assumptions of this model: that poverty is the only determinant of neighborhood quality and that outcomes only change across one threshold of neighborhood quality. I then show that in a more realistic model of neighborhood effects that relaxes these assumptions, the absence of effects from the MTO program is perfectly compatible with the presence of neighborhood effects. This analysis illustrates why the implicit identification strategies used in the literature on MTO can be misleading.
This is a preview of subscription content, access via your institution.
Notes
My measure of quality is a normalization of the first principal component of these variables, or the onedimensional vector explaining the most variation in these variables.
Section 8 vouchers pay part of a tenant’s private market rent. Projectbased assistance gives the option of a reducedrent unit tied to a specific structure.
This is the author’s current interpretation of the literature, most prominently represented by Kling et al. (2007a) and Ludwig et al. (2008). However, the distinction between program and neighborhood effect parameters has not always been made clearly. Some studies do seem to equate program effects with neighborhood effects, even when using this indirect logic. Early examples where this distinction is unclear are Ludwig et al. (2001) and Kling et al. (2005), and more recent examples include Ludwig et al. (2013), Sanbonmatsu et al. (2012), and Gennetian et al. (2012).
This interpretation of the results from MTO can be found in Kling et al. (2007a), Ludwig et al. (2013, pp. 228–229), Angrist (2014, p. 106), Angrist and Pischke (2010, p. 4). Some preliminary instrumental variable analysis can be found in Ludwig et al. (2008), and recent papers like Aliprantis and Richter (2016) and Pinto (2014) that have estimated neighborhood effects models using the MTO data have found evidence of neighborhood effects on adult employment.
State 18 describes a state of the world in which an individual will be employed regardless of the neighborhood in which they reside, yet receiving an MTO voucher will cause them to become employed. State 19 implies that an individual will be employed regardless of the neighborhood in which they reside, yet receiving an MTO voucher will cause them to become unemployed. Finally, State 20 describes a state of the world in which the individual is both always employed (columns 3 and 4) or else is never employed (columns 5 and 6), which simply cannot happen in our model as structured.
Aliprantis and Richter (2016) is one example of neighborhood effects estimated under weaker assumptions than NQB and NQP in which the estimated effects contradict conclusion (\(^\star \)).
While using an MTO voucher did initially require moving to a neighborhood with particular poverty characteristics (<10%), this requirement only had to be met for 1 year. Since subsequent moves were frequent, often involuntary, and tended to be to lowquality neighborhoods (de Souza Briggs et al. 2010; Sampson 2008), the initial MTO move does not to capture the entire sequence of neighborhood characteristics, even when measured by poverty alone. Here I measure mobility using residence at the time of the interim evaluation, but other ways of dealing with dynamics, whether within the static models discussed here or within an expanded dynamic model, could also be appropriate.
A discussion related to Assumption NQB can also be found in Angrist and Imbens (1995).
An alternative and complementary approach is to use an unordered choice model as in Pinto (2014).
To be precise, the model in Kling et al. (2007a) is the limit of this model as \(J \rightarrow \infty \). Ludwig and Kling (2007) estimate a similar model with poverty replaced by beat crime rate. Effects in these analyses are constant in U under the specification in Eq. 3 since they assume \(U_j=U\) for all \(j \in \{1, \ldots , J\}\), so \(U_{j+1, i}U_{j, i}= U_i  U_i = 0\).
Weights are used for two reasons. First, random assignment ratios varied both from site to site and over different time periods of sample recruitment. Randomization ratio weights are used to create samples representing the same number of people across groups within each siteperiod. This ensures neighborhood effects are not conflated with time trends. Second, sampling weights must be used to account for the subsampling procedures used during the interim evaluation data collection.
Nevertheless, race will be correlated with the neighborhood characteristics causally affecting outcomes due to the history of racial discrimination in the USA. Aliprantis and Kolliner (2015) study race and neighborhood characteristics in the context of MTO.
It is worth noting that the same general conclusion also holds in models assuming NQP. For example, Quigley and Raphael (2008) point out that “The effect of treatment under the MTO program was, on average, to move households in the five MTO metropolitan areas from neighborhoods at roughly the 96th percentile of the neighborhood poverty distribution to neighborhoods at the 88th percentile” (p. 3).
DeLuca and Rosenbaum (2003) find that 66% of the suburban group and 13% of the city group lived in the suburbs of Chicago 14 years after original placement through Gautreaux. DeLuca and Rosenbaum (2003) cite limited availability of housing, rather than the choice to not move through the program, as the reason only 20% of eligible applicants moved through Gautreaux. This claim is based on evidence that 95% of participating households accepted the first unit offered to them. Furthermore, it is likely that Gautreaux induced larger changes in school quality than MTO (Rubinowitz and Rosenbaum 2000, p. 162). Taken together, this evidence is suggestive that Gautreaux induced more households into highquality neighborhoods than MTO.
Note that NQK need not be adopted only in conjunction with NQJ. A version of Assumption NQBNQK is adopted in Sampson et al. (2008) using a similar index of neighborhood quality to that used in this analysis.
Although this model of neighborhood effects has additional mechanisms relative to those typically included in models of social interaction, such models are still useful to consider in this context. For example, Manski (1993) and Brock and Durlauf (2007) specify models relaxing SUTVA (a) and Manski (2013a) specifies a model relaxing SUTVA (b).
References
Aliprantis D (2015a) Covariates and causal effects: the problem of context. Federal Reserve Bank of Cleveland Working Paper 1310R
Aliprantis D (2015b) A distinction between causal effects in structural and rubin causal models. Federal Reserve Bank of Cleveland Working Paper 1505
Aliprantis D, Kolliner D (2015) Neighborhood poverty and neighborhood quality in the Moving to Opportunity experiment. Federal Reserve Bank of Cleveland Economic Commentary Number 201504. https://clevelandfed.org/~/media/content/newsroom%20and%20events/publications/economic%20commentary/2015/ec%20201504%20neighborhood%20poverty/ec%20201504%20neighborhood%20poverty%20and%20quality%20in%20the%20moving%20to%20opportunity%20experiment%20pdf.pdf?la=en
Aliprantis D, Richter FGC (2016) Evidence of neighborhood effects from Moving to Opportunity: LATEs of neighborhood quality. Federal Reserve Bank of Cleveland Working Paper no. 1208R. http://dionissialiprantis.com/pdfs/LATEs_of_nbd_quality_REStat1.pdf
Angrist JD (2014) The perils of peer effects. Labour Econ. doi:10.1016/j.labeco.2014.05.008
Angrist JD, Imbens GW (1995) Twostage least squares estimation of average causal effects in models with variable treatment intensity. J Am Stat Assoc 90(430):431–442
Angrist JD, Pischke JS (2010) The credibility revolution in empirical economics: how better research design is taking the con out of econometrics. J Econ Perspect 24(2):3–30
Brock W, Durlauf S (2007) Identification of binary choice models with social interactions. J Econom 140(1):52–75
Carneiro P, Heckman JJ, Vytlacil EJ (2011) Estimating marginal returns to education. Am Econ Rev 101(6):2754–2781
ClampetLundquist S, Massey DS (2008) Neighborhood effects on economic selfsufficiency: a reconsideration of the Moving to Opportunity experiment. Am J Sociol 114(1):107–143
de Souza Briggs X, Popkin SJ, Goering J (2010) Moving to Opportunity: the story of an American experiment to fight ghetto poverty. Oxford University Press, Oxford
DeLuca S, Rosenbaum JE (2003) If lowincome blacks are given a chance to live in white neighborhoods, will they stay? Examining mobility patterns in a quasiexperimental program with administrative data. Hous Policy Debate 14(3):305–345
Friedman M (1955) The role of government in education. In: Solo R (ed) Economics and the public interest. Rutgers University Press, New Brunswick
Gennetian LA, Sciandra M, Sanbonmatsu L, Ludwig J, Katz LF, Duncan GJ, Kling JR, Kessler RC (2012) The longterm effects of Moving to Opportunity on youth outcomes. Cityscape 14(2):137–167
Goering J (2003) The impacts of new neighborhoods on poor families: evaluating the policy implications of the Moving to Opportunity demonstration. Econ Policy Rev 9(2):113–140
Heckman JJ (2010) Building bridges between structural and program evaluation approaches to evaluating policy. J Econ Lit 48(2):356–398
Heckman JJ, Urzúa S, Vytlacil E (2006) Understanding instrumental variables in models with essential heterogeneity. Rev Econ Stat 88(3):389–432
Heckman JJ, Vytlacil E (2005) Structural equations, treatment effects, and econometric policy evaluation. Econometrica 73(3):669–738
Imbens G, Rubin D (2015) Causal inference for statistics, social, and biomedical sciences: an introduction. Cambridge University Press, Cambridge
Imbens GW, Angrist JD (1994) Identification and estimation of local average treatment effects. Econometrica 62(2):467–475
Keels M, Duncan GJ, Deluca S, Mendenhall R, Rosenbaum J (2005) Fifteen years later: can residential mobility programs provide a longterm escape from neighborhood segregation, crime, and poverty? Demography 42(1):51–73
Kling JR, Liebman JB, Katz LF (2007a) Experimental analysis of neighborhood effects. Econometrica 75(1):83–119
Kling JR, Liebman JB, Katz LF (2007b) Supplement to “Experimental analysis of neighborhood effects”: web appendix. Econometrica 75(1):83–119
Kling JR, Ludwig J, Katz LF (2005) Neighborhood effects on crime for female and male youths: evidence from a randomized housing voucher experiment. Q J Econ 120(1):87–130
Ludwig J, Duncan GJ, Gennetian LA, Katz LF, Kessler RC, Kling JR, Sanbonmatsu L (2013) Longterm neighborhood effects on lowincome families: evidence from Moving to Opportunity. Am Econ Rev 103(3):226–231
Ludwig J, Duncan GJ, Hirschfield P (2001) Urban poverty and juvenile crime: evidence from a randomized housingmobility experiment. Q J Econ 116(2):655–679
Ludwig J, Kling JR (2007) Is crime contagious? J Law Econ 50(3):491–518
Ludwig J, Liebman JB, Kling JR, Duncan GJ, Katz LF, Kessler RC, Sanbonmatsu L (2008) What can we learn about neighborhood effects from the Moving to Opportunity experiment? Am J Sociol 114(1):144–188
Manski CF (1993) Identification of endogenous social effects: the reflection problem. Rev Econ Stud 60(3):531–542
Manski CF (2011) Choosing treatment policies under ambiguity. Annu Rev Econ 3:25–49
Manski CF (2013a) Identification of treatment response with social interactions. Econom J 16(1):S1–S23
Manski CF (2013b) Public policy in an uncertain world: analysis and decisions. Harvard University Press, Cambridge
Mendenhall R, DeLuca S, Duncan G (2006) Neighborhood resources, racial segregation, and economic mobility: results from the Gautreaux program. Soc Sci Res 35(4):892–923
Minnesota Population Center (2004) National historical geographic information system (prerelease version 0.1 ed.). University of Minnesota, Minneapolis. http://www.nhgis.org
Orr LL, Feins JD, Jacob R, Beecroft E, Sanbonmatsu L, Katz LF, Liebman JB, Kling JR (2003) Moving to Opportunity: interim impacts evaluation. US Department of Housing and Urban Development, Office of Policy Development and Research, Washington, DC
Pearl J (2009) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press, Cambridge
Pinto R (2014) Selection bias in a controlled experiment: the case of Moving to Opportunity. University of Chicago, Chicago
Polikoff A (2006) Waiting for Gautreaux. Northwestern University Press, Evanston
Quigley JM, Raphael S (2008) Neighborhoods, economic selfsufficiency, and the MTO program. Brook Whart Pap Urban Aff 8(1):1–46
Rosenbaum JE (1995) Changing the geography of opportunity by expanding residential choice: lessons from the Gautreaux program. Hous Policy Debate 6(1):231–269
Rubin DB (1978) Bayesian inference for causal effects: the role of randomization. Ann Stat 6(1):34–58
Rubinowitz LS, Rosenbaum JE (2000) Crossing the class and color lines: from public housing to white suburbia. University of Chicago Press, Chicago
Sampson RJ (2008) Moving to inequality: neighborhood effects and experiments meet social structure. Am J Sociol 114(1):189–231
Sampson RJ (2012) Great American city: Chicago and the enduring neighborhood effect. The University of Chicago Press, Chicago
Sampson RJ, Sharkey P, Raudenbush SW (2008) Durable effects of concentrated disadvantage on verbal ability among AfricanAmerican children. Proc Natl Acad Sci USA 105(3):845–852
Sanbonmatsu L, Kling JR, Duncan GJ, BrooksGunn J (2006) Neighborhoods and academic achievement: results from the Moving to Opportunity experiment. J Hum Resour 41(4):649–691
Sanbonmatsu L, Marvakov J, Potter NA, Yang F, Adam E, Congdon WJ, Duncan GJ, Gennetian LA, Katz LF, Kling JR, Kessler RC, Lindau ST, Ludwig J, McDade TW (2012) The longterm effects of moving to opportunity on adult health and economic selfsufficiency. Cityscape 14(2):109–136
Sobel ME (2006) What do randomized studies of housing mobility demonstrate? Causal inference in the face of interference. J Am Stat Assoc 101(476):1398–1407
Votruba ME, Kling JR (2009) Effects of neighborhood characteristics on the mortality of black male youth: evidence from Gautreaux, Chicago. Soc Sci Med 68(5):814–823
Wilson WJ (1987) The truly disadvantaged: the inner city, the underclass, and public policy. University of Chicago, Chicago
Acknowledgements
I thank Francisca G.C. Richter, Jeffrey Kling, my Math Corps students, and several seminar participants and anonymous referees for contributing to this paper. I am also grateful to Mary Zenker for research assistance and Paul Joice at HUD for help accessing the data. The research reported here was supported in part by the Institute of Education Sciences, US Department of Education, through Grant R305C05004105 to the University of Pennsylvania. The views stated herein are those of the author and are not necessarily those of the Federal Reserve Bank of Cleveland, the Board of Governors of the Federal Reserve System, or the US Department of Education.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Full contingency table for states of world
See Table 7.
Appendix 2: The neighborhood effects identified by MTO
Effects from moving to highquality neighborhoods are not identified by MTO. Given the evidence in Sect. 5.2.2, any definition of treatment of the form D2 would have to restrict measures of quality to the lower half of the national distribution of neighborhood quality to satisfy assumption A5.
Once the focus on quality is restricted to accommodate A5, we can see that A5 appears more reasonable than A5\(^*\), as it is likely that some households will move to a relatively highquality neighborhood regardless of whether they receive a voucher through MTO or not. Under assumptions (A1–A6, EH, D2NQB) the Wald estimator identifies the LATE:
If we believe assumption A2 will fail to hold when treatment is defined under D2NQB for the reasons discussed in Sect. 5.2.1, we could alternatively define treatment under D2NQJ to generate a transitionspecific analogue to 14:
Versions of the model have been estimated in Kling et al. (2007a) and Ludwig et al. (2008) under (A1–A6, SI, and D2NQJNQP). A dose–response analysis is used in Kling et al. (2007a) to determine whether parameters are constant across all j to \(j+1\) transitions in \(\{1, \ldots , J \}\). Aliprantis and Richter (2016) estimate the model under (A1–A6, EH, D2NQJNQK). That analysis makes A2 more plausible by relaxing D2NQJNQP–D2NQJNQK and allows for the identification and estimation of LATEs that are heterogeneous over unobservables by relaxing SI to EH.^{Footnote 18}
Appendix 3: Assumptions about the distribution of unobservables
The interpretation of the treatment effect parameters will be very different depending on the assumptions we make about the relationship between the unobservables in the model. Ignorability is a standard assumption made in the statistics and econometrics literature about the relationship between the unobservable component determining selection into treatment and those determining potential outcomes. Ignorability is fundamentally an assumption about what the econometrician is able to observe; it is that the econometrician can observe all characteristics connecting selection into treatment with treatment effect heterogeneity. Although this assumption may be unrealistic in many applications, it is adopted frequently because it is helpful for identification for reasons that will be discussed shortly.
An implication of Ignorability is that conditional on observables, selection into treatment is not related to treatment effect heterogeneity. Formally, Ignorability can be written in our model as

Ig .
Imbens and Angrist (1994) showed it is possible to identify an interpretable parameter, the LATE, even if Ignorability fails. Recent work in Heckman and Vytlacil (2005), Heckman et al. (2006), and Carneiro et al. (2011) has further defined and estimated treatment effect parameters when relaxing the assumption of Ignorability by assuming that unobservable treatment effect heterogeneity is related to the unobservable determinants of selection into treatment. Formally, the assumption of Essential Heterogeneity is that

EH \(\hbox {COV}(U_1U_0, V) X \ne 0\).
Figure 5 helps to illustrate the implications of Ig and EH. The top panel in the figure shows that average treatment effects are allowed to vary across observable characteristics. Ig and EH characterize different scenarios once we select a particular value of observable characteristics, \(x^*\). In the middle panel of the figure we see a scenario of Ig. The distributions of the potential outcomes must be independent of V given \(x^*\), so the levels of the potential outcomes must be constant across V given \(x^*\). The differences between these levels given \(x^*\) and \(U_D = F_V(V)\), the marginal treatment effects (MTEs), are thus constant for all \(U_D\) given \(x^*\).
The bottom panel of Fig. 5 shows a contrasting scenario of EH. In this scenario the difference \(U_1U_0\) is correlated with V, resulting in MTEs that vary across \(U_D\). In the example displayed the effect of treatment is large for low levels of V, while for large values of V the effect of treatment decreases. Given our latent index model, this implies that for the given observable characteristics \(x^*\), treatment effects are large for those who would be most likely to select into treatment in the absence of the program. Finally, Fig. 6 shows that while Ig and EH are mutually exclusive, they are not exhaustive since individuals might select on the level while not selecting on the gain.
The contrast in the role of instrumental variables under Ig versus EH is shown clearly in Fig. 5. Under Ig it does not matter who is induced into treatment by the instrument since all variation from Z identifies the same homogeneous parameter. Unlike EH, one might assume Ig and estimate parameters without the existence of an instrument, perhaps implemented with propensity score matching. In fact, it may appear to be superfluous to use an instrument in conjunction with the Ig assumption. This is not necessarily the case, though, as adding a valid instrument Z to the latent index in Eq. 5 can make Ig a more plausible assumption.
In contrast to Ig, under EH the selection into treatment induced by the instrument is of central interest for interpreting parameters. Since MTEs vary over the support of \(U_D\), the subinterval induced into treatment by the instrument will determine the parameter(s) identified by the instrument. Different instruments that induce different intervals of \(U_D\) into treatment will identify different parameters.
Appendix 4: External validity
Although external validity is the motivation for studying causal effects, and there is no clear reason for prioritizing internal validity over external validity (Manski 2013b), the literature has focused most formal attention on internal validity (Aliprantis 2015a). The text has adopted these priorities for the sake of publication, but here we also consider why estimated parameters will not be experiment invariant unless an assumption also holds that restricts the permissible types of peer effects (Sobel 2006). Interested readers are also directed to the careful discussions of these issues in Sobel (2006) and Ludwig et al. (2008).
Assumptions across and within individuals
The parameters in Sect. 4.1 are all defined conditional on the joint distribution (U, V) where we define \(U \equiv (U_0, U_1)\). Assumptions about how these random variables interact across individuals have implications for the joint distribution (U, V) and will change the interpretation of the parameters we have defined.
One possibility satisfying A6 is for X to be a bundle of individuallevel characteristics including baseline neighborhood characteristics, with one element captured in the unobservables V being peer effects on the selection decision.^{Footnote 19} We now take some terminology from Sobel (2006) to consider the implications of changes to the distribution of V. We suppose the MTO experiment involves N individuals, that there are \(k_1\) people assigned to \(Z=1\), and that \(k_0=Nk_1\) are assigned to \(Z=0\), here again abstracting from the Section 8 group for the sake of exposition. Let \(R(k_0,k_1)\) denote the set of possible realizations of such a randomization, with \(r \in R(k_0,k_1)\) denoting one possible realization. If peer effects determining selection into treatment are a part of V, then different realizations r may result in different distributions of V, which we write as \(F_{V  r}\). Returning to the fact that all of the parameters defined in Sect. 4.1 are defined assuming some distribution of (U, V), this implies that these parameters might be very different for some realization r compared to another realization \(r^{\prime }\) (Sobel 2006).
A standard assumption on the nature of peer effects resolves this problem by ensuring the effects defined in Sect. 4.1 are the same for all realized random assignments r. This assumption simply assumes there are no peer effects at all. In the context of our model, Angrist and Imbens (1995) state the stable unit treatment value assumption (SUTVA) from Rubin (1978) as

SUTVA (a) for all \(j \ne i\)

SUTVA (b) and for all \(j \ne i\)
Note that SUTVA is an assumption across different individuals. Under SUTVA, Ig and EH are primarily assumptions within individuals. In this case, unobservables are primarily thought to represent individuallevel causal variables. Although (U, V) can represent social interactions under SUTVA, these social interactions cannot be related to treatment or assigned treatment.^{Footnote 20} When SUTVA is relaxed, however, Ig and EH become assumptions not only about individuallevel causal variables, but also about social interactions.
A less restrictive assumption on peer effects that still keeps the effects in Sect. 4.1 identical across realizations of the randomization is that the distribution of peer effects will be identical under all realizations r. I label this as the stable peer effects assumption (SPEA):

SPEA
Note that neither SUTVA nor SPEA is necessary to define and estimate the parameters in Sect. 4.1. However, the model illustrates how the lack of such an assumption dramatically changes their interpretation. Since the distribution of peer effects included in V might change in different contexts, this could have very important consequences, both in terms of whether the parameters in the model are invariant to the realization of randomized voucher assignment (Sobel 2006) and in terms of parameter invariance to classes of policy interventions. Importantly, this discussion illustrates that, just like Ig or EH, parameter invariance is an assumption about the unobserved variables in the model.
Appendix 5: List of assumptions
Given the joint model of potential and outcomes and selection into treatment:
with
there are several assumptions about the model considered throughout the paper. I list them here for the reader’s reference:

A1 \(\gamma _i = \gamma \) for all i and \(\gamma \ne 0\)

A2

A3 The distribution of V is absolutely continuous

A4 \(E[ Y(j) X ] <\infty \) for all j

A5 \(0< Pr(D=j  X) < 1\) for all X, j

A6 \(X = X_j = X_k\) almost everywhere for all \(j \ne k\)

D1 Treatment is moving with the aid of the program (i.e., using an MTO voucher).

D2 Treatment is moving to a highquality neighborhood.

M1 \(D_i \equiv \mathbf {1}\{\text {individual}\,i\,\text {lives in a highquality neighborhood}\}\)

M2 \(Z_i \equiv \mathbf {1}\{\text {individual}\,i\,\text {received an MTO voucher}\}\)

M3 \(Y_i \equiv \mathbf {1}\{\text {individual}\,i\,\text {is employed}\}\)

NQB Neighborhood quality D is a binary function of a latent index of neighborhood quality q: \(D \equiv \mathbf {1}\{ q \ge q^* \}\)

NQJ Neighborhood quality D is a multivalued function of a latent index of neighborhood quality q: \(D \equiv j \times \mathbf {1}\{C_{j1} < q \le C_{j} \}\)

NQP Neighborhood quality q is a onedimensional vector that is a scalar function of neighborhood poverty p: \(q = \alpha p\)

NQK Neighborhood quality q is a onedimensional vector that is a linear combination of K observable neighborhood characteristics: \(q = \alpha _1 X_1 + \cdots + \alpha _K X_K\)

SUTVA (a) for all \(j \ne i\)

SUTVA (b) and for all \(j \ne i\)

SPEA for randomization R
Rights and permissions
About this article
Cite this article
Aliprantis, D. Assessing the evidence on neighborhood effects from Moving to Opportunity. Empir Econ 52, 925–954 (2017). https://doi.org/10.1007/s0018101611861
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0018101611861
Keywords
 Moving to Opportunity
 Neighborhood effect
 Program effect
JEL Classification
 C30
 H50
 I38
 J10
 R00