Introduction

Many societies wonder about the best policies to integrate immigrants. One important question is the regional allocation of immigrants. To prevent excessive ethnic concentration, many European countries have dispersal policies that assign refugees to regions (Dustmann et al. 2017). Existing evidence tends to suggest, though, that enclaves may in fact facilitate the labor market integration of immigrants (Schüller 2016), presumably through positive network effects within ethnic groups (Dustmann et al. 2016). The successful integration of immigrants into host country societies in the long run may depend on the intergenerational effects of ethnic concentration on the immigrants’ children. Immigrant children’s proficiency in the host country language and their educational attainment play a particularly important role in long-term employment opportunities and in cultural and social integration (e.g., Schnepf 2007; Dustmann and Glitz 2011; Chiswick and Miller 2015). On the one hand, children’s language acquisition as well as educational and labor market integration may benefit from ethnic enclaves that provide useful information, reduced discrimination, and positive role models, e.g., through ethnic entrepreneurs (Andersson et al. 2021). On the other hand, immigrant children may also potentially be hindered by limited exposure to native children, reduced options for language acquisition, and lower socioeconomic opportunities for their families. In this paper, we study the effect of regional ethnic concentration on the language proficiency and educational attainment of immigrant children.

Our analysis exploits the regional distribution of immigrants in 1975/1985 based on the placement policy of the German guest worker program. Between 1955 and 1973, the German government actively recruited foreign workers to fill low-skilled labor shortages. The guest workers were enlisted in various countries of origin and then quasi-exogenously placed across West German firms. Workers could not choose specific firms and firms could not choose specific workers, and firms had no prior information about workers’ specific skills or other characteristics. Once recruited, the program implied a guaranteed immigration permit and immediate placement and job start. Location preferences of guest workers were not considered, and they were not free to change location during a lock-in period. The guest worker population was mainly low-skilled: Only 7% had completed over 12 years of education (compared to 27% of other immigrants in Germany at the time), and more than one-third had left school without any degree. The skill level of guest workers was stable over the entire placement period (Marplan 1982).

The German Socio-Economic Panel (SOEP) allows us to extract a sample of roughly 1000 children whose parents immigrated into Germany from five different countries of origin during the period of the guest worker program. In contrast to administrative datasets, the SOEP household panel provides information on these children’s host country language proficiency, as well as their educational attainment. In addition, the SOEP contains rich information on parents’ speaking and writing abilities, friendships with Germans, and indicators for parents’ social and labor market integration that allows us to analyze factors that may mediate the effect of ethnic concentration on child outcomes. We merge the SOEP data on individual immigrant children with administrative data on the regional concentration of different ethnicities.

The initial regional assignment of guest workers gives rise to what we view as plausibly exogenous variation in ethnic concentration across regions, reducing the concern of potential bias from endogenous sorting of immigrants into enclaves of co-ethnics. A limitation of our data is that we do not observe guest workers’ initially assigned location but only their location in 1985, requiring the identifying assumption that guest workers did not self-select across regions between their arrival in the 1960s/1970s and 1985 in ways that are systematic to our relationship of interest. We argue that this assumption is warranted in our setting by specific features of the German guest worker program: The work permits required guest workers to stay in the initially assigned region for a lock-in period of several years, and immediate integration into the labor market upon arrival minimized incentives to move to other regions. Two pieces of evidence further corroborate the assumption. First, we show that in 1985, demographics of guest worker parents and their children are balanced across regions with low and high ethnic concentration. Second, we show that in the 10 years after 1985, the movement of guest workers across regions was orthogonal to ethnic concentration and to our outcome measures.

To additionally account for any type of region-specific or ethnicity-specific differences, our models include region and ethnicity fixed effects. Region fixed effects ensure that any region-specific peculiarities are accounted for to the extent that they are common across guest worker ethnicities. Ethnicity (country-of-origin) fixed effects ensure that any ethnicity-specific differentials in integration are accounted for to the extent that they are common across regions. Thus, we identify the effect of ethnic concentration by observing immigrants who are exposed to differential concentrations of co-ethnics within the same region, thereby alleviating concerns of potential bias from endogenous location choices and from unobserved factors of different ethnic groups such as differing baseline willingness or disposition to integrate.

Our results indicate that growing up in ethnic enclaves significantly reduces immigrant children’s proficiency in the host country language and their educational attainment. In particular, a one log-point increase in the size of the own ethnic group in the region—equivalent, e.g., to increasing an ethnicity’s share in the regional population from 1.0 to 2.8—leads to a reduction in the German speaking proficiency of the children of the guest worker generation by 19% of a standard deviation, and a reduction in the German writing proficiency by 17% of a standard deviation. In addition, a one log-point increase in exposure to own ethnic concentration increases the likelihood that the immigrant child drops out of school without any degree by 5.6 percentage points, a large effect given the average of 7.1%. Although less robust, there is some indication that ethnic enclaves also reduce the probability of obtaining an intermediate or higher school degree. Effects tend to be larger for those immigrant children who were born abroad, but they do not differ significantly by gender.

The rich background information on children and parents contained in the SOEP allows us to analyze several mediating factors. Potential mechanisms underlying the negative effect of growing up in ethnic enclaves include parents’ lower host country language proficiency, reduced interactions with natives, and lower wages and employment opportunities of immigrant parents. We find that differences in parents’ ability to speak the German language—which is strongly related to their children’s German language proficiency—can in fact account for much of the effect of growing up in ethnic enclaves. Once parental German speaking abilities are controlled for, the estimated effect of ethnic concentration on children’s language proficiency is reduced to close to zero. In this analysis, it proves essential to address measurement error in the self-reported parental language measure by implementing an instrumental variable approach that uses parents’ responses on the same survey item from consecutive years (leads and lags) as instruments (Dustmann and van Soest 2002). While measures of parental writing abilities, friendships with German children, visits from Germans at home, parental unemployment, and household income are also significantly related to immigrant children’s language proficiency, they do not account for the negative effect of ethnic concentration. None of the investigated mechanisms can account for the negative enclave effect on school dropout.

Our results are robust to a number of sensitivity analyses. We use alternative functional forms for the measure of ethnic concentration, instrument ethnic concentration at the time of observation (1985) by the ethnic concentration observed a decade earlier, use social-security and census data to construct the ethnic concentration measure, measure ethnic concentration at different levels of regional aggregation, and account for interview mode (which may influence self-assessed German language proficiency). We also show that after 1985, neither return migration nor regional migration within Germany were selective with respect to ethnic concentration and that ethnic concentration did not affect family size.

Our paper contributes to several strands of literature. A number of papers address the link between ethnic enclaves and the human capital acquisition of immigrant children. Grönqvist (2006) finds lower university attainment among refugees in enclaves in Sweden. Cortes (2006) detects no test score disadvantages for enclave schools in the USA. Jensen and Rasmussen (2011) find lower test scores among immigrant children in enclaves in Denmark. In the one previous study directly addressing bias from self-selection into ethnic enclaves, Åslund et al. (2011) use a refugee placement policy in Sweden and find that the concentration of highly educated co-ethnics positively affects the achievement of immigrant students in school. We contribute to this literature by providing novel evidence for a different group of immigrants (low-educated guest workers in Germany), by studying effects on language skills, and by finding that in a different setting, ethnic concentration can have a negative effect on immigrant children’s language proficiency and educational attainment.

Given the shared emphasis on identification, it is useful to provide a comparison of our study with the Swedish study by Åslund et al. (2011). The main differences relate to the target population and to the results. Guest workers in Germany arrived with a signed labor contract and were quasi-exogenously distributed across regions, whereas Sweden assigned refugees according to municipal integration capacities, which were related to labor market and educational opportunities. Since all guest workers in Germany were employed upon arrival, our study can switch off one potential channel of how enclaves affect human capital outcomes of children. While our setting focuses on children of a rather homogeneous group of low-skilled labor migrants, an important aspect of Åslund et al. (2011) is to focus on a heterogeneous group of partly well-educated refugees: The sample share who completed over 12 years of education in their setting (39%) is more than five times as large as in ours. Most obviously, our main finding also differs strongly from the Swedish study, indicating that existing evidence from humanitarian immigrants may not generalize to other settings and that any effect of ethnic concentration may strongly depend on the skill and employment levels of co-ethnics in the enclave.Footnote 1

A vast literature studies the effects of ethnic enclaves on the economic integration of adult immigrants (see Schüller 2016, for an overview). Using dispersal policies in Sweden and Denmark, respectively, Edin et al. (2003) and Damm (2009) find positive network effects of ethnic concentration on immigrants’ labor market outcomes. By contrast, studying the same setting as in our paper, Danzer and Yaman (2016) and Constant, Schüller, and Zimmermann (2013) find negative effects of ethnic concentration on adult immigrants’ proficiency in the host country language and their cultural integration, respectively (see Chiswick 1998, for similar results for Israel). In a different German setting, Battisti et al. (2016) find positive short-term but negative long-term effects of ethnic concentration on labor-market outcomes, with the negative effect being related to lower human capital investments and larger job mismatch. Generally, there is growing awareness that immigrants tend to assimilate as communities, emphasizing the importance of ethnic links (Hatton and Leigh 2011).

Beyond immigrant integration, another large literature studies the effect of spatial segregation and concentration on the economic success of racial minorities, usually finding negative effects (e.g., Cutler and Glaeser 1997; Fryer 2011). More generally, a growing literature studies the effect of exposure to different quality neighborhoods during childhood on children’s outcomes in the short and long run (e.g., Chetty et al. 2016; Chetty and Hendren 2018; Gibbons et al. 2013, 2017).

We contribute to this literature by estimating well-identified effects of growing up in low-skilled ethnic enclaves on the language proficiency and educational attainment of immigrant children and by providing a rich analysis of mediating factors. Our findings indicate that parents’ limited proficiency in speaking the host country language is a key mediating factor of the negative impact of ethnic enclaves on immigrant children’s language proficiency. By contrast, limited interaction with natives and parental economic conditions do not seem to be leading mechanisms. Overall, the opportunity to benefit from large social networks of co-ethnics may be particularly relevant for newly arriving immigrants, but less so for the long-term integration of the children of settled immigrants. In fact, most of the arguments in favor of ethnic enclaves tend to relate to the labor-market integration of adult immigrants but bear less relevance for integration beyond the labor market. Regarding the cultural and educational integration of the immigrants’ second generation, our results suggest that the dispersal policies of several European countries may indeed facilitate integration.

In what follows, Section 2 provides institutional background on the German guest worker program. Section 3 describes the SOEP household data and the administrative data used to compute ethnic concentrations. Section 4 introduces our empirical model and shows balancing of demographic characteristics across regions with low and high ethnic concentration. Section 5 presents our main results on the effect of ethnic concentration on immigrant children’s outcomes. Section 6 investigates the relevance of several potential mediating factors. Section 7 provides a number of robustness analyses. Section 8 concludes.

Institutional background on the German guest worker program

The German guest worker program was one of the largest guest worker programs worldwide. West Germany (hereafter, Germany) signed bilateral guest worker treaties with Italy in 1955, Greece and Spain in 1960, Turkey in 1961, and Yugoslavia in 1968. During a period of rapid economic growth in the 1960s and early 1970s, increasing demand for low-skilled workers induced a massive inflow of labor migrants to fill the numerous open positions in the economy. Given that all treaties were designed to attract low-skilled and mainly young workers, the guest workers constitute a rather homogeneous immigrant population that is, on average, less educated than the German workers. Due to the severe economic recession triggered by the oil crisis, Germany stopped the recruitment of guest workers in 1973. By that time, 2.6 million foreign workers were employed in Germany, implying that 12% of the labor force were foreigners (Federal Employment Agency 1974).

To take up employment, guest workers were required to hold a valid work permit (Arbeitserlaubnisbescheinigung). The formal process of obtaining this permit was initiated at the foreign branches of the German Federal Employment Agency in the guest worker countries, which was similar for all source countries.Footnote 2 Having been recruited through advertisements, e.g., in cinemas and on TV in their home countries, potential guest workers had to show up at these branches. They were screened for basic literacy and underwent medical check-ups.Footnote 3 Then, guest workers were matched with German employers. Enrolling in their home countries, the guest workers never met a placement officer in Germany, could not pick a region or firm, and could not turn down a location offer without dropping out of the program.

The German employers submitted recruitment requests together with blank work contracts to their local labor offices, which forwarded them to the foreign branches after initial approval.Footnote 4 The firms received almost no information about their requested workers before arrival and in practice generally could not select workers based on job skills or country of origin (Feuser 1961; Fassbender 1966; and Voelker 1976). As recruited workers were meant to be manual workers, specific skills were not even recorded at application, so that placement in a specific firm was independent of any potential work experience or other skills of the worker.

Successful applicants got a work contract from a specific German company and a 1-year work permit that was only valid for employment at the specific firm (Feuser 1961). Recruited workers were then transferred to Germany in groups.Footnote 5 After having stayed with their initial employer for at least 2 years and in the same occupation (and, in practice, in the same region for most guest workers) for at least 5 years, guest workers could receive an upgrade of their work permit (Erweiterte Arbeitserlaubnisbescheinigung) that included free job choice (Dahnen and Kozlowicz 1963).Footnote 6

Given that the initial location in Germany depended on current labor demand, it was exogenous from the perspective of an individual guest worker. Unlike in settings where immigrants can potentially self-select into regions and ethnic enclaves (e.g., Bauer et al. 2005), the guest worker recruitment process generated exogenous variation in the concentration of ethnicities that allows us to estimate the effect of ethnic concentration on immigrant children’s outcomes. The regional variation in ethnic concentration differed across ethnicities for at least three reasons. First, regional labor demand fluctuated over time, and guest worker recruitment started in the different countries of origin at different points in time. For instance, the relative size of the guest worker population in North Rhine-Westphalia declined between 1962 and 1973, whereas the share of guest workers in Lower Saxony and Bremen increased (Federal Employment Agency 1964). Second, labor supply in the countries of origin varied over time. More guest workers were recruited from countries that had temporarily abundant labor supply, like Turkey in 1964 (Federal Employment Agency 1965). Third, while guest workers from Spain and Portugal arrived by train in Cologne in the West, the port of entry for the remaining guest worker groups (Greek, Italian, and Turkish) was Munich in the South, generating additional variation in the ethnic concentration across nationalities.

In 1973, the guest worker recruitment was officially stopped. However, immigration of family members within the family reunification framework ensured high levels of inflows from guest worker countries also afterwards. Those family members immigrated on the basis of the Aliens Act of 1965 and were granted a residence permit when joining a guest worker family member.

Data

Our analysis uses individual-level information on guest workers and their children from the German Socio-Economic Panel (Section 3.1). We construct our main measure of ethnic concentration from a large employee sample of the Federal Employment Agency’s Research Institute (Section 3.2).

Survey data on guest workers and their children

We use information on guest workers and their children from the German Socio-Economic Panel (SOEP, version 30), a large annual household survey that is representative of the resident population in Germany (Goebel et al. 2019). The first SOEP wave in 1984 strongly oversampled guest workers (by a factor of four). As a consequence, 1393 of the 5921 SOEP households originated from the five guest worker countries, which comprised the largest foreigner populations in Germany at the time (Sample B). For each ethnicity, an independent random sample was drawn to allow for stand-alone analyses (Haisken-DeNew and Frick 2005). The SOEP contains detailed information on individual characteristics, including educational attainment and, for foreigners, self-reported German speaking and writing proficiencies.Footnote 7 The 1985 survey is the first wave that provides sufficient geographic information on the region of residence at the county level. Hence, we identify guest workers and their region of residence based on information in the 1985 wave. Using information from mothers’ birth biography and pointers to their partners in 1985, we link parents to their children.Footnote 8 While the SOEP does not contain a direct indicator of guest workers, we identify guest workers by their country of origin, year of immigration, and age at migration. The guest worker immigration differed from previous immigration experiences in Germany in that guest workers were predominantly male, young, and low-educated.Footnote 9

Our analysis sample consists of 1065 guest worker children with Greek, Italian, Spanish, Turkish, or Yugoslav background. To be included in the sample, children must have at least one parent who was aged 18 or older at immigration and who arrived in Germany during the period when the guest worker program with her/his home country was in place. We restrict the sample to children aged 13 or younger at migration since the focus of our study is to investigate the impact of the region where children grow up.Footnote 10 We keep only children with at least one observation for self-reported German language proficiency or one observation for educational attainment.Footnote 11

We measure children’s German language proficiency by two distinct outcomes: speaking proficiency and writing proficiency. Both language outcomes are self-reported and based on the following question: “In your opinion, how well do you speak and write German?” Answers are provided on a five-point scale: very well, well, fairly, poorly, and not at all. Children report their German language proficiency for the first time at the age of 17 or 18, i.e., when they are personally interviewed in the SOEP for the first time (see Appendix Figure 2 for the timing of variable measurements). An advantage of the panel data is that we observe multiple observations of self-reported language proficiency for each child (five observations per child on average), resulting in a large sample of language proficiency observations. An additional advantage of the panel data is that we can address measurement error in parents’ language proficiency by instrumenting the self-reported language proficiency in a given year with their self-assessments in previous or succeeding years (see Section 6.1). In our sample of language proficiency, each observation is at the child-year level. This sample is based on the SOEP waves 1984–1987, 1989, 1991, 1993, and every 2 years from 1997 to 2005, including about 4900 child-year observations.Footnote 12 We standardize each outcome of children’s language proficiency to have mean 0 and standard deviation 1.

Children’s educational attainment is also measured by two variables. The binary indicator “any school degree” equals 1 if the child obtained any type of school degree and 0 if the child dropped out of school without any degree. The binary indicator “at least intermediate school degree” equals 1 if the child obtained an intermediate school degree (Realschulabschluss) or an advanced degree (Abitur) and 0 otherwise.Footnote 13 Children’s educational attainment is based on the most recent available information in the SOEP.Footnote 14

Table 1 reports descriptive statistics of children’s outcomes and demographic characteristics of children and their parents, separately for regions with low and high ethnic concentration (split at the ethnicity-specific median of the share of ethnic concentration in 1985). Immigrant children living in regions with a high co-ethnic concentration report lower German speaking proficiency (statistically significant at the 12% level) and lower writing proficiency (significant at the 10% level) than immigrant children living in low co-ethnic concentration regions.Footnote 15 Consistent with this finding, immigrant children in regions with high co-ethnic concentration are significantly less likely to obtain a school degree and slightly (and statistically insignificantly) less likely to obtain at least an intermediate school degree.

Table 1 Descriptive statistics by degree of ethnic concentration

In terms of ethnicities, 37% of immigrant children in our sample are Turkish, 19% each are Italian and Yugoslav, 15% are Greek, and 10% are Spanish. We identify the ethnicity of the immigrant children primarily based on their first citizenship (94.2% of the children in our sample). In the case of a German citizenship or missing citizenship information, ethnicity is based on the children’s country of birth or their parents’ nationality (see Appendix Table 10 for definitions of all individual-level variables).Footnote 16 A slight majority of immigrant children in the sample (57.1%) were born in Germany. The average year of birth is 1971, and the average age at migration is 2.8 years.

The SOEP also contains a rich set of additional individual characteristics, including the immigration history, educational attainment, and labor-market outcomes of adults.Footnote 17 This wealth of information allows us to investigate several potential mediating factors that may drive the effects of ethnic concentration. As potential mediating factors, we investigate parents’ speaking and writing proficiencies in German, parents’ employment status, household income, visits from Germans at home, and whether the child’s first friend is German. Parents’ mediating factors are based on the average of mothers’ and fathers’ information.Footnote 18

Ethnic concentration

We compute measures of the concentration of co-ethnics in the region separately for the five guest worker nationalities (Greek, Italian, Spanish, Turkish, and Yugoslav) at the regional level of the so-called Anpassungsschichten (hereafter “areas”). Typically comprising several counties, these regions constitute regional labor markets. In West Germany, there were 103 areas in 1985 with an average population of about half a million people. While smaller geographic units may better reflect small-scale ethnic neighborhoods, a higher level of regional aggregation produces more conservative estimates and circumvents potential bias from the typical sorting of immigrants into close-by cities or across city districts. Since compared to the USA, the degree of ethnic and social segregation is low in Germany (Musterd 2005), the area level reflects the most suitable level of analysis (Danzer and Yaman, 2016, discuss the trade-off between small and large units). In any case, our findings are fully confirmed in robustness analysis at the more fine-grained level of German counties.Footnote 19

For the measurement of ethnic concentration, we use the Sample of Integrated Labor Market Biographies (Stichprobe der Integrierten Arbeitsmarktbiografien, SIAB) of the Research Institute of the Federal Employment Agency (Institut für Arbeitsmarkt- und Berufsforschung, IAB). The SIAB is a 2% random sample of all individuals in Germany who are employed subject to social security, job seeking, or benefit recipients as contained in the Integrated Employment Biographies of the German social security system (Dorner et al. 2011). We use data from 1985, the year when guest workers’ region of residence is first observed in the SOEP data.

Ethnic concentration, our key explanatory variable, is measured by the logarithm of the size of the ethnic community in the region of residence in 1985 (see Appendix Table 11 for definitions of regional variables). In our regression analyses, region fixed effects control for the size of the overall population in a region. While it is common to measure ethnic concentration as the log size of the own ethnicity (e.g., Edin et al. 2003; Damm 2009; Åslund et al. 2011), we also show robustness of our results to using the share of the own ethnicity in the total regional population as an alternative measure (e.g., Chiswick 2009; Danzer and Yaman 2013, 2016). We match our measures of ethnic concentration to the individual-level SOEP data at the level of areas and ethnicities.

The extensive demand-driven recruitment of guest workers generated substantial variation in ethnic concentrations across regions. Figure 1 shows the distribution of ethnic concentrations separately for each of the five ethnicities across the 103 West German areas in 1985 (see Appendix Table 12 for descriptive statistics). There are clear differences in the settlement structures between the guest worker ethnicities. For example, while Spanish guest workers tend to be concentrated in central Germany, Italians and Yugoslavs are more concentrated in the southern regions. In our analysis, we exploit the differential concentrations of ethnicities within regions.

Fig. 1
figure 1

Ethnic concentrations across West Germany, 1985. Ethnic group’s share of the total population in the area of residence, 1985. Source: Institut für Arbeitsmarkt- und Berufsforschung (IAB). Own calculations of ethnic concentrations for 103 areas. Figures based on a historical GIS data file of the Federal Republic of Germany from the Max Planck Institute for Demographic Research and the Chair for Geodesy and Geoinformatics, University of Rostock (2011) and Bundesamt für Kartographie und Geodäsie (2011)

For robustness analyses, we also use the 1987 German Census, which includes the entire population in Germany, to compute alternative measures of ethnic concentration. Being based on a 2% employee random sample, the SIAB measure of ethnic concentration may contain classical measurement error, biasing our estimates toward zero. In addition, if the regional share of co-ethnics in the employee sample does not reflect the ethnic concentration in the overall population—for example, because of differential labor-market participation rates—there may be non-classical measurement error. In robustness analyses, we therefore also use an alternative measure of ethnic concentration based on data from the 1987 Census. The depth of the Census data also allows us to perform robustness analyses that define ethnic enclaves at the level of the 328 West German counties. A disadvantage of the 1987 Census is that it does not allow to compute ethnic concentrations for Spanish guest workers (who are included in a residual “other citizenship” category), which reduces the sample size and excludes one of the five guest worker ethnicities. In addition, the ethnicity measure in the Census is based on citizenship information (as country of birth is not observed in the Census), and the 1987 Census measures ethnic concentrations 2 years later than the 1985 SIAB data. Appendix Figs. 3 and 4 depict the distribution of the Census-based measures of ethnic concentration separately for the four ethnicities at the level of areas and counties, respectively.

Empirical model

In this section, we discuss the basic setup of our empirical model (Section 4.1) and show the balancing of demographic characteristics of guest workers and their children across regions with low and high concentrations of co-ethnics (Section 4.2).

Model setup with region and ethnicity fixed effects

We aim to estimate the effect of ethnic enclaves on the language proficiency and educational attainment of immigrant children. Exploiting the regional distribution following the quasi-exogenous placement of guest workers, our basic model setup expresses immigrant children’s outcomes as a function of the concentration of their ethnicity in their region. Conditioning on fixed effects for ethnicities and regions, the model is identified from the concentration of an ethnicity in a particular region compared to the concentration of other guest worker ethnicities in the same region.

When estimating the effect of ethnic enclaves on immigrant children’s host country language proficiency, we make use of the panel structure of the SOEP where immigrant children report their German language proficiency in multiple consecutive years. This allows estimating the following random effects modelFootnote 20:

$${lang}_{icrt}={\sigma }_{c}+{\delta }_{r}+{\tau }_{t}+{\beta }_{1}{EC}_{cr}+{{\varvec{C}}}_{{\varvec{i}}{\varvec{c}}{\varvec{r}}}^{\boldsymbol{^{\prime}}}{\beta }_{2}+{{\varvec{P}}}_{{\varvec{i}}{\varvec{c}}{\varvec{r}}}^{\boldsymbol{^{\prime}}}{\beta }_{3}+{\mu }_{icr}+{\epsilon }_{icrt}$$
(1)

where \({lang}_{icrt}\) is the German speaking and writing proficiencies, respectively, of child \(i\) of ethnicity (country-of-origin) c living in region r in year \(t\). The key explanatory variable is the concentration of child i’s ethnicity in her region, \({EC}_{cr}\).Footnote 21\({{\varvec{C}}}_{{\varvec{i}}{\varvec{c}}{\varvec{r}}}\) is a vector of child characteristics, including gender, year of birth, and age at migration. \({{\varvec{P}}}_{{\varvec{i}}{\varvec{c}}{\varvec{r}}}\) is a vector of parent characteristics, including year of birth, year of arrival in Germany, education in country of origin, years of schooling, a migration indicator (which equals 0 for a few spouses who have no migration background),Footnote 22 and the number of children for mothers. All models include fixed effects for ethnicities, \({\sigma }_{c}\); fixed effects for regions, \({\delta }_{r}\); and fixed effects for the year when the child reported her language proficiency, \({\tau }_{t}\). The individual-specific effects, \({\mu }_{icr}\), are assumed to be i.i.d. random variables, and \({\epsilon }_{icrt}\) is an idiosyncratic error term. Throughout, we cluster standard errors at the region-by-ethnicity level, the level at which our measure of ethnic concentration varies.Footnote 23

To estimate the effect of ethnic concentration on immigrant children’s educational attainment, we estimate the following OLS model using a cross-section of children:

$${educ}_{icr}={\sigma }_{c}+{\delta }_{r}+{\theta }_{1}{EC}_{cr}+{{\varvec{C}}}_{{\varvec{i}}{\varvec{c}}{\varvec{r}}}^{\boldsymbol{^{\prime}}}{\theta }_{2}+{{\varvec{P}}}_{{\varvec{i}}{\varvec{c}}{\varvec{r}}}^{\boldsymbol{^{\prime}}}{\theta }_{3}+{\varepsilon }_{icr}$$
(2)

where \({educ}_{icr}\) is the educational attainment of child i, measured either by a binary indicator for having obtained any school degree or by a binary indicator for having obtained at least an intermediate school degree. As in Eq. (1), we include controls for child and parent characteristics as well as ethnicity and region fixed effects.

By including ethnicity fixed effects, we account for any differences between ethnicities, such as linguistic distance to the German language, cultural distance, school quality in the country of origin, and general willingness or disposition to integrate into the host country. By including region fixed effects, we exploit only variation in ethnic concentrations within the same region but do not use systematic differences in ethnic concentrations across regions. Thus, we control for any differences across regions, such as unemployment rates, wage levels, overall share of migrants, school quality, and attitudes of the native population. Our model therefore identifies the effect of ethnic concentration on immigrant children’s outcomes from the presence of several immigrant groups with differing community sizes within the same region.

Test of balancing by ethnic concentration

As argued above, the placement policy of the German guest worker program led to quasi-exogenous variation in the regional placement of guest workers. While we observe guest workers only several years after their placement, we can test the plausibility of the exogeneity assumption by comparing observable characteristics of the immigrant children and their parents between regions with low and high ethnic concentration of the respective ethnicity. To do so, we split the sample at the ethnicity-specific median of the share of ethnic concentration in the child’s region of residence in 1985. As Table 1 shows, none of the demographic characteristics of immigrant children differs significantly (individually or jointly) across regions with low and high co-ethnic concentration. The same is true for the demographic characteristics of mothers and fathers. Similarly, using the specification of our outcome model, there is no significant relationship between ethnic concentration and background characteristics when regressing the background characteristics on the share of ethnic concentration as well as ethnicity and region fixed effects (Appendix Table 14). Additionally, we perform regression-based balancing tests on the sample of fathers instead of children as in Damm and Dustmann (2014), yielding the same qualitative result (results available upon request). These balancing tests support our assumption that there was no systematic self-selection of guest workers into regions of differing ethnic concentration. Of course, as with any balancing test, we cannot test whether unobserved characteristics, such as parents’ willingness to integrate into the German society, are balanced as well. Nonetheless, we run a regression-based balancing test as in Pei et al. (2019) with the candidate confounder—proxied by father’s education—as outcome variable, again finding no evidence for selection (results available upon request).

Beyond demographic backgrounds, the only exceptions where we find a significant difference between regions with low and high ethnic concentration are fathers’ unemployment rates and household income. Interestingly, guest workers are better off in terms of employment and income in regions with high shares of co-ethnic concentration. If anything, this difference should work against finding any negative effect of ethnic concentration on children’s outcomes. The unemployment difference observed for guest worker fathers in the SOEP sample is qualitatively in line with the overall unemployment rates in 1985 from the Federal Employment Agency (see bottom of Table 1). Thus, the unemployment difference likely reflects the fact that guest workers were particularly demanded in regions with booming industries, which were still characterized by lower unemployment levels in 1985. Of course, the region fixed effects in our regression models account for any general difference across regions, exploiting only within-regional variation across different ethnicities. Furthermore, as we show below, differences in unemployment and household income cannot account for the effect of ethnic concentration on children’s outcomes.

The balancing of guest workers’ demographic characteristics across regions with low and high ethnic concentration is particularly reassuring as we observe the location of guest workers in 1985 for the first time. As we do not observe the initial location to which guest workers had been assigned, our analysis is based on the assumption that any movement of guest workers across regions between their arrival in the 1960s/1970s and 1985 is orthogonal to our relationship of interest. Thus, the estimated coefficient on ethnic concentration would be biased downward (upward) if parents with adverse (advantageous) characteristics related to their child’s outcomes moved to regions with high ethnic concentrations. The balancing results support our identifying assumption that guest workers in Germany did not systematically self-select into regions between their arrival and 1985.Footnote 24 Further supporting this assumption, we show in Section 7.7 below that the movement of guest workers across regions in the 10 years after 1985, when we can observe them, was in fact orthogonal to ethnic concentration and to our outcome measures.

This is in line with existing work investigating the German guest worker program. Previous studies also did not find any evidence of significant differences in demographic characteristics between guest workers living in regions with high concentrations of co-ethnics and those living in regions with low concentrations (Constant et al. 2013; Danzer and Yaman 2013, 2016). The evidence against endogenous sorting of immigrants into ethnic enclaves in our specific setting (although not necessarily in any immigrant setting) is perfectly consistent with two specific features of the German guest worker program.

First, as discussed above, guest workers were restricted in their residential choice as their work permit required them to stay in the initially assigned region for several years (Dahnen and Kozlowicz 1963). Thus, the formal rules of the guest worker program made it hardly possible for guest workers to move across regions during the initial years after their arrival.

Second, guest workers in Germany were well integrated into the labor market immediately upon arrival as they had been recruited specifically for the purpose of filling open positions in the German economy. As a result, the unemployment rate of foreigners in Germany was less than 1.5% in every year between 1968 and 1973 and was even lower than that of natives (Federal Employment Agency 1974). Since guest workers—who migrated to Germany with the aim to work—had been employed immediately upon arrival, the incentive to move to other regions was very low. Accordingly, the current settlement structures of immigrants in Germany have been shown to still reflect the demand for labor in the 1960s and 1970s (Schönwalder and Söhn 2009). Quite generally, ethnic segregation has been reasonably stable across workplaces and residential locations over the entire period from 1975 to 2008 (Glitz 2014).

In sum, the demographic characteristics of guest workers and their children are very similar across regions with low and high ethnic concentration. This finding supports our identification strategy of exploiting the quasi-exogenous placement of guest workers across German regions to estimate the effect of ethnic enclaves on immigrant children’s outcomes.Footnote 25

Results

This section presents our main results (Section 5.1) and subgroup analyses (Section 5.2). The subsequent sections investigate mediating factors and provide robustness analyses.

Main results

Table 2 shows our main results on the effect of ethnic concentration on the host country language proficiency of immigrant children. The results indicate that an increase in co-ethnic concentration significantly reduces immigrant children’s proficiency to speak and write German. An increase in the size of the own ethnicity by one log-point is related to a decline in speaking skills by 19% and in writing skills by 17% of a standard deviation. The magnitudes of the estimated coefficients barely change when we include controls for children’s and parents’ characteristics.

Table 2 Effect of ethnic concentration on host-country language proficiency

To facilitate interpretation of magnitudes, ethnic concentration would increase by one log-point, for example, if a Turkish child moved from the city of Bonn (with a share of Turks of about 1%) to the city of Munich (with a share of about 2.8%).Footnote 26 This change in the region of residence would, ceteris paribus, reduce the child’s German speaking proficiency by 19% and her writing proficiency by 17% of a standard deviation, respectively. This is a modest effect, given that the difference between “poor” and “fair” German language proficiency is 1.39 standard deviations for speaking and 1.12 standard deviations for writing.

In line with the negative impact on host-country language proficiency, we also find a negative effect of ethnic concentration on immigrant children’s educational attainment (Table 3). Living in an ethnic enclave substantially increases the likelihood of the child to drop out of school without any degree (columns 1 and 2). A one log-point increase in co-ethnic concentration increases the probability of dropping out of school by 5.6 percentage points. Given that the overall dropout rate among immigrant children in our sample is only 7.1%, this is a huge effect. There are several reasons for the impact on school dropout to be larger than on children’s German language proficiency. High German language skills are a prerequisite to be able to understand the instructions of teachers (which requires verbal proficiency) as well as to pass the written exams (which requires writing proficiency) in all school subjects. Hence, the two negative effects on speaking proficiency and writing proficiency likely combine and accumulate across all subjects taught in school, leading to worse grades in multiple subjects. This, in turn, may lead to the large negative impact on school dropout. While results also point toward a negative impact on the probability of obtaining at least an intermediate school degree, the coefficient is much less precisely estimated and becomes zero when controlling for child and parent characteristics (columns 3 and 4).Footnote 27

Table 3 Effect of ethnic concentration on educational attainment

The negative effects on host country language proficiency and on obtaining any school degree suggest that immigrant children who grew up in regions with high shares of (low-educated) co-ethnics suffer long-term disadvantages in human capital acquisition. The joint effects on language proficiency and educational attainment are in line with the existing evidence of complementarities between language and other forms of human capital (e.g., Chiswick and Miller 2002).

Subgroup analyses

Beyond the average effects, we investigate effect heterogeneity by country of birth, gender, and ethnicity.Footnote 28 We start by investigating whether the negative effects of ethnic concentration on children’s outcomes differ between children born abroad and children born in Germany. About 42% of the immigrant children in our sample were born abroad, entering Germany through a family reunification scheme. The first two columns of Table 4 suggest that the negative enclave effects on German speaking and writing proficiencies are roughly 30 percent smaller for children who were born in Germany rather than abroad. As children born in Germany start learning the German language already in nursery school and school, co-ethnic concentration may be less important for them compared to children born abroad who typically start learning the German language at an older age. Still, the ethnic-concentration impact is also significant for guest worker children who were born in Germany. Furthermore, the smaller impact on language proficiency does not translate into a smaller disadvantage of children born in Germany in terms of dropping out of school (column 3).

Table 4 Subgroup analysis

The right panel of Table 4 investigates effect heterogeneity by child gender. Results indicate that the impact of ethnic concentration on children’s language proficiency and educational attainment does not differ significantly between boys and girls, although the negative effect on school dropout may be slightly smaller (in absolute terms) for girls.

Subgroup analyses by ethnicity indicate little heterogeneity (Appendix Table 15). Results suggest that the effect of ethnic concentration on German speaking and writing proficiencies and on school dropout does not differ significantly for Greek, Italian, Spanish, Turkish, or Yugoslav guest worker children. There is some indication, however, that ethnic concentration may have a stronger negative effect on the probability of obtaining at least an intermediate school degree for Italian and Turkish children, and a more positive one for Greek and Spanish children.

Mediating factors

The effect of ethnic enclaves on immigrant children’s outcomes may be mediated through numerous different channels, including parents’ language skills, inter-ethnic contacts with natives, and economic conditions. Existing studies that rely on administrative data are usually restricted to looking at the enclave effect as a black box. By contrast, the rich SOEP survey data allow us to investigate several potential mediating factors at the child and parent level.

Parental proficiency in the host-country language

A first candidate for a mediating factor is parents’ host country language skills, as children’s human capital accumulation may critically depend on the language proficiency of their parents (e.g., van Ours and Veenman 2003). In fact, Danzer and Yaman (2016) find a strong negative effect of ethnic enclaves on the language skills of first-generation guest workers in Germany. In the SOEP, adult guest workers (i.e., the parents of our children) report their own German language proficiency in speaking and writing. Using the same random effects specification (without child controls) and the same definitions for language proficiency and ethnic concentration as in our main model, we find an effect of ethnic enclaves on the speaking proficiency of parents of − 0.351 (standard error 0.081), but no significant effect on parents’ writing proficiency (− 0.072, standard error 0.091).

In a standard descriptive analysis of potential mechanisms, we add different potential mediating factors as control variables to our main models.Footnote 29 As indicated in column 2 of Table 5, parents’ German speaking proficiency is significantly positively related to their children’s German speaking proficiency.Footnote 30 Controlling for parents’ German speaking proficiency reduces the effect of ethnic concentration and renders it statistically insignificant, although the negative point estimate remains quite sizeable. However, self-assessed language proficiency is likely measured with error. To circumvent downward bias in the estimated effect of parents’ language proficiency, we follow the approach of Dustmann and van Soest (2002) and exploit the panel dimension of the SOEP to instrument parents’ speaking proficiency reported in a given year with their speaking proficiency reported in preceding (lag) and subsequent (lead) years.Footnote 31

Table 5 Mediating factors—effect of ethnic concentration on host-country speaking proficiency
Table 6 Mediating factors—effect of ethnic concentration on host-country writing proficiency
Table 7 Mediating factors—effect of ethnic concentration on obtaining any school degree

When accounting for random measurement error in this instrumental variable (IV) approach, parents’ German speaking proficiency can fully account for the effect of ethnic concentration on children’s speaking proficiency. The IV estimate on parents’ speaking proficiency (column 3) is three times as large as the OLS estimate, indicating that the latter suffers from substantial attenuation bias.Footnote 32 Intriguingly, once the independent-over-time measurement error is accounted for, the point estimate of the effect of ethnic concentration on guest worker children’s German speaking proficiency is reduced to close to zero. This suggests that poor parental host country language skills in ethnic enclaves are a main driver of the enclave effect on children’s host-country language proficiency.

Columns 4 and 5 present equivalent analyses for parents’ writing proficiency in German. While parents’ German writing skills are also significantly related to their children’s German speaking proficiency, controlling for them does not reduce the estimated effect of ethnic concentration by much.

Table 6 shows the same analyses for children’s writing rather than speaking proficiency. We find similar associations of parents’ German language proficiency with their children’s writing proficiency as we found for children’s speaking proficiency. Intriguingly, it is again only parents’ speaking proficiency (column 3), rather than their writing proficiency (column 5), that reduces the estimated enclave effect on children’s writing proficiency to close to zero. Thus, it appears that reduced speaking proficiency in the host-country language (and therefore likely reduced speaking of the host-country language at home), rather than limited writing proficiency in the host-country language, is a leading mechanism by which ethnic enclaves inhibit the language proficiency of immigrant children. One potential reason for the greater importance of parents’ speaking proficiency is the oral communication of guest workers and their children at home.Footnote 33

Inter-ethnic contacts with natives and economic conditions

Limited contacts to German natives may constitute a further mediating factor of the negative effect of co-ethnic concentration on children’s host country language proficiency. Prior research shows that guest workers in Germany who were placed in ethnic enclaves tend to interact less with natives (Danzer and Yaman 2013), and reduced contact with natives may in turn affect the human capital acquisition of their children. As columns 6 and 7 of Tables 5 and 6 show, having personal contacts with natives—either measured by whether the child’s first friend is German or whether parents regularly receive visits from Germans—is indeed significantly positively associated with the child’s German speaking and writing proficiencies.Footnote 34 Yet, controlling for the reduced contacts with natives does not significantly change the negative estimate of ethnic enclaves on children’s host-country language skills.

Differences in economic conditions such as parental unemployment or household income might also explain the negative effect of ethnic enclaves on immigrant children’s language proficiency. As column 8 of Tables 5 and 6 shows, parents’ unemployment status is significantly associated with their children’s host country language proficiency in the expected way, but controlling for parental unemployment and household income does not affect the estimated effect of ethnic concentration on children’s language proficiency at all.

Table 7 shows that none of the mediating factors analyzed here can account for the effect of ethnic enclaves on children’s schooling outcomes. Parents’ speaking ability is the only factor that is significantly associated with their children’s probability to obtain a school degree. Still, controlling for parents’ speaking ability does not reduce the estimated effect of ethnic concentration on whether children obtain a school degree.Footnote 35

While parents’ German language proficiency mediates the effect of ethnic enclaves on children’s language proficiency, it does not account for the effect on children’s school dropout. This indicates that parents’ German language proficiency differs in its importance for these outcomes. In line with prior evidence on the intergenerational transmission of language skills (e.g., Bleakly and Chin 2008; Casey and Dustmann 2008), parents’ (instrumented) German speaking and writing proficiencies are strongly correlated with their children’s German language proficiency (columns 3 and 5 in Tables 5 and 6). In contrast, parents’ German language proficiency is only weakly correlated with the probability that their child leaves school without any degree (Table 7). This weaker correlation implies a much lower potential to mediate the effect of ethnic enclaves. A potential explanation for this finding is that children’s host-country language proficiency is strongly affected by their parents’ host-country language skills due to frequent interactions at home. This is particularly true for immigrant children who often do not attend kindergarten, which provide opportunities to learn the host country language. While language skills can be strongly mediated in the parental home, obtaining a school degree depends on many factors outside the family, such as the quality of school peers or teachers, which affect, among others, children’s motivation to learn. Also, school dropout rates in Germany in the observation period were modest (5%), possibly reducing the scope of parental language skills as a mediating factor.

Some additional potential mediating factors are hard to explicitly address in our setup. For example, some schools may have fewer and lower-achieving native Germans due to “native flight” at the disaggregated level of school catchment areas (Cascio and Lewis 2012; De la Croix and Doepke 2009).Footnote 36 In most parts of Germany, children normally attend the only primary school of their residential school district. The extent of school segregation at the primary school level is, hence, mostly determined by residential patterns, since school choice is limited unlike, for instance, in the USA, UK, or Netherlands (Kristen 2005). Private primary schools play a negligible role with only 1.3% of enrolled primary school students (Statistisches Bundesamt 2002). School segregation in secondary education is mostly determined by early ability tracking in Germany.Footnote 37

Another potential mediating factor is the existence of network effects combined with self-selection that becomes increasingly negative over time. Migration networks reduce migration costs and will normally increase a low-skilled person’s propensity to move, find a job, or find accommodation (McKenzie and Rapoport 2010). In our setting, networks did not play a role in the initial placement of guest workers who received jobs and accommodation upon arrival and were unable to influence their placement. Networks may have influenced individual decisions to sign up for migration in the first place, not least since educational attainment in the home countries increased somewhat during the 1950s to 1970s (Lee and Lee 2016). Post-migration personal networks may have influenced the propensity of relocation after the end of the lock-in period, potentially influencing the incentives of parents to learn German, the nature of native flight, and the quality of children’s peers at school.

In sum, the negative effect of ethnic enclaves on immigrant children’s host country language proficiency can be fully accounted for by parents’ lower host-country speaking proficiency. Parents’ writing proficiency accounts for only a small part of the negative enclave effect. Limited contacts to natives and economic factors do not appear to be relevant mediating factors of the negative enclave effects. None of the investigated mediating factors—parents’ language skills, inter-ethnic contact, and economic conditions—can account for the detrimental effect of ethnic enclaves on the schooling success of immigrant children.Footnote 38

Robustness analyses

In this section, we show that our results are robust to measuring ethnic concentration by ethnic shares (Section 7.1), instrumenting ethnic concentration in 1985 by its 1975 value (Section 7.2), measuring ethnic concentration with Census data (Section 7.3), measuring ethnic concentration at the county level (Section 7.4), and accounting for the interview mode (Section 7.5). We also investigate return migration (Section 7.6), regional migration within Germany (Section 7.7), and family size (Section 7.8).

Measuring ethnic concentration by ethnic shares

There is no strong a priori argument for any specific functional form of the ethnic concentration measure. At least two different specific measures of ethnic concentration have been used in the literature. In our analyses so far, we followed Edin et al. (2003), Damm (2009), and Åslund et al. (2011) in using the logarithm of the size of the own ethnicity. In contrast, Chiswick (2009) and Danzer and Yaman (2013, 2016) measure ethnic concentration as the share of the own ethnicity in the total regional population.

When using the share of the own ethnicity in the regional population as an alternative measure of ethnic concentration, results on guest worker children’s German speaking and writing proficiencies and on school dropout are qualitatively similar to our main models (Table 8). Interestingly, the alternative concentration measure also produces significant results on the probability that guest worker children obtain at least an intermediate school degree. Specifically, the point estimate suggests that a one percentage point increase in the share of own ethnics in the regional population reduces the likelihood of obtaining at least an intermediate school degree by 5.1%.

Table 8 Measuring ethnic concentration by share of own ethnicity in regional population

Instrumenting ethnic concentration in 1985 by ethnic concentration in 1975

In another robustness check, we restrict the variation in our treatment variable to variation in ethnic concentration that was given already towards the end of the guest worker program. The measure of ethnic concentration in our main analysis refers to 1985, the year in which we first observe guest workers and their region of residence (see Section 4.2). While the balancing tests indicate no evidence of self-selection of guest workers across regions with different ethnic concentrations, the extent of ethnic concentration may have changed between the end of the German guest worker program in 1973 and the observed ethnic concentration in 1985. To account for potential endogeneity of our main explanatory variable, we can instrument a region’s ethnic concentration in 1985 by the region’s ethnic concentration in 1975, i.e., towards the end of the German guest worker recruitment program (Danzer and Yaman 2013). 1975 is the first year of the SIAB data. This IV model rules out any bias from changes in ethnic concentrations in a given region during the decade before we first observe guest workers’ region of residence. For instance, if economic conditions improved between immigration and 1985 in the initial placement region of a guest worker relative to other regions, we may expect an increase in ethnic concentration in this region owing to economically motivated in-migration. In this case, the ethnic concentration observed in 1985 differs from the one at quasi-exogenous placement.

Ethnic concentration in 1975 is a very strong instrument for ethnic concentration in 1985. The F statistic on the excluded instrument in the first stage is 236 in the regressions for language outcomes and 321 in the regressions for schooling outcomes.Footnote 39 In line with Schönwalder and Söhn (2009), this suggests that there is strong persistence in the settlement structures of guest workers between the end of the guest worker program and 1985.Footnote 40

Table 9 presents the results of the IV model that uses only that part of the variation in ethnic concentration in 1985 that can be traced back to variation in ethnic concentration that already existed in 1975. For both speaking and writing proficiencies, the enclave effect is somewhat stronger when instrumenting 1985 with 1975 ethnic concentration compared to the baseline model. The effect on school dropout does not change, and the coefficient for obtaining at least an intermediate school degree remains insignificant. Similarly, all results on mediating factors are very similar in the IV model compared to the baseline model (not shown).

Table 9 Instrumental-variable estimates using ethnic concentration in 1975

In sum, our baseline estimates are not biased by any change in ethnic concentration that occurred between 1975 and 1985. If anything, restricting the analysis to variation in ethnic concentration that already existed in 1975 leads to slightly larger estimates of the detrimental effect of ethnic enclaves on immigrant children’s outcomes.

Measuring ethnic concentration with census data

Measuring the size of the immigrant population based on a 2% random sample of employees like the SIAB can lead to attenuation bias in estimating effects of immigration measures (Aydemir and Borjas 2011). To address potential measurement error in our preferred measure of ethnic concentration, we use data from the 1987 German Census, which includes the entire population in Germany. As the 1987 Census data do not allow identifying Spanish citizens, the Census analysis is restricted to the other four ethnicities. For each ethnicity, the correlation coefficient between our preferred 1985 SIAB measure and the 1987 Census measures of the (log) size of the ethnic community exceeds 0.96.

Replacing the 1985 SIAB measure of ethnic concentration with the 1987 Census measure yields very similar results to our main specifications (results available upon request). Furthermore, IV models that instrument the 1987 Census measure of ethnic concentration with the concentration of guest workers in the mid-1970s using the SIAB 1975 data—which simultaneously account for measurement error and changes in regional ethnic concentration after the end of the guest worker program—are also quite similar to the baseline results. Again, the IV estimates are somewhat larger than the non-instrumented estimates. The results on mediating factors are also unaffected when using the 1987 Census data to compute measures of ethnic concentration, both in the non-instrumented and in the instrumented models (not shown). In sum, we do not find evidence that measurement error in our ethnic concentration measure has a substantial effect on our results.

Measuring ethnic concentration at the county level

Our preferred regional level for measuring ethnic concentration are the areas (Anpassungsschichten), as they comprise sufficiently large regions in order to circumvent bias from commuting within regional labor markets. While the much smaller regional entity of counties may more precisely measure immigrant children’s exposure to co-ethnics, they also increase concerns of bias due to commuting and moving across county borders. Still, using the 1987 Census, which includes the entire population, we can test for robustness of our results to measuring ethnic concentration at the level of 328 counties rather than 103 areas.Footnote 41

When measuring ethnic concentration and including fixed effects at the county level, the effects of ethnic concentration on children’s speaking and writing proficiencies are very similar to the estimates when measuring ethnic concentration at the area level (results available upon request). By contrast, the effect on obtaining any school degree becomes smaller and loses statistical significance. Besides the fact that Spanish guest worker children are missing in the analysis, statistical power in the county-level analysis may be impaired by the fact that enclave effects are identified from fewer guest worker children observed within the same region in the SOEP data. This likely affects in particular the analysis of school dropout, which on average is already rather low (7.1%). In fact, incidents of school dropout by guest worker children are observed in only 42 of the 114 counties with guest worker children in the SOEP. This suggests that models with county fixed effects exploit only very limited variation in school dropout.

Accounting for interview mode

We also show that immigrants’ self-reported language proficiency is not affected by the specific interview mode used in the SOEP, such as oral face-to-face interview or written interview by mail. Adding a control for the interview mode used when guest worker children report their levels of German language proficiency does not affect the estimated enclave effects on children’s proficiency in speaking or writing German (results available upon request).

Investigating return migration

Acquiring host country language skills and education is an investment decision that may depend on whether immigrants intend to stay in the host country or return to their home country (Dustmann and Glitz 2011). To account for this possibility, we can include a binary indicator that equals 1 if guest worker parents see their future in Germany (0 otherwise).Footnote 42 Adding this control variable does not affect our baseline estimates (results available upon request). Parents’ intention to stay in Germany is positively associated with the children’s outcomes, albeit statistically significantly only in the case of obtaining a school degree.

We also investigate to what extent immigrant children in our sample actually returned to their home country during the first 10 years of our analysis, and in particular, whether return migration was related to ethnic concentration. Following Dustmann (2003), we use the information that individuals provide when leaving the SOEP survey. We construct a binary indicator, \({return}_{icr}\), which equals 1 if an immigrant child left the SOEP between 1985 and 1995 reporting a move abroad, and 0 otherwise.Footnote 43 To predict return migration, we use the same explanatory variables as in our main model, in particular ethnic concentration in 1985:

$${return}_{icr}={\sigma }_{c}+{\delta }_{r}+{\gamma }_{1}{EC}_{cr}+{{\varvec{C}}}_{{\varvec{i}}{\varvec{c}}{\varvec{r}}}^{\boldsymbol{^{\prime}}}{\gamma }_{2}+{\varepsilon }_{icr}$$
(3)

During the observation period 1985–1995, 94 immigrant children (8.8% of our sample) left the SOEP and moved abroad. However, moving abroad is unrelated to ethnic concentration (results available upon request). The result that there is no selective return migration with respect to ethnic concentration strengthens our identifying assumptions. Furthermore, attrition due to return migration likely generates an upward bias if return migrants are negatively selected on host-country skills, which seems likely. This would imply that our results underestimate the true effect of ethnic enclaves.

Investigating regional migration within Germany

Our analysis requires the validity of the assumption that there was no systematic sorting of guest workers across regions between their initial placement in Germany and 1985, when we observe them for the first time in the SOEP data. While lack of data prevents us from investigating cross-regional mobility before 1985, we can analyze moving patterns after 1985, a period when overall regional mobility was higher. We find that only 6.9% of the 749 immigrant children who remained in the SOEP sample moved across regions between 1985 and 1995. Similarly, the moving rate was modest during the 20-year period between 1985 and 2005 (15.6%). This rather low mobility is consistent with Glitz (2014), who shows that regional ethnic concentrations were stable in Germany between 1975 and 2008.

Most importantly, ethnic concentration in 1985 does not predict whether guest worker families moved across regions over the next 10 years (results available upon request). In sum, the low mobility and the unsystematic moving patterns with respect to ethnic concentration support our identifying assumption that guest workers did not systematically sort across regions before 1985.

Investigating family size

If ethnic concentration affected the number of children in a household, our analysis sample could be subject to selection. For example, ethnic concentration might affect fathers’ labor market success, which in turn might affect their decision to have children or to bring their existing families from their home countries to Germany. However, results show that this is not the case: ethnic concentration is neither related to the probability of having children in the household nor to the number of children living in the household (results available upon request). These results are also consistent with the fact that mothers do not have significantly different numbers of children in 1985 in regions with low and high ethnic concentration (Table 1). Therefore, our analysis seems to be unaffected by endogenous fertility and family reunification.

Conclusion

This paper exploits the quasi-exogenous placement of guest workers across Germany during the 1960s and 1970s to estimate the effect of growing up in ethnic enclaves on the language proficiency and educational outcomes of immigrant children. We find that growing up in regions with higher own ethnic concentration significantly reduces immigrant children’s proficiency in the host country language and their educational attainment. For schooling outcomes, the effect is concentrated at the lower end of the educational distribution, although there is some indication that more academic school degrees may be affected as well. The enclave effects tend to be larger for immigrant children who were born abroad.

The rich information contained in the German Socio-Economic Panel, most importantly on parents’ host country language proficiency, allows us to investigate several factors that might mediate the effect of ethnic concentration on child outcomes. We find that parents’ German speaking proficiency can completely account for the negative effect of ethnic enclaves on their children’s German language proficiency. Parents’ writing abilities account for only little, and contacts to natives and parents’ economic conditions cannot account for the negative effect of ethnic enclaves on immigrant children’s outcomes at all.

These findings imply that even children of immigrants who are well integrated into the labor market may suffer from worse human capital outcomes—host country language proficiency and educational attainment—when growing up in regions with many, mainly low-educated immigrants of their own ethnicity. The important role of parents’ host country language skills in the mediation analysis suggests that host country language training for adult immigrants may have important positive spillover effects on their children. More generally, the results indicate that the long-run cultural and social integration of immigrants, particularly for the next generation, may be more successful when ethnic concentration is limited.

Two caveats of our analysis are worth mentioning. First, our analysis is limited in that we can only observe guest workers’ location in 1985, not their initially assigned place of residence. The results therefore rely on the identifying assumption that guest workers did not systematically move across regions between their arrival in the 1960s/1970s and 1985. While we argue—in line with the prior literature—that this assumption is plausible owing to specific features of the German guest worker program, we cannot finally rule out some cross-regional or return migration based on unobservables. Second, concerning external validity, our study results cannot capture all sorts of migration processes. Recruited guest workers had an education level substantially below the native population. Such migratory movements of less skilled workers to relatively richer countries take place around the globe. However, the guest workers had no difficulties in finding employment. Given that workplace integration may support other forms of integration, we expect that findings based on guest workers are more conservative than in settings without employment guarantees.