Immigrants assimilate as communities, not just as individuals


The literature on the economic assimilation of immigrants generally treats them as atomistic individuals assimilating in a largely anonymous labour market. Here, we argue that immigrants assimilate as communities, not only as individuals. The longer the immigrant community has been established, the better adjusted it becomes, and the more the host society comes to accept that ethnic group. Using data from a 5% sample of the 1980, 1990 and 2000 US censuses, we find that the stronger is the tradition of immigration from a given source region, the better are the economic outcomes for subsequent immigrants from that source.

Fig. 1


  1. Major contributions include Lieberson and Waters (1990), Portes and Rumbaut (1996) and Rodriguez (1999).

  2. A number of studies have addressed the endogeneity issue that arises from migration across localities. Those that choose to migrate away from the ghetto may have superior characteristics and hence the measured effect of ethnic concentration may be partly due to self-selection. For various treatments of this issue, see for example Bertrand et al. (2000), Borjas (2006), Cutler and Glaeser (1997), Cutler et al. (2005) and Edin et al. (2003).

  3. Some studies estimate the intergenerational education correlation between second-generation immigrants and their ‘synthetic fathers’ (men from the same source country in an earlier census). Looking at fathers in the 1940 census and children in the 1970 census, Card et al. (2000) estimate an intergenerational education elasticity of 0.4–0.5. Analysing fathers in the 1980 census and sons in the 1995–2002 CPS, Card (2005) estimates the elasticity to be 0.3.

  4. Between the 1860s and the 1900s, the source-country GDP per capita of the average immigrant fell from 95% to 49% of US GDP per capita. As in the post-1950 period, this was driven largely by the shift in origin-country composition, and it was accompanied by a decline in the average literacy rate of immigrants (despite rising literacy in source countries) and by a decline in their relative wage, which fell by five percentage points between 1873 and 1913 (Hatton 2000, p. 520).

  5. This is despite the increase over the period in the return to skills, which would tend to reduce the earnings of immigrants relative to the native-born (Smith 2006, p. 213). In addition, the average years since migration, a variable normally associated with higher relative earnings, declined slightly between 1970 and 2000 for males with positive annual earnings.

  6. Using data on immigrants and non-immigrants working in the US labour market, Akee and Yuksel (2008) find very similar wage returns to a year of US education and a year of foreign education.

  7. We opt not to use the samples for 1970 and earlier because (a) they are only a 1% sample, and (b) the hours worked per week variable is defined differently than in the 1980–2000 samples, raising possible issues of consistency.

  8. The gini coefficient is used as a proxy for the return to skills as there is no consistent series for wage differentials across countries and over time. However, the gini gives similar results to a measure based on wage differentials when explaining immigrant selection by education in a cross-section of source countries for the year 2000 (Belot and Hatton 2008).

  9. Results are similar if a separate dummy is included for each education group × age group × year combination.

  10. Thus for an observation from 1980, the variable measures the average share of the immigrant group in the population at census years from 1860 to 1970, excluding 1890 for which the census records were destroyed, and 1930, which was missing when the paper was drafted. For a year 2000 observation, the variable measures the average stock over the census years 1880 to 1990.

  11. The main education effect is captured by the dummy for the highest education group (not shown) and hence the negative interaction does not imply lower relative earnings for immigrants in the highest education group. The coefficient on the main past immigrant stock effect is quite similar if the education interaction is omitted from the regression.

  12. Since distance varies only by origin region it is eliminated from these regressions.

  13. Borjas and Friedberg (2007) identify an upturn in the relative earnings of immigrants in the 1990s, part of which is attributed to the improved performance of Mexicans. They note that this would be consistent with improvements in information and job networks available to Mexicans—although it would be equally consistent with the argument put forward here.

  14. The US Census Bureau defines the nine census divisions as the New England Division (CT, ME, MA, NH, RI, VT), the Middle Atlantic Division (NJ, NY, PA), the East North Central Division (IL, IN, MI, OH, WI), the West North Central Division (IA, KS, MN, MO, NE, ND, SD), the South Atlantic Division (DE, DC, FL, GA, MD, NC, SC, VA, WV), the East South Central Division (AL, KY, MS, TN), the West South Central Division (AR, LA, OK, TX), the Mountain Division (AZ, CO, ID, MT, NV, NM, UT, WY) and the Pacific Division (AK, CA, HI, OR, WA).

  15. Note that if immigrants’ location decisions depend on their relative wage, then the endogeneity relates principally to the current immigrant stock M rather than to the past stock Z which can be reasonably assumed to be exogenous. While Z might be a natural instrument for M, we cannot use it here because it appears directly in the structural equation.

  16. The intermarriage literature, motivated by concerns about the assimilation of new ethnic groups, dates back at least to Drachsler (1920).

  17. We use census data from 1980 because this is the most recent year for which we can calculate both the date at which the individual first married and the period when he or she arrived in the USA.

  18. On the links between religion, ethnicity and cultural assimilation see Bisin et al. (2004). Empirical studies of intermarriage include Kantarevic (2004) and Meng and Gregory (2005).

  19. The question was phrased as follows: “Since the beginning of our country, people of many different religions, races and nationalities have come here and settled. Here is a list of some different groups. Would you read down the list and, thinking of what they have contributed to this country and what they have gotten from this country, for each one tell me whether you think on balance, they have been a good thing or a bad thing for this country.” (Lynch and Simon 2003, p. 44–45).

  20. Between 1970 and 2000, the ratio of past average stock to current immigrant stock, taking an unweighted average across the 16 origin groups, increased from 1.1 to 2.6. If, instead, we weigh these ratios by the current immigrant stock, then the average falls from 2.3 in 1970 to 0.8 in 2000.


We are grateful for comments from Michele Belot, Sara Lemos and Jeff Williamson, seminar participants at the University of Leicester and the University of Nottingham, participants at the conference on The Economics of Migration, Diversity and Culture at Bologna, September 2006, participants at the ASSA Meetings in New Orleans, January 2008, editor Klaus Zimmermann and two anonymous referees. Susanne Schmidt provided outstanding research assistance.

Census variables

The following variables are taken from the 5% samples of the 1980, 1990 and 2000 US Censuses, available at the IPUMS database (Ruggles et al. 2008).

Sample restriction

Men aged between 25 and 64

Individual-level variables

Annual wage and salary income (INCWAGE)

Annual hours worked: weeks worked last year (WKSWORK1) × usual hours worked per week (UHRSWORK)

Hourly wages: annual earnings/annual hours worked

Education groups: (EDUCREC) coded into four categories: less than high school (codes 1–6), high school (code 7), 1–3 years of college (code 8) and 4 + years of college (code 9).

Years since migration: from year of immigration (YRIMMIG, which is in interval) and census year. Origin-group years since migration are the average of all immigrants from a given origin excluding the age/education group to which it is applied.

Migrant stock variables

Derived from birthplace (BPL) codes according to the following classification:

Mexico (200), Central America (210), Caribbean (250–260), South America (300), Scandinavia (400–405), UK and Ireland (410–44), Western Europe (420–429), Southern Europe (430–440), Central/Eastern Europe (450–459), Russian Empire (460–465), East Asia (500–509), Southeast Asia (510–519), India/Southwest Asia (520–524, 548), Middle East/Asia Minor (530–547, 549), Africa (600) and Australia and New Zealand (700). For each census, the origin-region immigrant stock is expressed as a percentage of the total US population.

The past stock variable is the average of these percentages in the 120 years before the date of observation, excluding the census years 1890 and 1930 for which the data were missing at the time the paper was drafted. To be precise, when looking at immigrant outcomes in the 1980 census, we use the past stock from the 1860–1970 censuses; when looking at immigrant outcomes in the 1990 census, we use the past stock from the 1870–1980 censuses; and when looking at immigrant outcomes in the 2000 census, we use the past stock from the 1880–1990 censuses. Because of the missing data for 1890 and 1930, we are averaging in each case across ten censuses conducted over a 120-year period.

We also explore various alternative measures of past immigrant stock for the time periods of 30, 70 and 120 years ago. Because this measure was unavailable in 1930 at the time the paper was drafted, the immigrant stock 70 years ago in 2000 is the average of the percentages in 1920 and 1940.

Ancestry is the first-mentioned ancestry (ANCESTR1), which contains a diverse range of origins that have been classified into the same groups as birthplace. We calculate the proportion aged 15 and over claiming ancestry from these origin regions, excluding the foreign-born and non-responses but including in the base those who claimed ancestry from North America.

Source country variables

GDP per capita: calculated from Maddison (2001), Appendix C, pp. 267–333. Origin region GDP per capita calculated from countries and regional residuals, weighted by population.

Education years: Average years of education for the population aged 15 years and over for 80 countries, weighted by country populations within each of the 16 regions. Data from Barro and Lee (2001), available from the website of the Center for International Development at Harvard University:

Inequality: Gini coefficient of household income for 80 countries, weighted by population in each of the 16 origin regions. These data were originally assembled by Deininger and Squire, and have now been augmented and made available at the website of the World Institute for Development Economics Research of the United Nations University (UNU-WIDER) at: The observations selected are (almost) exclusively those labelled as ‘high quality’, with adjustments according to whether the underlying data were for income/expenditure, for gross/net income or for individuals/ households. Census year observations are obtained from linear interpolation, where appropriate.

Distance from Chicago: Distances in km are calculated using a Java routine available on the website of John A. Byers at: Origin-region capitals are Mexico City, Panama City, Kingston (Jamaica), Brasilia, Stockholm, London, Paris, Rome, Berlin, Moscow, Beijing, Jakarta, Mumbai, Jerusalem, Johannesburg and Sydney. Distances from the nine US divisions (Tables 6 and 7) are measured from: Boston, New York, Chicago, Kansas City, Baltimore, Memphis, Houston, Denver and Los Angeles.

All data and code are available from the authors upon request.

