The following variables are taken from the 5% samples of the 1980, 1990 and 2000 US Censuses, available at the IPUMS database (Ruggles et al. 2008).
Men aged between 25 and 64
Annual wage and salary income (INCWAGE)
Annual hours worked: weeks worked last year (WKSWORK1) × usual hours worked per week (UHRSWORK)
Hourly wages: annual earnings/annual hours worked
Education groups: (EDUCREC) coded into four categories: less than high school (codes 1–6), high school (code 7), 1–3 years of college (code 8) and 4 + years of college (code 9).
Years since migration: from year of immigration (YRIMMIG, which is in interval) and census year. Origin-group years since migration are the average of all immigrants from a given origin excluding the age/education group to which it is applied.
Migrant stock variables
Derived from birthplace (BPL) codes according to the following classification:
Mexico (200), Central America (210), Caribbean (250–260), South America (300), Scandinavia (400–405), UK and Ireland (410–44), Western Europe (420–429), Southern Europe (430–440), Central/Eastern Europe (450–459), Russian Empire (460–465), East Asia (500–509), Southeast Asia (510–519), India/Southwest Asia (520–524, 548), Middle East/Asia Minor (530–547, 549), Africa (600) and Australia and New Zealand (700). For each census, the origin-region immigrant stock is expressed as a percentage of the total US population.
The past stock variable is the average of these percentages in the 120 years before the date of observation, excluding the census years 1890 and 1930 for which the data were missing at the time the paper was drafted. To be precise, when looking at immigrant outcomes in the 1980 census, we use the past stock from the 1860–1970 censuses; when looking at immigrant outcomes in the 1990 census, we use the past stock from the 1870–1980 censuses; and when looking at immigrant outcomes in the 2000 census, we use the past stock from the 1880–1990 censuses. Because of the missing data for 1890 and 1930, we are averaging in each case across ten censuses conducted over a 120-year period.
We also explore various alternative measures of past immigrant stock for the time periods of 30, 70 and 120 years ago. Because this measure was unavailable in 1930 at the time the paper was drafted, the immigrant stock 70 years ago in 2000 is the average of the percentages in 1920 and 1940.
Ancestry is the first-mentioned ancestry (ANCESTR1), which contains a diverse range of origins that have been classified into the same groups as birthplace. We calculate the proportion aged 15 and over claiming ancestry from these origin regions, excluding the foreign-born and non-responses but including in the base those who claimed ancestry from North America.
Source country variables
GDP per capita: calculated from Maddison (2001), Appendix C, pp. 267–333. Origin region GDP per capita calculated from countries and regional residuals, weighted by population.
Education years: Average years of education for the population aged 15 years and over for 80 countries, weighted by country populations within each of the 16 regions. Data from Barro and Lee (2001), available from the website of the Center for International Development at Harvard University: http://www.cid.harvard.edu/
Inequality: Gini coefficient of household income for 80 countries, weighted by population in each of the 16 origin regions. These data were originally assembled by Deininger and Squire, and have now been augmented and made available at the website of the World Institute for Development Economics Research of the United Nations University (UNU-WIDER) at: http://www.wider.unu.edu/. The observations selected are (almost) exclusively those labelled as ‘high quality’, with adjustments according to whether the underlying data were for income/expenditure, for gross/net income or for individuals/ households. Census year observations are obtained from linear interpolation, where appropriate.
Distance from Chicago: Distances in km are calculated using a Java routine available on the website of John A. Byers at: http://www.chemical-ecology.net/java/lat-long.htm. Origin-region capitals are Mexico City, Panama City, Kingston (Jamaica), Brasilia, Stockholm, London, Paris, Rome, Berlin, Moscow, Beijing, Jakarta, Mumbai, Jerusalem, Johannesburg and Sydney. Distances from the nine US divisions (Tables 6 and 7) are measured from: Boston, New York, Chicago, Kansas City, Baltimore, Memphis, Houston, Denver and Los Angeles.
All data and code are available from the authors upon request.