Understanding international migration: evidence from a new dataset of bilateral stocks (1960–2000)

In this paper I present a new database of bilateral migrant stocks and I provide new evidence on the determinants of international migration. The new Census-based data are obtained from National Statistical Offices of 24 OECD countries, and they cover the total stock of immigrants in each destination country for 1960–2000, including 188 countries of origin, sometimes in grouped categories. For each census, I keep grouped categories in a raw manner, without making imputations to specific origin countries. In the empirical analysis, I give an explicit treatment to these grouped categories. Results present strong evidence of heterogeneous effects of income gains on migration prospects depending on distance. For example, a 1000$ increase in US income per capita increases the stock of Mexican immigrants in the country by a percentage 2.6 times larger than the percentage increase in the stock of Chinese (8 vs. 3.1 %).


Introduction
International migration has increased dramatically in recent decades. Understanding the determinants of the movement of workers across international borders is crucial for immigration policy design. This paper aims to enhance our knowledge about these determinants by presenting new data on bilateral migrant stocks, a new treatment of those data in the empirical analysis, and new empirical evidence on the determinants of international migration.
To create the new database, I collected data on international migrant stocks by country of origin from National Statistical Offices of the 24 richest OECD countries. This dataset includes bilateral stocks of immigrants from 188 countries of origin into these 24 destination countries over the period 1960 to 2000. The data come from Census records at these destination countries. Given this, it covers the total amount of immigrants living in the country. Importantly, because the data sometimes appear in grouped categories, I keep track of these groups in a raw manner, without making imputations to specific countries of origin. 1 This is important because imputations, dropping grouped observations, and/or counting grouped observations as zeros may lead to important biases in the estimates.
Empirically, this paper makes two contributions. First, it gives explicit treatment to these grouped data in standard gravity regressions. Second, it presents evidence on the existence of heterogeneous effects of income gains on migration prospects depending on distance. According to a static model-the approach which mostly followed by the literature-when individuals decide whether to migrate to another country, they base their decision on net income gains from migration, i.e. the differential in expected wages between the two countries net of (one time) moving costs. 2 From a dynamic point of view, however, individuals may care about moving costs (distance in particular) even after having migrated. Large moving costs may reduce their flexibility to move back and forth to their home country as a consequence of income shocks; 3 and, if individuals dislike living far away from home, they may require a compensating wage differential for living abroad that might be increasing in distance. Forward looking individuals will take these two factors into account when deciding whether to migrate in the first place. As a result, the effect of income gains on moving prospects (net of the initial moving cost) may be heterogeneous depending on distance: individuals from countries away from home would be less reactive to income fluctuations compared to individuals from closer countries. Results suggest that these heterogeneities are indeed very important. For example, a 1000$ increase in US income per capita increases the stock of Mexican immigrants in the US by a percentage that is 2.6 times larger than the percentage increase in the stock of Chinese immigrants. In other words, the effect of income on log migrant stocks is 2.6 times larger for Mexico compared to China (8 vs. 3.1 %), given that Beijing is around 2.6 times as far from Washington DC as Mexico City is. This differs from the standard gravity equation, which would predict linear effects of income gains on log migrant stocks (Beine et al. 2015). This result is relevant for immigration policy design. For example, a pull-driven immigration shock (i.e. positive income shock) may imply significant changes in the composition of immigrant population in terms of nationalities. Similarly, a negative shock to a developing country may have a much larger effect for neighboring countries than previous estimates in the literature suggest; this larger effect suggests that destination countries may want to favor neighboring countries in development assistance policies if they are interested in reducing immigrant inflows.
Collecting data on bilateral migration is, in general, a difficult task. Reliability of statistics from origin countries is low because it is difficult to keep track of the people who leave the country. Data from destination countries is more accurate. The lack of comparable cross-destination country bilateral data led many papers in the literature to follow a single destination country over time (e.g. Borjas and Bratsberg 1996;Karemera et al. 2000;Clark et al. 2007;. More recently, researchers and institutions have put some effort in gathering comparable bilateral migration data across destination countries. Pedersen et al. (2008) and Mayda (2010) are the first papers using cross-destination country panel data on bilateral inflows to analyze the effect of income gains and moving costs on international migration. Mayda (2010) uses a database from OECD on annual legal inflows of workers by country of origin; she uses these data to investigate the determinants of migration inflows into 14 OECD countries between 1980 and 1995. Pedersen et al. (2008) produce a similar database collecting data on issues of residence and work permits from National Statistical Offices from 1989 to 2000. They use these data to look at the effects of networks and welfare benefits on international migration. These two databases have recently been expanded by Ortega and Peri (2013) and Adserà and Pytliková (2012) respectively. 4 The four databases contain information on inflows of immigrants and, with a lower accuracy, net flows. They are based on the number of issues of residence and work permits, which is likely to produce a severe underestimation the real numbers due to illegal migration. And, acknowledged by the authors, they have an important amount of missing data and incorrect zero values (for countries with relatively small flows), covering, as a result, a limited fraction of total inflows (Mayda 2010(Mayda , pp. 1258(Mayda -1259. Similarly to what I do in this paper, Docquier and Marfouk (2006) and Docquier et al. (2009) collect Census-based data. The aim of their databases is to gather information on stocks of immigrants by educational level, and, for this reason, they only cover two census dates, 1990 and 2000. Two papers use these data to analyze the determinants of international migration. Grogger and Hanson (2011) use them to the analyze the determinants of scale and composition of migration flows. Ortega and Peri (2014b) combine these two years of data on stocks with the OECD database on annual legal inflows used in Mayda (2010) to extrapolate stocks back to 1980 and analyze the determinants of migration flows.
Contemporaneously to this paper, a few additional datasets appeared. Özden et al. (2011) is the most similar. These authors collect bilateral stock data for the same period and from similar sources. The key difference with the current dataset is the treatment of data when bilateral information is not available. When this happens, which is often the result of grouping of data (residual categories, aggregations of countries,…), these authors try to recover the bilateral information by means of an array of different imputations. Conversely, I keep these grouping in a raw manner, giving it a specific treatment in the empirical analysis. Given the similarity, I draw some comparisons with this dataset below. The other three datasets are: United Nations (2013), which provides similar information for years 1990, 2000, 2010Brücker et al. (2013), who add the educational and gender dimension for the period 1980 to 2010; and, Abel and Sander (2014), who estimate inflows and outflows out of the stock data for 1990-2010.
The rest of the paper is organized as follows. Section 2 presents the database. Section 3 introduces the econometric model and explains the implications of grouped data in terms of identification of fixed effects. Section 4 shows estimation results. And Sect. 5 concludes.

A new database on bilateral migrant stocks (1960-2000)
In this paper, I collect data from National Statistical Offices of 24 OECD countries. 5 The dataset contains stocks of immigrants by country of origin from 1960 to 2000. Data are based on destination countries' Censuses. 6 From each Census, I collect data on the stock of immigrants by country of birth or country of nationality. The dataset contains information on stocks of immigrants from 188 countries of origin-sometimes in grouped categories-into each of the destination countries. 7 Although some desti-nation countries carry a census every five years, most of them do it every 10 years, so data is presented at a 10-year frequency. Hence, the database is well suited for looking at long-run effects.
There are some comparability issues that are worth mentioning. First, similar to existing datasets in the literature, the definition of immigrant is different across countries. Some countries define immigrants on the basis of the place of birth whereas others do it based on nationality. This might affect the comparability of stocks across destination countries, but changes over time are reasonably comparable. 8 Second, census dates vary across destination countries-roughly a half of them are carried in even years (1960, 1970,…) and the other half in odd years (1961, 1971,…). 9 Dates are generally consistent for each country, so the difference between two census dates is usually of ten years.
Data may be grouped for several reasons. One of them is that Statistical Offices decide to group several countries into one or some residual categories (usually labeled as "Other countries in region X"). In some other cases, they report the stock of immigrants born in a former country that later was split into smaller countries: USSR, Czechoslovakia, Yugoslavia, Ethiopia/Eritrea, Rhodesia, and the West Indies Federation are good examples. Finally, in some cases all origin countries are grouped, either because I only observe the total stock of immigrants in the destination country (a single group), or because the data is presented in big aggregate categories (e.g. data by continent or subcontinent of origin). Table 1 summarizes the importance of grouped data. There are several aspects to highlight from the Table. First, data are more disaggregated in recent years: the average number of countries in grouped categories decrease from 167 to 87, and the share of the total stock that they represent decreases from more than one third in 1960 to less than 10 % in year 2000. Second, even though in 1960 and 1970 the coverage of total migrant stocks by bilateral data is only of around two thirds of the stock, this coverage increases to 80 % if we exclude the destination countries for which we only observe the total migrant stock. And third, even considering only disaggregated bilateral observations, the coverage of the total stock of immigrants is pretty large. For example, it is larger than in the OECD database used in Mayda (2010). Indeed, Footnote 7 continued Bermuda, The Netherlands Antilles, and Puerto Rico. Montenegro and Serbia are considered as a sole country. 8 Destination country fixed effects, and especially destination-time fixed effects, are likely to account for most if not all these differences, given the log-specification of stocks in the specification estimated below. A caveat would still remain if policies introduced by a destination country affect different origin countries differently at different points in time. 9 The only exception is France, whose Censuses were carried in 1954, 1962, 1968, 1975, 1982, 1990, 1999 and 2006. I interpolate them linearly to fit census dates to 1961, 1971, 1981, 1991, and 2001. Some years for three additional countries have to be extrapolated as well. Disaggregated information for Denmark and Finland circa 1960 and 1970 was not available, so I exploit information on residence permits for Denmark and on main language used for Finland. For Germany, pre-unification censuses are not available, so data on legal flows into West Germany is used to extrapolate. Robustness analysis to the exclusion of 1960 and 1970 is presented in the Appendix. Finally, data for United Kingdom includes only immigrants living in England and Wales; for year 2000 they represent a 95 % of the total stock of immigrants in the UK, a percentage that was uniformly distributed across origin countries.   .7

123
The first column for each year represents the number of countries that are in grouped categories in that period. The total amount of possible origin countries is 187. The second column for each year is the % from the total stock of immigrants that is in grouped categories. Each destination country may have several grouped categories. The last two rows are averages across destination countries 123 Mayda (2010) states that the coverage of total inflows in her database ranges from 45 % (Belgium) to 84 % (US). For the equivalent time period, the average coverage by bilateral observations here ranges from 80 to 91 %. Regarding the number of countries with disaggregate bilateral observations, Mayda (2010) and Ortega and Peri (2013) use a sample of 79 and 120 origin countries respectively-including zero flows that "are likely to correspond to very small flows rather than zero flows" (Mayda 2010); Pedersen et al. (2008), report a substantial portion of missing values among their sample of 129 countries of origin. The country coverage for these years is similar on average in Table 1, but it is much larger both if we restrict to the sample of 15 destination countries considered in Mayda (2010) and Ortega and Peri (2013), or if we consider federations of countries that were single countries at the time as ungrouped countries (e.g., Former Yugoslavia accounted for almost a half of the stock of immigrants in Austria in years between 1970 and 1990, one quarter of the stock in Switzerland in year 2000, and around a 10 % of the Swedish stock in years between 1980 and 2000, and the USSR represented between 5 and 8 % of US and Canadian stocks in years 1960 and 1970, and around a 3 % in other several destination countries). Table 2 presents averages, standard deviations, and extreme values for each destination country, and the number of available observations. The left panel refers to the baseline sample, which includes all disaggregated bilateral observations plus one observation for each set of grouped countries. To compute these statistics, grouped observations are weighted by the number of countries included in the group. The right panel restricts the sample to disaggregated bilateral observations. The baseline sample includes 6,804 bilateral observations plus 625 grouped observations. These observations are not uniformly distributed across destination countries, ranging from the 55 single bilateral observations for Luxembourg (plus 26 grouped observations) to the 744 for Switzerland (plus 28 groups). The comparison of averages across the two samples suggests that grouped observations tend to include countries with smaller stocks of migrants, which is not surprising given that some grouping occurs due to labeling like "Other countries in region X". The difference in average stock size between the two samples, however, may be exaggerated by the fact that data are more grouped in earlier years of the sample, when immigrant stocks are smaller. The fact that grouping does not occur at random highlights the importance of including grouped observations in the analysis (as opposed to dropping them from the sample). Table 2 shows substantial variation in average migrant stocks, ranging from 46 immigrants per origin country in Iceland to 99,276 individuals per country in the United States. There is also a large variation across origin countries, as appreciated from the size of standard deviations. The extreme case is the US, with a standard deviation of 395,483 individuals, and stocks of immigrants that range from the 11 immigrants from Djibouti in 1990 to the 9,325,452 Mexicans in year 2000, but it is not the only one: Canada, Germany, France, and Japan also have sizeable standard deviations, and they are also quite large in Greece and Ireland compared to averages. Overall, the standard deviation in the whole sample is 90,729 individuals, roughly ten times the sample average. 10 Table 2 does not provide a sense of time series variation. Figure 1 draws the evolution of immigrant shares (i.e. stock of immigrants over population) across destination countries over the sample period. Different patterns are observed across countries:  . Each plot presents destination country's share of immigrants (immigrants over population). Left axes have a common scale, ranging from 0 to 25 %-which is compressed for Luxembourg due to its exceptionally large fraction of immigrants (36.4 % in year 2000) stable low-immigration countries (Korea and Japan), stable high-immigration countries (Australia, Canada and New Zealand), old immigration countries with a strong increasing trend (US, Luxembourg, Switzerland, and the UK), old immigration coun-  (1000s) and share of population who migrated ( 0 / 00 ) for selected country pairs . a Some country pairs with low migrant rates. b Some country pairs with high migrant rates. Solid lines are bilateral migrant stocks (in 1000s, left axis) from the origin to the destination countries listed in each title. Dashed lines are migrant rates (in 0 / 00 , right axis), i.e. stock of migrants from country "X" in country "Y" over total population of country "X" (origin). Left axis scale is common to all country pairs ranging from 0 to 1000-which is compressed for MEX and PHI to USA (9.3 and 4.4 million respectively in year 2000). Right axes from top panel have also common scale (0 to 1 0 / 00 ); in the bottom panel, it ranges from 0 to 5 0 / 00 in the first rows, from 0 to 20 0 / 00 in the second one, and from 0 to 120 0 / 00 and 0 to 240 0 / 00 in the last row tries with a slight decrease (Belgium and France), and new immigration countries (Spain, Italy, Austria, Greece, Portugal, and Nordic Countries). Figure 2 adds the country of origin layer. In particular, I plot the evolution of the stock of immi-grants and of the bilateral migration rate for a selected group of country pairs. 11 The figure shows substantial variation across countries and over time. The top panel includes a sample of country pairs with low migration rates, which include some pairs with flat trend (e.g. North American in Ireland or Japan, Chinese in Korea), and others with important increases over the sample period (Somali in Italy, Polish in Austria, Swedish in Finland). The bottom panel includes country pairs with high migrant rates, including pairs with decreasing rates (Korean in Japan, Irish in the UK, Spanish in France), roughly constant rates (Australian and British/Irish in New Zealand, British in Australia), and sharply increasing rates (Ecuadorian in Spain, Albanian in Greece, and, most extremely, Filipino and Mexican in the US).

Comparison with Özden, Parsons, Schiff and Walmsey (2011)
Contemporaneous work by Özden et al. (2011) provides a similar dataset. These authors' approach is to impute grouped observations to specific origin countries. They do so based on the propensity of destination countries to accept migrants from a particular origin in subsequent years, and based on the propensity of a given origin country to send them abroad. These imputations may be particularly harming when one is interested in estimating, precisely, the determinants of international migration. This method may generate measurement error correlated, almost by construction, with the regressors of interest. Table 3 reproduces Table 2 using Özden et al. (2011) dataset (generating artificially grouped observations). The comparison between the two tables is interesting. On average, their data predicts, about 2,000 extra immigrants per origin country, almost 4,000 when only origin countries with bilateral observations (in the current dataset) are considered. This gap is not homogeneous across destination countries. Some countries (Australia, Belgium, Canada, Denmark, Finland, Greece, Iceland, Ireland, Luxembourg, New Zealand, Norway, Switzerland, and United Kingdom) present very similar stocks. Others (Austria, France, Germany, Italy, Japan, The Netherlands, Portugal, Spain, Sweden, and United States) present substantially different averages both in grouped and in bilateral observations. Finally, another (Korea) is similar in the bilateral observations and differ substantially in grouped observations. Data in Özden et al. (2011) also have a larger cross-origin country variance.
To elaborate further in these differences, Fig. 3 presents histograms of the distribution of discrepancies for those observations with bilateral information available in both datasets. The majority of the observations (4,861 out of 6,804, or 71.4 %) present no discrepancies or discrepancies of less than 1000 migrants (central lines of Fig. 3c). The remaining 1943 observations (28.6 %) are distributed as follows: 1,338 (19.7 %) have discrepancies between 1000 and 10,000; 534 (7.8 %) have discrepancies of between 10,000 and 100,000 migrants, and 69 (1 %) have discrepancies above 100,000 migrants. Most of these extreme discrepancies are given by the defi-   2011) for available bilateral observations. a >100,000, b ∈ (100,000; 10,000), c <10,000. The histograms present the number of origin countries/periods with each level of discrepancies. The left histogram omits (absolute) discrepancies smaller than 100,000 migrants. Center histogram presents observations with absolute deviations between 100,000 and 10,000 migrants. And right panel plots observations with absolute deviations smaller than 10,000 migrants. A positive number indicates that this dataset reports more immigrants than Özden et al. (2011)

Other variables
The remaining variables used in the regression analysis below come from different sources (descriptive statistics provided in Table 4). All variables are averages over years t − 10 to t − 1. GDP per capita, population, and government share of GDP come from Penn World Tables (versions 6.2 and 7.0). In order to minimize the number of missing values for GDP per capita, I use Total Economy Database (Conference Board) to extrapolate backwards discontinuous Penn World Tables series. Both origin and destination countries' series are in constant international dollars of 2005 (chain). Population in origin and destination countries are in millions. Government share is public sector consumption over real GDP. Age dependence ratio at destination country-individuals older than 65 years over population of working age-is from World Development Indicators. Unemployment rate (in %) is obtained from the OECD. Geographic variables include physical distance-great circle distance between the two capitals-and dummies for having a common language, a past colonial relationship and a common border. The distance variable is based on Rose (2004) data, extended to cover the whole sample. The common language dummy is constructed using data from Alesina et al. (2003) and The World Factbook from the CIA; a pair of countries is considered to have a common language if there is a particular language that is spoken by at least a 10 % of the population in each of the two countries. Colonial relationship and common border dummies are also based on The World Factbook. War and Polity IV autocracy-democracy index are constructed with data from the Polity IV Project. The war variable measures the fraction of months over the preceding decade that the country was in any type of war. The Polity IV index ranges from −10, indicating autocracy, to 10, which indicates democracy, through values around 0, which indicate anocracy (a situation of instability emerged from the absence of a strong power and of the rule of law). The young population variable is constructed using data of total

Standard gravity model
In the reminder of the paper, I use the new data presented above to analyze the determinants of international migration. In particular, the data are used to estimate different types of "gravity equations" (see Beine et al. 2015 for a review of this literature).
Simplest gravity equations can be derived from random utility models in which the utility of moving from home country j to country k at time t is of the form: where i indicates an individual, w kt is the wage at country k and time t, c jk is the moving cost from j to k (where c j j is typically normalized to 0), and ε i jkt is a random term that is Type-I extreme value distributed. Given the distributional assumption of the random term, the relative odds of moving from country j to country k vs. staying 123 in country j are equivalent to: where M jkt is the stock of migrants from country j to country k in year t, and Pop jt is the (ex-post) population in country j at time t.
All the variables included in Eq. (3) are described in Sect. 2.4. Different specifications include different combinations of fixed effects, depending on the assumptions underlying the distribution of ε i jkt . These include country of origin, destination country, year, origin × year, destination × year, and/or country pair fixed effects. Migration is expected to be positively affected by income gains (hence, α 1 is expected to be positive and α 2 , negative), by having a common language, a colonial relation, and a common border, and by the population in the origin country, and negatively affected by physical distance; the expected sign of the effect of population in the destination country is ambiguous a priori. Similar micro-foundation for this regression can be found in the model by Grogger and Hanson (2011), or in the survey by Beine et al. (2015), and it is comparable to the previous studies in the literature (Mayda 2010;Grogger and Hanson 2011;Ortega and Peri 2013).  highlight the importance of origin-time dummies combined with country pair dummies due to the "Multilateral Resistance to Migration". Beine et al. (2015) go a step further and suggest adding origin × time × nest dummies on top. As noted below, with the number of observations left due to the grouping of the data, these models are too demanding in terms of degrees of freedom to be credibly estimated. 12,13 While the inclusion of GDP per capita in levels to approximate origin and destination country wages in Eq. (3) seems very closely connected to the underlying theoretical model described by Eqs. (1) and (2) (as noted by Grogger and Hanson 2011), there are many papers in the literature that estimate equations like (3) including GDP per capita in logs, as highlighted in Beine et al. (2015). For comparability with these studies, I run some specifications of Eq. (3) in which GDP per capita is introduced in logs.

Heterogeneous effects of income gains
An important implication of the model described above is that an increase in GDP per capita of a destination country would increase the stock of migrants from all origin countries by the same percentage (i.e. linear effect of GDP per capita on migrant stocks in logs). Likewise, an increase in the GDP per capita of a given country of origin would increase the stock of migrants from that country into all destinations by the same relative amount (linear effect on log migrant stocks).
However, the effect of income shocks on moving prospects might be more marked for closer countries compared countries that are farther apart. For example, large moving costs (distance) reduce the flexibility of individuals to move back and forth to their home country when income changes. As a result, in the migration decision, individuals from farther away countries may give more weight to long run income (as opposed to income shocks), whereas individuals from neighboring countries will be more prone to go back and forth to take advantage of income fluctuations. Similarly, if individuals dislike living far away from home, they might require a compensating wage differential to offset the unpleasantness of living abroad. If the disutility of being far from home increases with distance, they will require an increasing wage premium to take the decision to migrate. Hence, these compensating wage differentials would also introduce a heterogeneous effect of income gains on moving prospects depending on distance, which would make migration more reactive to income at closer distances.
As a way to micro-found these heterogeneous effects, consider the following modification of Eq. (1): where d jk is the distance between home country j and destination k, and u(., .) is a utility function with an elasticity of substitution between income and distance to be identified. The case in which wage and proximity (negative distance) are perfect substitutes is observationally equivalent to the standard utility model. Following a similar procedure to the one used to derive Eq. (3), and approximating u(w kt , d jk ) by a first order expansion around the mean we obtain: where x ≡ x −x indicates that variables are in deviations with respect to sample means. Parameters γ 3 and γ 4 are reduced forms of the cross-partial derivative of u(w kt , d jk ) evaluated at sample means. Hence, the presence of an heterogeneous response of migration to shocks to destination and/or origin country's income as a function of 123 distance is an indication of a complementarity (or, potentially, substitution) between income gains and proximity.

Identification of fixed effects with grouped data
A potential limitation of working with grouped data is in the identification of fixed effects in the estimation of Eqs. (3) and (5). In the simplest specifications estimated below, I introduce origin and destination country fixed effects, and year dummies. Additionally, in several specifications I introduce country-pair or country of origin × year dummies. Destination country, time, and destination × time fixed effects are identified in all cases, as grouping only affects origin countries. To identify a dummy for an origin country, we need to observe, at least, one bilateral observation from that country, or that the country appears in a unique combination of grouped observations. 14 To identify a country of origin × year dummy, this bilateral observation or unique combination of groups has to be observed in each year. And the identification of a country-pair dummy requires the bilateral observation to be observed at least once for each destination country. When one of these situations is not satisfied, a single dummy for each unique combination of groups is identified. Figure 4 summarizes the availability of this variation in the data. The left histogram shows the number of origin countries with 0, 1, . . . , 120 (=24 × 5) country of destination × year observations. All countries of origin have between 4 and 99 destination × year observations, which is enough to identify all origin country fixed effects; in most of the cases (105 out of 188 countries, 55 % of them) we have between 20 and 40 observations. The central histogram shows the number of countries of origin × years with bilateral data for the 0, 1, . . . , 24 destination countries. The figure shows that we cannot separately identify country of origin × year dummies in 160 out of 188×5 = 940 origin × year combinations (17 %), in most of the cases, this is due to federations of countries-USSR, Yugoslavia,…-that were still federated at the given period. Finally, the right histogram shows the number of country pairs with bilateral data for the 0, 1, . . . , 5 periods. According to the figure, we cannot identify a country pair dummy for 1854 out of 24 × 188 = 4488 country pairs (41 %). 15 This limitation does not affect consistency of the estimates below, but it affects the precision of the estimation of the most demanding models. Richer models that include the combination of country-pair and origin × year dummies, as suggested by Bertoli and Fernández-Huertas Moraga (2013), or the even richer ones that add origin × year × nest on top, as in Beine et al. (2015) would absorb too many degrees of freedom to allow us to draw any relevant conclusion from them.  Table 5 presents the results for the estimation of different versions of Eq. (3). All regressions include at least origin and destination country fixed effects, and year dummies. The first column is the baseline specification. The stock of migrants is positively associated with the GDP per capita of the destination country. This result suggest that better economic opportunities in the destination country encourage migration. In particular, everything else constant, a 1000$ increase in GDP per capita of the destination country increases the immigrant stock by a 5.2 %. This magnitude is in line, for example, with Ortega and Peri (2013), who find a positive effect of a 5-6 %. According to the results in Table 5, a 10 % increase in GDP per capita of the average country of destination (which is 17,848$, see Table 4) would increase the immigrant stock by a 9.3 %. 16,17 GDP per capita in OECD countries averaged 9,101$ in 1960, and 27,341$ in year 2000. According to the results in Table 5, this 200 % increase would have increased the stock of immigrants in a 95 % (25 millions of immigrants over the OECD), more than a half of the actual increase (45 millions).  (1) is 0.000, and the corresponding p-value for Column (7) is 0.264

Linear effects: standard gravity model
Theoretical predictions from models like the ones in Grogger and Hanson (2011) or in Mayda (2010) suggest that α 1 and α 2 should be similar in magnitude and of opposite sign. However, Table 5 shows a much smaller effect of origin country GDP per capita. Although it is negative (consistently in all specifications), the coefficient is far from being significantly different from zero, and point estimates are one order of magnitude smaller than destination country counterparts. This result is not new; Mayda (2010) also finds a non-significant effect, although her point estimates are indeed positive. This finding could result from an additional positive effect of origin country GDP per capita on migration prospects. Borrowing con-straints could be a plausible explanation: if individuals from poorer countries (lower GDP per capita) are financially constrained, then, other things equal, their chances to migrate are lower; therefore, the larger the GDP per capita, the less constrained they are, and the larger is the probability that they migrate. If that were the case, one would expect that this effect should be homogeneous across all destination countries, which is in line with findings discussed in Sect. 4.2. Several papers in the literature explore this possibility, and conclude that this is likely the case (Beine et al. 2015).
Physical and cultural distance play an important role in explaining moving costs. The elasticity of the migrant stock with respect to physical distance is about 0.9. Having a common language or a colonial relationship increases importantly the stock of immigrants. A common border, however, seems less important. These results are, again, qualitatively similar to Mayda (2010), Grogger andHanson (2011), andPeri (2013). Finally, we can neither reject that the coefficient of log population in the origin country is equal to one, nor that the one of log population in the destination country is zero, which are the values predicted by the model outlined above.
The remaining columns of Table 5 check the stability of the estimates across different versions of the same equation. In order to obtain estimates which are fully comparable to Grogger and Hanson (2011), in Column (2) I impose the same coefficient (of opposite sign) for origin and destination countries' GDP per capita. The coefficient of income gap is 0.023 (s.e. 0.011) very close to their estimate of 0.018 (s.e. 0.029) and much more precisely estimated, given the larger coverage by the dataset presented in this paper. Additionally, the coefficients for the variables associated with moving costs are extremely similar.
The fact that these estimates are comparable to Grogger and Hanson (2011) is useful to asses the validity of the way in which grouped data is treated in this paper. Grogger and Hanson (2011) use data from Docquier and Marfouk (2006), which is census data collected in a similar manner to the one in this paper for years 1990 and 2000. Grogger and Hanson (2011) estimate their regressions using data for year 2000. The similarity in the coefficients with respect to their paper indicates that the treatment I give to the grouped data produces consistent estimates of the relevant coefficients.
In order to analyze the importance of including the 100 % of migrant stocks, I drop grouped observations in Column (3). Although qualitative results hold, point estimates are somewhat different. In particular, four out of the eight coefficients are statistically different from point estimates in Column (1), and a Wald test of the null hypothesis that all eight coefficients are equal to their counterparts in Column (1) clearly rejects it (see p-value in the note of Table 5). This differences are caused by the fact that grouped observations (which are eliminated in previous studies) are not from a random sample of countries of origin. Therefore, we can conclude that including grouped observations-so that we cover the 100 % of total migrant stocks-is very important to obtain unbiased estimates.
In Columns (4) through (6), I change the specification of fixed effects. On top of origin, destination, and time fixed effects that are included in Columns (1) though (3), 123 I enrich the analysis by adding destination × time, origin × time, and country-pair dummies respectively. These specifications are more demanding in terms of degrees of freedom (see discussion on Fig. 4). Estimates are very stable across specifications. This stability of the coefficients is very interesting, as each specification controls somewhat for different versions of migration policies that may affect the results. Ortega and Peri (2013) show that a specification like the one in Column (4)-which includes country of origin × year dummies-emerges from a version of the random utility model in Grogger and Hanson (2011) extended to allow for individual-specific time-invariant random effects in the specification of the idiosyncratic utility function.
A problem of having some observations aggregated in grouped categories is that, since we only observe the stock of immigrants for the group, the dependent variable is measured with error provided that the log of the average stock of the group is not equal to the average of logs of bilateral stocks. The problem with this measurement error is that it is obviously correlated with the covariates. In order to check to what extent this could be a relevant issue, in Column (7) I include as controls standard deviations of the regressors within the grouped observations (zero for bilateral observations). Given that the measurement error increases as the countries in the grouped observation differ in the stock of immigrants, these standard deviations are good proxies for the measurement error. 18 Results are again robust; none of the coefficients of the regressors of interest is statistically different from its counterpart in Column (1), and the test of the joint difference cannot reject the null hypothesis that all coefficients are (jointly) equal to point estimates in Column (1)-the p-value of the test is reported in the note of Table 5.
As noted in Sect. 3.1, several studies in the literature estimate equations similar to (3) including GDP per capita in logs. For comparability with these studies, I run the same specifications of Table 5 using log GDP per capita instead of the levels. Results are presented in Table 8 in Appendix 1. Point estimates are in line with results in Table 5 and those in the literature. Estimated elasticities of GDP per capita in destination countries are around 0.6, slightly smaller but not very different from the average elasticity implied by the coefficients in Table 5. Those for GDP per capita at origin are rather small. The coefficients of other variables are virtually unchanged. However, the precision of estimated elasticities for GDP per capita at origin and destination is substantially lower.
To keep with the comparison between the database presented here and that by Özden et al. (2011), Table 9 in Appendix 1 replicates the regressions presented in Table 5 using the dataset produced by these authors. Sample sizes are obviously larger, given that observations for grouped countries are imputed to specific countries. While qualitatively similar, point estimates are somewhat different to those in Table 5. The estimated coefficient for GDP per capita at destination, for example, is 0.015 (s.e. 0.005) instead of 0.052 (s.e. 0.023) in Table 5. The elasticity of distance is around -1.1 instead of 0.9. And so on. These differences are more likely attributable to differences in the data collected than to the grouping itself. The motivation for this belief is that the coefficients of Column (3) in each table, which are only estimated for the subsample of observations with bilateral information in both datasets, are also quite different.
As a final robustness check, I also estimate the regressions in Table 5 excluding  observations for 1960, and 1960 plus 1970 (results are available upon request from the author). We have seen in Sect. 2 that data is particularly grouped in these years (especially the first), and that data reliability is slightly lower in those years (see Footnote 9). Results are again robust.

Heterogeneous effects of income gains depending on distance
Estimates for Eq. (5) are presented in Table 6. As the interacted terms in Eq. (5) are expressed in differences from sample means, the linear terms can be interpreted as effects for the average country pair (and they are comparable to estimates in Sect. 4.1). Again, all regressions include at least origin and destination country fixed effects, and year dummies.
Column (1) in Table 6 is the baseline specification. The effect of destination country GDP per capita for the average country is exactly the same as in Table 5. The effect of GDP per capita at origin country is slightly more negative (−0.014 vs. −0.008), but still not statistically different from zero. The coefficients of all other regressors that are included in Table 5 are virtually unchanged (except for the point estimate of common border, that now becomes large and significant).
As the coefficient of the interaction of destination country GDP per capita and distance suggests, the effect of income gains on moving prospects is not homogeneous across all origin countries. These coefficients are interpreted as follows: the effect of a 1000$ increase in GDP per capita of the destination country is 0.21 percentage points smaller if the distance from the origin country is a 10 percent larger than the average. To give a sense to these numbers, note that the distance between Washington DC (US) and Dublin (Ireland) is 5,448 km, roughly the average distance in the sample. On the other hand, the distance between Washington DC and Beijing (China) is 11,159 km, roughly twice as large. Therefore, a 1000$ increase in GDP per capita in the US would increase the stock of Irish living in the US by around a 5.2 %, whereas the stock of Chinese-born would only be increased by approximately 3.1 %. As an extreme example, a 1000$ increase in GDP per capita in the US would increase the stock of Mexicans by a 8 %, but the stock of Taiwanese would be increased by only a 2.8 %.
This is the main empirical result of this paper. Previous literature assumes that an income shock in a destination country increases the stock of immigrants from all origin countries by the same percentage. If that were the case, then income shocks would not affect the composition of the immigrant population. But the finding described above indicates that income shocks in a destination country have indeed very important compositional effects. This result is very important for shaping immigration policy. For example, if the policy maker is willing to preserve the ethnic mix (e.g. it was one of the goals of the US immigration policy from 1920s to mid-1960s), countermeasures will be required to compensate market forces. Additionally, if the skill composition of immigrants from a particular country of origin was not affected by changes in the size of the flow, income shocks would affect the skill composition of the immigrant workforce by changing the weight of each origin country in the total stock.    (2) is 0.000, and the p-value for a test of the null hypothesis that the interaction coefficients in Column (1) are equal in magnitude and opposite sign is 0.135 A similar story can be told for origin countries' GDP per capita. Despite linear effects are small and statistically insignificant, the interaction with distance is very important. Interestingly, the coefficient of this interaction is very similar-with the opposite sign-to the one for interaction of GDP per capita of the destination country and distance (indeed, we cannot reject statistically that their magnitudes are the same-p-value is reported in the table notes). This result, together with the small estimated coefficient for the linear term, are again suggestive of the presence of an additional effect of origin country GDP per capita on migration prospects. Following with the argument of borrowing constraints, imperfect access to credit markets in poorer countries would prevent migrants from these countries to afford the migration cost, although they would have gained from moving if they could have borrowed resources to afford it; if that were the case, credit market imperfections would increase the coefficient of the linear term (making it less negative), but would not affect the interaction term. Similarly, another positive direct effect of origin country GDP per capita could arise through immigration policies, if destination countries are more willing to accept immigrants from richer countries (which again would not affect the interaction term). 19 The remaining columns of Table 6 check the stability of the estimates across different versions of the same equation. In Column (2) I extend Eq. (5) by including interactions of origin and destination country GDP per capita with all other measures of distance. Results are virtually unchanged. Only interactions with colonial relationship are significant. Surprisingly, both of them have a negative sign. This result, however, may be driven by policy issues as one would expect that (after controlling for having a common language) a past colonial relationship only affects migration through a special treatment by destination countries in terms of immigration policy. For example, a negative income shock would reduce the stock of immigrants from non-former colonies in a larger magnitude than from former colonies, which would receive a special treatment.
In Column (3), I check the importance of including the 100 % of migrant stocks by dropping grouped observations. As in Table 5, qualitative results hold, but point estimates are different. In particular, seven coefficients are statistically different from their counterparts in Column (2), and a Wald test of the hypothesis that all coefficients are equal to their counterparts in Column (2) clearly rejects (p-value in the table notes).
As in Table 5, in columns (4) to (6) I change the specification of fixed effects. Again, on top of origin, destination, and time fixed effects (as in columns (1) to (3)), I introduce destination × time, origin × time, and country pair dummies respectively. Once again, results are virtually unchanged. Table 10 in Appendix 2 reproduces the regressions in Table 6 introducing GDP per capita in logs instead of levels. Again, linear coefficients and the coefficients of the cost proxies are virtually unchanged with respect to their counterparts in Table 8 (except, as in Table 6 vs. Table 5, for the coefficient of common border, and, in this case, also Column (6)). And again, interaction terms explain a similar story as in Table 6, similar relative magnitude compared to the linear term, and similar in size for origin and destination, with opposite sign. The interaction with colonial relationship is also significant and in the same relative size and magnitude compared to the linear term, and it is also of the same sign when interacted with GDP per capita at origin and at destination. And interactions with the other variables are not statistically significant.
I also estimate again the same regressions using the data from Özden et al. (2011). Results are presented in Table 11 in Appendix 2. As it occurred in the previous section, results are somewhat different than in Table 6. The key difference is for the destination country GDP per capita, which not only is small and insignificant in the linear term, but now also in the interaction term. Instead, results for GDP per capita at origin are qualitatively in line with those in Table 6, but with very different magnitudes. And, as it happened with Table 9, results are still very different even in the case where only observations with bilateral information in both datasets are included.
As final robustness, I estimate the same regressions excluding 1960, and excluding 1960 and 1970 (results are available upon request from the author). Results are robust.

Additional results for other push and pull factors
In Table 7, I extend Eq. (5) to control for other push and pull determinants of migration in more detail. Specifically, I add unemployment rate, age dependency ratio (older than 65 over working-age population), and government consumption share of GDP (pull factors), and wars, political regimes, and young population at origin (push factors). Finally, I also add too specifications that are very demanding because of the grouping of the data: I control for networks (stock of immigrants from a given origin in a given destination in the preceding census), and I estimate a regression in flows, computed as difference in stocks. Overall, the main results from previous sections are generally stable across specifications.
Aside from income gains, individuals value their probability of finding a job in the destination country. For this reason, higher unemployment at the country of destination reduces migration. Column (1) shows this empirically by including unemployment rate in the regression. Its effect is estimated to be negative, as expected, and very significant. Column (2) includes age dependence ratio as a regressor. Countries with older populations are more willing to admit immigrants to increase social security revenues and sustain increasingly unbalanced pay-as-you-go systems. Additionally, an older population brings in additional work opportunities for immigrants, both in terms of elderly caring services and because of a lower competition in the labor market. The coefficient of this variable has the expected positive sign, although its effect is small and statistically not different from zero. In Column (3) I include the government consumption as a share of GDP. More generous welfare state governments will spend more, and will attract more immigrants. However, larger government expenditure implies higher tax rates, and this may discourage migration. If all countries were equally efficient in their spending, the sign of the effect should depend on whether immigrants are net contributors or receivers. In that case, South-North migration should be affected positively by expenditure. However, larger expenditures in some countries may be due to lower  (9), which is log flows. Unit of observation: origin-destination-time. Regressions include the indicated fixed effects efficiency, which might imply that everyone becomes a net contributor, making the effect unambiguously negative. Results in Table 7 suggest that the effect is negative. Column (4) includes a warfare measure for the origin country. This variable measures the share of months over the last decade that the country was involved in a war of any type. Armed conflicts displace a lot of people who escape from the tragedy. This fact is reflected in the estimates: a decade of war in an origin country increases the stock of immigrants from that country in a 76 %. The political regime may also be important for migration. People may be less willing to leave a good democracy (everything else constant); moreover, in a dictatorship, they are usually not allowed to escape from the country. Instead of weak central authorities (known as anocracies) may be an encouraging environment for migration. In Column (5), I introduce the Polity IV index, which ranges from −10 (autocracy) to 10 (democracy). Intermediate values (with small absolute values) indicate the presence of an anocracy. For this reason, I include a quadratic in the indicator. The quadratic term is negative and significantly different from zero. The linear term is negative but small and clearly insignificant, indicating that similarly fewer people migrate from autocracies than from democracies compared to anocracies. In particular, the stock of migrants is around 30 % lower if the origin country is a democracy or an autocracy relative to an anocracy. Column (6) introduces the log of the population at the origin country. Countries with larger young populations (relative to the total population) tend to send more migrants abroad. Specifically, holding total population constant, an extra 1 % of population of those ages increases migration by 4.6 %. Column (7) introduces all push and pull factors together without any significant change.
The remaining two columns estimate respectively a regression that includes the lagged bilateral stock as a control, and one that uses log flows as the dependent variable. The fundamental problem to estimate these equations is that, given that the grouping affects differently each census, observations need to be artificially grouped further so that groups coincide over two consecutive censuses. This reduces observations substantially, and increases the incidence of grouping. Despite that, results in Column (8) are very similar to the estimates presented above. Additionally, an extra 1 % in the stock of migrants in a country-pair in the preceding decade is associated with a 0.4 % extra stock of immigrants in the given census. Column (9), which is estimated only with 1,765 observations delivers results that are qualitatively (and, with exceptions, quantitatively) similar to previous specifications, even though precision is affected substantially.

Conclusions
In this paper I present a new database of bilateral migrant stocks, and I provide new evidence on the determinants of bilateral migration. The database introduced in this paper was collected from the National Statistical Offices from 24 OECD countries based on population censuses. For each destination country and census date, it covers 188 countries of origin (sometimes in a grouped category) for the period 1960 to 2000. The database fully covers the total stock of immigrants, keeping track of the residual categories reported by Statistical Offices instead of making imputations to specific countries of origin. I handle these grouped data in a raw manner in the estimation.

123
Empirically, I test for the existence of non-linear effects of income gains on migration prospects depending on distance. The motivation for such heterogeneity can be cost-based (individuals from closer countries can move back and forth as a consequence of income fluctuations, whereas it is more costly for individuals from farther away countries), or by means of a compensating wage differential (individuals dislike living far away from home, and require a compensating wage differential to move, that would increase with distance). Results suggest that this heterogeneity is indeed very marked. For example, a 1000$ increase in US income per capita would increase the stock of Mexican immigrants in the US by a 8 %, the stock of Irish immigrants by a 5.2 %, and the stock of Chinese-born by only a 3.1 %. This result is very robust across many different specifications.
Empirical findings in this paper suggest that income shocks have significant compositional effects, which are important for shaping immigration policy. For example, if a policy maker is willing to preserve the ethnic mix (e.g. it was one of the goals of the US immigration policy from 1920s to mid-1960s), countermeasures will be required to compensate market forces. If country of origin is a good proxy for skills of immigrants, this result would also have implications for the skill composition of migrants. Additionally, destination countries should be more concerned about income shocks in neighboring countries than what is suggested in the literature, and may want to trade off development assistance and migration policies as a result.
A few remarks need to be made on the conclusions of this paper. The first one is regarding the grouping of the data. There are 1,800 country pairs (out of 24 × 188 = 4512) for which I observe data in grouped categories for all years (which only allows me to identify a fixed effect for each group). Also, for a similar reason, there are 160 origin country × time dummies that cannot be individually identified (out of 188 × 5 = 940). And data grouping also complicates the incorporation of the role of networks in determining international migration, and the estimation of models in flows (as shown in Table 7). A second remark is that the database does not include information on educational attainment by immigrants. Such information would be useful to test whether the compositional effects that I observe with respect to nationality have important implications for skill composition of immigrants. To the best of my knowledge, Docquier and Marfouk (2006), Docquier et al. (2009), andBrücker et al. (2013) are the only databases in the literature that include such information, but they only cover a shorter period. Third, the regressions estimated in this model abstract from the role of trade. Part of migration flows can be equilibrium adjustments to trade (as in di Giovanni et al. 2014). And fourth, the results in this paper can be seen as an additional explanation to those covered in Clemens (2014) as to why the elasticity of migration with respect to GDP per capita at origin is not homogeneous across countries (he finds evidence of an inverse U-shape).
The paper also opens avenues for future research. It would be interesting to investigate how the heterogeneous effects found in this paper affect skill composition and self-selection of migrants. Likewise, the database presented in this paper can be used for a variety of cross-country migration analyses (e.g., to produce instrumental variables as in Llull 2011 or Ortega and Peri 2014a).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.