1 Introduction

Countries have followed divergent paths of economic development since European colonization. Some former colonies, such as the Congo, Guinea-Bissau, Malawi, and Tanzania, have experienced little economic development over the last few centuries, with current per capita Gross Domestic Product (GDP) of about $2 per day. Others are among the richest countries in the world today, including Australia, Canada, and the United States, with per capita GDP levels of about $140 per day. Others fall along the spectrum between these extremes.

To explain these divergent paths, many researchers emphasize that the European share of the population during colonization shaped national rates of economic growth through several mechanisms. Engerman and Sokoloff (1997) (ES) and Acemoglu et al. (2001, 2002) (AJR) stress that European colonization had enduring effects on political institutions. They argue that when Europeans encountered natural resources with lucrative international markets and did not find the lands, climate, and disease environment suitable for large-scale settlement, only a few Europeans settled and created authoritarian political institutions to extract resources. The institutions created by Europeans in these “extractive colonies” impeded long-run development. But, when Europeans found land, climate, and disease environments that were suitable for smaller-scale agriculture, they settled, forming “settler colonies” with political institutions that fostered development. This perspective has two testable implications: (1) former settler colonies with a large proportion of Europeans during colonization will create more inclusive political institutions that foster greater economic development than former extractive colonies with a small proportion of Europeans, and (2) colonial European settlement will have a stronger association with development today than current European settlement (the proportion of the population that is of European descent today) because of the enduring effect of institutions created during the colonial period.

As an additional, potentially complementary mechanism, ES and Glaeser et al. (2004) (GLLS) note that the European share of the population during colonization influenced the rate of human capital accumulation. They argue that Europeans brought human capital and human capital creating institutions that shape long-run economic growth, as emphasized by Galor (2011). According to this human capital view, European settlers directly and immediately added human capital skills to the colonies and also had long-run effects on human capital accumulation.

These long-run effects emerge because human capital disseminates throughout the population over generations and it takes time to create, expand, and improve schools. Furthermore, this human capital view suggests that a larger share of Europeans during colonization could facilitate human capital accumulation across the entire population both because it would increase interactions among people of European and non-European descent and because it might accelerate expanded access to schools, as emphasized in ES. This human capital view also yields two testable implications: (1) the proportion of Europeans during colonization will be positively related to human capital development and hence economic development today, and (2) the proportion of Europeans during colonization will matter more for economic development than the proportion of the population of European descent today because of the slow dissemination of human capital and creation of well-functioning schools. Although the political institutions and human capital views emphasize different mechanisms, they provide closely aligned predictions about the impact of colonial European settlement on current economic development.

Other researchers, either explicitly or implicitly, highlight additional mechanisms through which European migration had positive or negative effects on development. North (1990) argues that the British brought comparatively strong political and legal institutions that were more conducive to economic development than the institutions brought by other European nations. This view stresses the need for a sufficiently strong European presence to instill those institutions, but does not necessarily suggest that the proportion of Europeans during colonization will affect economic development today beyond some initial threshold level. More recently, Spolaore and Wacziarg (2009) stress that the degree to which the genetic heritage of a colonial population was similar to that of the economies at the technological frontier positively affected the diffusion of technology and thus economic development, where European migration materially affected the genetic composition of economies. Putterman and Weil (2010) and Chanda et al. (2014) emphasize that the experiences with statehood and agriculture of the ancestors of people currently living within countries help explain cross-country differences in economic success. And, Comin et al. (2010) likewise find that the ancient technologies of the ancestors of populations today help predict per capita income of those populations. In all of these papers, the ancestral nature of a population helps account for cross-country differences in economic development, where European colonization materially shaped the composition of national populations.Footnote 1 As in the human capital view, the emphasis is on things that Europeans brought with them, such as technology.

Although this considerable body of research emphasizes the role of European settlement during colonization on subsequent rates of economic development, what has been missing in the empirical literature is the key intermediating variable: colonial European settlement. While researchers, including AJR, have examined the European share of the population in 1900, this is well after the colonial period in several countries, including virtually all of the Western Hemisphere. To the best of our knowledge, researchers have not directly measured colonial European settlement and examined its association with current economic development.

In this paper, we construct a new database on the European share of the population during colonization and use it to examine the historical determinants of colonial European settlement and the relation between colonial European settlement and current economic development. Although we do not isolate the specific mechanisms linking colonial European settlement with current levels of economic development as emphasized in each of the individual theories discussed above, we do assess the core empirical predictions emerging from the literature on the relationship between European settlement and economic development. In particular, we assess whether the proportion of Europeans during colonization is positively related to economic development today and whether the proportion of Europeans during colonization is more important in accounting for cross-country differences in current economic development than the proportion of the population of European descent today.

We begin by compiling a new database on the European share of the population during colonization. For each country, we gather data from an assortment of primary and secondary sources for as many years as possible going back the sixteenth century. From these data, we construct several measures of European settlement that differ with respect to the date used to measure colonial European settlement. We first construct a country-specific measure based on the colonial history of each country. To do this, we use information on when Europeans first arrived in the country and when the colony became independent to select a date on which to measure colonial European settlement. For this country-specific measure, we seek a date, subject to data limitations, that is early in its colonial period but sufficiently after Europeans first arrived to allow for the formation of colonial institutions. We next construct measures that use a common date. For each country, we average the annual observations on the European share of the population over 1500–1800, 1801–1900, and 1500–1900. We obtain consistent results using these different methods for dating and measuring colonial European settlement.

We then examine the historical determinants of colonial European settlement both to check the credibility of our new data and to examine differing views about the factors shaping European colonization. As a guide, we employ a very simple model of the costs and benefits of European settlement. Some determinants have already been discussed in the literature, such as pre-colonial population density, latitude, and the disease environment facing Europeans. Pre-colonial population density raises the costs to Europeans of obtaining and securing land for new settlers, and might also raise the benefits since the European often exploited and enslaved the indigenous population. Latitude raises the benefits of simply transferring European technologies (such as for agriculture) to the newly settled areas. A harsh disease environment facing Europeans raises the expected costs of settlement.

To this list of common determinants of European settlement, we construct and examine an indicator of whether a large proportion of the indigenous population died from European-borne diseases during the colonization period. Indigenous mortality from European diseases is a tragic natural experiment that might help account for European settlement because it removed or weakened indigenous resistance to Europeans invading new lands, and made fertile land more readily available to settlers. The phenomenon is limited to lands that had essentially zero contact with Eurasia for thousands of years, since even a small amount of previous contact was enough to share diseases and develop some resistance to them. For example, trans-Sahara and trans-Indian Ocean contacts were enough to make Africa part of the Eurasian disease pool (McNeil 1976; Karlen 1995; Oldstone 1998). Historical studies and population figures show that only the New World (the Americas and Caribbean) and Oceania (including Australia and New Zealand) suffered large-scale indigenous mortality due to a lack of resistance to European diseases (McEvedy and Jones 1978). Thus, our measure of large-scale indigenous mortality from European-borne diseases is captured by a dummy variable for the New World and Oceania.

Our examination of the historical determinants of colonial European settlement yields three findings. First, we find that colonial European settlement tends to be smaller (as a share of total population) in areas where there was a highly concentrated population of indigenous people and where there was not large-scale indigenous mortality during colonization. This finding provides the first direct empirical support for AJR’s (2002) hypothesis that in areas with a high concentration of indigenous people, Europeans did not settle in large numbers and instead established extractive regimes. This finding is a key building block in AJR’s (2002) theory of a “reversal of fortunes,” in which formerly successful areas, i.e., areas with a high concentration of indigenous people, became comparatively poorer due to the enduring effects of extractive political regimes. Second, Europeans tended to settle in large concentrations in lands further from the equator. Third, although biogeography—a measure of the degree to which an area is conducive to the domestication of animals and plants—explains human population density before the era of European colonization (Ashraf and Galor 2011), it does not account for colonial European settlement after accounting for indigenous population density.

We next assess the two key predictions emerging from the political institutions and human capital views concerning colonial European settlement and current economic development and discover the following. First, colonial European settlement is strongly, positively associated with economic development today. This relationship holds after controlling for British legal heritage, the percentage of years the country has been independent since 1776, and the ethnic diversity of the current population. The strong, positive association between European settlement and economic development today is also robust to controlling for the mortality of the indigenous population during colonization, latitude, availability of precious metals, distance from London, ability to cultivate storable plants and domesticate animals, malaria ecology, and European mortality during colonization, as well as soil quality, access to navigable waterways, and continent dummy variables. However, the relationship between economic development today and the proportion of Europeans during colonization weakens markedly when controlling for either current educational attainment or government quality, which is consistent with the views that human capital and political institutions are intermediating mechanisms through which European settlement shaped current economic development.

Second, the European share of the population during colonization is more strongly associated with economic development today than the percentage of the population today that is of European descent. Europeans during the colonization era seem to matter more for economic development today than Europeans today. This finding is consistent with the view that Europeans brought growth-promoting characteristics—such as institutions, human capital, technology, connections with international markets, and cultural norms—that had enduring effects on economic development. This result de-emphasizes the importance of Europeans per se and instead emphasizes the impact of what Europeans brought to economies during colonization.

The estimated positive relation between colonial European settlement and current development is economically large. Based our parameter estimates, we compute for each country the projected level of income in 2000 if colonial European settlement had been zero. We then compare this counterfactual level of current income to actual current income and compute the share of current income attributable to colonial European settlement. The data and estimates indicate that 40 % of current development outside of Europe is associated with the share of Europeans during the colonial era. Though such projections must be treated cautiously, they suggest that the European origins of economic development deserve considerable attention.

We check the robustness of the positive relation between colonial European settlement and current economic development in several ways. First, we were concerned that the relation between colonial European settlement and current development might breakdown when eliminating “neo-Europes,” such as Australia, Canada, New Zealand, and the United States, or other countries that had a comparatively high proportion of Europeans during colonization. Thus, we redid the analyses while omitting countries with colonial European settlement greater than 12.5 %, which represents a natural break in the data that omits the neo-Europes, Argentina, and a few, small Latin American countries.

Rather than the results breaking down when the sample is restricted to countries with a small proportion of Europeans during colonization, the estimates become larger. When examining only those former colonies with colonial European settlement less than 12.5 %, we find that the estimated positive relation between current income and colonial European settlement is more than double the estimate from examining the full sample of non-European economies.

We were also concerned that the inclusion of many countries for which our data indicate zero colonial European settlement might affect the parameter estimates. Besides statistical robustness, the previous literature has not explicitly addressed whether colonial institutions with small European settlement during colonization are better or worse than those with no settlement. Even if small colonial European settlements created worse institutions than those created in areas with no Europeans during colonization, the positive things that Europeans brought with them, such as human capital or technology, could offset the negative development effects of worse institutions. Our data allow us to provide a first evaluation of these issues and to assess the few non-European countries that escaped colonization altogether.

We address concerns about countries with “zero” Europeans during colonization in two ways. First, we omit all of these “zero” countries from the analyses and confirm the results. Second, we include a dummy variable for these “zero” countries and continue to find that colonial European settlement enters positively and significantly in the economic development regressions. We also find that the dummy variable for zero European settlement often enters with a positive coefficient, which provides some empirical support for the political institutions view that small European settlements that create extractive institutions are worse for current economic development than countries with essentially no European settlers during the colonial era. Combining the coefficient estimates on colonial European settlement and the dummy variable, the findings indicate that once colonial European settlement is above 4.8 %, any adverse effects from extractive institutions associated with small colonial European settlements were more than offset by other things that Europeans brought during colonization, such as human capital, technology, familiarity with global markets, and institutions, that had enduring, positive effects on economic development.

Ample qualifications temper our conclusions. First, we do not assess the welfare implications of European colonization. Europeans often cruelly oppressed, enslaved, murdered, and even committed genocide against indigenous populations, as well as the people that they brought as slaves (see Acemoglu and Robinson 2012 for compelling examples). Thus, GDP per capita today does not measure the welfare effects of European colonization; it only provides a measure of economic activity today within a particular geographical area. Although there is no question about European oppression and cruelty, there are questions about the net effect of European colonization on economic development today. Second, we do not separately identify each potential channel through which the European share of the population during colonization shaped long-run economic development. Rather, we provide the first assessment of the relationship between colonial European settlement and comparative economic development and thereby inform debates about the sources of the divergent paths of economic development taken by countries around the world since the colonial period.

The remainder of the paper is organized as follows. Section 2 defines and discusses the data, while Sect. 3 provides preliminary evidence on the determinants of human settlement prior to European colonization and the factors shaping European settlement. Section 4 presents the paper’s core results on the relationship between colonial European settlement and current economic development. Section 5 reports an exercise in development accounting to calculate what share of global development can be attributed to Europeans. Section 6 concludes.

2 Data

This section describes the two data series that we construct: (1) the European share of the population during colonization and (2) the degree to which a region experienced large scale indigenous mortality due to the diseases brought by European explorers in the fifteenth and sixteenth centuries. The other data that we employ are taken from readily available sources, and we define those variables when we present the analyses below.

2.1 Euro share

We compile data on the European share of the population during colonization (Euro share) from several sources. Since colonial administrators were concerned about documenting the size and composition of colonial populations, there are abundant—albeit disparate—sources of data. Of course, there was hardly anything like a modern statistical service in colonial times, so that different administrators across different colonies in different time periods used different and often undocumented methods for assembling population statistics. Thus, we use a large variety of primary and secondary sources on colonial history to piece together data on the European share of the population.

Although the Data Appendix (available online) provides detailed information on our sources, the years for which we compiled data on each country, and discussions about the quality of the data, it is worth emphasizing a few points here. First, we face the challenge of choosing a date to measure European share. We would like a date as early as possible after initial European contact to use European settlement as an initial condition affecting subsequent developments. At the same time, we do not want to pick a date that is too early after European contact since it is only after some process of conquest, disease control, and building of a rudimentary colonial infrastructure that it became possible to speak of a European settlement. Given these considerations, we try to choose a date at least a century after initial European contact, but at least 50 years before independence. This means that for conceptual reasons we do not seek to use a uniform date across all colonies. For example, Europeans were colonizing and settling Latin America long before colonizing Africa. We also lack a continuous time series for each country; rather, the data reflect dates when colonial administrators in particular locales happened to measure or estimate populations.Footnote 2 Given these data limitations, we cannot always adhere to our own guidelines for choosing the date on which to measure Euro share. In sensitivity analyses discussed below, we show that the results are robust to measuring Euro share as the average value over three different uniform periods: (i) 1500–1800, (ii) 1801–1900, or (iii) 1500–1900.

Second, we adopt a “dog did not bark” strategy for recording zero European settlement. If we find no historical sources documenting any European settlement in a particular colony, we assume that there were no such settlers. This procedure runs the risk of biasing downward European settlement. However, we believe colonial histories (which are virtually all written by European historians) are extremely unlikely to fail to mention significant European settlements. We checked and confirmed the validity of this procedure of setting colonial European settlement to zero using the Acemoglu et al. (2001) Online Data Appendix, which gives the share of Europeans in the population in 1900. Furthermore, as presented below, the results hold when eliminating all countries with zero colonial European settlement or when including a fixed effect for these countries in the regression analyses.

2.2 Indigenous mortality

We examine several predetermined factors that potentially influenced European settlement—including the degree to which Europeans brought diseases that wiped out the indigenous population. Others have carefully documented this tragic experience, but we believe that we are the first to use it to explain the nature of colonization and its effect on subsequent economic development.

Although Europeans established at least a minimal level of contact with virtually all populations in the world during the colonial period, this contact had truly devastating effects on indigenous populations in some regions of the world but not in others. Some regions had been completely isolated from Eurasia for thousands of years, and thus had no previous exposure or resistance to Eurasian diseases. When Europeans then made contact with these populations—which typically occurred during the initial stages of global European exploration and hence long before anything resembling “European settlements,” European diseases such as smallpox and measles spread quickly through and decimated the indigenous population. For example, when the Pilgrims arrived in New England in 1620, they found the indigenous population already very sparse because European fisherman had occasionally landed along the coast of New England in the previous decades. Similarly, De Soto’s expedition through the American South in 1542 spread smallpox and wiped out large numbers of indigenous people long before British settlers arrived.

Thus, we construct a dummy variable, Indigenous mortality, which equals one when a region experienced large-scale indigenous mortality due to the spread of European diseases during the initial stages of European exploration. To identify where Europeans brought diseases that caused widespread fatalities, we use the population data of McEvedy and Jones (1978) and three epidemiological world histories (McNeil 1976; Karlen 1995; Oldstone 1998). Diseases had circulated enough across Eurasia, Africa and the sub-continent, so that indigenous mortality did not shoot up with increased exposure to European explorers, traders, and slavers during European colonization. The New World (Americas and Caribbean) and Oceania (the Pacific Islands, Australia, and New Zealand) were different. When European explorers and traders arrived, the microbes that they brought triggered extremely high mortality rates, which accords with their previous isolation from European diseases. The evidence suggests that mortality rates of 90 % of the indigenous population after European contact were not unusual.

Although we compiled a country-by-country indicator of whether the country experienced large-scale indigenous mortality during colonization, our data indicate little measurable variation within the New World and Oceania, so that Indigenous mortality wound up being a simple dummy for countries in the New World and Oceania. (As discussed below, our findings on the associations of current development and colonial European settlement are robust to including continent fixed effects.) This dummy variable measure suggests caution in interpreting the results on Indigenous mortality. Although the data indicate that large-scale indigenous mortality occurred in the New World and Oceania but not elsewhere (McEvedy and Jones 1978; McNeil 1976; Karlen 1995; Oldstone 1998), Indigenous mortality is ultimately a dummy variable for these regions of the world and might proxy for other features of these regions, such as geographic isolation, rather than European-induced mortality. These same areas that were isolated from Europeans prior to colonization—and hence more susceptible to European-borne diseases—also had lower population densities in 1500 AD. This may be related to Spolaore and Wacziarg’s (2009) result on diffusion of technology as a function of when different branches of humanity became separated. Populations in Oceania and the Western Hemisphere had been isolated from the rest for a very long time, and hence they did not get either (1) the more advanced technology originating in the Old World that would have helped support a larger population or (2) the exposure to European diseases before colonization that would have helped them become more resistant to European diseases and hence to European settlement. We will see that this combination of low indigenous population density and vulnerability to European diseases plays a large role in accounting for where Europeans settled.

3 Preliminaries: where did Europeans settle?

Table 2 provides regression results concerning which factors shaped European settlement during colonization. The dependent variable is the proportion of Europeans in the colonial population (Euro share).

The regressors are as follows. First, we include Population density 1500. Note that this is a measure of the pre-Columbian population for the New World (i.e., before 1492) even though the date is conventionally rounded off to 1500. Hence, this number does NOT include any initial population decrease due to indigenous mortality from European-borne diseases. Since the regressions control for the attractiveness of the land for settlement, a plausible interpretation of the impact of Population density 1500 on Euro share is that it gauges the ability of the indigenous population to resist European settlement.Footnote 3

Second, we include Indigenous mortality, which we designed to provide additional information on the ability of the indigenous population to resist European settlers. If European diseases eliminated much of the indigenous population, this would reduce their ability to oppose European settlement. However, since Indigenous mortality in practice equals a dummy variable for the New World and Oceania, we cannot separately identify the impact of Indigenous mortality and continent fixed effects on Euro share.

Third, Latitude might have special relevance for European settlers to the extent that they are attracted to lands with the same temperate climate as in Europe. Latitude measures the absolute value of the distance of the colony from the equator.

Fourth, Precious Metals is an indicator of whether the region has valuable minerals since this might have affected European settlement. Fifth, one cost of settling in a particular country might be its distance from Europe, so we use the distance from London to assess this view (London). Finally, we examine other possible determinants of the attractiveness of the land for settlement, including Biogeography, Malaria ecology, and Settler mortality. Biogeography is an index of the prehistoric (about 12,000 years ago) availability of storable crops and domesticable animals, where large values signify more mammalian herbivores and omnivores weighing greater than 45 kg and more storable annual or perennial wilds grasses, which are the ancestors of staple cereals (e.g., wheat, rice, corn, and barley).Footnote 4 Malaria ecology is an ecologically-based spatial index of the stability of malaria transmission in a region, where larger values signify a greater propensity for malaria transmission.Footnote 5 Settler mortality equals historical deaths per annum per 1000 European settlers (generally soldiers, or bishops in Latin America) and is taken from AJR (2001).

The results show that three factors account for the bulk of cross-country variation in European settlement. First, the density of the indigenous population matters. In regions with a high concentration of indigenous people who could resist European occupation, Europeans comprised a much smaller fraction of the colonial population than in other lands. Second, in countries where the indigenous population fell drastically because of European diseases, i.e., in the New World and Oceania, European settlers were more likely to settle. Third, there is a positive relationship between Euro share and Latitude, even when conditioning on Population density 1500 and Indigenous mortality. Europeans were a larger proportion of the colonial population in higher (more temperate) latitudes, plausibly because of the similarity with the climate conditions in their home region.Footnote 6 \(^{,}\) Footnote 7

These three variables, Population density 1500, Indigenous mortality, and Latitude help explain in a simple way the big picture associated with European settlements, or the lack thereof, in regions around the world. Where all three factors were favorable for European settlement, such as Australia, Canada, New Zealand, and the United States, the European share of the colonial population was very high. When only some of the three factors were favorable, there tended to be a small share of European settlers. Latin America suffered large-scale indigenous mortality, but only some regions were temperate, and most regions had relatively high pre-Columbian population density (which is why more people of indigenous origin survived in Latin America compared to North America, even though both regions experience high indigenous mortality rates when exposed to European diseases). Southern Africa was temperate and had low population density, but did not experience large-scale indigenous mortality. These factors can also explain where Europeans did not settle. The rest of sub-Saharan Africa was tropical and again did not experience much indigenous mortality from exposure to the microbes brought by Europeans during colonization. And, most of Asia had high population density, did not suffer much indigenous mortality from European borne diseases, and is in or near the tropics, all of which combine to explain the low values of Euro share across much of Asia.

None of the other possible determinants that we consider are significant after controlling for these three determinants. Indeed, European colonial settlement, unlike pre-Columbian population (the latter as verified by Ashraf and Galor (2011, 2013)), was NOT associated with the intrinsic, long-run potential of the land—as measured by Biogeography.

One of the most famous variables in the literature on explaining European settlement is Settler mortality. Our data on colonial settlement allows for the first assessment of the ability of this variable to explain European settlement during colonization. The results are mixed. Settler mortality has a negative and significant simple correlation with colonial European settlement (not shown), confirming the prediction in AJR. It becomes insignificant when including the three variables that we found most robust in accounting for colonial European settlement, and does not materially alter the statistical significance of the other variables. Yet, when we include all RHS variables simultaneously (in column 8 of Table 2), Settler mortality returns to significance. In sum, the relation between Euro share and Settler mortality is highly sensitive to changes in the sample and the control variables.

4 Results: Europeans during colonization and current economic development

4.1 Simple graphical analyses

To assess the relationship between the European share of the population during colonization and the current level of economic development, we begin with simple graphs. We measure the current level of economic development as the average of the log of real per capita GDP over the decade from 1995 to 2005 (Current income). Using data averaged over a decade reduces the influences of business cycle fluctuations on the measure of current economic development.

Fig. 1
figure 1

Distribution of colonial European settlement and median current income. This figure shows the number of countries classified in groups according to their European shares at colonization (left axis). The median current income (in logs) for each group is also reported (right axis)

Figure 1 shows (1) the number of countries with values of Euro share within particular ranges, (2) the actual countries with these values of Euro share, and (3) the corresponding median level of Current income for countries with values of Euro share within the particular ranges. Two key patterns emerge. First, median Current income is positively associated with Euro share. Second, very few countries have Euro share greater than 0.125. While ES and AJR do not provide an empirical definition of a “settler colony,” we use 12.5 % as a useful benchmark.

Figure 2a and b illustrate the relationship between Current income and Euro share using Lowess, which is a nonparametric regression method that fits simple models to localized subsets of the data and then smooths these localized estimates into the curves provided in Fig. 2a and b. Figure 2a illustrates the relationship for the full sample of non-European countries. Figure 2b provides the curve for the sub-sample of countries with measured values of Euro share less than 12.5 %. Figure 2c omits zero observations from Fig. 2b. As shown, the relationship between Euro share and Current income is positive throughout. There is no apparent region in which an increase in Euro share is associated with a reduction in Current income, although 2c shows a steeper relationship at very low values of Euro share than 2b. We examine this more formally below.

4.2 Euro share and economic development today

In this section, we use regressions to condition on a range of national characteristics and assess the independent relationship between Current income and Euro share, where we use subscript i to represent an individual country.

We consider the following cross-country regression:

$$\begin{aligned} Current\,\,income_{i}=\alpha + \beta ^{*} Euro\,\, share_{i}+\gamma ^{\prime }{\varvec{X}}_{\varvec{i}}+u_{i}, \end{aligned}$$
(1)

where \({\varvec{X}}_{\varvec{i}}\) is a matrix of the characteristics of country i that we define below and \(u_{i}\) is an error term, reflecting economic growth factors that are idiosyncratic to country i, as well as omitted variables, and mis-specification of the functional form. Different theories provide distinct predictions about (a) the coefficient on the share of Europeans in country i \((\beta )\), (b) whether \(\beta \) changes when conditioning on particular national characteristics, and (c) how \(\beta \) changes across sub-samples of countries.

We get some insight into the channels connecting Euro share and Current income by examining how \(\beta \) changes when controlling for the different potential channels discussed above: political institutions and human capital. If Euro share is related to current levels of economic development through the formation of enduring political institutions, then Euro share may not enjoy an association with economic development today conditioning on political institutions. And, if Euro share is related to economic development today through the spread of human capital, then Euro share may not have an association with development today conditioning on educational attainment today. Of course, both current political institutions and educational attainment are endogenous to current economic development, so these findings must be interpreted cautiously.

Fig. 2
figure 2

Current income and colonial European settlement. These three figures plot Current income (measured by average log of GDP per capita from 1995 to 2005) against Euro share (the proportion of Europeans in the colonial population). The figures also include the locally weighted scatterplot smoothing curve (lowess), in which simple regressions are fitted to localized subsets of data to produce the nonlinear curve. Figure 2a uses the full sample; Fig. 2b uses the sample of countries with Euro share \(<\) 0.125, and Fig. 2c uses the sample of countries with 0 \(<\) Euro Share \(<\) 0.125

We begin by evaluating Eq. (1) while conditioning on an array of national characteristics \(({\varvec{X}})\). Legal origin is a dummy variable that equals one if the country has a common law (British) legal tradition. This dummy variable both captures the argument by North (1990) that the United Kingdom instilled better growth-promoting institutions than other European powers and the view advanced by La Porta et al. (1998) that the British legal tradition was more conducive to the development of growth-enhancing financial systems than other legal origins, such as the Napoleonic Code passed on by French and other European colonizers. Further differentiating across different civil law traditions, as in La Porta et al. (1998), does not alter the results. Education equals the average gross rate of secondary school enrollment from 1995 to 2005 and is taken from the World Development Indicators. Independence equals the fraction of years since 1776 that a country has been independent. As in Beck et al. (2003) and Easterly and Levine (2003), we use this to measure the degree to which a country has had the time to develop its own economic institutions. It could also be interpreted as a measure of the duration (and hence, perhaps, intensity) of recent colonialism across countries. Government quality is an index of current level of government accountability and effectiveness and is taken from Kaufman et al. (2002). Ethnicity is from Easterly and Levine (1997) and measures each country’s degree of ethnic diversity. In particular, it measures the probability that two randomly selected individuals from a country are from different ethnolinguistic groups. Since the purpose of our research is to examine the impact of European settlement outside of Europe, all of the regressions exclude European countries.

Table 1 Descriptive statistics
Table 2 Determinants of colonial European settlement
Table 3 Current income and colonial European settlement

Using ordinary least squares (OLS), Table 3 shows that there is—with two notable exceptions—a positive and statistically significant relation between Current income and Euro share. For example, regression (1) indicates that an increase in Euro share of 0.1 (where the mean value of Euro share is 0.07 and the standard deviation is 0.17) is associated with an increase in Current income of 0.36 (where the mean value of Current income is 8.2 and the standard deviation is 1.3). Table 1 lists descriptive statistics for the main variables used in the analyses. Below, we provide more detailed illustrations of the magnitude of the relation between the European share of the population during colonization and the current level of economic development. The strong positive link between the European share of the population during colonization and current economic development holds when conditioning on different national characteristics. Indeed, when simultaneously conditioning on Legal origin, Independence, and Ethnicity, the results hold.

Table 4 Current income and colonial European settlement, Euro share \(<\) 12.5 %
Fig. 3
figure 3

Colonial European share and European share today. This figure shows a simple scatter plot comparing the proportion of Europeans in the Colonial population with the same proportion in 2000

The exceptions are that the coefficient on Euro share falls materially and becomes insignificant when conditioning on either Education or Government quality. These findings are consistent with—though by no means a definitive demonstration of—the view that the share of Europeans in the population during colonization shaped long-run economic development by affecting political institutions and human capital accumulation.

These results could be driven by a few former colonies in which Europeans were a large fraction of the population during economic development and that just happen to be well-developed former colonies today. Thus, we conduct the analyses for a sample of countries in which Euro share was less than 12.5 %. The goal of restricting the sample to only those countries where Europeans account for a small proportion of the population is to assess whether the relation between Euro share and Current income holds when there is only a small minority of Europeans. While there is no formal definition of what constitutes a “small European colony” that is conducive to extractive institutions, we use less than 12.5 % European as a conservative benchmark of a small colonial European settlement and because there is a natural break in the distribution of Euro share across countries at this level.

Table 5 Current income and colonial European settlement, sensitivity analyses

As shown in Table 4, however, the coefficient on Euro share actually becomes larger when restricting the sample to those countries in which Euro share is less than 12.5 % (in the regressions that do not condition on either Education or Government quality). The increase in the coefficient on Euro share when restricting the sample to former colonies with Euro share less than 12.5 %, suggests that the relationship between the European share of the population during colonization and the level of economic development does not simply represent the economic success of “settler colonies.” Rather, a marginal increase in Euro share is associated with a bigger increase in subsequent economic development in colonies with only a few Europeans—one might characterize this as the diminishing marginal long-run development product of Euro share.Footnote 8 Table 4 also shows that the relationship between Current income and Euro share remains sensitive to controlling for political institutions and human capital accumulation. The association between Current income and Euro share shrinks and becomes insignificant when conditioning on Education or Government quality.

The coefficient on the British legal origin dummy variable is never significant (nor will it be in the rest of the paper). It is also of interest that many of the colonies with Euro share \(<\) 0.125 were Spanish colonies. Hence we find no evidence for the popular view that British colonization or legal origin led to more development than Spanish colonization or legal origin.

As robustness tests, we next expand the conditioning information set ( X ) and report the results in Table 5. In particular, we repeat the analyses in Tables 3 and 4 except that in all of the regressions we include the control variables from Table 2 (Indigenous mortality, Latitude, Precious metals, London, Biogeography, Malaria ecology, and Settler mortality) and continent fixed effects. We use the UN coding and definitions of populated continents—Europe, Africa, Asia, the Americas, and Oceania. The UN coding of continents seems to us the least susceptible to later, possibly endogenous, splits of regions such as North Africa and Sub-Saharan Africa, or North and South America.Footnote 9 Since our sample excludes European countries, we do not include a dummy variable for the continent of Europe, and since Indigenous mortality is the summation of the Americas and Oceania dummy variables, it drops from the analyses.

Although expanding the conditioning information set reduces the sample by about 50 %, the results hold. In no case does expanding the conditioning information set cause the coefficient on Euro share to become statistically insignificant compared to the results reported and already discussed in the Table 3 and 4 analyses.Footnote 10

4.3 Is it Europeans during colonization or Europeans today?

If Euro share proxies for the proportion of the population today that is of European descent, then it would be inappropriate to interpret the results in Tables 3 and 4 on Euro share as reflecting the enduring impact of Europeans during the colonization period on economic development today. Indeed, Fig. 3 shows that there is a positive association between colonial Euro share and European share in 2000, which we call Euro 2000 P-W and is take from Putterman and Weil (2010). To assess the strength of the independent relationship between the level of economic development today and the European share of the population during the colonial era, we therefore control for the proportion of the population today that is of European descent.

Table 6 Current income, colonial European settlement, and current European descendants

In Tables 6 and 7, we find that all of the results on the positive relationship between Current income and Euro share hold even when controlling for the current proportion of the population of European descent. Tables 6 and 7 are the same as Tables 3 and 4 except that the regressions also condition on the proportion of the population of European descent today. As shown, across the different regression specifications, samples, and control variables, the earlier results hold.

Figure 3 helps in understanding that the proportion of Europeans during the colonization period is more strongly associated with current economic development than the proportion of the population today that is of European descent. Examining the scatter plot in Fig. 3, consider three groups of countries: (1) countries in which Euro share was high both in colonial times and today (e.g., North America), (2) countries in which Euro share was low both in colonial times and today (e.g., South Africa), and (3) countries in which Euro share today is much higher than it was in colonial times (e.g., some Central and South American countries). If colonial Euro share did not have an independent link with incomes today, then we would expect group (3)’s income to be more like group (1)’s income. But, this is not what we find. In contrast, if colonial Euro share does matter independently for income today, then we would expect group (3)’s income to have lower income than group (1) and to have similar income to group (2). This is what we observe. The proportion of Europeans during the colonization period is independently associated with economic development today.

Table 7 Current income, colonial European settlement, and current European descendants, Euro share \(<\) 12.5 %

In Table 8, we add the same conditioning variables to the Tables 6 and 7 regressions that were used in Table 5. That is, in Table 8, we not only control for the proportion of the population of European descent in 2000, we also include the continent fixed effects as well as the control variables from Table 2Latitude, Precious metals, London, Biogeography, Malaria ecology, and Settler mortality.

As shown, the results are even stronger when controlling for the continent dummy variables and variables used to assess the determinants of Euro share. When examining the full sample of countries in Panel 1 of Table 8 or the subset of countries with Euro share less than 12.5 %, we find that Euro 2000 P-W never enters significantly. However, Euro share enters positively, significantly with similar coefficient estimates as those reported in Table 2. Thus, the core results on the relationship between Current income and the European share of the population during the colonization period hold after materially expanding the set of control variables.Footnote 11

Table 8 Current income, colonial European settlement, and current European descendants, sensitivity analyses

4.4 Explaining the reversal of fortune

In a widely cited article, Acemoglu et al. (2002, p. 1231) document a reversal of fortune: “Among countries colonized by European powers during the past 500 years, those that were relatively rich in 1500 are now relatively poor.” They proxy for the degree to which a country was relatively rich in 1500 with two indicators: urbanization rates and population density. Using both indicators, they find a strong negative correlation between the economic success of a region before colonization and current levels of per capita income.

Acemoglu et al. (2002) argued that densely populated areas in 1500 were more likely to induce Europeans to adopt extractive institutions, and these extractive institutions stymied economic development, leading to the reversal of fortune. In particular, they hypothesized that successful areas before European colonization, as measured by population density, would attract only a few European settlers, who would establish extractive political institutions that would retard long-run growth. Acemoglu et al. (2002) also suggested a direct positive effect of indigenous population density on the productivity of extractive institutions: there was more prosperity for Europeans to tax away for themselves, and there was a large labor force to exploit in European-owned plantations and mines.

Our new data on colonial European settlement contribute to the study of the reversal of fortune in two ways. First, using actual data on colonial European settlement, we provide the first confirmation of AJR’s prediction that indigenous population density was inversely associated with the share of European settlers during colonization (Table 2). Consistent with Acemoglu et al. (2002), we find that Europeans settled where there were not many other people. Second, as demonstrated in Tables 347, and 8, Euro share has a strong positive association with per capita income today. This finding is consistent with the explanation that colonial European settlement is a key intermediating variable in explaining the reversal of fortunes:Footnote 12 lower pre-colonial population density facilitated more European settlers (as a share of total population) and these settlers brought human capital, political institutions, and other factors that fostered economic development. This explanation does not require that political institutions are the principal channel through which European settlement shaped the reversal of fortunes. It simply requires, as documented above, that Euro share is positively associated with subsequent economic development.

4.5 Additional robustness tests

These results are robust to several sensitivity analyses. First, there might be concerns about the dating of Euro share. As discussed above, we attempt to use a date in each country’s history that is at least a century after initial European contact, but at least 50 years before independence. This is both a bit arbitrary and often infeasible due to data availability. Thus, we re-do all of the analyses using three alternative ways of computing Euro share. We compute the average value of the share or European settlement in our data over three uniform periods: (1) 1500–1800, (2) 1801–1900, and (3) 1500–1900.Footnote 13 Table 9 provide the regression results using these alternative methods for dating the share of colonial European settlement. As shown, all of the earlier results hold.

Second, there might also be concerns that (1) we assign a value of zero to Euro share if we find no historical sources documenting European settlement in a particular colony and (2) countries in which Euro share equals zero are special cases. Thus, we conducted two sensitivity tests. We simply re-did the analyses while eliminating all countries with Euro share equal to zero. All of the results hold (Table 10).

Furthermore, we re-did the analyses while including a dummy variable called Dummy for euro share equal to zero that equals one if Euro share equals zero and zero otherwise. These results are reported in Table 11. As shown, all of the results on Euro share hold when including this dummy variable. The finding that Dummy for euro share equal to zero generally enters with a positive and significant coefficient in the Tables 4 and 7 specifications that restrict the sample to Euro Share \(<\).125) provides some suggestive evidence for the view that small European settlements had harmful effects on long-run economic development compared to no European settlement. The estimates, however, indicate that the positive effect of Euro share on Current income quickly surpasses the positive effect of the Dummy for euro share equal to zero on Current income. For instance, the estimates from column (1) of Table 11, Panel 2 indicate that a Euro share of greater than 0.048 is better for Current income than a Euro share of zero.

Table 9 Current income and colonial European settlement: alternative time periods for computing colonial European settlement
Table 10 Current income and colonial European settlement: eliminating countries with Euro share = 0
Table 11 Current income and colonial European settlement: including a fixed effect for Euro share \(=\) 0

Third, we also assess whether other geographic endowments account for the findings on the relation between Euro share and Current income. In particular, we use variables from Ashraf and Galor (2013), the impact of soil quality and access to navigable waterways. For soil quality, they gauge the suitability of the soil for agriculture though measures of soil carbon density and soil pH. For navigable waterways, they use the average distance from grid cells throughout a country to the nearest ice-free coastline or sea-navigable river. When we include these measures of geographic endowments in our analyses, all of the results hold.

Fourth, there might be concerns that we do not measure the effect of European colonization compared to no colonization, and hence might not sufficiently distinguish European colonization from European settlement. This is a valid issue but difficult to study empirically, since few non-European countries completely escaped colonization or some intermediate form of European control, and there is ambiguity and disagreement as to which ones they are. For example, AJR identify the following countries as not former colonies: China, Iran, Japan, Korea, Liberia, Mongolia, Thailand, Turkey, and Uzbekistan. There are, however, disagreements because, for example, China had semi-colonial enclaves, Iran was a “sphere of interest” for Russia and Britain, Uzbekistan partially belonged to the Russian empire, Korea was a Japanese colony, and descendants of American slaves governed Liberia. In the other direction, Ethiopia is often considered a non-colony but is classified as a colony by AJR. Nevertheless, when including a dummy variable that equals one if the country is a former colony as classified by AJR and zero otherwise, all of the paper’s results hold.

5 How much development is attributable to Europeans?

In this section, we conduct a simple global development accounting exercise to assess how much of development today might be associated with European settlers during colonization. This is purely illustrative and the results should be treated as such. This exercise uses the estimated equation for Euro share with no controls

$$\begin{aligned} \ln \left( {CurrentIncome_i } \right) =\alpha +\beta Euroshare_i +\varepsilon _i \end{aligned}$$
(2)

Next, define the counterfactual CurrentIncome \(^{CF}\) for every country outside of Europe by removing the European effect:

$$\begin{aligned} { CurrentIncome}_i^{CF} ={ CurrentIncome}_i .e^{-\beta EuroShare_i} \end{aligned}$$
(3)

Of course, \({ CurrentIncome}_i ={ CurrentIncome}_i^{CF}\) for any country i where Euroshare \(_{i} =0\).

The counterfactual population-weighted global mean is then simply the weighted mean across all non-European countries of \(CurrentIncome_i^{CF}, \) where \(P_{i}\) is population in country i, and P is total global population:

$$\begin{aligned} \tilde{y}^{CF}=\sum \nolimits _i \left( {\frac{P_i }{P}} \right) { CurrentIncome}_i^{CF} . \end{aligned}$$
(4)

The global population-weighted per capita income \(\tilde{y}\) is

$$\begin{aligned} \tilde{y} =\sum \nolimits _i \left( {\frac{P_i }{P}} \right) { CurrentIncome}_i . \end{aligned}$$
(5)

The share of development attributed to European settlement is then \(s_e =\left( {\frac{\tilde{y} -\tilde{y} ^{CF}}{\tilde{y} }} \right) \).

As an illustrative exercise, we use the sample and the coefficient from regression (1) of Table 3, which is the simplest regression for the full sample of all countries outside of Europe. The coefficient estimate is \(\beta = 3.623\).

Using the 2000 population weights, the data and estimated coefficients indicate that 40 % of the development outside of Europe is associated with the share of European settlers during colonization \(\left( {\frac{\tilde{y} -\tilde{y} ^{CF}}{\tilde{y} }} \right) \). We repeat our frequent caveat that global per capita income is not a welfare measure, especially in light of the history of European exploitation of non-Europeans.

As an illustrative exercise in positive analysis, however, it is striking how much of global development today could be associated with the migration and settlement of Europeans during the colonial era (not even considering the development of Europe itself). It is even more striking that this large average income outcome in a non-European world today of over five billion is associated with the migration of only six million European settlers in colonial times.

6 Conclusions

The previous literature was correct to focus on colonial settlement by Europeans as one of the pivotal events in the history of economic development. In this paper, we provide the first direct evidence that the proportion of Europeans during colonization is strongly and positively associated with the level of economic development today. These findings are robust to using different subsamples of countries, controlling for an array of country characteristics, and conditioning on the current proportion of the population of European descent.

These results relate to theories of the origins of the divergent paths of economic development followed since Europeans colonization. ES and AJR stress that when endowments lead to the formation of settler colonies, this produced more egalitarian, enduring political institutions that fostered long-run economic development. And, ES and GLLS emphasize that Europeans brought human capital that slowly disseminated to the population at large and boosted economic development. The results presented in this paper are consistent with both of these effects: former colonies with larger colonial European settlements have much higher levels of economic development today than former colonies that had a smaller proportion of Europeans during the colonial period. Our results also paint a positive picture of minority colonial European settlements about which the previous literature was ambiguous. Specifically, the estimates indicate that once European settlement is above 4.8 %, the small colonial European settlements have a positive effect on development today compared to no colonial European settlement. This is suggestive that any adverse effects arising from the extractive institutions created by small colonial European settlements were more than offset by other things that Europeans brought during colonization, such as human capital, technology, familiarity with global markets, and institutions, which had lasting, positive effects on economic development.