Introduction and Motivation

The removal of barriers to the free movement of labour, capital, goods and services within the borders of the European Union (EU) was called for by the Treaty establishing the c Community in 1957. Despite member states’ growing economic integration, intra-EU labour mobility remained very low for decades and received comparatively little attention in the policy debate until Europe decided to move to a single currency. Labour mobility between member states of a currency area could work as an effective shock absorption mechanism.Footnote 1 Yet the free movement of labour in Europe appeared to be a mere notion rather than an economic stabiliser – in 2000, only 0.1% of the total EU15 population changed official residence between two member states (European Commission, 2002), and a mere 1% resided in an EU country other than that of their citizenship (Eurostat, 2021b, c).

To support cross-border labour mobility, the EU undertook a number of initiatives.Footnote 2 However, it was not until after the eastern enlargement rounds and the Great Recession that the dynamics of intra-EU labour mobility changed markedly.Footnote 3 The share of EU citizens of working age residing in an EU member state other than that of their citizenship made up 2.4% in 2010 and increased further to 3.3% by 2020 (Eurostat, 2021a). Regardless of whether one believes that there is too much or too little labour mobility, the real-life migration pattern in the EU is extremely uneven (see Table A1 in the Appendix).

The recent financial crisis and the subsequent economic downturn have given a fresh impetus to political, economic and academic debates on labour mobility and its potential contribution to growth and employment in the euro area (e.g. Arpaia et al., 2016; Barslund & Busse, 2014; Elsner & Zimmermann, 2016; Galgóczi & Leschke, 2016; Kaczmarczyk & Stanek, 2016). There is extensive literature on the volume and composition of migrants from accession countries as well as on the impact of labour mobility on both sending and receiving countries (e.g. Alcidi & Gros, 2019; Baas & Brüecker, 2010; Brüecker et al., 2009; Kahanec & Zimmermann, 2010). The understanding of the forces driving intra-EU mobility is nevertheless still limited.

Understanding the factors shaping migration flows within Europe is crucial for the development of policies aimed at removing unnecessary barriers to intra-EU mobility. This article contributes to the existing literature by identifying some of the key determinants of international migration flows within the EU and specifically examining the role of cultural and linguistic differences in explaining the size of these flows. The empirical analysis uses data from 28 EU member states over the period 1998–2018. A series of indicators of cultural distance are controlled for along with economic, demographic, geographical, political and network variables. The indicators measuring the extent of cultural barriers between countries are linguistic distance based upon the linguistic proximity measure constructed by Dyen et al. (1992) from the matrix of lexicostatistical percentages, an indicator calculated on the basis of cultural dimensions created by Hofstede as well as a new index based on interpersonal distance preferences in different countries as measured by Sorokowska et al. (2017).

The results reveal that economic incentives, geographical proximity and the size of the network already settled in the destination country have a significant and positive effect on intra-EU migration flows. Cultural distance does not seem to prevent Europeans from moving to another member state, whereas linguistic distance has a significant and strong negative effect on the size of migration flows. These results show that open borders alone do not imply that EU citizens enjoy full freedom of movement. The cost of learning a new language is an important factor preventing Europeans from moving freely across the EU.

The remainder of the paper is organised as follows. Introduction and Motivation provides an overview of related literature. Conceptual Framework and Methodology describes the conceptual framework, presents the data used and describes the construction of the cultural and linguistic distance measures employed in the article. Econometric Specification outlines the empirical approach while Discussion of Findings discusses the results. Summary and Conclusions concludes.

Theoretical and Empirical Approaches to International Migration

The decision to migrate abroad is affected by numerous determinants of economic as well as non-economic nature and may be shaped by various unmeasured or immeasurable factors. ‘[The] laws of population, and economic laws generally, have not the rigidity of physical laws, as they are continually being interfered with by human agency’, Ravenstein observed in 1889 (p. 241).

Despite this early observation, for many years, a central role in shaping the views and strategies of academics and policymakers has been played by the traditional neoclassical approach to international migration, which suggests that migration takes place because there are variations in wages and unemployment rates across labour markets in different countries that individuals respond to (Hicks, 1932; Harris & Todaro, 1970; Todaro, 1969). Neoclassical individuals from low-wage countries thus follow their adding-machine brains and inevitably choose to migrate in order to enjoy the highest income possible, hence maximising their utility.

It has been previously suggested that in the European case, wage and unemployment differentials may not be the central factor explaining international migration. Bentivogli and Pagano (1999) consider the migration responsiveness to wage and unemployment differentials in the United States and the euro area.Footnote 4 The authors find the sensitivity of net immigration flows to regional disparities in both unemployment rates and income to be much lower in Europe than in the United States; moreover, there is no response of migration flows to shocks in the regional relative unemployment rate in Europe. Braunerhjelm et al. (2000) show that despite a considerable fall in wage differentials between some European countries – for example, between France and Spain – since the 1970s, there has been an even larger increase in unemployment differentials (p. 51). Consequently, when weighted by the probability of being employed, wage differentials have in fact increased. Braunerhjelm et al. (2000) argue that levels of income in the sending country rather than income differentials influence the propensity to migrate, considering that in developed countries, households are generally not forced to migrate due to poverty and deprivation in the home country. Belot and Ederveen (2012), on the other hand, find that mobility between European countries does respond to economic differentials. Furthermore, Ortega and Peri (2013) suggest that intra-EU migration is highly sensitive to economic conditions at the destination, once time-varying factors at the origin are controlled for.

The relationship between welfare systems and international migration flows has received much attention in the policy debate in the EU. The welfare magnet hypothesis puts forward that individuals base their migration decision on the generosity of the welfare system in the country of destination (Borjas, 1999). The concern is that immigrants move to countries with generous welfare systems in order to receive social benefits rather than work. Some studies show that countries with higher social expenditure attracted more migrants, albeit the economic impact is small (Warin & Svaton, 2008; De Giorgi & Pellizzari, 2009). Other studies find no evidence that welfare generosity influences migration decisions (Giulietti, 2014; Guild et al., 2013; Kahanec & Guzi, 2020; Ponce, 2019). In spite of the lack of conclusive evidence for ‘welfare tourism’ in the EU, concern that freedom of movement could be used to profit from the generosity of the welfare system in the country of destination has been brought onto the EU agenda.

In an attempt to model migration flows more realistically, the human capital migration theory takes the heterogeneity of immigrants into account (e.g. Borjas, 1987, 1989; Hatton & Williamson, 2002; Sjaastad, 1962). It suggests that the probability of becoming employed and receiving higher wages at the destination relative to the origin, and thus to migrate, depends on individual human capital characteristics. This is why individuals from the same country of origin may have different costs of migration and consequently different inclinations to move.

An examination of the population composition can therefore shed light on the mobility attitudes of particular groups. For example, young people are likely to face lower costs of moving abroad and expect to derive the highest benefits from investment in their human capital. Burda (1993), analysing migration patterns in Germany after reunification, found that age is negatively and strongly associated with the inclination to migrate. Belot and Ederveen (2012) find a positive correlation between the share of the young population in the country of origin and migration flows within the OECD. Mayda’s (2010) study also confirms that the share of the young population is one of the most important drivers of migration flows, albeit the analysis includes both developing and developed countries.

Workers with higher skill levels are likely to gain more from moving abroad, and it has been shown that high-skill migration is indeed becoming a dominant pattern of international migration (Bernard & Bell, 2018; Brüecker et al., 2012; Docquier & Rapoport, 2012; Grogger & Hanson, 2011). The argument that highly skilled workers are more likely to emigrate has been found to be relevant for developed countries (see e.g. Giannetti, 2001; Mauro & Spilimbergo, 1999). However, data on high-skilled intra-EU migrants is scarce since these workers are not captured through any dedicated immigration programme (Weinar & Klekowski von Koppenfels, 2020).

Migrant networks have also been shown to shape population movements to a substantial extent (e.g. Beine et al., 2011, 2015; Beine et al., 2017; Munshi, 2003). The presence of a national community in the destination country could reduce the private costs and risks of migrating abroad, as the first migrant faces the highest migration costs, while an established migrant network in the country of destination may increase the welfare of new migrants by, for example, providing information on employment opportunities or local housing markets. Gross and Schmitt (2005) show that the existence of cultural communities is more beneficial to immigrants from developing countries than from developed countries. The authors argue that migration flows between OECD countries as well as between the EU member states show no reaction to the presence of cultural clusters. In contrast, van Wissen and Visser (1998) show that the variables indicating past migratory movements are important for predicting intra-EEA migration flows.

Differences in Language and Culture

Socially acceptable income levels lead to the non-monetary costs of migration being of more relevance for potential emigrants. Factors determining migration flows between advanced economies are different from those explaining migration from developing to developed countries. Braunerhjelm et al. (2000) argue that ‘cultural and linguistic factors can play a role in discouraging migration, provided however that home income is sufficiently high and households are willing to substitute home amenities for a further rise in wages through migration’ (p. 53). For a long time, migration research has paid limited attention to the potential influence of cultural determinants on international migration flows and did not go beyond including a control for sharing a common language or using broad linguistic groups as a proxy (e.g. Mayda, 2010; van Wissen & Visser, 1998). Measures that proxy cultural ties, such as linguistic and cultural proximity, were first added as control variables to study trade flows (Boisso & Ferrantino, 1997; Melitz, 2008; Felbermayr & Toubal, 2010). This idea was later extended to model international migration flows. Recent migration literature emphasises the potential influence of linguistic and cultural proximity in determining migration flows (e.g. Adsera & Pytlikova, 2015; Belot & Ederveen, 2012; Belot & Hatton, 2012; Bredtmann et al., 2017; Caragliu et al., 2013; Sprenger, 2013; White & Yamasaki, 2014). However, most studies include both developing and developed countries.

Adsera and Pytlikova (2015) investigate the importance of language in shaping international migration flows from 223 source countries to 30 member countries of the Organisation for Economic Co-operation and Development (OECD) during the period 1980–2010. The authors apply several measures of linguistic proximity and find that migration rates increase with linguistic proximity between first official languages. Belot and Ederveen (2012) obtain similar results when analysing the role of linguistic distance on migration flows between 22 OECD countries over the period 1990–2003. Bredtmann, Nowotny and Otten (2017) show that linguistic distance has a negative effect on the location decisions of migrants and that this negative effect decreases when the network in the host region is larger. Wong (2023) investigates the relationship between linguistic proximity and labour market outcomes of the asylum population in Switzerland and shows employment increases with proximity, particularly among the earlier arrival cohorts. The negative effect of linguistic distance has been shown to hold even within one nation. Falck et al. (2014), using linguistic micro-data for Germany collected between 1879 and 1888, show that cross-regional migration flows during the period 2000–2006 are positively affected by historical dialect similarity.

Cultural proximity is a more intangible concept. Several measures of cultural orientation have been used, for example, by Belot and Ederveen (2012), who find that variables describing religious distance and survey-based measures of cultural distance are important when analysing bilateral migration flows between OECD member states, albeit less so when studying the ‘European immobility puzzle’. Lanati and Venturini (2021) analyse migration flows from 185 source countries to 30 OECD countries over the period 2004–2013 using bilateral exports in cultural goods as a proxy for cultural proximity. The authors find that a stronger cultural affinity positively affects migration even beyond the effects of pre-existing cultural and historical ties.

Conceptual Framework and Methodology

Approaches to Measuring Culture

Culture is a complex phenomenon; its various aspects are hard to describe, and even harder to measure. For the sake of simplicity, language has often been used as a best-guess proxy for culture in economic research.Footnote 5 However, what is language actually a proxy for? To some extent, language certainly is a carrier of culture, but does it adequately reflect cultural identity or preference similarity, or is it primarily related to communication? ‘The relation between language and behavior is far from being settled...The question of whether language is or is not one of the facets of culture has obviously not lost its attractiveness even today’ (Ginsburgh & Weber, 2020, p. 357). Culture and language are both part of a nation’s values, however, the link between identity and linguistic distance is not straightforward, especially as regards migration costs.

The acquisition of proficiency in the dominant language of the destination country has been shown to considerably improve immigrants’ labor market outcomes (Chiswick & Miller, 2015). Greater linguistic proximity in turn plays a decisive role in foreign language acquisition (Chiswick & Miller, 2005) and, to a large extent, explains language skill heterogeneity among immigrants (Isphording & Otten, 2017). We therefore expect greater linguistic distance to be associated with higher language acquisition costs and thus higher migration costs. Furthermore, this article questions whether cultural distance, i.e. diversity in attitudes, also translates into higher costs for EU movers.

A challenging issue inherent in dominant survey-based approaches to measuring culture is related to latent culture (Caprar et al., 2015). Just like people’s actions reveal their underlying preferences, revealed culture potentially reveals latent culture. Most survey-based cultural distance measures, however, reflect reported, or stated, culture rather than revealed culture. Similar to reported preference as observed when people are simply asked how they would behave, reported culture is merely a proxy for latent culture as revealed by a survey (Maseland & Hoorn, 2010).Footnote 6 In order to proxy latent culture more directly than is done by measures based on surveys on national cultural values, this study proposes a cultural distance measure that relies on observable behaviour reflecting differences in underlying cultural values. For that purpose, we use objective values of preferred interpersonal distance, or interpersonal space, in different regions.

According to proxemics, cultural norms and expectations influence people’s comfort levels with physical proximity and are thus the most important factors to describe the preferred interpersonal distance, that is, a distance individuals maintain in interpersonal interactions (Hall, 1966; Hayduk, 1983). The study of proxemics refers to ‘the interrelated observations and theories of man’s use of space as a specialized elaboration of culture’ and sees people from different cultures as not only speaking different languages but living in ‘different sensory worlds’ (Hall, 1966, pp. 1–2). Spatial needs, defined in terms of interpersonal distance zones–intimate, personal, social and public–vary by both personal preferences and culture: what is an accepted personal or even social distance in one culture may be intimate in another.Footnote 7

The Gravity Model of Migration

The use of gravity models in migration research has only recently gained momentum because of an increased availability of bilateral migration data. Gravity models explain spatial relations between two countries as a function of the respective ‘mass’ of goods, labour or other factors of production and distance between these countries. The use of country-pair data also allows to identify other important determinants of international migration such as the existence of network effects, the role of linguistic distance or the impact of cultural links between countries (Beine et al., 2016). Ravenstein (1885; 1889) was the first scholar who drew upon the theoretical foundations of gravity in an attempt to explain and predict currents of migration within and between countries, and is considered to have pioneered the use of the gravity model long before gravity regressions became popular in analysing international trade (Anderson, 2011).

As discussed above, to examine migration decisions based on countries’ macroeconomic conditions, the attracting mass is generally approximated by income and unemployment differentials, the share of the young population and the share of individuals with tertiary education in the country of origin. The distance can be represented by geographical distance.

To identify the factors encouraging and impeding international migration in the EU, this article analyses economic, demographic, geographical, political and network determinants as well as a set of cultural distance measures. In line with the theoretical ideas presented above, costs associated with migration are expected to be larger with physical, cultural and linguistic distance and to fall with the size of existing networks and with the right to free movement of workers.

Data Construction

Data on migration flows between the 28 member states of the EU for the years 1998–2018 are collected from different sources (Eurostat, OECD and national statistical offices) to provide a complete overview.Footnote 8 The analysis uses a set of indicators of cultural distance along with economic, demographic, geographical, political and network variables. Table A2 in the Appendix provides definitions, sources and summary statistics of all variables. Four variables are included to measure the extent to which the country of destination differs linguistically and culturally and thus necessitates making an effort to adapt oneself.

Common Language Dummy

A dummy variable is defined with the value of 1 if two countries have the same official language and 0 if not. This indicator takes only official languages into account and not officially recognised minority languages such as, for example, Finnish in Sweden, French in the Aosta Valley region in Italy or German in the district of North Schleswig in Denmark.

Linguistic Distance

The index of linguistic distance is constructed based on the linguistic proximity measure created by Dyen et al. (1992) from the matrix of lexicostatistical percentages for the Indo-European languages. Lexicostatistics assesses degrees of relatedness between languages and uses lexicostatistical percentages to classify the varieties of speech. The lexicostatistic method uses a list of basic meanings that are present in almost every culture, i.e. culture-independent core vocabulary that includes pronouns, simple adjectives, simple verbs, names of body parts and names of natural phenomena, for example, ‘mother’, ‘I’, ‘all’, ‘to breathe’, ‘to kill’, ‘snow’, ‘blood’, ‘child’ and numerals from one to five. The phonetic representations of the words with these basic meanings are collected for all languages belonging to a language family. They are then considered for each meaning to determine whether some of all the forms are cognate. This method allows to avoid words borrowed from one language to another. For example, English ‘flower’ is not cognate to French ‘fleur’, because it is borrowed from French. However, English ‘blossom’ is (Dyen et al., 1992, p. 95). The lexicostatistical percentage is the percentage of all meanings for which the forms are cognate. For instance, French and English are connected by 23.6%, and German and English are connected by 57.8% (Dyen et al., 1992, pp. 102–118). Based on Dyen et al. (1992), the indicator of linguistic distance is defined as

$$\begin{aligned} 1-\max _{\forall i\in A, \forall j\in B}\{proximity\{i,j\}\}, \end{aligned}$$

where i and j are the official languages of countries A and B respectively. proximity is the lexicostatistical percentage as described above. One maximises the proximity between languages by taking the highest value of linguistic proximity of all possible pairs of languages for the countries with several official languages. The indicator can range from 0, when countries have the same official language and thus no distance, to 1, when countries’ official languages belong to different language families as in the case of the distance between the languages of the Uralic language family and the Indo-European languages (for more details, see Table A4 in the Appendix).Footnote 9 Uralic languages are not part of the Indo-European family and are thus not discussed in Dyen et al. (1992). To fill this gap, the linguistic distance index for Finnish, Hungarian and Estonian is constructed as proposed by Adsera and Pytlikova (2015, p. F53).

Cultural Distance Based on Hofstede Dimensions

Perhaps the most widely used construct to examine cultural distance is based on Hofstede & Minkov, (2010) cultural dimensions and computed as described by Kogut and Singh (1988) in their analysis of the choice of market entry mode in the United States:

$$\begin{aligned} CD_{i,j}= \frac{1}{6} \frac{\sum _{k=1}^6(I_{i,k}-I_{j,k})^2}{V_k}, \end{aligned}$$

where \(\ CD_{i,j}\) denotes the cultural difference or distance between country i and country j. \(\ I_{i,k}\) is the Hofstede index for country i and dimension k. \(\ V_k\) indicates the variance of the index of the kth dimension. Hofstede cross-cultural dimensions are possibly the most widely used measurement to proxy cultural distance. The dimensions are based on Hofstede’s original survey of IBM employees in over 40 countries and reflect six anthropological topics that are handled differently in different nations and include power distance, individualism versus collectivism, masculinity versus femininity, uncertainty avoidance, long-term orientation versus short-term normative orientation and indulgence versus restraint (Hofstede & Minkov, 2010). Data are available for all dimensions and all countries except Cyprus.

Cultural Distance Based on Preferred Interpersonal Distance

Sorokowska et al. (2017) compare preferred interpersonal distances across 42 countries, analysing three types of interpersonal distance: social distance (when approaching a stranger, 122–210 cm), personal distance (when approaching an acquaintance, 46–122 cm) and intimate distance (maintained in close relationships, 0–46 cm). Fifteen EU member states are included in the study by Sorokowska et al. (2017): Austria, Bulgaria, Croatia, Czechia, Estonia, Germany, Greece, Hungary, Italy, Poland, Portugal, Romania, Slovakia, Spain and the United Kingdom (represented by England).Footnote 10 The three countries from the full sample where participants’ preferred distance from a stranger was largest were Romania (139.64 cm), Hungary (130.72 cm) and Saudi Arabia (126.87 cm), whereas the three countries where participants required the least personal space when approaching a stranger were Argentina (76.52 cm), Peru (79.61 cm) and Bulgaria (81.37 cm). In Estonia, Hungary and Romania people stand farther from their acquaintances than Austrians and Slovaks do with strangers (see Fig. 1).

Fig. 1
figure 1

Preferred interpersonal distance in European countries (cm). Source: Author’s illustration based on Sorokowska et al. (2017)

We propose an indicator of cultural distance based on objective values of preferred interpersonal distances in different regions measured by Sorokowska et al. (2017). The measure is constructed as follows with the Euclidean distance formula used to calculate a composite distance index on a set of dimensions:

$$\begin{aligned} Space_{i,j} = \sqrt{(Socialdist_i - Socialdist_j)^2 + (Personaldist_i - Personaldist_j)^2}, \end{aligned}$$

where i and j are countries’ indices. For the purpose of this study, we focus on preferred interpersonal distance with strangers and acquaintances, i.e. social distance and personal distance.

The correlation coefficients between the analysed distance variables (physical, linguistic, Hofstede and interpersonal) are low and even negative, suggesting that the measures capture different aspects of cultural distance (see Table A3 in the Appendix).

Econometric Specification

To structure the ideas discussed above, the econometric model is given by:

$$\begin{aligned} m_{ijt}&= \beta _1 + \beta _2\text {Y}_{it-1} + \beta _3\text {Y}_{jt-1} + \beta _4\text {p}_{it} + \beta _5\text {p}_{jt} + \beta _6\text {S}_{it} + \beta _7\text {D}_{ij}\nonumber \\&\quad + \beta _{8}\text {o}_{ijt} + \beta _{9}\text {spr}_{jt} + \beta _{10}\text {n}_{ijt-1} + \beta _{11}\text {L}_{ij} + \beta _{12}\text {CD}_{ij} + \delta _j + \varepsilon _{ijt} , \end{aligned}$$
(1)

where \(m_{ijt}\) is the gross migration flow from country i to country j at time t, where i = 1,...28; j = 1,...28; and t = 1998,...2018. \(Y_{it-1}\) and \(Y_{jt-1}\) are country-specific economic push and pull factors, controlled by purchasing power adjusted GDP per capita and unemployment rates at the origin and destination. To reduce the risk of reverse causality in the model (migration flows having an impact on earnings and employment), the economic variables are lagged by one period. This is also useful to account for the information available at the time the migration decision is taken.

The size of the population at the origin, \(p_{it}\), indicates the magnitude of potential migration while the size of the population at the destination, \(p_{jt}\), captures possible gravity effects.

Matrix \(S_{it}\) includes aggregate measures of individual-level characteristics in the sending country: the share of tertiary educated people is included as an indication of workers’ skill level, and the share of young people (aged 20–34) in the total population is intended to capture the age structure of the population.

To control for the effect of physical distance, matrix \(D_{ij}\) includes the distance in kilometres between the capital city of country i and that of country j as well as a dummy variable that takes the value of 1 if the two countries have a common border. Physical distance is expected to capture the monetary cost of migration involved and the information the potential migrant has about the possible destination and its labour market.

Migration policies are represented by a dummy variable \(o_{ijt}\) with the value of 1 if country j allows the free movement of workers from country i. This measure is relevant for the EU in light of the transitional arrangements concerning the free movement of workers. The citizens of Bulgaria, Croatia, Czechia, Estonia, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia and Slovenia were subject to a transitional period that imposed restrictions on the free movement of labour (European Union, 2003; European Commission, 2008, 2015). A maximum of seven years (2+3+2) of postponement enabled the member states to regulate the opening of their labour markets. Not only did most of the EU15 member states keep restrictions during that period, but several accession countries also used reciprocal measures to restrict access to their labour markets for nationals from those member states that restricted labour market access for their nationals. In addition, Spain liberalised access to its labour market for Romanian workers on 1 January 2009 but invoked the safeguard cause in 2011, temporarily suspending the law on the free movement of workers (European Commission, 2011).

In order to test the welfare magnet hypothesis, expenditure on social protection benefits as a percentage of GDP, \(spr_{jt}\), is included among explanatory variables.

To capture the existence of network effects, the number of foreigners of the citizenship of the sending country in the receiving country is included; \(n_{ijt-1}\) is lagged by one period to assume it is predetermined in relation to current migration flows.

Matrix \(L_{ij}\) includes a measure of linguistic distance between the countries as well as a dummy variable that takes a value of 1 if country j has the same official language as country i.

To account for cultural distances, matrix \(CD_{ij}\) includes a composite index of cultural distance based on Hofstede dimensions and a cultural distance index based on preferred interpersonal distance. Some explanatory variables are time-invariant.

Estimation

The dependent variable under analysis is the total inflow of citizens of the sending country i in the receiving country j. It is an example of a count variable, which is discrete and non-negative. To model this type of data, we use the pooled Poisson model with cluster-robust Huber–White standard errors, clustered at the country-pair level. Thus, standard errors allow for intragroup correlation, relaxing the requirement that the observations be independent within groups. Furthermore, fixed effects for the country of destination are introduced to control for unobserved country-specific characteristics and, in this way, correct for the correlation between panels. The non-linear Poisson maximum likelihood estimator is an instance of pseudo maximum likelihood estimation and has been shown to be fully robust, relying only on a correctly specified mean function, implying that the parameter estimators are consistent even if the assumption for the distribution is incorrect (Winkelmann, 2015, 2008; Wooldridge, 1999). It is essentially the Poisson pseudo maximum likelihood (PPML) estimator proposed by Santos Silva and Tenreyro (2006). Alternative methods for analysing count data include the negative binomial regression model (see e.g. Belot & Ederveen, 2012) or log-linearising the dependent variable. Both alternative estimation methods were performed as robustness tests.

Discussion of Findings

The first column of Table 1 presents estimation results including economic, demographic, geographical and political variables. Columns (2)–(6) successively include further explanatory variables. The specification presented in column (2) adds expenditure on social protection benefits as a percentage of GDP in the destination country. Column (3) introduces the number of foreigners of the citizenship of the sending country in the receiving country. Column (4) introduces a common language dummy and the indicator of linguistic distance among explanatory variables. Cultural variables are introduced in columns (5) and (6). Because the data on preferred interpersonal distances are available for only 15 countries under consideration, the sample size drops substantially in column (6). Alternative estimation methods are presented in columns (7) and (8).

Table 1 Estimation results

The coefficients of the Poisson model can be interpreted as semi-elasticities since the model is specified with a log-linear conditional expectation function Winkelmann (2008). For example, taking the point estimate related to lagged GDP per capita in the receiving country in column (1), the effect would be a [exp(0.140)-1] x 100 = 15.03% increase. That is, an increase in GDP per capita of 1,000 PPS in the destination country would increase immigration flows by 15.03%, ceteris paribus.

Many studies have documented the role of economic factors in determining migration flows (see e.g. Adsera & Pytlikova, 2015; Hirschle & Kleiner, 2014; Lanati & Venturini, 2021). Ortega and Peri (2013), for example, suggest that, within Europe, migration is sensitive to economic conditions at the destination, once income per capita and other time-varying factors at the origin are controlled for. As shown in column (1), an increase in GDP per capita at the origin, on the other hand, discourages migration. Indeed, higher incomes are associated with smaller emigration rates from advanced economies compared to less developed middle-income countries since the propensity to migrate decreases with the level of contentment with the current location (Clemens, 2014; Dustmann & Okatenko, 2014). An increase of one percentage point in the lagged unemployment rate in the destination country decreases migration flows by 4.74%, ceteris paribus. This confirms previous literature showing that favourable labour market conditions in destination countries in the EU attract migrants whereas high unemployment levels in potential destinations discourage migration (see e.g. European Commission and Joint Research Centre et al., 2018). The effect of an increase in the unemployment rate at the origin is statistically insignificant, which is in line with findings by Belot and Ederveen (2012).

As expected, the effect of the population size variables is positive and significant. Looking at the effect of socio-demographic variables, we find that the share of tertiary educated people in the total population of the sending country discourages migration. According to the European Commission (2021, p. 14), only about one-third of EU movers had a tertiary level of education in 2019, and the contribution of EU mobile workers to total employment is highest for occupations requiring low-to-medium skills (European Commission, 2023). Another reason could be that highly skilled Europeans move to countries outside the EU. For example, looking at high-skilled emigrants from Germany, Parey et al. (2017) find that migrants to countries with a higher level of earnings inequality (e.g. the United States) are positively selected, whereas migrants to more equal countries (e.g. Scandinavian countries) are negatively selected and benefit from a more compressed wage distribution. The share of young people in the country of origin shows no statistically significant effect on migration flows.

The effect of physical distance is large, negative and significant; and sharing a border has a strong positive and statistically significant effect on migration flows. The free movement of workers has a significant positive effect on migration in the first specification (column (1)), and yet, the magnitude of its effect is not larger than that of geographical variables. Moreover, the coefficient is no longer significant when additional explanatory variables are included, as shown in columns (2)–(6). Windzio et al. (2021) also find that the opening of the labour market of destination countries has only a moderate effect on intra-EU migration flows when other factors (economic and geographical) are taken into account.

The effect of an increase in social protection benefits as a percentage of GDP in the destination country is positive, albeit statistically insignificant; and it is negative in the smaller sample analysed in column (6). This suggests that migration within the EU does not respond to the welfare magnet effect, and the concern that immigrants move to countries with generous welfare systems in order to receive social benefits rather than working is unjustified. This runs in line with the findings in a number of previous empirical studies (e.g. Giulietti, 2014; Guild et al., 2013; Ponce, 2019).

Similar to previous studies (Beine et al., 2015, 2011; Lanati & Venturini, 2021), the results of the estimation including the number of foreigners of the citizenship of the sending country in the receiving country suggest network effects are an important driver of subsequent migration. The size of the ethnic network has a positive and significant effect on the size of subsequent migration flows.

The indicator of linguistic distance is highly significant as a determinant of migration flows within the EU. As expected, its effect is negative and high. These results are in line with previous studies (Adsera & Pytlikova, 2015; Belot & Hatton, 2012; Bredtmann et al., 2017; Chiswick & Miller, 2015). Belot and Ederveen (2012) find that the linguistic proximity between the source and the destination country is an important factor explaining migration flows between OECD countries, but does not seem to play a significant role when analysing migration between members of the EU or the European Economic Area. It should be noted, however, that the number of official languages of the EU has doubled since 2003, the last year considered in the Belot and Ederveen (2012) study. The simple dummy for sharing a common language has an insignificant effect on migration flows. Van Wissen and Visser (1998), whose analysis also involved very few multilingual countries and countries with the same official language, find a comparable effect of the simple language dummy. This outcome suggests that a more refined measure is advantageous in a multilingual setting.

Finally, cultural variables are introduced in columns (4) and (5). Hofstede scores are available for all countries in the sample, except Cyprus, whereas the data on preferred interpersonal distances are available for 15 countries in the sample. Both measures of cultural distance have a positive and statistically significant effect on migration between EU member states, albeit the effect of the distance index based on interpersonal distance preferences is smaller. This is a surprising result. One explanation could be related to skill level-dependent cultural sorting. Rapoport et al. (2021) find a negative relationship between low-skill migration and cultural similarity, whereas the relationship between high-skill migration and cultural similarity is positive. When analysing the role cultural barriers play in the subsample of countries that are either members of the European Union or European Economic Area, Belot and Ederveen (2012) do not come to a conclusive answer: while the measure of religious distance is negative and significant, the cultural distance variable based on the Hofstede dimensions is positive but does not have a statistically significant effect on mobility. However, as mentioned above, one can hardly compare these results as the number of EU member states almost doubled since 2003.

In order to shed light on the potential influence of individual dimensions on international migration, White and Buehler (2018) propose decomposing composite cultural distance measures. The authors find that differences in dimensions that reflect individualism, uncertainty avoidance and perceived gender roles negatively affect migration flows. Table 2 shows results from a parsimonious specification that only includes the linguistic and cultural measures as control variables. As shown in column (2), similar to the findings of White and Buehler (2018), the distance in the dimension that assesses social differentiation between the sexes has in fact a negative effect on migration flows. The difference in the individualism dimension is also significant in our case, albeit with an opposite sign. Finally, the greater the distance in the dimension reflecting the degree to which the less powerful members of a society accept and expect that power is distributed unequally, the more migration takes place between the member states of the EU, ceteris paribus.

Table 2 Decomposing Hofstede cultural distance

The data clearly suggests that cultural distance is not an obstacle to intra-EU mobility. The opposite is the case: cultural distance (albeit not linguistic distance) stimulates mobility in the setting analysed in this article. The unexpected positive effect of the cultural distance variables on migration flows within the EU arouses new questions and calls for more research in this area. The analysis presented in this article may not have offered a conclusive evaluation, but it could be a useful focus for future research. Given the complexity of the phenomenon of culture and its relatedness (and yet not equivalence) to language, it highlights the necessity for a more nuanced examination of how cultural distance influences migration patterns within the EU.

Columns (7) and (8) of Table 1 show that the effects identified in this article hold across a range of econometric specifications.

Summary and Conclusions

This article investigates the forces driving intra-EU mobility. We use data on migration flows between 28 member states of the EU for the period 1998–2018 to analyse the role of economic, demographic, geographical, political as well as network variables while paying particular attention to the cultural and linguistic distance between the EU member states. The indicators measuring cultural barriers between countries are a linguistic distance measure constructed using lexicostatistical percentages, an indicator based on Hofstede’s cultural dimensions and a new index based on interpersonal distance preferences in different countries.

The results suggest that intra-EU mobility is driven by economic conditions, employment opportunities, geographical proximity as well as network ties. Cultural distance between countries does not seem to prevent Europeans from moving to another member state; rather, the opposite is true. The coefficient of linguistic distance, on the other hand, is negative and highly significant in all samples and specifications. Thus, migration flows between two countries are smaller the less related their languages are, ceteris paribus.

The main conceptual implication of this study concerns quantifying cultural distance. In economic research, language has often been used as a best-guess proxy for culture. The present paper shows that in the context of the European Union, linguistic distance is a misleading proxy for cultural distance.

Migration selectivity patterns seem to go beyond institutional factors, and open borders do not necessarily imply that EU citizens enjoy full freedom of movement. Other obstacles to migration – from physically moving to a new country to learning a new language – seem to prevent Europeans from moving freely across the EU. Even though the recent COVID-19 pandemic has accelerated the ongoing digital transformation of the European economy by promoting teleworking and the use of digital technology, making the physical distance less important, the language barrier will likely remain a challenge for the European labour market. Policies aimed at promoting the instruction of foreign languages could encourage international labour mobility. The advantages of foreign language proficiency are manifold. Adequate proficiency in the host country language may affect immigrants’ marginal productivity, facilitate social integration and increase the potential to accumulate human capital. Furthermore, language proficiency can expand the choice of destination countries.

Learning a foreign language is obviously easier if it is closer to the mother tongue than if it is more dissimilar. However, other factors play an important role as well. There are considerable differences in compulsory learning of foreign languages across European education systems, such as the age at which children begin learning a foreign language, the choice of languages taught as well as the number of foreign languages learned (an extreme example being Ireland, where, although most pupils include a foreign language in their choices, no foreign language is compulsory in the curriculum (Bruen, 2023; Council of Europe, 2005). In this regard, the 2019 Council of the European Union recommendation for a comprehensive approach to the teaching and learning of languages, addresses some important issues such as continuity in language education and supporting teachers in their training. However, recommendations are not binding on member states. Ginsburgh and Moreno-Ternero (2018) go as far as proposing that the European Union could subsidise each member state to stimulate the learning of one common language in order to facilitate communication among Europeans. Perhaps this is wishful thinking, but the use of a uniform second language could lower the cost of communication and expand economic opportunities.