In what follows, indicators and the corresponding data sources are assigned to the model’s four dimensions. In each of the econometric specifications the dependent variable is \(s_{i,t}\), either measured by its level or yearly growth rate of total schooling years accumulated by the population aged 24–64. The observation area consists of the 51 NUTS2 regions as introduced in Sect. 2. The full list of regions and the respective capital-city centres can be found in Appendix B.
The five dimensions, indicators, and variables
In Table 1, each variable as applied in the following estimations is assigned to the indicators and dimensions discussed in the previous section. In addition, Table 1 provides information on variable definitions and data sources. The respective variables are as follows:
Table 1 Explanatory variables applied in the econometric specifications The “income” dimension refers to current wealth and income prospects, which are captured by current GRP per inhabitant (measured in euros) and its real growth rate (measured in percentage numbers), respectively. Although purchasing power parities (PPP) may be preferred to account for actual living standards in the EU such data are available for countries only, resulting in identical relative GRP levels across regions within a country making them inadequate for the study’s purposes. A third variable, “share7”, captures the total observation area’s share of the EU’s absolute GDP to account for income differences on country levels and hence incentives to migrate to the core regions. Hence, while GRP per inhabitant levels and growth control for differences within the observation area and therefore distinguishes the periphery from the semi-periphery, share7 controls for differences of the periphery and semi-periphery to the core.
Second, the “jobs” dimension takes the presumed higher mobility of young people into account, by considering the youth unemployment rate in addition to the general unemployment rate in each region. The expected effect is ambiguous due to the various impacts it may have, as discussed above.
The third dimension, “future”, is represented by gross fixed capital formations and the age structure.Footnote 11 The share of the population that has most likely completed its education and is in working age, i.e. 25–64, is included to account for the hypothesis that they are most likely to benefit from migration (Sjastaad 1962). Its impact may be either positive, by making a region more attractive, perhaps via network effects, but it may also increase migration as people in this age group are more likely to migrate.
Fourth, “interregional relationships” are accounted for by two variables. The first, modified from Comin et al. (2013), captures a region’s economic distance to its respective country’s capital city region divided by the geographic distance between them:
$$\delta _{i,t}=\frac{\ln q_{i,t}-\ln q_{c\left(i\right),t}}{\ln d_{i,c\left(i\right)}}$$
(3)
where \(q\) refers to GRP per inhabitant, \(c(i)\) refers to the capital city region of \(i\)’s country and \(d\) symbolises road distance. The variable \(\delta\) is intended to control for the interplay of the attractiveness of moving from the periphery to the semi-periphery and the cost to do so. A greater distance increases the cost of migration and should therefore have a negative impact on \(\delta\), as it is found in the denominator. A lower GRP in the periphery relative to the capital city region should have a negative impact, too, as it decreases the numerator. Therefore, a larger \(\delta _{i}\) should have a positive impact on \(i\)’s human capital, especially if it increases over time.
The second variable capturing interregional relationships, \(E\), equals one for years the respective region was a member of the EU, zero otherwise. The impact of \(E\) may intuitively expected to be negative, as migration was eased with EU accession. However, due to the special role of the semi-periphery, the actual effect is ambiguous and may change over time.
Finally, regional net-migration rates capture the impact of the fifth dimension, “migration”. Note that this variable serves as the key variable in the following estimations, as the impact on human capital is ambiguous: Since net-migration may take on positive or negative values, with most of the sample’s regions displaying negative values for most periods (see Fig. 4), the interpretation differs whether human capital stocks or growth rates are considered. If the dependent variable is the current stock of human capital, a positive coefficient simply means that human capital is positively correlated with net-migration, and vice versa. If the dependent variable is human capital growth the interpretation becomes more complex:
-
A negative impact means that regions with negative net-migration rates actually benefit from migration: A decrease in net-migration (i.e. an increase of the absolute value) has a positive impact on human capital, and vice versa for regions with positive net-migration rates.
-
Consequently, a positive impact means that regions with negative rates would lose from an increase, while regions with positive rates would benefit.
To account for international migrants’ contribution to skills an additional variable is added, equalling the share of working-age foreign-born inhabitants with “advanced skills” (= tertiary education) as defined by the International Labour Organization by the total number of working-age foreign-born people. The expected impact of this variable is ambiguous. On the one hand, skilled immigrants may crowd out skilled internal migrants and hence reduce their otherwise positive impact on human capital. On the other hand, a higher share of tertiary educated immigrants may induce or reflect agglomeration effects leading to ever-increasing (or, if absent, decreasing) skill levels (McCann 2013). Due to data limitations this variable is available on the national level only, which means that each region within a country displays the same value for each year.
Level estimations
In the first set of estimations the correlations with levels of schooling years are estimated, the observation period covers the years 2000–2009:
$$\begin{aligned} \ln s_{i,t} =&\,\alpha _{i}+\beta _{1}\ln x_{1,i,t-1}+\beta _{2}\ln x_{2,i,t-1}+\beta _{3}x_{3,i,t-1}+\beta _{4}\ln x_{4,i,t-1}\\ &+\beta _{5}\ln x_{5,i,t-1}+\beta _{6}\ln x_{6,i,t-1}+\beta _{7}\ln x_{7,i,t-1}+\beta _{8}\delta _{i,t-1}\\ &+\beta _{9}E _{i,t-1}+\beta _{10}\ln m_{i,t-1}+\beta _{11}\ln \varsigma _{i,t-1}+\varepsilon _{i,t} \end{aligned}$$
(4)
where \(\alpha\) and the \(\beta\)s symbolise the regression coefficients, the dependent variable is defined as in Eq. 2, the explanatory variables are as defined in Table 1 and \(\varepsilon _{i,t}\) captures the regression residuals. The explanatory variables are lagged by one period to reduce endogeneity. Real GRP growth and net-migration rates may take on negative values, which is why the lowest value plus some small additive is subtracted in each case to guarantee defined logarithms.Footnote 12 A variant of Eq. 4 includes an interaction term to test whether EU-membership has an impact on migration:
$$\begin{aligned} \ln s_{i,t} =&\,\alpha _{i}+\beta _{1}\ln x_{1,i,t-1}+\beta _{2}\ln x_{2,i,t-1}+\beta _{3}x_{3,i,t-1}+\beta _{4}\ln x_{4,i,t-1}\\ &+\beta _{5}\ln x_{5,i,t-1}+\beta _{6}\ln x_{6,i,t-1}+\beta _{7}\ln x_{7,i,t-1}+\beta _{8}\delta _{i,t-1}\\ &+\beta _{9}E _{i,t-1}+\beta _{10}\ln m_{i,t-1}+\beta _{11}\left(E _{i,t-1}m_{i,t-1}\right)+\beta _{12}\ln \varsigma _{i,t-1}+\varepsilon_{i,t} \end{aligned}$$
(5)
Note that the inclusion of an interaction variable as in Eq. 5 implies that
$$\frac{\partial \ln s_{i,t}}{\partial \ln m_{i,t-1}}=\beta _{10}+\beta _{11}E _{i,t-1}$$
(6)
which means that the total effect of net-migration is the sum of net-migration’s coefficient plus the value of the EU dummy, i.e. the interaction variable’s coefficient must be added for the years in which a region was part of the EU. A further variant replaces \(E _{i,t-1}\) by year dummies.
Growth rates estimations
The second set of estimations focuses on the growth rates of schooling years, i.e. except for the dependent variable, the specification is identical to above:
$$\begin{aligned} \ln s_{i,t}-\ln s_{i,t-1} =&\,\alpha _{i}+\beta _{1}\ln x_{1,i,t-1}+\beta _{2}\ln x_{2,i,t-1}+\beta _{3}x_{3,i,t-1}+\beta _{4}\ln x_{4,i,t-1}\\ &+\beta _{5}\ln x_{5,i,t-1}+\beta _{6}\ln x_{6,i,t-1}+\beta _{7}\ln x_{7,i,t-1}+\beta _{8}\delta _{i,t-1}\\ &+\beta _{9}E _{i,t-1}+\beta _{10}\ln m_{i,t-1}+\beta _{11}\ln \varsigma _{i,t-1}+\varepsilon _{i,t,t-1} \end{aligned}$$
(7)
and may, in analogy to Eqs. 5 and 6, include interaction terms.