1 Introduction

The pattern of economic development in industrialized countries is characterized by steadily rising per capita GDP, a strong increase in life expectancy at birth, and an increasing share of the population living in cities. This is illustrated for the United States in Fig. 1, which displays the evolution of per capita GDP (solid line, left axis), life expectancy at birth (dotted line, right axis), and the urbanization rate (dashed line, right axis) over the past 50 years. We observe that all of these variables have been trending upwards since 1960.

Fig. 1
figure 1

Per capita GDP (solid line, left axis), life expectancy at birth (dotted line, right axis), and urbanization rate (dashed line, right axis) in the United States from 1960 to 2013

The evolution of these variables is usually analyzed separately with the New Economic Geography literature focusing on the analysis of agglomeration and regional migration patterns, innovation-based growth theory analyzing the determinants of long-run economic development from the perspective of a single aggregate economy, and the demographic-economic life-cycle literature focusing on the effects of demographic changes on savings decisions. However, as the Commission on Growth and Development (2008) forcefully argues, these processes are intertwined.

The aim of our paper is to provide a unified framework for the joint analysis of demographic change, migration, and economic growth to explain the trends depicted in Fig. 1 and the continuously ongoing expansion of large agglomerations in knowledge-based economies. We show that increasing life expectancy, migration, and innovation-driven economic growth all constitute strong agglomeration forces that outweigh the dispersion forces of congestion, migration costs, and the anti-agglomerative economic force of the turnover of generations.

As far as the related literature is concerned, there are three largely disconnected strands that we aim to integrate. The New Economic Geography literature analyzes processes that lead to the endogenous agglomeration of productive factors between initially symmetric regions.Footnote 1 A central determinant of the relative strength of agglomeration versus dispersion forces is represented by trade costs. For high trade costs, firms have an incentive to locate close to the customers such that economic activity is dispersed. There are many regional suppliers to minimize trade costs but no large agglomerations. For low trade costs, by contrast, positive agglomeration externalities outweigh the costs of transporting goods between regions and a geographic concentration of economic activity starts to emerge. Declining trade costs thus foster agglomeration over the course of economic development. While elements of economic growth models with learning-by-doing spillovers a la Romer (1986) as additional agglomerative force have been introduced to the New Economic Geography (see, for example, Martin and Ottaviano 1999; Baldwin and Forslid 2000), the effects of innovation-driven growth on the spatial concentration of economic activities have not yet been thoroughly analyzed. Furthermore, demographic aspects have been largely neglected within this strand of the literature. An exception is the work of Grafeneder-Weissteiner and Prettner (2013) who integrate an overlapping generations structure into a neoclassical New Economic Geography model of the Baldwin (1999) type. They show that demographic changes are a central determinant of agglomeration processes and that, in line with the empirical evidence, aging acts as an agglomerative force, whereas population growth represents a dispersion force. However, Grafeneder-Weissteiner and Prettner (2013) do not consider an innovation sector and therefore economic growth is not endogenous in their setting.Footnote 2

Innovation-driven growth models explain the long-run evolution of technological progress and economic growth within one aggregate economy by introducing a research sector that sells newly developed blueprints. These blueprints are a necessary input for the firms in the intermediate goods sector.Footnote 3 Each firm in the intermediate goods sector produces a differentiated variety (which can be interpreted as a differentiated machine) that is a necessary input in final goods production. The operating profits of the intermediate goods sector are siphoned by the firms in the research sector and used to compensate the scientists who develop new blueprints. The higher the operating profits in the intermediate goods sector, the stronger is the incentive to invest in innovation and the faster is technological progress and therefore economic growth. Furthermore, the larger the population is, the more scientists are available to design new blueprints, which, in turn, raises the pace of technological progress and economic growth. All these effects are present along a balanced growth path of endogenous growth models of the Romer (1990) type and during the transition phase toward the long-run balanced growth path in semi-endogenous growth models of the Jones (1995) type. Migration decisions and the inter-regional consequences of innovation-driven growth are typically not analyzed within this literature, although Rivera-Batiz and Romer (1991) show that growth can be enhanced by the integration of two different separated (labor) markets.

Finally, the life-cycle savings literature analyzes the effects of changes in longevity and in the age structure of a population on capital accumulation and medium-run economic growth.Footnote 4 These frameworks show that increasing life expectancy leads to higher savings, which in turn raises physical capital accumulation and therefore speeds up growth in the medium run. However, these models are typically not concerned with innovation-driven economic growth. Exceptions are the contributions of Prettner (2013) and Kuhn and Prettner (2016) who investigate the impact of increasing longevity on the incentives to invest in innovation and find that the increase in physical capital due to a longer life expectancy reduces the equilibrium market interest rate and therefore the rate at which the proceeds of new innovations are discounted. This in turn raises the compensation for research firms and the wages of the scientists they employ. Consequently, increasing life expectancy speeds up technological progress and long-run economic growth. Overall, this literature is silent on aspects related to migration and agglomeration.

This overview makes clear that (1) innovation drives economic growth, (2) population aging affects innovation, (3) the number of workers and, thus, migration, affect innovation, and (4) agglomeration externalities contribute to capital accumulation and might be a catalyst for migration. Our aim is therefore to bring the separated strands of the literature together to explain the interrelations among these developments and thereby the emergence of large agglomerations and their continuously ongoing expansion. We are not aware of any framework that combines these strands of literature to analyze the joint effects of demographic change, migration, and innovation on long-run economic growth and agglomeration.

The paper is organized as follows: Sect. 2 contains the description of the model and its basic assumptions, Sect. 3 is devoted to the derivation of the equilibrium dynamics, Sect. 4 describes the symmetric equilibrium and its stability properties, and in Sect. 5 we draw our conclusions.

2 The Model

Consider a country that consists of two initially symmetric regions. To be consistent with the literature, we refer to them as Home (H) and Foreign (F). To avoid notational clutter, we only use the superscript F explicitly to refer to the foreign region. If there is no superscript, the corresponding variable belongs to the home region. Both regions exhibit three production sectors, final goods production, intermediate goods production, and innovation; and both regions have access to two production factors, capital (K) and labor (L). Labor in the form of workers and differentiated machines are used to assemble consumption goods in the final goods sector; capital and blueprints are used in the intermediate goods sector to produce differentiated machines; and labor in the form of scientists is required to produce the blueprints in the innovation sector. Labor is mobile between regions but subject to quadratic migration costs. There is perfect competition in the final goods sector and in the innovation sector, whereas the machine-producing intermediate goods sector is Dixit and Stiglitz (1977) monopolistically competitive. For simplicity, we abstract from trade in consumption goods and in machines.

Following Blanchard (1985) and Heijdra (2017, chapter 15), individuals face lifetime uncertainty, which we parameterize by the risk of death, \(\mu\). Consequently, the economy consists of different cohorts that can be distinguished by their date of birth, \(t_0\). We denote the size of each cohort at a certain point in time \(\tau >t_0\) by \(N(t_0,\tau )\). The law of large numbers implies that the mortality rate \(\mu\) equals the fraction of individuals who are dying at each instant. In line with Blanchard (1985), we assume that the birth rate also equals \(\mu\) such that the population of one region, \(N(\tau )\equiv \int _{-\infty }^{\tau }N(t_0,\tau )dt_0\), is constant.Footnote 5 As in Yaari (1965), a life-insurance company sells fair actuarial notes which are bought by each individual and canceled upon the individual’s death. For simplicity, we abstract from imperfect annuity markets and accidental bequests (see, for example, Heijdra and Mierau 2012; Heijdra et al. 2014; Kuhn and Prettner 2018, for an analysis of these aspects in models of economic growth based on capital accumulation).Footnote 6 Finally, each individual, irrespective of her age, is endowed with \({\bar{L}}/{\bar{N}}\) units of labor, where we denote country-wide variables with an overbar. This means that \({\bar{L}}=L + L^F\) refers to the country-wide supply of labor and \({\bar{N}} = N+N^F\) refers to the country-wide population size. The share of individuals in the home region is denoted by \(s_N \equiv N/{\bar{N}}=1/2\). The decision of each individual on her inter-regional distribution of labor supply is based on the wage differences between H and F and the labor supply decision in turn determines the share of labor employed at home, \(s_L \equiv L/{\bar{L}}\). From now on, we normalize the country-wide labor supply \({\bar{L}}\) to 1 without loss of generality.

2.1 Consumption Side

Within both regions, the representative individual of cohort \(t_0\) maximizes her discounted stream of lifetime utility that is determined by the flow of consumption at each instant. For analytical tractability, we assume a logarithmic instantaneous utility function \(u=\log c(t_0,\tau )\), where \(c(t_0,\tau )\) denotes the consumption level of final goods of the individuals aged \(\tau -t_0\). Denoting the time preference rate by \(\rho >0\) implies that lifetime utility is given by

$$\begin{aligned} U(t_0,t_0) = \int _{t_0}^\infty e^{-(\rho +\mu )(\tau - t_0)} \log c(t_0,\tau ) d\tau . \end{aligned}$$
(1)

Note that the mortality rate augments the time preference rate because individuals who face the risk of death discount the future by more than the rate of pure time discounting. The wealth constraint of each individual has the form of a standard flow budget constraint and reads

$$\begin{aligned} {\dot{k}}(t_0,\tau ) = [r(\tau )-\delta ]k(t_0,\tau ) + w(\tau ) + d(\tau ) - c(t_0,\tau ), \end{aligned}$$
(2)

where \(k(t_0,\tau )\) refers to capital holdings, \(r(\tau )\) refers to the rate of return on capital, \(\delta\) denotes the rate of depreciation, \(w(\tau )\) refers to wage income, and \(d(\tau )\) are country-wide lump-sum dividend payments from holding shares of intermediate goods producers. The labor supply decision of individuals—and hence their tendency to migrate—is analyzed in Sect. 2.2. Optimization yields a standard individual Euler equation that can be aggregated according to our demographic assumptions. This yields the region-wide “aggregate” Euler equation (see Grafeneder-Weissteiner and Prettner 2013; Heijdra 2017, for details of the calculations):

$$\begin{aligned} \frac{{\dot{C}}(\tau )}{C(\tau )} = [r(\tau )-\rho -\delta ] - \mu \Omega . \end{aligned}$$
(3)

Uppercase letters refer to aggregate quantities and \(\Omega \in [0,1]\) is defined as

$$\begin{aligned} \Omega \equiv \frac{C(\tau ) - C(\tau ,\tau )}{C(\tau )}, \end{aligned}$$

where \(C(\tau ,\tau )\) are the consumption expenditures of newborns. As discussed in detail in Grafeneder-Weissteiner and Prettner (2013) and in Heijdra (2017, chapter 15), the difference between the individual and the aggregate savings behavior is captured by the generational turnover correction term \(\mu \Omega\). Optimal consumption growth is the same for all generations but optimal consumption levels differ. Older individuals are wealthier because of their accumulated capital holdings such that they can afford higher consumption expenditures than younger individuals. Since dying older generations are replaced by newborns with no capital holdings at each instant, aggregate consumption growth is dragged down by the generational turnover and is thus smaller than individual consumption growth.

2.2 Migration Decision

Each individual maximizes earnings by choosing her migration rate based on the wage differences between the regions (see, for example, Baldwin and Forslid 2000). We express wages in terms of the country’s technology level \({\bar{A}}=A+A^F\), which does not change the sign of the wage differential. However, it is analytically more convenient to work with normalized expressions because they remain constant along a balanced growth path. To conceptualize the migration behavior, the foreign representative individual of cohort \(t_0\) chooses her migration rate \(m^F(\tau ) = {\dot{l}}^F(\tau )\) so as to solve the following maximization problem

$$\begin{aligned} \max _{m^F(\tau )}&\int _{t_0}^\infty e^{-(\rho +\mu^F)(\tau - t_0)} \left\{ l^F(\tau ) {\hat{w}}(\tau ) + \left[ \frac{1}{{\bar{N}}} - l^F(\tau ) \right] {\hat{w}}(\tau )^F - \frac{\gamma m^F(\tau )^2}{2} \right\} \,\,\, d\tau \nonumber \\&s.t. \qquad m^F(\tau ) = {\dot{l}}^F(\tau ), \end{aligned}$$
(4)

where \(l^F(\tau )\) is labor supply of the foreign individual in the home region. Labor supply of the foreign individual in the foreign region is then given by \(1/{\bar{N}} - l^F(\tau )\) because of our normalization \({\bar{L}}=1\). The first two terms in the expression within curly brackets refer to the wage income of the individual for her endogenous choice of labor allocation between the two regions, while the third term refers to the quadratic costs of reallocating labor from one region to the other, i.e., it captures the migration costs. Note that \({\hat{w}}(\tau ) = w (\tau )/{\bar{A}}\) and \({\hat{w}}(\tau )^F = w (\tau )^F/{\bar{A}}\) denote normalized wages at home and abroad with w and \(w^F\) referring to the wage rates in the corresponding region. The parameter \(\gamma\) allows migration costs to be varied.

By the same token, the domestic representative individual of cohort \(t_0\) chooses her migration rate \(m(\tau ) = {\dot{l}}(\tau )\) so as to solve the following maximization problem

$$\begin{aligned} \max _{m(\tau )}&\int _{t_0}^\infty e^{-(\rho +\mu)(\tau - t_0)} \left\{ \left[ \frac{1}{{\bar{N}}} - l(\tau ) \right] {\hat{w}}(\tau ) + l(\tau ) {\hat{w}}(\tau )^F - \frac{\gamma m(\tau )^2}{2} \right\} \,\,\, d\tau \nonumber \\&s.t. \qquad m(\tau ) = {\dot{l}}(\tau ), \end{aligned}$$
(5)

where \(l(\tau )\) is labor supply of the home individual in the foreign region and labor supply of the home individual in the home region is thus given by \(1/{\bar{N}} - l(\tau )\).

We suppress time arguments from now on whenever this does not impair the clarity of the exposition. Solving the two maximization problems (see the section on “Optimal Migration” in the “Appendix”) results in the following system of differential equations that fully describe the economy-wide migration decisions

$$\begin{aligned} {\dot{W}} &= (\rho + \mu ) W - ({\hat{w}}^F-{\hat{w}}), \end{aligned}$$
(6)
$$\begin{aligned} {\dot{l}}& = \frac{W}{\gamma }, \end{aligned}$$
(7)
$$\begin{aligned} \dot{W^F}& = (\rho + \mu ^F) W^F - ({\hat{w}}-{\hat{w}}^F), \end{aligned}$$
(8)
$$\begin{aligned} {\dot{l}}^F& = \frac{W^F}{\gamma }. \end{aligned}$$
(9)

In this system, W and \(W^F\) are the costate variables representing the shadow value of migration for the home and foreign individuals. At the steady state, these shadow values equal the present value of the wage differential between regions. Consequently, at the symmetric equilibrium in which both regions exhibit the same fraction of workers and share the same parameter values, we have that \(W=W^F=0\). Since the share of labor in the home region is given by

$$\begin{aligned} s_L\equiv \frac{L}{{\bar{L}}}=L=\left( \frac{1}{{\bar{N}}}-l \right) \frac{{\bar{N}}}{2}+l^F\frac{{\bar{N}}}{2}=\frac{1}{2}+\frac{{\bar{N}}}{2}(l^F-l), \end{aligned}$$
(10)

the aggregate migration equation can be derived from Eqs. (7) and (9) as

$$\begin{aligned} {\dot{s}}_L& = \frac{{\bar{N}}}{2 \gamma } (W^F-W) . \end{aligned}$$
(11)

We observe that migration between the two regions prevails as long as there is a wage differential. A greater wage differential acts as catalyst of migration, whereas higher costs of migration, as represented by a higher \(\gamma\), slow down the inter-regional flow of labor.

2.3 Production Technology and Profit Maximization

The production side of the economy in the two regions follows the one described in Prettner (2013), which represents a simplified version of the production side of Romer (1990). The final goods sector in the home region (analogous expressions hold in the foreign region) produces a consumption good with workers and machines as inputs according to the production function

$$\begin{aligned} Y = L_Y^{1-\alpha } \int _0^{A} x_{i}^{\alpha } di , \end{aligned}$$
(12)

where Y denotes output of the consumption good (the numéraire), \(L_Y\) refers to labor used in final goods production, A is the level of technology, \(x_i\) is the amount of a specific machine i used in final goods production, and \(\alpha\) is the elasticity of output with respect to the machines of type i. Profit maximization and perfect competition in the final goods sector imply that the production factors are employed up to the point at which they earn their marginal product. The factor rewards are thus given by

$$\begin{aligned} w_Y& = (1-\alpha ) \frac{Y}{L_Y}, \end{aligned}$$
(13)
$$\begin{aligned} p_{i}& = \alpha L_Y^{1-\alpha } x_{i}^{\alpha -1}, \end{aligned}$$
(14)

where \(w_Y\) refers to the wage rate in the final goods sector and \(p_i\) to the price of intermediate inputs.

The intermediate goods sector follows Dixit and Stiglitz (1977) and is monopolistically competitive because each firm in the sector produces a differentiated machine. Consequently, the production of a machine requires to purchase the corresponding machine-specific blueprint from the innovation sector as a fixed up-front investment before the production process can start. After this fixed cost has been incurred, firms can transform one unit of physical capital into one unit of the specific machine for which they own the blueprint. Profit maximization then yields the following standard optimal pricing policy for intermediate goods producers

$$\begin{aligned} p_i& = \frac{r}{\alpha } , \end{aligned}$$
(15)

where \(1/\alpha\) is the markup. Note that there is symmetry between firms in the sense that the right-hand side of Eq. (15) is the same for all firms in the sector. Free entry into the intermediate goods sector ensures that the discounted stream of operating profits equals the fixed cost for the up-front investment of purchasing the blueprint from the innovation sector. The results so far imply that the capital stock in the home region is given by \(K=Ak\) because there are A different intermediate goods producers, each of which employs k units of capital. Consequently, the regional production function becomes

$$\begin{aligned} Y=K^{\alpha }\left( AL_{Y}\right) ^{1-\alpha }, \end{aligned}$$
(16)

in which the stock of blueprints appears as labor-augmenting.

Following Romer (1990), the innovation sector employs \(L_A\) scientists to discover the new blueprints for the intermediate goods producers that aim to enter the market. Depending on the productivity of scientists (\(\lambda\)), the employment level of scientists (\(L_A\)), and intertemporal knowledge spillovers (represented by the stock of technology A), the stock of blueprints evolves according to

$$\begin{aligned} {\dot{A}} = \lambda A L_A . \end{aligned}$$
(17)

Due to perfect competition in the innovation sector, profit maximization leads to the following relation between the price that intermediate goods producers charge for blueprints (\(p_A\)) and the wage rate of scientists (\(w_A\)):

$$\begin{aligned} w_A = p_A \lambda A. \end{aligned}$$
(18)

The wage rate of scientists increases if innovation firms charge a higher price for their blueprints. Since labor is homogenous, this leads to a flow of labor from the final goods sector into the innovation sector until wages are again equalized. The corresponding rise of \(L_A\) implies that innovation speeds up [see Eq. (17)]. This, in turn, leads to a higher rate of firm entry in the corresponding region such that regional economic growth gains momentum.Footnote 7

2.4 Market Clearing and Balanced Growth

Since workers in the final goods sector and in the innovation sector are homogenous, the labor market equilibrium is characterized by inter-sectoral wage equalization, i.e., \(w_A=w_Y=w\). Furthermore, the equilibrium price of blueprints is equal to the discounted stream of operating profits in the intermediate goods sector because otherwise intermediate goods producing firms would either make losses (if \(p_A\) were higher) or extra profits (if \(p_A\) were lower). In equilibrium, we therefore have that

$$\begin{aligned} p_A& = \frac{\pi }{r - \delta }. \end{aligned}$$
(19)

By combining Eqs. (14), (15), the production technology in the intermediate goods sector (\(x_i=k_i\) for all firms i), the fact that the region-wide capital stock is given by \(K=Ak\), and that the profits of intermediate goods producers amount to \(p_i x_i - r k_i\) for all firms i, we get equilibrium operating profits in the intermediate goods sector as

$$\begin{aligned} \pi& = (1-\alpha ) \alpha \frac{Y}{A} . \end{aligned}$$
(20)

Finally, labor market clearing requires that employment in the innovation sector and employment in the final goods sector add up to the regional supply of labor such that \(L = L_A + L_Y\). Combining these results yields endogenous labor supply in the innovation sector as

$$\begin{aligned} L_A& = \max \left\{ L - \frac{r-\delta }{\alpha \lambda },0 \right\} . \end{aligned}$$
(21)

If the interest rate was very high and the productivity of scientists was very low, the endogenous amount of labor employed in the innovation sector could become negative from a mathematical point of view. However, since this is not a meaningful economic result, the corresponding region would then just end up in the corner solution of \(L_A=0\) as reflected by the formulation of the right-hand-side in Eq. (21).

If the region’s labor market is in the interior equilibrium, an increase in the market interest rate reduces the employment level of scientists because the future operating profits in the intermediate goods sector are discounted more heavily. This higher discounting reduces the price that research firms can charge for blueprints, which reduces the wages in the innovation sector and leads to an outflow of labor toward the final goods sector to restore the labor market equilibrium. This, in turn, reduces technological progress, intermediate firm entry, and economic growth in the corresponding region.

We observe that the size of the region also matters for growth, i.e., a scale effect exists, because L appears on the right-hand-side of Eq. (21). A greater supply of labor in a region implies that more scientists are available to work on new ideas. This raises technological progress and hence economic growth and represents an agglomerative force between regions in knowledge-based economies.

Along a balanced growth path, all aggregate variables grow at the constant rate \(g\equiv {\dot{C}}/C={\dot{A}}/A={\dot{Y}}/Y\). The aggregate Euler equation (3) pins down the interest rate along such a balanced growth path as

$$\begin{aligned} r& = g + \mu \Omega + \rho +\delta . \end{aligned}$$
(22)

Plugging the interior solution of (21) into (17) and substituting for the interest rate from Eq. (22) provides an equation in the growth rate g and the generational turnover term \(\Omega\). Noting that \(\Omega =(\rho +\mu ) K / C\) and using the economy’s resource constraint

$$\begin{aligned} {\dot{K}} = Y - C - \delta K = \frac{rK}{\alpha ^2} - C - \delta K, \end{aligned}$$
(23)

where \(Y=rK/\alpha ^2\) follows from Eqs. (14), (15), and (16), allows us to solve for the constant growth rates and interest rates in the home and foreign region along a balanced growth path. Recalling that \(s_L = L\) and \(1-s_L = L^F\), these expressions are given by

$$\begin{aligned} g& = \frac{\alpha \lambda s_L - \alpha ^2 \delta -\alpha \rho +\delta - \Phi +\alpha ^2 \lambda s_L}{2 \alpha (1+\alpha )}, \end{aligned}$$
(24)
$$\begin{aligned} g^F& = \frac{\alpha \lambda (1-s_L) - \alpha ^2 \delta -\alpha \rho +\delta - \Xi +\alpha ^2 \lambda (1-s_L)}{2 \alpha (1+\alpha )}, \end{aligned}$$
(25)
$$\begin{aligned} r& = \frac{(1+\alpha )^2 \delta +\Phi +\alpha [\rho +(1+\alpha ) \lambda s_L]}{2 (1+\alpha )}, \end{aligned}$$
(26)
$$\begin{aligned} r^F& = \frac{(1+\alpha )^2 \delta +\Xi +\alpha [\rho +(1+\alpha ) \lambda (1-s_L)]}{2 (1+\alpha )}, \end{aligned}$$
(27)

where

$$\begin{aligned} \Phi& = \sqrt{4 \alpha ^3 \mu (\mu +\rho )+[(\alpha -1) (\alpha \delta +\delta +\alpha \lambda s_L)-\alpha \rho ]^2}, \end{aligned}$$
(28)
$$\begin{aligned} \Xi& = \sqrt{4 \alpha ^3 \mu ^F (\mu ^F +\rho )+\{(\alpha -1) [\alpha \delta +\delta +\alpha \lambda (1-s_L)]-\alpha \rho \}^2}. \end{aligned}$$
(29)

As shown by Prettner (2013) for a single closed economy, an increase in longevity (a decrease in \(\mu\)) effects the long-run growth rate positively because an expanded planning horizon raises savings. This in turn leads to a higher capital stock and a lower equilibrium interest rate. The lower interest rate implies that operating profits in the intermediate goods sector are discounted less heavily such that the price that research firms can charge for blueprints increases. As mentioned above, this raises the wage rate for scientists such that labor flows into the innovation sector to ensure wage equalization across sectors. This re-allocation of labor spurs technological progress and economic growth. In addition, a higher research productivity (\(\lambda\)), which increases the employment of scientists, has a positive impact on economic growth. By contrast, higher impatience (\(\rho\)), which reduces savings and hence raises the equilibrium interest rate, has a negative effect on economic growth.

3 Equilibrium Dynamics

The equilibrium dynamics of the two regions are captured by a four-dimensional system in the variables \(s_L\), W, \(W^F\), and \(s_A\), where the home share of technology is given by \(s_A\equiv A/(A+A^F)\). In the section “Home Technology Share” of the “Appendix” we derive the law of motion for \(s_A\) as

$$\begin{aligned} {\dot{s}}_A = s_A(1-s_A) \left( g - g^F\right) , \end{aligned}$$
(30)

where g and \(g^F\) are determined according to Eqs. (24) and (25). Equation (30), the law of motion for the share of labor in the home region (\(s_L\)) given by Eq. (11), and the laws of motion for the shadow values of migration (W and \(W^F\))—as represented by Eqs. (6) and (8)—constitute the following dynamic system

$$\begin{aligned} {\dot{s}}_L&= \frac{{\bar{N}}}{2 \gamma } (W^F-W), \\ {\dot{W}}&= (\rho + \mu ) W - ({\hat{w}}^F-{\hat{w}}), \\ \dot{W^F}&= (\rho + \mu ^F) W^F - ({\hat{w}}-{\hat{w}}^F), \\ {\dot{s}}_A&= s_A(1-s_A) \left( g - g^F\right) . \end{aligned}$$

In the section “Wages” of the “Appendix”, we derive the normalized wages (\({\hat{w}}\) and \({\hat{w}}^F\)) and show that they are given by

$$\begin{aligned} {\hat{w}}& = s_A \frac{Y(0)}{A(0)} \frac{(1-\alpha )\alpha \lambda }{r-\delta }, \end{aligned}$$
(31)
$$\begin{aligned} {\hat{w}}^F& = (1-s_A) \frac{Y(0)^F}{A(0)^F} \frac{(1-\alpha )\alpha \lambda }{r^F-\delta }. \end{aligned}$$
(32)

Note that r and \(r^F\) are determined according to Eqs. (26) and (27), while Y(0) and \(Y^F(0)\) denote the initial levels of output and A(0) and \(A^F(0)\) are the initial levels of technology.

4 Migration, Demography, and Agglomeration

We analyze the dynamics of agglomeration between two initially symmetric regions. This implies that there are no regional differences with respect to the parameters and starting values such that economic activity is initially spread out and there is full dispersion. The dynamic system is then given by

$$\begin{aligned} {\dot{s}}_L& = \frac{{\bar{N}} (W^F-W)}{2 \gamma }, \end{aligned}$$
(33)
$$\begin{aligned} {\dot{W}}& = (\rho + \mu ) W - ({\hat{w}}^F-{\hat{w}}), \end{aligned}$$
(34)
$$\begin{aligned} {\dot{W}}^F& = (\rho + \mu ) W^F - ({\hat{w}}-{\hat{w}}^F), \end{aligned}$$
(35)
$$\begin{aligned} {\dot{s}}_A& = (1-s_A) s_A (g-g^F), \end{aligned}$$
(36)

where equilibrium wages and growth rates are given by Eqs. (24), (25), (31), and (32) with \(\mu ^F=\mu\), \(Y^F(0)=Y(0)\), and \(A^F(0)=A(0)\).

From Eq. (33), we immediately see that a steady state of the system requires \(W=W^F\). Together with the resulting equations from solving \({\dot{W}}={\dot{W}}^F=0\), this implies that normalized wages must equalize. This is certainly true for the symmetric outcome with an equal division of labor and technology across regions, i.e., for \(s_L=0.5\) and \(s_A = 0.5\). Moreover, in such a symmetric situation, the regional growth rates are the same such that \(g=g^F\) and thus \({\dot{s}}_A=0\). As a result, the symmetric outcome with \(s_L=0.5\), \(W=W^F=0\), and \(s_A = 0.5\) represents a steady-state equilibrium.

Checking the stability properties of this steady state yields important insights on the possibility of agglomeration in such a two-region innovation-based growth framework with migration. If the symmetric steady state is unstable, any slight perturbation leads to agglomeration processes with one region becoming the core and the other the periphery. To get an intuition for these dynamics, consider a situation in which wages are the same in both regions such that nobody has an incentive to migrate to the other region. Then a technological innovation occurs in one region but not in the other such that wages rise slightly in the former, while they stay constant in the latter. Consequently, individuals from the region with the lower wage have an incentive to migrate to the region with the higher wage. In case that this movement of labor leads to a rise of wages in the labor-sending region and to a decline of wages in the labor-receiving region (to the extent that wages equalize again), the symmetric equilibrium is stable. By contrast, if the move of labor leads to a rise of wages in the labor-receiving region and to a fall of wages in the labor-sending region, the symmetric equilibrium is unstable and further migration occurs. This is also the case if a rise of wages in the labor-sending region and a decline of wages in the labor-receiving region occurs but to an extent that is insufficient to restore inter-regional wage equalization.

We analyze the stability properties of the steady state by following the classical approach (see Barro and Sala-i-Martin 2004) of linearizing the non-linear dynamic system given by Eqs. (33)–(36) around the symmetric equilibrium and then by evaluating the eigenvalues of the corresponding \(4\times 4\) Jacobian matrix

$$\begin{aligned} J_{sym}=\left( \begin{array}{cccc} J_{11} &{} \quad J_{12} &{}\quad J_{13} &{} \quad J_{14} \\ J_{21} &{} \quad J_{22} &{} \quad J_{23} &{}\quad J_{24} \\ J_{31} &{} \quad J_{32} &{} \quad J_{33} &{} \quad J_{34} \\ J_{41} &{} \quad J_{42} &{} \quad J_{43} &{} \quad J_{44} \end{array}\right) . \end{aligned}$$
(37)

Solving the characteristic equation yields four eigenvalues whose signs and nature fully characterize the system’s local dynamics around the symmetric steady-state equilibrium. Since there are two predetermined variables and two jump variables, saddle path stability prevails if two eigenvalues of the Jacobian matrix are positive and two eigenvalues are negative. The corresponding eigenvalues are shown in Fig. 2 for the parameter values \(\alpha =0.3\), \(\lambda =0.35\), \(\delta =0.05\), \(\rho =0.03\), \(\gamma =1\), \({\bar{N}}=1\), \(Y(0)=1\), \(A(0)=1\), and a mortality rate ranging from 0 to 1. These parameter values are chosen in line with the literature on economic growth (see, for example, Jones 1995; Acemoglu 2009; Grossmann et al. 2013) and such that the growth rate of the two regions at the symmetric equilibrium would amount to 1.7%. The figure reveals that the system is always unstable for the given parameter values such that there would always be a tendency for a clustering of economic activity. This explains the natural tendency for core–periphery structures and therefore cities to emerge in knowledge-based economies.

Fig. 2
figure 2

Eigenvalues of the Jacobian matrix defined in Eq. (37) for the parameter values \(\alpha =0.3\), \(\lambda =0.35\), \(\delta =0.05\), \(\rho =0.03\), \(\gamma =1\), \({\bar{N}}=1\), \(Y(0)=1\), \(A(0)=1\), and a varying mortality rate \(\mu \in [0,1]\)

Fig. 3
figure 3

Eigenvalues of the Jacobian matrix defined in Eq. (37) for the parameter values \(\alpha =0.3\), \(\delta =0.05\), \(\rho =0.03\), \(\mu =0.0125\), \(\gamma =1\), \({\bar{N}}=1\), \(Y(0)=1\), \(A(0)=1\), and a varying productivity of scientists \(\lambda \in [0,0.5]\)

Intuitively, if \(s_L=0.5\) is perturbed slightly, this affects the equilibrium wage differential via the following channels: First, the wage differential increases further if \(s_L\) rises because the labor-receiving region is able to sustain a higher rate of technological progress and therefore faster economic growth. This is a pro-agglomerative force. Second, the wage differential decreases due to the generational turnover effect described in detail by Grafeneder-Weissteiner and Prettner (2013). Countries with a younger population structure accumulate less capital. This implies that their wealth and expenditure levels are lower, while the interest rate is higher. Due to the fact that the future profits of new innovations are discounted by the market interest rate, this effect reduces technological progress and thus economic growth. This is an anti-agglomerative force. Third, the migration costs as determined by \(\gamma\) are also an anti-agglomerative force. Altogether, the pro-agglomerative force of faster economic growth due to the scale effect is stronger than the anti-agglomerative forces that only compensate for a part of the pro-agglomerative force. This reasoning suggests that the agglomerative force would be weaker if the productivity of scientists, and hence technological progress and economic growth were lower. We investigate this in Fig. 3 for a mortality rate \(\mu =0.0125\), giving rise to a life expectancy of 80 years, which is close to the average life expectancy in rich countries. Now we let the productivity of scientists (\(\lambda\)) vary from 0 to 0.5.

In case of \(\lambda =0\), the number of scientists in a region has no effect on economic growth such that immigration does not induce a rise in the gap between the wages of the labor-receiving region and the labor-sending region. The symmetric outcome is stable and dispersion of economic activity prevails. In this case, which corresponds to poorer countries without innovation-driven growth (and not to a modern knowledge-based economy), economic activity would spread out and no core–periphery structure would emerge. However, as soon as \(\lambda\) becomes positive, we have a positive growth effect of agglomeration that is confronted with two weaker dispersion forces. Again, one eigenvalue of the Jacobian matrix is negative in this case and the other three are positive. Thus, the symmetric equilibrium becomes unstable and agglomeration sets in as soon as innovation-driven growth gains momentum. Overall, and consistent with empirical observations, our model implies that modern knowledge-based economies are characterized by high urbanization rates and a considerable wage gap between cities and the countryside.

5 Conclusions

We explain the joint evolution of rising per capita GDP, increasing life expectancy, and the ongoing process of urbanization in industrialized countries within a two-region innovation-driven economic growth model with inter-regional labor migration. Individuals choose their optimal consumption growth path and the inter-sectoral labor allocation between final goods production and innovation. In the second optimization step, they compare the wage levels that they can attain in both regions and base their migration decision upon the wage differential. In case that the wages are reduced by migrating to the other region, they stay put and the current pattern of dispersion remains. Otherwise, if wages rise further after migration, ever more individuals up sticks, which changes the pattern of agglomeration and dispersion in favor of high urbanization rates. Overall, we find that the symmetric allocation between the two regions is an equilibrium. However, this equilibrium becomes unstable as soon as the productivity of innovation becomes positive, which would be the case in a modern knowledge-based economy. Thus, our model helps to explain the natural tendency for core–periphery structures to emerge in rich countries.

To analyze the interrelations between migration, demographic change, endogenous growth, and urbanization in a coherent and analytically tractable way, we had to abstract from many aspects that might be relevant in a more realistic setting such as (1) trade in consumption goods and machines, (2) age-specific mortality, and (3) heterogeneities with respect to education between workers and scientists. However, we do not find a compelling reason why the relaxation of these assumptions should invalidate our central results and leave these aspects for future research.