1 Introduction

The academic and public interest in the shape and changing patterns of income distributions has been growing steadily over the past decades. The rising top income share in the USA, for example, has inspired many discussions on everyone’s equal opportunity to prosperity through hard work in the formerly known “land of opportunity.” In a recent paper, Chetty et al. (2014a) emphasize the importance of regional differences in income mobility and describe the USA as being, instead of a land of opportunity, a collection of societies some of which are lands of opportunity with high rates of mobility across generations, and others in which few children escape poverty.Footnote 1

This is the first paper employing high quality register data to study the state of income mobility across regions in Sweden. My data set allows me to analyze national and regional mobility measures very precisely for the Swedish population born between 1968 and 1976. I compute, in addition to the traditional intergenerational elasticity (IGE), national and regional measures of intergenerational income mobility based on income ranks. The basis for these measures, called the rank-rank slope, is obtained by regressing the position (expressed in percentile ranks) of each member of the child generation on the parents’ position in their income distribution. Income ranks are considered more stable over the life cycle compared to income in levels, and no adjustments have to be made in order to accommodate zero income observations (Dahl and DeLeire 2008; Chetty et al. 2014b; Nybom and Stuhler 2016b).

I use two different measures in order to describe income mobility on the regional level, based on Chetty et al. (2014a). The first measure is called “relative mobility” and it is computed by scaling up the estimated rank-rank slope by a factor of 100. Relative mobility shows the strength of the association between child and parent income rank by region. In addition, since all incomes are expressed over 100 percentiles, relative mobility measures the difference in mean income rank between children with parents in the top, and children with parents in the bottom of the parent income distribution. In other words, relative mobility tells us about the size of the wedge (in terms of percentile ranks) between average incomes for children from high- and low- income families in each region and is thus also a measure of outcome inequality by region.

The second measure informs us about the average income rank a child who grew up in a certain region attains as an adult, given that her parents are located at a specific point in the parent income distribution. This measure is called “absolute mobility at percentile p”. The average child outcome can be calculated for any given parent income rank p, using the estimated intercept and rank-rank slope. I choose to focus on absolute mobility at parent percentile 25 when comparing Swedish regions, given the general interest in how children with disadvantaged background fare as adults. The absolute outcomes for any other parent percentiles can, however, easily be constructed from the reported measures.

The geographical unit that I focus on in the regional analysis is “local labor market,” which is an aggregation of municipalities defined by commuting patterns. The local labor market unit is similar to the commuting zone used by Chetty et al. (2014a). However, in comparison to the commuting zones in the USA, there is much more variation between different Swedish local labor markets in terms of population size (and thereby the number of observations). As I show below, this aspect of the data results in imprecise estimates. To remedy this problem, I propose a joint estimation technique using maximum likelihood, referred to as a multilevel (or hierarchical) model. In contrast to the approach taken in Chetty et al. (2014a), where they essentially run a set of distinct regressions, the multilevel model allows me to make a comparison between the different regional mobility measures in a statistically rigorous way. For example, I can test if the mobility estimate of one particular region is statistically significantly different from the national average. For completeness, I also report and discuss results based on separate OLS regressions by region.

Even though this paper follows Chetty et al. (2014a) closely in the mobility measures used, there are several important advantages with my study. For instance, Chetty et al. (2014a) assign childhood location based solely upon where individuals lived at the age of sixteen, whereas I define childhood location as the region where a child lived for at least six years between the age of six and fifteen. Thus, I make sure that children are assigned to a region in which they in fact spent a major part of their childhood.

Furthermore, when approximating average parent lifetime income, Chetty et al. (2014a) use a 5-year average of both parents’ pretax income between 1996 and 2000. One caveat with this (which is not discussed in their paper) is that parents have their children at different ages and therefore also have different ages in the 1996 to 2000 interval. According to their data description, parents in the core sample are actually between age 29 and 60 when their income is measured. This means that incomes are measured (and subsequently ranked) at very different points during the parents’ life cycles. This can potentially lead to very large measurement error. In this paper, in comparison to Chetty et al., I measure parents’ income over 17 years instead of 5. Moreover, I also measure income during the same age span for all parents. This eliminates the life cycle problem. In order to account for changing economic conditions beyond inflation, I rank parents along two dimensions: by average income and both parents’ birth cohort.

My results can be summarized as follows. I find that relative mobility (the scaled rank-rank slope) is relatively homogeneous across Sweden. The outcome inequality in mean rank is 18.2 percentile ranks in most local labor markets. Only 10 areas out of 112 show significantly lower or higher relative mobility, i.e., a larger or smaller difference between children from families with highest and lowest incomes, respectively. Stockholm ranks in the bottom with the lowest relative mobility, and the Varberg region south of Gothenburg at the Swedish west coast shows the highest relative mobility.

Absolute mobility at percentile rank 25, the expected outcome for children from low income families, varies considerably more across Swedish local labor market areas, between 40.90 percentile ranks in the Årjäng region close to the Norwegian border and 48.61 in the Värnamo region in the center of southern Sweden. This corresponds to a small but highly statistically significant income difference of approximately 20,000 SEK per year (≈2,210 USD).

For Sweden as a whole, the association between parent and child income measured by the relationship between income ranks has approximately been constant between 1968 and 1976. Looking separately at daughters and sons, there was an opposite trend with a decreasing association for sons and an increasing association for daughters, which reached equal levels for the last cohort observed. The IGE shows a different development with decreasing mobility between 1971 and 1976 and is misleading: the IGE reflects, in addition to the parent-child income association, also the considerable increase in the ratio of the standard deviations of child over parent income that took place from 1971 onward.

The remainder of the paper is organized as follows. In Section 2, a short background of the IGE, mobility measures based on income ranks, and a description of the multilevel model is given. The data and variables used are described in Section 3. In Section 4, results for intergenerational mobility on the national level and over time are reported. The regional results are the focus of Section 5, including a comparison of the multilevel model to OLS regressions. Section 6 concludes.

2 Measuring intergenerational mobility

The first part of this section comprises a short review of the estimation of the intergenerational income elasticity, with a focus on how to handle attenuation bias and life cycle bias. In the second part, I explain the concepts relative and absolute mobility which are used to compare the Swedish local labor markets. A brief introduction to multilevel modelling and the specific model used in this study are given in the last part of this section.

2.1 The IGE, attenuation bias, and life cycle bias

Income mobility refers broadly to the extent that (some measure of) child income varies with (some measure of) parent income. The by far most commonly employed mobility measure in the literature is the intergenerational elasticity (IGE). This is typically the slope parameter of a regression of log lifetime income of generation t on log lifetime income of generation t − 1. The closer the IGE is to zero, the more mobile the sample under consideration is said to be. Estimates of the IGE in the literature center around 0.4 with higher estimates for the USA, and usually smaller estimates for the European and especially the Nordic countries (see Björklund and Jäntti 1997; Solon 1992, 1999, 2004; or Mazumder 2005). Recent summaries of economic research in intergenerational mobility are provided by Björklund and Jäntti (2009) and Black and Devereux (2011). Extensions include the study of more than two generations such as Lindahl et al. (2015). One should bear in mind that knowing the intergenerational elasticity does not tell us, for example, how many and which of the children improve or worsen their economic status compared to their parents, i.e. the actual moving patterns of income status between generations. Mobility in this sense can be captured, for example, using transition matrices.

The IGE is typically estimated using the following benchmark equation:

$$ {y_{f}^{C}}=\alpha+\beta {y_{f}^{P}}+{\varepsilon_{f}^{C}} $$
(1)

where β is the parameter of interest, the elasticity between parent and child income, \({y_{f}^{C}}\) and \({y_{f}^{P}}\) are a the log of child and parent lifetime earnings in family f, respectively, and \({\varepsilon _{f}^{C}}\) is assumed to be an iid error term representing all other influences on child earnings not correlated with parental income. I will use the terms income and earnings interchangeably in this section due to the range of different income/earning concepts used in this literature.

What complicates the estimation of the IGE is the need for lifetime income data for the two generations. Approximations made in lack of sufficient data lead to at least two well-known measurement problems: attenuation bias and life cycle bias. Attenuation bias occurs due to measurement error of the regressor, most clearly seen when single year income observations are used to estimate the IGE. This was typical in early studies such as Solon (1992).

Assuming a classic error-in-variable-model, measured income y f then equals the true income \(y_{f}^{\star }\), plus an error:

$$ y_{f}=y_{f}^{\star}+\nu_{f}\;. $$
(2)

The known implication (Hausman 2001) is a downward inconsistent IGE estimate. The bias can be reduced using an average of T income observations to approximate the average of true lifetime income:

$$ {y_{f}^{P}}=\frac{1}{T}\underset{t=1}{\overset{T}{\sum}}\left( y_{f,t}^{P\star}+\nu_{f,t}^{P}\right). $$
(3)

Björklund and Jäntti (1997) showed that in this case, the inconsistency is diminishing in the number of observed years T (assuming the measurement errors/transitory fluctuations are not serially correlated). Mazumder (2005) used simulations to show that using a 5-year average (a number of typical magnitude in the literature) to measure father lifetime income still results in a downward bias of around 30%.

I address attenuation bias by averaging over a very large number of annual income observations where T is 17 for most parents in the sample (see Section 3.1 for more details). Importantly, income is observed for all individuals during the same age span, in the middle of their working lives.

Life cycle bias arises when single-year income observations of the child systematically deviate from the average of annual lifetime income (left hand-side measurement error). One can think of a parameter in front of \(y_{t}^{\star }\) in Eq. 2 that is time variable. In this case, the inconsistency of the OLS coefficient varies as a function of the age at which annual income is measured.

Since there are fewer years of income data available for the child generation, I handle life cycle bias by averaging over three income years in the early thirties. During these years, Swedish men have been shown to earn approximately as much as the yearly average over a whole lifetime (Bhuller et al. 2011; Nybom and Stuhler 2016b). However, there exist no similar studies focusing on women. In general, women have been excluded from most studies on intergenerational mobility. One potential reason for this could be their lower labor market participation and greater frequency of work absences related to childbearing.

It seems not too far of a stretch to interpret childbearing in terms of life cycle bias: The income trajectories over the life cycle of women differ systematically depending on having children (the so called “family gap,” see for example, Waldfogel 1998 or Budig and England 2001). In particular, motherhood, as well as the timing of motherhood, has been shown to affect wages, both directly and indirectly through motherhood related choices such as lower labor market participation and working to a larger extent in the public sector (Simonsen and Skipper 2006; Miller 2011).

However, these aspects pose similar problems to the approximation of life time income as those caused, for example, by heterogeneity in schooling decisions. Nybom and Stuhler (2016b) have shown that the shape of earnings over the life cycle for men (and thus the relationship between average life time income and annual incomes) varies systematically with education levels and other background variables. Thus, life cycle bias is presumably a problem for both genders and there is no strong reason to exclude daughters in particular. In addition, the results of this study will be more comparable to Chetty et al. (2014a) who also studied all children, sons and daughters, as one group.

There are two additional problems associated with the IGE measurement. Chetty et al. (2014a) showed for US data that the relationship between log incomes of children and their parents is not well represented by a linear regression model. This point has even been raised by Couch and Lillard (2004) and Bratsberg et al. (2007). One suggested remedy is to use income ranks instead of the log of incomes. A second problem are zero-income observations which have to be dropped or transformed for the analysis in log incomes. Dropping individuals with zero income will overstate mobility if children with zero incomes are over-represented in low income families. Recoding all zeros, on the other hand, leads to highly variable results depending on the replacement values chosen. A detailed analysis of this issue for my data can be found in Appendix A: Ranks versus logged incomes. Income ranks are found to be the preferred choice and are thus used exclusively in the regional analysis.

2.2 The relationship between income ranks

Instead of using log incomes, income ranks can be constructed to measure intergenerational income mobility. Importantly, observations with zero income do not need any special treatment here (Dahl and DeLeire 2008). As shown by Nybom and Stuhler (2016a), income ranks for Swedish men are found to be significantly more stable over the life cycle than log incomes, especially when measured above the age of 30. I rank children based on their approximated average lifetime incomes relative to other children in the same birth cohort. Parents are ranked similarly, by income and birth cohort relative to other parents. The ordered income levels are transformed into percentile ranks, i.e., normalized fractional ranks.Footnote 2 The following equation is then estimated by OLS:

$$ {R_{f}^{c}}=\alpha+\beta\,{R_{f}^{p}}+{\varepsilon_{f}^{c}} $$
(4)

where \({R_{f}^{c}}\) and \({R_{f}^{p}}\) are the rank of the child and parents in family f, respectively. The coefficient β (the rank-rank slope) is equal to the correlation coefficient between the ranks since, by construction, the ranks are approximately uniformly distributed. Both the IGE and the rank-rank slope show the persistence of income between parent and child generation. The measures differ conceptually when income inequality is larger in the child generation compared to the parent generation: with growing inequality, moving one rank down will correspond to a larger income loss in absolute terms since the distance between ranks increases.

When estimating rank-rank relationships on the regional level below, the national ranks assigned to each individual remain the same following Chetty et al. (2014a). If we were to use regional ranks instead, i.e., order individuals within each region, we would have a hard time interpreting the results: what does it mean that sons from low-income families in Stockholm reach on average the 38th percentile rank (within Stockholm), while sons from low-income families in Gothenburg reach on average the 35th percentile rank (within Gothenburg)? Is the income level at the 38th percentile within Stockholm higher or lower than the 35th percentile within Gothenburg? Using national ranks, we create a common scale that makes a regional comparison meaningful.Footnote 3

I analyze two mobility measures on the regional level, relative and absolute mobility. Relative mobility is computed according to the following equation:

$$ \bar{R}_{100,r}^{c}-\bar{R}_{0,r}^{c}=100\times\beta_{r} $$
(5)

where \(\bar {R}_{p,r}^{c}\) is the average child rank at percentile p in region r and β r is the rank-rank slope parameter from region r. Relative mobility can be viewed simply as a measure of the slope and thus the number of ranks a child on average rises in the income distribution given an increase in the parent income rank. Since all income ranks are distributed between 0 and 100, the scaled rank-rank slope can also be viewed as a measure of maximum outcome inequality in a region. As seen from the left hand side of Eq. 5, relative mobility equals the child rank difference between the child from the two families with highest and lowest parent income, respectively. Higher relative mobility in one region implies a larger spread in child outcomes, given parent incomes.

Relative mobility of 43 in region A, for example, means that the adult long run incomes of all children from that particular region differ by at most 43 ranks. In terms of the slope, we can also say that, compared to a region B where relative mobility is 38, the association between child and parent income is stronger in region A. It is important to keep in mind that both the IGE and relative mobility are relative measures and therefore do not reveal if higher relative mobility, i.e., a lower rank-rank slope, is driven by better outcomes of some poorer families, or solely by worse outcomes of richer families. Therefore, a measure of absolute mobility is necessary to obtain a more comprehensive picture of income mobility.

Absolute mobility is defined as the mean adult rank of children with parents located at a certain percentile p in the parent distribution. It is a prediction based on both the intercept and the slope estimates for the regions. I choose to compare the regions in terms of absolute mobility at percentile 25 in order to learn about the prospects for children from low income families as well as to facilitate comparisons to the US study. Outcomes at other percentiles can easily be constructed using the relative and absolute mobility results in Table 7. Absolute mobility at p = 25 is calculated according to the following formula:

$$\begin{array}{@{}rcl@{}} \bar{R}_{25,r}^{c} =\alpha_{r}+\beta_{r}\times25\:. \end{array} $$
(6)

The left panel in Fig. 1 illustrates relative and absolute mobility. The former is given by the difference in mean child rank (Y-axis) between parents with the highest and lowest income rank (X-axis), alternatively the rank-rank slope multiplied by 100. The latter is measured by the mean child rank given parents at the 25th percentile. The right panel shows three example regions for clarification. Region 1 and region 3 share the same level of relative mobility, i.e., the outcome inequality measured in ranks for children in those regions is the same. However, mobility differs in absolute terms: for every parent percentile, the mean child rank is higher in region 3. Region 1 and region 2 have the same level of absolute mobility at parent percentile 25. However, relative mobility is lower in region 2 which can be seen by the steeper rank-rank slope indicating a larger variance of ranks children obtain in this region. Children with parents in the top of the income distribution reach significantly higher outcomes in region 2 compared to region 1. Note that a steeper rank-rank slope means a larger wedge between children from top and bottom ranked parents and thus a lower level of relative mobility.

Fig. 1
figure 1

Relative and absolute mobility. The left figure illustrates relative and absolute mobility. Relative mobility is a measure of outcome inequality, namely the difference between the expected outcome of a child with parents in the top of the income distribution and a child with parents at the bottom of the income distribution. Alternatively, relative mobility can be seen as a measure of the rank-rank slope and thus informs about the strength of the association between child and parent income rank. Absolute mobility at p = 25 is the expected income rank of a child with parents located at the 25th percentile. The right figure shows the association between child and parent income rank for three different regions. Regions 1 and 3 exhibit the same relative mobility, while regions 1 and 2 share the same level of absolute mobility at p = 25. Regions 1 and 3 would be indistinguishable from each other when using purely relative measures such as the IGE

It is important to be aware of which aspects the mobility measures above can and cannot capture. The IGE, the slope coefficient of a regression of log incomes, takes into account both the correlation between log incomes and the spread of the child and parent income distribution, since it is equal to

$$ \beta=\frac{Cov\left( {y_{f}^{C}},{y_{f}^{P}}\right)}{Var\left( {y_{f}^{P}}\right)}=\frac{Cov\left( {y_{f}^{C}}, {y_{f}^{P}}\right)}{\sigma_{P}\sigma_{C}}\frac{\sigma_{C}}{\sigma_{P}}=corr\left( {y_{f}^{C}},{y_{f}^{P}}\right) \frac{\sigma_{C}}{\sigma_{P}}, $$
(7)

where σ C(P) is the standard deviation of the child (parent) distribution. The rank-rank slope on the other hand is just equal to the correlation coefficient between the income ranks since, after transforming income levels into percentile ranks, incomes in all generations are approximately uniformly distributed between 0 and 100 and the ratio of standard deviations cancels out.

If income inequality had grown more from one generation to the next everything else equal (i.e., an increase in σ C only), the IGE would now be larger while the rank-rank slope would not change. A change in the mean of the income distribution (a shift of the complete distribution to the left or right), however, will show up in neither the IGE or the rank-rank slope since covariances, standard deviations, and ranks are not affected by such a shift, ceteris paribus.

2.3 Regional estimation

The estimation of rank-rank slopes and intercepts by region can be implemented in a variety of ways. The simplest one would be to estimate R different equations as in Eq. 4 for regions r = 1, ... , R by OLS, resulting in R different slopes and intercepts (as done in Chetty et al. 2014a). Let us call this the no-pooling case. Ignoring the regional information completely and estimating the equation for the whole sample as one group would give us one slope estimate and one intercept, i.e., the overall national estimates. We can call this the complete pooling case, for further reference below.

A third and potentially better alternative is to recognize not only the grouped nature of the problem at hand (individuals are sorted into different regions), but to explicitly model this relationship by taking into account both the within- and the between-region variances using a multilevel (or hierarchical) model. Multilevel models are widely used in political sciences (modelling for instance election turnouts or state-level public opinion, see for example, Lax and Phillips 2009, Galbraith and Hale 2008, Shor et al. 2007, or Steenbergen and Jones 2002 for an overview) and in the context of education (students are grouped into class rooms and class rooms into schools and school districts, see for example, Koth et al. 2008). The terminology and notation below follow Gelman and Hill (2006).

The multilevel model is characterized by a level-1 equation for the smallest units (8), in this case modeling the relationship between child income rank and parent income rank for family f in region r, and a set of level-2 equations for the larger units, here the regions. The level-2 equations (910) model explicitly the intercepts and slope coefficients across regions:

$$\begin{array}{@{}rcl@{}} {R_{f}^{c}} & =&\alpha_{r}+\beta_{r}{R_{f}^{p}}+{\varepsilon_{f}^{c}} \end{array} $$
(8)
$$\begin{array}{@{}rcl@{}} \alpha_{r} & =&\gamma^{\alpha}+\eta_{r}^{\alpha} \end{array} $$
(9)
$$\begin{array}{@{}rcl@{}} \beta_{r} & =&\gamma^{\beta}+\eta_{r}^{\beta} \end{array} $$
(10)

where \({\varepsilon _{f}^{C}}\), \(\eta _{r}^{\alpha }\), and \(\eta _{r}^{\beta }\) are random errors centered around zero and with variances \({\sigma _{R}^{2}}\), \(\sigma _{\alpha }^{2}\), and \(\sigma _{\beta }^{2}.\) Another common and equivalent way to write this model is

$$\begin{array}{@{}rcl@{}} {R_{f}^{c}} & \sim & N\left( \alpha_{r}+\beta_{r}{R_{f}^{p}}\:,\:{\sigma_{R}^{2}}\right),\text{ for }f=1,...,F \end{array} $$
(11)
$$\begin{array}{@{}rcl@{}} \left( \begin{array}{c} \alpha_{r}\\ \beta_{r} \end{array}\right) & \sim & N\left( \left( \begin{array}{c} \gamma^{\alpha}\\ \gamma^{\beta} \end{array}\right),\left( \begin{array}{cc} \sigma_{\alpha}^{2} & \rho\sigma_{\alpha}\sigma_{\beta}\\ \rho\sigma_{\alpha}\sigma_{\beta} & \sigma_{\beta}^{2} \end{array}\right)\right),\text{ for }r=1,...,R \end{array} $$
(12)

which emphasizes the fact that the coefficients α r and β r are given a probability distribution with means and variances estimated from the data. Substituting Eqs. 9 and 10 into Eq. 8, the model can be re-expressed as a mixed model

$$ {R_{f}^{c}}=\gamma^{\alpha}+\eta_{r}^{\alpha}+\gamma^{\beta}{R_{f}^{p}}+\eta_{r}^{\beta}{R_{f}^{p}}+{\varepsilon_{f}^{c}} $$
(13)

where in multilevel terminology, the γ’s are “fixed effects” (= averages across all regions) and the η’s are “random effects” (= draws from the estimated distributions).Footnote 4

The multilevel model appears similar to a random or fixed effects model often used in economics, but there are some important differences. We could for instance estimate a fixed effects model by simply adding 2 × (R − 1) regional dummies to Eq. 4, for regional intercepts and slopes. This approach would basically control away all between-region differences. In a multilevel model, the between-region variance is explicitly estimated from the data and used to predict the regional effects. Also, if there are only few observations in some regions, the estimates using regional dummies will be inefficient. The multilevel model on the other hand makes use of all observations when estimating the variance components and leads therefore to more precise estimates when there is little within-region variance. Importantly, it is thus not necessary to have observations over the whole parent percentile distribution in each of the regions in order to efficiently estimate the model parameters.

Note also that ordinary least squares is just a special case of multilevel models: The variance of the regionally varying parameters is zero in the limit in the complete-pooling case (national OLS) and infinity in the no-pooling model (distinct OLS regressions by region). With multilevel data, however, we can explicitly estimate this variance and do not need to assume it to be either zero or infinity.

Again, in the no-pooling case, the α r ’s and β r ’s in Eq. 8 are the OLS estimates from separate regressions, varying completely freely from each other. In the complete pooling case, the α r ’s and β r ’s are constrained to one common α and β. Here, in the multilevel model, where Eqs. 810 are fitted simultaneously by maximum likelihood estimation, the α r ’s and β r ’s are given a “soft constraint”: they are assigned a probability distribution given in Eq. 12, with mean and standard deviation estimated from the data, which actually pulls the coefficient estimates partially towards their mean.

The amount of pooling depends on the number of observations in each group as well as the between-regions variance of the parameters. In fact, an estimate of a regional intercept, for example, can be expressed as a weighted average between the mean across all regions, γ α (complete pooling), and the average of the \({R_{f}^{c}}\)’s within the region, \(\bar {R}_{r}^{c}\) (no pooling):

$$\begin{array}{@{}rcl@{}} \hat{\alpha_{r}}^{multilevel} & = & \omega_{r}\hat{\alpha}^{complete-pooling}+\left( 1-\omega_{r}\right)\hat{\alpha_{r}}^{no-pooling}. \end{array} $$
(14)
$$\begin{array}{@{}rcl@{}} \hat{\alpha_{r}}^{multilevel} & = & \omega_{r}\gamma^{\alpha}+\left( 1-\omega_{r}\right)\bar{R}_{r}^{c} \end{array} $$
(15)

where the pooling factor ω r is calculated according to

$$ \omega_{r}=1-\frac{\sigma_{\alpha}^{2}}{\sigma_{\alpha}^{2}+\frac{{\sigma_{R}^{2}}}{n_{r}}}. $$
(16)

Thus, the intercept in a region with few observations is deemed less reliable and pulled towards the average value of all regions. The estimates for a region with many observations on the other hand will usually coincide with those from a separate OLS regression.

This is the main argument for using multilevel modelling in this particular study: there are many regions in Sweden with relatively few observations. The large regions have more than 400 times as many observations as the small regions. A separate regression for those small regions leads to extreme mobility estimates with large standard errors. In other words, we would not trust those estimates (even though they might seem appealing since we could report some exceptionally low and high levels of intergenerational mobility). Another useful aspect of multilevel models is that it is possible to include regional-level indicators along with regional-level predictors, which would lead to collinearity in OLS.

In a second model, I add five regional types (as described in Section 3.2 below) as a regional level predictor in the form of dummies to Eqs. 9 and 10:

$$\begin{array}{@{}rcl@{}} \alpha_{r} & = & \gamma_{1}^{\alpha}+\sum\limits_{i=2}^{6}\gamma_{i}^{\alpha}T_{i}+\eta_{r}^{\alpha} \end{array} $$
(17)
$$\begin{array}{@{}rcl@{}} \beta_{r} & = & \gamma_{1}^{\beta}+\sum\limits_{i=2}^{6}\gamma_{i}^{\beta}T_{i}+\eta_{r}^{\beta}. \end{array} $$
(18)

This gives the following mixed model:

$$ {R_{f}^{c}}=\gamma_{1}^{\alpha}+\eta_{r}^{\alpha}+\sum\limits_{i=2}^{6}\gamma_{i}^{\alpha}T_{i}+\gamma_{1}^{\beta} {R_{f}^{p}}+\sum\limits_{i=2}^{6}\gamma_{i}^{\beta}T_{i}\,{R_{f}^{p}}+\eta_{r}^{\beta}{R_{f}^{p}}+{\varepsilon_{f}^{c}} $$
(19)

which allows the type of region during childhood to have an effect on both regional intercepts and slopes via \(\sum \limits _{i=2}^{6}\gamma _{i}^{\alpha }\) and \(\sum \limits _{i=2}^{6}\gamma _{i}^{\beta }\).

The model is built step wise, starting with a random intercept per region and adding then random slopes and predictors. After each step, a log-likelihood ratio test i used to assess if the model is a better fit to the data compared to classical regression (first model), or a better fit compared to the previous step.

Maximum likelihood estimation is used to fit the model. The “fixed effects” (regional average) parameters of intercept and slope given by the gammas in Eq. 12 are analogous to standard regression coefficients and are directly estimated. The regional effects given by \(\eta _{r}^{\alpha }\) and \(\eta _{r}^{\beta }\) are not directly estimated but summarized in terms of their estimated variances and covariances. The best linear unbiased predictors (BLUPs) of the regional effects and their standard errors are computed based upon those estimated variance components as well as the “fixed effects” estimates.Footnote 5

3 Data and variable descriptions

The data in this study comes from the SIMSAM database at Umeå University (Swedish Initiative for Research on Microdata in the Social And Medical Sciences). SIMSAM combines several different Swedish micro data registers and the population, geographic and income registers used in this study are provided by Statistics Sweden. A detailed description of the sample, the income variable used, as well as the geographical unit used for the regional analysis is given below.

3.1 Sample selection and income

My population sample consists of all individuals born in Sweden between 1968 and 1976, in the following termed children (927,008 observations before applying any restrictions). Due to the Swedish centralized registration system 99.5 percent of those children can be linked to their fathers and mothers. The age of the parents at their child’s birth is restricted to the interval 16–40. This age interval is a result of the trade off between including older parents, and being able to observe parent income for everyone from their early thirties onward. With the chosen values, I make use of more than 95% of the sample.

The income variable used here is the sum of taxable income from employment, self-employment, and transfers from the Swedish Social Insurance Agency (“Sammanräknad förvärvsinkomst”). The taxable transfers include parental benefits, pension payments, and sick pay and are labor market and income related.

There are several possibilities as to which intergenerational family member combination to focus on (child income and father income, child income and mother income, or child income and some combination of mother and father income). Each choice leads to slightly different interpretations of the mobility measure. I choose to study the relationship between child income and the sum of mother and father income in order to facilitate comparison to the US study, as well as due to the cultural context: From the second half of the 1960s and onward, Swedish women increased their labor supply significantly due to a combination of an expanding public sector, increasing demand for labor, and women’s desire for (financial) independence. A tax reform in 1971 abolished joint taxation of spouses, and public child care was expanded considerably (Gustafsson and Jacobsson 1985; Gustafsson 1992; Gustafsson and Stafford 1992). Mothers have therefore been important contributors to Swedish families’ household income for cohorts in this study. In addition, changes in the amount of time parents spend at home with their children and changes in the intra-household division of market- and household work, have likely affected children’s adult incomes. These are very interesting issues that are beyond the scope of this paper and left for future research.

Chetty et al. (2014a) also use the total parent income (total pretax income at the household level); however, they use child family income as opposed to child individual income in their main analysis. This might be problematic since this measure is more affected by assortative mating. What one might be measuring in this case is the relationship between parent income and a child’s ability to find a high income partner.Footnote 6 In Chetty et al. (2014a) Section IV.B. 3., using child individual income instead of child family income is indeed shown to change the estimated rank-rank slopes by − 6 and − 26 percent for sons and daughters, respectively. We should keep in mind the different child income measure used when comparing the results to the US study.

Annual earned income can in principle be observed for each individual (children and parents) over the time period 1968 to 2010 in my data. All income observations are expressed in 2010 SEK. Income and earned income are used interchangeably in the following. I follow the literature discussed in Section 2 and approximate average parent lifetime income by averaging over a large number of annual incomes. For over 96% of the parents, I have 17 consecutive income observations available from when they were 34 to 50 years old. Parents missing too many income observations are dropped from the sample.Footnote 7

The great advantage here compared to earlier studies is that I measure parental income at approximately the same age for each parent, as well as over a very long time span. Averaging instead over the same calendar years for everyone (i.e., 2010–2012) as done in many other studies would give a biased measure: we would underestimate average income for young parents and overestimate average income for old parents, and even include some parents who are already retired. In order to make the parent incomes even more comparable over time, I rank parents by 5-year birth cohort groups. For example, parents where the mother is born between 1941 and 1945 and the father is born between 1936 and 1940 comprise one category and are ranked only relative to other parents in just this group.

For the children I have naturally fewer income observations are available. Following the results by Bhuller et al. (2011) and Nybom and Stuhler (2016b), I choose to approximate child lifetime income by taking the average over three years when 32 to 34 years old.Footnote 8 As discussed in Section 2.1, almost none of the relevant studies has analyzed the relation between income trajectories over the life cycle and average lifetime income for women. One exception is Böhlmark and Lindquist (2006) who found that women’s income trajectories follow a different pattern compared to men’s, but that the women’s relationship between annual and approximated average life time income has also changed strongly over time. Unfortunately, the youngest women in their study are born 26 years earlier than the oldest daughters in my sample which strongly reduces the applicability of their findings, given the development of female labor market participation during the missing decades. Since we do not know if women’s life time earnings are best approximated by annual earnings at an earlier or later age compared to men in my sample, I use the same age span for daughters as for sons. I rank children by income and child birth cohort, where all children missing more than one income observation are dropped from the sample (3.7%).

Table 4 in the Appendix summarizes the sample. The average age at child birth (26 for mothers and 28 for fathers) has increased slowly but steadily over the observed time horizon. There are roughly between 80,000 and 90,000 children in each cohort and 789,300 children in total, before assigning childhood regions in the next section.

Table 1 shows an income summary. There is a clear difference between female and male incomes in terms of levels and variances in both generations. Mothers have on average about 60% of fathers incomes (but only 36% in terms of the highest income). Income inequality as measured by the 90th income percentile divided by the 10th percentile is much larger for mothers than for fathers, but very similar within the child generation.

Table 1 Income distributions

3.2 Geographic unit

The geographic unit I choose to work with is the local labor market region, or LLM. An LLM is a self-sufficient area in terms of labor within which individuals live and work, and thus spend most of their time. The aggregation of municipalities into LLMs is taken from Statistics Sweden which measures commuting flows between municipalities. The aggregation into local labor markets corresponds most closely to the commuting zones which are used by Chetty et al. (2014a) for the USA.

Studying local labor markets is a first step towards measuring the effect of immediate conditions (family, neighborhood), the local community (school quality, for example), and the larger metro area which is picking up for example labor market conditions. Using smaller geographical units such as municipalities there is a larger risk of selection bias due to residential segregation, i.e., that families sort themselves into certain residential areas and municipalities. A local labor market area contains several municipalities and probably several different residential areas, with different types of families. There are currently 75 LLMs in Sweden (112 in 1990 due to increasing commuting patterns), containing on average 4 municipalities and a population of 90,000. In contrast, there are 741 commuting zones in the USA containing on average 4 counties and a population of 380,000.

In addition, I use five different regional types, based upon the “regional families” classification of local labor markets by The Swedish Agency for Economic and Regional Growth. The five regional types (T1–T5) are large cities (such as Stockholm), large regional centers (university cities, for example), small regional centers (small cities employing a large share of the population in the surrounding rural areas), sparsely populated regions (less than six people per square kilometer), and other small regions (ranking in between small regional centers and sparsely populated regions). A complete list of local labor market regions and their type classifications can be found in Table 5 in the Appendix.

Research by Cunha and Heckman (2007), Cunha et al. (2010), and Heckman (2007) indicates that the early environment is important in the human capital formation of children. Early investments generate not only human capital directly but also lead to higher returns to later investments. Other potentially important factors influencing the accumulation of human capital and life time income are the school environment and peers (Lavy et al. 2012), the home and neighborhood environment (Chetty et al. 2016), and probably also the availability of adult role models and guidance when choosing higher education or career paths during teenage years.

I therefore assign children to the local labor market region in which they lived for at least six years between the age of 6 and 15 (ignoring moves within a local labor market), in order to capture both some influences during earlier as well as some teenage years. Using the strict assignment rule of a minimum of 6 years in the same region, we can be sure that a child was actually exposed to this location a significant portion of her childhood and that studying regional differences in mobility is meaningful.Footnote 9Chetty et al. (2014a) assign children instead to a region based upon their parents residence in 2016. The sample includes now 778,484 individuals, 1.4% moved too often to determine a childhood region.

4 Mobility on the national level

In this section, I summarize the national mobility estimates based on both log incomes and income ranks. A non-parametric description of mobility on the national level (including a transition matrix and quintile mobility over time) can be found in Appendix:B Non-parametric description of mobility on the national level (Fig. 13).

The national mobility results for different family member combinations are shown in Table 2. Both the IGE and the rank-rank slope show the weakest dependence between the incomes of mothers and their children. Both the IGE and the rank-rank slope estimates indicate that the relation between son and parent income is the least mobile (remember that the larger the IGE or rank-rank slope, the less mobility). A ten percentile points increase in parent income rank implies on average a 2.36 percentiles increase in the son’s income rank.

The estimated IGE for sons and fathers, 0.252, is in line with previous results. Nybom and Stuhler (2016b) got an estimate of 0.27, based on a sample of 3,504 Swedish sons born between 1955 and 1957. Two main differences to their study are that their income measure is total pre-tax income which includes capital realizations, and fathers older than 28 years at their son’s birth are excluded from the sample. The effects of those two differences might however work in opposite directions which could explain the similarity to this study’s result.

Björklund and Jäntti (1997) estimated the IGE to be 0.216 between fathers and sons. Their sample was quite different from the one used here: no actual father and son pairs were observed but instead two independent samples for both groups were combined. Their income measure was earnings, a 5-year average for the fathers and one single observation for the sons.

Österberg (2000) presented results even for daughters and mothers. There are several ways her sample differed from mine. Incomes were observed during three calendar years only where parents are up to 65 years old and thus possibly already retired. Many children in the sample were under 33 years when their income is measured. Her estimate for the IGE between sons/daughters and fathers (0.13/0.071, respectively) as well as for sons/daughters and mothers (0.022/0.036, respectively) are substantially smaller in magnitude than my estimates which might be caused by attenuation and life cycle bias.

Björklund et al. (2006) studied how pre- and postbirth factors contribute to intergenerational earnings and education transmission by analyzing Swedish families with adoptive versus biological children born in the sixties. They use earnings in 1999 to approximate lifetime average earnings. Their estimate of the IGE between children and their fathers in biological families (no adoptive children) is 0.235, which is quite close to the IGE of 0.216 in my data.

The IGE and rank-rank slopes for parents and their children (first row in Table 2) are larger than the estimates for father and mother separately (second and third rows). For the IGE, the association with the sum of parent income is larger than the sum of the associations with mother and father income in all three cases. This suggests a potentially important role of the parent income combination, or parent income matching, for income transmission between generations. Investigating this finding further is an interesting direction for future research.

Table 2 Mobility estimates for the pooled sample

Figure 2a, b shows the development of the rank-rank slopes and IGE for children, sons, daughters, and their parents, respectively, by cohort. The error bars show 95% confidence intervals. The rank-rank slope for children and their parents is close to 0.2 for all observed cohorts with no significant trend. The association between sons and their parents’ income ranks has slightly decreased from around 0.26 to 0.22 between 1968 and 1976. The association for daughters starts at 0.19 and increases as to reach the same level as sons at the end of the observed time period.

Fig. 2
figure 2

Intergenerational mobility over time. a Rank-rank slope estimates separately by cohort, for the three combinations son, daughter, and child rank with parent rank, respectively. b Estimates of the intergenerational elasticity by cohort. The error bars indicate 95% confidence intervals

Note that the separate estimations for sons and daughters involve assigning new ranks compared to the child group: in each estimation sample, both the dependent and independent variable always consist of percentile ranks between 0 and 100. In particular, if the daughters are located more heavily along the lower ranks within the child distribution (due to a lower average income compared to sons), they are still approximately uniformly distributed between 0 and 100 in the pure daughter sample. The order among girls and boys, respectively, stays the same, however. The rank-based estimates of the children are therefore not a simple weighted average of the estimates by gender.Footnote 10

As shown in Fig. 2b, the association between the log income of parents and sons declines until 1971 and returns then almost to the starting value at 0.34. The daughter-parent log income association on the other hand starts as low as 0.22 and increases until it reaches similar levels as the sons in 1976 (0.31). The child-parent log income association is a weighted average of the estimates by gender and is thus relatively constant until the later years which show a small upward trend.

Equation 7 in Section 2.2 can help to explain the different trends in the later half of the observed time period between ranks and log incomes: The rank-rank slope is simply the correlation coefficient between the percentile ranks of children and parents, while the IGE is the product of (i) the correlation coefficient between log child income and log parent income and (ii) the ratio of their standard deviations. The data shows that the increasing IGE from 1971 onward is purely driven by an increase in the relative variance of the child log income distributions, and not by an increase in the linear dependence between child and parent log income.

5 Mobility across regions

The multilevel analysis reveals some interesting facts about intergenerational mobility across Sweden. The first part in this section discusses the multilevel model output. The second part focuses on the two measures relative mobility and absolute mobility at p = 25 on the regional level as described in Section 2.2. In the last part, I discuss alternative results obtained by using separate OLS regressions by region.

5.1 Results from the multilevel model

The results from the multilevel model (1) from Section 2.3 can be summarized by plotting the deviations of the predicted slope- and intercept-random effects (the \(\eta _{r}^{\alpha }\) and \(\eta _{r}^{\beta }\)) from the estimated average values (γ α and γ β) for each region. See Table 6 in the Appendix for the detailed estimation output. The slopes and intercepts obtained by the fixed- and random effects are used in the next section to compute relative- and absolute mobility according to the formulas in Section 2.2.

As shown by the black dots in Fig. 3a, the estimated slope-random effects vary at first glance greatly across Sweden. The regional slopes to the left with data points below the horizontal line are smaller than the average, and the regional slopes located above the line to the right are larger. However, most estimates are not significantly different from the average: most of the 95% confidence intervals (shown as error bars) include the horizontal line at zero which indicates the average intercept. Of all 112 regions, only 3 show a significantly flatter slope (weaker association between parent and son income rank), and 7 regions show significantly steeper slopes (stronger association). If we had used separate OLS regression for each region, we would probably have overstated the differences in rank-rank slopes over regions since there is no easy way to compare the estimates of many disjoint regressions based on different observations.

Fig. 3
figure 3

Regional effects. a Deviations of the 112 regional random slopes from the slope fixed effect, i.e., the average slope across all regions, sorted in ascending order from left to right. b A similar graph for the deviations of the regional random intercepts from the estimated average intercept. The error bars indicate 95% confidence intervals. Regions that include the horizontal line (zero) in their confidence interval do not differ statistically from the Swedish average in terms of intercept or slope

As opposed to the slopes, a large fraction of the regional intercepts (shown in Fig. 3b), differ significantly from the average in most local labor markets. 22 regions have smaller than average intercepts, while 33 have larger than average intercepts. Thus, we know already that mobility measures based on absolute outcomes (computed based on both intercept and slope) will show larger differences between regions than purely relative measures (based on the few statistically significant regional slope estimates only).

The average slope estimate across all regions is 0.182 with a standard error of 0.002, implying that a ten percentile increase of the parent income rank is associated with an increase of 1.82 ranks for the child. The average intercept is 40.3. The correlation coefficient between the regional slopes and regional intercepts is −0.64, which means that regions with steeper rank-rank slopes on average have lower intercepts.

The relationship between the multilevel model, separate OLS regressions for each region, and the completely-pooled, national estimates are demonstrated in Fig. 4. The top panel shows Dorotea, the smallest local labor market region (291 observations) located in the north of Sweden. The dotted line shows the mobility estimates from a separate OLS regression: the line is almost completely flat and would indicate extremely high levels of relative income mobility. However, the large spread of the underlying binned scatter plot in gray shows the inefficiency of the estimation and thus how unreliable this result is. The child-parent income rank association based upon the best linear unbiased predictor (BLUP) from the multilevel model (given by the black solid line) deviates from this extreme result and pulls towards the solid gray line above, which shows the average association across all regions. The bottom panel in Fig. 4 displays a similar figure for Stockholm. For readability, only the average child rank by parent rank is displayed (binned scatter plot). The multilevel estimates (BLUPs) coincide here completely with the estimates from a regression run exclusively for children grown up in Stockholm (the solid black line and the dotted line are indistinguishable from each other). With 132,749 observations, the estimates are not pulled at all towards the pooling-result.

Fig. 4
figure 4

Comparison of estimation strategies. a A binned scatter plot of son and parent income ranks for Dorotea, with three different fitted lines from (1) a separate OLS regression, (2) the national OLS regression, and (3) the Best Linear Unbiased Predictors from the multilevel model. The multilevel estimates are close to the national average and gives less weight on the within-LLM information. The lower panel shows a similar figure for Stockholm. The results from (1) and (3) are here indistinguishable from each other

When adding the five regional types (from large city to sparsely populated regions) to the model with large cities as the reference category, we find that none of the average intercepts for regional types 2 to 5 differs significantly from the base category. However, the rank-rank slopes differ on average across regional types: the rank-rank slope is steepest in type 1 regions (large cities), and flattest in type 4 regions (sparsely populated regions).

5.2 Relative mobility and absolute mobility across regions

Relative mobility and absolute mobility at p = 25 for each region are calculated according to formulas (5) and (6) in Section 2.2. The slopes and intercepts plugged into these formulas are computed according to Eqs. 9 and 10. More specifically, I compute the regional slopes as the sum of the slope fixed effect (regional average slope γ β) and the region specific random slopes (\(\eta _{r}^{\beta }\)), where the region specific random slopes are set to zero whenever they are not statistically significantly different from zero. Similarly, the total regional intercepts are the sum of intercept fixed effect (γ α) and the region specific intercepts (\(\eta _{r}^{\alpha }\)), where the region specific intercepts are set to zero whenever they are not statistically different from zero. The results obtained this way can be interpreted as a lower bound of the existing regional differences in mobility. The complete list of results by local labor market can be found in Table 7 in the Appendix.

Relative mobility is 18.16 in most regions. Relative mobility is higher only in the three regions Varberg, Växjö, and Skövde.Footnote 11 The average outcome difference between children from top and bottom income families in Varberg is just 15.58 percentile ranks. Seven regions show less than average relative mobility, with Stockholm ranking lowest. Here, the inequality of outcomes is largest with a maximal outcome difference between children of 22.21 percentile ranks.

Absolute mobility at p = 25 varies from 40.90 in Årjäng to 48.61 in Värnamo,Footnote 12 with an average of 43.69 across all regions (standard deviation 1.63). Calculating the percentiles back to income levels, we find that the expected difference in outcome between growing up in Årjäng or Värnamo for children with parents located at the 25th percentile amounts to nearly 20,000 SEK less income per year (≈2,210 USD). This corresponds to 90 percent of the average monthly salary of a worker in Sweden in 2010 (Swedish Trade Union Confederation 2011).

Figure 5 shows relative mobility and absolute mobility at p = 25 for all regions. The crossed lines through the center of the plot indicate the average levels of relative and absolute mobility, respectively. The arrow-tips indicate the direction in which mobility is increasing (note that high values of relative mobility indicate less mobility, since steeper slopes imply stronger associations between parent and child income). The quadrant marked with a large plus (minus) sign indicates regions with both above (below) average relative and absolute mobility. The data point right in the center represents not one but 57 regions which all have average levels of both mobility measures.

Fig. 5
figure 5

Relative and absolute mobility by region. For each region, relative mobility is plotted against absolute mobility at p = 25. The lines of the crosshair indicate the average levels of the measures, 18.16 and 43.69, respectively. The data point right in the center is actually an overlay of 57 regions, all with average mobility. The quadrant marked with a plus (minus) sign indicates areas with statistically significant above average (below average) mobility levels according to both measures

All regions with extremely high or low levels of upward mobility (the data points on the very left and the very right) show just average levels of relative mobility. Thus, even though the relative difference between sons from the highest and lowest income families in, for example, Torsby and Hylte, is the same, children from families with the same income rank in those regions will end up with very different levels of income as adults. Using the IGE or the rank-rank slope as the only measure for mobility, this difference would go completely unnoticed.

The estimates from this study can be compared to the results from Chetty et al. (2014a) for the USA. Remember that, in addition to population size, our estimates are not fully commensurate due to important differences in sample selection, income measurement, ranking procedure and childhood region assignment discussed in the Introduction and Section 3. Still, we can, for example, compare the middle 80% of the distribution of US commuting zones and Swedish LLMs in terms of absolute mobility at p = 25 and relative mobility (Chetty et al. use “upward mobility” instead, which is calculated exactly as absolute mobility at p = 25 but differs in their interpretation as the average outcome for all children with parents located between income ranks 0 and 50). The outcome difference in child mean rank given parent rank 25 between regions at the 90th percentile and regions at the 10th percentile amounts to 14.6 in the US and just 4.6 in Sweden. The reported mean is 43.3 in the USA and 45.0 in Sweden, using the arithmetic average of the regional, statistically significant, values. The distributions of absolute (upward) mobility across regions in the USA and Sweden have therefore quite similar means, but there is a larger variance in the USA even when not looking at the tails of the distribution.

Relative mobility also varies considerably more in the USA, where the maximum outcome difference measured in percentile ranks takes on values between 6.8 and 50.8 across regions. In Sweden, relative mobility across LLMs varies only between 15.6 and 22.2 percentile ranks.

Furthermore, since the income distribution is much more compressed in Sweden compared to the USA, the distance between two ranks in terms of income levels is considerably smaller in Sweden. The monetary difference between the top and bottom 10% of US commuting zones in terms of absolute mobility is 12,600 USD (including labor and capital income), while the same difference between Swedish LLMS amounts to just 11,578 SEK (≈1,300 USD) (labor income only).

It is important to keep in mind that Chetty et al. do not discuss how their individual regional estimates relate to each other and in how far they significantly differ from each other (or from the US average). My results are much more conservative both in the sense that my estimation method accounts for the number of observations (which gives less extreme results), and because I choose to compute the mobility measures using the regional predictions solely if they differ significantly from the Swedish average.

There are three local labor markets that stick out with less mobility according to both measures (Eskilstuna, Karlstad, Linköping), and three that show significantly higher mobility according to both measures (Varberg, Växjö, Skövde). Even though an in-depth analysis of the underlying forces driving this result is beyond the scope of this paper and left for future research, we can look at one known factor correlated with mobility, namely income inequality. Countries with more income inequality have been shown to have less intergenerational income mobility. This relationship has become known as the Great Gatsby Curve, see for instance Corak (2013).

A simple indicator of income inequality is the ratio of median income to mean income level which informs us about the skewness of the income distribution. Across all municipalities in Sweden in 1991, weighted by population size, this measure is 0.9586, i.e. the median income level amounts to 95.86% of the mean income and the distribution is thus, as expected, right skewed. Looking at this indicator separately for the local labor markets Varberg (96.12), Växjö (96.85), and Skövde (98.16), as well as Eskilstuna (96.79) and Karlstad (95.63), and Linköping (96.05), we find that the regions that do particularly well in the two mobility measures used in this study have income distributions that are less skewed than the Swedish average. This fits well with the Great Gatsby hypothesis. The three regions that perform badly in both measures on the other hand do not show particularly high income inequality, at least not according to this very simple measure. Looking at relative mobility only, the Stockholm region has both the most right-skewed income distribution (the median income level is just 92.65% of the mean income level) and the lowest levels of relative mobility among all Swedish LLMs.

All regions that are doing particularly well in lifting children from lower income families (located to the very right in Fig. 5, Värnamo, Hylte, Ljungby, Hofors, Gnosjö) have some common characteristics: They are located in the south of Sweden, are small to medium sized (the number of observations rank between the 20th and 50th percentile of all regions), and they all have a historically large manufacturing sector which today still employs a large fraction of the population. Regions that show the lowest outcomes for children with low income parents (on the very left of Fig. 5, Årjäng, Torsby, Malung, Vansbro, Jokkmokk) are also quite similar to each other. They are located in the Swedish inland close to the Norwegian border, they have a small and mostly decreasing population, and the local economy is characterized by a decreasing forestry sector, some agriculture, and outdoor tourism.

5.3 A comparison to OLS

Figure 6 visualizes relative and absolute mobility at p = 25 just as in Fig. 5 in Section 5.2, but here based on 112 separate OLS regressions by region. As expected, the regions differ much more in terms of both mobility measures compared to the multilevel approach. Especially relative mobility (the difference in mean outcome for sons from the families with the highest and lowest income, respectively) varies considerably more: from a 7.3 percentile ranks difference in Åsele to 24.9 percentile ranks in Torsby. Absolute mobility varies here between 40.1 in Torsby and 49.3 in Hylte. However, as I emphasize in this paper, it is not obvious how to interpret these differences.

Fig. 6
figure 6

Relative and absolute mobility by region using separate OLS regressions. For each region, relative mobility is plotted against absolute mobility at p = 25. The lines of the cross hair indicate the mean of each measure across the 112 regions. The gray line shows the fitted values from an OLS regression of the 112 relative mobility results on upward mobility, weighted by the number of observations in each region. Estimated slope: −0.476 (0.004)

Figure 7 illustrates the rank-rank slope estimates underlying the mobility measures obtained by separate OLS regressions in Fig. 6, including 95% confidence intervals. The regions are sorted in ascending order by the number of observations from left to right. It is clear that the lowest and highest slope estimates are found on the left side of the graph, together with the largest standard errors. In addition, most regional slopes are statistically indistinguishable from each other. There is no obvious way to compare the estimates of the 112 regressions. Thus, based solely on those regressions, an interpretation of regional differences in mobility appears in the Swedish context not very convincing.

Fig. 7
figure 7

OLS rank-rank slopes and their 95% confidence intervals. Every black dot represents a point estimate of the rank-rank slope for one region (112 in total). The regions are sorted by the number of observations in ascending order from left to right. In general, the fewer inhabitants in a region (the more to the left in the graph), the less efficient and more extreme is the slope estimate

6 Discussion and concluding remarks

In this paper, I have used detailed population-wide register data on nine Swedish cohorts and their parents to draw a picture of intergenerational mobility in and across Sweden. In line with previous literature, I have focused on income measurements at ages where annual income is most likely to equal average life-time income. These measures were constructed by averaging over 17 consecutive annual income observations for parents (when they were 34 to 50 years old) and three annual income observations for children (when they were 32 to 34 years old). Income ranks are highly comparable since I take into account parents’ and children’s birth cohorts.

For Sweden as a whole, the estimated IGE between parents and their children is 0.3. This implies that 30% of the deviation of a family’s parent income from the average parent income is transmitted to the child. The strongest association between the log incomes of two generations’ family members is found between parents and their sons (0.32) and the weakest association is measured between mothers and their sons (0.06). Using income ranks, the patterns across family members look very similar. I found that a 10 percentile rank increase in parent income implies a 2 percentile rank increase in child income.

Interestingly, child income is more strongly associated with total parent income than with only father income. This strengthens the choice of using both parents’ income in this study as opposed to ignoring mothers. In the case of the IGE, the child-parent association even exceeds the sum of the individual elasticities between child and father and child and mother income. It suggests an important role of parent matching for income transmission and is an interesting direction for future research. Focusing on each cohort separately from 1968 to 1976, I found a convergence of income associations by gender over time.

My primary measurement vehicle of regional differences has been a multilevel model. In order to facilitate comparisons, I also discussed the results from separate regional OLS regressions. The multilevel analysis revealed that relative mobility, the difference in strength of intergenerational association of income ranks (or, the maximum outcome difference) in a region is 18.16 percentile ranks in most local labor markets. The strongest association (lowest relative mobility) between child and parent income rank was measured in Stockholm, where the relative outcome difference is more than 22 percentile ranks. For children with parents located at the 25th percentile of the parent income distribution (absolute mobility at p = 25), growing up in different regions leads to income differences of up to 20,000 SEK (≈ 2,210 USD) per year. When using only the IGE or rank-rank slopes to study mobility, these differences would be completely invisible. In comparison to the USA, the regional differences in both mobility measures are smaller both in terms of ranks as well as in monetary terms, due to the more compressed income distribution in Sweden. Since the estimation method in this paper leads to more conservative results, however, the estimates are not completely comparable to the study by Chetty et al. (2014a).

Sweden is considered to be a country with exemplary high levels of intergenerational income mobility. My results show that there exist differences in terms of mobility across Sweden and that location matters. The evidence provided here indicates that there are significant differences in the expected outcomes for children from low income families depending on childhood region. Regions that are particularly successful and particularly unsuccessful in producing high outcomes for children from low income families differ clearly in several characteristics, such as their location within Sweden, population size, and regional economic composition. An important direction for future research is to analyze further the underlying factors and mechanisms driving those regional differences.

A general lesson of this study is that country-wide measures of income mobility potentially say little about the state of mobility at a particular location within the country. Cross country comparisons of income mobility, for example, should therefore be interpreted with some caution if the distribution of mobility within the countries is not known. For example, higher relative mobility in some country might be accompanied by a very large dispersion of relative mobility across different regions, and could thus be less desirable than a slightly lower level of relative mobility in another country where there are less extreme mobility measures found across regions. Finally, relative mobility measures should if possible be supplemented with absolute mobility measures in order to detect important differences in outcome levels that might otherwise go undetected.