Estimation of WLE
We estimate WLE using period working life tables. We specify a discrete-time Markov chain (Hoem 1977; Millimet et al. 2003), which models the year-to-year transitions of individuals between labor force states by means of transition probabilities. An alternative to the Markov chain approach is Sullivan’s method. In contrast to the incidence-based Markov chain approach, it is prevalence-based. As we are interested in trends over time, incidence-based methods are preferred. Prevalence-based approaches yield biased results when there are abrupt changes in transition probabilities over time (Dudel et al. 2018; Mathers and Robine 1997), as surely was the case in Italy over the analysis period.
From age 15Footnote 6 to age 99, individuals can be in one of the following states: employed, unemployed, inactive, retired, or in the absorbing state dead. This means that the state space is given by the all of the pairs obtained as a combination of ages and states, and by the absorbing state dead. The state space is thus represented by elements categorized as “aged 15 and inactive”, “aged 15 and employed”, and so forth up to “aged 99 and retired”, and “dead”. As a consequence, individuals can move through transient states only in 1-year steps, for instance from “aged 25 and inactive” to “aged 26 and employed.
We assume that at age 99 no one in the sample is still alive, and that the individuals who are not in employment at ages 65 and older are retired.Footnote 7 Individuals receiving a pension are considered employed if they have the required amount of contributions. Individuals older than 80 are all classified as retired since employment at older ages is quantitatively negligible.
Then, if Xt is the labor force state at time t,
$${\text{P}}( {\text{X}}_{{{\text{t}} + 1}} = {\text{i | X}}_{\text{t}} = {\text{j)}} = {\text{p}}_{\text{ij}}$$
(1)
is the transition probability from state j to the state i in a single time interval; in this case, 1 year. The transition probabilities regulate the movements of individuals across labor market states.
Individuals aged 15 begin inactive, and those who survive up to age 16 can stay inactive or move to any other transient state; the process continues at each successive age. At each age, transition probabilities sum up to the corresponding survival probability. For instance:
$${\text{p}}(15, {\text{I}}) = {\text{p}}({\text{I}}|15, {\text{I}}) + {\text{p}}({\text{E}}|15, {\text{I}}) + {\text{p}}({\text{U}}|15, {\text{I}}) + {\text{p}}({\text{R}}|15, {\text{I}})$$
is the probability that an inactive individual aged 15 survives, in some year. That is, the survival probability is the sum of the probability of being inactive, or employed, or unemployed or retired at age 16, given that he/she was inactive at age 15; from age 15 to age 19 there is a high persistence in the state inactive. This means that in this age group the inactive state nearly equals the survival probability.
To estimate the time spent in the different states using the Markov model, we calculate the fundamental matrix (Kemeny and Snell 1983; Taylor and Karlin 1998), with entries capturing WLE, time spent in retirement, and so on. To calculate this matrix, the transition probabilities are organized in the transition matrix \({\text{P}} = [{\text{p}}_{\text{ij}} ]\), separately for gender, region and occupation. As the state space includes all ages and movements are possible only in 1-year steps, the matrix P is a block diagonal matrix; off-diagonal blocks are zero matrices.
Using the transition matrix, the fundamental matrix is obtained by:
$${\text{N}} = ({\text{I}}_{\text{s}} - {\text{U}})^{ - 1}$$
(2)
where \({\text{I}}_{\text{s}}\) is an m-by-m identity matrix, and U is an m-by-m matrix (m = n − 1, number of states excluding the absorbing state dead). The superscript of − 1 is used to denote the inverse. The entry of N in the jth column and the ith gives the expected time spent in state i starting from state j. Given our model, this includes, for instance, the time spent in employment starting from the inactive state.
We provide estimates of the expected number of years spent in employment (WLE), unemployment, inactivity, and retirement from age 15 onward. This means, for instance, that our estimates of WLE show the years of life a 15-year-old can, on average, expect to spend in employment. The expected years of life spent in the different states sum up to remaining life expectancy. Furthermore, we decompose WLE by broad age groups (19 or younger; 20–29; 30–39; 40–49; 50–59; 60 and older). This allows us to assess how changes in WLE are distributed across age groups, and whether specific groups, such as older or younger individuals, experienced greater changes in WLE than other groups.
Transition Probabilities
We estimate the transition probabilities by means of multinomial regressions (Agresti 2003; Allison 1982). We model the probability of being in state i at time t + 1 as a function of the state j at time t, age, sex, occupational category, and macro areaFootnote 8 of residence. Moreover, as the data used in the analysis span the years 2003–2013, a dummy variable for each period from 2004 to 2012 is included in the models to control for trend; 2003 is used as reference period, while 2013 is not included, as we model transitions from t to t + 1. In each of the regressions, age is modelled as a cubic spline to account for non-linearity in the relationship between age and the probability of being in a state i.
The sample size allows us to estimate regression models separately by gender and age groups, as well as by occupational category and by macro area of residence. This approach introduces implicit interactions between the variables that were used to split the data, and helps to ensure that the computations remain feasible, which would otherwise be very time intensive. The age groups considered are those chosen to decompose WLE; that is, from 15 to 29, from 30 to 64 and from 65 to 99. These groups roughly reproduce ages at entry into the labor market, working ages, retirement ages and end of life.
Estimating the transition probabilities as described above yields estimates for transitions from t = 2003 to t + 1 = 2004, t = 2004 to t + 1 = 2005, and so on. Each of these sets of transition probabilities is plugged into the Markov chain approach to derive estimates of WLE. This means that, for instance, the transitions from 2003 to 2004 are used to derive an estimate of WLE for the same period. The result follows a so-called period perspective, which is commonly used for WLE (e.g., Hayward and Lichter 1998). It shows how long the length of working life of a hypothetical cohort would be if the conditions captured by the transitions from t to t + 1 prevailed over a life time. Thus, period WLE provides a summary of the labor market, and the time series of estimates that we calculate in this paper allow us to assess how the labor market changes over time. The other measures we provide in this paper, such as the years of life spent in retirement, also follow a period perspective.
We classify the years in which the employment rate was declining and the unemployment rate was increasing as periods of recession. It is important to note, however, that this definition of recession differs from the definition of recession commonly used in economics: i.e., two consecutive quarters of negative economic growth. Labor Force Statistics (ISTAT 2017) show that in Italy, the decreasing trend in the total employment rate started after the second quarter of 2008. Following a period of stability between 2010 and 2011, the decreasing trend resumed starting with the second quarter of 2012, and continued until the end of the period covered by our data (2013). This implies that in our analysis, the beginning of the recession is covered by the three 2-year periods of 2007–2008, 2008–2009, and 2009–2010.
All of the analyses were performed by means of the statistical language R (R Core Team 2016). We used the package readr (Wickham et al. 2016) to import the raw data, data.table (Dowle and Srinivasan 2017) for data management, VGAM (Yee 2010) to estimate the multinomial models, and ggplot2 (Wickham 2009) to create the figures.
Adjustments of Transition Probabilities
Since the sample is based on register data, it collects information on individuals with contribution histories in the occupational pension schemes covered by INPS. This approach has two major implications that need to be addressed. First, the sample includes only individuals with at least one contribution record. This means that individuals who have never worked or who made contributions to funds not covered by the sample are not included in the data. Second, individuals previously included in the sample are lost to the follow-up if they either moved abroad or started contributing to funds not covered by the data. We address these problems in the following way.
First, including only individuals with contribution records implies that the transition probability estimates from inactivity to employment are likely to be biased upward. Furthermore, it is reasonable to assume that especially as individuals grow older, those who already have work experience are more likely than those who do not to transition from inactivity to employment. This problem does not affect the estimates for all other states, such as from unemployment to employment or from employment to retirement. Thus, we address this issue by estimating the transition probabilities from inactivity to employment by means of a nonparametric approach. We use the LoSaI sample to estimate the transitions from inactivity to employment by age and sex, weighting our observations by the inverse of the inclusion probability in the sample, 365:24. Then, we estimate the whole inactive population, by age and sex, by means of the publicly available data of the Italian Institute of Statistics (ISTAT), “Aspects of daily life”. The resulting estimate of the inactive population may be assumed to be the correct number of individuals who have the potential to transition from inactivity to other states.
A second adjustment is applied to the survival probabilities resulting from the multinomial models. This adjustment is needed because the survival probabilities estimated within the regression models can result in unrealistically high life expectancies. We use the life tables provided by ISTAT at both the whole population and the macro area levels to replace the parametric survival estimates with adjusted estimates. These are obtained by decomposing the survival probabilities provided in the life tables according to the survival distribution by occupational status observed in the sample. The detailed procedure is described in Dudel and Myrskylä (2017).