1 Introduction

Pension reforms are a subject of intense controversy in many countries. In Europe, where the ratio of retirees per worker will continue to increase until 2050, financial unsustainability threatens public pay-as-you-go (PAYG) systems. However, this is not the only concern of policy makers. Several studies and advisory commissions point out the dangers of declining social welfare and increasing inequality between different groups in the economy after the implementation of pension reform.Footnote 1 They worry about the “social sustainability” of reformed pension systems and aim to prevent reforming the pension system unequally.Footnote 2

According to a report from the OECD (2017), inequality emerges along the life cycle within and across generations as a composite of inequalities arising from birth and through the life course of individuals (see, e.g., Huggett et al. (2011)). This inequality may be reinforced by population aging and by policy reforms that were precipitated by it. Individuals from different generations and with different health characteristics, productivities and preferences are not only differentially affected by the policy reforms but also react differently to the same incentives produced by the reforms. For example, they do not profit the same way from the beneficial macroeconomic changes induced by the reforms.

The paper uses a unified dynamic modelling framework to analyze pension reform, behavioral reactions, their economic effects, and policy makers’ and voters’ choices in an environment in which individuals are heterogeneous in many dimensions.Footnote 3 We calibrate this model to a benchmark situation that is typical for continental Europe with its strong aging process, namely, a weighted mix of the situation in three largest European economies, France, Germany, and Italy. We then take as examples PAYG pension reforms that have been implemented or proposed in the last decade in different countries. We compare these reforms with respect to four criteria: financial sustainability of the pension system, social welfare, intra-generational equality, and inter-generational balance. Finally, we let a social planner, and the voters decide on an optimal policy mix.

We show that the trade-offs among the welfare of individuals, income inequality between and within generations, and the financial stability of the system are multifaceted. The evaluation of these trade-offs is further complicated by the fact that there is no universally accepted measure of aggregate welfare to define a socially optimal reform. In addition, the outcome of social welfare functions tends to differ from the outcome of voting processes. We show that the policy makers’ decision processes concerning the merits of a policy reform are far from being clear-cut. All this speaks in favor of broad policy mixes rather than single-specific policies.

Our theoretical framework is based on a rich overlapping generations (OLG) model, which describes the transition from our baseline scenario to several reform scenarios. The model’s set-up follows the existing literature (Sánchez-Martín 2010; Catalan et al. 2010; Fehr et al. 2012; Kitao 2014) and more recently Schön (2023) and Tamai (2023), but we provide a much broader evaluative analysis that is missing in the literature so far. The main novelty of our paper is to use this OLG framework to shed qualitative and quantitative light on the trade-offs between financial sustainability, social welfare, and equality between and within generations in a society that has several dimensions of heterogeneity. Such heterogeneity within each cohort is modeled by individuals who differ in their productivity, their health and life expectancy, their fixed costs of working, and their preferences for consumption and leisure. We take into account key endogenous individual decisions, equilibrium macroeconomic dynamics, and projected changes in the demographic structure. Endogenizing household decisions such as saving and labor supply—both on the extensive and intensive margins—is essential for policy comparisons because pension reforms trigger different reactions from heterogeneous individuals, which typically dampen the intended objectives of the reform (“backlash effects”, Börsch-Supan et al. (2014)).

Our benchmark is a defined benefit (DB) PAYG pension system, which has a strong link connecting each individual’s lifetime contributions to the pension system and the benefits emanating from it when retired. Examples for such systems are France, Germany, Italy, Portugal, and Norway (Social Security Administration 2014). These systems typically have a full pensionable age (FPA) at 65 years and less than actuarial adaptations to the actual retirement age.Footnote 4

The choice of policies to be analyzed is motivated by their differential effects between and within generations.Footnote 5 Two reforms address the effective retirement age: (1) increasing the FPA by partially indexing it to life expectancy and (2) increasing the actuarial adjustment rates for retiring earlier or later than the FPA from the currently low to the actuarially neutral level. Another two reforms change the replacement rate of the pension system: (3) introducing a sustainability factor that links the replacement rate to changes in the population structure and to employment growth, and (4) making the PAYG system redistribute in favor of individuals with low income. Finally, we analyze two combinations of these reforms, namely, (5) a combination of the first three reforms, which only indirectly change the redistributive character of the benchmark pension system, and (6) a combination of all four specific reforms.Footnote 6

Our paper brings three strands of literature together in a unified framework. The first strand is papers which explore the effects of pension reforms on financial sustainability and welfare (Fehr et al. 2012; Kitao 2014; Sánchez-Martín 2010; Kotlikoff et al. 2007). A second strand of the literature digs into the inequality and redistributive effects of pension systems (Hurd and Shoven 1985; Weizsaecker 1995; Etgeton 2018; Lee et al. 2019; Sanchez-Romero and Prskawetz 2017; Huggett and Ventura 1999; Hairault and Langot 2008). A third strand of papers addresses the political economy and political feasibility of pension reforms (Persson and Tabellini 2002; Galasso 2007; Casamatta and Batté 2017). So far, these three strands of the literature do not overlap sufficiently well for a comprehensive analysis of pension reform. Our paper fills this gap by combining a primary focus on the trade-offs on inequality and redistributive impacts of reforms with the traditional analysis of the impact of reforms on sustainability and overall welfare, taking account of political feasibility.

Our paper also relates to several fields in the large literature on pension reform objectives. By paying close attention to pension reforms that affect intensive and extensive labor force participation, we speak to the literature that studies reforms promoting more active aging and a longer working life (Graf et al. 2011; Börsch-Supan 2007; Huber et al. 2013; Sonnet et al. 2014; World Bank 1994; OECD 2017; Börsch-Supan and Schnabel 1998). Specifically, the reforms modeled in our paper change the labor supply incentives inherent in pension systems (Gustman and Steinmeier 2005; Duggan et al. 2007; Kotlikoff et al. 2007; Gruber and Wise 1999; Börsch-Supan et al. 2018a; Börsch-Supan et al. 2014).Footnote 7 We relate to reforms increasing the full pensionable age in Germany or Italy (Börsch-Supan 2007; Boeri et al. 2016) and the introduction of flexible retirement mechanisms (Börsch-Supan et al. 2018b; Gustman and Steinmeier 2005). The combination of the main elements of the first three reforms considered in this paper are key elements of more profound changes of the pension system such as the introduction of notional defined contribution (NDC) systems in Sweden and Italy (Palmer 2000; Moscarola and Fornero 2009).

We find that making benefit adjustments to retirement age actuarially neutral improves intra-generational equity. It is also the most popular reform for the rational individuals in our model. However, it does not improve inter-generational equity. The introduction of a sustainability factor secures financial sustainability in the long run and reduces inter-generational imbalances but produces a large negative impact on lifetime utility for older generations as it only increases lifetime utility for cohorts entering the labor market in the future. Intra-generational imbalances are only slightly affected. As expected, this reform is unpopular among the more shortsighted voters. Even less popular is an increase of the retirement age. In contrast, introducing a more redistributive scheme leads to a substantial reduction of intra-generational income inequalities in terms of pension and labor income. However, due to feedback effects on saving behavior, the effect of this policy on total income is much smaller than with respect to earnings-related income. Detecting these feedback effects shows the value of our modeling approach. Making pensions more redistributive does not solve the sustainability problems; it actually magnifies the budgetary deficit of the pension system due to backlash effects on retirement decisions. We conclude that none of the four single reforms satisfies all the criteria of financial and social sustainability. We show that a combination of policies is not only effective in bringing up a compromise between the different goals but also maximizes social welfare and voters’ approval.

The remainder of the paper is structured as follows: Section 2 introduces the model and its components. The numerical solution of the model and the calibration procedures are described in Section 3. The benchmark scenario is presented in Section 4. Section 5 compares the impact of each reform in terms of the four criteria: financial sustainability, intra-generational and inter-generational equity, and social welfare. Section 6 concludes.

2 The model

We extend the OLG model of the Auerbach and Kotlikoff (1987) type in several dimensions. We add a model of a detailed earnings-related DB-PAYG public pension scheme. We include monetary incentives for early or late retirement through the adjustment of pension benefits. We allow for a discrete endogenous choice on retirement in addition to the continuous leisure/work and consumption/saving trade-offs. We introduce heterogeneity among household types along four dimensions: productivity, life expectancy, health-related fixed costs of working, and preferences for consumption and leisure. This detailed setting allows for analyses of the differential effects of various reforms on the four outcome criteria within the same modelling framework.

The household problem

There are K different types of perfectly foresighted households at every point in time t with age j.Footnote 8 The household types differ by productivity, resulting in four lifetime-income classes; by their consumption/leisure preferences, resulting in three different consumption profiles over the life course; by their survival probabilities, resulting in three categories of life expectancy; and by their costs of working, parametrized by three categories of health.Footnote 9 The resulting 108 different household types are displayed in Table 1 and specified in more detail in Section 3.1.

Table 1 Household types

Life in our model starts with entering the labor market, which is set at age 20. We index cohorts by the year of labor market entry. Households have uncertainty about the time of death and have their life expectancy determined by the prevailing survival rates. For computational convenience, we set a maximum age of J years, measured from age 20 onwards.

Households have preferences over consumption and leisure. Household k receives utility from consumption and leisure as given by the following per-period utility function

$$u\left({c}_{t,j}^k,{l}_{t,j}^k\right)=\frac{1}{1-\theta }{\left[{\left({c}_{t,j}^k\right)}^{\phi_j^k}{\left({l}_{t,j}^k\right)}^{1-{\phi}_j^k}\right]}^{1-\theta }.$$
(1)

where u is twice continuously differentiable, strictly increasing in consumption and leisure, and strictly concave. \({\phi}_j^k\) denotes the intra-temporal elasticity of leisure and is household type- and age-dependent. Risk aversion is described by the parameter θ. The time endowment is normalized to one. Leisure is equal to time endowment less hours worked \({h}_{t,j}^k\). Costs of participating in the labor market associated with age-related health deterioration and the burden of work for health itself (Kitao 2014) are measured by a cost function:

$${l}_{t,j}^k=1-{h}_{t,j}^k-\uppsi\ {\upchi}_j^k.$$
(2)

where \({\upchi}_j^k\) replicates the physiological aging process as in Dalgaard and Strulik (2014). We define ψ as the intensity parameter by which this aging process is transformed into the fixed costs of working.

Given these conditions, a household of type k entering the labor market at time t maximizes lifetime utility

$$\mathit{\max}\ \sum\nolimits_{j=0}^J{\beta}^j{\pi}_{t+j,j}^k{u}^k\left({c}_{t+j,j}^k,{l}_{t+j,j}^k\right),$$
(3)

where βj = 1/(1+ρj) is the discount factor for discount rate ρj, and

$${\pi}_{t,j}^k={\prod}_{i=0}^j{\varphi}_{t+i,i}^k$$
(4)

is the type-dependent unconditional survival probability at time t and \({\varphi}_{t,j}^k\) is the corresponding conditional survival probability. We do not include intended bequests in our model and assume that accidental bequests resulting from premature death are taxed away by the government at a confiscatory rate and are used for otherwise neutral government consumption.

Wages depend on age and household type

$${w}_{t,j}^k={w}_t{\varepsilon}_j^k,$$
(5)

where \({\varepsilon}_j^k\) generates age- and type-specific wage profiles. The dynamic budget constraint is given by

$${a}_{t+1,j+1}^k={a}_{t,j}^k\left(1+{r}_t\right)+{h}_{t,j}^k{w}_{t,j}^k\left(1-{\tau}_t\right)+{p}_{t,j}^k-{c}_{t,j}^k$$
(6)

with

$$0\le {h}_{t,j}^k\le 1-\uppsi\ {\upchi}_j^k\ \textrm{and}\ {c}_{t,j}^k>0.$$
(7)

where \({a}_{t,j}^k\)denotes assets, \({p}_{t,j}^k\) is pension benefits, and τt is the contribution rate of the public pension system described in Section 2.2.

An important feature of our model is the endogenous retirement decision. Households choose to retire within a “window of retirement” between an earliest eligibility age RE and a latest retirement age RL. We denote the retirement age that individuals of type k have chosen at time t by \({R}_t^k\), \({R}_E\le {R}_t^k\le {R}_L\). We abstract from partial retirement, bridge jobs, return from retirement, and disability insurance. We also assume that the labor market exit coincides with claiming pension benefits.Footnote 10 The retirement age chosen by the household is a by-product of the main optimization routine as explained in Appendix C. Footnote 11

2.1 The public pension system

Our benchmark pension system is a DB-PAYG pension system. It includes relevant characteristics of different pension systems in continental Europe allowing for a generalization of our results and most closely resembles the French and German point systems with their strong links connecting each individual’s lifetime contributions to the pension system with the benefits emanating from it when retired (“career average plan”).

DB means that a cohort of retirees is promised a pension benefit \({p}_{t,j}^k\), which is defined by a replacement rate bt that is set by the pension policy and not necessarily dependent on the demographic and macroeconomic environment. The contribution rate to the system is then adjusted to keep the PAYG system balanced. This set-up puts the highest financial burden of population aging on the young generation.

We abstract from a reserve fund such that the budget equation is assumed to be balanced in each year:

$${\tau}_t{w}_t{\sum}_{k=1}^K{\sum}_{j=1}^{R_t^k}{\varepsilon}_j^k{h}_{t,j}^k{N}_{t,j}^k={\sum}_{k=1}^K{\sum}_{R_t^k+1}^J{p}_{t,j}^k{N}_{t,j}^k$$
(8)

where \({N}_{t,j}^k\) represents the number of people aged j at time t and in household type k.

Individual pension benefits \({p}_{t,j}^k\) are computed by multiplying four elements:

$${p}_{t,j}^k={\gamma}_{R_t^k}^k\cdot \kern0.5em {b}_t\cdot \kern0.5em {w}_t{\overline{h}}_t\cdot \kern0.5em \frac{s_{t,{R}_t^k}^k}{R_t^k}$$
(9)

Two elements are economy-wide averages:

  1. (a).

    bt is the economy-wide replacement rate set by the pension policy.

  2. (b).

    \({w}_t{\overline{h}}_t\) denotes average earnings.

  3. (c).

    The other two elements are individual specific and depend on the retirement age \({R}_t^k\) that individuals of type k have chosen at time t:

  4. (d).

    \(\frac{s_{t,{R}_t^k}^k}{R_t^k}\) is the number of pension points at retirement age \({R}_t^k,\)averaged over the working life.Footnote 12

  5. (e).

    \({\gamma}_{R_t^k}^k\) adjusts pension benefits to the chosen retirement age.

The earnings points \({s}_{t,j}^k\) represent the pension claims that are accumulated in a career average plan. They evolve over the life course according to

$${s}_{t+1,j+1}^k={s}_{t,j}^k+\frac{\varepsilon_j^k{h}_{t,j}^k}{{\overline{h}}_t}$$
(10)

such that individuals receive one pension point if they receive exactly the average earnings in a given year t. Since average productivity εt is normalized to one, average earnings are given by

$${\overline{h}}_t=\frac{\sum_{k=1}^K{\sum}_{j=1}^{R_t^k}{\varepsilon}_j^k{h}_{t,j}^k{N}_{t,j}^k}{\sum_{k=1}^K{\sum}_{j=1}^{R_t^k}{N}_{t,j}^k}$$
(11)

Upon retirement at age \({R}_t^k\), the number of accumulated pension points determines the contribution-related component of pension benefits, which is given by

$${s}_{R_t^k}^k={\sum}_{m=0}^j\frac{\varepsilon_m^k{h}_{t-j+m,j}^k}{{\overline{h}}_{t-j+m,j}^k},$$
(12)

The adjustment factors \({\gamma}_{R_t^k}^k\) counterbalance a longer or shorter duration of receiving pension benefits if households retire before or after the full pensionable age, which we denote by \({\overline{R}}_t\), \({R}_E\le {\overline{R}}_t\le {R}_L\). \({\gamma}_{R_t^k}^k\) is determined at retirement age and remain fixed during retirement. For simplicity, we assume a linear and symmetric schedule. \({\gamma}_{R_t^k}^k\) equals 1 if the household retires at the full pensionable age. If the household decides to retire earlier, there is a deduction of ωt percent (“adjustment rate”) of pension benefits for every year of earlier retirement. For each year of delayed retirement, there is a premium of ωt percent. Hence, \({\gamma}_{R_t^k}^k\) is given by

$${\gamma}_{R_t^k}^k=1+\left({R}_t^k-{\overline{R}}_t\right){\omega}_t\kern0.5em \textrm{for}\ {R}_t^k\ge {R}_E.$$
(13)

Occasionally, the adjustments factors \({\gamma}_{R_t^k}^k\) are referred to as “actuarial adjustments” although the term “actuarial” only applies in a literal sense if ωt is calculated such that the present discounted value PDVt of participating in the pension system for all households is independent of the retirement age (“actuarial neutral”). Pension systems with benefits independent of the individual retirement age (i.e., ωt = 0) are not actuarially neutral since they redistribute income from late retirees to early retirees who receive the same benefits over a longer period of time. The same argument applies when adjustment rates are lower than the actuarially neutral value. This is the case in our benchmark countries France, Germany, and Italy (see Table 1 in Appendix B). Lower than actuarially neutral adjustment rates create strong incentives to retire early (see, e.g., Gruber and Wise 1999, 2005; Desmet and Jousten 2003; Fisher and Keuschnigg 2010 and Börsch-Supan et al. 2018a).

2.2 Production sector

The production sector consists of a representative firm. Production is given by a Cobb-Douglas production function using capital stock, Kt, and aggregate labor, Lt, as inputs.

$${Y}_t={K}_t^{\alpha }{\left({A}_t{L}_t\right)}^{1-\alpha },$$
(14)

where At is technology (growing at time varying rate gt) and α is the capital share in the economy. Since factors earn their marginal product, wage and interest rate are given by

$${w}_t={A}_t\left(1-\alpha \right){k}_t^{\alpha },$$
(15)
$${r}_t=\alpha {k}_t^{\alpha -1}-\delta,$$
(16)

where kt denotes the capital stock per efficient unit of labor, kt = Kt/(AtLt), and δ is the depreciation rate of capital. We also introduce a wedge between the interest rate perceived by households and the market interest rate, i.e., marginal product of capital.

2.3 Social welfare

Social welfare is computed by aggregating the utility of two groups of cohorts. The first group are workers and retirees of type k who started their working life at time t ≤ T0 before the reform, which is supposed to be implemented at time T0. Their remaining lifetime utility after T0 is:

$${\displaystyle \begin{array}{cc}{U}_t^k=\sum_{j={T}_0-t}^{J-\left({T}_0-t\right)}{\beta}^j{\pi}_{t+j,j}^k{u}^k\left({c}_{t+j,j}^k,{l}_{t+j,j}^k\right),& t\le {T}_0\end{array}}$$
(17)

The second group are households of type k who have not yet entered the labor market but will at time t. Their lifetime utility is:

$${\displaystyle \begin{array}{cc}{U}_t^k=\sum_{j=0}^J{\beta}^j{\pi}_{t+j,j}^k{u}^k\left({c}_{t+j,j}^k,{l}_{t+j,j}^k\right),& t>{T}_0\end{array}}$$
(18)

Aggregation is done using a social welfare function with weights corresponding to the respective population shares. We acknowledge the discussion concerning the many different welfare measures proposed in the literature.Footnote 13 We will follow the standard social welfare measure in the spirit of Samuelson and Bentham. Given the framework of our model, this social welfare function (SWF) in Samuelsonian style is given by:

$${SWF}_t\left({K}_1,{K}_2,{T}_1,{T}_2\right)=\frac{1}{\left({K}_2-{K}_1+1\right)}\sum\nolimits_{k={K}_1}^{K_2}\ \sum\nolimits_{i={T}_1-{T}_0}^{T_2-{T}_0}{\beta}^i\ {\alpha}_{t+i}^k\ {U}_{t+i}^k$$
(19)

where \({U}_t^k\)is the (remaining) lifetime utility of household type k at period t, \({\alpha}_{t+i}^k\) is the corresponding population share, and βj is the discount factor.

The parameters K1, K2, T1, and T2 describe how the social welfare function puts more, or less, weight on the utility of different groups in the economy.Footnote 14 For instance, a policy maker can give preference to adults only who are working or retired. The policy maker may also include children or future generations in the welfare calculation. Furthermore, the policy maker may opt for a weighted welfare measure that accounts for all household types. Alternatively, the policy maker may choose a Rawlsian welfare measure that takes only the welfare of low-income households into account.

Specifically, K1 and K2 denote the range of household types that the policy maker takes into consideration, sorted by income. If K1 = K2 = 1, only the lowest productivity type is considered; this corresponds to a Rawlsian welfare function. Similarly, T1 and T2 denote the range of cohorts that the policy maker takes into consideration. If T1 = T0 and T2 = T0 + J, then only the remaining lifetime utilities of current workers and retirees are taken account of in determining social welfare. If T2 = T0 + J + 20, then the lifetime utility of all currently alive are included. If T1 > T0, then social welfare excludes the transition time of a reform between T0 and T1.

3 Calibration

The structural parameters of the household model are calibrated to match the most important simulated moments of our model to their empirical counterparts for the year 2017. We consider a prototypical country, which is a synthetic aggregation of the population data from the three largest continental European countries (France, Germany, and Italy, called EU3). We calculate the weighted average moments for capital-output ratio, consumption output-ratio, average hours worked, retirement age, and the pension system’s expenditures with pension payments as percentage of GDP as targets for calibration.

3.1 Household-specific age profiles

We define the maximum life span of households to be 100 years. Households enter the labor market at age 20. We distinguish K = 108 different household types.

A first dimension of heterogeneity is the level and the life course profile of productivity, resulting in the four lifetime income classes shown in Table 1. Figure 1 in Appendix A depicts the productivity profiles for each income group. There is some discussion of how these profiles evolve over the life cycle. Often, studies claim a hump-shaped profile; i.e., individual productivity first increases when young, later reaches a peak in middle age, and decreases again as a consequence of the aging process (see Altig et al. (2001), French (2005), and Huggett et al. (2011)). Some find, on the contrary, that there is no decreasing labor productivity at later ages of workers (see Göbel and Zwick 2009; Börsch-Supan and Weiss 2016; Börsch-Supan et al. 2021).Footnote 15 As an approximation for productivity, we use SHARE data and the job episodes panel to calculate cohort-corrected wage profiles (Börsch-Supan et al. 2013; Brugiavini et al. 2019). As shown in Appendix A, Fig. 11, the resulting productivity profiles increase with age at a steeper rate for higher income groups and decrease slightly after the peak.

As a second dimension of heterogeneity, we include mortality risk, which increases with age. We calculate three variants of cohort- and individual-specific unconditional survival rates \({\pi}_{t,j}^k\) using the Human Mortality Database (2016). We use these estimated unconditional survival rates to determine the conditional survival rates for the three household types. Figure 2 in Appendix A displays the estimated differences between the three groups.

A third dimension of heterogeneity is represented by the preference for consumption \({\phi}_j^k\). Figure 3 in Appendix A shows three different preference profiles. They represent the aging process, during which the preference for leisure increases, thereby reducing labor supply and eventually inducing retirement. We take a parametric approach and assume the same starting value for all household types. The first profile assumes that there is no decline. The other profiles are calibrated by level and slope over the life cycle such that both the average retirement age and the expenditures in pension payments are matched with the data.

As a fourth dimension of heterogeneity, we estimate how the health costs of participating in the labor market change with age. We use questions on physical health and cognitive functioning in waves 1, 2, and 4–7 of SHARE to create a health deficiency index as a proxy measure of the physiological aging process, which increases monotonically with health deficits (see Mitnitski et al. (2001) and Dalgaard and Strulik (2014)).Footnote 16 This index is similar to the one in Abeliansky and Strulik (2019) and in Börsch-Supan et al. (2020). Individuals who suffer from a faster health deterioration have their costs of working increase faster than individuals with better health. Figure 4 in Appendix A shows the three profiles. They are used as a proxy for the increasing fixed costs of working with age, thus reducing labor supply at the intensive margin.

Finally, we use the SHARE data to calculate weights for each household type. The corresponding sample shares are displayed in Table 2. Many sample shares are very small, especially those that are off the diagonal that runs from low income and poor health to high income and excellent health. This reflects the well-known “socio-economic gradient of health” (Marmot and Siegrist 2004) with its strong correlation between health and income. According to the Danish register data (Kallestrup-Lamb and Rosenskjold 2017), there is a gap in life expectancy of about two years between the intermediate and highest income groups, while there is a larger gap of four to four and a half years between the lowest and the intermediate income groups.

Table 2 Sample shares of the 108 household types

Solving the model is very computer-time intensive. To save running time, we compute the solution of the model in parallel for the three consumption/leisure preferences (“triples”). Furthermore, we do not include household types with small sample shares by selecting the ten triples with the largest sample shares. They are shaded in Table 2. These 30 household types represent 58% of the underlying SHARE sample.

3.2 Parameters

Table 3 gives an overview of the parameters in the model.Footnote 17 The risk preference parameter θ is set to 2, which makes the household slightly risk averse and lies in the middle of estimates in the literature (see overview by Bansal and Yaron (2004) and Conesa et al. (2009)). We set the discount rate ρ to 0.0132 (see overview by Frederick et al. (2002)). It is calibrated to match the consumption-output ratio. The intra-temporal elasticity parameter between consumption and leisure, which defines the preferences for consumption \({\phi}_j^k\), is calibrated to 0.665 for all three types of individuals. The decline of the preferences for consumption of individuals belonging to the bottom and intermediate income groups is calibrated to 0.03 and 0.015, respectively, as described above. The parameter measuring the intensity of the physiological aging process aχ is calibrated to 4.5 by matching the average hours worked for all cohorts in the year 2017.

Table 3 Parameter calibration and age profiles

The capital share α in the economy is assumed to be 0.33. This is the range found in several studies (King and Rebelo 1999). The depreciation rate of capital is calibrated at 6.2% per year to match the capital output ratio (see, e.g., Christiano et al. (2005)). Annual productivity growth is set to its actual average values before 2017 using data from the Penn-World tables and set to 1.5% after 2017 (Feenstra et al. 2015).

We choose a retirement window from RE = 60 until RL = 72. Age 60 is the earliest legal retirement age for women in several European countries (Social Security Administration 2014; Deutsche Rentenversicherung Bund 2015; OECD 2019a). While there is no legal upper bound for late retirement, we assume age 72 as the latest retirement age for computational ease, in accordance with US Social Security regulations. We assume ω = 3.2% in Eq. 13. This is the weighted average value of current adjustment rates in the EU3 countries (Appendix B, Table 1).

Demography is described by the size of each cohort, the survival of that cohort, and additions through net migration. We treat these demographic forces as exogenous. The size of the population aged j in period t is given recursively by

$${N}_{t+1,j+1}={N}_{t,j}{\varphi}_{t,j},$$
(20)

where φt, j denotes the age-specific conditional survival rate. The original cohort size for cohort c depends on the fertility of women aged k at time c = t-j:

$${N}_{c,0}={\sum}_{k=0}^{\infty }{f}_{c,k}{N}_{c,k}.$$
(21)

Our model is also very rich in describing population aging which has three demographic components: past and future increases in longevity, expressed by φt, j; the historical transition from baby-boom to baby-bust expressed by past changes of fc, k; and fertility below replacement in many countries expressed by current and future low levels of fc, k. Population data, age distributions, and assumptions on projections for fertility, mortality, and migration rates are taken from the Human Mortality Database (2016). Life expectancies are also computed from life tables provided by this source.

3.3 Calibration results

Table 4 shows how well the model matches the main moments of the data. Our calibration year is 2017. We obtain an average capital-output ratio of 3.12 in the EU3 countries, close to the 3.10 observed in the data (Feenstra et al. 2015). As for consumption-output ratio, we obtain the value of 0.81, which matches closely the value of 0.75 observed in the data (Feenstra et al. 2015).Footnote 18 Average hours worked in the EU3 economy are 0.57, compared to 0.64 in the data (European Commission 2018). In addition, parameter values are chosen such that average retirement age is close to 62.6 in the year 2017, which corresponds to the actual average exit ages of 62.7 (for men) in these three European countries (OECD 2019a; Börsch-Supan et al. 2018b).

Table 4 Main calibration outcomes: macroeconomy and pension system in EU3

A main result of this exercise is that both intensive and extensive labor supply (given by average hours worked and average retirement age) are accurately matched by the model. This is important since many of the effects of policy reforms work through these two behaviors. Also calibrated pension expenditures with 14.4% of the EU3 GDP are close to the actual 13.2% observed in OECD (2019a). We are therefore confident about the validity of the simulations that will be described in the following sections.

4 The benchmark scenario

Table 5 shows the benchmark outcomes for the four dimensions that need to be traded off in pension reform. Financial sustainability is indicated by the fictitious deficit of the pension system if contributions were counterfactually held constant in spite of population aging. Intra-generational equity is measured by the Gini coefficient between the three household types.Footnote 19 In order to quantify inter-generational equity, we calculate the average implicit tax from participating in the pension system. Finally, social welfare is computed according to the social welfare function in Eq. 19.

Table 5 Benchmarks of the four criteria

The fictitious pension deficit is defined by assuming a constant contribution rate after the year when policy reforms can be implemented, thus disabling the balancing mechanism of the pension system’s budget in Eq. 8.Footnote 20 Table 5 shows that this counterfactual deficit increases from 0% in 2020 to almost 8.3% in a time period of 30 years, while actual equilibrium contribution rates grow from 21.0% in 2020 to 34.9% in 2050 (Fig. 3 in Appendix E). This lack of financial sustainability is not only caused by the decrease of the number of contributors but also by the early average retirement ages in the lowest income class. As Fig. 5 in Appendix A shows, average retirement age among the different groups is around 63. Such early retirement and the lower intensity of hours worked, which is due to high costs of working and preferences for leisure, result in low contributions by older workers and increase the expenditures of the benchmark pension system.

The trends of labor force participation, retirement, and contribution rates also determine income inequality within and between generations. Starting with the intra-generational dimension, Fig. 1 shows two versions of the intra-cohort Gini coefficient. Households’ labor and pension income, \({h}_{t,j}^k{w}_{t,j}^k\left(1-{\tau}_t\right)+{p}_{t,j}^k\), jointly with asset income (\({a}_{t,j}^k{r}_t\)), will be used to describe how income is distributed between household types within a specific cohort. We calculate a Gini coefficient using current income as a measure of intra-generational income inequality. A reform-driven increase of this measure would mean that households in the high-income percentile profit more from a reform than other households.

Fig. 1
figure 1

Intra-cohort Gini coefficients with and without asset income

In Fig. 1, the upper line depicts labor and public pension income only, while the lower line represents the Gini index for total income, i.e., including asset income. There is only a small increase for the inequality of labor and public pension income over time: inequality rises from 0.109 to 0.114 in a century. However, income inequality including income from assets exhibits a marked increase for the cohorts entering the labor market between 2000 and 2020. These are the cohorts hardest hit by population aging. Richer households are able to save more when aging reduces public pension generosity than poorer households, which results in substantially higher interest income later, thereby increasing intra-generational inequity.

Regarding inter-generational inequality, we calculate the implicit tax from participating in the pension system for each cohort at the year of entrance in the labor market. Negative numbers represent a gain from participating in the pension system. This implicit tax or gain is defined as the difference between the discounted net present value of an individual’s lifetime contributions paid during his working life and the discounted net present value of pension benefits accruing during retirement, relative to the discounted net present value of lifetime income earned:

$${\iota}_t^k=\frac{\sum_{j=t}^{R-1}\kern0.5em {\pi}_{t,j\kern0.5em }\frac{\tau_t{w}_{t,j}}{\prod_{i=t+1}^j\left(1+{r}_i\right)}\kern0.75em -\kern0.75em \sum_{j=R}^J\kern0.5em {\pi}_{t,j}\kern0.5em \frac{p_{t,j}}{\prod_{i=t+1}^j\left(1+{r}_i\right)}}{\sum_{j=t}^{R-1}\kern0.5em {\pi}_{t,j}\kern0.5em \frac{w_{t,j}}{\prod_{i=t+1}^j\left(1+{r}_i\right)}}$$
(22)

where \({\iota}_t^k\) is the implicit tax rate for the members of the cohort entering the labor market at year t, belonging to household type k. For better readability, we suppress the index k on the right-hand side of Eq. 22. R is the household type-specific retirement age \({R}_t^k\), \({\pi}_{t,j}^k\)is the unconditional survival probability, and ri is the market interest rate used for discounting all amounts to period t values.Footnote 21 The implicit tax \({\iota}_t^k\) is larger than zero if discounted pension benefits fall short of the discounted lifetime contributions. We weigh \({\iota}_t^k\) by the share of individuals of household type k to obtain the average implicit tax per cohort.

Figure 2 and Table 5 show a well-known result (see, e.g., Fenge and Werding (2004)). Current retirees, born in the 1930s, 1940s, and 1950s, have gained from participating in the DB-PAYG system. Thereafter, the average implicit tax rate increases over cohorts, starting with cohorts entering the labor market in the 1970s, reflecting that the public pension system begins to suffer from the impact of an aging population. This may motivate policy makers to flatten the implicit tax curve such that the costs of the system are better spread between generations.

Fig. 2
figure 2

Implicit tax

In Appendix E, Fig. 2, we complement this indicator with the trend of the relative income position of different age groups. In the decades to come, the share of net income of older individuals will rise relative to that of the younger generations (age 20 to 55) which decreases substantially due to increasing contributions to the pension system.

Our fourth policy target is social welfare. Table 5 shows the value of the social welfare function (Eq. 19) where we include all household types k and all current workers and retirees (T1 = T0 and T2 = T0 + J). Social welfare declines by almost 6 percentage points between 2020 and 2050. This result may be interpreted as the summary costs of population aging in this benchmark case of no reforms.

5 Policy reforms

We analyze four prototypical reform proposals that have been suggested or are already implemented in European pension systems. We then add two combinations of these single reforms. The year 2020 is our starting year for the implementation of these scenarios. All individuals alive and future cohorts are aware of these policy reforms and will act according to their characteristics and stages in the life cycle.

  1. (1).

    “Increase FPA by 2:1 rule”: One of the most widespread policy proposals to keep public pension systems sustainable is the increase of the full pensionable age at which people can retire without any deductions. One-off increases of the FPA are successful in reducing the fiscal imbalances of pension systems. However, increases in life expectancy are expected to continue in the future. A possible solution for this is an automatism that links increases of the FPA to increases in life expectancy (see, e.g., Börsch-Supan 2007; OECD 2019a, p. 42, and OECD 2021). As a rule of thumb, since an individual works approximately two-thirds of his life, an increase of 3 years in life expectancy should be divided in an increase of the FPA by 2 years and one more year spent in retirement. We call this the 2:1 rule.Footnote 22 The corresponding increases of the FPA are shown in Table 2 of Appendix B.

  2. (2).

    “Actuarially neutral”: Across many OECD countries, the current adjustment rates in pension systems are below the actuarially neutral adjustment rates (cf. Queisser and Whitehouse (2006)) (see Table 1 in Appendix B). Since low adjustment rates create early retirement incentives and therefore threaten the sustainability of pension systems, we analyze a reform that increases the adjustment rates from its current value of 3.2% in the EU3 countries to a value closer to the actuarial neutral value of 5.3% for earlier and later retirement. We assume that adjustment rates rise linearly from their current level in 2017 to reach their final values in 2032. The aim of such a mechanism is to reduce incentives for earlier retirement and to promote exits from the labor force after the FPA, leading to an increase in the working age population and thus the volume of contributions to the pension system.

  3. (3).

    “Sustainability factor”: This reform introduces a hybrid DB/DC-PAYG system, which adjusts pension benefits to demographic trends and the evolution of the wage bill. Such a mechanism was implemented or proposed in several countries (see, e.g., Börsch-Supan and Wilke (2005) for Germany and OECD (2019b) for Spain). We apply one of the possible designs for such a mechanism, where the replacement rate parameter bt scales the pension benefits in Eq. 9 up or down according to developments in wages and demographics.Footnote 23 The replacement rate will evolve according to

$${b}_t={b}_{t-1}\ast \frac{w_{t-1}\left(1-{\tau}_{t-1}\right)}{w_{t-2}\left(1-{\tau}_{t-2}\right)}\ast {\left(\frac{RQ_{t-2}}{RQ_{t-1}}\right)}^{\mu }.$$
(23)

where RQt is the ratio of the number of retirees to the number of contributors to the pension system at time t. Accordingly, pension benefits are scaled down (up) when net wages decrease (increase) and when the quotient RQt increases (decreases) over time, which is the case in times of population aging. The term \({\left(\frac{RQ_{t-2}}{RQ_{t-1}}\right)}^{\mu }\) is called sustainability factor. This factor works similarly to the notional interest rate, a key element of notional defined contribution (NDC) pension systems that were introduced in Sweden and Italy in the 1990s. As a result, the contribution rate to the pension system has to adjust less in times of population aging since the adjustment rate automatically scales down individual pension payments.

As Börsch-Supan et al. (2017) argue, the parameter μ can be set as a political compromise between current voters’ preferences and the financial sustainability of the pension system. The parameter captures the inter-generational distribution of the demographic risk generated by population aging. Setting μ = 0 stabilizes the replacement rate of pension benefits to the older generation, while μ = 1 stabilizes the contribution rate of the younger generation. Hence, the introduction of a sustainability factor with μ > 0 makes the PAYG pension system more sustainable than the benchmark DB system by sharing the burden of an aging population between generations. We use μ = .25 in accordance with the political choice made in the German public pension system.

  1. (4).

    “Progressive scheme”: This reform introduces a redistributive pension benefit system. It is inspired by the US system (Sanchez-Romero and Prskawetz 2017). Diakite and Devolder (2021) justify this approach and provide bounds and coefficients. The main goal of such a system is to introduce redistribution among different income groups by determining an individual-specific replacement rate according to the pensionable earnings. The calculation of pension benefits is based on the earnings position relative to a certain threshold. To make this scenario comparable to the others, we define this threshold such that the replacement rate for an individual with an average earnings history is 60%, the value in the benchmark scenario:

$${b}_t=\left\{\begin{array}{c}0.65,\kern0.5em if\kern0.75em {p}_t<\frac{{\overline{p}}_t}{1.35}\\ {}0.50+\frac{0.13}{1.25}\frac{{\overline{p}}_t}{p_t},\kern0.5em if\kern0.75em \frac{{\overline{p}}_t}{1.35}<{p}_t<{\overline{p}}_t\\ {}0.45+\frac{0.19}{1.25}\frac{{\overline{p}}_t}{p_t},\kern0.5em if\kern0.75em {\overline{p}}_t<{p}_t<2{\overline{p}}_t\\ {}\frac{1.31}{1.25}\frac{{\overline{p}}_t}{p_t},\kern0.5em if\kern0.75em 2{\overline{p}}_t<{p}_t\end{array}\right.$$
(24)

where \({p}_t={w}_t{\overline{h}}_t\frac{s_{t,j}^k}{{\overline{R}}_t}\) is the actual pensionable earnings and \(\overline{p_t}={w}_t{\overline{h}}_t{\overline{s}}_t\) is the average pensionable earnings in the economy at year t.

Finally, we analyze two combinations of these single reforms:

  1. (5).

    “All not directly redistributive reforms”: This reform combines the automatic mechanisms included in reforms (1) and (3) with the increased incentives for later retirement created by the actuarial neutral adjustment rates implemented in reform (2). This combination essentially models a move from a DB-PAYG system to a NDC system of the Swedish type.

  2. (6).

    “All reforms”: The final scenario entails the simultaneous introduction of all policy reforms including the redistributive assignment of points as described in reform 4. This will increase the equality between income groups of the same generation.

We discuss the implications of these reforms in three steps. First, we analyze the effects on sustainability and intra- and inter-generational equity, evaluated by the indicators fictitious pension deficit, Gini coefficient, and implicit tax, respectively. We then evaluate several variants of social welfare. Finally, we compute how voters would accept each of the six pension reforms.

5.1 Financial sustainability

Figure 3 depicts the projected trends in the fictitious pension deficit—i.e., the counterfactual deficit that would arise if the contribution rate would remain constant at its 2020 value, as our measure of financial sustainability—for seven scenarios: the benchmark described in Section 4, the four single reforms, and the two combination.Footnote 24 All figures in this section follow the same scheme.

Fig. 3
figure 3

Fictitious pension deficit (in % of GDP)

The differences among the scenarios are large. In the long run, the introduction of a sustainability factor has the largest effect on sustainability, while the progressive reform first increases the deficit and then remains neutral. The latter reflects the assumption of an unchanged contribution rate for the average individual. The redistributive nature of the reform collides with the aim of restoring the sustainability of the system. Adapting the retirement age to life expectancy reduces the deficit only in the long run when life expectancy is expected to be substantially higher than today. Later in time (around the 2040s), the increase of the FPA leads to an acceleration of the deficit-reducing effect, with the fictitious deficit approaching the level that would also be achieved by introducing a sustainability factor. Increasing the actuarial adjustments to neutrality has a very strong short-run effect and reduces the expected rise in the contribution rate by more than 4 to 5 percentage points until 2030 (Appendix E, Fig. 4). This occurs because individuals face higher deductions when retiring earlier than the FPA, incentivizing them to retire substantially later. It therefore reduces on average the number of years receiving a pension. Moreover, it positively affects labor supply at the extensive margin, which, at the aggregate level, increases total contributions to the pension system’s budget. However, this policy comprises only a level change of adjustment rates. As time goes by, individuals retire later, but the temporary effect of later retirement and longer contributions is eroded by higher premia for people retiring after the FPA, and by lower deductions for earlier retirees.

Hence, short-run and long-run effects of the single reforms are very different. This calls for a combination of reforms. If the policy maker cares about sustainability improvements with most weight on the next decade, she would choose the scenario combining all not directly redistributive reforms. In the short run, this has a slightly larger dampening effect on the fictitious deficit than when introducing all reforms. However, it is less effective in the long run since the combined reform has the largest dampening effect on the fictitious pension deficit after 2030. The deficit declines from roughly 8.2% of GDP in the benchmark scenario in 2050 to 0.9%, a decline of 7.3 percentage points. Compared to the benchmark scenario, there is a redistribution of the aging burden as pensioners receive lower pension benefits (via lower replacement rates), and younger generations work until later in life. Appendix E, Fig. 4 (lower figure), shows the decline of the replacement rate, reflecting population aging. The effect is not as strong as in a reform that only introduces a sustainability factor because the other measures prevent a stronger rise of the contribution rate, which reduces the need for a downward correction of the replacement rate. On the expenditure side, lower pension benefits are paid out, and individuals retire much later, which reflects a significant decline of 7 percentage points in the fictitious deficit from roughly 9.4% in the benchmark scenario in 2050 to 2.4% in this counterfactual scenario. In other words, there is a strong redistribution of the aging burden from pensioners who receive lower pension benefits to younger generations until late in their lives. The progressivity of replacement rates decreases the average replacement rate for high-income individuals, who lose in relative terms since their pensions are adjusted downwards. Given the setting of the progressivity scheme to match the 60% replacement rate threshold for the average individual, introducing this reform together with the other three has a short-run negative effect, but it then becomes slightly positive over time since the savings on benefits paid to high-income individuals are compensated by higher benefits paid to low-income individuals.

5.2 Intra-generational equity

The impact of each reform on the balance within generations as measured by the Gini index is shown in Fig. 4. It turns out to be more complex than may be expected. This is due to several behavioral mechanisms affecting the extensive and intensive margins of labor supply as well as saving and dissaving over the life cycle.

Fig. 4
figure 4

Intra-generational Gini index

We first focus on the inequality of labor and public pension income (Fig. 4, upper graph). As expected, the redistributive policy reform reduces intra-generational inequality substantially, especially in the long run. This is caused by lower net earnings of high-income individuals, driven simultaneously by lower replacement rates and lower marginal gains on hours worked, both due to high contribution rates, which remain close to the benchmark. In contrast, individuals at the bottom 20% benefit from higher replacement rates. They keep their average hours worked similar to the benchmark scenario, and have approximately the same average retirement age as in the benchmark, which implies comparatively higher pension benefits. As these effects become stronger over time, the decline of intra-generational inequality intensifies.

All this is to be expected. However, focusing on earnings-related income only misses reactions of saving behavior. The lower graph of Fig. 4 depicts total income including asset income. It shows a very different picture. If we account for asset income, intra-generational equity does not present significant differences from the benchmark scenario in spite of the progressive reform because high-income individuals save more than low-income individuals in order to compensate for the redistributive effects of the reform. Savings accumulated during working life produce substantial interest income at older ages in spite of a flat development of interest rates after population aging has peaked (Appendix E, Fig. 3).

Increasing the adjustment rates to actuarially neutral ones increases intra-generational inequality of labor and public pension income in the long term versus the benchmark. Similarly to the previous reform, the effect is reverted when accounting for asset income: intra-generational equity improves in the medium run. This occurs because the increase of the adjustment rates strengthens the incentives to retire later (see Fig. 9 in Appendix E). All individuals retire later but household types with lower income and lower health (capacity to work) still retire earlier than the FPA, thereby incurring penalties reducing their pension benefits. These penalties increase the incentives for them to save more, and this is relative to the saving of high-income/healthier individuals (see Fig. 3 in Appendix E). Therefore, at the margin, low-income/poor-healthy individuals will increase their saving relatively to high-income/excellent-health groups. In the medium run, the flow of asset income more than compensates for the losses in other sources of income, and they are sufficiently large in the long run to offset the increasing income inequality observed when accounting only for labor and pension income.

All other single reforms have negligible effects on intra-generational inequality, and the combination of reforms shows the expected pattern.

The results document the value of a rich model with heterogeneity in several dimensions and a variety of behavioral reactions. In particular, they show that second-round effects may have a similar order of magnitude as first-round effects.

5.3 Inter-generational equity

Figure 5 depicts the reform impacts on inter-generational equity as measured by the counterfactual implicit tax rate for each cohort. They are straightforward, unlike the previous subsection.

Fig. 5
figure 5

Implicit tax ratio

Examining first the single reforms, the most effective reform in counteracting inter-generational imbalances in the long run is to introduce a sustainability factor into the benefit formula (Eq. 23). This lowers the replacement rate, which hurts pensioners by decreasing their benefits. All working age individuals and future generations, in contrast, profit from higher net wages due to lower contribution rates. Additionally, the lower burden on their labor income increases incentives to work both at the intensive and extensive margin, which further strengthens the position of workers relative to pensioners. In the age group analysis depicted in Fig. 13 of Appendix E, we can confirm that the gains of retirees relative to workers in the benchmark scenario are inverted, which contributes to a decline in inter-generational inequity.

The same patterns are observed when increasing the FPA. Individuals respond by increasing hours worked (due to higher net wages) and by working longer (because of the incentives not to retire too much earlier than the FPA). Since workers and younger cohorts profit substantially more from lower contribution rates, the implicit tax is lower for younger cohorts than in the benchmark scenario. These effects become clearly visible in later years when the reform unfolds its full effects. The age group analysis in Fig. 7 of Appendix E shows that the relative income position adjusts later than in the reform that introduces a sustainability factor.

Increasing the adjustment rates to actuarially neutral values and making the pension system more redistributive have negligible effects on the inter-generational balance as they are designed to improve intra-generational equity.

While the introduction of a sustainability factor has the largest effect on flattening the implicit tax curve, a policy maker can improve on it by adding to this reform an increase of the FPA. This is why the two policy combinations generate even more inter-generational equity than introducing a sustainability factor alone.

5.4 Social welfare

So far, reforms had very different effects on financial sustainability and inter- and intra-generational equity. A social welfare function of the type described in Section 2.4 serves to aggregate these effects into a single measure. The usual narrative in economics is to imagine a social planner who makes policy choices by maximizing social welfare.

We distinguish two types of social planners. The first type maximizes the social welfare function (Eq. 19) for current workers and retirees, treating the lifetime utility of all individuals within these cohorts equally. This corresponds to T1 = T0, T2 = T0 + J, K = 3, K1 = 1, and K2 = 3. The second type of social planner includes children (in our model: living but not yet working individuals). This corresponds to setting T2 = T0 + J + 20. All other parameters are the same as in the first case.

Table 6 depicts social welfare calculated for a social planner of the first type. It is measured as percentage difference to the benchmark case. Since reforms have long-run implications and social welfare is aggregated from the lifetime utility of individuals, reforms affect social welfare immediately. In the short run between 2020 and 2030, only the reform that makes the adjustments actuarially neutral would be appealing to a social planner. All other reforms produce losses in social welfare right after their implementation. Even 20 years after the implementation, all other single reforms produce close to zero or negative changes. Some of the reasons behind the positive impact of the actuarially neutral reform is the immediate reaction of cohorts that are not yet retired. They postpone retirement to avoid higher penalties for early retirement (see Figs. 8 and 9 in Appendix E). At the same time, there is a strong reduction in contribution rates, which benefits cohorts at working age, and does not affect the cohorts already retired. Furthermore, there is a substantial drop in the average number of hours worked, which is a result of a higher net wage, leading to higher utility via more leisure.

Table 6 Social welfare relative to benchmark, only cohorts working, and retired

The same does not happen with the other policy reforms. For instance, the introduction of a sustainability factor affects all cohorts immediately and in the following decades. Already retired cohorts will see their pension benefits decrease significantly. At the same time, it improves the net income of working age individuals via lower contributions. The rate of improvement is slower than in the case of the actuarially neutral reform, since it leads to lower increases in retirement age. Individuals postpone retirement to benefit from the higher net wages, but this channel is mostly for younger cohorts that will benefit from a larger fall in contribution rates in the long run, and less for individuals at prime ages at the moment when the reform takes place. The combined effect of lower benefits for retired individuals plus slower income increases for younger cohorts leads to a loss in social welfare of 1.4% already in 2025, and only has a neutral effect almost two decades later.

The same patterns hold for increasing the FPA by the 2:1 rule. The effects are smaller since the 2:1 rule impacts much later, when life expectancy increases.

As for the progressive reform, the average effects are close to zero because the majority of workers would still benefit from a 60% replacement rate. Since the gains of one income group compensate for the losses of another, mainly via replacement rates and pension benefits, the implementation of this reform is close to neutral on the average that is represented by an equally weighting social welfare function. We discuss below how this policy measure has significant effects on the welfare of different income groups.

An improvement of social welfare by 2040 is only possible by combining reform policies. Key is to combine the conversion of adjustment rates to actuarially neutral ones with a redistributive policy on replacement rates. This generates a strong and positive response of retirement ages of all groups of individuals, together with a decline of contribution rates, while the decline of the replacement rates due to the sustainability factor is attenuated due to the latter channels. Furthermore, the expectation of lower penalties in pension benefits and a very late retirement age increases consumption for all individuals, which contributes to a welfare increase early on, despite the loss in leisure due to later retirement.

Table 7 shows how social welfare improves if all persons alive are included in the welfare function, i.e., not only retirees and workers but also their children. Including children generates positive reform effects much earlier in time since most of the reforms have positive effects in the long run as we have seen in Table 6. The long-term smaller effect of the actuarially neutral reform is a result of the nature of the policy reform that introduces only a one-time change. The impacts of this reform thus vanish over time. The most effective single reform is the introduction of a sustainability factor. It increases the social welfare function by 1.8 to 2.3%, in the medium and long run. While the replacement rate is decreasing, which hurts pensioners, working age individuals and young non-working cohorts profit from future higher net wages due to lower contribution rates. The same pattern holds when increasing the FPA. Individuals respond to it by increasing work hours due to higher net wages. Moreover, they plan longer working lives to avoid retiring too much in advance of the FPA. The improvements of the social welfare function are milder in the short run since the indexation of the FPA to life expectancy evolves slowly. Therefore, later retirement ages and lower leisure time are not compensated by financial gains and higher consumption in the short run.

Table 7 Social welfare relative to benchmark: all generations alive

Again, combinations of reforms yield the best outcomes in terms of social welfare. Social welfare gains are close to 2% already in 2030 and keep increasing over time to more than 4% in 2050.

5.5 Voting for reforms

A different way to aggregate individual preferences for pension reforms is voting. Majority voting for reforms occupies a large literature on the political economy of pension design when populations are aging (e.g., (Boeri et al. 2001; Boeri et al. 2002; Casamatta and Batté 2017; Galasso 2007; Persson and Tabellini 2002)). In this paper, we consider only a very simple voting mechanism and are only interested in the aggregate outcome. A more refined analysis of different reforms for different population groups in order to forge winning coalitions is left for future research. We consider two types of voting behavior: (a) voters consider only their own utility and (b) voters consider their own utility plus the utility of their children.Footnote 25 Individuals are assumed to vote in favor of implementing a reform if their (and possibly their children’s) lifetime utility increases due to the reform. Table 8 presents the percentages of pro-votes for two different years in which the reforms will be implemented, 2020 and 2040.

Table 8 Voting on for reforms (percentages)

As a first result, column 1 shows that only the actuarial neutral reform would pass if voters only consider their own lifetime utility and if the reform would be immediately implemented. This echoes the results of Table 6.

Delaying implementation (column 2) protects the baby-boom cohorts from initial benefit cuts. It therefore strengthens the approval for reform in all cases except one. The actuarial neutral reform and both combinations of reform would reach a majority after a delay of 20 years. However, the later introduction of the progressive scheme reform would be seen even more negatively by the voters, which corresponds to the welfare loss that happens in all time periods after 2020.

It is likely that voters who have children internalize the lifetime utility of their children. This is reflected in columns 3 and 4. The approval of increasing the FPA by the 2:1 rule and the introduction of a sustainability factor jump by 35.6 and 39.5 percentage points, respectively. All reforms except increasing the FPA by the 2:1 rule and introducing a progressive scheme would find a majority even if they were immediately implemented. Shifting the implementation date to 2040 would strengthen the approval with one exception to be discussed below.

Some of the effects are as expected. For instance, comparing columns 1 and 3, voters with altruistic preferences for their children’s welfare tend to increase substantially their votes for pension reforms that have a high positive impact on inter-generational equity, such as the introduction of a sustainability factor or the increase of the FPA by the 2:1 rule. Individuals are less in favor of the progressive reform, which has a general negative impact for almost all groups except of the ones ranking lower in all individual characteristics, and children will suffer more from this latter reform given their high contribution rates.

The same would happen for the combinations of reforms. Both combinations would be rejected by a large margin if voters would be egoistic, but they would pass if they have altruistic preferences. The similarities of the voting outcomes between the two combinations result from the small effect on the average lifetime utility of the progressive reform, which, together with the other reforms, ends up having a very low impact in how most individuals vote.

While the protective effect of postponing reform is as expected, Table 8 also carries some surprises. For example, a reform that makes the adjustments to the chosen retirement age actuarially neutral when taking children into account presents an interesting variation of voting behavior that does not occur in the other reforms: it would receive slightly more favorable votes if implemented in 2020 rather than in 2040. This follows mainly from the asymmetric impact of the reform on different age groups. The reform has a large positive effect for younger age groups but a negative effect for groups at prime age, for ages close to the time of retirement, and for groups at older ages. In 2020, the largest age groups are at prime ages and close to retirement. They have children at relative young ages. In 2040, these large groups already transited to retirement and have children at prime ages or close to retirement. If individuals only care for their own welfare, in 2020, these large age groups will vote in majority to reject this reform and similarly for an implementation 20 years later. However, by being altruistic, these age groups then invert their negative opinion of the reform as their children will benefit significantly from the reform, leading to a positive voting outcome of 88.6% pro-reform. The same happens when the reform is introduced in 2040, but then, this age group is no longer the largest one. Instead, they will then belong to the older age groups whose children are at prime ages or close to retirement. This reinforces the negative opinion about the reform, thus leading to a slightly smaller approval rate than if the reform was implemented in 2020. This result shows how dependent reform approvals are from which age groups have the largest share in the population.

An important final result of this section is that both combinations of reforms enjoy a large approval rate except if implemented immediately and children’s utility is ignored (column 1). Hence, if a policy maker wants to achieve long-term financial sustainability and improve inter-generational equity, e.g., by introducing a sustainability factor and/or increase the retirement age as life expectancy increases, she needs to wrap these unpopular reforms together into a package with popular changes that improve intra-generational equity, such as a progressive benefit scheme and/or actuarially neutral adjustment rates.

6 Summary and conclusions

Our paper has juxtaposed financial sustainability and several aspects of social sustainability. The emphasis on social sustainability comes not the least in order to gain political support for financially stabilizing pension reform, which appears to have dwindled in Europe. Hence, policy needs to prevent reforming unequally.

In order to study this, we have developed a modelling framework that allows us to broaden our focus from pursuing only one of the possible goals of a policy reform to compare the multidimensional effects of pension reform in a unified framework. The model quantifies the trade-offs that policy makers face when introducing new reforms in pension systems and permits a holistic analysis of the impact of several pension reform measures on social welfare, intra- and inter-generational equity, and financial sustainability of the pension system. Our model endogenizes important aspects of household behavior in order to detect feedback effects over the life cycle, in particular backlash effects on labor supply, endogenous retirement decisions and adjustments of saving behavior. It is grounded in realistic life cycle patterns of productivity, health, longevity, and consumption preferences based on SHARE data, which differ strongly across income groups.

Table 9 summarizes our results. The columns represent the objectives of pension reform and the rows the four single reforms and two combinations. It is worth noting that the combination of the first three reforms (“all not directly redistributive”) represents a move from the baseline DB-PAYG system to a NDC system of the Swedish type. Effect direction and strength are indicated by a five-point scale from ++ to --. The slashes distinguish short-run and long-run effects.

Table 9 Summary of main results

The table clearly shows that the options faced by policy makers are controversial in many dimensions. Sharing the burden of keeping the pension system sustainable between generations, e.g., by the introduction of a sustainability factor, yields the most positive effects in terms of sustainability. However, while welfare gains are high for younger cohorts and future generations, they are negative for older cohorts. These types of sustainability reforms are therefore highly unpopular unless voters internalize the welfare of their offspring. Similarly, increasing the retirement age fails to attract voters, even in the long run. A particularly salient example is France.

These two reforms fall short to improve intra-generational equity. Despite yielding positive welfare gains for both rich and poor, when future generations are included in the social welfare evaluation or voters’ considerations, the gap between income groups actually increases. Seen from this angle, policies with clear redistributive objectives become attractive. This holds obviously for a reform that makes the benefit formula more redistributive but also for making the benefit adjustments to early or late retirement actuarially neutral. Both reforms have a clear positive impact on correcting possible imbalances within generations. Unfortunately, however, they fall short in terms of inter-generational equity and have no or even negative effects on the financial sustainability of the pension system. Even worse, reform combinations including strong redistributive elements may fail as the second-to-last column in Table 9 shows and as it actually has happened in Switzerland.

These results have several implications for pension reform. First, politically feasible reforms—in the sense of gaining a majority of voters or maximizing a social welfare function—require a combination of reforms that provides different channels to counterweight the different imbalances in the system: the lack of financial as well as social sustainability. Second, static reforms have only small long-run effects. Hence, reforms should be dynamic such as the mechanism of a sustainability factor and the indexation of the retirement age to life expectancy. These self-correcting mechanisms evolve according to the demographic structure and balance the financial burden between beneficiaries from, and contributors to, the pension system. Third, pension reform resembles the struggle against climate change: it is necessary to convince voters that they need to take the welfare of their offspring into account. As Table 9 shows, pension reform survives the voting process only if voters look ahead to the next generation.