1 Introduction

Nowadays, many people invest their retirement savings in a defined contribution pension scheme. In such a scheme, the contributions are agreed upon and are, e.g., a percentage of one’s salary. The pension, however, is uncertain as it depends on the returns on investment. At retirement, the accumulated wealth is converted to a pension income that intends to replace a proportion of the investor’s income, typically about 70%, which is referred to as the replacement ratio. In this paper, we propose a dynamic strategy that optimally steers the investor towards a replacement ratio target.

Our dynamic strategy will reduce risk after several years of good returns on investment. It presumes that upward potential concurs with downside risk. Our pension investor is only interested in reaching her replacement ratio target, i.e., not making the target is considered downside risk and she feels indifferent about any two values above the target. We will show that, in this sense, the designed dynamic strategy outperforms static life cycle strategies. By decreasing risk after several good years, our dynamic strategy prevents unnecessary risk taking.

A well-known static life cycle strategy is known as Bogle’s rule [5], which prescribes to invest a percentage of 100 minus one’s age in risky assets. Decreasing risk in the course of the life cycle in such a way is called a glide path. When the glide path is known in advance up to retirement, the strategy is static and does not adjust as events unfold. Therefore, static strategies may take unnecessary risk when returns on investment are better than anticipated, see [1, 12] for a discussion of drawbacks of static life cycle strategies. The strategy we propose is also rule-based, but it is dynamic as the prescribed rule depends on events that still have to unfold.

In the literature, dynamic strategies are often studied in the context of dynamic programming [3]. Dynamic programming optimizes the investment strategy backwards in time by optimizing decisions for the coming period given that consecutive decisions are already taken optimally. Merton [17] was the first to apply dynamic programming to an asset allocation problem with two assets, a risky and a risk-free asset, also allowing for consumption during the investment period. Optimal decisions were based on the constant relative risk aversion utility function. He showed in [17] that the optimal strategy continuously rebalances, i.e., the optimal allocation is constant.

The literature on optimal asset allocation is rich, and we cite here some contributions that influenced our work. Li and Ng [14] introduced mean-variance strategies with respect to a wealth target. The wealth target then allows the investor to identify a surplus: wealth up to the target may be invested in stocks, any remainder is invested in the risk-free rate. Zhang et al. [20] solved a similar problem, although utility-based, and combined dynamic programming with the least squares Monte Carlo method. Upper and lower bounds for the wealth were prescribed in that paper, showing that upward potential comes with downside risk. Terminal wealth is steered towards a desired range by investing the difference between a risk-free-discounted upper bound value and the current wealth in the risk-free asset. Forsyth and Vetzal [11] also applied dynamic programming and used a PDE solver to solve a so-called time-consistent mean-variance problem, meaning that similar mean-variance problems were solved at future times. In addition to mean-variance that balances the mean and variance of returns, they studied a problem with a fixed wealth target. To reduce risk, both [11] and [20] proposed to invest excess wealth in a risk-free asset. Similarly, the rule-based strategies introduced in this paper will invest excess wealth into a so-called matching portfolio. Compared with static strategies, distributions of outcomes are more centered around the target value and the area below the target value will become smaller.

Besides many positive aspects, dynamic programming and its resulting strategies have several drawbacks including the following. Firstly, dynamic programming is computationally rather intensive. Secondly, the corresponding investment decisions can be sensitive to small changes in parameters and underlying assumptions. Because of this, the allocation may fluctuate over time resulting in large turnovers, of which, from a practical perspective, it is hard to explain why they are required. Intuitively defined rules, typically, do not suffer from these drawbacks. Finally, it is not straightforward to apply dynamic programming to the pension settings as an investor’s replacement ratio target often depends on inflation influencing the future, which in turn also influences the future contributions.

Rule-based dynamic strategies fall in between the static and dynamic programming paradigms, when well constructed they aim for the best of both worlds. As shown by Basu et. al. [2], even simple rule-based strategies that reduce risk half way in the life cycle can outperform static life cycle strategies. Compared with the work in [2], our rule-based strategies can reduce risk annually, and consider the market price of future pension payments instead of a wealth target. In addition to the rule-based strategies, we will study an integrated approach in which we combine a rule-based strategy with dynamic programming.

2 The optimal asset allocation problem

2.1 The Dutch pension system

To demonstrate the rule-based strategy’s practical value, we consider a Dutch pension investor, and use typical retirement data from the Netherlands. In the Netherlands, the pension system can roughly be described as follows. The system consists of three pillars. The first pillar is formed by government allowances for old age equal to about 70% of minimum wage. It is funded out of current payroll taxes. People earning more than a so-called franchise, roughly equal to minimum wage and amounting to 13,123 Euro ultimo 2016, can take part in the second or third pillar. The second pillar consists of participation in collective pension schemes. Contributions are equal to a percentage of one’s pensionable salary, i.e., the salary above the franchise, and employers typically pay part of the contributions which makes opting out highly unfavourable. The third pillar consists of private pension products. At retirement, pension benefits and savings are converted to a period income, e.g. facilitated through the collective pension scheme, a life insurance product of an insurance company, or through periodic withdrawals from an investment account. The overall aim is to replace about 70% of one’s income by a pension. In the second and third pillars, Dutch pension law facilitates tax free contributions to work towards this aim.

2.2 Model setting

In this paper, the Dutch pension investor doesn’t take part in a collective pension scheme, but is assumed to have full control over her pension savings, either by taking part in an individually defined contribution pension scheme in the second pillar, or by means of an appropriate private pension product in the third pillar. At \(t=0\), the 26 year old investor will start saving annually the maximum allowed tax free amount (according to Dutch pension law) up to retirement at time \(t=T\), coinciding here with a retirement age of 67 years. She intends to replace 70% of her salary by her pension (including government allowances for old age). As the government allowance for old age roughly equals 70% of the franchise, her pension savings intend to replace 70% of her pensionable salary, i.e., her salary in excess of the franchise. Although, in practice, an investor might be interested in insuring longevity risk or be interested in employing advanced withdrawal strategies, Blanchett et. al. [4] illustrates that simple withdrawal strategies can perform well, e.g., based on an annuity with a maturity roughly equal to an investor’s life expectancy. Therefore, as we focus on accumulating wealth before retirement, we simply assume the investor buys an annuity that indexes with the expected inflation, i.e., a bond which, apart from indexation for expected inflation, equals annual payoffs, for a period of N, say 20, years after retirement. Whichever withdrawal strategy an investor might follow, the assumption here is that this annuity gives a good estimate of, at least, the investor’s income in her first year after retirement, and, thereby, to what extend she can replace her pensionable salary for 70% with a pension.

The investor can invest her wealth \(W_t\) in a risky, equity-like, asset, which is called the return portfolio, or in a safe, bond-like, asset with annual payoffs during retirement, the matching portfolio. In our setting, the strategy will use the matching portfolio to protect the current gains, and it grows with inflation. Therefore, the matching portfolio also carries risk. Put differently, we assume the investor doesn’t hedge inflation risk with inflation protected securities as the market for inflation products is illiquid and strategies that hedge against inflation are not straightforward to follow in practice [16]. Finally, we assume there is no risk-free rate to invest money in.

The investor annually manages her portfolio, i.e., decisions, contributions and pension payments are made in discrete time, which runs up to retirement, from \(t=0\) to \(t=T\). The pension payments start at \(t=T\) and run up to \(t=T+N-1\). At time \(t\le T\) before retirement, she invests a fraction \(\alpha _t\) of her wealth \(W_t\) in the return portfolio. The investor is not allowed to short-sell assets or borrow money, so that

$$\begin{aligned} 0\le \alpha _t\le 1. \end{aligned}$$
(1)

In the dynamic programming literature, \(\alpha _t\) is referred to as the control (as decisions intend to give the investor control over the outcome). A strategy maps information \(Z_t\) available at time t, e.g., past returns and current wealth \(W_t\), to the desired allocation:

$$\begin{aligned} \alpha _t:{\mathbb {R}}^K\ni Z_t \mapsto \alpha _t(Z_t) \in [0,1]. \end{aligned}$$
(2)

Here \(Z_t\) is adapted to a filtration \({\mathscr {F}}_t\), governing the underlying stochastic processes. Before time t, the information \(Z_t\) is not yet available, and \(\alpha _t\) is thus a stochastic quantity. In a static strategy, such as Bogle’s rule, \(\alpha _t\) only depends on time and is known, i.e., not stochastic, even when the information \(Z_t\) is not yet available. In practice, risk is reduced towards retirement, meaning that \(\alpha _t\) typically decreases over time.

Just before rebalancing, the investor makes a contribution \(c_t\) to the portfolio equal to the maximum allowed tax free amount under Dutch pension law. These contributions resemble an age-dependent percentage \(p_t\), see Table 3, of the investor’s pensionable salary \(s_t\) which she earned in the period \(t-1\) up to t. We assume that the investor’s salary follows a deterministic career path, i.e., it increases with age. The investor’s salary also increases stochastically with the wage inflation \(w_t\). Altogether, Table 4 shows the expectation and standard deviation of the annual contribution to the portfolio for our investor, see also Appendix A. Note that the contributions merely depend on career path assumptions and inflation, and do not depend on investment decisions neither on return on investment.

The investor’s objective is to achieve a 70% replacement ratio target at retirement without incurring too much downside risk. The replacement ratio for the pensionable salary at retirement, \(RR_T\), is given by

$$\begin{aligned} RR_T= \frac{W_T}{M_T} \cdot \frac{T+1}{\sum \nolimits _{t = 0}^{T} s_t\prod \nolimits _{\tau =t+1}^{T} (1+\pi _\tau )}. \end{aligned}$$
(3)

Here, the second term divides by the investor’s average wage in nominal amounts indexed with inflation \(\pi _t\) to retirement at \(t=T\), and \(M_t\) is the market value factor. The market value factor \(M_t\) converts a nominal amount at time t, that is supposed to have similar purchasing power to the pension payments, to N pension payments at times \(\tau \ge T\) after retirement by indexing with currently expected inflation, and then discounting those pension payments back to time \(t \le T\):

$$\begin{aligned} M_t=\sum \limits _{\tau =T}^{T+N-1} (1+r^{\tau -t}_t)^{t-\tau }\;{\mathbb {E}}_t\left[ \prod \limits _{\tau '=t+1}^\tau (1+\pi _{\tau '})\right] , \end{aligned}$$
(4)

where \({\mathbb {E}}_t\) is the expectation, conditional on \({\mathscr {F}}_t\) (i.e., the information available at time t), and \(r^{\tau -t}_t\) represents the market rates that discount payments from \(\tau -t\) years into the future back to the present time. Put differently, dividing current wealth by the market value factor gives an amount that currently has similar purchasing power as the future pension. Using the market value factor \(M_T\) at retirement, the first term in (3) converts the accumulated wealth \(W_T\) into N annual income payments indexed for expected future inflation.

To measure whether a strategy achieves the investor’s objective, we use a utility function, U, which, whenever decisions are to be taken, intends to maximize the following expression in expectation:

$$\begin{aligned} \max _{\alpha _t,\ldots , \alpha _{T-1}}~{\mathbb {E}}\left[ {U(Z_T)}\left| \right. {\mathscr {F}}_t\right] , \end{aligned}$$
(5)

where \({\mathscr {F}}_t\) again represents current market information, \(\alpha _t\) is as in (2) and \(Z_T\) is a vector with outcomes including the terminal replacement ratio. Although other choices are possible, we choose U(.) to be the shortfall below the investor’s target replacement ratio of 70%:

$$\begin{aligned} U(Z_T)=\min (RR_T-70\%,0), \end{aligned}$$
(6)

where there is no shortfall in replacement ratio if it ends above \(70\%\). Note that this measure is not conditional on the shortfall. So, additionally, we will also evaluate a strategy’s performance using the 10% conditional value at risk \({\mathrm {CVaR}}_{0.1}(RR_T)\) of the replacement ratio, i.e., the expectation of the \(10\%\) worst case outcomes as defined by

$$\begin{aligned} {\mathrm {CVaR}}_\alpha \; (RR_T) = {\mathbb {E}}\left[ RR_T\left| RR_T\le F_{RR_T}^{-1}(\alpha )\right. \right] , \end{aligned}$$
(7)

where \(F_{RR_T}^{-1}(\alpha )\) is the inverse cumulative distribution function of the terminal replacement ratio \(RR_T\) and represents the \(\alpha\)-th quantile below which are the worst case outcomes.

2.3 Governing stochastic model

For general applicability, we require the designed strategies are not defined in terms of the governing stochastic model parameters. That is, the strategies can be applied when different governing stochastic models would be used. We merely assume that the governing stochastic model can be simulated by means of a Monte Carlo simulation. To make this explicit, we choose to use a standard model developed to make risk analyses comparable between Dutch pension funds, see [13]. The model and its calibration are well documented [10]. Calibration to recent market data and a Monte Carlo simulation of the model are publicly available at the website of the Dutch Central Bank [9]. In this paper, we use the data set of 2017 (quarter 1), which is calibrated to data up to ultimo 2016 and start simulating from there.

In discrete time, the model is a VAR(1) model with normally distributed increments, see [18] for a short summary of the model specification. In the calibration, some structure is imposed to achieve realistic market dynamics. Based on the model, sample paths are generated for the following variables:

  • Equity returns \(x_t\), which are used for the return portfolio;

  • Inflation \(\pi _{t}\);

  • Wage inflation \(w_t\), which equals inflation \(\pi _{t}\) plus \(0.5\%\);

  • A yield curve with interest rates \(r_t^m\) containing rates for each maturity m.

The matching portfolio is tailored to the investor’s retirement age. Its returns \(m_t\) equal the rate of change in the market value factor:

$$\begin{aligned} m_t=\frac{M_t}{M_{t-1}}-1, \end{aligned}$$
(8)

where \(M_t\) is defined in (4). Note that the matching portfolio protects the investor against expected future inflation. To determine the expected future inflation, we use the least squares Monte Carlo technique, as presented in Sect. B.3.

Table 1 gives the annual return statistics of the variables. Due to the fluctuating market price of future pension payments, the standard deviation of the matching returns is very similar to the one of the equity returns. Although the matching portfolio follows these fluctuations, it is considered less risky, in terms of the investor’s goals. By investing in the matching portfolio, the pensioner will receive the corresponding amount from the annuity, no matter the future market prices.

Table 1 Annual statistics of the underlying stochastic model calculated on a sample that combines all sample paths

3 Rule-based strategies

In this section, we define three rule-based strategies: a cumulative target strategy that decreases risk once it reaches a cumulative target for the contributions paid so far, an individual target strategy that tracks the investments of the contributions separately and decreases risk once it reaches the target for that contribution, and a combination strategy that combines the two with dynamic programming. The strategies all intend to steer towards a replacement ratio of 70%, and decrease risk when return on investment develops well. The strategies differ in their views on when return on investment has been developing well enough to decrease risk.

3.1 Cumulative target strategy

The cumulative target strategy that we consider here has similarities with the strategies studied in [20] and [11]: risk is reduced once wealth exceeds a pre-defined wealth target. Contrary to [20] and [11], however, our investor saves for retirement and we relate the wealth target to the price of a bond with payoff equal to the desired pension.

Given a density forecast for the matching and return portfolios, see Sect. 2.3, the strategy depends on two parameters: a required real rate of return r (before retirement) and a discount rate \(\delta\) (after retirement) to discount pension payments after retirement to the retirement date. At time t before retirement, i.e., \(t= 0,\ldots , T\), the investor contributes \(c_t\) to her pension savings, see Tables 3 and 4. The contributions \(c_\tau\) up to time t, i.e., \(\tau =0,\ldots ,t\), are supposed to grow with inflation \(\pi _\tau\), plus the real rate of return r, to a target wealth \(c_\tau \; {\mathbb {E}}_t[F_\tau ]\) at retirement, where \(F_\tau\) is given by

$$\begin{aligned} F_\tau = \prod \limits _{\tau '=\tau +1}^T(1+r+\pi _{\tau '}), \end{aligned}$$
(9)

and the conditional expectation, \({\mathbb {E}}_t\), enforces that the realized inflation is used before time t and the expected inflation is used beyond time t. The wealth targets at retirement for all contributions \(c_\tau\) up to time t are combined and converted into a target pension using a discount factor \({{\tilde{M}}}_T\) with discount rate \(\delta\):

$$\begin{aligned} {{\tilde{M}}}_T=\sum \limits _{\tau =T}^{T+N-1} \frac{1}{(1+\delta )^{\tau -T}}. \end{aligned}$$
(10)

Note that, contrary to the market value factor \(M_t\) as defined in (4), \({{\tilde{M}}}_T\) is merely an annuity with rate \(\delta\) used to approximate the first pension given wealth at retirement. Using the market value factor \(M_t\), this gives us the following current target wealth \({\tilde{W}}_t\):

$$\begin{aligned} {\tilde{W}}_t=\frac{M_t}{{{\tilde{M}}}_T}\sum \limits _{\tau =0}^{t} c_\tau \; {\mathbb {E}}_t[F_\tau ], \end{aligned}$$
(11)

where the summation represents the combined wealth targets at retirement for all contributions \(c_\tau\) up to time t. In short, (11) can be interpreted as follows: the summation converts contributions \(c_\tau\) up to time t into a target wealth at retirement, \({{\tilde{M}}}_T\) uses an annuity with rate \(\delta\) to approximate the first pension, and, finally, the market value factor \(M_t\) indexes the first pension with expected inflation and discounts all future pensions back to time t. Altogether, this gives the current target wealth \({{\tilde{W}}}_t\).

The cumulative target strategy starts by investing new contributions \(c_t\) in the risky asset. If the current wealth \(W_t\), including the current contribution \(c_t\), exceeds the target wealth \({\tilde{W}}_t\), risk is reduced and \(W_t\) is transferred to the matching portfolio. For the matching portfolio, the investor follows a buy and hold strategy. New contributions invested in the risky asset, will also be transferred to the matching portfolio if the current wealth \(W_t\), which consists of the current contribution \(c_t\), the value of the matching portfolio and the value of the return portfolio, exceeds the target wealth \({\tilde{W}}_t\). In other words, at \(t=0\), the control \(\alpha _0\) is given by

$$\begin{aligned} \alpha _0=\left\{ \begin{array}{lll} 0 &{}\quad {\text {if}}\ W_0\ge {{\tilde{W}}}_0,&{}\qquad {\text {(12a)}}\\ 1 &{}\quad {\text {otherwise}}, &{}\qquad {\text {(12b)}} \end{array}\right. \end{aligned}$$

and, for \(t=1\ldots T\), the control \(\alpha _t\) is given by

$$\begin{aligned} \alpha _t= \left\{ \begin{array}{lll} 0 &{}\quad {\text {if}}\ W_t\ge {{\tilde{W}}}_t,&{}\qquad {\text {(13a)}}\\ \frac{\alpha _{t-1}(1+x_t)}{\alpha _{t-1}(1+x_t)+(1-\alpha _{t-1})(1+m_t)} &{}\quad {\text {otherwise}}. &{} \qquad {\text {(13b)}} \end{array}\right. \end{aligned}$$

3.2 Individual target strategy

Contrary to the cumulative target strategy, the individual target strategy, which is the second strategy we will analyze here, defines a wealth target per contribution and invests each contribution separately, i.e., the wealth \(W_t\) is seen a sum of the individual wealth components resulting from investing the contributions separately:

$$\begin{aligned} W_t=\sum \limits _{\tau =0}^t W_{t,\tau }, \end{aligned}$$
(14)

where \(W_{t,\tau }\) is the wealth component from investing the contribution \(c_\tau\). As in (11), a wealth target \({\tilde{W}}_{t,\tau }\), at time t for a contribution invested at time \(\tau \le t\), is given by

$$\begin{aligned} {\tilde{W}}_{t,\tau }=\frac{M_t}{{{\tilde{M}}}_T} c_\tau \; {\mathbb {E}}_t[F_\tau ]. \end{aligned}$$
(15)

Apart from this, the strategy works similarly: the individual contributions are invested in the risky asset until the invested amount exceeds the wealth target for that contribution, in which case they are transferred to the matching portfolio until retirement. Thus, the control \(\alpha _{t,\tau }\) for investing contribution \(c_{t,\tau }\) is given by

$$\begin{aligned} \alpha _{t,\tau }= \left\{ \begin{array}{lll} 0 &{} \quad {\text {if for any}}\ \tau '=\tau \ldots t\ {\text {we have}}\ W_{\tau ',t} \ge {{\tilde{W}}}_{\tau ',\tau },&{} \qquad {\text {(16a)}}\\ 1 &{}\quad {\text {otherwise}}.&{} \qquad {\text {(16b)}} \end{array}\right. \end{aligned}$$

At the aggregated level, the control \(\alpha _t\) is now given by

$$\begin{aligned} \alpha _t=\frac{1}{W_t}\sum \limits _{\tau =0}^t W_{t,\tau }\alpha _{t,\tau }. \end{aligned}$$
(17)

Conceptually, the difference between the cumulative target strategy and the individual target strategy is what triggers the risk reduction. Contrary to the individual target strategy, in the cumulative target strategy new investments have to make up for insufficient past returns before a transfer to the matching portfolio can take place. On the other hand, in the cumulative target strategy good past returns may cause new contributions to be transferred immediately to the matching portfolio. With the individual target strategy, each contribution has to generate sufficient return on investment before such a transfer takes place.

3.3 Combination strategy

Both the cumulative and the individual target strategy either reduce risk by switching completely to the matching portfolio or don’t reduce risk at all. Instead of completely switching or not switching at all, the combination strategy, which is the third strategy considered, combines the individual target strategy with dynamic programming to dynamically steer the wealth \(W_{t,\tau }\) resulting from the contribution \(c_\tau\) above its wealth target \(\tilde{W}_{t,\tau }\). For this, we define the following wealth to target ratio,

$$\begin{aligned} Z_{t,\tau } := \frac{W_{t,\tau }}{{{\tilde{W}}}_{t,\tau }}, \end{aligned}$$
(18)

and solve

$$\begin{aligned} V(z,t,\tau ) = \sup _{{\mathscr {A}}_{t,\tau }} {\mathbb {E}}\left[ {\check{U}}(Z_{T,\tau }) | Z_{t,\tau } = z \right] , \end{aligned}$$
(19)

where \({\check{U}}\) is a utility function, \(V(z,t,\tau )\) is the value function in the dynamic programming problem and the control \({\mathscr {A}}_{t,\tau }\) consists of the future investment decisions:

$$\begin{aligned} {\mathscr {A}}_{t,\tau } = \{\alpha _{t,\tau }, \ldots ,\alpha _{T,\tau }\}. \end{aligned}$$
(20)

Using the dynamic programming principle, it follows that the optimal control, \({\mathscr {A}}_{t,\tau }^*\), satisfies

$$\begin{aligned} {\mathscr {A}}_{t,\tau }^* = \{\alpha _{t,\tau }^*, {\mathscr {A}}_{t+1,\tau }^*\}, \end{aligned}$$
(21)

which allows us to solve for the optimal control problem for \({\mathscr {A}}_{t,\tau }^*\), backwards in time.

In this context, we choose a utility function that steers the ratio \(Z_{T,\tau }\) in between the bounds \(z^*_{\mathrm {min}}\) and \(z^*_{\mathrm {max}}\). This is in line with the investor’s goal of minimizing downside risk, and with our assumption that upward potential comes with downside risk. The utility function should be positive concave and takes here the following functional form:

$$\begin{aligned} {\check{U}}(z)&= \frac{-\left( z-\beta \right) ^2 - \left( z - z^*_{\mathrm {min}} \right) ^2}{z}, \end{aligned}$$
(22)

where

$$\begin{aligned} \beta&= \sqrt{2\left( z^*_{\mathrm {max}} \right) ^2 - \left( z^*_{\mathrm {min}} \right) ^2}, \end{aligned}$$

see Fig. 1; note that this utility function differs from \(U(\cdot )\) as given by (6). Utility function \({\check{U}}(\cdot )\) is clearly concave and continuous on the domain \({\mathbb {R}}_{>0}\). We set \(z^*_{\mathrm {min}}=1\) and \(z^*_{\mathrm {max}}=3\), as this choice fits well with the investor’s replacement ratio target and, as we will show in Sect. 4.1, is sufficient to demonstrate the strategy’s added value.

Fig. 1
figure 1

Plot of (22) with \(z^*_{\mathrm {min}}=1\) and \(z^*_{\mathrm {max}}=3\)

Now, we will show that the ratio \(Z_{t,\tau }\), between the current wealth \(W_{t,\tau }\) and its target \({{\tilde{W}}}_{t,\tau }\), evolves in time by making returns on investment in the nominator and updating the inflation expectation in the denominator. Since this time evolution is independent of \(\tau\), we can show that the optimal control \(\alpha _{t,\tau }^*\) is independent of \(\tau\), i.e., once the optimal control is found, it can be applied to all contributions.

Lemma 1

If the optimal control \(\alpha _{t,\tau }^*\) of dynamic programming problem (19) exists and is unique, then it is independent of the contribution \(c_\tau\) and the time \(\tau\) at which the contribution is made.

Proof

The portfolio wealth \(W_{t,\tau }\), accumulated by investing contribution \(c_\tau\), increases with the return on investment and, therefore, satisfies

$$\begin{aligned} W_{t,\tau }=\left[ (1+x_t)\alpha _{t-1,\tau }+(1+m_t)(1-\alpha _{t-1,\tau })\right] W_{t-1,\tau }. \end{aligned}$$
(23)

From (15), (9) and (8), it follows that the wealth target, \({{\tilde{W}}}_{t,\tau }\), satisfies

$$\begin{aligned} {{\tilde{W}}}_{t,\tau }=\frac{{\mathbb {E}}_t[F_{t-1}]}{{\mathbb {E}}_{t-1}[F_{t-1}]} (1+m_t) {{\tilde{W}}}_{t-1,\tau }. \end{aligned}$$
(24)

Substitution of (23) and (24) in (19) yields that the optimal controls \(\alpha _{t,\tau }^*\) solve

$$\begin{aligned} \sup _{{\mathscr {A}}_{t,\tau }} {\mathbb {E}}\left[ \left. {\check{U}}\left( z \prod \limits _{\tau '=t+1}^T \frac{(1+x_{\tau '+1})\alpha _{\tau ',\tau }+(1+m_{\tau '+1})(1-\alpha _{\tau ',\tau })}{\frac{\displaystyle {\mathbb {E}}_{\tau '+1}[F_{\tau '}]}{\displaystyle {\mathbb {E}}_{\tau '}[ F_{\tau '}]} (1+m_{\tau '+1}) } \right) \right| Z_{t,\tau } = z \right] . \end{aligned}$$
(25)

As the argument of \({\check{U}}\) merely depends on the control \(\alpha _{\tau ',\tau }\), future returns on equity \(x_{\tau '+1}\), and future returns of the matching portfolio \(m_{\tau '+1}\), all optimal controls \(\alpha _{t,\tau }^*\) solve, as a function of the state z, the same sequence of optimization problems, backwards in time. Consequently, both the value function \(V(z,t,\tau )\) and the optimal control \(\alpha _{t,\tau }^*\) are independent of \(\tau\). A similar argument shows the independence of the size of the contribution \(c_\tau\). \(\square\)

Lemma 1 implies that, theoretically, the dynamic programming problem has to be solved only once, i.e., the optimal control (as a function of the state) is the same for all contributions. Therefore, the optimal control of the first contribution \(c_0\) can be used for all other contributions.

For the practical implementation for the dynamic programming algorithm, readers may refer to Appendix B.

3.4 Expected and target replacement ratio

The variable r, used in the construction of the wealth targets by means of Eq. (9), can be interpreted in several ways. First, it serves as a discount rate, which is used to compute the present value of contributions that are made in the future. It can also be interpreted as an annual return requirement: each contribution is required to have an average annual return of r. And third, it can be interpreted as a future expected annual return.

The third interpretation allows us to measure approximately a strategy’s progress towards the expected replacement ratio \({\mathbb {E}}_t[RR_T]\) by the quantity \(R_t\):

$$\begin{aligned} R_t&:= \frac{(T+1){\mathbb {E}}_t[W_T]}{{\mathbb {E}}_t\left[ M_T\right] \; {\mathbb {E}}_t\left[ \sum \nolimits _{t = 0}^{T} s_t\prod \nolimits _{\tau =t+1}^{T} (1+\pi _\tau )\right] }, \end{aligned}$$
(26)

where \({\mathbb {E}}_t[W_T]\) is approximated using:

$$\begin{aligned} {\mathbb {E}}_t[W_T] \approx \left[ 1+r+ I_{T,t} \right] ^{T-t}W_t + \sum _{k=t+1}^{T-1} \left[ 1+r+I_{T,t} \right] ^{T-k} {\mathbb {E}}_t[c_k]. \end{aligned}$$
(27)

Note that in the definition of \(R_t\), the expectation of the denominator and nominator are, contrary to how \({\mathbb {E}}_t[RR_T]\) is defined, taken separately for computational reasons. Altogether, the computation of \(R_t\) requires four different estimators. The discount rate r is used as an estimator for the future expected annual return. The estimator for the annualized future inflation \(I_{T,t}\), has to be estimated by regression between the future and the past cumulative inflation, as shown in Eq. (29). Future salaries are based on the information from Table 3. Lastly, the estimator for the market value factor at the end of the investment horizon, \({\mathbb {E}}_t[M_T]\), is based on regression between \(M_t\) and \(M_T\), with \(\varPhi = \{1,x\}\). See Appendix B.3 for details of the regression method used.

Alternatively, one could use these approximations to try to steer towards a replacement ratio target \(R^*_t\) directly. Using (26) and (27), the replacement ratio target can be converted into a current wealth target:

$$\begin{aligned} \frac{R^*_t{\mathbb {E}}_t\left[ M_T\right] \; {\mathbb {E}}_t\left[ \sum \nolimits _{t = 0}^{T} s_t\prod \nolimits _{\tau =t+1}^{T} (1+\pi _\tau )\right] -(T+1)\sum _{k=t+1}^{T-1} \left[ 1+r+I_{T,t} \right] ^{T-k} {\mathbb {E}}_t[c_k]}{(T+1)\left[ 1+r+ I_{T,t} \right] ^{T-t}}. \end{aligned}$$

We found, however, that steering towards this approximate target in a dynamic programming framework didn’t yield satisfactory results. Therefore, we postpone exploring such a direction deeper to future research.

4 Numerical evaluation

In this section, we apply the rule-based strategies described in Sect. 3 to the pension investor introduced in Sect. 2.2, using the governing stochastic model described in Sect. 2.3.

4.1 Rule-based strategies

To illustrate the dynamics of the rule-based strategies, Fig. 2 shows one of the 2000 sample paths for the investor’s portfolio dynamics. In particular, the top left figure shows the investor’s wealth \(W_t\) when following the cumulative target strategy (orange) and when the investor’s wealth exceeds the target \({{\tilde{W}}}_t\) (yellow). Note that when this occurs, the investments are transferred to the matching portfolio (orange line, bottom right figure). The individual target strategy (in green) works similarly, but, as discussed, uses a target per contribution, so that, typically, only part of the wealth is transferred to the matching portfolio (green line, bottom right figure). The bottom left figure illustrates that, in this sample path, the rule-based strategies outperform the optimal static strategy in terms of expected replacement ratio—although only in the first 10 years of the investment the rule-based strategies take substantially more risk, i.e., have a substantially higher allocation to the return portfolio. Therefore, in this particular sample path, one could argue that the better performance comes from the rule-based strategies and not from increased exposure to risk.

Fig. 2
figure 2

Sample paths with cumulative target strategy (orange), individual target strategy (green) and optimal static strategy (blue) of: wealth \(W_t\) (top left) with the wealth target \(\tilde{W}_t\) of the cumulative target strategy (yellow), and cumulative contribution (dark blue); return (top right) of the matching (dark blue) and return portfolio (yellow), expected replacement ratio \(RR_T\) (bottom left) together with the \(70\%\) target (yellow); and (bottom right) allocation \(1-\alpha _t\) to the matching portfolio

The combination strategy is best illustrated by means of the resulting investment decisions, i.e., the optimal control \(\alpha _{t,\tau }\) as defined by Eq. (25). Figure 3 illustrates the optimal allocation to the matching portfolio, \(1-\alpha _{t,0}\), for the first contribution \(c_0\) as a function of time t and the wealth to target ratio, as defined by Eq. (18). In this example, allocations are restricted to multiples of \(20\%\). Note that, contrary to the rule-based strategies, in the combination strategy, risk can be increased and investments can be transferred from the matching to the return portfolio. All together, this makes the combination strategy more refined than the rule-based strategies, which follow a “risk on” or “risk off” approach, in terms of their allocations.

Fig. 3
figure 3

Optimal allocation \(1-\alpha _{t,0}\) to the matching portfolio, as a function of wealth to target ratio \(\nicefrac {W_{t,0}}{{{\tilde{W}}}_{t,0}}\) (y-axis), as defined by Eq. (18), and time t (x-axis) for the first contribution of an investor following the combination strategy, discussed in Sect. 3.3, and using the stochastic model of Sect. 2.3. Allocations are restricted to multiples of \(20\%\)

Figure 4 compares the distribution of the terminal replacement ratio \(RR_T\) for the following best performing strategies in terms of the expected shortfall below the investor’s \(70\%\) replacement ratio target: two rule-based strategies, a combination strategy and a static strategy. The figure illustrates that, as intended, the dynamic strategies reduce downward risk at the expense of upward potential, i.e., the dynamic strategies are centered more around the target replacement ratio of \(70\%\).

Fig. 4
figure 4

Distribution of the replacement ratio \(RR_T\) for, respectively, the cumulative target strategy with \(r=3.06\%\) (orange), individual target strategy \(r=2.99\%\) (green), combination strategy with \(r=1\%\) (yellow) and a static strategy with \(46.02\%\) constant allocation to the return portfolio

In the Dutch pension system, maximum allowed tax free contributions are set by law, and few people deviate from this. Nevertheless, as a sensitivity analysis, we investigated that with \(40\%\) higher contributions all strategies achieve the \(70\%\) replacement ratio target with about \(90\%\) certainty, see Fig. 5. The figure illustrates that the designed strategies perform especially well when one tries to steer towards a target that is challenging to achieve. Also, intuitively, for such a target, reducing risk after several years with better than anticipated return on investment makes sense, which is inherent in the design of the strategies. That being said, still the dynamic strategies reduce downward risk at the expense of upward potential.

Fig. 5
figure 5

Distribution of the replacement ratio \(RR_T\) with \(40\%\) higher contributions for, respectively, the cumulative target strategy with \(r=3.06\%\) (orange), individual target strategy \(r=2.99\%\) (green), combination strategy with \(r=1\%\) (yellow) and a static strategy with \(46.02\%\) constant allocation to the return portfolio.

A comparison of all strategies is best made by comparing the strategy successes, i.e., whether a strategy achieves the intended \(70\%\) replacement ratio target, versus its downside risk, and parametrize the strategies by the parameters that control the strategy’s risk appetite, see Fig. 6. From this figure, we conclude that all the dynamic strategies clearly outperform the traditional static strategies. Together with the intuitive rationale to reduce risk after several good years, we believe this sufficiently demonstrates the added value of these dynamic strategies. We do, however, find these simulations insufficient to rank the dynamic strategies based on their effectiveness. It is well-known that the relative performance of dynamic strategies can be sensitive to the characteristics of the underlying stochastic model. As such, the characteristics are not completely objective, and we believe that use of the strategies in practice is an appropriate way to test the strategies further (which lies beyond the scope of this research).

Fig. 6
figure 6

Expected shortfall below the investor’s \(70\%\) replacement ratio target, see Eq. (6), versus \(10\%\) CVaR of the terminal replacement ratio \(RR_T\), as defined by (7), for: the cumulative target strategy (orange), the individual target strategy (green), the combination strategy (yellow), all parametrized by the real rate of return r (yellow), and, also, several annually rebalanced allocations (blue), and several default life cycles reducing risk with the investor’s age

4.2 Discussion

One of the intended advantages of a static life cycle strategy is the reduced risk close to the retirement, meaning that one can provide the investor with an accurate estimate of her retirement income in the years before retirement. Table 2 provides a comparison of the dynamic strategies and traditional life cycle strategies. In particular, the table lists the standard deviations of the difference between the expected replacement ratio 5 years before retirement and the replacement ratio at retirement. We conclude that when following the rule-based strategies the investor can be provided with a similarly accurate estimate of the replacement ratio before retirement

Table 2 Statistics for different investment strategies. Values in this table were calculated using \(r = 3.06\%\) for the cumulative strategy, \(r = 2.99\%\) for the individual strategy, and \(r = 1\%\) for the combination strategy

Although the rule-based strategies outperform other strategies in our examples, we wish to point out that there are also disadvantages to the all-or-nothing approach, e.g, the portfolio remains \(100\%\) invested in the more risky return portfolio when targets are not reached. Such truly worst case scenarios appear to have a minor influence, but are, e.g., illustrated in the far left lower tail in Fig. 4. The individual target strategy presented in Sect. 3.2 suffers less from the all-or-nothing disadvantages, as it defines a target per contribution. As a result, inferior past performance does not influence the required performance of current and future contributions.

Compared with the rule-based strategies, the combination strategy does not exploit the fact that the matching portfolio can grow an investment securely to its intended target (indexed by expected inflation) until retirement. As the rule-based strategies explicitly made use of this, the combination strategy could be further improved.

One advantage of the combination strategy in practical use is that the corresponding asset allocation is much more smooth than for the rule-based strategies. The necessity of large turnovers is difficult to explain and investors might be uncomfortable to follow such a drastic strategy to the end.

5 Conclusion

In this paper, we discussed several dynamic strategies, suitable for pension investors that aim to replace a proportion of their salary with a retirement income. The strategies reduce risk after several good years and steer the investor to her target. By having the allocation depend on return on investment, the approaches exploit a freedom which is typically not used by traditional static approaches. We have shown that the dynamic approaches may outperform some traditional static approaches and prevent unnecessary risk taking.

Two simple and intuitive rule-based strategies were introduced that secure investments in a cash flow matching portfolio once they yielded sufficient return. Although both rule-based strategies can straightforwardly be implemented in practice, we recommend to also investigate alternatives where the investor, e.g., switches between an aggressive traditional life cycle and a matching portfolio to rule out very aggressive portfolios close to retirement.

The rule-based strategies were further refined into a combination strategy based on dynamic programming. In the current setup, the combination strategy may not be superior and we even found that the rule-based strategies outperform the combination strategy in a numerical example. We certainly believe that dynamic strategies based on dynamic programming can be further improved, as also this research clearly demonstrates their added value to pension investors.

A most suitable dynamic strategy is hard to determine objectively as its performance depends on the governing stochastic model. Also, such a dynamic strategy should fit well with practical requirements, such as whether an investor will follow through on the strategy or will feel the need to combine such a strategy with her own judgement, and whether such strategies comply with regulations. This research, however, demonstrates the added value of dynamic strategies to pension investors. In summary, such strategies exploit freedom that is not used by traditional approaches, can steer a pension investor to her target and prevent unnecessary risk taking.