1 Introduction

The Lithuanian pension system consisting of three pillars became operational in 2004, introducing the ability for residents to accumulate retirement savings in private funds managed by pension accumulation companies. As participation in IInd pillar pension funds was quasi-mandatory, more than 90% of those who receive insurable income have already become members of some funded pension scheme. By 2019, pension accumulation companies have been offering pension funds of different risk levels based on the share of investments in equities that ranges from 0% (conservative) to 100% (risky) (Bank of Lithuania, 2017). However, empirical evidence showed (Medaiskis et al., 2018b) that the majority of pension system participants made irrational decision when choosing the pension fund in terms of their risk profile and then changing it during the accumulation period. Moreover, among those who changed funds, some participants switched to an inappropriate fund according to the economic cycle observed in the financial market. These are examples of reasons why pension accumulation companies have been obliged to establish life-cycle pension funds which are most appropriate to them in terms of participant’s age. This means that fund managers take full responsibility for asset allocation, diversification, rebalancing, and securing the wealth accrued over a lifetime once the participant approaches retirement age. As such, life-cycle funds are also referred to as target date funds, and the assets of funds are invested in accordance with a predefined investment strategy changing over time based on participants’s age, also known as a glide path.

Frequent reforms in the past few decades in Lithuanian IInd pillar pension system have created concern and distrust amongst its participants. The COVID-19 pandemic, the Ukrainian war and high inflation have raised concerns about the resilience of the IInd pillar pension funds to external shocks. In Lithuania, there is little information on pension fund stress tests, and for those that have been done, the reports are not public. These conditions limit the ability of the pension fund participant to comprehend the risks they are exposed to, which could lead to adverse conditions if an event such as a financial, economic, or political crisis occurs. Within this topic, the most relevant paper that focus on IInd pillar PFs in Lithuania was published by (Medaiskis et al., 2018a). In their paper, the authors perform sensitivity analysis of optimal life-cycle investment strategy of IInd pillar PFs. The benchmark case was modified by changing the parameters of participant’s contribution rates, preferences, labour income process and investment performance.

Specifically, to assess the resilience of a country’s financial system to shocks, stress testing is a widely used tool by supervisory authorities such as banks to quantify the impact of potential risks on the financial sector as well as to test the overall stability of the financial system (Acharya et al., 2018; Sahin et al., 2020; Jobst et al., 2017; García & Steele, 2022; European Court of Auditors, 2019). In the US, stress tests were applied to regulate banks to build up reserves that they could use to continue lending when the economy entered a deep recession (Kohn & Liang, 2019). Two testing techniques are usually used: bottom-up testing, which is designed to assess the resilience of a specific financial institution to economic shocks, and top-down testing, which is designed to compare the results obtained after performing bottom-up testing, thereby identifying inconsistencies in the testing performed by financial institutions (ECB, 2013; Bank of Lithuania, 2022). Bottom-up testing is performed by commercial banks using available data and their own models. The central bank or other supervisory authority can set certain limitations of modelling and, applying their analysis tools, check and evaluate the results obtained by commercial banks. Meanwhile, top-down testing is performed without involving the central banks themselves, usually commercial banks, in the assessment process, as this procedure aims to assess the resistance of the entire banking system to adverse economic shocks (Butkus & Narusevicius, 2015). When performing bottom-up testing, financial institutions, using different methods, ask different questions and therefore involve different risks. Sahin et al. (2020) examined several macroeconomic scenarios, taking into account the exposures of the banks and the business models. The authors confirm that the disclosure of stress test information has had an impact on the movement of both the stock and credit market, thus demonstrating that stress tests have an impact on systemic risk and bank behaviour. García and Steele (2022), Acharya et al. (2018) evaluate the effects of transparency disclosure on bank behaviour and demonstrate that when regulatory attention is focused on improving the quantity and quality of bank capital, it also improves financial stability, transparency, and market discipline. Mennis et al. (2018) demonstrated the effect of lower investment returns and high volatility on pension systems. To protect the pension fund against shocks, Moriggia et al. (2019) included hedging financial contracts in the form of put options in stress testing. Unlike other authors, Joshi and Pitt (2010) used stress tests to assess the impact of actuarial calculations on pricing and capital level management in life and non-life insurance. Lithuanian Bank Supervision Department (2017) ensures that the stress testing carried out by insurance companies is properly organised, prepares stress testing guidelines, but there is still a lack of scientific research in conducting stress tests for insurance companies in Lithuania. Moriggia et al. (2019) noted that pension funds during recent years have been stressed, both equity and alternative investment markets have exceeded their usual levels of riskiness. Therefore, it is necessary to manage pension funds efficiently, satisfying such targets as liquidity, returns, exposure to individual assets, and funding gap.

Scientific research further analysed shows that stress tests can be applied not only in the financial sector but also in assessing the resistance of various companies and markets. Zapletal et al. (2020) performed stress testing to determine the optimal production and emissions coverage for an industrial company. Compared to others, Hong et al. (2022) investigated the impact of financial stress on crude oil prices. Similarly, Scarcioffolo and Etienne (2021) used stress tests to identify oil and natural gas price volatility patterns in the US Sarafrazi et al. (2015) identified the linkages between high volatility in the stock markets and financial markets in the US Singh and Singh (2017) studied the linkages between financial stress indices of the US and Brazil, Russia, India, and China, demonstrating the dependence of economic policy initiatives. Several authors examined different evaluation methods for stress testing. For instance, Moriggia et al. (2019) developed a new stress testing technique suitable for multistage scenario trees, namely nodal contamination, and proposed an asset-liability model (ALM) structured as a multistage stochastic programming problem. Stochastic programming for stress tests was also used by Zapletal et al. (2020) and Kopa et al. (2018). In particular, Kopa and Rusý (2023) emphasise the importance of stochastic dominance and claim the growing importance and interest of stochastic programming as a rapidly developing area of mathematical optimisation, as also demonstrated by Mennis et al. (2018). Comparatively, some researchers, e.g., Scarcioffolo and Etienne (2021), Sarafrazi et al. (2015), Hong et al. (2022), Alexander and Kaeck (2008), Dionne et al. (2011) tested Markov-Switching GARCH models, Dionne and Saissi Hassani (2017) used Markov regime-switching to consider business cycles when calculating a reserve capital. Ben Soltane and Naoui (2021) using a Markov switching methodology tested the responses of the time-varying stock returns to low market liquidity (both expected and unexpected). Liu et al. (2021) studied the volatility in regime-switching models, which were based on the geometric Brownian motion with Markov chains used to randomise its drift and volatility factors. As liquidity is prone to suddenly dry up, Acharya et al. (2013), Bouveret et al. (2015) and Flood et al. (2015) applied Markov models in the US corporate bond market to characterise a regime-switching behaviour of market liquidity. Han and Leika (2019) used Markov regime-switching models to demonstrate that aggregate market liquidity tends to switch from a high-liquidity regime to a low-liquidity regime. Bouveret et al. (2015) confirmed findings that structural and cyclical factors, such as regulation and market volatility, potentially have an impact on the probability of being in a low-liquidity regime. Han and Leika (2019) argued that the framework can be extended further to estimate price impact measures, asset sale haircuts for asset and pension fund managers.

In this paper, we introduce a methodology for stress testing (at various levels) of returns of pension funds using Hidden Markov Model for regime detection. Moreover, we identify the best strategies of pension fund managers for various stress levels. First, the regimes are detected from the performance of underlying stock and bond indices. Having the regimes and the estimated transition matrix, we stress this matrix and generate scenarios of future evolution of returns from the stressed model. Then we suggest possible strategies for pension managers to react on the regime switch and compare the performance of the strategies among each other for all considered stress levels. This methodology is applied to Lithuanian IInd pillar life-cycle pension funds managed by Swedbank.

The rest of this paper is organised as follows. Section 2 presents the methodology, which covers the Hidden Markov Model for regime detection, a technique used for scenario generation, and the proposed stress testing using a Markov regime switching model, and a description of the strategies to be employed. The computational results are provided in Sect. 3, which presents an overview of the pension funds considered and demonstrates the results of the stress tests for the different strategies taken. Concluding remarks, managerial insights, and future research directions are summarised in the last section.

2 Methodology

In this section, we provide the concept of a stress testing applied to the IInd pillar pension funds. First we recall the basics of Hiden Markov Model (HMM) for regime detection and introduce the notation. We consider four regimes to distinguish among a non-crisis periods and three kinds of crisis periods (shock on bond market, stock market or both markets) Second, the notion of stressing the HMM is presented. It is based on a stress level k which determines the changes in transition matrix. The higher the stress level is, the more probable switch from the non-crisis to a crisis regime is considered. Third, scenario generation technique is shortly described. Finally, the considered strategies are introduced and their evaluation is discussed. The goal is to analyse the sensitivity of the strategies and their results to the choice of the stress level.

2.1 Hidden Markov model for regime switching

Suppose that the dynamics of the returns of pension funds is governed by the Hidden Markov Model (HMM) (Hamilton, 1989; Lindgren, 1978). Specifically, the use of HMM to detect regimes enables incorporation of non-linear dynamics by determining a set of model parameters to be attributed to a particular time depending on the state. In particular, a regime-switching property provides a good fit to empirical data when considering a long-term perspective in the study. In general, the regime itself is unobserved, which means that one should make the inference about the number of regimes, their change point, and probabilities observed in the past (Piger, 2009).

Conceptually, HMM is a probabilistic model in which a sequence of observations \(X=(x_1, \ldots , x_t, \ldots , x_T)\), \(x_t \in \mathcal {R}^d\) is generated by a finite-state Markov chain with hidden states \(S=(s_1, \ldots , s_t, \ldots , s_T)\), \(s_t \in \{1, \ldots , N\}\) where N is the number of states which is fixed in time. The HMM is then specified by the initial probability vector \(\pi _i=\text {Pr}(s_1=i), i=\{1,\ldots N\}\), a transition probability matrix \(P_t=(p_{ij})_t=Pr(s_{t+1}=j|s_t=i)\), and the emission probabilities B, which can be any distribution conditioned on the current hidden state. More specifically, we consider a dependent mixture model, where the observations are distributed as a mixture with N states and the time dependencies between the observations are due to the time dependencies between the mixture states that follow a first-order Markov process. In this case, the joint likelihood of observations X and hidden states S with model parameters \(\theta \) can be defined as

$$\begin{aligned} { f(X,S|\theta ) = \pi \cdot b_{s_1}(x_1) \prod _{t=1}^{T-1} P_t \cdot b_{s_t}(x_{t+1}),} \end{aligned}$$

where \(b_{s_t}\) is a vector of observation densities \(b^j_{s_t}(x_t)=\text {Pr}(x_t|s_t=j)\) that provide the conditional densities of observations \(x_t\) associated with the hidden state j, \(j=1,\ldots ,N\) at time \(t=1 \ldots T\). The parameters \(\pi \), P and B must be estimated from the observed sequence X.

To determine the time periods of different market states observed historically, we used a multiple change point detection model proposed by Lavielle and Lebarbier (2001). For this purpose, their approach employs a Monte Carlo Markov Chain (MCMC) algorithm to estimate the posterior distribution of the change-point process. The numerical experiments have shown that the procedure proposed in their paper is much faster than the Reversible Jump algorithm. Moreover, this technique, as a result, provides the additional probability \(p_j\) that at a particular time t the corresponding regime \(s_t=j, j \in \{1, \ldots , N\}\) is detected and a number N of regimes is expected.

In the paper we assume that the key drivers which have an impact on PF performance arise from a global market, which is represented by two well-known indices, namely the MSCI World index (MSCI) and the Bloomberg Barclays Euro 1–5 year Bond index (BB EURO). Two indices are chosen because they are used by most pension fund managers as benchmarks. Moreover, they represent a typical investment strategy of pension funds: world stock funds and bond funds. Finally, usual regulations for pension fund strategies are expressed in terms of limits on stock or bond assets.

The separate behaviour of stock and bond markets later is combined into four regimes of financial markets. More specifically, the market regimes are defined by 4 states, such as

  • Regime 1 is identified if no shock is observed in both indices;

  • Regime 2 is said to be detected if a non-typical behaviour is observed in the stock market index;

  • Regime 3 is said to be detected if a non-typical behaviour is observed in the European bond market index;

  • Regime 4 is identified if both the stock market index and the bond market index are in shock status.

Combining hidden states from stock and bond markets into four regimes is performed as follows:

  1. 1.

    detect one of two hidden states (no-crisis, crisis) of stock \(S^S\) and bond funds \(S^B\) markets separately (MSCI and BB EURO correspondingly) using algorithm by Lavielle and Lebarbier (2001);

  2. 2.

    extract probabilities of hidden states detected (Viterbi algorithm implementation in R, see Visser & Speekenbrink, 2010)

  3. 3.

    combine \(S^S\) and \(S^B\) into four regimes of the financial markets. The procedure for every observation moment t (\(t=1,\ldots ,T\)) is following:

    • if probabilities \(P(S^S_t =\text {'no-crisis'})> 1/2\) and \(P(S^B_t=\text {'no-crisis'})> 1/2\) (no-crisis in stock and bond markets simultaneously) then regime 1 (no-crisis) is identified at time t;

    • if probabilities \(P(S^S_t =\text {'crisis'})> 1/2\) and \(P(S^B_t=\text {'no-crisis'})> 1/2\) then at time t regime 2 (stock crisis) is identified;

    • if probabilities \(P(S^S_t =\text {'no-crisis'})> 1/2\) and \(P(S^B_t=\text {'crisis'})> 1/2\) then at time t regime 3 (bond crisis) is identified;

    • if probabilities \(P(S^S_t =\text {'crisis'})> 1/2\) and \(P(S^B_t=\text {'crisis'})> 1/2\) then at time t regime 4 (global financial crisis) is identified;

  4. 4.

    check if regimes identified are coherent with real historical situation in the financial markets.

In step 3 the probability 1/2 can be increased if more sure evidences of being in hidden state are needed. Probablity greater than 1/2 means “it is more probable to be in particular state than in another”.

Then, transitions between regimes are given by a transition matrix

$$\begin{aligned} P=\begin{pmatrix} p_{11} &{} p_{12} &{} p_{13} &{} p_{14} \\ p_{21} &{} p_{22} &{} p_{23} &{} p_{24} \\ p_{31}&{} p_{32} &{} p_{33} &{} p_{34} \\ p_{41} &{} p_{42} &{} p_{43} &{} p_{44} \\ \end{pmatrix}. \end{aligned}$$
(1)

As hidden states \(s_t\) are specified by a time-independent transition matrix P, then a vector \(\pi \) of steady-state probabilities is determined from a \(\pi =\pi P\) subject to \(\sum _{j=1}^4 p_{ij}=1\).

These probabilities describe the possibility of finally appearing in a particular state. However, this can only be done if the Markov chain is stationary or it follows the so-called Markovianity property (see Bickenbach & Bode, 2001). If the stationary distribution \(\pi \) exists, then this means that the “long-term” probability of being in state s is given by \(\pi _i\). These probabilities can be interpreted as the proportion of time on average that the system spends in other states (see Billingsley, 1995).

Let \(T_j = \min \{n: X_n = j\}\) denote the time to the first switch to state j. This is a random variable, with special case \(T_j = \infty \) if the visit never occurs. Then the mean first-passage time (MFPT) defines the average time for a stochastic event to first occur (see Sheskin, 1995; Polizzi et al., 2016).

Once we have transition probabilities, we can generate sequences of future scenarios. Mixing them with separately generated returns in the corresponding state would imitate the potential performance of a pension fund or benchmark index.

2.2 Stress testing for Markov regime-switching model

The considered regimes are based on the behaviour observed on the stock market and the bond market. Although the first regime is the most favourable for the participants, the other three describe the evolution of the indexes during some crises or non-typical circumstances. The worst regime is the last one. Therefore, the stress test is performed by increasing the transition probabilities from a better regime to a worse regime (state). In particular, for stress level k we increase probabilities \(p_{12}, p_{13}, p_{14}, p_{24}, p_{34}\) k-times. Moreover, the other nondiagonal probabilities are not changed because they correspond to switch to the better or indifferent state. Finally, the diagonal probabilities are decreased so that the sum of probabilities in each row equals 1. Assuming the same stress level k, all the probabilities of switching from better to worse regime are changed proportionally. For example, if the probability of transition from no crisis regime to the bond crises regime is the same as to the stock crisis regime in the original case, then it remains true in the stressed case as well. Summarising, if the original probability transition matrix is \(\{p_{ij}\}_{i,j=1}^4\) then the stressed modification at level k is defined as follows:

$$\begin{aligned} p_{\times k, 12}= & {} p_{12}k \\ p_{\times k, 13}= & {} p_{13}k \\ p_{\times k, 14}= & {} p_{14}k \\ p_{\times k, 24}= & {} p_{24}k \\ p_{\times k, 34}= & {} p_{34}k \\ p_{\times k, 23}= & {} p_{23} \\ p_{\times k, ij}= & {} p_{ij},\quad i>j, \quad i,j = 1,2,3,4 \\ p_{\times k, ii}= & {} 1-\sum _{j\ne i} p_{ij}, \quad i=1,2,3,4 \end{aligned}$$

provided that all \(p_{\times k, ij}\), \(i,j=1,2,3,4\) are between 0 and 1, that is, stress levels giving \(k \cdot p_{ij} > 1\) for some ij are not feasible. Since it might be difficult to set the stress level in real-world applications, we suggest to consider several stress levels and analyse the sensitivity of the results with respect to increasing stress levels as we demonstrate in Sects. 3.3 and 3.4. In the case when only one stress level has to be chosen for the particular break (for example, start of a pandemic) we can estimate transition matrix for the period before the break and after the break separately. Comparing the two transition matrices we can estimate the stress level k.

In general, one can consider a stress matrix \(K = \{k\}_{ij=1}^{4}\) instead of stress level k and the stressed transition probabilities could be defined as follows: \(p_{\times k, ij} = p_{ij}k_{ij}\) for \(i\ne j\). However, it might be difficult to set all \(k_{ij}\) in a reasonable way. Therefore we consider only the special choice described above which is easy to interpret.

Having the stressed probability transition matrix, we generate future scenarios in the same way as for the original one.

2.3 Strategy analysis under different stress levels

We generate 5000 replications of Markovian processes (1100 working days length) with different transition probabilities. The transition probabilities (see a transition matrix (1)) are chosen as a base for the simulation.

First, we simulate trajectories with historical transition probabilities and such a process can serve as a benchmark, as at time T we have nearly asymptotic behaviour of the random variable as if it evolved as previously. Second, we apply a stress to the transition matrix by worsening the chances of moving to a better market state, that is, we increase the transition probabilities on the upper-right side of the main diagonal and decrease the probabilities on the main diagonal correspondingly (see Sect. 2.2 for more details). We decided to increase the probabilities of worsening by rate \(k= 2, 3, 4, 10, 20\) and 50. The rate of 2 means a light stress in the market as the probabilities of worsening (e.g., moving to any type of crisis state from the non-crisis state) are doubled (\(2\times p_{ij}, \forall i>j\)) and the probability of remaining in the state i is decreased by \( \sum _{i>j}p_{ij}\). Depending on the rate of worsening, we get different levels of stress. The higher the rate, the greater the probabilities of a crisis that we should expect in the future. In this way, we get new transition probabilities. Third, we simulate trajectories of regime process (in a way described in the beginning of this section) with stressed transition probabilities.

Next, having a process of regimes, we can simulate stressed future behaviour of returns of each fund. In this research, we use the historical simulation technique. Therefore, the standard historical simulation is modified in the following way:

  1. 1.

    Depending on regime (\(i=1,\ldots ,4\)) expected, we resample historical returns from the corresponding regime only;

  2. 2.

    We analyse three strategies (A, B, and C) of resampling:

    1. (a)

      Strategy A: the fund manager keeps allocations of fund unchanged (a passive reaction to crisis in the markets);

    2. (b)

      Strategy B: during stock market crisis (regimes 2 and 4) the fund manager changes the allocations of the fund to the allocations of the a little bit more conservative accumulation fund he manages (a weak reaction).

    3. (c)

      Strategy C: during the stock market crisis (regimes 2 and 4) the fund manager changes the allocations of the fund to the allocations of the most conservative accumulation fund he manages (a panic reaction).

Finally, we take simulated values \(v(\tau , k, \varPsi )\) of the process at time T from all trajectories \(\tau = 1,\ldots ,5000\) for all stress levels \(k = 2,3,4,10,20,50\) and for all strategies \(\varPsi = A, B, C\) and get the potential distribution of random variable (fund return) at time T for each stress level and strategy. The distribution obtained can be used for scenario generation and for strategy comparison. By performing an experiment using such a scheme, we can answer the important question: for which stress level is it wise to change investment strategy of pension funds during shocks and crisis in the stock markets.

From an optimisation point of view, we need to solve the following discrete choice multiobjective problem for each \(k = 2,3,4,10,20,50\):

$$\begin{aligned} \min _{\varPsi } "F_1(v(\tau , k, \varPsi )),\ldots ,F_m(v(\tau , k, \varPsi ))" \end{aligned}$$

where \(F_1(\cdot ),\ldots ,F_m(\cdot )\) are the criteria considered, for example, mean loss, st. deviation, a risk measure, a deviation measure, etc.

3 Numerical results

This section begins with a brief overview of daily log-returns of the Lithuanian pension funds. Later, hidden Markov chain technique was used to detect crisis/no-crisis regimes in global financial markets (2007–2022). Third, future scenarios of pension funds are generated using stressed transition matrices and historical daily log-returns. Finally, different response (to the crisis) strategies are compared, and the trade-off is discussed.

3.1 Quick overview of pension fund results

Since 2019, there exist 40 s pillar pension funds in Lithuania: 35 life-cycle funds that allocate asset based on participant’s age and 5 asset preservation funds (herein noted by index T) that are designed for participants who have reached a retirement age and/or have chosen to receive periodic payments from the fund. In particular, life-cycle pension funds are arranged into the following birth-year groups: 1954–1960, 1961–1967, 1968–1974, 1975–1981, 1982–1988, 1989–1995 and 1996–2002 (herein, the two last digits will be used to denote the age group). All these funds are managed by five pension accumulation companies, specifically ’Allianz Lietuva gyvybės draudimas’ (Allianz), ’INVL Asset Management’ (INVL), ’Luminor investiciju valdymas’ (Luminor), ’SEB investiciju valdymas’ (SEB), and ’Swedbank investiciju valdymas’ (Swedbank). In the study, daily returns have been chosen as the main variable that represents the performance of the pension fund for the period between January 2019 and the end of August 2022 (see Fig. 1).

Fig. 1
figure 1

Distribution of returns of IInd pillar PFs. Manager is given in the legend, while the participant’s birth-year is attributed to the title of image panel. The special case “T” stands for the asset preservation PFs

Figure 1 draws our attention to the many extreme returns observed for all pension funds, with a heavier tail on a left side of the distribution. Obviously, this has been induced by COVID-19 and the Russian invasion of Ukraine, which started with a sudden shockwave around the world and shook financial markets around the world. Comparatively, the largest deviations observed for pension funds of young participants, namely from the 68–74 to 96–02 birth-year groups, could be explained by a higher allocation to equities in investment portfolio. However, some differences could be observed between managers. Specifically, LMNR funds demonstrated the lowest variation, except for the 54–60 birth-year group, while the highest uncertainty was observed for INVL and SWED funds. Surprisingly, conservative funds of birth-year groups 54–60 and asset-preserving funds (denoted with index T) also experienced a sharp drop, which is a result of extraordinarily volatility observed in the bond market. Table 1 summarises the pension fund market in terms of mean, standard deviation (StDev), minimum, maximum, skewness, kurtosis, value-at-risk (VaR), conditional value-at-risk (CVaR), maximum drawdown (MDD), average recovery time (ART), correlation between fund and MSCI (CorMSCI), and correlation between fund and BB EURO (CorBB). Additionally, the same empirical characteristics are given for indices as well.

Table 1 Descriptive statistics of pension funds over January 2019–August 2022

From Table 1, as expected, the funds for young participants exhibited a larger mean return comparatively, with the largest one observed for the Swedbank funds. The risk quantified using StDev, VaR and CVaR had a tendency to increase in line with a growing portion of equities in the portfolio, with the largest estimates determined for Swedbank as well. As such, among PACs, balanced risk-return performance could be observed for Luminor funds. However, the funds of older participants suffered high losses determined by negative skewness and kurtosis, which holds for all PACs. Maximum drawdown, MDD, is one of those characteristics that historically measures the maximum loss observed from a peak. As can be seen in the table, MDD for young participants ranges from 28 to 32%, while for older participants—from 10 to 22%. Correspondingly, the average time needed to recover from the drawdown, named ART, shows that it typically requires, on average, 9 days for the funds of young participants to revert to the long-run mean. Surprisingly, a different tendency is observed for older participants’ funds, where comparatively a long ART is observed for Luminor funds, while other funds in this group have been able to handle it more rapidly. The last two columns of Table 1 imply that the strongest correlation with the MSCI index is observed for the Swedbank and Luminor funds, and the weakest correlation has been determined for Allianz. For the index of BB EURO, the strongest relation is found, in general, for Luminor funds, where comparatively high correlation was determined for all asset-preserving funds. Visually, all possible correlations are shown in Fig. 13.

Furthermore, the correlations between various performance measures of all pension funds analysed and market indices are provided in Fig. 14.

3.2 Detection of regimes in main world indices

The sample period used in the analysis starts in January 2007 and ends in September 2022, covering the global financial crisis, the COVID-19 outbreak, the Russian invasion of Ukraine, and other disturbances that caused abrupt price changes (see Fig. 2).

Fig. 2
figure 2

Returns of MSCI and BB EURO

As evidenced in Fig. 2, both MSCI and BB EURO historically exhibited a clustering of volatility: volatility changes over time and tends to persist, determining periods of low and high volatility. In particular, it can be seen that there exist subperiods, when high/low volatility is observed for each index separately, and subperiods with differing volatility between indices. As such, a global market defined through MSCI and EURO BB is defined by a 4-state Markov regime-switching model, which was introduced in Sect. 2.1. Specifically, Fig. 3 illustrates the regime detected over sampled period.

Fig. 3
figure 3

Market regimes detected

Figure 3 shows that the switch to Regime 3 is much more often observed in the first half of the observed period, except for the period of global financial crisis, while the other half of the period has been mainly observed in Regime 2 or 4, with a comparatively longer duration in Regime 1. Consequently, Fig. 4 presents the empirical characteristics of a particular regime.

Fig. 4
figure 4

Descriptive statistics of regimes for MSCI World and BB EURO indices

In Fig. 4, The empirical characteristics for Regime 1 summarise normal market situation. The switch to Regime 2 particularly worsen the characteristics of the stock market represented by MSCI. It could be seen that the mean return becomes negative, which is accompanied by increased uncertainty in terms of StDev, Range, and Drawdown. Comparatively, the landscape of the bond market represented by EURO BB is not directly exposed to a short shock observed in the stock market. Similarly, the switch to Regime 3 resulted in a shock observed in the bond market, which is explained by a lower mean return and increased risk characteristics. In particular, the stress observed in the bond market slightly pushed stock prices up. Finally, Regime 4 represents a financial crisis observed in both the stock and bond markets. The black histogram represents the distribution of the historical data for each index. The colourful curve identifies the normal probability distribution under each regime, respectively.

Once a certain regime is determined in the market, the performance results of the pension funds are conditioned by the regime, as outlined in Fig. 5. Here, each empirical distribution obtained under a particular regime has been used to generate a conditional density \(b_j^k\) introduced in Sect. 2.1.

Fig. 5
figure 5

Distribution of fund returns for a particular regime (from top: regime 1, regime 2, regime 3, regime 4)

The comparison of the plots given in Fig. 5 reveals that the largest variation in returns with heavy tails is observed under regime 4, as it was expected. Regime 3 distinguishes from the others by the smallest variation in general and, more interestingly, a differing central tendency including also spread around it among pension accumulation companies, especially for the birth-year groups covering 61–67 and 96–02. The evidence shows that comparatively large positive and negative returns are quite expected under Regime 2, with a heavier left tail.

To check correlations between log-returns of funds in different regimes, please refer to Fig. 16. To check correlations between various performance measures in different regimes please refer to Fig. 15.

3.3 Stress simulation

The matrix of transition probabilities of different regimes observed in the market was determined over a historical period of 01/01/2007–13/09/2022. In particular, we obtain

$$\begin{aligned} P=\begin{pmatrix} 0.9829 &{} 0.0083 &{} 0.0079 &{} 0.0008 \\ 0.0479 &{} 0.9368 &{} 0 &{} 0.0153 \\ 0.0410 &{} 0.0023 &{} 0.9408 &{} 0.0159 \\ 0 &{} 0.0104 &{} 0.0091 &{} 0.9804 \\ \end{pmatrix} \end{aligned}$$
Fig. 6
figure 6

Transition graph

Consequently, the probabilities of steady states estimated from historical data are given in Table 2.

Table 2 Probabilities of steady states \(\pi _i, i=1,\ldots ,4\) observed historically

Once transition probabilities have been found, we can check whether the sequence of regimes satisfies a Markovianity property by performing the \(\chi ^2\) test. In particular, with a test probability p-value = 0.651 we conclude that the sequence follows the Markov property and we can proceed with steady states of the ergodic system (the hypothesis of irreducibility was not rejected).

The mean First Passage Times (MFPT) of our identified regimes are as follows.

$$\begin{aligned} m=\begin{pmatrix} 0.00 &{} 126.86 &{}140.42&{} 251.41\\ 37.88 &{} 0.00 &{}154.59&{} 206.55\\ 42.94&{} 135.50 &{} 0.00 &{}198.88\\ 91.38 &{} 114.37 &{}133.58 &{} 0.00 \end{pmatrix}. \end{aligned}$$

The elements of this matrix may be interpreted as average time necessary to reach column state from the row state, e.g., \(m_{12}=126.9\) means that being in no-crisis regime in 126.9 days the process will appear in stock market crisis regime and in \(m_{14}=251.4\) days global crisis is expected. Moreover, if currently there is global crisis one just need to wait for \(m_{41}=91\) days to get back to "normal" regime.

The simulation of stress follows the methodology given in Sect. 2.2. The stress level \(k=2, 3, 4, 10, 20\) and 50 has selected for the demonstration purposes. For instance, \(\times 2\) means that worse regime is expected twice often than historically. As such, we get new transition probability matrices that describe the market behaviour under different levels of stress.

$$\begin{aligned} P_{\times 2}= & {} \begin{pmatrix} 0.9658&{} 0.0167&{} 0.0158&{} 0.0017\\ 0.0479&{} 0.9216&{} 0&{} 0.0305\\ 0.0410&{} 0.0023&{} 0.9248&{} 0.0319\\ 0&{} 0.0104&{} 0.0091 &{}0.9804\\ \end{pmatrix}, \quad P_{\times 3}=\begin{pmatrix} 0.9487&{} 0.0250&{} 0.0238&{} 0.0025\\ 0.0479&{} 0.9063&{} 0&{} 0.0458\\ 0.0410&{} 0.0023&{} 0.9089&{} 0.0478\\ 0 &{}0.0104 &{}0.0091 &{}0.9804\\ \end{pmatrix},\\ P_{\times 4}= & {} \begin{pmatrix} 0.9316&{} 0.0334&{} 0.0317&{} 0.0033\\ 0.0479&{} 0.8911&{} 0 &{} 0.0610\\ 0.0410&{} 0.0023&{} 0.8929&{} 0.0638\\ 0 &{} 0.0104&{} 0.0091&{} 0.9804\\ \end{pmatrix}, \quad P_{\times 10}=\begin{pmatrix} 0.8290&{} 0.0834 &{}0.0792 &{}0.0083\\ 0.0479 &{}0.7996 &{}0 &{}0.1525\\ 0.0410 &{}0.0023 &{}0.7973 &{}0.1595\\ 0 &{}0.0104 &{}0.0091 &{}0.9804\\ \end{pmatrix},\\ P_{\times 20}= & {} \begin{pmatrix} 0.6581 &{}0.1668&{} 0.1585&{} 0.0167\\ 0.0479&{} 0.6471&{} 0&{} 0.3050\\ 0.0410&{} 0.0023 &{}0.6378 &{}0.3189\\ 0 &{}0.0104 &{}0.0091 &{}0.9804\\ \end{pmatrix}, \quad P_{\times 50}=\begin{pmatrix} 0.1451 &{}0.4170 &{}0.3962 &{}0.0417\\ 0.0479 &{}0.1895 &{}0 &{}0.7625\\ 0.0410 &{}0.0023 &{}0.1595 &{}0.7973\\ 0 &{}0.0104 &{}0.0091 &{}0.9804\\ \end{pmatrix}. \end{aligned}$$

Graph representation of transition matrices under different stress levels are provided in Fig. 17.

Transition matrices above are used to generate sequences of regime changes in the market that already have been presented in Fig. 4. Probabilities of steady states under different stress levels are provided in Table 3.

Table 3 Probabilities of steady states \(\pi _i, i=1,\ldots ,4\) under different stress levels \(\times k\)

Comparing the values given in Tables 3 and 2 we can see that the probability to appear in Regime 4 increases quite drastically as the stress level increases. Furthermore, when the stress level is above 10, then the Regime 4 is highly expected (\(\pi >0.9\)) and this state behaves like an absorbing state, that is, the global permanent financial crisis is observed in the market.

Furthermore, from matrices of mean first passage times (MFPT) and duration of financial crisis (regime 4) it is possible to deduce what the stress level of a particular crisis is. For example, if \(k=7\), then

$$\begin{aligned} m_{\times 7}=\begin{pmatrix} 0 &{} 52.81 &{} 59.89 &{} 20.09 \\ 143.20 &{} 0 &{} 95.44 &{} 12.69 \\ 151.33 &{} 86.30 &{} 0 &{} 11.96 \\ 198.13 &{} 91.41 &{} 102.03 &{} 0 \\ \end{pmatrix}. \end{aligned}$$

Assuming that a global financial crisis (regime 4) started on 31 January 2022 (earlier than Russia invaded Ukraine) and did not ended by December 2022 (did not switched to regime 1), which is exactly 198 working days or 9 months, we can deduce that stress level of this crisis is above 7, as \(m_{41}=198.13\). Moreover, if crisis will not end by 250 days (one year), then stress level will exceed 10. Similarly, we can estimate stress level of other crises in the past, e.g., financial crisis of 2008–2009 lasted for approx. Two years therefore stress level could exceed 25.

3.4 Strategy analysis

In this subsection, strategies listed in Sect. 2.3 are verified for pension funds of Swedbank. The pension accumulation company has been selected for the demonstration purposes because of two main reasons. First, Swedbank holds the largest share (40%) of the IInd pillar Lithuanian PF market. Second, the historical correlation of Swedbank funds with MCSI World and BB EURO indices is among the largest ones. The idea for simulation of regimes with different stress levels for other pension accumulation companies would be the same.

3.4.1 Case A: Swedbank keeps the strategy unchanged

Suppose Swedbank does not change portfolio allocations in funds despite the stress level expected. Next figures demonstrate the simulated trajectories of pension funds for 5 years ahead under different stress levels.

Fig. 7
figure 7

Simulated trajectories of cumulative log-returns of Swedbank 54–60 in case of strategy A

From Figs. 7 and 181920212223 and 24 in the Appendix we can see that the trend of the trajectories decreases as the stress level increases. This is not a surprise because probabilities of having more negative log-returns are much higher than of being positive. It is interesting to note that for all funds analysed, the trend remains positive if the stress level does not exceed 2. However, when stress level is above 3, the Swedbank funds tend to have negative cumulative log-returns. This statement is applied primarily to the most conservative funds (Swedbank 54–60 and Swedbank_T). In particular, the trend becomes negative for fund Swedbank 61–67 when stress level is above 10 and for funds Swedbank 68–74, Swedbank 89–95 and Swedbank 96–02 when stress level exceeds 50. Furthermore, the trend is always positive only in the case of two funds Swedbank 75–81 and Swedbank 82–88. These two funds could be treated as resilient to any stress. Additionaly, in the Appendix, Fig. 25 demonstrates the empirical densities of cumulative log-returns of Swedbank funds under different stress levels at time \(T=1100\).

The statements above are supported by descriptive statistics of cumulative log-returns (see Table 4) at the end of the simulated period.

Table 4 Stress influence to the statistics of simulated returns from 5000 trajectories at time \(T=1100\) for Strategy A

From Fig. 4 we can deduce some important statements. First, the funds with lower share in equities sooner exhibit negative log-return. The standard deviation values show that a larger deviation is observed for the equity funds, which is quite expected. However, one can observe that the standard deviation within the same pension fund tends to increase when stress level doubles or triples, but its value becomes even lower compared to no stress for a severe stress level, i.e., not less than x10. Comparatively low values of skewness and kurtosis reveal the symmetric distribution of log-returns with no heavy-tails expected, which is observed in all cases. Risk measures such as VaR and CVaR indicate the expected loss, which becomes more severe with a higher stress level. In particular, for funds with larger share in equities, for example, birth-year groups ranging between 68–74 and 96–02, the loss in terms of VaR and CVaR is not expected. Finally, for cases where the Sharpe ratio is greater than 1, we could conclude that the risk was comparatively well managed, even some stress level was observed. This is not true for conservative funds, which demonstrate once again that investments in bonds only in the long term, with market crashes expected, would not ensure a positive return.

3.4.2 Case B: Swedbank changes its strategy to less conservative

Here, we simulate regimes and change allocations of funds from their current to a little bit older group if crisis in stock markets is foretasted (regimes 2 or 4).

From Figs. 26272829303132 and 33 we can see that the trend of trajectories decreases when the stress level increases. This is not a surprise because probabilities of having more negative log-returns are much higher than of being positive. Similarly to Scenario A, the trend remains positive if the stress level does not exceed 2, when the stress level is above 3, then the trend of the most conservative funds (Swedbank 54–60 and Swedbank_T) becomes negative and for the Swedbank 61–67 when the stress level is above 10. Differently from scenario A, the trend becomes negative for Swedbank 68–74 fund when stress level is 20. However, for the Swedbank 75–81, the trend becomes negative when the stress level reaches 50. Moreover, the trend is always positive only in case of three Swedbank funds 82–88, 89–95, and 96–02. These funds could be treated as resilient to any stress.

The statements above are supported by statistics of cumulative log-returns (see Table 5) at the end of 1100 days (5 years) and histograms (Fig. 34).

3.4.3 Case C: Swedbank changes its strategy to very conservative

Here, we simulate regimes and change allocations of funds from their current to oldest group if a crisis in stock markets is foretasted (regime 2 or 4).

From Figs. 35363738394041 and 42 we can see that the trend of trajectories decreases as the stress level increases. Similarly to scenarios A and B, the trend remains positive if the stress level does not exceed 2, when the stress level is above 3, then the trend of the most conservative funds (Swedbank 54–60 and Swedbank_T) becomes negative. Unlike scenarios A and B, the trend becomes negative for Swedbank funds 61–67, 68–74, 75–81, and 82–88 when the stress level is above 10. Furthermore, the trend becomes negative even for Swedbank 89–95 and 96–02 funds (at stress levels \(k \ge 20\)). In the case of Scenario C, none of the funds could be treated as resilient.

The statements above are supported by statistics of cumulative log-returns (see Table 6) at the end of 1100 days (5 years) and histograms (Fig. 43).

3.4.4 Comparison of Strategies A, B and C

In this subsection, we show how the choice of strategy during the shock period influences the expected results (simulated) of Swedbank pension funds under different levels of stress. Figure 8 shows the distributions of the returns of Swedbank pension funds.

Fig. 8
figure 8

Box-plots of simulated log-returns of Swedbank funds under different stress levels (comparison of Strategies A, B and C)

From Fig. 8 we can clearly see that, independently of the birth-year group of the participant, Strategies B and C allow us to reduce the volatility of the returns. However, expected returns also decrease with some exceptions. What is really surprising (counterintuitive) from this figure is that for most of the funds, in case of small stress (\(k\le 3\)) or no stress at all, reallocation to a more conservative fund, during crisis periods, increased expectation. To better explain this issue and show the balance between expected return and variability after 5 years, in Fig. 9 the Sharpe ratio values are provided.

Fig. 9
figure 9

Sharpe ratio of simulated Swedbank log-returns under different stress levels (comparison of Strategies A, B and C)

Figure 9 allows us to compare the Sharpe ratio of simulated log-returns for Swedbank pension funds under different stress levels and different reaction Strategies A, B and C. We can clearly see that, depending on the stress level and birth-year group of the participant in the pension system, the Sharpe ratio varies much (from nearly 3 to nearly \(-5\)). For the oldest participants (Swedbank 54–60 and T), the Sharpe ratio does not depend on the choice of strategy, as the allocations are always the same. For these funds, performance is quite poor compared to other age groups (independently on stress level). Furthermore, if the stress level is higher than 2 (\(k>2\)), then the Sharpe ratio becomes negative. Therefore, we can state that very conservative investments during stressed periods are not the best choices. Furthermore, the younger the participant, the better the performance of its pension fund can be observed, which is not surprising if there is no crisis in the market. However, in the face of crisis, performance starts to decrease along with an increasing level of stress. What is counterintuitive is that if pension fund reallocates investments to a little bit more conservative during stress periods (Strategies B and C), then the performance increases significantly. Moreover, until some stress level (\(k\ge 10\)) the performance can increase even if the allocations are switched to the most conservative level. Such behaviour can be explained in the following way: the variability of more conservative investment strategies is lower than that of less conservative funds, while the expectation is similar, and therefore the Sharpe ratio can increase. Furthermore, from Fig. 8 we have seen that when there is no stress or it is rather small (\(k\le 2\)) the expectation of Strategies B and C can be even higher than that of Strategy A (doing nothing during a crisis), which is why the greater Sharpe ratio in Strategies B and C is not so surprising.

In further figures (see Figs. 1011 and 12) the trade-off between different stress levels of three measures (mean, downside deviation (from 0) and Sharpe ratio) is shown when the pension fund manager switches investment from Strategy A to Strategy B and C.

Fig. 10
figure 10

Trade-off of mean comparing to Strategy A

Figure 10 shows how the mean of cumulative log-returns changes when switching, under different stress levels, from existing allocations to more conservative funds is performed. It is difficult to generalise switching from A to B, as the trade-off is different for groups Swedbank 61–81 and Swedbank 82–02. For the first group, when the stress level is low (\(k=0, 2,3\)) the mean increases while in the case of higher stress (above 3) the mean decreases significantly. For the second group, the mean remains stable or even increases independently on stress level. However, in case of switching from strategy A to C, the following generalisation could be made: if a stress level is low, then expectation can increase (even more than in previously mentioned case A to B), while for higher stress levels mean decreases drastically.

Fig. 11
figure 11

Trade-off of downside deviation (from 0) comparing to Strategy A

Figure 11 shows how the downward (downside from 0) deviation of the cumulative log-returns changes when switching, under different stress levels, from existing allocations to more conservative funds is performed. Differently from the mean, a negative trade-off of downside deviation is desirable, while positive signals a poor performance of investment. It is interesting that for most of the funds the downside deviation decreases quite significantly for small stress levels independently on strategies chosen, with an exception of Swedbank 75–95 in case A to B. Switching from A to B and from A to C mostly has different impact on trade-off (it is greater for A to C). Moreover, when a stress level increases the downside deviation can increase too, especially when k is above 20.

A more clear trade-off is observed in the case of the Sharpe ratio (Fig. 12).

Fig. 12
figure 12

Trade-off of Sharpe ratio comparing to Strategy A

Figure 12 shows how the Sharpe ratio of cumulative log-returns changes when switching, under different stress levels, from existing allocations to more conservative funds is performed. In case of switching from A to B strategies, the Sharpe ratio decreases for more conservative funds when the stress level increases. However, for Swedbank 82–02, the Sharpe ratio remains stable or even increases (insignificantly) independently of the stress level. Furthermore, in case of switching from A to C when a stress level is low, the change of Sharpe ratio is positive, however, for higher stress levels the decline is very large, for all funds. Additionally, in Appendix figures of trade-off of standard deviation (Fig. 44) and CVaR (Fig. 45) could be found when switching from Strategy A to B and C is performed.

In summary (from this Section) we can state that when a crisis with low stress level is expected, it is wise to switch to the most conservative funds (Strategy C), this temporary will increase performance of investments. However, if a crisis with high level of stress is expected, it is wise to reduce volatility a little bit only (Strategy B), and this will keep efficiency at a similar (or even higher) level.

4 Conclusion

Recent evidence shows that the emerging financial crisis negatively affects the value of pension fund assets worldwide. Specifically, in the paper, we addressed the IInd-pillar life-cycle pension funds operating in Lithuania, with a particular interest on searching the answer, whether fund managers should stick to the life-cycle investment strategy despite of the disturbances and crisis that could be foreseen during the accumulation period; or maybe to diminish the share in equities during crashes in the market. As such, the analysis could be carried out by performing a stress testing, but unfortunately, this technique for the pension funds, taking into account their possible specific investment strategy, is not well-investigated in the literature. Therefore, under the assumption that investor’s risk preferences and beliefs are primarily observed in main world indices and then transmitted to other markets such as pension funds, we propose a stress testing technique that includes a hidden Markov regime switching model and historical simulation for scenario generation.

The sequence of regimes detected follows the Markovianity property. Furthermore, the highest probability is to appear in Regime 1 (there is no crisis of any type), while the probability of large-scale crisis (Regime 4) is more than twice smaller. The interesting finding from such type of analysis is that if there currently is no crisis (Regime 1) then one should expect large deviations in the stock markets in less than 127 working days. Furthermore, a large-scale crisis is expected after 252 days (which is equal to 1 year). According to our results, the financial crisis caused by the Russian–Ukrainian war has already reached the stress level (at the end of 2022) \(k=7\).

In this paper, three response strategies (A, B and C) to crisis are chosen. Strategy A represents the passive (no) reaction of the pension fund manager. This strategy is used for a comparison. Strategy B is treated as a weak response to the crisis, as the fund manager only slightly reduces shares of volatile assets (less than 10%) in his portfolio. When the fund manager significantly increases the share of conservative investments (to 90–100 %), this reaction is denoted as strategy C (panic reaction). The list of strategies could be easily extended to better reflect the potential options that fund managers have in the framework of regulation.

According to the findings, it might come as a surprise that opting for Strategy C, which involves shifting to very conservative allocations, is advisable only when facing low stress levels (\(k \le 3\)). In such case, large-scale crises (Regime 4) are relatively brief, allowing a temporary shift to Strategy C to shield against immediate shocks without missing out on potential positive market corrections that follow. However, if the stress level is large, the best decision is either no reaction or only the weak reaction (strategy B). It is because the large-scale crisis takes longer time during which markets experience both significant downturns and recoveries. Since the latest crisis has already reached a stress level above 5, it is wise to reduce volatility only a little and choose strategy B. Such a decision, for higher levels of stress, would sustain the investment’s efficiency at a commendable level.

The paper presents a general approach to stress testing using HMM. It could be easily applied to other pension funds as well as to different financial assets. It could be extended considering more regimes (states) if one wants to distinguish among different types of crisis. Similarly, more strategies can be considered if needed. However, this would probably need a deeper analysis of the reasons of the crises, which was not the goal of this paper. Finally, for the future research, one could consider non-homogenous Hiden Markov model with time-dependent transition probabilities. However, such probabilities would be difficult to estimate, some additional assumptions would be needed. Moreover, the stressed transition probabilities would need to be defined in a different way.