1 Introduction

The rising in longevity is one of the most astonishing achievements of modern society, it is a result of continuous progress in medicine, nutrition, and technology (Riley 2001). This leads to a looming and rapid population aging, and a consequently increased prevalence of chronic diseases, especially among the elderly. Among bio-demographic theories, we found that morbidity was generally associated with diseases, and longevity with biological senescence (Manton 1982). In its seminal work, Fries (1980) theorized that the age at the onset of chronic degenerative diseases could be postponed compressing morbidity into a smaller portion of the life span, thus without significantly changing average life expectancy.

The rapid decrease in mortality at advanced ages suggested that a biological limit was not close, without indication of a deterioration in the health status of older people despite significant increases in life expectancy. By aiming at investigating the nature of these scenarios, it is useful to monitor both mortality and the incidence/ prevalence of morbidity over time. To address these issues, relevant research avenues have evolved recently, promoting the creation of health expectancy indicators (Sanders 1964; Sullivan 1965, 1971) that measure the number of years that individuals at a given age are expected to live in good health under prevailing mortality and morbidity conditions. More precisely, HLE represents the expected number of years of remaining disability-free life a member of the life table cohort would experience if cohort age-specific rates of mortality and disability prevailed throughout his/her lifetime. The basic idea is to merge the period life table, with the age-specific disability prevalence obtained from cross-sectional survey data. In particular, Sullivan’s method simply partitions the total number of person-years lived into the disability and disability-free life expectancy.

According to Imai and Soneju (2007), similarly to life expectancy at birth, we are able to define HLE, denoted by h(at), which represents the expected remaining disability-free (DF) life of an individual age a born at time t. Formally, given the survival function l(at), and let \(\pi (a,t)\) be the proportion disabled at age a for the cohort born at time t, HLE is given by:

$$\begin{aligned} h(a,t)=\frac{1}{l(a,t)} \int _{a}^{\infty }[1-\pi (r, t)] l(r,t) dr \end{aligned}$$

Since the theoretical definition is given within the continuous-time framework and the data are typically recorded in a discrete form, we exploit the period life table model in order to approximate the continuous-time mortality process and use Sullivan’s method. We then obtain the discrete counterpart of h(at), denoted by \(h_{a,t}\). A reliable estimation of HLE is fundamental for understanding whether additional years of life are spent in good health and whether life expectancy is increasing faster than the decline of disability. Besides Levantesi et al. (2022) who contributed to underlined heterogeneity in longevity indicators from a global perspective, it is a common practice in life expectancy literature to model each population independently. Indeed, significant studies have been proposed on life expectancy forecasting, see, e.g., Torri (2011) that bring forward a Geometric Brownian motion-based model. Another relevant contribution comes from Raftery et al. (2013) who propose a hierarchical Bayesian model, finally Nigri et al. (2021a) bring forward a neural network approach to forecast longevity measures accounting for long and short term dynamics.

Nevertheless, after a long period of increasing life expectancy, researchers recorded some diverging trends in longevity, whit a situation of stagnation and deceleration (Nigri et al. 2022, 2021b; Ho and Hendi 2018). Regularities have been observed also in HLE (Permanyer et al. 2022a) with a less optimistic picture obtained looking at life expectancy trends. Therefore, it seems appropriate to identify clusters of countries having common characteristics detected by life expectancy before conducting the forecast by a multi-population model. Aiming at providing for the first time an HLE forecasting, we analyze the patterns of HLE at birth time series by identifying different longevity phases and transitions that allow clustering analysis of countries in line with their longevity dynamics. Leveraging a functional cluster analysis, which allows us to analyze curves rather than scalar data we try to track similarities in HLE among countries. Indeed, if some countries share macrolevel common factors, such as similar improvements in public health or economic circumstances, joint forecasting can be appropriate. Therefore, using the obtained cluster information, we then perform a simultaneous HLE forecasting inside each specific cluster by implementing a multivariate random walk with drift, which is an econometric model able to provide multivariate simultaneous forecasting.

This work is structured as follows. Data and healthy measures are depicted in Sect. 2. The methodology is described in Sect. 3. The numerical application is offered in Sect. 4. Food for thought and Conclusions follow.

2 Data and healthy measures

HLE is a measure used by the World Health Organization (WHO) for assessing the health of a population in a country. According to the definition of the Global Burden of Disease Study (GBD), HLEFootnote 1 is the number of years that a person at a given age can expect to live in good health, if the rates of all-cause mortality and all-cause disability would be constant into the future.

HLE is calculated by subtracting the Disability Adjusted Life Years (DALY) from life expectancy. DALYs are the sum of years of life lost (YLLs) and years lived with disability (YLDs). DALYs are the lost years of healthy life due to an early death caused by an illness or disability while still alive. YLLs are years lost due to premature mortality and are calculated by subtracting the age at death due to a given illness from the life expectancy at that age. Finally, YLDs are years lived with disability. This latter indicator is calculated as the product of the prevalence of a given condition that causes disability and the disability weight reflecting that condition’s severity.

If the comorbidity is ignored, the sum of YLDs across all causes may overestimate the total loss of health, especially at older ages. For this reason, in 2010, the Global Burden of Diseases, Injuries, and Risk Factors Study 2010 (Horton 2012) implemented adjustments for comorbidity so that the sum of YLDs across causes was equal to the sum of the overall lost health at a given age.

At first, the GBD calculated the YLDs following an incidence perspective, so that the number of incident cases in a given period was multiplied by the average duration of the disease and a weight reflecting the severity of the disease on a scale from 0 (perfect health) to 1 (dead). This approach had several drawbacks. For example, data related to the average duration of the disease are often unavailable, and taking into account comorbidities is rather complicated than a prevalence approach. While the incidence rate is the ratio of new cases of a disease divided by the number of exposure to the risk in a specific population over a particular period, the prevalence rate is the ratio of the number of cases of a disease divided by the number of exposure to the risk in a specific population over a particular period.

In 2010, GBD and WHO switched to a prevalence-based approach to the calculation of YLDs, estimating comorbidities under the assumption of independence within age-sex groups (see WHO (2020) for a detailed description of the methods and the weight’s calculation).

In this paper, we refer to the HLE data of the Global Burden of Disease Study (Wang et al. 2020) over the period 1990-2019 for the countries listed in Table 1. They have been downloaded at Global Health Data Exchange, http://ghdx.healthdata.org/gbd-results-tool (Institute for Health Metrics and Evaluation 2022).

Table 1 List of the countries used in this study and related International Standards Organization (ISO) 3-digit alphabetic codes

3 Methodology

In this section, we describe the model for the multivariate forecasting of the HLE based on a multi-country clustering. We explore the evolution of HLE over time in different countries to outline a comparative analysis. We firstly implement a functional data clustering of the HLE trends, which can be described as curves over time observed for each country. The functional approach allows the clustering of HLE trends and the identification of countries that are evolving according to similar patterns. Then, we provide the fundamentals of the multivariate random walk with drift that we use to simultaneously forecast the populations’ HLE within each cluster.

3.1 Functional HLE data clustering

Describing data in a functional curve consists in assuming the existence of a continuous function for which the observed data constitute a discretization. Jacques et al. (2014a) propose the first clustering procedure for multivariate functional time series to catch the similarities among curves. One of the methods proposed in the literature to perform the functional clustering is based on a two-step functional algorithm Abraham et al. (2003). In the first step, the functional curve is derived from discrete data, through the filtering step (James and Sugar 2003) aiming at approximating the curves by using a basis expansion of cubic B-splines functions. In particular, B-splines are widespread because they allow the analysis of the non-linear effects in the covariates and are locally susceptible to data (De Boor 1978). In the second step, the selected clustering tool is implemented. We refer to the two-step functional algorithm. We consider the basis expansion of \(h^{j}(t)\):

$$\begin{aligned} h^{j}(t)=\sum _{l=1}^{L}\beta _{jl} \cdot B_{l}(t) \end{aligned}$$
(1)

where \(B_{l}(t)\) with \(L \ge 1\) (\(L \in {\mathbb {N}}\)) is the selected number of basis functions, \(\beta _{jl} \in {\mathbb {R}}\) are the basis coefficients estimated from the observed data using the classical least square estimation. We observe that Eq. 1 provides a functional representation of the observed values of HLE.

Given the discrete observations \(h^{jk}\) of each sample path \(h^{j}(t)\) at a finite set of knots \(\lbrace t_{jk}: k=1,...,m_{j}\rbrace\), we reconstruct the HLE functional form through the functional predictor:

$$\begin{aligned} h^{jk}=h^{j}(t_{jk})+\varepsilon _{jk} \,\,\,\,\,\,k=1,...,m_{j} \end{aligned}$$

with \(\varepsilon _{jk}\) independent and identically zero mean distributed errors. The spline coefficients of each sample path \(h^{j}(t)\) are estimated by

$$\begin{aligned} {\hat{\beta }}_{j}=\left[ (B^{j})'B^{j}\right] ^{-1}(B^{j})'{\hat{h}}^{j} \end{aligned}$$

with \({\hat{\beta }}_{j}=({\hat{\beta }}_{j1},...,{\hat{\beta }}_{jL})'\), \(B^{j}=(B_{l}(t_{jk}))_{1\le k \le m_{j}, 1\le l \le L}\) and \({\hat{h}}^{j}=({\hat{h}}^{j1},...,{\hat{h}}^{jm_{j}})'\). The approximation of the functional curves of HLE for each population is performed through the fda package (version 5.1.9) of the R software using a cubic spline, where the number of knots is equal to the number of the available observations.

Once the functional form of each curve is reconstructed, we implement the k-means cluster to the fitted basis coefficients of all expanded curves. As it is well known, the k-means is an iterative clustering method, according to which a data point is assigned to a cluster minimizing the squared distance between each data point and the arithmetic mean of all data points in the cluster. We frame our choice within the existing literature, starting from the contributions of Singhal and Seborg (2005) and Ieva et al. (2011) that use a k-means algorithm on distances between multivariate functional data. Kayano et al. (2010) implement Self-Organizing Maps on the coefficients of orthonormalized Gaussian basis expansions of multivariate curves. Jacques et al. (2014a) consider the use of non orthonormal basis functions. In literature, other functional data clustering approaches have been proposed (James and Sugar 2003; Tarpey and Kinateder 2003; Chiou et al. 2007; Bouveyron and Jacques 2011), like the hierarchical non-parametric clustering (Ferraty and Vieu 2006) or the model-based approach that assumes a given density probability on a finite number of parameters describing the curves (Bouveyron and Jacques 2011). In this paper, we have selected the two-step filtering procedure to describe the HLE multivariate time series, to show how similar dynamic behaviours emerge among countries, by performing clustering on the time series. On the other hand other types of probabilistic models could be useful to address the aforementioned problem. For instance, Mixture Models (see Consonni et al. 1995; Giudici 2003), are often used for data clustering being an efficient implementation framework by means of the Expectation–Maximization algorithm.

3.2 HLE multivariate modeling

Considering the groups of populations obtained through the functional clustering, we now work on the cluster-related HLE discrete observations.

Let C the number of clusters obtained from the previous step, and \(n_i\) the number of populations within the generic cluster \(i\in \left\{ 1,2,...,C\right\}\). Hence, the total number of populations is the sum of the populations belonging to each cluster, \(n=\sum _{i=1}^{C}n_i\). We denote \(h^{(i,j)}_t\) the observed value of HLE at birth (for simplicity, age a has been omitted) for population \(j \in [1,...,n]\) in cluster i for year \(t \in [t_1,t_2,..,t_\tau ]\). For the generic cluster i, we define the matrix of HLE as:

$$\begin{aligned} \varvec{H_{t}^{(i)}}= \begin{bmatrix} h_{t_1}^{(i,1)} &{} h_{t_2}^{(i,1)} &{} \dots &{} h_{t_\tau }^{(i,1)} \\ h_{t_1}^{(i,2)} &{} h_{t_2}^{(i,2)} &{} \dots &{} h_{t_\tau }^{(i,2)} \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ h_{t_1}^{(i,n_i)} &{} h_{t_2}^{(i,n_i)} &{} \dots &{} h_{t_\tau }^{(i,n_i)} \end{bmatrix} \end{aligned}$$
(2)

where the row indicates the population in the cluster i and the column indicates the time. Therefore, \(\left( \varvec{H_{t}^{(i)}}\right) _{t\in \left\{ t_1,...,t_\tau \right\} }\) is the multivariate time series of HLE at birth of the populations in cluster i that we simultaneously model by a multivariate random walk with drift (MRWD):

$$\begin{aligned} \varvec{H_{t}^{(i)}}=\varvec{H_{t-1}^{(i)}}+\varvec{\delta ^{(i)}}+\varvec{\epsilon _t^{(i)}}, \quad \varvec{\epsilon _t^{(i)}} \sim N(\varvec{0},\varvec{\Sigma }^{(i)}) \end{aligned}$$
(3)

where \(\varvec{\delta ^{(i)}}\) is the vector of drift parameters driving the dynamics of the populations in cluster i, and \(\varvec{\Sigma ^{(i)}}\) is the \(n_i \times n_i\) variance-covariance matrix of the multivariate white noise \(\varvec{\epsilon _t^{(i)}}\) in cluster i.

Using data in the in-sample period, we calculate one-step-ahead and two-step-ahead and five-step-ahead point forecasts. We follow an expanding window approach, increasing the in-sample period by one year (two or five years) and calculating one-step-ahead (two-step-ahead or five-step-ahead) forecasts. We determine the out-of-sample forecast accuracy by comparing the forecasts with the actual out-of-sample data. A similar approach has been followed by, e.g., Shang and Yang (2021) for jointly modeling Australian sub-populations using one-step-ahead and five-step-ahead forecast, and Shang and Hyndman (2017) that provided one- to ten-step-ahead forecasts for grouped functional time series using the regional age-specific mortality rates in Japan.

Considering \(t_\tau\) the forecast origin and r the forecast horizon, the r-step-ahead forecast of HLE of the populations in the generic cluster i is given by:

$$\begin{aligned} \varvec{{\widehat{H}}_{t_\tau }^{(i)}}(r)=E\left[ \varvec{H_{t_\tau +r}^{(i)}} \vert \varvec{H_{t_\tau }^{(i)}},\varvec{H_{t_\tau -1}^{(i)}},...,\varvec{H_{1}^{(i)}} \right] \end{aligned}$$
(4)

The r-step-ahead forecast for MRWD in Eq. 3 can be written as:

$$\begin{aligned} \varvec{{\widehat{H}}_{t_\tau }^{(i)}}(r)=\varvec{H_{t_\tau }^{(i)}}+\varvec{\delta ^{(i)}} r \end{aligned}$$
(5)

3.3 Performance measures

We measure the performance of the model by the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE). Given \(t_T\) the final year of the forecast, we calculate MAE and RMSE for each population j within cluster i as:

$$\begin{aligned} MAE^{(i,j)}= & {} \frac{\sum _{t = t_{\tau }+1}^{t_T} \mid h_{t}^{(i,j)} - {\hat{h}}_{t}^{(i,j)} \mid }{(t_T-t_{\tau }-1)}, \end{aligned}$$
(6)
$$\begin{aligned} RMSE^{(i,j)}= & {} \sqrt{\frac{\sum _{t = t_{\tau }+1}^{t_T} \left( h_{t}^{(i,j)} - {\hat{h}}_{t}^{(i,j)}\right) ^2}{(t_T-t_{\tau }-1)}}. \end{aligned}$$
(7)

And, their average values for each cluster i as:

$$\begin{aligned} MAE^{(i)}= & {} \frac{1}{n_i}\sum _{j=1}^{n_i} MAE^{(i,j)}, \end{aligned}$$
(8)
$$\begin{aligned} RMSE^{(i)}= & {} \frac{1}{n_i}\sum _{j=1}^{n_i} RMSE^{(i,j)}. \end{aligned}$$
(9)

Furthermore, to compare the predictive accuracy of two competing forecasts we use the Diebold Mariano (DM) test (see Diebold and Mariano 1995). In a nutshell, given the forecast error \(e_{it}\) defined as the difference between the values predicted by a certain model i and the actual values at time t, the loss associated with forecast from model i is assumed to be \(g(e_{it})=e_{it}^2\). The two forecasts have equal accuracy if and only if the loss differential between the two forecasts, \(d_t = g(e_{1t})-g(e_{2t})\), has zero expectation for all t. The DM test statistic is defined as follows:

$$\begin{aligned} DM=\frac{{\bar{d}}}{\sqrt{\frac{s}{N}}} \end{aligned}$$
(10)

where \({\bar{d}}\) is the sample mean of the loss differential, s is the variance, and N the sample size. The null hypothesis of this test is that the models have the same forecast accuracy, i.e. \(H_0: E[d_t] = 0, \, \forall t\), while the alternative hypothesis is that \(H_1: E[d_t] \ne 0, \, \forall t\). If \(H_0\) is true, then the DM statistic is asymptotically distributed as a normal standard normal distribution with 0 mean and standard deviation equal to 1.

4 Numerical application

We remind that the proposed methodology consists of two stages. The first one is the functional data clustering of the HLE at birth of the countries listed in Table 1. The second one consists of simultaneously forecast HLE at birth of the countries inside the cluster by a multivariate random walk with drift.

We apply this methodology to the data on HLE for the period 1990-2019, where the first 15 years, 1990-2009, represent the in-sample period, and years 2010-2019 the out-of-sample period.

The optimal number of clusters C is chosen according to the Elbow and Silhouette methods (see Figs. 14-16 in Appendix A). From Figs. 15-16, we have chosen \(C=3\) as the optimal number of clusters for both genders.

The results of the application of the functional data clustering to the HLE data can be visualized in the world map in Fig. 1 for females and in Fig. 2 for males. Focusing on female populations, cluster 1 shows a strong geographical connotation as mainly collects former Soviet Union countries (Belarus, Latvia, Russia, and Ukraine). The other two clusters collect both European countries and North America, while Australia is in cluster 2. Looking at male populations, cluster 2 gathers the same countries as cluster 1 for females plus the remaining Baltic States (Estonia, Lithuania) and some Eastern Europe countries (Hungary, Poland, Slovakia). Male cluster 3 is similar to female cluster 2, collecting the same countries except for Austria, Belgium, Germany, and Finland.

Fig. 1
figure 1

Clustered populations. Females

Fig. 2
figure 2

Clustered populations. Males

To test the statistical significance of the drift, we used the Augmented Dickey-Fuller Test (ADF) Unit Root Test, analyzing whether the drift term is needed in the regression model. The null hypothesis means that our process is a random walk with drift. The results of the ADF test are reported in Table 2 and Table 3 for males and females, respectively. We can deduce that the values of test statistic are greater than critical values at levels 1%, 5%, and 10% for all the clusters and both genders with the exception of cluster 2 males and cluster 1 females. Therefore, we conclude that the null hypothesis (the process is a random walk with drift) can be accepted in all the cases except for cluster 2 males and cluster 1 females. Tables 23 also show the values of the drift parameters for the populations whose HLE time series follows a random walk with drift, estimated over the in-sample period 1990-2009. We observe positive values of \(\delta\) for all the countries. The \(\delta\) values for male populations are higher than those for female populations highlighting an improvement in the men’s health in the period 1990-2009. Indeed, men live less than women but spend a lower proportion of their total life expectancy in poor health. This phenomenon is known as the “male-female health-survival paradox” (Oksuzyan et al. 2009), and is not constant and universal, depending on different health domain indicators as argued by Di Lego et al. (2020).

Table 2 List of male populations by cluster number, results of the ADF test (null: random walk with drift), and drift parameter
Table 3 List of female populations by cluster number, results of the ADF test (null: random walk with drift), and drift parameter

Finally, we apply the ADF test for testing whether HLE time series is a random walk (without drift) for the populations included in cluster 2 males and cluster 1 females. The results, provided in Tables 4 and 5, show that for both the clusters above mentioned a random walk is appropriate.

Table 4 List of male populations by cluster number, results of the ADF test (null: random walk)
Table 5 List of female populations by cluster number, results of the ADF test (null: random walk). Years 1990–2009. Critical values for test statistics: \(***-2.66\) for \(p<0.01\), \(**-1.95\) for \(p<0.05, *-1.6\) for \(p<0.1\)

In conclusion, we model all the clusters with a MRWD, with the exception of cluster 2 males and cluster 1 females that are modeled as a MRW.

The results of the out-of-sample test over the years 2010-2019 for each female population cluster are depicted in Fig. 3, 4 and 5 for cluster 1, 2, and 3 respectively.

Fig. 3
figure 3

Out-of-sample results for cluster 1, females. Observed values (solid line), one-step-ahead forecast (dotted line), and two-step-ahead (dash-dotted line)

Fig. 4
figure 4

Out-of-sample results for cluster 2, females. Observed values (solid line), one-step-ahead forecast (dotted line), and two-step-ahead (dash-dotted line)

Fig. 5
figure 5

Out-of-sample results for cluster 3, females. Observed values (solid line), one-step-ahead forecast (dotted line), and two-step-ahead (dash-dotted line)

The results of the out-of-sample test for each male population cluster are depicted in Fig. 6, 7 and 8 for cluster 1, 2, and 3 respectively.

Fig. 6
figure 6

Out-of-sample results for cluster 1, males. Observed values (solid line), one-step-ahead forecast (dotted line), and two-step-ahead (dash-dotted line)

Fig. 7
figure 7

Out-of-sample results for cluster 2, males. Observed values (solid line), one-step-ahead forecast (dotted line), and two-step-ahead (dash-dotted line)

Fig. 8
figure 8

Out-of-sample results for cluster 3, males. Observed values (solid line), one-step-ahead forecast (dotted line), and two-step-ahead (dash-dotted line)

We can trace some reflections starting from the analysis of Figs. 38. We choose to provide specific qualitative comments for U.S. and Russia, which show some interesting health and longevity dynamics supported by the literature. The health disadvantage of the United States (National Research Council 2013) is clear for both males and females: HLE in the U.S. is significantly lower than that of other countries. In particular, from the WHO data, it emerges that the U.S. is the only developed country in the world that has experienced a decrease in healthy life expectancy since 2010, despite it having the highest GDP in the world and is one of the wealthiest countries in per capita GDP. The decreasing trend in HLE occurs parallel to the decreasing trend in life expectancy. As early as the 1990s, a slowdown in improving longevity was observed and the life expectancy of Americans fell below the average for developed countries. This trend continued until around 2010 when the average death age in the U.S. stabilized. In 2011, the U.S. recorded the highest health expenditure as a proportion of GDP but the lowest coverage (Lorenzoni et al. 2014). This is mainly due to the fragility of a health system strongly biased in favor of private assistance and to the inequalities that it generates in health care for the poorest sections of the population. In the literature, it has been asked whether the deficiency in the U.S. health system is the sole cause of health disadvantage. McGinnis and Foege (1993) emphasized the important role of health behaviors. Mokdad et al. (2005) highlighted that 40 percent of all deaths in the U.S. are associated with four poor health behaviors: tobacco use, unhealthy diet, physical inactivity, and drinking problems.

Another evidence that emerges from Figs. 38 is the particular volatility that characterized the trend in health expectancy in the Russian Federation and Eastern European countries. Russia is another exception to the well-known Preston curve relationship. After an improvement in health conditions in 1985-87 due to Gorbachev’s anti-alcohol campaign, the 1990s saw a return to easy access to alcohol and economic tensions with the end of the Soviet Union. Unlike what happened in the U.S., however, a growing trend has been consolidated in these countries since 2010. Despite this, the great volatility recorded in the previous period actually has repercussions on higher forecast errors for these countries for both genders as shown in Figs. 910, which illustrate the values of the error measures on the single population for females and males, respectively.

Fig. 9
figure 9

Error measures. Females

Fig. 10
figure 10

Error measures. Males

In Table 6, we show the average RMSE and MAE values, by cluster, of both the one-step-ahead and two-step-ahead forecasts. On average one-step-ahead forecast outperforms the two-step-ahead prediction in all the clusters (both for males and females).

Table 6 Average RMSE and MAE values by cluster, years 2010–2019. One-step-ahead and two-step-ahead forecasts

Figures 11 and 12 displays the boxplots of the distribution of the performance measures for females and males, respectively.

Fig. 11
figure 11

Boxplots of the point forecast performance measures. Females

Fig. 12
figure 12

Boxplots of the point forecast performance measures. Males

The two-step-ahead prediction interquartile range is larger than one of the one-step-ahead forecasts, consistent with the errors reported in Table 6. This is especially true for the male cluster 3, while for the other two clusters this quantity is comparable across the two forecasting horizons. For females, the cluster 3 interquartile range is the lowest for both forecasting horizons. On average in all box plots, we observe larger upper-whiskers compared with the lower ones suggesting higher asymmetries for larger values of the forecast distributions.

We apply the DM test for comparing the accuracy of forecast performance between 1-step ahead, 2-step ahead and 5-step ahead forecasts. The results of the DM test are reported in Table 7. When the null hypothesis is accepted at the certain level of significance, the differences between the two models compared are significant and the forecasting accuracy of first model is higher than that of the second model. Therefore, from Table 7, we observe that the accuracy of 1-step-ahead forecasts is always better than the accuracy of the other two models, and that 2-step-ahead forecasts are more accurate than 5-step-ahead ones.

Table 7 DM test by cluster, gender, and the forecast horizon h

5 Discussion

While life expectancy is an essential indicator for pension and health care systems to plan health spending to manage disease events that occur as the population ages, HLE is also an important measure to plan health care costs to prevent the onset of diseases. If the first indicator necessarily leads to considerations of coexistence between the lengthening of average life and morbidity, the second focuses on the years of life lived in full health. A choice of economic and social planning is thus posed: there is a trade-off between the expenditure destined to improve healthy life expectancy and that necessary to manage morbidity in senescence. Correct management of the former can have reductive effects on the latter.

In this work, we have focused on healthy life expectancy and have proposed a combined model based on functional clustering and multivariate time series forecasting to offer an overall picture of the evolution of life expectancy on a global level and coherently depict the country-specific life expectancy based on the cluster mortality profiles analyzing its possible future evolution. As regards the cluster methodology, numerous insights can be gathered from recent literature contributions. Since Mixture Model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation–Maximization algorithm, our future research will involve the development of a framework for dealing with mortality data time series which combines function data analysis and mixture models, where each cluster will be described by a regression in which the polynomial coefficients vary according to a discrete hidden process. Regarding the modeling and forecasting, as in the classical longevity approach, we have followed the assumption according to which the extrapolation of past trends of healthy life expectancy can provide accurate forecasting. Despite the simplicity of the forecasting model, the performance measures showed satisfactory accuracy results. Future proposals may refer to model flexibility, assuming forecasting also conditionally on a cluster latent variable. This would lead to a mixture of a random walk-with-drift model, thus embedding the two steps in a one-stage estimation as in Fraley and Raftery (2002) Moreover, the valuable contribution of this work can be framed in the 2030 Agenda for the achievement of Sustainable Goal n. 3 “Good Health and Well Being”, under which countries are required to obtain universal health coverage, access to essential health services of quality, and accessible to all to pursue the improvement of healthy living conditions globally. Starting from the updated data that will be available in the next future, further research will be able to monitor the degree of satisfaction of the Sustainable Goal leveraging the analysis outlined in this work on the healthy life expectancy.

6 Conclusions

Although the demographic literature is rich in contributions on the analysis and forecasting of life expectancy, only recently has attention been paid to the HLE indicator, focusing on the descriptive analysis of the variable rather than on the forecast. The proposed work tries to help bridge this gap by extending the predictive analysis in a multi-population perspective to obtain more accurate information by exploiting the similarities between countries that have shown similar trends. Our paper thus opens the way to a line of research with interesting practical implications, which can act as a guide both for governments, grappling with public spending, and for private operators, such as insurers and pension funds, which aim to grasp the challenges of the market of health and life insurance policies.