1 Introduction

Two important articles on aggregate mortality trends were published in the spring of 2002, with important implications for our perspective on modeling, forecasting, and interpreting mortality trends. One such article was Oeppen and Vaupel (2002, henceforth OV), which shows a remarkable linear trend in the female life expectancy (at birth, period basis) of the national population with the highest value for this variable from 1840 to 2000. Of course the set of nations reporting credible life expectancy values has greatly expanded over this period, but that is unlikely to have mattered much for the results. Over this entire 160-year period, the record life expectancy consistently increased by 0.24 years of life per calendar year of time, or at the rate of 24 years per century. Extrapolation would lead us to expect a female life expectancy of around 108 years at the end of the twenty-first century.

A closely related article by White (2002) finds a linear trend in sexes-combined life expectancy for 21 industrial nations from 1955 to 1995, with an increase of 0.21 years of life per calendar year. White also finds that a linear trend in life expectancy gives a better fit to the experience of almost all the individual countries than does a linear trend in the age-standardized death rate, or the log of the age standardized death rate. He also found that when a quadratic time trend was fitted to the standardized rates, the coefficient on the squared term was significantly positive, indicating that the rate of improvement has been accelerating.

Both OV and White discuss the processes of catch-up and convergence. OV notes that some countries converge toward the leader (e.g. Japan), some have moved away from it (e.g. the US in recent decades), and some move more or less parallel to it. White finds that nations experience more rapid e0 gains when they are farther below the international average, and conversely, and therefore tend to converge toward the average. The variance across countries has diminished markedly over the forty years. However, there has been no tendency for the rate of increase of average e0 to slow down. Based on the current position of the US, which is somewhat below the average (just as OV shows that the US is below the record line), White predicts that e0 will grow a bit more rapidly than the average rate of 0.21 years per year, perhaps at 0.22 years per year. At this rate, the US would reach e0 = 83.3 in 2030—about 1.5 years above the Lee-Carter (1992) forecast, and about 3.8 years above the Social Security Administration (2002) projection for that year. Extrapolation of the linear trend in either OV or White generates more rapid gains in future longevity than are foreseen by Lee-Carter (1992, henceforth LC), which projects increases of 0.144 years per year between now and 2030. This is only two-thirds as fast as 0.22 years per year in White, and 0.23 years per year in OV (averaging the female and male rates for OV).

Two major points are made in both articles. First, life expectancy (record or average) appears to have changed linearly over long periods of time. Second, national mortality trends should be viewed in a larger international context rather than being analyzed and projected individually. In this paper I will discuss both these points, and conclude with suggestions for incorporating them in forecasting methods. I will draw on the Human Mortality Database or HMD (at http://www.mortality.org/), to fit various models.

Figure 14.1 plots the OV maximum life expectancy together with that of the HMD and we see that they sometimes coincide, and sometimes the OV record exceeds the HMD, which includes fewer countries.

Fig. 14.1
figure 1

Record life expectancy, by sex, from Oeppen-Vaupel and the Human Mortality Database, 1840 to 2000

2 Linear Change in Life Expectancy over Long Historical Periods

Before reading the OV article, I had expected that the trajectory of record life expectancy over the past two centuries would have a tilted S shape, in which life expectancy began at first to increase slowly, then accelerated, and then decelerated in the second half of the twentieth century. If we go back far enough in time, we know that life expectancy had no systematic trend at all, although there might have been long fluctuations. We also can be pretty sure that initial gains in life expectancy, once the trend began, were slow. Based on the OV results, it appears that these portions of the history occurred out of our sight, before the start date of 1840. Indeed, Fig. 5 in the OV Supplementary Materials on the Science Web site plots English life expectancy over a longer period, and its trajectory conforms to this description.

OV do not actually test or explore the constancy of the slope for record life expectancy, so it is worth examining this point more carefully here. As a start, we can compute the average rates of life expectancy increase for the OV data by sex and sub-period, as follows:

Average Annual Rates of Decline of Record e0 By Subperiod

  Females Males Average
1840–1900 0.24 0.24 0.24
1900–1950 0.27 0.26 0.27
1950–2000 0.23 0.15 0.19

From this we see that the regularity of the linear decline is not quite as strong as it appears from the striking figure in OV. For males in particular, there has been a noticeable deceleration over the past 50 years. For both sexes, there is a hint of the S shaped path that I had expected to see.

I have taken two more simple steps. First, I fitted a cubic polynomial to the data, and found that all three terms were significantly different than zero. The fitted curve, as shown in Fig. 14.2 for females, does have a slight S-shape. To see more clearly the implied rate of change, I plot the first derivative of the polynomial for females in Fig. 14.3. This suggests that the rate of change in fact increased substantially, more than doubling from 1840 to 1925 or so, and then substantially declining again thereafter, challenging the linear interpretation of the OV plot. Second, I calculated a 25-year moving average of the annual pace of increase for females, and this also is plotted in Fig. 14.3.

Fig. 14.2
figure 2

Linear and cubic trends fitted to the Oeppen-Vaupel record female life expectancy, 1840–2000

Fig. 14.3
figure 3

Rate of change in female life expectancy calculated from linear and cubic fits to Oeppen-Vaupel record and 25-year moving average of change in record

This less severe smoothing of the rate of change cautions us against drawing any firm conclusions from the data about linearity or nonlinearity. A case could be made for either.

If we accept that the OV trajectory is strikingly close to linear, then we are led to ponder why the record life expectancy might have risen in this way. After considerable thought, I find I have little useful to contribute on this important question. I find I am equally unable to explain the relative constancy of age-specific proportional rates of mortality decline, as summarized by the trend in the Lee-Carter (1992) k for the US since 1900, and the G7 countries since 1950 (Tuljapurkar et al. 2000).

Of the two striking regularities, linear life expectancy trends and constant rate of decline of age-specific mortality, it is the linearity of life expectancy increase which I find most puzzling. In my mind, the risks of death (that is, the force of mortality or death rates, by age) are the fundamental aspect of mortality which we should model and interpret. One view, perhaps an incorrect view, is that period life expectancy is just a very particular and highly nonlinear summary measure, with little or no causal significance. If age-specific death rates (ASDRs) decline at constant exponential rates, then life expectancy will rise at a declining rate, at least for a long time.

This point is worth elaborating because OV, in the Supplementary Materials on the Science Web site, say: “Note that steady rates of change in mortality levels produce steady, absolute increases in life expectancy: This relationship may underlie the linear trend of record life expectancy.” I agree that ultimately, it is likely that life expectancy would rise linearly, once death rates below the ages which obey Gompertz’s Law have fallen to near zero, as Vaupel (1986) has pointed out. If θ is the Gompertz parameter (rate of increase of mortality with age in a period life table or cohort life table) and ρ is the annual rate of decline over time in mortality at all ages above, say, 50, then the rate of increase of e50 will be ρ/θ years per year (Vaupel 1986). However, there is substantial mortality at younger ages before Gompertz’s Law applies, particularly in the nineteenth century. There we would expect a “steady” rate of decline in death rates to lead to a declining rate of increase in life expectancy.

These points are illustrated in Fig. 14.4, based on Swedish mortality experience. The average exponential rate of decline is calculated for each age-specific death rate for the period 1861 to 1961. This rate of decline is then applied to the initial age-specific death rates, and used to simulate them forward for 200 years. The resulting life expectancy is plotted in Fig. 14.3 along with the actual life expectancy. It can be seen that the simulated life expectancy trajectory is highly nonlinear, and its pace of improvement decelerates.

Fig. 14.4
figure 4

Actual and simulated Swedish female life expectancy assuming constant proportional rates of decline for age-specific death rates, at average rates for 1861–1961

As time passes, the gains in life expectancy become more nearly linear, and for the last 50 years, are quite close to linear. By construction, the lines cross in 1961. Figure 14.4 shows that the constant exponential rates of decline in age-specific death rates could not account for the linearity of the increase in record e0 since 1840.

When we look at the trajectories of the logs of the Swedish ASDRs from 1861 to 2000, they appear very far from linear, even if we restrict attention to the last 50 years, see Fig. 14.5 for selected rates. Most rates decline rapidly in some periods, and slowly in others, with patterns varying across the age span. One would not think to characterize these patterns as showing a constant rate of decline at each age. Yet this is a period over which the Lee-Carter model does a good job of fitting life expectancy, and projecting it within sample (Lee and Miller 2001). Evidently, the Lee-Carter method succeeds by picking out average tendencies from among a welter of variation, not by describing strong real-world regularities.

Fig. 14.5
figure 5

Log of selected age-specific death rates for Swedish females, 1861–2000, showing irregular rates of decline

3 What Is Fundamental, Age at Death or Risk of Death?

The OV and White findings challenge the view that risks of death are fundamental, and age at death is derivative. If life expectancy (e0) changes linearly, then rate of decline of death rates must be nonlinear, and in particular must be accelerating for at least some ages, as found by White for many of the 21 countries he analyzed. How can we reconcile the linearity of the change in e0 with the fact that when LC models are fit, they have almost always revealed linear changes in k over rather long periods, such as a century in the US? To focus on the US case, there are two explanations. First, as the second figure in OV makes clear, over the twentieth century the US first approached the record line, then briefly was close to being the leader, and finally fell away from the line starting in the 1960s. (This falling away very likely reflects the relatively early uptake of smoking in the US.) Since the trajectory of US e0 in fact had the shape we would expect with a constant rate of decline in ASDRs, perhaps there is no puzzle to explain for the US case. But can the same story hold for all the G7 countries analyzed and projected by Tuljapurkar et al. (2000)? This brings us to the second explanation, which is that contrary to the LC assumptions, the rates of decline have not been constant for each age, which is to say that the LC bx coefficients have not been constant over the sample period. Instead, they have changed shape between the first half of the century, when the mortality decline was much more rapid for the young than for the old, and the second half, when there is little difference among the rates of decline above age 20 or so. Just when the ASDRs of the young became so low that their further decline could contribute little to increasing e0, the rates of decline at the older ages began to accelerate, as noted by Horiuchi and Wilmoth (1998). This tilting of the bx schedule has meant that a given rate of decline of k can produce more rapid rates of increase in e0 than would have been the case with the old bx schedule. The tilting of the bx schedule is shown for the US in Fig. 14.6, and for Sweden, France, Canada, and Japan in Fig. 14.7. In each case the annual rate of decline for mortality is plotted by age for the first and second halves of the twentieth century, except for Japan, for which the break point is 1975.

Fig. 14.6
figure 6

Average annual reductions in age-specific death rates, US (sexes combined), showing the changing age pattern of decline

Fig. 14.7
figure 7

Average annual reductions in age-specific death rates, selected low mortality countries (sexes combined), showing the changing age pattern of decline

4 Using These Findings to Improve Mortality Forecasts

The first question is whether or not we should expect record e0 to continue to increase at this rate in the future, and if so for how long? Since I do not understand why this linearity has occurred in the past, I have no reason to think it should, or should not, continue in the future. The regularity in the past invites the forecaster to assume it will continue in the future, at least for a while. Suppose then that we do assume it will continue. How can we use that assumption to mold our forecasts? This line of thinking leads us unavoidably to consider national mortality change in an international context, to which we now turn.

5 Considering National Mortality Change in an International Context

Let E(t) be the best-practice life expectancy at time t. It is imperfectly estimated by the OV record series. The White average e0 measure reflects a different concept. Let ei(t) be actual life expectancy at birth for country i in year t. I will consider a number of possible kinds of models describing the relation between changes in ei(t) and E(t). I will write the equations in continuous time, but they are readily rewritten for discrete annual changes for purposes of estimation.

First Category of Models: All Countries Are Structurally Similar, But Start at Different Levels

$$ {de}_i(t)/ dt=\phi +\alpha \left(E(t)-{e}_i(t)\right)+{\varepsilon}_i(t) $$

Here, life expectancy tends to increase at some constant rate ϕ, and in addition it tends to move a proportion α toward the best practice level (record level) E(t) each year. It is also subject to a disturbance ε which could move it toward or away from this trajectory. This specification is consistent with the equation estimated by White. In estimation, I allow the εi(t) for each country to be autocorrelated (ε i (t) = ρε i(t1) + η i (t)) with all countries sharing the same autocorrelation coefficient ρ.

I fit this and later models to life expectancy series for 18 countries with relatively low mortality, with data drawn from the Human Mortality Database (HMD) at http://www.mortality.org/. The data series are of varying historical depth, with the shortest covering 29 years and the longest 159 years. Models are fit using an unbalanced design, so that the full range of data could be exploited. However, the estimation range is sometimes restricted to the period since 1900.

Table 14.1a reports estimates of α for females and for males, based on model specifications with and without autocorrelated errors, and using the OV record.

Table 14.1a Estimated rate of convergence of national life expectancy to Oeppen-Vaupel record level in 18 countries of the Human Database (Eq. 14.1)

In all cases α is highly significantly different than 0, with values lying between 0.06 and 0.08, indicating a tendency for the life expectancy of the countries to converge towards the leader country. The half-life of a deviation from the record level is around 10 years (e−10*.07 = 0.5). Here and throughout, results are very similar if the equation is estimated with no constant, so that the only source of life expectancy increase is catching up with the leader, or if there is no allowance for autocorrelated errors. Note that the R2 is low at around 0.04, and that the estimated autocorrelation is negative, which is somewhat surprising. Table 14.1b is the same, except that it uses the HMD record life expectancy in place of OV. The results are also very similar, but with a slightly slower rate of convergence and lower R2.

Table 14.1b Estimated rate of convergence of national life expectancy to the highest level in the 18 countries of the Human Database (Eq. 14.1)

Rather than taking the actual record e0 from OV or HMD as an estimate of the target trajectory toward which life expectancy in all countries is tending, we can instead estimate the implicit unobserved target as part of fitting the model, as in the following equation:

$$ {de}_i(t)/ dt=\phi +{\gamma}_t{D}_t-\alpha {e}_i(t)+{\varepsilon}_i(t) $$

Here Dt is a period dummy for year t (else 0) and γt is its coefficient. γt/α gives the target trajectory, playing a role much like the OV record level. Results are reported in Table 14.2 (with estimates of γ not shown, to save space). Because the target is chosen to maximize its explanatory power, the R2 is now much greater, while rates of convergence, α, are somewhat slower.

Table 14.2 Estimated rate of convergence of national life expectancy to an annual implicit target in the 18 countries of the Human Database (Eq. 14.2)

Figure 14.8 plots the estimated values of γt/α, corresponding to the implicit target trajectory. For comparison the record life expectancy for the HMD is also plotted. We see that the target trajectory lies above the maximum about half the time and also that the target trajectory is highly erratic, possibly with negative autocorrelation.

Fig. 14.8
figure 8

Estimated implicit target of convergence (Eq. 14.2) in the 18 countries of the Human Mortality Database (erratic line), compared to the HMD record life expectancy (smooth line)

When life expectancy is generally above trend, as might happen in a year with a mild winter affecting many countries, for example, the regression will try to fit this by estimating a very high target value, and conversely. This will lead to an underestimate of the size of the convergence coefficient, α. To avoid these problems, it is desirable to impose a smoothness constraint of some kind on the target trajectory. Here I will take the simplest route, assuming that the target trajectory is a linear function of time, leading to the following equation:

$$ {de}_i(t)/ dt=\phi +\alpha \Big(a+ bt-{e}_i(t)+{\varepsilon}_i(t) $$

The results are shown in Table 14.3. The estimated rate of convergence, α, is now slightly higher than in the first set of estimates. The rate of increase of the linear target trajectory is found by dividing the coefficient on “year” by the estimate of α, that is the coefficient on –ei,0, which is also given in the table. The rate of increase in the target calculated in this way is slightly higher than for the record for OV or the HMD. For example, the gain per year in target e0 estimated here for the whole period is 0.271 years per year, while in OV it is 0.243 years per year. Other comparisons are similar.

Table 14.3 Estimated rate of convergence of national life expectancy to a linear implicit target in the 18 countries of the Human Database (Eq. 14.3)

It is possible that countries that are twice as far from E(t) may not converge twice as quickly. To allow for this, we can add a term that is quadratic in the size of the gap (the quantity in parentheses in Eq. 14.2). A negative coefficient on the quadratic term would indicate that the pace of increase in e0 is less than proportionate to the size of the gap, and a positive coefficient that it is more than proportionate.

$$ {de}_i(t)/ dt=\phi +\alpha \Big(E(t)-{e}_i(t)+\beta {\left(E(t)-{e}_i(t)\right)}^2+{\varepsilon}_i(t) $$

The results of estimating this specification are given in Table 14.4a and 14.4b, and are unambiguous: In every case, the coefficient on the quadratic, β, is highly significantly greater than zero, and the coefficient on the linear term is negative. In order to interpret these coefficients, I show in Fig. 14.9 the derivative of the change in ei(t) with respect to the size of the gap, E(t) – ei(t).

Table 14.4a Estimated quadratic rate of convergence of national life expectancy to Oeppen-Vaupel record level in the 18 countries of the Human Database (Eq. 14.3)
Table 14.4b Estimated quadratic rate of convergence of national life expectancy to HMD record level in the 18 countries of the Human Database (Eq. 14.3)
Fig. 14.9
figure 9

Derivative of quadratic convergence to the Oeppen-Vaupel target: how the proportional effect of a gap increases with the size of the gap

Under the linear specification used earlier, this plot would be a straight line with height α. Here, however, we see that all the lines slope decisively upward to the right, indicating that the rate of convergence increases more than proportionately with the size of the gap. The initial negative values most likely reflect the limitations of the quadratic specification, rather than a true tendency of the rate of change to decline as the gap increases in this low range. Most of the gaps, 90 to 95% of them, are less than 8 years. Only a few fall outside that range, and are subject to the higher sensitivities to the right on the plot. In future work it should be possible to examine the nonlinearity of the response better, drawing on data for Third World countries with higher mortality, but these have not yet been added to the HMD.

6 Extensions

6.1 Heterogeneous Targets

If the foregoing models were the whole story, we would expect the life expectancies of countries to be distributed randomly around E(t), since their mortality would have had decades or centuries in which to converge to E(t) under the influences described in the equations. But of course, this is not the case. A more realistic model would take into account the heterogeneity of international experience, by incorporating additional factors that influence the level toward which each country’s e0 converges, which may not be the best practice level. I will call this modified target the idiosyncratic target. We can take it to equal E(t) + πX(t), where X is a vector of relevant factors and π is a vector of coefficients. X includes relevant variables such as per capita income, educational attainment, nutritional measures, dietary measures, smoking behavior, and geographic/climatic conditions. πX expresses a deviation from the best practice level. Over time, E(t) rises. If X remained constant, the target level would nonetheless increase with E(t). More likely, πX also increases, indicating an additional source of increase in the target level of e0. πX could capture influences like those included in Preston’s (1980) analysis, in which he fit socioeconomic models to international cross-sections of life expectancy, and then decomposed gains in life expectancy into movements along the πX curve with economic development, and upward shifts in the whole equation, which would here be reflected in the combination of convergence and a common growth rate, ϕ. The ε shocks could reflect political, military, weather, or epidemiological factors of a transitory nature. This model would be:

$$ {de}_i(t)/ dt=\phi +\alpha \left(E(t)+\pi {X}_{i,t}-{e}_i(t)\right)+{\varepsilon}_i(t) $$

Once again, it would be possible to estimate E(t) as part of fitting the model, either unconstrained or constrained to have a linear trajectory. If estimated in this way it will reflect changes in the target net of socioeconomic progress, a concept closer to Preston’s residual improvement of life expectancy. Country i will have a target or equilibrium life expectancy in year t of E(t) + πXi, t so heterogeneity in equilibria is now incorporated. Countries that are poor, smoke, eat a high cholesterol diet, have low education, or perhaps have a tropical climate, will tend towards lower levels of life expectancy.

6.2 Heterogeneous Rates of Convergence

It is also possible that different countries will have different rates of convergence, α. For isolated countries, or perhaps for very poor ones, or ones with very little transportation or communication infrastructure, α may be smaller.

We can take this into account by making α a function of a set of variables Z.

$$ {de}_i(t)/ dt=\phi +\left(\alpha +\delta {Z}_{i,t}\right)\left(E(t)+\pi {X}_{i,t}-{e}_i(t)\right)+{\varepsilon}_i(t) $$

Z would include factors indicating the degree of integration of country i in the global community, and perhaps other factors bearing on the strength of government and the communications and transportation infrastructure in the country. It might be difficult to identify factors that belonged in Z rather than in X.

7 Forecasting Mortality

Let us assume that the linear trend in record or average life expectancy will continue. Then the next steps are straightforward. We use the linear trend to project the record life expectancy (or the target trend that was estimated as part of the convergence model). We will know the current life expectancy for a particular country of interest. We can use the appropriate or preferred equation for det/dt to estimate e0 one year later, and then continue recursively. The projected e0 will gradually approach the projected linear trend.

This procedure could be improved by using a model version which allowed for some heterogeneity, as in Eqs. (14.5) and (14.6). Not all countries will approach the same trend line, but each should approach a trajectory that is parallel to it. In these specifications, we would also have to consider the advisability of projecting changes in the X and Z variables, and methods for doing so.

The assumption of a pure linear trend could also be questioned, dropping the initial assumption. The central tendency (record, average, or other) could be modeled as a stochastic time series, and forecasted in that way. That could certainly be done for the γ series, for example.

In general, the approach of forecasting mortality for individual countries in reference to the international context is very appealing, and I believe it is the natural way to go in future work. Whether this approach is applied to life expectancy itself, or to a Lee-Carter type k, or in some other way, will have to be settled by further research. In the meantime, these recent papers, and particularly OV, challenge our current perceptions of mortality change and expectations about future trends.