Introduction

The improvement in mortality rates in developed countries until the first half of the 20th century was largely due to improvements in infant and young mortality rates, whereas the improvement in mortality rates since the 1970s has been largely due to improvements in old-age mortality rates (Olshansky & Ault, 1986). This was particularly remarkable in Japan. This suggests that mortality modeling in Japan requires the development of mortality models that are compatible with the remarkable improvements in old-age mortality rates.

Conversely, in the field of demography, relational models that enable lifetable estimates have been developed from a small number of demographic indices, such as the Log Quad Model by Wilmoth et al. (2012) and the Extended Log Quad Model 1 by Horiguchi (2022).Footnote 1The Log Quad Model enables the indirect estimation of life tables, with the probability of dying between ages 0 and 4 (\(_{5}q_{0}\)). The model is based on life tables from the Human Mortality Database (Human Mortality Database, 2023) (HMD). Horiguchi (2022) modified the Log Quad Model to fit the recent improvements in old-age mortality rates in Japan, using the singular value decomposition (SVD) method, and proposed this model as the Extended Log Quad Model 1.

This model can be used to estimate lifetable using a small number of indices (\(_{5}q_{0}\), \(_{45}q_{15}\), and \(e_{65}\)), which are relatively easy to obtain. This is useful when municipalities make their own assumptions regarding life tables for population projections. However, the Extended Log Quad Model 1 has larger estimation errors for ages under 50 for females, particularly in the 20–39 age group.

Hence, this study aimed to develop an extended mortality model, the Extended Log Quad Model 2, by improving the Extended Log Quad Model 1 and applying it to municipal life table estimations.

Background

Mortality models can be broadly classified into three categories: (1) mathematical models, (2) model life tables, and (3) relational models. In a mathematical model, life table functions are expressed using the mathematical functions of age. A well-known example is the Gompertz model (Gompertz, 1825), which expresses mortality force as an exponential function of age. A model life table expresses the life table function using several number tables based on experience. Coale Demeny’s model life table (Coale & Demeny, 1983) is a typical example. This model classifies the life table shapes (North, South, East, and West) into four types, each of which has a model life table consisting of 25 levels representing various life tables. Relational models establish a standard mortality pattern by age, and mathematically express divergence. The Lee–Carter (Lee & Carter, 1992) and Brass logit (Brass, 1971) models are representative examples, while the Log Quad Model by Wilmoth et al. (2012) is a type of relational model. In recent years, relational models have been used in various applications because they have the advantage of expressing life table functions with a small number of parameters, which is a feature of (1) and also has the advantage of expressing the empirical life tables, which is a feature of (2).

In recent years, research has been conducted to apply relational models to develop new model life tables that allow for indirect estimations based on limited information, primarily in developing regions where high-quality demographic data are not available. The Log Quad Model is one such model (Wilmoth et al., 2012). This model is based on the following equation (1):

$$\begin{aligned} \ln \textbf{m}_{x} = \textbf{a}_{x} + \textbf{b}_{x} h + \textbf{c}_{x} h^2 + \textbf{v}_{x} k \end{aligned}$$
(1)

where \(x\) is the age by 5-year age group (excluding ages 0 and 1–4, and the open interval, age 110 and older). Furthermore, \(h\) is the logarithm of the probability of dying at age 0–4 (\(\ln _{5}q_0\)), a constant representing the level of child mortality. \(k\) is a constant, that represents the deviation from the typical age pattern, expressed up to the second-order term of \(h\) on the right side in Eq. (1), and usually considers values in the range \((-2, 2)\). \(k\) is derived numerically to reproduce the actual value of the probability of dying at the age of 15–59 (\(_{45}q_{15}\)). When information on child mortality only is available, the life table estimation is fixed at \(k=0\). When adult mortality is also available, the accuracy of the life table estimation can be improved by varying \(k\). (\(\textbf{a}_{x}\), \(\textbf{b}_{x}\), \(\textbf{c}_{x}\), \(\textbf{v}_{x}\)) are estimated by weighted least squares and the SVD method, using the 719 life tables in the HMD.

The Log Quad Model can efficiently express various age patterns from limited information and is said to perform similar to or better than the model life tables proposed thus far. Developing such an excellent model life table is considered to have been made possible by the HMD project, which has created a database of a large number of life tables of guaranteed quality in a uniform manner.

Simultaneously, Clark (2019) proposed the SVD-Component (SVD-Comp) Model, using higher order singular values than the Log Quad Model, and applying it to the indirect estimation of the probability of dying. This model is expressed as follows: Let \(\textbf{Q}_{z}\) be a matrix that stores the probability of dying for each sex (z: female/male). Here, A is the number of age classes and L is the number of life tables, \(l=1, 2, \cdots , L\) indexes life tables. By applying SVD to \(\textbf{Q}_{z}\), we have,

$$\begin{aligned} \textrm{SVD}(\textbf{Q}_{z}) = \varvec{\Theta }_{z}\textbf{S}_{z}\varvec{\Xi }_{z}^{\textrm{T}} = \sum _{i=1}^{\rho } s_{zi} \varvec{\theta }_{zi} \varvec{\xi }_{zi} \end{aligned}$$
(2)

Here, \(\varvec{\Theta }_{z}\) is a matrix in which the left singular vectors \(\varvec{\theta }_{zi}\) are arranged in the column direction, \(\varvec{\Xi }_{z}\) is a matrix in which the right singular vectors \(\varvec{\xi }_{zi}\) are arranged in the column direction, and \(\textbf{S}_{z}\) is a diagonal matrix with which elements are \(s_i\), the singular values. Finally, \(\rho =\textrm{rank}(\textbf{Q}_{z})\). When the ith component of \(\varvec{\xi }_{zl}\) is written as \(\varvec{\xi }_{zli}\), the probability of dying by sex and age of life table l (\(\textbf{q}_{zl}\)) is approximated using the sum of c items (\(c \le \rho\)) in SVD as follows:

$$\begin{aligned} \textbf{q}_{zl} \approx \sum _{i=1}^{c}\xi _{zli} \cdot s_{zi} \varvec{\theta }_{zi} \end{aligned}$$
(3)

According to Clark (2019), it is confirmed that the probability of dying included in the HMD can be sufficiently approximated with \(c=4\). For indirect estimation, Clark (2019) proposed a method in which the value \(s_{zi} \varvec{\theta }_{zi}\) obtained from such SVD is fixed and estimates \(\xi _{zli}\) with regression using (\(_{5}q_{0}\), \(_{45}q_{15}\)).

Thus, studies applying the relational model have been conducted in Wilmoth et al. (2012) and Clark (2019), mainly for the purpose of contributing to the indirect estimation of life tables for developing regions. The Extended Log Quad Model 1 proposed by Horiguchi (2022) is an application of these findings to regional life table estimations in Japan, a developed country. The background to the development of the Extended Log Quad Model 1 is described below.

In Japan, since 2003, official future population projections by municipality have been published and updated approximately every five years. However, the assumptions for future survival rates have been limited to one for each municipality, with no scope for local adjustments.

Considering this limitation, Horiguchi (2022) attempted to apply the applied research of the relational model to the estimation of municipal life tables in Japan. The model is expressed by the following equation (4).

$$\begin{aligned} \ln \textbf{m}_{x} = \textbf{a}_{x} + \textbf{b}_{x} h + \textbf{c}_{x} h^2 + \textbf{v}_{x} k + \textbf{u}_{x} r \end{aligned}$$
(4)

This is in the form of the Log Quad Model of Eq. (1), with a modified term \(\textbf{u}_{x}r\). (\(\textbf{a}_{x}\). \(\textbf{b}_{x}\), \(\textbf{c}_{x}\), and \(\textbf{v}_{x}\)) are taken directly from the Log Quad Model. In the following, we discuss the derivation of the correction term \(\textbf{u}_{x} r\). To do so, we first observe problems when the Log Quad Model is applied to Japanese mortality rates in recent years (1978–2017).

Fig. 1
figure 1

Mean Squared Error (MSE) of The Log Quad Model and \(e_{65}\) (47 Prefectures of Japan, Female, Ages 65+, Year 1978–2017). Source: Author

Figure 1 shows the results of fitting the Log Quad Model by Eq. (1) into the life tables for women in Japan from 1978 to 2017. The accuracy of the estimation was measured using the mean squared error (MSE) of the Log Quad Model at age 65 and above. The MSE is expressed by the following equation (5):

$$\begin{aligned} \textrm{MSE} = \dfrac{1}{n} \sum _{i=1}^{n} (\ln \hat{\textbf{m}}_{65+5(i-1)}^{\textrm{Quad}}-\ln \textbf{m}_{65+5(i-1)})^2 \end{aligned}$$
(5)

Figure 1 shows the correlation between MSE and life expectancy at age 65 (\(e_{65}\)). Here, \(n\) represents the number of age groups above 65 years (\(n=10\)). The actual mortality rate is denoted by \(\textbf{m}_{x}\) and the mortality rate estimated by the right-hand side of the Log Quad Model [Eq. (1)] by \(\hat{\textbf{m}}_{x}^{\textrm{Quad}}\).

The figure shows that the MSE increases concomitantly with \(e_{65}\). In other words, the estimation error of the Log Quad Model tends to increase as the old-age mortality rate improves.

Based on these observations, the Extended Log Quad Model 1 models the estimation error of the Log Quad Model in the 50+ age groupFootnote 2 and estimating it as \(\textbf{u}_{x}\) and \(r\), so that the Extended Log Quad Model 1 can accommodate the reality of the remarkable improvement in old-age mortality rates in Japan in recent years.

The estimation error by the Log Quad Model (with \(k\) variable) is expressed as follows:

$$\begin{aligned} \textrm{d} \ln \textbf{m}_{x}^{\textrm{Quad}}=\ln \textbf{m}_{x} - \ln \hat{\textbf{m}}_{x}^{\textrm{Quad}} \end{aligned}$$
(6)

By applying SVD to \(\textrm{d} \ln \textbf{m}_{x}^{\textrm{Quad}}\) with mortality rates for age groups over 50 years can be

$$\begin{aligned} \textrm{SVD}(\textrm{d} \ln \textbf{m}_{x}^{\textrm{Quad}})= \varvec{\Phi } \textrm{S} \varvec{\Psi }^{\textrm{T}} = \sum _{i=1}^{\rho } \varvec{\phi }_{i} s_i \varvec{\psi }_{i}^{\textrm{T}} \approx \sum _{i=1}^{c} \varvec{\phi }_{i} s_i \varvec{\psi }_{i}^{\textrm{T}} \end{aligned}$$
(7)

where \(\rho = \textrm{rank}(\textrm{d} \ln \textbf{m}_{x}^{\textrm{Quad}})\), \(c \le \rho\), \(\varvec{\Phi }=[\varvec{\phi }_{1}, \cdots , \varvec{\phi }_{\rho }]\), \(\varvec{\Psi } = [\varvec{\psi }_{1}, \ldots \varvec{\psi }_{\rho }]\), \(\textbf{S}\) is a diagonal matrix with singular value \(s_i\) as a component.

On the right-most side of Eq. (7), by setting \(c=1\), the estimation error of the log of mortality for life table \(l\) (\(\textrm{d} \ln \textbf{m}_{xl}^{\textrm{Quad}}\)) is approximated as follows:

$$\begin{aligned} \textrm{d} \ln \textbf{m}_{xl}^{\textrm{Quad}} \approx \varvec{\phi }_{1} \cdot s_1 \psi _{1l} \end{aligned}$$
(8)

where \(\psi _{ij}\) denote the \(j\) th component of \(\varvec{\psi }_{i}\).Therefore, we estimate \(\textbf{u}_{x}\) and \(r\) as \(\textbf{u}_{x}=\varvec{\phi }_{1} \cdot s_1\), \(r=\psi _{1l}\) in Eq. (8).

Table1 lists the coefficients of the Extended Log Quad Model 1. Note that Table 1 displays 4-digits, while in the original paper of Horiguchi (2022), the coefficients \(\textbf{u}_{x}\) are shown with two digits.

Table 1 Coefficients of the extended log quad Model 1

The Extended Log Quad Model 1 by Horiguchi (2022) is a relational model that can be applied to life table estimations by region in Japan. It can be used to create life tables from a small number of relatively easily available indicators such as life expectancy at the age of 65. Thus, this model is highly versatile and can be applied when local governments want to estimate future populations by making their own assumptions about life tables. However, the Extended Log Quad Model 1 tends to have relatively large estimation errors for ages below 50 years for females, especially for females aged 20–39. Therefore, this study proposes the Extended Log Quad Model 2 as a method to improve this issue using the Japanese Mortality Database (JMD) life table by prefecture, and aims to apply this model for life table estimation by municipality. Table 2 summarizes the relationships between the models (the Log Quad Model, the Extended Log Quad Model 1, and the Extended Log Quad Model 2) in terms of the model coefficients, constants, indicators, and modeling methods.

Table 2 Relationships of the models: the LoG quad, The extended Log Quad Model 1 and the extended Log Quad Model 2

Data and Methods

To construct the Extended Log Quad Model 2 proposed in this study, we used the prefectural life table of JMD(National Institute of Population and Social Security Research, 2022). Specifically, the JMD prefectural life tables by sex (by five-year age group, with the year of the census as the mid-year and the five-year base period for the creation of the life tables)Footnote 3 were used in this study (National Institute of Population and Social Security Research, 2022). We used life tables for the period of 1978–2017 because, in the aforementioned series, there are eight years with a five-year basic period of creation: 1978–1982, 1983–1987, 1988–1992, 1993–1997, 1998–2002, 2003–2007, 2008–2012, and 2013–2017.

The accuracy of estimation by the Extended Log Quad Model 2 was evaluated by comparison with existing models (the Log Quad Model, SVD-Comp Model, and the Extended Log Quad Model 1) based on the results of JMD prefectural life tables. Specifically, the accuracy of the model was evaluated by comparing the estimated value of \(e_0\) obtained from each model with the actual value of the JMD’s prefectural life table and calculating these Root Mean Squared Errors (RMSEs). The parameters and regression coefficients used for the indirect estimation of the SVD-Comp Model were acquired from those published by Clark (2019b).

To test the applicability of the proposed model to life tables by municipality, we used the “Municipal Life Tables” of the Ministry of Health, Labor, and Welfare (MHLW) for 2010 and 2015. Japan has three levels of government: national, prefectural, and municipal. The country is divided into 47 prefectures, each of which includes numerous municipalities (cities, towns, and villages). Cities that have a population of over 500,000 are called ordinance-designated cities, and each ordinance-designated city is subdivided into several wards. Twenty-three special wards of Tokyo are included among these wards. However, they are not part of this administrative system; they are effectively independent cities.

In the “Municipal Life Tables,” the mortality rate, which is the basis for creating the life table, is calculated using Bayesian estimation, because the number of deaths in each municipality is small and a direct calculation from the actual values of population and deaths is often unstable owing to the influence of random fluctuations (Ministry of Health, Labor and Welfare of Japan, 2013; Ministry of Health, Labor, and Welfare of Japan, 2018). The life table produced without Bayesian estimation is not shown in the published results.

Additionally, the “Municipal Life Tables” are typically prepared using the number of deaths for three years, including one year before and one year after the census, and the regions used to set the prior distribution for Bayesian estimation are based on secondary medical regions (Suga, 2018); however, in 2010, owing to the effects of the Great East Japan Earthquake, the number of deaths for only one year (2010) was used. Furthermore, the prior distribution was set to “prefectures, ordinance-designated cities, and Tokyo special wards” after 2010 (Ministry of Health, Labor and Welfare of Japan, 2013; Ministry of Health, Labor, and Welfare of Japan, 2018). When comparing life tables for different years, it is necessary to consider such differences as the method of preparation. However, in this study, the published values in the “Municipal Life Tables” were used as the actual values.

The Extended Log Quad Model 2 proposed in this study is expressed by the following equation (9):

$$\begin{aligned} \ln \textbf{m}_{x} = \textbf{a}_{x} + \textbf{b}_{x} h + \textbf{c}_{x} h^2 + \textbf{w}_{x} \gamma \end{aligned}$$
(9)

This is in the form of removing the \(\textbf{v}_{x} k\) and the subsequent terms from the Extended Log Quad Model 1 in (4) and adding \(\textbf{w}_{x} \gamma\). The coefficients (\(\textbf{a}_{x}\), \(\textbf{b}_{x}\), \(\textbf{c}_{x}\)) are taken from the Log Quad Model; however, \(\textbf{w}_{x}\) is estimated again by the method described below. Next, we describe the derivation method \(\textbf{w}_{x} \gamma\).

It was suggested by Horiguchi (2022) that the tendency of estimation error be pronounced for ages under 50 for females, especially those for females between 20 and 39 years of age. Therefore, using the SVD method, we modeled the estimation error of the Log Quad Model (\(k=0\)) for ages 20 and above, where the tendency of estimation error is particularly pronounced, and estimated it as \(\textbf{w}_{x}\) and \(\gamma\).

The \(\textbf{w}_{x}\) and \(\gamma\) are estimated by the following method. Let \(\textbf{m}_{x}\) be the actual mortality rate and \(\hat{\textbf{m}}_{x}^{0}\) be the estimated mortality rate of the Log Quad Model (\(k=0\)). Then, its estimation error (\(\textrm{d} \ln \textbf{m}_{x}^{0}\)) is expressed as follows:

$$\begin{aligned} \textrm{d} \ln \textbf{m}_{x}^{0} = \ln \textbf{m}_{x} - \ln \hat{\textbf{m}}_{x}^{0} \end{aligned}$$
(10)

Here, by applying SVD to \(\textrm{d} \ln \textbf{m}_{x}^{0}\) for age groups above 20 years, \(\textrm{d} \ln \textbf{m}_{x}^{0}\) can be expressed thus:

$$\begin{aligned} \textrm{SVD}(\textrm{d} \ln \textbf{m}_{x}^{0}) = \varvec{\Phi } \varvec{S} \varvec{\Psi }^{\textrm{T}} = \sum _{i=1}^{\rho } \varvec{\phi }_{i} s_{i} \varvec{\psi }_{i}^{\textrm{T}} \approx \sum _{i=1}^{c} \varvec{\phi }_{i} s_{i} \varvec{\psi }_{i}^{\textrm{T}} \end{aligned}$$
(11)

where \(\rho =\textrm{rank}(\textrm{d} \ln \textbf{m}_{x}^{0})\),\(c \le \rho\) and \(\varvec{ \Phi } = [\varvec{\phi }_{1}, \cdots , \varvec{\phi }_{\rho }]\), \(\varvec{\Psi }=[\varvec{\psi }_{1}, \cdots , \varvec{\psi }_{\rho }]\), \(\textbf{S}\) is a diagonal matrix with the singular value \(s_i\) as a component. By setting \(c=1\) in Eq. (11), the error in estimating the log mortality of life table \(l\) (\(\textrm{d} \ln \textbf{m}_{xl}^{0}\)) is approximated as follows:

$$\begin{aligned} \textrm{d} \ln \textbf{m}_{xl}^{0} \approx \varvec{\phi }_{1} \cdot s_{1} \psi _{1l} \end{aligned}$$
(12)

where \(\psi _{ij}\) denote the j th component of \(\varvec{\psi }_{i}\). Subsequently, \(\textbf{w}_{x}\) and \(\gamma\) are estimated as follows: \(\textbf{w}_{x} = \varvec{\phi }_{1} \cdot s_{1}\) and \(\gamma = \psi _{1l}\).

Estimation of Life Tables Using the Extended Log Quad Model 2

In the Extended Log Quad Model 1, the life table was estimated by approximating \(h\), \(k\), and \(r\) in Eq. (4) using \(_{5}q_{0}\), \(_{45}q_{15}\), and \(e_{65}\) for the target population. In the Extended Log Quad Model 2 of this study, only two indicators, \(_{5}q_{0}\) and \(e_{65}\), are used to estimate \(h\) and \(\gamma\) in the Eq. (9).

We first obtained the actual (or estimated) values of (\(_{5}q_{0}\), \(e_{65}\)) for the target population and set \(h=\ln _{5}q_{0}\). Next, \(\gamma\) was obtained numerically to ensure \(e_{65}\) derived from the Extended Log Quad Model 2 was equal to the actual value. By substituting \(h\) and \(\gamma\) obtained by the above procedure into the expression (9), an estimate of the mortality rate was obtained. This is denoted as \(\hat{\textbf{m}}_{x}^{\textrm{Ext2}}\).

It is reasonable to derive the constant \(\gamma\) numerically to reproduce the actual value of \(e_{65}\). We observed the correlation between the right singular vector (RSV1) \(\psi _{1l}\) on the right-hand side of the expression (12) and \(e_{65}\) obtained from the JMD prefectural life table, and confirmed that there is a strong positive correlation between them (\(r=0.94\) for females, \(r=0.74\) for males).

Results

Coefficients of the Extended Log Quad Model 2 and Results of Mortality Estimates

The coefficients of the Extended Log Quad Model 2 are derived from the JMD Prefectural Life Tables and are shown in Table 3. Table 4 shows the RMSE of \(e_0\) by each model when being applied to JMD prefectual life tables for the years 1978–2017. This table shows that the Extended Log Quad Models 1 and 2 have higher estimation accuracies for both men and women than the two existing models (the Log Quad Model, SVD-Comp Model). In contrast, when comparing the Extended Log Quad Models 1 and 2, the following points need to be borne in mind.

Table 3 Coefficients of the Extended Log Quad Model 2
Table 4 RMSE for difference in \(e_0\) by each model (year 1978–2017)
Table 5 RMSE for difference in \(e_0\) by each model (year 2013–2017)
Table 6 RMSE of log of mortality Rate by each model (Year 1978–2017)

This table shows that for women, the RMSE of The Extended Log Quad Model 2 is 0.28 years, which is approximately the same level of accuracy as that of the Extended Log Quad Model 1. In contrast, for men, the RMSE of the Extended Log Quad Model 2 is 0.43 years, which is greater than the 0.10 years of the Extended Log Quad Model 1. This is due to the reduction in the number of model parameters and, as explained previously, the correlation between RSV1 and \(e_{65}\) is weaker in men than in women. However, the accuracy of the estimates of the Extended Log Quad Model 2 tended to improve with the progression of years for both men and women, and a different picture was observed when restricting the analysis to the most recent years, 2013–2017. Table 5 depicts these results. It shows that the RMSE of the Extended Log Quad Model 2 is 0.14 years for women, which is less than the 0.27 years of the Extended Log Quad Model 1. Conversely, for men, it is 0.25 years, which is slightly larger than the Extended Log Quad Model 1’s 0.14 years, but it is an improvement over the previous result.

However, the most important advantage of the Extended Log Quad Model 2 is that it allows life table estimations with fewer parameters than the Extended Log Quad Model 1. Therefore, considering these advantages, it can be stated that the Extended Log Quad Model 2 is capable of estimating life tables with the same degree of accuracy as the Extended Log Quad Model 1.

Next, we show that the Extended Log Quad Model 2 improves the estimation error for women aged 20–39 years, which was an issue in the Extended Log Quad Model 1. Table 6 lists the results. It shows that for women aged 20–39, the RMSE of the log mortality rate using the Extended Log Quad Model 2 is 0.14, which is smaller than the 0.55 in the Extended Log Quad Model 1. For women aged 40 and over, the RMSE is 0.09 in the Extended Log Quad Model 2, which is almost the same as 0.07 in the Extended Log Quad Model 1. In contrast, for men, the RMSE of the Extended Log Quad Model 2 is 0.18 for those aged 20–39 years and 0.08 for those aged 40 and over, which is slightly larger than the Extended Log Quad Model 1 in both age groups. However, as mentioned, the advantage of the Extended Log Quad Model 2 is that it can estimate the life table with a reduced number of parameters, and taking this advantage into consideration, it can be inferred that the two models will be able to estimate life tables with almost the same levels, even for men. It can thus be concluded that the Extended Log Quad Model 2 can reduce the estimation error for women aged 20–39 years, which was a challenge with the Extended Log Quad Model 1, while for males and other age groups for females, it maintains the same level of accuracy as the Extended Log Quad Model 1.

Application of the Extended Log Quad Model 2 to Projections of Municipal Life Tables

Next, we discuss the application of the Extended Log Quad Model 2 to estimate future municipal life tables.Footnote 4 The Extended Log Quad Model 2 obtains the life table from two indicators, (\(_{5}q_{0}\) and \(e_{65}\)), while the Extended Log Quad Model 1 obtains the life table from three indicators (including \(_{45}q_{15}\) in addition to the other two). This model can be applied to future life table estimations if these indicators can be projected. Here, we assumed that only the actual values up to 2010 are known, based on which the accuracy is evaluated by projecting the life table for 2015 and comparing it with the actual values from the “Municipal Life Tables” for 2015.

Specifically, first, each indicator (\(_{5}q_{0}\), \(_{45}q_{15}\), and \(e_{65}\)) was obtained from the JMD prefectural life tables for 2003–2007 and 2008–2012 (by 5-year age groups, 5-year base periods). For \(_{45}q_{15}\) and \(e_{65}\), we calculated the percentage increase in value from the years 2008–2012 compared to the value from 2003 to 2007 for each prefecture. Then, \(_{45}q_{15}\) and \(e_{65}\) for each municipality in 2015 were estimated by multiplying the actual values of \(_{45}q_{15}\) and \(e_{65}\) in 2010 for that municipality by the growth rate for the prefecture containing that municipality.

Conversely, for \(_{5}q_{0}\), considering the possibility that the data may be unstable in municipalities with relatively small population sizes, \(_{5}q_{0}\) was obtained from the national life table in the 2010 “Municipal Life Tables” and was uniformly used as an estimate for each municipality in 2015 (Method 1).Footnote 5

The average life expectancy \(e_{0}^{\textrm{Ext2}}\) is calculated from the mortality \(\textbf{m}_{x}^{\textrm{Ext2}}\) of the Extended Log Quad Model 2 based on Method 1 as well as the Extended Log Quad 1 and the Log Quad models. The accuracy of the life table projection is subsequently evaluated by comparing it with actual life expectancy from the 2015 “Municipal Life Tables.” The evaluation was conducted by population size,Footnote 6

Table 7 RMSE for difference in \(e_0\) by municipal life tables and estimated \(e_0\)

Table 7 presents the estimation results obtained using Method 1. For women, the RMSE of \(e_0\) for all municipalities is 0.63 years in the Extended Log Quad Model 2, 0.69 years in the Extended Log Quad Model 1, and 3.44 years in the Log Quad Model. For males, the RMSE of the Extended Log Quad Model 2 is 0.72 years, the Extended Log Quad Model 1 is 0.64 years, and the Log Quad Model is 2.26 years for all municipalities. For males, the results using the Extended Log Quad Model 2 are approximately 0.1 year larger in RMSE of \(e_0\) compared to those using the Extended Log Quad Model 1. Regardless, the estimation accuracy is similar for both models. The results confirm that both models can estimate future life tables with higher accuracy than the Log Quad Model. However, in the estimation using the Extended Log Quad Model 1, two indices (\(_{45}q_{15}\) and \(e_{65}\)) need to be projected by method 1 to fix \(_{5}q_{0}\) to the national value, whereas in the Extended Log Quad Model 2, projecting only \(e_{65}\) is necessary. The Extended Log Quad Model 2 is simpler than the Extended Log Quad Model 1 because it only requires the projection of \(e_{65}\), while the accuracy is almost the same.

Discussion and Conclusion

The Log Quad Model, the direct predecessor of the model of this study and of Horiguchi (2022), was designed to enable indirect estimation from limited information in developing regions; future estimations in developed regions such as Japan are not necessarily a central goal of the Log Quad Model. The fit accuracy of the Log Quad Model to recent life tables in Japan has not necessarily been accurate in recent years for the old-age population. Conversely, the Extended Log Quad Model 2 proposed in this study can represent the recent mortality situation of old-age groups in Japan with more accuracy. Simultaneously, it succeeds in improving the estimation error in the 20–39 female age group, which was considered a problem with the Extended Log Quad Model 1. Furthermore, it should be noted that the Extended Log Quad Model 2 in this study enables life table estimation from a smaller number of indicators than the Extended Log Quad Model 1, which is an advantage. These results indicate that the Extended Log Quad Model 2 can be used as a model life table for Japan’s mortality rates in recent years.

In the long-term care insurance program, plans are formulated based on future benefit projections; therefore, for estimating the number of people requiring long-term care, estimating the future population of the older age groups as accurately as possible is essential.

In situations where such future population estimations are crucial, the Extended Log Quad Model 2 in this study can be used to easily obtain a future life table by projecting only the life expectancy at age 65 (\(e_{65}\)), if we fix \(_{5}q_{0}\) at some level. In such a case, it is an advantage of the Extended Log Quad Model 2 over the Extended Log Quad Model 1 that it can project only one index \(e_{65}\), whereas in the Extended Log Quad Model 1, projecting two indices (\(_{45}q_{15}\), \(e_{65}\)) is necessary to project life tables.

Fig. 2
figure 2

Estimated Log of Mortality Using The Log Quad Model and The Extended Log Quad Model 2 (Female, Year 2020). Note: \(\textbf{m}_{x}\) is the actual value of the mortality rate, \(\textbf{m}_{x}^{\textrm{Quad}}\) is the estimated value by the Log Quad Model, and \(\textbf{m}_{x}^{\textrm{Ext2}}\) is the estimated value by the Extended Log Quad Model 2. The abbreviations of country names are as follows. AUS: Australia, CAN: Canada, FRA: France (Total Population), NZL: New Zealand (Total Population). Data: Human Mortality Database (2023)

Fig. 3
figure 3

Estimation Error of Log of Mortality Using The Log Quad Model and The Extended Log Quad Model 2 (Female, Year 2020. Note: \(\textrm{d} \ln \textbf{m}_{x}^{\textrm{Quad}}\) is log of estimation error by the Log Quad Model, and \(\textrm{d} \ln \textbf{m}_{x}^{\textrm{Ext2}}\) is the log of estimation error by the Extended Log Quad Model 2. The abbreviations of country names are as follows. AUS: Australia, CAN: Canada, FRA: France (Total Population), NZL: New Zealand (Total Population). Data: Human Mortality Database (2023)

Fig. 4
figure 4

Mean Squared Error (MSE) of The Log Quad Model and \(e_{65}\) (France, Female, Ages 65+, Year 1975–2020). Data: Human Mortality Database (2023)

To verify the applicability of the Extended Log Quad Model 2 to countries other than Japan, we applied the Extended Log Quad Model 2 to HMD’s 2020 life table. As a result, for most countries included in HMD, even compared with the Log Quad Model, the Extended Log Quad Model 2 does not necessarily have high estimation accuracy. This is thought to be due to the fact that while the Log Quad Model is a model estimated from the HMD life table, the Extended Log Quad Model 2 is a model estimated from the JMD prefectural life table, and the improvement in Japan’s old-age mortality rate is significant compared to many other countries. However, the Extended Log Quad Model 2 may be applicable to countries other than Japan. Figures 2 and 3 show the estimation results for four countries (Australia, Canada, France, and New Zealand), in respect to which the Extended Log Quad Model 2 estimates female life tables with high accuracy. The figures show that the Extended Log Quad Model 2 fits well with the mortality rate of females in these countries in 2020, especially for those aged 70 and above, although this new model is modeled based on the Japanese Mortality Database.

These countries have \(e_{65}\) values for females that are relatively close to those of Japan (Australia: 23.2, Canada: 22.2, France: 22.9, New Zealand: 22.4, Japan: 24.9), and are considered countries that are making progress in improving the mortality rates of older adults. Japan is one of the countries where longevity is the most progressive, and it is expected that other countries will experience similar improvements in old-age mortality rates in the future.

In fact, the fit of the Log Quad Model is already worsening in some countries, especially for females. Figure 4 plots the relationship between the Log Quad Model MSE and \(e_{65}\) for French females from 1975 to 2020. The figure shows that in France, the fit of the Log Quad Model worsened as the old-age mortality rate improved. In this study, the coefficients (\(\textbf{a}_{x}\), \(\textbf{b}_{x}\), \(\textbf{c}_{x}\)) were directly used by Wilmoth et al. (2012), while the coefficient \(\textbf{w}_{x}\) is estimated based on JMD prefectural life tables. However, further improvements may be required to develop a model suitable not only for Japan, but also for other countries. This can be achieved by directly estimating these coefficients based on data from countries and regions listed in HMD life tables where the mortality levels are close to Japan’s.

In addition, life table projection by municipality using the Extended Log Quad Model 2 in this study was slightly less accurate than the Extended Log Quad Model 1 for males. This may be because of the weaker correlation between the right singular vector (RSV1) and life expectancy at age 65 (\(e_{65}\)) for males than for females. Future work will involve modeling to enable a more appropriate estimation, including re-estimation of the aforementioned coefficients.

In this study, we re-examined the issues raised regarding the Extended Log Quad Model 1 by Horiguchi (2022), such as a large estimation error of mortality at ages 20–39 for females. We propose the Extended Log Quad Model 2 by modeling the estimation error for ages 20 and above in the Log Quad Model (\(k=0\)). The Extended Log Quad Model 2 can obtain life tables based on a few indices (\(_{5}q_{0}\), \(e_{65}\)), and the Extended Log Quad Model 2’s performance was confirmed as accurate when it was compared with the 2015 municipal life tables. Certain issues remain to be examined regarding the Extended Log Quad Model 2; nonetheless, the simplicity of the model, which can obtain a life table with only \(e_{65}\), if \(_{5}q_{0}\) is fixed to national values, is a notable advantage over the Extended Log Quad Model 1. In addition, the accuracy of future estimations using the Extended Log Quad Model 2 is confirmed to be superior to that of the Log Quad Model. These results indicate that the Extended Log Quad Model 2 can be applied to future population estimation at the municipal level. We believe that the Extended Log Quad Model 2 can be proposed as a new practical model life table.