Introduction

The negative effects of physical inactivity on health are well known. Unhealthy diets and physical inactivity are the main contributors to overweight and obesity, which are among the leading risk factors for the major non-communicable diseases [1]. Heart disease is a costly outcome of physical inactivity [2].

However, half of Japanese workers are physically inactive because they do not engage in enough physical activity during their leisure time and because jobs are increasingly sedentary in nature. The 2007 National Health and Nutrition Survey in Japan indicated that one out of two males 40–74 years old was likely to develop metabolic syndrome.Footnote 1 In addition, the prevalence of overweight or obesity in Japan (a body mass index of 25 or higher) showed a tendency to increase in males regardless of age group compared with 1997 statistics. The change in the prevalence of overweight or obesity in males aged 50–59 in a decade was 10.2 % points (from 24.1 to 34.3 %), the largest value among the working generations. In contrast, the proportion of regular exercisers among males aged 50–59 was 21.0 %, which was smaller than the value of 24.7 % for females.Footnote 2

The effect in which a past value by itself influences future values of the same process is known as genuine state dependence. To the best of our knowledge, only a few studies have taken into account the importance of both state dependence and unobserved heterogeneity in explaining health outcomes [35]. The studies by Contoyannis et al. [3, 4] supported the existence of a certain degree of state dependence in self-assessed health in the UK. Contoyannis et al. [3] presented evidence of persistence in self-assessed health, attributed in part to state dependence, and found that such persistence was stronger among men than among women. They showed that the impact of individual heterogeneity was reduced when state dependence was controlled for and that unobserved heterogeneity accounted for 30 % of the unexplained variation in health. Hernández-Quevedo et al. [5] reported that health limitations had a high state dependence even after controlling for measures of socioeconomic status. Their model conditioned on previous health status and parameterized the unobserved individual effect as a function of the initial period observations on time-varying regressors and health.Footnote 3

Taking into account the unobserved heterogeneity of employed persons, Brown and Roberts [6] examined frequency of participation in physical activity using a generalized random-effects-ordered probit model and revealed that there was a time trade-off among non-market work, market work, and the frequency of participation. They used the method of Mundlak [7], which takes the group means of the time explanatory variables into account in order to remove the time invariant individual effects from the model, thereby allowing for unbiased estimation. However, they did not take into account the contribution of the state dependence to participation in physical activity.

Health depreciation may not be solely a consequence of aging but may also be related to adverse health behavior. In line with Grossman [8], health behavior can be treated as an investment in health. The starting point for an economic analysis of health dynamics is Grossman’s household production model. Grossman’s investment model determines the optimal stock of health in any period.Footnote 4 Under a partial adjustment mechanism, because of adjustment costs to the desired health stock, current health will depend on previous health, and this model can be estimated using longitudinal data [4]. We include the lagged dependent variables in our dynamic empirical models and suggest that these may be viewed as approximating partial adjustment mechanisms. Our models include regular physical activity (RPA) as a representative lifestyle choice. We consider that there may be a direct causal link between lifestyle choices such as RPA and health status.

In this article, we examine the association between participation in RPA and latent health stock (LHS) in a middle-aged population, drawing on the health investment framework of Grossman’s model. We focus on measuring the degree of genuine state dependence in both RPA and LHS. Accounting for state dependence will correct the possible overestimation of the impacts of socioeconomic factors. Estimation results showing that the degree of state dependence of LHS is positive and significant would imply that policy interventions that improve LHS will have lasting consequences over time.

The remainder of this article is structured as follows. In section “Participation in physical activity and latent health stock”, we summarize the studies on participation in physical activity. Following a review of the literature, we describe the characteristics of the longitudinal data used in this study. Data from a large nationwide survey by the Japanese Ministry of Health, Labor and Welfare were used. In longitudinal data (panel data) analysis, it is possible to focus on changes in health behavior occurring in subjects and to make population inferences that are not as sensitive to variations between subjects. In section “Empirical strategy and results”, we present the estimation results of three probit models of middle-aged populations, taking physical health variables into account. Comparing the estimation results of the pooled probit model, random-effects probit model, and dynamic random-effects probit model, we show that the dynamic random-effects probit model provides the best specification. Our main results indicate that state dependence and unobserved heterogeneity make important contributions to a given health status. We also examine whether the participation in RPA is associated indirectly with a decreased risk for chronic diseases. In section “Conclusions”, we offer conclusions and argue that both participation in RPA and improved health policy are factors that could reduce the costs incurred by Japanese society in the treatment of chronic diseases.

Participation in physical activity and latent health stock

Literature review

Several previous studies have used health production models that include participation in physical activity in order to examine the effects of lifestyle choices on health. The results of those studies suggest that individuals with healthier lifestyles tend to have better self-assessed health [9, 10].

Labor force participation should be considered an important factor in health-relevant behavior when we analyze the effects of lifestyle choices on health. First, the kind of work performed has a decisive impact on the health depreciation rate [11, 12]. Blue-collar workers with physically exhausting jobs tend not to exercise after work. Individuals with lower socioeconomic status (SES) are more likely to report engaging in job-related physical activity compared to higher SES individuals, who are more likely to report engaging in leisure-time physical activity [13]. Monthly leisure-time physical activity for males differs significantly among occupations, with clerks having greater physical activity than managers and blue-collar workers [14]. Second, working hours are used to explain the trade-off among work, health investment, and leisure. Long working hours reduce leisure-time and health investment activities. A study of Canadian time-use data collected in 2005 indicated that time poverty may be more important than income poverty as a barrier to RPA. Both income and time deprivation can contribute to low levels of physical activity [15]. Individuals make choices about how to allocate their time and resources to health investments and other activities. On the one hand, time spent on RPA reduces time available for other activities. On the other hand, time spent on RPA increases health stock and in turn reduces time lost to illness. The expenditure of time for health-producing activities such as RPA may improve one’s available hours of productive activity.

Education, employment, and income are among the most powerful components of SES, and lower levels of education can lead to insecure income, hazardous work conditions, and poor housing, which can in turn increase the risk of death due to external causes [16]. Educational attainment has been shown to have a positive association with habitual exercise regardless of age [17].Footnote 5 Several important behavioral risk factors for poor health are more common among people in lower SES groups. Nishi et al. [18] reported that females with lower educational levels were more likely to have a smoking habit.

Adults with lower incomes or less education are more likely to smoke and more likely to be obese than adults with higher incomes and more education. Low-income individuals tend to consume cheaper meals with lower nutritional value. As a consequence, the risk of overweight or even obesity is much higher for those with low incomes.

In contrast, individuals at the highest levels of income, education, and job classification were more likely to engage in RPA during their leisure time than those with lower job status and incomes. Nonsmoking, moderation in drinking, and normal-range body weight may be seen as the consequences of health investments [19].

The Japanese workplace is characterized by several unique features, such as an intense work environment [16]. On the one hand, hours of market work are likely to affect both income and health. For regular workers, working more hours than the prescribed 40 h per week is a major constraint on leisure-time physical activity. Both higher occupational status and longer working hours may reduce the leisure-time physical activity of middle-aged persons. Managers, for example, tend to have stressful jobs with long working hours that allow less leisure time for disease-preventive physical activity. On the other hand, increases in non-market working time make it less costly to undertake health-conducive activities such as exercise or the consumption of a healthy diet [20].

Data

Table 1 shows a summary of statistics for all health-related variables at an individual level. The data were obtained from nationwide surveys in Japan. The 5 years’ longitudinal data (2005–2009) used in this study were taken from the Longitudinal Survey of Middle and Elderly Persons (LSMEP) by the Japanese Ministry of Health, Labor and Welfare. The respondents of this survey were 50–59 years old in 2005. Data were collected through a combination of interviews and self-administered questionnaires. The LSMEP asked each respondent about his or her illnesses and lifestyle variables. From these surveys, we obtained information about demographic variables, educational background, and occupational status. However, the LSMEP did not ask about the number of family members or the number of housemates, except for spouses.

Table 1 Descriptive statistics

As lifestyle variables, we converted the survey responses into dichotomous variables (yes = 1) for each of the following: regular physical activity (RPA), consumption of alcoholic beverages almost every day or every day, and current smoking. We used RPA as a dummy variable, which took on the value of one if individuals who engage in RPA were defined as taking part in sports more than twice a week during their leisure time. The intensity of exercise was classified as follows: light (stretch, light gymnastics), moderate (walking, jogging), or vigorous (aerobics, swimming). The proportion of respondents who were considered to have participated in RPA was 0.333 (light: 0.155; moderate or vigorous: 0.178). The LSMEP did not ask the amount of time devoted to physical activity.

Each health outcome was measured as a binary variable that took a value of one if the individual reported having any of the following conditions (the proportion of individuals with each condition is given in parentheses): diabetes (0.090), heart disease (0.036), cerebral stroke (0.012), hypertension (0.241), hyperlipemia (0.136), and cancer (0.016). The proportions of individuals who reported feeling the following self-reported mental health statuses all of the time during the past 30 days were: nervous (0.021), hopeless (0.007), restless or fidgety (0.006), so depressed that nothing could cheer you up (0.011), everything was an effort (0.010), and worthless (0.008). The proportion of individuals who were taking medication or had consulted a doctor was 0.279. The proportion experiencing difficulty in activities of daily life because of a physical health problem was 0.068.

Non-market working time includes the time spent on housework and child care, activities that do not generate income but nonetheless affect lifestyle. In Japan, women often specialize in non-market domestic work such as child care, food preparation, and nursing care. Homemakers, unemployed persons, and retired persons are not a part of the labor force and are not included in the following analysis. It has been reported that males spend more time doing some type of exercise than females in Japan.Footnote 6 The trend was stable from 1991 to 2006 according to the Survey on Time Use and Leisure Activities published by the Statistics Bureau of the Ministry of Internal Affairs and Communications in Japan [9].Footnote 7 For occupational status, the following proportions were reported: regular employees (0.419), management executives (0.064), part-time and casual workers (0.220), self-employed (0.144), contracted employees (0.067), family workers (0.049), and dispatched workers (0.007). We also employed the data on income per month at the time of the survey. We used income per month of the individuals, because the LSMEP had not asked for household income since 2006.Footnote 8 With respect to demographic variables, we considered age, sex, and marital status. The proportion of married workers was 0.814. Almost half the workers self-reported completing high school. The proportion of workers who self-reported completing a university degree was 0.177.

Real income, which did not include income from public pensions, was deflated by the consumer price index (CPI). The CPI in 2005 was 100. We transformed real income into a natural log, considering the nonlinear association between income and health. The extent of missing income data was relatively larger. Missing data are a major concern for surveys: unless the absence of the variables in question is completely random, the analysis is likely to be biased [21]. We therefore constructed a dummy variable that took on the value of one if the observations had missing values for income. The correlation coefficients between this dummy variable and the relevant health variables are reported in Table 2. All the correlation coefficients shown in Table 2 are very small, and no systematic pattern can be found. We therefore concluded that the missing income data were not systematically related to health variables.Footnote 9

Table 2 Missing observations and the correlation between missing and health variables

The original self-assessed health (SAH) variable is a six-point scale variable ranging from very good to bad. The SAH variable may be vulnerable to reporting bias because of anticipation and measurement heterogeneity [22]. There may be simultaneity between physical activity and health status, since health affects the participation in leisure-time physical activity directly. In order to overcome the problems associated with the measurement error of SAH, we created a latent health stock variable. To correct for possible reporting heterogeneity, we applied a technique previously proposed by Disney et al. [23].

Methods

People with worse objective health status may tend to overstate their subjective health. In addition, the self-assessed health status may be affected by personal characteristics such as age, education, or the utilization of medical resources. Following the procedure of [23], we estimated a model of SAH as a function of physical and mental health status (\( d_{it} \)) as well as personal characteristics such as age and education (\( w_{it} \)). First, we wrote the unobservable health status (\( Z_{it} \)) as a function of d, x and unobserved variables (\( \mu_{it} \)):

$$ Z_{it} = \delta^{\prime } w_{it} + \gamma^{\prime } d_{it} + \mu_{it} . $$
(1)

Instead of \( Z_{it} \), the categorical variable SAH (\( h_{it} \)) was observed in our data set. This variable may be measured with a reporting error since the assessment of health may depend on age, education, and health problems [19]. The latent health stock (\( h_{it}^{*} \)) as the counterpart of the observed \( h_{it} \) is a function of \( Z_{it} \) and the reporting error (\( e_{it} \)) as follows:

$$ h_{it}^{*} = Z_{it} + e_{it} . $$
(2)

The latent health variable can be linked to the categorical variable \( h_{it} \) using the mechanism below:

$$ h_{it} = j,\quad if\,\varphi_{j - 1} < h_{it}^{*} < \varphi_{j - 1} ,\quad j = 1,2, \ldots ,\,6. $$
(3)

Equation (3) shows that our observable health variable takes the value j if the latent health stock lies between the two thresholds \( \varphi_{j - 1} \) and \( \varphi_{j} \). Combining this observation mechanism with (1), the model can be estimated using an ordered probit model. Using the predicted values, we can normalize the health stock via a z-transformation. We used health stock as a dummy variable, which took on the value of one if the latent health stock was good. It was classified according to the median of the standardized variable (median = good).

Table 3 refers to the estimation results. Eleven health status effects on SAH were significantly negative at the 1 % level. Six illnesses (diabetes, heart disease, cerebral stroke, hypertension, hyperlipemia, and cancer), five mental health variables (nervous, hopeless, worthless, depressed, and everything was an effort), and two health care variables (medication or doctor’s consultation, hospitalization) had negative effects on SAH. Low educational attainment also had negative effects on SAH. In contrast, greater education had positive impacts on SAH at the 1 % significance level.

Table 3 Estimation results: self-assessed health (N = 79,167)

Empirical strategy and results

The theoretical notion is that health deteriorates over time but is capable of enhancement as a result of household production. Models of rational decision-making have been developed for a variety of health behaviors. Nevertheless, little is known about the relationship between exercise and health of rational health-capital formation. Becker [24] revealed that there is greater benefit from becoming addicted to activities such as regular exercise if the probability of surviving to older ages is high.Footnote 10 Caputo and Levy [25] showed the effects of an agent’s marginal value of health. If the marginal value of health is negative and exercise is a substitute for consumption and mood, then consumption and work increase while exercise decreases with an agent’s mood state.Footnote 11

Differences in latent mental health may affect participation in leisure-time physical activity and, in turn, affect health stock of the individuals. Healthy workers are more likely to invest in health. In our empirical framework, we assumed that the instantaneous utility function of the individual depended on a lifestyle vector and latent health stock, which was conditional on exogenous variables, and a vector of unobservable factors that influence personal preferences.Footnote 12 When utility was updated with the optimal levels of lifestyle at each period from the utility maximization problem in the previous period, future utility clearly depended on past consumption decisions. Thus, examining the degree of the state dependence of RPA or LHS is important.Footnote 13

Dynamic random-effects probit model and state dependence

In modeling the state dependence of RPA or LHS among Japanese workers, the analysis begins with the dynamic specification using simple pooled probit specification. Under this formulation, the response probability of a positive outcome depends on the unobserved effect and past experience. It is important to take unobserved heterogeneity into account because ignoring it overestimates the degree of state dependence. Second, the random-effects probit specification allows for unobserved heterogeneity but treats the initial conditions as exogenous. Estimating a standard uncorrelated random-effects probit model implicitly assumes zero correlation between the unobserved effect and the set of explanatory variables.Footnote 14

However, it is reasonable to expect the unobserved effect to be correlated with at least some of the elements of the set of explanatory variables if the unobserved effect captures an individual’s behavior. Therefore, the unobserved effect must be integrated out before estimation can progress [32]. The need to integrate out the unobserved effect evokes the question how the initial observation is to be treated. The treatment of initial conditions of the dynamic random-effects probit model is crucial, since misspecification will result in an inflated parameter of the lagged dependent variable term. Ignoring the initial conditions problem yields inconsistent estimates [32, 33].

Wooldridge proposed a conditional maximum likelihood estimator that considers the distribution conditional on the initial period observations and exogenous covariates. Parameterizing the distribution of the unobserved effects leads to a likelihood function that is easily maximized using preprogrammed commands with standard software [32].Footnote 15 The latent equation for the dynamic random-effects probit model of RPA participation is specified as:

$$ y_{it}^{*} = \rho^{\prime } y_{it - 1} + \beta^{\prime } x_{it} + \alpha_{i} + u_{it} , $$
(4)

where \( y_{it}^{*} \) is the latent dependent variable, \( x_{it} \) is a vector of exogenous explanatory variables, \( \alpha_{i} \) are individual-specific random effects, and \( u_{it} \) are assumed to be normally distributed. The coefficient \( \rho \) is the state dependence parameter. The observed binary outcome variable is defined as \( y_{it} \) = 1 if \( y_{it}^{*} \) ≥ 0 and \( y_{it} \) = 0 otherwise. The subscript i indexes individuals and t time periods.

Following Wooldridge [32], we assume a certain correlation between \( x_{\text{it}} \) and \( \alpha_{i} \) and therefore the time-averages of all time-varying explanatory variables (\( \mathop {x_{i} }\limits^{\_} \)) are included in the specification. We implement the conditional maximum likelihood approach by parameterizing the distribution of the individual effects as:

$$ \alpha_{i} = \alpha_{0} + \alpha_{1}^{'} y_{i0} + \alpha_{2}^{'} \mathop {x_{i} }\limits^{\_} + \varepsilon_{i} , $$
(5)

where \( \varepsilon_{i} \) is assumed to be distributed N(0, \( \sigma_{\varepsilon }^{2} \)) and independent of (\( y_{i0} \), \( \mathop {x_{i} }\limits^{\_} \)). \( \mathop {x_{i} }\limits^{\_} \) is the average over the sample period of the observations on the exogenous variables. Substituting Eq. (5) into Eq. (4) gives Eq. (6). The estimates of \( \alpha_{1} \) are of interest as they are informative about the relationship between the individual effect and initial health. We would expect there to be a positive gradient in the coefficient estimates.Footnote 16

$$ y_{it}^{*} = \rho^{'} y_{i,\,t - 1} + \beta^{'} x_{it} + \alpha_{0} + \alpha_{1}^{'} y_{i0} + \alpha_{2}^{'} \mathop {x_{i} }\limits^{\_} \, + \, \varepsilon_{i} + u_{it} $$
(6)

Empirical results

Following [5], we assess the statistical fit of the different models using the Akaike information criteria and Schwarz Bayesian information criteria (AIC and SBIC, respectively) for model selection:

$$ {\text{AIC}} = - 2\ln L + 2q $$
(7)
$$ {\text{SBIC}} = - 2\ln L + (\ln M)q, $$
(8)

where q represents the number of parameters in each specification and M the number of observations. When the estimation results of the three models (pooled probit, random-effects probit, and dynamic random-effects probit) were compared (see Tables 4, 5), both the AIC and SBIC of the dynamic random-effects probit model were the smallest values among the three. The variance of the unobserved individual effect (\( \sigma_{u}^{2} \)) of the dynamic random-effects probit model of RPA or LHS was significant at 1 %. Therefore, the dynamic random-effects probit model was the best specification. The corresponding pooled probit model is given in the first column of Tables 4 and 5.

Table 4 Determinants of regular physical activity
Table 5 Determinants of latent health stock

Table 4 indicates that the major determinants of participation in RPA were as follows: income, age, educational attainment, work, and lifestyle. Age, high educational attainment (university), special work such as professionals, managerial work, and security had positive effects on RPA participation. In contrast, a smoking habit, low educational attainment, longer work hours, and longer commuting time had negative effects on RPA participation.Footnote 17 We used six Lehman shock dummy variables to capture the effects of a sudden income decrease. The sum total of Lehman shock dummy variables was 0.16 in 2008. One Lehman shock dummy (from very high income to middle income) was statistically significant at the 5 % level, which suggested that some persons with a sudden income decrease changed their lifestyle.

RPA also had positive effects on LHS at the 1 % significance level (see Table 5). We consider that there was a direct causal link between lifestyle choices such as RPA and health stock. Since the estimation results showed that the degree of state dependence of LHS was positive and significant, it would appear that policy interventions that promote RPA have lasting consequences across time.

It is noteworthy that the major determinants of LHS, except RPA, were slightly different from those of RPA (see the third column of Tables 5 and 11). Individuals with higher income had greater LHS. Very high income had positive effects on LHS. Low educational attainment, difficulty in daily life activities, and care for family members had negative effects. The estimation results of the dynamic random-effects probit model, on the contrary, showed that a smoking habit did not have significant effect on LHS. We therefore consider that there were two causal relationships between smoking habit, RPA, and LHS—a flow to RPA from smoking habit and a flow to LHS from RPA.Footnote 18

The intraclass correlation coefficient (ICC) from an error components panel data model is determined as (ICC = \( \sigma_{u}^{2} /(1 + \sigma_{u}^{2} ) \)), where \( \sigma_{u}^{2} \) represents the variance of the unobserved individual effect. The ICC measures the proportion of the total unexplained variation that is attributed to the individual effects.

The ICC represents the correlation of health scores across periods of observation. Values of the ICC close to unity indicate high persistence in health outcomes (Jones et al. 2005). We can consider that there existed moderate persistence in RPA because the value of the ICC of RPA was 0.393, which is almost the same as the value of LHS (see Tables 4, 5).

Testing the hypothesis of a non-zero ρ is equivalent to testing the presence of true state dependence, having controlled for the unobserved heterogeneity. As the results of the estimation of the dynamic random-effects probit model, the results change substantially and the state dependence estimate was reduced to less than half. As Table 4 shows, the state dependence parameter of RPA, 0.608, was statistically significant at the 1 % level. Past lifestyle is itself a determinant of future lifestyle. The state dependence parameter of LHS, 0.363, was statistically significant at the 1 % level. The size of the estimated coefficient was smaller than that of RPA. The degree of dependence between previous health stock and current health stock exhibited moderate persistence.

The exogeneity of the initial conditions in the dynamic random-effects probit model can be tested by a simple significance test under the null of \( \alpha_{1} = 0 \) for Eq. (6). As Tables 4 and 5 show, the exogeneity hypothesis was strongly rejected in these models. We therefore concluded that the estimate of the random-effects probit model overstated the extent of state dependence when the unobserved individual-specific effect influenced the initial conditions.

Our main results indicated that state dependence and unobserved heterogeneity were important explanatory factors of a given health status. As a matter of fact, the explanatory power of observed variables vanished when individual-specific effects and lags of the dependent variable were introduced. The variables so affected were marital status (married), work-related variables such as agriculture and forestry fishing, and occupational status (family worker) for the RPA equation, and a smoking habit, low income, high educational attainment (university and graduate school), and occupational status (self-employed, management executive, and domestic side job worker in a home) for the LHS equation. We also found gender differences in the determinants of RPA or LHS (see Table 6).Footnote 19 Very high income had positive effects and longer working hours had negative effects on both RPA participation and LHS in males. For both males and females, it is noted that a smoking habit had negative effects on RPA at the 1 % significance level. Thus, smoking cessation is an important health policy to increase the participation in RPA.

Table 6 Gender differences in the determinants of RPA and LHS

Using a subsample that excluded individuals without RPA, we investigated the effects of the change in the intensity of RPA on LHS.Footnote 20 As Table 7 shows, increasing the intensity of RPA had positive effects on LHS at the 1 % significance level. However, the value of ICC was 0.338, smaller than that of the full-sample estimate 0.389. The coefficient of the previous latent health stock was 0.596, larger than that of 0.363 (see Table 5). The results showed that the full-sample estimate of the coefficient of previous latent health stock was smaller than that of subsample estimate because the former included individuals without RPA. This implies that the individuals with RPA were associated with greater persistence of LHS compared to the individuals without RPA. The impact of individual heterogeneity was reduced (from 0.389 to 0.338) when we excluded individuals without RPA, and unobserved heterogeneity accounted for 34 % of the unexplained variation in LHS.

Table 7 Effects of the change in the intensity of RPA on LHS

Conclusions

No prior investigation has considered the effects of state dependence and unobserved heterogeneity on the relationship between RPA and LHS. Accounting for state dependence corrects the possible overestimation of the impact of socioeconomic factors. We estimated the degree of the state dependence of RPA and LHS among middle-aged Japanese workers. Our dynamic empirical models included RPA as a representative lifestyle choice, on our hypothesis that there is a direct causal link between lifestyle choice and health status. We also included the lagged dependent variables of these two dependent variables in our models and analyzed partial adjustment mechanisms.

The 5 years’ longitudinal data (2005–2009) used in this study were taken from the Longitudinal Survey of Middle and Elderly Persons by the Japanese Ministry of Health, Labor and Welfare. The respondents were subjects who were 50–59 years old in 2005. Because the original self-assessed health variable might be vulnerable to reporting bias, we used health stock as a dummy variable, which took on the value of one if the LHS was good using the procedure of [23].

The dynamic random-effects probit model provided the best specification. The estimate of the random-effects probit model overstated the extent of state dependence when the unobserved individual-specific effect influenced the initial conditions. As the results of the estimation, we found that RPA had positive effects on LHS, taking into consideration the possibility of confounding with other lifestyle variables. These results indicated that there was a direct causal link between RPA and health stock.

There was moderate persistence in RPA. The impact of individual heterogeneity was reduced when we used a subsample that excluded individuals without RPA. In fact, when individuals without RPA were excluded, the unobserved heterogeneity was reduced to 34 % of the unexplained variation in LHS, from 39 % of the unexplained variation in the full-sample estimation result. Increasing the intensity of RPA had positive effects on LHS and caused individuals with RPA to exhibit greater persistence of LHS compared to individuals without RPA. A smoking habit, low educational attainment, longer work hours, and longer commuting time had negative effects on RPA participation. For both males and females, a smoking habit had negative effects on RPA participation at the 1 % significance level. The estimation results showed that the degree of state dependence of LHS was positive and significant and support the implication that policy interventions that promote RPA, such as smoking cessation, have lasting consequences. We therefore concluded that smoking cessation is an important health policy to increase both the participation in RPA and LHS.

Finally, we discuss the main limitation of our empirical analysis. The results of this article would not be applicable to other age groups or for the whole population because there are intergenerational differences in smoking rate and hours worked. For both males and females, their smoking rate in their 30s was higher than in that in their 50s. For males, the proportion of workers who worked long hours was higher in their 30s than in their 50s. For females, the labor force participation rate in their 30s was lower than that in their 50s. The time poverty due to the responsibility for domestic work influences both the participation in the labor market and regular physical activity.