1 Introduction

The global population is ageing. The previsions by the United Nations are that, by 2050, the population aged 60 years or older will reach 2.1 billion (United Nations, 2017). This shift in global demographics is a success story that comes with a major challenge: to ensure that those added years are lived with health so older people are able to do what they value (Beard et al., 2016). This idea is captured and extended by the World Health Organization (WHO) in the World Report on Ageing and Health (World Health Organization, 2015), where “happiness, satisfaction and fulfilment”, also known as subjective wellbeing (SWB) in the literature (Diener, 2006), are identified as the very end of the healthy ageing process. This is consistent with the advice by entities such as the European Commission (2016) or the Organisation for Economic Co-operation and Development (OECD, 2013) to foster subjective wellbeing at the population level.

Many studies have explored the relationship between health and subjective wellbeing in the general population. A meta-analysis of 29 studies showed an overall statistically significant positive relationship between health and subjective wellbeing (Ngamaba et al., 2017). As in most of the SWB literature, the studies included in the meta-analysis namely considered evaluative measures of SWB (e.g. global assessments of the life satisfaction as a whole or general statements about affect, including happiness), which are widely present in population surveys (Dolan & Metcalfe, 2012).

This relationship also takes place in the older population. Even if the relationship between health and SWB is not among the aims of the study, researchers focusing on the determinants of SWB in older adults usually include some measure of health (typically self-reported health, number of chronic conditions, or functional ability) in order to adjust the results for the potential confounding effect of health on those relationships (some recent examples are Lam & García-Román, 2019; Moreno-Agostino et al., 2019, 2020b). Moreover, a plethora of studies have specifically analysed the relationship between health and SWB in older adults, mostly focusing on SWB as the outcome. In line with the abovementioned evidence, these studies have also shown that health predicts a better SWB in older adults from low-, middle-, and high-income countries (Gildner et al., 2019; Lu et al., 2020; Moreno-Agostino et al., 2021a; Puvill et al., 2016, 2019). These findings go in line with a “bottom–up” perspective of SWB, where pleasurable and unpleasurable moments and experiences are what determine the SWB assessments people make (Brief et al., 1993; Diener et al., 2018; Feist et al., 1995); and with what has been called the “disability hypothesis”, which posits that health problems can exert a negative effect on SWB (Watson & Pennebaker, 1989). However, the reverse hypothesis (which could be denoted as the “psychosomatic hypothesis”) has also been tested, showing that SWB predicts better health in older adults (Miret et al., 2017; Steptoe & Fancourt, 2019). Nevertheless, a common limitation of all these studies is that, due to the use of cross-sectional designs, there is no way to ascertain any directionality in the results, leaving unanswered the question of whether health predicts SWB or vice versa in older adults.

Fewer studies have used a longitudinal approach that allows for exploring a specific directionality in the relationship between health and SWB in older adults. For instance, using a linear mixed modelling approach, Calderón-Larrañaga et al. (2019) recently found that SWB predicted a lower speed of disability accumulation and multimorbidity over time, replicating the same results when taking into account only older adults with no disability or multimorbidity at baseline. On the other hand, studies focusing on the trajectories of SWB in older adults have found that a better health is consistently associated with the best trajectories over time (Lim et al., 2017; Moreno-Agostino et al., 2021b). A common limitation of these longitudinal approaches is that they fail to provide evidence that accounts for the potential bidirectionality in the relationship between SWB and health.

A notable exception to this is the study by Gana et al. (2013), where this bidirectionality is explicitly modelled by using a cross-lagged model in a sample of older adults over five time points. In this article, the authors provide evidence on the highly autoregressive nature of health and SWB measures (i.e. the impact of the previous state on the next one), along with the relative impact of each of them on the other, showing that poorer health levels are related to subsequent lower levels of SWB, whereas the reverse relationship was found to be statistically non-significant. Nevertheless, previous research has shown that classic cross-lagged models fail to separate the within-person process, which are intended to model, from stable between-person differences (Hamaker et al., 2015). This limitation is especially relevant in this context, since both SWB and health measures have been found to have a relatively high personal stability, also in old age (Daskalopoulou et al., 2019; de la Fuente et al., 2018; Diener et al., 2009; Moreno-Agostino et al., 2020a, 2021b). When this within-person stability is not specifically modelled, that potential stability is then captured by the autoregressive coefficients of the cross-lagged model, which inflates their magnitude and can affect the cross-lagged estimates (Zyphur et al., 2019). Moreover, failing to separate within- and between-person variability does not rule out the possibility that the relationship between time-specific assessments of health and SWB may be explained by broader “top down” individual predispositions to assess more positively or negatively different aspects of their lives (Brief et al., 1993; Diener et al., 2018; Feist et al., 1995), especially if, as is usually the case, health assessments rely exclusively on self-reported information.

Considering the abovementioned literature, there is a need for gathering new evidence on the potentially bidirectional relationship of health and SWB in older adults that overcomes the limitations of previous studies and provides useful insights that can be used to prioritise interventions aimed at fostering the SWB and healthy ageing of the population (European Commission, 2016; World Health Organization, 2017).

Measures of life satisfaction are widely available in population-based studies (Dolan & Metcalfe, 2012), including ageing cohort studies such as the English Longitudinal Study of Ageing (ELSA) (Steptoe et al., 2013). Although SWB has additional components (e.g., experiential and eudemonic wellbeing), the lack of availability of adequate measures of these components in population-based studies makes evaluative wellbeing measures, as life satisfaction, the most suitable choice.

Therefore, the main aim of the present study is to explore the relationship between health and life satisfaction in older adults, accounting for the potential bidirectionality in these effects as well as the within-person stability in both constructs. Since most of the studies show that health is one of the most important predictors of life satisfaction, and the size of the reverse effect is usually of a lower magnitude, the hypothesis is that the effect of health on SWB will be higher than the effect of SWB on health in older adults.

2 Materials and Methods

2.1 Sample and Procedure

Data were drawn from the English Longitudinal Study of Ageing (ELSA), a biannual longitudinal study of a nationally representative sample of the English population aged 50 and older (Steptoe et al., 2013). Waves 4 (2008) to 8 (2016) were considered for this work due to the uninterrupted inclusion of the variables of interest. The initial sample comprised 13,243 older adults. Among those, only participants aged 50 + at baseline, with at least one observation in one of the variables whose longitudinal interrelation was being analysed, and with complete information in the covariates used (i.e., gender and baseline age and wealth) were included in the corresponding analyses. All participants provided written informed consent. The ELSA study was approved by the National Research Ethics Service (MREC/01/2/91). Additional information on the study and how to access the data can be found on the webpage https://www.elsa-project.ac.uk/.

2.2 Measures

2.2.1 Life Satisfaction

Life satisfaction was assessed by means of the Satisfaction with Life Scale (SWLS) (Diener et al., 1985). In the SWLS, respondents are asked to express their agreement with five statements about their life satisfaction in a 7-point Likert scale from 1 (“Strongly disagree”) to 7 (“Strongly agree”). The resulting score ranges from 5 (minimum life satisfaction) to 35 (maximum life satisfaction). Previous research has shown that the SWLS items have an adequate internal consistency (with Cronbach’s α values ranging between .80 and .90), and that the scores present a moderate stability over time (even over a decade) as well as reasonable sensitivity to changes over time (Pavot & Diener, 2008). The internal consistency of the SWLS in the five waves considered measured with Cronbach’s alpha ranged from .89 (wave 4) to .90 (waves 5, 6, 7, and 8).

2.2.2 Health

Health was operationalised using the analytical approach proposed by Caballero et al. (2017). A set of 45 health-related items covering different health domains [i.e. walking, sight, hearing, balance, dizziness, memory, orientation in time, cognition, pain, energy, sleep, incontinence, mobility, and limitations in Activities of Daily Living (ADLs) and Instrumental Activities of Daily Living (IADLs)], consistently with the WHO recommendations for health measurement (Salomon et al., 2003), were selected. These items included both self-reported questions on functioning and difficulties, and measured tests on cognitive (e.g. verbal fluency or immediate recall) and physical performance (i.e. speed of walk). The items were subsequently used to compute a health score at each time point for each participant in that wave based on a polytomous Bayesian multilevel Item Response Theory (IRT) approach (Fox & Glas, 2001). Previous research has shown that this score is able to predict mortality and institutionalisation better than other frequent operationalisations of health such as the count of chronic conditions (Caballero et al., 2017; de la Fuente et al., 2018). Although not all 45 items were present across all waves, the IRT approach used allowed overcoming this obstacle. The full list of items is included in Table S1 (Supplementary Material). Following the procedure described in Caballero et al. (2017), walking speed was recoded into three groups based on the quartiles of the distribution at each wave (lowest quartile, two central quartiles, and highest quartile), whereas cognitive tasks’ scores were recoded into three groups based on their mean and standard deviation (SD) at each wave: high (> one SD above the mean), medium (± one SD around the mean), and low (< one SD below the mean).

2.2.3 Other Variables

Gender and baseline age and wealth were included as covariates affecting health and life satisfaction at all time points in the cross-lagged models. The measure of wealth in quintiles elaborated by the Institute for Fiscal Studies (2011) for the ELSA study was included in quintiles of net total wealth. This measure reflects the savings, investments, physical wealth and housing wealth after financial debt and mortgage has been subtracted. In those cases where information on wealth was missing in all the waves under study, that information was retrieved if available from previous waves of the ELSA study.

Additionally, a question on self-reported health (“Would you say your health is excellent, very good, good, fair, or poor?”), with scores ranging from 1 (“excellent”) to 5 (“poor”) was considered for the sensitivity analyses. Response category scores were reversed so higher scores reflected a better self-reported health.

2.3 Statistical Analysis

First, descriptive statistics of the age, gender, and wealth of the analytic and excluded samples in the main analyses were obtained. Significant differences between both samples were analysed by unpaired t-test for continuous variables and chi-square test for categorical ones, using Cohen’s d and Cramer’s V, respectively, as effect size estimates.

A Bayesian multilevel IRT approach was used in order to obtain the health scores of each individual at each time point in which they participated. A set of four IRT models with increasing complexity were computed in order to select the model with best fit to the data. The resulting expected a posteriori (EAP) was extracted from the model with the best fit (lowest Deviance Information Criterion, DIC). Details on the approach can be find in the Appendix S1 (Supplementary Material).

Although the raw sum scores of the SWLS were used in the subsequent models, we assessed the correlation between these and the factor scores estimated from a confirmatory factor analysis (CFA) model with an adequate level of longitudinal invariance as a way to assess the equivalence between these scores in terms of the rank-ordering of individuals. A model with metric (or weak) longitudinal invariance was deemed sufficient considering that the focus of the study is on the relationships between variables rather than on the mean scores over time (Mulder & Hamaker, 2020). Details on the approach can be found in the Appendix S2 (Supplementary Material).

In order to analyse the longitudinal interrelationship between health and life satisfaction, we computed a set of cross-lagged models. These models are dynamic models that allow assessing the effects of two or more variables on each other over time (Hamaker et al., 2015). We used a structural equation modelling (SEM) approach to specify two different models. The first one, a standard cross-lagged model, was computed namely for comparison purposes with the second model and with previous research (Gana et al., 2013). This first model included both autoregressive effects (the effect of health or life satisfaction at the previous wave t − 1 on the same variable at the wave t) and cross-lagged effects (the effect of health or life satisfaction at t − 1 on the other variable at t). The covariance between the residual variances of health and life satisfaction at each wave was freely estimated in order to allow for potential within-occasion effects on the observations. The main advantage of the standard specification is its simplicity and transparency, whereas its main downside is that it fails to separate the within-person and between-person variance, even if the multiple observations are nested within individuals, so the within-person stability is absorbed by the autoregressive component of the models, which biases the effect estimates (Hamaker et al., 2015).

A second cross-lagged model was computed in order to overcome this limitation. In addition to the previous model’s specifications, this model included a set of additional latent variables to model additional phenomena. Following the suggestions in Zyphur et al. (2019), we specified two latent variables (ηhealth and ηlife satistaction), each of them identified by their respective observed variables with unrestricted factor loadings except for those corresponding to t = T (total number of waves), that were fixed to 1 for identification purposes. These two latent variables had unrestricted covariance among them. By using this strategy, “fixed-effects” due to between-person variability in the repeated measures are captured by these latent variables, thus not biasing the resulting autoregressive and cross-lagged effects. The freely estimated covariance across ηhealth and ηlife satistaction may be especially important when, as it often happens in the literature (and it is specifically modelled in the sensitivity analyses described below), health is operationalised by means of a self-reported item, whose relationship with life satisfaction measures obtained through similar psychological processes may bias the results.

As a sensitivity analysis, two additional models were computed with the same specifications as the two above-mentioned cross-lagged models, using the self-reported measure on health instead of the IRT score.

All cross-lagged models were computed with a SEM approach, using MLR estimator, which provides standard errors that are robust to non-normality (Maydeu-Olivares, 2017). Gender and baseline age and wealth were included as exogenous variables with effects on all health and life satisfaction observations at all time points. Model fit was assessed by means of Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), and Root Mean Square Error of Approximation (RMSEA) values. Following Hu and Bentler (1999) recommendations, CFI and TLI values higher than .95, and RMSEA values lower than .06 were considered to indicate good model fit. In both cases, cross-lagged effects of health and life satisfaction were compared in the standardised solution to ascertain the relative impact of each variable on the other.

Missing data were assumed to be missing at random (MAR), meaning that missingness is ignorable upon conditioning on the observed variables. Full Information Maximum Likelihood (FIML) estimation was used, which allows each case to contribute to the estimation of the parameters for which there are complete data, thus using all available data for parameter estimation and providing unbiased estimates under the MAR assumption (Enders & Bandalos, 2001). Nevertheless, the FIML estimation does not account for missingness in the exogenous variables (i.e., gender and baseline age and wealth). Therefore, considering the substantial subset of participants whose wealth level could not be ascertained, and in order to analyse the potential impact of this missingness in the results, we computed both cross-lagged models (with and without the within-person stability factors) excluding wealth.

Data management was performed with Stata SE version 14 (StataCorp, 2015). Bayesian multilevel IRT and CFA were performed in R version 3.6.3 (R Core Team, 2020), using sirt R package version 3.9-4 (Robitzsch, 2020) for IRT and lavaan R package version 0.6-7 (Rosseel, 2012) for CFA. Cross-lagged models were performed with Mplus version 7 (Muthén and Muthén 1998–2017). The code is available upon reasonable request.

3 Results

3.1 Sample Characteristics

Among the 13,243 participants aged 50 + that comprised the initial sample, 13,233 (99.92%) provided information on more than 50% of the selected health indicators on at least one occasion; 11,779 (88.95%) completed the SWLS questionnaire on at least one occasion; 12,634 (95.40%) provided information on their self-reported health on at least one occasion; and 11,673 (88.14%) had complete information in the covariates (the wealth of the remaining 1570 participants could not be ascertained). The analytic sample of the main models (using the IRT-based health measure and adjusting for gender and baseline age and wealth) comprised 11,667 participants. Table 1 summarises the baseline characteristics of the analytic and excluded samples. Almost all cases in the excluded sample (n = 1570, 99.62%) were excluded because their wealth level could not be ascertained. Significant differences were found in the age of the samples, with the included sample being 5.38 years younger (p < .001) than the excluded sample. A higher percentage of female was found in the included sample (55.65%) compared to the excluded sample (44.35%). The distributions of the health and life satisfaction scores across the different waves are shown in the Table S2 (Supplementary Material).

Table 1 Baseline characteristics of the analytic and excluded samples

3.2 Obtention of the Health Score

The DIC values and the EAP reliability estimates for the different Bayesian multi-level IRT models performed are detailed in Table 2. The best fitting model was the model including item-wise difficulties and slopes, with item-specific standard deviation of item difficulties (model 4), showing the lowest DIC value (546,015) and an adequate EAP reliability (.947). Therefore, this model was use for obtaining the health score of participants.

Table 2 Information criteria and reliability of the Bayesian multilevel item response theory models of health

3.3 Rank-Ordering Equivalence of SWLS Scores

The goodness-of-fit indices of the CFA models computed to assess the longitudinal invariance of the SWLS are shown in Table S3 (Supplementary Material). The metric invariance model showed to adequately reproduce the data, and the correlations found between the predicted factor scores from this model and the raw sum scores used in the subsequent models ranged from .981 to .982, showing an almost identical rank-ordering of participants.

3.4 Cross-lagged Models with the IRT-Based Health Measure

In order to address the aims of the present study, we used the health scores from the selected IRT model along with the SWLS scores to perform the cross-lagged models. The goodness-of-fit indices associated to the cross-lagged models computed with the IRT-based health score are included in the left section of Table 3. The standard cross-lagged model did not show an optimal fit to the data, with some of its fit indices being very close to the habitual cut-off criteria for unacceptable fit. The visual depiction of this cross-lagged model is shown in Fig. 1, including the standardised regression coefficients (β) and the residual correlations at each time point. All autoregressive and cross-lagged effects were found to be positive and significant. The largest effect was found for the autoregressive effect of health, with β estimates ranging from .849 to .871, followed by the autoregressive effect of life satisfaction, which ranged from .694 to .717. Regarding the cross-lagged effects, health effect on life satisfaction (ranging from .099 to .119) was found to be larger than the effect of life satisfaction on health (ranging from .018 to .020).

Table 3 Goodness-of-fit indices of the cross-lagged models
Fig. 1
figure 1

Standard cross-lagged model between life satisfaction and the IRT-based health measure. Note: ζ residuals, LS life satisfaction. All coefficients are standardised. The effects of the exogenous variables (i.e. age, gender, and wealth) on health and life satisfaction on later time points (Health2 and LS2 and onwards) are not depicted for the sake of clarity

The cross-lagged model accounting for the within-person stability showed substantially better fit. All goodness-of-fit indices showed an excellent fit of the model to the data, except for the χ2 estimate (to be expected due to the large sample size). The visual depiction of this model along with the standardised coefficients is shown in Fig. 2. The inclusion of the latent factors capturing the within-person stability in health and life satisfaction over time substantially reduced the autoregressive effects of both variables, which kept being the largest effects in the model. Nevertheless, the cross-lagged effects of life satisfaction on health ranged from β = .010 to β = .012 (p = .019), whereas its counterpart, the effect of health on life satisfaction, ranged from β = .075 to β = .093 (p < .001) and was around 7–8 times larger than the former. The two within-person stability latent factors had a positive correlation (φ = .329).

Fig. 2
figure 2

Within-person stability cross-lagged model between life satisfaction and the IRT-based health measure. Note: η within-person stability factors, ζ residuals, LS life satisfaction. All coefficients are standardised. The effects of the exogenous variables (i.e. age, gender, and wealth) on health and life satisfaction on later time points (Health2 and LS2 and onwards) are not depicted for the sake of clarity. The dotted lines represent effects with a significance .01 < p < .05

3.5 Sensitivity Analyses: Cross-lagged Models with Self-reported Health

In order to ascertain to what extent the relationships found would be affected by the use of an exclusively self-reported measure of health instead of the IRT-based measure including both self-reported and measured tests of cognitive and physical functioning, we computed an additional set of models. The goodness-of-fit indices of these sensitivity models are shown in the middle section of Table 3. In this case, the standard cross-lagged model showed a poor fit to the data, whereas the model including the within-person stability latent factors showed an excellent fit. The visual depiction of the standard cross-lagged model computed with the self-reported measure of health is shown in the Figure S1 (Supplementary Material). First, it is important to note that the lack of fit of this model hinders its substantive interpretation. Nevertheless, compared to its counterpart with the IRT-based health score, this model showed a lower autoregressive effect of health (ranging from .629 to .639), thus being lower than that of life satisfaction. Moreover, the cross-lagged effects were in this case very similar across the two variables: the effect of life satisfaction on health ranged from .081 to .084, whereas the effect of health on life satisfaction ranged from .091 to .093, only slightly larger.

The cross-lagged model including the within-person stability latent factors is depicted in Figure S2 (Supplementary Material). Compared to the analogous model with the IRT-based health score, and as in the case of the standard cross-lagged model with the self-reported health measure, the autoregressive effect of health was smaller (ranging from .142 to .145). Regarding the cross-lagged effects, the effect of life satisfaction (β = .012) turned to be non-significant (p = .093) once the within-person stability was accounted for. Notably, the cross-lagged effect of health on life satisfaction was substantially smaller (ranging from .036 to .038) than in the models with the IRT-based measure. The correlation between the two latent factors was larger in this model (φ = .410).

3.6 Sensitivity Analyses: Cross-lagged Models Without Adjusting for Wealth

To assess the effect of excluding cases with no information on wealth, and in light of the differences found in the comparison between included and excluded samples, an additional set of cross-lagged models were computed without adjusting for that variable. This increased the sample size in 1566 additional cases. The goodness-of-fit indices of these models are shown in the right section of Table 3, and their visual depiction, along with the coefficients, are shown in the Figure S3 (standard cross-lagged model) and Figure S4 (cross-lagged model with within-person stability factor) of the Supplementary Material. The results were very similar to those adjusted for wealth, again showing that, once the within-person stability in the measures was modelled, the cross-lagged effects of life satisfaction on health became very small and, as in the case with the self-reported health measure, not statistically significant (p = .064).

4 Discussion

4.1 Main Findings and Interpretation

The present study shows that, while health predicts a higher life satisfaction in older adults, once the within-person stability in both variables is accounted for, the effect of life satisfaction on health becomes negligible. These results were replicated using two different operationalisations of health: a more comprehensive approach including both self-reported and performance information and a single self-reported question on general health. Models not accounting for that stability may find biased results, suggesting that life satisfaction is a relevant predictor of health, especially when using self-reported measures of health, where the results may even suggest that the ability of each variable to predict the other one is equivalent. Altogether, our study reflects the importance of accounting for the within-person stability when analysing the relationship between health and life satisfaction in older adults, most especially when the operationalisation of health relies exclusively on self-reported information.

In this latter case, failing to account for that stability and the relationship between both stability factors can become a greater source of bias, since both constructs are more likely to be related due to unspecific factors such as the response styles (Angelini et al., 2012) or broader “top down” individual predispositions at assessing the personal levels of health and life satisfaction (Diener et al., 2018), given that both variables are exclusively self-reported. In our study, it is likely that such response styles may explain why health and life satisfaction stability factors are more highly related in the models using exclusively self-reported information for operationalising health.

Our results are in line with a previous study by Gana et al. (2013), which used a similar analytical approach and found that health predicts life satisfaction but life satisfaction does not predict health in older adults. Our study provides further evidence on these relationships by specifically accounting for the within-person stability of both measures and using a health operationalisation that includes objective performance information, thus reducing the risk of bias in the estimates. Moreover, our results underpin the relative personal stability of both constructs over time in the older population (Daskalopoulou et al., 2019; de la Fuente et al., 2018; Diener et al., 2009; Moreno-Agostino et al., 2020a, 2021b).

On the contrary, the present study suggests that caution should be taken when interpreting evidence analysing the relationship between health and life satisfaction (Ngamaba et al., 2017), especially evidence reporting that SWB predicts a better health in older adults (Miret et al., 2017; Steptoe & Fancourt, 2019). It is important to note that most of the previous evidence (Miret et al., 2017; Ngamaba et al., 2017) has been conducted with cross-sectional data, and therefore no directionality in the results can be inferred. But even when longitudinal information is used (Steptoe & Fancourt, 2019), the above-mentioned stability in the constructs involved is usually disregarded. By using a robust approach that accounts for the complex existing dynamics across health and life satisfaction in older adults, our study overcomes these limitations, providing a more accurate picture of the underlying relationships.

4.2 Strengths and Limitations

To our knowledge, this is the first study to analyse the relationship between health and life satisfaction in older adults using a methodological approach that accounts for the within-person stability in both constructs, while adjusting the results for potentially relevant confounders such as age, gender, and wealth. Importantly, unlike most of the available literature on this relationship, our approach allows us to model the directionality of the effects in a longitudinal framework, thus leaving less room for alternative explanations. Furthermore, we have used a robust approach to the measurement of health in old age (Caballero et al., 2017; Daskalopoulou et al., 2019), thus avoiding relying exclusively on self-reported information. Nevertheless, by using an alternative approach to the operationalisation of health based exclusively on self-reported information, we have been able to show how greater bias may take place when failing to account for the within-person stability in this habitual case in the literature.

Notwithstanding, the present study should be interpreted on the light of several limitations. First, it is crucial to note that the sample analysed in this study belongs to a population from a high-income Anglo-Saxon country. This may importantly affect the generalisability of the results found, that may have substantially differed had the study been conducted on the older population from a country that is not Western, Educated, Industrialised, Rich, and Democratic (WEIRD), where living conditions are expected to change the way in which health and SWB interrelate over time (Henrich et al., 2010). Moreover, it has been suggested that people from Anglo-Saxon countries may display differences in how their SWB is expected to change over time (Steptoe et al., 2015), although previous research has found similar results when studying older participants from France (Gana et al., 2013). Therefore, replication studies in non-WEIRD older populations are needed in order to generalise the findings of this work. Another limitation of the present study is the exclusive consideration of life satisfaction measures. SWB comprises additional components that, despite being related to life satisfaction, are not strictly equivalent to it (Dolan et al., 2017). The existence of a positive relationship between health and these different components has been proposed in the literature (Moreno-Agostino et al., 2021b; Ngamaba et al., 2017; Steptoe et al., 2015), but the magnitude of the relationships found in the present study may have varied if alternative components of SWB had been considered. Unfortunately, and as it is usual in population-based studies (Dolan & Metcalfe, 2012), only life satisfaction was consistently included and measured over time. Future research may focus on the relationship across health and SWB over time using different components such as experiential or eudemonic wellbeing. It is also important to note that measurement error was not accounted for in the relationships found. This is because, although the higher complexity of the models including the within-person trait-like stability factors allows separating within- and between-person variability (Hamaker et al., 2015; Mulder & Hamaker, 2020), it also makes it difficult to simultaneously estimate the measurement and structural models, being prone to present convergence problems (Usami, 2020; Usami et al., 2019). Nevertheless, methods for estimating longitudinal models on reciprocal relationships are being developed and refined, and future research may replicate these findings embedding measurement models for the variables under study in the structural model testing the reciprocal effects. Finally, the direction of the differences found between the analytic and excluded sample suggests that the latter comprised more men and of younger age, which may limit the generalisability of our results. However, the additional set of analysis performed without adjusting for wealth, and thus avoiding the exclusion of the great majority of those cases, showed that the results were very similar to those of the main analysis, supporting their robustness.

5 Conclusions

Health and life satisfaction are not bidirectionally related in older adulthood when the within-person stability in both constructs is accounted for, but only health predicts a better life satisfaction over time. Policies aimed at fostering the SWB of the older population may achieve that goal by focusing on health enhancement and maintenance, but, on the contrary, policies aimed at promoting healthy ageing by increasing SWB may not be as effective. Research aimed at analysing health and life satisfaction in older adults, and possibly in the general population, should account for the relative stability of those constructs in order to adequately ascertain the underlying relationships across them and with other variables.