Background

Quality of life (QoL) is a multi-dimensional concept of an individual’s general well-being status in relation to the value, environment, cultural and social context in which they live [1]. Since QoL measures outcomes beyond biological functioning and morbidity [2], it is recognised as an important measure of overall [1]. The origin of the term QoL dates back to the early 1970s, as a measure of wellness with linkage to health status like diseases or disability [3, 4]. Since then, interest in QoL has increased considerably [5]. As life expectancy increases, more emphasis has been placed on the importance of better QoL, and the maintenance of good health for as long as possible [6,7,8,9]. Indeed, global leading health organizations have emphasized the importance of QoL and well-being as a goal across all life stages [10,11,12].

Moreover, QoL has increasingly been used in the wider context to monitor the efficacy of health services (e.g. patient reported outcome measures, PROMs), to assess intervention outcomes, and as an indicator of unmet needs [13,14,15]. Several studies have reported that QoL is negatively associated with rehospitalization and death in patients with diseases such as coronary disease [16, 17], and pulmonary diseases [18]. Further, QoL is also predictive of overall survival in patients affected by cancer, chronic kidney disease or after coronary bypass graft surgery [19,20,21,22]. In recent years, an increasing number of studies have investigated whether QoL is also a predictor of mortality risk in the general population [23,24,25,26,27].

To date, there has been only one pooled analysis of eight heterogeneous-Finnish cohorts. That study of 3153 older adults, focused exclusively on the prognostic value of the validated 15-dimentional (15D) health-related QoL (HRQoL) measures [28] for predicting all-cause mortality [29]. However, there has been no systematic review investigating the association between QoL measured by different instruments and all-cause mortality in population-based samples which could be used to monitor health changes in the general population. A broad and comprehensive systematic review of the prognostic value of QoL for all-cause mortality prediction is needed to determine the utility of this QoL measure as a potential screening tool in general clinical practice. Therefore, this systematic review and meta-analysis was conducted with the aim of determining whether QoL is predictive of mortality in the general population which includes individuals with or without a range of health conditions.

Methods

Search methods

This systematic review and meta-analysis were conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [30]. The protocol for this review was registered with the International Prospective Register of Ongoing Systematic Reviews (PROSPERO) [31], under the registration number: CRD42019139994 [32]. The electronic bibliographic databases, MEDLINE, EMBASE and PsycINFO (through OVID) were searched from database inception until June 21, 2019. The search strategy was developed in consultation with a Senior Medical Librarian. The MeSH terms and key-words were developed for MEDLINE (through OVID) and were translated to EMBASE and PsycINFO using the OVID platform (See Supplementary Tables S1-S3, Additional File 1). When the full text of an article was not available, all attempts were made to obtain it by contacting the authors directly. To identify further potentially relevant studies, another search was also developed with those specific QoL / HRQoL measures which were found in this review (See Supplementary Table S4, Additional File 1). Additionally, the bibliography lists of the included articles were also hand searched.

Inclusion and exclusion criteria

Articles were included if they: (a) involved adults aged 18 years and older; (b) were general population-based samples with or without a range of health conditions; (c) assessed mortality from any cause or cause-specific mortality using a longitudinal design; and (d) included a QoL / HRQoL measure using a standard tool. QoL, the general well-being of individuals, consists of a range of contexts – health, education, employment, wealth, politics and the environment [33]. HRQoL, the self-perceived health status, includes physical, mental, emotional, and social domains [33]. We excluded papers not written in English, reviews, or studies including only specific groups of patients (e.g. patients on dialysis, those with fractures, after surgery, or individuals with a terminal illness).

Study selection

The screening of articles for eligibility according to title and abstract was undertaken independently by two reviewers (AZZP and HC). All relevant full-text articles were independently reviewed by two reviewers (AZZP and HC) for eligibility against inclusion criteria. The inter-coder reliability among two reviewers (AZZP and HC) was 98%. Discrepancies and disagreements between two reviewers (AZZP and HC) were resolved through discussion with a third reviewer (JR). The screening process was undertaken using Covidence online software [34] and EndNote X9 software.

Data extraction

A standard data extraction form was used which included the following fields – title, authors, year of publication, setting/country, name of the study and design, sample size, follow-up period, participant characteristics (age and sex), specific QoL measure, cause of death (if available), and results (risk estimates including 95% confidence intervals, CI) which were standardized in term of 1-unit increase or 1-SD increase for continuous risk estimate, or high vs. low for categorical risk estimates. The first reviewer (AZZP) completed the data extraction form and a second reviewer (HC) verified the extracted information. All efforts were made to contact authors when there was missing information.

Quality appraisal

The quality of included studies was appraised using ‘the Newcastle – Ottawa Quality Assessment Scale (NOS)’ [35]. The NOS includes eight items, categorized into three dimensions (a) Selection, (b) Comparability, and (c) Outcome. The NOS scale uses a star system to evaluate the quality of each study, and they can be accredited a maximum of one star for each item within the Selection and Outcome dimension and two stars for the Comparability item. When considering the comparability of each study, a star was provided for studies which controlled for relevant covariates – age, sex (where appropriate), socioeconomic status or proxy (including socioeconomic position, education level or income), and some measure of co-morbidity (for example a specific health condition). An additional star was given for studies which considered other factors associated with QoL and mortality, including clinical measures, BMI, or lifestyle factors (i.e. smoking, alcohol, physical activity). The range of NOS scoring was from 0 to 9 stars, with higher scores indicating less susceptibility to bias. The methodological quality of included studies was rated by one reviewer (AZZP) and verified by a second reviewer (HC). Disagreements were resolved through discussion with a third reviewer (JR).

Data synthesis

The clinical and methodical heterogeneity of the studies was examined, in particular considering the measure of QoL used, and the effect estimates reported (Hazard Ratio (HR), Relative Risk (RR) or Odds Ratio (OR)). Where studies were considered too methodically heterogeneous to enable pooling, the results were summarized quantitatively in tables according to related categories with risk estimates; and 95% CIs.

Meta-analysis

A meta-analysis was performed when there was a sufficient number of studies (four or more) which used the same domain of QoL measure and equivalent effect estimate parameters. In the present study, four meta-analyses were conducted for a pooled risk estimate of studies using (a) physical component score (PCS) of 36-item Short Form (SF-36) and OR / RR; (b) physical function domain of SF-36 and HR; (c) mental component score (MCS) of SF-36 and OR / RR; and (d) the 15-dimensional measure (15D) and HR. A DerSimonian-Laird random-effects model was chosen given heterogeneity in the studies in terms of population characteristics and varying health status. When more than one risk estimate was reported in the study, the fully adjusted/final regression model was included. In addition, when the included studies from the same cohorts with the same follow-up were eligible for meta-analysis, only one study with larger sample size was chosen for meta-analysis. Effect estimates were standardized where possible, so all values corresponded to a 1-unit increase in SF-36 or a 1-SD increase in 15D (single index number). A pooled risk estimates of less than one indicates a decreased risk of mortality with higher QoL. Statistical heterogeneity was evaluated by using the I2 statistic, and the results were interpreted based on the Cochrane guidelines (0–40% = no heterogeneity; 30–60% = moderate heterogeneity; 50–90% = substantial heterogeneity; and 75–100% = considerable heterogeneity) [36]. In addition, when the I2 statistic showed considerable heterogeneity (≥ 75%), the influence of individual studies on the pooled risk estimate was assessed using the metaninf command of STATA. Funnel plots and Egger’s test were used to assess publication bias. Data analysis was undertaken using STATA statistical software, version 15.0 (StataCorpLP, College Station, TX, USA).

Results

Search result

A total of 4175 articles were identified from the systematic database search, and six additional articles were found via searching the reference list of included articles (Fig. 1). After removing duplicates, 3140 records remained for review. After title and abstract screening, 3058 articles were excluded and the full-text of the remaining 82 articles were evaluated for eligibility. A total of forty-four (44) articles met all inclusion criteria. Excluded articles with reasons for exclusion are presented in Supplementary Table S5, Additional File 1. Moreover, three articles from additional search were also added in this review. Therefore, a total of forty-seven (47) articles were included in this systematic review.

Fig. 1
figure 1

Flow Diagram of Review Process

Description of included studies

Table 1 presents the characteristics of the 47 included studies. The earliest study was published in 1993 while the remaining included articles were published between 2002 and 2019, with 28% published in the past 5 years. All studies except the retrospective cohort study of Ul-Haq et al., [75] were prospective cohort studies. The included studies were conducted in USA (34%), UK (9%), Australia (6%), Canada (6%), Spain (6%), Taiwan (6%), Belgium (4%), Finland (4%), Scotland (4%), Sweden (4%), Bangladesh (2%), China (2%), Germany (2%), South Korea (2%), Italy (2%), Norway (2%), and South Africa (2%). The sample sizes of the included studies ranged from 171 [41] to 559,985 [40]; 14 studies had a sample size of less than 1000, 17 studies between 1000 and 10,000, 13 studies between 10,000 and 100,000, and the remaining three studies [38, 40, 53] has a sample size of more than 100,000 participants. Five studies included only males [41, 42, 54, 71, 73] and three studies only females [56, 59, 74]. The remaining 39 studies recruited between 3 to 78% of women. The follow-up periods of the studies varied between 9 months [72] and 18 years [73].

Table 1 Characteristics of the 47 included studies

This review included a variety of different QoL measures and half of the included studies (24 studies) measured QoL using the Short Form 36 (SF-36) (Tables 1 and 2). Of the 47 articles included in this review (Table 1), some studies involved the same cohorts and, in several cases, likely the same participants. Subsequent publications often reported effect estimates over different lengths of follow-up or using different QoL tools. Two published articles of De Buyser et al. reported the results of the same population-based cohort study [41, 42], three published articles by De Salvo et al. and Fan et al. were from the same study and included participants enrolled in the Veterans Affairs Ambulatory Care Quality Improvement Project [24, 43, 47], two published studies of Mold et al. and Lawler et al. used the same community-dwelling cohort [57, 61], two published studies of Higueras-Fresnillo et al. and Otero-Rodriguez et al. were from the same Spanish cohort [52, 67], two published studies of Feeny et al. and Kaplan et al. were from the same Canadian cohort [48, 55]; and Myint et al. published three articles [26, 64, 65] with different perspectives on the same population-based study. Additionally, Liira et al.’s study [29], included eight individual cohorts, however, only five of the cohorts met the inclusion criteria for this current systematic review, and thus are shown in Table 1.

Table 2 Quality of life scale included in the systematic review

Risk of Bias assessment

The methodological quality of included studies based on NOS ranged between five and nine stars. Among the included studies, seven were of high methodological quality, with nine stars. Across the ten studies with less than seven stars, they were scored most poorly on the items assessing how representative the cohort was in relation to the overall population being sampled and whether they adjusted for potential confounding factors in their analysis (See Supplementary Table S6-S7, Additional File 1).

Qualitative synthesis

Of the total 47 included studies, 43 (91.5%) studies reported for at least one of the domains examined, that better QOL was associated with lower mortality risk (Table 1). Of 33 studies which assessed physical HRQoL (nine exclusively assessed physical HRQoL), 30 studies (91%) reported better HRQoL was associated with lower mortality risk. Among the 23 studies which examined mental HRQoL (one exclusively assessed MCS), 13 studies (57%) reported that higher mental HRQoL was associated with decreased mortality risk (Table 1). The five studies [49, 52, 57, 59, 76] that measured HRQoL using SF-36 or SF-20 reported not only the physical functioning and mental health domains, but also general health perception, bodily pain, vitality, and social functioning. The findings were generally consistent in general health perception and social functioning; and it was reported that better level of general health perception and social functioning was associated with decreased mortality risk (Table 1).

The mortality risk estimates of the studies which were not included in the meta-analyses are shown in Tables 3, 4 and 5. The 18 out of 20 studies which measured the PCS using the SF-36 or SF-12 or the physical functioning subscale using SF-36, RAND-36, or SF-20 reported these to be a predictor of mortality risk, with better physical health being associated with lower mortality risk (Table 3). Nine out of 16 studies which assessed the MCS or mental health subscale using SF-36 or SF-12, showed that better mental health was associated with lower mortality risk (Table 4). The 12 out of the 15 studies that measured the association between QoL and mortality risk, found that higher QoL scores were associated with lower mortality risk (Table 5).

Table 3 Physical component score / physical functioning as predictors of all-cause mortality
Table 4 Mental component score / mental health as predictors of all-cause mortality
Table 5 Other QoL measures rather than SF / RAND, as predictor of all-cause mortality

Meta-analyses

Four studies including 53,642 participants [23, 24, 60, 70] measured QoL using the SF-36 and examined the association between the PCS and all-cause mortality and provided estimates from logistic regression analysis (OR or RR). With an average 1.8-year follow-up, one unit increase in the SF-36 PCS was associated with a 5% decrease in all-cause mortality (pooled OR/RR = 0.950; 95% CI: 0.935 to 0.965; P-value < 0.001). There was substantial heterogeneity between studies (I2 = 82.1%; P-value = 0.001) (Fig. 2-a).

Fig. 2
figure 2

Forest plot of all-cause mortality risk per one unit increase in a SF-36 PCS, b SF-36 Physical-Functioning, c SF-36 MCS. CI = confidence interval; FU (yrs) = follow-up in years; N = sample size; OR = odds ratio; RR = relative risk; HR = hazard ratio

Six studies including 22,570 participants [42, 46, 57, 59, 68, 76] measured QoL using the SF-36 and investigated the association between the physical functioning and all-cause mortality using time-to-event survival analysis. With an average 8.7-year follow-up, one unit increase in the SF-36 PF was associated with a 1.3% decrease in time to death (pooled HR = 0.987; 95%CI: 0.982 to 0.992; P-value < 0.001). There was substantial heterogeneity between studies (I2 = 83.8%; P-value < 0.001) (Fig. 2-b).

Four studies including 53,642 participants [23, 24, 60, 70] measured QoL using the SF-36 and examined the association between the MCS and all-cause mortality reported estimates on logistic regression analysis (OR or RR). With an average 1.8-year follow-up, one unit increase in the SF-36 MCS was associated with a 2% decrease in all-cause mortality (pooled OR/RR = 0.980; 95% CI: 0.969 to 0.992; P-value = 0.001). There was substantial heterogeneity between studies (I2 = 75.9%; P-value = 0.01) (Fig. 2-c).

Given the heterogeneity identified in the three meta-analyses described above, the influence of individual studies on the pooled risk estimate was assessed. The removal of no single study affected the association (Supplementary Table S8 – S10, Additional File 1).

Five Finnish individual cohorts of the Liira et al. study including 2377 [29] measured QoL using the 15D index and explored its association with all-cause mortality using time-to-event survival analysis. With an average 2-year follow-up, one SD (0.14) increase in the 15D index was associated with a 36.7% decrease in all-cause mortality (pooled HR = 0.633; 95%CI: 0.514 to 0.780; P-value < 0.001). There was moderate heterogeneity between studies (I2 = 49.4%; P-value = 0.10) (Fig. 3).

Fig. 3
figure 3

Forest plot of all-cause mortality risk per one-SD (0.14) increase in 15D index. CI = confidence interval; FU (yrs) = follow-up in years; HR = hazard ratio; N = sample size

Visual inspection of the funnel plots which were used to assess for publication bias were presented in the Supplementary Figures S1-S4, Additional File 1. For three of the four meta-analyses, there was no strong evidence of publication bias, however for the meta-analysis of MCS, this test was statistically significant (P = 0.04).

Discussion

This systematic review is the first to investigate the association between QoL and mortality in community-dwelling individuals with or without health conditions rather than patients in a hospital or people living in assisted living. It summarizes the findings from 47 studies including approximately 1,200,000 individuals aged predominantly 65 years and older (age range 18–101 years), with 46 studies (98%) conducted in high-income or upper-middle-income countries. Overall thirteen different instruments were used to assess the association between QoL or more specifically HRQoL and mortality risk after 9 months to 18 years of follow-up, with the SF-36 or its derivatives (RAND-36, SF-20, SF-6D) most commonly used. Overall, 43 (91.5%) studies of the 47 included studies reported for at least one of the domains examined, that better QoL was associated lower mortality risk, which was also supported by the results of four meta-analyses (11 studies, n = 78,589) of PCS, physical function and MCS domains of the SF-36, and 15D HRQoL.

Our findings are in line with a previous study that used pooled analysis [29] of eight heterogenous Finnish cohorts using the 15D HRQoL measure and included a wide range of both community-dwelling participants with or without morbidity, such as cardiovascular disease, dementia, and hospitalized patients with delirium. They also found that the 15D HRQoL measure was associated with two-year survival, with a slightly higher hazard ratio than that found in our study (HR per 1-SD = 0.44, 95% CI 0.40 to 0.48) [29]. These differences may relate to their inclusion of patient groups in generally poorer health, while our systematic review focused on the community dwelling population. Moreover, our findings in the general non-patient population are also comparable with studies investigating people with specific diseases such as cancer and chronic kidney disease, which reported QoL to be a predictor of mortality risk [19,20,21].

The findings of the present study are also consistent with those of recent population-based systematic review which investigated on the association between QoL and multimorbidity [78]. In their recent study, Makovski et al. (2019) systematically reviewed the evidence on the relationship between QoL and multimorbidity. They observed a stronger relationship between the PCS of QoL and multimorbidity (overall decline in QoL per additional disease = − 4.37, 95%CI − 7.13% to − 1.61% for WHOQoL-BREF physical domain and − 1.57, 95%CI − 2.70% to − 0.44% for WHOQoL-BREF mental domain) [78]. These findings also align with the results of the present study, where the meta-analysis indicated a stronger effect size for PCS compared to MCS using the SF-36 tool (pooled OR/RR = 0.950; 95% CI: 0.935 to 0.965 for PCS; and pooled OR/RR = 0.980; 95%CI: 0.969 to 0.992 for MCS). Since physical health is generally recognised as a strong risk factor for comorbidity, hospitalisations and mortality [79,80,81,82], our findings add further support to the predictive capacity of physical HRQoL for mortality risk. Like other objective health measures such as body mass index, glycaemia, and blood pressure, these findings highlight the utility of assessing physical HRQoL in general clinical practice to help identify individuals at greatest risk of death [83].

Given the evidence regarding the longitudinal relationship between QoL and mortality risk, the utility of a QoL tool in general care may improve patient’ health which in turn would decrease mortality. Furthermore, mental health issues such as depression or anxiety could also be identified through QoL measures and this would enable initiation of early interventions for mental health which in turn could improve long term QoL of individuals. Hence, the finding of this review can help to increase the efficacy of disease prevention strategies in older people through identifying individuals at higher risk for adverse health outcomes in general practice / primary health settings. Thus, the mortality risk prediction by QoL might not be very relevant to younger healthy populations although QoL generic measures were designed to be used across a wide range of populations [84]. There is a need for further studies however, in particular to better understand the influence of gender on these associations, and whether differences could be observed for males and females. Understanding these specific relationships could help identify which particular groups are most at risk and enable specific targeting of interventions to these individuals.

Strengths of the review

Strengths of this systematic review are that it was performed in a rigorous manner, adhering to strict systematic review guidelines. The protocol was registered with the International prospective register of systematic reviews (PROSPERO), and the review was undertaken in accordance with the preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement. A reproducible and rigorous search strategy using three electronic databases was used, which helped ensure that all relevant articles were included. The literature screening was independently performed by two reviewers, who were also involved in the process of data extraction and methodological quality assessment of the included studies in accordance with NOS. Based on the NOS, all studies received greater than or equal to five out of nine stars, which indicates that there was generally a low risk of bias. Similarly, most studies provided risk estimates that controlled for important factors including current health and socio-economic status. Since our review criteria were not limited to articles with the commonly used QoL (or HRQoL) tools such as the SF-36, this has increased the generalisability of the findings. Therefore, this review has a broad and comprehensive perspective, with results that are rigorous and can be reproduced.

Limitations of the review

Among included articles, large heterogeneity was observed in terms of country-of-origin, participant characteristics, and evaluation of QoL. The majority of the included articles were conducted in English speaking counties, and restriction to English language articles as part of our inclusion criteria, may impact the generalisability of these findings. Since the different QoL standard tools examine different aspects [33, 85] and are not directly comparable, this made comparison of included studies in data synthesis difficult. There were also some differences in the way the data analysis was performed and the results were presented, reporting OR versus HR for example. In addition, some articles reported the risk estimates by comparing categorical QoL groups while others provided the risk estimates per 1 or more units change in the continuous scale. Hence, the different nature of each QoL scale and inconsistency in risk comparison precluded us from including some articles in the meta-analyses. As such, only 11 studies were included across the four meta-analyses of this systematic review, and the meta-analyses still showed substantial heterogeneity. Therefore, caution should be taken with the interpretation of the overall effect estimates. Moreover, since the numbers of studies included in each meta-analysis were fewer than 10 studies, the results of funnel plots or Egger’s test should also be interpreted with caution. Of particular interest here, it has commonly been reported that gender differences exist in QoL and women of all age groups have lower QoL than their male counterparts [86,87,88,89,90]. However, in this review, it was not possible to perform statistical pooling by gender and age groups due to the different reporting strategies of the reviewed studies. Finally, it is important to consider that although studies of mortality are not directly affected by reverse causation, individuals with severely declining health prior to death, would likely report a decreased HRQoL. An ideal study design would involve excluding individuals who died in the first year of the study, or at least, to run sensitivity analysis to ensure these early deaths were not driving the results. Most of the studies included in this review, did not undertake such analyses. Furthermore, around 10% of the included studies have very short follow-up periods of less than 2 years.

Conclusion

This is the first systematic review and meta-analysis that has determined whether QoL is associated with mortality in the general non-patient population. In summary, the findings provide evidence that better QoL or HRQoL measured by different tools were associated with lower mortality risk in the general population. Therefore, our findings could be applied more generally to QoL or HRQoL assessed using different instruments. Our unique and first review indicates that QoL measures can be considered as potential screening tools beyond the existing traditional clinical assessment of mortality risk. Additionally, our result also encourages clinicians to incorporate QoL measure into routine data collection of health system which in turn could enable initiation of early primary health care for people at high risk of premature death. Furthermore, this study also adds further support to the predictive capacity of physical HRQoL for mortality risk. Additional research is needed to determine whether these associations differ across gender, and other populations in low- and lower-middle-income countries, who have suffered of a double burden of infectious and chronic diseases, with having difficulties for accessing quality health services. Ultimately these findings suggest the utility of QoL measures to help identify populations at greatest risk of mortality and who might benefit most from routine screening in general practice and possible interventions.