Background

Subjective health measures are widely used within clinical and epidemiological research, as well as health policy settings, being easily assessed by a single-item self-rated health (SRH) question or more thoroughly using health related quality of life (HRQoL) instruments. The Short Form Health Survey (SF-36) is a well documented and validated HRQoL instrument [13], with the SF-12 developed as a shorter alternative. With the advantage of two summary scores of physical (PCS-12) and mental component summaries (MCS-12), the SF-12 has been extensively applied in epidemiological studies [4].

In particular, the relationship between SRH and mortality has been repeatedly reported [58], suggesting a single measure of SRH as strong predictor of poor overall health status and increased mortality risk [9]. However, previous studies are limited by the fact that the associations between subjective health and mortality have been assessed in elderly [1012] or disease-specific patient populations including conditions such as cancer [13, 14], diabetes mellitus [15], coronary artery disease [16, 17], respiratory disease [18, 19], chronic kidney disease [20], or infection by HIV [21]. Beside these limitations, the impact and comparative predictive performance of different biomarkers on the association between subjective health measures and mortality risk is largely unknown. This is even more intriguing, as the multi-biomarker approach has recently gained widespread attention as powerful predictors of clinical [22, 23] and subclinical outcomes [24].

The present study aims to investigate the impact and comparative predictive performance of a multi-biomarker panel on the association of subjective health with mortality, analyzing data from the 10-year follow-up of the population-based Study of Health in Pomerania (SHIP).

Methods

Study population

The SHIP is a population-based cohort study in West Pomerania, conducted in the north-eastern area of Germany comprising the cities of Greifswald, Stralsund, Anklam and 29 surrounding communities with a total of 212,157 residents [25, 26]. A representative sample of 7,008 adults aged 20 to 79 years was invited to participate. A two-stage cluster sampling method was adopted for this purpose from the WHO MONICA in Germany (Augsburg) and yielded twelve five-year age strata for both genders, each including 292 individuals in a total of 34 towns or villages. Only individuals with German citizenship and main residency in the study area were included. The net sample (without migrated or deceased persons) comprised 6,267 eligible subjects, of which 4,308 finally participated (response proportion 68.8%). Data collection was performed in two examination centers (Greifswald and Stralsund) between October 1997 and May 2001 after written consent was obtained from each participant. The study conformed to the ethical guidelines of the Declaration of Helsinki as reflected in an a priori approval by the local Ethics Committee of the University of Greifswald. Subjects with missing data for the modelled variables (N = 49) were excluded, yielding a study population of 4,259 individuals.

Measures

A computer-assisted personal interview assessed socio-demographic information including age, gender, educational level (< 10, = 10, or > 10 years of education), civil status (cohabitation), and occupational status (having no paid job, worker, employed, academic/self-employed); health-related behaviour including physical activity (physical training during summers or winters for at least one hour a week), excessive/high-risk alcohol consumption (> 30 g alcohol/day for men and > 20 g alcohol/day for women [27]), smoking habits (current, former, or never-smoker), and diet (gender-specific tertiles from a validated food-frequency questionnaire reflecting food quality [28]); as well as subject's self-reported medical history including hypertension, myocardial infarction, stroke, and diabetes mellitus. Because income is a household-level variable, "equalized" household income (in Euros) was calculated using the commonly adopted procedure of the Luxembourg Income Study to divide the household income by the square root of the number of household members [29]. Somatometric measures included waist circumference (WC), measured to the nearest 0.1 cm using an inelastic tape midway between the lower rib margin and the iliac crest in the horizontal plane with the subject standing comfortably with weight distributed evenly on both feet. We measured HRQoL using the two SF-12 components PCS-12 and MCS-12. To assess SRH, we used the single-item question: "Over the last 12 months would you say your health has been very good, good, fair, poor, or very poor?"

Laboratory assessment of the multi-biomarker panel included measurement of 10 biomarkers from distinct biological pathways associated with increased morbidity and mortality including inflammation [high sensitive C-reactive protein (hs-CRP)], hemostasis (fibrinogen), metabolic disturbances [glycated hemoglobin (HbA1c), total cholesterol, and triglycerides], liver disease [gamma glutamyltransferase (GGT)], kidney disease [urine albumin and glomerular filtration rate (GFR)], thyroid status [thyrotropin (TSH)], and hypothalamic-pituitary-adrenal axis activity [insulin-like growth factor-I (IGF-I)]. Biomarkers were measured as follows: hs-CRP determined immunologically on a Behring Nephelometer II with commercially available reagents from Dade Behring (Dade Behring, Eschborn, Germany); plasma fibrinogen assayed according to Clauss using an Electra 1600 analyzer (Instrumentation Laboratory, Barcelona, Spain); HbA1c determined by high-performance liquid chromatography (Bio-Rad Diamat, Munich, Germany); total cholesterol measured photometrically (Hitachi 704, Roche, Mannheim, Germany); triglyceride determined enzymatically using reagents from Roche Diagnostics (Hitachi 717, Roche Diagnostics, Mannheim, Germany); urine albumin determined on a Behring Nephelometer (Siemens BN albumin; Siemens Healthcare, Marburg, Germany); GGT measured photometrically (Hitachi 717; Roche Diagnostics, Mannheim, Germany); creatinine determined with the Jaffé method (Hitachi 717, Roche Diagnostics, Germany) and the GFR estimated according to the modified MDRD formula [30]; TSH measured by immunochemiluminescent procedures (Byk Sangtec Diagnostica, Frankfurt, Germany); and IGF-I determined by automated two-site chemiluminescence immunoassays (Nichols Advantage; Nichols Institute Diagnostica GmbH, Bad Vilbel, Germany) [26].

Information on vital status were acquired at regular intervals from time of enrollment into the study through December 15, 2009. Individuals were censored at either death or loss to follow-up. The number of months between baseline examination and censoring was used as follow-up length.

Statistical analysis

Data on quantitative and qualitative characteristics are expressed as median (inter-quartile range), or percent, respectively. Intergroup comparisons with regard to vital status were performed using χ2 test (qualitative data) or a Mann-Whitney-U test (quantitative data). PCS-12 and MCS-12 were divided into quartiles to calculate crude incidence rates (per 1000 person-years) and to perform multivariable Cox proportional-hazards regression models associating PCS-12, MCS-12, and SRH with all-cause mortality. Kaplan-Meier survival curves were graphed for SRH and compared using the log-rank test. First, we prespecified gender- and age-adjusted Cox models and included alternately socio-demographic factors (civil status, educational level, occupational status, and equalized income), behavioral factors (smoking status, alcohol consumption, physical activity, food consumption, and WC), comorbidities (hypertension, myocardial infarction, stroke, and diabetes mellitus), and the multi-biomarker panel (hs-CPR, fibrinogen, HbA1c, total cholesterol, triglyceride, GFR, albumin, GGT, TSH, and IGF-I). In secondary analyses, we used backward (p ≥ 0.20 for removal) and forward elimination (p < 0.05 for inclusion) procedures to identify a parsimonious adjustment set among all of the applied standard covariates and the multi-biomarker panel.

To compare the predictive performance of the implemented models, we measured the area under the receiver operating characteristic (ROC) curve, or C-statistic, and tested their differences using STATA's "roccomp" command. The C-statistic ranges from 0.5 (no discrimination) to a theoretical maximum of 1 (perfect discrimination) and is equivalent to the probability that the predicted risk is higher for a case (decedent) than for a non-case (survivor) [31]. Furthermore, we estimated the integrated discrimination improvement (IDI) to examine whether the prediction on the basis of a model without the biomarker panel was significantly improved after inclusion of the biomarker panel [32]. In contrast to the net reclassification improvement which needs a priori meaningful predicted risk categories, the integrated discrimination improvement is based on continuous differences in the predicted risk from new and old models. IDI were obtained with logistic regression models that examined deaths through 10-years of follow-up.

We assessed the potential effect modification of the investigated associations by age and gender through additional inclusion of multiplicative interaction terms (age and gender * HRQoL and SRH, respectively) into the applied multivariable Cox models. To account for the potential impact of changing risk factor patterns over time, we entered the applied adjustment sets as time-varying covariates into the analysis. Further sensitivity analyses were performed by recalculating the applied models stratified by 20-year age-groups and gender, as well as adjusting for possible non-response bias by using inverse probability weights [33]. An elevated level of item nonresponse of > 5% was an issue with regard to the MCS-12 and PCS-12. Therefore we used multiple imputations by chained equations (MICE) as an extremely suitable algorithm to obtain completed versions of incomplete MCS-12 and PCS-12 [34, 35]. Item nonresponse from all further variables was less than 2%. We verified that the assumption of proportionality of hazards was satisfied. Hazard ratios (HR) were calculated with a 95% confidence interval (95% CI). We considered two-sided P value less than p < 0.05 to be statistically significant. This manuscript was written in accordance with the STROBE statement, giving guidelines for reporting of observational studies [36]. All statistical analyses were performed using Stata 11.0 (Stata Corporation, College Station, TX).

Results

Data on subjective health measures, biomarkers, and covariates are presented in Table 1. During 41,180 person-years (mean, 9.7 years; 25th, 9.3; 75th, 10.7) of follow-up, 456 individuals (10.7%) died, resulting in an overall death rate of 11.1 deaths per 1000 person-years. Crude incidence rates of all-cause mortality decreased across quartiles of PCS-12 but not MCS-12 (Table 2).

Table 1 Baseline characteristics of the study population stratified by vital status.
Table 2 Crude incidence rates of all-cause mortality by quartiles of PCS-12 and MCS-12.

In Cox proportional-hazards models adjusted for gender and age, we found a distinct association between low PCS-12 scores and all-cause mortality, showing that subjects with PCS-12 scores in the lowest quartile had an increased mortality risk (HR 1.75; 95% CI 1.31-2.33) compared to subjects in the highest quartile. The inclusion of potentially confounding socio-demographic factors (HR, 1.63; 95% CI, 1.22-2.17), behavioral factors (HR, 1.78; 95% CI, 1.33-2.40), comorbidities (HR, 1.60; 95% CI, 1.19-2.14), as well as the multi-biomarker panel (HR, 1.64; 95% CI, 1.19-2.27), attenuated the estimates only slightly (Table 3). P for trend statistics confirmed that HRs were linearly elevated across PCS-12 quartiles in all applied models (p < 0.001). Cox proportional-hazards analyses for low MCS-12 scores did not yield any associations with all-cause mortality (Table 3). We further conducted Cox models for the association of SRH with all-cause mortality and revealed that subjects reporting "poor" or "very poor" SRH had a twofold higher mortality risk (HR 2.07; 95% CI 1.34-3.20) compared to subjects reporting "good" or "very good" SRH. Again, additional adjustment altered the relationship only slightly (Table 4). The complete regression results were given as Additional file 1. Kaplan-Meier survival curves additionally indicated that subjects with "fair", "poor", or "very poor" SRH had significantly (log-rank test: p < 0.001) shorter survival times compared to subjects who reported their SRH as "good" or "very good" (Figure 1).

Table 3 Adjusted hazard ratios (HR, 95% CI) for quartiles of PCS-12 and MCS-12 associated with all-cause mortality.
Table 4 Adjusted hazard ratios (HR, 95% CI) for self-rated health (SRH) associated with all-cause mortality.
Figure 1
figure 1

Kaplan-Meier survival curves for self-rated health associated with 10-year mortality risk. Subjects with "fair", "poor", or "very poor" self-rated health had a significantly shorter survival compared to subjects with "good" or "very good" self-rated health (log-rank test; p < 0.001).

Initially, we compared the discriminatory power between the complete self-reported measures panel (incorporating HRQoL, SRH, socio-demographic, behavioral, and comorbidity measures) and the multi-biomarker panel (age, gender, and all 10 biomarkers). The ROC curves depicted in Figure 2 illustrate the significantly better discriminatory power of the self-reported measures (C-statistic of 0.883) compared to the biomarker panel (0.872). To evaluate the added discriminatory power in multivariable Cox models, we implemented a Cox model only including age and gender, yielding a significantly lower C-statistic of 0.843 (p < 0.001). In order to define a parsimonious adjustment set for each panel, the conducted variable selection procedures identified gender, age, occupational status, educational level, cohabitation, smoking, WC, and history of stroke and diabetes mellitus as relevant socio-demographic and behavioral covariates for mortality risk prediction (model 1); and fibrinogen, HbA1c, albumin, and GGT as the most informative biomarkers (model 2). The calculated C-statistics confirmed the better discriminatory power of model 1 (incorporating only SRH and selected socio-demographic and behavioral covariates, respectively) compared to model 2 (age, gender, and the selected biomarkers), yielding a significantly higher C-statistic of 0.883 vs. 0.873 (p = 0.010). Finally, we combined both reduced models (SRH, selected covariates, and selected biomarkers) and detected the best discriminatory power with a significantly higher C-statistic of 0.887 (p < 0.001) compared to the previously presented separate assessment of model 1 (0.883) or model 2 (0.873). We confirmed this finding with a highly significant IDI, estimated at 1.5% (p < 0.001).

Figure 2
figure 2

Receiver-operating characteristic (ROC) curves for mortality risk predicted by subjective health measures and covariates vs. multi-biomarker panel. The Cox model including subjective health measures and covariates incorporated age, gender, self-rated health, PCS-12, MCS-12, civil status, educational level, occupational status, equalized income, smoking status, alcohol consumption, physical activity, food consumption, waist circumference, previous history of hypertension, myocardial infarction, stroke, and diabetes mellitus vs. a multi-biomarker panel including age, gender, high sensitive C-reactive protein, fibrinogen, glycated hemoglobin, total cholesterol, triglycerides, glomerular filtration rate, albumin, gamma glutamyltransferase, thyrotropin, and insulin-like growth factor-I.

Sensitivity analyses with the inclusion of multiplicative interaction terms did not yield any significant effect modification caused by gender or age (p < 0.05). Furthermore, we found virtually no differences comparing the risk estimates between Cox models with and without time-varying covariates. Stratified analyses revealed that especially middle-aged men (40-59 years) were responsive to the detected associations of low PCS-12 (HR, 3.61; 95% CI, 1.43-9.14) and poor SRH (HR, 4.37; 95% CI, 1.27-15.04) with all-cause mortality. Finally, the additional inclusion of non-response weights altered the revealed estimates only slightly (data not shown).

Discussion

Principal findings

The present study investigated the associations between subjective health, multiple biomarkers, and mortality risk; offering three principal findings. First, we found that poor SRH and low PCS-12 scores were significantly associated with increased risk of all-cause mortality, independent of a broad spectrum of standard covariates and multiple biomarkers. Second, we found that a risk assessment using subjective health instruments and standard covariates yielded a better mortality risk prediction compared to that of a multi-biomarker panel. Finally, we were able to show that the most accurate mortality risk prediction was obtained from a combined assessment of subjective health and biomarkers.

HRQoL & mortality

Our presented risk estimates of the association between low PCS-12 and mortality were similar to those previously reported among Taiwanese community-dwelling elderly [12], but much smaller compared to estimates among American community-dwelling elderly [10]. We were not able to detect any association of MCS-12 with all-cause mortality in the present study. This result is in line with previous studies based on particular disease groups [17, 19, 20, 3739] or community cohorts [1012], which similarly found PCS-12 but not MCS-12 associated with mortality after multivariable adjustment. Thus, our findings supports the notion that physical domains of HRQoL measures are be more tightly related to mortality compared to mental domains [14, 40, 41]. As a potential explanation, it is possible that despite the high concordance between PCS-12/MCS-12 and SF-36 scores [2], particular aspects of mental functioning are not captured, ultimately leading to absent associations. However, it has been shown that the mental health status contributes to the PCS-12 as opposed to MCS-12, most likely due to the strong interrelationship between physical and mental domains of health [42]. Thus, the strength of relationship between mental HRQoL and health outcomes may be diluted due to limitations in the applied MCS-12 metric.

SRH & mortality

Our observed effect sizes between SRH and mortality reflect fairly well the range found in the literature, although the confidence intervals are somewhat wider [7, 43]. Our results are in line with previous investigations suggesting SRH not only as an independent predictor of mortality risk, but also as a stronger predictor than HRQoL [40, 44]. Because we and previous studies found the relationship between SRH and mortality to be stronger in men than in women [6, 45], and stronger in younger than older individuals [46], we conducted sensitivity analyses incorporating interaction terms for gender and age, but without detecting any significant effect modification.

Biomarker panel

To the best of our knowledge, this is the first population-based study to systematically assess the impact of a comprehensive multi-biomarker panel on the subjective health-mortality association. Our finding of an association between poor SRH and low PCS-12 scores with increased mortality risk was independent of a broad spectrum of multiple biomarkers from distinct biological pathways. This finding is in line with previous results from a five-year follow-up of 4,065 individuals aged 71 years or older, showing SRH significantly associated with mortality risk after adjustment for various biomarkers including albumin, white blood cell count, hemoglobin, high-density lipoprotein cholesterol, and creatinine [47]. It is important to note that this previous study included biomarkers only for additional adjustment, but not to assess their comparative predictive performance with regard to mortality risk.

When we applied variable selection procedures to answer the question which of the 10 biomarkers were most predictive in terms of mortality risk, we identified fibrinogen, HbA1c, albumin, and GGT as the most predictive biomarkers. These pathophysiological highly plausible mortality risk candidate biomarkers reflect disturbances in hemostasis (fibrinogen), metabolism (HbA1c), kidney disease (albumin), and liver disease (GGT). Their predictive ability has been shown to be similar to the full biomarker panel, suggesting improved risk stratification effectiveness using this parsimonious set of biomarkers. But even more interestingly, the predictive performance of these selected four biomarkers was shown to be nearly as good as a risk assessment based on subjective health instruments and standard covariates. While previous studies accumulated evidence that subjective health is a powerful mortality risk predictor, the present findings add to the existing literature indicating that this selected biomarker set plus information about gender and age do about as well. Finally, we were able to show that the combined assessment of subjective health and biomarkers significantly improved the discriminatory ability of the mortality risk prediction model beyond that of each separate panel. There is only one previous study among US veterans that investigated the predictive power of PCS-12 [43]. Compared to our estimates, they reported a slightly lower C-statistic of 0.73, which could be explained by sampling artifacts as veterans tend to be older with a higher proportion of males than in general populations. However, our results suggest that the physiological effects of subjective health measures are synergistic with those captured by the biomarker panel, whereas a combined assessment was identified as the most sensitive barometer of physiologic states associated with increased mortality risk.

Strengths and limitations

The strengths of our study include a prospective population-based sample of adults aged 20-79 years, a completed 10-year follow-up period utilizing valid and reliable mortality data based on a national death register, comprehensive subjective health and covariable assessment, as well as a broad multi-biomarker panel. Nonetheless, the present study has several limitations. First, non-response bias may potentially exist in this sample, because non-response is particular relevant for self-report of mental properties. But when we performed sensitivity analyses accounting for non-response bias, we detected only minimal deviations from our main results. Second, this study sampled a healthy adult population, with only 11% of participants deceased over the follow-up period. But although this proportion is lower than in several previous studies, a comprehensive review of single-item measured SRH and mortality concluded that studies with lower than 10% mortality found similar effect sizes as studies with higher mortality rates [7]. Even a study with only 5% decedents demonstrated similar effect sizes for the association between SF-12 derived HRQoL and mortality risk [12]. Third, changes in HRQoL over time have been suggested to be as important as the actual baseline value in terms of mortality risk prediction. Studies in selected populations showed that individuals moving from low to high SF-36 scores (improvement) had similar mortality patterns than those who scored high over time [11, 48]. Although this analytical approach would have been a valuable extension of the present study based on baseline measurements, it is yet unclear how a change in HRQoL over time is related to mortality in the general population.

Conclusions

The present results from a large population-based epidemiological study demonstrate that the association of poor SRH and low PCS-12 with mortality is independent of a broad spectrum of standard covariates and multiple biomarkers from distinct biological pathways. But the key finding of the present study is that a small set of biomarkers conjointly with subjective health instruments and socio-demographic standard measures significantly improved the mortality risk prediction above and beyond a separate assessment.

While the first finding is more confirmative in nature, the latter holds important implications from epidemiological and public health perspectives. Incorporating both subjective health assessment and a small set of biomarkers into routine data collection could be used to monitor population health, especially to identify subpopulations particularly at risk. Doing so, the improved identification of high-risk individuals could increase this efficacy of disease prevention strategies. However, further research must be conducted to elucidate how subjective health and biomarkers are interrelated and interdependent. Carrying the present results forward to cause-specific mortality and morbidity will lead to a better understanding of these relationships.

Disclosures

The authors have nothing to disclose.