Background

Self-reported general health (SRGH) is the most widely used measure of health in both population and clinical health surveys and the most frequent tool for health comparisons between populations. A Medline literature search showed that in the year 2002, 1,991 scientific papers were published using this question [1]. Most of these studies relied on the standard ‘In general, how would you rate your health?’ question answered on a five-point Likert-type scale: very bad, bad, fair, good, very good, or poor, fair, good, very good, excellent. This question is also included in widely-used questionnaires, such as the Short-Form 36 [2] and the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire [3].

The studies using SRGH usually belong to one of two types: SRGH is used either as a predictor of specific health outcomes, such as mortality [4, 5], social-psychological well-being [6, 7], morbidity [8, 9] and health care utilisation [10], or as an outcome of other factors such as medical diagnoses, physical symptoms and functioning [11], social role activities, social relationships [12] and emotional factors [13].

SRGH is not, however, an approach to measuring health that fits all purposes. Salomon et al.[14] claim that SRGH may not be suitable for tracking changes in population health over time and for comparing the level of health of subpopulations.

We claim in this paper that one reason to question the validity of SRGH for tracking health over time and for cross-population comparability involves the different meanings of health that respondents have in mind when answering the SRGH question.

To test what respondents have in mind when answering the SRGH question, qualitative studies are a good place to begin. These studies scrutinize what respondents are thinking about when answering. Some of these studies have already shown that SRGH is a multidimensional construct and that the perception of health is determined not only by the presence or the absence of health problems (that is, biological health), but also by one or more of the following factors: (1) functional factors - the extent to which people are able to perform actions and tasks; (2) coping factors - the person’s level of adaptability, or his or her attitudes towards the health condition, and (3) wellbeing factors - their emotions or feelings [15]. These qualitative studies also suggest that it is very important to anchor the assessment of SRGH to age, gender and time [15, 16].

Bearing in mind the value of these studies, the question we wish to answer in this investigation is whether we can psychometrically study what respondents have in mind when answering the SRGH question. To address this question we will use the conceptual basis of the International Classification of Functioning, Disability and Health (ICF) of the World Health Organization (WHO) [17]. According to the ICF model the construct of capacity reflects the intrinsic features of a person to do an action or execute a task independent of the positive or negative influence of the person’s physical, attitudinal or social environment. The construct of performance, on the other hand, refers to health in terms of what one’s level of capacity in different functioning domains allows us to do in life, taking full account of the impact, positive or negative of one’s environment, such as the assistive devices one may use. Health in the sense of capacity is what we mean by ‘biological health’ and performance is what we mean by ‘lived health’. The ICF provides the best framework to describe and measure people’s limitations and restrictions and was explicitly not intended to measure quality of life understood as how people feel about these limitations and restrictions.

For this investigation, we selected a population-based study, namely the 2008 Spanish National Disability Survey. We selected this study because it captured both the concepts of biological health and lived health, making it possible for us answer the question whether SRGH is more related to one or the other. The questionnaires used for this survey contained the SRGH question as well as questions about the extent of problems in different domains of functioning, with and without assistive devices or personal assistance. We believe that the extent of problems in domains of functioning without any aids or personal assistance captures biological health, whereas questions about the extent of problems in the same domains but taking into consideration personal or technical assistance addresses the concept of lived health. The aim of this study is, again, to determine whether biological health or lived health is more predictive of SRGH.

Methods

Study design and participants

This is a psychometric study using cross-sectional data from the Spanish National Disability Survey from 2008 (Survey on Disabilities, Independence and Dependence Situations - EDAD). This survey included two residence-based population samples, one community-dwelling and the other institutionalized. The 2008 EDAD design has been described previously [18]. Data was only collected for people who fulfilled the disability criterion of having ‘important limitations to carrying out everyday activities that have lasted, or are expected to last, more than one year, and whose origin is an impairment in one of the following eight domains: seeing, hearing, communication, learning and application of knowledge and development of tasks, mobility, self-care, home life, interactions and interpersonal relationships.

Variables

Forty-two questions were used to assess the level of difficulty in carrying out activities without any technical aid or personal assistance. In our judgment, these are questions about a person’s biological health. Thirty-one questions assessing the level of difficulty in most of the same activities but taking into account any kind of technical aid or personal assistance were also asked. These we judged to be lived health questions. The ordinal scale used to assess the limitation level consisted of the following response options: 1 = Without difficulty or with little difficulty; 2 = With moderate difficulty; 3 = With severe difficulty and 4 = Cannot carry out the activity.

When people did not use technical assistance or have personal assistance, only the question about the level of difficulty ‘without’ was asked. Additional questions about medical conditions, diagnosis, professional life, education, discrimination, social contacts, accessibility and main caregivers were also asked. The SRGH level was collected using the five point scale, with response options: very bad, bad, fair, good, very good.

Data analysis

The questions referring to vision and hearing were not considered because no differentiation was made between with and without assistive devices or personal assistance. Furthermore, only people that had a difficulty in at least one of the remaining biological health questions were included in the analyses. As a result, 17,739 people from the community-dwelling and 9,707 from the institutionalized population were kept in the analyses.

We used descriptive statistics to present the characteristics of both study populations, taking sampling weights into account. The response options ‘With moderate difficulty’ and ‘With severe difficulty’ in both biological health and lived health questions showed a low frequency. Thus, we collapsed them into a single option called ‘with moderate/severe difficulty’.

To answer the question whether biological health or lived health is more predictive of SRGH, we (1) calculated a biological health and a lived health score for each person by constructing a biological health scale (BHS) and a lived health scale (LHS) using the Item Response Theory (IRT) Model called Samejima’s Graded Response Model (GRM); and (2) calculated the variable importance measures using Random Forest with SRGH as the dependent variable and the biological health score and lived health score as independent variables.

For step one, three specific steps were followed:

  1. a)

    We evaluated the assumptions of Item Response Theory (IRT) - unidimensionality, local independency and monotonicity - separately for biological health and lived health questions to find out whether IRT could be used for our data. Unidimensionality was examined with bifactor analysis with the analytic bifactor rotations [19, 20]. Local independency was tested by examining the residual correlations among questions in one-factor model confirmatory factor analysis [21]. We estimated GRM with and without the flagged local dependent questions (residual correlations higher than 0.2) to see if results were robust to question dependencies [22]. Monotonicity was studied by examining graphs of the question mean scores conditional on ‘rest-scores’ (i.e. total raw scale score minus the question score). Questions that failed one of these three assumptions were not considered in the final model [23].

  2. b)

    Biological health questions and lived health questions that satisfied the IRT assumptions were used to create a BHS and a LHS using GRM [24].

  3. c)

    Biological health questions and lived health questions were tested for differential item functioning (DIF) for study population (institutionalized and community-dwelling), gender (male and female), age groups (≤65 and >65) and reported number of health conditions groups (0, 1-2 and >2) using iterative hybrid ordinal logistic regression with change in McFadden’s pseudo R-squared measure (above 0.02) as DIF criterion [25, 26]. Questions showing DIF were calibrated separately for each of the groups showing DIF and after DIF correction final GRMs were calculated. Based on the resulting biological health question parameters and lived health question parameters, a summary score of biological health and a summary score of lived health for each of the individuals in the sample were calculated. For a more intuitive summary score for the biological health or lived health of individuals, we transformed the resulting scores into more meaningful values, ranging from 0 (best biological health or lived health) to 100 (worst biological health or lived health). For both study populations, the relation between biological health and lived health was studied using the Pearson correlation analysis.

For step two, for each of the community-dwelling and institutionalized data sets, we (1) studied the association between biological health scores, lived health scores and SRGH by using Spearman correlation coefficient (rS) and box-plots which displayed the distribution of biological health scores and lived health scores in each of the five SRGH response options; and (2) compared the importance value of the biological health score with that of the lived health score obtained from Random Forest regression with 1000 trees and mtry = 2, where 2 means the number of randomly preselected independent variables, which in Random Forest are called split variables. The Random Forest regression provides an improved prediction accuracy compared to other regression techniques (e.g. logistic or linear regression) because it deals with the collinearity and the main and interaction effects of independent variables. The variable importance measure is the average of the frequency with which the independent variables (biological health and lived health) appear in all 1000 trees calculated to predict the dependent variable (SRGH) over all 1000 trees. It takes values from 0 to 1, the higher the value, the better the prediction of SRGH. The permutation importance was computed with the conditional permutation scheme proposed by Strobl and colleagues, which controls for the correlation of the predictor variables [27].

All the analyses were performed with R version 2.15.1 [28].

Results

Characteristics of both study populations are presented in Table 1. In both study populations around 60% were female. Most of the institutionalized people were aged more than 65 years (82%). The percentage of respondents reporting very good or good health is 38.2 in the institutionalized and 20.7 in the community-dwelling population.

Table 1 Characteristics of institutionalized and community-dwelling population

Table 2 shows the biological health and lived health questions considered for BHS and LHS, respectively.

Table 2 Biological health and lived health questions initially considered for the GRM models for each study population: institutionalized and community-dwelling population

Biological health and lived health scores

IRT Assumptions

Unidimensionality

For both BHS and LHS the bifactor analyses supported the assumption of a strong general factor, with all questions loading highly on the general factor. However, questions from the mobility domain of LHS and questions from the communication domain and learning and application of knowledge and development of tasks domain of LHS loaded higher on their respective group factors than on the general factor. We decided to proceed with unidimensional BHS and unidimensional LHS, since these domains are contributing to biological health and lived health, respectively. We checked our decision by estimating the GRMs both with and without mobility for BHS and communication and learning and application of knowledge and development of tasks for LHS and analyzed the correlation between the item thresholds for the two models each. The results showed that our decision did not affect the results.

Local independency

While the examination of the residual correlations of biological health questions indicated violation of local independency in five groups of questions, the results for lived health questions revealed violation in six. Table 2 shows the local dependent questions as well as the questions considered in the final models.

Monotonicity

The monotonicity IRT assumption was satisfied by most of the biological health and lived health questions, with the exception of the questions: ‘With what level of difficulty would you say are you able to carry out activities related to menstrual care?’ and ‘With what level of difficulty would you say are you able to drive vehicles?’.

Differential item functioning

Table 3 presents the biological health and lived health questions included in the BHS and LHS respectively and their parameter estimates (discrimination and threshold parameters) for the final GRM models. While for BHS 6 questions showed DIF for study population and 11 questions for age groups, for LHS 7 questions showed DIF for study population, and 4 for age groups. All questions of the BHS and of the LHS were free of DIF for gender and number of health conditions.

Table 3 Biological health and lived health questions included in the single biological health scale and lived health scale and their parameter estimates (discrimination (Discr) and threshold parameters (Thr 1-2)) for the final GRM models

Biological health scale and lived health scale

The most discriminating biological health question was ‘initiate and maintain intimate or sexual relations’ in the community-dwelling old age group (with a discrimination of 4.97). This means that this question differentiates well between people with high and lower difficulties in biological health in the old age group. The least discriminating question was ‘walk or move outside the home’ (with a discrimination of 0.62). For LHS, the most discriminating question was ‘carry out housework’ (with a discrimination of 4.25); the least discriminating was ‘speak intelligibly or utter coherent phrases’ in the community-dwelling young-age group (with a discrimination of 0.91). While the question for which only those individuals in the worst biological health are expected to have median difficulties is ‘hold a gaze or pay attention when listening’ (with a threshold of 2.55 on the logit scale), the question for which individuals in the worst lived health are expected to have high difficulties is ‘speak intelligibly or utter coherent phrases’ in the community-dwelling young-age group (with a threshold of 3.80).

On a scale from 0 (best biological health) to 100 (worst biological health), the levels of biological health are higher for community-dwelling (mean = 31.07, standard deviation = 21.22, range = [0; 98.96]) than for institutionalized population (mean = 48.86, standard deviation = 23.54, range = [0; 100]). When technical assistance, personal assistance or both was received, the difference between community-dwelling (mean = 31.94, standard deviation = 20.72, range = [0; 100]) and institutionalized populations (mean = 36.75, standard deviation = 22.69, range = [2.09; 93.22]) was smaller. The biological health score and lived health score are not comparable since they were calculated based on two separate sets of questions.

For both study populations the Pearson correlation between biological health and lived health was high: 0.79 for community-dwelling population and 0.85 for institutionalized population.

Conditional permutation importance of biological health and lived health scores

For both community-dwelling and institutionalized populations the association between SRGH and lived health scores (community dwelling: rS = 0.33, institutionalized: rS = 0.36) was higher than the association between SRGH and biological health scores (community dwelling: rS = 0.23, institutionalized: rS = 0.30). The relation between SRGH and biological health scores and lived health scores is displayed in Figure 1.

Figure 1
figure 1

Box-plot showing the distribution of biological health scores and lived health scores in each of the five SRGH response options (1 = very good, 2 = good, 3 = fair, 4 = bad, 5 = very bad).

The resulting importance measures of the two predictors (biological health score and lived health score) of SRGH are displayed in Figure 2. For both samples, the lived health score showed the higher variable importance and therefore was a better predictor of SRGH than the biological health score.

Figure 2
figure 2

Conditional permutation importance of biological health and lived health scores as predictors of SRGH by study population. The higher the value, the better the prediction of SRGH.

Discussion

Comparing the predictive value for SRGH of biological health and lived health in a psychometric space is the first step towards a true understanding of what people are thinking about when rating their general health. Our study showed that people base their evaluation of their health, not on their biological state, but on their lived experience of their health. This is an important result because it implies that any kind of intervention that targets population health should address, not merely the intrinsic capacity of a person, but also his or her environment.

We are not aware of studies reporting on the comparison of predictive power of biological health and lived health on SRGH. Yet, our finding is similar to, and confirms the Smith et al. [29] conclusion that ‘sickness is a social role in addition to biological state’ and that SRGH ‘is not a continuum of biological states’. As Jylhä [30] suggested, the response to SRGH is influenced not only by ‘earlier health experiences, present health conditions’, but also by the health-related environment.

Bifactor analyses of biological health questions and lived health questions supported the construction of BHS and LHS, in terms of the contribution of questions to a single common dimension. The presence of an underlying factor that links domains of functioning commonly used to operationalize biological health and lived health helped us to quantify both biological and lived health as a single number, which facilitated comparability between people’s abilities from the two study populations. Our results with respect to BHS and LHS are also concordant with other findings [31].

The GRM IRT modelling was used to assess the levels of biological health and lived health. The primary advantage of using an IRT model is that it allows for an estimation of biological health and lived health independent of the set of test questions administrated [32]. For BHS, this makes it possible for us to consider questions that addressed domains of functioning that were not addressed by lived health questions.

The different gradients captured in the developed BHS and LHS - study population and age - support the validity of both scales. However, there is a large number of lived health questions showing DIF. One possible explanation is that institutionalized people receive constant support from hospital personal. This is not the case in the community-dwelling population. In fact more than half of the community-dwelling population did not benefit from personal help. For the age groups, the DIF could be explained by the use of a cut-off of 65 years, which was available in both populations and was in line with others studies that showed that SRGH is worse after an age of 65 years in the Spanish population [33].

For both study populations, the Spearman correlation analysis showed that there is a stronger association between lived health and SRGH than between biological health and SRGH. Since correlation analysis is not a full proof of the strength of biological health and lived health to cause the answer to the SRGH, the regression analysis was used. The causal chain results from the correlation analysis: biological health - > lived health - > SRGH. This implies that linear regression with SRGH as a dependent variable and biological health and lived health as independent variables would have a coefficient zero for biological health, i.e. conditional on lived health, biological health is not contributing anything to predict SRGH. The results of qualitative studies showed that some people will disagree that biological health is unimportant to SRGH, therefore we used the optimal solution of overcoming the structural relation between the variables, namely Random Forest regression and the variable importance measures. Certainly, the causal chain indicates that both biological health and lived health are important factors to consider when people rate their health. However, using the Random Forest regression informs us that it is enough to measure lived health for predicting SRGH.

Strengths and limitations

The most important strength of this study was its large nationally representative Spanish sample. Yet, it is significant that this sample is only representative of persons with limitations in functioning and not the general population. This is because the design of the 2008 EDAD used a representative Spanish sample as starting point but only obtained more detailed information about lived health from the subpopulation with limitations in biological health. Thus, our results are not generalizable to the entire Spanish population. There were additional limitations. First, more aspects of the environment that affect the experience of health in everyday life should be considered in addition to personal support and technical aids. Secondly, an artificial cut-off was set, in the sense that only what was considered larger than or equal to moderate difficulty could be rated as ‘moderate’, ‘severe’ or ‘cannot carry out the activity’. We had to assume that people answering ‘no difficulty’ were those who either had no or little difficulty, and in any event did not have a severe enough problem to rate it as moderate. We also had to collapse the response options ‘moderate’ and ‘severe’ difficulty because of the skewed distribution towards complete limitation of the response options.

Conclusions

Our study showed that people base their evaluation of health on their lived health experience rather than their experience of biological health. This result needs to be confirmed and supported by further studies before conclusions can be drawn and practical implications proposed to improve health policy. However, since SRGH can predict the use of health services [34], our study result points to the need on the part of health service personnel and decision makers to consider lived health when they develop and implement health promotion programs or select study outcomes. People with health problems are handed over to health professionals, and this creates an important responsibility. The decisions of health professions should take into account the fact that their patients may be less concerned to know medical facts and more interested in how their health affects everything that they do in their lives. Further research is necessary to determine whether lived health rather that SRGH could be considered when health professionals track health changes over time and for health cross-population comparability.