Introduction

Over the past decades, quality of life has become a focal point in scientific research and clinical practice. Although researchers disagree on the domains that make up quality of life, the general consensus is that quality of life measurement should focus on the subjective experience of the individual. This implies that the individual in question is the most valid source of information [1, 2]. However, nursing home residents may not be able to respond to self-report measures or lose this ability during their stay, f.i. due to dementia, which complicates the assessment and monitoring of a resident’s quality of life across time [35]. Although self-report is a complex process of introspection and evaluation [6], research has asserted that moderately demented patients still can report on their quality of life, even when they have poor insight into and awareness of their dementia [4, 79]. According to Kane et al. [8], 60% of the nursing home population would be able to reliably report on their quality of life. It would, therefore, be helpful to know which scales can be applied to nursing home residents with varying degrees of cognitive impairment.

Most scales measure separate dimensions of quality of life [10]. This has the advantage of a higher responsiveness to change than a measure for overall quality of life (OQOL). However, OQOL is an attractive outcome that can be measured as a single subjective result of weighing unspecified dimensions that are considered to be relevant by the patient. The administration is also less burdening, which is an important factor in a very frail elderly population. Therefore, several researchers (also) use a single overall measure [e.g. 3, 7].

In our approach to quality of life [11, 12], OQOL is represented by subjective (i.e. psychological) well-being. Given this, scales for subjective (psychological) well-being can be used as OQOL scales. Although not always considered as the central outcome, psychological well-being is an important dimension of quality of life in many other approaches to quality of life in the elderly [1319]. It encompasses both positive and negative affect, and life satisfaction (i.e. morale and contentment) [14, 20, 21], but most often a selection of these concepts is used for measurement. Instruments for psychological well-being that are used in the elderly are, for instance, the Philadelphia Geriatric Center Morale Scale (PGCMS) [22] and the Bradburn Affect Balance Scale [23]. For measuring affect in the elderly, the Positive and Negative Affect Scales [24, 25] and the observational Philadelphia Geriatric Center Positive and Negative Affect Scales [26] have been recommended [21]. Both positive affect and negative affect are important dimensions in quality of life scales [7, 1719, 27]. Sometimes, however, only negative scales, such as scales for depression, are used in the measurement of quality of life [28]. As the absence of depression does not automatically imply that a resident is happy or content, this poses the question of whether positive and negative scales do, indeed, measure separate constructs and thus, whether or not a negative scale can be used as a single scale for OQOL.

In this paper the aim is to investigate the usefulness of six self-report measurement scales for OQOL, by studying whether they can be administered reliably and validly in a large group of nursing home residents, and whether this is related to cognitive impairment. We hypothesized that if all scales measure OQOL in nursing home patients validly and reliably, they would correlate highly, within all cognition groups. Moreover, the scales should be related to observational scales that measure OQOL.

Methods

Data were collected in ten nursing homes in the Netherlands. The Medical Ethics Committee of the VU University Medical Center had approved the research proposal, and written informed consent was obtained from the participants or their legal representative. Data were collected on a maximum sample of 30 residents over a period of 3 months per nursing home, with an equal distribution of residents with mainly physical handicaps (in so-called ‘somatic’ units) and mainly dementia syndromes (in so-called ‘psychogeriatric’ units).

The principal investigator (DLG, a trained psychologist) administered the self-report OQOL scales and the cognitive test (see later), while the nursing staff carried out the observational assessments. The completeness of the interview data depended on the resident’s cognitive and physical abilities and willingness to answer questions. The scales were administered in random order. The administration of a scale was terminated when a resident proved to be unwilling or unable to respond to the questions that were asked. To assure the validity of cross-sectional comparisons, the self-report and observational scales for each resident were both assessed within the same 4-week period.

Measurement instruments

The scales that were selected to measure OQOL had been used before in published research among nursing home residents or frail elderly populations. A distinction was made between scales that ask about OQOL literally, scales that focus on positive affect, negative affect or life-satisfaction, and scales that indicate clinical depression.

Self-report OQOL scales

A general question on OQOL (GEN-QOLQ) was asked: ‘Overall, how would you rate the quality of your life at the moment?’. This is a modification of the general question on quality of life that is part of Brod and co-workers’ scale for quality of life in people with dementia [7]. To our question ‘at the moment’ was added, because a pilot study showed that, without this explicit time-limit, the residents tended to evaluate the whole of their past life. The response scale, which is presented in the form of a card, consists of the following response categories: 1 = bad, 2 = moderate, 3 = good, 4 = very good, and 5 = excellent.

The Philadelphia Geriatric Center Morale Scale (PGCMS) [22] is a self-report scale that has been developed to assess elderly people in institutions, and has regularly been used as an outcome measure in research on quality of life and well-being in the elderly [e.g. 29, 30]. It consists of 17 dichotomous items measuring life satisfaction, and the scores are summed, with a high score indicating high quality of life. The scale has been found to be reliable, valid and sensitive [30], and internally consistent (KR-20 of .79) [31].

The Positive And Negative Affect Scales (PANAS) [24], were also used. The Positive Affect Scale (PAS) consists of 10 items concerning positive feelings, such as enthusiasm, interest and determination. The Negative Affect Scale (NAS) consists of 10 items concerning ‘negative’ feelings, such as fear, sadness, anxiety and hostility. For this study, the time frame ‘today’ was chosen, and instead of the original 5-category scale, a 2-category response scale was used, because a pilot study showed that very few residents were able to answer the 5-category scale. The administration was visually mediated, following the procedure proposed for the Depression List (see further). Summing the item-scores yielded two separate total scores, ranging from 0 (no positive/negative affects confirmed) to 10 (all positive/negative affects confirmed). The PAS and the NAS were found to be suitable for use in the elderly [21, 25]. Earlier reported internal consistency with the time-frame of ‘today’ yielded a Cronbach’s alpha of .90 for the PAS and .87 for the NAS in the general population [24].

The Depression List (DL) is a Dutch self-report screening instrument for depression, especially suitable for the assessment of (elderly) people with cognitive impairment [32]. It consists of 15 keywords that are presented on cards, one by one, accompanied by a simple question. For instance, a card with ‘down’ printed on it is accompanied by the question ‘do you feel down?’. Sum-scores range from 0 (no depressive complaints) to 30 (many depressive complaints). In psychometric research, the reported internal consistency of the DL was .82 in a group of visitors to a psychogeriatric day-care clinic [32].

The Geriatric Depression Scale (GDS) is a self-report screening instrument for depression in the elderly that is of known reliability and validity, also in long-term care [33, 34]. It consists of 30 dichotomous questions, which are summed into total scores, ranging from 0 (no depressive complaints) to 30 (many depressive complaints).

Other scales

The Mini Mental State Examination (MMSE) is a test for cognition, and has scores ranging from 0 (very severe cognitive impairment) to 30 (no cognitive impairment). It is widely used and has been validated, also in long- term care populations [35, 36].

The GIP-sad behavior [37, 38] is a sub-scale of the Behavior Observation Scale for Geriatric Inpatients (GIP), which is widely used in nursing homes in the Netherlands. The 6-item GIP-sad behavior (GIP-S) measures the behavior of elderly people in intramural care settings that expresses sadness, unhappiness, and anxiety, and is used in the present paper as a scale for OQOL. Sum-scores range from 0 (no sad behavior) to 18 (frequent sad behavior). When first published, the internal consistency (Cronbach’s alpha) of the scale was .84, and the inter-rater reliability (Pearson’s r) was .74 [37]. In a validation study, internal consistency was found to be .86, and the average inter-rater reliability of the items (Cohen’s weighted kappa) was .43 [39].

The MDS Depression Rating Scale (DRS) is an observational scale, based on items from the Minimal Data Set of the Resident Assessment Instrument [40], which can be used to screen for depression [41]. The DRS consists of seven MDS items that are summed. The scores range from 0 (no depressive behavior) to 14 (frequent depressive behavior). The internal consistency (Cronbach’s alpha) when it was developed was .75 in the derivation sample and .71 in the validation sample. Its sensitivity against a psychiatric diagnosis of depression was 91% [41]. In a validity study the internal consistency of the DRS was .68, and its correlations with the Geriatric Depression Scale (GDS) and the Hamilton Depression Rating Scale were .19 and .24 respectively [42].

Analyses

In order to determine whether cognitive status relates to the psychometric properties of the scales, the total group of residents was divided into four MMSE score-groups. An attempt was made to find a division based on known cut-off points that also resulted in equally large groups. The traditional MMSE cut-off point indicating cognitive impairment is 22/23 [43]. Among the reported cut-off points for severe cognitive impairment are 16/17 and 17/18 [36], and known cut-off points for reliable self-report assessment are 9/10 [e.g. 7] and 14/15 [e.g. 44]. The division into cognition groups was carried out as follows: a MMSE score below 5 (very low cognition group); scores from 5 to 12 (low cognition group); scores from 13 to 21 (moderate cognition group); and scores of 22 or higher (‘high’ cognition group).

For each MMSE score-group, the number of residents who could complete each scale was calculated, and compared with the number of residents to whom it was offered. Cronbach’s alphas were calculated to determine the internal consistency of the scales, and were compared across the different cognition groups. Cronbach’s alpha is considered to be fairly good if higher than .70, but should not be higher than .90 [45]. For construct validity, Spearman coefficients for the interrelationships of the self-report OQOL scales and for the relationships of the OQOL scales with the observational scales for OQOL were calculated and compared across the different cognition groups. Significant correlation coefficients are described in the results and interpreted as follows: .21–.40 = fair correlation, .41–.60 = substantial correlation, and .61–.90 = strong correlation.

Results

Sample description

The overall sample consisted of 227 residents. Their average age was 80.5 (SD 9.26; range 52–100), and 78% were female. The average score on the MMSE (N = 200) was 11.8 (SD 9.26); 26.5% had a score below 5, and 18.5% had a score of 22 or higher. The internal consistency of the MMSE in this sample was .89. The curves of the NAS and the DL were slightly positively skewed. The scores on the NAS scales were found to be quite low (Table 1, descriptives total group).

Table 1 Descriptives and practical utility of self-report overall quality of life-scales

Due to practical considerations and the frailty of the residents, not all scales were administered to all residents. Therefore, the number of completed scales varied.

Proportion of completed scales

Table 1 reports that in the high cognition group (MMSE score of 22 or higher), all six scales could be completed by 94–100% of the residents. In the moderate cognition group (MMSE score between 13 and 21), all scales, except the GEN-QOLQ and the PGCMS, could be completed by 91–97% of the residents. The GEN-QOLQ and the PGCMS could be completed by 84% and 81% in this group, respectively. In the low cognition group (MMSE score 5–12) the DL still could be completed by all residents (100%), the PAS and the NAS by 94%, the GEN-QOLQ by 80%, and the PGCMS and the GDS by 72%. In the very low cognition group (MMSE score below 5) the DL could still be completed by 43% of the residents, but only 21% or less could complete the other scales. The GDS was not completed by any of these residents, and the PGCMS only by 3%.

Only 3 MMSE-groups were used for all further analyses, because in the very low cognition group (N = 53) too few residents could complete the scales.

Internal consistency

The GEN-QOLQ consisted of one question, so internal consistency analysis was not applicable. Table 2 shows that internal consistency of the DL and GDS was satisfactory in all cognition groups; that of the NAS was also acceptable, although this scale was somewhat less consistent in the high cognition group (.68). The PGCMS shows good consistency, except in the moderate cognition group (.53). The PAS is the least reliable scale, as it only reaches an acceptable alpha in the moderate cognition group. Further, the scales showed no linear trends of decreasing internal consistency with increasing cognitive impairment, but there were some variations between cognitive groups.

Table 2 Internal consistency of self-report overall quality of life-scales for different cognition groups

Construct validity

Construct validity, inter-relationship

Table 3 presents the Spearman correlation coefficients for the self-report OQOL scales in the three cognition groups. Each correlation coefficient for two scales pertains to all residents who completed both scales. Therefore, the group-sizes differ for the correlation coefficients. Also, the group sizes may be larger than in Table 2, because scale-scores were also calculated when one of the scale’s items was missing, using mean substitution.

Table 3 Spearman correlation coefficients between self-report overall quality of life scales, for the separate MMSE score-groups

Table 3 shows that the PGCMS, DL and GDS had the strongest intercorrelation. These three scales correlated significantly (P < .01) in all cognition groups. Also the NAS was substantially correlated with the GDS and the DL. The other relationships between the scales varied. As far as the different cognitive groups are concerned, although the correlations between the scales were mostly lowest in the low cognition group, no clear linear trend across cognition groups was visible. The strength of the relationships between the positive PGCMS and the negative NAS, DL and GDS further indicates that these constructs are far from independent, as often has been found.

Construct validity, relationship with GIP-S and DRS

As Table 4 shows, the GIP-S correlated significantly with the GEN-QOLQ, PGCMS, NAS, DL and GDS, but only in the moderate cognition groups. The DRS correlated only significantly to the GEN-QOLQ (only for the high cognition group), PGCMS (for both high and moderate cognition), DL (for moderate cognition) and GDS (for both high and moderate cognition). None of the self-report scales correlated significantly with the observational scales in the low cognition group. Remarkably, there was no correlation between the NAS and the DRS. The correlation coefficients for the GIP-S and the DRS (not shown in Table 4) were .57 in the high cognition group (N = 34); .48 in the moderate cognition group (N = 39); and .42 in the low cognition group (N = 62). In the very low cognition group, in which almost no self-report scales could be administered, the correlation between these two observational scales was .28 (N = 70).

Table 4 Relationships of self-report overall quality of life-scales with observational measures for overall quality of life, for the separate MMSE score-groups

Discussion

The aim of this paper was to investigate the usefulness of six self-report scales for measuring OQOL in nursing home residents. Hence, it was examined what percentage of residents could complete each scale, how high internal consistency and construct validity of the scales were, and whether these results were associated with level of cognitive impairment. It appeared that, of all the scales considered here, the DL could be administered to the most residents, even to almost half (43%) of the residents with very severe cognitive impairment. The other scales could only be administered to a small minority of this group. However, in the higher cognition groups, all scales (except for the PGCMS in the 5–12 MMSE group) could be administered to 80% to 90% of the residents. This percentage is higher than the 60% that was reported by Kane et al. [8], but refers to shorter scales that only measure OQOL instead of various dimensions of quality of life.

In addition to residents being able to complete a scale, the resulting psychometric properties of the scales are of importance. It appeared that all scales, with the exception of the PAS and, to a lesser extent, the PGCMS, had an acceptable internal consistency. Although the alphas varied across the cognition groups, there was no linear trend of decreasing consistency with increasing cognitive impairment. With regard to validity, the PGCMS, DL and GDS (and the NAS to a lesser extent), were strongly interrelated in all cognition groups, although the correlations in the low cognition group overall were somewhat lower. These four scales also had the strongest relationships with the observational scales for OQOL. However, these relationships were not very strong. Moreover, in the low cognition group the scales were not related to the observational scales. This suggests that level of cognitive impairment has a substantial influence on the validity of self-report OQOL-scales.

The PAS performed worst on all aspects. The low internal consistency in the high cognition group could be explained by one item (‘determined’). When it was omitted, Cronbach’s alpha increased from .52 to .70. Furthermore, its disappointing characteristics may partly be explained by the fact that the PAS measures a somewhat different construct than the other scales. Its low correlation with the NAS, the other PANAS scale, is not unexpected. Positive and negative affect are considered to be largely independent. Considering the item content of the five scales, six affects (items) of the PAS were not included in the PGCMS, the DL or the GDS (i.e. interested, exited, strong, proud, inspired, and determined), whereas only three affects of the NAS were not included (i.e. guilty, hostile, and shameful). The PAS therefore must be considered to measure a different construct than the other scales, which also explains the absence of a correlation with the GIP-sad behavior and the Depression Rating Scale. Although conceptually it may be a good addition to the measurement of OQOL,- as psychological well-being is made up of positive affect, negative affect and life satisfaction -, its poor psychometric performance make it unsuitable in its present form. In contrast, the other PANAS scale, the NAS, had quite good properties, although the mean score on the NAS was low. The rating of affective states ideally involves a consideration of intensity, duration and frequency [21]. However, in cognitively impaired residents it is important to use a self-report scale that is as simple as possible, which may lead to loss of information, and therefore a loss of psychometric quality. For instance, the dichotomized response scale and the time-frame of ‘today’ that was used for the PAS and the NAS may have resulted in less discriminatory and lower scores, but increasing the response categories and the time-frame would threaten its reliability. We therefore suggest further research into the optimal time frame and response categories of the NAS.

Although the DL and the GDS have been developed as screening instruments for depression, they correlated very strongly with the PGCMS, which is a scale for life satisfaction. Studying the item content of the scales, it appeared that the items of these three scales have many similarities. The items of the DL and the GDS contain positive as well as negative affects and also contentment (e.g. ‘satisfied’, ‘happy’ and ‘hopeful’). Remarkably, although the PGCMS is considered to be a positive scale, it has more items that contain negative affects or cognitions (12 out of 17; 71%) than the GDS (20 out of 30; 67%), and especially the DL (7 out of 15; 47%). Therefore, despite the fact that the names of the GDS and the DL suggest a negative scale, they both contain ample positive items. Likewise, the PGCMS cannot be considered to be a solely positive scale. So, with adapted scoring methods, each of these scales could be used as a single measure for OQOL, covering both positive and negative aspects.

As in most proxy studies [2, 4649], in the present study low correlation was found between self-report and proxy assessment in low cognition groups (in this study: observational scales, rated by the nursing staff). This may suggest that the validity of self-report scales decreases with the level of cognition, but also that the validity of observational scales is lower in the low cognition group. Indeed, the correlation between the two observational scales in this study was lower in the low cognition group. Nevertheless, the relationships between the self-report and observational scales were also not strong in the high cognition group. This suggests that, although they are certainly related, self-report and observational scales may measure a different construct, irrespective of the cognitive functioning of the resident. The fact that both observational scales were negative may have complicated matters even further. Additional research into the relationship of self-report and observational scales for OQOL as well as proxy-report measures (following Edelman et al. and Sloane et al. [46, 48]) in relation to cognitive performance is therefore necessary. This research should use positive observational scales for OQOL, for instance the Philadelphia Geriatric Center Positive and Negative Affect Scales [26]. It should also apply a shorter maximum time-interval for collecting observational data than the four week-period used in this study, since this may have resulted in an underestimation of the association between the self-report scales on the one hand and the observational scales on the other.

For now, the decision as to whether or not a self-report scale can still be administered reliably and validly in a cognitively impaired resident could best be made for individual assessments when administering a quality of life-scale. The research on quality of life measurement in dementia has shown that one can use screening questions, incorporated in the scale, and thus tailored to the specific cognitive demands of the scale, in order to determine whether a resident is cognitively able to answer the questions [7, 50]. Nevertheless, even if a resident appears to be able to understand the questions, it is not certain that her answers are a true reflection of her inner state. Therefore, further research should focus on developing guidelines on when the administration of a scale can be considered as reliable and valid. This research could, for instance, study the possibility of examining test-retest reliability by repeating questions of a scale throughout the assessment. Such test-retest reliability in the assessment of cognitively impaired residents is an important indication that the resident has understood the questions and that she really communicates her true subjective state. In addition, repeating the assessment on another day and calculating a mean score for the two assessments can result in a more stable self-report OQOL score.

In conclusion, measuring overall quality of life reliably and validly through self-report may not be possible in nursing home residents with at least moderate cognitive impairment. The quality of observational assessment of OQOL may also be lower in cognitively impaired residents. Before drawing definite conclusions about the usefulness of self-report scales, it is necessary to study their reproducibility. Nevertheless, in clinical practice, using self-report scales will provide interesting information on the experience of the residents, and is therefore in itself a valuable addition to observational data. The Depression List is a useful scale in this respect, especially for the assessment of nursing home residents with mild to moderate cognitive impairment.