Background

Patient-reported outcomes (PROs) are increasingly becoming the focus of research and clinical practice [1],[2]. A major challenge in the use of PROs is the practical consideration of the number of items that can be asked [3]. Considerable evidence has been generated to demonstrate the value of simple single-item PRO assessments for describing the effects of disease and treatment in cancer and other diseases [4]–[6]. Single-item assessments in fact have been the most often-used measures in National Cancer Institute (NCI) cancer control clinical trials [7],[8]. The purpose of this manuscript was to present normative data for a specific set of single item PRO measures that have been used in numerous clinical trials and clinical practice settings so as to serve as a reference resource.

Linear Analogue Self Assessment (LASA) items have been validated as general measures of global QOL dimensional constructs in numerous settings [9]–[13]. The acronym LASA actually only refers to the type of response scale, but has come to be associated with simple single-item PRO measures in clinical research. This is partially due to the wide application of these simple measures and the ready acceptance by clinical researchers and clinicians. These single-item assessments have become the most-used assessment in all NCI-sponsored cancer control studies [8]. Single-item tools are in widespread use, for example JCAHO has mandated that single-item pain assessments be completed at the time of every clinical intake for institutions to maintain accreditation [14]. The incorporation of these requirements into clinical practice presumes patient care has improved although the evidence is inconsistent [15]. Recently, a PRO Outcome Measurement System (PROMIS) paper compared a single-item pain measure to a longer assessment and indicated that the two were psychometrically similar but complimentary [16]. This would seem to indicate that there is a place for both in the clinical trials armamentarium.

The advantages of the LASA include brevity and minimized burden for both the patient and the clinical or research system. Sloan and colleagues explored the advantages and disadvantages of single versus multiple item approaches extensively [1] and have further demonstrated that where an indication of clinically significant deficits is the goal of assessment, the LASAs are superior to longer multi-item scales. This is in part due to the LASA allowing a patient to make the gestalt combination for sub-constructs rather than a predetermined metric formula derived empirically from a factor analysis for example [3]. Trusting that the patient has this capability is a key assumption to success. Whatever a patient says their QOL is, that is what it is. Some psychometric analyses assume that the patient has to be fooled into providing an accurate score or that they may give an “inaccurate score”. This is condescending and paternalistic in the extreme. As a triage screening or trigger, an individual LASA has the most obvious application. Also, the brevity allows for routine application in clinical settings where a longer tool would be economically and temporally prohibitive.

The disadvantages of the LASA include a lack of detail about the deficit indicated by the single item. Others have pointed to a lack of capability to obtain a measure of reliability for a single item. However, as demonstrated by Cleeland, if the construct being measured is valid and understandable to the subject, then unidimensional reliability is automatically present [17],[18]. Furthermore, recent research has indicated that reliability for single-items can indeed be measured by using correction for attenuation or factor analysis [19].

The LASA have become the focus of a specific line of research into prognostic factors for survival. Specifically, single item measures of overall QOL and fatigue have been seen to be prognostic for survival in multiple disease groups [1],[20]–[22].

Typically, LASAs are scored on a 0 to 10 scale. Initially a true linear analogue (i.e. a line) was presented to patients who were asked to then place an X on the line to represent their score. This had the benefit of producing a “continuous” variable, but was arduous in terms of scoring, as staff would have to use a ruler to measure the score on each item. Research indicated that patients tended to clump around the middle and quartiles of the line so that the true measurement accuracy that was being provided was realistically a five-point scale with errors around each point. Subsequent LASAs hence used a 0–4 or 0–10 numerical response scale (NRS). In some papers one will see NRS instead of LASA for a label to be psychometrically precise. The use of the 0–10 scale was demonstrated by Norman et al. to have advantages over other alternatives, although 0–4, 1–7, 1–5 and other response scales have all been employed [23]. A linear transformation of any such scaling can be used to translate all scores onto a 0–100 point scale.

A score 50 or below on the transformed 0–100 scale is indicative of a need for immediate exploration and intervention for the QOL deficit [4],[24]. Due to these findings, the NCCTG and subsequently the Alliance for Clinical Trials in Oncology (Alliance) decided to include LASA measures for overall QOL and fatigue in all future phase II and phase III clinical trials as an independent prognostic factor independent of performance status. Our study objective was to analyze and present normative data for LASA measures from various patient and control populations.

Methods

This paper presents a series of normative data for overall QOL LASA scale (Additional file 1) drawn from different populations ranging from healthy volunteers to hospice patients. In total, baseline QOL LASA data from 36 clinical trials and 6 observational studies are included (Table 1). The reference indicated for each study was either a published manuscript, a protocol, an abstract or unpublished dataset as indicated. Healthy NCCTG volunteers (54) provided LASA data via a survey at a semi-annual meeting. Mayo physician and residents data is drawn from a survey. Please refer to Additional file 2 for the details about where each sample was obtained.

Table 1 Data sources, population type, summary statistics for overall QOL

Simple summary statistics (means, standard deviations) are the primary analytical tool for this work. Correlation between the LASA and other measures/demographics was accomplished via correlation coefficients. We compared LASA scores across subpopulations by Fisher’s exact tests for categorical variables and Kruskal-Wallis testing for continuous variables.

Results

In total, for the collective sample of 9,295 individuals, the average overall QOL reported was 7.39 (SD = 2.11) with an overall distribution displayed in Figure 1. The distribution is markedly skewed with roughly 17% reporting a score of 5 or below indicating a clinically significant deficit in overall QOL. Distributions for the various cohorts of Table 1 are displayed in Figures 2 and 3. Comparison of overall QOL scores for select groups is shown in Figure 4.

Figure 1
figure 1

QOL scores were not available for 134 patients. Distribution of overall QOL score, N=9,161.

Figure 2
figure 2

Error bars indicate standard deviation; medians are indicated by the horizontal lines with in each graphic. Boxplots of Overall QOL for individual studies.

Figure 3
figure 3

Error bars indicate standard deviation; medians are indicated by the horizontal lines with in each graphic. Boxplots of Overall QOL for Patient Categories.

Figure 4
figure 4

Error bars indicate standard deviation. Mean Overall QOL scores.

Healthy individuals average above 8.3 (SD = 1.2) on the 0–10 scale and rarely report a score of 5 or below indicating a clinical deficit (Tables 1 and 2). Hospice patients report a much worse average score of 5.7 upon entry to hospice, although it has been seen that their QOL will improve after hospice care has been initiated [25]. Hospice caregivers average 7.4. Cancer patients vary within these two extremes with most patients averaging in the 7’s on the 0–10 scale. Health care professionals score on average almost as bad as their patients. In particular, Mayo Physicians and Minnesota Medical students average 7.3 while residents averaged 6.5. The full range of the scale was reported by almost all cohorts except for healthy individuals, hospice caregivers and skin cancer patients.

Table 2 differentiates patient cohorts by the proportion of patients reporting a clinically meaningful deficit (CMD) of a score equal to 5 or less. This CMD is related to a relative doubling for the risk of death [1]. Healthy volunteers rarely (2%) reported a clinically significant deficit. Hospice patients upon entry had the highest prevalence (42%) of clinically significant deficits in overall QOL.

Table 2 Incidence of clinically significant deficits in QOL for each cohort

In terms of cancer site, lung, brain, musculoskeletal, metastatic cancer, head and neck and lymphatic cancers had between 20-33% with CMD in overall QOL. All other cancer cohorts reported lower incidence rates of CMD in overall QOL. It was notable that 28% of Mayo residents, 13% of Mayo physicians and 15% of MN medical students reported CMDs in QOL.

Overall QOL scores were subsequently analyzed by selected demographics. Overall QOL on average declined slightly with increased age, but only one seventh of a standard deviation or a 7% increase in the percentage reporting a CMD in overall QOL (Table 3). Men and women’s overall QOL distributions were virtually identical (Table 4). When examining data separately from cancer treatment trials vs. observational studies, differences were noted by age in treatment trials, but not observational studies (Additional file 3). Most data for gender came from cancer treatment trials, which showed identical scores in men and women (Additional file 3); few data from observational studies were available that showed some gender differences (Additional file 4).

Table 3 Overall QOL by age
Table 4 Overall QOL by gender

Overall QOL was weakly related to performance status with a Spearman correlation coefficient of −0.29 indicating that people with lower performance status tended to have worse overall QOL (Table 5). Roughly 14% of patients with performance status 0 or 1 reported a CMD compared to 58% reporting a CMD among patients with a performance status 2 or worse.

Table 5 Overall QOL by performance score

Overall baseline QOL was somewhat related to subsequent tumor response (Table 6; p = 0.0094). Patients with a full or partial response reported a CMD at baseline in 11.4% of cases compared to 14.4% among those with stable disease and 18.5% among those with tumor progression.

Table 6 Overall QOL by best response

Discussion

This paper provides a series of normative data drawn from multiple sources for the simple single-item measure of overall QOL that has been used in numerous clinical trials, observational research and clinical practice settings. The overall QOL item differentiates across healthy populations and various patient populations in terms of average values and the incidence of CSDs reported.

A key finding is that overall QOL is different from performance status. This result has been demonstrated previously in individual studies [3],[4],[26], but was demonstrated here to be consistent across study populations. Similarly, the relationship between tumor response in cancer patients and QOL is weak, as reported previously in a study of 989 metastatic colorectal cancer [27]. For example, neither baseline QOL nor changes in QOL indicated a relationship of any strength with tumor response [27]. A limitation of our analyses of associations of QOL with performance status and tumor response there was that large amount of data were missing. The impact of data missingness on our results is unclear and this must be taken into account while interpreting these results. Gender differences in reporting QOL are also nonexistent. Overall QOL also does not automatically decline with age although a general trend is present. Collectively these findings indicate that a patient’s self-reported QOL is more than merely a function of performance status, age, gender or any other demographic/clinical variable.

There are numerous existing well validated and reliable, but much longer, measures of quality of life in cancer patients, there is an overriding need for simple single item assessment measures, such as the LASA used for recording overall QOL in this study [28]. This brief QOL measure is advantageous because it reduces patient burden, both in clinical situations and in clinical trials, and has greater clinical utility for the busy practitioner [29]. Nearly 10 years ago, editors of health quality and life outcomes indicated that there may be too many QOL assessment tools, making the goal of finding an optimal tool difficult [30], as suggested from our work published in 1998 [31]. In the text book by Fayers and Machin, section 2.5 states “the simplest and most overtly sensible approach to measure QOL is to use global rating scales” “A global single item measure may be a more valid measure of the concept of interest than a score from a multi-item scale” [32]. In a series of studies, Zimmerman et al. had almost 2,000 psychiatric outpatients complete single-item assessments of psychosocial functioning and QOL, as well as more complex measures [33]. The single item measures of symptom severity, psychosocial functioning and QOL were strongly correlated with the multi-item measures and were able to discriminate among various clinical populations, e.g., depressed and non-depressed patients. They concluded that single-item measures could be easily incorporated into a busy clinical practice and were reliable and valid in order to collect data on patient condition and treatment effective. Similar results were found by Yohannes et al. in patients with cystic fibrosis [34]. Krause et al. discussed the practical utility of single-item assessments [35]. In fact, this measure is presently being used routinely in our clinical practice for every patient visit.

Results of a survey of usage of the overall QOL item indicate that it allows for clinicians to identify otherwise patient concerns and to facilitate conversations regarding the precise nature of the issues underlying the concerns [36]. A single item can play a central role in triaging and routine screening for issues that patients want addressed but that have either not been raised by the clinician or volunteered by the patient for various reasons including lack of time in the clinical visit or discomfort surrounding sensitive issues like sexuality [37].

The clinical importance of the single item overall QOL is inherent in its ability to tap into the simple construct of overall well being within a patient using his/her own internal weighting scheme for the innumerable component constructs [3]. While some multiple item measures may look at many aspects of QOL, it is impossible to cover all facets of QOL or give them appropriate weighting. Indeed, it has been previously demonstrated that a patient may report a deficit in overall QOL due to a deficit in a single sub-domain that they consider of primary importance that overrides positive indications on all other domains [1],[4]. It is this gestalt capability of a single item that is likely the reason that it has been seem to be empirically linked to overall survival. In its simplest form, the item is asking “Do you think you are doing well?” This even in the presence of overwhelmingly positive objective laboratory and clinical data may be the overriding determinant of the individual’s well being. In one way, this general item can capture unknown important aspects of well being that are being the capability of presently available clinical measures.

A major drawback and concern with the use of a simple single-item measure of QOL is the lack of detail and precise determination of what is being measured or meant by “overall QOL” [38]. Clearly it is not possible for any single item to capture sufficient detail so as to delineate the appropriate clinical pathway that should be pursued. Its utility lies instead in the ability to differentiate between those patients who have CSDs in the well being that can further be explored and subsequently treated.

Another issue with the use of an overall QOL measure is that it may involve issues that are beyond the purview of the clinician, such as financial or legal issues. In the age of comprehensive, multi-disciplinary, patient-centered care, however, identifying such issues can improve the efficacy of clinical care [39]. Indeed, much has been written about how issues beyond clinical care can impede or block positive clinical outcomes [40],[41].

The overall LASA is routinely supplemented in clinical trials and practice by a series of other items relating to physical mental, emotional and spiritual well-being. These data are described elsewhere in the context of individual studies. The purpose of presenting only the overall item for this analysis is based on its universality and its demonstrated linkage to survival in a wide variety of patient populations.

Conclusions

The present study indicates that the single-item measure of overall QOL has acceptable content and construct validity to be used as a clinical indicator of patient well-being. The relative capability for single items versus multiple item PRO measures to help us understand patient well-being is the focus of an R01-funded investigation presently ongoing. This project will compare psychometric properties, including the prognostic capability for survival, among the simple LASA measures, the PRO version of the Common Toxicity Criteria (PRO-CTCAE), and the PROMIS. This and other studies will further enhance our understanding of how we may “Cross-walk” results from alternative measures of the patient experience. Ultimately, this work will lead to a day when PROs are routinely incorporated into clinical care as a supplementary vital sign.

Additional files