Background

Patient-reported outcomes (PROs) are defined as a “measurement of any aspect of a patient’s health status that comes directly from the patient (i.e., without the interpretation of the patient’s responses by a physician or anyone else)” [1]. Patient-reported outcome measures (PROMs) provide an opportunity for patients to indicate the impact of a disease and its treatment on their lives. Health-related quality of life (HRQoL) represents a patient’s physical, psychological, and social response to disease and therapy and is one type of PRO [2]. PROs can provide additional information to help with treatment approval, reimbursement, and selection/dosing decisions; management of medication side effects; health monitoring; and patient-provider decision-making.

Breast cancer is the most commonly occurring cancer in women, with an estimated 2 million new cancer cases diagnosed globally in 2018 [3]. Advanced breast cancer has been described as a generally incurable, yet treatable disease, and the primary goals of treatment are to reduce symptom burden, maintain quality of life (QoL), and prolong survival [4, 5]. Treatment for patients with advanced disease includes neoadjuvant chemotherapy, surgery, postsurgical radiation therapy, and systemic adjuvant therapy (including hormone therapy for those with hormone receptor-positive breast cancers). Understanding the impact of treatment on patients’ HRQoL outside clinical trials can provide useful information for patients and clinicians in making treatment decisions.

Several PROMs have been used to assess patients with breast cancer. PROMs are questionnaires that capture patients’ feelings and functioning in a structured manner and consist of items and corresponding response options; When developed and validated according to international guidelines, PROMs can provide reliable and valid patient assessment.

A systematic literature review by Nguyen et al. [6] indicated that the European Organization for Research and Treatment of Cancer (EORTC) Breast Cancer–Specific Quality of Life Questionnaire-23 item (QLQ-BR23) and the Functional Assessment of Cancer Therapy-Breast (FACT-B) are the only HRQoL questionnaires that have been developed specifically for patients with breast cancer facing different disease stages and treatments. Both tools act as supplements to their general cancer questionnaires, the EORTC Quality of Life Questionnaire, Version 3.0 (QLQ-C30) and the FACT-G, respectively. Given recent developments in breast cancer treatment, we sought to determine whether additional valid and reliable HRQoL measures are available in the public domain. Specifically, we aimed to identify breast cancer–specific HRQoL measures with evidence of validation in the breast cancer population for potential use in patients underwent systemic treatment for breast cancer (excluding surgery and radiotherapy).

As HRQoL measures are focused on patients’ overall health and well-being, regional characteristics and traditions may be included. This review focused on identifying HRQoL measures regardless of whether or not they were region specific.

Methods

Literature search

The literature review was conducted on August 19, 2019, in the PubMed, Embase, and PsycINFO databases. Table 1 presents the search strategy used for PubMed; the key words used were translated for each of the individual databases. The search focused on the past 10 years (January 1, 2009–August 19, 2019), was limited to publications written in English, and excluded commentaries, letters to editors, editorials, book chapters and case reports because they did not contain detailed information on the psychometric properties of the instruments.

Table 1 PubMed Search Strategy

Literature review

Unique records that were identified across the three databases were reviewed in accordance with prespecified inclusion criteria. Studies were required to include patients (aged ≥18 years) with breast cancer who were treated with a pharmaceutical intervention and to assess a psychometric property of an HRQoL-focused PROM. Psychometric properties of interest included reliability (internal consistency, Cronbach alpha, test-retest), validity (content, convergent, divergent), and responsiveness. Psychometric properties for different modes of administration (e.g., electronic PRO [ePRO]) or for translations were included. Reasons for exclusion were populations receiving surgery or radiation, studies focused on HRQoL of treatment efficacy only, and studies only considering caregiver burden. References of relevant review articles were reviewed for any pertinent articles not identified in the original search. Two investigators reviewed the abstracts and selected abstracts that fulfilled the inclusion criteria. Any disagreement among investigators was discussed and final decision was done based on consensus.

During level 1 screening (titles and abstracts), studies that did not meet criteria were excluded. Full texts of included studies were reviewed (level 2 screening) using the same relevance criteria applied at level 1. Upon completion of level 2, an additional criterion was added to focus the review on breast cancer–specific HRQoL instruments only.

Information regarding reliability (internal consistency, Cronbach alpha, test-retest) and validity (content, convergent, divergent) were extracted from the studies. These psychometric properties were analyzed in accordance with prespecified thresholds of significance (e.g., Cronbach alpha > 0.7). In addition, item content of the instruments was reviewed.

Results

Figure 1 summarizes the literature review, which identified 613 unique records for level 1 screening, of which 131 full-text articles were reviewed; 80 articles presented psychometric properties for identified PROMs used in breast cancer. This review focuses on the 33 that described psychometric properties of breast cancer–specific HRQoL instruments.

Fig. 1
figure 1

PRISMA Diagram

Psychometric properties of 12 PROMs were identified: EORTC QLQ-C30, EORTC QLQ-BR23, FACT-B, Functional Assessment of Cancer Therapy-Breast Symptom Index (FBSI), National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy-Breast Cancer Symptom Index-16 (NFBSI-16), Young Women with Breast Cancer Inventory (YW-BCI36), Breast Cancer Symptom Scale (BCSS), QuEST Breast Cancer Questionnaire (QuEST-Br), Quality of Life Instruments for Cancer Patients-Breast Cancer (QLICP-BR), Indonesian Breast Cancer Health-Related Quality of Life (INA-BCHRQoL), and two new unnamed measures [7,8,9]. For each identified PROM, Table 2 provides an overview of the measure’s purpose, the domains assessed and the number of items. Table 3 provides an overview of concepts addressed for each PROM. Table 4 provides an overview of the psychometric articles (instrument, objective, population), and Table 5 provides psychometric qualities of the identified instruments.

Table 2 Overview of Identified Measures
Table 3 Content Mapping Across Patient-Reported Outcome Measures
Table 4 Overview of Psychometric Articles Identified
Table 5 Psychometric Qualities of Identified Instruments

Comparison of item content

A review of the item content of the identified measures reveals variability in the content that each assesses (Table 3). Most of the PROMs assess not only breast cancer symptoms, but the physical and emotional aspects of the disease. Breast cancer–specific symptoms appear not to be included in INA-BCHRQoL, YW-BCI36, and the new measures [7,8,9]. Physical and emotional or social functioning are included in the QLQ-C30/QLQ-BR23, FACT-B, FBSI, NFBSI-16, QuEST-Br, QLICP-BR, whereas other concepts are addressed within only one or two PROMs. For example, sexual function is included in the QLQ-BR23, but no other PROM; body image is only included in the QLQ-BR23 and the YW-BCI36. Vanlemmens and colleagues’ measure for young women (< 45 years of age) is focused not only on the patient with breast cancer but also on her partner and their relationship (i.e., couple cohesion, managing children/everyday life) [8]. Regarding the financial impact of living with breast cancer, only the measure by Vanlemmens et al. [8] and the YW-BC136 include this concept. Approximately two thirds of the identified articles (22/33) focused on the EORTC QLQ-C30, EORTC QLQ-BR23, or the FACT-B.

EORTC QLQ-C30 and EORTC QLQ-BR23

Fourteen publications provided additional psychometric validation of the EORTC measures (QLQ-C30 and QLQ-BR23) (Table 4). Of these, eight were focused on validation of the EORTC modules in new languages (e.g., Chinese [23], Arabic [10, 11, 14, 20], Mexican-Spanish [13] or confirmation that the measures was understood by or used in larger populations (Brazil [16], Albania [17], Singapore [20]); two articles presented additional confirmation of content validity [18, 19], one article tested the replicability of cutoff scores [19], and three demonstrated acceptability of the measures in ePRO platforms [15, 21, 22].

Internal consistency was assessed using the Cronbach alpha coefficient for translations. The breast symptoms scale for several translations (Chinese [23], Arabic [14], and Mexican-Spanish [13]) was below 0.70; otherwise reliability of the QLQ-BR23 translations met established criteria (Table 5). Test-retest reliability was also established for the Arabic version [10]. Various methods were used to evaluate the validity of the translations, including multitrait scaling and known-groups comparisons. Item-convergent validity was demonstrated (i.e., exceeding the 0.40 criterion) [14, 17, 23]. The questionnaires differentiated patients with lymphedema from those without [29], differentiated patients with early stage breast cancer versus those with locally advanced breast cancer [11, 13], and were responsive to changes following treatment [13]. Additional content validity for signs and symptoms was evaluated by testing the correlation between reported adverse events and responses to the QLQ-C30 [18].

Bjelic-Radisic et al. [12] evaluated whether updates in breast cancer treatment necessitate updating the EORTC QLQ-BR23, which was developed in 1996. A literature review and interviews with patients and health care providers suggest that additional concepts were missing. The new items contain two multi-item scales: target symptom scale (20 items) and satisfaction scale (2 items). The target symptom scale can be further divided into three subscales: endocrine therapy scale, endocrine sexual scale, and skin/mucosa scale. Further psychometric validation is underway.

Administration of the measures via ePRO also has demonstrated reliability and validity [15, 21].

FACT-B

The majority of the FACT-B publications presented reliability and validity data for translations of the measure into Arabic [24], Persian [31], Czech [26], Lebanese Arabic [27], and Chinese [30] (Table 4). One publication presented data regarding the appropriateness of an ePRO application [29]. Two articles compared the properties of the FACT-B (disease-specific measure) with that of a general HRQoL measure, the EQ-5D [25, 28].

Internal consistency reliability (Cronbach alpha) and test-rest reliability (intraclass correlation coefficient) met accepted thresholds for the translations [24, 26, 31] (Table 5). Content validity was established in Lebanese Arabic translation [27]. Administration of the measure via ePRO was deemed acceptable [29].

Other measures

FBSI and NFBSI-16

The Chinese translation of the FBSI has demonstrated adequate test-retest reliability as well as known-group validity and convergent and divergent validity [32]. Garcia et al. [33] sought to develop a new version of the FBSI in accordance with US Food and Drug Administration guidance for PRO measures that provides assessment on a symptom level and improves upon the original FBSI by emphasizing input from patients. Specifically, 52 patients with breast cancer provided their top-priority symptoms/concerns through open-ended interviews and symptom checklists. After patient input was reviewed, eight additional items were added to the original FBSI, creating the NFBSI-16. Conceptual relevance was supported for most items in the NFBSI-16 based on patients’ reports of experiencing the concepts as part of their breast cancer experience [34].

YW-BCI36

Christophe et al. [35] developed a questionnaire specifically measuring the subjective experience of nonmetastatic breast cancer in young women (aged 45 years or younger when diagnosed), their perceptions regarding its treatment in their daily life, and the repercussions of the disease. Reliability and validity of the new measure were demonstrated (Table 5).

BCSS

Horigan et al. [36] conducted a large survey of registered patients with breast cancer to further document the content validity of the BCSS. Specifically, the patients were asked to rank 21 issues identified as important to them. The nine highest ranked items include good QoL, maintaining independence, able to sleep, able to concentrate, perform normal activities, being fatigued, having depression, being anxious, and having pain. The five lowest ranked items include appetite, breast-specific issues, hot flashes, and sexuality. Ratings by breast cancer subset (newly diagnosed, on treatment, no evidence of disease, hormonal or nonhormonal treatment, metastatic disease, survivors) showed some differences compared with those by the whole group.

QuEST-Br

Harley et al. [38] adapted existing HRQoL instruments (EORTC measures) for use in routine clinical practice delivering outpatient chemotherapy for breast cancer. Methods followed the guidelines laid out by the EORTC Quality-of-Life Group for developing questionnaire modules [40]. Internal consistency reliability was > 0.70 for the QuEST-Br scale [38].

QLICP-BR

Wan et al. [37] developed and validated a QoL instrument for patients with breast cancer in China. The measure was developed with particular attention to Chinese culture. For example, the family relationship and kinship play very important roles in daily life. Taoism and traditional medicine focus on good temper and high spirit. Good appetite, sleep, and energy are highly regarded in daily life, and food culture is very important [37]. The QLICP-BR was found to have adequate reliability and validity (Table 5).

INA-BCHRQoL

Saptaningsih et al. [39] developed a new measure to capture not only the physical, cognitive, and psychological aspects of patients but also the spiritual aspect. The questionnaire was developed in Indonesia and was designed to be culturally relevant (i.e., it included a spiritual domain, which is suitable for Indonesia, as it is a very religious country) to the breast cancer population in Indonesia. The INA-BCHRQoL was found to have adequate reliability and validity (Table 5).

Unnamed measures

Deshpande et al. [7] developed and validated a patient-reported questionnaire to assess the QoL outcomes of patients with breast cancer in India. Reliability and content validity were demonstrated (Table 5).

Vanlemmens et al. [8, 9] developed and validated a particular and specific inventory for measuring the impact of breast cancer on the QoL of young women (< 45 years of age) with nonmetastatic disease and the QoL of their partners. Reliability (internal consistency and test-retest) has been established. Convergent validity showed strong correlations with QoL measures (EORTC QLQ-C30) (Table 5).

Discussion

Understanding the effect of breast cancer treatment on a patient’s HRQoL is a central clinical and research question. However, to accurately assess HRQoL, valid and reliable PROMs are needed: that is, PROMs with evidence of reliability, validity, and responsiveness in the population of interest (breast cancer). This review sought to identify disease-specific HRQoL measures with evidence of validation in the breast cancer population for potential use in patients underwent systemic treatment for breast cancer (excluding surgery and radiotherapy). In addition to the EORTC QLQ-C30, QLQ- BR23, and FACT-B, we identified an additional nine potential measures.

The identified PROMs vary in the content that they assess. For example, Vanlemmens and colleagues’ measure for young women (< 45 years of age) is focused not only on the patient with breast cancer but also on her partner and their relationship (i.e., couple cohesion, managing children/everyday life) [9]. This measure also assesses impact of breast cancer on the woman’s career and finances. Other than the YW-BCI36, none of the other instruments assess the impact of breast cancer on a woman’s career or finances. Conversely, most do assess not only breast symptoms but also the physical and emotional/psychological impact of disease.

The YW-BCI36 [35] and the measure developed by Vanlemmens et al. [8, 9] were both developed specifically for women < 45 years old with breast cancer. Younger women with breast cancer have concerns (i.e., childcare, financial) that older women with breast cancer may not, thus these measures were developed specifically for this population. Several other PROMs were developed to meet a specific unmet need within regions (China, Indonesia, India) for measures that were culturally appropriate (QLICP-BC [37], INA-BCHRQoL [39], Indian breast cancer measure [7]). Given that these PROMs have been developed to be culturally relevant for a specific region/population, they may not be appropriate for global studies.

Psychometric qualities that may be examined in the evaluation of an instrument may include acceptability, validity, reliability (including internal consistency and test-retest reliability), and responsiveness. When questionnaire responsiveness (the ability of a scale to detect significant change over time, assessed by comparing scores before and after an intervention of known efficacy) was examined on the basis of various methods, including t tests, effect sizes, standardized response means, or responsiveness statistics, the information available was scarce.

The FACT-B and the QLQ-BR23 were designed for use in patients with breast cancer with a range of disease stages and undergoing different treatments. The EORTC QLQ-BR23 and the FACT-B are well developed instruments that have been extensively tested among patients with breast cancer. The FACT-B is shorter than the QLQ-BR23 and covers fewer symptoms and treatment-related side effects. This review has identified additional translations of the measures, providing further evidence of their validity. Internal consistency estimates of reliability were adequate for research purposes, although the internal consistency estimates were somewhat lower for the cognitive and breast symptoms scales. Further psychometric testing of the Breast Cancer–Specific Quality of Life Questionnaire-45 item (QLQ-BR45) may provide improved results. A recent publication [41] provides more detailed updates on the development of the QLQ-BR45. The QLQ-BR23 was one of the first disease-specific questionnaires developed in 1996 to assess QoL in patients with breast cancer. Given the effects of newer therapeutic options available since then, the developers believed it was evident that the original 23-item QLQ-BR23 may not be able to cover many important QoL issues and potential side effects of newer treatments. Therefore, the EORTC Quality of Life Group decided to update this module, eventually creating the QLQ-BR45. The development of the QLQ-BR45 involved a systematic literature review to identify relevant QOL issues for patients, interviews with patients and providers, and pretesting of a preliminary module in an international phase 3 study. Results of the literature review and discussions with patients and providers indicated that the original QLQ-BR23 inadequately covered concepts currently relevant to patients with breast cancer. Thus, new items were developed (added to the existing QLQ-BR23) and pretested in a multinational study, resulting in the QLQ-BR45, which is currently undergoing further psychometric testing.

Much of the additional psychometric data for the EORTC QLQ-C30, QLQ-BR23, and FACT-B are from new translations, further confirming the acceptability of these measures. Reliability of tablet or ePRO versions of the measures were also confirmed. New instruments were developed de novo in order to be considered more culturally relevant to patients in Asian countries. While these measures have demonstrated adequate internal consistency, test-retest and responsiveness data are lacking.

Conclusions

Even though, historically there have been limited options for validated measures to assess HRQoL in breast cancer patients, a number of new options for assessing HRQoL in breast cancer population have been developed and validated in recent years. This review supports the reliability and validity of the EORTC QLQ-C30 and FACT-B; new translations and electronic versions of these measures further support their use for this population. Researchers should ensure that their selected PROMs are suitable for their target patient population, anticipated line of therapy, and the expected side effects of the therapies involved.