FormalPara Key Points for Decision Makers

Thirteen randomized controlled trials (RCTs) assessing the longitudinal effects of adjuvant endocrine therapy on quality of life (QOL) in post-menopausal women with non-metastatic estrogen-positive breast cancer used a variety of QOL instruments; The FACT/FACIT, MENQOL and SF-36 were the most common.

Most studies found no statistically significant differences between tamoxifen and aromatase inhibitor groups in terms of global QOL, although—in a few cases—tamoxifen exhibited a better QOL profile in the early stages of treatment.

The QOL of post-menopausal women is unlikely to be adversely affected by long-term use of adjuvant endocrine therapy.

1 Introduction

Globally, breast cancer is the most common cancer in women and the second leading cause of cancer mortality [1]. There exist three major clinically relevant subtypes: estrogen receptor positive (ER+), human epidermal growth factor receptor (HER)-2 positive and triple negative breast cancer, with ER+ breast cancer being the most prevalent subset, representing more than 75% of all breast cancer [2].

Anti-estrogen endocrine therapy has been purported in several meta-analyses to be effective in reducing breast cancer mortality [3, 4]. Recent estimates suggest that 5 years of adjuvant anti-estrogen therapy reduces cancer mortality by about 33% [4]. There are two major classes of anti-estrogen therapy: (1) selective ER modulators (SERMs), e.g. tamoxifen, and (2) aromatase inhibitors (AIs). SERMs and AIs have major side effects, including vasomotor symptoms, myalgias/arthralgias, sexuality disorder and thrombo-embolic toxicities [5, 6]. Although endocrine therapy has a much more favourable therapeutic index than chemotherapy, the chronicity of endocrine therapy can have a major impact on patients’ quality of life (QOL). For this reason, understanding the impact of adjuvant endocrine therapy on patient-reported QOL is clinically relevant and important.

Lemieux et al. [7] published a systematic review in 2011 focusing on QOL measurement. This was an updated review of previous work by Goodwin et al. [8] and included 190 randomized controlled trials (RCTs) conducted from 2001 to 2009 that measured QOL as an outcome in women of any age with breast cancer receiving any biomedical or non-biomedical intervention. The authors concluded that the reporting of QOL methodology could be improved. In a similar systematic review, Montazeri [9] included 477 papers ranging from descriptive studies to clinical trials that documented health-related QOL (HR-QOL) in patients with breast cancer from 1974 to 2007. The author concluded that a better understanding of HR-QOL in patients with breast cancer could be achieved by more qualitative research focusing on treatment side effects and symptoms and sexual functioning. Although very informative, the evidence synthesized in these studies may be dated and focusses on heterogeneous populations and diverse treatment options.

We performed a systematic review of published prospectively collected QOL data on adjuvant endocrine therapy, either AI or tamoxifen, in post-menopausal women with non-metastatic stage 1–3 ER+ breast cancer, where the therapeutic intent was curative. In the metastatic setting (i.e. stage 4), the therapeutic intent is palliative, i.e. delay of tumour progression, so we limited our literature review to non-metastatic ER+ breast cancer. The objectives of this systematic review were to (1) describe QOL instruments used in ER+ non-metastatic breast cancer trials and (2) document the longitudinal effects (treatment follow-up ≥ 5 years) of adjuvant endocrine therapy on the QOL of post-menopausal women with ER+ non-metastatic breast cancer.

2 Methods

2.1 Literature Search

We performed a systematic review of RCTs reporting the longitudinal effects (≥ 5 years) of adjuvant endocrine therapy with either an AI or tamoxifen in post-menopausal women with non-metastatic stage 1–3 ER+ breast cancer. Relevant studies were identified through an initial search of three electronic data sources from inception until 30 September 2016: Ovid MEDLINE, Cochrane Central Register of Controlled Trials and the US National Library of Medicine’s PubMed using controlled vocabulary as well as all applicable search terms. An update search was performed on 30 October 2017. The search was conducted in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement criteria [10]. We used the following terms: breast neoplasms, breast cancer, breast carcinoma, anti-estrogen therapy, endocrine therapy, aromatase inhibitor, tamoxifen, letrozole, anastrozole, exemestane, quality of life, value of life, and females. We also used Boolean operators, truncation and wildcard operators (Appendix 1 in the Electronic Supplementary Material). Additionally, we manually searched the reference lists of included articles and consulted with experts in the breast cancer field to increase the likelihood of identifying further relevant articles.

2.2 Review Selection Process

The bibliographic records obtained via the literature search were exported into Endnote. Figure 1 shows the process by which articles were selected for this systematic review. First, we removed duplicates from our original sample of bibliographic records, then two reviewers (XJ and CC) independently screened the titles and abstracts and excluded those that did not meet the eligibility criteria. XJ and CC then independently assessed the full text of the remaining articles against predetermined inclusion criteria. We tested the inter-rater agreement for the reviewers’ assessments of study selection by calculating the Cohen’s kappa coefficient of agreement (percentage agreement), κ [11]. Disagreements were resolved via discussion with a third reviewer (VD).

Fig. 1
figure 1

Flow diagram showing the study-selection process. DCIS ductal carcinoma in situ, RCTs randomized controlled trials, SDs standard deviations

Studies were included only if they described (1) an RCT of non-metastatic breast cancer with at least one arm including an adjuvant endocrine regimen; (2) the use of a patient self-report measure that assessed general or breast cancer-specific QOL; and (3) QOL outcomes at multiple time points during follow-up of at least 5 years (followed longitudinally). Studies were excluded if they (1) described only a methodological approach; (2) were duplicates using data from the same patients; (3) enrolled patients with stage 0/ductal carcinoma in situ (DCIS) or advanced/metastatic breast cancer.

2.3 Data Extraction

XJ and CC extracted and synthesized (numerically) the data meeting the inclusion criteria using published predefined extraction forms [7]. These forms present criteria that relate to the quality of the RCTs using the Jadad scoring system [12], the summary of the QOL instruments used, and findings from QOL studies. The following criteria were used: trial name and type (superiority, non-inferiority, equivalence), geographical location, treatment arms description, mean age of patient population, QOL endpoint (primary or secondary), QOL instruments, trial and QOL study sample sizes, statistical power for QOL reporting, statistical method for QOL analysis, timing of QOL measurements, QOL study findings, and clinical significance of QOL study findings.

3 Results

3.1 Literature Search

Figure 1 shows the systematic search process. The literature search yielded 1031 bibliographic records from the three databases (871 and 160 bibliographic records obtained, respectively, from the initial and updated database searches). Tables in Appendix 1 show the search strategy and results from Ovid MEDLINE, the Cochrane Central Register of Controlled Trials and PubMed, respectively. We removed 230 duplicates from the 1031 records. Following review of titles and abstracts, 706 studies were rejected as they enrolled patients with stage 0/DCIS or advanced/metastatic breast cancer. Of the remaining records subject to full-text review, 82 were removed for a range of reasons, including that they had a follow-up time of < 5 years or were conducted in the metastatic setting and in non-adjuvant therapies. The final set eligible for inclusion included 13 bibliographic records.

Inter-rater agreement for reviewer assessments of study selection were very high: 0.92 for title and abstract screening, indicating almost perfect agreement, and 0.82 for full-text screening, indicating strong agreement [11].

3.2 Quality Assessment of the Randomized Controlled Trials

Table 1 shows the results of the quality assessment of the trials. The Jadad score was calculated for each of the 13 trials based on randomization, blinding and an account of all patients [12]. Seven studies (53.8%) [13,14,15,16,17,18,19] scored 3, one (7.7%) [20] scored 4 and five (38.4%) scored 5 [21,22,23,24,25], indicating that the studies included in this systematic review were of reasonably good overall quality.

Table 1 Quality assessment of clinical trials

3.3 Description of the Quality of Life (QOL) Instruments in the Clinical Trials

Our review included 13 trials [13,14,15,16,17,18,19,20,21,22,23,24,25] reporting on different QOL domains in patients with non-metastatic ER+ breast cancer. Table 2 summarizes the QOL instruments used in these trials: Functional Assessment of Cancer Therapy/Functional Assessment of Chronic Illness Therapy (FACT/FACIT), Short Form-36 (SF-36), Menopause-Specific Quality of Life (MENQOL), Center for Epidemiological Studies–Depression Scale (CES-D), Symptom Checklist Depression Scale (SCL), Linear Analogue Self-Assessment (LASA), Rotterdam Symptom Checklist (RSCL), European Organization for the Research and Treatment of Cancer Quality-of-Life Questionnaire (EORTC QLQ), Personal Adjustment to Chronic Illness Scale (PACIS), and the visual analogue scale (VAS). A description of each instrument and the extent to which they were used follows.

Table 2 Summary of quality-of-life instruments used in clinical trials in this systematic review

The FACT measurement system was developed by the FACIT organization [26] and was intended to score and interpret HR-QOL for people with chronic diseases. It includes FACT-G (27 items) for general QOL, which measures four subdomains (physical well-being, social/family well-being, emotional well-being and functional well-being); FACT-B, which is FACT-G plus the nine-item Breast Cancer Subscale (BCS); and FACT-ES, which is FACT-G plus the 18-item endocrine subscale. The total possible score ranges for FACT-G, FACT-B and FACT-ES are 0–108, 0–36 and 0–72, respectively. Higher scores indicate better QOL. The FACT/FACIT questionnaire, along with its variants, was one of the most commonly used (38.4%) QOL instruments in the selected trials [16, 17, 19, 21, 22].

SF-36 is a 36-item questionnaire that measures overall QOL for the general population [27] across eight physical and emotional domains: physical function (ten items), role-physical (four items), bodily pain (two items), general health (five items), vitality (four items), social functioning (two items), role-emotional (three items) and mental health (five items). Each scale is directly transformed into a scale of 0–100, assuming each question carries an equal weight. Higher scores indicate better QOL. The SF-36 questionnaire was the second of the two most commonly used (38.4%) questionnaires in the trials [13, 16, 23,24,25].

MENQOL is a tool to assess HR-QOL in the immediate post-menopausal period [28]. It is a 29-item self-administered questionnaire in which each item assesses the impact of one of four domains of menopausal symptoms: vasomotor, psychosocial, physical and sexual. The average for each domain scores between 1 (not at all a problem) and 8 (experiencing symptom in the domain at the highest degree of bother). In this review, MENQOL was used in 30.7% of the trials, making it the second most commonly used QOL instrument [20, 23,24,25].

CES-D is a 20-item self-administered tool that measures symptoms associated with depression [29]. The total score can range from 0 to 60 and is obtained by summing the individual score for each item. A responder is said to be at risk for clinical depression if the total score is ≥ 16. In total, 15.3% of the trials use the CES-D to measure a dimension of QOL [17, 19].

The RSCL is a self-reported questionnaire that measures four main dimensions of QOL in patients with cancer: (1) physical symptom distress, (2) psychological distress, (3) activity level and (4) overall global life quality [30]. The scores for these dimensions range from 23 to 92, 7 to 28, 1 to 8, and 1 to 7 for dimensions 1–4, respectively. High scores correlate with poor QOL, except for activity, for which higher scores correlate with better QOL. The RSCL was used in 7.6% of the trials selected for this systematic review [13].

The EORTC QLQ C30 was developed to assess the QOL of patients with cancer and comprises five functioning scales (i.e., physical functioning, role functioning, cognitive functioning, emotional functioning and social functioning), a global health status/QOL scale, three symptom scales (fatigue, pain and nausea/vomiting), and six single items (dyspnoea, appetite loss, sleep disturbance, constipation, diarrhoea and perceived financial impact). Each scale of the QLQ C30 is converted to a 100-point scale [31]. High scores represent higher response levels. For instance, a high score for a functional scale indicates a high/healthy level of functioning, whereas a high score for a symptom scale corresponds to more frequent or more intense symptoms. The EORTC QLQ C-30 was used in 7.6% of the trials selected in this systematic review [15].

PACIS is a tool used in clinical trials to assess coping in patients with operable breast cancer [32]. The score on this scale ranges from 0 (no effort at all to cope) to 100 (a great deal of effort to cope). Only 7.6% of the selected trials used the PACIS [14].

The VAS is a commonly used 100-millimeter horizontal or vertical scale with verbal anchors at the low and high endpoints. It is used to measure various subjective experiences such as pain on a continuum of values [33]. VAS scores are obtained by measuring the distance between the low endpoint of the line and the point representing the intensity of the responder’s experience. A VAS was used in 7.6% of the trials included in this systematic review [13].

LASA is commonly used to measure functioning, overall QOL and other symptoms [34]. It resembles a 100-millimeter thermometer (line) with descriptors at opposite extremes (0 low endpoint or anchor and 100 upper endpoint). For each respondent, a score is obtained by measuring the distance (mm) between the low endpoint and the patient’s mark. LASA was used in 7.6% of the trials included in this systematic review [18].

3.4 Main Findings Regarding QOL in Clinical Trials in Non-metastatic Estrogen Receptor-Positive Breast Cancer

Tables 3 and 4 summarize the characteristics of the included clinical trials and their main findings regarding QOL. All studies involved AI therapy in post-menopausal women with non-metastatic (stage 1–3) breast cancer in at least one treatment arm and were published in or after the year 2000. The RCTs covered a variety of anti-estrogen treatment regimens ranging from monotherapy to combinations of the following treatments: tamoxifen, letrozole, anastrozole and exemestane. Interventions were compared with another anti-estrogen therapy, chemotherapy, radiation therapy, toremifene or placebo. The trials were conducted in North America [16, 20, 23], Europe [13], and Asia [17, 19], as well as in multiple regions [14, 18, 21, 22, 24, 25]. In total, 12 trials followed patients for 5 years, and one study had a follow-up time of 10 years. In all trials, QOL was measured as a secondary endpoint in a sub-population of the trial. The mean age of patients included in the clinical trials ranged from 59 to 65.1 years, and 38.4% of the trials reported power calculations for QOL outcomes. The main findings in terms of QOL outcomes are presented per treatment groups.

Table 3 Basic information of included studies
Table 4 Main findings regarding quality of life

3.5 Hormonal Therapy vs. Hormonal Therapy

Eight studies in this systematic review investigated a hormonal therapy compared with another hormonal therapy [15, 17,18,19, 21,22,23, 25].

Cella et al. [21] and Fallowfield et al. [22] compared 5 years of anastrozole alone, tamoxifen alone, and anastrozole plus tamoxifen. A slight improvement was seen in the overall QOL of all three groups during the 2-year period. No statistically significant difference was seen across intervention groups across the 5 years. Endocrine scores for both arms decreased initially and recovered partially during the first 2 years after initiation of therapy. However, side effect profiles were reported in different treatment groups. Patients receiving an AI experienced more sexual side effects but fewer cold sweats than those receiving tamoxifen.

Francini et al. [15] investigated tamoxifen compared with AIs and found no statistically significant difference in global health status between groups, measured with the EORTC QLQ-C30.

Ohsumi et al. [17] compared tamoxifen versus AIs and found that total FACT-G scores were better in the tamoxifen group than in the anastrozole group. The scores in the tamoxifen group were stable over time, whereas those in the anastrozole group declined but not significantly. Additionally, FACT-B scores remained stable, without any clinically significant change [17].

Whelan et al. [25] compared tamoxifen versus placebo. No statistically significant between-group differences were found in summary scores measured with the SF-36, although differences between interventions might exist across dimensions measured by the QOL instrument.

Goss et al. [23] studied 5 years of AIs + letrozole compared with 5 years’ AI alone using MENQOL. The authors reported no significant treatment effects on domains of MENQOL over time [20, 23, 24].

Pagani et al. [18] investigated toremifene compared with tamoxifen using the LASA. No significant difference between toremifene and tamoxifen was found on any of the ten items assessed. LASA scores for hot flushes declined initially, indicating decreased QOL associated with hot flushes, but began to recover with further follow-up after month 12.

Takei et al. [19] studied tamoxifen compared with two types of AIs. Scores of overall QOL increased after treatment began. The tamoxifen group achieved better QOL than the AI group in the first year. Endocrine-related symptoms remained unchanged in all groups and did not differ between the interventions. In addition, this study used FACT-G scores to measure global QOL among those receiving exemestane versus tamoxifen versus anastrozole and found no significant differences across groups. FACT-G scores increased after treatment began for the tamoxifen group. However, the score change over time for the tamoxifen group was a maximum of four, meaning there were no significant changes for the tamoxifen group over time [19]. The N-SAS BC 04 trial included three arms: two AIs alone versus tamoxifen alone [19]. FACT-B scores increased after treatment began and remained significantly higher in the tamoxifen group than in the AI group for 1 year.

3.6 Hormonal Therapy + Chemotherapy vs. Hormonal Therapy

Crivellari et al. [14] used the PACIS to study QOL with tamoxifen alone compared with tamoxifen + the combination of cyclophosphamide, methotrexate and fluorouracil (CMF). All treatment groups showed substantial improvement in QOL scores during adjuvant therapy. Longer initial cytotoxic therapy delayed improvement in QOL scores.

Mamounas et al. [20] used the MENQOL to study the effects of 5 years’ tamoxifen followed by exemestane compared with 5 years’ tamoxifen alone. They found no significant treatment effects in the vasomotor (p = 0.87), psychosocial (p = 0.27), physical (p = 0.13) or sexual (p = 0.23) scales.

Muss et al. [24] investigated 5 years’ tamoxifen followed by exemestane compared with 5 years’ letrozole alone using the SF-36 and MENQOL. There was no difference in QOL at 24 months among letrozole- and placebo-treated patients aged ≥ 70 years [24].

3.7 Hormonal Therapy + Chemotherapy vs. Hormonal Therapy + Chemotherapy

Buijs et al. [13] investigated a high-dose chemotherapy regimen with radiotherapy and tamoxifen compared with a conventional chemotherapy regimen with radiotherapy and tamoxifen. The time horizon was 5 years, with QOL measured using the SF-36, VAS and RSCL. The authors found HR-QOL to be negatively affected compared with conventional-dose chemotherapy. Differences were insignificant 1 year post-randomization.

Land et al. [16] conducted a study of CMF + 5 years of tamoxifen versus CMF + placebo versus adriamycin + cytoxan (AC) + tamoxifen compared with AC + placebo. The QOL instruments used were FACT-B, SCL and SF-36. The results of the study showed that the effects of the tamoxifen arm, surgery, tumour size group, and age were either not statistically significant or were of negligible magnitude.

4 Discussion

This study aimed to describe QOL instruments used in ER+ non-metastatic breast cancer trials, documenting the long-term effects of adjuvant endocrine therapy on the QOL of post-menopausal women with ER+ non-metastatic breast cancer. Therefore, a systematic review study design was adopted to retrieve citation data from three major electronic databases from their inception to 2017. A variety of QOL instruments have been used to capture different dimensions of QOL in ER+ non-metastatic breast cancer trials, with FACT/FACIT, MENQOL and SF-36 being the most common. In addition, most studies found no statistically significant differences between tamoxifen and AI groups in terms of global QOL, although—in a few cases—tamoxifen exhibited a better QOL profile in the early stages of treatment.

The results of this systematic review are concordant with those found in previous systematic studies covering similar topics [7, 9], except that our study focused on post-menopausal women with ER+ non-metastatic breast cancer. This systematic review was built on structured search and data extraction strategies that enabled the retrieval of bibliographic records from several electronic databases over an up-to-date and wide-ranging time horizon. Our study is prone to several limitations that warrant highlighting. No search strategy is perfect in terms of sensitivity, specificity and accuracy, meaning there is a chance we missed potentially relevant articles. As acknowledged by Lemieux et al. [7], the ability to clearly establish that QOL results contributed to the “trial authors’ decision to recommend the use of” a treatment regimen was not straightforward in all cases. Given the variation among the RCTs included in our review in terms of sample size, comparators, QOL instrument used, and timing of QOL measurement, a meta-analysis was not feasible.

Despite the above limitations, this review is the first to focus on QOL and its change over time (≥ 5 years) in post-menopausal women with non-metastatic breast cancer who have received adjuvant endocrine therapy with either tamoxifen or AI therapy.

This systematic review suggests that the QOL of post-menopausal women is unlikely to be adversely affected by the long-term use of adjuvant endocrine therapy (AI over tamoxifen). Nonetheless, efforts are needed to harmonize the use of QOL instruments and the reporting of QOL data to enable quantitative assessment of QOL for patients receiving adjuvant endocrine therapy and therefore empower clinicians and patients in their shared decision making.