Introduction

Breast cancer is the most common cancer in women, and over 1.6 million cases are diagnosed annually [1]. It is a highly heterogeneous disease with respect to clinical and molecular characteristics. Both adjuvant endocrine therapy and chemotherapy, after initial surgery, have proven to be highly effective methods to reduce the risk of disease recurrence, preventing both local and distant metastasis [2] and reducing mortality. Despite the proven benefits of adjuvant endocrine therapy in women with hormone receptor positive breast cancer, relapses still occur even after initial treatment with endocrine therapy for 5 years. Adjuvant chemotherapy has also proven to be effective in reducing the risk of recurrence within the first 5 years after diagnosis [2], but little is known about its effect on long-term outcome.

For women with HR-negative disease, the risk of recurrence is confined mostly to the first 5 years after diagnosis and relapse rates fall rapidly thereafter [3,4]. However, women with HR-positive tumors remain at risk for late recurrences, and the annual rate is in excess of 2% for at least 15 years, even after 5 years of tamoxifen therapy [5]. The ATLAS (Adjuvant Tamoxifen: Longer Against Shorter) and aTTom (Adjuvant Tamoxifen: To Offer More?) trials have reported their findings, and results show that extended tamoxifen therapy (5 additional years) in women with early-stage breast cancer reduces the risk of late recurrences [6]. Similar findings were observed for aromatase inhibitors, which have been shown to reduce the risk of late relapse if treatment was extended beyond the initial 5 years [7,8]. However, there is a need to identify those women at high risk of recurrence and who will benefit most from these extended therapies. Thus, the benefits of extended treatment must be weighed against potential side effects of prolonged therapy for each individual patient. In this article, we will review the evidence for the use of clinicopathological factors, individual biomarkers, and multi-gene/molecular scores for the prediction of late recurrence, especially distant recurrence.

Clinicopathological parameters

Tumor size, stage, and nodal involvement are routinely used to estimate the likelihood of breast cancer recurrence and are relevant for both estrogen receptor (ER)-positive and -negative cancer. These clinical parameters are useful for predicting recurrence in the first 5 years after diagnosis [9]. Distant recurrence has been associated with large tumor size, poorly differentiated disease, and nodal involvement [10], and these factors are believed to be correlated also with late metastasis. However, only a few studies have investigated the utility of these clinical factors in this setting. The Netherlands Cancer Institute has studied 252 breast cancer patients with clinical data, long-term follow-up, and treatment details [11] and found that, of the clinical parameters, only nodal involvement was a significant prognostic factor for late metastasis. In the large translational Arimidex, Tamioxiden Alone or in Combination (transATAC) trial, we confirmed the role of nodal status but also found tumor size to be an independent prognostic factor for early and late distant recurrence, whereas grade was predictive only in the first 5 years after diagnosis (Table 1) [9]. These two findings are in agreement with previously reported results that nodal involvement is the dominant feature predicting disease-specific outcome [4]. Clinicopathological factors still have different values for the identification of women at high risk of late recurrence, and better discrimination is desirable. Many women with node-positive disease receive chemotherapy, but some may be over-treated and may be at a sufficiently low risk of recurrence 5 years after diagnosis. This raises the question of the duration of endocrine treatment, and it is not yet clear which women will benefit most from extended therapy.

Table 1 Individual clinical and immunohistochemical markers for the prediction of early versus late recurrence in transATAC

Cancer-related genes

The identification of markers for the prediction of breast cancer has been widely investigated, specifically for the identification of biomarkers for early breast cancer diagnosis. The evaluation of ER, progesterone receptor (PgR), Ki67, and human epidermal growth factor (HER2) is common in clinical practice for prognostic purposes and treatment decisions. However, laboratory variability in Ki67 scoring is well known and therefore this biomarker is not ideal for clinical decision making [12]. Other markers, such as B-cell lymphoma 2, androgen receptor, epidermal growth factor, phosphatase and tensin homolog, and PIK3CA, have been investigated for their prognostic value in breast cancer. Tumors with mutations in PIK3CA have been shown to be associated with lower recurrence and mortality rates in the late time period [13,14]. Liu and colleagues [15] analyzed predictors of late relapse in early-stage ER-positive breast cancer, in which differences in the primary tumor tissue in patients with distant relapses occurring early (less than 3 years) versus late (after 7 years) have been compared. A set of genes was identified that were prognostic specifically for early relapse (CALM1, CALM2, CALM3, SRC, CDK1, and MAPK1), but they also identified genes that appear to predict late relapse (ESR1, ESR2, EGFR, BCL2, and AR). High expression levels of BCL2 have also been shown to be a good predictor of late breast cancer recurrence in a subset of 73 patients with primary breast cancer [16]. TP53 gene mutation is very frequently found in breast cancers [17]. High expression of p53 in HR-positive tumors has been associated with risk of both early and late recurrence [18]. Furthermore, Bianchini and colleagues [19] investigated a mitotic kinase score (average expression of 12 kinases) and an estrogen-related score (four genes) and observed that women who recurred late (after 5 years) had highly proliferative and high estrogen-sensitive tumors. In contrast, in the transATAC trial, in which ER, PgR, Ki67, and HER2 as measured by immunohistochemistry were analyzed for their prognostic value for late distant recurrence, no prognostic information for late recurrence was observed for any of these four markers (Table 1) [9]. It remains to be seen whether these various genes can prospectively predict late (distant) recurrence and be implemented in the clinical setting.

Molecular scores

Over the past decade, many molecular markers have been developed for the prediction of recurrence risk. Most of them are good prognostic markers during the first 5 years after diagnosis (Table 2). All of these markers have in common that they evaluate the patient’s individual risk of recurrence (ROR) by combining expression profiles of a panel of cancer-related genes. However, they need to be combined with traditional clinicopathological factors to make use of all available information on prognosis.

Table 2 Summary of multi-gene/molecular scores for the prediction of recurrence

Currently, seven genomic assays are available for use in early-stage breast cancer (Table 2). All of these give an overall risk assessment of breast cancer recurrence and provide prognostic information not contained in clinicopathological factors. None of these signatures was specifically developed for predicting late (distant) recurrence; nevertheless, some of them have been investigated in this context, as summarized below.

Current genetic tests for the prediction of recurrence

MammaPrint

The MammaPrint (Agendia, Irvine, CA, USA) is a 70 gene-based molecular score, which was developed in a cohort of women who did not receive systemic therapy and with no long-term clinical outcome data [20,21]. The genes included in the MammaPrint are all related to the regulation of cell cycle, invasion, metastasis, proliferation, and angiogenesis and were retrospectively selected. This multi-gene score classifies women into low-risk (good prognosis) or high-risk (bad prognosis) groups according to their 10-year distant recurrence risk of less than 10% or more than 10%, respectively, but no continuous score is available for MammaPrint. The signature was validated in two studies in women with node-negative or node-positive disease and showed that the test predicted breast cancer recurrence accurately [20,21]. The predictive value of this test is being evaluated in the MINDACT (Microarray In Node-negative and 1-3 node-positive Disease may Avoid Chemo Therapy) trial where women are randomly assigned to chemotherapy plus endocrine therapy or endocrine therapy alone if the results between MammaPrint and Adjuvant Online! were discordant [22]. The MammaPrint has been evaluated in many patient subgroups, but the prognostic information added by this assay is confined to the first 5 years after diagnosis.

Genomic grade index

The Genomic Grade Index (Ipsogen, now known as Qiagen Marseille, Marseille, France) is a 97 gene-based assay including mostly proliferation genes and was developed to refine the commonly used histological grade assessment (low, intermediate, and high). The assay classifies women into low- or high-grade groups and has been shown to better define tumor grade, patient prognosis, and breast cancer subtypes [23,24]. Again, this assay has been validated in many different patient subgroups, and the prediction of (distant) recurrence was limited to 5 years after diagnosis.

Oncotype DX recurrence score

The 21 gene-based Oncotype Recurrence Score (Genomic Health, Redwood City, CA, USA) is a well-established multi-gene assay, which was developed to assess the risk of recurrence in women with HR-positive, node-negative breast cancer treated with tamoxifen [25]. The signature is based on 16 breast cancer-specific genes and five reference genes, including information on proliferation, estrogen, invasion, HER2, and other factors [25]. A mathematical algorithm was used to generate a continuous recurrence score (RS), with a higher RS corresponding to an increased risk of recurrence. The RS furthermore classifies women into low-risk (less than 18), intermediate-risk (18 to 30), and high-risk (more than 30) groups for recurrence. The RS has been validated and evaluated in a number of clinical trials and patient subgroups, and all results confirm the prognostic performance of the RS score for (distant) recurrence in the first 5 years after diagnosis [25-27].

The Oncotype RS has also been investigated in postmenopausal women receiving aromatase inhibitors as adjuvant treatment. In the transATAC trial [28], the RS has been evaluated in 1,125 postmenopausal women who were chemotherapy-free and who received either tamoxifen or anastrozole alone for 5 years. In this patient group, the RS was shown to add significant prognostic information for distant recurrence, independently of clinicopathological factors and Adjuvant Online!. The results furthermore confirmed that a combination of molecular predictors, as found in the RS, and clinical factors would provide a good prognostic tool for clinicians and oncologists [28].

The predictive value of the Oncotype RS has not been evaluated in the transATAC trial, but other studies have shown that women with a low RS will benefit little if at all from additional chemotherapy [26,27,29,30]. Nevertheless, the RS analysis in the transATAC trial confirmed indirectly that those with a very low RS, even with one to three positive nodes, will not benefit from additional chemotherapy.

The prognostic performance of the RS for late distant recurrence was furthermore evaluated in the transATAC trial [9] and compared with clinical factors and other molecular scores. The Oncotype RS added significant prognostic value in the first 5 years after breast cancer diagnosis but failed to be substantially predictive of late distant recurrence in the overall population in years 5 to 10 when adjusted for clinical parameters. The RS was more prognostic for late metastasis in women with HER2-negative disease and added little prognostic value for those with node-positive disease [9]. The RS was least prognostic for late distant recurrence when compared with other multi-gene scores, such as the PAM50 ROR score or the immunohistochemistry 4 (IHC4) score [31] (ER, PgR, Ki67, and HER2). The results of this analysis showed that the RS is not a strong candidate for the prediction of late distant recurrence.

Prosigna PAM50 risk of recurrence

The PAM50 ROR (NanoString Technologies, Seattle, WA, USA) score is based on a 50-gene test, which was developed to identify intrinsic breast cancer subtypes [32,33]. The ROR is derived from an expression profile of the 50 genes and includes information on tumor size as well. The ROR score has been validated in women with node-negative or node-positive disease [32,33] and has been shown to classify women into low- or high-risk groups and added prognostic information beyond that of clinical or IHC4 factors.

In the transATAC trial, the PAM50 has been evaluated for the ability to add prognostic information for distant recurrence beyond that of clinical factors [34]. Furthermore, the prediction of distant recurrence in years 0 to 10 was compared with the Oncotype RS and IHC4 as well. The results showed that the ROR added significant prognostic information beyond that of clinical parameters in all patients and furthermore in all subgroups as well. In addition, it was shown that the ROR was more predictive of distant recurrence overall than the Oncotype RS and categorized more patients into the high-risk group and fewer into the intermediate-risk group than the RS, indicating better discrimination of risk groups.

The ROR score was also evaluated in the ABCSG-8 (Austrian Breast and Colorectal Cancer Study Group 8) trial, in which postmenopausal women with early breast cancer were randomly assigned to receive tamoxifen or anastrozole for 5 years [35]. In this large analysis, the ROR score added significant prognostic value beyond that of clinical parameters for distant recurrence in the overall population and all subgroups. It also confirmed the better discrimination between low- and high-risk groups in all subgroups.

The ROR score was further investigated for the prediction of late distant recurrence in the transATAC trial [9]. In this context, the ROR score was the most prognostic overall and in all patient subgroups, when compared with the Onctoype RS and IHC4, and reclassified more women into the low- or high-risk group than the other two tests. This was the first analysis to show that the ROR score, though developed to predict overall recurrence, added significant prognostic information for late distant recurrence. In a combined analysis of the transATAC and ABCSG-8 trials, the ROR score was investigated for the prediction of specifically late distant recurrence [36]. Two thousand one hundred and thirty-seven postmenopausal women who did not have a recurrence in the first 5 years after diagnosis were included in the analysis. In both the univariate and bivariate analyses (adjusted for clinical parameters), the ROR score added significant prognostic information for late distant recurrence in all patients and was more predictive in node-negative and node-negative/HER2-negative patients than clinical factors alone [36]. The results of this analysis indicated that the ROR score is able to identify women who are at sufficiently low risk of late distant recurrence, even if they have node-positive disease, and who might be spared additional endocrine therapy and therefore overtreatment.

Breast cancer index

The Breast Cancer Index (BCI) (bioTheranostics, San Diego, CA, USA) is a gene expression module based on two components - the HOXB13/IL17BR (H/I) and the molecular grade index (MGI), which is a proliferation module. The continuous BCI score with a cubic component has been developed in tamoxifen-treated women with lymph node-negative disease and has been shown to be a good predictor for distant recurrence in this cohort [37]. A second risk model was developed by combining H/I and MGI into a linear, continuous risk score [38] and this model provided better significant prognostic information for distant recurrence in node-negative patients. Furthermore, the linear BCI stratified the majority of women as low risk for distant recurrence in years 0 to 5 and also in years 5 to 10 after diagnosis.

The BCI was further evaluated in the transATAC cohort in which both the cubic BCI and the linear BCI were investigated for the prediction of distant recurrence in the early and late follow-up period [39]. The results of this analysis showed that only the linear BCI as a continuous score was an independent strong predictor for distant recurrence in women with node-negative or HER2-negative/node-negative disease. As a categorical score, the linear BCI compared low-, intermediate-, and high-risk groups, and significant differences in distant recurrence rates at 5 years of follow-up were found for the high-risk group when compared with the low/intermediate-risk group (1.3% versus 5.6% versus 18.1%; P < 0.001). Conversely, in years 5 to 10 after diagnosis, the low-risk group had a recurrence rate of 3.5%, which was significantly lower when compared with the intermediate-risk (13.4%) and high-risk (13.3%) group (P = 0.001). To further evaluate whether both components of the BCI score predict for late distant recurrence, an exploratory analysis was performed to investigate the prognostic performance of each component separately. The results showed that, although both components added prognostic value in the first 5 years after diagnosis when adjusted for clinical parameters, only H/I also significantly predicted for late recurrence [39]. The finding that MGI is not prognostic for late distant recurrence is consistent with other results [9] in which proliferation-related factors, such as Ki67 or clinical tumor grade, are not prognostic for late distant recurrence. Overall, the BCI identifies women with node-negative disease at increased risk of late distant recurrence and might be used in the clinical setting to identify women who might benefit from further endocrine therapy. However, this signature has not been evaluated in women with node-positive disease.

EndoPredict

The EndoPredict (EP) (Sividon, Cologne, Germany) has been developed for women with HR-positive/HER2-negative disease and includes information of eight cancer-related and three normalization genes [40]. The signature has been validated in a cohort of women from two large clinical trials (ABCSG-6/8), which were treated with adjuvant endocrine therapy only. Apart from the EP score, a second score, EPclin, has been developed, which combines the EP score with clinical information, such as nodal status and tumor size. Both signatures are available as continuous scores or as low-risk and high risk groups as defined by pre-specified cutoff points. In both validation cohorts, the EP was an independent predictor of distant recurrence beyond that of clinical variables. Furthermore, the EPclin provided additional prognostic information beyond that of clinicopathological parameters alone in women with HER2-negative breast cancer. In the same two cohorts, the EP and EPclin were specifically investigated for the prediction of late distant recurrence [41]. The results showed that both signatures are clearly associated with the prediction of late metastasis. The EPclin furthermore identified a subgroup of patients who have a very good prognosis after 5 years of endocrine therapy.

Novel predictors

None of the multi-gene tests discussed above was specifically developed for the prediction of late recurrence, and there is great interest in new biomarkers for the prediction of these events. The ultimate goal of a new biomarker is that it specifically predicts for either early or late recurrence. Cheng and colleagues [42] identified a 51-gene signature from primary breast cancer tumors, which are particularly associated with late recurrence. Although this gene signature has not been validated yet, it suggests that novel biomarkers might help to identify women who might be at risk of late recurrence and therefore candidates for extended therapy.

A new area of great interest is the identification of circulating tumor cells (CTCs) in peripheral blood of patients with breast cancer (from either primary tumor or metastasis). CTCs have been linked to worse prognosis and early relapse in breast cancer and can be used for the identification of response to treatment [43]. Many studies have shown that CTCs in early breast cancer predict recurrence within the first 5 years after diagnosis [44-47]. An overview of all studies investigating CTCs for detection of metastatic breast cancer was recently published [48] (Table 3). CTCs thus far have been investigated mostly in the metastatic setting, and further research is needed for the early breast cancer setting.

Table 3 Summary of ongoing clinical trials with circulating tumor cells in metastatic breast cancer

In addition, the detection of one or more CTCs in the first 5 years after diagnosis has been associated with late relapse in early breast cancer [49]. A potential advantage of CTCs is that their presence can be measured at several time points during disease from blood samples (‘liquid biopsies’) and thus give information on disease progression. However, it is still not clear what the best detection method for CTCs is, as these cells occur at a very low yield. Another research area that needs to be addressed is the correct time point of measuring CTCs for late relapse. Although it has been become clear that CTCs are a powerful marker for the prediction of early metastatic disease, not many studies have specifically addressed the use of CTCs for the identification of late relapse and this area of research needs further investigation.

Conclusions

It is important to accurately identify women at high risk of late (distant) recurrence as some of them may be spared extended endocrine therapy whereas others may benefit from further treatment. Risk of distant recurrence and timing vary among patients with breast cancer. Clinicopathological parameters, specifically nodal status and tumor size, are well-established predictors for late recurrence in postmenopausal women with HR-positive breast cancer. However, it has become evident that molecular signatures improve the prediction of late relapse and can identify women who are at sufficiently low risk, even with node-positive disease, who do not warrant extended endocrine therapy. The ROR score, the BCI, and the EP score all have been shown to add significant prognostic information for late (distant) recurrence in postmenopausal women who are chemotherapy-free and these results indicate that these signatures may be helpful in the clinical setting.

However, the question of which molecular test is best for each individual patient needs further research. Although all of the signatures predict early and some of them late recurrence in a variety of patients, it has become apparent that tumor biology is not all that matters. Non-clinical baseline factors, such as age or body mass index, may influence the prognostication of these signatures [50] and furthermore may help to identify specific women who will benefit most from these tests.

The difficulty that all discussed molecular signatures have in common is that the information is derived from the primary tumor, assuming that driving forces for late recurrences are in these primary tumors. This might be true for early relapses but not necessarily for late recurrence. Dormant micrometastasis cells might survive for a long time in distant organs and undergo genetic changes, which might drive the recurrence but not be present in the primary tumor [51]. CTCs (micro-metastasis) may stay dormant in distant organs for many years, and genetic changes of these cells may arise and reflect differences in these cells that are distinctively diverse from those of the primary tumor [52]. Therefore, it remains a challenge to accurately classify primary tumors for the prediction of late recurrences.

Many molecular signatures have been evaluated and validated in large clinical trials, and most of them are officially approved for use in the clinic. None of the discussed signatures has been developed particularly for the prediction of late (distant) recurrence. Nevertheless, the results clearly show that some of them have great potential for the identification of women who are at high risk of developing a late recurrence, particularly those with node-negative disease. With the help of these signatures, tailored therapies and specific patient treatments can be optimized for late (distant) recurrence.