Background

Five years of adjuvant endocrine therapy is standard treatment for patients with primary oestrogen receptor-positive (ER+) breast cancer, and it clearly improves prognosis [1]. Multiparametric molecular assays are increasingly used to estimate prognosis and guide treatment decisions of patients with primary ER+ breast cancer. These include the Oncotype DX (OncotypeIQ/Genomic Health, Inc., Redwood City, CA, USA) Recurrence Score (RS) [2], Prosigna PAM50 (NanoString Technologies, Seattle, WA, USA) [3], Breast Cancer Index (BCI) [4], EndoPredict (Myriad Genetics, Zurich, Switzerland) [5] and IHC4 [6]. All of them have been evaluated in the TransATAC series of samples that were established from patients with ER+ primary breast cancer randomised to treatment with 5 years of anastrozole or tamoxifen in the ATAC (Arimidex, Tamoxifen, Alone or in Combination) trial [7]. It has become clear that, following surgery, the risk of recurrence in ER+ primary breast cancer is not constant, which is underlined by molecular differences. In TransATAC we have previously shown that the oestrogen module of RS was prognostic within 5 years of surgery (during endocrine therapy), however it became non-informative for recurrences beyond 5 years, thus weakening the overall prognostic value of RS [8]. In the same data set, patients with high ER expression by RT-PCR were twice as likely to have a relapse 5–10 years after surgery than within the first 5 years. Bianchini et al. reported risk stratification by integrating the mitotic kinase score (MKS) and an oestrogen receptor-related score (ERS), both based on genes constituting the proliferation and oestrogen modules of RS. Women with high MKS and ERS tumours were at greater risk of late recurrence [9]. More recently, improved risk estimation beyond 5 years by RS was reported when integrated with dichotomised ER expression assessed by RT-PCR [10].

Extending endocrine therapy beyond 5 years has been shown to reduce late-recurrence rate [11, 12], however those most likely to benefit from such therapy need to be identified. Although some of the widely used prognostic assays for ER+ patients have been shown to be prognostic for risk beyond 5 years [13,14,15,16], none of them have been optimised to quantify residual risk after 5 years free from recurrence, and their ability to predict late relapse varies substantially [17]. The different time-dependent performance of multiparametric molecular signatures indicates that molecular features of ER+ breast cancers may be identified to improve prediction of residual risk in order to spare those patients with significantly low risk of late recurrence from extended endocrine therapy.

We therefore hypothesised that prognostic signatures optimised specifically for the early (0–5 years) and late (5–10 years) follow-up periods, respectively, would be more prognostic than a single signature optimised for the whole 10-year follow-up period. To test this hypothesis, we developed time-dependent prognostic signatures in patient samples from the TransATAC series for early, late and 10-year follow-up periods. The prognostic performance was tested in an independent sample set and against commercial signatures already assessed in TransATAC. Our primary aim was to compare the prognostic value of the newly developed signature(s) added to Clinical Treatment Score (CTS) [6] with that of PAM50 risk of recurrence (ROR) based on subtype and proliferation added to CTS.

Methods

Patient cohorts

Our initial analysis drew from four published breast cancer cohorts (GSE6532, GSE9195, GSE17705, GSE26971) analysed on either of the Affymetrix Human Genome HG-U133A (GPL96) and HG-U133 Plus 2.0 (GPL570) microarray platforms (Affymetrix, Santa Clara, CA, USA). The two platforms shared 22,277 probes to which we restricted our analyses. This cohort had 747 unique patient samples that matched our selection criteria: ER+, HER2−, treated with 5 years of endocrine therapy, chemotherapy-naive, with information on either distant metastasis-free survival (DMFS) or relapse-free survival (RFS) available with a long follow-up. Details of the inclusion criteria are listed in Additional file 1: Methods, and a full list of samples included in the analysis is shown in Additional file 2: Table S1.

In the TransATAC cohort, RNA was available from 948 formalin-fixed, paraffin-embedded (FFPE) tumours from the ATAC trial, previously extracted by Genomic Health Inc. (GHI) [18]. Eligibility required hormone receptor-positive/HER2− disease, without chemotherapy treatment and at least 500 ng of RNA available. One hundred eighty-three recurrence events were recorded for this cohort. This study was approved by the South-East London Research Ethics Committee, and all patients gave informed consent.

The POLAR (Predictors Of early versus LAte Recurrence in ER+ breast cancer) samples were identified from archives of Royal Marsden Hospital (RMH), London, UK, and Lund University Hospital Biobank, Lund, Sweden. Eligibility criteria were patients with ER+/HER2− early breast cancer diagnosed between January 2000 and December 2004, treated with curative intent and with a follow-up data cut-off at May 2014. Patients must have received 5 years of adjuvant endocrine therapy (unless relapse occurred within this time); (neo)adjuvant chemotherapy was permitted. A 422-sample case-control design was used; control subjects were randomly selected according to matching criteria from among the remaining cohort of patients who did not relapse during follow-up. The total number of patients drawn upon was 1449. The following four matching criteria were used in this study: (1) age at diagnosis (< 50 or > 50 years), (2) Nottingham Prognostic Index (NPI) category (< 3.4, 3.4–5.4, > 5.4), (3) type of adjuvant endocrine therapy (tamoxifen only vs. any aromatase inhibitor [AI]) and (4) chemotherapy use (yes or no). Two-hundred forty-seven recurrence events were recorded. The POLAR study was approved by the RMH Research Ethics Committee (CCR 4122) and the ethics committee of Lund University Hospital (LU 240-01).

Study endpoints

The primary endpoint was time to any recurrence, which was defined as locoregional (ipsilateral breast, contralateral breast and regional lymph nodes) and/or distant recurrence. Secondary endpoint was time to distant recurrence, which was the time from diagnosis until metastasis from the primary tumour at distant organs, excluding contralateral disease and locoregional and ipsilateral recurrences. Death before recurrence was treated as a censoring event for both endpoints.

Analytic procedures

In the microarray data set, 454 probes representing 454 genes (Additional file 2: Table S3) were analysed at univariate level; those significant in univariate analyses in a particular setting were entered into multivariable analyses. Further details are provided in Additional file 1: Methods.

For TransATAC, RNA was extracted by GHI for the RS study [18]. RNA (100 ng) was used with the nCounter platform (NanoString Technologies) to assay the 93 endogenous and 7 reference genes selected in the process of the microarray expression analysis in 948 TransATAC samples.

For POLAR, RNA was extracted from three 3 × 10-μm unstained sections with more than 40% tumour cellularity using the RNeasy FFPE kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. RNA was quantified by using a NanoDrop instrument (Thermo Fisher Scientific, Wilmington, DE, USA). Between 50 and 200 ng of RNA was used to profile the expression of 27 endogenous and 5 reference genes with the NanoString nCounter.

NanoString expression data were background-corrected by subtracting the mean of the eight negative control probes, normalised with the geometric mean of five reference genes that had a correlation of Pearson’s r > 0.8 with all endogenous genes. The data set was then logarithmically (base 2) transformed and z-score-transformed. The KIF20A gene was detected in < 10% of samples in the TransATAC cohort and was removed from the data set. CTS, which carries information on tumour size, nodal status, grade, age and type of endocrine therapy, was calculated as published previously [6].

We trained separate early, late and 10-year signatures by performing elastic net analysis in the TransATAC training cohort. Our objective was to test if the early and late signatures had statistically significantly more prognostic power than the 10-year signature. If so, we would test the validity of the early and late signatures in the non-chemotherapy-treated subpopulation of POLAR and also test their performance in the chemotherapy-treated POLAR cohort. If the early and late signatures were not statistically significantly more prognostic than the overall signature, we would test the validity of the overall signature in the chemotherapy-naive POLAR group and explore its performance in the chemotherapy-treated POLAR group.

Statistical analyses of the cohort with microarray data were carried out at the Institute of Cancer Research (ICR) using R version 3.03 software (R Foundation for Statistical Computing, Vienna, Austria). Statistical analyses using the TransATAC cohort were performed at Queen Mary University of London with STATA version 13.1 (StataCorp, College Station, TX, USA) and R version 3.0.3 software. Statistical work on POLAR was carried out at RMH using the Statistical Analysis Plan version 2.0 and Prism 6.0c (GraphPad Software, La Jolla, CA, USA) software. Before data analysis took place, the statistical analysis plan for the TransATAC study was approved by the Long-term Anastrozole vs Tamoxifen Treatment Effects committee and that for the POLAR study was approved by the RMH Committee for Clinical Research, and these plans are described in Additional file 1: Methods. All statistical tests were two-sided.

Results

We performed the following steps in our study. We used publicly available microarray data to generate lists of prognostic genes to be analysed in the TransATAC cohort. We developed early, late and 10-year prognostic signatures in a training data set (two-thirds of TransATAC) while setting aside a test set (one-third of TransATAC) so that the performance of the newly trained signatures could be evaluated. This internal validation included comparison with commercial signatures of BCI, Oncotype DX RS, PAM50 ROR and IHC4. Finally, we conducted an external validation in the POLAR case-control sample set.

Candidate gene selection and microarray expression data analysis

In order to derive time-dependent prognostic signatures, we shortlisted 585 candidate genes representing proliferation, oestrogen signalling, immune infiltration and immune signalling. These genes were tested for prognostic significance in publicly available gene expression sets of ER+ endocrine therapy-treated breast cancer. A flowchart illustrating the approach is shown in Fig. 1. Sixty-seven genes of interest that are part of the PAM50, Oncotype DX RS, EndoPredict and BCI profilers were also included. Additional genes likely to be related to benefit from endocrine therapy were identified from 81 patients by reanalysing our previously published neoadjuvant endocrine therapy-treated set of samples [19] (https://www.synapse.org/#!Synapse:syn16243). From this dataset, we identified 164 candidate genes by examining correlation of individual gene expression from untreated biopsies with change in the following after 2 weeks of AI treatment: (1) Ki-67, (2) proliferation-associated gene cluster, (3) oestrogen-associated gene cluster, and (4) expression of the modified version of the Global Index of Dependence on Estrogen [20] genes. An additional 354 genes were selected on the basis of literature searches. Genes from published gene modules of the proliferation-associated gene cluster, oestrogen-associated gene cluster and inflammatory response signature [19], the tumour invasion/metastasis module (PLAU) [21] and IGG-14 module (immunoglobulin-gamma) [22] were also included. The complete list of candidate genes and the reason for their inclusion are detailed in Additional file 2: Table S1.

Fig. 1
figure 1

Flowchart of gene signature derivation in the microarray and TransATAC cohorts. QC Quality control, ATAC Arimidex, Tamoxifen, Alone or in Combination

Seven hundred forty-seven samples from the microarray expression dataset were compiled from four publicly available breast cancer cohorts to investigate the relationship between genes and outcome (Additional file 2: Table S2) [5, 23,24,25]. Expression data were available for 454 genes (Additional file 2: Table S3). We performed univariate Cox proportional hazards regression analyses for early, late and 10-year follow-up periods using RFS and DMFS as endpoints, respectively (six analyses), that identified 212 genes that were significant at p < 0.01 in any of the analyses (Additional file 2: Table S4). Genes significantly prognostic in a particular time period were taken forward for multivariable analyses performed by Cox proportional hazards regression with DMFS and RFS as endpoints, respectively, in the early, late and 10-year follow-up settings (six analyses). This resulted in 88 genes being selected in the models (Additional file 2: Table S5), of which 17 genes were removed owing to high correlation of expression with other candidates already selected (Additional file 2: Table S6). An additional 29 genes were added that included candidates without probes available in the microarray expression data analyses, some recently emerging candidates and also seven reference genes (Additional file 2: Table S7).

Expression profiling and signature building in TransATAC

Sample availability in TransATAC is shown in Fig. 2a. Expression data for the 100 selected genes (including housekeeping genes) (Additional file 2: Table S9) were obtained for 948 patient samples in TransATAC using the NanoString nCounter. We assessed the prognostic value of these molecular variables in TransATAC for early, late and 10-year time periods for RFS. Sixty-three genes were statistically significant in at least one of the time windows assessed (Additional file 2: Table S7, Additional file 3: Figure S1). We found different prognostic properties between early and late periods for 20 genes. Six genes were prognostic early but not in the late period (CD79, IL6ST, LRRC48, MPZL1, PGR and PIGV), and 14 genes were not significantly prognostic early but gained prognostic significance in the late setting (ANP32E, ANXA1, CTSL2, EPB41L2, ESR1, FOXA1, ICOS, IL17RB, MMP9, MYCBP2, NR2F1, PDZK1, SLAMF8 and TCF7L2).

Fig. 2
figure 2

Consolidated Standards of Reporting Trials (CONSORT) diagram of the availability of samples for analysis from (a) the ATAC trial and (b) the POLAR collection of samples. POLAR Molecular Predictors Of early versus LAte Recurrence in ER-positive breast cancer, ATAC Arimidex, Tamoxifen, Alone or in Combination, ER Oestrogen receptor, PgR Progesterone receptor, RMH Royal Marsden Hospital, LUH Lund University Hospital, ET Endocrine therapy, HER2 Human epidermal growth factor receptor 2

The TransATAC cohort was then randomly split into two-thirds (n = 634) training and one-third (n = 314) validation sets while ensuring that the recurrence rate was similar in the two subgroups. Demographics for the training, validation and overall cohorts are presented in Table 1. We aimed to select prognostic variables independent of clinicopathological features that are commonly used for prognosis. To achieve this, on top of the 63 statistically significant genes in univariate analyses, CTS was also entered into multivariable selections for early, late and 10-year time-periods, respectively. Elastic net penalised Cox regression with leave-one-out cross-validation was used for feature selection in the TransATAC training set. CTS was selected in all three signatures in addition to 18 genes in the 10-year, 16 genes in the early, and 15 genes in the late follow-up analyses. The variables and their coefficients derived from the elastic net models are listed in Table 2. CTS had the highest coefficient in each of the time periods.

Table 1 Demographics of TransATAC and POLAR cohorts
Table 2 Variables and corresponding beta-coefficients of the time-dependent 10-year, early and late signatures

Comparison of time period-optimised prognostic signatures in TransATAC validation set

TransATAC was used to validate and compare the prognostic information of the three time period-dependent signatures (Table 3). In the 0–10-year follow-up period, all three newly derived signatures were significantly prognostic, with the late signature being significantly less informative than the 10-year signature (10-year signature likelihood ratio chi-square test [LRχ2] = 28.0; early signature LRχ2 = 33.4; late signature LRχ2 = 18.1). In the 0–5-year period, the 10-year signature and early signature were equally prognostic and significantly more than the late signature (LRχ2 for 10-year signature = 14.1; LRχ2 for early signature = 14.9; LRχ2 for late signature = 8.9). In the late setting, the early signature was the most prognostic, followed by the 10-year and late signatures (LRχ2 for 10-year signature = 13.9; LRχ2 for early signature = 18.6; LRχ2 for late signature = 9.3). CTS was strongly prognostic in all three time periods (CTS 0–10-year LRχ2 = 48.7; CTS 0–5-year LRχ2 = 29; CTS 5–10-year LRχ2 = 19.8).

Table 3 Statistical analysis of TransATAC validation cohort

For the 0–10-year period, all three signatures added statistically significant prognostic information beyond that of the CTS (ΔLRχ2 for 10-year signature = 7.9; ΔLRχ2 for early signature = 10.3; ΔLRχ2 for late signature = 4.3). In the 0–5-year period none of the signatures added significant prognostic information to CTS. However, in the 5–10-year period, the 10-year and early signatures added statistically significant prognostic information to CTS (10-year signature ΔLRχ2 = 4.8; early signature ΔLRχ2 = 8.0; late signature ΔLRχ2 = 2.7).

Given that the early and the late signatures were not statistically significantly more prognostic than the 10-year signature in the respective periods they were optimised for, we rejected our primary hypothesis that signatures optimised separately for the early and the late follow-up periods, respectively, are more prognostic than a 10-year signature, but we proceeded to assess the validity of the 18-gene, 10-year signature in an independent cohort and to compare its performance with that of commercial signatures.

Signature test of 10-year validity in POLAR cohort

A matched case-control set of samples was compiled from RMH and Lund University Hospital archives (POLAR) to validate the 10-year signature (Fig. 2b, Table 1). Our aims were to test the validity the 10-year signature in an endocrine therapy-only cohort similar to the training set and also to explore if the prognostic property (if any) extends to a higher-risk, chemotherapy-treated population. The latter cohort was of interest in the 5–10-year period because of the potential for its use in selecting patients for extended adjuvant endocrine therapy.

Despite having matched cases and controls on NPI category, the CTS was still higher in cases than in control subjects: 201.9 ± 98 (SD) vs. 170.8 ± 87.6 (p = 0.0009), respectively. In a univariate analysis, CTS had an OR of 1.004 (95% CI, 1.001–1.006) for a one-unit increase. We assessed a multivariable model with CTS with and the 10-year signature, and both were found to be statistically significant: 10-year signature OR = 1.851 (95% CI, 1.194–2.868), p = 0.006; CTS OR = 1.003 (1.001–1.005), p = 0.012.

We also assessed whether the 10-year signature added significant prognostic information above CTS alone using LR tests (Table 4, Additional file 4: Table S10). In the overall POLAR cohort (n = 422), CTS was prognostic across 10 years and in the early follow-up period (CTS 0–10-year period LRχ2 = 11.23; 0–5-year period LRχ2 = 22.09), but not in the 5–10-year period. The 10-year signature was prognostic in all three follow-up periods and contributed to CTS with significant prognostic information in the 10-year and early periods (0–10-year period ΔLRχ2, CTS + 10-year signature vs. CTS = 7.74; 0–5-year period ΔLRχ2, CTS + 10-year signature vs. CTS = 7.59), but not in the 5–10-year period. Both CTS and the 10-year signature were marginally more informative across the 10 years in the chemotherapy-treated POLAR cohort than in the endocrine therapy-only population, despite the latter having more patients and events (patients, n = 170 vs. n = 252; events, 99 vs. 148). Additionally, the 10-year signature added significantly more prognostic information to CTS in the chemotherapy-treated group (ΔLRχ2: CTS + 10-year signature vs. CTS = 6.71) than among those receiving endocrine therapy only (ΔLRχ2, CTS + 10-year signature vs. CTS = 2.47).

Table 4 Statistical analysis of three groups of POLAR validation set for 0–10 years of follow-up

Prognostic properties of the 18 individual genes constituting the 10-year signature were assessed in POLAR and compared with data obtained in TransATAC. In POLAR, only 8 of the 18 genes were significantly prognostic at the univariate level (Fig. 3), but all genes except tumour necrosis factor-alpha (TNF) showed the same prognostic direction both in TransATAC and in POLAR.

Fig. 3
figure 3

HRs and ORs for the 10-year signature genes in TransATAC and POLAR, respectively. POLAR Molecular Predictors Of early versus LAte Recurrence in ER-positive breast cancer, ATAC Arimidex, Tamoxifen, Alone or in Combination

Comparison of the 10-year signature with CTS, RS, PAM50 ROR, BCI and IHC4 in TransATAC

We have previously published data on the prognostic performance of CTS, RS, PAM50 ROR, BCI and IHC4 in TransATAC [6, 15, 18, 26]; data for all scores were available for 271 patients in the validation cohort. We assessed their prognostic information for 10 years after surgery using any recurrence and distant recurrence as endpoints, respectively, and compared them with the newly developed 10-year signature (Table 5). For both any and distant recurrence, the BCI provided the most added information beyond the CTS in this set (any recurrence, CTS LRχ2 = 37.4; BCI ΔLRχ2 = 9.5; distant recurrence, CTS LRχ2 = 46.7; BCI ΔLRχ2 = 14.5, respectively). The novel 10-year signature performed similarly to the other three scores in this respect.

Table 5 Statistical analysis for all and distant recurrences in the TransATAC validation cohort

Discussion

We developed novel time-specific prognostic signatures for early, late and 10-year follow-up periods for ER+/HER2− patients treated with endocrine therapy alone to allow us to test the hypothesis that sequentially applying early and late signatures could be more prognostic for risk of relapse than a single newly developed 10-year signature. This hypothesis was based largely on our observation that the performance of some components in many of the commercially available signatures varied between these time periods. For example, we found that ESR1 and the oestrogen module overall in the RS was less prognostic in years 5–10 than in years 0–5 [8]. Analogous findings were reported by Bianchini et al. [9]. Very recently, the Early Breast Cancer Trialists’ Collaborative Group (EBCTCG) published data on clinicopathological and limited immunohistochemical data on over 60,000 women who were treated with 5 years of endocrine therapy [27]. Although progesterone receptor showed strong prognostic performance in years 0–5, it showed no significant relationship with prognosis thereafter. These data on markers associated with hormone responsiveness support the contention, but by no means prove, that cessation of endocrine treatment at 5 years may lead to increased recurrence risk in more hormonally responsive tumours. We therefore included in our assessment genes that we and others have found to be associated with the anti-proliferative response of primary ER+ breast cancer to oestrogen deprivation. Our work involved a discovery set of 747 samples; training and test sets of 634 and 314 TransATAC samples, respectively; and independent case-control series from 1449 eligible samples. As such, this was one of the largest original gene expression analyses undertaken for evaluating prognosis in ER+ breast cancer.

Of the 92 genes selected from microarray data and assessed in univariate analyses in TransATAC, we found 63 to be significantly prognostic (p < 0.05) in any of the three time periods, which is considerably more than expected by chance after allowing for multiple testing errors. For most genes, the same prognostic pattern was observed for early and late periods, however we observed some possibly different prognostic properties for 20 genes. Notably, consistent with the above arguments, higher levels of ESR1 and its pioneer factor FOXA1 showed a shift at 5 years to be associated with worse prognosis beyond 5 years, but surprisingly over the 10-year period, the two genes were associated with poor prognosis. The complementary role whereby upon stimulus ER binding to chromatin is dependent on the presence of FOXA1 is well established [28]. In our dataset, FOXA1 and ESR1 correlated highly (Pearson’s R = 0.65); the possibility that increased expression of one or both may put patients at increased risk of late relapse merits further investigation, particularly with regard to whether the genes also identify patients who benefit from extended adjuvant therapy.

The optimised time-dependent signatures derived in the TransATAC training set were rather similar to one another in makeup. All genes in the 10-year signature featured in either (or both) of the early and late signatures with their coefficients being in the same direction. The early and late signatures had five and three variables, respectively, not present in the 10-year signature, suggesting that the early and late signatures may not have captured time-specific features or that such time-specific features that exist exert a minor modulatory influence on the overall prognosis over 10 years. It is notable that CTS was consistently the most prognostic variable in the three time-dependent models and that its contribution was similar in both early and late recurrence. This is consistent with the data of the EBCTCG that classical clinicopathological features retain their strong prognostic influence beyond 5 years [27].

Given that the 10-year signature captured prognostic features of both early and late events, it is perhaps not surprising that no improvement was seen in the use of early and late signatures compared with the overall 10-year signature that led to the rejection of our hypothesis. Also, it should be noted that splitting of the 0–10-year time period into 0–5- and 5–10-year periods markedly reduces the power to detect prognostic contributions. At least a contributory factor for the lack of improvement may be the dominance of proliferation-related genes in our and other signatures. As shown in our earlier analysis of the RS, each of the individual proliferation genes and the integrated module are equally prognostic before and after 5 years [8]. Notably, this is also supported by the observation by the EBCTCG that Ki-67 was equally prognostic before and after 5 years in their overview analysis of late recurrence [27].

The 10-year signature was nonetheless validated in the POLAR sample set and provided significant prognostic information in both chemotherapy-naive and chemotherapy-treated cohorts. Moreover, it added independent prognostic information beyond that of CTS in the POLAR cohort. Comparison of the information provided by each gene showed that 8 of the 18 genes were significantly prognostic at univariate level in POLAR (4 genes at P < 0.05, 2 genes at P < 0.01 and 3 genes at P < 0.001). TNF showed an opposite prognostic direction in training and validation sets, thus weakening the performance of the signature in POLAR. TNF is a versatile pro-inflammatory cytokine that has both pro- and anti-tumour activities promoting lymphocytic infiltration and activating the nuclear factor-κB, c-Jun N-terminal kinase and mitogen-activated protein kinase pathways, and it is capable of inducing apoptosis through TNF receptors 1 and 2 [29]. It may be that the inclusion of higher-risk, chemotherapy-treated patients in POLAR contributed to the difference in TNF’s prognostic pattern; further investigation is needed to explain the relationship of TNF and risk of relapse in these cohorts.

The 10-year signature was compared with established prognostic signatures in the TransATAC validation set. Importantly, the 10-year signature was developed for the endpoint of any recurrence contrary to the endpoint of distant recurrence used in the development of RS, PAM50, ROR, BCI and IHC4. In univariate assessments, BCI and the 10-year signatures were the most informative for both all and distant recurrence. When added to CTS, all signatures assessed provided similar amounts of information, with CTS + BCI being the most informative for distant recurrence. This new signature did not outperform the established signatures, even though it was based on a large and wide-ranging analysis of both established prognostic genes and novel genes with a clear rationale for inclusion. Larger studies may be needed to fully optimise novel prognostic signatures with improved prognostic information, however the data from our studies indicate that the gain is unlikely to be large. Other approaches that assess response to treatment or integrate mutational and DNA copy number profiles or by the use of circulating tumour DNA are likely to be more fruitful.

The results presented here support the mounting evidence that better risk estimation can be achieved by combining molecular profilers with clinicopathological factors. For the three time-dependent signatures derived in TransATAC, CTS was the most prognostic in all three time-dependent signatures and provided more prognostic information than RS, ROR, BCI and IHC4, respectively. Additionally, all profilers added significant prognostic information to CTS, leading to combined signatures being significantly more informative. There is emerging evidence for genetic differences affecting outcome amongst various racial groups [30]. Although this is an important question with practical consequences, the cohort presented here was > 99% Caucasian and did not provide us with the opportunity to examine within TransATAC.

Our study has strengths and limitations. An advantage was that a large discovery cohort of 634 samples was used for signature training. All tumours were ER+/HER2− from post-menopausal patients who had received 5 years of endocrine therapy without chemotherapy. This was a homogeneous group of breast cancers, which reduced confounding factors such as tumour subtype and differing treatment lengths and types. Data for the clinical prognostic tests were obtained by the same methods as set out by the tests’ developers. The same batch of RNA was used for the newly developed signatures presented here and for the clinical prognostic tests used in the comparisons, reducing intra-sample variation. The clinical data were derived from a registration standard trial with comprehensive follow-up over 10 years. Limitations include that the candidate gene selection based on microarray data and associated clinical information from multiple studies did not allow the assessment of candidates by taking multiple clinical variables into account; this may have limited the performance of derived signatures that ultimately included CTS as a variable. Also, CTS, IHC4 and the 10-year signature were derived in TransATAC; therefore, their performance in the comparisons was slightly overestimated compared with what we would see in independent cohorts. Finally, although this study was relatively large compared with others, the splitting of the data into early and late signatures decreased the statistical power for comparisons within those time periods. The approach we have taken is likely to have somewhat overfitted the 10-year signature to the TransATAC population. An alternative approach for the derivation and validation of the 10-year signature would have been to fit the signature to the whole of the TransATAC cohort and validate it in the POLAR cohort. However, the approach we took allowed the comparison of the 10-year signature with commercially available signatures in the TransATAC test set. Had the 10-year signature not at least matched these, it would not have been worth proceeding further.

Conclusions

In summary, we found that early and late signatures are unlikely to be more informative for predicting relapse than a single signature optimised for 10 years. Larger studies may be needed to fully optimise novel gene expression signatures for prognosis in endocrine-treated ER+ patients with breast cancer, however a substantial improvement in performance is unlikely.