Introduction

A variety of genes involved in breast cancer biology have been studied and proposed as prognostic or predictive biomarkers, but only a few of them, such as hormone receptors and ERBB2, are used today to classify breast cancer patients and to make treatment decisions in the clinical routine [1, 2]. The introduction of microarray analysis recently lead to a better characterization of breast cancer on a molecular level, underlining its biological heterogeneity and revealing that breast tumors can be grouped into different subtypes with distinct gene expression profiles and prognosis [3]. Some of these subtypes confirmed the relevance of established differences between phenotypes such as the estrogen receptor (ER) and ERBB2 status, but also identified novel breast cancer subtypes or prognostic signatures of potential clinical value [37]. Although little overlap was observed between these gene signatures at the level of individual genes, recent data indicate that the underlying biological processes and pathways might be common [810].

In terms of tumor biology, proliferation has been recognized as a distinct hallmark of cancer and as an important determinant of cancer outcome [1113]. Increased tumor cell proliferation is accompanied by cell matrix remodeling and neo-angiogenesis, which together form the basis for an aggressive tumor phenotype [14, 15]. This observation was further underlined by recent reports showing that several genes involved in gene signatures discriminating clinically relevant breast cancer subtypes were related to proliferation [3, 4, 9, 16, 17].

In the context of breast cancer molecular screening, we recently investigated by quantitative RT-PCR the expression of 60 tumor-related genes in various subsets of breast cancers from the Stiftung Tumorbank Basel (STB) [18, 19]. This gene set also comprised several genes involved in proliferation such as thymidilate synthase (TYMS), thymidine kinase 1 (TK1), topoisomerase 2-alpha (TOP2A), survivin (BIRC5) and the transcription factor E2F1. Since these genes strongly correlated to one another and since the assessment of a single gene able to accurately predict breast cancer patients' outcome would represent major advantages for standard clinical use, we focused our efforts on the evaluation of E2F1 transcript levels as surrogate marker for proliferation. This transcription factor is well known for being involved in the cyclin/cyclin-dependent kinase/retinoblastoma pathway and for controlling the expression of more than 1,000 genes involved in cell proliferation, differentiation and apoptosis [2023]. In a set of 317 primary breast cancers patients with known clinical outcome (STB data set), we evaluated E2F1 mRNA expression levels with respect to other proliferation markers, ER and ERBB2 status and clinical outcome. All results obtained in our collective were subsequently validated in The Netherlands Cancer Institute (NKI) microarray data set comprising 295 breast cancer patients. Moreover, the prognostic value of E2F1 was compared with the 70-gene prognostic signature, and with other gene expression-based predictors such as the intrinsic subtypes, the wound response signature and the recurrence score available as reported by Fan and colleagues using the same NKI data set [8].

Methods

Study populations

Patients and methods have been described previously [18]. The 317 primary breast cancer tissue samples were obtained from the STB, Switzerland and were analyzed by quantitative RT-PCR (STB data set). The previously published microarray breast cancer data set reported by Van de Vijver and colleagues (NKI data set) [5] was used for validation and comparative analysis as reported by Fan and colleagues [8]. Major differences between the two study populations included the patient age, nodal status, adjuvant therapy and methodology (quantitative RT-PCR versus Agilent microarray). Detailed patient and tumor characteristics are summarized in Table 1.

Table 1 Patient and tumor characteristics

Quantitative real-time PCR analysis

Gene expression measurements by quantitative RT-PCR were performed as reported previously [24]. Total RNA was extracted using the RNAeasy Mini Kit (Qiagen, Hilden, Germany) and was quality-checked on a Bioanalyzer 2100 (Agilent Technologies, Palo Alto, CA, USA). High-quality RNA samples were reverse-transcribed and PCR was carried out in 40 cycles on a ABI Prism 7000 using 2× SYBR Green I Master Mix (Applied Biosystems, Forster City, CA, USA). Relative gene expression quantities (Δ[Ct] values) were obtained by normalization against ribosomal 18S RNA.

Statistical analysis

For the STB study the ER status was defined based on the mRNA level as reported previously [24], and for the NKI data set the status was defined as provided by the authors [5, 8]. The ERBB2 status was determined in both the STB and NKI data sets using mRNA expression levels for all study populations as previously described by Urban and colleagues [18].

The prognostic value of biomarkers was assessed by univariate and multivariate Cox analysis against metastasis-free survival (MFS), and in different patient subgroups according to the ER and ERBB2 status. The association of E2F1 with MFS in particular was assessed by univariate Cox analysis for various cutoff values (data not shown). For all subsequent analysis, the 30th percentile was used as the cutoff point for E2F1. Survival probabilities for MFS were calculated according to the Kaplan–Meier method, and group differences were assessed by the logrank test. Multivariate P values were based on Wald statistics. Statistical analysis was performed with 'R' statistical software version 2.0.1 using the 'survival' package [25].

Results

E2F1 correlated with other proliferation markers and clinical outcome

A strong and significant correlation was found between the five proliferation markers analyzed in the STB data set (Table 2). Univariate Cox regression analysis demonstrated a significant association of E2F1 as well as TYMS, TK1, TOP2A and BIRC5 expression levels with distant MFS (Table 2). Similar results were observed in the NKI data set (data not shown). In the NKI data set we also investigated Ki67. The RNA expression levels of this proliferation marker were positively correlated with E2F1 (correlation coefficient = 0.46) and were borderline significant (P = 0.02) in univariate Cox regression analysis.

Table 2 Correlation among different proliferation markers in the Stiftung Tumorbank Basel data set and association with survival

Distinct E2F1 expression patterns according to ER and ERBB2 status determined the clinical outcome

Scatter plots of E2F1 versus ER and ERBB2 expression levels in the STB data set (Figure 1a,b) revealed that ER-negative and ERBB2-positive breast tumors typically expressed high levels of E2F1, whereas in contrast low E2F1 levels (below the 30th percentile of its distribution in this collective) were detected almost exclusively in ER-positive and ERBB2-negative breast tumors. The same pattern was observed in the NKI data set (Figure 1c,d). Similar scatter plots were obtained analyzing the other proliferation markers (data not shown).

Figure 1
figure 1

Estrogen receptor and ERBB2 versus E2F1 expression levels. Scatter plots of estrogen receptor (ER) ESR1 and ERBB2 versus E2F1 expression levels in (a), (b) the Stiftung Tumorbank Basel data (STB) set and (c), (d) The Netherlands Cancer Institute (NKI) data set. Open circles, no metastasis; filled circles, metastasis. Vertical lines, cutoff values for the estrogen receptor (ER) and ERBB2 status, respectively; horizontal lines, 30th percentile for E2F1. Combined Kaplan–Meier analysis (metastasis-free survival) using the ER or ERBB2 status and E2F1 (30th percentile) in (e), (f) the STB data set and (g), (h) the NKI data set. Labels of the survival curves correspond to the groups as indicated on the respective scatter plot. CI, 95% confidence interval; HR, hazard ratio.

Cox univariate survival analysis performed in subsets of patients according to their ER and ERBB2 status showed that E2F1 correlated with MFS in ER-positive and ERBB2-negative tumors, but not in ER-negative and ERBB2-positive tumors (data not shown). Combined Kaplan–Meier analysis using E2F1 and the ER or ERBB2 status revealed that patients whose tumors expressed low E2F1 levels, a situation found mainly in ER-positive and ERBB2-negative phenotypes, were associated with favorable outcome, whereas patients with tumors expressing high E2F1 levels revealed a poor outcome independent of the ER and ERBB2 status (Figure 1e-h).

E2F1 correlated well with the 70-gene signature

The majority of the patients in the NKI data set assigned to the good-prognosis group by the 70-gene signature expressed low E2F1 levels and were found to be ER-positive or ERBB2-negative (Figure 2a,b). In addition, there was a strong correlation (r = 0.67) between E2F1 and the 70-gene signature (Figure 2c). In particular, 77% (69 out of 90) of patients with low E2F1-expressing tumors overlapped with patients assigned to the good-prognosis group by the 70-gene signature and were indeed found to be at the lowest risk of metastatic events. Patients with low E2F1 and a poor-prognosis signature or patients with high E2F1 and a good-prognosis signature had a comparable incidence of metastases (Table 3).

Table 3 Concordance of E2F1 with the 70-gene signature in The Netherlands Cancer Institute data set
Figure 2
figure 2

Comparison of E2F1 and the 70-gene signature in The Netherlands Cancer Institute data set. (a), (b) Scatter plots of estrogen receptor (ER = ESR1) and ERBB2 versus E2F1 expression levels. Open circles, poor-prognosis group as defined by [5]; filled circles, good-prognosis group [5]. (c) Correlation between the 70-gene prognostic signature and E2F1. Open circles, no metastasis; filled circles, metastasis.

E2F1 stratification showed similar prognostic value as the 70-gene and other gene-based predictors

Kaplan–Meier analysis displayed the similar prognostic value of E2F1 and the 70-gene signature (hazard ratio = 5.1 (95% confidence interval = 2.7–9.8) and hazard ratio = 4.6 (95% confidence interval = 2.7–7.8), respectively; Figure 3a). We obtained similar results (Figure 3b–d) when E2F1 levels were compared with the breast cancer intrinsic subtypes [3], with the recurrence score [17] and with the wound response signature [7], all of these gene expression-based predictors being reported by Fan and colleagues in the NKI data set [8].

Figure 3
figure 3

Kaplan–Meier analysis of metastasis-free survival. Kaplan–Meier analysis (metastasis-free survival) using (a) E2F1 expression (30th percentile) and the 70-gene signature, (B) intrinsic subtypes, (c) the recurrence score (Rsu), and (b) the wound response signature. CI, 95% confidence interval; HR, hazard ratio.

E2F1 was a strong and independent survival factor in multivariate analysis

Multivariate survival analysis including E2F1, nodal status, grade, tumor size, age, ER and ERBB2 status, and treatments revealed that only E2F1 and nodal status retained independent prognostic value in the STB data set (Table 4), and that E2F1, nodal status, tumor size, age and chemotherapy were significant in the NKI data set (Table 5). We performed a second multivariate Cox model including additionally the 70-gene signature in the NKI data set (Table 5), reconfirming that E2F1 and the 70-gene signature were significant and additive predictive survival factors together with the nodal status, tumor size and chemotherapy.

Table 4 Univariate and multivariate Cox analyses in the Stiftung Tumorbank Basel data set (n = 317)
Table 5 Univariate and multivariate Cox analyses in The Netherlands Cancer Institute data set (n = 295)

Discussion

In the present study we demonstrated that the assessment of E2F1 mRNA as a surrogate proliferation marker is a strong determinant of breast cancer outcome, particularly suitable for identifying patients at very low risk of metastasis, comparable with gene expression-based signatures such as the 70-gene signature. The prognostic component of the ER and ERBB2 status as well as different gene signatures were found to be strongly related to tumor proliferation. In fact, a large subset of patients classified with very favorable outcome shared a common molecular tumor phenotype characterized by ER-positive and/or ERBB2-negative status and low proliferation (low levels of E2F1 as well as BIRC5,TYMS,TOP2A and TK1). Moreover, the results obtained in our data set analyzed by quantitative RT-PCR were successfully validated in an independent breast cancer data set using microarray technology.

Sotiriou and colleagues developed a gene expression grade index able to reclassify breast cancer patients with tumor histological grade 2 into groups with high risk of recurrence versus low risk [9]. The gene expression grade index was developed on the basis of the analysis of five breast cancer microarray data sets including more than 600 tumors, from which the authors extracted a list of 242 genes associated with tumor grade and predicting patient outcome. Most of these genes were related to proliferation and cell survival, such as E2F1 and MKI67, BIRC5, TOP2A and STK6, all being highly correlated and providing similar prognostic information. In our study, we demonstrated that the detection of a single gene is sufficient to select tumors at low proliferation. A single gene assessment requires high RNA quality from fresh (frozen) tissue, however, and might be insufficient in cases of more heterogeneous RNA quality (for example, RNA from paraffin-embedded tissues).

Breast cancer has been successfully classified using microarrays into clinically relevant subgroups based on variations in gene expression patterns. Sorlie and colleagues showed that ER-negative tumors grouped into basal-like and ERBB2 subtypes, both with poor prognosis [3]. In contrast, ER-positive breast cancers could be classified into luminal A and luminal B subtypes with significantly distinct prognosis: luminal A tumors displayed favorable outcome, whereas survival of patients with luminal B tumors was poor and comparable with those of the ER-negative ERBB2 and basal subtypes [3]. Our classification in the NKI data set revealed that 81% of the tumors expressing low E2F1 levels (below this study's cutoff point) corresponded with luminal A subtype as defined by Fan and colleagues [8], and subsequently had similar prognostic value (Figure 3b).

Van de Vijver and colleagues used a 70-gene prognostic signature to discriminate patients with good prognosis and poor prognosis [5], which according to our analysis strongly correlated with E2F1 expression levels. As shown in Figure 2, patients defined as of good prognosis by the 70-gene signature had tumors expressing low E2F1 levels and were mainly ER-positive. Despite all observed correlations, multivariate Cox analysis of the NKI data set showed that E2F1 levels and the 70-gene prognostic signature retained additive significance when both covariates were included (Table 5). This is probably due to the fact that both markers classified, in addition to the overlapping patients at very low risk, patients at similar but higher risk who would not have been selected by either classifier alone (Table 3). Furthermore, we found that almost all ERBB2-positive and ER-negative tumors expressed high levels of E2F1 and were classified as of poor prognosis according to the 70-gene signature – suggesting an explanation of why Espinosa and colleagues were unsuccessful in improving the accuracy of the 70-gene signature by incorporating additional genes such as ERBB2 [26].

Fan and colleagues [8] recently demonstrated that the different gene-expression-based predictors including the 70 gene-signature, the intrinsic subtypes, the wound signature and the recurrence score were highly concordant to evaluate breast cancer outcome. Our analysis revealed that low proliferation as quantified by low levels of E2F1 represented a common determinant of patients with good prognosis (Figures 2 and 3). It has to be noted that the prognostic value of E2F1 was independent of the nodal status. Indeed, 40% of the STB tumors and 50% of the NKI tumors with low E2F1 expression levels belonged to nodal-positive patients at very low risk of metastases, reconfirming the impact of proliferation recently reported in a study evaluating breast cancer patients with 10 and more positive lymph nodes [27, 28].

The STB and NKI data sets differed in adjuvant treatment modalities; in general, patients of the STB collective were older and consequently received more hormone therapy but less chemotherapy as compared with patients of the NKI collective. In this context, it has to be emphasized that treatment regiments were chosen independent of the E2F1 status (Additional file 1) and that E2F1 levels retained predictive survival significance in patients with and without different adjuvant treatments (Additional file 2). Multivariate analyses, however, revealed different treatment impacts in the two data sets (Tables 4 and 5). In the STB collective, chemotherapy was particularly significant in univariate Cox analysis but was nonsignificant in multivariate Cox models, suggesting that information about the higher risk cases receiving chemotherapy is already included in the combination of the other covariates. Since E2F1 is co-expressed or regulates genes such as TYMS, TK1 and TOP2A, which were mechanistically linked with response to 5-fluorouracil and anthracycline-based therapy [16, 2932], however, our results with respect to specific chemotherapy response should be further investigated.

Conclusion

Since accurate monitoring of proliferation assessing the mRNA E2F1 levels together with the determination of the ER and ERBB2 status can be performed easily by quantitative RT-PCR even in small amounts of tissue such as core biopsies [19], we encourage the inclusion of such analyses in protocols of ongoing clinical and translational research investigations, including predictive studies with respect to specific chemotherapies.