Introduction

Breast cancer can be divided into different subtypes, based on immunohistochemical (IHC) marker expression or on gene expression array data. IHC-markers are routinely used in most diagnostic labs. Hormone receptor status is used for the assignment of endocrine therapy, whereas HER2 overexpression identifies tumors sensitive to trastuzumab. The intrinsic subtypes based on gene expression analysis, first defined by Sorlie et al. [1] in 2001, have also gained wide spread acceptance as a way of classifying breast cancer tumors. Since gene expression data is often not available, approximations of intrinsic subtypes based on routinely available tests have been proposed. Recently, the expert panel of the St. Gallen meeting [2] proposed to classify tumors for therapeutic purposes based on such “surrogate intrinsic subtypes”. Luminal A disease (characterized by a positive hormone receptor, absence of HER2 amplification and “low” Ki67 [<14 %]) is stated to require only endocrine therapy, whereas in Luminal B/HER2−, Luminal B/HER2+, HER2-enriched and triple negative disease, chemotherapy should be considered. In case of HER2 amplification, trastuzumab should be considered as well. For the ER+/HER2− tumors, the new surrogate intrinsic subtype classification replaces the concept of endocrine responsiveness, which was proposed at the St. Gallen 2007 meeting [3] to separate the ER+/HER2− tumors in a chemotherapy-sensitive and a chemotherapy-resistant subgroup.

Recently, a large analysis of the German breast group has focused on these surrogate intrinsic subtypes and chemotherapy responses in an attempt to establish the association between pCR and survival for each of the different subtypes [4]. This analysis suggested that it is important to divide a breast cancer dataset according to subtypes, as the relation between pCR and survival depended strongly on the subtype. pCR was a suitable surrogate end point for patients with luminal B/HER2−, HER2+, and triple-negative disease but not for tumors with luminal B/HER2+ or luminal A tumors. Also, another study applied the surrogate definitions to define different prognostic breast cancer groups [5]. However, no gene expression data were available in both studies, so the association between the original gene expression based classification, the surrogate intrinsic subtypes and chemotherapy response could not be established.

In this study, we compared the power to predict neoadjuvant chemotherapy response of gene expression based intrinsic subtypes, as well as surrogate intrinsic subtype definitions. Therefore, surrogate intrinsic subtypes were determined in 560 patients and response and survival analysis were performed. PAM50 gene expression based intrinsic subtypes were available for 247 tumors.

Patients and methods

Patients

Pre-treatment biopsies of 560 primary breast tumors were collected. All patients had received neoadjuvant treatment at the Netherlands Cancer Institute between 2004 and 2012 as part of ongoing clinical trials, or were treated off protocol according to the standard arms of one of these studies. The studies had been approved by the ethical committee and informed consent was obtained from all patients. All tumors were either at least 3 cm in size, or the presence of—axillary lymph node metastases had been proven by fine needle aspiration (FNA). Biopsies were taken using a 14 G core needle under ultrasound guidance. After collection, specimens were snap-frozen in liquid nitrogen and stored at −80 °C. Each patient had two or three biopsies taken to ensure that enough tumor material was available for both diagnostics and research purposes.

Depending on the particular study, a treatment regimen was assigned to each patient, which, for HER2-negative tumors, consisted of one of the following: (1) six courses of dose-dense Doxorubicin/Cyclophosphamide (ddAC); (2) six courses of Capecitabine/Docetaxel (CD) or (3) if the therapy response was considered “unfavorable” by MRI evaluation [6] after three courses, ddAC was changed to CD or vice versa. Study results have not been published yet. However, a first analysis shows that switching of chemotherapy regimen may be beneficial for initial non-responding tumors (manuscript in preparation, de Rigter et al.). pCR rates are not significantly different for patients treated with AC or DC. All HER2+ patients were treated by a regimen of three cycles of 8-weekly courses of paclitaxel, trastuzumab, and carboplatin. Hormone receptor positive patients received both chemotherapy and endocrine therapy.

Response evaluation

The pathological response in the resection specimen after chemotherapy was used as an endpoint. Patients with a complete absence of invasive tumor cells (irrespective of carcinoma in situ) in the surgical specimen of the breast and of the lymph nodes were considered to have a pathological complete remission (pCR). All other patients (partial and non responses) were included in the “no pCR” category.

Pathology

All tissue sections were reviewed by a consultant breast cancer pathologist (J.W.) for histological classification and immunohistochemical assessment. Samples were scored as positive for ER and/or PR by immunohistochemistry (IHC), when at least 10 % of the tumor cells nuclei showed staining of the ER or PR, respectively. A sample was scored as being HER2 positive when either a strong membrane staining (3+) could be observed by IHC or if CISH revealed amplification of HER2 in samples with moderate (2+) membrane staining at IHC. Ki67 staining was performed with the MIB1 antibody (Dako, Glostrup, Denmark), dilution 1:250. In another study by our group (manuscript in preparation Rigter et al.) we assessed the best Ki67 cut-off to differentiate Luminal A and Luminal B tumors in our tumor set (Supplementary Fig. 1). We found that the 95 % confidence interval ranged from 10 to 15 %. 14 % was then taken as a cut-off between low and high proliferation index, as this cut-off was also described by Cheang et al. [7]. For surrogate intrinsic subtyping based on grade, grade 1 and 2 were combined into the low proliferative group and grade 3 represented the high proliferative tumors.

For ER+/HER2− tumors, we determined the endocrine responsiveness, as was described by Colleoni et al. [8]. Tumors were classified as highly endocrine responsive when ER and PR were positive in at least 50 % of the cells and incompletely endocrine responsive when either ER or PR was positive in less than 50 % of the cells. Table 1 summarizes all different subtype definitions used in this study.

Table 1 Subtype definitions and numbers per subtype in this study

Gene expression data

mRNA isolation and extraction from the frozen material were performed, as described previously [9]. In short, a 5-μm section halfway through the biopsy was stained for hematoxylin and eosin and analyzed by a pathologist for tumor percentage. Only samples that contained at least 60 % tumor cells were subsequently analyzed on a microarray. All samples were labeled and hybridized to Illumina 6v3 arrays (Illumina, La Jolla, CA), according to the manufacturer’s protocol. Data were log2 transformed and between-array normalized using simple scaling. The subtype single sample predictor “PAM50” [10] was used to assign a molecular subtype to the samples based on their expression profiles across the intrinsic gene set. In short, the intrinsic genes were selected from the Illumina 6v3 platform, when a single gene was represented by multiple probes the probe with the highest variance was chosen. Subsequently, for all samples the Spearman correlation of a sample (i.e., the expression of the intrinsic genes of that sample) to the centroid of each corresponding molecular subtype was calculated. Each sample was then assigned to the subtype with the highest correlation.

Statistics

The Fisher’s exact test was used to assess associations between the dichotomized response measures, pathological and molecular markers. Survival curves were constructed using the Kaplan–Meier technique, and survival was compared using log rank tests. Multivariate analysis for chemotherapy response (pCR) and cox-regression models for recurrence free survival were performed to correct for age, T- and N-stage. The contribution of the intrinsic subtypes was assessed by the change in log likelihood ratio χ 2, between a model containing only clinical variables (age, T-stage, N-stage), and a model including clinical variables and (surrogate) subtype information. To assess concordance between PAM50 and the surrogate intrinsic subtypes cross-tables were presented and kappa values were calculated. All analyses were performed in SPSS 17.0.

Results

Patient and tumor characteristics

In this study we show data of all breast cancer patients who received neoadjuvant chemotherapy in a single institution. Since the start of the neoadjuvant program in 2004, 560 patients completed neoadjuvant treatment and had surgery. From a subset of 247 patients, gene expression data was available and PAM50 intrinsic subtypes were determined. Table 1 shows the different definitions of breast cancer subtyping as well as the number of cases per group. We subsequently show the following subtype classifications: IHC-based classification, PAM50 gene expression subtype, surrogate intrinsic subtype based on Ki67, surrogate intrinsic subtype based on grade, and the endocrine responsiveness. Patient and tumor characteristics are presented in Table 2. Note that for the TN and HER2+ samples, the surrogate intrinsic subtypes completely overlap with the IHC-based classification. The concept of endocrine responsiveness was only developed for the ER+/HER2− tumors, so it is only shown for this group.

Table 2 Patient and tumor characteristics

Response for the surrogate intrinsic subtypes

We assessed response to neoadjuvant chemotherapy for the different subtype classifications. pCR rates are shown in Table 3 for the IHC-based subtyping, PAM50 subtyping, the surrogate intrinsic subtype definitions, based on both Ki67 staining and histological grade, and endocrine responsiveness. For the TN and HER2+ tumors, the surrogate intrinsic subtypes are identical to the clinical IHC-based subtypes, so the response rates for surrogate subtypes are only shown for the ER+/HER2− tumors. In this subgroup the various classifications are meant to select a chemotherapy-sensitive and a chemotherapy-resistant group. Therefore, we focus in the rest of this study mainly on the ER+/HER2− group. The surrogate intrinsic subtype based on grade could divide this subgroup in a Luminal B/HER2− subgroup with limited response to chemotherapy (14.6 % pCR), and the Luminal A group with almost no pCRs (2.3 % pCR) (p = 0.002). The PAM50 gene expression based subtype, the surrogate intrinsic subtype based on Ki67 and endocrine responsiveness did not differentiate ER+/HER2− tumors in groups with significantly different chemotherapy responses.

Table 3 Response according to different subtype classifications

For the ER+/HER2− tumors, we computed multivariate models to assess if the subtypes have predictive power in assessing chemotherapy response. We tested four different models, one employing PAM50, one employing Ki67, one employing grade, and one employing endocrine responsiveness, as these variables are used to define a Luminal A and Luminal B tumor group. Age, T-stage, and N-stage were included as covariates. In the model including grade as a variable, tumors with grade 3 tumors had a higher probability of achieving a pCR than grade 1 or 2 tumors (OR = 6.38, p = 0.004). PAM50, Ki67 and endocrine responsiveness were not associated with pCR rates in a multivariate model (Table 4). By computing the change in log likelihood χ 2 between a model with only clinical covariates (age, T-stage, and N-stage) and a model including both clinical covariates and the subtype, we can compare the fit of the models. Higher values indicate more added predictive information. The model, including grade, adds the most information in predicting pCR rates.

Table 4 Multivariate analysis of pCR in ER+/HER2− tumors

Survival analysis

To assess if the surrogate intrinsic subtypes can discriminate tumors with different survival, Kaplan–Meier curves for recurrence free survival (Fig. 1) were computed, for IHC, PAM50, surrogate subtypes (one based on Ki67 and one based on grade), and for endocrine responsiveness. For the surrogate definitions and for endocrine responsiveness, we only show the curves for the ER+/HER2−, as the triple negative and HER2+ subtypes are similar in the surrogate subtype definitions to the IHC-subtypes. Indeed, the different subtypes differentiated the tumors in classes with distinct relapse free survival (Fig. 1a–e). As expected, the triple negative or basal groups had the worst survival. Luminal A tumors had the best survival. From the surrogate definitions, the definition based on grade had the lowest p value, indicating that the grade was most strongly associated with survival. We should note that our data are preliminary, as median follow-up time was only 29 months (range 2–82), with 169 patients with a follow-up of less than 2 years. This is very limited, especially for the Luminal subtypes, which have mostly late recurrences between 5 and 10 years after treatment. We performed Cox regression analysis to determine if the surrogate definitions are still predicting survival when we correct for age, T-stage, and N-stage. Table 5 shows that the definition based on grade resulted in the best predictor, with the lowest p value and the highest hazard ratio. Patients with a Luminal B/HER2− surrogate intrinsic subtype (based on grade) have a five times higher chance on a recurrence than patients with a Luminal A subtype tumor within the first years after surgery (HR = 5.3, p = 0.002). Also, PAM50 based subtypes and endocrine responsiveness are significantly associated with recurrence free survival (HR = 9.65, p = 0.02 for PAM50 and HR = 4.81, p = 0.01 for endocrine responsiveness, respectively). When we consider the change in log likelihood ratio χ 2, both grade and endocrine responsiveness add the most predictive information, followed by PAM50 based subtypes.

Fig. 1
figure 1

RFS curves for subsequently the traditional IHC subtypes (log rank p value = 0.0002) (a), the PAM50 subtypes (log rank p value = 0.003) (b), the surrogate intrinsic subtypes-Ki67 (log rank p = 0.19) (c), the surrogate intrinsic subtypes-grade (log rank p = 0.003) (d), and endocrine sensitivity (log rank p = 0.0067) (e)

Table 5 Multivariate cox proportional hazard analysis of the risk of recurrence (recurrence free survival) in ER+/HER2− tumors

Discussion

In this study, we collected data from 560 patients in a single cancer center who took part in an ongoing neoadjuvant chemotherapy program. For 247 patients gene expression data were available. We investigated if different intrinsic subtype definitions could help to predict neoadjuvant chemotherapy benefit in breast cancer. Subsequently we compared IHC-based subtypes, gene expression based PAM50 subtypes, surrogate intrinsic subtypes both based on Ki67 and on grade and endocrine responsiveness. The surrogate intrinsic subtyping was identical to the IHC subtypes for the basal and the HER2-positive tumors. Clinically, the surrogate intrinsic subtypes have been suggested to be mainly important to divide the clinical ER+/HER2− tumors in classes with different prognoses and chemotherapy responses. In our series, histological grade had the best predictive power. We therefore prefer to continue the use of the conventional IHC-subtyping based on hormone receptors, HER2 and histological grade rather than the “surrogate intrinsic subtype” classification, at least in the setting of neoadjuvant chemotherapy.

Our results are in line with a recent study by Von Minckwitz et al. [4], as we both show that subtypes based on grade can differentiate tumors with different chemotherapy sensitivity. However, von Minckwitz et al. did not compare the different subtype definitions. By comparing the association between the different subtype classifications and chemotherapy response, we see that grade is better than Ki67 or endocrine responsiveness in identifying groups of tumors with different chemotherapy response in this dataset. A recent review by the Breast International Group and North American Breast Cancer Group [11] recommends to use surrogate intrinsic subtypes based on Ki67, but acknowledge that the definition of low and high proliferation might be subject to change. Our study contradicts this recommendation, as we see that grade has more predictive power than Ki67. We should note that the number of responding ER+/HER2− tumors is small in our dataset, and a definite conclusion should therefore be based on larger datasets.

Breast cancer intrinsic subtypes have been subjected to many studies and a variety of classifications exist [1, 10, 1215]. Different classifications are said to lead to the same molecular classification of breast cancer in the five subtypes, however, a recent study by Weigelt et al. [16] contradicts this assumption. In that study the agreement between three different types of gene expression based intrinsic subtyping methods was assessed. Only the basal subtype showed a good agreement between classifiers, whereas for the other subtypes the agreement was fair to moderate. Consequently, the authors conclude that it is too early to use the intrinsic subtypes into clinical practice. We determined the concordance between the PAM50 based classification, immunohistochemical subtypes, surrogate intrinsic subtypes, and endocrine sensitivity (Supplementary Table 1). In line with Weigelt’s findings, the concordance for the basal likes was good, but modest for the other subtypes.

This study has several limitations. First, there were missing values for Ki67 staining and for grade. We used both to differentiate between Luminal A and Luminal B/HER2− tumors [2]. As seen in Supplemental Fig. 2, there is a correlation between Ki67 staining and grade, but it is far from perfect. Both Ki67 staining and histological grade determination have their pros and cons. Cheang et al. determined a cut-off percentage of positive nuclei to classify a sample as either Luminal A or Luminal B, and found that the false positive and false negative rates were both 25 % [7]. Also, the cut-off of Cheang et al. is based on patients treated in the adjuvant setting, which is different from the neoadjuvant setting. It has been recommended that each laboratory establishes its own optimal value. In our dataset, an optimal cut-off is between 10 and 15 %, we therefore kept to use the 14 % cut-off as was assessed by Viale and co-authors [1719]. Apart from this, the superiority of grade over Ki67 is perhaps not surprising. In addition to proliferative activity, grade also incorporates two other characteristics of the tumor: the degree of tubule formation and the degree of nuclear polymorphism, both which could correlate with chemosensitivity. A second limitation is that survival data were derived from a limited follow-up period, as the study included many patients from recent years. The median follow-up was 29 months (range 2–82). This is relatively brief for the luminal subtypes, which are subjected to late recurrences which may occur up to 15 years or later after the diagnosis of their primary tumor [20]. Strong points of this study are that we have a large, single institution dataset. For half of patients we had both gene expression and IHC data. HER2 status was checked by CISH in tumors with intermediate IHC scores. Further, all pathology data was reviewed by the same expert breast pathologist. Also, chemotherapy regimens were the same for most patients: the large majority of the HER2− tumors were treated by six courses of anthracyline based or anthracyclin-taxane-based chemotherapy and the HER2+ tumors received trastuzumab in combination with carboplatin and paclitaxel.

In conclusion, our findings do not make a strong case for the “surrogate intrinsic subtype” terminology. For the ER+/HER2− tumors, the routine determination of IHC positivity for the estrogen receptor, HER2 in situ hybridization and grade yield the best separation of prognostic—in terms of RFS—and predictive—in terms of pCR rates—subgroups. For the other surrogate subtypes, ER and HER2 define the surrogate subtype (by definition). As the concordance between surrogate intrinsic subtype and PAM50 subtype is far from perfect, the terminology appears needlessly confusing without adding clinically useful information.