Introduction

Biological subtyping of breast cancer is an integral part of the standard evaluation of patients diagnosed with breast cancer. Subtyping can be done with gene expression arrays [1], but the molecular subtypes are frequently approximated with immunohistochemistry (IHC) due to its wide availability and low cost. However, assays for cancer oestrogen receptor (ER), progesterone receptor (PgR) and human epidermal growth factor receptor-2 (HER2) expression by IHC have an up to 20 % risk for discordant or erroneous results [2, 3], and making a distinction between luminal A and luminal B breast cancer requires assessment with the proliferation marker Ki-67, which is prone to high intra- and inter-observer assessment variability [4, 5].

In this study, we compared assessment of breast cancer key biomarkers, ER, PgR, HER2 and Ki-67 quantitatively using RT-qPCR with their assessment using IHC or in situ hybridization as a part of the clinical routine in breast cancer subtyping and prediction of patient outcome. We hypothesized that quantifying Ki-67 with RT-qPCR might result in more robust outcome predictions. To our knowledge, few such comparative data are available.

Methods

Patients

The clinical data and breast tumour tissue samples were collected within the FinHer trial (identifier ISRCTN76560285), where 1010 women with axillary node–positive or high-risk axillary node-negative breast cancer were randomly assigned between October 2000 and September 2003 to receive either three cycles of docetaxel followed by three cycles of fluorouracil, epirubicin and cyclophosphamide (FEC) or three cycles of vinorelbine followed by three cycles of FEC [6, 7]. Breast tumour erbB2 (HER2) copy numbers were determined centrally by chromogenic in situ hybridization (CISH), and women with HER2-positive cancer (n = 232) had a second randomisation between nine weekly infusions of trastuzumab, given concomitantly with either docetaxel or vinorelbine, and similar chemotherapy without trastuzumab. After a median follow–up time of 62 months since randomisation, women assigned to docetaxel had better distant disease-free survival (DDFS, the primary objective) than those assigned to vinorelbine (HR 0.66, 95 % CI 0.49–0.91; P = 0.010) [6]. The absolute benefit in 5-year DDFS in favour of the docetaxel plus FEC regimen was 5.2 % (86.8 vs 81.6 %), and 3.3 % (92.6 vs 89.3 %) for overall survival (OS) across all biological subtypes [6].

Immunohistochemistry

Immunostaining for ER, PgR, HER2 and Ki-67 was performed on tissue sections cut from formalin-fixed, paraffin-embedded (FFPE) tumour tissue at the local pathology laboratories of the 17 study sites (all located in Finland) according to each laboratory’s standard procedures.

ER and PgR were considered positive when 10 % or more of the cancer cells stained positively. Ki-67 assays were analysed by estimating the proportion of positively staining cancer cell nuclei out of all cancer cell nuclei in the tissue section, and the result was provided as a percentage ranging from 0 to 100 %. For the present study, Ki-67 expression was considered positive when ≥20 % of cancer cell nuclei stained positively. Local pathologists interpreted the ER, PgR and Ki-67 immunostaining results, as per each institute’s standard practice.

Chromogenic in situ hybridization (CISH)

Tumours with a score of 2+ or 3+ (on a scale of 0 to 3+) for HER2 expression in IHC were further analysed for HER2 gene amplification by CISH in one of two central laboratories. The HER2 status was considered positive when six or more gene copies per nucleus were present. As in the original trial [6, 7], in the present study, cancer HER2 status was considered positive whenever CISH for HER2 was positive, and negative whenever CISH was negative, regardless of the degree of HER2 protein expression in IHC.

RT-qPCR

After pathologic confirmation of representativeness of the tissue sections for presence of cancer, a single whole-face 10-μm-thick slice from each FFPE tumour block was processed with the RNXtract® RNA extraction kit (BioNTech Diagnostics GmbH, Mainz) using a magnetic particle-based assay (Supplemental file 1A). RT-qPCR was done with the MammaTyper® kit (BioNTech Diagnostics GmbH, Mainz) for ESR1, PGR, ERBB2 and MKI67, and the two reference genes B2 M and CALM2 on a Versant kPCR system (Siemens, Erlangen, Germany) by applying one cycle of primer-specific reverse transcription followed by 40 cycles of nucleic acid amplification (Supplemental file 1B). The median quantification cycle (Cq) for each of the four genes of interest (GOI) were normalized against the two reference genes (REF) and presented as ΔΔCq values relative to the positive control, obtained after subtracting the ΔCq value of the positive control (pc) from the ΔCq of the sample (s) by the formula

$$4 0\;{-}\;\Delta \Delta {\text{Cq}}\left( {\text{GOI}} \right)_{\text{S}} { = 40}\;{-}\;\left( {\left( {{\text{Cq}}\left[ {\text{GOI}} \right]_{\text{S}} {\text{{-} meanCq}}\left[ {\text{REF}} \right]_{\text{S}} } \right){-}\left( {{\text{Cq}}\left[ {\text{GOI}} \right]_{\text{pc}} {\text{{-} meanCq}}\left[ {\text{REF}} \right]_{\text{pc}} } \right)} \right) .$$

To exclude a major influence of a varying tumour cell content for the assay results, sensitivity studies were undertaken similarly as previously reported [8]. A series of extreme cases with low content of invasive carcinoma and varying amount of DCIS were analysed before and after macrodissection and it could be confirmed that the TCC did not influence the final test result [9, Laible et al. submitted]. Therefore, a major influence of TCC on MKI67 mRNA expression can be excluded. Cut-offs for the markers ERBB2, ESR1 and PGR were defined in an independent technical cohort based on reference pathology IHC results. Prognostic and predictive value of MKI67 cut-offs had previously been analysed by testing objective cut-offs in 562 Affymetrix U133 A datasets from breast cancer patient cohorts having received either no systemic therapy, only endocrine treatment or chemo-endocrine regimen [10]. In view of these analyses, the MKI67 cut-off was set at the 3rd quartile of the normally distributed MKI67 expression data from 90 FFPE breast cancer reference tumour samples and thus ought to reflect a correlate to the standard Ki-67 cut-off at 20 % positively stained nuclei.

Definition of breast cancer biological subtypes

After defining each of the four biomarkers either positive or negative, the molecular subtype of each tumour was determined using a slightly modified version of the currently proposed IHC-based breast cancer molecular subtyping algorithm [1] (Supplemental File 1C). In brief, luminal A cancers were defined as having high ESR1 and/or PGR mRNA content and low ERBB2 and MKI67 content. Luminal B cancers were defined as having high cancer ESR1 and MKI67 content, or high ESR1 content but low PGR and ERBB2 content. Cancers with a high ERBB2 mRNA content were considered as HER2-positive cancers and were not further categorized into luminal and non-luminal (“enriched”) lesions. Triple-negative cancers consisted of cancers that had low ESR1, PGR and ERBB2 mRNA content irrespective of cancer MKI67 mRNA content.

The same scheme was used to categorize the cancers according to the IHC and CISH results, but using protein expression (at IHC) and the number of HER2 gene copies (at CISH) in place of cancer mRNA content. For example, cancers that were positive for ER and PgR (with ≥10 % of the nuclei that were positive in each staining), HER2 negative (by CISH) and had low Ki-67 (<20 % of nuclei stained positively at IHC) were considered luminal A cancers.

Statistical methods

The results were analysed according to a statistical analysis plan written and approved prior to the initiation of the study, and the RT-qPCR results were interpreted blinded to the clinical information. Kappa (κ) statistic numeric values are categorized into poor (≤0.2), fair (>0.2–0.4), moderate (>0.4–0.6), good (>0.6–0.8) and very good (>0.8) associations, and were used as a measure of positive percent agreement (PPA), negative percent agreement (NPA) and overall percent agreement (OPA). The tests are accompanied by their respective 95 % confidence intervals (95 % CI). A two-sided P value <0.05 was considered significant.

The primary clinical endpoint was DDFS, defined as the time period between the date of randomisation and the date of first distant metastasis or the date of death when death preceded detection of distant recurrence. Overall survival (OS) was defined as the time period between the date of randomisation and the date of death. Survival was analysed using the Kaplan–Meier method.

Univariable and multivariable Cox proportional hazards models were constructed to compare prognosis between groups and to study the interactions between variables. Hazard ratios (HRs) were calculated using a univariable Cox model. In multivariable Cox models, a backward selection procedure was used to adjust for the covariables.

Results

Patients

An RT-qPCR assay of ESR1, PGR, ERBB2 and MKI67 was successfully performed from breast cancer tissues of 769 (76 %) out of the 1010 patients entered to the FinHer trial. In the remaining 241 cases, cancer tissues were not available, and the tissue block did not consist mostly of cancer cells, or RNA extraction did not yield good-quality mRNA. We included in this study all 719 (71 % out of 1010) cases with successful RT-qPCR assay of the four genes and with IHC data available for subtyping. The inaccessibility rate to the tissue samples was similar across the study treatment arms (a modified CONSORT diagram shown in Supplemental file 1D). The characteristics of the patients and tumours included in the present study (Table 1) were similar to those of the entire FinHer trial cohort [7].

Table 1 Patient demographics, clinicopathological data and frequencies of marker binary categories

The median age of the patients at study entry was 50.9 years (range, 25.5–65.8). Tumours had a mean diameter of 26 mm ± 16 mm (6–150 mm), and the majority (n = 637, 88.6 %) had given rise to regional lymph node metastases at the time of the diagnosis. There were 511 (71.1 %) ER-positive, 395 (54.9 %) PgR-positive and 163 (22.7 %) HER2-positive cancers. After random allocation, 357 (49.7 %) patients were treated with docetaxel plus FEC, 362 (50.4 %) with vinorelbine plus FEC and 83 (50.9 %) of the 163 patients with HER2-positive cancer received trastuzumab. The median follow-up time after randomisation was 62 months, during which time period 112 patients had distant cancer recurrence and 62 died.

Concordance between mRNA and IHC assays

Tumour ESR1, PGR and ERBB2 mRNA content assessed by RT-qPCR and the corresponding protein expressions determined by IHC for ER and PgR and DNA amplification status by CISH for HER2 showed good concordance, whereas cancer MKI67 mRNA content and protein expression correlated only moderately well (Table 2). Many of the discordant cases between IHC and the mRNA assay had a high cancer MKI67 mRNA content, but despite this, <20 % of cancer cell nuclei stained positively with IHC (Fig. 1 ).

Table 2 Agreement between RT-qPCR-based and IHC-based biomarker assessments
Fig. 1
figure 1

A scatterplot depicting the relation between tumour MKI67 mRNA content measured with RT-qPCR, and Ki-67 expression determined by immunohistochemistry (IHC). Vertical axis, tumour Ki-67 expression (IHC, %); horizontal axis, tumour relative MKI67 mRNA expression. The cut-off for positivity was 20 % in the Ki-67 protein assays (the horizontal line) and 34.8 in the MKI67 mRNA (qPCR) assays (the vertical line). Sections A and D depict the discordant cases, and sections B and C depict the concordant cases

Prognostic value of cancer MKI67 mRNA content and Ki-67 expression

Patients with low breast tumour MKI67 mRNA content or low (<20 %) Ki-67 expression had more favourable DDFS and OS as compared to those with high MKI67 mRNA content or Ki-67 expression. Each method produced roughly similar hazard ratios for DDFS and OS (Fig. 2).

Fig. 2
figure 2

Influence of cancer MKI67 mRNA expression and Ki-67 protein expression on DDFS (panels a and c) and survival (panels b and d). Results obtained by measuring MKI67 mRNA expression are shown in panels a and b, and those obtained by assessing Ki-67 protein expression in panels c and d

In a multivariate Cox regression analysis where the type of chemotherapy (vinorelbine-FEC or docetaxel-FEC), the axillary nodal status (pN0, pN1, pN2 or pN3), tumour size (as a continuous variable), histological grade (as a continuous variable) and cancer MKI67 mRNA content (as a continuous variable) were entered as covariables, low tumour MKI67 mRNA content was independently associated with favourable DDFS (adjusted HR 0.51; 95 % CI, 0.29–0.90; P = 0.019) together with a negative axillary nodal status (P < 0.0001) and small cancer size (P = 0.006). A low cancer MKI67 mRNA content was also independently associated with favourable OS (adjusted HR 0.44; 95 % CI, 0.23-0.87; P = 0.018) in addition to the axillary nodal status (P = 0.003) and tumour size (P = 0.006). When Ki-67 protein expression was entered into the same models in place of cancer mRNA content, Ki-67 was not significantly associated with DDFS (P = 0.266), but when OS was selected as the endpoint, low cancer Ki-67 expression was associated with favourable survival (adjusted HR 0.43; 95 % CI, 0.24–0.77; P = 0.005) together with the axillary nodal status (P = 0.002) and small tumour size (P = 0.006).

Concordance of molecular subtyping with IHC and RT-qPCR

The method of Ki-67 assessment had substantial impact on making the distinction between luminal A and B cancers. Of the 189 cancers that were classified as luminal A by IHC/CISH, only 102 (54.0 %) were similarly classified, when MKI67 mRNA expression was used in place of Ki-67 protein staining with the 87 discordant cases being classified as either luminal B (n = 75, 39.7 %) or HER2 positive (n = 12, 6.4 %, Table 3). Of the 251 cancers that were classified as luminal B by IHC/CISH, 180 (71.7 %) were similarly classified using MKI67 mRNA expression, 48 (19.1 %) were classified as luminal A, 17 (6.8 %) as HER2 positive and 6 (2.4 %) as triple negative. Of the 156 and 294 tumours classified as luminal A and luminal B by RT-qPCR, respectively, 102 (65.4 %) and 180 (61.2 %) were classified as luminal A or B also with IHC/CISH.

Table 3 Concordance of breast cancer subtypes when cancer Ki-67 expression is assessed with IHC and MKI67 mRNA expression with RT-qPCR

Influence of cancer MKI67 mRNA expression-based and Ki-67 protein expression-based subtypes on outcome

There was no significant difference in DDFS or OS between patients treated with adjuvant docetaxel plus FEC and those treated with vinorelbine and FEC in the subsets with luminal A, HER2-positive or triple-negative breast cancer when each subtype was defined either with IHC/CISH or with RT-qPCR (DDFS and OS statistics for each subtype according to chemotherapy agent shown in Supplemental file 1E). Interestingly, when luminal B subtype was defined by MKI67 mRNA expression, patients treated with docetaxel plus FEC had significantly more favourable DDFS and OS as compared with those treated with vinorelbine plus FEC (for DDFS, HR 0.52, 95 % CI 0.29–0.94, P = 0.031; OS, HR 0.24, 95 % CI 0.09–0.65, P = 0.005). In contrast no significant difference in DDFS or OS was found, when the luminal B subtype was defined with Ki-67 protein expression (P > 0.10 for both analyses; Fig. 3).

Fig. 3
figure 3

Distant metastasis-free survival (panels a and b) and overall survival (panels c and d) of patients treated with adjuvant docetaxel plus FEC and those treated with vinorelbine plus FEC in the subset of patients with luminal B breast cancer. Panels a and c, the luminal B subtype was defined with MKI67 mRNA expression; panels b and d, the luminal B subtype was defined with Ki-67 protein expression

The type of adjuvant chemotherapy (tested docetaxel plus FEC vs vinorelbine plus FEC) had an independent influence on DDFS in the subset of patients who had luminal B cancer defined by cancer MKI67 mRNA content in a multivariable analysis (HR 0.44; 95 % CI 0.23–0.84, P = 0.013), together with cancer histological grade (tested as a continuous variable; HR 1.67, 95 % CI 1.03–2.72, P = 0.039) and tumour size (tested as a continuous factor; HR 1.02, 95 % CI 1.00–1.04, P = 0.026). Similarly, when OS was used as the end point in place of DDFS and the luminal B subtype was defined by cancer MKI67 mRNA content, docetaxel-containing chemotherapy was independently associated with favourable survival (HR 0.22; 95 % CI 0.08–0.60, P = 0.003) together with histological grade (HR 2.29, 95 % CI 1.15–4.57; P = 0.019), while tumour size lost its significance. Unlike MKI67 mRNA content, Ki-67 protein expression did not have independent influence on DDFS or OS in these models. When the luminal B subtype was defined with tumour MKI67 mRNA content, the interaction with the type of adjuvant chemotherapy given was significant (P = 0.040) when OS was selected as the end point, but not when DDFS was considered (P = 0.352). No interaction with either OS or DDFS and the type of adjuvant chemotherapy was present when the luminal subtype was defined with Ki-67 protein expression (P = 0.658 and 0.699, respectively).

Discussion

We approximated commonly used breast cancer biological subtypes using RT-qPCR and compared the results with the subtypes defined by IHC (and with CISH to detect HER2 amplification) within the framework of a large randomized clinical trial. The subtypes defined with each method agreed moderately well with most discrepancy occurring in the luminal B subtype. Both high cancer Ki-67 protein expression and high MKI67 mRNA content were associated with unfavourable DDFS and OS in a univariable analysis with approximately similar hazard ratios, but only tumour MKI67 mRNA content remained significant in a multivariable model for DDFS when both parameters were entered into the same model after a stepwise selection process of the covariables such as tumour size, nodal status, histological grade and the type of treatment given.

A key difference between luminal A and luminal B subtypes is a higher cell proliferation rate in the latter, which is often assessed by estimating the proportion of cancer cells that stain positively for Ki-67 after immunohistochemical staining. Interestingly, when the luminal B type was defined using cancer MKI67 mRNA content in place of Ki-67 expression assessed with immunohistochemistry, patients with luminal B breast cancer were found to benefit more from adjuvant docetaxel plus FEC than from adjuvant vinorelbine plus FEC, which association could not be detected when the luminal B breast cancer subtype was defined by Ki-67 protein expression with immunohistochemical staining.

Biological subtyping of breast cancer is the basis for selection of systemic cancer treatment [1]. Of the four biomarkers commonly used for this purpose, i.e. ER, PgR, HER2 and Ki-67, the assays for Ki-67 have turned out the most challenging ones to standardize and to make reproducible. For example, in a study carried out in a few leading pathology laboratories, there was substantial variability between the laboratories in scoring of Ki-67 expression from shared breast cancer tissue slides stained with IHC, and attempts to reduce the interlaboratory variability were only partially successful [4]. In the present study, IHC staining for Ki-67 was done locally in many pathology laboratories using the institutional staining protocols and was assessed by many pathologists, whereas cancer MKI67 mRNA content was determined centrally in one laboratory. To reduce the potential variability in Ki-67 staining and scoring, we considered carrying out staining for Ki-67 also centrally, but due to the difficulties to standardize Ki-67 immunostaining even in leading laboratories and to establish a reference procedure [4], we preferred to use the Ki-67 staining results reported originally by the local laboratories from whole tumour tissue sections as the comparator for the MKI67 mRNA assay. Image analysis of Ki-67 from IHC stained slides is a promising method to improve the reproducibility of Ki-67 scoring from immunostained slides, but, to our knowledge, no standard parameter values for scoring of the nuclei as either positive or negative are available. To estimate how well the locally assessed Ki-67 assays done from whole tumour tissue sections might correlate with a centrally done Ki-67 assay, we analysed cancer Ki-67 expression from TMAs (as whole tumour sections were not available) containing tissue from 745 breast cancers using image analysis [11]. The median cancer Ki-67 expression turned out to be similar with image analysis and locally done IHC (19.7 and 20.0 %, respectively), and the two assays showed strong correlation (P < 0.0001, Spearman’s rho 0.633). These observations suggest that centrally done image analysis of Ki-67 might have resulted in similar conclusions had it been selected as the comparator assay in place of the local Ki-67 IHC assays.

The subtypes defined with MKI67 mRNA were associated with survival outcomes that agree well with the results obtained with IHC from other clinical trials [9, 10, 1214]. Patients with the luminal A subtype had the best 5-year DDFS, patients with HER2 positive and triple-negative cancer had the least favourable outcomes, while patients with luminal B cancer had an outcome intermediate of these subtypes (see Supplement File 1E). These results are well in agreement despite slight dissimilarities in the definition of luminal B and HER2-positive subtypes between the trials.

Taxane-containing adjuvant regimens are effective in the treatment of early breast cancer but are associated with side effects, and therefore, methods to optimize patient selection for regimens that contain a taxane are needed. The current finding that patients with luminal B cancer have longer DDFS and OS when treated with docetaxel plus FEC as compared with vinorelbine plus FEC is supported by observations made by Jacquemier et al. and Nitz et al. who found that chemotherapy containing docetaxel was associated with a significant reduction in the risk of relapse in the subset of patients with luminal B breast cancer in the PACS 01 trial [13] and WSG-AGO EC-Doc trial [12], respectively. Both of these trials compared docetaxel-containing regimens with standard anthracycline-containing treatments. In the BCIRG 001 trial that compared docetaxel, doxorubicin and cyclophosphamide (TAC) versus fluorouracil, doxorubicin and cyclophosphamide (FAC) in the treatment of operable node-positive breast cancer, only patients with ER-positive tumours with either high Ki-67 expression or HER2 overexpression had a statistically significant improvement in disease-free survival when treated with TAC [14]. However, unlike these studies, we did not find a survival benefit from the docetaxel-containing regimen in the subset of women with HER2 positive cancer. In FinHer, half of the patients with HER2-amplified cancer were randomly assigned to receive adjuvant trastuzumab, which may have masked the potential docetaxel benefit in this subtype and may have reduced the statistical power to detect the association.

The PAM50 gene expression array has also been evaluated in predicting the potential benefit of adding a taxane to anthracycline-based chemotherapy, but none of the PAM50-derived subtypes including the luminal B subtype were predictive for a taxane benefit in the GEICAM/9966 and the NCIC CTG MA.21 randomized phase III trials [15, 16]. Similarly the Endopredict gene expression assay did not predict taxane benefit in the GEICAM/9966 study population [17].

The limitations of the study include the retrospective nature of the study, although we determined tumour MKI67 mRNA without knowledge of the clinical data and planned the statistical analyses prospectively. We tested the methods within the context of a relatively large randomized trial but lacked a validation series, and some subgroup analyses have limited power. However, the PCR method used turned out to be reproducible across multiple testing sites for all four biomarkers including MKI67 mRNA (Laible et al., manuscript submitted for publication). The details of the IHC methods used for assaying Ki-67 in the local pathology laboratories were not captured during the FinHer trial, as Ki-67 was not a protocol-mandated assay, but most pathology laboratories in Finland assess Ki-67 from the tumour hot spot areas. The recommended cut-off for ER and PgR positivity is now 1 % and no longer 10 % as it was at the time when the FinHer trial accrued patients, but the proportion of breast cancers where ER or PgR are expressed in 1 % to 10 % of nuclei is small [18].

Conclusions

Measuring of cancer ESR1, PGR and ERBB2 mRNA correlated well with the results obtained with IHC and CISH in clinical pathology laboratories. Tumour MKI67 mRNA content quantitated with RT-qPCR is associated with DDFS and OS of patients treated with modern adjuvant regimens. The results suggest that assessment of tumour MKI67 mRNA content may be valuable for selection of patients for docetaxel-containing adjuvant therapy. Since the immunohistochemical assay results for Ki-67 expression are challenging to transfer between laboratories, and the assay for measuring cancer MKI67 mRNA content with RT-qPCR might be less challenging to standardize than IHC stainings, performing studies that evaluate interlaboratory comparisons of cancer ESR1, PGR, ERBB2 and MKI67 mRNA content using RT-qPCR are warranted.