Introduction

DNA microarray analyses (e.g., tiling arrays, mRNA arrays, and direct sequencing of complementary DNA) have significantly advanced our understanding of breast cancer. They showed that breast cancer is not a single disease with variable morphologic features, but a group of molecular distinct neoplasms [1]. Furthermore, in certain clinical settings it can help to determine whether or not adjuvant chemotherapy is justified [2].

Several assays, resulting in risk scores, have been developed and are partially commercially and partially clinically available. The 21-gene Recurrence Score (OncotypeDx assay, Genome Health inc, Redwood city, CA) [3], the Amsterdam 70-gene profile, commercially known as the Mammaprint (Agendia, Huntington Beach, CA) assay [4] and the Risk of Recurrence (ROR) score, derived from Predictor Analysis of Microarray 50 (PAM50) [5], are mostly used. Reliable results of the assays require a good quality tumor sample with high cellularity. To illustrate this, Elloumi et al. [6] revealed a systematic bias when too much normal tissue was present in a tumor sample. However, tumor percentage is also dependent on tumor morphology. For example, a tumor with solid growth more easily reaches a high tumor percentage than a tumor with glandular or lobular growth (a feature important for grade). Furthermore, the presence of sclerosis as well as stromal and inflammatory cells can reduce the tumor cell percentage substantially. Another feature that influences the ability to perform gene expression analysis is the RNA quality. Pre-analytical factors such as time to fixation, fixation duration, and storage temperature have an impact on the RNA quality [7].

Summarizing the above, a high tumor percentage and good quality RNA are prerequisites for successful gene expression analysis. These requirements lead to the dropout of samples not fulfilling these criteria. As a higher histological grade is associated with a higher tumor cellularity, gene expression analysis might be more successful in high-grade tumors. To assess if indeed clinico-pathological variables are associated with successful gene expression analysis, we compared clinical and tumor characteristics of tumors suitable and unsuitable for gene expression analysis in two large (neo) adjuvant-treated patient cohorts.

Materials and methods

Patient selection

Tissue samples of patients treated in the neoadjuvant setting for breast cancer were collected at the Netherlands Cancer Institute (NKI) between 2004 and 2012. For participation in the neoadjuvant program, the tumor diameter should exceed 3 cm or axillary lymph node metastasis should be proven by fine-needle aspiration. Part of these patients participated in two ongoing clinical trials of which details have been described previously [8]. Both studies were approved by the ethical committee and informed consent for gene expression analysis was obtained for all included patients. At least two tumor biopsies were taken under ultrasound guidance, using a 14G core needle to assure sufficient tissue for both adequate diagnostics as well as for research purposes. To facilitate such analyses, at least one biopsy was snap frozen in liquid nitrogen and stored at − 80 degrees.

Pathology

Paraffin-embedded sections were all stained by a hematoxylin and eosin (H&E) stain and reviewed by a consultant breast pathologist (JW) for immunohistochemically assessment and histological classification (including subtype and grade) on biopsy material (of which the details are described previously) [9]. In brief, samples were scored as positive for oestrogen receptor (ER) and progesterone receptor (PR) if at least 10% of the tumor cells showed nuclear staining. HER2 (or ERBB2) was scored as positive when there was strong membranous staining in more than 30% of the tumor cells (3+) by immunohistochemistry or if chromosome in situ hybridization (CISH) revealed amplification. Percentage nuclear staining of tumor cells in Ki67 (MIB1) was scored as a marker for proliferation. Chemotherapy response was determined by pathological examination of resection specimens. Pathological complete remission (pCR) was defined as the absence of invasive tumor in both the breast and axillary lymph nodes after neoadjuvant chemotherapy.

Imaging data

For a subset of the patients, detailed imaging data were available. A dedicated breast radiologist (CL) assessed according to BIRADS lexicon [10] whether pre-treatment MRIs showed the tumor to be either mass-like, or non-mass like. For analysis purposes, these two categories were used. Metabolic activity was assessed using baseline 18F-fluorodeoxyglucose (FDG) positron emission tomography combined with computed tomography (PET/CT) scans. FDG uptake was quantified using maximum standardized uptake values (SUVmax) measured with Osirix DICOM viewer (Pixmeo SARL, Geneva, Swiss).

Gene expression assay and tumor percentage

mRNA was isolated from the frozen material as described previously [9]. Briefly, a 5-micrometer section of the biopsy was H&E stained. A pathologist evaluated if the overall tissue quality of the frozen biopsy was sufficient for further analysis (i.e., samples dropped out if the biopsy was too small, too fatty or in the absence of invasive tumor). The pathologist also estimated the tumor percentage and only the samples with a tumor percentage ≥ 50% were selected for microarray analysis. Gene expression analysis was only performed if the RNA integrity number ≥ 6.5 and the quantity ≥ 3 µg. Samples obtained between 2004 and 2010 were analyzed using Illumina microarray analysis (WG6 v3 microarray chips); RNAseq was performed on the samples from 2011 to 2012.

Validation cohort

As a validation, an independent cohort obtained from the microarRAy PrognoSTics in Breast cancER (RASTER) study was used. Study design is described before [11]. In short, 812 women were enrolled in 16 hospitals in The Netherlands after having given informed consent. Patients received surgery (mastectomy or breast conserving surgery) as primary treatment. Within 1 h after surgery, a tumor sample was procured at the pathology department of the participating hospitals and sent to Netherlands Cancer institute by mail. After samples were received at the Netherlands Cancer Institute they were snap frozen at − 70 degrees. Pathologists analyzed paraffin-embedded tumor samples of the validation dataset at the pathology department of the participating hospitals. Histological tumor grade according Elston and Ellis, ER status, PR status, and ERBB2 status were established by each hospital according to locally used methods [11]. Frozen sections of the tumor samples of the validation set were obtained and stained with H&E stain, and subsequently analyzed by an experienced breast pathologist. Eligible samples had to contain ≥ 50% tumor cells. Agendia Laboratories performed the microarray analysis using the Mammaprint (Agilent microarray, Santa Clara USA) [11].

Data analyses and statistics

The variables age, histologic subtype of tumor, grade, T-stage, N-stage, and response (pathological complete remission (pCR) of breast and axilla) were compared between samples suitable or unsuitable for gene expression analysis. The χ 2 (Spearman) was used to compare dichotomized variables. We also assessed differences in clinical characteristics for each exclusion criteria as described above, i.e., tissue quality of the frozen biopsy, tumor cell percentage, and RNA quantity as well as quality. Multivariate logistic regression analysis was performed to assess the independent association of various clinical variables with the ability to perform GE analysis. Recurrence-free survival was assessed with Kaplan–Meier plots and the log rank test. A cox proportional hazard model was built to assess if the ability to perform GE analysis was independently associated with recurrence-free survival. The SPSS Package 23.0 was used for statistical analyses and p values (two-sided) < 0.05 were considered statistically significant. This study was designed according to the Reporting recommendations for tumor MARKER prognostic studies (REMARK) guidelines [12].

For the validation dataset, the variables age, histologic type of tumor, subtype, histological grade, and T-stage were compared between tumor samples suitable or ineligible for gene expression analysis. Statistical analysis was performed as described above.

Results

Patient selection in cohort

A total of 738 breast cancer patients were treated in the NKI with neoadjuvant chemotherapy between 2004 and 2012. From 665 patients, a frozen biopsy was available. Seventy-seven tissue samples were not processed because the biopsies were too small or too fatty, or did not contain invasive tumor. Of the remaining 587 tissue specimens, 461 had a tumor percentage of more than 50%; 391 of these samples met the criteria for sufficient RNA quality and quantity, allowing gene expression analysis (53% of the total cohort). Figure 1 shows the sample selection flow diagram.

Fig. 1
figure 1

Flow diagram of included patients. Flow diagram of included patients and in- and exclusion steps of the fresh frozen biopsies for gene expression studies. TP denotes tumor percentage, DCIS ductal carcinoma in situ, RIN RNA integrity number

Association with clinical characteristics

Comparisons of baseline characteristics between the tumors for which a gene expression (GE) profile could be generated (GE+ ; n = 391) and the tumors for which this was not possible (GE− ; n = 347) are shown in Table 1. GE + tumors were more often high grade and had a higher SUV max value than GE− tumors. When we stratify for subtype, the effect of grade is still significant in the ER+HER2− and in the triple-negative subgroup (see Supplementary, Table 1). In the ER+HER2− subgroup, these samples also have a higher SUV max value. Multivariate analysis shows that a high tumor grade and positive lymph node status are independently associated with GE + tumors (Table 2).

Table 1 Characteristics of patients with gene expression profiles versus patients without gene expression profiles
Table 2 Multivariate analyses of patient characteristics with gene expression profiles versus patients without gene expression profiles

To investigate the influence of the various steps of sample selection for gene expression analysis in more detail, we compared the clinical variables between in- and excluded samples after each selection step (Table 3). Interestingly, in every selection step, we enrich for high-grade tumors: a high tumor grade is associated with larger biopsies, higher tumor percentage and high quality and quantity RNA. In addition, high quality and sufficient quantity of RNA is more often found in HER2+ tumors and node-positive tumors.

Association with treatment response and survival

Chemotherapy response was not significantly different between GE + and GE − tumors. However, we observed that tumors with high tissue quality of frozen biopsies more often achieved a pathological compete response (pCR) after treatment than samples with poor quality biopsies (p = 0.02; Table 3). We did not observe a significant difference in recurrence-free survival between samples included and excluded in gene expression analysis, although a trend was visible in triple-negative patients (Fig. 2). A cox proportional hazard model did not indicate GE+ as a variable associated with survival (Supplementary, Table 2).

Table 3 Characteristics of patients excluded versus remaining group. Due to rounding, some percentages do not count up to 100%
Fig. 2
figure 2

Kaplan–Meier curves of recurrence-free survival. Kaplan–Meier curves of recurrence-free survival to compare patients with and without gene expression profiles, stratified by subtype. GE denoted gene expression

Results in the validation dataset

The RASTER data were used to validate our observations. This set includes 812 breast cancer patients enrolled between 2004 and 2006 (see Methods). Because node positive patients were excluded in this study for clinical reasons and therefore a gene expression profile was not performed, we analyzed the samples of node-negative patients (n = 585, see Supplementary Fig. 1, for a flowchart). Therefore, we could not validate our association with nodal status in this set; however, most other clinical variables were available. Of these samples, 27% dropped out because of incorrect procedure or sample failure. Similar to the observations in the neoadjuvant cohort, gene expression profiling was more often possible in high-grade tumors (borderline significant; p = 0.05, Table 4).

Table 4 Patient characteristics in the validation dataset, split for gene expression status

Discussion

In this study, we showed that patients of whom gene expression data were obtained had more often high-grade tumors and lymph node metastasis, features associated with a worse prognosis. It is important to acknowledge that, due to the selection of tumors with good quality samples and a high cellularity for gene expression studies, we select for a certain subgroup of tumors. Most likely, this selection bias is present in the majority of published gene expression studies for breast cancer.

Our findings are indeed in line with published literature. Cremoux et al. [13] studied (pre-) analytical steps in tissue handling by comparing different institutes. As in our study, they found that tumors suitable for gene expression profiling were more often high grade and of ductal subtype. Mook et al. observed a 17% dropout and concluded that the rejected samples were obtained from slightly smaller tumors [14]. Also Goetz et al. observed that women without expression data had more often a small tumor [15]. Together with the results of our validation dataset, there is substantial evidence towards the selection of larger and more aggressive tumors for gene expression studies.

In a systematic literature search that we performed prior to this study, finally resulting in 110 articles, 39% of the studies were indistinct about exclusion criteria and associated dropout rates, indicating unawareness. The remaining 61% did mention about exclusion criteria and numbers. These studies show a variety of dropout rates (1–83%, average 21%). Most gene expression profiles had been developed on frozen sample collections available at the biobank of the respective institute. These samples are a selection of relatively large and easily accessible tumors. Also, only samples that met the strict criteria of tumor RNA quality and quantity were used. Nowadays FFPE material from all laboratories (both biopsies and resection material), with very different protocols, can be measured by commercial available platforms, such as Mammaprint and OncotypeDx, and tumor percentage can be as low as 30% [16]. This is possible due to advances in the technique. However, we should be aware that such assays were originally not developed on samples with comparable characteristics, and validation on small samples with lower tumor cell percentage is warranted. In addition, also for research purposes, it is important to acknowledge that due to analytical requirements high-grade tumors might be overrepresented in GE datasets.

This study has some limitations. First, fresh frozen tumor samples were used. In general, it is more difficult to obtain fresh frozen material than FFPE tumor material, resulting in a higher dropout rate. Second, this study was done on pre-treatment biopsies, which yield smaller quantities of tumor material than resection specimens. Third, this study was performed in the neoadjuvant setting, which results in the selection of locally advanced tumors. Consequently, we could not look at stage I or stage IV tumors. Strong points of our study are that samples were obtained from a consecutive cohort of neoadjuvant treated patients, and not on a highly selected clinical trial population. Furthermore, we had an independent cohort for validation purposes that corroborated our findings. Although there were some differences in the way the samples were collected (resections versus biopsies, mailed transport versus snap frozen), the validation cohort consisted of early stage breast cancer samples and had information on grade, enabling us to validate our main finding in an independent cohort. Finally, all samples were from one institute and handled by one dedicated technician to preclude variability in centre or in lab handling.

In conclusion, we showed that breast cancers for which gene expression data were successfully obtained were associated with a higher grade and with lymph node metastasis, due to the selection of samples with a high tumor percentage and good quality RNA. These tumors have, on average, a more aggressive phenotype and a relatively poor prognosis. In general, when interpreting test results, it is important to realize that patient populations for which GE profiles are used, often differ substantially from the ones in which they were originally developed, particularly when using a development cohort consisting of frozen tumor tissue and a test cohort consisting of FFPE samples. At this point, it is uncertain what the impact might be on treatment decisions in the clinic.