Background

Histological subtype information in breast cancer is clinically relevant for treatment purposes but insufficient to describe all tumor heterogeneity [1]. The basal-like molecular subtype, which typically expresses cytokeratins 5/6 and EGFR, frequently overlaps with the histological subtype of triple-negative (TN) breast cancer [2, 3]. TN breast cancer is described by the absence of ERα/PR and HER2 expression and poor overall prognosis [4, 5]. Because of the lack of available targeted therapies for this subtype, the clinical impact of target discovery for patients with TN breast cancer is potentially significant.

Hereditary germline BRCA1 mutations are found in around 12% of all TN breast cancers [6,7,8]. BRCA1 plays a critical role in error-free DNA double-strand break repair via homologous recombination, and deficiency can result in genomic instability [9, 10]. Differential gene expression patterns in BRCA1 mutant tumors versus nonmutant tumors have been identified previously [11,12,13,14]. Because of the relative rarity of BRCA1 mutation in the general breast cancer population [15], however, these studies are often underpowered, making clinical impact for mutation carriers limited. Furthermore, the capacity of these signatures to predict response to targeted treatments such as PARP inhibitors has not been thoroughly explored in the randomized clinical trial setting.

BRCA1-mutated/promoter-methylated TN tumors with a specific pattern of copy number alterations are termed BRCA1-like [16,17,18,19,20]. ‘BRCAness’ describes tumors with molecular features of BRCA1-mutated tumors [21, 22]. Interestingly, the whole group of BRCA1-like tumors responds well to DNA double-strand break-inducing agents and intensifying chemotherapy regardless of their BRCA1 mutation/promoter methylation status [16, 23, 24]. These findings suggest that a relatively large portion of TN breast cancers may be susceptible to targeted therapies such as PARP inhibitors. The efforts of many groups have resulted in various classifiers for BRCAness, typically based on mutation [13, 14, 25] or homologous recombination repair deficiency (HRD) markers [26] and using gene expression data as an input. Recent work has found that an assay designed to detect BRCAness using HRD as a biomarker failed to predict for carboplatin response [27], illustrating the challenges of generating a signature with the capacity to predict treatment effect [28].

Molecular subgroups within the TN subtype have differential benefit from therapies [29,30,31]. In addition, previous work in TN tumors has determined that differentially expressed genes between BRCA1-like and non-BRCA1-like tumors center around DNA repair [29, 32, 33] and may lead to new information for clinical therapeutic decisions. A test based on gene expression levels may also lead to insight into the mechanisms which result in tumors with BRCA1-like features. We developed a 77-gene signature to identify samples with a BRCA1-like gene expression pattern we term BRCA1ness with a sensitivity and specificity of 96.7% and 73.1%, respectively. We explored this signature’s ability to predict response to the PARP inhibitor veliparib in combination with carboplatin (V-C) in the I-SPY 2 TRIAL, a phase 2, multicenter, adaptively randomized trial designed to screen multiple experimental regimens in combination with standard neoadjuvant chemotherapy for breast cancer, where V-C graduated in the TN signature [34,35,36]. Investigation of the BRCA1ness signature was part of a further evaluation of additional biomarkers in this setting. In this study, we aimed to answer the clinical question in the I-SPY 2 external validation set of whether to treat with PARP inhibition based on the well-studied mechanism of HRD (identified by our biomarker BRCA1ness).

Methods

Discovery set patient characterization and microarray data generation

The collaborative European Union-funded effort FP7 RATHER project (Rational Therapy for Breast Cancer) aims to integrate gene expression profiling, copy number variation, kinome variation and kinase activation status in an effort to identify new targets for therapy of difficult-to-treat breast cancer subtypes, including TN breast cancer (www.ratherproject.com). The RATHER project retrospectively identified 128 TN breast cancer patients with long-term follow-up in total: 70 from Netherlands Cancer Institute (NKI), Amsterdam, the Netherlands and 58 from Addenbrooke’s Hospital, Cambridge, UK.

The primary inclusion criterion for the RATHER cohort was availability of sufficient isolated frozen RNA tissue in the tissue bank and diagnostic information indicative of TN breast cancer. We enriched for frozen tumors with 30% or greater tumor content (2 x 8-μm serial sections, hematoxylin and eosin stained). The local medical ethics authorities of both centers approved the collection protocols. Sectioning of tumor tissue and RNA isolation were performed as described previously [37].

Samples with RIN value > 5 according to 2100 Bioanalyzer (Agilent Technologies) assessment were selected for further analysis. Gene expression data were generated as described previously [37]. Briefly, feature signal intensities were processed and extracted using the ‘limma’ R package with background subtraction using an offset of 10 and log2 transformed data. Probe intensities were quantile normalized with in-house R scripts and missing values (including probes with signal intensities < 1 after preprocessing) were imputed by the 10 nearest-neighbor method. A biobank batch effect was adjusted using ComBat [38]. Genes with multiple probes were summarized by the first principal component of a correlating subset.

BRCA1-like classification

The multiplex ligation-dependent probe amplification (MLPA) method was used to generate copy number profiles for the determination of the BRCA1-like status of the tumors. The assay was performed, fragments analyzed and data normalized according to the manufacturer’s protocol (MRC-Holland). Class prediction (BRCA1-like/non-BRCA1-like) was carried out on the normalized data according to published instructions [16].

Gene signature development

Signature development was performed using Partek Genomics Suite (partek.com) (categorical signature) and Matlab (https://mathworks.com/) (translation to continuous score) on 128 samples. Top variable genes (variance >1 across all samples) were used for the model input (N = 2049). The classification model of diagonal linear discriminant analysis (DLDA) with equal prior probabilities was run to select the signature genes. Groups of genes ranked by their significance in a univariate ANOVA examining the BRCA1-like/non-BRCA1-like MLPA status were tested (groups from 1 to 100, in increments of 1). Single-level leave-one-out cross validation (LOOCV) with the maximum number of partitions was used to internally validate and calculate the performance of the model.

The significant number of genes in the model (n = 77, Additional file 1: Table S1) was selected based on the ROC area under the curve (AUC as specified by Partek). After the model is run, each sample is allocated a posterior probability for each class (BRCA1-like and non-BRCA1-like) and the sample is assigned to the class with the highest posterior probability (BRCA1ness). This categorical signature was then transferred to the diagnostic setting to better comply with quality and regulatory requirements using a nearest centroid model; a robust method with both reasonable and favorable characteristics for many measurements on a modest amount of patients [39]. Briefly, raw full genome data were normalized and class centroids were calculated (median per gene) for each of the 77 genes for each class (BRCA1ness/non-BRCA1ness) using the discovery set. These calculated centroids were used as the template for BRCA1ness/non-BRCA1ness. Pearson correlations of each new sample with the BRCA1ness/non-BRCA1ness templates were calculated (Additional file 1: Table S1) and combined into a single continuous score by subtracting the correlation to the non-BRC1Aness template from the correlation to the BRC1Aness template. In order for a sample to be classified as BRCA1ness, a threshold was established with a high sensitivity while preserving specificity close to 0.75. The classification threshold was set at –0.3; that is, a sample with a BRCA1ness score > –0.3 was classified as BRCA1ness and a sample with a score < –0.3 was classified as non-BRCA1ness.

I-SPY 2 TRIAL

The I-SPY 2 TRIAL is a standing multicenter, phase 2 platform trial to screen experimental regimens in combination with standard chemotherapy in the neoadjuvant treatment of breast cancer. Patients are adaptively randomized into one of four experimental arms or a control arm (Fig. 1) [35]. In this portion of the I-SPY 2 TRIAL, eligible patients received weekly paclitaxel at 80 mg/m2 (T) i.v. for 12 doses alone (control), or in combination with an experimental regimen. Patients randomized to V-C received 50 mg of veliparib by mouth twice daily for 12 weeks and carboplatin at AUC 6 on weeks 1, 4, 7 and 10 concurrent with weekly paclitaxel. Following paclitaxel ± V-C, all patients received doxorubicin 60 mg/m2 and cyclophosphamide 600 mg/m2 (AC) i.v. every 2–3 weeks for four doses with myeloid growth factor support as appropriate per protocol followed by surgery that included axillary node sampling. The V-C arm was open to HER2-negative patients; and was graduated in the TN group. The BRCA1ness signature was one of the qualifying dichotomous biomarkers assessed as a predictor of response to V-C relative to standard chemotherapy.

Fig. 1
figure 1

CONSORT diagram. CONSORT diagram indicating how patients were randomized for the I-SPY 2 TRIAL

To assess the BRCA1ness signature in this validation set as a specific biomarker of V-C response, gene expression data from 116 HER2-negative patients (V-C, n = 72 and concurrent controls, n = 44) were analyzed. A Customized Agilent 44 K array (Agendia) was used to evaluate the 77-gene signature BRCA1ness classification. The association between BRCA1ness classification and response in the V-C and control arms alone was assessed using Fisher’s exact test, and the relative performance between arms (biomarker × treatment interaction, likelihood ratio test) using a logistic model. We included adjustment for hormone receptor status (HR/TN) and tumor size in our model. Our sample size is small, and thus statistical calculations (p values) are descriptive rather than inferential. This analysis does not adjust for multiplicities of other biomarkers evaluated in the trial but outside this study.

Results

Signature development

We developed a BRCA1ness signature using whole genome gene expression data. The signature has been developed on fresh frozen (FF) breast tumors that were categorized as either BRCA1-like or non-BRCA1-like using a DNA copy number MLPA-based classifier [16], and endeavors to predict BRCA1-like tumors with a high sensitivity/specificity rate.

Forty-eight percent of the tumors (61/128) in the discovery cohort were classified as BRCA1-like and the remainder was assigned to the non-BRCA1-like class. We employed the gene expression profiles of the tumors, DLDA and the labels assigned by the MLPA-based classifier to train a classifier that distinguishes between the two classes. Using the ROC area under the curve model (AUC) as the performance criterion we identified a 77-gene signature that resulted in the highest performance (Table 1).

Table 1 Sensitivity and specificity for detecting BRCA1-like status samples using BRCA1ness

Unsupervised hierarchical clustering of the genes in the 77-gene signature indicates separation between the classes (Fig. 2). We transferred the signature to a diagnostic setting using a nearest centroid-based algorithm. The sensitivity and specificity for detecting BRCA1-like status as defined by MLPA were 96.7% and 73.1%, respectively. Using Ingenuity Pathway Analysis (Qiagen) to identify key biological processes associated with the 77 genes in the BRCA1ness 77-gene signature, we found cellular assembly and control and DNA replication, recombination and repair to be among the top associated pathways and functions (Fig. 3a). In addition, we observed serine and glycine biosynthesis to be associated with the 77-gene signature genes, indicating that these genes may be responsible for reprogramming of metabolic processes, which can lead to tumor progression (Fig. 3a). Supporting the pathway and molecular function results, network analysis revealed a network centered upon cell the cycle control regulator cyclin A (Fig. 3b).

Fig. 2
figure 2

Unsupervised hierarchical clustering of 77 genes in the 128 discovery set samples. The 77 genes were derived from a supervised analysis to identify those genes most informative in distinguishing BRCA1-like from non-BRCA1-like TN breast cancers [33]. Scaled expression value denoted as Z score (red–blue scale: red indicates high expression and blue indicates low expression). Information bar indicates MLPA BRCA1-like status: true (green) or false (brown). MLPA multiplex ligation-dependent probe amplification (Color figure online)

Fig. 3
figure 3

The 77-gene signature network analysis. a Significant canonical pathways (top) and molecular functions (bottom). Negative log p value is on the x axis. b Network analysis of the 77 genes in the BRCA1ness signature. Grey shading indicates genes found in signature, solid lines show direct relationships between proteins and dashed lines show indirect relationships

I-SPY 2 TRIAL

The BRCA1ness signature was applied to 116 HER2-negative patients (V-C, n = 72 and concurrent controls, n = 44). Fifty-five patients were classified as BRCA1ness. Fourteen percent of these patients were hormone receptor-positive (ERα/PR) and HER2-negative. The distribution of pathological complete response (pCR) rates among BRCA1ness and non-BRCA1ness groups is shown in Fig. 4a [36] and Table 2. Association between the BRCA1ness classification and patient response was seen in the V-C arm (OR = 3.2, p = 0.03) but not in the control arm (OR = 0.39, p = 0.45) (Fig. 4b). A significant biomarker × treatment interaction (p = 0.025) was also observed. Although there is enrichment for TN samples in the BRCA1ness group in univariate analysis (Table 2), this interaction remains significant upon adjusting for HR (p = 0.023) (Fig. 4c).

Fig. 4
figure 4

I-SPY 2 TRIAL. a Mosaic plot depicting the number of patients with pathological complete response (pCR) in each treatment group and signature group. Top row indicates patients in the trial enrolled in the control arm and bottom row indicates patients in the V-C arm. Number of patients with pCR is shown in green and number of patients without pCR is shown in tan. Black outlined boxes indicate the patients with a non-BRCA1ness status (left), red outlined boxes depict those with BRCA1ness status (right). b Histological subtype of the patients in the trial divided by treatment arm (V-C) and control arm and pCR rate per group. c Odds ratio (OR) and likelihood ratio test (LR) for treatment and control arms of the trial and the biomarker x treatment interaction test. HER2 human epidermal growth factor receptor 2, HR hormone receptor status, TN triple-negative, V/C veliparib-carboplatin (Color figure online)

Table 2 Patient characteristics

In addition we found that the interaction also remains significant when adjusting for tumor size and HR (p = 0.038). We use the likelihood ratio test to formally demonstrate that the logistic regression with the addition of the HR (and tumor size) terms does not provide a better fit to the data. When the hormone receptor-positive BRCA1ness classified patients are added to the graduating TN subset, the OR associated with V-C is 4.03. This is comparable to that of the TN alone (OR = 4.04), while increasing the prevalence of ‘biomarker-positive’ predicted V-C sensitive patients by 8%.

Discussion

At around 15% of all breast cancers, the TN breast cancer subtype impacts a significant proportion of women [7, 40]. TN breast cancer tends to be aggressive independent of other known prognostic factors [5, 41]. Current guidelines indicate that standard therapy for TN breast cancer is chemotherapy [2]. Unfortunately, these tumors typically metastasize early despite therapy [5, 41]. This poor response to treatment may be due to the fact that the TN subtype itself is made of molecular subgroups. Conversely, molecular data from these subgroups may indicate a targeted therapy, which is likely to benefit patients.

Because of results in preclinical models, BRCA1 mutation carriers of multiple tumor types have been enrolled in clinical trials with PARP inhibitors [27,42,, 32, 42,43,44]. Currently, there are no other predictive biomarkers for PARP inhibitors other than the germline BRCA1 mutation status, and the issue with that biomarker is that it only captures a small subgroup of all patients that may benefit from carboplatin/veliparib [8]. Previous work has shown that genomic instability patterns are related to BRCA1 mutation/methylation and that these patterns can be used to classify tumors are BRCA1-like or non-BRCA1-like [16,17,18,19,20, 24]. These BRCA1-like tumors make up a larger group than BRCA1 mutant or methylated alone, and importantly they respond well to DNA double-strand break-inducing chemotherapies [17, 23, 24]. We have developed a gene expression signature that is capable of identifying BRCA1-like samples with a high sensitivity/specificity rate (BRCA1ness).

Pathway analysis reveals that the genes in this signature are associated closely with cell cycle and cancer networks. We also observed a significant association with serine and glycine biosynthesis pathways with the genes of the signature. This is of particular interest because it has been recently shown that aerobic glycolysis signaling can promote tumor growth in breast cancer cell lines that are TN [45], indicating that the poor outcome for patients with a BRCA1ness tumor may be partially explained in this manner. Serine biosynthesis has been identified to be essential to tumorigenesis in estrogen receptor-negative breast cancer cell lines [46]. We identified serine biosynthesis to be related specifically to the genes found in the BRCA1ness gene expression signature, suggesting that tumors having BRCA1-like features may have particular vulnerabilities to drugs that interfere with serine biosynthesis. It would be interesting to test whether high expression of genes involved in serine and glycine biosynthesis can confer sensitivity to drugs which interfere with this biosynthesis in breast cancer cell lines. In addition, we found that the signature is capable of predicting response to the PARP inhibitor veliparib in combination with carboplatin compared with a control treatment regimen.

Conclusion

The sample size in the I-SPY 2 TRIAL is small, but our prespecified analysis suggests that the BRCA1ness signature shows promise for predicting response to V-C combination therapy relative to control. We focused on the experimental arm of the study that contains DNA-damaging agents because the BRCA1ness test is meant to identify patients that may derive substantial benefit from these agents. We observed a proportion of patients who were hormone receptor-positive that benefited from the V-C treatment. It is unlikely in a regular clinical setting that hormone receptor-positive patients would be tested for BRCA1ness, but our data indicate that these patients could derive benefit from specific tailored treatments like PARP inhibitors and/or platinum agents. Concurrently reported results studying carboplatin in TN breast cancer have indicated it may be difficult to translate the pCR rate to longer benefit such as recurrence-free survival (RFS) [47,48,49,50]. It should be noted that, for this trial, we used a surrogate endpoint (pCR) for RFS and longer follow-up is required to investigate the BRCA1ness classifier in relation to long-term benefit. In the event that downsizing of the tumor is required to facilitate conversion from mastectomy to breast-conserving therapy, this classifier may already have value. If verified in a larger trial, this signature may contribute to the selection criteria of PARP inhibitor trials in the future.