1 Introduction

Breast cancer is the most common malignancy diagnosed during pregnancy and its incidence is rising notably due to the present-day trend of delayed childbearing, the increase of young-onset breast cancer and the introduction of non-invasive prenatal testing (NIPT) in obstetrical care (resulting in the accidental identification of several asymptomatic pregnant patients in developed countries) [1,2,3,4].

Pregnancy-associated breast cancer (PABC), generally defined as breast cancer diagnosed anytime during gestation, lactation or within one year after delivery, represents a heterogeneous disease with fundamental histological and clinical variation between patients. Every year, one in 3,000 to 10,000 pregnant women are diagnosed with breast cancer, representing only 0.2 – 3.8% of overall breast cancer cases [2, 5,6,7].

The molecular nature of PABC remains an underexplored field, and considerable controversy exists regarding the influence of pregnancy on breast cancer prognosis [8,9,10,11,12,13,14,15,16]. PABC is generally believed to exhibit particularly aggressive behavior and its poor outcome is largely attributed to unfavorable tumor characteristics: advanced tumor (T) stage at diagnosis, lymph node involvement, high histologic grade, negative estrogen receptor (ER) and progesterone receptor (PR) status, and human epidermal growth factor receptor-2 (HER-2) amplification and overexpression [13, 14, 17, 18]. To date, little progress has been made in unraveling the molecular mechanisms of the aggressive pathological characteristics of PABC tumors. A deeper understanding of the molecular makeup of PABC may not only help explain its aggressive biological attributes, but may also provide individualized biomarkers and potential targets for new cancer therapies.

Somatic copy number alterations (CNAs) are part of the molecular makeup of breast cancer [19]. Multiple studies have reported an association between CNAs and specific tumor characteristics such as histologic grade, risk of recurrence, and metastasis [20,21,22,23,24,25,26]. CNAs have been reported to be of independent prognostic, even after adjustment for stage, histologic grade, TP53 status, histologic subtype and total aneuploidy [20].

In our previous large population based study, triple negative breast cancer (TNBC: ER negative, PR negative, absence of HER-2 overexpression) was the most frequently observed subtype in PABC compared to age-matched non-PABC tumors [27], in line with other case control studies [14, 18, 28, 29].

To assess whether this frequently observed subtype in PABC bears a unique molecular signature, we compared the genomic background of triple negative PABC and control non-pregnant breast cancer patients by detection of specific DNA copy number alterations. Associations between individual gene CNAs, clinicopathological characteristics and survival were explored.

2 Materials and methods

2.1 Patient selection

Using our Dutch nationwide population based ‘PABC cohort’ of women ≤ 45 years of age (n = 744), with a first diagnosis of invasive breast cancer (BC) during a first pregnancy or within six months after delivery [27], we extracted PABC patients with a triple negative receptor status (n = 283). Of these patients, breast tumor specimens have been requested from Dutch laboratories using the Dutch nationwide network and registry of histo- and cytopathology (PALGA) [30]. Only patients with full relevant clinical information about their outcome and available formalin-fixed paraffin-embedded material of their pregnancy associated breast tumor could be included for this molecular analysis (n = 31). As controls, triple negative and poorly differentiated tumors of 23 randomly selected premenopausal non-PABC patients (defined as first diagnosis of invasive BC without any sign of pregnancy in the patient history), were identified from the archives of the Department of Pathology at the University Medical Center Utrecht, The Netherlands.

All data from the PALGA database are pseudonymized by a trusted third party (ZorgTTP, Houten, The Netherlands). Consent was given by all Dutch laboratories for the storage of their data by PALGA, and for scientific use of these data. Use of anonymous or coded ‘left over’ material for scientific purposes does not require informed consent according to our institutional medical ethical review board and according to Dutch legislation [30,31,32].

2.2 DNA extraction and multiplex ligation-dependent probe amplification

Hematoxylin–eosin stained slides were reviewed by an experienced pathologist (PJvD) to confirm and mark the presence of malignancy in tumor samples. Areas with lymphocytic infiltrate or ductal carcinoma in situ were avoided. The ratio of tumor cells compared to other cell types in the infiltrative tumor was determined and expressed as a percentage of the total number of cells. After deparaffinization in xylene, DNA was extracted from the marked tumor area on five 10-µm unstained sections. Areas were scraped off with a scalpel and specimens were heated at 90 °C for 15 min in 200 μL lysis buffer (lysis buffer: 50 mM Tris-HCI buffer, pH 8.0, 0.5% Tween 20). Then, 20 μL proteinase K solution (10 mg/ml; Roche, Almere, The Netherlands) was added, and the sample was incubated at 56 °C overnight (∼16 h) for lysis of the tissues. Inactivation of proteinase K was achieved by heating the sample for 15 min at 80 °C. The crude lysate was centrifuged for 10 min at 14,000 rpm, and 5 μL (50–100 ng) from the supernatant was used for each multiplex ligation-dependent probe amplification (MPLA) reaction according to the manufacturers’ instructions, using the P078-D2 breast tumor kit (MRC Holland, Amsterdam, The Netherlands) as before [33]. This probe mix contains 55 MLPA probes, including in total 41 probes for the following breast cancer associated chromosomal regions: 6q25 (ESR1), 7p11 (EGFR), 8p12-p11 (ZNF703, FGFR1, ADAM9, IKBKB), 8q13-q24 (PRDM14, MTDH, MYC), 11q13 (CCND1, EMSY), 16q22 (CDH1), 17q11-q25 (CPD, MED1, ERBB2, CDC6, TOP2A, MAPT, PPM1D, BIRC5), 19q12 (CCNE1) and 20q13 (AURKA). In addition, 14 reference probes are included which target copy number stable regions in various tumor types including breast cancer.

All tests were performed in duplicate on an ABI 9700 PCR machine (Applied Biosystems, Foster City, CA, USA). PCR products were analyzed on an ABI3730 capillary sequencer (Applied Biosystems). Gene copy numbers were analyzed using Genemapper (Applied Biosystems) and Coffalyser NET software (MRC-Holland). Six negative reference samples (two blood and four formalin-fixed paraffin embedded normal breast tissue specimens) were taken along in each MLPA run to normalize MLPA ratios. For genes with more than one probe present in the kit, the arithmetic mean of all the probe peaks of this gene in duplicate was calculated. A mean probe ratio value below 0.7 was defined as loss, a value between 0.7 and 1.3 was defined as normal, 1.3–2.0 as gain/low-level amplification, and values > 2.0 were defined as high-level amplification, as established previously [34].

2.3 Statistics

CNA data was summarized and plotted using GraphPad Prism version 8.3.0 for Windows (GraphPad Software, San Diego, California USA). The web tool ClustVis was used to visualize CNA patterns and create heatmaps after unsupervised hierarchical cluster analysis using Ward’s linkage algorithm with Euclidean distance metrics [35].

Statistical analysis was performed using IBM SPSS statistics for Windows version 26.0.0.1. Differences in number of CNAs between PABC and non-PABC patients, and between PABC subgroups (clusters) were evaluated by independent samples t-test and ANOVA with post-hoc Tukey HSD test, respectively. Differences between categorical variables were examined by chi-square statistics or Fisher Exact test when indicated. Individual significance level was set at p < 0.05. Bonferroni-Holm Correction was applied for multiple comparisons. Overall survival curves were constructed using the Kaplan–Meier method and the log-rank test was used to test for significance. Multivariate survival analysis was done using a backward Cox proportional hazards model. Characteristics with a p-value < 0.10 in univariate analysis and potential confounders were included.

3 Results

Table 1 compares the clinicopathologic characteristics of the selected PABC and non-PABC patients. All non-PABC and all but one PABC tumors were poorly differentiated (grade III) according to the modified Bloom-Richardson Scarff grading system [36]. Mean age of PABC and non-PABC patients was 33 (range 23 – 42) and 40 (range 29 – 48) years, respectively. The mean tumor percentage of the microscopic slides of PABC and non-PABC patients was 70.3% (SD ± 12.2%) and 70.9% (SD ± 11.6%) respectively, whilst the median tumor percentage was identical in both groups (70%, IQR 60–80%).

Table 1 Clinicopathologic characteristics of pregancy associated breast cancer (PABC) and non-PABC cohorts in this study

3.1 TOP2A copy number loss is more frequent in triple negative PABC compared to non-PABC

In general, PABC triple negative tumors showed significantly more losses (p = 0.046) and tended to show fewer high-level amplifications than non-PABC triple negative tumors. Table 2 compares the frequencies of individual gene CNAs between PABC and non-PABC cohorts, and shows mean and median MLPA copy number ratios per gene. TOP2A loss was frequent in PABC (19%) while it was not observed in non-PABC patients (p = 0.03; non-significant after correction for multiple comparisons). For all 21 other genes, no significant differences were observed. Figure 1 depicts observed frequencies of losses, gains and amplifications in PABC and non-PABC patients. MYC was the most frequently gained/amplified gene (81% and 66% of PABC and non-PABC patients, respectively). No ESR1 and ERBB2 (HER2) high-level amplifications were observed.

Table 2 Gene-specific frequencies of copy number alterations by MLPA in pregnancy-associated breast cancer (PABC) and non-PABC patients. Mean and median MLPA copy number ratio, including standard deviation (stdev) and interquartile range (iqr), as well as the results of inter-group statistical comparison, are also given
Fig. 1
figure 1

Copy number alteration (amplification, gain and loss) frequencies of 22 breast-cancer related genes in pregnancy associated breast cancer (PABC) and non-PABC

3.2 Cluster analysis identifies triple negative PABC subgroup with poor outcome

Unsupervised hierarchical cluster analysis of PABC and non-PABC patients based on CNA profiles revealed no clear distinction between both groups (Supplementary Fig. 1). Clustering within PABC patients, however, revealed 3 major clusters (Fig. 2) based on significant CNA differences between chromosomal regions 6q (ESR1), 8p (ZNF703, FGFR1, ADAM9), 11q (CCND1), 17q (CPD, MED1, CDC6, TOP2A, MAPT) and 20q (AURKA). Supplementary Table 1 provides an overview of the different clusters. One of these three clusters consisted of patients showing a far worse survival compared to the other triple negative PABC patients (p = 0.038; Fig. 3), and was characterized by more 8p loss (ZNF703, FGFR1 and ADAM9) compared to the other two clusters (Fig. 4). No significant differences in gestational trimester, age at BC diagnosis or cTNM stage were observed between clusters.

Fig. 2
figure 2

Unsupervised hierarchical cluster analysis of triple negative pregnancy associated breast cancer (PABC) patients based on somatic copy number alteration patterns of 22 breast cancer related genes. Cluster 2 was associated with significantly worse prognosis compared to cluster 1 and 3

Fig. 3
figure 3

Kaplan Meier survival plots comparing outcome (A) in three pregnancy associated breast cancer (PABC) copy number alteration-classified subgroups identified by unsupervised hierarchical cluster analysis, and (B) patients with (ratio < 0.7) and without FGFR1 copy number loss by multiplex ligation-dependent probe amplification

Fig. 4
figure 4

Differences in ZNF703, FGFR1, ADAM9 and CCND1 copy number between three triple negative pregnancy associated breast cancer (PABC) subgroups identified by unsupervised hierarchical cluster analysis (clusters 1, 2 and 3). Boxplots extend from the 25th to 75th percentiles. Whiskers and outliers were identified by the Tukey method. Cumulative number of patients with neutral copy number, loss and gain/amplification per cluster are shown in the bottom row. * p < 0.05; ** p > 0.01; *** p < 0.001

3.3 FGFR1 and TOP2A copy number loss are independent prognosticators in triple negative PABC

CNAs individually associated with poor overall survival were ESR1 loss (n = 3 events, p = 0.025), FGFR1 loss (n = 9 evens, p = 0.0042; Fig. 3), ADAM9 loss (n = 7 events, p = 0.037) and CCNE1 gain (n = 2 events, p = 0.021). Patients presenting with tumors harboring FGFR1 loss developed more frequently distant metastases (67% vs. 25% if copy number neutral, p = 0.048). Tumors harboring MYC gain or amplification were less likely to develop lymph node metastases (9% vs. 57% if copy number neutral, p = 0.018). TOP2A loss, ESR1 loss, and FGFR1 loss were independent predictors of overall survival (OS) in Cox regression analysis including cT and cN (HR 8.960 (95% CI 1.407–57.079), p = 0.020; HR 10.589 (95% CI 1.046–107.2108), p = 0.046; and HR 3.586 (95% CI 0.981–13.103), p = 0.053, respectively). Of these 3 CNAs, only FGFR1 loss and TOP2A loss remained in the model when entered together (HR 4.408, p = 0.073 and HR 7.100, p = 0.056 respectively).

3.4 Associations found in PABC are not seen in non-PABC

In the non-PABC group, no significant associations with survival were observed for FGFR1 or any other interrogated gene, although AURKA gain/amp (p = 0.066) and EMSY gain/amp (p = 0.056) tended to predict worse survival. Unsupervised cluster analysis of non-PABC patients also revealed three clusters based on significant CNA differences between chromosomal regions 8p (ZNF703, FGFR1, ADAM9) and 17q (TOP2A, MAPT and BIRC5). All three patient clusters however had a similar survival (p = 0.463).

4 Discussion

To investigate the underlying mechanisms resulting in the aggressive clinical behavior of PABC, we aimed to identify specific gene CNAs characterizing triple negative PABC, by conducting a comparative analysis of a cohort of triple negative PABC patients and subtype-matched non-PABC patients (with a diagnosis of invasive breast cancer ≤ 45 years of age). We have shown that triple negative PABC tumors exhibit enrichment for copy number losses by MLPA in general and some unique CNAs, including the enrichment for TOP2A copy number loss. In addition, MYC was the most frequently gained/amplified gene in PABC [37].

Cluster analysis based on CNA profiles identified a triple negative PABC subgroup with a particularly poor prognosis, characterized by chromosome 8p copy number loss. Further analysis of individual gene CNAs revealed that FGFR1 copy number loss on chromosome 8p11.23 was the best prognosticator residing in this chromosomal region. FGFR1 loss was an independent predictor of worse overall survival in multivariate analysis and predicted the development of distant metastases.

In line with our observations, other studies have previously described 8p copy number loss as a frequent event in various cancer types including breast cancer, suggesting that this region harbors one or more tumor suppressor genes. Loss of 8p has been linked to advanced tumor stage, high grade, high proliferation index, negative ER and PR status, early-onset breast cancer, poor survival rates and shortened response to oncologic systemic treatment [38,39,40]. Cai et al. examined the effect of a chromosome 8p 2–35 Mb targeted deletion, which was insufficient to transform MCF10A cells, but altered the fatty acid and ceramide metabolism leading to increased invasiveness and enhanced autophagy [41]. Their results provided evidence to suggest that screening for 8p loss in breast tumors may serve as a selection strategy for treatment with microtubule inhibitors (confers resistance), statins (confers resistance), and/or autophagy inhibitors (confers sensitivity). This strategy may thus be of particular interest in a PABC context.

Besides FGFR1, TOP2A copy number loss on 17q21.2, ESR1 loss on chromosome 6q25.1 and CCNE1 gain on 19q12 were identified as biomarkers for poor outcome in triple negative PABC patients. TOP2A loss, enriched in PABC tumors and covered by three independent MLPA probes, was an independent predictor of poor overall survival alongside FGFR1 loss.

TOP2A encodes the topoisomerase IIα protein, an intracellular target of anthracyclines. Several studies have therefore suggested that anthracycline-containing therapy might be most effective in patients whose tumors carry amplified TOP2A [42,43,44]. Interestingly, TOP2A gene deletion might also confer increased sensitivity to anthracyclines [42, 45,46,47] suggesting a potential benefit of anthracycline-containing chemotherapy in triple negative PABC patients.

ESR1 encodes the estrogen receptor alpha and, as expected since it usually leads to ER alpha overexpression, did not show amplifications in both triple negative cohorts. Yet, we did observe several ESR1 losses in PABC tumors (9%; covered by two MLPA probes). Activating mutations in ESR1 are recurrent mechanisms of acquired resistance to aromatase inhibitors in ER-positive tumors, but ESR1 allelic losses have only rarely been described [48]. Laenkholm et al. reported that a large fraction of ER negative tumors showed ESR1 deletion (55%) by FISH [49]. They also noted an elevated number of deletions in cohorts with a higher number of ER negative patients in the DBCG trial 89D. Thus ESR1 deletions may contribute to the ER alpha negative status of these cancers.

Cyclin E1 (CCNE1) plays a critical role in cell cycle regulation, DNA replication, chromosome segregation, and G1 to S-phase transition [50]. CCNE1 overexpression and gene amplification have both been associated with poor prognosis in triple negative breast cancer [51,52,53] as well as epithelial ovarian cancer [54]. In ovarian cancer, the near mutual exclusivity of homologous recombination pathway mutations and CCNE1 amplification generally results in resistance to platinum-based cytotoxic chemotherapies and ineffective Poly (Adenosine Diphosphate (ADP)-Ribose Polymerase (PARP) inhibition [54].

CCNE1 amplified tumors can cause faster mitotic exit, an increased rate of mitotic slippage and resistance to anti-mitotic chemotherapies such as taxanes. Breast tumor cells engineered to overexpress cyclin E have been shown to have an increased sensitivity to cisplatin and paclitaxel combinations [55, 56]. Promising targeted strategies using CDK2 inhibitors and WEE1 kinase inhibitors are currently being examined in ongoing biomarker driven clinical trials.

The abovementioned prognostic CNAs proved to be unique to PABC tumors as similar associations were not observed in the breast tumors diagnosed outside pregnancy of postpartum period. This reinforces the notion that PABC represents an even more distinctive entity of breast cancer than previously reported, requiring its own biomarkers and therapeutic approaches.

Even though several studies on the genomic profile of PABC have been conducted [57], this analysis is novel as it focuses specifically on the triple-negative PABC subtype and correlates the clinicopathological features of the disease and outcomes with the CNAs. Although MLPA analysis alone cannot determine whether triple-negative PABC is defective in homologous recombination, recent genomic analysis has revealed that a significant portion of TNBC is characterized by abundant chromosomal structural variants and CNAs due to homologous recombination deficiency [58].

Some limitations of this study to be noted. Although MLPA is a multiplex technique that can assess multiple relevant CNAs simultaneously, we have only examined a limited number of genes here. PABC and non-PABC cohorts were relatively small but perfectly matched for triple negative subtype, and still provided prognostically significance. These new genetic insights can serve as a starting point for further more extensive copy number analyses by next generation sequencing in our entire PABC cohort, after obtaining the formalin-fixed paraffin-embedded (FFPE) tumor material of the remaining patients. In addition, age-matching was not perfect as PABC patients were on average slightly younger (33 years) than non-PABC patients (40 years) upon final analysis. Age was, however, not significantly associated with any of the interrogated variables, so we do not believe that this has played an important role here.

In conclusion, this study provides important new insights into the biology of triple negative PABC and suggests that several copy number alterations, particularly 8p loss, TOP2A loss, ESR1 loss and CCNE1 gain are implicated in tumor progression during pregnancy. FGFR1 loss and TOP2A loss are promising new biomarkers that independently identify a subgroup of triple negative poor prognosis PABC patients that require personalized cancer treatment. In addition, this study provides unprecedented therapeutic clues for further studies to pursue in a larger PABC population.