Introduction

Even though the prognosis of breast cancer has greatly improved over the past decades [1], approximately 30% of patients diagnosed with primary breast cancer will develop recurrent disease within 10 years [2,3,4]. In these patients, metastatic disease is the primary reason for shorter survival, yet the process of metastasis is still poorly understood [5]. Epithelial-to-mesenchymal transition (EMT) is an evolutionarily conserved program active during physiological processes such as embryogenesis and branching morphogenesis of the mammary gland, though also a process reactivated during the pathogenesis of tumor progression and metastasis of carcinomas [5,6,7,8,9]. During this transition, the epithelial cells downregulate epithelial markers, lose features such as polarity and intercellular adhesion, and upregulate mesenchymal markers to acquire features such as a fibroblast-like morphology and increased motility [5,6,7,8,9]. EMT is a reversible process and mesenchymal-to-epithelial transition, the process of regaining epithelial properties, is considered essential for the establishment and outgrowth of cancer cells at secondary sites [5,6,7,8,9,10,11]. Recently, tumor cells co-expressing epithelial and mesenchymal markers have been identified, and described as “partial EMT”, and thus represents a continuous spectrum of intermediate states between the epithelial and mesenchymal phenotype [12,13,14,15,16]. Tumor cells with an intermediate state of EMT have been associated with greater metastatic potential and a worse prognosis in breast cancer [12, 13, 16]. The processes of EMT are well recognized in preclinical studies, however, direct evidence of EMT in clinical samples is still lacking [17,18,19,20,21].

Currently, there are no standard biomarkers to demonstrate EMT, however, downregulation of epithelial markers, such as the cell surface protein E-cadherin and the cytoskeleton protein cytokeratin, are considered hallmarks of EMT. Common mesenchymal markers that are upregulated during EMT are cell surface protein N-cadherin, cytoskeleton protein vimentin and the transcription factor twist [22,23,24]. The primary aim of the present study was to evaluate the expression of these five selected EMT-related markers by immunohistochemistry (IHC) in breast cancer specimen to investigate if changes occur during tumor progression by examining paired samples of primary tumors (PTs) and synchronous lymph node metastases (LNMs), and paired samples of PTs and recurrences. Expression of the EMT-related markers were assessed individually and EMT phenotypes were defined based upon the combined expression pattern of the markers to explore the wide spectrum of EMT phenotypes previously reported including partial EMT. We hypothesized that the expression of EMT-related markers and phenotypes will be significantly different between tumor progression stages. To evaluate the clinical significance of EMT-related markers and phenotypes, correlation with clinicopathological factors and patient outcome was analyzed.

Materials and methods

Patients

The present study is based on a subset of breast cancer patients previously included in a prospective observational study originally evaluating the presence and prognostic value of disseminated tumor cells in bone marrow [25]. A total of 569 primary breast cancer patients diagnosed between 1999 and 2003 were included (South Swedish Health Care Region: Lund, Landskrona, Helsingborg), and the study will in the following be referred to as the Bone Marrow Metastases (BMM) cohort [25]. The study was approved by the Lund University ethics committee (LU699-09, LU75-02), and all patients included provided written informed consent. Results of the observational study and information about the patient cohort have been described, including biomarker protocols and assessment, in detail previously [25,26,27]. Patients diagnosed with lobular carcinomas were excluded in the present study, as several studies have demonstrated a different expression pattern of E-cadherin in this type of breast cancer [28,29,30]. Median follow-up time for patients alive and without distant recurrences was 13.9 years at last follow-up. The latest data on recurrences were retrieved from individual patient charts and causes of death from the Swedish Register of Causes of Death (Central Statistics Office; November 2015).

Tissue microarray and immunohistochemistry

Formalin-fixed paraffin-embedded archival blocks with tumor tissue from the BMM cohort were retrieved from the archives of pathology departments and one set of tissue microarrays (TMAs) (2 × 1 mm core diameter) was constructed as previously described (Beecher instruments, MD, USA) [26].

Consecutive 3–4 μm sections from each TMA block were cut and transferred to glass slides (Menzel Super frost plus, Thermo Scientific, Germany), dried at room temperature and baked for 2 h at 60 °C in heat chamber. Following deparaffinization and antigen retrieval, IHC staining was performed using Autostainer Plus (Dako Denmark A/S, Glostrup, Denmark). The following antibodies and dilutions were used: E-cadherin (NCH-38, #M3612 Dako Denmark A/S, 1:100), pancytokeratin (AE1/AE3, #3515 Dako Denmark A/S, 1:500), N-cadherin (3B9; #33–39 Invitrogen, 1:25), vimentin (V9, #M0725 Dako Denmark A/S, 1:300) and twist (2C1a, #ab50887 Abcam, 1:10). Counterstain with Mayer’s Haematoxylin was applied for 2 min to each section and a visualization kit K801021-2 (Dako Denmark A/S) was used for all stainings.

Stained TMA sections were scanned (Hamamatsu Photonics, NanoZoomer, software NDP-Scan, Japan), and using the web-based image and data management platform Xplore (Philips) all markers were scored independently by two observers blinded to clinical data [E-cadherin, N-cadherin (KA, CF); pancytokeratin, vimentin, twist (KL, CLTJ)]. Stainings were evaluated for intensity 0–3 (0 = negative, 1 = weak, 2 = intermediate, 3 = strong) and percentage of stained tumor cells. Only invasive tumor cells were assessed and only TMA core biopsies with > 100 invasive tumor cells were included. Samples with differences in assessment between the two investigators were re-evaluated and a consensus decision taken. The highest value of the two cores was used for statistical analysis.

In accordance with similar thresholds used in previous studies, a value of > 10% positive tumor cells independent of intensity was chosen to define positive expression of E-cadherin, pancytokeratin, N-cadherin, vimentin and twist [31,32,33,34,35,36]. As all samples but one were positive regarding pancytokeratin expression, the majority in the range of 90–100%, this marker was excluded from further analysis. EMT phenotype classification was done based upon the EMT phenotypes proposed previously in breast, small intestinal and esophageal cancer [13, 15, 16, 37, 38]. We stratified all samples where E-cadherin (epithelial marker) and at least one mesenchymal marker (N-cadherin, twist, vimentin) were evaluable into to the following phenotypes of EMT: epithelial type (positive expression of epithelial marker and negative expression of all evaluable mesenchymal markers); mesenchymal type (negative expression of epithelial marker and positive expression of at least one mesenchymal marker); partial EMT type (positive expression of epithelial marker and of at least one mesenchymal marker); and negative type (negative expression for epithelial marker and all evaluable mesenchymal markers).

Statistical analysis

Markers were analyzed individually and combined as defined EMT phenotypes. The association between EMT-related marker/phenotype expression and different patient and PT characteristics was analyzed with χ2 test and Fisher’s exact test. Comparison of EMT marker/phenotype status between PTs, synchronous LNMs and recurrences was performed using the exact McNemar test. To evaluate survival effect, Kaplan–Meier survival curves and log rank test was used. Hazard ratios (HR) were calculated by Cox regression and multivariable analyses were adjusted for age, tumor size, lymph node status, Nottingham histologic grade (NHG) and St Gallen breast cancer subtype. Schoenfeld’s test (estat phtest in STATA) was used to check assumptions of proportion hazards and the evidence against proportionality was found to be weak for all markers. Distant recurrence-free interval (DRFi) was chosen as primary endpoint before evaluation of EMT-related markers. DRFi was defined as the time from surgery until verified distant recurrence by radiology/biopsy or breast cancer-related death. For event-free patients, follow-up was censored at last medical follow-up visit. All P-values presented are two-sided. No adjustment for multiple testing has been performed. P-values should be regarded as level of evidence against the null hypothesis. We follow the advice in Benjamin et al. and use the term suggestive evidence for P-values in the range 0.005–0.05 and significant for P-values below 0.005 [39]. Statistical calculations were performed using IBM SPSS Statistics (version 24.0, IBM, Armonk, NY, USA) and STATA (version 15.1, StataCorp. College Station, TX, USA). Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK) were followed where applicable [40].

Results

Patients and tumor characteristics

The original BMM trial recruited 569 participants. A total of 14 patients did not fulfill the inclusion criteria [25], thus leaving 555 patients for the present study (Fig. 1). Archival tumor tissue was available in the form of TMA from 535 of the included patients (96%). Excluding samples of histological lobular type, ductal carcinoma in situ and one sample with missing histological status, PT samples from a total of 419 patients (78%), matched synchronous LNMs from 131 patients (24%), and recurrence samples from 34 patients (6%) were included in the final analyses. Patient and PT characteristics of the entire BMM cohort and the subset included in the present study are summarized in Supplementary Table 1 (Online Resource 1). Overall, the characteristics of the 419 included patients were similar to the characteristics of all patients included in the BMM trial.

Fig. 1
figure 1

Flowchart of patient cohort and EMT-related marker expression/phenotype in primary tumor, synchronous lymph node metastases and recurrences. Synchronous lymph node metastasis and recurrences were only evaluable in 131 and 34 patients, respectively. Boxes inserted into the flowchart display information of EMT-related marker/phenotype on matched pairs, i.e., numbers of primary tumors and lymph node metastases, and primary tumor and recurrences, respectively. BMM bone marrow metastasis, DCIS ductal carcinoma in situ, EMT epithelial-mesenchymal transition, N number, LNM lymph node metastasis, PT primary tumor, TMA tissue microarray

Table 1 Associations between primary tumor epithelial-mesenchymal transition (EMT)-related markers and patient and primary tumor characteristics

Patient and tumor characteristics in relation to primary tumor expression of EMT-related markers and phenotype

IHC analysis was successful in 76–92% of cases (Fig. 1). Positive staining of E-cadherin and N-cadherin were localized to the cellular membrane, twist in the nucleus, and vimentin in the cytoplasm. Photomicrographs demonstrating examples of IHC staining are presented in Fig. 2.

Fig. 2
figure 2

Representative cases of immunohistochemistry stainings for E-cadherin, N-cadherin, twist and vimentin, and of each epithelial-mesenchymal transition (EMT) phenotype. All samples evaluable for E-cadherin (epithelial marker) and at least one mesenchymal marker (N-cadherin, twist, vimentin) were stratified into to the following phenotypes of EMT: epithelial type (positive expression of epithelial marker and negative expression of all evaluable mesenchymal markers); mesenchymal type (negative expression of epithelial marker and positive expression of at least one mesenchymal marker); partial EMT type (positive expression of epithelial marker and of at least one mesenchymal marker); and negative type (negative expression for epithelial marker and all evaluable mesenchymal markers). Original magnification ×40

Table 1 presents an overview of patient and PT characteristics in relation to PT expression of the EMT-related markers. PT status of E-cadherin and twist were not associated with any clinicopathological factor. Notably, suggestive evidence of an association was seen between N-cadherin positive status in the PT and negative ER status (P = 0.03), and triple-negative St Gallen subtype (P = 0.02), as well as between vimentin positive status in the PT and younger age (P = 0.006), and tumor size > 20 mm (P = 0.02). Associations were seen between N-cadherin positive status in the PT and NHG III (P < 0.001) and high Ki67 (P = 0.004), as well as between vimentin positive status in the PT and NHG III, negative hormone receptor status, high Ki67, and triple-negative St Gallen subtype (P < 0.001). PT status of E-cadherin was inversely associated with PT status of vimentin (P = 0.01), whereas PT status of N-cadherin was positively associated with PT status of vimentin (P < 0.001).

Mesenchymal and partial EMT phenotypes were associated with NHG III, hormone receptor negative status, high Ki67, triple-negative St Gallen subtype as well as epidermal growth factor (EGFR) positivity and platelet derived growth factor C (PDGFC) positivity, whereas epithelial and negative EMT phenotypes were associated with NHG I and II, hormone receptor positive status, low Ki67, and luminal A St Gallen subtype (P < 0.001) (Table 2).

Table 2 Associations between primary tumor epithelial-mesenchymal transition (EMT) phenotype and patient and primary tumor characteristics in 382 patients with invasive breast cancer

EMT-related marker status across tumor progression stages

E-cadherin expression was downregulated more frequently in recurrences, whereas vimentin was more frequently expressed in recurrences, when compared to expression in the PTs and LNMs. No recurrence was positive regarding the expression of twist (Table 3).

Table 3 Distribution of epithelial-mesenchymal transition (EMT)-related marker and phenotype status in the subsets of patients included at different stages

The expression of each EMT-related marker was compared across the different tumor progression sites as paired data (Table 4). A total of 103 cases had paired data regarding E-cadherin expression between PTs and synchronous LNMs and a discordance rate of 5% was observed. An E-cadherin discordance rate of 17% was observed between 23 paired PTs and recurrences. Regarding N-cadherin expression discordance rates of 9% and 13% were observed between PTs and synchronous LNMs (103 pairs), and PTs and recurrences (24 pairs), respectively. Twist expression was seen to be more stable between the 92 pairs of PTs and synchronous LNMs, with a discordance rate of 2% only. None of the 20 matched pair samples of PTs and recurrences were twist positive. Discordance rates of vimentin expression were observed in 9% of 97 pairs of PTs and synchronous LNMs, and in 14% of 22 pairs of PTs and recurrences. However, none of the shifts observed were statistically skewed (Exact McNemar test, P > 0.05).

Table 4 Epithelial-mesenchymal transition (EMT)-related marker and EMT phenotype conversion rate between paired primary tumors and corresponding metastases

EMT phenotypes across tumor progression stages

Based upon combined IHC results for E-cadherin (epithelial marker) and N-cadherin, twist, and vimentin (mesenchymal markers) an EMT phenotype could be assigned to 382 PTs, 110 LNMs and 27 recurrences, of which 102 and 23 had paired data, respectively.

In recurrences the epithelial phenotype was less frequently observed, whereas partial EMT and negative phenotypes were more frequently seen, compared to the phenotype distribution in PT and LNM. No recurrence was classified as mesenchymal phenotype (Table 3).

An overall EMT phenotype conversion was seen in 20% of cases between PT and LNM, and in 39% between PT and recurrence. The evidence for difference in EMT phenotypes between pairs of PT and LNM, or PT and recurrence was generally weak when analyzed by exact McNemar test (Table 4). However, a trend towards a shift from epithelial type in PT to any other phenotype in the recurrence was seen (P = 0.07). An EMT phenotype conversion was seen in 8/23 (35%) cases when looking at the epithelial vs non-epithelial phenotype between matched pairs of PT and recurrence samples, 88% (7/8) of which changed from a epithelial to non-epithelial phenotype.

EMT-related marker and phenotype status and patient outcome

Kaplan–Meier analyses and log rank test revealed that patients with twist positive PTs had shorter DRFi compared to patients with twist negative PTs (P = 0.014) (Fig. 3). The Cox univariable regression analysis confirmed that PT status of twist was associated with DRFi [(HR) 2.4, 95% confidence interval (CI) 1.2–5.1, P = 0.02], and this result remained essentially the same (HR 2.5, 95% CI 0.97–6.6, P = 0.06) in the Cox multivariable analysis (Table 5). Neither PT status of E-cadherin, N-cadherin, vimentin, nor EMT phenotype, or status of E-cadherin, N-cadherin, twist, vimentin or EMT phenotype in LNMs did predict DFRi. When evaluating the overall shift of EMT phenotype by comparing two sites, there was no difference in DRFi between patients with a shift and no shift of phenotype (Supplementary Fig. 1/Online Resource 2).

Fig. 3
figure 3

Kaplan–Meier survival curves showing distant recurrence-free interval (DRFi; years) in relation to twist status in primary tumor. P value from log rank test

Table 5 Cox univariate and multivariate analysis of distant recurrence-free interval

Discussion

In the present study, we assessed EMT-related proteins in matched samples of PTs, asynchronous LNMs and recurrences to test the hypothesis that EMT profiles are unstable throughout tumor progression in breast cancer patients. We found discordance rates of expression of single EMT-related markers and defined EMT phenotypes between matched tumor samples in the range of 2–35%. Interestingly, non-epithelial phenotypes were more frequently identified in recurrences compared to in PTs and LNMs. However, in paired analysis between tumor progression sites, including only patients with paired marker readings, the evidence was generally low for change in EMT-related markers and EMT phenotypes. This lack of evidence should, however, not be interpreted as no change, as the power to detect change with the sample size in this study is low. PTs with a positive N-cadherin, positive vimentin, mesenchymal or partial EMT status were associated with more aggressive tumor characteristics, exemplified by the triple-negative subtype. We further evaluated the clinical significance of EMT-related markers and EMT phenotypes, and found that twist positive status of the PT was a negative prognostic factor for DRFi.

Previous studies have compared IHC status of the EMT-related markers included in our study between tumor progression sites, however, mostly comparing PTs and LNMs, and only few studies have described conversion rates of status shifts between matched samples [10, 13, 28, 31, 32, 41]. Overall, our study showed a similar expression of the EMT-related markers and phenotypes in PTs and LNMs, as compared to the expression in recurrences. In contrast to what has been reported previously in distant metastases [10], we found that E-cadherin was expressed less frequently in recurrences. Accordingly, loss of epithelial phenotype and gain of partial EMT and negative phenotypes were more frequently observed in recurrences, as compared to the phenotype distribution in PTs and LNMs. Interestingly, none of the recurrence samples were classified as mesenchymal or twist positive, suggesting that these characteristics do not provide the same advantage at the secondary tumor site when the tumor cells need to re-epithelialize, as it might do in the PT.

To the best of our knowledge, only two previous breast cancer studies have evaluated tumor tissue regarding EMT-related markers in combination to evaluate distinct EMT phenotypes [13, 16]. We used a four-marker panel to define the EMT phenotypes and found a similar distribution as reported previously in a large study classifying EMT phenotypes based upon the expression pattern of E-cadherin and fibronectin in PT samples from 1495 breast cancer patients [16]. A smaller study of EMT phenotypes based upon combined expression of E-cadherin and vimentin in 176 breast PT samples reported a lower fraction of the epithelial subtype and consequently a higher fraction of mesenchymal, partial EMT and negative subtypes [13]. Discrepancies of phenotype frequencies might be a consequence of investigating a panel of four compared to two markers only, as well as of excluding all lobular tumors from our study cohort. As in the two previous studies, we were able to confirm the existence of tumor specimen with a partial EMT phenotype in breast cancer, however, the method used in our study makes it impossible to distinguish the coexistence of both epithelial and mesenchymal tumor cells from the presence of true ‘double positive’ tumor cells.

Positivity of N-cadherin and vimentin were seen to be associated with tumor aggressiveness, consistent with previous reports [42, 43]. Nevertheless, in this study, we did not see any difference by marker status in relation to prognosis in terms of DRFi. Expression of twist in PTs has also previously been linked to various clinical parameters [44, 45]. However, we did not find a significant association with any tumor or patient characteristic analyzed in this study. Still, we found that the few patients (5%) with a twist positive PT had a shorter DRFi, compared to patients with a twist negative PT. This is in line with results obtained previously, where positive expression of twist has been associated with poor outcome of breast cancer patients [44, 46], though an association between positive twist expression and a superior overall survival also has been described [33]. Our results might seem paradoxical, though twist is a transcription factor involved in multiple signaling pathways and could be a more biologically relevant marker to study compared to the three EMT effector markers included in our study [47, 48].

Furthermore, in our study, the mesenchymal and partial EMT phenotypes were associated with more aggressive tumor characteristics, such as the triple-negative subtype, in line with what has been reported previously [13, 16]. A novel finding in the present study is that the partial and mesenchymal EMT phenotypes displayed a high fraction of EGFR and PDGFC positivity compared with the epithelial and negative phenotypes, further supporting that these phenotypes define an aggressive type of breast cancer. EGFR is a hall-mark of basal like breast cancer and has repeatedly been presented as a key player promoting EMT [49, 50]. Interestingly, PDGFC is also associated with features of inferior prognosis in human breast cancer [27] and the PDGFC gene strongly correlates with gene-sets defining the EMT pathways supporting an association also for PDGFC with EMT [51]. The functional role of PDGFC in the EMT promoting process is, however, not settled. EMT phenotype has also been seen to provide prognostic information and patients with a partial EMT phenotype to exhibit higher risk of recurrence and inferior survival [13, 16]. We did not find any differences in DRFi according to PT EMT phenotype in our study. Of note, the published studies are evaluating other endpoints than we did, and we excluded patients with lobular cancers, which Bae et al. did not. Importantly our study included only one fourth of the amount of samples included in the Bae et al. [16] study, and thus have a weaker statistical power to perform subgroup analysis.

This study has several strengths, including assessment of EMT-related markers on tissue samples available from 75% of the participants included in the BMM cohort, prospectively defined hypotheses and analysis plan, evaluation of several EMT-related markers on tumor site pairs by two independent assessors blinded to clinical data, as well as relatively long median follow-up (> 10 years). However, our study also have limitations such as evaluation on TMAs. Nevertheless, similar IHC expression of E-cadherin, N-cadherin, twist, and vimentin between margin and center of primary tumor has been reported, which suggests that no change in our results would be obtained from analysis of whole-tissue sections [32, 52]. Moreover, classification of EMT phenotypes is still controversial and although we have selected representative EMT-related markers there are several other known markers related to EMT that would be relevant to evaluate in this context. In addition, the power to detect significant marker changes between PTs and recurrences was low due to limited amount of sample pairs. Furthermore, a main obstacle is to differentiate tumor cells undergoing EMT from stromal fibroblasts by IHC, and thus it is possible that we have underestimated the extent of stained tumor cells, especially considering a marker like vimentin which is present in fibroblasts [53].

In summary, the study confirms the association between single EMT-related markers and specific EMT phenotypes to aggressive features of the primary tumors and the negative prognostic information provided by twist expression in PTs. Epithelial phenotype was indicated to be lost between PTs and recurrences as a reflection of tumor progression. Still, we were not able to demonstrate strong evidence for difference in expressed EMT-related markers and defined EMT phenotypes between primary tumors and synchronous LNMs or recurrences. Our findings may be explained by underlying biology and add to the current knowledge of breast cancer progression, and suggests further investigation in larger cohorts.