Introduction

With 1.7 million new cases causing 522,000 deaths worldwide per year, breast cancer is the leading cause of female cancer death [1]. Early detection, optimal surgery, and adjuvant therapy are the key strategies to improve prognosis. Although 5-year overall survival increased from 77 % in the period 1978–1984 to 82 % in the period 1995–2003, about 16 % of patients will develop distant metastases and eventually die of the disease [2]. Preferred site of distant metastases strongly depends on the subtype of breast cancer. Lobular-type breast cancer preferentially metastasizes to bone, GI tract and ovaries, triple negative breast cancer to liver and brain, and luminal breast cancer to the bone and skin, while well-circulated organs like the spleen and heart almost never harbor metastases [35]. This “organotropism” was first described by Paget et al. about a century ago as the “seed and soil” analogy, where tumors are supposed to have a “seminal influence” on the metastatic micro-environment, and thereby act together with the distant organ to effect tumor metastases [6]. The identity of these seminal influences remains elusive. Both genetic and epigenetic changes may play a role here. Epigenetic alterations are of pivotal interest since they cannot only influence tumor behavior but may also become important therapeutic targets as these processes are potentially reversible. Therapies that target DNA methylation (DNA methyl-transferase (DNMT) inhibitors) or histone modification (histone deacetylase (HDAC) inhibitors) already exist, but newer versions of these drugs need to be developed to improve future clinical management [7].

Which mechanisms underlie development of distant metastases remains a topic of debate. The two main but not necessarily mutually exclusive hypotheses are the linear and the parallel model of metastasis. According to the linear model, genetic modifications progressively accumulate in cancer cells of the primary tumor, whereby cells with advantageous mutations will survive and expand through clonal evolution [8]. If we translate this into epigenetic alterations such as promoter hypermethylation, one would expect that tumor suppressor genes in metastases show more methylation than primary carcinomas. An increase in methylation values during local tumor progression has already been shown [9, 10]. In the parallel progression model, cancer cells disseminate early during tumor progression at a stage when the primary lesion is small. Disseminated cells then evolve independently of the primary tumor to form metastases. According to this latter model, one would expect different methylation patterns in primaries and their matched metastases.

Hypermethylation of tumor suppressor genes like APC, RASSF1A, and FEZ1/LZTS1 in primary breast cancer has been reported to correlate with development of distant metastases [11, 12]. However, little is known about the comparative methylation status of primary tumors and matched distant metastases, possibly related to the fact that metastatic material is rare. Rivenbark et al. compared the methylation status of CST6 in primary breast cancers to their lymph node metastases and showed that methylation-dependent silencing occurred more frequently in the lymph node metastases, possibly reflecting progression-related epigenetic events according to the linear model for metastasis [13].

Here we report promoter hypermethylation profiling for 40 tumor suppressor genes by methylation-specific multiplex ligation-dependent probe amplification (MS-MLPA) in 53 primary breast carcinomas and their matched non-bone distant metastases (skin, brain, lung or liver). This study is part of a project where we study genotype and phenotype of distant breast cancer metastases [1416]. Extensive knowledge of the hypermethylation status of tumor suppressor genes possibly involved in site-specific metastasis could lead to novel biomarkers predicting site of distant metastases and adjuvant targeted therapy strategies that could prevent such metastases from becoming clinically manifest.

Materials and methods

Patients

This study was performed on 53 formalin-fixed paraffin embedded (FFPE) samples of female primary breast carcinomas and 53 single corresponding metachronous non-bone distant metastases. The samples were selected randomly from an existing database entailing material from 300 patients from the departments of pathology of the University Medical Center Utrecht, the Meander Medical Center Amersfoort, the Deventer Hospital, the Rijnstate Hospital Arnhem, Tergooi Hospitals, the Academic Medical Center Amsterdam, the Radboud University Nijmegen Medical Center, the Canisius Wilhelmina Hospital Nijmegen, the Netherlands Cancer Institute Amsterdam, the Medical Center Alkmaar, the Medical Center Zaandam, the University Medical Center Groningen, the St. Antonius Hospital Nieuwegein, the Diakonessenhuis Utrecht, the Free University Medical Center Amsterdam, the Erasmus Medical Center Rotterdam, the Gelre hospital Apeldoorn, Isala clinics Zwolle, the Laboratory for Pathology Enschede, the Laboratory for Pathology Dordrecht, and the Laboratory for Pathology Foundation Sazinon Hoogeveen, all in The Netherlands.

This study was performed in accordance with the institutional medical ethical guidelines. The use of anonymous or coded left over material for scientific purposes is part of the standard treatment agreement with patients, and therefore, informed consent was not required according to Dutch law [17].

Molecular subtypes of breast tumors were assigned as follows: Luminal A (ER+/PR+, HER2−, low cellular proliferation), luminal B (ER+/PR+, HER2−, low cellular proliferation or ER+/PR+, HER2+), triple negative or basal type (ER−/PR−, HER2−), and HER2 enriched (ER−/PR−, HER2+) as before [4].

To set methylation cut-off values, non-paired normal breast tissue (n = 25) was used from breast reduction specimens (mean age 39.4 years; n = 15) and autopsy specimens (mean age 48.9 years; n = 10), with no significant difference in age compared to breast cancer patients (p = 0.338). In addition, we analyzed normal non-paired tissue from brain (n = 5), lung (n = 5), liver (n = 5), and skin (n = 5) derived from our normal tissue biobank to exclude that methylation values in distant metastases would be influenced by admixture of normal surrounding tissue, with again no significant difference in age (45.8 years) compared to patients with breast cancer (p = 0.111). The mean patient age at diagnosis was 52.8 years and 84 % of patients presented with invasive ductal carcinoma. Follow-up ranged between sixteen and 315 months, and metastases were meanly diagnosed 55.4 months after the primary diagnosis. The localization of the metastases that were included was brain (n = 11), lung (n = 12), liver (n = 10), and skin (n = 20). Clinicopathological characteristics are shown in Table 1.

Table 1 Clinicopathological characteristics of the metastatic breast cancer patients (n = 53) analyzed for methylation status of 40 tumor suppressor genes with MS-MLPA

DNA extraction

Four-micrometer sections were cut from each FFPE tissue block and stained with haematoxylin and eosin (HE). The HE-section was used to guide macro-dissection for DNA extraction and to estimate tumor percentage. Only samples containing 80 per cent tumor load or higher (both primary tumor and metastasis) were selected. For proteinase K-based DNA extraction, five 5-µm-thick slides were cut, and tumor areas were macro-dissected using a scalpel. Areas with necrosis, dense lymphocytic infiltrates, and pre-invasive lesions were intentionally avoided. The DNA concentration and absorbance at 260 and 280 nm were measured with a spectrophotometer (Nanodrop ND-1000, Thermo Scientific Wilmington, USA).

MS-MLPA

MS-MLPA was performed according to the manufacturer’s protocol using the SALSA MS-MLPA probemixes ME001-C2 Tumor suppressor-1 and ME003-A1 Tumor suppressor-3 “Online Resource Tables 1 and 2,” each containing 15 internal control probes and in total 53 HhaI-sensitive probes against the following tumor suppressor genes: TP73, CASP8, VHL, RARB, MLH1 (2 loci), RASSF1A (2 loci), FHIT, APC, ESR1, CDKN2A/B, DAPK1, KLLN, CD44, GSTP1, ATM, CADM1, CDKN1B, CHFR, BRCA1/2, CDH13, HIC1, TIMP3 (2 loci), RDM2, RUNX3, HLTF (2 loci), SCGB3A1 (2 loci), ID4 (2 loci), TWIST1, SFR4 (2 loci), DLC1 (2 loci), SFR5 (2 loci), BNI3, H2AFX (2 loci), CCND2 (2 loci), CACNA1G, TGIF1, BCL2, and CACNA1A. Since MS-MLPA is based on the methylation-sensitive restriction enzyme HhaI, the choice of CpG site to be evaluated within the promoter region is highly dependent on the presence of the GCGC restriction site and not so much based on correlation to expression in literature.

At least 50 ng of DNA was used in each MS-MLPA reaction. DNA concentration control fragments, present in each MS-MLPA mix, were evaluated to check for sufficient DNA quantity. All reactions were performed according to the manufacturer’s instructions in a Veriti 96 Well Thermo Cycler (Applied Biosystems). A water sample, a 100 % methylated (MCF-7 M.SssI methyl-transferase treated) control, and a negative control (human sperm DNA) were taken along in every MLPA run. Fragment separation was done by capillary electrophoresis on an ABI-3730 capillary sequencer (Applied Biosystems). Peak patterns derived by Genescan Analysis were evaluated using Genemapper (version 4.1) and Coffalyser.net software (version 9.4, MRC-Holland, Amsterdam, The Netherlands). The cumulative methylation index (CMI) was calculated as the sum of all quantitative methylation values per tumor. Raw methylation percentages of all genes were depicted in “Online Resource Table 7.”

Correlation between mRNA expression and promoter methylation by TCGA

To correlate methylation of the investigated tumor suppressor genes to mRNA expression, we used The Cancer Genome Atlas (https://tcga-data.nci.nih.gov/tcga/).

TCGA Breast Invasive Carcinoma mRNA Expression z-Scores (RNA Seq V2 RSEM) data (n = 1038) were downloaded via The cBioPortal for Cancer Genomics [18, 19]. Illumina Infinium Human DNA Methylation 27 level 3 data (calculated beta values (M/M+U), gene symbols, chromosomes, and genomic coordinates) were downloaded via TCGA Data Portal (n = 313).

Statistical analyses were performed on data of all available CpG sites of the TCGA database compared to the CpG sites used for MS-MLPA.

Statistics

Unsupervised hierarchical clustering of log-transformed quantitative methylation values was performed using non-parametric Spearman correlation with R software (version 3.0.1), including all cases that were tested with both MLPA probemixes. Statistical analysis was executed on absolute methylation percentages as well as on dichotomized values; the latter were determined by ROC curve analyses of methylation values in normal breast tissue compared to primary breast tumor tissue. The Kolmogorov–Smirnov test and Shapiro–Wilk test were used to test for normality of the distributions. Primary tumors and their paired metastases were compared per gene using the Wilcoxon signed-rank test. Non-paired analyses on patient differences and clinicopathological characteristics were computed using the Mann–Whitney test. The dichotomized values were analyzed using McNemars test or Chi square test. Two-sided p values <0.05 were considered to be statistically significant. Correction for multiple comparisons was performed by the Bonferroni–Holm approach. Analysis of prognosis was performed using Kaplan–Meier survival curves/log-rank test for univariate analyses and Cox proportional hazard analysis for multivariate models (entry and remove limits 0.05), calculating hazard ratios (HR) with 95 % confidence intervals (CI). TCGA mRNA z-scores were compared to percentages of DNA methylation by Pearson’s r correlation.

To evaluate whether site of distant metastasis is determined by specific methylation patterns of the primary tumor or rather by inherent molecular subtype, we performed logistic regression comparing the different metastatic sites one by one with quantitative methylation status of individual genes and molecular subtype as variables in the model.

To evaluate whether adjuvant systemic treatment may influence conversion from low methylation in the primary to high methylation in the distant metastasis (or vice versa), we grouped patients according to conversion per individual gene and performed logistic regression for each individual gene including adjuvant chemotherapy (yes or no) and adjuvant hormonal therapy (yes or no) as variables in the model.

All statistical calculations were done with IBM SPSS Statistics 21.

Results

Normal versus tumor tissue

Appropriate cut-offs to dichotomize methylation values of tumor suppressor genes, derived from ROC curve analysis of MS-MLPA values in normal breast versus primary breast tumor tissue, varied between 0.5 and 22.75 % for the 40 genes (53 loci) (Online Resource Table 8).

Although we only included samples of breast cancer metastases that contained 80 percent tumor load or higher, we wanted to further exclude that differences between primaries and metastases were due to the admixture of tumor micro-environment at distant sites. 17/40 genes showed significantly higher methylation values in normal lung, brain, or liver than in normal breast (Online Resource Table 3; Fig. 1a shows CASP8 as an example). Also the CMI values of normal liver and brain tissue were significantly higher than the CMI of normal lung, skin, and breast tissue (Fig. 1b).

Fig. 1
figure 1

Differences in quantitative methylation percentages of CASP8 a and the CMI b by MS-MLPA between various normal tissues. N = 30 (brain n = 5, liver n = 5, lung n = 5, skin n = 5, and breast n = 10). Small horizontal lines depict the median per group. The gray horizontal line depicts the cut-off for hypermethylation of CASP8 (4.5 %). Chromosome location CASP8: chr2 (202122754-202152434), CpG site MS-MLPA probe: 202122649, #bp from probe to TSS: 104 and from probe to ATG: 104

Unsupervised hierarchical clustering of the quantitative methylation values of primary breast tumors, paired distant metastases, and normal tissues is shown in Fig. 2. Normal liver and brain tissue seems to cluster together due to hypermethylation of some genes (APC, CDKN2B, CCND2 both loci, RASSF1A both loci and CASP8) as already mentioned above, and normal breast, lung, and skin tissue showed a related pattern.

Fig. 2
figure 2

Unsupervised hierarchical clustering analysis of log-transformed quantitative methylation percentages of 40 tumor suppressor genes (53 loci) in 53 primary breast tumors, 53 paired distant metastases, and 30 normal tissues (breast n = 10, brain n = 5, lung n = 5, liver n = 5, and skin n = 5). The sidebars depict location of tissue and type (primary, metastasis, or normal tissue)

Primary tumor versus metastasis

Using quantitative methylation values, 52.5 % (21/40) of genes were significantly less methylated in the metastases compared to their paired primary tumors : PRDM2 (p = 0.036), RARB-2 (p = 0.003), HLTF-2 (p = 0.013), H2AFX-1 (p = 0.001), CACNA1G (p = 0.000), TGIF1 (p = 0.029), TIMP3-1 (p = 0.046), TP73 (p = 0.019), FHIT (p = 0.002), APC (p = 0.048), CDKN2A (p = 0.002), CDKN2B (p = 0.012), PTEN (p = 0.002), CD44 (p = 0.011), ATM (p = 0.000), CADM1 (p = 0.006), CHFR (p = 0.005), BRCA2 (p = 0.001), HIC1 (p = 0.001), and BRCA1 (p = 0.002). After correction for multiple comparisons, H2AFX-1, CACNA1G, ATM, BRCA2, and HIC1 remained significant. CMI was not significantly different between primaries and metastases (p = 0.454). Figure 3a shows quantitative methylation values of CACNA1G in primary tumors and their distant metastases as an example.

Fig. 3
figure 3

Quantitative methylation percentages of CACNA1G by MS-MLPA in primary breast tumors and their corresponding distant metastases (a). Methylation percentages in the primary tumor, divided per molecular subtype (b) and corrected for dissemination localization (brain) (c) are shown thereunder. At the bottom, methylation percentages in the primary tumor, divided per dissemination location (d) and corrected for molecular subtype (luminal B) (e), are presented. Small horizontal lines depict the median per group. The gray horizontal line depicts the cut-off for hypermethylation (8.5 %). Chromosome location CACNA1G: chr17:48638429-48704832, CpG site MS-MLPA probe: 48638728, #bp from probe to TSS: −300 and from probe to ATG: 92

Using dichotomized values, 55 % (22/40) of the tested tumor suppressor genes, namely PRDM2 (p = 0.049), RARB-1 (p = 0.002), HLTF-2 (p = 0.031), TWIST1 (p = 0.012), H2AFX both loci (p = 0.002 and p = 0.049), CACNA1G (p = 0.013), TGIF1 (p = 0.002), TIMP3-3 (p = 0.013), TP73 (p = 0.007), FHIT (p = 0.001), CDKN2A (p = 0.029), DAPK1 (p = 0.004), PTEN (p = 0.008), CD44 (p = 0.000), GSTP1 (p = 0.013), ATM (p = 0.000), CADM1 (p = 0.000), CHFR (p = 0.031), BRCA2 (p = 0.013), HIC1 (p = 0.016), and BRCA1 (p = 0.000), were significantly less methylated in the metastases than in the primaries. After correction for multiple comparisons, FHIT, CD44, ATM, CADM1, and BRCA1 stayed significant.

PRDM2, HLTF-2, H2AFX-1, CACNA1G, TGIF1, TP73, FHIT, CDKN2A, PTEN, CD44, ATM, CADM1, CHFR, BRCA2, HIC1, and BRCA1 were significant in both quantitative and dichotomized analyses. Of these, PRDM2, H2AFX-1, TGIF1, TP73, CDKN2A, and CD44 were more methylated in normal brain and/or liver tissues than in normal breast, which indicates that the generally lower methylation values in the distant metastases must be tumor cell specific and excludes the potential admixture of cells from the distant microenvironment being a confounder here.

When comparing primaries and metastases for all investigated tumor suppressor genes per individual patient, significantly less methylation was seen in the metastases compared to the primary tumor in 30.2 % (16/53; quantitative) or 41.5 % (22/53; dichotomized) of patients (20.8 and 28.3 % after correction for multiple comparisons, respectively). Only 15.1 % (8/53; quantitative) or 3.8 % (2/53; dichotomized) of patients showed significantly more methylation in the metastasis compared to the primary tumor (3.8 or 1.9 %, respectively, if corrected for multiple comparisons). These higher methylation values cannot be explained by admixture of normal adjacent tissue in the metastases, since none of these patients had a metastasis in brain or liver, where high methylation values are found in normal tissue.

In cluster analysis (Fig. 2), 32/53 pairs of primaries and metastases clustered directly and another 9/53 pairs almost directly (within three positions), indicating that methylation patterns of the tested tumor suppressor genes show high patient specificity.

Molecular subtype

HER2 enriched tumors were excluded from statistical analyses because of the small number. Triple negative tumors tended to cluster together, but the difference between luminal A and B was less distinct (Fig. 4).

Fig. 4
figure 4

Unsupervised hierarchical clustering analysis of log-transformed quantitative methylation percentages of 40 tumor suppressor genes (53 loci) in 53 primary breast tumors. The sidebars depict dissemination location, subtype (luminal A, luminal B, triple negative, and HER2 enriched), and ER status (according to 10 % positivity)

PRDM2, RARB, CACNA1G (Fig. 3b), SFRP4-2, H2AFX, CACNA1A, TIMP3-1/2, and DLC1-1 showed significantly less methylation in luminal A primary tumors compared to luminal B and/or triple negative primary tumors. Less methylation of SCGB3A1 was seen in triple-negative tumors compared to the other subtypes. Further, more methylation of ID4-2 was seen in luminal B tumors compared to the other subtypes. When corrected for metastatic site, these effects disappeared (Fig. 3c), indicating that although subgroups were small, molecular subtype is not a significant determinant of dissemination site in this group (Online Resource Table 4). No differences were seen between the CMI of the different molecular subtypes (p = 0.199) (Online Resource Table 5).

Concerning receptor status, 35 % (14/40; quantitative) or 25 % (10/40; dichotomized) of the tumor suppressor genes showed significantly higher methylation values in ER-positive tumors compared to ER-negative tumors. After correction for multiple comparisons, 5 % of the tumor suppressor genes remained significant for both data types: SCGB3A1 (both loci), ID4-1, SFRP5-2, H2AFX-1, and FHIT.

In PR-positive tumors, this phenomenon was less distinct: 17.5 or 25 % of genes (quantitative or dichotomized respectively) showed higher methylation values, but no significance remained after multiple comparisons correction. Further, in HER2-positive tumors more methylation was seen in 2.5 % (quantitative) or 7.5 % (dichotomized) of tumor suppressor genes, but again no significance remained when corrected for multiple comparisons.

Metastatic site

The following genes were significantly more methylated in primary tumors metastasizing to brain, lung, or skin, than to liver: PRDM2 (quantitative and dichotomized), RARB-1 (quantitative and dichotomized), HLTF-1 (quantitative), ID4-2 (quantitative), TWIST1 (quantitative and dichotomized), SFRP4-2 (quantitative an dichotomized), DLC1 (both loci; quantitative), H2AFX-2 (quantitative and dichotomized), CACNA1G (quantitative and dichotomized) (Fig. 3d), CACNA1A (quantitative), and TIMP3 (all three loci; quantitative, −b; dichotomized). Also in the heatmap (Fig. 4), a distinct cluster was formed by primary breast tumors that metastasized to liver.

When corrected for molecular subtype by logistic regression, the largest differences in methylation of individual genes were seen between liver and skin (skin being more methylated), and also the CMI was significantly different here (p = 0.039). Figure 3e shows significantly more methylation of CACNA1G in brain, lung, and skin compared to liver (quantitative data) as an example.

Association with clinicopathological characteristics

“Online Resource Table 5” shows the association between methylation in the primary tumor and classical clinicopathological characteristics. A higher CMI (quantitative values) significantly correlated with higher MAI (p = 0.040), although there was no association to lymph node status, localization of metastases, and molecular subtype. More aggressive tumor characteristics like higher grade and MAI showed a tendency to higher methylation values of individual genes.

Logistic regression for methylation conversion between the primary cancers and their metastases did not show significance for chemotherapy or hormonal therapy for any of the genes, indicating that adjuvant systemic treatment is not a confounder in methylation conversion. No significant association was found (for both analysis methods) between methylation of individual tumor suppressor genes and age at diagnosis.

Prognostic value

Of the primary tumor characteristics, lymph node positivity, ER or PR negativity (10 % cut-off for positivity), and HER2 positivity (DAKO score 3) were significantly correlated to worse survival (Table 2). When comparing survival curves of patients that showed methylation conversion from low to high or vice versa with those that did not, conversion of HLTF-2, ID4-2, SFRP4-1, and DAPK1 was correlated to worse overall survival (Fig. 5a). Conversion for these genes was entered in Cox proportional hazard analyses together, where SFRP4-1 (HR 2.3, 95 % CI 1.03–5.05) and HLTF-2 (HR 2.2, 95 % CI 1.09–4.56) remained significant (Table 3). When analyzing prognostic value of methylation status of the individual genes in the metastases for survival time from biopsy of metastases to end of follow-up, three out of the four aforementioned genes were again significant (ID4-2, SFRP4-1, and DAPK1) (Fig. 5b).

Table 2 Cox proportional hazards modeling of tumor suppressor gene methylation
Table 3 Multivariate model of conversion between primary and metastasis in time between resection of metastasis and end of follow-up
Fig. 5
figure 5

Kaplan–Meier survival curves of time between resection of metastasis to end of follow-up of HLTF-2, ID4-2, SFRP4-1 and DAPK1 of conversion of methylation status in the primary tumors compared to paired metastases (a). The dashed line depicts conversion from negative in the primary tumor to positive in the metastasis and the gray line depicts conversion from positive in the primary tumor to negative in the metastasis. Survival curves of ID4-2, SFRP4-1, and DAPK1 of methylation status of metastases are shown in (b). Chromosome location HLTF-2: chr3:148747904–148804341, CpG site MS-MLPA probe: 148804223, #bp from probe to TSS: −105 and from probe to ATG: −105. Chromosome location ID4-2: chr6:19837601–19842431, CpG site MS-MLPA probe: 19837620, #bp from probe to TSS: −20 and from probe to ATG: 365. Chromosome location SFRP4-1: chr7:37945535–37956525, CpG site MS-MLPA probe: 37956166, #bp from probe to TSS: −10632 and from probe to ATG: −9086. Chromosome location DAPK1: chr9:90113885–90323549, CpG site MS-MLPA probe: 90113281, #bp from probe to TSS: 603 and from probe to ATG: 711

Correlation of methylation to mRNA expression by TCGA data extraction

Despite possible heterogeneity in methylation between individual CpG sites within the same promoter region, we nevertheless tried to correlate methylation to mRNA expression by comparing the most closely located CpG sites between TCGA data and our MS-MLPA loci (criteria for matching: <1000 bp between CpG sites, significant inverse correlation, Pearson’s r > −0.2; Online Resource Table 9). Note that these results thus need to be interpreted with caution.

The evaluated CpG sites/regions of ATM, BCL2, BRCA1, BRCA2, CACNA1G, CADM1, CASP8, CCND2, CD44, CDKN2B, CHFR1, DAPK1, ESR1, GSTP1, HLTF, ID4, MLH1, PRDM2, PTEN, RARB, RASSF1, RUNX3, TIMP3, TP73, and TWIST1 (15/40 genes) showed a significant inverse correlation with mRNA expression when quantitative data were used “Online Resource Table 6.” Of these genes, fourteen showed higher methylation values in primaries compared to metastases in our cohort. For BNIP3, CACNA1A, CDH13, CDKN1B, FHIT, HIC1, SCGB3A1, SFRP4, SFRP5, and TGIF1 (10/40 genes), no correlation was found between CpG site methylation and mRNA expression.

Discussion

DNA methylation has a similar potential as genetic alterations in serving as a selectable driver during clonal expansion or metastatic dissemination and could therefore yield valuable markers for cancer detection and prognosis as well as targets for new therapeutic strategies [20]. Our study design allowed comparison of primary breast tumors to their paired distant metastases at different locations, enabling intra- and inter-individual comparison.

Our results show a general tendency for lower methylation at primary tumor-methylated regions in the matched metastases of 21/40 tumor suppressor genes. It is unlikely that admixture of cells from the tumor micro-environment at distant sites have caused these lower methylation values. First, we only included metastatic samples that contained at least 80 % tumor. Second, methylation values in normal breast were lower than in normal tissues from skin, lung, brain, and liver, so admixture of such normal cells (especially from liver and brain) would have raised methylation values. Third, all normal tissues clustered together in unsupervised analysis, which also showed that primary tumors and their paired metastases cluster together. Therefore, most of these hypermethylation events are likely patient specific and subject to specific selection across metastatic dissemination and expansion, emphasizing the need for personalized cancer treatment.

Higher CMI correlated with higher MAI as did methylation values of individual genes, indicating that proliferation rate correlates with methylation, which is biologically plausible. Adjuvant chemotherapy or hormonal therapy did not seem to influence methylation conversion.

To our knowledge, our study is the first that compared promoter methylation in a large group of multiple localizations of distant human breast cancer metastases to their matched primary breast carcinomas and we tried to apply the “reporting recommendations for tumor markers” (REMARK criteria) as adequately as possible [21]. Several studies have been performed addressing methylation differences between primary tumors and metastases. However, their methods failed to draw conclusions on intra-patient differences and site-specific markers. Limitations included: description of a single metastatic site or tumor suppressor gene, non-matched pairs of primaries and metastases, methylation only in the primary tumor (compared to the metastasizing tendency), or the use of mouse models instead of patient material [11122226]. Rivenbark et al. demonstrated “epigenetic progression” by showing more methylation in lymph node metastases compared to the primary breast tumor [13], but Wu et al. showed no differences in methylation of seven tumor suppressor genes in primary breast carcinomas compared to their matched distant metastases [27]. The discrepant findings with our generally lower methylation values in distant metastases (largely in line with results in head and neck squamous cell carcinomas [28]) are likely related to differences in distant metastasis localizations, differences in study populations and sample sizes, pairing of normal tissue, the inclusion of paired metastases, and variation in tumor suppressor genes and CpG regions studied, Further, methodologies for demonstration of methylation status (QM-MSP, methylation-specific PCR analysis, bisulfite sequencing, differential methylation hybridization, etc.) differ between studies. In our institute, we have extensive experience using MS-MLPA [10, 2931], a restriction enzyme-based assay that allows a multi-target approach on small amounts of DNA extracted from formalin-fixed paraffin embedded material. This technique shows a very good correlation with other techniques such as bisulfite pyrosequencing and (QM) MSP [3237]. Besides, a tumor or metastasis-initiating clone or sub-clone in each individual has a unique DNA methylation signature that is closely maintained across metastatic dissemination [20]. However, for each tumor, we chose one of many available tissue blocks (that contained the largest amount of tumor load), which could have led to sampling bias. A previous study from our group clearly demonstrated that, although most variation in methylation status is present between individual breast cancers, clonal epigenetic heterogeneity is seen within most primary breast carcinomas, indicating that methylation results from a single random sample may not be representative of the whole tumor [30]. In addition, for 12 genes, two different CpG loci were analyzed separately, and exact results showed differences in methylation frequencies, indicating the presence of heterogeneous methylation. However, unsupervised hierarchical clustering showed an almost perfect correlation between six and eight of the 12 genes of which different CpG sites were analyzed. These limitations could explain perhaps some but clearly not all of the differences in methylation values between primary and metastasis.

To correct for the differences between locations of dissemination, differences between molecular subtypes should be taken into account, since they are known to preferentially metastasize to specific distant sites [4, 38]. For instance, a general hypomethylation of basal-like tumors compared to differential methylation across non-basal-like subtypes is often reported [38, 39]. We indeed saw some clustering of triple negative tumors and one cluster almost entirely composed of ER-positive cancers, but no evident hypomethylation was seen compared to other subtypes. Distinct methylation patterns relative to breast cancer subtype and normal breast tissue as shown by Bardowell et al. [38] were also not seen. Further, some of the chosen genes were significantly more methylated in tumors that metastasized to specific localizations (even when corrected for molecular subtype), which could lead to novel biomarkers predicting site of distant metastases and adjuvant targeted therapy strategies that could prevent such metastases from becoming clinically manifest.

In a therapeutic setting, the correlation between methylation and mRNA/protein expression may become relevant, which is why we explored TCGA data. Generally, methylation at the investigated CpG sites by MS-MLPA, seemed inversely correlated to mRNA expression levels as demonstrated before [40] (despite possible heterogeneity in methylation between individual CpG sites used for MS-MLPA and TCGA test), indicating their relevance in gene silencing. Future studies should take into account actual protein expression of tumor suppressor genes in metastases in relation to methylation status.

Theoretically, less methylation in metastases would prognostically be beneficial for the patient because of reactivation of these tumor suppressor genes. However, survival analysis showed that conversion of HLTF-2, ID4-2, SFRP4-1, and DAPK1 from positive in the primary tumor to negative in the metastasis was correlated to worse overall survival. Interestingly, methylation status of 3/4 of these genes (ID4-2, SFRP4-1 and DAPK1) predicted worse survival when hypermethylated in metastases. Most important independent predictors for shorter survival time over lymph node positivity and ER status were SFRP4 and HLTF, which are known predictors of worse survival. Hypermethylation of HLTF seems to predict poor outcome in colorectal [41, 42] and lung cancer [43]. SFRP4 is been shown to be an independent predictor of shorter survival in myelodysplastic syndrome [44] and invasive bladder cancer [45]. However, these studies emphasize hypermethylation status in primary tumors, and no studies were found on hypermethylation of these markers in paired metastases in relation to survival. Promoter hypermethylation of tumor suppressor genes is known to be an early event during carcinogenesis [9, 10]. There are several possible explanations for the trend that less promoter methylation of the investigated genes is seen in the metastases. First, the spread of tumor cells may take place even prior to methylation. It has been demonstrated before that in breast, prostate, and esophageal cancer, bone marrow disseminated tumor cells (DTCs: any tumor cell that has left the primary lesion and traveled to an ectopic environment, not necessarily forming a metastasis) display significantly fewer genetic aberrations than primary tumor cells [4649]. Dissemination of tumor cells that are still evolving may lead to allopatric selection and expansion of variant cells adapted to specific microenvironments [50]. Second, it could be that methylation is a dynamic process and may even vary in different stages of the cell cycle. Graff et al. have shown that E-cadherin (a gene involved in homotypic cell–cell adhesion) in cell lines is hypermethylated when put in a culture model system for basement membrane invasion and hypomethylated in a tumor growth model [51]. The reversibility of methylation of tumor suppressor genes could therefore be beneficial to tumor spread, whether it is a random process or a response to specific signals.

In summary, we have shown that hypermethylation of tumor suppressor genes detected by MS-MLPA is generally lower in the distant metastases compared to the primary tumors. We already knew that hypermethylation, in contrast to DNA mutations, is reversible, but whether this is a random or controlled principle has not been fully elucidated. The question rises if the difference in methylation pattern between these primaries and metastases could be explained by the loss/rearrangement of hypermethylation. Since we have shown that the 21/40 tested tumor suppressor genes show less methylation in metastases with respect to their matched primary carcinomas, methylation is probably not an epigenetic factor that could be used for therapy against metastatic tumor spread. However, since different metastasizing localizations show different methylation patterns, screening for a specific pattern that predicts most likely site of metastases could be a useful clinical tool. Further, methylation status of several genes seems to predict survival after metastases. Therefore, more tumor suppressor genes should be screened on larger databases and heterogeneity should be ruled out to include all tumor subclones.