Abstract
Less than 15–20% of patients who meet the criteria for hereditary breast and ovarian cancer (HBOC) carry pathogenic coding genetic mutations, implying that other molecular mechanisms may contribute to the increased risk of this condition. DNA methylation in peripheral blood has been suggested as a potential epigenetic marker for the risk of breast cancer (BC). We aimed to discover methylation marks in peripheral blood associated with BC in 231 pre-treatment BC patients meeting HBOC criteria, testing negative for coding pathogenic variants, and 156 healthy controls, through methylation analysis by targeted bisulfite sequencing on 18 tumor suppressor gene promoters (330 CpG sites). We found i) hypermethylation in EPCAM (17 CpG sites; p = 0.017) and RAD51C (27 CpG sites; p = 0.048); ii) hypermethylation in 36 CpG-specific sites (FDR q < 0.05) in the BC patients; iii) four specific CpG sites were associated with a higher risk of BC (FDR q < 0.01, Bonferroni p < 0.001): cg89786999-FANCI (OR = 1.65; 95% CI:1.2–2.2), cg23652916-PALB2 (OR = 2.83; 95% CI:1.7–4.7), cg47630224-MSH2 (OR = 4.17; 95% CI:2.1–8.5), and cg47596828-EPCAM (OR = 1.84; 95% CI:1.5–2.3). Validation of cg47630224-MSH2 methylation in one Australian cohort showed an association with 3-fold increased BC risk (AUC: 0.929; 95% CI: 0.904–0.955). Our findings suggest that four DNA methylation CpG sites may be associated with a higher risk of BC, potentially serving as biomarkers in patients without detectable coding mutations.
Similar content being viewed by others
Introduction
Breast cancer (BC, OMIM#114480) is a highly heterogeneous and multifactorial disease that is the primary cause of cancer-related mortality among women worldwide1. Hereditary breast and ovarian cancer syndrome (HBOC) is the most prevalent hereditary form of BC, accounting for up to 10-15% of all cases2, frequently associated with pathogenic variants in BRCA1/2 genes3,4. Pathogenic variants in other genes, including BRIP1, CHEK2, ATM, and PALB2 have been identified in less than 5% of cases5, demonstrating a significant locus heterogeneity.
The contribution of epigenetic mechanisms in BC development, and their ability to modulate gene expression independently of coding region mutations have been extensively documented6. Among epigenetic marks, DNA methylation has been widely studied in cancer tissues, observed in more than 70% of tumor suppressor gene (TSG) promoters in CpG contexts (5-methylcytosine) within regions known as CpG islands7. Aberrant methylation patterns in promoter regions, known as epimutations8, lead to the transcriptional silencing of TSGs or the activation of oncogenes, resulting in loss or gain of gene function, respectively. Epimutations causing diseases are referred to as constitutional epimutations. Epimutations can be categorized based on their origin as somatic or germline, and mechanistically as primary or secondary. Primary epimutations arise from stochastic or unknown causes, while secondary epimutations result from cis or trans-acting mutations. Both primary and secondary epimutations can occur in multiple embryonic and adult tissues, leading to epigenetic mosaicism9.
At the molecular level, altered DNA methylation marks in the BC tissue occur in early stages of the carcinogenic development10. In addition, DNA methylation alterations at genomic level have been observed in peripheral blood of individuals meeting HBOC criteria, revealing specific CpG sites associated to BC risk11,12. In the last decade, genome-wide DNA methylation in peripheral blood has been investigated in cases with familial BC in TSGs like BRCA1, BRCA2, CHEK2, ATM, TP53, CDH1 and MLH113. Increased global methylation has been associated with reduced risk of BC, while elevated DNA methylation within functional promoters with increased risk14,15. Moreover, aberrant germline hypermethylation of the KIF116, NDRG117, ATM18, and PALB219 promoters has been proposed as a biomarker for BC and HBOC risk in some populations. However, current studies have focused on limited number of genes and HBOC patients, and have not analyzed methylation in patients without genetic pathogenic variants evaluated in large gene panels.
Previously, we reported that in a cohort of 300 BC patients with HBOC criteria, 15% had coding pathogenic variants, 11% had variants of uncertain significance (VUS), and 74% were negative through the analysis of 143 cancer susceptibility genes20. Therefore, we hypothesized that alternative mechanisms might increase the risk for BC in patients that fulfill HBOC criteria in whom no genetic coding alterations were identified. Hence, in this study, we aim to identify CpG sites exhibiting aberrant methylation patterns in peripheral blood and explore their association with BC risk. We employed a state-of-the-art analysis of the methylation status in the promoter regions of 18 well-known TSGs in 231 BC patients with HBOC criteria and negative for pathogenic variants or VUS, and compared them with 156 healthy women population controls.
Results
Clinical and epidemiological characteristics of the study population
We selected 231 Mexican high-risk patients with BC, negative for coding pathogenic variants in 143 cancer susceptibility genes from a previous study conducted by our group20, and 156 healthy population controls (Fig. 1). There were significant differences in the average age (41.8 ± 7.4 vs 39.1 ± 4.8; p = 0.002, Wilcoxon rank-sum test), menarche (12.7 ± 1.5 and 12.4 ± 1.6; p = 0.008, Wilcoxon rank-sum test), menopause (30.7% vs 4.5%; p < 0.0001, chi-square test) and family history of cancer (83.5% vs 21.8%; p < 0.0001, chi-square test) (Table 1).
The most common clinicopathological characteristics of the patients were invasive ductal carcinoma (69.3%), clinical stage II (44.6%), estrogen receptor positivity (45.5%), progesterone receptor positivity (46.3%), HER2 receptor negativity (52.8%), and Luminal A molecular subtype (39%) (Table 1).
High-throughput DNA methylation sequencing reveals aberrant methylation in tumor suppressor genes in high-risk breast cancer patients
We evaluated the methylation levels of the promoter region of 18 TSGs in the 231 high-risk BC patients, and 156 healthy controls (Supplementary Tables 1, 2, Fig. 2a) by bisulfite sequencing PCR-NGS. In addition, MLH1 and KLLN were used as internal controls for bisulfite conversion and full methylation, respectively.
We mapped 330 CpG sites in the 18 genes (128,370 CpG total sites in all samples). Systematic inspection with the Revelio software showed that none of the evaluated CpG sites had SNPs (Supplementary Figs. 2–6; see Materials and methods). In addition, no SNPs in the target region were detected using the dbSNP track in the UCSC Genome Browser. Then, we evaluated the average methylation values across all CpG sites (Supplementary Table 3), and found higher methylation in EPCAM, encompassing 17 CpG sites (Z-score of 0.326 vs −3.74E−16, p = 0.017, Wilcoxon rank-sum test), and RAD51C (27 CpG sites; Z-score of 0.132 vs to −1.17E−16, p = 0.048).
A following analysis of specific CpG sites identified 36 hypermethylated marks (FDR q < 0.05) in 9 genes (Supplementary Table 4). Statistical analysis was done on the genes with more than two hypermethylated CpG sites, which included ATM (4 CpG sites; Z-score 0.07 vs 1.34E−17, p = 0.0074), RAD51C (4 CpG sites; Z-score 0.03 vs 2.38E−18, p = 0.018), and EPCAM (9 CpG sites; Z-score 0.42 vs 1.37E−17, p = 0.00088) (Fig. 2b). In addition, locus-wide hypermethylation in the RAD51C (mean methylation: 23.5%; 27 CpGs), BRCA1 (mean methylation: 9.6%; 9 CpGs), and POLH (mean methylation: 9.1%; 33 CpGs) was detected in three patients.
Breast cancer patients showed enriched hypermethylation in four specific CpG sites
To identify general methylation patterns in the global normalized methylation levels of all 18 TSGs, we conducted a PCA. Using the differentiated Z-scores, the methylation profile of the gene promoters for patients and controls was classified into two distinct clusters (Fig. 3a). At the individual CpG site level, we identified hypermethylation in patients compared to healthy controls in four sites: i) cg47630224-MSH2 (Z-score 2.95 vs −6.41E−12, p = 1.99E−10; q-value = 2.23E−11), ii) cg23652916-PALB2 (Z-score 2.27 vs 1.92E−11, p = 1.19E−07; q-value = 3.34E−09), iii) cg89786999-FANCI (Z-score 1.3 vs 7.69E−11, p = 001; q-value = 1.10E−05), and iv) cg47596828-EPCAM (Z-score 1.1 vs −2.56E−11, p = 7.74E−09; q-value = 4.33E−10), (Fig. 3b–d, Fig. 4, Supplementary Table 5). Notably, all CpG sites in the EPCAM gene exhibited higher methylation levels compared to controls (Fig. 3b).
The distribution of the methylation levels at the four significant hypermethylated sites was assessed by a supervised hierarchical clustering analysis. Two clades resulted, one with clear hypermethylation and one with lesser levels of methylation (Fig. 5a). Within the clade of high methylation, there was a subgroup of 22% of patients (51/231; cut-off two-sided +/−1 STD of Z-score [4.6, PALB2; 5.6 MSH2]) with co-methylation of the cg23652916-PALB2 and cg47630224-MSH2 sites (Fig. 5b). When exploring the molecular characteristics of this subset of patients, we found that 22% (13/58) present a triple-negative subtype. The clinical stages of these patients were stage I 17% (10/58), stage II 29.5% (17/58), stage III 24% (14/58), 29.5% (17/58) was not reported (Fig. 5a).
Four specific hypermethylated CpG sites as potential biomarkers of breast cancer risk
Increased global methylation has been associated with a reduced risk for BC, while abnormal DNA hypermethylation within functional promoters to increased risk14,15. Therefore, we aimed to confirm the association between the four aberrant DNA methylation markers and BC risk through univariate and multivariate logistic regression analyses of binomial odds ratios (ORs), utilizing both raw methylation data and Z-scores (Supplementary Tables 6–11). The multivariate analysis was adjusted for age, age at menarche, BMI, family history of cancer, and depth of sequencing. DNA hypermethylation at the sites cg47596828-EPCAM (OR = 1.84 [1.46–2.32], 95% CI, p < 0.001), cg47630224-MSH2 (OR = 4.17 [2.05–8.48], 95% CI, p < 0.001), cg23652916-PALB2 (OR = 2.83 [1.71–4.66], 95% CI, p < 0.001), and cg89786999-FANCI (OR = 1.65 [1.24–2.20], 95% CI, p = 0.001) was associated with an increased risk of BC, using the raw methylation data (Fig. 6a).
Interquartile analysis revealed that higher incremental progressive quartiles were linked to a greater risk of BC for each CpG site (Supplementary Tables 8–11). To assess the potential of these CpG sites as biomarkers for BC, we performed a receiver operating characteristic (ROC) analysis. The sites cg47596828- EPCAM (AUC: 0.700, Cut-off: 0.360) and cg47630224-MSH2 (AUC: 0.716, Cut-off: 0.329) showed the highest performance (Fig. 6b). Additionally, a combinatorial ROC analysis using all four aberrant methylation marks (11 combinations) showed three optimal combinations referred to as gold markers: combination 7, 10, and 11 (Fig. 6c, d, Supplementary Fig. 7).
Validation of the potential CpG biomarkers in three independent cohorts
To validate the results of our analysis for significant CpG sites, we conducted a comprehensive search for studies on methylation analysis that targeted the same sites. Our search revealed that the Infinium HumanMethylation450 (HM450K) BeadChip-Illumina was the only platform with structured methylation data available for BC, covering over 480,000 CpG sites. However, the design of this technology only includes the cg47630224-MSH2 (Illumina probe cg22269526), which was used to validate our findings. We initially screened 1786 articles related to BC using the NCBI-GEO (https://www.ncbi.nlm.nih.gov/gds) database and filtered them to 74 articles that included methylation profiles (case-control). We excluded studies that involved cell lines or chemotherapy treatment, resulting in three relevant cohorts: the Australian cohort GSE104942 (87 BC patients with HBOC criteria), the EPIC-Italy (HuGeF) cohort GSE51032 (222 sporadic patients), and the Uruguayan cohort GSE148663 (22 sporadic patients).
Higher methylation levels in the cg47630224-MSH2 site were observed in the Australian hereditary BC cohort (mean Z-score patients = 1.971 vs −8.65E−8, p < 0.0001) and the Italian sporadic cases cohort (mean Z-score patients = 0.238 vs −0.016, p = 0.0057). Conversely, no significant differences were found in the Uruguayan sporadic cases cohort (Z-score patients = 0.529 vs −0.6518, p = 0.573) (Fig. 6e).
Furthermore, logistic regression analysis showed an association with BC risk in the Australian cohort (OR: 3.41, 95% CI: 2.1–6.9, p < 0.001) compared to the cohorts from Italy (OR: 1.23, 95% CI: 1–1.4, p < 0.01) and Uruguay (OR: 0.81, 95% CI: 0.4–1.6, p > 0.05) (Fig. 6f). In the ROC analysis, we observed an AUC of 0.892 (Sensitivity: 0.900, Specificity: 0.885) in the Australian cohort (Fig. 6g).
Discussion
Prior studies have shown that around 70% of patients who meet NCCN criteria for HBOC lack pathogenic variants in cancer susceptibility genes20,21. This led us to hypothesize the existence of alternative molecular mechanisms on these patients. In this study, we aimed to identify epigenetic alterations in HBOC-eligible BC patients without pathogenic coding variants by DNA methylation analyses on 18 TSGs in peripheral blood from 231 Mexican patients and 156 healthy population controls.
Here, we detected hypermethylation in the promoter regions of ATM, RAD51C, and EPCAM genes, with implications for hereditary BC. In familial BC, ATM hypermethylation in peripheral blood DNA was associated with a 3-fold increased risk of bilateral BC (p = 0.0017)13, suggesting its potential as a novel marker for BC risk in individuals fulfilling HBOC criteria. Similar findings were observed for distinct ATM CpGs (OR: 1.89, p = 2 × 10−4)18. Constitutive RAD51C hypermethylation (>6%) was reported in HBOC patients22, but conflicting evidence exists23; therefore, further work is needed to determine its precise influence on BC risk. EPCAM mutations are known to cause methylation and transcriptional repression in the neighboring MSH2 gene in hereditary colorectal cancer24, yet but there are no studies in BC patients with criteria for hereditary disease.
We discovered 36 specific hypermethylated marks associated with an increased risk of BC, including cg47630224-MSH2, cg23652916-PALB2, cg89786999-FANCI, and cg47596828- EPCAM. These CpG sites had not previously been linked to BC risk, which underscores the benefit of using open platforms such as NGS for evaluating genome-wide DNA methylation status in familial BC.
The link between blood DNA methylation in specific marks and BC patients with HBOC criteria has been examined. A previous work in Australian patients identified four heritable methylation CpG sites in GREB1, PNKD, C7orf50 and TMC3, associated with BC risk in 210 individuals from 25 families25. A nested case-cohort analysis within the prospective Sister study revealed 250 individual CpG sites with differential methylation between BC cases and controls11. Only one CpG site in ERCC1 was associated with an increased risk of BC in an independent cohort26. The potential implication of specific CpG site methylation in patients with hereditary cancer criteria was further supported by a comparative study between sporadic and hereditary BC patients lacking pathogenic variants from the same population. In this work, a CpG site in BRCA1 showed significant hypermethylation in the hereditary cases27. Overall, these findings suggest that methylation at specific CpG sites could play a role in the development of the disease in affected women whose families meet criteria for HBOC.
Sporadic BC and peripheral blood DNA methylation have also been studied. An integrative genetic analysis of 122,977 sporadic BC patients and 105,974 controls revealed 38 CpGs potentially increasing BC risk by regulating 21 genes28. A Chinese study identified four CpG sites in imprinting genes KCNQ1, KCNQ1OT1, and PHLDA2 associated with increased BC risk29. Conversely, a meta-analysis of four prospective cohorts found no evidence supporting individual CpG site methylation in blood as a sporadic BC risk factor30. Therefore, the link between methylation and sporadic BC remains unclear.
Among the four genes with altered hypermethylation, MSH2 and PALB2 showed the strongest association with BC. Interestingly, we observed MSH2-PALB2 co-methylation in 51 of 57 patients with high MSH2 methylation levels, suggesting a potential mutual association via an unknown biological mechanism (Fig. 5). Compelling evidence indicates MSH2 promoter hypermethylation is induced by cis EPCAM gene rearrangements in Lynch syndrome patients31. Additionally, PALB2 hypermethylation has been reported in 8% of sporadic breast and ovarian cancer patients19. A study on familial BC found significant tumor methylation in four CpG sites within the MSH2 promoter region, including cg47630224-MSH2 (Illumina probe cg06478094), associated with increased BC risk in our study32. Despite MSH2 lack of current association with increased BC risk, PALB2 is a well-established high-risk BC susceptibility gene. We hypothesize that the concurrent methylation at the specific sites of PALB2 and MSH2 in the germline of 22% patients might have a synergistic effect, possibly mediated by shared distant regulators, as other reports have proposed that co-methylation might be an indicator of functional associations between gene pairs in somatic BC33,34,35.
Cis mutations in TSGs are linked to locus-wide hypermethylation in colorectal cancer and other oncologic syndromes9,24. Interestingly, we observed this abnormal pattern in the promoter regions of RAD51C (23.5%), BRCA1 (9.6%), and POLH (9.1%) in patients GT25, GT214, and GT202, respectively, suggesting neighboring mutations may induce hypermethylation. A single study investigated this mechanism in HBOC-criteria patients, finding a hypermethylated BRCA1 allele (>35%) cosegregating with the variant c.-107A>T in the BRCA1 5′UTR36. However, validation of this variant failed in a large German cohort and in BC and ovarian tumors with BRCA1 promoter hypermethylation, indicating suggesting low allelic frequency in German and Dutch patients37,38. In a follow-up study, we aim to explore the potential incidence and functional implications of these putative cis mutations.
To further support our findings, we assessed the aberrant methylation status of the cg47630224-MSH2 site across three independent cohorts that used the Illumina Infinium 450 K methylation array platform. We confirmed the association of this CpG site with BC risk in the cohort of patients with HBOC criteria (Australia; OR: 3.41, p < 0.001), suggesting the potential biomarker capacity of our approach. However, more studies are needed to fully confirm the role of cg47630224-MSH2 as a biomarker.
This study presents a thorough analysis, at single-base resolution, of high-risk gene promoters in a large case-control dataset, utilizing healthy population controls to minimize methylation level variability from non-genetic and environmental factors39. The limited availability of validation cohorts for the specific CpG sites we examined is a limitation of this study. Only one comparable report involving Australian patients was found for validation. Hence, we cannot fully exclude that this methylation marks are not associated with sporadic BC. Moreover, these findings should not yet be interpreted as indicative of an autosomal dominant trait, as the methylation patterns have not been assessed in family members. Additional limitations include the lack of assessment of dietary variables known to influence methylation changes, such as vitamin B12, folate, choline concentration40, tobacco use, and exposure to endocrine-disrupting substances41. Despite these limitations, our findings support previous studies and offer compelling evidence of site-specific methylation as a potential marker in BC patients meeting HBOC criteria, particularly in an underrepresented population in the epigenetic literature.
We summarize our findings in three hypotheses (Fig. 7). Increased low level methylation in ATM, RAD51C, and EPCAM promoter regions might arise from stochastic primary epimutations due to environmental factors during the embryonic development and adulthood (Fig. 7a)42. High level, locus-wide methylation in RAD51C, BRCA1, and POLH detected in three patients are product of non-coding genetic alterations in the neighboring regions, causing secondary cis epimutations (Fig. 7b)9,24,36. The aberrant methylation in specific CpG sites in MSH2, PALB2, FANCI, and EPCAM could result from changes in the DNA of peripheral blood leukocyte subpopulations, influenced by the tumor microenvironment, and the immune response at systemic level (Fig. 7c)43.
In conclusion, this study provides evidence regarding the association between germline methylation of cancer susceptibility genes in BC patients with criteria for HBOC without detectable coding pathogenic variants. We identified four novel potential epigenetic markers associated with a higher risk of BC, which were validated in an independent cohort. Overall, this work contributes to improving our understanding of the epigenetic landscape of high-risk BC patients, which could be alternative mechanisms of etiopathology. Further investigation into epigenetic signatures holds the potential to enhance risk assessment and facilitate personalized approaches in BC management.
Methods
Study population
From the Latin American Study of Hereditary Breast and Ovarian Cancer (LACAM) cohort, we selected 231 Mexican patients with BC that fulfill the National Comprehensive Cancer Network (NCCN) criteria for HBOC in 143 genes previously reported20. The inclusion criteria were: i) negative for pathogenic variants, ii) negative for VUS in clinically relevant genes, iii) with availability of more than 800 ng genomic DNA (gDNA), iv) sample obtained prior to chemotherapy treatment. We term these patients as high-risk BC patients. In addition, 156 healthy controls without a family history of BC were selected, including only those with family histories of other types of cancer. All participants signed an informed written consent for the use of their biological samples for research purposes. The research was conducted according to the Declaration of Helsinki and approved by the Ethics Committee of four health institutes in Mexico (Protocols: ECG-CEICANCL290515-05GENCMAHER, IECC-2015-01, ISEM-02092015, INSP-CI:1065, and INSP-341) (Fig. 1a).
gDNA extraction from peripheral blood
gDNA was extracted from peripheral blood using the DNeasy Blood & Tissues kit (Qiagen, Hilden, Germany). The integrity of the gDNA was evaluated by electrophoresis in a 0.8% agarose gel and the purity in an EPOCH BIOTEK spectrophotometer. Quantification was performed using a Quantus Fluorometer with the dsDNA Quantifluor kit (Promega, Madison, USA) (Fig. 1b).
Positive methylation controls
Two positive methylation controls were included: i) METC1-POS: a commercial Human HCT116 DKO Methylated DNA (Zymo Research, Irvine, CA, USA); and ii) METC2-POS gDNA, both treated in vitro with the enzyme DNA methyltransferase M. SssI. The methylation percentages in the positive controls were 98-100% meaning complete methylation. The methylation status in positive controls was validated by methylation-sensitive restriction enzyme assay using HpaII (New England Biolabs, UK). Methylated sites in CpG context block HpaII activity. The detailed procedures are described in Supplementary Fig. 1.
Methylation assay by bisulfite conversion
800 ng of gDNA from patients and controls were treated with Sodium Bisulfite using the EZ DNA Methylation-Gold kit (Zymo, California USA). The bisulfite-converted DNA was quantified with the ssDNA Quantifluor kit (Promega, Madison USA) and nanophotometer spectrophotometry (Implen), and stored at −20 °C for targeted bisulfite sequencing (Fig. 1b).
Primer design for bisulfite sequencing PCR
We designed primers for the promoter regions of 18 TSGs: ATM, ATR, BRCA1, BRCA2, BRIP1, CHECK2, EPCAM (first intron CpG island), ERCC3, FANCF, FANCI, FANCL, FANCM, MLH1, MSH2, PALB2, PMS2, POLH, and RAD51C using the MethPrimer 2.0 online software. See supplementary methods and Supplementary Table 1 for the primer design criteria.
Endpoint PCR amplification
We performed a total of 7780 endpoint PCRs to amplify the target regions within the promoters of the 18 TSGs (one amplicon per promoter). GoTaq Polymerase Master Mix® (Promega, Madison USA) was used for the amplification. Each reaction was performed in 25 μL as follows: 12.5 μL of GoTaq 2X Mix enzyme, primer forward and reverse were added to 200 nM, 20 ng of DNA converted with sodium bisulfite template and adjust to 25 μL total with RNAse-free water. A no-template reaction was used as a negative control. MLH1 promoter region was used a positive control.
The thermal cycling conditions used for the PCR were: one initial denaturation cycle at 95 °C for 3 min; 40 cycles of denaturation at 95 °C for 30 s, alignment at the primer specific Tm for 30 s, extension at 72 °C for 30 s; a final extension step at 72 °C for 5 min, and the reaction was kept at 4 °C. Subsequently, the amplified products were resolved on a 0.8% agarose gel (Supplementary Table 1).
Pooling, library preparation and next generation sequencing
The individual PCR products were equalized to 6.36 nM, and pooled by sample. We obtained 389 equimolar equalized PCR pools (231 patients, 156 controls, and 2 positive methylation controls) with 20 genes (18 study genes and 2 internal control genes). Each pool was purified with AMPure XP Beads (1.8X) and a total of 70 ng of DNA was used for the preparation of DNA libraries using the NEBNext Ultra™ II DNA Library Prep Kit for Illumina. The generation of high-quality DNA libraries was confirmed by Bioanalyzer 2100 High Sensitivity DNA ChiP analysis, (Agilent, California USA). The samples were paired-end sequenced (2 × 250) with a theoretical deep coverage of 10,000X, in a MiSeq instrument (Fig. 1c).
Bioinformatics and quantitative methylation
DNA methylation was detected by an automated program. The pipeline performed the mapping of the CpG sites and filtering of specific-site methylation data with as follows: i) raw reads were assessed with MultiQC v1.13 and processed with Trim galore v0.6.6 keeping bases with Phred quality score >30; ii) mapping and methylation calling were done with Bismark v0.22.344, using GRCh37/hg19 genome reference; and iii) regions of interest were kept for downstream analysis. The methylation values were expressed as percentage (0–100%) and normalized using the control samples, applying Z-score for each specific CpG site according to the following equation:
where \({Z}_{{ij}}\) is the Z-score value of each CpG of each promoter (j) for each sample patient (i); \({x}_{{ij}}\): methylation of each CpG of the promoter evaluated (j) for each sample patient (i); \(\bar{{x}_{j}}:\) mean methylation of each CpG of the promoter evaluated (j) from the control samples; and \({{S}}_{j}\): standard deviation of methylation control samples for each CpG of the promoter evaluated (j). This normalization approach has been applied in other reports to evaluate the methylation status of specific CpG sites in case-control studies45,46,47. A Z-score above zero was considered higher than the methylation control mean.
The detection of single nucleotide polymorphisms (SNPs) from bisulfite sequencing alignments was done with the Revelio software48 (Supplementary Figs. 3–7). SNPs were also evaluated with the dbSNP track in the UCSC Genome Browser.
Statistical analysis
We conducted the Wilcoxon test with Bonferroni correction on the p-values of all CpG sites to control the Type I error rate in multiple comparisons. This analysis was carried out using scipy v1.10.1 and statsmodels v0.14. Differences with a p-value of less than 0.05 were considered statistically significant. Additionally, to correct for multiple testing, we set the false discovery rate (FDR) threshold at q < 0.05, employing the qvalue v2.32.0 library.
We performed a supervised hierarchical analysis with Pearson correlation and principal component analysis (PCA), stratified into patient and control groups, using ComplexHeatmap v.2.13.1 and scikit-learn v1.1.2 libraries, respectively. The relation between Z-score and the p-value obtained from the Wilcoxon rank-sum test were depicted in a volcano plot to visualize individual site methylation fold-change differences. Univariate and multivariate logistic regression model Odds Ratio (OR) analyses were conducted to analyze the association between the peripheral blood methylation of each CpG site and the risk of BC and were calculated using OddsPlotty v1.0.2 and Stata V18.0. The results are represented as odds ratios (ORs) and 95% confidence intervals (95% CIs). Finally, we made a classifier with the significant sites individually and combined, and specificity and sensitivity were calculated with pROC v1.18.0 and CombiROC v0.2.3, respectively.
Validation in three independent cohorts
We used three datasets from NCBI-GEO, the cohorts EPIC-Italy (HuGeF) GSE51032 (222 sporadic patients), GSE148663 from Uruguay (22 sporadic patients), and the Australian cohort GSE104942 (87 BC patients with HBOC criteria), analyzed with Infinium HumanMethylation450 (HM450K) BeadChip-Illumina arrays. Samples displaying low yield of detection (p > 0.05) in the Infinium cg22269526 probe were excluded from the analysis across all three cohorts. The obtained β-values derived from the HM450K platform were expressed as percentages, with 0 indicating 0% methylation and 1 indicating 100% methylation. These values were subsequently normalized using Z-score. The univariate logistic regression model Odds Ratio (OR) and ROC curves were calculated as described above.
Data availability
The data presented in the study are deposited in the Sequence Read Archive repository, accession number PRJNA987643 and PRJNA987641.
Code availability
The codes used in this study are deposited in https://github.com/UBIMED-Lab13/MethylationDetection.git.
References
Sung, H. et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA. Cancer J. Clin. 71, 209–249 (2021).
Yoshida, R. Hereditary breast and ovarian cancer (HBOC): review of its molecular characteristics, screening, treatment, and prognosis. Breast Cancer Tokyo Jpn. 28, 1167–1180 (2021).
Kast, K. et al. Prevalence of BRCA1/2 germline mutations in 21 401 families with breast and ovarian cancer. J. Med. Genet. 53, 465–471 (2016).
Fackenthal, J. D. & Olopade, O. I. Breast cancer risk associated with BRCA1 and BRCA2 in diverse populations. Nat. Rev. Cancer 7, 937–948 (2007).
Hu, C. et al. A Population-Based Study of Genes Previously Implicated in Breast Cancer. N. Engl. J. Med. 384, 440–451 (2021).
Allis, C. D. & Jenuwein, T. The molecular hallmarks of epigenetic control. Nat. Rev. Genet. 17, 487–500 (2016).
Esteller, M., Corn, P. G., Baylin, S. B. & Herman, J. G. A gene hypermethylation profile of human cancer. Cancer Res. 61, 3225–3229 (2001).
Oey, H. & Whitelaw, E. On the meaning of the word “epimutation. Trends Genet. TIG 30, 519–520 (2014).
Ruiz de la Cruz, M. et al. Cis-Acting Factors Causing Secondary Epimutations: Impact on the Risk for Cancer and Other Diseases. Cancers 13, 4807 (2021).
Karsli-Ceppioglu, S. et al. Epigenetic mechanisms of breast cancer: an update of the current knowledge. Epigenomics 6, 651–664 (2014).
Xu, Z. et al. Epigenome-wide association study of breast cancer using prospectively collected sister study samples. J. Natl. Cancer Inst. 105, 694–700 (2013).
Xu, Z., Sandler, D. P. & Taylor, J. A. Blood DNA Methylation and Breast Cancer: A Prospective Case-Cohort Analysis in the Sister Study. J. Natl. Cancer Inst. 112, 87–94 (2020).
Flanagan, J. M. et al. Gene-body hypermethylation of ATM in peripheral blood DNA of bilateral breast cancer patients. Hum. Mol. Genet. 18, 1332–1342 (2009).
Severi, G. et al. Epigenome-wide methylation in DNA from peripheral blood as a marker of risk for breast cancer. Breast Cancer Res. Treat. 148, 665–673 (2014).
van Veldhoven, K. et al. Epigenome-wide association study reveals decreased average methylation levels years before breast cancer diagnosis. Clin. Epigenetics 7, 67 (2015).
Guerrero-Preston, R. et al. Differential promoter methylation of kinesin family member 1a in plasma is associated with breast cancer and DNA repair capacity. Oncol. Rep. 32, 505–512 (2014).
Han, L.-L. et al. Aberrant NDRG1 methylation associated with its decreased expression and clinicopathological significance in breast cancer. J. Biomed. Sci. 20, 52 (2013).
Brennan, K. et al. Intragenic ATM methylation in peripheral blood DNA as a biomarker of breast cancer risk. Cancer Res. 72, 2304–2313 (2012).
Potapova, A., Hoffman, A. M., Godwin, A. K., Al-Saleem, T. & Cairns, P. Promoter hypermethylation of the PALB2 susceptibility gene in inherited and sporadic breast and ovarian cancer. Cancer Res. 68, 998–1002 (2008).
Quezada Urban, R. et al. Comprehensive Analysis of Germline Variants in Mexican Patients with Hereditary Breast and Ovarian Cancer Susceptibility. Cancers 10, 361 (2018).
Oliver, J. et al. Latin American Study of Hereditary Breast and Ovarian Cancer LACAM: A Genomic Epidemiology Approach. Front. Oncol. 9, 1429 (2019).
Hansmann, T. et al. Constitutive promoter methylation of BRCA1 and RAD51C in patients with familial ovarian cancer and early-onset sporadic breast cancer. Hum. Mol. Genet. 21, 4669–4679 (2012).
Tabano, S. et al. Analysis of BRCA1 and RAD51C Promoter Methylation in Italian Families at High-Risk of Breast and Ovarian Cancer. Cancers 12, 910 (2020).
Ligtenberg, M. J. L. et al. Heritable somatic methylation and inactivation of MSH2 in families with Lynch syndrome due to deletion of the 3’ exons of TACSTD1. Nat. Genet. 41, 112–117 (2009).
Joo, J. E. et al. Heritable DNA methylation marks associated with susceptibility to breast cancer. Nat. Commun. 9, https://doi.org/10.1038/s41467-018-03058-6 (2018).
Sturgeon, S. R. et al. Prediagnostic White Blood Cell DNA Methylation and Risk of Breast Cancer in the Prostate Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) Cohort. Cancer Epidemiol. Biomarkers Prev. 30, 1575–1581 (2021).
Pang, D. et al. Methylation profiles of the BRCA1 promoter in hereditary and sporadic breast cancer among Han Chinese. Med. Oncol. 29, 1561–1568 (2012).
Yang, Y. et al. Genetically Predicted Levels of DNA Methylation Biomarkers and Breast Cancer Risk: Data From 228 951 Women of European Descent. J. Natl. Cancer Inst. 112, 295–304 (2020).
Fu, J. et al. DNA Methylation of Imprinted Genes KCNQ1, KCNQ1OT1, and PHLDA2 in Peripheral Blood Is Associated with the Risk of Breast Cancer. Cancers 14, 2652 (2022).
Bodelon, C. et al. Blood DNA methylation and breast cancer risk: a meta-analysis of four prospective cohort studies. Breast Cancer Res. 21, 62 (2019).
Chan, T. L. et al. Heritable germline epimutation of MSH2 in a family with hereditary nonpolyposis colorectal cancer. Nat. Genet. 38, 1178–1183 (2006).
Scott, C. M. et al. Methylation of Breast Cancer Predisposition Genes in Early-Onset Breast Cancer: Australian Breast Cancer Family Registry. PloS One 11, e0165436 (2016).
Akulenko, R. & Helms, V. DNA co-methylation analysis suggests novel functional associations between gene pairs in breast cancer samples. Hum. Mol. Genet. 22, 3016–3022 (2013).
Sun, S., Dammann, J., Lai, P. & Tian, C. Thorough statistical analyses of breast cancer co-methylation patterns. BMC Genomic Data 23, 29 (2022).
Shi, J. et al. The concurrence of DNA methylation and demethylation is associated with transcription regulation. Nat. Commun. 12, 5285 (2021). Sep.
Evans, D. G. R. et al. A Dominantly Inherited 5′UTR Variant Causing Methylation-Associated Silencing of BRCA1 as a Cause of Breast and Ovarian Cancer. Am. J. Hum. Genet. 103, 213–220 (2018).
Laner, A., Benet-Pages, A., Neitzel, B. & Holinski-Feder, E. Analysis of 3297 individuals suggests that the pathogenic germline 5’-UTR variant BRCA1 c.-107A > T is not common in south-east Germany,. Fam. Cancer 19, 211–213 (2020).
de Jong, V. M. T. et al. Identifying the BRCA1 c.-107A > T variant in Dutch patients with a tumor BRCA1 promoter hypermethylation. Fam. Cancer 22, 151–154 (2023).
Coppedè, F. Genes and the Environment in Cancer: Focus on Environmentally Induced DNA Methylation Changes. Cancers 15, https://doi.org/10.3390/cancers15041019 (2023).
Mossman, D. & Scott, R. J. Epimutations, Inheritance and Causes of Aberrant DNA Methylation in Cancer. Hered. Cancer Clin. Pract. 4, 75–80 (2006).
Zeilinger, S. et al. Tobacco smoking leads to extensive genome-wide changes in DNA methylation. PloS One 8, e63812 (2013).
Hitchins, M. P. Constitutional epimutation as a mechanism for cancer causality and heritability? Nat. Rev. Cancer 15, https://doi.org/10.1038/nrc4001 (2015).
Koestler, D. C. et al. Peripheral blood immune cell methylation profiles are associated with nonhematopoietic cancers. Cancer Epidemiol. Biomark. Prev. 21, 1293–1302 (2012).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinforma. Oxf. Engl. 27, 1571–1572 (2011).
Bediaga, N. G. et al. DNA methylation epigenotypes in breast cancer molecular subtypes. Breast Cancer Res. BCR 12, R77 (2010).
Konishi, K. et al. Rare CpG island methylator phenotype in ulcerative colitis-associated neoplasias. Gastroenterology 132, 1254–1260 (2007).
Geli, J. et al. Global and regional CpG methylation in pheochromocytomas and abdominal paragangliomas: association to malignant behavior. Clin. Cancer Res. Off. J. Am. Assoc. Cancer Res. 14, 2551–2559 (2008).
Nunn, A., Otto, C., Fasold, M., Stadler, P. F. & Langenberger, D. Manipulating base quality scores enables variant calling from bisulfite sequencing alignments using conventional bayesian approaches. BMC Genomics 23, 477 (2022).
Acknowledgements
This work was supported by UNAM PAPIIT IN225224, UNAM PAPIIT IN225920, CONACYT Fondo Sectorial 272573, Fondo SEP CONACYT 285879. Miguel Ruiz de la Cruz was a beneficiary of a fellowship during his PhD studies in Infectomics and Molecular Pathogenesis, granted by the National Council of Humanities, Science and Technology of Mexico (CONAHCYT), for the realization of this work with number (CVU/Scholar): 854825/755818. We thank Laura Margarita Marquez for her technical support.
Author information
Authors and Affiliations
Contributions
Conception, M.R.D.L.C., F.D.L.C.H.H., and F.V.P. Patient recruitment, sampling, database, R.G., M.P.R.C., E.M.G.G., and G.T.M. Experimental analysis, M.R.D.L.C., C.E.D.V., N.G.R.F., A.H.D.L.C.M., and F.V.P. Data analysis and visualization, M.R.D.L.C., H.M.G., F.A.B., and F.V.P. Manuscript writing and review, M.R.D.L.C., H.M.G., C.E.D.V., F.A.B., N.G.R.F., D.P., J.O., S.P., E.M.G.G., L.I.T., G.T.M., F.D.L.C.H.H., and F.V.P. Resources, F.V.P. Funding acquisition, F.V.P. All authors contributed to the article and approved the submitted version.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ruiz-De La Cruz, M., Martínez-Gregorio, H., Estela Díaz-Velásquez, C. et al. Methylation marks in blood DNA reveal breast cancer risk in patients fulfilling hereditary disease criteria. npj Precis. Onc. 8, 136 (2024). https://doi.org/10.1038/s41698-024-00611-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41698-024-00611-z
- Springer Nature Limited