Background

Breast cancer (BC), the topmost leading cause of cancer incidence in women of the USA and worldwide [1, 2], is a heterogeneous disease with multiple clinical, histopathological, and molecular subtypes, which is characterized by both genetic and epigenetic alterations [3, 4]. For epigenetic events, DNA methylation (DNAm) is a well-characterized major epigenetic modification that involves mitotically heritable and reversible attachment of methyl groups at the 5′ carbon of cytosine in CpG dinucleotides (CpGs), influencing DNA transcription without altering the DNA sequence [5, 6]. Whereas several DNAm studies for BC initiation and progression support global hypomethylation [7,8,9] and focal hypermethylation, such that some tumor-suppressor genes are frequently hypermethylated at CpG islands and promoters, thus being inactivated [8, 10, 11], the role of the epigenetic mechanisms in BC tumorigenesis has not been conclusive. For example, there is no consistent trend toward an association between identified CpGs and the risk of BC across studies, suggesting the need for large population-level epigenetic studies, which prospectively evaluate BC development, specifically in BC molecular-subtype stratifications [7, 8].

In particular, among postmenopausal women, the obesity–insulin resistance (IR) connection is a well-established factor for BC risk/progression [1, 12,13,14,15], but their interrelated molecular pathways on the methylome have not been established. In detail, IR and type 2 diabetes (T2DM) are influenced by environmental and genomic factors as well as by their interplay [16,17,18,19]. In prior genome-wide association studies (GWASs), a majority of genes are associated with insulin secretion, pointing to pancreatic islet defects but does not represent impaired insulin action [19,20,21]; and these genes explain a small portion of the estimated heritability [19, 22]. The analysis of epigenetics may address these issues. For example, obesity status/adipose tissues and long-term exposure of beta cell lines to hyperglycemia altered DNAm of genes involved in glucose metabolism and their gene expression, leading to impaired insulin secretion as well as sensitivity [17, 23,24,25,26,27,28,29,30]. Thus, aberrant DNAm may directly influence the function of pancreatic beta cells as well as other organs involved in glucose homeostasis. Also, considering that age, as measured via epigenetic age [31, 32], influences DNAm, changes in DNAm that are associated with IR account for aging in the methylome and thus may be a better indicator of inter-/intra-individual genomic variability of IR. As such, DNAm can be a biomarker of decreased insulin sensitivity, but few epigenome-wide association studies (EWASs) have so far examined DNAm in IR [30, 33,34,35]. In addition, the previous EWASs for obesity/metabolic syndrome showed limited evidence owing to a lack of findings’ validation and no comparison of findings between peripheral blood and tissues.

Further, one study [32] for IR in T2DM pancreatic islets reported that differentially methylated genes were enriched in pathways of cancer and MAPK signaling, suggesting a close link of epigenomic mechanisms between IR and cancer. Given that even a modest change in DNAm causes a substantial effect on gene expression and that in late-onset disease, it has a large effect on disease over a long time period [36], DNAm markers can serve as a biomarker to detect an early at-risk group for morbid conditions, such as IR and BC, even several years before the clinical diagnosis is made.

Our study was a population-level EWAS to detect DNAm probes that are associated with IR phenotypes and that, by using data from a prospective evaluation of BC development, are further directly correlated with BC risk, both overall and in BC subtypes among postmenopausal women. DNAm is tissue specific, but the correlations between peripheral blood and tissue are gene specific. For example, the methylation levels of several genes in relation to IR, T2DM, and/or BC are highly correlated between peripheral blood and tissue [37,38,39,40]. Thus, we first conducted a peripheral blood leukocytes (PBLs)–based EWAS and compared the methylation levels of detected CpGs with those of the CpGs within BC and adjacent normal breast tissues. Corresponding to the results in a published study [41] of gene-methylation parallelisms between peripheral blood cells and tissues in glucose metabolism, the PBLs may be the best non-invasive alternative tissue, standing for a surrogate DNAm marker that reflects multiple glucometabolic pathways. With the detected EWA-based IR-CpGs, we further tested for the associations with BC risk in PBLs and conducted validation tests in BC tissues. This allowed us to determine whether our IR-CpGs at genome-wide significance that were associated with BC risk are systemic or tissue specific or common in both.

Materials and methods

Study population

Our EWA analysis used data from the Women’s Health Initiative (WHI) cohort, a large, prospective study of postmenopausal women, whose ages were 50–79 years at the time of enrollment between 1993 and 1998 at 40 clinical centers in the USA, consisting of 2 study arms, namely the clinical trial (CT) and the observational study (OS) [42]. For DNAm data, we included 3 WHI ancillary studies (ASs) with available genome-wide DNAm measured in PBLs (Fig. 1): for discovery, AS315 (Epigenetic Mechanisms of Particulate Matter-Mediated Cardiovascular Disease, random minority oversample from WHI CT, n = 2,243); for validation, Broad Agency Award (BAA23, Integrative Genomics for Risk of Coronary Heart Disease [CHD] and Related Phenotypes, case–control study of CHD from WHI CT and OS, n = 2,107) combined with AS311 (Bladder Cancer and Leukocyte Methylation, matched case–control study of bladder cancer from WHI CT and OS, n = 882) [43, 44]. Racial/ethnic variation exists in BC-related DNAm [37, 45]; for the purpose of our EWA analysis, we restricted the study population to those women who reported their race or ethnicity as non-Hispanic white, a majority of the WHI ASs population, and who had available IR phenotypes assessed via fasting blood levels of glucose (FG) and insulin (FI) (n = 1,132).

Fig. 1
figure 1

Diagram of EWA and BC study populations from the WHI and TCGA cohorts. BC Breast cancer, CpGs CpG dinucleotide, DNAm DNA methylation, ER/PR + Estrogen receptor/progesterone receptor–positive, EWA Epigenome-wide association, HER2/neu– Human epidermal growth factor receptor-2–negative; TCGA The Cancer Genomic Atlas, WHI Women’s Health Initiative. * Individuals within Stage 2 had DNAm data measured at 2 visits and, for the analysis, the DNAm of those with a shorter interval between enrollment and blood draw were selected. ** Those selection criteria were applied to TCGA BC tissues. § The cases of HER2/neu– contained 49 (13% of BC cases) triple negatives

For our analysis of the validated IR-CpGs with the risk of BC development, our discovery cohort included those 3 WHI ASs with available BC outcomes but excluding women (n = 46) who had been followed up for less than 1 year and/or had been diagnosed with any type of cancer at enrollment, leaving a total of 1,086 women (Fig. 1). These women had been followed up through March 6, 2021, with a mean of 17 years follow-up, and 80 of them had developed invasive BC. Our replication cohort was derived from the Cancer Genomic Atlas (TCGA) BC Study (n = 862), housing tissue-derived genome-wide DNAm data and molecular profiles of different BC subtypes from BC tissues [46]. Our analyses for BC were restricted to women who are white and postmenopausal with available BC subtypes, but distant-metastasis free, resulting in a total of 412 (= 361 BC tissues + 51 adjacent normal breast tissues) (Fig. 1). The institutional review boards of each WHI clinical center and the University of California, Los Angeles, approved this study.

Data collection and BC outcome

Participants enrolled in the WHI completed self-administered questionnaires at screening, providing demographic information (e.g., age, race) and medical histories, such as DM. Trained staff obtained anthropometric measurements, including height, weight, and waist and hip circumferences at baseline. Invasive BC development was initially ascertained through self-report of a new cancer diagnosis by all participants, further determined by a committee of physicians on the basis of a review of the patients’ medical records and pathology and cytology reports, and coded into the central WHI database according to the National Cancer Institute’s Surveillance, Epidemiology, and End-Results guidelines [47]. The time from enrollment until BC development, censoring, or study end-point was measured as the number of days and then converted into years.

BC patient data from TCGA used in this study include information on age, race, menopausal status, and diagnosed tumor subtype and stage. For the study purpose, data from primary invasive BC tissues and normal breast tissues adjacent to BC (either primary or metastatic) tissues were analyzed.

Epigenome-wide DNAm array and laboratory methods

Using peripheral blood leukocytes isolated from the fasting blood of the WHI participants, we extracted DNA and measured DNAm via the Illumina 450 BeadChip (Illumina Inc.; San Diego, CA) at up to 485,511 CpG sites. DNAm levels (β values) were calculated as the ratio of intensities between the methylated and unmethylated probes, ranging from 0 (completely unmethylated) to 1 (completely methylated) [48]. DNAm was beta-mixture quantile (BMIQ)-normalized, [49] and batched-adjusted for stage and plate by using the empirical Bayes methods [50] or by using random intercept for plate and chip and a fixed effect for row. Leukocyte heterogeneities were estimated to be adjusted for in the analysis using Houseman’s method [51] (for CD4+ T cell, natural killer cell, monocyte, and granulocyte) and Hovarth’s method [52] (for plasma blast, CD8+CD28CD45RA T cell, and naïve CD8 T cell).

In TCGA, tissue-derived genome-wide DNAm was analyzed by using the Illumina Infinium450K array and, using minfi v.1.42.0, was normalized via normal-exponential out-of-band (Noob) background correction [53]. The tumor purity and cell-type proportions (cancer and normal epithelial, stromal, and immune cells) of each tumor sample were estimated by using the R InfiniumPurify v. 1.3.1 [54] and RefFreeEWAS V.2.2 [55], respectively.

Serum samples from the WHI participants fasting at least 8 h were drawn at enrollment by trained phlebotomists and assayed for glucose and insulin concentrations using the hexokinase method on a Hitachi 747 analyzer (Boehringer Mannheim Diagnostics, Indianapolis, IN) for glucose, and by radioimmunoassay (Linco Research, Inc., St. Louis, MO) or automated ES300 method (Boehringer Mannheim Diagnostics, Indianapolis, IN) for insulin. Results from the 2 methods for insulin measurement were comparable at insulin concentrations < 60 μIU/ml, and the intra-class correlation coefficient with repeatedly measured insulin was 0.7 [56]. Homeostatic model assessment–IR (HOMA-IR), as a surrogate of IR, was estimated as glucose (unit: mg/dl) × insulin (unit: μIU/ml) / 405 [57].

Statistical analysis

For the DNAm site-specific analysis across the genome with IR phenotypes, each phenotype was log-transformed as a result of tests conducted for linear assumption and normality distribution and was also categorized as follows: FG, FI, and HOMA-IR, using 100 mg/dl, 8.6μIU/ml, and 3.0 (respectively), corresponding to the cut points of the American Heart Association/National Heart, Lung, and Blood Institute, the International Diabetes Federation, and the Adult Treatment Panel III for metabolic syndrome [58, 59]. The association between DNAm and each phenotype was evaluated via multiple linear and logistic regressions, adjusting for age and leukocyte heterogeneities. The summary of the leukocyte proportions is provided in Additional file 1: Table S1. A 2-sided p < 1E–007 (discovery) and 0.05 / number of the discovered CpGs (replication), providing Bonferroni correction, were considered statistically significant. Results were combined across discovery and replication in a meta-analysis assuming a fixed–effect model.

With the selected top 20 CpG sites that were most statistically significant after multiple-comparison corrections, we next performed in the WHI data the multiple Cox proportional hazards regression for BC development overall and within BC subtypes, with an assumption test via a Schoenfeld residual plot and rho, by accounting for age, having ever been treated for diabetes, body mass index (BMI), waist-to-hip ratio (WHR), and leukocyte heterogeneities. Using TCGA data, we further conducted validation tests of the top 20 CpGs with BC risk by using logit regression that was adjusted for age, tumor purity, and cell-type composition both overall and in the BC subtypes. For the analysis of BC risk, the modeled CpGs in both cohorts were further standardized across samples; thus, the effect size reflected a 1 standardized deviation increase in DNAm on BC risk. Given that this testing was performed on the basis of our hypothesis-driven questions (i.e., IR-DNAm in association with BC systemically or in tissues), a 2-tailed p < 0.05 was considered significant.

Differences in methylation levels of the modeled CpGs by IR phenotypes in the PBLs and by BC risk in each of PBLs and tissues, as well as differences in the DNAm status between the PBLs and tissues among women with BC and those without BC, were tested via unpaired 2-sample t tests. If β values were skewed or had outliers, Mann–Whitney/Wilcoxon’s rank-sum test was used. With the CpGs at genome-wide significance in the discovery and those of which were associated with BC risk in either TCGA or WHI, we finally conducted a Gene Set Enrichment Analysis (GSEA) by IR phenotypes and by BC subtypes, respectively, using the WebGestalt [60]. All statistical analyses were performed using R.

Results

Epigenome-wide association of DNAm and IR phenotypes.

Among 484,220 CpGs in the discovery data, we found several differentially methylated CpGs associated with each IR phenotype (FG, FI, and HOMA-IR) and further validated them. In detail, 19 CpGs were associated with FG, the level of which was analyzed as a continuous variable; of those, 1 CpG (cg19693031 in TXNIP) was further validated, with p < 2.6E–03(= 0.05/19) (Table 1, Figs. 2A and B). This same CpG was also replicated in the analysis for FG as a categorical variable, showing the same direction as the effect size estimated in the FG analysis as a continuous variable (Additional file 1: Table S2). Of 20 CpGs in relation to FI as a continuous variable in discovery, 7 CpGs were further validated, with p < 2.5E–03 (= 0.05/20; Table 2, Figs. 2C and D). Of those 7 CpGs, 1 CpG (cg00574957 in CPT1A) was also replicated in the analysis of FI as a categorical variable (Additional file 1: Table S3); in both linear and logistic analyses, this CpG was negatively associated with FI. For HOMA-IR as a continuous variable, 35 CpGs were detected in discovery; 7 of those were further validated (p < 1.4E–03; Table 3, Figs. 2E and F). In the analysis of HOMA-IR as a categorical variable, 4 of the validated 7 CpGs (cg14476101 in PHGDH, cg19693031 in TXNIP, cg00574958 in CPT1A, and cg06500161 in ABCG1) were also detected in discovery, yielding the same directions as those of effect sizes estimated in the linear analyses, but none of them were further validated (Additional file 1: Table S4). Finally, we conducted a meta-analysis of all the detected epigenome-wide CpGs by combining their discovery and replication data. We detected 1 CpG (cg19693031 in TXNIP) that was replicated inversely related to FG, FI, and HOMA-IR each as a continuous variable; and that CpG was also significant at the epigenome-wide level in association with FG and HOMA-IR each as a categorical variable.

Table 1 Genome-wide scan of DNA methylation for an association with fasting glucose concentrations (as a continuous variable)
Fig. 2
figure 2

Comparison among the effect sizes of EWA-CpGs in FG, FI, and HOMA-IR as continuous variable in discovery, validation, and meta-analyses. (CpG CpG dinucleotide, EWA Epigenome-wide association, FG and FI Fasting levels of glucose and insulin, HOMA-IR homeostatic model assessment-insulin resistance). A Line graph: FG: 19 CpGs; B Scatter plot: FG: 19 CpGs; C Line graph: FI: 20 CpGs; D Scatter plot: FI: 20 CpGs; E Line graph: IR: 35 CpGs; F Scatter plot: IR: 35 CpGs

Table 2 Genome-wide scan of DNA methylation for an association with fasting insulin concentrations (as a continuous variable)
Table 3 Genome-wide scan of DNA methylation for an association with fasting level of HOMA-IR (as a continuous variable)

Further, we conducted a subset analysis by selecting CpGs with > 5% of a mean difference in DNAm by IR phenotypes and compared their mean differences in DNAm levels by each IR phenotype across chromosome (Chr), CpG context, enhancer and/or promoter, and gene region (Additional file 1: Figure S1). The mean levels of DNAm by FG (< 100 mg/dl vs. ≥ 100 mg/dl) differed in Chr 1, 7, 8, and 16. The mean levels of DNAm by FI (≤ 8.6μIU/ml vs. > 8.6μIU/ml) and those of DNAm by HOMA-IR (< 3.0 vs. ≥ 3.0) were different in Chr 1, 7, and 8 and in Chr 4, 8, and 11, respectively. Whereas S-Shores were hypomethylated in the groups with impaired glucose metabolism measured via FG and HOMA-IR, OpenSea, N Shelf, and S Shelf were hypermethylated in those with a greater level of FI. In this group with a higher level of FI, the enhancer was hypermethylated, whereas the promoter was hypomethylated. Gene regions, including intergenic, gene body, and 5' untranslated regions (5' UTR), were hypermethylated in the groups with, respectively, greater levels of FG, FI, and HOMA-IR, but the 200–1500 bp upstream of transcription start site (TSS1500) was hypomethylated in the group with a greater level of either FI or HOMA-IR.

Association of the detected IR-DNAm with BC risk.

With the top 20 epigenome-wide IR-DNAm, we next tested for correlation with BC risk in the 2 independent cohorts, WHI and TCGA. In the WHI cohort, several CpGs were associated with BC development; their hazard ratios were consistent across the analyses both with and without adjustment for DM, BMI, and WHR (Table 4). In particular, 3 CpGs in WDR8 were detected across overall, estrogen receptor/progesterone receptor–positive (ER/PR +), and human epidermal growth factor receptor-2–negative (HER2/neu–) subtypes, with a positive association with BC risk (Table 4, Additional file 1: Figure S2). None of the CpGs replicated in the analysis of IR phenotypes were detected in the analysis for BC risk, but 2 epigenome-wide level CpGs detected in discovery (cg17058475 and cg16246545) in CPT1A and PHGDH (replicated genes in relation to IR phenotypes), respectively, were associated with the risk of BC.

Table 4 WHI: Differentially DNA-methylated CpGs in IR significantly associated with an invasive BC risk, overall and stratified by BC molecular subtype

In the TCGA cohort, multiple CpGs were significant across BC subtypes; specifically, 2 CpGs (cg06500161 and cg27243685) in ABCG1 (replicated gene in IR phenotypes) were significantly associated with BC risk (Table 5, Additional file 1: Figure S3). Only 1 CpG (cg01676795 in POR) was commonly detected across the WHI and TCGA analyses. This CpG with a 1 standardized deviation increase in DNAm had 75% (in the WHI) and a 5 times greater risk (in the TCGA) for the ER/PR + subtype. Further, we compared DNAm levels between the WHI and TCGA from the IR-CpGs associated with BC that are shared by the 2 cohorts in terms of Chr, CpGs, CpG context, and gene region. Whereas DNAm levels of some CpG contexts and/or gene regions differed significantly between the 2 cohorts among the non-BC subcohorts (Additional file 1: Figure S4), no significant difference in DNAm levels between the cohorts was observed within the BC subcohorts (Fig. 3), suggesting DNAm parallelisms between PBLs and tissues in IR and BC.

Table 5 TCGA: Differentially DNA-methylated CpGs in IR significantly associated with primary invasive BC tissues, overall and stratified by BC molecular subtype
Fig. 3
figure 3

Box plots for DNAm levels from EWA IR-CpGs in association with BC, shared by BC datasets according to Chr, CpG context, and gene region. (BC Breast cancer, Chr Chromosome, CpG CpG dinucleotide, DNAm DNA methylation, ER pos Estrogen receptor/progesterone receptor–positive, EWA Epigenome-wide association, IR Insulin resistance, TCGA The Cancer Genomic Atlas, UTR Untranslated region, WHI Women’s Health Initiative. * Statistical significance after multiple-comparison correction). A By Chr; B By gene region; C By CpG context; D By enhancer; E By 1 CpG (cg01676795); F ER pos: By 1 CpG (cg01676795)

GSEA by IR phenotypes and by BC subtypes.

Using GSEA strategies, we conducted multiple analyses of gene ontology (GO) with biologic process, cellular component, and molecular functions; pathways with KEGG and Reactome; and diseases by using DisGeNET and GLAD4U databases. In regard to IR phenotypes (Additional file 1: Tables S5.1–S4.7), GO with biologic process identified a beta-catenin/T cell factors (TCF) complex assembly; its dysregulation is associated with cancer [61]. Gene-enrichment pathways were involved in glucose intolerance, transcriptional mis-regulation in cancer, IR signaling (AKT2, RSK/RAS/MAPK), and lipid metabolism. Diseases involved in the IR pathways included nutritional and metabolic diseases, DM, and obesity. For BC subtypes, GO with a cellular component included a histone acetyltransferase and other transcription factors in the ER/PR + subtype. Genes were enriched in the pathways involving adipocytokine signaling and lipid metabolism in the HER2/neu– subtype and in those involving immune and insulin signaling (MAPK1/MAPK3, Rap 1) in both ER/PR + and HER2/neu– subtypes (Additional file 1: Tables S5.8–S4.11).

Discussion

This is the first large population-level EWAS conducted in postmenopausal women for detecting differentially methylated CpGs in the PBLs that are associated with individual IR phenotypes and that are further prospectively evaluated for an association with BC development, both overall and in BC molecular subtypes. The methylation levels of the detected CpGs in IR and BC risk between the PBLs and the BC tissues were comparable, consistent with the findings of a gene-methylation parallelism study in glucose metabolism between peripheral blood cells and tissues [41]. This suggests that PBLs may serve as the best source of surrogate DNAm markers in non-invasive tissues, reflecting multiple interconnected glucometabolic carcinogenesis pathways.

Several EWA-CpGs in IR phenotypes detected in our study were also reported in previous studies, supporting our study’s replication and robustness. For example, cg19693031 in TXNIP, inversely associated with FG, FI, and IR in our study, was observed in previous studies with the same direction of association [62,63,64,65,66]. Thioredoxin-interacting protein (TXNIP) plays a key role in pancreatic beta cell biology involving oxidative stress and endothelial cell inflammation and its vascular complications [67], and it regulates glucose homeostasis by promoting fructose absorption in the small intestine [68]. The TXNIP gene is activated in both hyperglycemic animals and human adipose tissues [69], and it regulates glucometabolic pathways in human skeletal muscle [70]. Thus, our finding of hypermethylated DNA probe in TXNIP (i.e., a negative effect on the gene expression) associated with decreased IR has been supported.

Also, cg00574957 in CPT1A was negatively associated with FI and IR in both our and previous studies [63, 64, 71], showing the biological plausibility of its association with IR, including its role in obesity, metabolic syndrome, and fatty acid metabolism [72]. As this CpG is independent of nearby single-polymorphism nucleotides (SNPs) located within 1 Mb upstream or downstream of this locus, representing rs1369 index, the decreased CPT1A expression can be caused solely by increased methylation at this CpG site [73]. CPT1A, 1 of the 3 isoforms of CPT-1, was found mostly in the liver, where it is involved in the regulation of mitochondrial fatty acid oxidation (FAO). CPT1A deficiency causes the metabolic disorder of FAO [74, 75]. A decrease in mitochondrial fatty acid uptake results in elevated intramuscular lipid levels, but upregulates glucose oxidation and improves whole-body insulin sensitivity in a mouse model [74]; this is supportive of our finding of an inverse association between increased DNAm of the CpG (i.e., reduced gene expression) and FI/IR. However, most human gene studies have reported that this gene’s function is connected to fatty acid metabolism, not to clinical glucometabolic phenotypes, warranting a future functional study.

Similarly, cg14476101 in PHGDH was inversely associated with FI and IR in our study. Previous EWASs and Mendelian Randomization studies confirmed the association between hypermethylation at that locus and lower fatty-liver risk, T2DM, and adiposity [76, 77]. Also, the role of this CpG in regulating the blood concentration of steroid hormones was upregulated by obesity [78]. Together, these findings propose a plausible link between the PHGDH gene and lipid and adipocytokine metabolic pathways that can be altered by the methylation level of cg14476101.

In contrast, we found that cg06500161 in ABCG1 was positively associated with FI and IR. This CpG site is a well-known DNAm probe associated with glucometabolic phenotypes [30, 62, 63, 79, 80], and the gene’s expression was inversely associated with the methylation level at this CpG [30, 62]. ABCG1 is a crucial regulator of cholesterol efflux from macrophages to high density lipoprotein (HDL); thus, suppressed gene activity by increased DNAm at this site can contribute to lowering the HDL level [81], which is a known independent risk factor for glucometabolic disorders. Also, the link between ABCG1 and T2DM/glucose traits has been reported previously in both human and animal studies [82,83,84], supporting our finding of increased DNAm of this site’s being associated with IR phenotypes.

Of those validated IR-genes, 3 genes (CPT1A, PHGDH, and ABCG1) were further correlated with BC risk. In particular, the ABC transporter gene (ABCG1) expression associated with cholesterol efflux in the liver results in inhibition of cell proliferation and stimulation of cell apoptosis in BC cells [85]. This highlights a potential epigenetic link between lipid–glucometabolic alteration and BC tumorigenesis and progression that deserves further study. In our study, the detected CpGs in those 3 genes were EWA-based IR-DNAm probes, which are novel with respect to their association with BC risk.

Although the methylation levels of the CpGs in relation to IR and BC that are common across Chr, CpG contexts, and the gene regions were comparable between the WHI and TCGA cohorts, only 1 individual IR-CpG (cg01676795 in POR) was common in its relationship to BC risk in both cohorts. P450 oxidoreductase (POR) gene expression has been studied in few cancer types, presenting significant overall suppression of POR expression in muscle-invasive bladder cancer [86] and differentially expressed gene proteins enriched in neutrophil and T cell activation in hepatocellular carcinoma [87]; those findings support the important role of POR in carcinogenesis via alteration of the immune tumor microenvironment. Our finding of this CpG in POR in association with BC risk is novel, which calls for a future study on the methylation in this gene linked to BC by taking into account the effects of nearby SNPs.

Our analysis for BC risk in the TCGA included BC tissues and adjacent normal tissues. Different findings could result from the analysis between BC tissues and normal tissues (obtained from patients without BC), although we adjusted for tumor purity in the analysis. A few DNAm probes from the TCGA presented an extreme risk magnitude, warranting a further replication study with a larger independent dataset. To increase the comparability of analyses between the 2 cohorts, our study did not account for lifestyle factors in a comprehensive fashion and did not consider interactions with DNAm, which may affect the relationships between DNAm, IR, and BC. The validation data reflect a small fraction of the 2 ASs (BAA23; AS311) owing to the limited availability of IR phenotypes, resulting in less strong statistical power. In addition, given that each AS had its own study purpose, samples selected for our study may not fully represent the source population. Finally, our study population was confined to white postmenopausal women, so the generalizability of our results to other populations is limited.

Conclusions

In conclusion, we found several differentially methylated CpGs, which are both well-established and novel, at the epigenome-wide level in relation to IR that were further correlated with BC development. Our findings warrant further validation in larger, independent epigenetic and mechanistic studies. Our study contributes to better understanding of the interconnected molecular pathways on the methylome between glucose intolerance and BC carcinogenesis and suggests the potential use of DNAm markers in PBLs as preventive targets for detecting an at-risk group for IR and BC among postmenopausal women.