Background

With an estimated 40,000 deaths in 2017, breast cancer is the second leading cause of cancer-related death in the United States [1]. Women who engage in physical activity prior to the diagnosis of breast cancer have better overall survival than those who do not [2], but the mechanisms of this association are unknown. Given that only 20% of the U.S. population achieves the Centers for Disease Control and Prevention’s physical activity guidelines [3], improved understanding of how physical activity influences breast cancer prognosis could have significant public health impact.

Epigenetics is the study of functionally relevant changes to the genome that do not involve a change in the nucleotide sequence. DNA methylation is the most extensively studied epigenetic modification and involves the addition or removal of methyl (-CH3) groups at CpG dinucleotides that influence gene regulation [4]. DNA methylation can be measured in a range of tissues, including tumor and blood [5], and has been associated with breast cancer prognosis in several studies, including our own [6, 7]. Although methylation signatures are largely established during embryogenesis [8], DNA methylation (and other features of the epigenome) may be modified throughout the life course as a result of both behavioral and environmental stimuli [9], including physical activity [10]. Interactions between the environment and DNA methylation may, therefore, inform prognostic outcomes among women diagnosed with breast cancer.

In a population-based sample of women diagnosed with first primary breast cancer, we aimed to understand whether the association between prediagnostic recreational physical activity (RPA) and all-cause or breast cancer-specific mortality was modified by gene promoter methylation (which regulates gene expression) in a panel of 13 breast cancer-related genes (APC, BRCA1, CCND2, CDH1, DAPK1, ESR1, GSTP1, HIN1, CDKN2A, PGR, RARβ, RASSF1A, and TWIST1) measured in tumor tissue. Similarly, we sought to determine whether the RPA-mortality association was modified by global DNA methylation (a marker of genome stability) using two methods to assess white blood cell methylation: long interspersed nucleotide element 1 (LINE-1), which approximates levels in repetitive elements [11], and the luminometric methylation assay (LUMA), which estimates methylation at CCGG sites [12]. We hypothesized that methylation of oncogenes (or lack of methylation in tumor suppressor genes) and high prediagnostic physical activity engagement would result in lower all-cause and breast cancer-specific mortality among women diagnosed with first primary breast cancer. We also hypothesized that physical activity and low LUMA (high LINE-1) would work in synergy to reduce mortality following a breast cancer diagnosis.

Methods

For this project, we used resources from the follow-up component of the Long Island Breast Cancer Study Project (LIBCSP), a population-based study. Details of the study design and participants for this component have been described previously [13, 14].

Study participants

Eligible participants in the LIBCSP follow-up study were English-speaking female residents of Nassau and Suffolk counties on Long Island, NY, USA, who were newly diagnosed with a first primary in situ or invasive breast cancer between 1 August 1996 and 31 July 1997. Potentially eligible subjects were identified through daily or weekly contact with pathology departments of all 28 hospitals on Long Island and 3 tertiary care hospitals in New York City. At diagnosis, the 1508 women with breast cancer were aged 20–98 years, predominately postmenopausal (67%) and white (94%), which is consistent with the underlying racial/ethnic distribution in these two New York counties at the time of data collection.

Data collection

Recreational physical activity and other covariates

Approximately 2–3 months after diagnosis, women were interviewed at home by trained interviewers using structured questionnaires. As part of this baseline (on average 100-minute) interview, RPA was assessed using a modified instrument developed by Bernstein and colleagues for epidemiologic studies of breast cancer [15]. RPA from menarche to diagnosis was used to estimate lifetime RPA, and women were classified as inactive, low RPA (<6.36 h/week), and high RPA (≥6.36 h/week) on the basis of the median for the entire cohort as previously described [16]. During the baseline interview, participants were additionally queried on their demographic characteristics (including age, race/ethnicity, income, and education), lifestyle characteristics (including cigarette smoking and body size), medical histories (including family history of breast cancer, exogenous hormone use, and mammography screening), and other breast cancer-related factors as previously described [13, 14].

Medical records data

Medical records were abstracted at baseline and again approximately 5 years later to determine tumor characteristics (e.g., estrogen receptor [ER]/progesterone receptor [PR] status, tumor size, and nodal involvement) as well as the first course of treatment for the first primary breast cancer diagnosis.

Gene-specific promoter methylation

DNA extraction from archived formalin-fixed, paraffin-embedded tumor tissue of the first primary breast cancer was performed as previously described [17]. Among the 975 women with archived tumor tissue, 807 (82.8%) had available gene promoter methylation data. The 807 women with tumor methylation data did not differ from the 1508 eligible women on most demographic and clinical characteristics. Women with tumor methylation data were more likely to have nodal involvement and invasive cancer (data not shown), which reflects the amount of tumor material available for methylation analyses.

Thirteen genes known to be involved in breast carcinogenesis, and frequently methylated in promoter regions, were selected for assessing interactions with RPA. Promoter methylation of ESR1, PR, and BRCA1 was determined by methylation-specific (MSP) polymerase chain reaction (PCR) and was dichotomized (i.e., methylated vs. unmethylated) on the basis of the presence or absence of the PCR band [17, 18]. The methylation status of the ten remaining genes was assessed by the MethyLight assay (Qiagen, Valencia, CA, USA) [19, 20]. The percentage of methylation was calculated by the comparative cycle threshold (2−ΔΔCT) method, where ΔΔCT = (CT,Target − CT,Actin)sample − (CT,Target − CT,Actin)fully methylated DNA [21], and multiplying by 100. Using a 4% cutoff, we dichotomized into methylated or unmethylated cases as previously reported [22].

Global methylation

For 1102 (73.1%) of women with breast cancer, trained phlebotomists obtained a nonfasting 40-ml blood sample at the baseline interview, and DNA was isolated as previously described [23]. Details of LUMA and LINE-1 assessment in the LIBCSP have been detailed previously [12]. Briefly, LUMA was carried out according to the modified protocol described by Bjornsson et al. [24] and was expressed as a percentage based on the following equation: methylation (%) = [1 − (HpaII ΣG/ΣT)/(MspI ΣG/ΣT)] × 100 [24]. Four CpG sites in the promoter region of LINE-1 were assessed using a prevalidated pyrosequencing-based methylation assay [19] and were individually analyzed as a T/C single-nucleotide polymorphism using Q-CpG software (Qiagen). These data were subsequently averaged to provide an overall percentage 5-methylcytosine status.

Mortality

We used the National Death Index to determine vital status through the end of 2011 as previously reported [25]. After approximately 14.7 (0.2–15.4) years of follow-up, among the 1254 patients with any gene-specific (range n = 726–803 women with gene promoter methylation status) or global methylation (range n = 1005–1015 women with LUMA or LINE-1) assessments and complete RPA data, we identified 421 who died as a result of any cause, of which 186 deaths were breast cancer-related (determined using International Classification of Diseases code 174.9 or C-50.9).

Statistical analysis

We used Cox proportional hazards regression [26] to estimate HRs and 95% CIs for the association between RPA, methylation status (global and gene-specific), and mortality (all-cause and breast cancer-specific) among 1254 women with any methylation biomarker and complete RPA assessment. The 1254 women with breast cancer did not meaningfully differ from the original 1508 who were eligible. The women were more likely to have nodal involvement and invasive cancer, which relate to the amount of tumor material that would be available for assay. All statistical tests were two-sided (a priori significance level of 0.05). The proportional hazards assumption was assessed using exposure interactions with log-time [26]. We observed no violations of the proportional hazards assumption with the 13 breast-cancer related genes, global methylation markers, or RPA.

For interaction analyses, we assessed RPA using a three-level classification based on the median level among active participants: inactive, low RPA (<6.36 h/week), and high RPA (≥6.36 h/week). As detailed above, methylation of gene promoters was classified as methylated or unmethylated using a 4% cutoff, and global methylation markers (LUMA and LINE-1) were dichotomized at the median. Effect measure modification on the multiplicative scale between RPA and methylation was evaluated using the likelihood ratio test with a 0.05 significance level [27].

All models were initially adjusted for age at diagnosis. We further considered inclusion of family history of breast cancer (yes/no), history of benign breast disease (yes/no), cigarette smoking (ever/never), race (white, black, other), and body mass index (BMI; <25.0 kg/m2, 25.0–29.9 kg/m2, ≥30 kg/m2). Covariates were removed from the multivariate model using backward elimination. Variables remained in the final model if their exclusion changed the effect estimate by >10% [28]. None of these covariates met our criteria, and thus all models were adjusted for age at diagnosis only.

When constructing our models, we did not consider tumor characteristics (e.g., tumor stage, grade, size, and nodal involvement) or hormone receptor status as potential confounders of the association between RPA, methylation, and mortality. These covariates are on the causal pathway between prediagnostic RPA and mortality, and adjustment for a causal intermediate would result in biased parametric estimates [29, 30]. Although our study population includes women with invasive (84%) and in situ (16%) breast cancer, our findings restricted to invasive tumors did not vary substantially from those among all women, likely owing to the lower proportion of in situ cases in our study population. We therefore considered both invasive and noninvasive cases in these analyses. All statistical analyses were performed using SAS statistical software version 9.4 (SAS Institute, Cary, NC, USA).

Results

Distribution of clinical characteristics

The distribution of clinical characteristics by RPA category among the 1254 women with breast cancer included in this study are provided in Table 1. The distribution of clinical characteristics by outcome (all-cause and breast cancer-specific mortality) is available in Additional file 1: Table S1. Women who engaged in RPA across the life course tended to have younger age at diagnosis and a lower BMI, and they were slightly less likely to have nodal involvement. We found little difference in other clinical characteristics (i.e., ER or PR status) among physically active women compared with inactive women.

Table 1 Distribution of clinical characteristics by recreational physical activity category among the 1254 participants with any information on methylation (gene-specific and/or global) and lifetime physical activity in a population-based cohort of women diagnosed with first primary breast cancer, Long Island Breast Cancer Study Project

Associations between RPA and all-cause and breast cancer-specific mortality

In Table 2, we provide effect estimates for the association between prediagnostic lifetime RPA and mortality after approximately 15 years of follow-up among our LIBCSP cohort of 1254 women newly diagnosed with first primary breast cancer in 1996–1997. The association between lifetime RPA and mortality among the entire cohort of 1508 women with breast cancer with follow-up through 2002 was previously reported [14]; follow-up has now been updated and extended through 2011. Our updated estimates showing inverse associations with both all-cause and breast cancer-specific morality are similar to the earlier reported estimates in the LIBCSP based on 5 years of follow-up. The biological relevance and function of the 13 genes investigated in this study [31] are summarized in Additional file 2: Table S2, along with previously reported associations with RPA [32] and breast cancer-specific mortality [33].

Table 2 Age-adjusted HRs and 95% CIs for the association between lifetime recreational physical activity and 15-year all-cause and breast cancer-specific mortality among a population-based sample of 1254 women with a first primary breast cancer, Long Island Breast Cancer Study Project

Associations between RPA, DNA methylation, and mortality

As shown in Table 3, the association between prediagnostic lifetime RPA and all-cause mortality following a breast cancer diagnosis was lower among active women (>6.36 h/week of RPA) with breast tumor promoter methylation in APC (HR 0.60, 95% CI 0.40–0.80), CCND2 (HR 0.56, 95% CI 0.32–0.99), HIN (HR 0.55, 95% CI 0.38–0.80), and TWIST1 (HR 0.28, 95% CI 0.14–0.56), but not among active women with unmethylated tumors (p < 0.05 for multiplicative interaction). Overall, we found substantially lower risk of all-cause mortality among women with any RPA and methylated gene promoters than among active women with unmethylated promoters (Fig. 1). For example, we observed an almost 50% lower risk of death as a result of all causes among very active women with methylated HIN1 promoter (HR 0.55, 95% CI 0.38–0.80). In contrast, there was no corresponding risk reduction for RPA among those with unmethylated HIN1 promoter (HR 1.09, 95% CI 0.61–1.81). We observed similar patterns of association for breast cancer-specific mortality, albeit the interaction was not significant (RPA HR 0.96, 95% CI 0.40–2.29 for unmethylated HIN1 vs. RPA HR 0.52, 95% CI 0.30–0.90 for unmethylated HIN1; multiplicative interaction p = 0.066). We did not observe an interaction between RPA, APC methylation, and breast cancer-specific mortality (p = 0.138). For CCND2 and TWIST1, we were unable to evaluate effect modification owing to small cells.

Table 3 Age-adjusted HRs and 95% CIs for the association between lifetime recreational physical activity and 15-year all-cause and breast cancer-specific mortality stratified by gene methylation status (methylated vs. unmethylated tumors) among 803 women diagnosed with a first primary breast cancer and with available gene promoter methylation data, Long Island Breast Cancer Study Project
Fig. 1
figure 1

Age-adjusted HRs and 95% CIs for the association between lifetime recreational physical activity (RPA) and 15-year all-cause and breast cancer-specific mortality, stratified by gene methylation status (methylated vs. unmethylated tumors), among 803 women diagnosed with a first primary breast cancer and with available gene promoter methylation data, Long Island Breast Cancer Study Project. Closed circles = low RPA (<6.36 h/week). Open circles = high RPA (≥6.36 h/week). Compared with inactive women (data point not shown, HR 1.0)

When we restricted our analyses to women with hormone receptor-positive breast cancer only (defined as any ER- or PR-positive), our estimates became less precise but were similar in effect size, and most interactions persisted (Additional file 3: Table S3). The association between RPA and mortality among women with breast cancer was not modified by global methylation markers, LINE-1, or LUMA (Additional file 4: Table S4).

Discussion

In this population-based follow-up study of 1254 women diagnosed with first primary breast cancer, we found that the overall improved survival among women with any lifetime prediagnostic RPA appeared to be modified by gene-specific methylation profiles. We observed substantially improved survival with high lifetime prediagnostic RPA in women with a tumor-methylated APC, CCND2, HIN1, or TWIST1 gene promoter compared with active women with unmethylated gene promoters. A more pronounced risk reduction was observed for breast cancer-specific mortality for the interaction with HIN1; however, we were unable to evaluate interactions with breast cancer-specific mortality owing to small numbers. We found no interaction between RPA and global methylation as measured by LINE-1 and LUMA. Our findings suggest that the inverse association between RPA and mortality after breast cancer may depend upon gene-specific methylation profiles.

Improved survival with RPA among women with breast cancer has been observed in many epidemiologic studies [2], including our own [14]. Also, we [6] and others have reported associations between gene-specific methylation and prognosis [34]. However, to our knowledge, no previous investigation has considered gene methylation as a potential modifier of the RPA-mortality association, despite strong biologic plausibility. Not only does physical activity reduce adiposity and its numerous metabolic correlates, but it is itself thought to reduce markers of inflammation, alter immune functioning, and lower circulating insulin [35, 36]. These pathways have been linked to aberrant DNA methylation, altering several genes implicated in breast carcinogenesis [3740]. Collectively, these data suggest that the mechanisms underlying the inverse association between RPA and mortality may be facilitated and/or altered by inflammation-related methylation changes.

In the present study, we found that the improved survival after breast cancer with high RPA was greatest among patients with methylated APC, CCND2, HIN1, and TWIST1 promoters. APC and HIN1 are candidate tumor suppressors thought to be involved in breast carcinogenesis [41, 42]. Our observation of decreased mortality among very active women with APC or HIN1 methylation is counter to our a priori hypothesis of lowered risk of death among active women with unmethylated (active) tumor suppressor genes, although we did observe that in the highest RPA group, there was no statistical difference in the effect among APC methylated and unmethylated cases. We observed risk reductions at both low and high RPA engagement among women with methylated HIN1 promoters. Methylation of HIN1 is linked to gene silencing, reduced expression, and loss of apoptosis [43]. HIN1 is a putative growth inhibitory cytokine thought to be inactivated at the earliest stages of breast tumorigenesis and silenced in the majority of sporadic breast carcinomas [44]. This may suggest that prediagnostic RPA could help overcome the deleterious effects of HIN1 inactivation in breast carcinogenesis, thereby improving survival outcomes.

The exact roles of CCND2 and TWIST1 in breast cancer are unresolved. CCND2 is important in cell cycle regulation and has been cited as both a tumor suppressor gene [45] and an oncogene [46]. Inactivation of CCND2 is thought to occur via promoter hypermethylation, which may be an early, though infrequent (about 11%), event in malignant breast cancer transformation [45, 47, 48]. In our study, we found lower all-cause mortality among women with promoter methylation (or loss of CCND2 expression) in tumor tissue. This may reflect synergy between physical exercise and inactivation of the CCND2 oncogene, particularly among women with low RPA. TWIST1 is an antiapoptotic and prometastatic transcription factor, and methylation of the gene promoter has been observed frequently in malignant breast tissue [49]. We observed pronounced reductions in all-cause mortality among physically active patients with TWIST1 methylation, which is consistent with our a priori hypothesis of synergy between the presumptive oncogene and RPA.

Although our population-based study of women with breast cancer was carefully conducted and included comprehensive exposure assessment and a long follow-up time, several potential limitations should be considered. First, information on RPA was collected systematically by trained interviewers [13]; nonetheless, there is potential for nondifferential measurement error, which would result in reduced effect estimates. However, LIBCSP investigators used a comprehensive, open-ended approach to query women on their lifetime RPA habits. This approach has been shown to elucidate important relationships between RPA and breast cancer in the LIBCSP [14, 16] and is consistent with other findings [50]. Second, postdiagnostic RPA, which likely influences prognosis [51], was not considered in this investigation, owing to small cells after stratification by methylation status. Third, we were limited to a panel of 13 biologically relevant genes and 2 global assays. However, studies employing more robust panels that interrogate hundreds of thousands of CpG sites are at high risk for false-positives, generally lack biologically driven hypotheses, and perform modestly using archived tumor samples [52, 53]. Additionally, we were limited by the use of conventional MSP PCR assays for three of the genes. However, where quantitative MSP PCR assays have the advantage of providing a quantitative estimate of methylation, the conventional MSP assay is a highly sensitive method to classify individuals by methylation status, which mitigates the threat of biased results [54, 55]. Finally, the racial homogeneity of our study population restricted our ability to explore potential variation by intrinsic subtype or by race, both of which are known to associate with prognostic outcomes [56]. Nonetheless, the largest hormonal subtype of breast cancer diagnosed among U.S. women of any race is ER+PR+ [57], which continues to increase with time [58] and is the predominant subtype of breast cancer diagnosed among LIBCSP study participants. When we restricted our findings to women with only hormone-responsive breast tumors, results were similar to those for all women.

Conclusions

To our knowledge, we are the first to show, using resources from a population-based follow-up study, that promoter methylation of APC, CCND2, HIN1, and TWIST1 may modify the inverse association between prediagnostic RPA and all-cause mortality following a breast cancer diagnosis. With the exception of HIN1, which was suggestive of breast cancer-specific mortality, power was limited for examining potential modification of the association between RPA and breast cancer-specific mortality. Although our results require confirmation in cohort studies with a larger number of women with breast cancer and more comprehensive gene coverage, they suggest that DNA methylation may play an important role in associations between physical activity and improved survival among women with breast cancer.