A genome-wide association study to identify genetic susceptibility loci that modify ductal and lobular postmenopausal breast cancer risk associated with menopausal hormone therapy use: a two-stage design with replication
Menopausal hormone therapy (MHT) is associated with an elevated risk of breast cancer in postmenopausal women. To identify genetic loci that modify breast cancer risk related to MHT use in postmenopausal women, we conducted a two-stage genome-wide association study (GWAS) with replication. In stage I, we performed a case-only GWAS in 731 invasive breast cancer cases from the German case-control study Mammary Carcinoma Risk Factor Investigation (MARIE). The 1,200 single nucleotide polymorphisms (SNPs) showing the lowest P values for interaction with current MHT use (within 6 months prior to breast cancer diagnosis), were carried forward to stage II, involving pooled case-control analyses including additional MARIE subjects (1,375 cases, 1,974 controls) as well as 795 cases and 764 controls of a Swedish case-control study. A joint P value was calculated for a combined analysis of stages I and II. Replication of the most significant interaction of the combined stage I and II was performed using 5,795 cases and 5,390 controls from nine studies of the Breast Cancer Association Consortium (BCAC). The combined stage I and II yielded five SNPs on chromosomes 2, 7, and 18 with joint P values <6 × 10−6 for effect modification of current MHT use. The most significant interaction was observed for rs6707272 (P = 3 × 10−7) on chromosome 2 but was not replicated in the BCAC studies (P = 0.21). The potentially modifying SNPs are in strong linkage disequilibrium with SNPs in TRIP12 and DNER on chromosome 2 and SETBP1 on chromosome 18, previously linked to carcinogenesis. However, none of the interaction effects reached genome-wide significance. The inability to replicate the top SNP × MHT interaction may be due to limited power of the replication phase. Our study, however, suggests that there are unlikely to be SNPs that interact strongly enough with MHT use to be clinically significant in European women.
KeywordsPostmenopausal breast cancer riskMenopausal hormone therapyPolymorphismsGene-environment interactionGenome-wide association studyCase-only study
Alternate reading frame of the INK4a/CDKN2A locus
Breast Cancer Association Consortium
Cancer Genetic Markers of Susceptibility Project
Delta/notch-like epidermal growth factor-like repeat containing
Estrogen-progestagen combined therapy
F-box protein 36
Identical by state
Minor allele frequency
- MALDI-TOF MS
Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry
Mammary Carcinoma Risk Factor Investigation
Menopausal hormone therapy
Singapore and Sweden Breast Cancer Study
Suppressor of variegation, enhancer of zeste, and Trithorax
SET binding protein
Single nucleotide polymorphism
Thyroid hormone receptor interactor 12
Menopausal hormone therapy (MHT) is associated with an elevated risk of breast cancer in postmenopausal women [1, 2]. Estrogen and its metabolic compounds are involved in breast carcinogenesis through their influence on cell growth and proliferation [3, 4]. Progesterone increases breast cancer risk by provoking an elevated formation of estradiol caused by induction of 17 betahydroxysteroid dehydrogenase  and regulating estrogen receptor levels in breast cancer cells by progesterone metabolites . Furthermore, progesterone influences breast carcinogenesis directly by regulating proliferation, apoptosis, and cell adhesion in breast cells [7, 8]. Current use of any MHT compared to never use is associated with about 1.5 times higher breast cancer risk, whereas past use does not show any association. [1, 9, 10]. The magnitude of risk varies with histological type of breast cancer, being more than twofold higher for lobular/tubular than for ductal invasive cancer. In a population-based case-control study, we found the odds ratios (ORs) of association between current MHT use and risk of invasive breast cancer to be 2.95 for lobular, 3.75 for tubular, and 1.39 for ductal tumors . Risk also differs by type of therapy and is higher for combined estrogen-progestagen therapy (EPT) than estrogen-only therapy (ET) [9–17]. For example, a meta-analysis reported ORs of 1.39 and 1.16 for the association between EPT, respectively, ET and breast cancer .
Even though the prevalence of MHT use decreased substantially after publication of the results of the Women’s Health Initiative trial in 2002, postmenopausal MHT can still be considered as a relevant—in terms of prevalence of use (e.g., ET/EPT prevalence reported for Germany: 3/5 %)—and effective therapy for the treatment of menopausal symptoms [18–21]. Thus, the identification of potential genetic modifiers of the MHT associated breast cancer risk can be of great importance from a clinical (keyword: personalized medicine) as well as a public health point on view. Furthermore, from a scientific perspective, the detection of such genetic modifiers may help to elucidate the mechanisms underlying the involvement of sexual hormones in breast carcinogenesis and additionally have the potential to improve risk prediction models. Previous studies showed that functionally relevant polymorphisms in genes involved in steroid metabolism may influence hormone levels [22, 23] which are associated with breast cancer risk. Thus, risk after exposure to MHT may vary according to individual susceptibility. However, only a limited number of studies have examined this and no established interactions between single nucleotide polymorphisms (SNPs) and MHT use exist to date [24–35]. For example, initial findings of interaction between variants in the established breast cancer susceptibility gene FGFR2 and MHT could not be confirmed in subsequent studies [24, 25, 27–29, 35].
Case-only studies are more efficient than case-control studies for detecting gene-environment interactions if the genetic and environmental factors are not associated in the population from which the cases were drawn . To efficiently identify novel genetic loci that modify breast cancer risk associated with current MHT use in postmenopausal women, we conducted a two-stage genome-wide association study (GWAS), consisting of a case-only GWAS and a case-control analysis. We attempted to replicate the most significant interactions from the combined stage I and II in nine studies within the Breast Cancer Association Consortium (BCAC; http://www.srl.cam.ac.uk/consortia/bcac/index.html). Additionally, fine mapping of the three most interesting loci from the two-stage GWAS was conducted.
Subjects and methods
Study population and data collection
The GWAS was carried out within the Mammary Carcinoma Risk Factor Investigation (MARIE) study. The MARIE study is a population-based case-control study of primarily European postmenopausal women carried out in two regions in Germany with incident breast cancer cases and controls matched by birth year and study region .
For the GWAS first stage, 500 ductal and 300 lobular tumor cases were randomly selected from the breast cancer patients with known age at menopause from the MARIE study, whereby lobular cases were over-sampled.
Stage II comprised 1,375 additional cases and 1,974 matched controls from the MARIE study—excluding those cases already included in stage I—as well as a random subset of 795 cases and 764 controls of European origin from the Singapore and Sweden Breast Cancer Study (SASBAC) . SASBAC is a subset of the Swedish nationwide population-based case-control study, including 1,801 incident cases and 1,712 controls, aged 50–74 years .
The replication study was conducted using nine case-control studies with information on current MHT use and genotype data within BCAC. Thus, a total of 5,795 cases and 5,390 controls of the BCAC studies BBCS, CECILE, GENICA, KBCP, MCBCS, NHS, SASBAC, TWBCS, and UCIBCS (see Supplementary Table 1 for an explanation of acronyms and study details) were included in the replication stage. For the SASBAC study, only subjects not included in stage II were incorporated in the replication stage.
Fine mapping was performed in all MARIE participants with available blood sample (2,516 cases, 4,854 controls) including those involved in stages I and II.
Supplementary Table 1 contains further information on these studies.
Menopausal hormone therapy exposure definition
MHT use was defined as use of any type of menopausal hormone replacement therapy, including EPT and ET. Only women using MHT for more than 3 months were considered to be ever users of MHT. The available data on MHT use from MARIE and SASBAC were used to determine the categories “current use of any MHT” and “past/never use of any MHT”, where current use was defined as use within the last six months before reference date (i.e., date of diagnosis for cases, date of interview for controls). Current use of EPT was accordingly defined as “current use of any combination of estrogens and progestagens” versus “past/never use of EPT”. From the other BCAC studies, data on current versus past/never MHT and EPT use was centrally collected according to the BCAC data dictionary and underwent quality assurance.
In stage I, 318,237 SNPs were genotyped with Illumina Humancnv370-duo chip (370 K) on quality-checked DNA samples from 799 MARIE cases (one sample was excluded due to low DNA concentration). Scanning of the chips was done by BeadArrayReaders, the calling by Beadstudio 2.0 (Illumina). Samples with completion rate <90 % were assayed a second time. Twenty-three samples were subsequently excluded due to failure in the second run. Fifteen pairs of individuals with identical by state (IBS) score >0.8 were excluded from the analysis. There was no significant evidence of population stratification, i.e., the difference between IBS scores of groups defined by MHT exposure was not significant (P = 0.33). Further 15 individuals with >10 % missing genotypes were removed and genotyping rate in the remaining individuals was 99 %. Additionally, SNPs were excluded if minor allele frequencies (MAFs) were <0.01 (224 SNPs excluded) or if the rate of missing genotypes exceeded 10 % (1,043 SNPs). Thus, 731 cases and 316,974 SNPs remained for statistical analyses.
Genotyping capacities allowed for a total of 1,200 SNPs to be genotyped in stage II. Therefore, the 1,200 SNPs with the lowest P values for statistical interaction with current MHT (SNP × MHT interaction) in stage I were genotyped in the 3,349 MARIE participants using Illumina Golden Gate AssayStage II (Illumina, San Diego, CA). After excluding 101 SNPs that failed, genotyping rate in remaining individuals was 99.8 %. Two SNPs were excluded due to genotyping rate <90 %. We did not exclude SNPs based on MAFs (MAF < 0.05 for 10 SNPs) or deviation from Hardy–Weinberg equilibrium (HWE) (P < 0.01 for 23 SNPs). No individuals were excluded due to missing genotype data since all individuals were genotyped for >90 % of the SNPs. The SASBAC participants were already genotyped with Illumina HumanHap550. Quality control (QC) checked genotype data was provided for 1,154 of the 1,200 SNPs. Of the initial 1,200 SNPs, only SNPs that were successfully genotyped and also passed the QC in MARIE (1,097 SNPs) as well as SASBAC (1,154 SNPs) were eligible for inclusion in our analyses, i.e., 99 SNPs were discordant. Thus, a total of 1,055 concordant SNPs were included in the stage II analysis for a total of 2,170 cases and 2,738 controls.
For a case–control study nested within the NHS study (1,090 cases, 1,078 controls), which was involved in the replication phase, genotypes of the most significant SNP of the combined stage I and II were obtained from a GWAS conducted with the Illumina HumanHap500 as part of the Cancer Genetic Markers of Susceptibility Project (CGEMS) (for further information on genotyping and QC see ).
For the remaining eight BCAC studies that were incorporated in the replication phase (i.e., all nine BCAC studies except for the NHS), the most significant SNP of the combined stage I and II was genotyped along with other SNPs in BCAC by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) using iPLEX technology (www.sequenom.com). Standard QC guidelines were applied. Data were excluded for: (1) Any sample that consistently failed genotyping for >20 % of the SNPs; (2) All study data for any SNP with overall call rate <95 % or duplicate concordance <98 % (based on at least 2 % of samples in each study being genotyped in duplicate); (3) For any study with evidence for departure of SNP genotype distribution from HWE in controls (P < 0.005). Clustering of the intensity plots was reviewed manually and the data excluded if the clusters were poorly separated. In addition, all genotyping centres assayed an identical plate of 80 control DNA samples from the Coriell Institute for Medical Research, Camden, NJ, USA (referred to as the Coriell plate; which also included 14 internal duplicates) and had to achieve call rates and duplicate concordance >98 % in order for their data to be included. For these eight BCAC studies (4,705 cases, 4,312 controls) genotyping quality complied with the QC criteria.
In stage I analyses, we tested for SNP × MHT/SNP × EPT interactions on the genome-wide level in cases only using logistic regression with current hormone use as the outcome variable and the SNP as the explanatory variable. Each SNP was assessed for interaction in four different models. Interactions between SNP and current MHT use were investigated using models 1–3, whereby the SNP was assessed according to a log-additive model, i.e., a test for trend by number of minor alleles for all cases (model 1) or lobular cases only (model 2) was conducted. In addition, a genotypic model, i.e., two dummy variables were used to code for the heterozygous and homozygous mutant genotype (2 df test), was investigated for all cases (model 3). In model 4, interactions between SNP and current EPT use assuming a log-additive model were assessed in all invasive cases. All models were adjusted for the covariates age at reference date (date of diagnosis; continuously) and study centre.
The 1,200 SNPs carried forward to stage II were selected initially based on the 1,200 SNP × MHT interactions with the lowest P values obtained from model 1. Then the SNPs with the highest P values were replaced by non-overlapping SNPs showing SNP × MHT (model 2 and 3)/SNP × EPT (model 4) interactions with P < 0.001 for the models 2, 3, and 4.
In stage II, a case-control logistic regression analysis was performed. The model included SNP and current hormone use (vs. past/never) as well as the multiplicative SNP × MHT/SNP × EPT interaction term. Additionally, potential breast cancer risk factors or confounders entered the model as continuous [i.e., year of birth (in 5-year categories), BMI (≤22.4, 22.5–24.9, 25–29.9, ≥30 kg/m2), and number of full-term pregnancies [0, 1, 2, ≥3; coded continuously)] or categorical covariates [i.e., study, study centre (for MARIE), age at menarche (< 12, 12–14, ≥ 15 years), type of menopause (natural, induced, hysterectomy, other), ever benign breast disease (yes/no), first degree family history of breast cancer (yes/no), and number of mammograms (0, 1–4, 5–9, ≥ 10, unknown)]. Four different models corresponding to those used in stage I were employed to test for SNP × MHT (model 1–3)/SNP × EPT (model 4) interaction in the case-control scenario. Analyses in stages I and II were conducted with plink version 1.0.
For a combined analysis of stages I and II, joint test statistics zjoint and P values Pjoint were calculated, separately for the different models and only if interaction ORs of the two stages were in the same direction, according to Skol et al. . The critical value Cjoint for zjoint which corresponds to an overall significance level of 10−7, was calculated using the program CaTS (http://www.sph.umich.edu/csg/abecasis/cats/).
For the replication study, a case-control logistic regression analysis was performed according to model 1 (since lowest P values for the interaction of the top SNP with current hormone use were obtained with model 1) and adjusted for age at reference date (date of diagnosis for cases/date of interview for controls; continuously) to analyze the NHS data. Similarly, a pooled case-control logistic regression analysis of the remaining eight BCAC studies according to model 1, adjusted for study and age at reference date was conducted. A meta-analysis (fixed effects model) was performed to combine these results. The study specific analysis of NHS data as well as the meta-analysis was performed using the R software version 2.9.2  and the R-package meta. For the pooled analysis the SAS software version 9.2 of the SAS System for Windows was employed.
The fine mapping SNPs were assessed for SNP × MHT interaction according to model 1 (as lowest P values for interaction were obtained with model 1) in a case-control logistic regression framework as described for stage II, stratified by year of birth (in 5-year categories) and study centre and adjusted for the covariates used in stage II analyses except for study, year of birth, and study centre. Analyses were performed using the SAS software version 9.2. of the SAS System for Windows.
To test for independent SNP × MHT interaction effects for highly correlated SNPs (as measured in terms of LD) logistic regression analyses were performed using models that contained two correlated SNPs at a time along with the respective SNP × MHT interaction terms and the covariates that were also used in the single SNP models that led to the detection of the respective interaction effects. SNP × MHT interaction effects were considered to be independent if both interaction terms appeared to be significant in the respective regression model that included both interaction effects at a time. These analyses were performed using the SAS software version 9.2. of the SAS System for Windows.
The prevalence of MHT and EPT use in stages I and II individuals is presented in Supplementary Tables 2 and 3. The estimate of the main effect of MHT and EPT, respectively, obtained from the pooled stage II analysis was OR = 1.49 (95 % confidence interval (CI) = 1.31–1.70, P = 2.89 × 10−9) and 1.54 (95 % CI = 1.32–1.79, P = 1.86 × 10−8) for breast cancer overall, 2.37 (95 % CI = 1.80–3.12, P = 8.20 × 10−10) and 2.08 (95 % CI = 1.54–2.81, P = 2.05 × 10−6) for lobular carcinoma and 1.24 (95 % CI = 1.06–1.43, P = 0.005), and 1.39 (95 % CI = 1.18–1.64, P = 1.08 × 10−4) for ductal carcinoma.
The set of 1,200 SNPs selected from stage I and carried forward to stage II was composed of the 767 SNPs with lowest P values for SNP × MHT interaction obtained from models 1 and 433 additional SNPs with P < 0.001 for SNP × MHT interactions selected from models 2–4 (105, 206, 168 SNPs selected from models 2, 3, 4, respectively; 44 out of 433 SNPs were selected by more than one of the three models).
Effect estimates of the combined stage I and II with P values <10−5 for the interaction of current use of MHT and SNP on postmenopausal breast cancer risk as well as corresponding results of the fine mapping
P value (HWEd)
Combined stage I and II
OR (95 % CI)e
OR (95 % CI)e
OR (95 % CI)e
C > T
2.98 × 10−7
1.01 × 10−3
C > A
6.61 × 10−7
3.56 × 10−3
A > G
5.27 × 10−7
4.39 × 10−3
G > A
8.14 × 10−7
5.73 × 10−1
G > A
5.32 × 10−6
7.08 × 10−4
The genotype distribution for rs6707272 in the BCAC studies that were used in the replication analysis is provided in Supplementary Table 4. Prevalence of MHT use in the respective study populations is shown in Supplementary Table 5. The OR for breast cancer associated with current use of MHT obtained from the pooled BCAC studies and the NHS were 0.88 (95 % CI = 0.79–0.97), P = 0.01 and OR = 1.07 (95 % CI = 1.02–1.11), P = 4.30 × 10−3, respectively.
The SNP rs6707272 was not associated with breast cancer risk in the replication sample, with OR = 0.96 (95 % CI = 0.90–1.04), P = 0.33, in the pooled studies, and OR = 0.99 (95 % CI = 0.96–1.02), P = 0.37, in NHS. It did not modify the effect of current MHT in the pooled studies [OR = 1.06 (95 % CI = 0.90–1.24), Pinteraction = 0.51] or in NHS [OR = 1.19 (95 % CI = 0.92–1.55), Pinteraction = 0.19]. Thus, the meta-analysis of the nine studies did not yield a significant SNP × MHT interaction result [OR = 1.09 (95 % CI = 0.95–1.25), Pinteraction = 0.21]. Restriction to subjects of European ancestry did not change the results for interaction substantially [OR = 1.09 (95 % CI = 0.94–1.25), Pinteraction = 0.25].
Further analyses were carried out restricted to the four population-based BCAC studies, CECILE, GENICA, NHS, and SASBAC (excluding cases and controls involved in stage II) for which recruitment times for cases and controls were overlapping and estimates of MHT main effects were risk-enhancing. The corresponding SNP × MHT interaction effect was non-significant [OR = 1.03 (95 % CI = 0.87–1.23), Pinteraction = 0.73; 2,886 cases, 3,169 controls].
The genotype distributions of the 33 fine mapping SNPs along with main effect ORs and corresponding P values are given in Supplementary Table 6. Key characteristics of the study population are shown in Supplementary Table 7. The OR for breast cancer associated with current use of MHT was 1.60 (95 % CI = 1.44–1.79; P = 5.14 × 10−17).
Estimates of the main effect of current use of MHT on postmenopausal breast cancer risk in the fine mapping analysis in strata defined by genotypes of SNPs with P-values < 10−5 for the interaction with current MHT use in the combined stage I and II
OR (95 % CI)a
OR (95 % CI)a
OR (95 % CI)a
C > T
4.95 × 10−18
4.28 × 10−6
1.50 × 10−1
C > A
1.70 × 10−16
2.18 × 10−7
1.64 × 10−1
A > G
5.66 × 10−16
1.23 × 10−7
1.74 × 10−1
G > A
1.63 × 10−7
1.09 × 10−6
1.60 × 10−10
G > A
7.94 × 10−22
1.20 × 10−2
7.88 × 10−1
Our results suggest possible modifications of breast cancer risk associated with current MHT use by SNPs on chromosomes 2 and 18, though none of the observed interaction effects reached genome-wide significance in the detection stage, i.e., the combined stage I and II. Furthermore, the most significant interaction of combined stage I and II, the modification of the current MHT use related breast cancer risk by rs6707272, was not significant in the replication studies. Analyses of chromosome 2 including two SNPs at a time suggest that—if at all—a single causative variant may modify breast cancer risk associated with current MHT use on chromosome 2.
SNP rs6707272 lies in a ~100 kb LD block on 2q36.3 that contains one known gene, delta/notch-like epidermal growth factor-like repeat containing (DNER). Recent experimental studies suggest that the DNER signalling pathway has differentiating and tumor-suppressing actions on glioma-derived stem-l cells  and may also be involved in modulating adipogenetic differentiation . Although DNER is an unorthodox ligand of the single-pass transmembrane receptors NOTCH, it is interesting to note that NOTCH2 has been associated with breast cancer susceptibility . The other two SNPs on chromosome 2, rs13014061 and rs560304, lie in intronic regions of the thyroid hormone receptor interactor 12 (TRIP12) and of the F-box protein 36 (FBXO36) gene, respectively (see Fig. 2). No association with cancer has been reported for FBXO36, while TRIP12 may increase cancer risk by causing a ubiquitin-mediated degradation of ARF (alternate reading frame of the INK4a/CDKN2A locus), the key activator of the tumor suppressor p53 .
SNP rs1942574 on chromosome 18, which showed the strongest interaction in the fine mapping analysis, is located approximately 80 kb downstream of the suppressor of variegation, enhancer of zeste, and Trithorax binding protein (SETBP1) (see Supplementary Fig. 1), which was proposed to be inversely associated with cancer through its involvement in the mechanism of SET-related leukogenesis and tumorigenesis .
To date, GWAS are still rarely being conducted to agnostically search for gene-environment interactions . These studies confront the researchers with issues that similarly apply to GWAS screening for genetic marginal effects, such as the misspecification of the true genetic model. In our study, we analysed SNP × MHT interactions under the assumption of a log-additive (models 1, 2, and 4) as well as a genotypic (model 3) genetic model. On the one hand, we thereby allowed for different genetic models while restricting the number of tests performed by focussing on specific genetic models and, thus, reducing the problem of multiple testing. On the other hand, our study might have lost power due to a potential misspecification of the true genetic model.
In addition to the issues that arise when searching for genetic marginal effects in GWAS, further challenges come up when aiming to identify gene-environment interactions on a genome-wide scale. These include misspecification of the environmental exposure as well as magnified problems due to the indirect measurement of causal markers through LD and multiple testing which, in turn, imply the requirement of larger samples [49, 50].
In our well-powered combined stage I and II analysis with very good genome coverage, we agnostically searched for potential genetic modifiers of the MHT related breast cancer risk. Thereby, our study design exploits the advantages of the case-only approach and aims to avoid false positive results due to gene-environment dependence when combining the results of the case-only approach in stage I with the results of the case-control approach in stage II . Amongst others, the power of our study depends on the definition of the MHT exposure variable that should reflect the true association of MHT use with breast cancer risk. We restricted our regression analyses to current MHT/EPT use since risk among past users was shown to be comparable to risk among never users of any MHT [1, 9, 10]. We also exploited that MHT-related breast cancer risk is more pronounced for lobular tumors  and for EPT use [9–17]. In our combined stage I and II, we investigated separate regression models for lobular breast cancer cases only (model 2) as well as current EPT use (model 4) with the aim to increase the power of our study. By applying the four different regression models on a genome-wide scale, we magnified the issues of multiple testing. We were more concerned with Type II error, thus, we did not account for the different models (1–4) tested in our evaluation of the findings of the combined stage I and II analysis. However, since none of our findings reached genome-wide significance, correcting for the dependency between tests conducted for models 1–4 would not have changed our conclusions.
We employed a single definition of the MHT variable, i.e. we restricted our investigations to a binary MHT variable. Thereby, we did not take into account differences between MHT users with respect to duration of use and medication dosage. Moreover, we defined current MHT use as use within the 6 months prior to reference date. In the logistic regression model, the reference category, thus, included never MHT users as well as past users of MHT for whom last MHT use was more than 6 months prior to reference date. Given that breast cancer may be diagnosed months after disease onset, a limitation of our study is that it most likely includes patients misclassified with respect to MHT status, i.e., according to our definition of current use, some patients truly were current users of MHT at disease onset but were classified as past users since this was the correct classification at the reference date, i.e. the date of diagnosis. For these reasons, it is likely that the estimates of MHT/EPT main effects derived from our study are downwardly biased and that the power of our study to detect MHT main and SNP × MHT interaction effects is, thus, decreased .
In the replication stage, we used data of nine BCAC studies with a harmonized definition of the MHT exposure variable, which corresponds to the one used in the combined stage I and II. The composition of MHT products differs, however, between countries and, thus, between studies involved in our replication analysis. By adjusting for study we were in part able to take these differences into account. While our stage II as well as fine mapping analyses demonstrate a positive association between MHT use and breast cancer risk, which is in line with results reported in the literature [1, 2], current MHT use was inversely associated with breast cancer risk in the replication stage. Since this may result in biased interaction effects [52, 53], we carried out an additional replication analysis including only population-based BCAC studies, all of which showed the established risk enhancing effect of MHT. However, restriction to these studies resulted in a reduction of power (80 % power to detect an interaction effect of 1.41 (or 0.71, respectively) ) and a weaker, non-significant interaction effect.
Only a sub-sample of the MARIE cases was genotyped on a genome-wide scale and thereby power to detect interactions was reduced compared to a one-stage design with the whole sample genotyped. For GWAS analyses, Skol et al. proposed the combined analysis of the most significant results of a stage I GWAS analysis and the re-analysis of these findings in an independent sample . They showed that this design provides a trade-off between genotyping costs and power that can preserve much of the power of the corresponding one-stage GWAS design and is almost always more powerful than the corresponding two-stage design. We performed power calculations according to Skol et al. to determine the number of SNPs to be transferred from stage I to stage II. For the design of our study, we originally chose a scenario that was calculated to yield 80 % power to detect at least one true interaction with an OR of 1.7 or 0.59, respectively when transferring 1,500 SNPs to stage II including additional 200 cases and 1,000 controls. Due to decreases in genotyping costs during the study conduct we were able to increase the number of cases (1,375) and controls (1,974) for stage II. Whereas Skol et al. analyzed genetic main effects, we applied their design to the analysis of gene-environment interactions using a case-only design in the first and a case-control design in the second stage. Their results with respect to relative power of the different designs can be assumed to similarly apply to our analysis as estimates of the interaction effects derived from case-only versus case-control analyses are identical in presence of gene-environment independence. The violation of this assumption, however, can result in biased effect estimates derived from case-only analysis.
Our inability to replicate the top SNP × MHT interaction involving rs6707272 in BCAC may indicate that this finding is indeed a null result. However, our replication study based on all nine BCAC studies had 80 % power to detect an interaction effect of 1.20 (or 0.83, respectively) . The corresponding interaction effect estimates obtained in stages I and II were OR = 0.60 and 0.76, i.e. it is likely that we over-estimated the interaction effect in the detection phase (the combined stage I and II)—a phenomenon known as ‘the winners curse’—and, thus, our replication phase may have lacked power to detect a true interaction effect .
Bonferroni correction was applied to correct the fine mapping results for multiple testing. Thereby, we did not take into account the dependence between the samples used in the combined stage I and II and in the fine mapping, i.e., the case-control sample included in the fine mapping analysis comprised all MARIE subjects involved in stage I and II. Therefore, the significance of the fine mapping results is likely to be limited.
Our study provides a hint for a possible modification of the current MHT use related breast cancer risk by SNPs on chromosomes 2 and 18. We were not successful in replicating our top SNP × MHT interaction on chromosome 2 in an independent sample, which may be attributable to a lack of power. However, since none of the investigated interaction effects reached genome-wide significance in our well-powered combined stage I and II analyses, we conclude that the presence of clinically relevant SNP × MHT interactions in European women is questionable.
We thank all the individuals who took part in these studies and all the researchers, clinicians, technicians and administrative staff who have enabled this work to be carried out. In particular, we thank: Tracy Slanger, Renate Birr, Ursula Eilber, Belinda Kaspereit, N. Knese, Kathi Smit, and Nicole Knese (German Cancer Research Center, Heidelberg, Germany), Elke Mutschelknauss, W. Busch, (German Research Center for Environmental Health, Neuherberg, Germany), M. Schick, R. Fischer and B. Korn (Genomics and Proteomics Core Facilities, German Cancer Research Center, Heidelberg, Germany), W. Höppner and Ramona Salazar (BioGlobe GmbH, Hamburg, Germany), Eik Vettorazzi (Institute for Biostatistics and Epidemiology, University Medical Centre, Hamburg, Germany) (MARIE study); Eileen Williams, Elaine Ryder-Mills and Kara Sargus (British Breast Cancer Study); the GENICA (Gene Environment Interaction and Breast Cancer in Germany) network: Christina Justenhoven (Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, Stuttgart, University of Tübingen, Germany), Yon-Dschun Ko and Christian Baisch (Department of Internal Medicine, Evangelische Kliniken Bonn gGmbH, Johanniter Krankenhaus, Bonn, Germany), Hans-Peter Fischer (Institute of Pathology, University of Bonn, Bonn, Germany), Ute Hamann (Molecular Genetics of Breast Cancer, German Cancer Research Center, Heidelberg, Germany), Thomas Brüning, Beate Pesch, Sylvia Rabstein and Anne Lotz (Institute for Prevention and Occupational Medicine of the German Social Accident Insurance (IPA), Bochum, Germany), Volker Harth (Institute and Outpatient Clinic of Occupational Medicine, Saarland University Medical Center and Saarland University Faculty of Medicine, Homburg, Germany) (GENICA (Gene Environment Interaction and Breast Cancer in Germany) study); Eija Myöhänen, Helena Kemiläinen (Kuopio Breast Cancer Project); Irene Masunaka (UCI Breast Cancer Study). This work was supported by the Federal Ministry of Education and Research (BMBF) Germany grants 01KH0402, 01KH0408, 01KH0409 and the European Community’s Seventh Framework Programme (Collaborative Oncological Gene Environment Study) [grant agreement number 223175, grant number HEALTH-F2-2009-223175]. The MARIE study was supported by the Deutsche Krebshilfe e.V., grant number 70-2892-BR I, the German Cancer Research Center (DKFZ) and the Hamburg Cancer Society. Genotyping in the BCAC studies was funded by CR-UNITED KINGDOM [C1287/A10118, C1287/A7497]. Meetings of the BCAC have been funded by the European Union COST (European Cooperation in Science and Technology) programme [BM0606]. D.F.E. is a Principal Research Fellow of CR (Cancer Research) –United Kingdom. The BBCS (British Breast Cancer Study) is funded by Cancer Research United Kingdom and Breakthrough Breast Cancer and acknowledges NHS funding to the NIHR Biomedical Research Centre, and the National Cancer Research Network (NCRN). The CECILE study was funded by Fondation de France, Institut National du Cancer (INCa), Ligue Nationale contre le Cancer, Ligue contre le Cancer Grand Ouest, Agence Nationale de Sécurité Sanitaire (ANSES), Agence Nationale de la Recherche (ANR). The GENICA (Gene Environment Interaction and Breast Cancer in Germany) was funded by the Federal Ministry of Education and Research (BMBF) Germany grants 01KW9975/5, 01KW9976/8, 01KW9977/0, 01KW0114, 01KH0401, 01KH0402, 01KH0410, and 01KH0411, the Robert Bosch Foundation, Stuttgart, German Cancer Research Center (DKFZ), Heidelberg, Institute for Prevention and Occupational Medicine of the German Social Accident Insurance (IPA), Bochum, as well as the Department of Internal Medicine, Evangelische Kliniken Bonn gGmbH, Johanniter Krankenhaus, Bonn, Germany. The KBCP was financially supported by the special Government Funding (EVO) of Kuopio University Hospital grants, Cancer Fund of North Savo, the Finnish Cancer Organizations, the Academy of Finland and by the strategic funding of the University of Eastern Finland. The MCBCS (Mayo Clinic Breast Cancer Study) was supported by the NIH (National Institute of Health) grants [CA122340, CA128978], an NIH (National Institute of Health) Specialized Program of Research Excellence (SPORE) in Breast Cancer [CA116201], the Breast Cancer Research Foundation, and the Komen Race for the Cure. The Nurses’ Health Studies are supported by US NIH (National Institute of Health) grants CA65725, CA87969, CA49449, CA67262, CA50385 and 5UO1CA098233. The work on SASBAC (Singapore and Sweden Breast Cancer Study) was supported by National Institutes of Health (RO1 CA58427), the Märit and Hans Rausing’s Initiative against Breast Cancer, and the Agency for Science, Technology and Research (A*STAR). KH was supported by the Swedish Research Council (523-2006-972). KC was financed by the Swedish Cancer Society (5128-B07-01PAF). The TWBCS (Taiwanese Breast Cancer Study) is supported by the Taiwan Biobank project of the Institute of Biomedical Sciences, Academia Sinica, Taiwan. The UCIBCS (UCI Breast Cancer Study) component of this research was supported by the NIH (National Institute of Health) [CA58860, CA92044] and the Lon V Smith Foundation [LVS39420].
Conflict of Interest
The authors declare that they have no conflict of interest.