Introduction

High mammographic breast density, the density of the breast on mammography, is one of the strongest known risk factors for breast cancer [1]. High breast density (dense tissue on 50% or more of the breast) could account for up to one third of breast cancer cases [2]. Factors such as body mass index, parity, age, smoking, and physical activity jointly account for only a small proportion of the variability in mammographic density [3]. In contrast, mammographic density has a strong genetic component. Twin studies have demonstrated that heritability (the proportion of variance attributable to genetic factors) accounts for 60% of the variance in mammographic density [4, 5].

It is feasible that genetic variation in sex steroids or in estrogen receptors (ESRs) produced in breast tissue could lead to differing degrees of proliferation that may be manifest radiographically as interindividual differences in mammographic density. The presence of sex steroid metabolic enzymes and ESRs in breast tissue [624] suggests that local activation of estrogen to potentially reactive metabolites within breast tissue may play a role in initiating and promoting carcinogenesis [18]. Such enzymes include CYP1A1, CYP1B1, and 17β-hydroxysteroid dehydrogenase (17β-HSD). In addition to metabolizing environmental carcinogens (for example, polycyclic aromatic hydrocarbons), CYP1A1 has high activity with the 17β-estradiol substrate [25, 26]. CYP1A1 forms mainly 2-hydroxyestrone, and to a lesser degree, some 4-hydroxyestrone, from estrone. In contrast, CYP1B1 predominantly catalyzes formation of potentially carcinogenic catechol estrogens, especially 4-hydroxyestrogens [6, 2628]. The implication of 4-hydroxy catechol estrogens in carcinogenesis suggests a key role for CYP1B1 in carcinogenesis [19, 27, 29, 30]. CYP19A1 is the gene encoding the aromatase enzyme that catalyzes the formation of aromatic C18 estrogens from C19 androgens [6, 31]. Type I 17β-HSD is the enzyme responsible for interconversion of estrone and estradiol [32]. In addition to potential local effects of these enzymes on breast tissue, ESR-estrogen interactions stimulate breast epithelial cell growth [33]. Single-nucleotide polymorphisms (SNPs) in genes encoding sex steroid-metabolizing enzymes or receptors have effects on the hormonal milieu of the breast and on levels of potential mammary carcinogens [6].

A few studies have explored associations between mammographic density and SNPs in genes encoding CYP1A1, CYP1B1, aromatase, 17β-HSD, ESR1, and ESR2 [3439]. However, most studies were focused on postmenopausal women [36, 37]. Premenopausal breast density may be more highly heritable than is postmenopausal density [40], and some genes may be associated with premenopausal but not with postmenopausal density [4]. The goal of this study was to examine the association between mammographic density and SNPs in genes encoding CYP1A1, CYP1B1, aromatase, type I 17β-HSD, ESR1, and ESR2 in a group of pre- and early perimenopausal white, African-American, Chinese, and Japanese women.

Materials and methods

To determine the association between SNPs in genes encoding sex steroid-metabolizing enzymes and ESRs and mammographic density, we used data from women who participated in the SWAN ancillary Mammographic Density Study and the SWAN Genetics Study, which are described later. All protocols were IRB approved at participating sites, and all participating women provided signed, written informed consent.

The Study of Women's Health Across the Nation (SWAN)

SWAN is a multisite longitudinal community-based cohort study of 3,302 midlife women, serving as the parent study for the Mammographic Density ancillary study. In brief, at baseline, women were aged 42 to 52 years and premenopausal (reporting no change in usual menstrual pattern) or early perimenopausal (reporting change in menstrual pattern but occurrence of menstruation in the past 3 months), had an intact uterus and one or more ovaries, were not pregnant or lactating, and were not using exogenous reproductive hormones [41]. Initiation of exogenous hormones after the baseline visit did not preclude inclusion in the longitudinal cohort study. Each of the seven study sites enrolled white women in addition to women of one other self-identified racial/ethnic group: African-American women (Boston, Detroit area, Chicago, and Pittsburgh), Japanese women (Los Angeles), Hispanic women (New Jersey), and Chinese women (Oakland, California). SWAN participants completed questionnaires and underwent fasting blood sampling annually.

The SWAN Mammographic Density Study

Three SWAN clinical sites (Los Angeles, Oakland, Pittsburgh) participated in the SWAN Mammographic Density ancillary study, which retrieved and analyzed existing participants' mammograms that had been performed by accredited mammography facilities as a part of routine medical care.

At the time of enrollment into the ancillary study, 1,248 participants were active at the three sites. Of these, 22 (2%) women were ineligible because of bilateral breast surgery, 82 (7%) were not recruited because of having an abbreviated follow-up, and 89 (7%) refused to participate. Thus, 1,055 (85%) women were eligible and agreed to participate in the mammographic density study; of these, 1,005 women had at least one mammographic density assessment.

By using previously published methods, a single expert reviewer quantified mammographic density (that is, the percentage of the breast composed of dense tissue) [42]. The reader assessed mammographic density by using the craniocaudal view of the mammogram of the right breast [43]. If a participant reported prior breast surgery involving the right breast, mammograms of the left breast were used for density assessments. A compensating polar planimeter was used to measure the total breast area (in square centimeters) and the area of dense breast tissue (in square centimeters). Percentage density was calculated as the area of dense breast tissue divided by the area of the breast. A repeated review of a 10% random subset of mammograms for intrarater reliability yielded an intraclass correlation coefficient for percentage mammographic density of 0.96 [43].

Our goal was to examine associations of SNPs with mammographic density among pre- and early perimenopausal participants. Of the 1,005 participants with at least one assessment of mammographic density, we chose one mammogram for each of the 643 pre- or early perimenopausal SWAN Mammographic Density study participants. If more than one mammogram was available for given participant, we selected the mammogram temporally closest to the preceding annual follow-up visit that was flanked by pre- or early perimenopausal status on the visits before and after the mammogram. For example, if a participant had mammographic density assessments from two mammograms during her premenopausal stage and one mammogram during her early perimenopausal stage, we chose a single mammogram for the participant by picking the mammogram that was temporally closest to its preceding annual follow-up visit. Mammograms that occurred more than 3 months before baseline and mammograms obtained during the use of current exogenous reproductive hormones were excluded.

The SWAN Genetics Study

The SWAN Genetics Study genotyped 25 SNPs relating to sex-steroid metabolism and estrogen receptors (Figure 1, Table 1). Of the 1,988 women who were eligible (that is, still participating and providing blood for the SWAN parent study at the follow-up year 5 visit), 88% agreed to participate in the genetics study. Details regarding specimen collection, specimen processing, and genotyping were previously reported [44]. Genotyping was performed by using TaqMan (Roche Molecular Systems, Inc., Pleasanton, CA) and an ABI 7900 HT sequence detection system (Applied Biosystems Inc., Foster City, CA, USA).

Figure 1
figure 1

Functions of SWAN genetics sex steroid metabolism enzymes and receptors. Used with permission of Sowers and colleagues [93].

Table 1 SNPs examined in the SWAN Genetics Study

Between three and eight SNPs per gene were selected based on use in previous genetics studies, a review of the literature, and information from gene databases (National Center for Biotechnology SNP database [45] and Celera [46]). The original SNP selection process is discussed in the first SWAN Genetics Study manuscript [44]. The SWAN Genetics Study searched for published literature supporting the biologic significance of SNPs chosen. SNPs were chosen if they were thought potentially to influence circulating sex hormone levels [47, 48] or disease patterns: breast cancer [4951], ovarian cancer [52], and bone mineral density [53, 54].

Of the 643 premenopausal or early perimenopausal SWAN participants with available mammographic density information, at least partial genotyping data were available for 463 (72%) women. For this analysis, we excluded one participant lacking genotyping data for 24 of the 25 SNPs and an additional 11 participants who were missing information for one or more covariates. Thus, the analytic sample for this study comprised the 451 pre- and early perimenopausal women for whom complete information was available regarding mammographic density, genotypes, and covariates.

Questionnaire-based and anthropometric measures

At baseline and at each annual follow-up visit, SWAN participants were asked to complete standardized questionnaires and underwent measurement of height and weight for calculation of body mass index (BMI, weight in kilograms divided by the square of the height in meters). We took information regarding age, race/ethnicity, reproductive history, medication use, smoking, and alcohol intake from annual questionnaires.

Statistical analysis

Allele frequencies in the SWAN Genetics Study were estimated by race/ethnicity (Mendel Version 8.0 [55]). Hardy-Weinberg equilibrium (HWE) was assessed by using Fisher's Exact tests [55]. Because of the multiple statistical tests performed, we considered a P value of < 0.01 as the criterion to reject the null hypothesis of HWE.

Creating a separate model for each of the 25 SNPs, we used multivariate linear regression to examine the relation between percentage mammographic breast density (outcome) and SNP (primary predictor). Based on previously published studies, we considered the following candidate covariates: age, race/ethnicity, number of live births, BMI, oral contraceptive use, menopausal hormone use, cigarette smoking, and alcohol intake) [3, 43, 5668]. Of these candidate covariates, age, number of live births, and BMI were included in all models, based on previously well-established associations with mammographic density. The remaining candidate covariates (cigarette smoking, alcohol intake, oral contraceptive use, and menopausal hormone use) were evaluated for model inclusion by using backwards regression performed on data from the 643 pre- and early perimenopausal participants of the SWAN Mammographic Density study. We used a P value of 0.10 as the cutoff for covariate inclusion. In addition, because each site recruited a specific racial/ethnic group in addition to non-Hispanic whites, a combined variable was created for race/ethnicity and study site; this variable was included in all models. Categories for this variable were whites in Oakland, Chinese in Oakland, whites in Los Angeles, Japanese in Los Angeles, white in Pittsburgh, or African American in Pittsburgh. Age at mammogram (continuous), race/ethnicity-study site, number of live births (continuous), current cigarette smoking (yes/no), and BMI (continuous) were the covariates retained in the final models. We modeled the alleles as acting in either an additive (aa versus Aa and AA, where the effect of the Aa genotype is half the effect of the AA genotype) or recessive (aa/Aa versus AA) manner, in which A is the minor allele.

Because of prior studies showing that associations of sex steroid-related SNPs may be more evident among women with BMI greater than 25 [69], we conducted secondary analyses wherein we added an SNP*BMI interaction term to the multivariable linear regression models. Because we suspected that the sample size of certain racial/ethnic subgroups may have been too small to allow detection of SNP*race/ethnicity interactions, and because the allele frequencies for 13 of the 25 SNPs differed by more than 0.20 between the ethnic groups, we repeated all of our analyses in the subsample of white participants, the largest racial/ethnic subgroup. All regression analyses were performed with the software program R [70].

Results

Baseline characteristics of the participants

Baseline characteristics of the analytic sample (N = 451 with mammographic density, genotyping, and covariate data) are displayed in Table 2. No notable differences in characteristics were found between the overall mammographic density sample (N = 643) and the analytic sample. The median age of the participants in the analytic sample was 48.7 years (Table 2). Median BMI was 24.4 kg/m2. Mean percentage mammographic density was 43.6%. Forty-nine percent of the participants in the analytic sample were white, 24% were Chinese, 22% were Japanese, and 6% were African American. Twenty-six percent were premenopausal, and 74% were early perimenopausal at the visit immediately preceding mammography (Table 2).

Table 2 Characteristics of the study participants: analytic sample of the current study (N = 451)

Hardy-Weinberg Equilibrium assessment

We examined allele frequencies by ethnicity (Table 3) and assessed HWE (see Table 4). Within racial/ethnic subgroups, only CYP194947 showed significant deviation from HWE. Because this SNP was also the SNP with the highest frequency (8%) of missing genotypes, methodologic considerations related to genotyping of this SNP may have contributed to deviation from HWE.

Table 3 Allele frequencies of SWAN Genetics Study participants by race/ethnicity
Table 4 Hardy-Weinberg equilibrium evaluation by race/ethnicity

Associations between SNPs and percentage mammographic density

We examined the association between percentage mammographic breast density and each of the SNPs in models adjusted for age, race/ethnicity-study site, parity, smoking, and BMI (Tables 5 and 6).

Table 5 Percentage mammographic density as a function of single-nucleotide polymorphism: recessive modelsa
Table 6 Percentage mammographic density as a function of single-nucleotide polymorphism: additive modelsa

In the fully adjusted recessive models (adjusted for age, race/ethnicity-study site, parity, smoking, and BMI), the CYP1B1 rs162555 CC genotype was associated with 9.4% higher percentage mammographic density than the TC/TT genotype (P = 0.04). The CYP19A1 rs936306 TT genotype was associated with 6.2% lower percentage mammographic density than TC/CC genotype (P = 0.03) (Table 5). In contrast to analyses restricted to white participants, ESR1 rs2234693 was not significantly associated with mammographic density in either recessive or additive models that included the entire analytic sample (Tables 5 and 6).

Interaction by BMI

In additive models, CYP1A1 rs2606345 was significantly associated with BMI (1.1 kg/m2 higher for each A allele; P = 0.03) and CYP19A1 rs2414096 (1.1 kg/m2 lower for each A allele; P = 0.01; data not shown). Similarly, in recessive models restricted to white participants, the CYP194947 GG genotype was associated with a 2.1 kg/m2 lower BMI compared with the GA/AA genotype (P = 0.05), and the CYP19A1 rs749292 AA genotype was associated with a 2.3 kg/m2 lower BMI than the GA/GG genotype (P = 0.05; data not shown).

To determine whether associations between SNPs and mammographic density varied according to BMI, we added BMI*SNP interaction terms to multiple linear regression models that included age, race/ethnicity-study site, smoking, parity, and BMI as covariates and percentage mammographic density as the outcome (data not shown). In additive models, the CYP1A1 rs2606345-mammographic density association was significantly different (stronger) among participants with BMIs greater than 30 kg/m2 compared with participants with BMIs less than 25 kg/m2 (Pinteraction = 0.05). Specifically, among participants with BMIs less than 25 kg/m2, percentage mammographic density was 0.57% higher for each CYP1A1 rs2606345 C allele; in contrast, among participants with BMIs greater than 30 kg/m2, percentage mammographic density was 6.1% higher for each additional C allele. The associations of SNPs with mammographic density did not significantly differ by BMI category for CYP19A1 rs2414096, CYP19A1 rs749292, or CYP194947. However, we may not have had adequate statistical power to detect an SNP*BMI interaction when BMI was categorized into tertiles.

Analyses restricted to white participants

In analyses restricted to whites (n = 219), we detected two associations that were similar to those seen in the overall analytic sample (for example, CYP19A1 rs936306 in recessive models, CYP19A1 rs2414096 in additive models). In white participants, the ESR1 rs2234693 CC genotype was associated with a 7.0% higher percentage mammographic density than the CT/TT genotype (P = 0.01; Table 5); this finding was also apparent in additive models (P = 0.01; Table 6). The association between ESR1 rs2234693 and mammographic density varied by ethnicity; the association was stronger among whites than among Japanese (interaction P value, 0.09) or Chinese (interaction P value, 0.03) participants.

Discussion

In pre- and early perimenopausal women, SNPs involving CYP1B1 (rs162555 CC genotype), CYP19A1 (rs936306 TT/CC genotype), and ESR1 (rs2234693 CC genotype) were each significantly positively associated with mammographic density. Associations between several SNPs (CYP1A1 rs2606345, CYP194947, CYP19A1 rs749292, CYP19A1 rs2414096) and mammographic density were attenuated after adjustment for BMI. Percentage mammographic density varied at least 3% per allele for the statistically significant associations. These differences in mammographic density according to genotype are of a clinically relevant magnitude, given that each 1% increment in mammographic density is associated with a 2% higher relative risk of breast cancer [2]. Several SNP-mammographic density associations varied significantly by ethnicity.

Several of our findings are novel. As far as we know, other publications have not reported information regarding associations between mammographic density and the following SNPs: 17β-HSD rs615942, 17β-HSD rs592389, 17β-HSD rs2830, ESR1 rs728524, ESR1 rs3798577, ESR2 rs1256030, ESR2 rs1255998, ESR2 rs1256049, CYP1B1 rs162555, CYP1B1 rs1800440, CYP19A1 rs700519, CYP19A1 rs2446405, CYP19A1 rs2445759, CYP19A1 rs1008805, CYP19A1 rs936306, CYP19A1 rs2414096, CYP19A1 rs749292, CYP194947, CYP1A1 rs1531163, or CYP1A1 rs2606345.

Our finding of an association between ESR1 rs2234693 and mammographic density among white women conflicts with some prior studies. The association between ESR1 rs2234693 and mammographic density was described in three reports from the EPIC study. In the first EPIC report, the T allele was associated with higher mammographic density [39], whereas in this study, the CC genotype is associated with higher mammographic density.

The second EPIC analysis found a statistically significant difference in mammographic density between hormone therapy users and never-users of hormone therapy among women the CT or TT genotype, but not among those with the CC genotype [71].

The third EPIC analysis reported no association between ESR1 rs2234693 and mammographic density [36]; the discrepancy among studies may be because the previous study used a different mammographic density measurement technique, had a less heterogenous study population, and focused on postmenopausal women.

We found an association between CYP1B1 rs1056836 and mammographic density that neared statistical significance only before adjustment, but not after adjustment, for BMI. These results may be consistent with three previously published studies [35, 36, 38].

A cross-sectional observational European study of white women found statistically significantly higher mammographic density in carriers of at least one ESR1 rs9340799 A allele [39]. Although we had similar results, our findings were not statistically significant, possibly because of the smaller number of participants in our study or the younger age of our participants.

The other SNPs involved in sex steroid metabolism or estrogen receptors were not significantly associated with mammographic density in the present study. As with our study, past studies reported absence of an association between mammographic density and CYP1A1 rs1048943 and CYP1A1 rs4646903 [35, 38].

Although previously published studies have not included a systematic examination of sex steroid metabolism SNPs and mammographic density, some previously studied SNPs may be linked with the SNPs that we examined. We searched Haploview version 4.1 (Daly Lab, Cambridge, MA) with Hapmap genotype data to search for information regarding linkage disequilibrium for each of the three SNPs that we found to be associated with mammographic density and other SNPs previously studied in relation to mammographic density. Linkage diseqilibrium R2 values for ESR1A1 rs2234693 (which we found to be associated with mammographic density) and rs9340799 (which prior studies found to be associated with mammographic density) range from 0.234 to 0.55, depending on the ethnic group. For CYP19A1 rs936306 (which we found to be associated with mammographic density) and rs10046 (which prior studies found not to be associated with mammographic density), R2 values range from 0.017 to 0.193. Linkage-disequilibrium information is not currently available for CYP1B1 rs162555 on Hapmap. Although LD information was not available for rs162555, we note that its chromosomal location is not close to the other two previously studied CYP1B1 SNPs.

Our findings have a biologic rationale. A local influence of sex steroid metabolism SNPs on breast tissue is suggested by prior breast cancer studies. For example, ESR1 rs2234693 has been associated with duration of breast cancer survival [72], degree of breast cancer differentiation [73], age at breast cancer diagnosis [74], and receptor status of breast cancer tumors [75, 76]. Likewise, CYP19A1 rs936306 may be associated with breast cancer disease-free survival [77]. Case-control studies of breast cancer risk related to ESR1 rs2234693 [73, 76, 7887] and in relation to CYP19A1 rs936306 [77, 88] are conflicting. Inconsistent results of breast cancer case-control studies are likely due to differences in ethnicity and menopausal status of participants across studies. Reasons exist to suspect that associations of SNPs with mammographic density may vary by BMI, as we found for CYP1A1 rs2606345. Sex steroid metabolism (for example, peripheral aromatization of androstenedione) varies by BMI, so that effects of sex steroid SNPs on breast tissue may be more pronounced among obese women. Although prior studies have not examined whether associations of CYP1A1 rs2606345 with mammographic density vary by BMI, a prior study reported that the association of an ESR1 SNP with increased breast cancer risk was apparent only among women with BMI greater than 25 kg/m2 [69].

Strengths of our study included its multiethnic study population, use of validated and reproducible mammographic density-assessment techniques, rigorous attention to genotyping methods, and collection of detailed information regarding key covariates related to mammographic density. However, this study did not directly assess sex steroid activity in breast tissue samples. Furthermore, although our sample size was relatively large, its heterogeneity may have precluded detection of statistically significant race-specific associations or interactions of SNPs with mammographic density. Finally, the observational study design precluded coordination of mammographic density with menstrual-cycle phase. Relations between SNPs and mammographic density may have been diluted because we analyzed mammograms taken during varying menstrual phases. Breasts are more radiographically dense during the luteal phase [8991], although a recent study found that variation in mammographic density over the menstrual cycle may be subtle (that is, may not be statistically significant) [92].

Conclusions

In conclusion, SNPs involving sex steroid metabolism enzymes and ESR1 may be associated with mammographic density in pre- and early perimenopausal women. Future studies relating these SNPs to mammographic density not only should adjust for BMI but also should consider interactions by BMI. The mechanisms underlying the association (for example, increased proliferation of epithelial and stromal cells) require elucidation. Because these enzymes and ESR1 are expressed in target tissues, these SNPs (or genetic factors with which they are in linkage disequilibrium) may alter breast cancer risk by altering mammographic density. These findings inform the understanding of biologic influences on mammographic density, a strong risk factor for breast cancer.