Introduction

It has recently been recognized that cancer is a manifestation of both abnormal genetic and epigenetic events [1]. Dysregulated epigenetic controls, which usually are represented by abnormal DNA methylation patterns such as global hypomethylation and region specific hypermethylation, are a hallmark of most cancers. Although the precise mechanisms underlying methylation alterations are far from being fully understood, the overall methylation process is mainly regulated by several groups of regulatory proteins [24].

The methyl-CpG-binding domain (MBD) proteins are among these protein families that bind specifically to a methylated gene and mediate transcriptional repression via effects on chromatin structure. Thus far, five MBD genes have been identified in mammalian cells that encode putative MBDs, namely MeCP2, MBD1, MBD2, MBD3, and MBD4 [57]. Human MBD genes are considered housekeeping genes because they are widely expressed in somatic tissues [6]. Given the epigenetic role of MBD proteins in regulating gene expression, MBDs may be involved in cancer development by affecting the expression of cancer related genes. In fact, there is growing evidence that aberrant expression of MBD proteins is associated with human cancers [8, 9].

The MBD2 gene is mapped to the conserved region within human chromosome 18q21 [10]. Genomic sequence analysis determined that the MBD2 gene contains six exons and one noncoding exon spanning more than 50 kb in the genome. The MBD2 gene encodes two potential forms of protein MBD2 that correspond to the initiation of translation starting at either the first (MBD2a; 43.5 kDa) or second (MBD2b; 29.1 kDa) methionine codons. The signal functions of MBD2 are to bind specifically to methylated gene promoters and recruit histone deacetylases and chromatin remodeling proteins. The altered chromatin structure resulting from the binding of these factors may be resistant to the transcriptional machinery and, as a result, repress gene expression [11, 12]. A recent finding also suggests that MBD2 has potential DNA demethylase activity [13], implying that it might mediate gene activation in addition to transcriptional repression. However, two subsequent studies could not demonstrate any demethylase activity of MBD2 [14, 15], and this inconsistency in the functions of MBD2 remains to be resolved.

Although our understanding of the exact function of MBD2 in epigenetics is still in its early stages, several studies in human cancer research have demonstrated that the MBD2 protein plays a role in tumorigenesis. For example, a recent study [16] showed that breast carcinomas exhibit alterations in MBD2 expression. One interesting finding from that study was that breast carcinomas can be divided into two groups, with one expressing very high levels of MBD2 and the other expressing a much lower level. MBD2 has also been reported to be involved in the repression of GSTP1 transcription in breast cancer cells [17]. Moreover, a significant reduction in MBD2 mRNA expression was found in human colorectal and gastric cancerous tissues [18] and peripheral blood lymphocytes [19] in bladder cancer patients, implying a protective role for MBD2 in tumorigenesis. MBD2 protein expression and its demethylase activity were detected in normal human prostate tissue but not in cancerous tissue [20]. These differences between types of cancers in the abundance of MBD2 levels may reflect different roles for MBD2 either in transcriptional repression or in the demethylation process.

Given that there is a potential role for MBD2 in tumorigenesis, we hypothesized that genetic polymorphisms in the MBD2 gene may modify an individual's susceptibility to human cancers. In this molecular epidemiologic study, we genotyped two single nucleotide polymorphisms (SNPs; rs1259938 and rs609791) in the MBD2 gene to investigate whether genetic variations in the MBD2 gene are associated with breast cancer risk and whether the potential associations are modified by menopausal status.

Materials and methods

Study population

This study was built upon a recently completed breast cancer case–control study that was undertaken in Connecticut, USA. Detailed information regarding the study population is provided elsewhere [21]. Briefly, a total of 475 histologically confirmed incident breast cancer cases (ICD-O, 174.0–174.9) and 502 randomly selected control individuals were identified from the Tolland and New Haven County area of Connecticut between 1 January 1994 and 31 December 1997. All of the cases and controls were in the age range 30–80 years and had no previous diagnosis of cancer with the exception of nonmelanoma skin cancer.

For New Haven County, eligible cases were identified from the major hospital of the county (Yale–New Haven Hospital) through the computer database system at the Department of Surgical Pathology. Controls were also randomly selected from the computer database system from among women who were histologically confirmed to be without breast cancer. The participation rates were 77% for cases and 71% for controls in New Haven County. For Tolland County, because there was no major county hospital in this county, newly diagnosed breast cancer cases were identified from area hospital records by the Rapid Case Ascertainment system at the Yale Comprehensive Cancer Center. Controls from Tolland Country were recruited through random digit dialing methods for those under age 65 years and randomly selected from Health Care Financing Administration files for those aged 65 years and over. The participation rates were 74% for cases and 64% for controls in Tolland County.

The study pathologist reviewed all of the pathologic diagnoses for breast cancer patients and benign breast disease controls. Breast carcinoma were classified as carcinoma in situ, invasive ductal, or lobular carcinoma, and were staged according to the American Joint Committee on Cancer staging system [22].

Data collection

Informed consent was obtained from all study participants before collection of epidemiologic data through personal interview. The 45-min in-person interview, completed by all study participants, was administered by trained interviewers following institutional guidelines for human subjects. Data on smoking habits, alcohol consumption, and hormone replacement therapy of case and control individuals was obtained. Other information, including menstrual and reproductive factors (age at menarche, age at first pregnancy, age at menopause, parity, lifetime lactation history), family breast cancer history, lifetime occupational history, body mass index, hair dye use, and residence history, was also collected. Dietary information was obtained using a scannable semiquantitative food frequency questionnaire developed by the Fred Hutchinson Cancer Research Center, designed to optimize estimation of fat intake. Menopausal status was assessed at the time of diagnosis. Women with hysterectomy or bilateral oophorectomy were considered to be postmenopausal women, whereas very few women with dubious menopausal status were considered to represent missing data. At the completion of the interview, blood was drawn for DNA isolation and subsequent molecular analysis. The status of all samples – case or control – was concealed before they were handed to laboratory personnel.

Single nucleotide polymorphism selection

A SNP search using the National Center for Biotechnology Information SNP database [23] showed no non-synonymous SNPs in the coding region of the MBD2 gene. Therefore, two noncoding SNPs were chosen for genotyping. One (rs1259938) is located in the noncoding exon and another (rs609791) is located in intron 3 of the MBD2 gene. The noncoding exon is generally found at the 3'-untranslated region of a gene and this is now widely acknowledged. There is increasing evidence indicating that the 3'-untranslated region of a gene plays a vital biologic role in many post-transcriptional regulatory pathways that control mRNA localization, stability, and translation efficiency [24].

Genotyping methods

The restrictional fragment length polymorphism PCR assay was used to determine the genotypes of SNP rs1259938. The genomic DNA used for the assay was extracted from peripheral blood lymphocytes. The PCR primers used for amplifying this polymorphism were as follows: forward 5'-CCTTGCCTGTGACTTGGACT-3' and reverse 5'-TCGCGAGTTTCAACAGAAAA-3'. Standard PCR was performed in a 25 μl volume with annealing temperature at 58°C and followed by an overnight digestion with XbaI (New England BioLabs, Beverly, MA, USA) at 37°C. The products were separated for 45 min at 220 V on a 4% agarose gel stained with ethidium bromide. Following electrophoresis, the homozygous G/G alleles were represented by a DNA band with size at 319 bp, whereas the homozygous A/A alleles were represented by DNA bands with sizes at 103 bp and 216 bp, and the heterozygotes displayed a combination of both alleles (103 bp, 216 bp and 319 bp).

The TaqMan Assay was used to determine the genotypes of SNP rs609791. Assays-on-Demand primers and probes (C_3079439_10; Applied Biosystems, Inc., Foster City, CA, USA) were mixed with PCR reagents following the manufacturer's instructions in the TaqMan assay. Plates were sealed and cycled at 95°C for 5 min, followed by 45 cycles at 92°C for 15 seconds, and 60°C for 1 min in a Stratagene Real-Time Mx3000 thermocycler (Stratagene Corp., La Jolla, CA, USA).

Each genotyping plate contained positive and negative controls. Approximately 5% of the samples were duplicated to ensure quality control in genotyping and two reviewers separately performed genotype scoring to confirm results.

Statistical analysis

Because more than 90% of the study participants were Caucasians, with about 6% being black, 1% Asian, and 2% other races, we restricted our analysis to Caucasians only (393 cases and 436 controls). Pearson's χ2 test was used to evaluate differences in the distribution of selected characteristics between cases and controls. Genotype frequencies at both SNP loci in the control population were first checked for compliance with Hardy–Weinberg equilibrium using STATA statistical software (StataCorp, LP, College Station, TX, USA). Haplotype estimation was calculated using the PHASE program, which reconstructs haplotypes from population genotyping data [25]. The best haplotypes estimated by the PHASE were assigned to each study participant. STATA was also used to calculate both crude and adjusted odds ratios (ORs). ORs with 95% confidence intervals (CIs) were reported to illustrate relative cancer risk associated with genotypes and haplotypes.

For a SNP genotype, study participants with homozygous common allele were used as the reference group in OR calculation. For haplotype analysis, the most common haplotype (G-C) was used as the reference group in risk estimation. Logistic regression was used to control for confounding by age (as a continuous variable), body mass index (<25 kg/m2, 25–29.99 kg/m2, >29.99 kg/m2), family history of breast cancer in first-degree relatives, family income (tertiles based on distribution of controls), lifetime months of breastfeeding (never, 1–5, 6–15, >15 months) and study site (New Haven County, Tolland County). Control of other variables (such as age at menarche, age at menopause, number of live births) did not change the ORs significantly, and these variables were not included in the final model.

Results

This study, which included 393 breast cancer cases and 436 controls, was composed entirely of Caucasians. Table 1 presents the distribution of selected baseline characteristics for cases and controls. There were significantly more postmenopausal women in the case population (77.6%) than among controls (66.5%), indicating an increased risk for breast cancer associated with menopausal status. In addition, data showed that more controls had higher family income (31%) compared to the incidence of high family income in breast cancer cases (23.5%). No other baseline factors exhibited a material difference between cases and controls.

Table 1 Distributions of selected characteristics by case–control status in Caucasians

The genotype distributions of both SNP loci for cases and controls (Table 2) were in Hardy–Weinberg equilibrium (χ2 = 0.35, P = 0.56 for rs1259938; χ2 = 0.31, P = 0.57 for rs609791).

Table 2 MBD2 genotypes, menopausal status and breast cancer risk in Caucasians

Among all women, we found no overall associations between genotypes at these two loci and breast cancer risk after adjustment for age, menopausal status, family history of breast cancer in first-degree relatives, family income, body mass index, lifetime months of breastfeeding, and study site. Among premenopausal women, a reduced breast cancer risk was significantly associated with variant genotypes (homozygous minor allele + heterozygote) at both SNP loci. Specifically, women with G/A and A/A at rs1259938 had 59% reduced breast cancer risk (OR = 0.41, 95% CI = 0.23–0.72) and women with C/G and G/G at rs609791 had 46% reduced breast cancer risk (OR = 0.54, 95% CI = 0.30–0.96). However, no significant associations were detected among postmenopausal women.

These two SNP loci in the MBD2 gene may generate four possible haplotypes, and their frequency distributions among cases and controls are shown in Table 3. G-C was the most common haplotype, with a frequency of 66.90% in our control group. The frequencies of the other three haplotypes were 5.56% (G-G), 11.70% (A-C), and 15.84% (A-G) in controls. Among all female participants, none of these haplotypes was associated with breast cancer risk. However, in premenopausal women the two rare haplotypes halved breast cancer risk, with ORs of 0.40 (95% CI = 0.20–0.83, P = 0.013) for A-C and 0.47 (95% CI = 0.26–0.84, P = 0.011) for A-G. Similar associations were not observed in postmenopausal women.

Table 3 MBD2 Haplotypes, menopausal status and breast cancer risk in Caucasians

Discussion

It is becoming clear that carcinogenesis is a stepwise process of accumulation of both genetic and epigenetic abnormalities that can lead to cellular dysfunction. A large body of evidence has demonstrated that the epigenetic process is involved in breast carcinogenesis by influencing several broad gene categories, including cell cycle regulation, cell growth, steroid receptors, tumor susceptibility, carcinogen detoxification, cell adhesion, and inhibitors of matrix metalloproteinase genes. For example, methylation of p16 promoter and exon 1 regions are observed in both human breast cancer cell lines and 20–30% of primary breast cancers [26, 27]. Methylation of the promoter region of GSTP1, a member of the glutathione S-transferases, which are a supergene family involved in the detoxification of carcinogens, is associated with gene inactivation in about 30% of primary breast carcinomas [28]. DNA methylation has also been found to be an alternative mechanism of inactivation of BRCA1 [29, 30], a gene that accounts for one half of inherited breast carcinomas [31]. Moreover, three members of the steroid hormone superfamily, including estrogen receptor, progesterone receptor and retinoic acid receptor, have long been linked to mammary carcinogenesis [32] and recent studies have shown that epigenetic alterations appear to play a role in silencing estrogen receptors and retinoic acid receptors in breast malignancy [3335]. Given the increasing evidence for a role of the epigenetic process, especially DNA methylation, in breast carcinogenesis, it is speculated that some genetic variations in methylation related genes may affect the expression and function of these genes and consequently contribute to breast cancer development.

Findings from the present study show associations between genotypes and haplotypes of the MBD2 gene and breast cancer, which have not previously been examined. These results support a potential role for methylation related genes in breast tumorigenesis. Interestingly, MBD2 polymorphisms have different effects in women depending on menopausal status. Our results demonstrate significant associations between MBD2 genotypes and haplotypes and breast cancer risk in premenopausal women but not in postmenopausal women. In fact, menopausal effects on breast cancer risk have also been observed in a previous study investigating genetic polymorphisms in catechol-O-methyltransferase [36]. That study found that the low-activity allele of catechol-O-methyltransferase was associated with increased risk among premenopausal women (OR = 2.1, 95% CI = 1.4–4.3) but was inversely associated with postmenopausal risk (OR = 0.4, 95% CI = 0.2–0.7). Our findings support arguments from previous studies suggesting that different etiologies may be involved in breast carcinogenesis between premenopausal and postmenopausal women [36, 37].

Although the mechanisms are not elucidated, menopausal effects on the role of MBD2 in breast cancer development may be related to changes in sex hormone levels. One of the phases of breast cancer pathogenesis is exposure of breast tissue to ovarian hormones that drive the kinetics of breast tissue stem cells, resulting in carcinogenesis [38]. Dividing cells are particularly susceptible to alterations in DNA synthesis, DNA repair, and DNA methylation.

These biologic and physiologic effects of sex hormones are controlled by hormone receptors, the expression of which is regulated by the methylation status of their promoter regions [3335].

On the other hand, steroid hormones may influence the epigenetic blue print of methylation of certain genes and consequently activate or inactivate gene expression [39]. Even though MBD2 might be involved in the epigenetic regulation of steroid hormone receptor gene expression, MBD2 itself could also be affected by steroid hormones. It is possible that MBD2 plays different roles in breast carcinogenesis when hormone levels dramatically change. Our findings support this speculation in that we found a significantly protective role of MBD2 variants in premenopausal women but no significant associations in postmenopausal women.

There are limitations to the present study. Only Caucasian women were included in the study, and so hypotheses must be further examined in multi-ethnic groups. In addition, the sample sizes of our study limit the analyses to explore other potential risk factors. Traditional risk factors such as parity and family history of cancer did not differ between cases and controls, which could also be due to the sample sizes. Inaccurate recall may affect our assessment of family history of cancer as well.

Conclusion

These findings imply a potential link between DNA methylation processes and hormonal expression. Although large molecular epidemiologic studies are warranted to further examine associations between MBD2 polymorphisms and breast cancer in multi-ethnic groups, this study suggests that genetic variations in methylation related genes may serve as a promising biomarker in risk estimate of breast cancer.