Background

Despite breast cancer (BC) being the most frequent cancer in women in western countries and the second cause of cancer death after lung cancer [1], the risk factors that lead to the disease are not completely understood, although is widely accepted that they include a combination of environmental and genetic factors. For genetic approximation, a polygenic model has been proposed in which a combination of common variants, having individually a modest effect, together contribute to BC predisposition [2].

Numerous evidence links carcinogenesis and oxidative stress regulation, including prooxidant and antioxidant defense systems [37]. Oxidative stress is defined as an imbalance in the production of reactive oxygen species (ROS) and reactive nitrogen species (RNS) and their removal by antioxidants. When this imbalance occurs, biomolecules are damaged by ROS and RNS and normal cellular metabolism is impaired, leading to changes of intra- and extracellular environmental conditions. ROS can cause lesions in DNA, such as mutations, deletions, gene amplification and rearrangements, that may lead to malignant transformations and cancer initiation and progression [810]. The effect of ROS and RNS, however, is balanced by the anti-oxidant action of non-enzymatic and anti-oxidant enzymes maintaining cellular redox levels under physiological conditions [4, 11].

Previous studies with knockout animals that lack antioxidant enzymes support the view that ROS contribute to the age-related development of cancer. For instance, mice deficient in the antioxidant enzyme CuZnSOD showed increased cell proliferation in the presence of persistent oxidative damage contributing to hepatocarcinogenesis later in life [12]. Another study showed that mice lacking the antioxidant enzyme Prdx1 had a shortened lifespan owing to the development, beginning at about 9 months, of severe hemolytic anemia and several malignant cancers [13].

In this context, single nucleotide polymorphisms (SNPs) in components of the cellular redox systems can modify the redox balance and take part in both the BC initiation and/or progression, as well as determine possible therapeutic treatments [1417].

Despite the importance of oxidative stress in the development and progression of cancer, few studies have evaluated the relationship between genetic modification in genes coding for enzymes relatives to the redox system and the susceptibility to develop BC. The previous studies had focused mainly on the analysis of genes related to antioxidant defense enzymes [18, 19], but the information about modifications in genes involved in the oxidation process is relatively sparse.

The aim of this study was to evaluate the association between common variants in genes coding for proteins related to the redox system (antioxidant and oxidant systems or proteins) and the susceptibility to develop BC. We hypothesized that common SNPs related to the redox pathway are associated with an altered risk for BC. We chose 76 SNPs on which to perform a two-step study: one first exploratory set and a second, independent, validation set. We also decided to investigate the impact of complex interactions between SNPs at different genes of the stress oxidative pathway. To address this issue, we analyzed the effects of gene-gene interactions by the multifactor dimensionality reduction (MDR) approach. This analysis was carried out in four SNPs that were statistically significant in the combinatorial set.

Methods

Study population

The underlying analyses were carried out in a Caucasian Spanish population. The study was carried out in two steps with two population groups. A first group of 1176 samples was composed of 493 female patients diagnosed for BC between the years 1998–2008 at La Paz Hospital and Foundation Jimenez Díaz (Madrid), and 683 healthy women controls recruited at the Hospital of Valladolid (Spain).

Thereupon, we chose the polymorphisms that showed marginally significant association (p-value < = 0.15), and we replicated the procedure in a second independent group (n = 1233) where we included 430 female patients diagnosed for BC between the years 1988–1998 at the Clinic Hospital of Valencia (Spain) and 803 samples from cancer-free women recruited at the blood donor bank at the same Hospital. Blood was collected between 2010 and 2011 during periodical patient visits. The blood from controls was extracted between the years 2009 and 2012. In both groups, the controls were women without pathology or history of cancer. Controls were not matched to cases, but were similar in age. In group 1, cases’ mean age was 57.5 (range 23.5-89.5), and that for donors was 52.7 (21.5-96.5). In group 2, cases’ mean age was 54.1 (20.5-86.5) while in donors, it was 54 (22.5-92.5).

We selected this staged approach because it allowed us to analyze only those polymorphisms with indicative results and reduced the number of genotyping reactions without significantly affecting statistical power [18, 20].

The research protocols were approved by the ethics committee of the INCLIVA Biomedical Research Institute. All the participants in the study were informed and gave their written consent to participate in the study.

Single nucleotide polymorphisms selection and genotyping

Two public databases were used to collect information about SNPs in oxidative pathway genes: NCBI (http://www.ncbi.nlm.nih.gov/projects/SNP/) and HapMap (http://www.hapmap.org). The selection of polymorphisms was performed by SYSNP [20] and by a literature search in PubMed, Scopus and EBSCO databases using the terms “breast cancer and polymorphisms and oxidative”, along with additional terms such as “SNPs and oxidative pathway and susceptibility”, and their possible combinations. The following criteria were used to select the SNPs: functional known or potentially functional effect, location in promoter regions, minor allele frequency (MAF) over 0.1 in Caucasian populations analyzed previously, localization and distribution along the gene (including upstream and downstream regions) and low described linkage disequilibrium between candidate polymorphisms. We included variants with potential influence in the gene and protein function, as well as the most important variants described in the literature.

Finally, we select a total of 76 polymorphisms located in 27 genes related to the redox system: 17 were classified as antioxidant genes (CAT, GCLC, GCLM, GNAS, GPX6, GSR, GSS, M6PR, MSRB2, OGG1, SOD1, SOD2, SOD3, TXN, TXN2, TXNRD1, TXNRD2) and 10 as reactive species generators (mainly NADPH oxidase-related genes CYBB, NCF2, NCF4, NOS1, NOS2A, NOX1, NOX3, NOX4, NOX5 and XDH). Reference names and characteristics of the selected SNPs are provided in Table 1.

Table 1 Summary of the 76 selected SNPs in 27 genes

Experimental procedures

The blood samples remained frozen until the DNA extraction was performed. Genomic DNA was extracted from blood samples using DNA Isolation Kit (Qiagen, Izasa, Madrid, Spain) following the manufacturer’s protocol, but a final elution volume of 100 μl used. DNA concentration and quality were measured in a NanoDrop spectrophotometer. Each DNA sample was stored at -20°C until analysis, which in all cases was performed within a year of the DNA extraction.

Genotyping analysis in both sets was performed by SNPlex technology (Applied Biosystems, Foster City, California, USA) according to the manufacturer’s protocol [21]. This genotyping system, based on oligation assay/polymerase chain reaction and capillary electrophoresis, was developed for accurate genotyping, high sample throughput, design flexibility and cost efficiency. It has validated its precision and concordance with genotypes analyzed using TaqMan probes-based assays. The sets of SNPlex probes were reanalyzed in about 10% of the samples with a reproducibility of over 99%. Those polymorphisms and samples with genotyping lower than 85% in the first set were excluded from further analysis.

Statistical and MDR analyses

Statistical analysis was performed using SNPstats software [22], a free web-based tool, which allows the analysis of association between genetic polymorphisms and diseases. The proper analysis of these studies can be performed with general purpose statistical packages, but this software facilitates the integration of data. The association with disease is modeled as binary; the application assumes an unmatched case–control design and unconditional logistic regression models are used. The statistical analyses are performed in a batch call to the R package (http://www.R-project.org). SNPStats returns a complete set of results for the analysis. SNPstats provides genotype frequencies, proportions, odds ratios (OR) and 95% confidence intervals (CI), and p-values for multiple inheritance models. The lowest Akaike’s Information Criterion and Bayesian Information Criterion values indicate the best inheritance genetic model for each specific polymorphism. All the analyses were adjusted by age. Only SNPs with no significant deviation from Hardy-Weinberg equilibrium (HWE) in controls and a MAF exceeding 5% were retained for the association analysis (Table 1).

To identify gene-gene interactions, MDR was used. It is a non-parametric and a genetic model-free approach that uses a data reduction strategy [2325]. This method considers a single variable that incorporates information from several loci that can be divided into high risk and low risk combinations. This new variable can be evaluated for its ability to classify and predict outcome risk status using cross validation and permutation testing. Both were used to prevent over-fitting and false-positives from the multiple testing. With n-fold cross-validation, the data are divided into n equal size pieces. An MDR model is fit using (n-1)/n of the data (the training set) and then evaluated for its generalizability on the remaining 1/n of the data (the testing set). The fitness of a MDR model is assessed by estimating accuracy in the training set and the testing set. Moreover, it estimates the degree to which the same best model is discovered across n divisions of the data, referred to as the cross-validation consistency (CVC). The best MDR model is the one with the maximum testing accuracy. Statistical significance is determined using permutation testing. We used 10-fold cross-validation and 1000-fold permutation testing. MDR results were considered statistically significant at the 0.05 level. The advantages of this method are that there are no underlying assumptions about the independence or biological relevance of SNPs or any other factor. This is important for diseases as sporadic BC where the etiology is not completely known. We used the MDR software (version 2.0 beta 8.4) which is freely available (Epistasis.org: http://www.epistasis.org).

Results

Single nucleotide polymorphisms and susceptibility to breast cancer

Set 1: To determine the possible association of polymorphisms related to oxidative stress genes and BC we analyzed 76 polymorphisms in 27 genes of the redox system in 493 cases and 683 controls (Table 1). Seven SNPs (rs3749930, rs2036343, rs34990910, rs17881274, rs17011353, rs17011368, rs17323225) with MAF <0.05 in controls, as along with two SNPs (rs725521 and rs231954) not showing Hardy-Weinberg equilibrium, were excluded from the association analysis (Table 1). A total of 67 SNPs were successfully genotyped and analyzed.

Our association analysis in set 1 pointed out four nominally statistically significant results (p < 0.05). Table 2 shows the results found in the selected polymorphisms. Polymorphisms rs974334, rs1805754 rs4135225 and rs207454 showed an association with modifications in the risk for BC. All the results were adjusted by age.

Table 2 Comparison of genotype frequencies between breast cancer patients and controls (Set 1)

Set 2: Subsequently, and in order to better identify those polymorphisms that could be associated with BC, we replicated the 10 SNPs with a p-value equal to or lower than 0.15 in group 1 [rs3736729, OR: 0.74 (0.54-1.01); rs406113, OR:1.26 (0.98-1.62); rs974334, OR:2.01 (1.07-3.80); rs1805754, OR: 1.31 (1.02-1.68); rs1052133, OR: 1.76 (1.00-3.10); rs2284659, OR: 1.30 (0.92-1.84); rs2301241, OR: 0.80 (0.60-1.07); rs4135179, OR: 1.27 (0.97-1.66); rs4135225, OR: 0.66 (0.45-0.96); rs207454, OR: 4.98 (1.28-19.34)] in a second independent set.

Set 1 + Set 2: Finally, we analyzed the 10 polymorphisms in the global population set 1 + set 2 (n = 2409; cases = 923, controls = 1486). The results are listed in Table 3. From the 10 polymorphisms analyzed in both samples, 6 presented a statistically significant association with increased risk when the combined data were analyzed: rs406113 [OR: 1.23 (1.04-1.46)], rs974334 [OR: 1.73 (1.09-2.73)], rs1052133 [OR:1.82 (1.31-2.52)], rs2284659 [OR:1.33 (1.05-1.67), rs4135225 [OR: 0.77 (0.60-0.99)], rs207454 [OR: 2.12 (1.11-4.04)]. Of these polymorphisms, the rs105213 on the OGG1 gene maintained the statistical significance (p-value = 0.0004) after the Bonferroni correction.

Table 3 Genotype frequencies of relevant polymorphisms in different Sets

Gene-gene interactions in breast cancer patients

There is growing evidence that epistasis interactions between genes may play a role in cancer risk, and different variable selection approaches have been developed to analyze the potential gene-gene and gene-environment interactions [25]. The four most significantly associated polymorphisms in set 1 + set 2 with susceptibility to BC were selected for this analysis: rs406113 [OR: 1.23 (1.04-1.46)], rs974334 [OR: 1.73 (1.09-2.73)], rs1052133 [OR:1.82 (1.31-2.52)] and rs2284659 [OR:1.33 (1.05-1.67)]. Data from 1182 samples (controls and patients) from both groups were used. The combination was performed grouping the genotypes according to the model predicted for the four polymorphisms: recessive model for rs1052133 (CC and CG were grouped into a single block), dominant model for rs406113 (CC and AC genotypes were grouped into a single block), recessive model for rs974334 (CC and CG genotypes were grouped into a single block) and recessive model for rs2284659 (GG and GT genotypes were grouped into a single block). For a two-loci interaction, the combination of polymorphisms rs406113 (GPX6) and rs1052133 (OGG1) was the most significant (p = 0.041). The best three-loci model included rs406113 on the GPX6 gene, rs1052133 on the OGG1 gene and rs2284659 on the SOD3 gene, and it showed statistical significance (p < 0.0007) with an OR = 1.82 and 95% CI = 1.28-2.58. A four-way interaction found that between rs406113 on the GPX6 gene, rs974334 on the GPX6 gene, rs1052133 on the OGG1 gene and rs2284659 on the SOD3 gene predicts breast cancer with a testing balance accuracy of 0.5267. This four-loci model had a chi-square value of 11.284 (p = 0.0008) and an OR of 1.75 [95% CI = 1.26-2.44]. The four polymorphism combinatory model showed a higher predisposition to BC than the polymorphisms rs406113, rs974334 and rs2284659 did individually (ORX2 = 1.23, ORX3 = 1.73, ORX6 = 1.33) and had values similar to the ones of polymorphism rs1052133 (ORX5 = 1.82). The summary of the multi-factor dimensionality results are listed in Table 4.

Table 4 Summary of Multi-factor Dimensionality (MDR) results

The combined genotype AA for rs406113, CC/CG for rs974334, CC/CG for rs1052133 and GG/ GT for rs2284659 showed a higher risk for BC, which is consistent with the models described for the polymorphisms individually. Figure 1 summarizes the four-loci genotype combinations associated with high and low risk and with the distribution of cases and controls.

Figure 1
figure 1

The polymorphisms rs406113, rs974334, rs1052133 and rs2284659, showing the highest statistical significance in the combinatorial set 1+set 2, were chosen for the gene-gene interaction analysis. The MDR analysis was done with the genotypes collapsed according to the genetic models selected: rs1052133 (OGG1) recessive model; CC/CG vs.GG, rs406113 (GPX6) dominant model; CC/AC vs. AA, rs974334 (GPX6) recessive model; CC/CG vs. GG, rs2284659 (SOD3) recessive model; GG/GT vs. TT. The figure shows the summary of four-loci genotype combinations associated with high and low risk. Cases: left bars, controls: right bars. The epistatic gene-gene interaction corresponds to the high risk combinations (darkest color).

Discussion

Genetic association studies involving SNPs and their possible interactions have become increasingly important for the study of human diseases. The present study has focused on genes encoding for proteins of the redox system. It is long proven that they are clearly involved in extensive damage to DNA, which in turn leads to gene mutations and, finally, carcinogenesis. The functionality of polymorphisms in relation to oxidative stress has been proven in several cases. For instance, the polymorphism in exon 2 of the superoxide dismutase 2 (SOD2) gene A16V (C/T) (rs4880) led to structural alterations in the domain responsible to target the mitochondria, giving a reduction in the antioxidant potential [26]. Furthermore, a functional polymorphism in exon 9 of the CAT gene and other polymorphisms in endothelia NO synthase (eNOs) that seem relevant for their activity have been documented [2628]. Therefore, it is clear that a single oligonucleotide modification can lead to structural changes, modifications in the affinity to bind proteins or in activity that may be relevant to the redox system. Our hypothesis was that variations in genes from the stress oxidative pathway, that have shown to have a possible linkage to BC, can be associated with predisposition to this disease. Indeed, genetic variations in these pathways have shown to modify the risk for BC [29].

In the present epidemiological study, we have assessed the effect of 76 SNPs in 27 genes in a case–control study in a Spanish population. Genotype distributions in the controls did not differ significantly from those expected under HWE. The study was performed in two independent sets of patients and controls, first, to select the relevant polymorphisms and second, to check the reproducibility and significance of these preliminary results.

Six SNPs (rs406113 and rs974334 on the glutathione peroxidase 6 (GPX6) gene, rs1052133 on the 8-oxoguanine DNA glycosylase (OGG1) gene, rs2284659 on the superoxide dismutase 3 (SOD3) gene, rs4135225 on the thioredoxin (TXN) gene and rs207454 on the xanthine dehydrogenase (XDH) gene) are associated with variations in the predisposition to BC.

The rs406113 (c.39 T > G; p.F13L) and rs974334 (c.242-12G > C) polymorphisms on the GPX6 gene had not been studied previously; in fact, there was no information available in the literature about polymorphisms on the GPX6 gene even though they can have a functional effect. Genetic variants in other genes of the GPX family have been associated with BC [3032].

Thioredoxin (TXN) is overexpressed in BC, and it is related to tumor grade [33], being a crucial element in redox homeostasis [34]. Studies of polymorphisms in the TXN gene, encoding thioredoxin, are few in cancer. Seibold et al. [19], evaluated the influence of common variants on TXN, thioredoxin reductase 1 (TXNRD1) and thioredoxin 2 (TXN2) genes and the risk of BC after menopause, including seven of the SNPs analyzed in our study. Rs2301241 and rs2281082 were not significantly related to BC risk in our study, however, Seibold et al. found a limited association of rs2301241 with BC risk when comparing rare homozygote vs. common homozygote. Other studies found a borderline significance [18]. In the case of rs2281082, the borderline association of the Seibold study was not confirm in other publications [19]. In our study population, we found that carriers of one T allele on rs4135225 (c.196-192C > T) were associated with lower risk for BC development (OR = 0.77 [95% CI; 0.60-0.99] p = 0.041). Seidbol et al. found a predisposition to BC for this polymorphisms (OR = 1.22 [95% CI; 1.06-1.41]. One must take into account that the Seidbold study is focused in postmenopausal women, unlike ours. Still, in their analysis, they compared only two of the three possible genotypes (heterozygotes vs. common homozygotes). In any case, their results showed borderline significance.

The xanthine dehydrogenase (XDH) is an important enzyme involved in the first-pass metabolism of 6-mercaptopurine [35]. Polymorphisms in the XDH gene have been related to cancer. The rs1884725 polymorphism has been identify as a genetic variant associated with disease risk and outcomes in multiple myeloma [36]. In our study, one of the thirteen polymorphisms evaluated on this gene showed an association with BC risk. Carriers of one A allele of rs207454 displayed 2.12 times ([95% CI; 1.11-4.04], p = 0.024) more risk to develop the illness than did non-carriers. To our knowledge, there are no studies of these polymorphisms in the literature. The results presented here suggest an association with the development of BC, although further confirmatory studies would be needed to confirm it.

The polymorphism on the SOD3 gene (rs2284659) analyzed in our study showed a trend to the predisposition for BC in the global analysis. There was no information about this polymorphism in the literature. Other polymorphisms in this gene, like rs2536512 and rs699473, have been associated in BC patients with the incidence of tumor and poorer progression-free survival (PFS) [37]. Moreover, some results suggest that rs699473 may influence brain tumor risk [38].

The variant rs1052133 (Ser326Cys) in the OGG1 gene has the same tendency to predisposition for BC in both sets, separately and in the combined data set. Concerning this polymorphism, previous studies had conflicting results [3945]. Three meta-analyses have attempted to summarize the results [39, 41, 46]. In one study, the authors analyzed this polymorphism in relation to several cancers founding only significant association with the risk for lung cancer [46]. The others two meta-analyses are focused on BC, and the results are contradictory. Yuan et al. found an association just in the European population subgroup [41], while Gu et al. did not show any association, even when stratifying the analysis by ethnicity or menopausal status [39]. These differences may have arisen from the different number of studies included in the European group.

We found an increment for the risk to develop BC in the carriers of at least one Ser allele (recessive model) if we consider the sets both separately and together ((OR = 1.82 [95% CI 1.31-2.52]) and p-value = 0.0004). Our results are in concordance with the meta-analysis by Yuan and collaborators that suggests that the hOGG1 326 Cys allele provides a significant protective effect for BC in European women [41]. The importance of this SNP rests in the role of the 8-oxoguanine DNA glycosylase, encoded by OGG1[47]. This enzyme can excise the 8-hydroxy-2´-deoxyguanosine (8-OHdG) modifications occurring in the DNA as a result of hydroxyl radical interaction [41, 48, 49]. An incorrect expression of the protein could interfere with the suitable repair of the genetic material. Other polymorphisms in the OGG1 gene, like rs2304277, and recently described by Osorio et al., have been associated with ovarian cancer risk in BRCA1 mutation carriers [50]. This data certainly support the importance of genetic changes in the OGG1 gene in relation to the predisposition to cancer.

The epistatic analysis of the four most significant polymorphisms in relation to the susceptibility to BC was performed with the MDR method. This is a reliable approach that has been widely used [23, 25, 5154]. The combination was performed grouping the genotypes according to the model predicted for the four polymorphisms in Tables 2 and 3. The result obtained was an OR = 1.75 [95% CI = 1.26-2.44; p-value = 0.0008], a value similar to that obtained for rs1052133 (OR = 1.82 [95% CI = 1.31-2.52; p-value = 0.0004]). The previous study of Cebrian et al. in antioxidant defence enzymes and BC susceptibility has twelve common SNPs with our study. Two showed discrepancies with our data: Rs511895 in the CAT gene was not significant in our analysis, but it presented a borderline tendency in the Cebrian et al. study. Moreover, they found a significant difference in genotype distribution between cases and controls in rs4135179 (TXN). We, however, were unable to confirm this in our global analysis, although we detected a marginal significance in set 1. The reason for this discrepancy can be found in the population’s characteristics, in the superior age of the population included in the Cebrian study [18].

Our study has several limitations to take into consideration. Firstly, there is no data available about the lifestyle of controls and patients that could be related to oxidative stress, such as diet, exercise and the consumption of tobacco and alcohol. Secondly, polymorphisms that were not explored in our study may affect the risk to develop BC and should be taken into account in the analysis of our data and in further studies. Nevertheless, the association between SNPs and risk for BC is reliable since that power exceeded 95% in all the cases. All samples are from the same country and ethnicity, and the adjustment for age reduces variability.

Additionally, MDR has 80% statistical power to detect true interactions in two-, three-, and four-way gene-gene interactions, even with a small number of cases and controls [24]. Furthermore, several associations detected in these data involved SNPs occurring in non-coding regions. However, variations in the intronic structure have been proposed to influence cancer susceptibility via regulation of gene expression, gene splicing or mRNA stability. It is also possible that these polymorphisms are in linkage disequilibrium with other functional polymorphisms that may affect BC susceptibility.

Despite these considerations, our work, as far as we know, is the largest study in the Spanish population that analyzes the influence of polymorphisms in oxidative genes in susceptibility to BC. Overall, our data, together with that published in the bibliography [18, 19, 29, 37, 41, 45, 5562], suggest a role of stress-response gene variants in the susceptibility to BC.

Conclusions

Our results suggest that different genotypes in genes of the oxidant/antioxidant pathway could affect the susceptibility to breast cancer. We have found six polymorphisms in OGG1, GPX6, SOD3, TXN and XDH genes significantly associated with predisposition to breast cancer. These associations have not been described previously, except for rs1052133 (OGG1). Concerning this polymorphism the published results in breast cancer were contradictory, and some authors found only a significant association with the risk of developing lung cancer. We have found an increment in the risk of developing breast cancer in the carriers of at least one Ser allele (recessive model) in concordance with a meta-analysis of breast cancer susceptibility in European women. In this particular case, an incorrect expression of the protein encoded by the OGG1 gene could interfere with the suitable repair of the genetic material. Furthermore, our study highlighted the importance of the analysis of the epistatic interactions in order to define the influence of genetic variants in susceptibility to breast cancer more precisely. Further studies on the relevance of these and other polymorphisms in the development of breast cancer should be performed.