Introduction

Every year in developed countries, including Poland, a growing number of cases of sporadic colorectal cancer is observed. This tumor is still associated with a high mortality rate in patients. Several lines of evidence indicate that lifestyle factors including cigarette smoking, alcohol consumption and dietary habits may contribute to sporadic colorectal cancer (CRC) risk [1]. In addition, genetic variations in different biological pathways may modulate the influence of environmental insults in a gene × environment (G × E) manner [2].

Up-to-date, a plethora of low-penetrance single nucleotide polymorphisms (SNPs) has been linked to the molecular background of CRC [3]. Numerous studies have been dedicated to the role of genetic variants in genetic/biochemical pathways such as xenobiotic-metabolizing enzymes or DNA repair networks, which may modify influence of personal vulnerability to environmental exposures and consequently individual susceptibility to CRC [47].

Xenobiotic-metabolizing enzymes (XMEs) create a first line of protection against numerous mutagenic agents [8]. An activation of chemical pro-carcinogens is mediated mainly by cytochrome P-450 enzymes (phase I biometabolism of xenobiotics), while an elimination of environmental chemicals proceeds through the complex detoxification mechanisms and is associated with phase II biometabolic enzymes [9]. Among phase I enzymes, CYP1A1 polymorphisms have been extensively studied in the etiology of different cancers such as breast, prostate, lung [911]. In turn, the impact of genetic variation in phase II xenobiotic clearance enzymes has been attributed among others to such genes as NAT2 (N-acetyltransferase 2) and GSTT1 (glutathione S-transferase T1) [12].

The DNA repair network is a key protection against DNA-damaging carcinogenesis. A DNA repair system consists of a multitude of genes acting on several pathways specialized in repair of distinct types of DNA damage. The base excision repair (BER) pathway is implicated in counteracting the adversity of reactive oxygen species and ensured by the 8-oxoguanine DNA glycosylase (OGG1) gene and XRCC1 gene [13]. The nucleotide excision repair (NER), involved in scavenging ultraviolet and xenobiotic-induced DNA modifications, comprises an activity of xeroderma pigmentosum (XP) genes and the excision repair cross-complementing group genes (ERCC) [14]. Finally, homologous recombination repair (HR) is responsible for repairing double-strand DNA damages and is linked, among others, to activity of nibrin 1 (NBS1) and Rad51 [15].

Results of numerous studies on the association among genes involved in aforementioned pathways were often contradictory with insufficient statistical power. Moreover, a separate assessment of various genes did not allow to ensure the comprehensive insight into the molecular etiology of cancer including CRC.

In the present study, we performed an analysis on a Polish cohort consisting of 478 patients and 404 controls. Participants of the study originate from southwest (Lower Silesia) and central Poland. In our study, we analyzed two polymorphisms in XMEs: CYP1A1 (rs1048943) and NAT2 A803G (rs1208) and three in DNA repair genes ERCC1 Asp118Asn (rs11615), XRCC3 Met241Thr (rs861539), XPC i11C/A (rs2279017). The selection of mentioned polymorphisms was supported by analysis of avaliable/most current literature data [1628] as well as our preliminary studies concerning CYP1A1 and NAT2 polymorphisms (unpulished).

Materials and methods

Subjects

Colorectal cancer patients’ group

In total, 478 CRC patients were analyzed. Blood samples from 110 sporadic CRC patients were obtained from 2nd Department of General and Oncological Surgery and 1st Department of General Gastroenterological and Endocrine Surgery, Wroclaw Medical University, Lower Silesia (65,64 ± 10,10) (Supplementary Fig. 1, Supplementary Table 1). DNA from latter 368 patients with sporadic CRC was obtained from Warsaw Centre of Oncology-Institute, central Poland (aged 48.85 ± 12.23) (Supplementary Table 2 and Supplementary Fig. 2).

Control group

In total 404 controls were analyzed. Hundered healthy controls were enrolled at Department of Internal Diseases, Clinical Hospital, Swiebodzice, Lower Silesia as described previously [16]. This group is briefly presented in Supplementary Table 1 and Supplementary Fig. 1. Further 304 healthy volunteers (from central Poland) were tested negative (screening colonoscopy) for any abnormalities in the lower gastrointestinal tract. Familial and individual history regarding neoplasia was negative within the group of controls. All participants were Caucasians; all completed and signed an informed consent. They were aged 58.35 ± 4.92 (Supplementary Table 2 and Supplementary Fig. 2). The study was approved by the local Ethics Committee.

Combined groups, demographic data

Five selected polymorphisms were analyzed on a combined group consisting of 478 CRC cases versus 404 controls. The mean age of CRC patients was 52.75 ± 13.73 and controls 62.45 ± 9.15 (Supplementary Table 3 and Supplementary Fig. 3).

Analysis of studied polymorphisms

Genomic DNA from Lower Silesian participants was obtained from peripheral blood lymphocytes using standard phenol–chloroform extraction and ethanol precipitation. Genotyping of studied polymorphisms was performed using polymerase chain reaction and restriction fragment length polymorphism (PCR–RFLP) method described elsewhere [8, 16].

The DNA for the Centre of Oncology-Institute (Warsaw) group was extracted using the QIAamp DNA Blood Mini Kit (QIAGEN). Genotyping of selected polymorphisms was done using the TaqMan® SNP Genotyping Assays (Life Technologies), SensiMix II Probe Kit (Bioline) and the ABI Prism® 7900HT real-time PCR system.

Statistics

Before running the association analysis, the PLINK v1.07 software was used to rule out any deviations from the Hardy–Weinberg equilibrium in the control groups of individuals.

To asses statistical power of the analysis for each cohort, PS: Power and Sample Size Calculation version 3.0 software was used taking into account common and rare variants [29].

The selected polymorphisms were analyzed using both the Fisher’s exact test using the PLINK software, and the logistic regression analysis was performed in R-project accounting for the additive model of gene action. The logistic regression model included patients’ gender as an additional independent variable. The reference (index) level for the sex variable was set to female.

To evaluate the cumulative performance of the studied polymorphic variants to discriminate between high and low CRC risk individuals, the ROC (receiver operating characteristic) curves were plotted and the AUC (area under the curve) parameters were computed using the epicalc package of the R-project [30].

Results

For statistical analysis, we took into account the aforementioned Wroclaw Medical University (WMU) cohort, an independent group of CRC cases and controls collected at Warsaw Center of Oncology-Institute (COI) as well as both of the groups combined together. The respective sub-studies reached various levels of statistical power to identify associations for alleles found in the general population at high (p0 = 0.5) and low (p0 = 0.05) frequencies (Supplementary Fig. 4). The sub-study focusing on the combined group of WMU and COI patients reached a very decent power of >80 % to identify the allele associations of common variants at the OR > 1.5 and rare variants at the OR > 2 (Supplementary Fig. 4 C).

The Hardy–Weinberg equilibrium analysis indicated a deviation from the equilibrium in the case of rs2279017 (XPC) at p = 4.48E−04 and rs11615 (ERCC1) p = 4.78E−02, only in the group of CRC affected individuals collected and genotyped at WMU (Supplementary Table 4 in comparison with Supplementary Table 5 and Table 6).

The single marker, allelic association analysis performed in the whole WMU cohort indicated association of rs1208 (NAT2) minor allele G at OR 1.75 (1.18–2.6); p = 2.72E−02 (Supplementary Table 7). A very similar result was obtained for the WMU cohort individuals 50 years of age and above (Supplementary Table 8). This association showed a slight tendency of gender specificity being more pronounced in the female group of individuals after 50 years of age.

In contrast, a similar allelic association analysis performed in the COI cohort showed an association with CRC neither in the whole group nor in the group of individuals above 50 years of age as well as combined groups (Supplementary Table 9, Table 10 and Table 11).

Interestingly for the combined cohort of WMU and COI patients, 50 years of age or above, we identified an independent (not seen in any of the individual cohorts) association of the marker (rs1048943) found in the CYP1A1 gene OR 2.05 (1.29–3.28); p = 1.25E−02 (Supplementary Table 12 A). The association showed quite pronounced gender specificity being stronger in the female group of CRC patients OR 2.72 (1.43–5.14); p = 1.14E−02 (Supplementary Table 12 C).

The logistic regression model analysis confirmed the results of the allelic one also indicating an important influence of gender for the overall CRC susceptibility, with males being at somewhat greater risk of CRC OR 1.48 (1.11–1.98); p = 4.55E−02 (Supplementary Table 15 A). This effect was especially strong in the WMU cohort OR 3.18 (1.73–5.84); p = 1.19E−03 (Supplementary Table 13 A). However, in COI patients group, this effect was not seen (Supplementary Table 14). The receiver operating characteristic (ROC) plots computed based on the logistic regression additive model showed that the cumulative effects of selected markers were a moderately strong predictor of increased risk of CRC scoring AUC (area under the curve) parameter values of 0.709, 0.576 and 0.586 for the WMU, COI and combined cohorts, respectively (Supplementary Fig. 5). In general, the indicator performance was observed to be slightly better for the cohorts 50 years of age or above.

Discussion

In Poland, an upward trend of sporadic colorectal cancer (CRC), among both male and female individuals, is observed. It has been estimated that during forthcoming years (till 2025), the number of cases will almost double [19]. Annually, approximately 14,600 of new cases of the disease are diagnosed. Average rate of 5-year survival remains at a similar level for both sexes and approximates 30 percent. It is estimated that the mortality from CRC in the next 20 years will double, especially in patients over 65 years of age [31].

The main risk factors for disease are age (about 65 % of elderly patients), gender (1,4:1 male to female ratio), inflammatory bowel disease, family history of CRC, individual susceptibility to cancer modulated by low penetration gene polymorphisms and external factors such as diet and lifestyle [32].

In this study, we aimed to assess the role of chosen polymorphisms as a risk factors in carcinogenesis of colon. Only one of the studied SNPs, the CYP1A1 Ile462Val variant (minor allele), emerged as a single-putative risk allele in sporadic colorectal cancer patients especially for female over 50 years old.

CYP1A1 gene is located on a long arm of chromosome 15 (15q24.1) and belongs to the cytochrome P450 enzymes family which participate in the first phase of xenobiotic metabolism. Main substrates of CYP1A1 include many environmental carcinogens such as alkaloids or heterocyclic aromatic amines. Hormones (e.g., estrone, testosterone) and some drugs are also metabolized during first phase via CYP1A1 [33]. Polymorphisms present in CYP1A1 gene may influence effectiveness of the metabolism of environmental carcinogens. The A>G transition in codon 462 (exon 7) leads to a substitution of isoleucine to valine (Ile462Val). In consequence, the effectiveness of the enzyme is increased and thereby higher amount of carcinogenic active molecules, especially polycyclic aromatic hydrocarbons PAHs, may be involved in carcinogenesis of colorectal mucosa [33].

The assessment of Ile462Val polymorphism has been extensively investigated in various cancers such as: endometrial, lung, cervical, oral, head and neck, ovarian, gastric, breast and colorectal [1628], as well as on healthy population, e.g., Polish from Silesian region [34]. However, the role of this polymorphism remains ambiguous and not fully explored because of the low frequency of the minor allele especially in Caucasians.

Recently, two meta-analyses, done by Jin (2011) and Zheng (2012), concerning CYP1A1 Ile462Val polymorphism and its role in colorectal cancer risk have been published [25, 26]. Jin et al. took into consideration thirteen case–control studies consisting of 5,336 CRC cases and 6,226 controls. They concluded that CYP1A1 Ile462Val polymorphism may contribute to colorectal cancer risk in Asians as well as Europeans [25].

Zheng et al. investigated fourteen studies including 6,654 CRC cases and 7,895 controls. They found a statistically significant relationship between CYP1A1 Ile462Val minor allele and increased risk of colorectal cancer (recessive model: OR 1.45, 95 % CI 1.16–1.81) [26].

Moreover, our analysis clearly showed that the design of the study especially size of the group (number of participants) plays a key role. The effect of CYP1A Ile462Val polymorphism emerged only in combined groups (in smaller it was not seen) what allows us to propose a thesis that before choosing the sample size the power of the statistical test should be assessed; otherwise, significance of rare minor alleles may be overlooked.

Hence, we imply the CYP1A1 Ile462Val polymorphism should be taken under consideration for next larger scale studies to support/reject its role in CRC tumorigenesis in Polish population.