Introduction

Hereditary factors are thought to contribute about 35% to causation of colorectal cancer (CRC)1. This view is supported by the fact that while rare genetic variants with high penetrance do confer a predisposition for inherited forms of CRC, such as the APC gene mutation in familial adenomatous polyposis (FAP) and the mismatch repair (MMR) gene mutation in Lynch syndrome, they account for only about 5% of CRC cases2. In order to explain the remaining genetic heritability, genome-wide association studies (GWAS) have identified approximately 40 common genetic loci for sporadic CRC3; susceptibility single-nucleotide polymorphisms (SNPs) are thought to confer weak but cumulative and increasing effects on CRC risk4.

Genetic variants in susceptibility SNPs for CRC are likely to influence age at onset4. It has been suggested that, compared with late-onset CRC, the genetic contributions are enriched in early-onset CRC5 in that clinico-pathologically advanced disease and poor prognosis6. Furthermore, the fact that age was differently distributed according to molecular features, such as CpG island methylator phenotype (CIMP)7, DNA macrosatellite instabilitly (MSI) status8, precursor adenomas9, and mutations in BRAF or KRAS gene9 in sporadic CRC suggests that a distinct genetic background contributes to the disease that differs between early- and late-onset CRC4. Furthermore, a considerable number of unidentified genetic variants remain and replication studies of previously reported CRC susceptibility SNPs according to age at onset are needed.

We hypothesized that several common genetic variants of susceptibility SNPs could be related either to early or late age at onset of CRC. To test this hypothesis, allele frequencies of 33 susceptibility SNPs identified by previous GWAS were compared between early-onset CRC patients (aged <50 years) and later-onset CRC patients (aged ≥50 years) in a case-only analysis. We assessed the heterogeneity of associations between SNPs and CRC risk according to age groups and interactions between SNPs and age groups in case-control analyses.

Results

Table 1 shows the baseline characteristics of CRC patients in each study. A total of 1,962 sporadic CRC patients comprising 436 early-onset (aged <50 years, mean: 42.5 years) patients and 1,526 late-onset (aged ≥50 years, mean: 62.2 years) patients were included in this analysis. In both the NCC 2010–2013 and NCC 2000–2004 studies, late-onset CRC patients were more likely to have higher body mass index (PNCC 2010–2013 = 0.03, PNCC 2000–2004 < 0.01, and Pcombined < 0.01) and a lower education level (PNCC 2010–2013 < 0.01, PNCC 2000–2004 < 0.01, and Pcombined < 0.01) than early-onset patients. Early-onset patients were more likely than late-onset patients to have reported ever consuming alcohol (PNCC 2000–2004 < 0.01 and Pcombined < 0.01). In the NCC 2010–2013, late-onset CRC patients had more frequency of fecal occult blood test (FOBT) history than early-onset patients. There were no differences for sex, smoking, TNM stage, or CRC site between onset age groups.

Table 1 Characteristics of colorectal cancer patients by age of onset.

All SNPs were in HWE (P > 0.05) except for one SNP, rs10411210 on 19q13.11 (RHPN2, P = 0.01). When CRC patients aged ≥50 years were considered as the reference, the additive risk allele (G) of SNP rs704017 at 10q22.3 (ZMIZ1-AS1) was less frequent among patients aged <50 years (ORNCC 2010–2013 = 0.72, 95% CI = 0.54–0.97, P = 0.03, ORNCC 2002–2004 = 0.80, 95% CI = 0.66–0.98, P = 0.03, and ORcombined = 0.78, 95% CI = 0.66–0.92, P = 2.7 × 10−3) (Table 2). When late-onset CRC patients were restricted to patients aged ≥65 years for the sensitivity analysis (Supplementary Table 2), the RAF of rs704017 also tended to be less frequent among early-onset patients (ORNCC 2010–2013 = 0.63, 95% CI = 0.44–0.90, P = 0.01, ORNCC 2002–2004 = 0.80, 95% CI = 0.64–1.01, P = 0.06, and ORcombined = 0.63, 95% CI = 0.63–0.93, P = 6.6 × 10−3). No other SNPs showed differences in RAFs between onset age groups (P > 0.05).

Table 2 Allelic frequency comparison of identified susceptibility single-nucleotide polymorphisms between age of onset groups (<50 vs. ≥50 years) in colorectal cancer patients.

In the NCC 2010–2013 study, rs704017 was significantly associated with increased risk of CRC among patients aged ≥50 years (ORadditive model = 1.24, 95% CI = 1.06–1.45, P = 6.5 × 10−3 and ORdominant model = 1.42, 95% CI = 1.14–1.77, P = 2.0 × 10−3) with significant heterogeneity for associations between age groups (Padditive model = 0.04 and Pdominant model = 0.02) and genotype × age group interaction (Padditive model = 0.04 and Pdominant model = 0.02) (Table 3). Although no significant association was observed in the NCC 2000–2004 study, rs704017 was associated with decreased risk for CRC among patients aged <50 years in the combined dataset (ORdominant model = 0.77, 95% CI = 0.60–0.98, P = 0.03) and the combined associations were in the opposite direction among patients aged ≥50 years in age. In addition, we found heterogeneity between age groups (Padditive model = 0.02 and Pdominant model = 7.5 × 10−3) and interaction between genotypes and age groups (Padditive model = 0.02 and Pdominant model = 7.8 × 10−3) in the combined dataset.

Table 3 Associations between rs704017 and risk of colorectal cancer stratified by age groups (<50 and ≥50 years).

Discussion

We found that the risk allele of SNP rs704017 at 10q22.3 (ZMIZ1-AS1) was less frequent among sporadic CRC patients with an early age at onset (<50 years) than among patients with late-onset age (≥50 years) in our case-only analysis. Furthermore, both heterogeneity and interaction was observed in the association between genotypes of rs704017 and risk for CRC according to age groups (<50 and ≥50 years) in our case-control analysis.

Early-onset CRC includes approximately 30% of hereditary and 70% of sporadic CRC cases10. The molecular mechanisms driving hereditary early-onset CRC have been well defined as germline mutations such as the MLH1, MSH2, MSH6, PMS2, and EPCAM mutations in Lynch syndrome and APC and MUTYH mutations in FAP11, whereas sporadic early-onset CRC has not been fully clarified10. Although sporadic early-onset CRC is thought to be attributable to common genetic variants with low penetrance4, only a few SNPs, including rs10795668 at 10p14, rs3802842 at 11q23.1, and rs4779584 at 15q13.3, have been associated with an increased risk for early-onset CRC12.

We found that the risk allele (G) of rs704017 was less frequent among early-onset CRC patients and was associated with increased risk among late-onset CRC patients. Accordingly, it may be that this variant plays a role in genetic predisposition to late-onset CRC. To date, a few associations of this risk variant for CRC have been reported among East Asians (P = 2.07 × 10−8) and Europeans (P = 4.71 × 104)13. In those analyses, the mean age of CRC patients was 60.25 years in East Asians and 64.10 years in Europeans, and the analyses included all CRC patients regardless of onset age. Therefore, there is a need for more association studies, in order to confirm the associations of rs704017 with CRC risk according to onset age.

Rs704017 is located in an intron of the zinc finger MIZ-type containing 1 antisense RNA1 (ZMIZ1-AS1) gene in the 10q22.3 region. ZMIZ1-AS1 interferes with and inhibits translation of ZMIZ1 gene. Reduced ZMIZ1 gene expression and greater frequencies of somatic mutations were observed in colon tumors based on data from The Cancer Genome Atlas (TCGA)14 and the Catalogue of Somatic Mutation in Cancer (COSMIC)15. The ZMIZ1 gene encodes a part of the protein inhibitor of activated signal transducer and activator of transcription (STAT) protein family (PIAS). With a Janus kinase (JAK), the STAT protein belongs to JAK-STAT signaling pathway, which can control survival, proliferation, and differentiation of various cells16. The oncogenic transformation can be promoted by persistently activated STAT proteins because of several somatic mutations in the JAK-STAT pathway, which have been identified in patients with a variety of diseases, including myeloproliferative disease, polycythemia vera, megakaryoblastic myeloid leukemia, lymphoblastic leukemia, and uterine leimyosarcomas16, and also could cause CRC17.

A large proportion of CRC patients have late-onset sporadic disease without an obvious hereditary syndrome18. Although the majority of late-onset CRC is located in the distal colon and microsatellite stable (MSS), some features more characteristic of late-onset CRC include occurrence in the proximal colon, as well as the presence of MSI via MLH1 gene promoter methylation, chromosomal instability, and a high CpG island methylator phenotype, especially when compared with sporadic early-onset CRC11. In addition to these characteristics, constitutively decreased PTEN expression in colon mucosa and p53 were experimentally observed to be associated with a late process of tumorigenesis in CRC19,20,21. Because the PIAS protein family has been known to regulate p5322 and PTEN23, tumor development of CRC may also occur late.

On the other hand, rs704017 (G) was less frequent and tended to be associated with decreased risk of early-onset CRC compared to late-onset CRC. This is because rs704017 might have only small effects on early-onset CRC according to both the common disease-common variant hypothesis24 and the polygenic inheritance model25. Moreover, several early-onset sporadic CRC cases without family history showed the possibility of hereditary CRC suggesting a role for germline mutations in hMLH1 and hMSH2 in carcinogenesis in contrast to general sporadic CRC, which is more related to epigenetic changes26. Thus, tumorigenesis of early-onset CRC could be more influenced by germline mutators than by somatic mutations.

An age of 50 years has been considered the cut-off for early- vs. late-onset CRC according to previous publications11,27. The reason that CRC screening is recommended for people starting at age 50 years in Korea28, as well as in many other national guidelines29,30,31, is because screening colonoscopy studies have shown a significantly increased risk of advanced neoplasms among people older than 50 years32,33,34. Additionally, we considered CRC patients aged 65 years or more as late-onset for the sensitivity analysis. We also compared allelic frequencies between CRC patients aged under 30 or 40 (early-onset) and patients aged 50 or 65 years or more (late-onset), but the results were more attenuated due to small sample size effects.

One strength of our study is that we evaluated the association of risk variants according to onset age of CRC throughout both stages of our case-only and case-control analyses. Because case-only analysis is considered to produce more precise estimations than case-control analysis due to both small dispersion and homogeneity35, we conducted a case-only analysis before the case-control analysis. From those analyses, we were able to observe the relationship between rs704017 and onset age of CRC. A limitation of this study is that although we made adjustments for multiple testing, specifically the Bonferroni and false-discovery rate (FDR) tests, the association of rs704017 with CRC onset age was not statistically significant. The p-value was 2.7 × 10−3 in the combined dataset when comparing allele frequencies between onset age of CRC patients. However, p-values were compared to 0.05 divided by 33 (=1.5 × 10−3) which was the Bonferroni-corrected p-value for 33 SNPs and the FDR-adjusted p-value of rs704017 was estimated to be 0.09. Accordingly, adjustments for multiple testing were not applied to the results and further analyses with larger sample sizes are needed to prevent false-positive results and confirm the possible association noted in this study.

In conclusion, we found that the risk variant of rs704017 at 10q22.3 (ZMIZ1-AS1) was significantly less frequent among early-onset sporadic CRC patients, although this did not surpass the threshold for multiple testing. Moreover, the association between rs704017 and risk of CRC tended to be in opposite directions according to the onset age, and heterogeneity and genotype-onset age interaction were observed. To ascertain the role of susceptibility SNPs on the onset age of CRC, further studies are needed.

Methods

Study population

This study used data from two independent, hospital-based case-control studies conducted by the National Cancer Center (NCC) in Korea, NCC 2010–2013 and NCC 2000–2004, the details of which have been reported previously13,36,37. NCC 2010–2013 recruited 1,070 newly diagnosed CRC patients, who had been surgically treated between 2010 and 2013. The controls were recruited from among people who visited a cancer-screening center at the NCC for a health check-up through a benefit program of the National Health Insurance Corporation between 2007 and 2014. After excluding individuals who did not complete a structured written questionnaire or whose blood sample was insufficient for genotyping, the remaining 703 cases were 1:2 matched with 1,406 controls by sex and age (5-year intervals). Of these, 49 cases and 67 controls who had a first- or second-degree family history of CRC were also excluded. Thus, a total of 654 cases and 1,339 controls from NCC 2010–2013 were included in the analysis.

In NCC 2000–2004, cases comprised CRC patients who had been histologically confirmed and received surgical treatment between 2000 and 2004 at the same hospital as NCC 2010–2013. After applying the same exclusion criteria used in NCC 2010–2013, a total of 1,308 sporadic CRC patients were eligible for the analysis. Among controls who were recruited from the same cancer-screening center as NCC 2010–2013 between 2002 and 2004, 1,329 individuals were frequency-matched with cases by age and sex for NCC 2000–2004. All participants were mutually exclusive between the two studies. All study participants provided the written informed consent to participate. Both studies were approved by the institutional review board (IRB) of the NCC (IRB No. NCCNCS-10-350 and NCCNCS-10-396).

Data collection

From CRC patients, general and lifestyle information on age, sex, body mass index, education level, alcohol consumption and smoking habits, and previous FOBT history was obtained by a face-to-face interview conducted by a trained interviewer using a structured, written questionnaire. Clinico-pathological information on tumor-node-metastasis (TNM) stage and CRC site was obtained from patients’ medical records from the Center for Colorectal Cancer at the NCC. The control participants conducted self-administered questionnaires on general and lifestyle information, after which an interviewer contacted them by phone and confirmed the participants’ responses.

Genotyping

For genotyping, we selected 36 susceptibility SNPs at 27 loci that had been associated with CRC risk by previous GWAS13,38,39,40,41,42,43,44,45,46,47. For participants in the NCC 2010–2013 study, genomic DNA from blood was extracted using the MagAttract DNA Blood M48 kit and BioRobot M48 automatic extraction equipment (Qiagen, Inc., Valencia, CA, US), according to the manufacturer’s instructions. Genotyping was performed using Agenabio MassArray iPLEX® gold assay (Agena Bioscience, Inc., San Diego, CA, US), and 32 of the 36 selected SNPs (88.9%) were successfully genotyped. Genotyping for the NCC 2000–2004 study had been conducted using the iPLEX Sequenom MassARRAY platform (Sequenom, Inc., San Diego, CA, US) for 29 susceptibility SNPs as previously described13,37, and 28 SNPs overlapped with the 36 SNPs selected for this analysis. Accordingly, one SNP rs719725 was excluded from the two studies. Additionally, two SNPs, rs6691170 and rs16892766, were monomorphic and therefore excluded. Thus, of the originally selected 36 SNPs, a total of 33 GWAS-identified SNPs at 25 loci were included in the analysis (Supplementary Table 1). All experimental methods were approved by the IRB of the NCC and performed in accordance with the manufacturer acguidelines and regulations.

Statistical analysis

To compare the characteristics of sporadic CRC patients aged <50 years with those aged ≥50 years, we used Student’s t-test for continuous variables and the chi-square test for categorical variables. For all selected SNPs, Hardy-Weinberg Equilibrium (HWE) was tested among controls. The risk allele frequencies (RAFs) of the SNPs were calculated for each early-onset (aged <50 years) and late-onset (aged ≥50 years) CRC patient and all controls. To compare the RAFs of SNPs between onset age groups (<50 and ≥50 years) of the CRC patients under an additive model, a logistic regression model adjusted for sex was used. For the sensitivity analysis, the RAFs of SNPs were also compared between CRC patients aged <50 years and those aged ≥65 years. To investigate associations of susceptibility SNPs with CRC risk according to age groups (<50 and ≥50 years) under additive, dominant, and recessive models, we used logistic regression models adjusted for age and sex and stratified by age groups. The heterogeneity of the associations between age groups was evaluated with Cochran’s Q test. Interactions were assessed with Wald statistics by adding a genotype × age group interaction term to the models. The heterogeneity of the association for SNP, rs704017, between NCC 2010–2013 and NCC 2000–2004 was evaluated with Cochran’s Q test and random effect meta-analysis as well as pooled analysis was performed. Since there was not statistically significant heterogeneity between two study groups and showed similar combined results, the pooled analysis was applied in combined results of two study groups. Because of multiple comparison problems, Bonferroni and the false-discovery rate (FDR) tests were conducted. Associations were evaluated by odds ratios (ORs) and 95% confidence intervals (CIs), and p-values less than 0.05 were considered to be statistically significant. All statistical analyses were two-sided and performed separately in each of the datasets from the two studies and in the combined dataset using SAS version 9.3 software (SAS Institute, Inc., Cary, NC, US) and STATA version 13 software (STATA Corp., College Station, TX, US).

Additional Information

How to cite this article: Song, N. et al. Common risk variants for colorectal cancer: an evaluation of associations with age at cancer onset. Sci. Rep. 7, 40644; doi: 10.1038/srep40644 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.