Findings

Recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP) rs6983267 within the 8q24 region associated with increased susceptibility to colorectal and prostate cancer [16]. Follow-up association studies have suggested that the same variant may also increase the risk for cancers of the kidney, thyroid and larynx [7, 8]. The location of rs6983267 in the intergenic region 335 Kb upstream from the MYC gene, a well-known oncogene [9], generated a hypothesis that this SNP might be involved in a long-distance regulation of MYC expression. Located in a region with significant evolutionary conservation and enhancer potential [1012], the SNP was predicted to affect a binding site for TCF7L2 [10, 11], a key transcription factor in the WNT pathway. The risk allele G of rs6983267 was found to have a slightly stronger affinity to TCF7L2 in binding assays compared to the non-risk allele T, and stronger regulatory activity in luciferase reporter assays [10, 11]. An analysis of long-range interactions showed that the region containing rs6983267 might be in physical proximity with MYC region [10]. These findings suggested that rs6983267 might be located within an enhancer element that interacts with TCF7L2 and regulates MYC expression [10, 11]. MYC is a target gene of TCF7L2 [1315] and its expression is regulated through two TCF7L2 binding sites within the MYC promoter [13]. No association has been found between rs6983267 and the mRNA expression of MYC in lymphoblastoid cell lines [11, 16], normal and tumor colon samples [10, 11, 1720], or with MYC immunostaining in colon tumors [5].

Previously, we performed a detailed study of TCF7L2 expression in several types of human tissue, including colon where we measured the expression of multiple assays targeting the majority of known splicing forms of TCF7L2 [21, 22]. In the current study we sought to determine, whether the expression of TCF7L2 splicing forms we identified in non-cancer colon samples correlated with MYC expression and whether this expression was dependent on alleles of rs6983267 or interaction of rs6983267 with TCF7L2 expression.

We investigated non-cancer colon samples on the assumption that the effect of a germline genetic variation might be more easily detectable in conditions not affected by the effects of cancer or its treatment. The samples and the methods are described in Additional file 1. The mRNA expression of MYC was detected by sensitive quantitative reverse-transcriptase PCR (qRT-PCR) and 3 expression assays (Figure 1A.) The expression of MYC assays 1 and 2, corresponding to exons 2-3, and 1-2, respectively (RefSeq transcript NM_002467), was highly correlated (r = 0.95). MYC assay 3 targeted an alternative transcript initiated from a promoter P0 (GenBank accession number M13929) [23], however, expression of this transcript was very low (at a level of at least 100 times lower than of assays 1 and 2) and was not studied further. Expression of TCF7L2 was measured with 7 assays previously described (Figure 1B, Additional file 2) [21, 22].

Figure 1
figure 1

Location of MYC and TCF7L2 expression assays. A. MYC exons and 5' and 3'untranslated regions (UTRs) are marked by rectangles and two translation starts are marked by vertical lines and arrows. MYC assay1 is located over the junction of exons 2 and 3, and MYC assay 2 is located over the junction of exons 1 and 2. MYC assay 3 targets alternative transcript with an upstream exon. B. Constitutive exons of TCF7L2 are represented as black rectangles and alternative exons as white rectangles. Protein domains are indicated above corresponding exons: β-catenin binding domain is encoded by exons 1 and 2, high mobility group (HMG) DNA-binding domain is encoded by exons 9 and 10 and the CRARF DNA-binding domain is encoded by exons 13 and 14. Location of expression assays is indicated under corresponding exons and arrows show primer positions. The specificity of detection of particular splicing forms is achieved by probes located over exon junctions.

The strongest correlation between MYC and TCF7L2 expression was observed for assay "ex13-14" of TCF7L2 (r = 0.57- 0.60, p < 10-6), followed by assay "ex11-13" (r = 0.52-0.54, p < 10-6). The weakest correlation was detected for assay "ex11-13a" (r = 0.10 - 0.15, p = 0.12 - 0.28) (Table 1). These assays detect alternative splicing forms that include combinations of exons 11-13-14 and 11-13a-14 in the C-terminal end of the TCF7L2 transcripts (GenBank accession numbers FJ010174 and FJ010167). Both protein isoforms encoded by these splicing forms have long C-terminal reading frames (E-tails) with binding sites for the C-terminal binding protein (CtBP) involved in post-translational regulation of TCF7L2 expression [21, 22]. Protein fragments encoded by the alternative exons 13 and 13a share 68% identity (17 amino acids of 25, Figure 2). The form with exons 11-13-14 encodes a 30-amino-acid highly conserved motif with a CRAR F signature protein sequence, while in the form with exons 11-13a-14 this sequence is changed to CRAL F (Figure 2). The CRARF protein sequence is also found in another member of TCF/LEF family of transcription factors, TCF-7 (former TCF-1) and in the ancestral drosophila TCF/pangolin protein [24]. The CRARF sequence serves as an additional DNA-binding domain and a strong transactivator of the WNT pathway [24, 25]. The CRARF-form of TCF7L2 was shown to interact with two TCF7L2 binding sites within MYC promoter, TBE1 at -1156 bp and TBE2 at -589 bp upstream the first translation start site [13, 24]. Therefore, we suggest that while both the CRARF and CRALF forms of TCF7L2 contain E-tails, only the CRARF form regulates MYC expression in the colon. A splicing form detected by assay "ex11-14" is the most common splicing form of TCF7L2 in all human tissues [21, 22]. This splicing combination utilizes an alternative stop codon in the beginning of exon 14 resulting in a protein without the CtBP-binding domain. A somatic frameshift mutation in a polyA stretch within exon 14 found in colorectal cancer cell lines results in similar outcome - termination of protein by an alternative stop codon in the beginning of exon 14 [26, 27]. We found a moderate correlation between assay "ex11-14" of TCF7L2 and MYC expression (Table 1). Next, we evaluated the levels of mRNA MYC expression in colon tissue in relation to genotypes of rs6983267. We observed significant effect of age, of several TCF7L2 splicing forms but no effect of rs6983267 alone or in interaction with TCF7L2 (Table 2).

Figure 2
figure 2

Detailed structure of C-terminal part of TCF7L2 gene. A. Combinations of exons 11, 13, 13a and 14 of TCF7L2 encoding proteins with long reading frames (E-tail). B. A form with alternative exon 13 encodes a protein sequence with CRARF motif; a form with alternative exons 13a encodes a protein sequence with CRALF motif, differences in amino acids in proteins encoded by exons 13 and 13a are underlined, CRARF and CRALF motifs are boxed. C. Expression of a splicing form with exon13 correlates with MYC expression (r = 0.60, p < 10-6), while expression of a splicing form with exon 13a does not correlate with MYC expression (r = 0.15, p = 0.12).

Table 1 Correlation between expression of MYC and TCF7L2 in colon samples
Table 2 Association of rs6983267 with MYC expression in colon tissue

Our results show a strong role of TCF7L2 in regulation of MYC expression in colon, but not through rs6983267. Both TCF7L2 and MYC genes are expressed in colon and are important for maintaining proliferation of intestinal epithelium [28, 29] (Additional file 3). Inactivation of the adenomatous polyposis coli (APC) tumor suppressor gene leads to formation of β-catenin/TCF7L2 complexes, constitutive activation of the WNT pathway and eventually colorectal cancer. Rare point mutations within TCF7L2 are also found in colorectal cancers [30]. The proliferative effect of TCF7L2 is achieved through its transcriptional regulation of several target genes such as MYC and CCND1 (Cyclin D1)[31]. Our results suggest that expression of MYC in colon tissue is most likely regulated by a splicing form of TCF7L2 encoding a protein with a potent transactivation CRARF-domain. However, we did not find any evidence for an effect of rs6983267 on TCF7L2 regulation of MYC expression. Of the family of TCF/LEF transcription factors, TCF7L2 has the highest expression in colon, but other members of this family may also be involved. Inactivation of TCF-7 (former TCF-1) leads to development of intestinal polyps [32]. Expression of LEF1 is found in tumors but not in normal colon tissue [33]. Each of these proteins can recognize the same TCF/LEF consensus binding site and, therefore, might bind alleles of rs6983267. High degree of similarity between TCF/LEF proteins may lead to cross-reactivity in chromatin immunoprecipitation (ChIP) assays. Thus, other TCF/LEF factors should also be examined for their effect on regulation of MYC expression.

In conclusion, SNP rs6983267 within the 8q24 region has been established as one of the strongest genetic risk factors for development of at least two types of cancer. Identification of functional mechanisms of this association is the highest priority of cancer genetics and would mean a significant step forward towards understanding of cancer pathogenesis and development of better diagnostic and therapeutic approaches. Our results provide new insights into the regulation of MYC expression by TCF7L2. However, further studies are needed to investigate alternative molecular mechanisms that can explain the association between rs6983267 and cancer risk.