Background

Colorectal cancer (CRC) is a clinically important malignant disease due to its high incidence and mortality. According to the GLOBOCAN estimates with 1.4 million new cases and 694.000 deaths annually, CRC is the third most common cancer in the world, after lung and breast cancers [1].

The majority of sporadic CRCs develop according to the normal-adenoma-dysplasia-carcinoma sequence described by Fearon and Vogelstein [2]. The accumulation of genetic and epigenetic alterations in colonic epithelium leads to CRC through early and late precancerous adenoma stages in which promoter DNA methylation changes of certain tumor suppressor genes with consecutive mRNA expression changes are one of the earliest events, often prior to the appearance of mutations in well-known genes such as the adenomatosis polyposis coli gene (APC) [3].

Recently, comprehensive molecular characterization of several human cancers including CRC has been performed and the data integrated into The Cancer Genome Atlas (TCGA) database (https://cancergenome.nih.gov/). Integrative evaluation of genetic, epigenetic and gene expression data of hundreds of CRC and paired normal adjacent tissue (NAT) samples revealed that in addition to the known mutations, epigenetic changes (especially DNA methylation) also play a key role in establishing CRC subtypes with different prognostic and therapeutic phenotypes [4]. The majority (84%) of CRCs were found to be non-hypermutated. Non-hypermutated cancers with distinct colonic or rectal location could be distinguished according to copy-number alteration, DNA methylation or gene expression profiles [4].

DNA methylation changes both in promoter and gene body regions contribute to cancer phenotype as they can affect the gene transcription in several ways [3, 5,6,7]. In addition to the earlier methods focusing on gene promoter methylation analysis, new technologies, such as BeadChip methylation arrays [4, 8,9,10,11,12], reduced representation bisulfite sequencing (RRBS) [13], whole genomic bisulfite sequencing (WGBS) and methyl capture sequencing (MethylCap-Seq) [7, 14] were applied to study DNA methylation profiles in CRC. While the majority of investigations included CRC and NAT tissues [4, 9, 10, 12, 14], analysis of precancerous adenomas (AD) are represented in a small number of previous studies [8, 11] including a MethylCap-Seq study of WNT pathway genes we undertook [7]. We have also identified hypermethylated markers (mal, T-cell differentiation protein (MAL), proline rich membrane anchor 1 (PRIMA1), prostaglandin D2 receptor (PTGDR) and secreted frizzled related protein 1 (SFRP1)) in CRC and adenoma using bisulfite sequencing [15] and determined a common ten-gene methylation signature in colorectal adenomas and CRC based on methylation qPCR arrays [16].

BeadChip 27K and 450K arrays and RRBS offer opportunities for analysis of DNA methylation at single nucleotide resolution mainly within CpG islands, however recently developed Epic BeadChip arrays – besides examination of CpG island methylation - allow more extensive study of CpG sites outside of CpG islands, as well. WGBS provides the most widespread whole methylome analysis at single nucleotide resolution, but it is not commonly used due to its high cost. MethylCap-seq is an alternative genome-wide methylation analysis technique to identify novel differentially methylated regions (DMRs) [17, 18]. It gives extensive information about both promoter and gene body methylation, though at lower resolution [18]. Unlike BeadChip arrays, it is suitable for investigation of mutation hot spot regions within the gene body. It is known that mutations can cause altered DNA methylation and DNA methylation changes also can lead to development of mutations [19, 20]. The mutation rate is higher at methylated CpG sites than non-methylated ones [21, 22]. The change of 5-methylcytosine to thymine via spontaneous deamination [23, 24] ‘which is less effectively repaired by the DNA repair machinery than the cytosine to uracil deamination reaction’ [22, 23] can cause the increased mutability of cytosines within CpG sites.

The aim of this study was to analyze genome-wide tissue DNA methylation differences along the colorectal normal-adenoma-carcinoma sequence progression, including gene body methylation changes using MethylCap-seq. The second aim was to search for a potential relation between DNA methylation and mutation alterations for 12 CRC-associated genes. The possible effects of the genetic and epigenetic changes on neoplastic phenotype at transcriptome level were also examined.

Methods

Estimation of global methylation levels using long interspersed nuclear element-1 (LINE-1) bisulfite sequencing

After DNA isolation from 5 colorectal adenoma, 5 CRC and 10 normal (N) colonic biopsy samples, bisulfite conversion of DNA samples was performed using EZ DNA Methylation-Direct Kit (Zymo Research). For quantification of methylation levels of the LINE-1 retrotransposable element, bisulfite-specific PCR (BS-PCR) was done and 146 bp long LINE-1 PCR products were sequenced on Pyromark Q24 system (Qiagen) using the Qiagen Q24 CpG LINE-1 Kit (Qiagen) according to the manufacturers’ instruction.

MethylCap-seq

Global DNA methylation alterations were determined using MethylCap-seq data of 30 colonic tissue samples (15 AD, 9 CRC, 6 NAT) published previously by our research group [7]. In the previous study [7], only the DNA methylation changes of 160 WNT pathway genes and promoters were evaluated, while in this study whole methylome analysis was performed.

After informed consent of untreated patients, colonic biopsy samples were taken during routine endoscopic intervention. Using parallel formalin-fixed samples from the same site, histological diagnoses were established by experienced pathologists. Tissue samples from untreated CRC patients were also obtained from surgically removed colon or rectal tumors and from NAT that originated from the area farthest available from the tumor. The detailed patient specification has been described earlier [7]. The study was conducted according to the Helsinki declaration and approved by the local ethics committee and government authorities (Regional and Institutional Committee of Science and Research Ethics (TUKEB) Nr.: 69/2008, 202/2009 and 23,970/2011 Semmelweis University, Budapest, Hungary).

Genomic DNA was isolated using High Pure PCR Template Preparation Kit (Roche Applied Science) according to the manufacturer’s instructions [16]. The capture of methylated DNA fragments and next generation sequencing were performed as previously described [7]. Briefly, after fragmentation of 3 μg genomic DNA samples, the DNA fragments with methylated CpGs were selected using the Auto MethylCap kit (Diagenode). Purification of the methylated DNA fraction was carried out on QIAquick PCR purification columns (Qiagen). Library preparation was performed using the TruSeq ChIP Sample Preparation kit (Illumina) and clusters were generated using TruSeq SR Cluster Kit v3-cBot-HS (Illumina). Next generation sequencing of the methylated DNA fragments was performed on the HiScanSQ instrument using TruSeq SBS v3-HS reagents (Illumina,) according to the manufacturer’s instructions. Bowtie2 software with default settings was used to map the 100 bp paired and 50 bp unpaired reads to the hg19 human genome reference assembly [25]. The aligned data were processed using the MEDIPS Bioconductor R package [26]. Methylation probabilities (β-values hereafter) were calculated for 100 bp long analysis windows (differentially methylated regions = DMRs), with respect to genome-wide CpG density dependent Poisson distributions.

Mutation analysis

Using normal, benign and malignant colorectal tissue samples, mutation hot-spot regions of 12 CRC-associated genes (APC, B-Raf proto-oncogene, serine/threonine kinase (BRAF), catenin beta 1 (CTNNB1), epidermal growth factor receptor (EGFR), F-box and WD repeat domain containing 7 (FBXW7), KRAS proto-oncogene, GTPase (KRAS), NRAS proto-oncogene, GTPase (NRAS), mutS homolog 6 (MSH6), phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA), SMAD family member 2 and 4 (SMAD2 and SMAD4), tumor protein 53 (TP53)) were amplified using a custom-made multiplex PCR panel previously designed by our research group [27]. Amplicon sequencing was carried out on a GS Junior instrument (Roche) using ligated and barcoded adaptors as described earlier [27]. Bead enrichment and sequencing were performed using GS Junior Titanium Sequencing Kit (Roche) according to the Sequencing Method Manual, GS FLX Titanium Series. For variant identification, Amplicon Variant Analyzer software (Roche) was applied.

Promoter DNA methylation and mRNA expression analysis of TP53 signaling pathway genes

The list of the TP53 pathway genes (in total 67 gene symbols) was constructed according to the KEGG pathway database. Promoters were defined as described earlier using Encode ChromHMM results [7]. Promoter DNA methylation was determined using methyl capture results of 30 colonic biopsy samples in a 100 base pair analysis window resolution and DMRs were identified between the diagnostic groups. In silico mRNA expression analysis for TP53 signaling pathway genes was performed using microarray data from colonic tissue samples (Affymetrix HGU133Plus2.0; GEO accession numbers: GSE37364 [28], GSE18105 [29], GSE4107 [30], GSE9348 [31], GSE22242 [32], GSE8671 [33]).

Statistical analysis

For MethylCap-seq DNA methylation data analysis, differences between diagnostic groups (9 CRC samples versus 6 NAT samples, 15 AD samples versus 6 NAT samples) were characterized by Δβ-values (the differences of the average β-values of each sample group). The top50 candidate DMRs were selected according to the highest absolute values of Δβ-values. For estimation of global methylation levels using LINE-1 bisulfite sequencing, average methylation percentages of 3 analyzed CpG sites were calculated. For gene expression logFC calculations, the differences between the averages of samples groups were compared. During statistical evaluation of DNA methylation and gene expression data, for paired comparisons of diagnostic groups, Student’s t-test and False Discovery Rate (FDR) were applied as the Kolmogorov-Smirnov test resulted in normal distribution and the standard deviation of data were similar. Variance analysis was performed using the non-parametric Kruskal-Wallis test. A p-value of < 0.05 was considered as significant.

Results

Global DNA methylation alterations of the colorectal normal-adenoma-carcinoma sequence

Genome-wide decreases in DNA methylation were observed for samples from the adenoma stage of colorectal carcinogenesis. Based on the LINE-1 bisulfite sequencing results, significant global DNA hypomethylation was detected both in CRC (63 ± 6.7%; p = 0.0302) and adenoma samples (65 ± 3.8%; p = 0.0093) compared to normal tissue (73 ± 1.4%). Variance analysis also revealed significantly lower DNA methylation level both in CRC and adenoma than in normal samples (Kruskal-Wallis test: p < 0.00104) (Fig. 1a). MethylCap-seq results showed that decreased DNA methylation appeared principally in 40–60% and 80–100% methylation percentage categories in adenoma and CRC samples compared to NAT controls (Fig. 1b).

Fig. 1
figure 1

Global DNA methylation alterations of the normal-adenoma-colorectal cancer sequence. a DNA methylation of LINE-1 (long interspersed nuclear element-1) in CRC, adenoma and normal tissue samples. N = normal, Ad = adenoma, CRC = colorectal cancer; b Category distribution of global DNA methylation in CRC, adenoma and NAT samples analyzed by methyl capture sequencing. DNA methylation percentage categories are shown on the X axis, while the numbers of 100 base pair analysis windows are represented on the Y axis. NAT = normal adjacent tissue, AD = adenoma, CRC = colorectal cancer.

Top DMRs in CRC and adenoma samples identified by MethylCap-seq

In CRC samples known CRC-associated genes including heparan sulfate-glucosamine 3-sulfotransferase 2 (HS3ST2), dickkopf WNT signaling pathway inhibitor 2 (DKK2), tissue factor pathway inhibitor 2 (TFPI2) and syndecan 2 (SDC2) occurred in the top50 significantly hypermethylated 100 base paired regions (p < 0.001), showing elevated promoter DNA methylation levels located within CpG islands. Δβ-values representing methylation differences between CRC and NAT samples were in a range from 0.68 to 0.81. More than one third of the top50 hypermethylated DMRs align with weak (9%) or active (9%) promoters according to the Encode ChromHMM data. The majority of the top50 DMRs that were significantly hypomethylated in CRC compared to NAT samples (p < 0.001) could not be assigned to genes, gene promoters, and were located in intergenic regions. Similar to the hypermethylated DMRs, large differences were found for hypomethylated DMRs with Δβ-values between − 0.74 and − 0.65 (Additional file 1: Table S1A, B).

In the AD versus NAT comparison, 94% of the top50 highly methylated DMRs were found in CpG islands including generally known CRC-associated DNA methylation markers like Fli-1 proto-oncogene, ETS transcription factor (FLI1), GATA binding protein 4 (GATA4) and nerve growth factor receptor (NGFR). The top50 significant (p < 0.0001) methylation alterations appeared to be more intensive in adenomas compared to NAT samples (Δβ-values were between 0.86 and 0.79). Considering Encode ChromHMM data, 38% of top50 hypermethylated DMRs were found to be located in promoter regions, and 26% can function as active promoters. Similar to the results in CRC versus NAT comparison, almost all of the top50 DMRs showing significantly decreased DNA methylation in AD could not be annotated (p < 0.0001) with stronger methylation differences than found in CRC versus NAT (Δβ-values between − 0.90 and − 0.74) (Additional file 1: Table S1C, D).

DNA methylation alterations and expression of CRC-associated, frequently mutated genes

The mutation frequencies of a panel consisting 12 CRC-associated genes in CRC and AD samples were measured in our previous multiplex PCR-based CRC mutation hot-spot sequencing study [27]. DNA methylation alterations were also detected in the mutation hot-spot regions of 12 analyzed CRC-associated genes that are frequently mutated, including TP53, APC, KRAS, BRAF and FBXW7. DNA methylation changes on 100 base pair long analysis windows located on mutation hot-spot regions of TP53, APC, KRAS, BRAF and FBXW7 can be seen in Fig. 2. Evaluation of promoter methylation patterns of the 12 frequently mutated genes revealed several significant alterations including hypermethylation of the APC promoter in CRC and AD tissue specimens (p < 0.05; Δβ = 0.27–0.39) (Fig. 3a), hypermethylation of the TP53 promoter in AD (p < 0.001; Δβ = 0.40) and hypomethylation of CTNNB1 (p < 0.05; Δβ between − 0.30 and − 0.45) (Fig. 3a) and SMAD2 (p = 0.024; Δβ = − 0.28) in CRC compared to NAT samples. SMAD4 promoter region was found to be hypomethylated both in AD and CRC biopsy samples (p < 0.05; Δβ between − 0.25 and − 0.32). mRNA expression profiles of the 12 analyzed CRC-associated genes revealed that APC and CTNNB1 could be regulated by DNA methylation during the colorectal carcinogenesis as showing inverse relation between promoter DNA methylation and mRNA expression (Fig. 3b) .

Fig. 2
figure 2

DNA methylation alterations in mutation hot-spot regions of genes frequently mutated in CRC and adenoma. Methylation percentage values are shown in 100 base pair analysis regions located in mutation hot-spot areas of genes (TP53, APC, KRAS, BRAF and FBXW7) frequently mutated in CRC and adenoma tissue. The frequencies of mutations in CRC and adenoma samples detected in our previous multiplex PCR-based CRC mutation hot-spot sequencing study [27] are also represented. *p < 0.05, CRC = colorectal cancer, Ad = adenoma, N = normal adjacent tissue

Fig. 3
figure 3

Inverse promoter DNA methylation and mRNA expression alterations of APC and CTNNB1 genes in CRC and adenoma samples. a Significant DMRs in promoter regions of APC and CTNNB1 genes in CRC and adenoma samples (p < 0.05). Hypermethylation is marked with red, while hypomethylated DMRs are green. The names of the DMRs indicate the official gene symbol_number of the chromosome_start position of the DMR. CRC = colorectal cancer, NAT = normal adjacent tissue. b mRNA expression pattern of APC and CTNNB1 genes in CRC and adenoma (GSE37364 [28]). Overexpression is marked with red, while downregulated genes are green. CRC = colorectal cancer

DNA methylation on TP53 signaling pathway gene promoters – Relation with gene expression results

The TP53 pathway genes selected according to the KEGG pathway database were represented with 67 gene symbols. Promoters were defined as described earlier using Encode ChromHMM results [7]. In the CRC versus NAT comparison, 26.9% of TP53 pathway genes (18 from 67 genes) showed significant DNA methylation alterations in their promoter regions with at least a 10% methylation difference (p < 0.05, Δβ ≥ 0.1) (Table 1). In CRC samples hypermethylated DMRs were found in the promoter regions of 11 genes such as caspase 8 (CASP8), cyclin dependent kinase inhibitor 1A and 2A (CDKN1A and CDKN2A), insulin-like growth factor binding protein 3 (IGFBP3), sestrin 2 (SESN2) and tumor protein p73 (TP73), while seven TP53 pathway genes including G2 and S-phase expressed 1 (GTSE1) showed hypomethylation in their promoters. The box plots of the significant hyper-, and hypomethylated DMRs in TP53 pathway gene promoters showing the highest DNA methylation differences between CRC and NAT samples are represented on Fig. 4 and the box plots of all DMRs fulfilling the criteria can be seen in Additional file 2: Figure S1.

Table 1 DNA methylation alterations in promoter regions of TP53 signaling pathway genes in CRC and AD tissues compared to NAT samples
Fig. 4
figure 4

Box plots of DMRs in TP53 pathway gene promoters with the highest DNA methylation differences between CRC and NAT samples. Box plots represent the DNA methylation levels (β-values) of differentially methylated regions (DMRs) in TP53 pathway gene promoters showing the highest DNA methylation differences between CRC and NAT tissue samples. Individual DNA methylation level values are shown by red dots, and the median and standard deviation of the β-values are also demonstrated. The names of the DMRs indicate the official gene symbol_number of the chromosome_start position of the DMR. NAT = normal adjacent tissue, CRC = colorectal cancer, CASP8 = caspase 8, CCNE1 = cyclin E1, EI24 = EI24 autophagy associated transmembrane protein, FAS = Fas cell surface death receptor, IGFBP3 = insulin-like growth factor binding protein 3, TP73 = tumor protein p73

By applying the same criteria, significant promoter DNA methylation changes were observed in 37.3% of TP53 pathway genes (25/67) in AD compared to NAT samples (p < 0.05, Δβ ≥ 0.1) (Table 1). Fifteen TP53 pathway genes showed elevated promoter methylation in AD samples including CDKN2A, IGFBP3 and TP73, while hypomethylation was detected in the promoter regions of 10 genes such as GTSE1, damage specific DNA binding protein 2 (DDB2) and cyclin dependent kinase 1 (CDK1).

Using in silico expression analysis of microarray data from colonic biopsy samples (Affymetrix HGU133Plus2.0 GEO accession numbers: GSE37364 [28], GSE18105 [29], GSE4107 [30], GSE9348 [31], GSE22242 [32], GSE8671 [33]), an inverse relation between promoter DNA methylation alteration and mRNA expression (Table 2) was shown for a number of differentially methylated TP53 pathway genes including CDKN1A, CDKN2A, GTSE1, IGFBP3, SESN2 and SESN3 (Table 2).

Table 2 TP53 signaling pathway genes showing inverse relation between promoter DNA methylation and mRNA expression

Discussion

The accumulation of DNA methylation alterations accompanied by genetic changes such as mutations and deletions is known to contribute to the pathogenesis of various cancer types including CRC [3, 4]. Comprehensive DNA methylation changes found in precancerous adenoma stages can serve as early detection markers [7, 8, 11, 34]. In this study, global DNA methylation alterations were analyzed along the colorectal normal-adenoma-carcinoma sequence, and top differentially methylated genes/regions were identified using genome-wide MethylCap-seq analysis. The second aim of the study was to find out if there is a potential correlation between DNA methylational and mutational alterations for 12 CRC-associated genes. Furthermore, the possible effects of the genetic and epigenetic changes on TP53 signaling pathway genes at the transcriptome level were also examined.

Global hypomethylation was detected by LINE-1 bisulfite sequencing in CRC samples compared to normal tissue in line with previous data [35,36,37]. Although to a lower extent, global DNA hypomethylation could be detected as early as the AD stage. LINE-1 bisulfite sequencing was used for overall hypomethylation analysis due to its superior advance over MethylCap-seq, which predominantly targets genomic regions with high methylated CpGs density [14].

In this study, we identified 22 novel AD- and/or CRC-associated hypermethylated DMRs (approximately one fourth of top50 hypermethylated DMRs) which could be assigned to genes with previously undescribed methylation changes in cancers including CRC. These markers are principally involved in transcription regulation (e.g. BHLHE23, CUX2, HLX, MAFB, MKX, NKX1–1, and GSC2), transport processes (e.g. SLC24A2, GLRA3, LRRC38, SNAP91), and intracellular signaling (e.g. RGS20, GNAL, NRG3). Among the hypermethylated transcription factors, the expression of H2.0 like homeobox (HLX) was found to be reduced in moderately differentiated CRCs [38]. Moreover, HLX is also considered as a tumor suppressor in hepatocellular carcinoma [39]. The platelet derived growth factor receptor alpha (PDGFRA) was observed to be hypermethylated in AD compared to normal controls in our study. It was found to be overexpressed in CRC, but - in accordance with promoter hypermethylation detected in our MethylCap-seq study – it was down-regulated in adenomatous polyps [40]. Nevertheless, one fifth of hypermethylated and the majority of hypomethylated DMRs could not be associated with known genes, both in CRC versus NAT and AD versus NAT comparisons. The identified significant top50 methylation changes could be observed in a high proportion (> 80%) of the specimens within a sample group compared to the mutational alterations analyzed in this study.

On the basis of the methylation levels of the top50 hypomethylated and hypermethylated markers determined in this study, including the newly identified DMRs, the clear separation of CRC and NAT samples was also apparent for an independent sample set (Additional file 2: Figure S2). Furthermore, a partially overlapping set of samples also showed consistent DNA methylation profiles analyzed by MethylCap-seq and EpiTect Methyl qPCR methods (Additional file 2: Figure S3).

Approximately half of the top50 identified hypermethylated DMRs in CRC represent genes found to demonstrate elevated DNA methylation levels in different types of cancers [14, 16, 34, 41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60]. Seven of the top50 markers (BNC1 [16], DKK2 [48,49,50], HS3ST2 [51], MIR124–3 [52], SDC2 [14, 34, 53, 54], TFPI2 [55, 56] and ZIC1 [57]) were previously described as methylated genes in CRC. Hypermethylation of basonuclin (BNC1) zinc finger protein, SDC2 transmembrane heparin sulfate proteoglycan, and DKK2 dickkopf WNT signaling pathway inhibitor 2 genes were also reported in previous studies by our research group [16, 34]. Among the annotated AD versus NAT top50 hypermethylated DMRs, several markers were found to be hypermethylated in various cancers including FLI1 [58, 59], GATA4 [51] and NGFR [60]. These showed elevated methylation levels in CRC samples in other studies.

In this study, we present a comparative analysis between the promoter methylation and mRNA expression data of 12 genes frequently mutated during colorectal carcinogenesis and progression. The results revealed that DNA methylation can play a role in the regulation of APC and CTNNB1 expression in addition to and in parallel with the mutational changes. The above genes are members of the WNT signaling pathway investigated in details in our previous analysis [7]. Hypermethylation of the APC promoter [7, 50, 61, 62] and hypomethylation of the CTNNB1 promoter [7, 49] in AD and CRC samples have also been detected in other studies indicating that the DNA methylation alterations of frequently mutated canonical WNT pathway key genes can contribute to its constitutive activation in colorectal carcinogenesis from the premalignant adenoma stage.

Farkas et al. evaluated DNA methylation changes of genes frequently mutated in CRC using BeadChip450K technology, including 11 of the 12 genes analyzed in our study [49] and reported hypomethylation in CTNNB1 and SMAD2 promoters in CRC compared to NAT samples. Decreased promoter DNA methylation levels of these genes were also observed in our MethylCap-seq analysis together with methylation alterations of other genes such as SMAD4 and TP53 promoters during colorectal carcinogenesis.

In the current project, DNA methylation alterations were also detected in the mutation hot-spot regions of 12 analyzed CRC-associated frequently mutated genes including TP53, APC, KRAS, BRAF, and FBXW7. In accordance with the observation that C - T transitions at CpG sites are the most prevalent mutations in TP53 gene in colon tumors [63], the high mutation rate and methylation changes at mutation hot spot regions of this gene could be detected in our study. DNA methylation can cause mutations in tumor suppressor genes such as TP53, as mutations occur 10–40 times more frequently on the basis of methylated cytosine than of unmethylated cytosine [19, 20]. The conversion of 5-methylcytosine to thymine via spontaneous deamination [23, 24] or by the APOBEC/AID system [64] can lead to a high mutational burden of 5-methylcytosine. The 5-methylcytosine can be involved in increased mutability through other mechanisms. According to a recent report, elevated C to G transversion rate in cancer genomes can be associated with 5-hydroxymethylcytosines derived by the oxidation of 5-methylcytosine catalyzed by TET proteins [65].

Hypomethylation was also detected in addition to the elevated methylation levels on certain mutation hot-spots. This is only seemingly contradictory to previous data indicating that the mutation rate is higher on methylated CpG sites than on unmethylated ones [21], as the relative hypomethylation (from high level to intermediate level) and not the absolute loss of DNA methylation was observed on certain mutation hot-spots in our study. It is in conjunction with the results of a recent work describing that among the methylated CpG sites, the rate of mutations (or SNP density) was found to be increased on less methylated CpG sites (20–60%) as compared to high-intermediately and highly methylated CpGs (60–80%; > 80%) [21, 66]. Cancer-associated overall hypomethylation of the genome including heterochromatic DNA repeats, retrotransposons, and endogenous retroviral elements also contribute to genome instability [20].

In our analysis, DMRs could be identified on all chromosomes with the relatively largest number of aligned sequence reads on chromosome 17, similar to the MethylCap-seq study performed by Simmer et al. [14]. Next, DNA methylation alterations of TP53 (encoded on chr 17) signaling pathway genes were also investigated. TP53 pathway deregulation frequently occurs through the mutations or deletion of TP53 itself [67]. Outside the mutations of the TP53 gene, this pathway is rarely hit by any other mutations/polymorphisms [68,69,70]. Other mechanisms, such as epigenetic regulation including DNA methylation changes of TP53 pathway genes, also contribute to attenuating the pathway and participate in cancer development [67], and TP53 itself is also thought to regulate cancer-associated genes showing altered methylation patterns [71]. Accordingly, our MethylCap-Seq analysis revealed significant promoter DNA methylation changes in approximately one third of TP53 signaling pathway genes in CRC. Moreover, an even greater proportion of TP53 pathway gene promoters (around 40%) showed altered DNA methylation in AD samples compared to NAT controls. The alterations of the identified TP53 pathway genes with inverse promoter DNA methylation and mRNA expression differences (Table 2) were found to be associated with tumorigenesis in different cancer types including CRC [72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88]. Among these markers, in addition to the down-regulation of well known p16 (CDKN2A) [75, 76] and p21 (CDKN1A) [74] cyclin dependent kinase inhibitors, BCL2 associated X, apoptosis regulator (BAX) [72], SESN2 [85,86,87], IGFBP3 [84] and cytochrome c, somatic (CYCS) [77] are also thought to exert tumor suppressor functions. Diminished or loss of CYCS protein expression in AD and CRC tissue was found to be correlated with apoptosis resistance [77]. DDB2 damage specific DNA binding protein, which was described to suppress the tumorigenicity in case of ovarian cancer [78] and reduces CRC invasiveness [79], showed promoter hypomethylation and overexpression in AD samples in our study, suggesting its contribution to the inhibition of uncontrolled expansion in the adenoma stage.

Conclusions

Using genome-wide DNA methylation analysis, we identified novel aberrant methylation profiles of genes including HLX, CUX2, MKX, NRG3 and PDGFRA associated with the colorectal adenoma-carcinoma sequence progression. In addition to the genetic changes, DNA methylation alterations were also shown in the mutation hot-spot regions of 12 analyzed, CRC-associated, frequently mutated genes including, TP53, APC, KRAS, BRAF, and FBXW7. Global hypomethylation – which might be linked to genetic instability - could be detected as early as the adenoma stage.

Our study also revealed that promoter DNA methylation changes influence the mRNA expression level in the case of a significant part of the TP53 pathway genes. Thus epigenetic alterations can also contribute to a whole pathway-related effect on DNA repair and apoptosis in addition to single gene (e.g. TP53) mutations.

In summary, the methyl-capture sequencing technique yielded reproducible, clinically relevant results on the whole genome level which are related to cancer phenotype development through mRNA expression changes and to the cancer genotype through the link of mutation formation.