Background

Colorectal cancer (CRC) is regarded as one of the most frequent malignant tumours globally [1]. This heterogeneous disease can develop through at least three distinct molecular pathways by which genetic and/or epigenetic dysregulation influences gene expression and protein levels finally leading to colorectal adenoma and carcinoma formation [2, 3]. One of the epigenetic alterations that can contribute to CRC formation is the abnormal DNA hypermethylation of promoters, resulting in reduced or absent gene expression [4]. DNA hypermethylation occurs at regulatory sites e.g. promoters in a tissue- and cancer type-specific manner [5]. Besides genetic alterations, DNA hypermethylation of tumour suppressor genes is a frequently detected mechanism behind the inactivation of these genes leading to tumour initiation [6]. Although more and more genes are associated with various types of cancers, our knowledge of DNA methylation markers in CRC development remains incomplete.

Another key posttrancriptional epigenetic regulator of gene expression, miRNA, regulates the stability and translation process of mRNAs. The expression of miRNAs has been shown to differ in colorectal tumours compared to healthy colon tissue specimens and on the basis of several experimental results they play role in colorectal cancer formation. Up- and downregulation of certain miRNAs was identified along the adenoma-carcinoma sequence of CRC and evidence supports the role of miRNAs in CRC development and progression as these small non-coding RNAs affect proliferation and invasion [7].

The identification of genes affected by epigenetic changes can be achieved using whole genome gene expression analysis [8]. DNA methylation and miRNA expression alterations can both lead to a certain degree of dowregulation of mRNA expression and consequently of protein levels, which can be confirmed by immunohistochemistry.

In the present study, our aims were (1) to identify DNA methylation markers in CRC samples on the basis of whole genome gene expression analysis and (2) to analyse the DNA methylation levels of these candidate marker along the colorectal adenoma-carcinoma sequence on colorectal adenoma and cancer samples. Furthermore, (3) our aim was to confirm the relationship between gene expression, DNA methylation status, miRNA expression and protein levels of the analysed candidate markers.

Methods

Selection of candidate genes regulated by DNA methylation

The selection of candidate genes was based on expression data generated from 147 colonic biopsy specimens (from 49 normal, 49 adenoma, and 49 CRC patients), laser capture microdissected colonic epithelial cells (from 6 NAT, 6 adenomas, and 6 CRC), analysed in a previous study by whole genome HGU133 Plus 2.0 microarrays (Affymetrix) [8, 9]. These data files are available in the Gene Expession Omnibus database (http://www.ncbi.nlm.nih.gov/geo/) at GSE series accession numbers GSE4183 (8 normal, 15 adenoma and 15 CRC), GSE10714 (3 normal, 5 adenoma and 7 CRC), GSE37364 (38 normal, 29 adenoma and 27 CRC)) and GSE15960 (laser microdissected colonic epithelial cells from 6 normal, 6 adenoma and 6 CRC).Clinical data of patients involved in the analysed gene expression studies can be found in Additional file 1: Table S1.

Although the bioinformatic analysis and the candidate selection was based on previously performed and published raw gene expression data of HGU133 Plus 2.0 microarrays, the aim of the present study was substantially different from the previously published studies’. We aimed to identify genes with gradually altering expression in adenoma and tumour samples that can be potentially regulated by DNA methylation. The data sets GSE4183 [10], GSE10714 [11], GSE 37364 [9], and GSE15960 [8] were analysed to identify genes potentially regulated by DNA methylation. Transcripts with gradually decreasing or increasing expression along the adenoma-carcinoma sequence were selected on the basis of Kendall (tau coefficient) rank correlation analysis (−0.5 ≤ tau coefficient ≤ 0.5). DNA methylation analysis was performed for genes with CpG island(s) on the basis of in silico prediction by the CpG Plot EMBOSS application (http://www.ebi.ac.uk/Tools/seqstats/emboss_cpgplot/) [12].

Expression of the selected gene set was also analysed on gene expression data sets of human colorectal cell lines before and after DNA demethylation treatment with 5-Aza (GSE29060: 10 μM 5-Aza treatment for 72 h on HT-29 cell line; GSE14526: 3 μM 5-Aza treatment for 72 h on HCT116 and SW480 cell lines; GSE32323: 0.5 μM 5-Aza treatment for 72 h on Colo32, HCT116, HT-29, RKO and SW480 cell lines.

Student's t -test and Benjamini-Hochberg method were applied in order to determine significance of gene expression and DNA methylation level comparisons (p < 0.05). For logFc, abs (differences of average of intensity values) > 1 threshold was applied.

Tissue sample collection

For DNA methylation analysis, tissue specimens were obtained from surgically removed colon tumours (moderately differentiated, Dukes B-C stages; MSS) (n = 15) and from histologically normal adjacent tissue (NAT) (n = 15) derived from the furthest available area away from the tumour. In addition, adenomas (n = 15) were also analysed, containing biopsy samples (n = 10) and fresh frozen tissue samples (n = 5), as well. Fresh frozen samples were snapfrozen in liquid nitrogen directly after surgery and were stored at −80 °C. Written informed consent was obtained from all patients; and the study was approved by the local ethics committee (Ethics Committee approval was obtained Nr.: TUKEB 2005/037 and TUKEB Nr.: 2008/69, Semmelweis University Regional and Institutional Committee of Science and Research Ethics, Budapest, Hungary). The study was performed according to the ethical standards of the revised version of Helsinki Declaration. Clinical data of patients involved in the study can be found in Additional file 2: Table S2.

Laser capture microdissection, macrodissection

Frozen tissue samples were embedded in OCT compound (Sakura Finetek, Japan). Then, 10 μm cryosections were cut at −20 °C in a cryostat instrument and mounted on PALM Membrane Slides 1.0 PEN (Carl Zeiss, Bernried, Germany). After fixation with 70 % ethanol for 5 min and absolute ethanol for 2 min, slides were stained with cresyl violet acetate (Sigma-Aldrich, St. Louis, USA). Colonic epithelial and stromal cells (approx. 103 cells) were collected using the PALM Microbeam laser capture microdissection system (PALM, Bernried, Germany). Macrodissected samples were collected from cryosections after toluidine blue staining. Selected areas containing both stromal and epithelial cells were harvested by scratching the tissue slide with a single-use needle.

DNA methylation analysis

Bisulfite conversion

Bisulfite conversion was performed using the EZ DNA Methylation Direct Kit (Zymo Research) without prior DNA isolation. Proteinase K digestion was performed in 20 μl (according to Section I Protocol A) followed by bisulfite conversion. The elution volume was 20 μl.

Bisulfite-specific PCR (BS-PCR)

In silico CpG island prediction was performed by CpG Plot EMBOSS Application (http://www.ebi.ac.uk/Tools/seqstats/emboss_cpgplot/). Bisulfite-specific PCR reactions were performed using primers designed with PyroMark Assay Design software (SW 2.0, Qiagen, Hilden, Germany) to be specific for non-CpG regions in order to amplify the bisulfite converted DNA samples without discriminating between methylated and non-methylated sequences (Table 1). PCR primers in the opposite direction of sequencing primers were biotin labelled. Primer specificities were tested in silico by BiSearch software (http://bisearch.enzim.hu) [13].

Table 1 Genes analysed in the study. Genes with gradually decreasing or increasing expression along the adenoma-carcinoma sequence with predictable CpG islands were selected on the basis of Kendall (tau coefficient) rank correlation analysis (−0.5 ≤ tau coefficient ≤ 0.5)

BS-PCR reactions were performed using AmpliTaq Gold 360 mastermix (2x) (Life Technologies, Carlsbad, USA), LightCycler 480 ResoLight Dye (40x) (Roche Applied Science), primer mix (200 nM final concentration), bisulfite converted DNA samples (approx. 10 ng bcDNA/well) in 15 μl final volume. Real-time PCR amplification was carried out with the following thermocycling conditions on the LightCycler 480 System: 95 °C for 10 min, then 95 °C for 30 s, 60 °C with a 0.4 °C decrease/cycle for 30 s, 72 °C for 30 s for 10 touchdown cycles, followed by the amplification at 95 °C for 30 s, 56 °C for 30 s, and 72 °C for 30 s in 40 cycles.

Providing single-base resolution information about the methylation status of a CpG island direct sequencing is one of the most robust methods to analyse BS-PCR products. After bisulfite treatment and BS-PCR, all cytosines are converted to thymines except for those originally methylated. Two different pyrosequencing technologies were applied to analyse DNA methylation of BS-PCR products i.e. the Qiagen PyroMark System and the Roche GS Junior System utilising the 454 technology. The read length of the different technologies differs. With the PyroMark system sequences, up to 60 bp can be analysed, while up to 400 bp read length could be achieved with the 454 technology.

PyroMark Q24 sequencing

Pyrosequencing was performed on a PyroMark Q24 instrument (Qiagen) using PyroMark Gold Q24 Reagents (Qiagen) according to the manufacturer’s recommendations. Purification and subsequent processing of the biotinylated single-stranded DNA were performed in two consecutive runs by applying two different sequencing primers in order to cover more CpG sites in the amplicons [14, 15]. Sequencing results were analysed using the PyroMark Q24 software v2.0.6 (Qiagen).

GS Junior sequencing

Library preparation with ligated adaptors and emulsion-PCR amplification were as described in “Guidelines for Amplicon Experimental Design”. The concentrations of BS-PCR amplicons were measured by Qubit fluorometer with High Sensitivity dsDNA reagent (Life Technologies). Amplicons belonging to the same sample were pooled at an equimolar ratio and PCR products were purified with AMPure beads (Agencourt, Beckman Coulter Genomics, Pasadena, USA) according to the manufacturer’s standard protocol. The Agilent Bioanalyzer was used with the High Sensitivity DNA Chip (Agilent, Santa Clara, USA) to assess sample quality. Fragment End Repair was performed using the GS FLX Titanium Rapid Library Preparation Kit (Rapid Library Preparation Method Manual 3.2). RL MID Adaptor Ligation was carried out using GS FLX Titanium Rapid Library Preparation Kit (Rapid Library Preparation Method Manual 3.4). After ligation, purification of amplicon libraries was performed with AMPure beads, and assessment of library quality was done using the Agilent Bioanalyzer with High Sensitivity DNA Chip. Library quantification was performed based on fluorometric measurements with Qubit High Sensitivity dsDNA reagent. Equimolar mixing of the libraries was performed by MIDs identifying different samples with different MID adaptors. Amplicon library pools were then amplified by emPCR at a 0.5 DNA molecule per bead ratio using the Lib-L emPCR Kit. Since amplicon lengths were short, the emPCR procedure was performed with reduced Amp Primer quantity (emPCR Amplification Method Manual – Lib-L, GS Junior Titanium Series, Live Amp Mix for paired end libraries). Bead enrichment and sequencing were performed using the GS Junior Titanium Sequencing Kit and the method described in the Sequencing Method Manual, GS FLX Titanium Series.

The Smith-Waterman algorithm with Gotoh’s improvement was used for matching the reads to template sequences in the JAligner software package [16, 17]. As 454 technology can result in sequencing errors with homopolymer stretches e.g. in bisulfite-sequencing templates [18], gaps or insertions were frequently observed in the sequenced reads. Reads with a minimum of 80 % of maximum alignment score were analysed further, after which the actual nucleotides at the potential methylation sites were summarised.

miRNA analysis

miRNA analysis was performed on an independent formalin-fixed, paraffin-embedded (FFPE) sample set including CRC (n = 3), adenomas (n = 3) and NAT (n = 3) samples. miRNA isolation was performed with the High Pure miRNA kit (Roche) and the expression of approximate 800 miRNA were assessed on Human Panel I + II (Exiqon) with the miRCURYTM Universal RT microRNA PCR protocol according to the manufacturer’s instructions. Normalisation of raw Ct data was performed with interplate calibrators followed by miR-423-5p, as a housekeeping gene expressed at relatively constant levels in our analysed samples. In silico miRNA prediction was performed for all analysed genes using the miRWALK database prediction algorithm including validated mRNA targets [19] in order to select experimentally verified miRNA interaction information associated with genes, pathways, organs, diseases, cell lines, OMIM disorders, and literature on miRNAs. Subsequently, expression of selected miRNAs in normal, adenoma and cancer samples was compared.

Immunohistochemistry

Among the analysed 18 genes, SFRP1 protein level was analysed because of the special interest of our working group. Surgically removed colonic tissues from NAT (n = 10), AD (n = 10), and CRC specimens (n = 10) were fixed in formalin and embedded in paraffin and tissue microarrays (TMA) were constructed. Four μm sections were cut, deparaffinised, and rehydrated. For SFRP1 staining, antigen retrieval was performed in TRIS EDTA buffer (pH 9.0) using a microwave (900 W for 10 min, 340 W for 40 min). Samples were incubated with anti–SFRP1 rabbit polyclonal antibody (ab4193, Abcam, Cambridge, UK) diluted 1:800 for 60 min at 37 °C. EnVision + HRP system (Labeled Polymer Anti-Mouse, K4001, Dako) and diaminobenzidine-hydrogen peroxidase–chromogen substrate system (Cytomation Liquid DAB + Substrate Chromogen System, K3468, Dako) were used with hematoxylin counterstaining. Slides were digitalised using the Pannoramic Scanner p250 Flash instrument (software version 1.11.25.0, 3DHISTECH Ltd., Budapest, Hungary), and analysed with a digital microscope software (Pannoramic Viewer, v. 1.11.43.0. 3DHISTECH Ltd., Budapest, Hungary). The semiquantitative Quick-score (Q) method was applied for SFRP1 protein level alteration analysis. Every TMA core was scored by multiplying the percentage of positive cells by the given intensity value (0 for no staining, +1 for weak, +2 for moderate, and +3 for strong diffuse immunostaining).

Results

Gene expression analysis

Genes potentially regulated by DNA methylation were selected on the basis of whole genome gene expression data from previously performed microarray experiments of 49 normal, 49 adenoma, and 49 tumour biopsy samples [9]. Based on Kendall analysis, a set of 18 transcripts was selected showing continuously altering expression (p ≤ 0.01) in the biopsy samples along the adenoma-carcinoma sequence (Table 1). Along colorectal adenoma-carcinoma progression, the following genes showed downregulation: BCL2, CDX1, ENTPD5, MAL, PRIMA1, PTGDR, SFRP1, and SULT1A1 while the following genes showed upregulation: ALDH1A3, COL1A2, CYP27B1, FADS1, PTGS2, SFRP2, SOCS3, SULF1, THBS2, and TIMP1. Gene expression alteration of BCL2, CDX1, CYP27B1, ENTPD5, MAL, PRIMA1, PTGDR, PTGS2, SFRP1, SOCS3 SULT1A1, and TIMP1 were found to be significant (p < 0.05) in the adenoma versus healthy and also in the tumour versus healthy comparison. In addition, ALDH1A3, COL1A2, FADS1, SFRP2, SULF1, and THBS2 were found to be significantly (p < 0.01) differentially expressed in tumour samples but not in adenomas compared to healthy samples (Fig. 1, Table 2, Additional file 3: Figure S1, Additional file 4: Table S3).

Fig. 1
figure 1

Summary of genes with altered expression levels in the analysed samples. Venn diagrams display genes that exhibit significantly altered gene expression patterns (p < 0.05) in (a) colon biopsy samples, (b) laser capture microdissected (LCM) epithelial cells, and (c) stromal cells in the normal versus adenoma, normal versus tumour comparisons and their intersections. The majority of gene expression changes could be detected in biopsy samples, while LCM epithelial and stromal cells show fewer altered transcript levels, primarily in normal vs. tumour comparison

Table 2 Gene expression data of biopsies and laser microdissected (LCM) colon epithelial cells

In order to investigate the cellular origin of altered gene expression of the analysed transcript set during colorectal cancer formation, laser capture microdissection was applied to separate epithelial and stromal cells from the colonic mucosa. Significantly altered expression (p < 0.05) of SOCS3 and PRIMA1 could be detected in epithelial cells from normal versus adenomatous samples. Gene expression changes of BCL2, CYP27B1, COL1A2, FADS1, and SULT1A1 were significant (p < 0.05) only in tumours compared to healthy samples, while CDX1, ENTPD5, PTGDR ,and TIMP1 showed gene expression difference in both normal vs. adenoma and normal vs. tumour comparisons (Fig. 1, Table 2, Additional file 4: Table S3).

No significant gene expression alterations could be detected in the stromal cells isolated from adenomas compared to the normals, but COL1A2, FADS1, MAL, PRIMA1, SULF1, THBS2, TIMP1 genes’ transcripts showed significant differences (p < 0.05) in logFc values for the tumour versus normal comparison (Fig. 1; Additional file 4: Table S3). As stromal cells showed the fewest gene expression alterations, we further focused on biopsy and laser microdissected epithelial samples.

Demethylation treatment on colon adenocarcinoma cell lines

Gene expression of the selected marker set was analysed on data sets containing control and 5-Aza treated colon adenocarcinoma cell lines. According to GSE29060 data, in HT-29 adenocarcinoma cells after a demethylation treatment 4 transcripts showed a minimally decreased expression (TIMP1, FADS1, CYP27B and SULT1A1), while PTGS2 was found to be upregulated. HCT-116 cells showed higher re-expression of the selected genes, as PTGS2, THBS2 and TIMP1 also showed upregulation (1 < logFccontrol-treated) and TIMP1 was also upregulated in 5-Aza treated SW480 cells according to GSE14526. Among the 5 CRC cell lines of GSE32323 SULT1A1 in Colo32 cells, PTGS2 in HCT-116 cells, ALDH1A3 and SOCS3 in HT-29 cells and ALDH1A3 and TIMP1 in SW480 cells showed remarkable upregulation after demethylation treatment (Fig. 2, Additional file 4: Table S3).

Fig. 2
figure 2

Heat map of gene expression data of the selected marker set in 5-aza-2’-deoxycytidine-treated human colon adenocarcinoma cells (GSE29060; GSE14526; GSE32323). Intensity values on the colour scale were as follows: red – high intensity, black – intermediate intensity, green – low intensity. Demethylation treatment resulted in varying degrees of upregulation of certain transcripts

DNA methylation analysis

DNA methylation was assessed in human colonic samples using two different pyrosequencing systems. Firstly, routinely collected biopsy samples and macrodissected specimens naturally containing both epithelial and stromal cells were analysed. Among the 18 analysed markers (Table 1), DNA methylation was significantly (p < 0.05) altered for six loci belonging to four genes, in which COL1A2, SFRP2, SOCS3 showed hypermethylation and THBS2 showed hypomethylation both in AD and in CRC samples compared to NAT. Three additional genes, BCL2, PRIMA1, and PTGDR showed hypermethylation only in tumour samples (Table 3, Additional file 5: Figure S2).

Table 3 DNA methylation data of biopsies, macrodissected samples and laser microdissected (LCM) colon epithelial cells

Interestingly, two of the analysed regions in the THBS2 promoter conferred hypomethylation along tumour formation, while the third locus examined showed significant hypermethylation in tumours compared to NAT.

Unsupervised clustering of genes with DNA hypermethylation

Unsupervised hierarchical clustering of DNA methylation data revealed three groups of markers in biopsy and macrodissected sample groups. The first group of genes (SFRP2, COL1A2, THBS2, SOCS3, CYP27B1, SULT1A1, PRIMA1 and MAL) showed a relatively high degree of DNA methylation already in AD and also in CRC samples. The second group included most markers and did not show remarkable difference among different sample groups, while the third minor cluster included only two THBS2 loci with high methylation levels across all samples (Fig. 3a). Unsupervised hierarchical clustering of LCM epithelial cells revealed similar relationships to those in biopsy and macrodissected samples above. Certain genes showed relatively high DNA methylation levels in both biopsies and epithelial cells in adenoma and cancer cases, as PRIMA1, SFRP1, SFRP2, MAL, SOCS3, CYP27B1, COL1A2 and SULT1A1. THBS2 showed high methylation levels across all samples. The second major marker group included most genes and did not show remarkable difference between the different sample groups (Fig. 3b).

Fig. 3
figure 3

Heatmap representing level of DNA methylation in a) NAT, AD, and CRC biopsies and macrodissected samples and in b) NAT, AD and CRC LCM epithelial cells. Intensity values on the colour scale were as follows: red - high intensity, black - intermediate intensity, green - low intensity. Samples are shown in columns, selected genes are in rows. Similar DNA methylation pattern could be found in both sample types, as PRIMA1, SFRP1, SFRP2, MAL, SOCS3, CYP27B1, COL1A2 and SULT1A1 showed relatively high DNA methylation levels in colon biopsies and LCM epithelial cells

miRNA analysis

We used the miRWALK database to predict miRNAs that could target genes of our selected set. Multiple miRNAs could be predicted using the miRWALK ’Validated Target’ in silico searching application. Certain groups of miRNAs were predicted to target more genes analysed in our present study; miR-21 (predicted for BCL2, MAL, PTGS2, SFRP1, SOCS3) expression was found to be remarkably upregulated in CRC compared to NAT (Fig. 4). Furthermore, miR-21* (predicted for BCL2, MAL, SFRP1, SOCS3, PTGS2), miR-181c (predicted for ALDH1A3, BCL2, MAL), and let-7i* (predicted for BCL2, CYP27B1, and SOCS3) were also found to be upregulated in AD and CRC samples (Fig. 4).

Fig. 4
figure 4

Normalised Ct values of selected miRNAs (hsa-miR-21, hsa-miR-21*, hsa-miR-181c, hsa-let-7i*) targeting the selected marker set. Raw Ct data were substracted from the maximal qPCR cycle number (45) and data were normalised with interplate calibrators and also with miR-423-5p Ct values. Red dots represent individual miRNA normalised Ct values, box plots represent median and standard deviation of the data

Immunohistochemistry

Colonic FFPE tissue samples were immunostained for SFRP1. In NAT epithelium, moderate diffuse cytoplasmic staining (+2) could be detected (Fig. 5a, white arrows) in contrast to adjacent myofibroblasts (we identified they by their localisation and morphology) with strong diffuse immunostaining (+3) (Fig. 5a, red arrows). In tubular AD samples, weak diffuse cytoplasmic protein expression (+1) was accompanied by strong and spotted immunostaining (+2/+3) (Fig. 5b). The majority of CRC cases (9 out of 10 cases) showed weak (+1) or no (0) SFRP1 immunostaining (Fig. 5c). According to Q-score values used for semiquantitative immunohistochemistry analysis, the overall SFRP1 protein expression decreased along the colorectal adenoma-carcinoma sequence (Fig. 5).

Fig. 5
figure 5

Continuously decreasing SFRP1 protein expression could be observed along colorectal adenoma-carcinoma development in epithelial/CRC compartment of NAT (a), AD (b), and CRC (c) samples. SFRP1 protein expression of healthy epithelial cells (a, white arrows) was compared to that in endogenous myofibroblasts (a, red arrows) with strong (+3) immunopositivity (digital microscopy images, 90x magnification, scale bar: 20 μm). Semiquantitative immunohistochemistry results (Q-score values) of NAT, AD, and CRC specimens are summarised as bar charts with whiskers representing standard deviation (d)

Discussion

The goal of this study was to identify DNA methylation and miRNA markers associated with the sequence of adenoma-carcinoma formation leading to CRC. The candidate markers were selected based on whole genome gene expression array data, DNA methylation analysis, and in silico prediction and validation of miRNA expression.

The study identified set of 18 transcripts showing continuous gene expression alterations that correlated with CRC progression. Microarray experiments revealed 12 genes (BCL2, CDX1, CYP27B1, ENTPD5, MAL, PRIMA1, PTGDR, PTGS2, SFRP1, SOCS3, SULT1A1, and TIMP1) with significantly different transcriptional activities in AD compared to NAT controls, while 6 genes (ALDH1A3, COL1A2, FADS1, SFRP1, SULF1, and THBS2) showed unique gene expression alterations only in CRC samples. More specifically, looking at cellular components of the abovementioned stages of CRC formation, the results showed that epithelial cells in AD express decreased amounts of SOCS3 and PRIMA1, whereas those in CRC express less BCL2, CYP27B1, COL1A2, FADS1, and SULT1A1.

Demethylation treatment of colon adenocarcinoma cell lines led to varying degrees of upregulation of certain transcripts. In HT-29 cell line ALDH1A3 and SOCS3 was found to be upregulated by 0.5 μM 5-Aza. Interestingly, in HCT-116 cells PTGS2; and in SW480 cell line TIMP1 showed higher expression after 0.5 and 3 μM 5-Aza treatments, as well.

From the resulting marker set, COL1A2, SFRP2, and SOCS3 were hypermethylated and THBS2 was hypomethylated in both AD and CRC samples compared to NAT. Based on the literature, hypermethylation of COL1A2 was confirmed in head and neck cancer [20], melanoma [21], and bladder cancer [22]. This is suggestive that COL1A2 may contribute to the formation of various cancers by modulating cell proliferation and migration. In the gastrointestinal tract, expression of COL1A2 may be associated with endothelial-to-mesenchymal transition [23]. Collagen production of carcinoma cells decreases during oncogenic transformation [24]; and, hypermethylation of COL1A2 was confirmed in several CRC cell lines (HCT 116, SW480, and SW620) as well as in primary CRC tissues [25]. SFRP2 is a member of the well-known inhibitors of Wnt pathway, abnormal activation of which (e.g. via APC mutation or beta-catenin translocation) is a frequent and early event in the genesis of CRC [26]. It has already been shown to be hypermethylated in colorectal cancer cell lines (e.g. HCT116) as well as primary CRC [27, 28]. Furthermore, it has recently been recognised as a promising and sensitive marker of stool-based screening of CRC [26]. SOCS3 is a negative regulator of the JAK-STAT3 pathway; therefore, it may effect cell proliferation and cell cycle [29]. Mutational analysis of the gene revealed no marked association between SOCS3 promoter region polymorphisms and the risk of developing metastatic colorectal cancer [30]. Epigenetic inactivation of SOCS3 was reported in human malignant melanomas and glioblastoma multiforme [31, 32]. Reduced gene expression of SOCS3 was found in the colitis ulcerosa (UC) to CRC progression from low-grade dysplasia to CRC. Related to this, DNA methylation of SOCS3 could also be detected in colonic biopsies of UC-CRC patients but not from healthy controls or from inactive UC patients [33, 34]. THBS2 hypermethylation might be responsible for altered expression of thrombospondin-2 protein in ovarian cancer and endometrial adenocarcinomas [35]. Thrombospondin-2 is an antiangiogenetic factor in CRC and its expression was associated with angiogenesis and metastasis formation inhibition in CRC [36].

The set of BCL2, PRIMA1, and PTGDR showed hypermethylation only in CRC. BCL2 (B-cell CLL/lymphoma 2) is an apoptotic inhibitor. Its hypermethylation was documented in breast cancer [37] and bladder cancer [38]. Bcl-2 protein plays a role in CRC formation [39] and has a reduced expression in CRCs with microsatellite instability [40]. DNA hypermethylation of BCL2 was detected in CRC cases; however, there was no relationship between gene expression and methylation of specific CpG sites [41]. PRIMA1 encodes a membrane protein anchoring acetylcholinesterase to cell membranes [42]. Its promoter hypermethylation was detected in major depressive disorder with a concomitant decrease in gene expression [43]. It has not yet been associated with CRC development. Decreased mRNA expression levels of PTGDR genes in colorectal AD and CRC caused by DNA methylation were previously described [8].

In summary MAL, PRIMA1, PTGDR and SFRP1 showed a downregulation of gene expression and in parallel increasing DNA methylation level that correlated with CRC development. Meanwhile, BCL2, CDX1, ENTPD5 and SULT1A1 dowregulation was not accompanied with significant DNA methylation changes; thus, other regulatory processes should be further investigated to understand these changes in gene expression.

After DNA methylation analysis of candidate genes with altered gene expression, the potential influence of DNA methylation on the protein level was also investigated. Significantly decreasing protein levels of SFRP1 could be observed along the adenoma-carcinoma sequence. This result is in accordance with the literature, as epigenetic regulation of SFRP1 can lead to decreased protein levels [44, 45].

On a limited sample set miRNAs with upregulation along the AD-CRC sequence were also identified. miR-21 was found to be remarkably upregulated in AD and CRC samples compared to NAT controls. On the basis of in silico prediction miR-21 can target genes showing no remarkable alteration in their promoter methylation (e.g. BCL2, MAL, PTGS2) during CRC development, that might influence their gene expression levels. miR-21 is known to play role in tumour formation and was also found to be upregulated in CRC tissues along tumour formation [46, 47]. The expression level of miR-21 is elevated both in colorectal adenomas and cancers, and the degree of upregulation correlates with more advanced stages of CRC [7]. This small non-coding RNA could have a fundamental role in the progression of CRC, as elevated level of miR-21 was found to be predictive of poor survival [48], that may increase proliferation, migration and invasion. In CRC cell lines with the EMT phenotype the expression of miR-21 oncomiR is regulated by AP-1 and ETS transciption factors and also by epigenetic factors. Activating histone modifications (H3K3me3, H3K914ac, H3K27ac), but no inactivating were detected on miR-21 promoter region [49]. These epigenetic mechanisms can affect the binding affinity of transcription factors to the miR-21 promoter regulating its expression level. Upregulated miR-181 in CRC cases might also influence gene expression level of the Bcl-2 family members [50].

Conclusion

In summary, we identified 18 transcripts with changes in gene expression that correlate with CRC development. On the basis of genome-wide gene expression-based screening we could identify genes potentially downregulated by promoter hypermethylation. Silencing of the markers identified in our study by hypermethylation or miRNA upregulation can result in reduced gene expression leading to decreased protein levels contributing to CRC formation.