Background

Worldwide, colorectal cancer (CRC) is the third most common cancer in terms of incidence, accounting for nearly 1.8 million new cases annually (Bray et al. 2018). By 2030, the global burden of CRC is expected to increase by 60%, and new cases will reach 2.2 million (Arnold et al. 2017).

The occurrence process of CRC results from internal and external factors interaction. At present, the association between immune evasion and the risk of CRC attracts great interest of researchers. Dual host-protective and tumor-promoting function of immune system are identified as cancer immunoediting, which is the process consisting of three sequential phases: immunosurveillance, equilibrium, and escape (Pernot et al. 2014; Dunn et al. 2004). Studies have revealed that the development of CRC is related to the epigenetic silencing of immunosuppressive checkpoints (ICs), which may impact antigen processing and presentation by tumor cells and facilitates evasion of immunosurveillance. Epigenetic regulation is the major mechanisms behind ICs expression (Héninger et al. 2015). Aberrant expression of ICs in cancer creates an immunosuppressive microenvironment, supporting the immune escape of tumor cells. Numerous negative regulatory mechanisms, which can inhibit anti-tumor responses are implemented through the expression of various ICs on immune and tumor cells (Elashi 2019).

As one of the most widely studied and recognized ICs, PDCD-1, involved in almost every aspect of immune responses (Okazaki et al. 2007), dampens the immune response of activated T cells and mediates the inhibition of T cells during long-term antigen exposure (Flemming 2012). Polymorphisms in PDCD-1 have been reported to be associated with the risk of CRC (Zhang 2016). LAG-3, another IC that is expected to be targeted in the clinic, consequently garnering considerable interest and scrutiny. Serve as a surface protein, LAG-3 is expressed by immune cells normally (Andrews et al. 2017; Marin-Acevedo et al. 2018), which play a vital role in evasion of the host immune response in kinds of tumors. LAG-3 is required for optimal T cell regulation and homeostasis. Persistent antigen-stimulation in tumors including CRC (Camisaschi et al. 2010; Llosa et al. 2015), leads to anomalous expression of LAG-3, promoting T cell exhaustion (Ruffo 2019). V.S. Nair et al. discovered abnormal expression levels and anomalous methylation patterns of various ICs including PDCD-1 and LAG-3 in colorectal tumor tissues compared with normal tissues (Sasidharan Nair et al. 2018). Furthermore, co-expression between PDCD-1 and LAG-3 in environments containing either tumor, self-Ag, or chronic infection can result in T cells with poor effector function (Woo et al. 2012; Lucas et al. 2011; Matsuzaki et al. 2010; Grosso et al. 2009). Until now, the research in this field is mostly carried out at the tumor tissue level, whether methylation statuses of PDCD-1 and LAG-3 in PBL could be involved in the anomalous expression of ICs and related to CRC risk remain unclear.

We therefore carried out this study to discover significantly differentially methylated CpG sites (DMCs) of PDCD-1 and LAG-3 using the GSE51032 dataset from the European Prospective Investigation into Cancer and Nutrition (EPIC) study and then screened the DMCs of the two genes located in CpG island and had the least P value as the candidate CpG sites to further explore the relationship between the methylation of the two genes and CRC risk in the case–control study. Moreover, we investigated the combination effect between the methylation of the two genes on CRC risk. Since DNA methylation alterations in leukocytes may reflect epigenetic modifications (gene expression), environmental exposures, or interactions between these factors that increase cancer susceptibility (Kitkumthorn et al. 2012; Huang et al. 2012; Walters et al. 2013; Nan 2013; Ally et al. 2009; Kaaks et al. 2009; Miroglio et al. 2010; Gao et al. 2012; Luo et al. 2016). We then further explored combination and interaction effects between PDCD-1, LAG-3 methylation statuses and environmental factors on CRC risk in PBL.

Methods

Data sources

The workflow of the study is summarized in Fig. 1. Firstly, GSE51032 dataset from the EPIC study, a nested case–control study, that was designed to investigate the relationships between genetic and environmental factors and the incidences of different cancers was used to screen the DMCs of PDCD-1 and LAG-3 as the candidate CpG sites for the further study. Secondly, a case–control study was carried out to explore the relationship between the methylation of the two genes and CRC risk. Thirdly, we further explored combination and interaction effects between PDCD-1 and LAG-3 methylation statuses and environmental factors on CRC risk.

Fig. 1
figure 1

Flow chart of the study

DNA methylation data of the peripheral blood leukocytes for CRC were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/) with the GEO accession number GSE51032 (166 CRC patients and 424 normal samples). The platform was Illumina Infinium HumanMethylation450 BeadChip. The methylation levels of each CpG site were represented by beta-value, which was the ratio between methylated probe intensities and total intensities.

Data preprocessing and DMCs selection

Firstly, the CpG sites with missing values greater than 10% in the samples were removed. Secondly, probes were filtered based on the following conditions: (i) probes with < 3 beads in at least 5% of samples examined; (ii) probes with a detection P > 0.01; (iii) non-CpG probes; (iv) single-nucleotide polymorphism (SNP)-related probes; (v) multi-hit probes; (vi) probes located in X and Y chromosome. Thirdly, all missing data were imputed using the k-nearest neighbor imputation method. Beta-mixture quantile normalization (BMIQ) was applied for normalization. DMCs related to CRC were identified using minfi package in R with false discovery rate (FDR) less than 0.05.

Study subjects and sample collection

We then carried out a hospital-based case–control study including 390 primary CRC patients who were diagnosed at the Second Affiliated Hospital of Harbin Medical University and the Third Affiliated Hospital of Harbin Medical University from 2015 to 2017, and 397 cancer-free control individuals enrolled from the Second Affiliated Hospital of Harbin Medical University during the same period. The patients with neuroendocrine carcinoma, malignant melanoma, non-Hodgkin’s lymphoma, gastrointestinal stromal tumors, or Lynch syndrome and the controls with a history of gastrointestinal disease based on self-reporting were excluded. Peripheral blood (5 ml) of every subject was collected before surgery or before treatment and immediately stored at −80 °C. Due to the lack of DNA, the methylation levels of PDCD-1 were measured in only 389 cases and 386 controls.

Genomic DNA extraction and sodium bisulfate modification

The blood samples were centrifuged at 1600g for 10 min to separate the plasma and the buffy coats. We extracted genomic DNA from the buffy coats using the QIAamp DNA Blood Mini kit (Qiagen, Hilden, Germany) and bisulfite-modified using the EpiTect Plus DNA Bisulfite Kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. DNA quantity was measured using NanoDrop 2000 spectrophotometer (Thermo Fisher, USA).

Data collection

Through in-person interviews, all subjects completed a structured questionnaire with information on demographic characteristics (i.e., age, gender, and education), lifestyles (i.e., alcohol drinking, cigarette smoking, and physical activity), dietary status within the 12 months before the interview (i.e., fruits and vegetables, fried food, overnight food) and family history of CRC. The questions were modified from the questionnaire of Shu et al. (Shu et al. 2004). This study was approved by the ethical committee of Harbin Medical University. We obtained written informed consent from all subjects and performed in accordance with the ethical standards in the 1964 Declaration of Helsinki and its later amendments.

Quantitative methylation specific PCR (QMSP)

The detected CpG sites of the two genes were identified by the GEO dataset. The methylation statuses of PDCD-1 and LAG-3 were detected using quantitative methylation specific PCR (QMSP) on a LightCycler480 system (Roche Applied Science, Mannheim, Germany). We analyzed the methylation statuses of PDCD-1 and LAG-3 by Abs Quant/2nd Derivative Max software (version 2.0). Each PCR mixture consisted of a total volume of 10 μL containing 5 μL LightCycler480 High-Resolution Melting Master Mix (Roche Applied Science, Mannheim, Germany), 0.5 μL primer, approximately 10 ng bisulfite-modified template DNA, 3.4 μL of PCR-grade water. The PCR conditions were showed in Additional file 1: Table S1. We constructed a series of plasmid standards (5 × 107, 5 × 106, 5 × 105, 5 × 104, 5 × 103, 5 × 102, 5 × 101 copies/μL) on a background of diluted positive standards (Bisulfite Converted Universal Methylated Human DNA Standards; Zymo Research, D5015) as standard curves. The standard curves were obtained by setting the cycle threshold (Ct) to the ordinate and the logarithm of the standard concentration to the abscissa. When the methylated standard concentrations were 5 × 107, 5 × 106, 5 × 105, 5 × 104, 5 × 103, 5 × 102, 5 × 101 copies/μL, the Cts of PDCD-1 were 13.51, 16.75, 20.05, 23.38, 28.05, 29.59 (The sample range of pre-tests were 5 × 107–5 × 102 copies/μL, Tm = 79 ± 0.5 ℃), respectively (Additional file 1: Fig S1); the Cts of LAG-3 were 14.14, 17.62, 21.08, 24.67, 28.17, 31.18, 33.1 (Tm = 80 ± 0.5 ℃) respectively (Additional file 1: Fig S2); the Cts of ACTB were 13.83, 17.33, 20.71, 24.24, 27.72 (The sample range of pre-tests were 5 × 107–5 × 103 copies/μL, Tm = 76 ± 0.5 ℃), respectively (Additional file 1: Fig S3). The amplification efficiency was 0.9 < e < 1.0.

The methylation ratio (QMSP value) was defined as the ratio of the fluorescence emission intensity values for the target gene-specific PCR products to those of the ACTB (Reference gene) and then multiplied by 100 for easier tabulation. Additionally, two blank control (non-template control) samples were included in each batch, and all reactions were performed in duplicate. A third trial would be conducted if the two trials presented inconsistent results.

Statistical analysis

To evaluate the homogeneity between cases and controls, two-sample t-tests for the continuous variables and chi-square (χ2) tests for the categorical variables were performed. We used the receiver operating characteristic (ROC) curve to determine the cut-off value to categorize all subjects into hypomethylation group and hypermethylation group. Methylation level in the target gene higher than the cut-off value was defined as hypermethylation; Otherwise, the level was defined as hypomethylation. Univariate and multivariate logistic regressions were applied to estimate the associations between environmental factors, genes methylation in PBL and their interaction on CRC risk with crude and adjusted estimate Odds Ratios (ORs) and 95% confidence intervals (CIs). We applied crossover analysis to evaluate the interaction effects of environmental factors and methylation of PDCD-1 and LAG-3, and the interaction effect between methylation of PDCD-1 and LAG-3 on the risk of CRC. The statistical power for the study was calculated using PASS 15 (NCSS lnc., USA). Statistical analyses were performed using STATA 14 (Stata Corp., USA) and R version 3.5.1. P values < 0.05 were considered statistically significant.

Results

Identification of DMCs and the association between gene methylation and CRC risk in GEO dataset

The basic characteristics of CRC patients and controls in in GEO dataset were shown in Additional file 1: Table S2. We adjusted for age and gender in the following analyses. Eight DMCs located in PDCD-1 (cg09031938, cg03903296, cg06291111, cg07281781, cg10057601, cg11532131, cg04789125, cg10994870) and four DMCs located in LAG-3 (cg10191002, cg04153135, cg06157570, cg14292870) were associated with CRC risk in GEO dataset. Genomic information of 12 CpG sites were shown in Table 1. The CpG sites located in CpG island and had the least P value were screened as the candidate sites for the following study (PDCD-1: cg06291111; LAG-3: cg10191002). As shown in Table 2, hypermethylation of PDCD-1 and LAG-3 were associated with a lower risk of CRC (ORadj = 0.322, 95% CI 0.197–0.528, P < 0.001; ORadj = 0.666, 95% CI 0.446–0.5996, P = 0.048, respectively).

Table 1 Genomic information of 12 CpG sites for PDCD-1 and LAG-3 in GEO dataset
Table 2 Associations between gene methylation and the CRC risk in GEO dataset

Basic demographic characteristics of the cases and controls in the validation study

A total of 361 CRC patients and 358 controls were comprised in the study. The basic characteristics of all subjects were shown in Table 3. The distribution of age was significantly different between cases and controls, which was adjusted in following analyses. Related reports supported that estrogen delays the occurrence of CRC in females for 7–8 years (Lieberman et al. 2005). Although there is no statistical difference in gender distribution between the groups, we also adjusted for gender in the following analyses. Moreover, the distributions of environmental factors in all subjects were shown in Additional file 1: Table S3.

Table 3 Distribution of the basic characteristics of CRC patients and controls

The association between gene methylation and CRC risk

The cut-off values of PDCD-1 and LAG-3 were 22.94% and 1.46%, respectively (Additional file 1: Fig S8). For PDCD-1, 59.13% cases (230/389) and 76.17% controls (294/386) were hypermethylated. Hypermethylation of PDCD-1 was associated with a lower risk of CRC (ORadj = 0.448, 95% CI 0.322–0.622, P < 0.001). For LAG-3, 56.67% cases (221/390) and 72.80% controls (289/397) were hypermethylated. Hypermethylation of LAG-3 was also statistically associated with a lower risk of CRC (ORadj = 0.417, 95% CI 0.301–0.578, P < 0.001) (Table 4, Additional file 1: Fig S4–S6).

Table 4 Association between methylation levels of PDCD-1, LAG-3 in PBL and CRC risk

In order to eliminate variants caused by differences repeat of tests, we conducted quality control and calculated the methylation level by taking the average of all coefficients and slopes. The results showed that the hypermethylation of PDCD-1 and LAG-3 were still associated with the lower risk of CRC with similar ORs (Additional file 1: Table S4).

Age stratification analysis

There were significant associations between hypermethylation of PDCD-1 and LAG-3 and reduced CRC risk in young subjects (< 60 years) (PDCD-1: OR = 0.441, 95% CI 0.289–0.673, P < 0.001; LAG-3: OR = 0.336, 95% CI 0.215–0.525, P < 0.001), as well as in old subjects (≥ 60 years) (PDCD-1: OR = 0.457, 95% CI 0.278–0.754, P < 0.001; LAG-3: OR = 0.584, 95% CI 0.368–0.926, P = 0.022) (Table 5). There was no statistically significant difference between young and old subjects (P > 0.05 for all comparisons) in the associations between hypermethylation of these two genes and CRC risk. In GEO dataset, we observed the similar results (Additional file 1: Table S5). The associations between hypermethylation of PDCD-1 and LAG-3 and CRC risk stratified by different environmental factors were shown in Additional file 1: Table S6.

Table 5 Associations between the methylation of individual genes and CRC risk stratified by age

Combination effect between the methylation of PDCD-1 and LAG-3 on CRC risk

A combination effect between PDCD-1 and LAG-3 methylation on CRC risk was observed (OReg = 0.217, 95% CI 0.127–0.370, P < 0.001). We didn’t observe significant interaction between methylation levels of PDCD-1 and LAG-3 on CRC risk (ORi = 0.683, 95% CI 0.341–1.371, P = 0.286) (Table 6). In the GEO dataset, we observed the similar results (OReg = 0.135, 95% CI 0.067–0.271, P < 0.001; ORi = 1.017, 95% CI 0.383–2.676, P = 0.980).

Table 6 Effects of combination and interaction between methylation of PDCD-1 and LAG-3

Effects between environmental factors and PDCD-1 and LAG-3 methylation on CRC Risk

Significant combination effects between hypermethylation of PDCD-1 and LAG-3 and higher consumption of alcohol, overnight food, salty food, hot-scalded food, pork, beef, fowl, pickled cabbage and smoked and baked food on increased CRC risk were observed. Moreover, a significant combination effect between hypermethylation of LAG-3 and higher intake of eggs on the reduction of CRC risk was observed (Additional file 1: Table S7–S8).

Synergistic interaction between LAG-3 hypermethylation and higher intake of eggs on reducing CRC risk was observed (ORi = 0.389, 95% CI 0.190–0.794, P = 0.010). We didn’t find the interaction between other environmental factors and PDCD-1 methylation on CRC risk (Additional file 1: Table S7–S8).

Discussion

In our study, we identified the DMCs of PDCD-1 and LAG-3 in the GSE51032 dataset and validated the relationship between the methylation levels of the two genes (PDCD-1: cg06291111; LAG-3: cg10191002) and CRC risk in a case–control study. We found that hypermethylation of PDCD-1 and LAG-3 in WBC-derived DNA were associated with reducing the risk of CRC. There were significant interaction and combination effects between hypermethylation of the two genes and environmental factors on CRC risk. We also observed a combination effect between PDCD-1 and LAG-3 methylation on CRC risk.

Epigenetic alterations of ICs such as aberrant methylation/demethylation pattern may alter gene expression and tumorigenesis (Elashi 2019). In order to check DNA epigenetic modifications behind the abnormal expression of ICs, V.S. Nair et.al. checked the expression of demethylation enzymes (TETs) and methylation enzymes (DNMTs) in the tumor tissue and normal tissue and found that the expression of demethylation enzymes was significantly higher and methylation enzymes were lower in tumor tissue. Evidence showed that the TET protein level was upregulated in solid tumors (Ficz et al. 2014). These data prompted us to check the CpG methylation profile of the promoter regions of ICs. Moreover, Leukocyte DNA methylation statuses of some genes may be associated with the susceptibility or risk of CRC (Gao et al. 2012). Therefore, the methylation levels of ICs in PBL may reflect the individual's susceptibility to CRC. Interestingly studies claimed that a strong synergy between the PDCD-1 and LAG-3 inhibitory pathways in tolerance to both self and tumor antigens and argued strongly that dual blockade of these molecules represents a promising combinatorial strategy for cancer, suggesting that they may contribute to T cell apoptosis and reduce autoimmune function jointly, induce tumor-mediated immune suppression (Woo et al. 2012; Lucas et al. 2011; Matsuzaki et al. 2010; Grosso et al. 2009). In our study, we measured the methylation levels and interaction effect of PDCD-1 and LAG-3 in nearly 800 PBL samples, including 397 CRCs and 390 controls. We observed that the protective effects of hypermethylation of PDCD-1 and LAG-3 in PBL, and a statistically significant combination effect between the hypermethylation of PDCD-1 and LAG-3, which was similar to the results in the GEO dataset. Notably, the combination effect of PDCD-1 and LAG-3 in the leukocytes of CRC and the epigenetic modifications are important to understand the complex inhibitory immune mechanisms involved in the risk of CRC.

The Results of the GEO dataset, TCGA database (Additional file 1: Fig S7) and previous studies (Elashi 2019; Sasidharan Nair et al. 2018) showed that: compared with normal tissue and PBL samples from healthy controls, methylation status of PDCD-1 in tumor tissues and PBL samples from CRC patients were in concordance with transcriptomic expression in CRC: the more the hypomethylation, the higher the expression, but LAG-3 were not completely consistent in tissue and PBL samples. Thus, blood-derived DNA methylation measurements may not always represent the levels of colorectal tissue methylation (Li et al. 2013; McKay et al. 2011). Although all somatic cells in a given individual are genetically identical, different cell types form highly different anatomical structures and perform widely different physiological functions (Christensen 2009). It is speculated that in the process of tissue differentiation and development, the transcription relevant control regions in the genome are selectively demethylated or hypermethylated. So that a set of restricted genes can be transcribed within a given tissue (Walsh et al. 1999). Compared with other tissues, the methylation pattern in DNA obtained from blood may be more "plastic", because blood is very close to environmental effects (such as lifestyle) (McKay et al. 2011).

Some changes in methylation closely correlate with age which may provide markers for biological aging and play an important part in cancer. The genome continues to undergo programmed variation in methylation after birth in response to environmental inputs, serving as a memory that could affect aging and predisposition to cancer (Christensen 2009; Teschendorff et al. 2010; Jones et al. 2015). Because age is a key factor in cancer risk and gene methylation, we speculated that age might be a confounding factor, which might impact CRC risk. Remarkably, we did not find any statistically significant difference between the young subjects group (< 60 years) and the old subjects group (≥ 60 years) (P > 0.05 for all comparisons) for the associations between methylation of these two genes and CRC risk. It may imply that the methylations of these two genes are independent biomarkers for the susceptible of CRC.

Altered DNA methylation is associated with environmental exposures encountered throughout life (Walsh et al. 1999). In our study, we observed that LAG-3 hypermethylation and higher intake of eggs (≥ 3 numbers/week) could synergistically reduce the risk of CRC. Vitamin B2 of eggs has antioxidant properties that can serve as a cofactor to enhance one-carbon metabolism, maintain intestinal mucosal stability, and has been implicated in lowering CRC risk (Yoon et al. 2016). Moreover, eggs are also rich in a variety of trace elements, such as zinc (Stepien et al. 2017), which have a role in reducing the risk of cancer. Besides, egg yolk can enrich conjugated linoleic acid, which has a wide range of biological functions, such as cancer inhibition (Bhattacharya et al. 2006; Larsson et al. 2005) and immune enhancement (Bassaganya-Riera et al. 2012). Therefore, DNA methylation changes in white blood cells may reflect epigenetic modification, human immune system, environmental exposure or the interaction among these factors on the risk of CRC.

However, there are still some limitations in our study. Firstly, we did not distinguish the type of cells in leukocytes. Different types of leukocytes may have different status of DNA methylation in tumor carcinogenesis. Nevertheless, studies focused on the methylation status of DNA from different leukocyte subtypes have suggested that confounding by leukocyte subtypes is potentially a minor issue that would not affect the DNA methylation level in the peripheral blood (Zilbauer et al. 2013). Secondly, our research results are based on a case–control study and cannot provide confirmation that whether abnormal changes of methylation statuses of PDCD-1 and LAG-3 are the preparatory epigenetic event of CRC or cancer-derived consequences; However, the GSE51032 dataset is from a nested case–control study of the prospective EPIC-Italy cohort (Cordero et al. 2015), in which the blood samples were collected 74.1 months (range from 0.2 to 172.8 months) prior to CRC diagnosis. This dataset can directly confirm the temporal relationship between methylation changes and tumorigenesis. Our results are consistent with the GEO dataset. The third limitation of the present study is the fact that the information about subjects that were recalled. The information of exposed environment factors may not reflect the real frequencies and amount. Nevertheless, the interviewers tried their best to collect the exact information. Finally, the sample size in the stratified analysis is relatively small, which may limit the statistic power in our research. Normally, the statistic power reaching to 0.8 is often considered an acceptable threshold. The statistical power for the age stratification analysis in our study was 0.82, which was close to 0.8. Therefore, more studies involving larger samples may be needed to improve the statistical power.

Conclusion

The methylation levels of PDCD-1 and LAG-3 might be the blood-based predictive biomarkers for identifying individuals at lower risk of developing CRC. Moreover, gene-environment interaction may play a vital role in CRC risk.