Introduction

Tobacco smoking is a risk factor for various diseases such as cancers, pulmonary and cardiovascular disease, type-2 diabetes, and obesity (Vineis et al. 2004; Mathers and Loncar 2006, Thun et al. 2010, CDC 2008). The World Health Organization (WHO) reported that about 6 million people worldwide die from smoking annually (WHO 2014). Smoking is the primary cause of approximately half-a-million deaths annually in the United States (Mokdad et al. 2004), and of most lung cancer cases (more than 4 of 5 cases) in developed countries (Fairley et al. 2010). Notably, smoking is a main source of cadmium exposure; cadmium accumulates gradually in the human body and has a half-life of 10–30 years (Jarup and Akesson 2009). Cadmium concentrations are much higher in the blood samples of smokers than that in non-smokers (Batariova et al. 2006; El-Agha and Gokmen 2002; Elinder et al. 1983), and smoking-induced increase in urinary cadmium concentration is associated with kidney dysfunction (Mortensen et al. 2011). Cadmium exposure is associated with the development or progression of cancers, cardiovascular dysfunction, nephrotoxicity, and bone damage (Larsson and Wolk 2016; Nordberg et al. 1992; Jarup and Akesson 2009).

Smoking can cause changes in DNA methylation, which plays an essential role in the transcriptional regulation of oncogenes, tumor suppressor genes, and inflammation-related genes (Sundar et al. 2011; Yao and Rahman 2011). Generally, hypermethylation at specific CpG sites in the gene promoter is associated with gene silencing and hypomethylation is related to the activation of gene expression (Yang and Schwartz 2011; Jones 2012). Variation in DNA methylation at specific CpG sites is associated with diseases such as cancers, inflammatory, and pulmonary diseases (Sundar et al. 2011, 2013; Selamat et al. 2012; Morrow et al. 2016; Cheng et al. 2016). Recently, several studies have investigated the changes in DNA methylation associated with smoking, as well as the link between smoking and diseases, such as lung cancer and chronic obstructive pulmonary disease, with respect to DNA methylation status (Ma and Li 2017; Sundar et al. 2017). Cadmium exposure due to smoking can induce alterations in DNA methylation (Virani et al. 2016); however, there are a limited number of studies focused on alterations in DNA methylation due to smoking-related cadmium exposure. In this study, we identified differentially methylated CpG sites in Korean smokers compared to Korean non-smokers, through a microarray-based approach. Further, we also investigated whether these CpG sites were differentially methylated by cadmium exposure due to smoking.

Methods and Materials

Study Population

Study participants included 100 non-smokers and 100 current smokers, who enrolled as volunteers for the fifth Korean National Health and Nutrition Examination Survey (2008–2011). We randomly selected subjects based on sex, age, and self-reported smoking history. All selected subjects were male and there was no difference in the average age of the two groups; the average age of the non-smokers was 52 years and that of smokers was 51 years (Table 1). Non-smokers had no smoking history over their lifetime. Smokers had a smoking history of more than 20 cigarettes per day for the past 20 years and first started smoking after the age of 15; the average smoking period per person was 31.54 years and the annual average number of packs of cigarettes per person was 439.64. All study participants provided informed consent.

Table 1 Characteristics of study subjects

DNA Methylation Analyses

A total of 50 non-smokers and 50 smokers were randomly selected from participants enrolled in the study for DNA methylation analyses. DNA samples, extracted from buffy coat samples of these subjects, were obtained from the National Biobank of Korea. DNA was bisulfite-converted using EZ DNA Methylation™ Kit (Zymoresearch, California, USA) and DNA methylation profiles were analyzed using the Infinium Human Methylation 450 K BeadChip (Illumina, San Diego, CA), which contains 485,512 CpG sites, according to the manufacturer’s protocol. The methylation rate at each CpG site was calculated by comparing fluorescent signals from methylated and unmethylated sites. The methylation rates are presented as mean beta (ß) values, which ranged from 0 (at a completely unmethylated site) to 1.0 (at a completely methylated site). Delta (Δ) ß value is defined as the difference between the mean ß value of smokers and that of non-smokers (mean ß value of smokers − mean ß value of non-smokers). Methylation rates between the two groups were compared using the independent t test statistical method. CpG sites with |Δ ß value|≥ 0.05 and P value < 0.01 were considered as differentially methylated.

Measurement of Urinary Cotinine and Blood Cadmium Levels

Urinary cotinine concentrations were measured by Gas Chromatography Mass Spectrometry (GCMS) using Perkin Elmer Clarus 600 T (PerkinElmer, Finland). Blood cadmium concentrations were measured by Graphite Furnace Atomic Absorption Spectrometry (GFAAS) using PerkinElmer AAnalyst 600 (PerkinElmer, Finland).

Gene Ontology Analyses

To identify the biological functions of genes that are differentially methylated due to smoking or cadmium exposure, Gene Ontology (biological process terms) analysis was performed using the DAVID Bioinformatics Resources 6.8 Functional Annotation Tool (https://david.ncifcrf.gov/). For this analysis, we used genes differentially methylated by smoking or cadmium exposure. Significant terms were chosen when the Benjamini–Hochberg-corrected P value < 0.05.

Correlation Analyses

The correlation between blood cadmium concentrations and DNA methylation rates was assessed using a linear regression statistical method. For this analysis, the methylation rate (ß value) at each CpG site and blood cadmium concentration (μg/L) from 50 non-smokers and 50 smokers was used. P value < 0.001 was considered statistically significant.

Results

Comparison of Blood Cadmium and Urinary Cotinine Levels Between Smokers and Non-smokers

The smokers (n = 100) and non-smokers (n = 100) had mean blood cadmium concentration of 1.67 ± 0.68 μg/L and 0.83 ± 4.23 μg/L, respectively (Table 1). The average concentration of blood cadmium was over 2 times higher in smokers compared to that in non-smokers. The urinary cotinine level is a sensitive biomarker for tobacco smoking (Kulza et al. 2012; Raja et al. 2016). Our results showed that the smokers and non-smokers had mean urinary cotinine concentration of 1847.95 ± 1178.87 ng/mL and 12.22 ± 17.16 ng/mL, respectively. The average concentration of urinary cotinine was over 100 times higher in smokers as compared to that in non-smokers.

Differential DNA Methylation Between Smokers and Non-smokers

A total of 136 CpG sites, including 70 unique genes, were differentially methylated in smokers compared to non-smokers (|Δ ß value|≥ 0.05; P value < 0.01) (Supplementary data 1). Among these, 92 CpG sites, including 51 unique genes, showed under-methylation in smokers compared to non-smokers; 44 CpG sites, including 19 unique genes, exhibited over-methylation in smokers. The average Δ ß value of the 92 under-methylated CpG sites was 0.07 (ranged from − 0.05 to − 0.21), and the average Δ ß value of the 44 over-methylated CpG sites was 0.07 (ranged from 0.05 to 0.15). The top 30 sites with the highest fold change among differentially methylated CpG sites are listed in Table 2. We found 25 under-methylated CpG sites and 5 over-methylated CpG sites in smokers. The cg05575921 site in AHRR showed hypomethylation [Δ ß value =  − 0.21; log2 (fold change) =  − 0.41]. The rs951295 in RNA gene LOC105370802 [Δ ß value =  − 0.18; log2 (fold change) =  − 0.62], cg00587941 [Δ ß value =  − 0.16; log2 (fold change) =  − 0.31], and cg23576855 in AHRR [Δ ß value =  − 0.17; log2 (fold change) =  − 0.40] were under-methylated, while cg11314779 in CELE6 [Δ ß value = 0.15; log2 (fold change) = 0.38] and cg02126896 [Δ ß value = 0.15; log2 (fold change) = 0.39] were over-methylated.

Table 2 Top 30 differentially methylated CpG sites in non-smokers (n = 50) and smokers (n = 50)

We performed gene ontology analysis on 70 genes that were differentially methylated in smokers (Table 3). AHRR (cg05575921, cg23576855, cg14817490, cg03991871, cg21161138, cg25648203), GFI1 (cg09935388), HOPX (cg25456368), RARA (cg19572487), RARG (cg20059012), REST (cg25313468), and ZFP57 (cg12463578) are associated with negative regulation of transcription, and all were under-methylated in smokers.

Table 3 Gene ontology analysis of 70 differentially methylated genes in non-smokers and smokers

DNA Methylation Associated with Cadmium Exposure

To identify cadmium exposure-related DNA methylation, we evaluated the correlation between blood cadmium concentration and DNA methylation rate at each CpG site, using genome-wide DNA methylation data obtained from 50 smokers and 50 non-smokers. The results showed that DNA methylation rates at 307 CpG sites, including 207 unique genes, were significantly correlated to the blood cadmium concentrations of the study subjects (P value < 0.001) (data not shown). The top ten sites with the most significant correlations were cg03991871, cg05575921, cg12806681, cg21161138, and cg23576855 in AHRR, cg03636183 in F2RL3, cg05951221, cg01940273, cg19859270 in GPR15, and cg21566642. We analyzed the biological functions of 207 genes (Table 4); these genes, including AHRR, F2RL3, HOPX, RARA, and RARB, were found to be typically associated with transcription regulation and signal transduction.

Table 4 Gene ontology analysis of genes showing significant correlation between DNA methylation rate and blood cadmium concentration

Identification of Genes Commonly Associated with Smoking and Blood Cadmium Exposure

To identify DNA methylation induced by cadmium exposure due to smoking, we selected CpG sites that were differentially methylated in smokers from among the cadmium exposure-associated DNA methylation described above. Thirty-eight CpG sites (including 23 unique genes) were identified (Table 5). The cg05575921 and cg23576855 in AHRR, cg03636183 in F2RL3, and cg21566642, showed a Δ ß value <  − 1.0 for DNA methylation rates between smokers and non-smokers.

Table 5 Thirty-eight CpG sites that were differentially methylated due to smoking and blood cadmium concentration

Discussion

In this study, we identified smoking-induced methylation alterations at 136 CpG sites (including 70 unique genes). The cg05575921 site in AHRR showed hypomethylation in smokers, in accordance with previous studies (Zeilinger et al. 2013; Dogan et al. 2014; Lee et al. 2017). These data support the suggestion that methylation levels of AHRR (cg05575921) may be used as an indicator of smoking intensity (Beach et al. 2015). Our findings were consistent with those of previous studies on smoking-associated DNA methylation at other CpG sites as well. Studies have reported that cg21161138 and cg26703534 in AHRR, cg01940273, cg06126421, cg21566642 (Zeilinger et al. 2013; Dogan et al. 2014), cg03636183 in F2RL3 (Zeilinger et al. 2013; Breitling et al. 2011), and cg19572487 in RARA (Zeilinger et al. 2013) are under-methylated in smokers. These studies included participants from the KORA S4 survey, African American females from the states of Iowa and Georgia, and general population-based epidemiological ESTHER study participants (Zeilinger et al. 2013; Dogan et al. 2014; Breitling et al. 2011). The combined data revealed that the DNA methylation patterns of 8 CpG sites (cg05575921, cg21161138, and cg26703534 in AHRR, cg01940273, cg06126421, cg21566642, cg03636183 in F2RL3, and cg19572487 in RARA) might change depending on the smoking status, regardless of race. In addition, we identified that rs951295, cg00587941, cg11314779, and cg02126896 were under- or over-methylated by ≥ 15% in smokers. These results indicate that methylation of these 4 CpG sites may be new candidate indicators for long-term smoking exposure. The biological implications of methylation changes at rs951295 (within RNA gene LOC105370802), cg00587941, and cg02126896 remain unknown. The cg11314779 site is located in the intron of CELF6. Since the CELF6 is associated with addiction (Bryant and Yazdani 2016), it may be interesting to study the relationship between methylation of cg11314779 and addiction.

Smoking is the main source of cadmium exposure. The blood cadmium concentrations of smokers were found to be significantly higher than those of non-smokers in this study. We identified cadmium exposure-related DNA methylation of 307 CpG sites (including 207 unique genes). CCL22, a signal transduction-related gene, was differentially methylated by cadmium exposure. A previous study on the relationship between cadmium exposure and CCL22 was retrieved from the PubMed database (February 17, 2020). The mRNA level of CCL22 decreased in antigen-activated lymphocytes due to cadmium treatment (Ebaid et al. 2014). Further studies are needed to determine whether DNA methylation of CCL22 due to cadmium exposure affects the gene expression. cg05575921 and cg23576855 in AHRR, cg03636183 in F2RL3, and cg21566642 that have not been previously reported to be associated with cadmium exposure were included in our study. These were under-methylated by > 10% in smokers compared to that in non-smokers. DNA methylation of cg05575921 in AHRR (Beach et al. 2015) and cg03636183 in F2RL3 (Zeilinger et al. 2013; Breitling et al. 2011) is the putative indicator for smoking. Taken together, these CpG sites (cg05575921 and cg23576855 in AHRR, cg03636183 in F2RL3, and cg21566642) may be differentially methylated by cadmium exposure due to smoking.

The Gene Ontology terms were analyzed to identify the biological functions of genes differentially methylated due to smoking and cadmium exposure. Among the 70 genes found to be differentially methylated in smokers compared to non-smokers, HOPX (cg25456368), RARG (cg20059012), and ZFP57 (cg12463578) genes, found to be under-methylated in the non-promoter regions, are involved in negative transcription regulation. Previous studies have reported that hyper-methylation in non-promoter regions of MMP9 (Falzone et al. 2016) and CDKN2A (Ben-Dayan et al. 2017) was associated with transcriptional activation. Therefore, the expression of HOPX, RARG, and ZFP57 genes might also be regulated by DNA methylation in the non-promoter regions. Glandular epithelial cell development (RARA), negative regulation of transcription from RNA polymerase II promoter (AHRR), and regulation of myelination (RARA) are the biological functions of some genes that were differentially methylated by both smoking and cadmium exposure. The RARA gene acts as retinoic acid receptor, nuclear receptor, and steroid hormone receptor (Tsaprouni et al. 2014). Recently, it has been reported that differential DNA methylation in RARA is associated with smoking in African Americans (Barcelona et al. 2019). Our data showed that cg19572487 in RARA was under-methylated by 9% (Δ ß-value =  − 0.05) in smokers compared to that in non-smokers (data not shown). Thus, alterations to DNA methylation in RARA can be caused by cadmium exposure as well as smoking.

The urinary cotinine level is a sensitive biomarker for tobacco smoking (Kulza et al. 2012; Raja et al. 2016) Behera et al. (2003). showed that mean urinary cotinine levels were 2736.20 ± 983.29 ng/mL and 7.30 ± 2.47 ng/mL in smokers and non-smokers, respectively. Sharma et al. (2019) reported that mean urinary cotinine levels were 1043.69 ± 1514.01 ng/mL and 13.60 ± 12.73 ng/mL in smokers and non-smokers, respectively. Our study found that smokers had a mean urinary cotinine concentration of 1847.95 ± 1178.87 ng/mL and non-smokers had a mean of 12.22 ± 17.16 ng/mL. Thus, these findings suggest that urinary cotinine values may be an indicator for smoking.

In conclusion, our study showed that 136 CpG sites (including 70 unique genes) were differentially methylated by smoking. Among these, DNA methylation levels of rs951295, cg00587941, cg11314779, and cg02126896 sites may be new putative indicators for smoking intensity. Furthermore, DNA methylation at cg05575921 and cg23576855 in AHRR, cg03636183 in F2RL3, and cg21566642 may be altered by smoking-induced cadmium exposure. These findings provide a novel insight into smoking-induced genetic alterations that might be involved in associated diseases.