Introduction

Lung cancer is one of the most common malignant tumors [1]. With the increase in detection rate of lung cancer and the aggravation of environmental pollution, morbidity and mortality of lung cancer are increasing year by year [2]. At present, lung cancer has become the leading cause of cancer death worldwide in both sexes combined (18.4% of the total cancer deaths) [3, 4]. According to statistics, in 2012, 1.8 million new cases of lung cancer occurred, accounting for about 13% of new cancer [1, 4, 5]. Epidemiological studies have demonstrated that lung cancer diagnoses are associated with smoking in highly developed countries [6], but only 10–15% of smokers develop lung cancer in all smokers, and the morbidity of lung cancer varies in different genders, races, and regions. Studies show that lung cancer has the highest morbidity in North America, East Asia, the Middle East, and the Southern Europe, while female lung cancer rates are highest in North America and Southern Europe [5, 7]. In addition, many studies have shown that the development, metastasis, and prognosis of lung cancer are related to many gene mutations, such as EGFR, KRAS, and BRAF [8, 9]. The above researches indicated that lung cancer is not only caused by environmental factors, genetic factors cannot be ignored in the occurrence and development of lung cancer.

Cytochrome P450 (CYP450), belonging to ω-hydroxylase, participates in the metabolism of many endogenous substances and exogenous compounds, including fatty acids, docosahexenoic acid, and vitamin D, and of a wide variety of carcinogens and anticancer drugs [10, 11]. These reactive metabolites would interact with DNA, thereby causing altered gene expression or function, and eventually carcinogenesis [12]. Therefore, CYP450 may influence tumor genesis and progression. CYP4F2 and CYP3A5 are members of the CYP450 family. A recent study shows that the expression of CYP4F2 is closely related to hepatocellular carcinoma cells, which may contribute to tumor progression [10]. Relative studies demonstrate that 20-HETE (CYP4F2-related products) was associated with the growth of tumors in mouse non-small cell lung cancer cell lines [13]. Another study confirmed that CYP3A5 was associated with lung cancer in the population of Taiwan, China [14]. However, little research has been done about the association between CYP4F2 and CYP3A5 gene polymorphisms and lung cancer in the Chinese Han population of mainland China.

Therefore, in this study, four SNP locus in CYP4F2 and CYP3A5 genes were analyzed to explore the association between the polymorphisms of CYP4F2 and CYP3A5 genes and the risk for lung cancer.

Materials and methods

Subject recruitment and sample collection

A case–control study, involving 510 lung cancer patients as the case group and 504 healthy individuals as the control group, was conducted at the First Affiliated Hospital of Xi’an Jiaotong University, Shaanxi, China. All included patients had recently been diagnosed and pathologically confirmed primary lung cancer. The subjects in the control group were recruited from the Health Checkup Center of the First Affiliated Hospital of Xi’an Jiaotong University, who take health examination annually and have no histories of cancers and no chronic or serious endocrine or metabolic nutritional diseases. Patients were ascertained to be free from any acute or chronic pathology. Blood samples from the patients with lung cancer were collected before initiation of chemotherapy or radiotherapy. All of the participants were genetically unrelated ethnic Han Chinese and agree to participate in the present study. The protocols for this study were approved by the Ethical Committees of both the First Affiliated Hospital of Xi’an Jiaotong University and Northwest University.

Five milliliters of whole blood was collected from each subject into tubes containing ethylenediaminetetraacetic acid at the time of initial diagnosis. After centrifugation, the samples were stored at − 80 °C until further use. The characteristics of all study participants are summarized in Table 1.

Table 1 Characteristics of case group and control group in the study

SNPs selection and primer design

Based on GWAS studies of tumors and reports in related literature, four SNP loci of CYP4F2 and CYP3A5 genes were selected. All loci met the criterion that the minimum allele frequency was more than 5% in HapMap Chinese Han Beijing population. Primers were designed according to ASSAY Design SUITE V2.0 (https://agenacx.com/online-tools). (All primers were designed according to the sequence of forward strand from dbSNP Database.)

DNA purity detection and genotyping

DNA was extracted by whole blood genome DNA purification kit (Xi’an GoldMag Biological Company). The concentration and purity of DNA were detected by Nanodrop Lite ultraviolet spectrophotometer (Thermo Scientific, Waltham, Massachusetts, USA). Genotyping of all SNPs was performed on Mass ARRAY iPLEX (Agena Bioscience, San Diego, CA, USA) platform using matrix-assisted laser desorption ionization time of flight (MALDI-TOF) mass spectrometer. The results were output by Agena Bioscience TYPER 4.0 software.

Statistical analysis

Microsoft Excel (Microsoft, Redmond, WA) and SPSS software(version 19.0, SPSS, Chicago, IL) were used for statistical analysis. Chi-square test was taken to compare the distribution of frequency of suspicious confounding factors (age, sex, etc.) in cases and control groups, to determine the comparability between the two groups. Hardy–Weinberg equilibrium test (HWE) was performed on all SNP frequencies in the control group by Chi-square test. Fisher's exact test was used to compare the allele and genotype frequencies of each locus in two groups. We used logistic regression analysis to assess the association between each SNP and the risk of lung cancer and risk for lung cancer in different genetic models (additive, dominant, recessive models), while conducting management considering age and gender. Logistic regression analysis was also used to calculate odds ratios (ORs) and 95% confidence intervals (CIs). In the comparisons above, a two-sided P value < 0.05 was considered statistically significant. According to the stratification of age, gender, and pathological types of lung cancer, the correlation between SNP sites and lung cancer risk in different stratified populations was evaluated. The specific method was the same as the above.

Results

Population characteristics

510 cases of lung cancer were included in this study. The average age was 58.08 (± 10.548) years old in the cases, of which 75.3% were males and 24.7% females. 504 cases were included in the control group, with an average age of 57.27 (± 10.852) years old, of which 75.6% were males and 24.4% females. Chi-square test showed that there was no significant difference in age and sex between the case group and the control group (age: P = 0.227, gender: P = 0.911) (Table 1).

SNP and the risk of lung cancer

Basic information of four SNPs loci in CYP4F2 and CYP3A5 genes is shown in (Table 2). All SNPs loci were in accordance with Hardy–Weinberg equilibrium (HWE) assessed by Chi-square test and Fisher’s exact test of SPSS software. The distributions of allele frequencies between the case group and the control group were compared by Chi-square test of Plink software. The results showed that the P values of all SNPs loci in the whole population were greater than 0.05 in allele model, which demonstrated that there was no significant difference between the two groups in the whole population.

Table 2 SNP in CYP4F2 and CYP3A5 gene

From the analysis above, it can be concluded that four SNPs loci in CYP4F2 and CYP3A5 genes have no significant correlation with risk for lung cancer in the case group and the control group in allele model, so stratified analysis was carried out in the aspect of age, sex, and pathological type of lung cancer. Stratified analysis found that three of the four selected SNP loci were significantly associated with lowered risk of lung cancer in allele model, namely rs3093106 (CYP4F2), rs3093105 (CYP4F2), and rs10242455 (CYP3A5), among which rs3093106 and rs3093105 were significantly different between the two groups in the subjects of older than 58 years old (rs3093105: P = 0.023, OR 0.59, 95% CI 0.37–0.93; rs3093106: P = 0.029, OR 0.60, 95% CI 0.38–0.94), lung adenocarcinoma (rs3093105, P = 0.023, OR 0.59, 95% CI 0.370.93; rs3093106: P = 0.025; OR 0.60, 95% CI 0.38–0.94) and male patients (rs3093105, P = 0.017, OR 0.68, 95% CI 0.49–0.93; rs3093106: P = 0.020, OR 0.68, 95% CI 0.50–0.94). Rs10242455 in CYP3A5 gene showed significant difference between the two groups in lung squamous cell carcinoma (rs10242455: P = 0.018, OR 0.71, 95% CI 0.53–0.94) (Tables 3, 4).

Table 3 Association between CYP4F2 gene polymorphism and lung cancer under different stratification analyses (adjusted by sex and age)
Table 4 Association between CYP3A5 gene polymorphism and lung cancer under different pathological types (adjusted by sex and age)

In addition, three selected SNPs loci were analyzed in different populations and different genetic models through logistic regression analysis. The results showed that the rs3093105 and rs3093106 were linked with reduced risk of lung cancer in the dominant model and additive model of lung adenocarcinoma, male patients and patients older than 58 years old. After adjusting for age and gender, the correlation was still observed (Table 3). The rs10242455 was associated with lowered risk of lung squamous cell carcinoma in the dominant model and additive model; after adjusting for age and gender, the correlation was still observed (Table 4).

In addition, through the analysis of TCGA database, GEPIA database (http://gepia.cancer-pku.cn/), Kaplan–Meier plotter database (http://kmplot.com/), the correlation between expression and prognosis was analyzed. It was found that the expression of CYP3A5 gene in cancer tissues is lower than that in para-tumor tissues (Fig. 1), and there was a worse prognosis in lung cancer patients with lower expression (Fig. 2).

Fig. 1
figure 1

Expression of CYP3A5 gene in lung squamous cell carcinoma (n = 486) and para-tumor tissues (n = 338) from GEPIA database. The Y-axis is the log-scale of log 2(TPM + 1) (TPM Transcripts Per Million). The box plots show the interquartile range (IQR), median (bar in box), tissues. CYP3A5 expression is significantly lower in lung squamous cell carcinoma (*p < 0.01)

Fig. 2
figure 2

Relationship between CYP3A5 gene and prognosis of lung squamous cell carcinoma. The Y-axis is survival rate. Red line represents low expression of CYP3A5 gene, and black line represents high expression of CYP3A5 gene in lung squamous carcinoma people; P value < 0.05 was considered statistically significant; HR Hazard ratio; the numbers at the bottom indicate the number of people still alive at the different survival time (color figure online)

Discussion

This study suggests that CYP3A5 and CYP4F2 were associated with reduced risk of NSCLC. This was related to age, sex, and pathological type of lung cancer.

CYP4F2 is a member of the CYP4F family. Several studies have revealed marked mRNA up-regulation of genes encoding CYP4 enzymes in thyroid, breast, colon, and ovarian cancers. Alexanian et al. [15] confirmed that the levels of CYP4F2 and 20-HETE in ovarian cancer tissues were higher than those in normal control group. However, up to now, the correlation between CYP4F gene and lung cancer has not been reported. Our study has been the first to report that there is a significant correlation between CYP4F2 gene polymorphisms and lung cancer in Chinese Han population, and this is associated with lowered risk of lung cancer in people older than 58 years old, lung adenocarcinoma and men. Similarly, Ankit et al. confirmed that the expression of CYP4F2 was increased in pancreatic ductal carcinoma, and the expression of CYP4F2 was negatively correlated with age and higher in males [16]. This is similar to the conclusion of the present study. In addition, many studies have confirmed that CYP4F2 was closely related to the metabolism of 20-hydroxyethyl hexadecanoic acid (20-HETE) [17,18,19]. In the past decade, 20-HETE has been recognized as a key conditioning agent of cancer progression, which can induce cell proliferation in vitro by stimulating the formation of reactive oxygen species and the production of vascular endothelial growth factor. Previous studies have shown that 20-HETE antagonists (WIT002) can inhibit the proliferation of renal adenocarcinoma [20]. Similarly, two studies have demonstrated that HET0016 (20-HETE antagonist) can inhibit the growth of tumors in non-small cell lung cancer cell lines and of human glioma [13, 21]. We hypothesize that the effect of CYP4F2 gene polymorphisms on the risk for lung cancer may be related to the metabolism of 20-HETE and then affect the growth of cancer cells by regulating the signal pathway of vascular endothelial growth factor. However, further experiments are needed to confirm this.

CYP3A5 is an important member of the CYP3A family. It participates in the catalytic oxidation of many exogenous substances, including toxins, carcinogens, the metabolism, and clearance of some drugs [1]. Studies have shown that CYP3A5 plays an important role in the development of acute and chronic leukemia, colorectal cancer, and esophageal cancer [22,23,24,25]. Islam et al. [26] reported that CYP3A5 was a risk factor of lung cancer in Bangladeshi population. Interestingly, we found that CYP3A5 was a protective factor of NSCLC in Chinese Han population, which may be related to racial differences. Similarly, in a study of Taiwanese of China, CYP3A5 has been confirmed to play a protective role in the development of lung cancer [14]. Also, Feng et al. indicated that CYP3A5 plays a protective role in the occurrence and metastasis of hepatocellular carcinoma. At the same time, they also confirmed that CYP3A5 over-expression in hepatocellular carcinoma cells inhibits the metastasis and invasion of cancer cells in vivo and in vitro, via manipulating ROS/mTORC2/p-AKT (S473) signaling pathway and limiting MMP2/9 function [25, 27]. Research has found that a SNP within intron-3 (CYP3A5*3) results in aberrant mRNA splicing and a pronounced reduction in protein synthesis [28]. Likewise, rs10242455 belongs to intron variants in CYP3A5 gene. So, we suspect that CYP3A5 may affect ROS/mTORC2/p-AKT (S473) signaling pathway and limiting MMP2/9 function by affecting mRNA splicing and protein synthesis, thereby affecting the occurrence of lung cancer.

In addition, we found that CYP3A5 gene was low expressed in lung squamous cell carcinomas, and the survival rate was lower among the lung cancer patients with low expression. Similarly, Tingdong suggests that the lower the expression of CYP3A5, the worse the prognosis in hepatocellular carcinoma patients [25]. Another study in Chinese population showed that CYP3A5 gene is closely related to the prognosis of patients with non-small cell lung cancer undergoing chemotherapy and surgical treatment [1]. This is similar to our conclusion. Besides, two recent studies indicated that CYP3A5 gene participates in the metabolism of docetaxel and sunitinib. Different genotypes respond differently to drug dosage requirements and drug toxicity [29, 30]. This suggests that CYP3A5 gene may be related not only to the risk and prognosis of lung cancer, but also to the treatment and drug selection of lung cancer. It may be a predictor of the occurrence, development, and prognosis of lung cancer, but it needs a larger sample of research to further confirm the findings.

Our research confirms that CYP4F2 and CYP3A5 gene polymorphisms are associated with the risk of lung cancer. We believe that our results will encourage more people using larger sample sizes to further confirm the relationship between CYP4F2 and CYP3A5 genes and lung cancer, as well as their specific mechanisms in the occurrence and development of lung cancer in the future studies. But there are still limitations for our study. First, the treatment and survival time of lung cancer patients did not take into consideration. Second, we did not collect the smoking data of the samples in our study, and further study is needed to improve the deficiencies of this research.

Conclusion

This study found that CYP4F2 and CYP3A5 gene polymorphisms were associated with the risk of NSCLC.