Background

Lung cancer is one of the most frequent malignant neoplasms in the world according to previous data gathered from different countries including China [1, 2]. The crude incidence and mortality of lung cancer were increasing rapidly over the past 30 years in China [3]. The occurrence and development of lung cancer may result from complicated interactions of environmental and genetic factors through previous epidemiological studies, which has considered tobacco exposure as a major environmental risk factor. With the in-depth exploration of genome-wide association studies (GWAS), multiple studies for underlying associations between single nucleotide polymorphisms (SNPs) in non-coding RNA genes and various cancers has gradually emerged with respect to genetic risk factors.

Long non-coding RNAs (lncRNAs), a kind of non-coding RNAs of more than 200 nucleotides in length, possesses the properties of lack transcriptional protein-coding function and open-reading frames [4,5,6]. On account of cellular function, lncRNAs could also be characteristically classified into tumor-suppressive and oncogenic lncRNAs in a way similar to the protein-coding genes [7]. LncRNAs may act as gene regulators through intricate mechanisms, including post-transcriptional, trans- and cis- gene regulation in carcinogenic pathways [8,9,10,11]. Most of previous studies have focused on revealing the critical role of lncRNAs for cancer risk, prognosis, diagnosis and targeted therapy, implying lncRNAs have shifted to the forefront of human cancer research.

Paired-box gene 8 (PAX8), one of members of the pair-box (PAX) gene family, is mapped on chromosome 2q13, which has multiple oncogenic properties such as inhibition of apoptosis to promote the survival of tumor cells, activation of BCl2 transcription and suppression of p53 [12,13,14]. A large number of studies have shown that PAX8 strikingly expressed in several types of tumors, exerts its biological functions and ultimately affects the development of cancer [15]. PAX2/5/8 could promote cell proliferation, but its transcription factors play conservative roles in affecting cell apoptosis [16,17,18]. Besides, specific protein–protein interactions could regulate the activity of PAX transcription factors. For example, activation of PAX8 transcriptional activity may be occur on condition that PAX2/5/8 interact with the tumor suppressor pRb [19]. Additionally, several studies had identified that the most common genetic event is the PAX8-PPARγ rearrangement except for the RAS gene mutations in follicular thyroid gland tumour [20,21,22,23,24]. PAX8-PPARγ rearrangement may affect the function of PPARγ protein and leads to impairment of normal PPARγ function eventually, which loss of the functions would result in uncontrolled cell growth [25, 26]. PAX8 may also bring about the elimination of the PTEN gene’s inhibitory effect by acting on the AKT to immunologically activate the PI3K signaling pathway [26]. Simultaneous activation of the signal pathways is highly carcinogenic, which may cause distant metastasis of tumour cells and the occurrence of locally infiltrating follicular carcinoma. Particularly, Kanteti et al. [27] reported that PAX8 is overexpressed in non-small cell lung cancer (NSCLC), whereas PAX5 protein is predominantly expressed in small cell lung cancer (SCLC) by analyzing the expression profile of the PAX gene family, and speculated that PAX8 is likely to interfere with p53 transcription and contribute to the development of tumor. Silencing of PAX8 expression may impair viability and motility of NSCLC cells in A549 cells [28]. Previously these studies had laid a solid foundation for gathering evidence to examine the hypothesis.

Recently, two studies has evaluated the association of expression quantitative trait loci (eQTL) SNPs in lncRNA AC016683.6, which located in the upstream region of PAX8 gene and could alter the expression of PAX8, with the susceptibility of cervical cancer [29] and the prognosis of hepatocellular carcinoma [30]. In this study, we hypothesized that rs4848320 and rs1110839 polymorphisms in AC016683.6 may shape the risk of lung cancer by supporting of the above evidence. To verify this hypothesis, we implemented a hospital-based case–control study consisting of 434 lung cancer cases and 593 cancer-free controls to evaluate the associations between the two SNPs and susceptibility of lung cancer. We also investigated the potential interaction of the two SNPs with smoking exposure on lung cancer risk, which was valuable for exploring cancer etiology and proposing improvement of environmental risk factors for cancer prevention.

Materials and methods

Study subjects and sample data collection

The hospital-based case–control study was conducted in Shenyang city of Liaoning Province, which located in the northeast China. All enrolled subjects were unrelated ethnic Han Chinese. Totally, 434 newly diagnosed lung cancer patients (111 males and 323 females) and 593 cancer-free subjects (180 males and 413 females) enrolled in the case and control group, respectively. The included criteria of the selected cases were as follows: (a) patients who examined through histopathological confirmation and newly diagnosed lung cancer; (b) The patients had no previous cancers; (c) These patients had not experienced radiotherapy or chemotherapy before. The included criteria of controls was consistent with the (b) (c) of the above criteria for cases. A small number of controls suffered mainly from coronary heart disease, hypertension and diabetes. Meanwhile, the recruited controls were from medical examination centers of the hospitals in the same period and matched to the selected patients on age (± 5 years) to ensure comparability among subjects. The study has acquired the approval of the Institutional Review Board of China Medical University. All enrolled participants or every participant’s representatives signed the informed consent according to relevant regulations. Each participants donated 10 ml peripheral blood and possess capability for undergoing a 1.5 h interview to complete the demographic data by filling out questionnaire. Additionally, a subject who smoked more than 100 cigarettes in the whole lifetime is considered as a smoker, otherwise, he or she is regarded as a non-smoker.

SNP selection and genotyping

Based upon the data from Regulome Database, the study selected the rs4848320 and rs1110839 SNPs in PAX8-AS1 (http://www.regulomedb.org) [31], which are meet the criteria of minor allele frequency (MAF) > 0.05 in Han Chinese population [32, 33]. Following the previous method [34], we isolated genomic DNA samples from the vein blood of each participants by Phenol–chloroform extraction. Genotyping of the two SNPs conducted by using an Applied Biosystems 7500 FAST Real-Time PCR System (Foster City, CA, USA) utilizing Taqman® allelic discrimination (Applied Biosystems, Foster City, CA) with commercial primer–probe set. The reaction condition of real-time PCR was set at 10 min for 95 °C, 30 s at 92 °C and 1 min at 60 °C for all 47 cycles.

Statistical analysis

T test, χ2 test and logistic regression analysis tested the differences between the case and control group according to type of demographic variables (including categorical and continuous variables). A goodness-of-fit χ2 test was performed to estimate the Hardy–Weinberg equilibrium (HWE) of the control group. The linear-by-linear association of χ2 test was conducted to assess trend analysis in a similar dose-dependent manner. The odds ratios (ORs) and their 95% confidence intervals (CIs) were used to evaluate the relationships between rs4848320, rs1110839 genotypes and lung cancer risk by means of logistic regression analysis. Crossover analysis was to examine interaction between the SNPs and smoking exposure. The multiplicative interaction was evaluated by OR and its 95% CI with the unconditional logistic regression model. Assessment of the addictive interaction applied relative excess risk due to interaction (RERI), synergy index (S) and attributable proportion due to interaction (AP). If the 95% CI of RERI/AP did not contain 0 and the 95% CI of S did not contain 1, the addictive interaction was existed [35]. All statistical analyses was obtained by performing SPSS 22.0 software (IBM SPSS, Inc. Chicago, IL, USA), and P < 0.05 was regarded as a statistically significant result under conditions of the all tests with two-sided.

Results

Baseline characteristics

Table 1 clearly presented the demographic characteristics of 593 controls and 434 lung cancer patients, which covers 331 non-small cell lung cancer (NSCLC), 233 adenocarcinoma (AD), 89 squamous cell carcinoma (SQ) and 61 small cell lung cancer (SCLC), respectively. The age mean ± SD of case group and control group were 57.97 ± 11.85 and 56.67 ± 15.88, respectively. There were no statistical differences on age and gender of the case and control groups. However, the distribution of smoking status in both groups presented the significant difference (P = 0.044). Therefore, all the further statistical analyses adjusted by age, gender and smoking status. Genotype frequency distributions of the two SNPs in control group was in agreement with HWE (P = 0.556 for rs4848320, P = 0.643 for rs1110839), which indicated selected subjects were a randomly representative sample from the overall population.

Table 1 Demographics of patients with lung cancer and controls

Genotype distribution and lung cancer susceptibility

Table 2 summarized the associations between the SNPs genotypes and the susceptibility to lung cancer as well as NSCLC. No significantly statistical significance was in all genetic (CT vs. CC: OR = 1.036, 95% CI = 0.788–1.362, P = 0.800; TT vs. CC: OR = 1.041, 95% CI = 0.482–2.247, P = 0.918; Dominant model: OR = 1.036, 95% CI = 0.795–1.352, P = 0. 0.792; Recessive model: OR = 1.030, 95% CI = 0.479–2.212, P = 0.940, adjusted by age, gender and smoking) models for rs4848320 polymorphism. Rs1110839 genetic variation was not significantly statistical significance in all (TG vs. TT: OR = 1.083, 95% CI = 0.830–1.413, P = 0.559; GG vs. TT: OR = 1.079, 95% CI = 0.714 –1.632, P = 0.717; Dominant model: OR = 1.082, 95% CI = 0.839–1.394, P = 0.543; Recessive model: OR = 1.034, 95% CI = 0.702–1.525, P = 0.864, adjusted by age, gender and smoking) models. Moreover, similar results occurred in NSCLC, lung adenocarcinoma and squamous cell carcinoma subgroups (Tables 2 and 3). Subsequently, we estimated the combined effect of variant alleles (Rs4848320-T and Rs1110839-G) on the risk of lung cancer and NSCLC (shown in Table 4). We obtained that lung cancer and NSCLC risks was not remarkable significance with the increasing number of risk genotypes of the rs4848320 and rs1110839 in a similar dose-dependent manner (LC: P = 0.893; NSCLC: P = 0.964). The results of stratified analyses by smoking exposure, gender and age presented in Tables 5, 6 and 7, respectively. In the smoking population, patients with CT and TT genotypes were significantly increased with the risk of lung cancer in dominant (Rs4848320: OR = 1.929, 95% CI = 1.070–3.479, P = 0.029, adjusted by age and gender) model rather than other three models. Rs110839 GG genotype was significantly associated with increased risk of lung cancer (OR = 3.224, 95% CI = 1.089–9.544, P = 0.034). In male population, rs1110839 genotypes were related to significantly increased risk of lung cancer in all genetic models (GG vs. TT: OR = 3.096, 95% CI = 1.343–7.134, P = 0.008; Dominant model: OR = 1.804, 95% CI = 1.061–3.067, P = 0.029; Recessive model: OR = 2.376, 95% CI = 1.102–5.121, P = 0.027) except for heterozygous model. However, the results of non-smoking, age and female subgroups did not reveal significant effect on lung cancer susceptibility.

Table 2 Association between the two SNPs and risk of lung cancer and non-small cell lung cancer
Table 3 Association between the two SNPs and risk of lung adenocarcinoma and squamous cell lung cancer
Table 4 Cumulative effects of rs4848320-T and rs1110839-G on lung cancer and non-small cell lung cancer susceptibility
Table 5 Stratified analyses of association between the two SNPs and lung cancer risk by smoking exposure
Table 6 Stratified analyses of association between the two SNPs and lung cancer risk by gender
Table 7 Stratified analyses of association between the two SNPs and lung cancer risk by age

Interaction between SNPs and smoking exposure

Table 8 provided the results of the crossover analysis. TT and CT genotypes carriers with smoking exposure were significantly associated with 2.218-fold increased risk to lung cancer (P = 0.005) for rs4848320. Additionally, rs1110839 TG and GG genotypes carriers with smoking exposure 1.755-fold increased risk to lung cancer (P = 0.017). However, the parameters of additive interaction presented that rs4848320 and rs1110839 polymorphisms had no additive interaction with smoking exposure (shown in Table 9). The OR value of multiplicative interaction of rs4848320 risk genotypes with smoking exposure more than 1 indicated that the interaction was positive multiplicative interaction, with a P value of 0.016 (P < 0.05) (Table 10). In other words, when the rs4848320 TT and CT genotypes coexist with smoking exposure, the interaction between the two risk factors becomes stronger, and its biological significance is a synergistic effect.

Table 8 Crossover analysis of interaction between rs4848320, rs1110839 risk genotypes and smoking exposure
Table 9 Addictive interaction between rs4848320, rs1110839 risk genotypes and smoking exposure
Table 10 Multiplicative interaction between rs4848320, rs1110839 risk genotypes and smoking exposure

Discussion

The current case–control study initially investigated the associations between polymorphisms in AC016683.6 and lung cancer susceptibility. We obtained that rs4848320 and rs1110839 polymorphisms significantly increased the risk of lung cancer in the smoking and male population. Subsequently, the crossover analyses provided the two SNPs risk genotypes carriers with smoking exposure increased the risk of lung cancer. Moreover, the interaction between rs4848320 risk genotypes and smoking was positive multiplicative interaction.

Genome-wide cancer mutation analyses are uncovering how the particular properties of lncRNAs to contribute crucial cancer phenotypes through the regulation of lncRNAs transcription [6], suggesting that lncRNAs are powerful modulators, involving chromatin interaction, transcriptional regulation, and DNA, Protein, RNA processing [36]. McGinnis et al. initially discovered the paired box gene family in Drosophila [37]. Each member among the family is a very conservative transcriptional regulator and actively participates in the sophisticated regulation of intracellular signal transduction. The cooperative regulation among the PAX genes plays a crucial role in different biological processes such as the transcription of development-related genes, the promotion of cell proliferation and differentiation, the inhibition of apoptosis, the induction of directional metastasis of precursor cells as well as selective processing of different splice sites of the precursor mRNA [38,39,40]. PAX8, a product of the pair-box gene family, plays a complex role in the developmental of various disease including cancers [12]. Overwhelming evidence elucidated that PAX8 is a efficacious marker to distinguish serous ovarian tumor [41, 42], metastatic müllerian carcinomas [43], renal cell carcinoma [44]. The expression of PAX5 and PAX8 are different in lung cancer tissues in order to further identify potential therapeutic targets, while PAX5 inclined to expressed in SCLC cells, overexpression of PAX8 in most NSCLC cell lines [27]. Moreover, the results of Muratovska et al. presented that PAX genes influence cell survival in specific types of cancer [17].

Previous studies demonstrated that AC016683.6 eQTLs SNPs might contribute to susceptibility or prognosis to different cancers [29, 30, 45]. Han et al. firstly conducted a case–control study, which revealed that rs4848320 TT and rs1110839 GG decreased risk of cervical cancer, with the corresponding adjusted ORs (95% CIs) of 0.58 (0.36–0.93) and 0.75 (0.58–0.97), respectively [29]. However, another study found that rs4848320 polymorphism may also be a risk factor for the risk of childhood acute lymphoblastic leukemia [45]. Moreover, Ma et al. demonstrated the SNPs in AC016683.6 were significantly associated with the prognosis of hepatocellular carcinoma [30]. However, the study did not provide both rs4848320 and rs1110839 polymorphisms associated with lung cancer risk in overall population. We identified that rs4848320 and rs1110839 polymorphisms increased the risk of lung cancer in the smoking and male population. Besides, we founded that there was positive multiplicative interaction between rs4848320 risk genotypes and smoking exposure, indicating that the interaction increased the susceptibility to lung cancer.

Previous evidence identified that SNP mutations in lncRNAs may alter their structural stability, generate alternative splicing, impair the translation of target mRNAs, and ultimately affect the risk of various cancers [6]. For example, rs35592567 polymorphism affect the expression of TP63 by interfering miR-140, which may serve as a reasonable explanation for the increased susceptibility of gastric cancer [46]. Additionally, Xue et al. reported that rs7958904 G > C strikingly altered the secondary structure of HOTAIR, suggesting the SNPs affect the susceptibility to colorectal cancer. LncRNA AC016683.6 is located in the intron region of PAX8, which mapped on chromosome 2q13. Rs1110839 and rs4848320 polymorphisms in AC016683.6 may be the eQTLs SNPs for PAX8 gene. Consequently, it is likely that the two SNPs could influence the specific interaction between AC016683.6 and PAX8, directly regulating the expression of PAX8. Therefore, we need to explore the functional properties of SNPs in oncogenic lncRNAs, which may be beneficial for developing the potential of lncRNAs in the diagnosis and treatment of cancer.

The study should emphatically highlight several implicated limitations. Firstly, the subjects from Shenyang city may lack of representative to some extent. Secondly, sample size of the study is limited, especially stratified subgroup. Thirdly, functional verification of two SNPs in AC016683.6 did not in our current study. Therefore, more large-scale studies confirm these results of the study across different ethnicities in the future.

Conclusions

Rs4848320 and rs1110839 genetic variants may be associated with the increased risk of lung cancer. The rs4848320 risk genotypes combined with smoking status strengthened the risk effect on lung cancer risk. However, the rs4848320 and rs1110839 genetic variants in AC016683.6 may serve as potential genetic risk factors for lung cancer susceptibility, which needs to be verified in a larger population. Additionally, biological function of the two SNPs in AC016683.6 need further to clarify in lung cancer. Previous studies and the current study provide novel clues for functional analysis of cancer-related susceptibility loci.