Introduction

Primary liver cancer is currently the fourth most common malignancy and the second leading cause of cancer-related death in China, seriously threatening people’s life and health [1]. Hepatocellular carcinoma (HCC) is the most common form of primary liver cancers, accounting for approximately 75–85% of primary liver cancers [2]. The early symptoms of HCC are not obvious, and the disease will not be detected until it develops to a certain stage. Modern medical technology can treat HCC in a timely and effective manner and prolong patients’ life to a certain extent. Unfortunately, HCC cannot be completely cured. Additionally, it is prone to recurrence and metastasis, and has a poor prognosis [3]. HCC is a disease with multifactorial etiologies, including viral factors, such as hepatitis B virus (HBV) and hepatitis C virus (HCV), and non-viral factors, such as smoking, alcohol consumption, aflatoxin, and obesity [4]. Remarkably, these risk factors are not entirely responsible for the rising morbidity and mortality of HCC. With the rapid development of molecular genetics, researchers have found that genetic mutations, such as single nucleotide polymorphisms (SNPs), have become one of the important basic reasons for the occurrence of HCC [5]. Hence, it is urgent to study the relationship between SNPs and HCC susceptibility, which can provide theoretical guidance for the clinical diagnosis and prevention of HCC.

Patatin-like phospholipase domain-containing 3 gene (PNPLA3), also known as adiponectin, encodes a transmembrane protein composed of 481 amino acids and is expressed on the hepatocyte membrane, which can regulate lipid metabolism, inflammatory mediators and so on [6, 7]. PNPLA3, located on the long arms (q) (22q.13.31) of human chromosome 22, belongs to the patatin-like phospholipase family and is highly expressed in liver and fat [8, 9]. At present, numerous studies are concerned with the association of PNPLA3 gene polymorphisms with non-alcoholic fatty liver disease (NAFLD) [10,11,12], alcoholic liver disease (ALD) [13], liver fibrosis (LF) [14, 15], and liver cirrhosis (LC) [16]. On the contrary, very little pays attention to the association of PNPLA3 gene polymorphisms with HCC. Therefore, further research on the relationship between PNPLA3 gene polymorphisms and HCC susceptibility is of great value.

In this study, we performed association analyses to determine the effect of PNPLA3 gene polymorphisms on HCC susceptibility in the Chinese Han population. Based on the 1000 Genomes Project database and Haploview software, combined with Hardy–Weinberg equilibrium (HWE) test and primer design principles, we selected four SNPs (rs738409, rs3747207, rs4823173, and rs2896019) of the PNPLA3 gene for further study. Finally, we assessed the relationship between PNPLA3 SNPs and HCC susceptibility, which can provide clues for identifying more genetic polymorphisms related to HCC susceptibility.

Materials and methods

Study subjects

Power analysis was carried out before the study to determine the required sample size. According to power analysis, the case group and control group should consist of at least 484 and 486 individuals, respectively. Accordingly, this study included 971 subjects (484 HCC patients and 487 controls) from Hainan General Hospital. All patients with HCC were diagnosed by imaging, hematological molecular and pathological examinations in accordance with the guidelines for the diagnosis and treatment of primary HCC in China. The control group was healthy people who underwent physical examination during the same period as cases. The inclusion criteria for the control group were the blood routine and biochemical indicators within a normal reference range, as well as no endocrine, cardiovascular, kidney, and liver diseases. All subjects were genetically unrelated Chinese Han people. This study was approved by the Ethics Committee of Hainan General Hospital, and all subjects signed the informed consent.

SNP selection and genotyping

Four genetic polymorphisms (rs738409, rs3747207, rs4823173, and rs2896019) in PNPLA3 were screened according to the following procedures. Firstly, all mutant loci in PNPLA3 were downloaded from the 1000 Genomes Project. Secondly, Haploview software was utilized to filter SNPs, depending on these parameters: HWE > 0.01 and minor allele frequency (MAF) > 0.05. Finally, combined with primer design and literature search, four candidate SNPs were identified. Peripheral venous blood (5 mL) was drawn from each subject, placed in an EDTA anticoagulant tube and stored at 4 °C for DNA extraction. GoldMag extraction kit (GoldMag Co., Ltd., Xi’an, China) was used to extract genomic DNA from peripheral venous blood. DNA concentration was detected using NanoDrop 2000 (Thermo Scientific, Waltham, Massachusetts, USA). The primer design software Assay Designer 3.1 was applied to design primers for four SNPs (Additional file 1: Table S1). Genotyping of four SNPs was performed using the Agena MassARRAY system.

Statistical analysis

The independent samples t-test (continuous variables) and chi-square test (categorical variables) were used to analyze sample characteristics. The HWE test for SNP genotypes in the control group was performed using the chi-square test. Logistic regression analysis was used to evaluate the association between PNPLA3 gene polymorphisms and HCC susceptibility, and odds ratios (ORs) and 95% confidence intervals (CIs) were calculated to assess the influence of different alleles and genotypes on HCC susceptibility under multiple genetic models (co-dominant, dominant, recessive, and log-additive models). At the same time, PLINK software was applied to perform haplotype analysis. Multifactor dimensionality reduction (MDR) software 3.0.2 was used to determine the effect of SNP-SNP interactions on HCC susceptibility. To reduce false positives, a tenfold cross-validation procedure was used to generate data. The best model was selected based on the maximum cross-validation consistency (CVC) and testing balanced accuracy. Besides, false discovery rate (FDR) analysis was carried out to correct multiple testing, while false-positive report probability (FPRP) analysis was conducted to test whether significant results were credible. Finally, all statistical analyses were performed using SPSS 22.0, and p < 0.05 was considered statistically significant.

Results

Participant characteristics

The basic information about 484 HCC patients and 487 healthy controls is shown in Table 1. The average ages of HCC patients and controls were 55.06 ± 11.33 years and 55.07 ± 11.02 years, respectively, and no significant difference in age distribution was found between the two groups. In addition, there were also no significant differences in the distribution of gender, smoking and carbohydrate antigen 50 (CA 50) between the two groups. However, the differences in the levels of carcino-embryonic antigen (CEA) (p ˂ 0.001), alpha-fetoprotein (AFP) (p = 0.028), carbohydrate antigen 125 (CA 125) (p ˂ 0.001), and carbohydrate antigen 199 (CA 199) (p ˂ 0.001) between the two groups were worthy of note.

Table 1 Basic characteristics of subjects

Basic information and allele frequencies of PNPLA3 SNPs

The basic information and allele frequency distribution of four PNPLA3 SNPs (rs738409, rs3747207, rs4823173, and rs2896019) in cases and controls are shown in Table 2. These four SNPs in the control group were all in line with HWE (p > 0.05). The allele frequency distribution of rs738409 (p = 0.045), rs4823173 (p = 0.017) and rs2896019 (p = 0.018) was statistically different between the two groups. After FDR correction, the results indicated that the minor allele of rs2896019 was significantly associated with increased HCC susceptibility (FDR-p = 0.035).

Table 2 Basic information and allele frequencies of rs738409, rs3747207, rs4823173, and rs2896019 in PNPLA3

Association between PNPLA3 gene polymorphisms and HCC susceptibility

The overall analysis indicated that three SNPs were associated with increased susceptibility to HCC (Table 3). Specifically, rs738409 was significantly correlated with an increased susceptibility to HCC (homozygous model: OR = 1.49, 95% CI = 1.00–2.22, p = 0.049; log-additive model: OR = 1.22, 95% CI = 1.01–1.48, p = 0.038). Besides, rs4823173 was closely associated with increased susceptibility to HCC (homozygous model: OR = 1.63, 95% CI = 1.08–2.46, p = 0.019; dominant model: OR = 1.35, 95% CI = 1.04–1.77, p = 0.027; log-additive model: OR = 1.28, 95% CI = 1.06–1.55, p = 0.012). Rs2896019 was also linked to an increased susceptibility to HCC (heterozygous model: OR = 1.33, 95% CI = 1.01–1.76, p = 0.045; homozygous model: OR = 1.59, 95% CI = 1.06–2.39, p = 0.024; dominant model: OR = 1.38, 95% CI = 1.06–1.81, p = 0.018; log-additive model: OR = 1.28, 95% CI = 1.06–1.55, p = 0.012). After FDR correction, rs4823173 showed borderline association with increased HCC susceptibility in the log-additive model (FDR-p = 0.049), and rs2896019 was remarkably related to increased susceptibility to HCC in both homozygous (FDR-p = 0.048) and additive (FDR-p = 0.024) models.

Table 3 Association between PNPLA3 polymorphisms and HCC susceptibility

Stratified analysis of the association between PNPLA3 gene polymorphisms and HCC susceptibility

Basic information including age, gender, and smoking status was collected from all subjects to perform stratified analyses. Age-stratified analysis showed that rs738409, rs3747207, rs4823173, and rs2896019 were all associated with increased HCC susceptibility in subjects aged > 55 years (Table 4). After FDR correction, these four SNPs were still strikingly related to increased HCC susceptibility in subjects aged > 55 years, suggesting that age differences might affect the relationship between PNPLA3 gene polymorphisms and HCC susceptibility. Stratified analysis based on gender (Additional file 1: Table S2) and smoking status (Additional file 1: Table S3) showed that these four SNPs were all correlated with increased HCC susceptibility in men and women, as well as smokers and nonsmokers, which indicated that the effect of these SNPs on HCC susceptibility might not be related to gender and smoking status.

Table 4 Association between PNPLA3 polymorphisms and HCC susceptibility stratified by age

Correlation between PNPLA3 gene polymorphisms and serum tumor markers

Additionally, we also investigated the relationship between PNPLA3 gene polymorphisms and serum tumor markers (Table 5). The results revealed that four candidate SNPs were not correlated with the levels of CEA, CA 50, CA 125, and CA 199 in HCC patients. However, rs738409 was significantly related to the levels of AFP in HCC patients (p = 0.007), and HCC patients with the GG genotype had a higher level of AFP.

Table 5 Association of PNPLA3 gene polymorphisms with serum tumor marker levels

Linkage disequilibrium (LD) and haplotype analysis

LD analysis of PNPLA3 SNPs (rs738409, rs3747207, rs4823173, and rs2896019) was carried out using Haploview software. The results showed that there were linkages between SNPs, and one LD block was obtained, namely rs738409-rs3747207-rs4823173-rs2896019 (Fig. 1). At the same time, haplotype analyses of PNPLA3 SNPs were performed, and haplotypes with frequencies < 0.03 were ignored. As shown in Table 6, one haplotype, rs738409 (G)-rs3747207 (A)-rs4823173 (A)-rs2896019 (G), was detected. We found that the GAAG haplotype was significantly associated with increased HCC susceptibility (OR = 1.25, 95% CI = 1.03–1.53, p = 0.023). After FDR correction, this haplotype was still significantly associated with increased HCC susceptibility (FDR-p = 0.046).

Fig. 1
figure 1

LD analysis of PNPLA3 SNPs. Red squares represent statistically significant associations between SNPs, as measured by D’. Darker red indicates higher D’. LD: linkage disequilibrium; SNP: single nucleotide polymorphism

Table 6 Haplotype frequencies of PNPLA3 polymorphisms and their association with HCC susceptibility

MDR analysis

SNP-SNP interactions were analyzed using MDR software (Table 7). As a result, we found that the best single-locus prediction model was rs2896019, and HCC patients carrying rs2896019 GG and GT genotypes showed a testing balanced accuracy of 0.536 and a CVC of 10/10 for predicting HCC susceptibility. The rs2896019 GG and GC genotypes had 1.25-fold and 1.08-fold increased susceptibility to HCC, respectively (CVC = 10/10, testing balanced accuracy = 0.536, OR = 1.37, 95% CI = 1.05–1.79, p = 0.021; Fig. 2). Moreover, the best multi-locus prediction model was the four-locus model (the combination of rs738409, rs3747207, rs4823173, and rs2896019) (CVC = 10/10, testing balanced accuracy = 0.532, OR = 1.48, 95% CI = 1.13–1.93, p = 0.004).

Table 7 SNP-SNP interaction models of candidate SNPs analyzed by the MDR method
Fig. 2
figure 2

Genotype distribution of PNPLA3 rs2896019 in cases and controls based on MDR analysis. For each genotype, the number of cases is displayed in the histogram on the left of each cell, while the number of controls is displayed on the right. Darker shadows indicate higher HCC susceptibility. MDR: multifactor dimensionality reduction; HCC: hepatocellular carcinoma

FPRP analysis

FPRP analysis was used to verify the reliability of the observed associations between PNPLA3 SNPs and HCC susceptibility (Table 8). The FPRP values were less than 0.2, indicating the associations were noteworthy. When the prior probability was 0.25, the significant associations between all SNPs and increased HCC susceptibility were notable in all genetic models, except for rs738409 CG versus CC model in the overall and stratified analyses (age > 55). When the prior probability was 0.1, the significant associations of rs4823173 and rs2896019 with increased susceptibility to HCC were more notable in different genetic models: rs4823173 A versus G (FPRF = 0.132, power = 0.975) and log-additive (FPRP = 0.098, power = 0.948) model in the overall analysis, rs2896019 G versus T (FPRF = 0.132, power = 0.975), TG versus TT (FPRF = 0.109, power = 0.856), TG + GG versus TT (FPRF = 0.198, power = 0.727), and log-additive (FPRF = 0.098, power = 0.948) model in the overall analysis, and rs2896019 TG + GG versus TT (FPRF = 0.195, power = 0.225) model in stratified analysis (age > 55). Since the relationship between four SNPs and HCC susceptibility might not be affected by gender and smoking status, FPRP analysis was not performed in subgroups stratified by gender and smoking.

Table 8 FPRP analysis of significant findings

Discussion

HCC has become a public health problem worldwide, especially in China, which has the highest morbidity and lethality rate. And about 50% of newly-diagnosed HCC cases and HCC deaths in the world occur in China [17, 18]. Since SNPs are considered as one of the major risk factors for HCC, it is of great significance to study the association between SNPs and HCC susceptibility [19, 20]. In our study, the association of PNPLA3 SNPs with HCC susceptibility was analyzed in 484 HCC patients and 487 healthy controls, and the results showed that rs2896019 was significantly associated with increased HCC susceptibility. Our research contributes to the understanding of the pathogenesis of HCC and provides a new method for the treatment of HCC.

To date, researches on PNPLA3 gene polymorphisms have mainly focused on rs738409 and its relationship with NAFLD risk. The study by Hikmet Akkiz et al. has shown that rs738409 markedly increases the risk of NAFLD in the Turkish population in an unadjusted regression model [21]. Besides, the study by Daniel F Mazo et al. has also discovered that Brazilian subjects with the rs738409-GG genotype have a 3.29-fold increased risk of NAFLD [22]. Meanwhile, a study using a meta-analysis approach has highlighted that people with rs738409 CG and GG genotypes have a 19% and 105% likelihood of developing NAFLD, respectively [11]. Furthermore, one study has revealed the association between PNPLA3 rs738409 and HCC risk and demonstrated that rs738409 is an independent predictor of HCC occurrence [23]. However, the above study was conducted in white Italian patients, and racial differences may lead to the inconsistency between our results and those of the above study. We also explored the association of rs3747207, rs4823173 and rs2896019 with HCC susceptibility. One study has also revealed the significant correlation of PNPLA3 rs4823173 and rs2896019 with HCC susceptibility [24]. Notably, although no association was observed between the alleles and genotypes of rs4823173 and HCC susceptibility, our study demonstrated that the rs2896019 G allele and GG genotype were obviously associated with increased HCC susceptibility. However, so far, only one study has reported that rs3747207 is linked to NAFLD [25]. No significant association of rs3747207 with HCC susceptibility was observed in our study, which might be related to the discrepancy between diseases.

Given multiple factors for HCC occurrence, our study explored the relationship between PNPLA3 SNPs and HCC susceptibility in terms of age, gender and smoking status. First, in terms of age, the mean age of HCC cases in our study was 55 years, and people aged more than 55 years had an increased risk of HCC. Specifically, subjects aged more than 55 years with rs738409 CG, rs3747207 GA, rs4823173 GA, and rs2896019 TG were more likely to develop HCC. This may be due to the decline of the body’s immunity with age, thus leading to the increased risk of HCC. Second, as for gender, studies around the world have shown that HCC is a male-oriented malignancy [26]. In most countries, men are 2 to 4 times, or even 3 to 5 times more likely to suffer from HCC than women [27,28,29]. The male to female ratio of HCC incidence in this study was 3.35:1, which is basically consistent with the international gender trend of HCC incidence. This difference in gender distribution is thought to be closely related to hormone levels in men and their unhealthy lifestyles like overworking, staying up late and excessive drinking. Third, in terms of smoking status, studies have shown that smoking is a minor risk factor for HCC [30, 31]. According to reports, smoking can increase the morbidity and mortality of HCC [32]. Our study showed that there was no significant difference in the distribution of smoking between cases and controls, indicating that smoking failed to affect the association of selected SNPs with the occurrence of HCC. This contradiction may be due to the randomness of sample selection.

Considering that HCC is a complex polygenic disease, studies on SNP-SNP interactions may help identify risk factors for HCC. Of note, MDR is an effective method for detecting SNP-SNP interactions in case–control studies. In this study, MDR analysis was used to analyze the interactions between four candidate SNPs, and the results showed that the best single-locus model for predicting HCC susceptibility was the model (rs2896019). This result was consistent with that of the overall analysis, that is, rs2896019 was significantly associated with increased susceptibility to HCC. The best multi-locus prediction model was the four-locus model, the combination of rs738409, rs3747207, rs4823173, and rs2896019, which might further support the influence of SNP-SNP interactions on HCC susceptibility. However, the complex interactions between PNPLA3 SNPs in the progression of HCC remain to be further investigated.

In conclusion, rs2896019 increased HCC susceptibility. At the same time, FPRP analysis validated the reliability of the significant associations in our findings. However, the current study was designed to research only one gene, PNPLA3, and its SNPs. In the follow up studies, we will select multiple genes that share the same molecular pathway with PNPLA3 and their corresponding SNPs to better investigate the relationship between genetic polymorphisms and HCC.

Conclusions

This study revealed that PNPLA3 rs2896019 was associated with an increased susceptibility to HCC, but what roles these PNPLA3 SNPs play in the occurrence and development of HCC remains to be further studied.