Obesity and insulin resistance are commonly associated with fat accumulation in the liver, which is classified as nonalcoholic fatty liver disease (NAFLD) [1, 2]. This common disease encompasses a wide range of liver conditions ranging from hepatic steatosis to nonalcoholic steatohepatitis (NASH) in the absence of significant alcohol consumption [3]. NAFLD/NASH has become a leading cause of cryptogenic cirrhosis [4] and is currently the third leading clinical indication for liver transplantation in the USA [5]. The molecular pathogenesis underlying NAFLD appears to involve many factors, including those of genetic and environmental origins [6].

The relationship between NAFLD and obesity raises the possibility that epigenetic mechanisms associated with “metabolic memory” may be involved [7]. DNA methylation of CpG dinucleotides is a well-characterized epigenetic modification that occurs mostly within CpG islands (CGIs) in promoter regions and functions to regulate gene expression. Evidence supporting a role for DNA methylation in the development of metabolic diseases, including NAFLD, is emerging. For example, diets enriched for energy and certain macronutrients appear to induce transgenerational epigenetic changes in rodent models that predispose offspring to hepatic steatosis and NASH [8]. Changes in DNA methylation have been associated with hepatic steatosis [9, 10] and progression of fibrosis [11], while methyl-depleted diets contribute to the development of steatohepatitis, cirrhosis, and liver cancer in rodents [12, 13]. In humans, hypermethylation of NADH dehydrogenase 6 [14] and PPARGC1A has been correlated with NAFLD [15]. To our knowledge, only a few studies have investigated differential methylation of genes using a genome-wide approach in patients with NASH [16,17,18,19]. Several studies have also recently identified changes in the methylation profile of specific genes in NAFLD and NASH [20, 21], and type 2 diabetes (T2D) has been associated with epigenetic modifications in human liver [22].

To complement previous studies, we assessed DNA methylation status in liver biopsies of individuals with histologically documented NAFLD-related cirrhosis in the setting of extreme obesity. Because methylation of CpG sites in promoter regions is related to transcriptional regulation, we also examined the relationship between methylation status and gene expression. We identified over 30 CGIs that were differentially methylated in NAFLD cirrhosis and correlated with hepatic gene expression in the same liver samples. These findings contribute supporting evidence for a role for CpG methylation in the pathogenesis of NAFLD-related cirrhosis, including confirmation of almost 90 previously reported differentially methylated CpG sites, and provide new insight into the molecular mechanisms underlying the initiation and progression of liver fibrosis and cirrhosis.

Materials and methods

Study sample

Liver wedge biopsies were intraoperatively obtained from Caucasian women enrolled in the Bariatric Surgery Program at the Geisinger Clinic Center for Nutrition and Weight Management and histologically evaluated using NASH CRN criteria as described [23,24,25]. Patients with histologic or serologic evidence for other chronic liver diseases were excluded from this study. Both medical history and histological assessment excluded individuals with clinically significant alcohol intake and drug use from participation in the bariatric surgery program. Clinical data, including demographics, clinical measures, ICD-9 codes, medical history, medication use, and common lab results, were available for all study participants as described previously [26]. All study participants provided written informed consent for research, which was conducted according to The Code of Ethics of the World Medical Association (Declaration of Helsinki). The Institutional Review Boards of Geisinger Health System, the Translational Genomics Research Institute, and the Lewis Katz School of Medicine at Temple University approved the research.

Methylation analysis

Whole genome methylation profiling was performed using the 450K Infinium Methylation BeadChip Assay (Illumina; San Diego, CA, USA). Liver genomic DNA was extracted using the QIAamp DNA kit (Qiagen; Germantown, MD, USA), and DNA concentration was determined using the Quant-iT PicoGreen dsDNA Assay kit (Thermo Fisher Scientific; Waltham, MA). Bisulfite conversion of genomic DNA was performed using the EZ DNA Methylation Kit (Zymo Research; Irvine, CA, USA). Bisulfite-treated DNA was then hybridized to arrays according to the manufacturer’s protocol. Methylation levels for each CpG residue were estimated as the ratio of the methylated signal intensity over the sum of the methylated and unmethylated intensities at each locus using the minfi R package and presented as β values [27]. For quality control, we used the publicly available software CpGassoc [28] to exclude any samples with probe detection call rates < 95%, as well as those with an average intensity value of either < 50% of the experiment-wide sample mean or < 2000 arbitrary units (AU). Data points were set to missing with detection P values > 0.01. Probes overlapping with copy number variants, as well as those mapping to multiple locations with up to two mismatches, were excluded from the analysis. All samples were checked for atypical raw intensity distributions, and β value correlation among others within each respective group. Data were normalized using the SWAN algorithm provided by the minfi package (

To identify CpG sites showing differentially methylation between fibrotic and normal samples, we used the City of Hope CpG Island Analysis Pipeline (COHCAP) [29] and the minfi package. Differentially methylated site analysis encompasses testing each genomic position for association between methylation and phenotype with a F-test for categorical outcome. CpG sites were defined as methylated if they showed a percentage of methylation > 0.5 and unmethylated if they had β values < 0.5. To identify differentially methylated CGI, signal from differentially methylated CpG sites within a CGI was averaged for each sample before group comparison. We used a FDR < 0.05 [30] to define a set of differentially methylated CGIs.

RNA extraction, sequencing, and analysis

Total RNA was extracted using the RNAeasy kit (Qiagen), quantified using the NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific), and converted to cDNA using the Ovation RNA-Seq System V2 (NuGEN; San Carlos, CA, USA). We performed massively parallel RNA-Seq using polyA-selected RNA from biopsied liver samples using the Illumina HiSeq2000 platform, sequencing to a depth of 60 M 100 bp paired-end reads. During the generation of qseq and fastq files for alignment, low-quality reads were identified and removed, and the indexed reads identified and grouped accordingly. Filtered reads were aligned with the human genome using the Bowtie program [31]. Aligned RNA-Seq reads were imported into the Cufflinks program [32] to assemble alignments into a parsimonious set of transcripts and this set of annotated transcripts were quantified in each sample using DESeq2 [33]. To identify genes with negative expression correlations, we compared β values for each CGI to the fold-change of its corresponding gene obtained from RNA-seq analysis. The correlation was estimated using Spearman’s rho statistic, with P values computed via the asymptotic t distribution approximation.

Functional enrichment analysis

The Ingenuity Pathway Analysis (IPA) software (Qiagen) was used to identify canonical signaling pathways and network connections associated with CGIs. The significance of the association between CGI and canonical pathway was assessed using two criteria: (1) the ratio of the number of molecules mapped to the pathway and total number of molecules involved in the canonical pathway and (2) the Benjamini-Hochberg corrected P value from the right-tailed Fisher exact test.


Study sample characteristics

DNA was obtained from liver biopsies from 15 patients with NAFLD, including 4 with stage 3 bridging fibrosis (F3) and 11 with either incomplete cirrhosis or stage 4 (F4) fibrosis (cirrhosis), and 15 individuals with normal liver histology matched by age, sex, BMI at biopsy, and T2D status. The demographic information and clinical characteristics of the study participants are shown in Table 1. Individuals with NAFLD fibrosis had significantly higher serum levels of aspartate aminotransferase (AST), alanine aminotransferase (ALT), insulin, and triglycerides compared to individuals without fibrosis. NAFLD patients with fibrosis also manifested lobular inflammation, portal inflammation, and hepatocyte ballooning.

Table 1 Patient demographics and clinical characteristics

Identification of differentially methylated CGIs

For each hepatic DNA sample, methylation levels for each CpG residue were estimated as β values, which represent the ratio of the methylated signal intensity over the sum of the methylated and unmethylated intensities. Initial hierarchical clustering analysis revealed that F3 fibrosis could be segregated as a separate group from F4, consistent with the distinct histology of stage 3 versus stage 4. To avoid the introduction of unnecessary heterogeneity, we excluded these three samples from further analysis. In addition, one NAFLD fibrosis sample failed quality control measures and was removed from further analysis. Using data from the remaining 11 cirrhosis and 15 normal samples, we identified 208 CGIs, including 99 hypomethylated and 109 hypermethylated CGIs (Additional file 1: Table S1) showing statistically significant evidence (adjusted P value < 0.05) for differential methylation. Analysis of individual sites revealed differential methylation of 4275 CpG sites, (adjusted P value < 0.05), corresponding to 1713 hypomethylated and 2562 hypermethylated CpG sites; the top 100 differentially methylated CpG sites are shown in Additional file 2: Table S2. The top CGI, prioritized by strength of statistically significant evidence for differential methylation, magnitude of difference in methylation levels (Δ beta), and biological relevance to general liver function or metabolism, are shown in Table 2 (hypermethylation) and Table 3 (hypomethylation).

Table 2 CpG islands showing the strongest evidence for hypermethylation in NAFLD cirrhosis
Table 3 Ten CpG islands showing the strongest evidence for hypomethylation in NAFLD cirrhosis

Correlation of CpG methylation with hepatic gene expression

To identify gene-methylation correlations, we compared β values for each CGI to the FPKM (fragments per kilobase of transcript per million mapped reads) data of its corresponding gene. Focusing on pairwise associations with a significant negative correlation at FDR < 0.05, we found evidence for negative correlation between 34 CGI-transcript pairs (Table 4). Box plots of mean (± standard deviation) methylation β values for CGIs associated with gene expression differences in patients with NAFLD fibrosis (n = 11) compared to individuals with normal liver histology (n = 15) are presented in Additional file 3: Figure S1.

Table 4 Integration with gene expression based on correlation statistic

Functional enrichment analysis

We used functional enrichment analyses to identify the effect of differentially methylated CGI on different canonical pathways. The analysis revealed 113 enriched canonical pathways (Additional file 4: Table S3). Pathways with relevance to NAFLD cirrhosis identified in this analysis are highlighted in Table 5. The top pathways included production of nitric oxide and reactive oxygen species, LXR/RXR activation, and FXR/RXR activation.

Table 5 Functional enrichment analysis: canonical pathways


To our knowledge, only four studies have investigated methylation profiles in NAFLD patients [16,17,18,19]. Although all four studies utilized the same array-based platform for measuring DNA methylation, they differed with respect to analytical strategy, experimental design, and study sample composition. In the first published study, levels of methylation and mRNA expression were assessed in normal-weight controls (N = 18) and obese individuals with liver histology consistent with normal (N = 18), steatosis (N = 12), or NASH (N = 15). Seventy-four differentially methylated CpG sites (FDR < 0.004) were found in comparisons of the four phenotypic groups with NAFLD-specific methylation and expression differences observed for nine genes: ACLY (ATP citrate lyase), GALNTL4 (polypeptide N-acetylgalactosaminyltransferase 18), GRID1 (glutamate ionotropic receptor delta type subunit 1), IGF1 (insulin like growth factor 1), IGFBP2 (insulin like growth factor binding protein 2), IP6K3 (inositol hexakisphosphate kinase 3), PC (pyruvate carboxylase), PLCG1 (phospholipase C gamma 1), and PRKCE (protein kinase C epsilon) [16].

In the same year, Murphy et al. [18] identified 69,247 differentially methylated CpG sites, most of which were hypomethylated, in obese patients with advanced (N = 23) versus mild (N = 33) NAFLD [18]. In that study, methylation was correlated with gene transcript abundance levels for 7% of the differentially methylated CpG sites and methylation at FGFR2, MATA1, and CASP1 was validated in a replication cohort. Despite the similar design between that study and the current work, there were notable differences between the two, which could account for the disparate findings. In the published study, (1) the mild fibrosis group was a mix of NAFLD histological types including grade 1 fibrosis; (2) the advanced NAFLD group included stage 3 fibrosis; (3) the BMI range was 32.0–33.8 across case and controls in both discovery and replication cohorts; and (4) the case and control groups were not matched for T2D. More recently, de Mello et al. [17] identified 1292 CpG sites, representing 677 genes that showed differences in DNA methylation between 26 NASH patients and 34 individuals with normal liver histology, independent of T2D status, age, sex, and BMI. In the present work, we also used a homogeneous control group without histopathological manifestations of NAFLD. The lack of a normal control group is widespread in studies of NAFLD, where liver tissue is often obtained in conjunction with clinically indicated biopsies; thus, a high pre-procedure probability of a pathological finding often results in the presence of some level of NAFLD. We also found that stage 3 fibrosis could be segregated as an epigenetically separate group using hierarchical clustering analysis; thus, we removed these samples from further analysis. This level of bioinformatic quality control is critical when analyzing large genomic datasets such as genome-wide methylation array data.

In the most recent study, Hotta et al. [19] identified a number of differentially methylated regions in a comparison of patients with either mild or advanced NAFLD fibrosis from Japan. That study differed from the current study, not only with respect to ethnic background, but also the NAFLD groups were heterogeneous with respect to fibrosis status and neither the mild nor the advanced fibrosis groups were obese [19].

Despite the differences in design, we compared our results with those reported by published studies discussed above and identified 86 common differentially methylated sites (Additional file 5: Table S4). Interestingly, four (AQP1, FGFR2, RBP5, and MGMT) overlapped with differentially methylated CpG sites in all four studies, suggesting that these genes may be a common core set with potential importance in disease pathogenesis. AQP1 (aquaporin 1) encodes a water channel that is overexpressed in fibrosis and cirrhosis and appears to promote hepatic fibrosis through mechanisms involving pathological angiogenesis [34]; association of increased AQP1 expression is associated with hypomethylation of CpG site found among these studies. FGFR2 (fibroblast growth factor receptor 2), which was found to be hypomethylated across these studies, has been linked with liver fibrosis. FGF2 levels were increased in hepatic stellate cell activation [35] and liver cirrhosis [36]. Further, FGF2 knockout in the carbon tetrachloride mouse model of hepatic injury was associated with decreased collagen expression and protection from liver fibrosis [37], while Brivanib, an inhibitor of FGFR and vascular endothelial growth factor (VEGF) was shown to inhibit activation of hepatic stellate cells in vitro and liver fibrosis in three different animal models [38]. The biological relevance of the other two genes, retinol-binding protein 5 (RBP5) and O6-methylguanine-DNA methyltransferase (MGMT), both of which were hypomethylated in these studies, to the development of NAFLD-related fibrosis remains unknown.

The most highly associated canonical pathways identified through functional enrichment analysis provide additional evidence for epigenetic regulation in the pathophysiology of NAFLD-related cirrhosis. The most highly enriched pathway, Production of Nitric Oxide and Reactive Oxygen Species in Macrophages, implicates a well-studied molecular mechanism in the progression of NAFLD to cirrhosis. In human studies and in rodent models of NAFLD/NASH, increased reactive oxygen species appear to be a central feature of lipotoxicity and inflammation [39]. Although the use of antioxidants as a therapeutic strategy has not resulted in significant effects in either humans or animal models, most studies suffer from one or more apparent weaknesses [40]. Our results suggest that epigenetic modulation of antioxidant pathways could be a promising therapeutic approach. The next two associated pathways, LXR/RXR activation and FXR/RXR activation, are mechanistically linked via heterodimerization of the nuclear receptors farnesoid X receptor (FXR) and retinoid X receptor (RXR) [41]. FXR is highly expressed in the liver and is involved in several key metabolic processes including bile acid synthesis, glucose and lipid metabolism, and regulation of inflammatory pathways [42]. Administration of the synthetic bile acid derivative and FXR agonist, obeticholic acid, has been shown to reduce the histological NAFLD fibrosis score in a multicenter, randomized, double blind, placebo-controlled study [43]. To date, the role of epigenetic regulation of this pathway in the potential clinical effectiveness of bile acid-based therapeutics has not yet been explored.

Despite the overlap in findings of differential methylation among studies, our sample size, while comparable to several other reports [16, 44, 45], is still limited. We dichotomized our cohort into extreme histologic phenotypes to increase our power to identify clinically relevant differences in DNA methylation levels and were therefore unable to stratify into phenotypic groups to assess intermediate levels of peri-sinusoidal fibrosis or portal fibrosis. Additional studies including individuals spanning the spectrum of NAFLD fibrosis will be critical to extend our findings. We also note that the cross-sectional design does not allow associations with disease progression to be drawn. In the absence of serial liver biopsies in the present cohort, we were not able to utilize a prospective design. Assessment of methylation levels in longitudinal biopsies will be necessary to determine the roles of specific candidates in disease progression. In addition, due to the small sample size in this study, we focused our investigation on females with extreme obesity, who underwent gastric bypass surgery and had liver biopsies. We utilized this design because obesity is a significant risk factor for NAFLD [46], and liver biopsy tissue was obtained without clinical indication, which effectively eliminates bias toward clinically suspect liver disease. Despite the relative phenotypic homogeneity, we were able to replicate findings of differential methylation from study samples encompassing a greater range of phenotypic diversity, indicating the influence of shared epigenetic effects. However, conclusions with regard to NAFLD-related fibrosis obtained from this cohort may still not be relevant to populations with less severe obesity or of different ethnicities without additional validation.

In summary, the results obtained in the current study not only confirm previous findings of differential methylation in NAFLD patients but also provide novel evidence of differential methylation and RNA expression profiles associated with NAFLD-related fibrosis. Future studies will be needed to determine the extent to which DNA methylation patterns in the liver are represented in other metabolically relevant tissues such as visceral and subcutaneous fat [47, 48], as well as peripheral blood leukocytes, which will be critical for the development of non-invasive markers of NAFLD stage. Additional studies, including those showing functional consequences of differentially methylated sites, i.e., disruption of transcription factor binding, will be necessary to confirm the role of specific CpG loci in liver fibrosis. These approaches are expected to yield new insights into the pathological mechanisms underlying the development of fibrosis and cirrhosis in NAFLD.