Background

Hepatitis B virus (HBV), a prototypical member of the Hepadnaviridae family, is responsible for inducing acute and chronic hepatitis B, cirrhosis, and hepatocellular carcinoma (HCC) [1]. Despite the availability of safe and highly effective vaccines against HBV, chronic infection remains a significant public health challenge, affecting an estimated 296 million individuals worldwide. It is the second-leading cause of cancer-related mortality worldwide [2]. HBV is an enveloped DNA virus that contains a small, partially double-stranded 3.2 kb DNA genome, or relaxed circular DNA (rcDNA). The rcDNA transforms into covalently closed circular DNA (cccDNA), serving as a template for viral transcription. cccDNA persists within the nuclei of infected hepatocytes in an episomal state via the intervention of cellular enzymes [3]. Current treatments include nucleos(t)ide analogs (NAs), such as tenofovir or entecavir, which effectively target viral reverse transcriptase activity and reduce viral replication. However, the challenge persists as these treatments do not directly eradicate cccDNA, a pivotal obstacle to achieving a complete virological or sterilizing cure for HBV infection.

The viral transcription produces multiple-length viral RNAs (3.5, 2.4, 2.1, and 0.7 kb) under the control of four viral promoters and two enhancers. Additionally, numerous host cellular transcription factors, such as nuclear receptors and hepatocyte-enriched and ubiquitous transcription factors, participate in the HBV life cycle [4]. Recent reports have highlighted the biological significance of HBV RNA regulation, potentially impacting splicing regulation in a cell type-specific manner [5]. Therefore, a more extensive exploration is needed to understand the biological significance of HBV RNA regulation in hepatocytes, encompassing viral replication, host range, and intricate regulatory mechanisms governing post-transcriptional processing events within the HBV life cycle. Furthermore, understanding the multitude of host factors involved in the HBV life cycle holds promise for uncovering insights guiding the discovery of anti-HBV therapies, focusing on drugs targeting cellular factors.

In this study, we performed RNA-sequencing (RNA-seq) experiment to identify differentially expressed and co-regulated genes in HBV-infected primary human hepatocytes (PHHs). Additionally, to validate our transcriptome data analysis, we compared it with eligible HBV gene expression datasets collected from the public repository Gene Expression Omnibus (GEO). Using this approach, we aim to identify the specific gene expression patterns that could serve as an accurate clinical tool for predicting prognosis and adjuvant therapy responses in HBV infection.

Materials and methods

Cell culture

HepAD38 cells were described in our previous studied [6]. Cells were cultured with or without 5 μg/ml tetracycline (Sigma-Aldrich, St Louis, MO, USA) in Dulbecco’s modified Eagle’s medium (DMEM)/F-12 (Gibco, Grand Island, NY, USA) supplemented with 10% fetal bovine serum (FBS, Gibco) and 1% penicillin–streptomycin (Gibco) at 37 °C under a humidified atmosphere containing 5% CO2. Cultured HepAD38 cells were used to produce an HBV inoculum for infection experiments as previously described [6]. PHHs were purchased from Corning (Tewksbury, MA, USA) and maintained in the hepatocyte-specific medium (Corning) with 0.01 µg/mL EGF according to the manufacturer recommendations.

HBV production and infection

For the infection experiments, the HBV inoculum was prepared from freshly collected supernatants of HepAD38 cells as described previously [6]. The titer of the HBV solution was adjusted to 5 × 108 viral genome equivalents (GEq) per mL. PHHs (8 × 105 cells/well) were seeded onto six-well plates coated with collagen I (Gibco) and were inoculated with 80 GEq/cell of HBV in the presence of 4% PEG 8000 (Sigma). After inoculum removal and washing, the cells were incubated in fresh medium. The culture medium was collected and replaced every two days.

Detection of HBsAg and HBeAg

The levels of HBV surface antigen (HBsAg) and HBV e antigen (HBeAg) in the culture medium were determined using an ELISA kit (Wantai Bio-Pharm, Beijing, China) according to the manufacturer’s instructions. The absorbance was measured at 450 nm using a spectrophotometer (Synergy H1; BioTek, Winooski, VT, USA). All experiments were performed a minimum of three times.

Quantitative real-time PCR analysis of intracellular HBV DNA

Total genomic DNA was extracted from intracellular HBV rcDNA using a QIAamp DNA Mini Kit (Qiagen, Venlo, Netherlands) according to the manufacturer’s instructions. To assess HBV rcDNA levels, quantitative real-time PCR (qRT-PCR) was performed using a Power SYBR green PCR master mix (Applied Biosystems, Warrington, UK) with primers for HBV rcDNA and amplified using the QuantStudio 3.0 program (Applied Biosystems). The primer pairs for HBV DNA were forward primer (nt 256 to 274): 5′-CTCGTGGTGGACTTCTCTC-3′; and reverse primer (nt 404 to 421): 5′-CTGCAGGATGAAGAGGAA-3′. Relative gene expression levels were normalized against those of β-actin.

RNA-sequencing data generation and analysis

Total RNA was extracted using the TRIzol reagent (Invitrogen, CA, USA) according to the manufacturer’s instructions. Isolated RNA was evaluated using an Agilent RNA 6000 Pico kit (Agilent, Santa Clara, CA, USA), and the concentration was measured using a BioPhotometer® spectrophotometer (Eppendorf, Hamburg, Germany). RNA samples with an absorbance ratio at 260/280 nm between 1.8 and 2.0 and structural integrity verified before being used in the sequencing library preparation. A cDNA library was generated using the QIAseq FX Single Cell RNA Library Kit (Qiagen) according to the manufacturer’s protocol. In this protocol polyA-selected mRNA was converted in cDNA, and then enzymatic fragmentation (incubation time for “fragment size = 200–500 bp” was used) and library preparation were performed using 1 μg cDNA. The cDNA concentration was measured using the LightCycle qPCR (Roche, Penz Agilent High Sensitivity D5000 ScreenTape System berg, Germany), and the size of library was checked using an Agilent High Sensitivity D5000 ScreenTape System (Santa Clara, CA). RNA-seq was conducted by GnCBio (Daejeon, Korea) using HiSeq X (Illumina, CA, USA), as previously reported [7,8,9]. Low-quality sequence reads from the raw sequence data were filtered using Trim Galore software (https://github.com/FelixKrueger/TrimGalore). High-quality sequence reads with a base quality > 30 and length > 50 were selected and mapped to the Homo sapiens reference genome using the Bowtie2 aligner tool [10]. These preprocessing procedures were performed for both HBV-infected and HBV-uninfected PHH samples. Differential expression analysis was conducted between the two groups (HBV-infected and uninfected PHHs) using read count data (See Additional file 1).

In this study, we focused on identifying protein-coding genes related to pathogenesis, and 17,613 protein-coding genes were analyzed. Each gene in the RNA-seq data was measured on days 3, 5, and 7 after HBV infection. For each group, measurements per gene were fitted to a linear regression model. R2 > 0.7 and false discovery rate (FDR)-rate-adjusted p-values (i.e., q-values) < 0.05 were used as significance thresholds to select genes showing dynamic changes. In addition to within-group comparisons, between-group comparisons were performed using the Wilcoxon rank-sum test at a significance level of 0.05.

Dataset collection on HBV-infected primary human hepatocytes

To prevent the selection of candidate genes specific to the dataset, validation analysis was carried out using the GSE72068 dataset with a study design similar to ours [11]. The GSE72068 dataset contains microarray expression profiling data showing the gene expression response at various time points (i.e., 4 and 8 h and 1, 6, and 12 days) in HBV-infected PHHs. Among the 18,036 protein-coding genes identified in the GSE72068 dataset, 13,862 overlapped with the those in main dataset generated through RNA-Seq experiments in this study. The same statistical methods used to analyze the main dataset were also applied in the validation analysis.

The GSE25097 dataset was used to identify candidate genes as potential biomarkers for HBV-associated HCC. The GSE25097 dataset is a microarray expression profile designed to identify prognostic genetic markers between 268 HCC tumor samples and 6 healthy liver samples. Among the 18,076 protein-coding genes identified in the GSE25097 dataset, 14,143 genes overlapped with the main dataset used in this study. Comparison of the expression levels between HCC and healthy samples was performed using the Wilcoxon rank-sum test.

Furthermore, to gain insight into the mechanisms of potential biomarkers, we performed a functional enrichment analysis using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) [12].

Statistical analysis

The data are expressed as the mean ± standard deviation. Statistical analysis was performed using the unpaired t test (GraphPad Prism 8) to determine statistically significant differences between groups. *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 1 × 10–15 were considered statistically significant.

Results

Establishment and characteristics of HBV-infected primary human hepatocytes

PHHs were infected with HBV particles collected from HepAD38 supernatant. Successful HBV infection was observed over 7 days, as confirmed by the detection of secreted HBV antigens (HBsAg and HBeAg) and intracellular HBV rcDNA at the indicated time points (Fig. 1A). As shown in Fig. 1B, the levels of HBsAg and HBeAg were found to plateau in HBV-infected cells at approximately 7 days post-infection (dpi). Next, intracellular HBV rcDNA was detected using qRT-PCR (Fig. 1C). The expression of HBV rcDNA increased continuously until 7 dpi. Therefore, PHH infected with HBV constitutes an effective model for studying the cellular effects of post-infection stages of the HBV life cycle. Subsequently, samples were collected at these time points and subjected to RNA-seq analysis.

Fig. 1
figure 1

A Workflow for HBV infection. B Levels of HBsAg and HBeAg were analyzed by ELISA at the indicated time points. The data compare the antigens secreted from HBV-infected cells (Red bar) and uninfected cells (Blue bar). Statistical significance is indicated (***: p-value < 0.001). ELISA data are presented as bar charts (n = 4). C Expression of HBV DNA detected by qRT-PCR

Changes in gene expression patterns after HBV infection in primary human hepatocytes

To comprehensively identify host gene expression changes in HBV-infected PHHs, we analyzed the dynamic alterations in gene expression patterns following HBV infection. We observed significant alterations in the expression patterns of 149 genes for 7 dpi (R2 > 0.7, q < 0.05; Fig. 2A). Among these genes, the expression of a majority of 141 genes (95%; Fig. 2B) showed a decreasing trend over time, whereas a smaller subset of 8 genes (5%; Fig. 2C) exhibited an increasing pattern. The rate of decrease or increase in the expression patterns of these 149 genes was more pronounced in HBV-infected cells compared to uninfected controls (Supplementary Fig. 1). Specifically, our analysis revealed that while both decreasing and increasing expression patterns were observed in uninfected controls over the 7 day period, the magnitude of these changes was significantly greater in HBV-infected cells (Supplementary Fig. 1B). These findings highlight the distinct expression dynamics induced by HBV infection.

Fig. 2
figure 2

Expression patterns of 149 genes showing significant dynamic changes. A Expression patterns over time in HBV-infected cells. Expression levels were normalized for each gene and then used as input to the heatmap. B, C Boxplots showing the differences between HBV-infected and uninfected groups for patterns of decrease and increase. The expression patterns of the two groups were compared by date. D Scatter plot with R2 as the y-axis and the fold change at day 7 after infection as the x-axis. Fold change was defined as the average expression level in HBV-infected cells divided by the average expression level in uninfected cells. Green or orange indicates genes with 1.5-fold decreased or increased expression levels, respectively, in HBV-infected cells compared with those in uninfected cells. (E) Expression levels over time for representative genes exhibiting distinct patterns, as determined by RNA-seq data analysis. Genes shown display more than a 1.5-fold difference in expression levels between HBV-infected and uninfected groups at indicated time points

Gene expression patterns in HBV-infected cells exhibited notable and statistically significant differences 3–7 dpi. Notably, by 7 dpi, over half of the genes (58%) displayed a 1.5-fold decrease or increase in expression in HBV-infected cells than in uninfected cells. The scatter plot illustrates both upregulated and downregulated genes across all datasets (Fig. 2D). The most significantly downregulated genes included RPL18, RPL28, RBM14, ABCF1, HMGA1, RBM10, and PABPC4, whereas the corresponding upregulated genes consisted of GPAM and LDLRAD4.

Cross-dataset validation and analysis of HBV-associated gene expression patterns

To prevent bias toward HBV-associated genes specific to one dataset, we conducted a cross-dataset validation analysis using the GSE72068 dataset, mirroring the study design. We focused on the patterns observed in 149 genes that exhibited statistical significance in the main dataset generated through RNA-Seq experiments in this study. Among the 149 genes, there were 112 genes in the GSE72068 dataset (Fig. 3A–C). Within the GSE72068 dataset, the expression of 112 genes showed distinct patterns of decreases or increases in HBV-infected cells, whereas no discernible patterns emerged in uninfected cells over time (Fig. 3D, E). Despite the clear temporal patterns observed in the HBV-infected cells, differences were observed in the gene composition of each pattern. Specifically, 51 of the 112 genes displayed consistent expression patterns in both datasets (Fig. 3B, C). Relative to the expression levels measured 4 h after HBV infection, the expression levels of these 51 genes began to exhibit significant differences starting from the 6 dpi (p = 2.7 × 10–7 for the decreasing pattern and p = 0.0047 for the increasing pattern). Notably, the degree of expression reduction between HBV-infected and uninfected cells began to display statistically significant differences at 12 dpi (Fig. 3F). Although the R2 values for the 51 genes were relatively modest, RPL28 and GPAM emerged among the top genes in terms of fold change (FC), which was consistent with the findings from the primary dataset (Fig. 3G).

Fig. 3
figure 3

Expression patterns for 149 genes in the GSE72068 dataset. A Venn diagram showing the relationship between the 149 candidate genes in the main dataset and the designed genes in the GSE72068 dataset. B Venn diagram showing the number of genes with a decreasing pattern for each dataset around 112 candidate genes. C Venn diagram showing the number of genes with an increasing pattern for each dataset around 112 candidate genes. D, E Expression pattern over time for 112 genes in HBV-infected and uninfected cells in the GSE72068 dataset. Expression levels were normalized for each gene and then used as input for the heatmap. The order of the genes is the same. F Boxplots showing the differences between HBV-infected and uninfected groups for both patterns. We focused on 51 genes that showed common patterns with the main dataset. The expression patterns of the two groups were compared by date. G Scatter plot with R2 as the y-axis and the fold change at day 12 after infection as the x-axis. Fold change was defined as the average expression level in HBV-infected cells divided by the average expression level in uninfected cells. Green or orange indicates genes with 1.1-fold decreased or increased expression levels, respectively, in HBV-infected cells compared with those in uninfected cells

Potential biomarkers selection for HBV-mediated chronic liver disease

Apart from the 51 genes validated using the GSE72068 dataset, a collective of 88 genes has been identified, encompassing 37 genes exhibiting substantial evidence (high R2 and FC values), signifying potential biomarkers for HBV-associated HCC. Detailed information encompassing the main analysis results for significant 51 genes exhibiting consistent patterns across both datasets, is provided in Table 1. The 37 genes were included in the list of potential biomarkers as they could not be identified in the GSE72068 dataset due to limitations of the designed genes (see Table 2). Notably, these 37 genes had high R2 and FC values, including ELOA (R2 = 0.818) and LDLRAD4 (FC = 1.730). The DAVID functional analysis revealed that most of these genes, accounting for 80%, are known contributors to protein binding (GO:0005515; q-value = 0.0133). Within this subset, 6 genes, including RPL28, correspond to the structural elements of ribosomes (hsa03010, q-value = 0.0341). Additionally, among the statistically significant functional pathways and gene ontology terms, we focused on RNA-binding proteins (RBPs, GO:0003723, q-value = 5.6 × 10–4). Among the 88 genes, 21 (24%) were identified as RBPs, which were downregulated in HBV-infected cells relative to uninfected cells. While previously reported RBPs have shown associations with prognostic markers in HCC patients, regardless of chronic HBV infection, differences exist between these RBPs and the biomarkers identified in this study [13, 14]. Validation experiments employing qRT-PCR were conducted for 21 genes, resulting in the validation of 6 genes (ABCF1, HMGA1, RPL28, RBM10, RBM14, and PABPC4) (Fig. 4B, C). Interestingly, RBM14 and RPL28 exhibited a trend of downregulation in HBV-infected PHH and in HCC tumor tissue based on the GSE72068 and GSE25097 datasets (Fig. 4D, E). These findings, supported by evidence from HCC tumor samples, suggest the potential of RBM14 and RPL28 as biomarkers for HBV-associated HCC. As for RBM10, while not prominently highlighted in the GSE72068 and GSE25097 datasets, it had previously been identified as a downregulated tumor suppressor gene in HCC tissues through qRT-PCR [15].

Table 1 Detailed results of main dataset analysis for genes with common patterns
Table 2 Detailed results from analysis of main dataset for undesigned genes in GSE72068 set
Fig. 4
figure 4

Functional enrichment analysis and qRT-PCR-based validation analysis. A All statistically significant enriched terms are displayed (q-value < 0.05). The count indicates the number of genes belonging to each term. B, C The quantitative analysis of genes identified by qRT-PCR. Relative expression levels were calculated by normalizing β-actin expression. Significant differences between uninfected and HBV-infected primary human hepatocytes at 7 dpi are represented (*p < 0.05, **p < 0.01). D Comparison of expression levels between two groups on the last day of measurement for each dataset. Fold change (FC) is defined as the average RNA expression level in the HBV group divided by that in the control (Ctrl) group. (E) Two genes validated in the GSE25097 dataset show downregulated expression levels in real HCC samples compared with those in healthy liver samples

Discussion

HBV is a non-cytopathic, hepatotropic virus known for causing persistent infections that may lead to cirrhosis and HCC. Consequently, numerous studies have focused on characterizing the altered gene expression profiles in host cells following HBV infection [16,17,18]. Despite extensive research on gene expression changes to understand HBV infection in tumor-derived cell lines and hepatocyte, little is known about HBV infection in PHHs. Our study aimed to identify potential biomarkers for functionally curing HBV infection through a comprehensive understanding of the genes and pathways involved in HBV replication/transcription in PHHs. These cells used to cryopreserve PHHs in this study represent a physiologically relevant in vitro culture system for studying HBV infection. They closely mimic the characteristics of HBV-infected human hepatocytes [19].

In the HBV life cycle, HBV enters hepatocytes by binding to specific receptors on their surface. Once inside, the uncoated viral genome enters the nucleus, forming cccDNA, which acts as a template for viral transcript synthesis. Viral transcription driven by promoter and enhancer regions (ENI and ENII) generates four unspliced viral RNAs − 3.5, 2.4, 2.1, and 0.7 kb − among which the 3.5 kb RNA contains precore and pregenomic RNA species. Notably, the precore mRNA encodes the precore antigen (HBeAg), while the pregenomic RNA directs the translation of the core antigen (HBcAg) and polymerase. After encapsidation, pregenomic RNA serves as a reverse transcription template. Assembled HBV virions are secreted from hepatocytes. Each step of the HBV life cycle is heavily dependent on the host factors, such as hepatocyte-enriched and ubiquitous transcription factors, regulating viral replication and transcription via promoter and enhancer regions. Therefore, we focused on changes in gene expression and transcription dynamics of hepatocyte related with HBV replication/transcription processes after HBV infection. Our analysis suggests significant changes primarily in RNA metabolism-related genes within PHHs depending on the time following HBV infection.

HBV directly alters RNA metabolism by interacting with cellular RBPs, affecting RNA processing and stability in host cell [20]. A study by Chabrolles et al. revealed that HBV core protein was found to interact with RBPs, notably SRSF10. Their functional studies identified SRSF10 as the HBV RNA regulator primarily in its dephosphorylated state within infected cell nuclei, influencing nascent HBV RNA levels without affecting HBV RNA splicing [21]. Additionally, virus-induced changes in cell environment indirectly impact RNA metabolism, leading to dysregulated gene expression and altered mRNA levels crucial for immune responses, cell functions, and apoptosis during HBV infection [22,23,24]. In this study, we identified six RNA metabolism-associated genes involved in HBV infection, such as ABCF1, HMGA1, RPL28, RBM10, RBM14, and PABPC4. They were successfully validated through qRT-PCR experiments and analysis of two separate datasets. Notably, RBM14 and RPL28 were involved in hepatocarcinogenesis and used as potential biomarkers for HBV associated HCC [25, 26]. RBM14 plays diverse roles in maintaining the stem-like state of glioblastoma multiforme spheres and assisting PARP-dependent DNA repair at double-strand breaks. Moreover, elevated RBM14 expression in HCC modulate the M2-phenotype polarization of KCs through N-methyladenosine (m6A) methylation regulation, thereby promoting HCC cells' malignant aggressiveness [27].

RPL28 is implicated in HCC by modulating murine double minute 2 (MDM2), affecting the tumor suppressor p53's function. Its negative regulation of MDM2 inhibits p53 ubiquitination, stabilizing p53's tumor-suppressive role. Dysregulated RPL28 activity in hepatocarcinogenesis may disrupt these processes, potentially facilitating HCC cell growth [28]. However, there is insufficient research on the functions of RBM14 and RPL28 related to the HBV infection. Collectively, our results and previous reports suggested that the changed expression of RNA processing regulatory factors in HBV-associated HCC could serve as the potential biomarkers.

In this study, we revealed a modest number of significantly differentially expressed gene between HBV-infected and non-infected PHH. These unexpected results might be due to the differences in cell viability and metabolic activity of freshly isolated PHHs compared to cryopreserved PHHs due to the preservation stress incurred during freezing and thawing [29, 30]. Therefore, further validations and investigations using more natural models and clinical samples would be necessary in the future. Additional experiments are necessary to investigate differences in gene expression related to susceptibility to HBV infection in freshly isolated PHHs obtained from different donors.

In summary, our study presents comprehensive transcriptional profiling of HBV infection dynamics, validates the 10 genes related to RNA-metabolism on HBV infection. We utilized the PHHs, which are regarded as the most physiologically relevant in vitro models for studying HBV infection. PHHs closely mirror the characteristics of human liver cells, the primary target of HBV. Our findings uncovered alterations in various genes and cellular pathways linked to mRNA metabolism, alternative splicing, and spliceosomes during HBV replication. Several of these genes are associated with HBV-associated HCC. These findings hint at the potential for identifying novel biomarkers to address HBV-associated HCC.

Conclusions

Our study delineated alterations in gene expression within PHHs caused by HBV infection. We pinpointed RBPs crucial in mRNA metabolism and the regulation of alternative splicing during HBV infection. Grasping the functional roles of host factor networks in HBV infection holds promise in elucidating the molecular mechanisms behind HBV replication/transcription and could pave the way for developing therapeutic interventions against HBV infection.