Background

Chronic hepatitis B virus (HBV) infection is an endemic disease with a global burden of 350 million patients [1]. This disease persists for multiple decades, and its natural history comprises the immune tolerance, immune clearance and inactive residual phases [26]. During the chronic infection, episodes of liver inflammation may occur which cause progressive liver fibrosis and cirrhosis, leading toward thrombocytopenia [7], hypoalbuminemia [8], portal hypertension, esophageal varices, ascites [9], liver decompensation and hepatocellular carcinoma (HCC) [10]. To prevent such devastating consequences, effective antiviral therapies were now vigorously used, with viral and host status carefully monitored [11]. Serum concentrations of HBV DNA and surface protein antigen (HBsAg) are both important viral markers [12]. HBsAg is derived from the HBV genome. Thus, higher HBV DNA concentrations should implicate higher HBsAg levels.

Despite the established molecular origin, serum HBV DNA and HBsAg did not manifested a simple linear relationship in the natural course. The HBV DNA and hepatitis B e-antigen (HBeAg) levels were drastically reduced in the immune clearance phase, while the HBsAg levels were further reduced continuously in the inactive residual phase [26]. The discrepancy between HBV DNA and HBsAg levels made them independent variables rather than confounding variables in clinical studies. For example, they played different roles in the prediction of subsequent HCC [13]. Medical guidelines suggested that anti-viral treatments should be given to patients in the immune clearance phase for the purpose of expediting the natural course into the inactive residual phase; and to patients with viral reactivation during the inactive residual phase [11, 12, 14]. HBV DNA was demonstrated to be effectively suppressed, often to undetectable levels, by treatments of approved nucltos(t)ide analogs including lamivudine [15], adefovir [16], entecavir [17], telbivudine [18] and tenofovir [19]. The HBsAg, however, remained positive for years for most of these treated patients. This was why HBsAg seroconversion (the disappearance of HBsAg and the production of anti-HBs antibody), rather than the HBV DNA undetectability, was regarded as the closest sign of cure [12]. On the other hand, patients with negative HBsAg but with positive HBV DNA were occasionally identified, and referred to as the occult hepatitis B patients [2022]. These patients were still at risk of HBV reactivation [22].

The lack of linear relationship between HBV DNA and HBsAg may be partly explained by the viral life cycle. The covalently closed circular DNA (cccDNA) is the template for generating messenger RNAs, which are further translated to produce HBsAg, as well as the pregenomic RNAs which are reversely transcribed to viral DNA [23]. Since the HBV DNA can integrate into the human genome, the HBsAg may also be derived from the integrated HBV DNA in addition to cccDNA [24]. The viral life cycles occurred in the human hepatocytes, making them susceptible to host factors.

The discrepancy between serum HBV DNA and HBsAg levels remained to be quantitatively evaluated. Therefore, we employed a data-driven approach and conducted a systematic, multivariate evaluation of hematological, histological and viral factors to evaluate their effects on HBsAg concentrations.

Methods

Patients

This study was approved by the institutional review board of the Chang Gung Memorial Hospital, Taiwan, and conducted in accordance with the Declaration of Helsinki. All patients have given informed consent for the deposition of their clinical samples to the tissue bank of Chang Gung Memorial Hospital, Taiwan, for academic researches.

In the first stage, clinical records of 327 chronic hepatitis B patients who received pretreatment hematology, liver histology and viral serology assessments between years 2007–2009 were retrospectively retrieved for a quantitative modeling (Table 1). Liver histology was evaluated by the ISHAK hepatic activity indexes [25]. In the second stage, two independent cohorts were assessed for the validation purposes. The first cohort comprised 45 patients who also received liver biopsy for pretreatment evaluations between years 2007–2009. The second cohort comprised 80 anti-hepatitis B treatment-naïve patients evaluated between years 2010–2012. These patients did not receive liver biopsy.

Table 1 Baseline characteristics of patients in the model construction cohort

Quantitation of HBV DNA and HBsAg concentrations

HBV DNA levels were measured by use of the COBAS AmpliPrep/COBAS TaqMan HBV Test, v2.0 assay (Roche Molecular Systems Inc, Pleasanton, CA) according to manufacturer’s protocols. HBsAg concentrations were measured by use of Elecsys HBsAg II assay (Roche Diagnostics GmbH, Mannheim, Germany).

Statistical analysis

The HBV DNA and HBsAg concentrations were consistently represented here in the logarithm scale due to their wide numerical ranges. Clinical associations were evaluated by univariate and multivariate linear regressions. Subgroup analysis was then performed to identify patient stratum where the significant HBsAg-HBV DNA correlation was lost. For this subgroup, modulating factors for the DNA-HBsAg relationships were then evaluated by the backward stepwise linear regression method, where the F-test were used to evaluate the model performance. The modulating factors were then introduced into a prediction model. The statistical analysis was performed using the SPSS software (IBM, Armonk, NY). P values smaller than 0.05 were considered statistically significant.

Results

HBV DNA levels, HBeAg positivity and ISHAK fibrosis stages were independently associated with HBsAg levels

The first cohort comprised 327 chronic hepatitis B patients (Table 1). Age, ISHAK fibrosis stages, HBV DNA levels, hepatitis B e-antigen (HBeAg) positivity, platelet counts and hemoglobin levels were significantly associated with HBsAg levels in the univariate analysis (Table 2). When these variables were entered into multivariate analysis, only three variables remained significantly associated (ISHAK fibrosis stages, HBV DNA levels and HBeAg positivity) (Table 2). Among them, HBV DNA is the most strongly associated variable (P < 0.001). An initial model of HBsAg by use of the three independent variables was therefore constructed as a benchmark using the multivariate linear regression as:

Table 2 Association of viral and host variables to quantitative HBsAg concentrations using linear regression
$$ \mathrm{HBsAg} = 0.274\ *\ \mathrm{H}\mathrm{B}\mathrm{V}\ \mathrm{D}\mathrm{N}\mathrm{A}+0.314\ *\ \left(0\ \mathrm{if}\ \mathrm{HB}\mathrm{eAg}\ \mathrm{negative}\right)\ \hbox{-}\ 0.123\ *\ \mathrm{ISHAK}\ \mathrm{fibrosis}\ \mathrm{score}+1.858 $$

The estimated HBsAg levels by the three-variable model were highly correlated with the measured HBsAg levels (Pearson’s correlation r = 0.59; P < 0.001). The standard deviation of the regression residual is 0.79 log10 IU/ml.

Identification of a patient subgroup which lacked significant HBV DNA-HBsAg correlations

We further conducted the subgroup analysis of patients stratified by the above three variables. Significant HBsAg-DNA correlations remained in HBeAg positive and negative patient subgroups (both P < 0.001), in ISHAK score ≥ 4 or ≤ 3 subgroups (both P < 0.001), and in patients with HBV-DNA > 6 log10 IU/mL (P < 0.001). However, no significant association were found in the patient subgroup with HBV-DNA ≤ 6 log10 IU/mL (Fig. 1). A baseline comparison of the low- and high- HBV DNA titer subgroups, defined using the boundary threshold of 6 log10 IU/mL, showed that the low-titer subgroup has a significantly lower percentage of HBeAg positive patients (27.48%) than the high-titer subgroup (56.12%, Table 3).

Fig. 1
figure 1

Pearson’s correlation (r) of HBsAg and HBV DNA levels in patient subgroups stratified by HBeAg status, ISHAK fibrosis stages and HBV DNA levels

Table 3 Baseline characteristics of patients with HBV DNA below or above 106 IU/mL

A biphasic model of HBsAg concentrations using platelet counts and HBV DNA concentrations

A scatter plot was then produced to offer a visualization of the relationship between the HBV DNA and HBsAg identified in the previous subgroup analysis (Fig. 2a). Significant HBsAg-DNA correlation were found in patients with HBV-DNA > 6 log10 IU/mL but not in patients with HBV-DNA ≤ 6 log10 IU/mL, suggesting unknown modulating factors of the HBsAg levels in the HBV DNA low-titer subgroup. Therefore, a backward stepwise linear regression analysis was then performed in the subgroup when HBV-DNA ≤ 6 log10 IU/mL (N = 131). This was done by incorporating all the 16 clinical variables into a multivariate linear regression equation, then gradually removing irrelevant variables one at a time, and evaluating the statistical significance (Fig. 2b). At the end of the stepwise analysis, the linear combination of two variables, platelet counts and DNA levels, was found to be significantly correlated with HBsAg levels (F-test P = 0.048, degrees of freedom = 2).

Fig. 2
figure 2

a The scatter plot of HBV DNA and HBsAg levels in the model construction cohort. b Backward stepwise linear regression analysis in patients with HBV-DNA ≤ 6 log10 IU/mL. The x-axis showed the number of variables incorporated in the model, which also equated to the degrees of freedom in the F-test. The y-axis showed the P values calculated by the F-test. At the beginning, all 16 clinical variables were incorporated into a linear model. Less relevant variables were progressively removed. At the end of the stepwise process, a linear combination of platelet and HBV DNA levels showed significant association to HBsAg levels (P = 0.048, degrees of freedom = 2). c Estimated HBsAg levels is a function of HBV DNA levels and platelet counts in the constructed biphasic model. d A scatter plot of the measured and estimated HBsAg levels in two validation cohorts. Validation cohort 1: patients with biopsy-included pretreatment evaluations. Validation cohort 2: treatment naïve patients

We continued to construct a biphasic model of HBsAg level using (i) HBV-DNA alone when HBV-DNA > 6 log10 IU/mL, and (ii) HBV-DNA and platelet counts together when HBV-DNA ≤ 6 log10 IU/mL.

$$ \mathrm{HB}\mathrm{sAg}=0.538\ast \mathrm{H}\mathrm{B}\mathrm{V}\mathrm{D}\mathrm{N}\mathrm{A}+0.001\ast \mathrm{platelet}\ast \left(\left|6{\textstyle \hbox{-}}\mathrm{HBVDNA}\right|+6{\textstyle \hbox{-}}\mathrm{HBVDNA}\right){\textstyle \hbox{-} }0.321 $$

Where |‧| represented the absolute-value function. The relationship between the HBV DNA levels, platelet counts and the estimated HBsAg levels was visualized in Fig. 2c. In the model construction cohort, the HBsAg levels calculated by the biphasic model were significantly correlated with the measured HBsAg levels (r = 0.60, P < 0.001). The standard deviation of the regression residual is 0.78 log10 IU/ml.

Clinical records of additional 45 patients with liver biopsy-included pretreatment evaluations were used for the first validation (Table 4). Significant positive correlations were found between estimated and measured HBsAg concentrations (r = 0.47, P = 0.001). The standard deviation of the residual is 0.82 log10 IU/mL. Furthermore, a cohort of 80 treatment-naïve patients (not receiving pretreatment liver biopsy) evaluated between 2010-2012 were recruited for the second validation (Table 5). Significant positive correlations were found again (r = 0.57, P < 0.001). The standard deviation of the residual is 0.88 log10 IU/mL. A visual presentation of the estimated and the measured HBsAg levels in the two validation cohorts were shown in Fig. 2d.

Table 4 Baseline characteristics of patients in the first validation cohort. The patients all received biopsy-included pretreatment evaluations between 2007-2009
Table 5 Baseline characteristics of patients in the second validation cohort. The patients were treatment-naïve patients evaluated between 2010-2012

Discussion

Chronic hepatitis B often lasted for decades, if not lifetime. The HBsAg level was high in the immune tolerance phase. It reduced gradually in the immune clearance phase and the inactive residual phases [12]. A strong positive linear relationship between age and the annual rate of HBsAg seroclearance has been demonstrated in a meta-analysis of 13 study cohorts [26]. The highest rate of HBsAg seroclearance occurred at 50 years old [26], an age when many patients have already developed mild or severe liver fibrosis. The negative correlations between fibrosis stages and HBsAg levels has also been demonstrated in previous univariate analyses [27, 28]. Significantly lower HBsAg levels were found in patients with ISHAK fibrosis score >1, compared with those with score ≤ 1 (P < 0.001) [27]. Baseline data from a multicenter, phase III trial of peginterferon alfa-2a and a phase IV NEPTUNE trial showed that lower HBsAg levels were associated with lower PS1 and PS2 scores, which indicated more severe fibrosis [28].

The necessity of multivariate analysis arises as multiple factors (age, HBsAg level, fibrosis stage) were shown to be involved in univariate analyses [27, 28]. Our systematical evaluation of hematological, histological and viral serological variables showed that the progression of liver fibrosis was accompanied by HBsAg reduction (Table 2 , adjusted regression coefficient of “ISHAK fibrosis stage” = -0.125, P = 0.002), independent of age, HBV DNA levels, HBeAg positivity, platelet counts and hemoglobin levels. Age on the other hand was negatively correlated with HBsAg concentrations only in the univariate analysis but not in the multivariate analysis.

The discrepancy between HBV DNA and HBsAg levels underlies the reason why HBsAg cannot play comparable roles on the estimation of subsequent HCC risks as what HBV DNA can do (except for patients with very low levels of HBV DNA). HBV DNA has been established as an important predictor of HCC risks [29]. A recent report showed that HBV DNA in general is a better predictor of HCC than HBsAg [13]. However, in a specific subgroup of HBeAg negative, HBV DNA < 2000 IU/mL patients, HBsAg rather than HBV DNA was a better predictor [13]. This conclusion was based on a study population of non-cirrhotic, relatively young patients (>50% patients were 28–39 years old at the time of enrolment). Considering the strong effect of fibrosis stages on the subsequent HCC occurrence [12, 30] and the negative correlations between HBsAg levels and fibrosis stages demonstrated here, it was reasonable to say that any potential positive correlations between HBsAg and HCC incidence can only be found in patients with similar fibrosis status, which however required liver biopsy to be assessed correctly. The predictive role of HBsAg on HCC reported in [13] may not be readily extrapolated to elder people with mild, moderate and severe fibrosis.

A model of HBsAg levels can be constructed straightforwardly using the three independent variables (HBV DNA, fibrosis stages and HBeAg status). This model was a benchmark in the search for a simpler model with fewer number of clinical variables. We continued to investigate patient subgroups stratified by the three independent variables. We found that the DNA remained significantly associated with HBsAg in all strata except when DNA < 6 log10 IU/mL. A backward stepwise linear regression analysis in the low-titer subgroup showed that, after the less relevant variables were removed gradually, platelet counts and HBV DNA remained, and their combination was synergistically associated with HBsAg levels. Thus, a biphasic model was constructed using HBV DNA alone when HBV-DNA > 6 log10 IU/mL, and platelet levels in conjunction with HBV DNA when HBV-DNA ≤ 6 log10 IU/mL. This new model is simpler, with fewer variables, yet the correlation (r = 0.60) is even higher and the standard deviation of the regression residual (e = 0.78 IU/mL) is even lower than those of the three-variable model (r = 0.59 and e = 0.79 IU/mL).

The reduction of platelet counts, i.e. thrombocytopenia, has been acknowledged to be associated with chronic liver diseases and cirrhosis [31, 32]. The correlation between platelet counts and ISHAK fibrosis stages made them both associated with HBsAg levels in our univariate analysis (Table 2). When they were both introduced into the multivariate analysis, only the ISHAK stage but not the platelet counts (P = 0.064) remained significantly associated. However, in the low-titer subgroup when HBV-DNA ≤ 6 log10 IU/mL, platelet counts rather than ISHAK stages were remained in the backward stepwise regression analysis. This showed that platelet counts and HBV DNA formed an effective combination for estimating HBsAg when HBV-DNA ≤ 6 log10 IU/mL.

Platelets were widely known for their roles in blood coagulation. In addition to this conventional role, its antimicrobial roles were gradually being noticed [33]. Platelets can secret chemokine ligand 5 (CCL5) so as to stimulate the production of megakaryocytes, forming a positive feedback loop of platelet activation [34]. It can also secrete hepatocyte growth factor (HGF) so as to protect against liver fibrosis [35]. The detailed mechanism on the interactions of platelets to the HBV life cycle warrants further investigations.

The quantitative modeling provided a numerical basis for our understanding on the relationship between HBsAg, HBV DNA, age, fibrosis stages and platelet counts. The estimated HBsAg concentrations correlated well with the measured HBsAg in the model construction cohort as well as the two independent validation cohorts (P ≤ 0.001 in all), supporting the use of the biphasic model in retrospective studies where the HBsAg was not measured at previous timepoints and no stored clinical samples were available. Since quantitative HBsAg measurement has become more and more available recently, patients’ HBsAg levels can now be measured directly without the help of this biphasic model.

Patients in the immune activation and the inactive residual phases were particularly required for quantitative HBsAg monitoring, and they were the major population of our study cohorts. Although we have analyzed a total of 452 patients, patients in the immune tolerance phases were not well represented. Therefore, the current analysis may only be applied to patients in the immune activation phase onward, but may not be extrapolated to patients in the immune tolerance phase.

In conclusion, serum HBsAg levels depended on HBV DNA titers, the liver fibrosis stages, and HBeAg positivity. Taking into consideration of all the above aspects, we constructed a noninvasive, biphasic quantitative model using two variables, HBV DNA and platelet levels, which can effectively estimate HBsAg concentrations.