Background

Non-alcoholic fatty liver disease (NAFLD) is currently the most prevalent chronic liver disease in the world [1]. Predictions indicate that NAFLD will become the leading cause of liver-related morbidity and mortality and the leading indication for liver transplantation in the next decade [2,3,4]. NAFLD presents as a range of liver disease severity from simple steatosis known as non-alcoholic fatty liver (NAFL), to steatosis with inflammation and hepatocyte injury termed non-alcoholic steatohepatitis (NASH). Fibrosis is more prevalent in NASH and is now considered the dominant histological feature that is predictive of outcomes. Several studies have shown that fibrosis stage is the most relevant liver biopsy feature associated with overall- and liver-related mortality as well as liver-related events [5,6,7,8,9,10,11]. Thus, quantifying liver fibrosis in NAFLD patients is an important part of management as it guides long-term prognosis, need for monitoring, treatment intensity and referral for clinical trials.

Although liver biopsy remains the gold standard investigation for assessing stage of disease, its invasive nature means it not practical for use in all NAFLD patients [12]. Non-invasive tests (NITs) are widely used as alternative tools for the diagnosis of liver fibrosis in NAFLD patients. ‘Simple’ liver fibrosis NIT’s based on routine laboratory and clinical parameters include the AST to platelet ratio index (APRI), FIB-4, and the NAFLD fibrosis score (NFS). A meta-analysis of 34 studies demonstrated the mean AUC values of these measures to be reasonable at between 0.75 and 0.80 for the detection of advanced fibrosis in NAFLD patients [13]. The summary positive and negative predictive values using standard cut-offs ranged between 55–67% and 89–93%, respectively, highlighting their strength in excluding advanced disease but also their potential to have false positive results.

Hepascore is an alternative serum NIT which incorporates direct markers of fibrogenesis/fibrinolysis, namely hyaluronic acid and alpha-2 macroglobulin. Hepascore was originally developed in patients with chronic hepatitis C infection [14], however has also been reported to be an accurate predictor of liver fibrosis in NAFLD patients [15, 16]. Liver stiffness measurement (LSM) by vibration controlled transient elastography (VCTE or Fibroscan®) utilizes the physical properties of the liver to predict fibrosis severity in NAFLD patients [17, 18]. In the setting of a valid scan, transient elastography has excellent accuracy for the determination of advanced fibrosis [13].

Despite a number of studies examining the use of NIT’s in NAFLD, surprisingly few studies have compared the accuracy of simple NIT’s with models including more complex biomarkers of fibrosis and VCTE. In addition, recent studies have highlighted that the accuracy of NIT’s in NAFLD patients may be impacted by a range of patient factors including older age (FIB-4, NFS), presence of type 2 diabetes mellitus (NFS, FIB-4, APRI, Hepascore), obesity and degree of hepatic steatosis (VCTE) [16, 19,20,21,22]. In addition, many of these models allocate a significant proportion of patients into an indeterminate or undiagnosed zone and thus these subjects may require additional testing or a liver biopsy to exclude or confirm the presence of advanced fibrosis. As a result, the optimal NIT method that accurately identifies liver fibrosis in NAFLD patients remains unclear. Thus, we aimed to evaluate the comparative accuracy of simple serum models (NFS, FIB-4 and APRI) with a model with direct measures of hepatic fibrogenesis (Hepascore) and LSM by VCTE for the non-invasive diagnosis of liver fibrosis in a population of biopsy-proven NAFLD patients.

Material and Methods

Study Population

We conducted a retrospective study of patients at a tertiary centre [Department of Hepatology, Sir Charles Gairdner Hospital (SCGH)], Western Australia. The cohort study population consisted of subjects recruited from the outpatient hepatology clinic at SCGH that fulfilled the following inclusion criteria: Adult patients with histologically proven NAFLD and available data for calculation of Hepascore, NFS, FIB-4 and APRI within six months of liver biopsy.[23] Those subjects with concomitant liver disease, acute or chronic (viral hepatitis, hemochromatosis, α1-antitrypsin deficiency, Wilson disease, autoimmune and drug-induced liver disease), decompensated chronic liver disease or hepatocellular carcinoma at baseline, secondary causes of NAFLD, human immunodeficiency virus infection, history of average daily alcohol consumption of > 20 g for men and > 10 g for women were excluded. All subjects who elected to undergo liver biopsy and histological confirmation of NAFLD were included in this analysis. We also included a subset of patients with additional LSM available using Fibroscan®.

Clinical and laboratory data were prospectively collected at time of liver biopsy. LSM measurements was performed within 15 days of the liver biopsy. Demographic variables included age, gender and ethnicity. Body mass index (BMI) was calculated as: weight (in kilograms)/height 2 (in meters). Subjects with a BMI of 30 kg/m2 or more were considered obese. Laboratory evaluations included alanine aminotransferase (ALT), aspartate aminotransferase (AST), total cholesterol, high-density lipoprotein (HDL) and low-density lipoprotein (LDL) cholesterol, triglycerides, fasting glucose, platelet count, international normalized ratio (INR), creatinine and albumin. Lipid and glucose levels were assessed after an overnight fast.

In addition, we assessed the presence of the following comorbidities: type 2 diabetes mellitus (T2DM), hypertension, hypertriglyceridemia and ischemic heart disease (IHD) based on medical records and a patient history review at time of liver biopsy. Diagnosis of T2DM was based on a history of diabetes and /or fasting plasma glucose ≥ 126 mg/dL or 2 h plasma glucose ≥ 200 mg/dl during an oral glucose test tolerance or HbA1c ≥ 6.5% or use of antidiabetic medications [24]. Hypertension was defined as systolic blood pressure ≥ 140 mmHg and diastolic blood pressure ≥ 90 mmHg or anti-hypertensive use [25]. All subjects had provided informed consent for their medical information to be used for medical research with the approval of the SCGH Human Research Ethics Committee.

Liver Histology Assessment

All liver biopsies were scored by an expert liver histopathologist according to the NASH-CRN scoring system [26]. Fibrosis was classified on a 5-point scale: stage 0 = no fibrosis, stage 1 = zone 3 perisinusoidal/perivenular fibrosis, stage 2 = zone 3 and periportal fibrosis, stage 3 = septal/bridging fibrosis, stage 4 = cirrhosis. Advanced fibrosis was classified as F3 and F4. Fatty liver was defined by the presence of ≥ 5% steatosis, while NASH was diagnosed by the presence of steatosis, inflammation, and ballooning.

Non-invasive Fibrosis Models Assessment

Hepascore components include age, gender, bilirubin, gamma-glutamyltranspeptidase, hyaluronic acid and alpha- 2 macroglobulin. APRI includes AST and platelet count. FIB-4 components include age, AST, ALT and T2DM. NFS components includes age, BMI, presence of impaired glucose intolerance or T2DM, platelet count and albumin. Non-invasive fibrosis models were calculated using the specific formulas for each score NFS [27], FIB-4 [28], APRI [29] and Hepascore [14] based on clinical, anthropometric and laboratory data. NFS, FIB-4 and APRI were constructed retrospectively with the necessary information collected at the time of liver biopsy. Hepascore was based on the following formula: y = exp (− 4.185818 − (0.0249 × age) + (0.7464 × sex) + (1.0039 × Alpha 2 macroglobulin) + (0.0302 × Hyaluronic acid) + (0.0691 × bilirubin) − (0.0012 × GGT)]. Final score is based on the following: y/1 + y. LSM was assessed by Fibroscan® 502 (Echosens, Paris, France) medical device using the standard M or XL probe as per the manufacturer’s recommendations, by experienced nurses trained in the use of Fibroscan®. LSM was performed following a two hour fast and was recorded after obtaining 10 valid measurements and reported using the median in kilo Pascals (kPa). Following recommended criteria for reliability [30], only those patients with measures (IQR/M ≤ 0.10) or (0.10 < IQR/M ≤ 0.30, or IQR/M > 0.30 with LSE median < 7.1 kPa) were consider for the analysis.

Statistical Analysis

Patient’s baseline characteristics were reported as number and percentage for categorical variables and as mean ± SD if normally distributed or median and range if non-normal distribution for quantitative variables. Diagnostic performance of the models for advanced fibrosis and cirrhosis were assessed using diagnostic accuracy (DA), sensitivity, specificity, negative and positive predictive values and diagnostic odd’s ratio. DA represents the percentage of cases correctly identified by the model and is calculated by dividing the sum of true positive and true negative results by the total number of subjects.

All analyses were performed for each non-invasive score using previously published and validated cut-offs for the diagnosis of advanced fibrosis. The accuracy of each algorithm to predict advanced fibrosis and cirrhosis was assessed using receiver operator characteristic (AUC) curves and compared using the De Long test with 95% confidence intervals (95% CI). Comparison of the AUCs between all models was performed using the test of the differences in the AUCs (STATA, ROCCOMP) [31]. AUCs of models were also compared following stratification by age, sex and presence of obesity based on BMI cut-off of 30 kg/m2. A 60 year cut-off value for age was selected as this represented the median value in the cohort.

The indeterminate area for each score was determined using published cut-offs NFS (-1.455, 0.676) [27], FIB-4 (1.30, 2.67), APRI [29] (0.5, 1.5) and Hepascore [14] (0.60–0.80). An LSM cut-off of 8.0 kPa was used to rule out advanced fibrosis and > 12.0 kPa was used to rule in advanced fibrosis [32, 33]. The area between 8.0 kPa and > 12 kPa was considered indeterminate. The proportion of F3–F4 patients falling within the indeterminate range were compared for each score.

All confidence intervals, significance tests, and resulting P values were two-sided, with an alpha level of 0.05. Statistical analyses were performed using STATA software, release 14.2. College Station, TX: Stata Corp LP.

Results

Baseline Characteristics

From 2004 to 2018 a total of 946 patients were assessed for eligibility. Of these, 271 biopsy proven NAFLD patients were included. All patients had data available for calculation of non-invasive serum fibrosis scores. Of these, 131 subjects had LSM by Fibroscan® available and 125 patients had reliable LSM measurements (Fig. 1). Of the entire cohort, 60% were female, and mean age was 52 (± 12) years with subjects being predominantly obese (158/271, 58%), as outlined in Table 1. Advanced fibrosis was present in 83 (31%) of subjects whereas 47 (17%) of patients had cirrhosis on biopsy.

Fig. 1
figure 1

Flow diagram of the participants in the study

Table 1 Baseline characteristics of subjects with biopsy proven NAFLD (n = 271)

Diagnostic Performance of Non-invasive Tests

Diagnostic characteristics in the entire cohort (n = 271) and in those with reliable LSM measures are shown in Table 2. For the entire cohort, DA for advanced fibrosis was 87.1 for Hepascore, 74.2 for FIB-4,74.5 for NFS and 72.5 for APRI. Similarly, Hepascore (DA 85.7) performed slightly better than the other fibrosis models (FIB-4, DA 84.5, NFS, DA 81.6 and APRI, DA 83.4) for the prediction of liver cirrhosis. Consequently, Hepascore > 0.8 had the highest diagnostic odd ratio (DOR) for advanced fibrosis (43.7) in comparison to FIB-4 (4.2), NFS (4.5) and APRI (4.1).

Table 2 Diagnostic performance of non-invasive fibrosis models for advanced fibrosis or cirrhosis in non-alcoholic fatty liver disease

In the subset of patients with reliable LSM measurements, the DA for LSM were 83.2 and 90.5 for advanced fibrosis and cirrhosis, respectively. The DOR for these categories were 17.7 and 7.9, respectively.

Table 2 also describes the performance of the models at specific cut-offs. Using a cut-off of 0.60, Hepascore accurately ruled out the presence of advanced fibrosis (DA 87, DOR 22). Similar results were obtained between FIB-4 < 1.30, (DA 76.3, DOR 4.5), and NFS < -1.455 (DA 74.5, DOR 1.55) with APRI < 0.5 showing the lowest performance (DA 66.2, DOR 1.07).

Accuracy of Non-invasive Tests

The AUCs estimation and comparison for the detection of advanced fibrosis and cirrhosis of the non-invasive fibrosis models are summarized in Table 3 and Figs. 2, 3 and 4. For the detection of advanced fibrosis in the entire cohort, Hepascore accuracy (AUC 0.88) was higher compared to FIB-4 (AUC 0.73), NFS (AUC 0.72) and APRI (AUC 0.69) (p < 0.001 for all). Additionally, in the subset of patients with reliable LSM measurements, Hepascore performance (AUC 0.88) was similar to LSM (AUC 0.80) with LSM being more accurate than APRI (AUC 0.71, p = 0.02), shown in Supplementary Table 1. Hepascore (AUC 0.85) performed similar to FIB-4 (AUC 0.77) but was more accurate than APRI (AUC 0.71, p = 0.01) and NFS (AUC 0.73, p = 0.01) for the detection of cirrhosis (Table 3). Similar accuracy for detecting cirrhosis was observed when comparing LSM and Hepascore (AUC 0.79 and 0.85, respectively).

Table 3 Accuracy for the diagnosis of significant, advanced fibrosis or cirrhosis of the non-invasive fibrosis models
Fig. 2
figure 2

Area under the curves of non-invasive fibrosis models for advanced fibrosis in NAFLD (n = 271)

Fig. 3
figure 3

Area under the curves of non-invasive fibrosis models for cirrhosis in NAFLD (n = 271)

Fig. 4
figure 4

Area under the curves of LSM by Fibroscan® in NAFLD (n = 125). A Advanced fibrosis, B Cirrhosis

Indeterminate Areas of Non-invasive Tests

The lowest proportion of subjects falling within an indeterminate range was observed with Hepascore (4%) compared to FIB-4 (33%), NFS (48%) and APRI (31%), shown in Fig. 5. Additionally, in those patients with F3-F4 (n = 83), Hepascore also showed the lowest rate of indeterminate patients (5%) compared to FIB-4 (42%), NFS (36%) and APRI (44%). In the subset of F3-F4 and F4 subjects with reliable LSM measurement available, 22% and 13% fell within the indeterminate range, respectively.

Fig. 5
figure 5

Proportion of F3-F4 patients within indeterminate range for each non-invasive fibrosis scores. Legend for the graphic abstract: (140–200 characters). Hepascore has higher accuracy and lower indeterminate area to detect advanced fibrosis compared with simple non-invasive fibrosis models. Hepascore also showed greater accuracy than Fibroscan® in obese individuals

Accuracy of Non-invasive Tests According to Age, Sex and BMI

Overall, gender did not influence the predictive accuracy of the models for the diagnosis of advanced fibrosis or cirrhosis, whereas the accuracy of NFS was lower in those subjects aged ≥ 60 years (see Supplementary Tables 2 and 3). The accuracy of the other fibrosis models was not impacted by age.

We additionally compared the predictive accuracy of Hepascore and LSM by Fibroscan according to BMI. In obese patients (defined by a BMI > 30 kg/m2), the predictive accuracy of Hepascore (AUC 0.91) was superior to LSM by Fibroscan (AUC 0.80) for the diagnosis of advanced fibrosis. In contrast, no differences in predictive accuracy were observed between Hepascore and LSM in non-obese patients with advanced fibrosis or liver cirrhosis (Table 4).

Table 4 Diagnostic accuracy of fibrosis models in NAFLD patients for Fibrosis stage ≥ 3 and F4 according to the presence of obesity

Discussion

The accurate diagnosis of liver fibrosis in NAFLD patients is crucial to identify the subset of individuals at higher risk of progression to end stage liver disease, hepatic/extrahepatic complications and death. In this study, we found overall equivalent accuracy between Hepascore and Fibroscan® in the diagnosis advanced fibrosis or cirrhosis, however superior accuracy of Hepascore in obese individuals. These two fibrosis scores performed better when compared to FIB-4, NFS and APRI. Furthermore, when classifying a full spectrum of fibrosis patients, Hepascore had a lower proportion of patients falling into the indeterminate area compared with other non-invasive fibrosis models. Overall, the accuracy of NIT’s was not impacted by gender although the accuracy of NFS was reduced in patients aged ≥ 60 years.

Although the use of NIT’s is becoming increasingly established in clinical practice, studies comparing more complex serum markers such as Hepascore, with elastography techniques are limited. In this study, LSM by Fibroscan® and Hepascore provided the highest accuracy to diagnose advanced fibrosis, which is identified as an important endpoint in clinical practice. A multicentre French study of 452 biopsy proven NAFLD patients found that LSM by Fibroscan® (AUC 0.83) performed significantly better than Hepascore (AUC 0.78) for the diagnosis of advanced fibrosis, however both markers had equivalent accuracy for predicting cirrhosis (0.82 and 0.81, respectively) [18]. Similar to our results, both Hepascore and LSM had a higher diagnostic accuracy when compared to APRI, NFS and FIB-4 for the prediction of advanced fibrosis. A smaller study of 186 NAFLD patients also found complex models including the Enhanced Liver Fibrosis score, and FibrometerV2G/V3G had similar accuracy compared to Fibroscan® and superior accuracy compared to FIB-4 and NFS [34]. Thus our study strengthens the evidence base demonstrating complex serum markers are more accurate than NFS and FIB-4 and equivalent to Fibroscan® in the determination of fibrosis in NAFLD. Additional studies are required to examine the comparative cost-effectiveness and the optimal combination of tools for screening and diagnosis of fibrosis in NAFLD.

The proportion of patients with an unclassified NIT result in the setting of NAFLD is a pertinent to their use as these patients may require a liver biopsy to rule out or in the presence of advanced fibrosis. These models usually also have excellent negative predictive values to exclude advanced fibrosis but high rates of false positive results thus limiting their accuracy to confirm diagnosis [13]. Compared to the other fibrosis tests, Hepascore demonstrated the lowest rate of unclassified patients, less than 5% in the entire cohort and less than 7% for both groups of patients with advanced fibrosis and cirrhosis. In contrast, the proportion of patients with advanced fibrosis and cirrhosis falling within the indeterminate area for NFS, FIB-4 and APRI ranged from 31 to 48%. In this study, LSM by Fibroscan® (22%) and Hepascore (5%) provided the highest rate of well-classified patients for advanced fibrosis.

Recent large multi-centre studies have reported similar indeterminate areas of 50–51% for NFS and 43–48% for FIB-4 but with the lowest indeterminate area with LSM by Fibroscan® at 8% [35, 36]. Indeterminate area may vary according to advanced fibrosis prevalence, with higher fibrosis prevalence associated with a higher indeterminate area [37]. Our study showed a 31% of prevalence of advanced fibrosis. In a recent multicentre multinational Asian study designed to optimize the diagnosis of advanced fibrosis by NITs in NAFLD, the indeterminate areas in patients with 24% prevalence of advanced fibrosis were 29% and 23% for NFS and FIB-4, respectively. The grey zones were significantly reduced with the use of either of these models followed by Fibroscan® [37] but only in the population with a low prevalence of advanced fibrosis. Further research is required to assess if the combination of Hepascore with Fibroscan® can further narrow the indeterminate zone in this specific population.

It is important to highlight the limitations of our study: firstly, the small sample of patients with Fibroscan® available limited the ability to perform additional analyses comparing sequential NIT’s, and secondly, up to 20% of NAFLD patients may have an error in fibrosis stage using liver biopsy. Furthermore, as all patients included in the study were from a tertiary center and required to be biopsy proven, we acknowledge the presence of selection bias which tends to include subjects with specific indications for liver biopsy and more advanced liver fibrosis. Strengths of the study includes the availability of all data for calculation of non-invasive scores at the time of liver biopsy and the small proportion of patients with unreliable LSM measures.

Screening patients with chronic liver disease risk factors in primary care for advanced fibrosis is needed to reduce the burden of liver disease in the community. Serum models such as FIB-4 are attractive first-line candidates, however have recently been criticized for their limited accuracy [38]. The analytes of Hepascore are inexpensive to assay and can be routinely assessed in an automated fashion in any moderate sized laboratory. Furthermore, the use of the algorithm is not covered by patent, and thus Hepascore represents an attractive ‘direct measure’ of fibrogenesis which could be further examined in the primary care setting.

In conclusion, in this study Hepascore accuracy was higher than simple serum fibrosis models to predict advanced fibrosis in NAFLD. Both Hepascore and Fibroscan® had equivalent accuracy to detect advanced fibrosis or cirrhosis in NAFLD patients, however, Hepascore was superior in obese individuals. Hepascore and Fibroscan® also have lower indeterminate range than simple serum models for the diagnosis of advanced fibrosis or cirrhosis. Further work validating their use in the primary care setting and incorporating them into sequential layered diagnostic algorithms offers promise to identify patients more accurately at risk of liver-related morbidity and mortality.