Performance of Noninvasive Liver Fibrosis Tests in Morbidly Obese Patients with Nonalcoholic Fatty Liver Disease

Background Nonalcoholic fatty liver disease (NAFLD) is highly prevalent in morbidly obese patients, and fibrosis is an independent predictor of mortality. Noninvasive tests (NITs) are being developed for the detection of advanced fibrosis (AF). Purpose To assess the performance of three NITs (NAFLD fibrosis score, NFS, fibrosis-4 index, FIB-4, and aspartate aminotransferase-to-platelet ratio, APRI), in the identification of AF among morbidly obese patients. Materials and Methods Patients, who underwent bariatric surgery between 2004 and 2009 and had liver biopsy, were included. Fibrosis stages ≥ F2 and ≥ F3 were defined as significant and AF, respectively. Published and optimal thresholds (Youden index) for NFS, FIB-4 and APRI, sensitivity, specificity, positive and negative predictive values (PPV-NPV), and area under the receiver operator curves (AUROC) were evaluated. Results Among 584 patients (mean age 43.3 ± 11.3 years, 21.2% male, 75% white, mean BMI 45.5 ± 8.80), 31.7% had NASH. Stages distributions were F1 = 68.1%, F2 = 16.4%, F3 = 8%, and F4 = 3.2%. At published thresholds, all 3 NITs performed poorly for detection of AF, with AUROC < 0.62. Overall performance at optimal thresholds improved to 0.68, 0.72, and 0.74 for NFS, FIB-4, and APRI, respectively. At optimal thresholds, all tests had good NPV (94.4–95.9%) but low PPV (24.2–32.5%). Combinations of the tests did not improve their performance. Conclusions NFS, FIB-4, and APRI fall short to detect advanced fibrosis but valuable for excluding advanced fibrosis. More research is needed to develop new NITs with high positive predictive value. Electronic supplementary material The online version of this article (10.1007/s11695-020-04996-1) contains supplementary material, which is available to authorized users.


Introduction
Globally, nonalcoholic fatty liver disease (NAFLD) affects around one billion individuals [1]. In the USA, where NAFLD is the most common cause of chronic liver disease, the prevalence is estimated to be about 25%, and this is predicted to substantially increase by 2030 [2,3]. NAFLD is characterized by excessive accumulation of hepatic fat in the absence of secondary causes, such as excessive alcohol consumption, use of steatogenic medications, viral infections, and hereditary disorders [2]. The disease encompasses a broad spectrum of histological characteristics, from steatosis, or nonalcoholic fatty liver (NAFL), to nonalcoholic steatohepatitis (NASH). NASH is characterized by liver cell injuries such as hepatocyte ballooning, inflammation, and fibrosis of the liver, which may lead to cirrhosis and hepatocellular carcinoma (HCC) [4]. The clinical burden of NAFLD is not limited to liver-related morbidity; it also increases the risk of cardiovascular diseases, type 2 diabetes mellitus (T2DM), and chronic kidney disease [5].
In patients with NAFLD, the stage of fibrosis is an independent predictor of overall and liver-related mortality [6]. Therefore, determining the fibrosis stage is vital for treatment planning and accurate prognosis. Liver biopsy remains the gold standard for staging fibrosis in NAFLD [7], but has its limitations as it is invasive, expensive, and subject to sampling errors and variability of microscopic interpretation [8]. The liver biopsy is also impractical for repeated assessment of disease progression. To address these shortcomings, alternative, noninvasive methods of assessing liver fibrosis have been developed. These methods are based either on imaging, such as ultrasound-based transient elastography or magnetic resonance elastography to measure liver stiffness [9], or clinical prediction models and serum biomarkers [10]. In many clinics, noninvasive serologic surrogate markers are the most practical tests, especially for evaluating large numbers of patients [11]. Models designed to assess the severity of fibrosis include the NAFLD fibrosis score (NFS) [12], fibrosis 4 index (FIB-4) [13], and aspartate aminotransferase (AST)-to-platelet ratio index (APRI) [14]. These easily accessible tests reliably differentiate patients who have significant fibrosis (> F2 METAVIR stage) from those without significant fibrosis (F0 and F1 METAVIR stage) [15]. These noninvasive algorithms or tests (NITs) allow rapid assessment of large numbers of patients, so they are used routinely in clinical practice and are being increasingly used in clinical trials [16,17].
The main risk factors for developing NAFLD are central obesity, dyslipidemia, and T2DM/insulin resistance [18]. Therefore, NAFLD is highly prevalent in morbidly obese patients [2]. Obesity plays a role not only in the development of NAFLD but also in determining the severity of the disease [19]. Knowing the degree of liver fibrosis in patients with NAFLD and, especially, monitoring the progression of fibrosis in morbidly obese patients helps in making decisions about intervention and the need for surveillance for hepatocellular carcinoma in patients with cirrhosis [20].
In a study of 187 morbidly obese patients who underwent bariatric surgery, intraoperative liver biopsy revealed steatosis in 91.4% patients, whereas preoperative ultrasound imaging had a sensitivity of only 49% and specificity of 75% [21]. The severity of steatosis may affect the diagnostic performances of NITs in patients with NAFLD, stressing the need for different tools to tailor various NAFLD subgroups to optimize assessments [22]. Hence, noninvasive methods of assessing fibrosis in morbidly obese patients with NAFLD may be unreliable, just as they are insensitive in making the diagnosis of NAFLD. Further investigation has been needed to establish the effectiveness of NITs in these patients. Therefore, the aim of this study was to assess the performance of three NITs in the identification of advanced fibrosis among morbidly obese patients.

Study Population
We prospectively enrolled 584 patients undergoing bariatric surgery at two hospitals within the Inova Health System. Participation in the study was offered to all adult patients who were scheduled for bariatric surgery on days were research staff were available between 2004 and 2009. Patients were excluded if they had a known history of another chronic liver disease, such as viral hepatitis or alcohol-associated liver disease. After informed consent, clinical data were collected and serum was obtained and frozen. During bariatric surgery, liver biopsies were collected and assessed by a single expert hepato-pathologist. Fibrosis stage was determined on the basis of the Brunt classification. Briefly, stage 0 represented absence of fibrosis (F0), stage 1 represented perisinusoidal or portal fibrosis (F1), stage 2 represented perisinusoidal and portal or periportal fibrosis (F2), stage 3 represented septal and bridging fibrosis (F3), and stage 4 represented cirrhosis (F4). Fibrosis stages ≥ F2 and ≥ F3 were defined as significant and advanced fibrosis, respectively. The presence of NASH was determined by the accepted histologic criteria for diagnosis: presence of steatosis, lobular inflammation, and presence of ballooning degeneration with or without Mallory-Denk bodies [7]. Subjects with hepatic steatosis but without lobular inflammation and ballooning were considered to have NAFL, which has also called non-NASH NAFLD. Fibrosis score on liver biopsy was compared with the results of the NITs which included NFS, FIB-4, and APRI.

Other Definitions
T2DM was defined by a fasting glucose level greater than or equal to 126 mg/dL, self-reported medical history of diabetes, and the use of oral hypoglycemic agents. Hypertension (HTN) was defined as a history of high blood pressure or history of oral antihypertensive medications. Hyperlipidemia (HL) was defined by either a serum cholesterol level greater than or equal to 200 mg/dL, low density lipoprotein (LDL) level greater than or equal to 130 mg/dL, high density lipoprotein cholesterol (HDL) level less than or equal to 40 mg/dL for men and 50 for women, or history of hyperlipidemia.

Statistical Analysis
Characteristics were compared across fibrosis stages using the Kruskal-Wallis test for continuous variables and χ 2 test for categorical variables. The test of the trend was performed by generalized liner models with binomial distribution for binary variables and gamma distribution for numerical variables using the fibrosis stage as a continuous variable. Published for NFS, FIB-4, and APRI in detections of advanced liver fibrosis (≥ F3) were evaluated using sensitivity, specificity, positive and negative predictive values (PPV and NPV), and area under the receiver operator curves (AUROC). The DeLong method was used for AUROC comparisons. We also evaluated three noninvasive tests using the optimal threshold defined as the value corresponding with the Youden index. All analyses were performed with the SAS statistical software, version 9.4 (SAS Institute Inc). Statistical significance was set at α = .05.

Results
Among 584 morbidly obese NAFLD patients (mean age 43.4 ± 11.3 years, 21.2% male, 75% white, mean body mass index (BMI) 45.5 ± 8.80), 31.7% had histologic NASH, 55.8% had HTN, and 35.3% had T2DM. Stage distributions were F1 = 68.1%, F2 = 16.4%, F3 = 8%, and F4 = 3.2%. Comparison of demographic and clinical parameters of morbidly obese NAFLD patients across fibrosis stages is shown in Table 1. The upper stage of fibrosis was associated with older age, being male and Hispanic ethnicity, and a greater likelihood of HTN and T2DM. ALT, AST, and glucose increased significantly for each stage of liver fibrosis, whereas platelet count decreased at a higher stage. In contrast, BMI and albumin were not significantly changed at a higher stage.  Table 1). Even though the majority of morbidly obese NAFLD patients with advanced fibrosis had high NFS, FIB-4, and APRI scores, a considerable number of such patients had low values. The accuracy of FIB-4 and APRI, as measured by AUROC (Fig. 2), was significantly better in diagnosing advanced fibrosis than NFS (AUROC 0.76, 95% CI: 0.73-0.80 and AUROC: 0.79, 95%CI: 0.76-0.82 vs. AUROC 0.69, 95% CI: 65-0.73) ( Table 2). Limiting the analysis to patients with NASH only no difference in the AUROC was observed across all three NITs.

Performance of NFS, FIB-4, and APRI Tests
The diagnostic accuracy of published and optimal thresholds for NITS is shown in Table 3. Using published rule-in thresholds for advanced fibrosis, all three tests performed relatively poorly, with an AUROC of < 0.62. The lower accuracy with published thresholds was likely due to a relatively higher value of BMI, ALT, and platelets among morbidly obese patients with NAFLD compared with those among the general populations of NAFLD patients from the National Health and N u t r i t i o n E x a m i n a t i o n S u r v e y ( N H A N E S ) d a t a (Supplementary Table 2). This led to a lower median of FIB-4 and APRI scores for our study cohort and a higher median of NFS scores. Using the optimal thresholds based on the Youden index, the overall accuracy (AUROC) of all three tests improved to 0.68 (0.64-0.72), 0.72 (0.69-0.76), and 0.74 (0.70-0.77) for NFS, FIB-4, and APRI, respectively. A NFS of ≥ − 0.682 to diagnose advanced fibrosis had a sensitivity of 75.8%, specificity of 60.2%, PPV 24.2%, and NPV 95.1%; a FIB-4 of ≥ 0.986 had a sensitivity of 60.6%, specificity of 84.2%, PPV 32.5%, and NPV 94.4%; and a APRI of ≥ 0.241 had a sensitivity of 75.2%, specificity of 72.2%, PPV 25.3%, and NPV 95.9%. The optimal thresholds for all three tests were excellent to rule out advanced fibrosis with high NPV (94.4-95.9%), but could not affirm advanced fibrosis due to a very low value of PPV (24.2-32.5%). At 90% sensitivity, the APRI had better specificity than FIB-4 and NFS (41.3% vs. 31.9% and 22.2%) and at 90% specificity, the FIB-4 had better sensitivity than APRI and NFS (51.5% vs. 47.0% and 31.8%). Combinations of the tests did not improve their performance (data not shown).

Discussion
This study investigated the performance of three NITs, NFS, FIB-4, and APRI, in the identification and staging of advanced fibrosis in morbidly obese patients with NAFLD. With previously published thresholds, all three NITs performed relatively poorly; however, the use of optimal thresholds improved their accuracy. Furthermore, all three tests had excellent NPV but poor PPV.
Bariatric surgery has been increasingly utilized for morbidly obese NAFLD patients, which provide a valuable database for "biopsy-proven" NAFLD and NASH cases [23]. In this population of morbidly obese patients with NAFLD, the prevalence of NASH was 31.7%, which is higher than that in the general NAFLD population, in which the prevalence of NASH is about 10% [4]; other studies suggest that about 10-40% of patients with silent NAFLD will develop NASH [5,20]. However, the risk of developing NASH increases with obesity and higher BMI; morbidly obese people with NAFLD have reported NASH rates up to 65% [24][25][26]. In a patient population with morbid obesity and vitamin D deficiency, who underwent gastric bypass surgery, the rate of NASH was as high as 72% [27]. However, some of these studies likely include bias in their prevalence rates because only patients with a suspicion of liver disease received a liver biopsy, due to the invasiveness of the technique and the chance of complications occurring [28]. Rates of NASH can also be difficult to estimate because of the scale of the NAFLD burden and the level of screening required to achieve accurate numbers [29].
NITs for liver fibrosis staging are a major benefit to patients with NAFLD. Given the high prevalence of NAFLD, with millions of people affected worldwide, the invasiveness of liver biopsy and sampling errors make it impractical, especially for periodic assessment required for monitoring of disease progression [28]. In this study, patients with morbid obesity who underwent bariatric surgery and had protocol-driven liver biopsy were reviewed. NFS had 66.8% sensitivity and 87.5% specificity (no AUROC given too few studies included to provide a summary AUROC) [17]. However, these studies included general populations of NAFLD patients, and not to select a population of morbidly obese patients. Here, we report that the optimized AUROCs for detecting advanced fibrosis were 0.68, 0.72, and 0.74 for NFS, FIB-4, and APRI, respectively. Our findings also suggest that since the sensitivity and specificity of these NITs were not that high, the utility of these markers needs further evaluation in this specific group of patients. APRI is the simplest test used in this study. In 111 patients with a histological diagnosis of NAFLD, APRI had an AUROC of 0.85 with an optimal cut-off of 0.98, giving a sensitivity of 75% and a specificity of 86% for detecting advanced fibrosis [32]. A meta-analysis [17] with an APRI threshold of 1.0 found a sensitivity of 50.0% and specificity of 84.0%, while a 1.5 threshold had 18.3% sensitivity and 96.1% specificity for advanced fibrosis. In our study, the AUROC for APRI was 0.74 after optimization, with a cut-off of 0.24, thus giving a sensitivity of 75.8% and a specificity of 72.2%. However, at a 1.5 threshold, the sensitivity decreased to 3.0% but specificity became 100%. Similar results were found with the NFS and FIB-4 analysis [33]. For FIB-4, previously established thresholds were investigated that had shown a score ≥ 2.67 had an 80% positive predictive value and score ≤ 1.30 had a 90% negative predictive value; meta-analysis suggested that a FIB-4 threshold of 2.67 had a sensitivity of 26.6% and a specificity of 96.5%, and a cut-off of 3.25 had a sensitivity of 31.8% and a specificity of 96.0% for advanced fibrosis [17]. The resulting optimized threshold in our study for FIB-4 was at a cut-off of 0.99 to provide 60.6% sensitivity and 84.2% specificity. A meta-analysis for NFS used a cut-off of − 1.455, which provided 72% sensitivity and 70% specificity [17], while our study found the optimized cut-off was − 0.682, giving a sensitivity of 75.8% and specificity of 60.2%.
In each test investigated here, the NPV was much better than the PPV. This difference has been found in other studies [34] also and suggests that these tests are more effective at ruling out advanced fibrosis than in identifying it, which is beneficial for helping select patients for liver biopsy as well as reassuring patients and providers that the absence of advanced fibrosis makes them less likely to develop decompensated cirrhosis in the near future and which patients need to be referred to specialized liver clinic. This information may be especially important for morbidly obese patients, for whom minimizing the number of invasive procedures is important because of their increased risk of complications and the technical difficulties with liver biopsies. Also, any abdominal surgery will have some post-operative risk in patients with advanced fibrosis and cirrhosis. In this context, ruling out patients with advanced hepatic fibrosis using a simple NIT could provide assurance of not including patients at some risk post post-operative bariatric surgery.
In addition to ruling out advanced fibrosis before bariatric surgery, ruling in advanced fibrosis prior to bariatric surgery may be desirable in some instances, For example, the type bariatric surgery (malabsorptive vs. restrictive), documentation of portal hypertension may be of value to the surgical team. In this context, it is possible that these simple tests evaluated here may need to be performed in conjunction with more complex analysis to provide a more accurate estimation of the fibrosis degree. A prospective study of 123 morbidly obese patients who underwent metabolic surgery [35] found that transient elastography (TE), with a liver stiffness measurement of > 7 kPa and APRI of > 0.40, was independent factors associated with advanced fibrosis. A meta-analysis [36] indicated that TE alone was good for diagnosing advanced fibrosis, with 85% sensitivity and 82% specificity, with previously documented caveats in obese individuals [37,38]. In addition to TE, MR elastography (MRE) could improve the accuracy and PPV of these tests for advanced fibrosis. Therefore, more data is needed to advance the field of NIT in NASH.
The current study has some limitations. It was conducted in one clinical center, and as a retrospective study, it likely has a bias in the selection of patients. Thus, the results need to be validated in prospective studies with larger numbers of morbidly obese patients from multiple centers.
In summary, although NITs, such as NFS, FIB-4, and APRI, are increasingly being used, it is important to understand the context of use and utility of these tests. Currently, these 3 NITs in bariatric patients have excellent NPV and are accurately able to exclude advanced fibrosis. In contrast, PPV for advanced fibrosis is poor. Therefore, there is an urgent need to develop both sensitive and specific NITs which are independently validated to assess the degree of liver fibrosis in morbidly obese patients with NAFLD.