Introduction

Esophageal varices (EV) is a potentially lethal complication of liver cirrhosis. The prevalence of EV in cirrhotic patients is approximately 60–80%, and the 1-year rate of first variceal hemorrhage is approximately 12% [5% for low-risk varices and 15% for high-risk varices (HRV)] [1,2,3]. The mortality of variceal bleeding ranges from 15 to 55% [4]. Esophagogastroduodenoscopy (EGD) is usually recommended to identify those at risk of bleeding who should undergo prophylaxis treatment in patients with established cirrhosis [5]. EGD is a costly invasive producer, which is not free of risks. More importantly, a large portion of patients undergoing EGD screening, particularly those with compensated advanced chronic liver disease (cACLD), do not have EV or only have low-risk varices [6]. A relatively large proportion of EGDs is unnecessary according to current guidelines. Hence, there is a critical need to develop a noninvasive method to identify HRV and to spare unnecessary EGD screening.

For the past few years, non-invasive tests (NITs) have been established to triage patients for sparing EGDs. Elastography is the most widely used noninvasive method for screening HRV, especially transient elastography (TE). The ability of liver stiffness measurement (LSM) and spleen stiffness measurement (SSM) by TE for identifying HRV has been widely evaluated. LSM < 20 kPa and platelet count (PLT) > 150 × 109/l can avoid EGD screening to identify HRV, which has been recommended by the Baveno VI consensus [7]. More recently, some studies have suggested that SSM alone or together with LSM can avoid more EGDs [8,9,10]. Although these methods can provide great diagnostic accuracy to diagnose HRV, they still have not been widely carried out in most liver centers, because of they rely on data that may not be readily available. TE is the most common technology for elastography, but is still not immediately available in all liver units, especially in developing countries. Therefore, creating an easy-to-use model to screen HRV and spare EGDs would be of great value to daily clinical work. Several easy-to-use models have been established. Calvaruso et al. have developed a model based on PLT and albumin (ALB) [11]; Jangouk et al. reported a model based on PLT and MELD [12].

In this study, we developed and validated an easy-to-use risk scoring system based on a routine laboratory test and routine liver Doppler ultrasonography to identify HRV and spare EGDs in patients with hepatitis B virus (HBV)-related cACLD. Meanwhile, we validated several reported easy-to-use models for screening HRV.

Methods

Patients

This single-center retrospective study was approved by our institutional ethics committee, and the need for written informed consent was waived. Patients from the outpatient department were consecutively collected. The training cohort included patients from July 2016 to March 2019. The inclusion criteria were as follows: patients with persistence of serum hepatitis B surface antigen (HBsAg) for > 6 months and with anti-viral experience; age ≥ 18 years; interval time between the ultrasound and EGD examination of no more than 6 months; ultrasound examination suggested patients had cACLD. Patients were considered to have cACLD with the following findings: surface nodular irregularity and marginal blunting on ultrasonography [13]. The exclusion criteria were as follows: patients with decompensation events such as severe jaundice, ascites, hepatic encephalopathy, nonselective betablockers treatment, variceal bleeding, esophageal variceal ligation, portal vein or splenic vein thrombosis, splenectomy, splenic embolization, transjugular intrahepatic portosystemic shunt, hepatocellular carcinoma or liver transplantation history. The validation cohort included patients from April 2019 to November 2019. Inclusion and exclusion criteria were the same as those in the training cohorts; 584 HBV-related cACLD patients with routine laboratory, EGDs and ultrasound results were collected. Eight patients with splenectomy or splenic embolization history, 13 patients with hepatocellular carcinoma, 187 patients with esophageal variceal ligation and varices bleeding history and 42 patients with ascites were excluded. Finally, 334 patients with HBV-related cACLD were eligible for analysis; 221 patients were enrolled in the training cohort, and 113 patients were enrolled in the validation cohort (Fig. 1).

Fig. 1
figure 1

Flowchart of patient selection and design of the study

Doppler Ultrasonography

Ultrasound examination was performed using a GE Logiq E9 system with a convex probe operating at a frequency of 3.75 MHz. All patients were fasted 8 h before examination. They were examined in the supine position during quiet respiration. Portal vein diameter (PVD) was measured at the point 2 cm proximal to the portal bifurcation at the exact site in the portal vein mid-portion. Splenic vein diameter (SVD) was measured at the hilum of spleen. The angle between the long axis of the vein and the Doppler beam was < 60 degrees.

Esophagogastroduodenoscopy

EGD examination was performed by experienced endoscopy operators with a CV-290 gastroscope (Olympus Optical Co., Tokyo, Japan). The endoscopic findings were recorded and graded as follows: grade I: varices were flattened by insufflation; grade II: varices were non-confluent and protruding in the lumen despite insufflation; grade III: confluent varices were not flattened by insufflation. The presence of red signs was also recorded in all patients. According to the criteria proposed at the Baveno VI consensus, the stages of varices were re-evaluated by one experienced endoscopist. Low-risk varices were defined as grade I varices without red signs. HRV was defined as grade I varices with red signs or ≥ grade II varices [7].

Statistical Analysis

Characteristics were shown as median and interquartile range. Quantitative variables were compared with the Mann-Whitney test. The receiver-operating characteristic curve (ROC) analysis was used to evaluate the performance of noninvasive parameters for predicting HRV. The optimal cutoff values were determined by the maximum Youden index, and the corresponding sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (+ LR) and negative likelihood ratio (− LR) were calculated. The area under the receiver-operating characteristic curves (AUROCs) of noninvasive parameters for predicting HRV were compared by using DeLong’s test. Characteristics were first analyzed with univariate testing, and those with P < 0.05 were subsequently included in multivariable logistic regression analysis. A non-significant Hosmer-Lemeshow test (P > 0.05) indicates the model is a good fit. HRV miss rate = the number of patients with missed HRV/the number of patients who spared EGD screening. The decision thresholds to triage patients for saving EGDs was NPV > 95%. Data were analyzed with SPSS software (version 23.0, SPSS), Medcalc (version 12.5.0.0, Medcalc) and GraphPad (Version 8.2.1). A statistically significant difference was defined as P < 0.05.

Results

Training Cohort

The baseline characteristics of the training cohort are presented in Table 1. A total of 221 patients with HBV-related cACLD were included. In these patients, a large proportion of them had Child-Turcotte-Pugh A (94.1%): 122 (55.2%) patients without varices, 34 (15.4%) patients with grade I (without red sign) and 65 (29.5%) patients with HRV. Table 2 shows the diagnostic performance of noninvasive parameters for predicting HRV in the training cohort. The AUROC of PLT was 0.83 (95% CI 0.77–0.88), which was not significantly different from that of albumin (ALB) (0.81, 95% CI 0.77–0.88) (P = 0.72), the aspartate aminotransferase-to-platelet ratio index (APRI) (0.81, 95% CI 0.75–0.86) (P = 0.31), fibrosis 4 score (FIB-4) (0.80, 95% CI 0.74–0.86) (P = 0.06) and model for end-stage liver disease (MELD) (0.77, 95% CI 0.71–0.84) (P = 0.13), but better than PVD (0.72, 95% CI 0.65–0.80) (P = 0.02) and SVD (0.71, 95% CI 0.63–0.79) (P = 0.007).

Table 1 Baseline characteristics of the patients in training and validation cohorts
Table 2 Noninvasive characters and models for predicting the presentation of high-risk varices

APP Score Development

In the training cohort, univariate logistic regression analysis showed that bilirubin (P < 0.0001), ALB (P < 0.0001), platelet count (PLT) (P < 0.0001), prothrombin time (P < 0.0001), PVD (P < 0.0001), SVD (P < 0.0001), APRI (P < 0.0001), FIB-4 (P < 0.0001) and MELD(P < 0.0001) were associated with the presence of HRV. They were analyzed with the following multivariable logistic regression analysis. ALB [OR 0.832; 95% confidence index (CI) 0.773–0.895; P < 0.0001], PLT (OR 0.975, 95% CI 0.964–0.986; P < 0.0001) and PVD (OR 1.402; 95% CI 1.152–1.706; P = 0.001) were the independent risk factors for presence of HRV (Table 3). The following risk scoring system was established by multivariable logistic regression analysis: logistic (P) = 4.780 + 0.338 × PVD − 0.025 × PLT − 0.184 × ALB, which was named the albumin-platelet-portal vein diameter varices risk score (APP score). The AUROC of the APP score was 0.90 (95% CI 0.86–0.94) for identifying HRV. There was also a good correlation between the predicted and the observed HRV with the APP score: R2 = 0.95 with \(\chi\) 2 = 7.09 (P = 0.53) (Hosmer-Lemeshow test). The corresponding sensitivity, specificity, PPV and NPV, − LR and + LR of the APP score = 0.24 for predicting HRV were 90.8%, 76.3%, 61.5%, 95.2%, 0.12 and 3.83, respectively. (Table 4).

Table 3 Factors associated with high-risk varices in training cohorts
Table 4 APP score for predicting high-risk varices in training cohorts and validation cohorts

Validation Cohort

The validation cohort enrolled 113 patients. Among them were 61 (54.0%) patients without varices, 17 (15.0%) patients with grade I (without red sign) and 35 (31.0%) patients with HRV. There were no differences with respect to patients’ characteristics between the training and validation cohorts except the Child-Turcotte-Pugh score (P = 0.002). The validation cohort had a larger portion of Child B (14.2%) than the training cohort (4.1%) (Table 1). The AUROC of the APP score was 0.88 for diagnosing HRV. The corresponding sensitivity, specificity, PPV and NPV, − LR and + LR of the APP score = 0.24 for predicting HRV were 97.1%, 74.4%, 63.0%, 98.3%, 0.04 and 3.79, respectively (Table 4).

Performance of the APP Score and Several Reported Criteria Based on Routine Laboratory Tests for Saving Unnecessary EGDs

Within the training cohort, 125 of 221 (56.6%) patients met the APP score < 0.24 with a 4.8% HRV miss rate. Among 125 patients, 18 (14.4%) had low-risk varices. In the subgroup of patients with Child-Turcotte-Pugh A, 125 of 208 (60.1%) patients met the APP score < 0.24 with a 4.8% HRV miss rate. In the validation cohort, 59 of 113 (51.3%) patients met the APP score < 0.24 with a 1.7% HRV miss rate. Among 59 patients, 7 (11.7%) had low-risk varices. In the subgroup of patients with Child-Turcotte-Pugh A, 58 of 97 (59.8%) patients met the APP score < 0.24 with a 1.7% HRV miss rate. To assess the ability of the APP score for saving unnecessary EGDs, several previously reported criteria based on routine laboratory tests were analyzed in the training cohort. Seventy of 211 (31.7%) patients met the criteria (ALB < 36 g/l and PLT < 120 × 109/l) with a 1.4% HRV miss rate. Among 70 patients, 7 (10.0%) patients had low-risk varices. PLT > 150 × 109/l or MELD = 6 spared 61 of 221 (27.6%) EGDs, with a 1.5% HRV miss rate. Among the 61 patients, 5 (8.2%) patients had low-risk varices. The number of avoided EGDs with the APP score was larger than the previously reported criteria (P < 0.0001), with an equivalent HRV miss rate (P = 0.42) and low-risk varices miss rate (P = 0.75) (Table 5).

Table 5 Reliability of methods for ruling out high-risk varices

Discussion

In this study, we have developed and validated an easy-to-use varices risk scoring system for screening HRV in HBV-related cACLD. This novel algorithm simply included ALB, PLT and PVD, which is easy to apply in daily clinical work. The APP score avoided 51.3–56.6% EGDs for screening HRV with a < 5% HRV miss rate. Moreover, the APP score can save more EGDs than the previously reported models based on routine laboratory tests.

To create an easy-to-use model for screening HRV, the routine laboratory tests and ultrasound results were included. ALB, PLT and PVD were independent risk factors for the presence of HRV. The APP score was established based on multivariate logistic regression analysis. ALB as an independent risk variable for the presence of varices has been reported by Imran et al. [14] and Bressler et al. [15]. Li et al. [16] reported that ALB with liver volume could be a potential predictor for the presence of esophageal varices. These results indicate that ALB could be a potential variable for predicting HRV. PLT as a crucial variable combined with LSM by TE has been recommended by the Baveno VI consensus to screen HRV [7]. Chen et al. [17] and Dong et al. [18] reported that PLT was one of the predictors that identified patients who did not need EGDs screening. PLT combined with the length of the spleen safely ruling out HRV also has been confirmed [19]. The portal hypertension is the driver of developing varices. The diameter of the portal vein was increased with development of portal hypertension [20]. These results suggest that the APP score could be a potential model for screening HRV.

To assess the ability of the APP score to screen HRV, previously reported criteria based on routine laboratory tests were validated in this study, Calvaruso et al. found that ALB < 36 g/l and PLT < 120 × 109/l can avoid > 30% EGDs for HRV screening, which was similar with expanded the Baveno VI criteria in which LSM < 25 kPa and PLT > 110 × 109/l in patients with HCV-related cirrhosis [11]. Jangouk et al. reported that PLT > 150 × 109/l or MELD = 6 can saves 30–54% EGDs for screening HRV in patients with cACLD [12]. Tosetti et al. validated that PLT > 150 × 109/l or MELD = 6 can avoid 39% EGDs for screening varices needing treatment in patients with HBV-related cACLD [21]. Data for these models were available in daily clinical work. In our study, these criteria avoided 31.7% and 27.6% EGDs for screening HRV, respectively. The APP score saved more EGDs than these criteria and had a similar HRV miss rate and low-risk varices miss rate to them. These results suggest that an APP score < 0.24 can save more EGD screening without increasing the risk of HRV missing in patients with HBV-related cACLD. Chronic HBV infection was a main cause of cirrhosis. PLT was significantly lower in patients with chronic HBV or HCV infection than in those with other etiologies of liver disease among patients with HRV [22]. An APP score < 0.24 shows great potential for screening HRV in patients with HBV-related cACLD. However, whether the APP score is useful in other etiologies, especially autoimmune liver disease, should be studied in further research.

The Baveno VI criteria LSM < 20 kPa and PLT > 150 × 109/l are the most widely acceptable noninvasive model for screening HRV [7]. Several large sample studies validated that the Baveno VI criteria can avoid 15–35% EGDs in patients with cACLD [19, 23, 24]. Although the Baveno VI criteria are the most widely used model for screening HRV, which avoids a relatively low number of EGDs, the expanded Baveno VI criteria LSM < 25 kPa and PLT > 110 × 109/l have been established to save more EGDs, and can save more EGDs (30–50%) [24], but it was not safe with a > 5% of VNT miss rate in some of the conditions [23, 25]. In this study, the APP score saved > 50% EGDs. Due to the lack of TE data, there is no comparison between the APP score and LSM-based criteria for sparing EGDs.

There were several limitations that should be considered. First, this was a retrospective and single-center study, and the ultrasound and EGD examination were done by several operators in daily clinical work. There could be some bias among operators, especially in ultrasound examination. Second, the APP score was built within HBV-related cACLD; whether this score works for other etiologies was not discussed in this work. A prospective multicenter study with larger cohorts is needed to further validate this model. Third, we did not compare the APP score with the Baveno VI criteria, spleen stiffness or PSR because of the lack of LSM measured by TE or longitudinal spleen diameter. Finally, because the patients were predominantly Child-Turcotte-Pugh A, this model probably performs differently in non-Child-Turcotte-Pugh A patients.

In conclusion, we have developed and validated a score system that can avoid > 50% HRV with a low risk of HRV missing in patients with HBV-related cACLD. The APP score is established based on routine laboratory tests and ultrasound examination, which verify that it can be applied in daily clinical work. More studies are needed to validate whether the APP score is useful in other etiologies, especially autoimmune liver disease.