Radiomics analysis of contrast-enhanced CT for staging liver fibrosis: an update for image biomarker

Background To establish and validate a radiomics-based model for staging liver fibrosis at contrast-enhanced CT images. Materials and methods This retrospective study developed two radiomics-based models (R-score: radiomics signature; R-fibrosis: integrate radiomic and serum variables) in a training cohort of 332 patients (median age, 59 years; interquartile range, 51–67 years; 256 men) with biopsy-proven liver fibrosis who underwent contrast-enhanced CT between January 2017 and December 2020. Radiomic features were extracted from non-contrast, arterial and portal phase CT images and selected using the least absolute shrinkage and selection operator (LASSO) logistic regression to differentiate stage F3–F4 from stage F0–F2. Optimal cutoffs to diagnose significant fibrosis (stage F2–F4), advanced fibrosis (stage F3–F4) and cirrhosis (stage F4) were determined by receiver operating characteristic curve analysis. Diagnostic performance was evaluated by area under the curve, Obuchowski index, calibrations and decision curve analysis. An internal validation was conducted in 111 randomly assigned patients (median age, 58 years; interquartile range, 49–66 years; 89 men). Results In the validation cohort, R-score and R-fibrosis (Obuchowski index, 0.843 and 0.846, respectively) significantly outperformed aspartate transaminase-to-platelet ratio (APRI) (Obuchowski index, 0.651; p < .001) and fibrosis-4 index (FIB-4) (Obuchowski index, 0.676; p < .001) for staging liver fibrosis. Using the cutoffs, R-fibrosis and R-score had a sensitivity range of 70–87%, specificity range of 71–97%, and accuracy range of 82–86% in diagnosing significant fibrosis, advanced fibrosis and cirrhosis. Conclusion Radiomic analysis of contrast-enhanced CT images can reach great diagnostic performance of liver fibrosis. Supplementary Information The online version contains supplementary material available at 10.1007/s12072-022-10326-7.


Background
Liver fibrosis is an important cause of morbidity and mortality in patients with chronic insults (e.g. viral hepatitis, alcohol and non-alcoholic fatty liver diseases [NAFLD]) and complications mainly occur in advanced fibrosis [1]. Fibrosis staging is an essential step in the clinical assessment of patients with chronic liver disease to identify those who require treatment [2]. Liver biopsy is the current reference method for staging fibrosis, but it has defects including invasiveness, sample biases and interobserver variability [3][4][5][6]. Therefore, there is a need for noninvasive and accurate methods for staging liver fibrosis. 2018 practice guidance of the American Association for the Study of Liver Diseases (AASLD) recommended multiphase CT or MRI for initial diagnostic testing in at-risk patients with abnormal surveillance test results [7]. Compared to MRI, CT offers unique advantages including low cost, fewer contradictions, nearly ubiquitous availability and whole organ imaging capacity [8]. To date, several studies have evaluated the ability of contrast-enhanced CT imaging to determine the severity of liver fibrosis [8][9][10]. However, the sample sizes of these studies were not big enough and that may not be sufficient for development and validation of models.
In the era of personalized medicine, radiomics has allowed large number of quantitative features to be extracted from images that provide information on shape, signal intensity and texture [11,12]. Our previous study established and validated a radiomics-based model at non-contrast CT for the prediction of cirrhosis in patients with hepatitis B virus (HBV) [13]. We hypothesized that a model based on radiomics features extracted from contrast-enhanced CT images may improve the staging of liver fibrosis. Therefore, the aim of this study was to develop and validate a radiomics model for the prediction of liver fibrosis using contrast-enhanced CT in the liver.

Materials and methods
This retrospective study was approved by the institutional review board of our institution, and the requirement for written informed consent was waived.

Patients
Among the 1779 consecutive patients who underwent abdominal contrast-enhanced CT at our institution between January 2017 and December 2020, patients over 18 years who had available pathologic records of liver fibrosis within 3 months of liver images at 1.5 mm thickness were retrospectively reviewed. Of 927 eligible patients, 484 were excluded due to conditions that may interfere with the extraction of radiomic features of their nontumorous right hepatic lobes, including large (≥ 10 cm) or multiple (≥ 5) hepatic masses (n = 276), a tumor thrombus in the portal vein larger than the segmental branch (n = 63), bile duct obstruction (n = 29), previous surgical resection on the right hepatic lobe (n = 28), poor image quality because of metal or respiratory motion artifacts (n = 60) and incomplete clinical data (n = 28).
A total of 443 patients, including 345 men (median age, 56 years; age range, 28-86 years) and 98 women (median age, 61 years; age range, 40-84 years), were finally included in this study cohort. This cohort was randomized in a threeto-one ratio into training and validation cohorts, respectively, using computer-generated random numbers without matching of any patient characteristics. were in the validation cohort. The flow diagram for the study population is shown in Fig. 1 and Table 1 shows the demographic and clinical characteristics of the cohorts. The median interval between CT images and pathologic evaluation was 15 days ± 18 (standard deviation; range, 1-76 days).

Reference standard for liver fibrosis
Liver pathologic examination served as the reference standard for staging liver fibrosis. Liver specimens were obtained by liver resection (n = 308); liver transplantation (n = 32); or percutaneous liver biopsy (n = 103) (Table 1), which were histologically analyzed by two pathologists in consensus. Fibrosis stage was determined according to the Metavir scoring system [14], as follows: F0, no fibrosis; F1, portal fibrosis without septa; F2, portal fibrosis with rare septa; F3, numerous septa without cirrhosis; F4, cirrhosis. F ≥ 2 was considered as significant fibrosis and F ≥ 3 as advanced fibrosis.

Serum fibrosis tests
The aspartate aminotransferase-to-platelet ratio index (APRI) and the fibrosis-4 index ( [15,16], respectively. These indices were calculated using the results of laboratory tests performed within 26 days ± 13 (range 3-76 days) from obtaining results of pathologic examination of the liver.

CT image acquisition
Contrast-enhanced CT scans were acquired in the axial plane with 0.75-1.5-mm-thick sections and a 0.75-1.5-mm reconstruction interval. Image acquisition parameters are detailed in Appendix E1 (online resource).

Radiomic feature extraction and selection
One reader (S.N.T., with 7 years of clinical experience in abdominal radiology) selected regions of interest (ROIs) in the liver of all patients. ROIs for the liver were delineated along the margin of the right hepatic lobe, at the level of the right portal vein, by excluding large hepatic vessels and masses on non-contrast (mean area of ROIs, 48 cm 2 ± 16; range 17-108 cm 2 ), arterial (mean area of ROIs, 48 cm 2 ± 18; range 16-108 cm 2 ) and portal (mean area of ROIs, 51 cm 2 ± 18; range 16-109 cm 2 ) venous phases CT images using 3D slicer (version 4.11.1; http:// www. slicer. org) (Fig. 2). To explore the stability of each feature, 30 patients were randomly chosen; reader 1 repeated image segmentation twice and reader 2 independently performed segmentation to evaluate the intra-and interobserver reproducibility. The reproducibility was quantified by the intraclass correlation coefficient (ICC).
Image preprocessing and feature extraction were performed using the open-source Pyradiomics package (version 2.2.0: http:// www. radio mics. io/ pyrad iomics. html). The voxel spacing was standardized with the size of 1 × 1 × 1 mm and voxel intensity values were discretized with a bin width of 25 HU to reduce the interference of image noise and normalize intensities [17], respectively. We extracted 837 radiomic features (18 first-order statistics, 75 textural features and 744 wavelet transformations) from each twodimensional segmentation, giving a total of 2511 for every phase CT images (non-contrast, arterial and portal venous phases). The z-scores were used to standardize values of features and the mean and standard deviation determined in the training cohort were applied in the validation cohort.
A three-step procedure was followed to select significant radiomic features. First, the reliability of each feature was qualified using ICC and features with ICC more than 0.9 were kept for further analysis [18]. Second, irrelevant features that weakly correlated with fibrosis stage were removed; the correlation between each radiomic feature and metavir fibrosis stage was evaluated using the Kendall correlation coefficient. Features with correlation coefficients less than 0.15 were eliminated. The final step in feature selection was performed using the least absolute shrinkage and selection operator (LASSO) logistic regression algorithm with penalty parameter tuning conducted by tenfold cross-validation [19], between stages F0-F2 and F3-F4, and features with nonzero coefficients were considered independently related to fibrosis stage.

Clinical factors selection
We devised a three-step procedure for selection of clinical factors. First, we used kendall correlation analysis to screen out factors with significant correlation (kendall correlation analysis, p < 0.05). Second, forward conditional logistic multivariable analysis was used to select factors for the discrimination between stages F0-F2 and F3-F4 (input and output p value: 0.05 and 0.1, respectively). Third, a function on the basis of the variance inflation factor (VIF) was conducted to check for the collinearity of variables included in the regression equations [20]. Variables with VIF greater than 10 (indicating multicollinearity) were excluded.

Model establishment and validation
The radiomics signature for the prediction of fibrosis (R-score) was created using support vector machine (SVM) as a multi classification to distinguish among stages F0, F1, F2, F3 and F4. SVM can be used to carry out general regression and classification and it was performed using "e1071" package (https:// CRAN.R-proje ct. org/ packa ge= e1071) on R software (version 3.6.1, http:// www.r-proje ct. org). Multivariate linear regression analysis (Appendix E2 in online resource) was performed to establish a final model (R-fibrosis) based on radiomics signature (R-score) and clinical factors for the prediction of fibrosis. The performance of models was tested in the independent validation cohort using the equation derived from the training cohort.

Statistical analysis
Categorical and continuous variables were compared using χ 2 test and the Mann-Whitney U test, respectively. The correlation between results calculated from models and pathologic liver fibrosis stage was evaluated using the spearman correlation analysis. Performance of models for staging liver fibrosis was evaluated using receiver operating characteristics (ROC) curve analysis, area under the curve (AUC) value and the Obuchowski index, a multinomial version of ROC curve analysis adapted for ordinal references such as metavir staging of liver fibrosis [21]. The Obuchowski index is a weighted average of the areas under the curve obtained for all possible pairs of fibrosis stages to be differentiated and it estimates the probability that a test will correctly rank two randomly chosen patients with different stages of fibrosis. The optimal thresholds of models were determined using the ROC analysis by maximizing the Youden index. Delong nonparameteric approach was used to compare AUC values [22]. Calibration curves were plotted to evaluate the calibration of the established model, accompanied by the Hosmer-Lemeshow test. Additionally, a decision curve analysis (DCA) was performed to assess the clinical usefulness and net benefits of the developed radiomics models [23]. A twosided p value less than 0.05 was indicative of a statistically significant difference.

Characteristics of the study cohorts
The baseline characteristics of all patients are summarized in Table 1. There were no significant differences in clinical and pathological characteristics between the training and validation cohorts. No differences were found in rates of significant fibrosis (Training: 69

Fibrosis-related clinical factors
In the training cohort, platelet (PLT) count, glutamyl transpeptidase (GGT), albumin (ALB), albumin to globulin ratio (A/G) and total cholesterol (TC) were identified as independent fibrosis predictors by the multivariable logistic regression analysis ( Table 2). The VIF of TC was 10.7 (over 10), indicating the collinearity, in which the variable should be excluded. According to the Kendall correlation coefficient of GGT and PLT (0.15 and − 0.292, respectively), the GGT to PLT ratio was involved in the prediction model.

Radiomic feature selection and signature construction
Among 2084 radiomic features with high stability, 320 features with significant correlations to fibrosis stage were identified. And then, 21 independent features with nonzero coefficients were finally selected by the LASSO logistic regression (Fig. 3). A radiomic signature was constructed using SVM algorithm. The type of SVM was "eps-classification", of which the kernel function was radial basis. The value of gamma and epsilon was 0.045 and 0.1, respectively. The total number of support vectors was 153.

Development and validation of the prediction model
Calibration curves of the R-fibrosis and R-score demonstrated great agreement between predicted and actual significant fibrosis, advanced fibrosis and cirrhosis in the validation cohort (Fig. 5). The Hosmer-Lemeshow test yielded a p value of > 0.05, suggesting no departure from the good fit. The decision curve analysis for the R-fibrosis, R-score, APRI and FIB-4 are presented in Fig. 5. R-fibrosis and R-score provided higher net benefit compared with other models and simple strategies of all patients or no patients across the majority of the range of reasonable threshold probabilities in the validation cohort. No obvious differences were found in terms of clinical benefit between R-fibrosis and R-score.

Discussion
The aim of this study was to develop and validate radiomics-based models on contrast-enhanced CT radiomics for liver fibrosis. We concluded that radiomics analysis of contrast-enhanced CT allows for more accurate staging of liver fibrosis compared with other models. The R-fibrosis and R-score created by the training cohort data predicted the staging of liver fibrosis in the validation cohort with AUCs of 0.84-0.90 and accuracies of 82-86%. In agreement with our hypothesis, radiomics models (Obuchowski index, 0.84-0.85) outperformed custom serum indices (Obuchowski index, 0.65-0.68).
There are various less-invasive methods for staging liver fibrosis including serological markers and elastography.  Ultrasound-based elastography (including transient elastography [TE] and two-dimensional shear wave elastography [2D-SWE]) and magnetic resonance elastography (MRE) are known to have great diagnostic performance for staging liver fibrosis [19,24,25]. However, elastography techniques are not widely used in China because of high prices and limited cost-effectiveness for general hospitals. HBV carriers are frequently suggested to receive annual contrast-enhanced CT or MRI in China. Our previous study developed a radiomics-based model at noncontrast CT for predicting cirrhosis [13] and this study used contrast-enhanced CT for further investigation (significant & advanced fibrosis). Ultrasonography is used as the initial tool for early screening of liver tumor in patients with chronic hepatitis in the world. However, the normalization of ultrasound images is difficult and software that can preprocess and extract radiomic features from twodimensional images is rare. We are also researching the image processing algorithms for future consideration of ultrasound images.
This study considered not only chronic liver diseases but also liver masses to make the R-fibrosis and R-score suitable for major kinds of patients with liver fibrosis. Fibrosis staging can help guide the treatment plans. Both contrastenhanced CT and MRI are recommended by guidelines for early detection of liver tumor for patients with chronic liver diseases [7,26], and many studies have focused on image data mining at MRI involving image findings and texture analysis [27][28][29]. A study conducted by Park et al. [29] analyzed Gadoxetic Acid-enhanced MRI for staging liver fibrosis using radiomics and obtained radiomics fibrosis index with the AUC range of 0.89-0.91 (similar to R-score and R-fibrosis). Actually, CT is more readily available than MRI. Computer-aided visual assessment of liver or spleen volume and homogeneity on CT allowed for the detection of fibrosis stage but showed neglect of multiclass accuracy [10,30]. Moreover, none of them were validated in independent test data sets. A deep convolutional neural network (DCNN) system for staging liver fibrosis was developed using portal venous phases CT images [31]. Unlike texture analysis, the DCNN system extracted and analyzed features from cropped and zoomed images. Diagnostic performance of the DCNN system is not greater than us (AUC range 0.73-0.76), although there should be a head-to-head comparison for comparing these two methods. A recent study revealed that DCNN system should be established based on the entire upper abdomen at CT images which can significantly improve diagnostic performance (AUC range 0.88-0.92) [32].

3
The established radiomic signature (R-score) in this study included 4 first-order statistics and 17 textural features. As similar to previous studies [33,34], most of (90.5%, 19 of 21) these included features were processed by wavelet transform. 12 features were derived from non-contrast CT and others were from arterial (4) and portal (5) venous phases, of which the cause might be non-contrast CT can provide more stable features without effects of personal intake. The final model (R-fibrosis) included GGT/PLT, ALB and A/G in addition to R-score. Results calculated by R-score are decision values of all binary classifiers computed in multiclass classification. It is normal for established models to get negative values. We aimed to develop models with detailed cutoff values for multiclass classification in this study to be easily applied in other centers. The predictive value of the GGT to PLT ratio for significant fibrosis and cirrhosis was confirmed by Lemoine et al. [35] and Lu et al. [36]. ALB has been confirmed as an independent indicator of advanced for each model in the validation dataset. R-score and R-fibrosis were established due to the training cohort and validated for the prediction of significant fibrosis (a), advanced fibrosis (b) and cirrhosis (c). In decision curve analysis, the y-axis measures the net benefit, which was calculated by summing the benefits (true-positive results) and subtracting the harms (false-positive results), weighting the latter by a factor related to the relative harm of an undetected fibrosis status compared with the harm of unnecessary treatment liver fibrosis in patients with NAFLD [37], and it can also significantly contribute to the index for staging liver fibrosis in patients with viral hepatitis [38,39]. A/G was used as biomarkers in many cases such as tumor prognosis [40][41][42] and chronic diseases [43][44][45], but only one study involved A/G into the fibrosis markers [46]. The specificity of A/G for fibrosis might not be so high, and thus we make it computable when ALB ≤ 40 g/L.
There were several limitations in our study. First, the limited population size and the unbalanced distribution of the patient population restricted the great establishment of the prediction model. Moreover, the retrospective study may introduce selection biases, and there were larger numbers of patients with advanced fibrosis (i.e. stages F3&F4) than others (i.e. stages F0-F2). Second, the proposed radiomics-based model was established using data obtained from a single center. Our model needed to be further validated by prospective multicenter studies with considerably large datasets. Third, image findings related to significant fibrosis (a nodular or irregular hepatic surface, parenchymal abnormalities, a blunt liver edge, intrahepatic morphological changes and clinical manifestations of portal hypertension) were not considered in this study. The main reason is that these image findings are frequently suggestive of cirrhosis [47]. Fourth, this study did not consider different etiologies on feature extraction. Different etiologies have a certain impact on fibrosis, indicating the possibility of different feature values caused by different etiologies. Therefore, subgroup analysis should be conducted in different etiologies to ensure the objectivity of the results. Finally, because elastography methods (TE or 2DSWE) were not performed for these patients, we were unable to compare the efficacy of our model with that of elastography for staging liver fibrosis.

Conclusions
In conclusion, we proposed a noninvasive and convenient radiomics-based model at contrast-enhanced CT images which allowed for accurate diagnosis of clinically significant liver fibrosis. Compared with our previous radiomics model based on non-contrast CT scans, R-fibrosis can additionally become as an update version for the prediction of significant and advanced fibrosis. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.