Background

Hepatocellular carcinoma (HCC) is the most common primary liver cancer and is a leading cause of death in patients with cirrhosis [1]. About 80% of HCC cases are associated with chronic hepatitis B virus (HBV) and hepatitis C virus (HCV) infections. HCB and HCV induce cirrhosis which is seen in 80–90% of HCC patients [2]. The prognosis of HCC is generally poor, with average survival less than one year for patients with advanced disease. On the other hand, patients with early-stage HCC can achieve 70% 5-year survival rate with curative treatments [3]. Therefore, several national and international guidelines have been published aiming to standardize the management of HCC [4,5,6]. Patients with cirrhosis of any etiology and those with chronic HBV and HCV infection are defined as the target population for surveillance with slight differences in recommendations between guidelines. Regarding surveillance tests, most guidelines recommend ultrasonography (US) at 6 months interval as the primary method for surveillance with or without alpha-fetoprotein (AFP) [1, 4,5,6]. Several studies showed variable results for US detection of early HCC. Moreover, the role of AFP in surveillance is still controversial with contradicting reported results [7,8,9].

The aim of this study was to evaluate and compare the accuracy of US and AFP measurement for HCC surveillance in at risk population.

Methods

Study population

This retrospective study included patients with cirrhosis of any etiology and patients with chronic HBV and HCV infections who performed US and AFP measurement between January 2013 and January 2016. The electronic medical records were searched for CT and MRI studies as well as biopsy results if present.

Exclusion criteria Patients with previous HCC diagnosis, extrahepatic malignancy, and those with lost follow up were excluded.

Ethical considerations

Our research ethics committee approved this retrospective study, and the requirement of written informed consent was waived.

Ultrasound examinations and image analysis

The US examinations were performed by one of four board certified radiologists with 6, 7, 9, and 13 years of experience in abdominal imaging, respectively, using Philips EPIQ or Philips Iu22 US machines (Philips healthcare). The US examinations were done as per institution protocol using a curvilinear transducer (1–5 MHz) and included series of static greyscale images of the left and right liver lobe in supine and left lateral decubitus position as well as colored Doppler images of the portal and hepatic veins. The US examinations and reports were reviewed by a single radiologist (A.J) with 10 years of experience in abdominal imaging. The examination results were recorded as follow: no focal lesions, definitely benign focal lesions, indetermined findings, and malignant lesions consistent with HCC. The indeterminate findings included limited visualization due to patient body habitus, incomplete liver visualization, small indeterminate nodules, and sever cirrhosis limiting the detection of focal lesions. If CT and/or MRI studies were available, the results were recorded as: no focal lesions, definitely benign focal lesions, indetermined nodules, and malignant lesions consistent with HCC.

Reference standards

The diagnosis of HCC was based on typical imaging findings according to liver imaging and reporting data system (LI-RADS) v2018 [10], and biopsy results if it was done for indetermined cases. The negative results were determined by the absence of focal lesions on CT and/or MRI, and by clinical and US follow up in cases without CT or MRI examinations.

Statistical analysis

Data were expressed as mean ± SD for continuous data and as percentages and frequencies for categorical data. Chi-Square test was used to compare categorical data, while independent samples T test was used to compare continuous data, and ROC curve analysis was used to estimate area under the curve for US, AFP and combined US and AFP. A cut off value of 20 ng/ml was used for AFP to calculate diagnostic accuracy measures. For ultrasound, indetermined results were considered as positive results once and as negative results in the second time, and estimation of sensitivity and specificity was performed accordingly. SPSS (version 24) was used for the analysis.

Results

Study population

From initially evaluated 400 patients, a final cohort of 335 patients were eligible for the study. A flowchart for the study population is shown in Fig. 1. The mean age for the study group was 55.7 ± 18 years. A total of 168 were males, and 167 were females. Table 1 summarizes patient characteristics. A total of 1127 ultrasound examinations were performed for 335 patients (average 3 examinations per patient). CT was performed in 142 patients, MRI was performed in 42 patients, and 40 patients had both CT and MRI. Biopsy was performed in 8 patients. The mean follow-up period was 19 months + − 3 months.

Fig. 1
figure 1

Flowchart of the study population

Table 1 Baselines characteristics of study population

Incidence of HCC

Thirty-five patients (10.4%) were diagnosed with HCC, 28 (8.4%) had regenerative nodules, 5 (1.5%) hemangiomas, 12 (3.6%) cysts, while 255 (76.1%) showed no focal lesions. 34/35 HCCs occurred in patients with cirrhosis. 24/35 HCCs were found in hepatitis positive patients.

Ultrasound findings and further imaging

Negative results at US

No focal lesions were detected by US in 259 patients; out of these 259 patients, no further cross-sectional imaging was performed in 172 patients, CT was performed for 87 patients and MRI for 19 patients. Three false negative cases were diagnosed with HCC. Five additional benign focal lesions (3 cysts, 2 hemangiomas) and 10 regenerative nodules were detected by CT and MRI.

Positive results at US

Ultrasound detected 16 (4.8%) focal lesions reported as HCC (Fig. 2). All these HCC lesions were confirmed by further imaging. Nine benign focal lesions (3 cysts, 5 hemangiomas, and 1 focal fat) were reported by US (Fig. 3). The 3 cases with cysts did not perform further imaging. CT and/or MRI were done in the other six cases and confirmed the results were concordant with US in 5 cases, while one case did not show any focal lesions (false positive by US).

Fig. 2
figure 2

A 56-year-old man with hepatitis B viral cirrhosis. A Surveillance US shows a 2.5 cm hypoechoic nodule in the right liver lobe. On axial contrast enhanced CT images, the nodule (white arrow) shows arterial phase hyper enchantment (B), and washout on the portal venous phase image (C)

Fig. 3
figure 3

A 48-year-old woman with hepatitis c viral cirrhosis. A Baseline surveillance US shows a 4 cm homogenous hyperechoic nodule in the right liver lobe impressive of hemangioma. On follow-up US examination after 2 years (B), the nodule showed stable size and appearance

Indeterminate results at US

Ultrasound reported indeterminate results in 51 patients (15.2%). These consisted of limited visualization due to patient body habitus (17 patients), incomplete liver visualization (7 patients), small indeterminate nodules (6 patients), and sever cirrhosis limiting the detection of focal lesions (21 patients). Further imaging by CT and/or MRI was done for all the 51 indeterminate results. Figure 4 shows the final results of indeterminate US findings.

Fig. 4
figure 4

Final results of indeterminate US findings

AFP levels and relation to US and HCC diagnosis

Fifty-six patients had abnormal AFP levels > 20 ng/ml; out of these, 22 had HCC. A total of 279 patients showed normal AFP levels below 20 ng/ml; among them, 13 were diagnosed with HCC. AFP levels were elevated in 29 cases with normal US; of these, 2 cases were diagnosed with HCC. Table 2 shows the distribution of US findings among in relation to AFP levels.

Table 2 Distribution of US results among AFP categories

Diagnostic performance of US and AFP

ROC curve analysis (Fig. 5) showed that US has the largest area under the curve (0.924, 95% CI 0.866–0.983) followed by combined US and AFP (0.897, 95% CI 0.854–0.941), then AFP alone (0.829, 95% CI 0.756–0.902).

Fig. 5
figure 5

ROC curve analysis for US, AFP and combined US and AFP

US showed a 45.7% sensitivity and 100% specificity for HCC detection when definite lesions detected by US were considered as the only positive results.

When indeterminate results were considered as positive results, US showed a 91.4% sensitivity and 88.3% specificity.

Using 20 ng/ml as a cutoff, AFP showed a 63% sensitivity and 88.7% specificity. Combined use of US and AFP showed 97% sensitivity and 82.3% specificity. ROC curve results for AFP levels demonstrated that using 5.6 ng/ml as a cutoff, AFP will have a 77% sensitivity and 78% specificity for HCC detection.

Discussion

In this retrospective study, US detected 16 out of 36 HCCs giving a sensitivity and specificity of 45% and 100%, respectively. Fifty-one indeterminate findings were reported at US and in all of which, further assessment was recommended instead of continue surveillance. Therefore, we conducted a second calculation considering these indeterminate results as a cutoff for positive findings. In this second analysis, the sensitivity of US has significantly increased to 91.4% and the specificity decreased to 88.3%. In contrary to our approach, Chang et al. [7] considered indeterminate US results as a negative test, they considered that indeterminate results imply failure of the surveillance to detect HCC.

In our study, HCC was diagnosed in 10.3% of the study population. Previous studies have reported an incidence of HCC ranging from 3 to 22.7%. This variation in HCC incidence is due to the differences in study populations; studies that included only patients with cirrhosis had higher HCC incidence than studies that included patients with and without cirrhosis [3, 7, 11,12,13,14]. Our study had similar results, with only 1 out of 35 HCC cases was diagnosed in a patient without cirrhosis.

In our study, HCC patients were significantly older than patients without HCC (65.5 vs. 44.4 years), also HCC was more frequent in male patients, and this is in line with the results from the study by Change et al. [7] which reported that 67.5% of HCC patients were males and was older than non-HCC patients.

Ultrasound is operator dependent and has several limitations include and not limited to distortion of hepatic parenchyma seen in cirrhosis, narrow acoustic window in obese patients, subcapsular tumor locations, and infiltrative tumors [1]. A recent study assessed the sensitivity of US in HCC detection in obese patients, they used the pathology of liver explants as a gold standard, and the study found that US sensitivity was 77% in patients with body mass index (BMI) less than 30 and only 21% in patients with BMI above 30 [15].

American College of Radiology developed the US Liver Imaging Reporting and Data System (LI-RADS) algorithm in 2017 [16]. The US LI-RADS recommends assigning two scores: a US category from 1 to 3, which determines the need for follow-up, and a visualization score from A to C, which is used to communicate the expected level of sensitivity of the examination. A recent study by Son et al. [17] showed that US-3 category has a high specificity but low sensitivity for HCC detection, and that visualization score 3 has higher false negative rates than score A and B. Our results recommend that indeterminate US findings should be considered as positive surveillance results, especially when they promote further assessment by cross-sectional imaging or closer follow up.

Using AFP traditional cutoff level of 20 ng/ml, 13 HCCs would be missed in our study giving a sensitivity and specificity of 63% and 88.7%, respectively. Singal et al. [3] reported a sensitivity and specificity of 66% and 91%, respectively, for HCC detection using the same AFP cutoff level.

Ungtrakul et al. [13] reported a 41% sensitivity and 98% specificity for AFP using the same 20 ng/ml cutoff. However, they reported no additional improvement of surveillance combining AFP and US. Contrary to this, our results showed that combining US and AFP resulted in diagnosis of additional two HCC cases giving a sensitivity of 97% and specificity of 82.3%. This agrees with several previous studies that showed improved effectiveness of surveillance when combining US and AFP measurement [3, 7, 9, 14, 18, 19].

This study has limitations. First, the retrospective design did not allow for randomization. Second, reference standard could not be obtained for every participant; however, we tried to minimize confirmation bias be excluding patients without sufficient follow up. Third, mortality was not assessed, so this study cannot answer the question whether HCC surveillance carry a significant survival benefit. Lastly, we did not assess cost effectiveness.

Conclusions

Our results showed that alternative surveillance intervals in chronic hepatitis patients without cirrhosis may be considered due to the low incidence of HCC in non-cirrhotic patients. US has better diagnostic accuracy compared to AFP. The current 20 ng/ml cutoff for AFP is not adequate for HCC surveillance; however, combined use of US and AFP improves the sensitivity of HCC surveillance with acceptable specificity. Our study provides supporting evidence to the current recommendations for HCC surveillance using US and AFP.