Background

Despite the advancement and improvement of surgical techniques and perioperative management over recent years [1], major liver resection still bears the risk of inducing postoperative liver failure (LF) or other major liver-related complications, especially in patients with underlying parenchymal liver disease [2, 3]. LF is a serious complication following liver resection and the major cause of postoperative mortality and morbidity [4, 5]. However, liver resection remains the best curative method for treating colorectal liver metastases [6, 7] and primary hepatobiliary cancers [7]. Insufficient remnant liver function is an important factor contributing to a poor postoperative outcome [6]. Therefore, any procedure with the purpose of removing large amounts of diseased liver tissue should include a pre-procedural risk assessment by estimating the future liver remnant (FLR) function to avoid post-procedure LF, mortality, or other liver-related complications.

Both computed tomography (CT) and biochemical liver function tests have been employed in the preoperative assessment by measuring the volume of the FLR and the global liver function, respectively [8]. CT is presently the gold standard for the preoperative evaluation of the FLR [9]. Using CT volumetry as an indirect assessment of FLR function assumes that the liver function is homogenously distributed. However, as patients often present with compromised livers with heterogeneously distributed liver function, this assumption is not always true. Therefore, estimating the function of the FLR directly may be more reliable in predicting the real postoperative remnant liver function rather than estimating the volume or global liver function [8, 10, 11].

With the use of specific nuclear imaging techniques, it is possible to evaluate the function of the FLR directly. Presently, there are no guidelines or definite, widely accepted recommendations on liver function assessment with nuclear imaging methods prior to a procedure with the purpose of removing or destroying diseased, localized liver parenchyma. Therefore, it would be of great clinical value to establish a reliable, noninvasive method with specific guidelines and cut-off values for the preoperative assessment of postoperative risk in patients undergoing liver resection.

The purpose of this systematic review was to investigate the clinical documentation of preprocedural nuclear imaging methods to predict postprocedural clinical outcomes after local intervention in the liver, including the prediction of LF and death.

Materials and methods

Literature search strategy

The literature search was performed by a trained research librarian (LS) using two bibliographic databases, MEDLINE (Ovid) and Web of Science Core Collection (Clarivate Analytics). The search period spanned from the start date of each database until May 27, 2020. Due to the extensive amount of work preparing this review, the original search was outdated, and a new search was performed (as of May 27, 2020) in order to ensure that no relevant studies meeting the criteria for inclusion were left out. The literature search was customized for each bibliographic database and set up to match the predefined PICOS criteria (patient, intervention, comparison, outcome, study design) (Supplementary file 1). In brief, we searched for original papers where a nuclear medicine imaging examination was performed prior to any intervention for localized liver disease, and the results of the imaging assessment were compared to a clinical outcome. Using both controlled thesaurus terms and natural language terms to include synonyms, the search terms included descriptions of the underlying liver condition, the liver intervention, and the liver nuclear imaging technique (see Supplementary file 2 for the exact search profiles). All the identified references were run through a reference managing software tool (EndNote X9, Clarivate Analytics, Philadelphia, USA) to identify duplicate studies. The unique references were entered into a systematic review management system (Covidence, Veritas Health Innovation, Melbourne, Australia), which was used for title/abstract and full-text screening of the studies. There is no public protocol registration for this systematic review. The systematic review was conducted in full accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [12].

Eligibility criteria

The eligibility criteria were as follows. (1) The included patients had to undergo local treatment for focal or multifocal liver disease with an intent to eliminate the diseased liver tissue, irrespective of the underlying cause of disease. (2) The interventions included both liver interventions and preprocedural imaging interventions. The liver interventions included but were not limited to surgery, radiotherapy, cryotherapy, percutaneous ethanol injection, percutaneous microwave coagulation therapy, radiofrequency ablation, or transcatheter arterial chemoembolization. The imaging interventions involved the preprocedural assessment of liver function with a nuclear medicine imaging method. (3) There was no requirement of any comparator. (4) The outcome had to be correlated to the preprocedural nuclear imaging technique and included reporting of postoperative mortality, LF, postoperative complications, and/or postoperative liver function tests (e.g., biochemical tests or nuclear imaging techniques). (5) Any study design with a minimum of five patients per study was considered (Supplementary file 1).

Study selection

Two investigators independently performed the review of the studies. Initially, the original studies retrieved from the literature search were reviewed for eligibility based on reading the titles and abstracts. Only if the two investigators agreed upon exclusion at this level, the paper was discarded. The full texts of all the remaining papers were subsequently read. The systematic review included studies meeting all eligibility criteria as judged by both investigators.

Data extraction

The following data were extracted: country (affiliation of the first author), year of publication, patient enrolment (prospective/retrospective), patient selection (consecutive/nonconsecutive), number of included patients undergoing liver intervention and nuclear imaging who were followed for outcome, type of liver intervention, nuclear medicine imaging technique, and outcome reporting. The number of included patients only encompasses the patients undergoing a preprocedural nuclear imaging examination and an intervention with the purpose of removing diseased liver tissue. Patient enrolment was characterized as prospective if the word “prospective” was used, the terminology was clear, it was described as an interventional design, informed consent was obtained individually before participation in the study, and/or the study partly comprised healthy volunteers (even without providing information about informed consent or ethical approval). Patient selection was classified as consecutive if the word “consecutive” was used or it was clear that all patients meeting certain eligibility criteria in a well-defined time period were included in the study. The surgical procedure was characterized as a major or minor surgery according to the Couinaud criteria (resection ≥ 3 liver segments vs. 1–2 segments) [13].

It was recorded whether the study reported outcome data categorized as mortality, LF, other clinical morbidity, or liver function tests (yes/no option) if they correlated the preoperative nuclear medicine test to the postoperative outcome. Due to the large number of papers reporting high-level outcomes (mortality and LF), detailed data extraction was performed for these two categories only. The outcome data were extracted provided the study included data for a preprocedural nuclear imaging examination and postprocedural mortality or LF. If data on mortality and LF were reported independently, the data were extracted for each of these outcomes separately. If a study analyzed a postprocedural composite endpoint, e.g., overall complications including mortality or LF, the mortality and/or LF data were extracted if possible and reported separately if preprocedural nuclear imaging data were reported separately for those deceased and for those who developed LF. Otherwise, the outcome was shown as a composite outcome and reported in either the mortality or liver failure table or both depending on the number of events for each outcome. The outcome data were reported in the mortality table if all patients with LF died or in the LF table if the majority of these patients developed LF, of which a few died. If the definition of LF included death without other cause, mortality events due to LF were only reported in the LF outcome table. Last, if the paper reported on LF-related complications or signs of postoperative LF, it was included in the liver failure outcome table.

The following data were also extracted: (1) the definitions of mortality (e.g., all-cause mortality or in-hospital mortality) and LF; (2) the outcome event rate; (3) whether a study analyzed a predefined cut-off value of the nuclear imaging examination in a prospective setting or if the study established a post hoc cut-off value based on the observed results; (4) the diagnostic characteristics of the cut-off value as well as comparisons between the uptake values of the nuclear imaging examination in patients with or without an outcome; and eventually (5) available data regarding the predictive value of the nuclear imaging examination and outcome based on univariate or multivariate analyses in logistic or cox regression models.

Statistics

Descriptive statistics comprised the calculation of the median and range. No analytical statistics were used.

Ethics

In compliance with national legislation, no ethical approval or informed consent was obtained for this systematic review, as it only contains data from previously published articles and no individual data.

Results

Literature search and review

A total of 1344 studies were retrieved from the literature search, and eight studies were retrieved from other sources (Fig. 1). After the removal of duplicates, 1119 studies were screened. Based on the title and abstract screening, 933 studies were excluded, leaving 186 studies for full-text screening, of which 82 eligible studies were included in the systematic review.

Fig. 1
figure 1

Consort flow diagram of the article selection process

Study demographics

The majority of the included studies investigated the use of [99mTc]Tc-GSA (57 studies), and the majority of those trials originated from Japan (56 studies) (Table 1). Nineteen studies investigated the use of [99mTc]Tc-mebrofenin. The remaining six studies employed the following tracers: [18F]FDG, [99mTc]Tc-Sn colloid, [99mTc]Tc-colloid, [99mTc]Tc-PMT, L-[methyl-11C]-methionine, and [198Au]Au colloid + [99mTc]Tc-Sn colloid. The vast majority of the studies were retrospective, and only nineteen papers were prospective. Forty-seven studies explicitly defined consecutive recruitment of patients. The year of publication spanned four decades from 1989 to 2020 (median 2010). The median number of included patients undergoing both a preprocedural nuclear medicine imaging examination and a procedure with the purpose of eliminating diseased liver tissue was 67 (5 to 625 patients). Some studies investigated major surgery exclusively including some studies investigating associating liver partition and portal vein ligation for staged hepatectomy (ALPPS) [21, 69, 72, 80, 82, 83]. The majority of the studies investigated both major and minor surgery. Four studies also included non-operative interventions such as radiofrequency ablation, transarterial embolization or chemoembolization, percutaneous ethanol injection, or microwave coagulation therapy [31, 52, 75, 87]. The vast majority of trials reported on postoperative LF or mortality. Sixteen papers reported solely on postprocedural morbidity and/or nonclinical outcome (liver function assessed by imaging or laboratory/biochemical tests) (Table 1), and an additional 48 studies reported on such data along with data on LF and/or mortality.

Table 1 Study demographics of the included papers

Prediction of postoperative mortality

Thirty-seven studies correlated the findings from a preoperative nuclear medicine imaging investigation to postoperative mortality (Table 2). All of the trials involved major intervention (entirely or partly) except for one study in which the extent of liver resection was not described [35]. Postoperative mortality was defined explicitly up front in fifteen trials only. The mortality definitions varied though most used 90-day postoperative mortality. The mortality rate varied considerably (0 to 52%). There were major technical methodological variations across the studies. For example, among the 22 studies investigating [99mTc]Tc-GSA, there were fifteen different ways of calculating the preoperative liver uptake of the tracer (data not shown).

Table 2 Overview of trials reporting on the correlation between a preoperative nuclear imaging examination and postoperative mortality

Most studies retrospectively reported the nuclear medicine investigation results among patients with a fatal outcome versus those with a non-fatal outcome or described the proportion of patients with a fatal outcome below or above a defined cut-off value. A minority of the studies provided clinically relevant prognostic information in the form of preoperative nuclear medicine results in patients with fatal versus nonfatal outcomes, analyzed diagnostic test accuracy characteristics, or assessed predictive values in univariate and multivariate analyses (Table 2). Thirty-three studies described a preoperative cut-off value for the prediction of postoperative mortality or mortality as part of a composite outcome. Twelve of these studies used a predetermined cut-off value. Five papers (of which two reports had mortality as part of a composite endpoint) compared nuclear medicine results in patients with fatal versus non-fatal outcomes. All studies found significant differences in the nuclear medicine investigation results between patients with a fatal outcome and those with a non-fatal outcome. Two papers compared survival rates in patients with liver function uptake values above or below a certain level (reported in the same column). Nishikawa et al. [60] found a significant difference between survival rates in patients above and below a certain cut-off, whereas Yano et al. [89] did not. Moreover, Rassam et al. [67] did not find a significant association between 90-day mortality and the preoperative FLR function or uptake rate of [99mTc]Tc-mebrofenin. Four papers retrospectively assessed the diagnostic test accuracy of the cut-off determined based on the observations from their study. Satoh et al. [68] analyzed the diagnostic characteristics of the cut-off value in regard to predicting overall complications in which mortality was included, whereas Dinant et al. [24] and Nishiyama et al. [61] analyzed the diagnostic characteristics of the cut-off value for the prediction of LF-related mortality. The positive predictive values were modest (50 to 71%), whereas the negative predictive values were excellent (98 to 100%). Dinant et al. [24] reported a sensitivity of 75% and specificity of 93% for predicting LF-related mortality. Olthof et al. [65] found that the FLR function had an AUC of 0.70 for the prediction of liver failure-related mortality.

Six papers assessed the predictive value of the preoperative nuclear medicine test for postoperative mortality or a composite outcome in univariate and/or multivariate regression analysis. Dinant et al. [24] found that future remnant (FR) uptake determined by [99mTc]Tc-mebrofenin liver imaging had significant predictive value for LF-related mortality but not all-cause mortality in univariate analysis but not multivariate analysis. Hayashi et al. [28] showed that the FR function determined from [99mTc]Tc-GSA imaging was a significant predictor of postoperative mortality (odds ratio (OR) 8.8, p = 0.008) in univariate analysis; no multivariate analysis was performed. Kim et al. [39] found LHL15 to be a significant predictor of overall complications, including mortality, in multivariate analysis. Nishikawa et al. [60] found the GSA index to be a significant predictor of recurrence-free survival and overall survival in univariate analysis and documented a significant predictive value of the GSA index for recurrence-free survival in multivariate cox regression analysis (hazard ratio (HR) 2.4, p < 0.001). However, Yano et al. [89] found that the GSA-Rmax parameters were not associated with overall survival or tumor-free survival in multivariate cox regression analysis, and Yamao et al. [87] did not find a significant association between LHL15 and overall survival in univariate cox regression analysis. Several other studies claimed to document the predictive value of the imaging tests evaluated, but the findings were not supported by statistical analyses.

Prediction of postoperative LF

Fifty-two studies reported on LF alone or as part of a composite outcome (Table 3). Fifty (96%) of the trials involved major surgery (entirely or partly). Among the forty-four studies reporting on the definition of postoperative LF, the definition varied considerably (data not shown). The postprocedural LF rate ranged from 0 to 86% (Table 4). Most trials (n = 35) were performed with [99mTc]Tc-GSA, of which more than 25 different measures of liver function were applied (data no shown). Twenty-one studies analyzed a predetermined cut-off value, and nine studies did not report a cut-off value. The remaining studies established a post hoc (data-driven) cut-off value, and some reported both.

Table 3 Overview of the studies reporting on the correlation between a preoperative nuclear imaging examination and the postoperative outcome liver failure
Table 4 Details of the diagnostic characteristics, descriptive values, and predictive value of the preoperative nuclear imaging examinations and postoperative liver failure

A large number of trials (n = 35) provided detailed diagnostic or predictive data (Table 4). Twenty-six trials compared the results of radionuclide imaging in patients with or without liver failure for one or more imaging variables. Most trials showed significant differences among patients with and without liver failure for at least one of the liver function parameters. Sixteen studies analyzed the diagnostic characteristics of the cut-off values in predicting postoperative LF or overall complications. In general, the sensitivity and specificity of the cut-off values varied considerably from 50 to 100% and from 32 to 98%, respectively. The positive predictive values of the cut-offs varied considerably as well (7 to 92%), whereas the negative predictive values were consistently high (82 to 100%).

Twenty trials analyzed the predictive value of the nuclear medicine test in predicting postoperative LF or a composite outcome including LF in univariate and/or multivariate regression analyses, and some trials presented up to six variables. All studies showed significant outcome in univariate regression analysis with at least one nuclear medicine variable. Multivariate regression analyses were performed in eighteen trials, fifteen of which showed that a nuclear medicine liver function test was a significant independent predictor of postoperative LF (Table 4). Twelve of these trials presented the predictive impact with an OR or HR. Eleven of these studies tested other variables as well, and only six of these studies [21, 22, 36, 57, 64, 65] found that other variables (e.g., operation time, extent of hepatectomy, blood loss volume, and aspartate aminotransferase to platelet ratio index) had significant predictive value for predicting LF.

Historical comparisons

Five studies reported historical comparisons on the outcome rate of LF and mortality in the period before and after implementation of nuclear imaging as a preoperative examination in patients undergoing liver surgery [16, 20, 25, 56, 58]. Overall, the historical comparisons involving [99mTc]Tc-mebrofenin found that implementation of nuclear imaging in the preoperative assessment resulted in lower mortality and liver failure rates [16, 20, 25], whereas the historical comparisons on the use of [99mTc]Tc-GSA did not result in a significant decrease in the number of liver failure cases in the period after implementation of [99mTc]Tc-GSA liver scintigraphy [56, 58].

Discussion

To the best of our knowledge, this is the first systematic review investigating the value of preprocedural nuclear imaging examinations for the prediction of postprocedural mortality and LF in patients undergoing localized, liver-directed interventions. This review demonstrated great technical heterogeneity, e.g., in terms of tracers, nuclear imaging uptake calculations, and outcome definitions. Most trials were retrospective and explored test-derived cut-off values rather than evaluating the clinical validity of predetermined variables and cut-off values. However, a few studies investigated predetermined cut-off values and confirmed the clinical utility of these for the preoperative nuclear imaging examination both in terms of producing low LF and mortality rates. In addition, a notable number of trials reported significant predictive values of the nuclear medicine imaging test in multivariate analyses for the prediction of LF, which favors further efforts to identify the clinical utility of these tests in a prospective setting.

The most promising and well-investigated nuclear medicine imaging tracers for the prediction of postoperative clinical outcome were [99mTc]Tc-GSA and [99mTc]Tc-mebrofenin. Despite the fact that a large proportion of these trials showed interesting diagnostic properties and excellent predictive values, the methodology was not optimal in most cases. The definitions of liver failure and mortality differed across the studies without consensus on predefined, clinically relevant liver failure and mortality definitions. This may ultimately affect the outcome of the individual analyses and complicate the ability to properly compare the tracers in the studies in terms of their predictive value. Furthermore, the majority of the studies were retrospective and had an exploratory approach, and the diagnostic properties of the nuclear imaging techniques were defined based on the actual observations. A few papers applied a predetermined cut-off value in a prospective setting and investigated outcomes among patients with uptake values above or below that cut-off. Thus, there were only a few clinical utility studies. However, the studies applying a predetermined cut-off value in a prospective setting validated the cut-off and proved that the cut-off value was able to safely determine which patients could undergo hepatic resection with resulting low mortality and LF rates [20, 63], thereby underscoring the importance of preoperative nuclear imaging in patients undergoing liver resection. A few studies investigated other tracers; the limited number of data makes it difficult to identify advantages or disadvantages over [99mTc]Tc-GSA and [99mTc]Tc-mebrofenin.

The major differences between [99mTc]Tc-GSA and [99mTc]Tc-mebrofenin exists in the uptake and excretion of these tracers. The uptake of [99mTc]Tc-GSA follows receptor-mediated endocytosis by attachment to asialoglycoprotein receptors on the functioning hepatocytes. The tracer is then transferred to the lysosomes for degradation [94]. As the only uptake site for [99mTc]Tc-GSA is in the liver, imaging with [99mTc]Tc-GSA offers a good representation of the functioning hepatocytes. In comparison, [99mTc]Tc-mebrofenin is transported to the liver predominantly bound to albumin after which it is taken up by the functioning hepatocytes and excreted unmetabolized into the bile system. As a result, imaging with [99mTc]Tc-mebrofenin offers visualization of the liver’s uptake and excretion function including the biliary tract system [94]. However, imaging with [99mTc]Tc-mebrofenin can be affected by high blood bilirubin levels as both substrates compete for the same uptake transporter on the hepatocytes [94]. Both imaging methods are capable of estimating both the global and regional liver function with the use of SPECT. No head-to-head comparative studies of the two tracers in the same population were identified.

Hepatobiliary scintigraphy has shown a promising value in the preoperative assessment of patients undergoing liver resection, and it has been utilized by several nuclear medicine departments worldwide over the last decade. Still, there are no widely accepted international guidelines that recommend the use of nuclear medicine imaging to determine the resectability of patients prior to undergoing a procedure with the purpose of removing diseased liver tissue. CT volumetry, however, is well established for the preoperative estimation of FLR volume prior to liver surgery and is the present gold standard to determine resectability in patients undergoing liver resection. Preoperative CT volumetry cut-off values for a safe liver resection have been established on the basis of several well-designed studies [95,96,97,98,99,100,101]. It is generally accepted that approximately 75–80% of the total liver volume can be removed safely without postoperative complications, leaving a FLR of approximately 20–25% if the liver morphology and liver function are normal [95,96,97,98,99,100,101]. If the normal liver has been subjected to chemotherapy recently, a FLR volume of 30% is necessary, and in patients with cirrhosis or hepatitis, the FLR volume needs to be at least 40%, depending on the total liver function measured by a variety of tests and scores [99,100,101].

CT volumetry is, however, an indirect measure of liver function due to the heterogeneously distributed liver tissue, especially in patients with parenchymal liver disease. Therefore, using CT volumetry as a preoperative measure of the FLR requires knowledge of the quality of the liver parenchyma [11] as reflected in the different cut-off values for the various liver conditions. In addition, in using CT volumetry as the preoperative assessment of the FLR, some patients are at risk of being excluded from surgery with curative intent due to a small FLR volume, even if their actual FLR function is sufficient. Other patients with sufficient FLR volume are at risk of developing postoperative liver failure due to insufficient FLR function. To avoid these problems, a more direct evaluation of FLR function is needed, and nuclear imaging seems to be the most promising approach. Furthermore, with the use of nuclear imaging as a preoperative measure of the FLR function, one cut-off level for a safe liver resection might suffice for all patients regardless of the underlying liver function [10, 11]. Therefore, it would be advantageous for clinicians to estimate the function of the FLR and thus the postprocedural clinical course of the patients while taking into account the underlying liver disease.

For the nuclear imaging techniques to be included in the diagnostic assessment of patients undergoing liver-directed treatments, clinicians need evidence-based cut-off values for the nuclear imaging liver functional assessment based on a validated and simple liver function uptake calculation. Most cut-off values shown in this systematic review were post hoc, data-driven values, not prospective assessments of a fixed cut-off value. It was evident that nuclear imaging techniques play a promising role in the preprocedural work-up of patients undergoing liver-directed treatments, especially for the prediction of postoperative liver failure. However, the liver failure and mortality predictions are influenced by the different and inconsistent liver failure and mortality definitions. The abundant heterogeneity in nuclear imaging techniques, acquisition methods, and outcome definitions complicate the ability to establish evidence-based guidelines for the preprocedural work-up of patients undergoing liver-directed treatment. External validation and comparison of the results across the different studies require procedural standardization, both of technical performance and outcome. Rassam et al. have described practical guidelines on how to use [99mTc]Tc-mebrofenin hepatobiliary scintigraphy [102]. Most papers investigated both major and minor surgery in the same study and established a joint cut-off for a safe liver resection irrespective of the surgery type. However, the extent of liver surgery may affect the FLR cut-off needed for a safe surgery and a favorable postoperative outcome. Therefore, future studies should distinguish between major and minor surgery in their prediction models and cut-off establishments. It remains unclear if the same cut-off can be used across different type of interventions.

Due to the interesting data generated with the tracers [99mTc]Tc-GSA and [99mTc]Tc-mebrofenin, prospective trials with predefined and clinically relevant definitions of mortality and liver failure are warranted. However, [99mTc]Tc-GSA is not yet commercially available outside of Japan and there are currently no head-to-head studies comparing this tracer with [99mTc]Tc-mebrofenin. The two tracers should be directly compared in prospective, multicenter studies in patients undergoing major liver surgery using predefined definitions of liver failure and mortality; the predictive findings should be compared to the gold standard of CT volumetry. This would hopefully lead to an extraction of clinically relevant cut-off levels of the nuclear imaging techniques for a safe hepatectomy.

This up-to-date systematic review covered four decades of research published in two major medical databases, employed extensive and detailed research strings, and had two investigators throughout the review and data extraction process. The authors consider this systematic review to encompass the first synthesized evidence on this clinically relevant topic.

Some groups published several, similarly appearing papers; some groups published papers with increasing number of subjects over time. We did not contact individual authors to investigate any overlapping data among papers from the same institution. This may cause a potential bias of duplicate or overlapping data if authors did not comply with established guidelines on the ethics of publishing.

Conclusion

In conclusion, more than 80 trials have been published on preoperative nuclear medicine methods, predominantly [99mTc]Tc-GSA and [99mTc]Tc-mebrofenin, to predict clinical outcomes after the liver-directed treatment of non-systemic liver diseases. Even though we identified evidence of benefit for preoperative nuclear medicine assessment across various liver diseases, the data were very heterogeneous concerning the methodology for liver function uptake calculations and dissimilar outcome definitions of LF and mortality. The general impression was that this area of research is short of confirmatory and consistent evidence to determine the patient-relevant benefit of the preoperative assessment of the postoperative FLR with nuclear medicine tests. We encourage the nuclear medicine society in collaboration with hepatobiliary surgeons to support prospective, multi-national clinical efficacy trials documenting that adding a preoperative nuclear medicine test benefits patients in comparison to the standard of care for preoperative investigations using CT volumetry.