Introduction

Gastrointestinal stromal tumors (GIST) are rare tumors arising in the muscle layer of the gastrointestinal (GI) tract. The interstitial cells of Cajal are thought to be precursor cells [1]. The Cajal cells are part of the autonomic nervous system of the intestine and control gastrointestinal motility. In contrast to most cancers of the GI tract, GIST belong to a group of connective tissue tumors, like sarcomas, and are, thus, of nonepithelial origin. The most common location of GIST is the stomach (60–70%), followed by the small intestine (20–30%), the duodenum (less than 10%), the rectum (less than 10%), and the esophagus (about 1%) [2]. The profound understanding of the molecular mechanisms in the development of GIST has led to a precise definition of the entity and to a change of the available therapeutic options. In particular, the identification of activating mutations in the KIT protooncogene, encoding for a transmembrane receptor, which is the driver of GIST in about 85% of the tumors, has revolutionized both diagnosis (most GIST are positive for staining of CD117) and therapy [2]. Mutations of the c-KIT are the crucial step for the development of a GIST as the autophosphorylation of the receptor leads to a high cell division rate and tumor proliferation. Another less commonly mutated gene, also contributing to GIST development, is the platelet derived growth factor receptor alpha (PDGFR-α). The identification of c-KIT and PDGFR-α has led to new therapeutic approaches based on therapies targeting both receptors. In 2002, the first orally administered drug, imatinib mesylate, a tyrosine kinase inhibitor (TKI), was approved by FDA for the treatment of metastatic and unresectable GIST and changed the prognosis and therapeutic outcome of this tumor entity. Over the last decade the use of multiple TKIs beyond imatinib, has led to an increase of GIST median survival to nearly 5 years, compared to 9–20 months prior to the imatinib era [3, 4]. However, the development of secondary mutations induces resistance to imatinib [5]. In case of resistance, other TKIs are used as a second line therapy, like sunitinib or regorafenib, both multikinase inhibitors. Both drugs have an antiangiogenetic effect by blocking vascular endothelial growth factor (VEGFR).

The improvement of the therapeutic approaches in GIST and the introduction of effective but at the same time expensive targeted therapies require appropriate diagnostic tools, which allow the early assessment and confirmation of a therapeutic response. Early identification of non-responders in the course of treatment is crucial, to tailor therapy accordingly and save costs for the health system. Moreover, surgery is a therapeutic option in specific scenarios of metastatic GIST with only single tumor metastases no longer responding to TKIs. To decide about its utility and plan a potential surgical approach, it is indispensable to know which tumor lesions have responded to therapy and which ones are progressive. The aim of this review is to discuss the impact and limitations of PET-based therapy monitoring as well as its comparison to CT, MRI and survival data.

Search

PubMed and Cochrane library search were performed using the following keywords combinations: computed tomography AND gastrointestinal AND stromal AND tumors AND therapy AND (response OR monitoring); CT AND GIST AND therapy AND (response OR monitoring); FDG AND GIST AND therapy AND (response OR monitoring); fluorodeoxyglucose AND gastrointestinal AND stromal AND tumors AND therapy AND (response OR monitoring); magnetic resonance imaging AND gastrointestinal AND stromal AND tumors AND therapy AND (response OR monitoring); MRI AND GIST AND therapy AND (response OR monitoring); PET AND gastrointestinal AND stromal AND tumors AND therapy AND (response OR monitoring) NOT FDG.

The identified articles of this search were then reviewed and articles published in other languages than English were excluded. Furthermore, studies based on preclinical data, case reports, as well as articles including less than 10 patients were also excluded. All remaining articles were screened and only those reporting on the therapeutic assessment based on CT, MRI and PET were included. Overall, 39 original papers fulfilled these criteria and were included in this systematic review (Table 1).

Table 1 Summary of the studies included in this review

Imaging methods

Computed tomography (CT)—general considerations

Morphologic imaging modalities like CT are primarily used for the diagnosis and therapy management of GIST. The development of new targeted therapies for GIST, like imatinib mesylate and other TKIs, initiated a discussion about appropriate response criteria for CT imaging. In general, CT-based therapy response in GIST is based on changes in tumor volume and tumor density, as well as on the appearance of new lesions. However, GIST metastases treated with targeted therapies may show only a minor reduction of tumor volume even in cases of response. In the EORTC 62005 trial it took more than 9 months for responding patients to develop a partial remission (PR) according to RECIST [6]. In addition, some patients show progression in the size of metastases during TKI therapy due to intratumoral hemorrhage or myxoid degeneration, both of which represent phenomena not associated with true disease progression. Therefore, response cannot be correctly evaluated using the known Response Evaluation Criteria in Solid Tumors (RECIST), which assess single lesions exclusively based on measurements of the longest axial diameter [7]. Choi et al. introduced additional CT criteria based on changes in size but also in lesion density, in order to address the issue of pseudo-progression caused by myxoid degeneration during therapy [8]. However, the issue of intratumoral hemorrhage, mimicking disease progression, cannot be solved by the Choi criteria. Another important observation found in metastatic GIST is that patients with progressive GIST may demonstrate a ‘nodule in a mass’ and not necessarily an overall increase in tumor volume due to focal progression within a generally responsive lesion [9]. This issue is also not completely solved by the Choi criteria, since there are no standards for density measurements within lesions. It should be mentioned that TKIs-based treatment response evaluation in GIST patients requires standardized CT acquisition protocols for scanning, image reconstruction as well as contrast material, in order to guarantee that differences in lesion density are not caused by variations in contrast administration and scan settings like various tube voltage settings between different examinations. Table 2 summarizes the RECIST 1.1 and Choi criteria [7, 8].

Table 2 Summary of the RECIST 1.1 and Choi criteria for treatment response evaluation

Assessment of therapeutic response with CT

Neoadjuvant therapy

Shen et al. reported on 18 patients (13 with primary and 5 with recurrent or metastastic GIST), who received preoperative imatinib mesylate. 16/18 patients demonstrated PR and 2/18 stable disease (SD) based on CT follow-up studies evaluated with the Choi criteria. Tumor diameter decreased from 9.1 to 6 cm as response to treatment. 50% of patients underwent surgical resection after a median of 7 months of imatinib therapy [10].

Comparable results were reported by Tirumani et al. in 20 patients with primary GIST, who received neoadjuvant imatinib prior to surgery. CT-based response monitoring and RECIST 1.1 were used for evaluation. There was PR in 16/20 patients and SD in 4/20 patients. At best response, which was seen at 28 weeks, the median decrease in the longest axial diameter was 43% and in tumor volume 83%. Plateau response, defined as the time point of <10% change in treatment response between two consecutive scans beyond best response, was seen at 34 weeks. The authors concluded that beyond this time point further treatment may not be beneficial [11].

Both studies demonstrate the usefulness of CT in the neoadjuvant setting.

Adjuvant therapy

In contrast to CT-based therapy monitoring in the neoadjuvant setting, several studies demonstrate limitations of RECIST and Choi criteria in the adjuvant setting, which are mainly related to the change in the original morphology of liver metastases and to the presence of a nodule in a cyst as a sign of disease progression.

Schiavon et al. studied the morphology of liver metastases in GIST patients treated with imatinib, to determine if evaluations according to RECIST accurately reflect lesion volume. The authors demonstrated that at baseline only 44% of liver metastases were spherical, an underlying assumption for RECIST, while 56% of them were classified as ellipsoidal. Moreover, during treatment only 42% of all liver metastases retained their original morphology. The remainder demonstrated a change in morphology, e.g. from spheroidal to ellipsoidal and vice versa. Overall, the ellipsoidal volume was a more suitable surrogate for volume estimation of liver metastases [12]. The authors reported similar results in another study in 84 GIST patients treated with imatinib [13].

Cystic degeneration after imatinib treatment is another limiting factor for therapy assessment. Koh et al. reported on discrepancies between RECIST- and Choi-based response in 27/61 patients, who responded to imatinib with cystic degeneration related to lack of CD34 expression after treatment. Choi criteria correlated significantly stronger with progression-free survival (PFS) than RECIST [14].

The problem of cystic-like transformation of metastases during imatinib treatment was evaluated in a prospective study by Mabille et al. in 107 patients with metastatic GIST. The authors found that 70 patients showed a total or partial cystic-like transformation of hepatic and/or peritoneal metastases. These pseudocysts remained unchanged in size and were classified as SD according to RECIST. However, 17 patients demonstrated increasing parietal thickness and 29 patients peripheral enhancing nodules as a sign of recurrence (Fig. 1). The authors concluded that peripheral thickening or enhancing nodules within a cystic-like metastatic lesion during imatinib treatment is related to progression even without any change in size [15]. Concordant results were reported by Desai et al. in 89 patients with metastatic GIST treated with imatinib. Twenty-three of the 48 (47.9%) progressive patients developed a new enhancing nodular focus within a preexisting tumor mass as a sign of progression (Fig. 2). Nodules were demonstrable at a median of 5 months prior objective progression, based on tumor size criteria and were the first sign of progression in 18/23 patients. Genotypic analysis, performed in 10 patients, revealed new activating kinase mutations of KIT and PDGFRA kinases in 80% of them [16]. The importance of the appearance of a nodule within a cystic metastatic lesion as a sign of recurrent disease was also demonstrated by Shankar et al. in 2005. This group studied 92 patients with metastatic GIST treated with imatinib. Twenty-one of the 39 patients with PD (53.9%) developed a nodule within a known mass as a first sign of tumor progression [9]. Comparable results were reported in another study by Phongkitkarun et al. including 17 patients under imatinib. Cystic-like metastatic lesions without peripheral thickening or enhancing nodules demonstrated a better survival as compared to patients with new solid lesions [17]. In line with these results, Ryu et al. studied 62 patients with metastatic or unresectable GIST treated with imatinib and demonstrated that new solid lesions as well as generalized progression were associated with shorter overall survival (OS) [18]. Further, concordant results were published by Vanel et al. in 54 patients with hepatic and peritoneal metastatic GIST during imatinib therapy. The authors demonstrated that a disease reactivation after an initial PR appeared first as a focal, peripheral, solid nodule in the wall of a cystic lesion or a diffuse increase in lesion density [19].

Fig. 1
figure 1

Left transversal CT image of cystic liver metastasis of a GIST (bright arrow) and multiple hyperdense and hyperperfused lesions at the periphery of the regressive metastasis (dark arrow) due to multifocal progression. Right resection specimen demonstrating the progressive tumor areas (dark arrow) around the myxoid degenerated cystic part (bright arrow)

Fig. 2
figure 2

Left transversal CT image of a cystic-myxoid liver metastasis of a GIST (bright arrow) with a hyperdense lesion at the medial part of the metastasis (dark arrow) indicative for tumor progression. Right resection specimen demonstrating the progressive metastasis within the cyst (dark arrow)

Overall, all these studies demonstrate that the appearance of a focal, peripheral or solid nodule in the wall of a cystic lesion is a sign of progression. Furthermore, tumor morphology is another limiting factor for the RECIST or Choi based therapy assessment.

Assessment of other novel therapies

Comparison of RECIST with Choi

Shinagare et al. compared the different response evaluation criteria in 20 patients with advanced GIST receiving regorafenib, with prior response failure to imatinib and sunitinib (phase 2 study). Baseline CTs and follow-up studies were performed every 2 months. The authors compared WHO, RECIST 1.0, RECIST 1.1 and Choi criteria with PFS and OS. PR was more frequently observed by using the Choi criteria. PFS per RECIST 1.0 was similar to RECIST 1.1 (median 44 versus 58 weeks), and significantly shorter for WHO (median 34 weeks) and Choi (median 24 weeks). With RECIST 1.1, RECIST 1.0 and WHO, there was moderate concordance between PFS and OS (c-statistics 0.596–0.679). Choi criteria had less favorable concordance (c-statistic 0.506). The authors found that RECIST 1.1 and WHO criteria performed somewhat better than Choi criteria for response evaluation [20].

Comparable data were reported by Schramm et al. in 20 patients receiving second line sunitinib therapy. CT follow-up studies were evaluated with RECIST and Choi. PR was also more frequently observed when using the Choi criteria. However, the authors concluded that only PR according to RECIST indicated favorable survival [21]. Concordant results have been reported by Dudeck et al. in 51 patients who received second line sunitinib therapy. The use of Choi was favorable for PR as compared to RECIST at 3 months (16 patients with PR by Choi criteria, versus one patient by RECIST). However, PFS and OS were comparable for patients rated as SD with RECIST, and for patients rated as PR and SD with Choi criteria. Only patients with PD demonstrated significantly shorter PFS (10.1 weeks for both criteria) and OS (29.1 weeks for RECIST; 28.9 weeks for Choi criteria) regardless of the response classification applied. The authors concluded that discrimination of PR from SD with Choi criteria was of no predictive value [22]. It must be kept in mind that the use of sunitinib indicates failure of imatinib. Patients with a primarily non-responding tumor (due to PDGFRα mutation) might represent a cohort with different prognosis than those patients who initially responded to imatinib and developed secondary tumor progression.

Overall, these studies demonstrate that the use of Choi criteria was favorable for PR but the discrimination of PR from SD was of no predictive value with respect to survival data.

Novel CT response criteria based on dual energy CT and dynamic perfusion CT studies

Meyer et al. reported results in 17 patients with advanced GIST who received therapy with TKIs. They demonstrated that dual energy CT (DECT) monitoring allowed a better prediction of therapeutic outcome based on an increase of both tumor size >20% and iodine-related attenuation, or either an increase of tumor size >50% or iodine-related attenuation for non-responsive patients. The use of Choi criteria and the application of DECT measurements demonstrated a significantly longer median PFS of patients rated as responders (9–29 months) as compared to non-responders (2–6 months) [23]. Concordant results have been reported by Apfaltrer et al. in 24 patients with 291 liver metastases. DECT demonstrated significantly higher iodine-related attenuation in progressive disease [24]. The impact of DECT has been evaluated in another study by Schramm et al. in 18 patients with 48 tumor lesions. They found a correlation between selective iodine uptake and tumor growth. However, the study did not provide any survival analysis for Ref. [25]. Although the results of DECT studies are promising, it still remains unclear whether these DECT-based criteria are more helpful than the Choi or RECIST criteria. Figure 3 demonstrates an example of a DECT in a patient with liver metastases of a GIST. The follow-up study demonstrated pseudoprogression after imatinib therapy with an increase in size but a decrease in iodine uptake.

Fig. 3
figure 3

Dual energy CT (DECT) of a patient with metastatic GIST within the liver. The follow-up DECT on the right demonstrates pseudoprogression 2 months after the initiation of imatinib therapy with an increase in size but a significant decrease in lesion iodine uptake

Dynamic perfusion CT is another promising functional technique potentially useful for the assessment of therapeutic response to TKIs in GIST. Schlemmer et al. studied 24 patients with metastatic GIST who received TKIs and reported on a decrease in tumor perfusion parameters in responders for both intra- and extrahepatic lesions [26]. A study by De Giorgi et al. analyzed the impact of vascular perfusion by angiosonography with a second-generation contrast agent in 10 patients with metastatic GIST during imatinib therapy. The authors reported an initial reduction in tumor vascularization in all patients who responded. In addition, during follow-up the reduction of tumor perfusion in angiosonography correlated with pseudocystic appearance in CT [27]. Further studies are needed to confirm these results, as the technique is dependent on the experience of the examining person and cannot be reproduced.

Overall, the impact of these criteria is still open due to the limited data and should be assessed in larger patient studies.

Magnetic resonance imaging (MRI)

Therapy response in MRI is based on the assessment of the same criteria as in CT. In the case of MRI, changes in signal intensity (SI) are evaluated. However, compared to Hounsfield unit measurements with CT, changes in SI are less reliable even if standardized protocols are used [28]. Stroszczynski et al. performed MRI measurements in 45 patients with metastatic or locally advanced GIST during imatinib therapy. The metastastic patients were treated with in a multinational trial (EORTC 62005) using CT scan for assessment and comparison, whereas in the neoadjuvantly treated group of locally advanced patients surgical and pathological evaluation of the resection specimen were available. Accuracy of RECIST by MRI assessment was 51, 69 and 73% on MRI at 2, 4 and 6 months after therapy. Responders had higher signal-to-noise ratios on T2-weighted images after 2 months and a decrease of vascularized areas in the target lesions 4 and 6 months after treatment [29]. Tang et al. evaluated the impact of diffusion weighted imaging (DWI) in 32 GIST patients who received imatinib. Baseline and follow-up studies 1, 4 and 12 weeks after therapy were performed. They found low pretherapeutic apparent diffusion coefficient (ADC) values and marked increase at 1 week after therapy as a sign of good response [30].

Overall, MRI cannot be considered as the prime standard for the assessment of therapeutic response in GIST. Thus, in the consensus report of the German GIST Imaging Working Group, MRI is recommended only for liver-specific questions or when contraindications to CT are present [28].

PET-CT

Positron emission tomography (PET) as well as positron emission tomography-computed tomography (PET-CT) is being increasingly used in GIST for primary diagnosis, staging and therapy monitoring. In most cases, the radiotracer F-18-fluorodeoxyglucose (18F-FDG or FDG) is used. FDG, a glucose analog, is transported from plasma to the cell, where it is phosphorylated and then trapped, reflecting, thus, the intracellular glucose metabolism and consumption, and providing information about tissue metabolism. Despite being the workhorse of molecular imaging in oncology, FDG is a non-specific tracer, since it is accumulated not only in tumors but also in several benign processes, like inflammation [31]. Several attempts have been made to find more specific tracers for oncological diagnosis. In fact, some new receptor active peptides have found use in PET imaging in the last years. One example is Ga-68-DOTATOC, a tracer that binds to SSTR2-expressing tumors, like neuroendocrine carcinomas or meningeomas. Another example is Ga-68-PSMA, a tracer that binds to the prostate-specific membrane antigen (PSMA), a cell surface protein that is enhanced in prostate carcinoma. These peptides can be used for both diagnostic and therapeutic purposes. Regarding GIST, Ga-68-BZH3, a pan-bombesin analog has been used in a limited number of patients with metastatic disease [32]. However, this tracer cannot be recommended as a single, stand-alone tracer for diagnosis and therapy monitoring in GIST.

Assessment of therapeutic response with FDG PET

It is generally accepted that FDG PET is more sensitive for the assessment of early therapy response than morphologic imaging modalities (Fig. 4). However, several questions remain open, like the appropriate time to monitor a therapeutic protocol, the applied PET or PET-CT protocol, and the appropriate therapy response evaluation criteria, which should be followed. The review of the literature data published in GIST with regard to therapy monitoring assessment is divergent. Most studies report on a general decrease in FDG tumor uptake following imatinib therapy, but the time interval between baseline and follow-up FDG PET studies differs significantly, ranging from 1 week after onset to imatinib to several months following treatment. Moreover, several studies report only the follow-up PET results but do not provide data on the baseline examination prior to treatment. Another issue is the heterogeneity regarding the criteria used for the assessment of the FDG PET response, with some studies including only visual evaluation of the scans, and some others evaluating mean and/or maximum SUV (SUVmean, SUVmax) changes before and after treatment. Furthermore, in some studies it is not clearly stated if the authors used the most widely applied PET criteria, which are the European Organization for Research and Treatment of Cancer (EORTC) criteria. Finally, the literature results are largely divergent concerning the predictive value of the FDG PET response with respect to PFS or OS.

Fig. 4
figure 4

Left side transversal fused FDG PET-CT images in a patient with a recurrent GIST of the gastroesophageal junction prior (upper row) and 2 months after onset of therapy with imatinib (400 mg/day). Pleural effusion at the right side in the baseline study. Good response after therapy. Decrease in FDG uptake (SUVmean from 7.3 to 3.2, and SUVmax from 11.2 to 5) and in tumor volume (from 12 cm to 7.7 cm). Right side maximum Intensity Projection Images (MIP) of the same patient

Response criteria for FDG PET

Prerequisite for therapy assessment by means of PET/CT is the use of the same scanning protocol regarding acquisition, image reconstruction, time of scan beginning after tracer injection and glucose level, for the baseline and the follow-up studies. In general, the EORTC criteria are used for therapeutic response assessment based on FDG PET [33]. These criteria are based on SUV changes, which are related to the time interval after onset of therapy. Progressive metabolic disease (PMD) is defined as a 25% increase of SUV or the appearance of new metastatic lesions. On the other hand, partial metabolic remission (PMR) is defined as a reduction of SUV of at least 15% after one cycle or more than 25% after more than one cycles. In 2009 Wahl et al. proposed the use of PERCIST criteria for the assessment of therapy response [34]. Some important components of the PERCIST criteria are the assessment of normal reference values in a 3-cm-diameter ROI in the liver as well the assessment of the SUL (or SUVlean) peak of a small ROI in the hottest tumor area (1 cm3) for therapy assessment. Furthermore, they recommend the use of SUVlean and a reduction of its value of at least 30% for definition of PR. These criteria are more complex and have found limited use in GIST. One further limitation of these response criteria in the case of GIST, is that TKIs are administered as a daily continuous therapy and not in the form of therapeutic cycles, as conventional cytotoxic therapies. Therefore, there is no consensus about the time intervals which should be used for the response evaluation of TKIs or other targeted therapies used in GIST.

Assessment of the imatinib treatment response

Good correlation between FDG PET response and PFS

Stroobants et al. studied 17 patients at baseline and 8 days after onset of imatinib within a phase 1 dose-finding study with FDG PET-CT. FDG PET response evaluation was based on EORTC criteria, and CT response on RECIST. The authors reported an FDG PET response in 13/17 patients at day 8, as compared to a CT response in 10/17 patients after 8 weeks. Furthermore, the study found a longer 1-year PFS in FDG responders (92% for responders vs. 12% for stable and progressive disease) [35, 36]. Comparable results were found in another study by Jager et al. In this study, 16 patients underwent FDG PET-CT at baseline as well as 1 week after treatment. Eleven of 16 patients responded in FDG PET with a mean SUV reduction of 65%, whereas 5/16 patients did not respond and demonstrated an FDG uptake increase of 16%. The study found that PFS was longer in FDG PET responders [37]. Choi et al. studied 40 patients with metastatic GIST at baseline and 2 months after onset of therapy and compared the FDG PET data with the CT volumetric changes as well as with changes in density. They defined a 70% reduction in SUVmax and a SUVmax <2.5 at 2 months as response. Thirty-three patients responded in FDG PET based on these criteria. The comparison of the CT data with the FDG PET data revealed that a decrease of tumor size of more than 10% or a decrease in tumor density of more than 15% (detected on CT) correlated best to the FDG PET response with a sensitivity of 97% and a specificity of 100%; the respective values were 52 and 100% when using the RECIST criteria. Moreover, responders at 2 months had significantly longer PFS [38].

Overall, these studies demonstrate that PFS was significantly longer in FDG responders (follow-up study between 8 days and 2 months after onset to imatinib).

Poor correlation between FDG PET response and PFS

In contrast, Chacon et al. studied 16 patients on day 7 and day 37 after starting treatment with 800 mg/day imatinib. They found that FDG uptake on day 7 is not predictive for PFS [39]. McAuliffe et al. provided comparable results in 19 patients who underwent FDG PET prior to neoadjuvant therapy as well as after the end of the preoperative treatment within a randomized phase 2 trial [40]. The results are particularly interesting because they compared the FDG PET follow-up data to the rate of apoptosis, assessed by TUNEL assay (terminal deoxynucleotidyl transferase), a standard method for detecting DNA fragmentation that results from apoptotic signaling cascades in tumor biopsies prior and after preoperative treatment. The authors defined FDG PET response as a minimum of 40% decrease of SUVmax or a SUVmax ≤3.9 after therapy. Using these criteria, the study found that 69% of the patients responded according to FDG PET and 71% according to contrast-enhanced CT (ceCT) criteria. However, FDG PET response was not predictive for PFS and was not correlated with apoptosis. Goerres et al. studied 20 patients with FDG PET and ceCT at baseline and a median time interval of 19 days of re-evaluation after start of imatinib therapy, and demonstrated that the changes in SUV and in CT correlated to OS but not to PFS. This study has, however, one major limitation, which is the lack of a fixed time point for re-assessment after therapy (range of follow-up study was between 11 and 111 days). Moreover, the lack of correlation of the SUV changes to PFS might have been caused by the fact that seven patients were FDG-negative at baseline PET. Furthermore, the authors reported a higher detection rate of metastases using ceCT than using FDG PET, but a higher detection rate of skeletal lesions in PET than in CT [41]. Nevertheless, the latter fact must be taken with caution as GIST hardly metastasize to the bone during early treatment lines like imatinib.

Overall, these studies do not demonstrate a longer PFS in FDG responders. Negative FDG scan at baseline, large range of FDG follow-up studies or different tumor biology in the group of patients who received high dose imatinib may explain the lack of predictive value in FDG responders.

Studies without survival analysis

Some other studies assessed the response in FDG PET and partially compared the FDG PET data to CT data. However, those studies did not provide survival analysis. Holdsworth et al. studied 63 patients at baseline and 1 month after imatinib and found that the best parameters for the prediction of response were a SUVmax ≤3.4 after therapy, a reduction of SUVmax of 40%, and no growth in CT from baseline to 1 month. These criteria outperformed the existing standards, like the EORTC and SWOG criteria [42]. Antoch et al. studied 20 patients at baseline and 1, 3 and 6 months after onset of therapy. They used the EORTC criteria for response assessment and compared the data to CT follow-up studies. They could demonstrate that FDG PET accurately diagnosed response in 85% of the patients at 1 month and in 100% of them at 3 and 6 months after onset of therapy. In comparison, CT was accurate in 44% of the patients at 1 month, in 60% of the patients at 3 months and in 57% of them at 6 months after treatment. Moreover, the authors found a decrease in tumor density (in HU units) by at least 25% in 12/14 responders 1 month after onset of treatment [43]. Gayed et al. studied 54 GIST patients at baseline, 3 weeks after onset of therapy (in 54/54 patients), and 2 months after therapy (in 49/54 patients). The authors compared the FDG PET data with the CT data and reported on concordant results in 71.4% of the patients at 2 months. They could demonstrate that FDG PET predicted response earlier than CT in 22.5% of the patients, whereas CT predicted lack of response earlier in 4.1% of the studied cohort. The authors’ conclusion was that FDG PET is overall a better guide for GIST response assessment than CT [44].

All these studies demonstrate that FDG PET is better than CT for the early assessment of therapy response. The predictive value of FDG was not evaluated due to the lack of survival analysis.

Overall assessment of PET studies

The reported studies generally demonstrated a good assessment of early response to imatinib by means of PET, even 8 days after the onset of therapy. However, the response criteria used are different and depend on the time of the follow-up FDG study. For longer time intervals (>2 months) a decrease in SUV of more than 50% and a SUV of less than 2.5 at the follow-up study may be preferable and more robust criteria for the assessment of a sustained response. The combination with CT volumetric and density changes may improve the response assessment; however, this is not yet clarified. There is no clear evidence that the change in FDG correlates with PFS and OS from a larger prospective data set using PET not as a side study. Any lack of correlation between change in FDG uptake and OS may be explained by the different therapies used after 1st line treatment with imatinib, particularly over time and dependent on the drug availability, which may differ from country to country despite worldwide registration of the drugs. However, there is no obvious explanation for the lack of correlation between SUV changes and PFS in some studies. Some authors report on a negative baseline FDG PET study, which explains the lack of further FDG change during therapy and has, most likely, influenced the results. More often, there is no biopsy proof of metastases and other diseases, which might mimic GIST, and vice versa [45].

Assessment of other treatment strategies

Good correlation between FDG response and PFS

Fuster et al. in an unusual approach, studied 21 patients with locally advanced and/or metastatic GIST, refractory to high dose (800 mg/day) imatinib, which were treated with doxorubicin for 4 cycles (15–20 mg/m2/weekly) followed by imatinib maintenance (400 mg/day). FDG PET and CT were performed at baseline and after completion of therapy. Six patients showed PMR in FDG PET, 15 patients had SMD, and 6 patients had PMD based on the EORTC criteria. In comparison, according to the RECIST criteria, 3 patients had PR, 12 patients SD, and 6 patients PD. Median PFS was 219 days. A significant correlation was found between FDG PET response and mean PFS (PMR 14 months, SMD 5.5 months, PMD 3.5 months). A residual SUVmax <5 after treatment correlated with improved PFS. Survival curves demonstrated a significant association between FDG PET response and PFS (p < 0.05). Patients with a PFS <6 months demonstrated a significantly higher SUVmax at baseline (SUVmax = 26.4) than those with PFS ≥6 months (SUVmax = 9.82) [46].

Prior et al. studied 23 patients after imatinib failure at baseline and 4 weeks after the onset of sunitinib. The PET response was evaluated based on the EORTC criteria. This group demonstrated that early metabolic response correlated with PFS. Median PFS was longer for metabolic responders (29 weeks) when compared to stable metabolic disease (SMD) (16 weeks) and PMD (4 weeks). Furthermore, they reported that a SUV <8 at 4 weeks was a good prognostic sign and related with a PFS of 29 weeks, as compared to a PFS of 4 weeks in tumors with SUV >8. None of the patients with PMD responded subsequently based on RECIST [47].

In a phase 2 study Kang et al. studied 30 patients with metastatic and/or unresectable GIST receiving dovitinib, a novel multikinase inhibitor targeting VEGFR 1-3, PDGFRβ, KIT, fibroblast growth factor receptors (FGFR) 1–3 and fetal liver tyrosine kinase receptor 3. All patients had demonstrated failure in previous treatment with at least imatinib and sunitinib. Patients were studied at baseline and 4 weeks after onset of treatment. The EORTC criteria were used for response assessment. Four patients (13%) had PMR. PMD at week 4 was significantly associated with shorter PFS (p = 0.03). Survival analysis demonstrated a predictive value of FDG PET response at week 4 with respect to PFS. The authors concluded that RECIST criteria underestimate the benefits of TKIs treatment [48].

In a phase 2 study, Le Cesne et al. followed 30 imatinib-naive patients with advanced GIST receiving masitinib, an oral tyrosine kinase inhibitor with stronger in vitro activity and selectivity for the wild-type c-Kit receptor and its juxtamembrane mutations than imatinib. In 17/30 patients, the metabolic response was assessed with PET scans (based on EORTC criteria) at baseline, after 1 month (in 13/17 pts) and after 2 months (in 14/17 pts). Of the 17 patients studied, 3 patients (17.7%) had a negative baseline FDG PET. After 1 month 9 of 13 patients (69.2%) had PMR (according to RECIST 20%), while 4/13 of them (30.8%) demonstrated SMD (according to RECIST 76.6%). At 2 months 12/14 patients (85.7%) achieved a metabolic response (3 patients CMR, 9 patients PMR; according to RECIST 53.3% of patients showed response), and 2/14 patients (14.3%) showed SMD (according to RECIST 43.3%). Median PFS was 41.3 months. The authors concluded that RECIST evaluation is not appropriate for early response monitoring but only for PD assessment, and that the absence of progress in CT is a good prognostic sign [49].

Demetri et al. studied 67 imatinib-resistant patients treated with sunitinib (50 mg/d) within a phase 1/2 trial. FDG PET was performed at baseline, after 7 days, and after the end of the 1st cycle of 4 weeks of treatment followed by 2 weeks without treatment. 60 patients had FDG PET at least at baseline and at 1st follow-up. Based on the EORTC criteria 43 patients demonstrated PMR and these early changes correlated to an improved clinical outcome based on subsequently CT and MRI follow-up scans. Most of these 43 patients subsequently showed clinical benefit based on CT or MRI, with six partial responders and 25 patients with SD of more than 6 months [50].

Overall, the presented studies demonstrate that early metabolic response assessed between 7 days and 4 weeks correlated to a longer PFS and an improved clinical outcome.

No correlation between FDG response and PFS

In a multicenter phase 2 study, Benjamin et al. studied 102 patients with advanced GIST resistant to imatinib at baseline and 8 weeks after onset of therapy with motesanib, an oral inhibitor of KIT, PDGFR and VEGFR. Response was assessed according to RECIST, and to modified RECIST using the Choi criteria, and then correlated with PFS. The objective response rate was 3%. 59% of patients achieved SD and 14% durable SD (more than 24 weeks), while 38% had PD. The authors reported a response rate of 30% for FDG PET and 41% for CT according to the Choi criteria. Median PFS was 16 weeks. However, neither CT nor FDG PET was predictive for PFS at 8 weeks [51].

In a phase 2 trial, Judson et al. studied 24 patients with advanced GIST treated with cediranib, a potent VEGF signaling inhibitor with activity against all three VEGF receptors and KIT. Cediranib was given as a second or third line therapy in patients who did not respond to imatinib or sunitinib. Patients were studied at baseline, on day 8 and on day 29 following onset of treatment. The response evaluation was based on the EORTC criteria for FDG PET and on the RECIST criteria for CT. The study found that 67% of the patients showed SMD on day 8 and 50% on day 29. Furthermore, 4 patients achieved PMR on day 29 (≥29% decrease). The decrease in SUVmax was low (6.8% on day 8 and 4.6% on day 29) with no statistically significant difference between both modalities. Best response was SD in 11/24 patients. Twenty patients were evaluated by RECIST with central review and none of them achieved an objective response. 11/20 patients had SD as best response and 8 of them had SD for more than 16 weeks. Median PFS was 2 months. In this study, no correlation was found between PMR in FDG PET and response in RECIST. The results reported in this study were different between the local investigators and a central review board, reflecting the investigator-dependent variability when assessing small tumor changes [52].

Wagner et al. report on 37 heavily pretreated patients within a phase I study, in which the HSP90 inhibitor retaspimycin hydrochloride was administered. This compound may be effective in TKI-resistant GIST, as it bypasses kinase mutational status and targets a fundamental pathway that supports oncoprotein activation [53]. FDG PET studies were performed at baseline, after the 2nd or 3rd cycle, and after drug holidays in 29/37 patients. EORTC criteria were used for the PET response assessment and PFS data were available. The overall objective response rate was 3% (one patient with partial remission), 73% achieved SD or PR at 6 weeks and 16% at 12 weeks based on RECIST. The overall FDG response rate (all patients with PMR) was 38%. Nineteen patients (66%) showed some overall reduction in FDG activity, and 2 patients (7%) progression of disease. FDG PET imaging conducted during drug holiday showed variable changes in SUVmax as compared to on-drug imaging. Median PFS was 10.6 weeks. The authors did not find any correlation between changes in FDG uptake and either CT response or duration of disease control [53].

All these studies fail to demonstrate a prognostic value of early metabolic response with respect to PFS.

Studies without a survival analysis

Bauer et al. studied 12 heavily pretreated patients receiving panobinostat, a pan-deacetylase inhibitor, combined with imatinib in a study at baseline (prior panobinostat) and 3 weeks later. They reported on 1 patient with PMR, 7 patients with SMD and another 3 patients developing PMD. Longest treatment duration was 17 weeks [54].

In a phase 2 trial, Dickson et al. studied 23 heavily pretreated patients refractory to imatinib and sunitinib with another HSP 90 inhibitor, BIIB021, which is an oral non-ansamycin inhibitor that binds to the ATP-binding pocket of the HSP90. FDG PET was performed at baseline, on day 5, on day 8 (1st cycle) and on day 29 (1st day of 2nd cycle) after onset of treatment. Two different treatment protocols were used; one with a biweekly treatment (BIW) and a second one with a triweekly administration of the drug (TIW). FDG PET response was based on the EORTC criteria. Best response by FDG was PMR in 3/12 patients (25%), who received 600 mg BIW, and in 2/11 patients, who received 400 mg TIW (overall 22%). Best response by RECIST was SD in 4 patients receiving BIW and six patients receiving TIW (overall 43%). Best response according to Choi criteria was 1 patient from the BIW group with PR (overall 4%) and six patients with SD (2 patients from the BIW group, and two patients from TIW group, overall 26%). The response duration was 25–138 days. The study did not correlate PET results with survival data [55].

Comparison to other systematic reviews

Hassanzadeh-Rad et al. performed a systematic review and meta-analysis of 21 studies concerning the value of FDG PET for prediction of treatment response. The authors reported a pooled sensitivity of 90%, a pooled specificity of 62% and a pooled prognostic odds ratio of 14.99 [56]. Most of the studies used for this meta-analysis refer to imatinib treatment. However, the authors state that this meta-analysis has several limitations due to the heterogeneity between studies and because the predictive value of FDG PET was reported without adjusting for potential confounders like grading and staging. In our review the large heterogeneity of the included studies precluded a meta-analysis. Factors contributing to the heterogeneity of the presented studies include the varying time intervals used for therapy monitoring, the lack of a clear reference for the assessment of therapy response (e.g. there are studies without survival data or CT studies using FDG PET as reference), the lack of standardized criteria for the assessment of therapy response with FDG (some studies use EORTC criteria, other not), the lack of a baseline study in some of the studies, and the use of different PET systems (PET-only or PET-CT scanners). Table 1 demonstrates the variability of the studies included in this review.

The superiority of FDG PET for early assessment of therapy response to imatinib in GIST has been stated in a systematic review of Stefanelli et al. [57]. On the other hand, the authors concluded that more data are needed to establish the efficacy of FDG PET for the assessment of new therapies. Treglia et al. published another systematic review on the same topic and stated that therapy assessment based on FDG PET is a valuable method for early assessment not only to imatinib but also to other drugs, as well as for the prediction of therapy outcome [58]. However, this conclusion is not in accordance to our results, since no robust evidence has been yet provided, that early treatment response with FDG PET correlates to therapy outcome in terms of survival [39,40,41,42,43,44, 51,52,53,54,55]. Therefore, more data are needed to support this statement.

General considerations and conclusions

Conventional morphological criteria based on changes of tumor size such as RECIST do not seem to solve the problem of assessing early response to TKIs treatment, since they may underestimate the therapeutic effect. FDG PET studies performed even as early as 1 week after start of treatment demonstrate a change in metabolism, with depiction of SUVmax decrease in responders. In several cases of metabolic response the respective changes in CT correspond only to small changes in tumor volume and tumor density. The more recently developed, CT morphology-oriented criteria, like the Choi criteria or the iodine-related attenuation measured on DECT, are more sensitive than the RECIST criteria and show a better correlation to the FDG changes. PR based on RECIST is defined by a ≥30% decrease in the maximum diameter of all measurable lesions, while the Choi criteria use a 10% cutoff for the decrease in tumor size or 15% for the decrease in tumor density [59]. A decrease in tumor density may be due to necrosis, hemorrhage or myxoid degeneration as a sign of therapy response, which may contribute to an intercurring increase in tumor size and lead to an underestimation of response to therapy. Therefore, although there is no clear evidence yet that early PET evaluation during treatment correlates to survival data, the so far published data demonstrate that it may be more reliable to evaluate metabolic changes rather than morphologic changes for early treatment assessment. FDG PET metabolic studies prior to therapy, as well as at least 8 weeks after onset to therapy may be used for the assessment of the early therapeutic effect. A more than 50% reduction of SUV and/or an SUV of less than 2.5 in the follow-up study may be more robust criteria for the assessment of a sustained response. However, this should be assessed in prospective studies.

A multiparametric evaluation approach based on both changes in morphological, including conventional CT- as well as novel CT response criteria, and functional tumor data, like FDG metabolism and tumor perfusion has to be assessed in further prospective studies.