Introduction

Gastrointestinal stromal tumor (GIST) is the most common gastrointestinal tract sarcoma, with an annual worldwide incidence of 7–19 cases per million inhabitants [13]. GISTs are found most often in the stomach, followed by the small intestine; however, a considerable number of GISTs are found in the colon, esophagus, and at other sites in the peritoneal cavity [4, 5]. Most GISTs have either KIT or platelet-derived growth factor receptor alpha (PDGFRA) mutations that are mutually exclusive and that are key molecular drivers of proliferation in GISTs [6, 7].

Surgery is indicated for primary resectable GIST and is the only curative therapeutic modality. Unfortunately, nearly 40 % of resectable patients experience disease recurrence even after complete resection [8, 9]. After recurrence or when the tumor is unresectable, imatinib mesylate (Gleevec, Novartis Pharmaceuticals, Basel, Switzerland) is a first-line therapy [8, 10]. This agent has revolutionized the treatment of advanced GIST and acts by inhibiting the KIT or PDGFRA signaling pathways [11]. Randomized clinical trials have indicated that 1-year imatinib adjuvant therapy improves recurrence-free survival (RFS) compared to placebo and that 3-year adjuvant therapy improves not only RFS but also overall survival (OS) compared with 1-year treatment [12, 13]. Although imatinib is generally well tolerated, all patients have some adverse events. Since very low-risk or low-risk GIST patients may not benefit from adjuvant treatment with imatinib [7, 13] and long-term imatinib therapy may heavily burden medical costs, patient selection for adjuvant treatment is critically important.

The prognostic factors for recurrence after surgery have been investigated previously, and tumor size, mitosis, and tumor location are considered important and independent prognostic factors for recurrence in patients with R0 or R1 surgery for GIST [4, 14, 15]. Furthermore, the rare clinical event of tumor rupture has recently been identified as a prognostic factor [1, 16]. Other possible prognostic factors, such as macroscopic invasion, may also be beneficial in patient risk stratification [16].

To identify candidates that will benefit from adjuvant therapy, several risk-stratification systems that use the prognostic factors mentioned above have been proposed for analyzing patients after curative surgery for GIST [1, 4, 1416]. The National Institutes of Health consensus criteria (NIHC) are based on tumor size and on the number of mitosis events per 50 high-power fields (HPF) [14]. The Armed Forces Institute of Pathology Criteria (AFIPC) use tumor location in addition to size and mitotic count [4]. The Joensuu modified NIH classification (J-NIHC) combines the advantages of the NIHC and AFIPC with the additional factor of rupture [17]. The American Joint Committee on Cancer staging system (AJCCS) uses the TNM classification. Given these differences in the various risk stratification tools, it is not clear which classification is the best for selecting GIST patients for adjuvant therapy. Most of these data were reported from the USA or EU, and Asian data are practically lacking, although some differences in the prognosis have been preliminarily indicated [18]. No reports have looked at the validity of using these risk stratification systems in a large data set from Japanese GIST cases. In the present study, we evaluated these previously proposed prognostic factors for GIST recurrence using a large set of Japanese data and also examined the sensitivity and accuracy of the reported risk classification systems in an adjuvant therapy setting.

Materials and methods

Patients

In this study, the data were retrospectively and prospectively collected from patients who underwent curative surgery for GIST between 1980 and 2010. Sixty-seven patients between 1980 and 1989 were collected retrospectively, and 804 patients between 1990 and 2010 were prospectively collected. A total of 871 patients with GIST underwent surgery at our institutions and at hospitals affiliated with the Osaka University Hospital. Of these, 77 patients were treated with adjuvant and/or neoadjuvant chemotherapy, and 13 patients were accompanied by liver metastasis and/or peritoneal dissemination at the time of surgery. Eight GISTs were microGISTs that were found incidentally in specimens resected for other diseases, including gastric cancers, and were diagnosed upon pathological examination of the specimens. Sixty-one patients lacked the minimal clinical information required for inclusion, such as age, gender, tumor size, mitosis, or prognosis. These 159 GIST patients were excluded from further analyses. The remaining 712 patients that underwent surgery with curative intent (R0 or R1) were enrolled in the analysis. Postoperatively, the patients were not treated with any chemotherapeutic agents, including imatinib, until disease recurrence. Most postoperative follow-up had been performed by periodical contrast-enhanced computed tomography to detect any recurrences and metastases, which turned out to be very similar to suggestions by the GIST guidelines [5]. This study was conducted according to institutional ethics guidelines and was approved by the institutional review board at each institution.

Pathological diagnosis

When histopathology revealed spindle, epithelioid, or mixed features by hematoxylin and eosin (H&E) staining, and when immunohistochemical analysis showed KIT (CD117) and/or CD34 positivity, patients were diagnosed with GIST. The histopathological features, cell shape, and number of mitoses per 50 HPF were obtained by examination of H&E-stained specimens. Mitoses were counted at the highest power, and mean values were used for the analysis after counting the fields twice. For patients who lacked pathological data, including immunohistochemistry, we histologically re-examined their surgical materials by one of the pathologists, S.H., when their paraffin blocks were available and usable.

Risk stratification

Patients were classified using the NIHC, AFIPC, J-NIHC, AJCCS, and Japanese modified NIH criteria (m-NIHC) [1, 4, 1416]. Since the NIH consensus criteria do not specify how to classify tumors with exactly 5 mitoses per 50 HPF or tumors that are exactly 2, 5, or 10 cm in size, we defined mitosis and tumor size in the NIH consensus criteria as follows: <5/50 or ≥5/50 HPF, and ≤10/50 or >10/50 HPF for mitosis, and <2 or ≥2 cm, ≤5 or >5 cm, and ≤10 or >10 cm for tumor size. The other classification systems in this analysis were used as in the original reports with some modifications. In brief, for the AFIP criteria, we stratified patients into 5 risk groups: no risk, very low risk, low risk, moderate risk, and high risk. Since a limited number of patients were analyzed, groups based on tumor location were defined: the “small intestine group” included GISTs in the duodenum, jejunum, and ileum; the “large intestine group” included GISTs in the colon and rectum; the “other group” included peritoneal or retroperitoneal GISTs. In the AJCC staging system, stages IA and IB were considered stage I, and stages IIIA and IIIB were considered stage III for both gastric and non-gastric locations. In the m-NIHC, GISTs with rupture or macroscopic invasion are classified separately as a clinically malignant group in addition to the very low risk, low risk, intermediate risk, and high risk groups of the NIH consensus criteria (Supplemental Table 1) [3, 16].

Statistical analysis

The original primary endpoint was RFS, which was calculated from the date of surgery to the date of first recurrence. Cause-specific survival (CSS) was calculated from the date of surgery to the date of death due to GIST or any death with GIST recurrence. OS was calculated from the date of surgery to the date of death. Survival was compared between groups using the Kaplan-Meier method and the log-rank test. A forward stepwise Cox proportional hazards model was used for multivariate analysis to evaluate risk factors associated with RFS. McNemar’s test was used to compare differences in sensitivity and accuracy between each risk classification system. The factors that were statistically significant at the 5 % level in univariate analysis were included as covariates in the multivariate model, and two-way interactions were then considered. Two-sided p values <0.05 were considered significant.

Results

Patients

The clinicopathological features of the 712 patients are shown in Table 1. The tumor sites included 549 GISTs in the stomach, 112 in the small intestine, 35 in the large intestine, 8 in the esophagus, and 8 in other sites. The median tumor size was 4.0 cm, and 70 of 712 patients (9.8 %) had tumors larger than 10 cm. The median number of mitoses per 50 HPF was 4, and 171 of 712 patients (24.0 %) had mitotic counts >10/50 HPF. At surgery, macroscopic invasion into neighboring structures, which does not always mean direct invasion, was clinically present in 22 of 711 patients (3.1 %), and tumor rupture (either spontaneous or due to surgery) occurred in 14 of 709 patients (2.0 %). Of the 22 GISTs with macroscopic invasion, 16 required multi-visceral resection because of macroscopic invasion, and 34 patients had tumor rupture and/or macroscopic invasion, both of which are clinically malignant features [16]. These GISTs with clinically malignant features were larger in size (median size 10.0 cm; p < 0.0001) and had a higher mitotic count (median mitoses = 15/50 HPF; p < 0.0001) than GISTs without these features (median size 4.0 cm and median mitotic count 3/50 HPF). The tumor cell types consisted of 513 spindle type, 25 epithelioid, and 20 mixed.

Table 1 Patient characteristics

Survival

During the median follow-up period of 50.2 months (range 0.1–310 months), there were 114 recurrences and 93 deaths. The estimated 5- and 10-year RFS rates were 82.3 and 77.9 %, respectively. Most GISTs appeared to relapse within the first 3 years after surgery, and a few (but not inconsequential) recurrences were observed after 5 years. The estimated 5- and 10-year OS rates were 87.9 and 79.5 %, respectively. The estimated 5- and 10-year CSS rates were 97.7 and 94.1 %, respectively.

Tumor size and mitosis were strongly correlated with RFS, as shown in Fig. 1. Compared with the smallest GISTs (<2 cm), GISTs that were 2–5, 5.1–10, or >10.1 cm showed poorer prognosis, with hazard ratios (HRs) of 5.91 (95 % CI 0.79–44.01; p = 0.0829), 28.25 (95 % CI 3.82–208.83; p < 0.0001), and 51.75 (95 % CI 6.80–394.07; p < 0.0001), respectively. In terms of mitosis, GISTs with 5–10 mitoses or with >11 mitoses showed higher recurrence rates with HRs of 3.52 (95 % CI 1.85–6.72; p < 0.0001) and 15.06 (95 % CI 8.49–26.72; p < 0.0001) compared to GISTs with the fewest mitoses (0–4 mitoses). Concerning tumor location, GISTs other than the stomach, including the small intestine, colon, and other locations, showed significantly more unfavorable outcomes with HRs of 2.42 (95 % CI 1.58–3.70; p < 0.0001) compared to gastric GIST. There was no difference in outcome according to tumor site other than the stomach in this study. As indicated in previous studies [1, 16], tumor rupture (n = 14) was associated with poor prognosis (median RFS 1.8 years for rupture; p < 0.0001). Macroscopic invasion (n = 22) was also associated with worse prognosis (median RFS 1.4 years for invasion; p < 0.0001). Taken together, the occurrence of rupture and/or invasion was associated with poor RFS [HR = 15.68 (95 % CI 7.26–33.88) p < 0.0001] (Fig. 1). Although men had marginally but significantly poorer prognoses than women [HR = 1.52 (95 % CI 1.01–2.30) p = 0.04476], age and histological cell type were not correlated with RFS (Table 2).

Fig. 1
figure 1

Recurrence-free survival by tumor size (a), mitosis count (b), site (c), and tumor rupture and/or invasion (d). HPF high power field

Table 2 Univariate analysis of risk factors for recurrence-free survival (RFS)

Multivariate analysis indicated that four factors were independently correlated with RFS: tumor size >5 cm [HR = 3.38 (95 %CI 2.10–5.43) p < 0.0001], mitotic count >5/50 HPF [HR = 7.11 (95 % CI 4.32–11.72) p < 0.0001], non-gastric location [HR = 2.72 (95 %CI 1.74–4.25) p < 0.0001], and the occurrence of rupture and/or invasion [HR = 4.33 (95 % CI 2.37–7.87) p < 0.0001] (Table 3). Other factors, including age, gender, and cell type, were not significant.

Table 3 Multivariate analysis of risk factors for recurrence-free survival (RFS)

Risk-group stratification and outcome analysis

Each risk group showed different RFS curves (p < 0.0001) for all classifications (Fig. 2). High-risk groups identified by the NIHC, J-NIHC, and AFIPC, stage III in AJCCS, as well as high-risk and clinically malignant groups in the m-NIHC, all independently showed highly recurrent GISTs. Further, most of the patients in the “no risk,” “very low risk,” “low risk,” and “stage I” groups did not have recurrences. When GISTs with rupture and/or macroscopic invasion were categorized as being in a “clinically malignant group,” the recurrence rate in this group was estimated to be >90 %; the recurrence rate in the high-risk group was nearly 50 % in this study. This suggests that high-risk GISTs, as well as clinically malignant GISTs, have ominous prognoses even after complete resection. Patients in these groups should be considered candidates for adjuvant therapy, which was recently confirmed by the results of the EORTC study [19].

Fig. 2
figure 2

Recurrence-free survival according to the NIH consensus criteria (a), AFIP criteria (b), Joensuu’s modified NIH classification (c), AJCC staging (d), and the Japanese modified NIH criteria (e)

Next we evaluated the sensitivity and accuracy of each classification method in predicting recurrent GISTs. The J-NIHC showed the highest sensitivity for predicting recurrence (compared with all other risk classifications: p < 0.0001). However, AJCCS was the most accurate (compared with AFIPC: p = 0.003; compared with others: p < 0.0001) of the five classifications evaluated in this study population (Table 4). The number of patients with ruptured GIST and/or GIST with macroscopic invasion was small; therefore, their addition to the high-risk group had little impact on sensitivity or accuracy determinations.

Table 4 Sensitivity and accuracy of each classification system

Discussion

GIST is only curable after complete resection, although targeted therapy with imatinib or sunitinib greatly improves the survival of patients with advanced GIST. Several retrospective studies all identified tumor size, mitotic count, and location as independent prognostic factors for primary GISTs [1, 4, 16, 17, 2022]. Recently, tumor rupture was suggested to be an important prognostic factor, although its incidence appears to be relatively low [1, 16, 23]. The study further indicated that patients who have localized GISTs with macroscopic invasion, which might require combined resection of the surrounding organs, have a poor prognosis similar to those with rupture. This study confirmed that size, mitotic count, location, and rupture and/or invasion were independent prognostic factors after complete resection of primary GIST. Most GISTs with rupture and/or invasion were larger and had higher mitotic counts and were therefore mostly categorized as high risk by the NIHC and AFIPC classification systems. In fact, all GISTs with macroscopic invasion were classified as being in a high-risk group in this study when either the NIHC or AFIPC was used. Patients with rupture and/or invasion appeared to have much higher recurrence rates than those with high-risk GIST, especially in the early postoperative days, as shown in Fig. 2e [1, 16, 23]. These results suggest that unlike high-risk GIST, ruptured and invasive GIST may be considered a potential systemic disease and may require a combination of surgery and imatinib therapy for more than 3 years.

Gastrointestinal stromal tumor risk classifications are useful in follow-up after surgery and in decision making for initiating adjuvant therapy in clinical practice [5, 7]. The five classification systems evaluated in this study utilize very similar prognostic factors, and it remains unclear which of these risk classification tools is the best for predicting recurrence after curative surgery for GIST. In adjuvant therapy, most clinical studies include high-risk GIST, and the Z9001 study showed that only patients with high-risk GIST benefit from adjuvant treatment [1, 15, 24]. A recent report of the EORTC adjuvant study suggested that patients with intermediate-risk GIST do not appear to be good candidates for adjuvant therapy [19]. These results suggest that only patients with high-risk GIST might be candidates for adjuvant therapy.

In this study, we evaluated the sensitivity and accuracy of five classification systems for patients who were identified as being at high risk of recurrence who were candidates for adjuvant therapy. We found that the J-NIHC had the highest sensitivity for predicting recurrence and that the AJCCS was the most accurate. Considering the high tolerability and lesser toxicity of imatinib, it may be considered of clinical importance in the selection of adjuvant therapy to find the patients who are going to have recurrences. In this respect, we select the J-NIHC to identify candidates for adjuvant chemotherapy. Another option for risk assessment in adjuvant therapy is the use of the gold nomogram and heat maps [20, 24], which may give an individualized estimate of the recurrence risk for each patient. Recurrence risk as calculated by the nomogram may vary greatly when the mitotic count is below or above 5/50 HPF and may differ in various populations, as seen in a global study [18].

In conclusion, this study identified or confirmed key factors that impact disease recurrence after complete resection. These factors include the mitotic rate, size, location, and rupture and macroscopic invasion. The group with the highest risk of recurrence is patients with ruptured GIST and/or GIST with macroscopic invasion. These patients and those with high-risk GIST should be considered candidates for adjuvant therapy. Of the five risk classification systems that we looked at, the J-NIHC might identify the most patients who are candidates for imatinib therapy (i.e., patients who are likely have relapses), but a substantial number of patients who will not have recurrences would receive imatinib if the J-NIHC were used for decision making. The J-NIHC appeared to be better for selecting patients for adjuvant treatment because of the high tolerance of imatinib adjuvant therapy.