Introduction

In Western countries, the vast majority of gastric cancers are detected in the advanced stages, requiring multimodality therapy, and preoperative chemotherapy or chemoradiotherapy has become the standard of care [1, 2]. In Japan and Korea, on the other hand, half of newly diagnosed gastric cancers are stage I cancers, and are curable by surgery alone; thus the treatment strategy is gastrectomy first, followed by adjuvant chemotherapy only when pathological stage II or pathological stage III disease is confirmed [3]. In this situation, neoadjuvant chemotherapy (NAC) has been used only for marginally resectable tumors such as linitis plastica or those with extensive bulky nodal metastasis [46]. Although large-scale randomized controlled trials (RCT) of adjuvant therapy in Japan and Korea showed excellent 5-year survival rates of more than 70% in patients with pathological stage II/III tumors [7, 8], the outcome for patients with pathological stage III tumors is still unsatisfactory, and a switch to NAC is being considered for this stage in expectation of greater chemotherapeutic intensity and effect than with postgastrectomy therapy.

A problem with the intensive NAC strategy is the possible inclusion of patients with pathological stage I disease, who may experience grave adverse events from unnecessary chemotherapy. This has not been faced seriously in previous NAC trials in the West because early disease was rare and the survival of the whole population was poor. In the European MAGIC trial [8, 9], 3% of the patients who underwent gastrectomy in the group that had surgery alone had pT1 disease, and it is estimated that a similar proportion of patients in the perioperative chemotherapy group received unnecessary intensive chemotherapy. This seems to have been disregarded in the shadow of overall survival benefit with the NAC strategy. However, Japanese physicians believe that the inclusion of patients with pathological stage I disease could be, and should be, minimized by appropriate pretreatment diagnosis. The diagnostic accuracy of the T and N categories of gastric cancer by endoscopy, endoscopic ultrasonography (EUS), and/or computed tomography (CT) has been reported to fall within a wide range [10, 11], and no criterion has been proposed to diagnose pathological stage II/III disease with minimal inclusion of pathological stage I disease. In this study, we set up a diagnostic criterion and tested its validity in a prospective multicenter setting for future NAC trials.

Patients and methods

This study was conducted by the Stomach Cancer Study Group of the Japan Clinical Oncology Group (JCOG). Preoperative diagnoses of the depth of tumor invasion (cT category) and lymph node metastasis (cN category) were prospectively recorded with use of fixed staging criteria and were compared with postoperative pathological diagnoses (pT category, pN category, and pathological stage).

Patients satisfying all the following criteria were eligible: (1) histologically proven gastric adenocarcinoma; (2) gastric endoscopy and contrast-enhanced CT had been performed, and at least one of them gave a diagnosis of cT2 (proper muscle layer) or deeper invasion; (3) no linitis plastica (Borrmann type 4) or diffuse ulcerative tumor (type 3) 8 cm or larger; (4) absence of or 3 cm or shorter esophageal invasion; (5) no paraaortic or extensive bulky lymph node metastases (one metastasis larger than 3 cm or two metastases larger than 1.5 cm along the celiac, splenic, common, or proper hepatic arteries); (6) no evidence of stage IV disease; (7) no prior staging laparoscopy; (8) no previous gastrectomy, chemotherapy, or radiotherapy for gastric cancer. Criteria 3 and 5 were set because these tumors had already been evaluated in other NAC trials by our group [4, 5].

Endoscopic diagnosis of T category was made and recorded according to the Japanese conventional standard. EUS was optional, and when performed, its diagnosis was recorded separately from that obtained by endoscopy. Multidetector CT using more than four detector rows with a slice thickness of 1 or 5 mm was recommended for T category diagnosis. Model CT images for T staging were provided in the protocol as a diagnostic aid. Before registration, cT category was determined in each patient at preoperative conferences in each institution. When the diagnoses regarding the depth of invasion by the two compulsory diagnostic modalities or with the optional EUS were not consistent, the conference members were asked to reach a final conclusion on the cT category. All the lymph nodes having either a minor axis of 8 mm or greater or a major axis of 10 mm or greater were diagnosed as metastatic and were recorded with nodal size and their anatomic location (station numbers). The lymph nodes not satisfying the aforementioned criteria but clinically suspicious of metastasis were also recorded with their size and location. When the lymph nodes were aggregated and diagnosed as metastatic, the size and location of the individual nodes were recorded.

Within 35 days of enrollment, patients underwent laparotomy, and, when the disease was judged resectable, gastrectomy with D2 lymph node dissection according to the third version of the Japanese Gastric Cancer Treatment Guidelines of the Japanese Gastric Cancer Association [12]. Pathological examination of the resected specimens was performed in each institution on the basis of the 14th edition of the Japanese Classification of Gastric Carcinoma [13].

The diagnostic criterion tested in this study was “cT3 (subserosal invasion) or cT4 (perforating the serosa or invading the adjacent organs) regardless of lymph node diagnosis.” This had been proposed by surgeons of the National Cancer Center Hospital, the leading institution of this study, on the basis of their retrospective study of 225 patients (data not published), in which the proportion of pathological stage I disease was 3.2% and the sensitivity for pathological stage III disease was 94.1% by this criterion.

The primary endpoint was the proportion of pathological stage I tumors among those clinically diagnosed as T3/T4. Secondary endpoints were positive predictive value and sensitivity of the diagnosis for pathological stage III tumors, the proportion of pathological stage II tumors among cT3/T4 tumors, and positive predictive value, negative predictive value, and sensitivity and specificity of CT diagnosis for lymph node metastases. We hypothesized that the proportion of pathological stage I tumors among those diagnosed as cT3/T4 would be less than 5% on the basis of our previous study. If this primary endpoint is met, we judge that pathological stage III disease can be appropriately selected for NAC trials. If the primary endpoint is not met, we explore more appropriate criteria using the data collected. The expected value of the primary endpoint was set as 3%, and the half width of the 95% confidence interval based on the Clopper–Pearson method was set as at most 1.5%. Taking the variation of the point estimate of the primary endpoint into account, we set the probability that the half width of the 95% confidence interval for any point estimate was within 1.5% as 80%. Under these conditions, the sample size was calculated to be 968. Considering the uncertainty in clinical diagnosis of T category as well as possible further exploratory studies, we planned to collect data on cT2 (proper muscle layer) tumors as well. Because cT3/T4 tumors are expected to account for 80% of the eligible tumors, the sample size was determined to be 1250.

This study protocol was approved by the JCOG Protocol Review Committee and the institutional review board of each participating hospital before initiation of the study. This study was done in accordance with the international ethical recommendations stated in the Declaration of Helsinki [14] and the Japanese Ethical Guidelines for Epidemiological Research. The JCOG Data Center conducted central monitoring to ensure data submission, patient eligibility, and on-schedule study progress.

Results

Between July 2013 and November 2014, 1275 patients were enrolled from 53 Japanese specialized institutions (Fig. 1). Of these, eight did not undergo surgery and seven did not fulfill the eligibility criteria (three, type 4 or large type 3; two, intraoperative frozen section of lymph node revealed malignant lymphoma; two, multiple advanced lesions). Thus, 1260 patients were eligible for primary analysis. Of these, 29 patients underwent nonresection surgery (exploration or bypass) because of peritoneal dissemination, hepatic metastasis, or gross T4b disease found at laparotomy, for which pT and pN categories were not available.

Fig. 1
figure 1

Patient enrollment flowchart

Table 1 gives the clinicopathological features of the patients. Table 2 gives the cross table for cT category and pathological stage. Of 928 tumors that had been clinically diagnosed as T3/T4, 114 (12.3%) were pathological stage I tumors (primary endpoint). The proportion of pathological stage II tumors among these 928 tumors was 31.8% (295/928). The positive predictive value of diagnosis of cT3/T4 for pathological stage III tumors was 43.6% (405/928), and 405 of 462 pathological stage III tumors were diagnosed as cT3/T4 (sensitivity 87.7%). Of the 928 patients, 114 (12.3%) had pathological stage IV disease at laparotomy, most cases of which were due to positive cytology findings and/or peritoneal dissemination. Most of them underwent palliative gastrectomy, except for the 29 patients described before.

Table 1 Clinicopathological characteristics of the patients analyzed (N = 1260)
Table 2 Clinical depth diagnosis and pathological stage (N = 1257)

Tables 3 and 4 give the cross tables for clinical and pathological diagnosis and the tumor depth and lymph node metastasis respectively. Of 928 tumors diagnosed as cT3/T4, 71 (7.7%) and 141 (15.2%) were pathologically T1 and T2 tumors respectively. EUS was performed in 92 patients (7.3%), but its use did not increase the diagnostic accuracy (data not shown). Of 650 patients in whom lymph node metastasis was diagnosed, 505 had histologically positive nodes (positive predictive value 77.7%), whereas 278 of 581 patients who were node negative clinically did not have histological nodal metastasis (negative predictive value 47.8%). The sensitivity and specificity of the nodal metastasis by the CT criteria in this study were 62.5% (505/808) and 65.7% (278/423) respectively.

Table 3 Clinical and pathological diagnosis of depth of tumor invasion (N = 1258)
Table 4 Clinical and pathological diagnosis of lymph node metastasis (N = 1231)

As the primary endpoint was not met, we explored diagnostic criteria to select pathological stage III tumors with minimal inclusion of pathological stage I tumors. The following two criteria were considered: clinical stage III (cT2N3, cT3N2/N3, or cT4N1/N2/N3; criterion A) and (2) cT3/T4 with cN1/N2/N3 (criterion B). The proportion of pathological stage I tumors included and the sensitivity for pathological stage III tumors by each criterion are compared in Table 5, together with the original criterion of this study “cT3/T4” (criterion C). Criterion A shows a low inclusion rate of pathological stage I tumors (4.6%), but the sensitivity for pathological stage III tumors is as low as 52.4%. Criterion B shows a slightly higher inclusion proportion of pathological stage I tumors (6.5%) but higher sensitivity for pathological stage III tumors (64.5%) than by criterion A. We propose use of criterion B to select patients for further NAC studies in Japan.

Table 5 Comparison of the three clinical diagnostic criteria

Discussion

For the purpose of planning RCTs of NAC for gastric cancer in Japan, we evaluated a diagnostic criterion, “cT3/T4 by endoscopy and CT regardless of nodal status.” This criterion was expected to appropriately select pathological stage III gastric cancer patients with high sensitivity (90%) and minimal inclusion of pathological stage I gastric cancer patients (less than 5%). We considered this expectation plausible because it had been well documented that 70% of pT3/T4 tumors have histological lymph node metastasis regardless of the nodal appearance [15]. Although the sensitivity of this criterion for pathological stage III tumors was as high as expected (87.7%), the proportion of pathological stage I tumors was much higher than expected (12.3%) and did not meet the primary endpoint of this study. We conclude that this criterion is not suitable for NAC trials in Japan.

The reasons for this high rate of inclusion of pathological stage I tumors is multifactorial. First, we excluded some high-risk tumors from this study; that is linitis plastica, large diffuse ulcerative tumors, and those with extensive bulky nodal disease, which are almost never pathological stage I lesions. We did so because these tumors had already been evaluated in our own NAC trials and we had no intention to include them in the next RCT [46]. Second, the prevalence of pathological stage I disease in our patients was not low. Predictive values of a diagnostic test are largely influenced by the prevalence of the disease, and if our diagnostic criterion had been tested in a population that has very low prevalence of pathological stage I disease, the results would have been different. Third, preoperative T staging for gastric cancer inherently has limitations. There have been many reports of accurate depth diagnosis of early gastric cancer using EUS and/or narrowband imaging as well as conventional endoscopy [1618], but few gastroenterologists have shown interest in differentiating T2, T3, and T4 tumors. Indeed in the current trial, where T1 tumors were excluded, EUS was used for T staging in only 92 patients (7.3%), and it did not increase the diagnostic accuracy. Contrast-enhanced multidetector CT cannot differentiate them correctly, either. We found that 212 of 928 tumors (22.8%) diagnosed as T3/T4 tumors by a combination of endoscopy and CT were pathologically T1/T2 tumors. This overdiagnosis was mainly caused by intratumoral peptic ulceration in T1/T2 lesions that made the tumor look deeper on endoscopy and thicker on CT. Even EUS would not have correctly distinguished the ulcerative fibrosis from tumor invasion.

Lymph node assessment by size alone also has limitations. Our cutoff values (a minor axis of 8 mm or greater or a major axis of 10 mm or greater) are commonly used for the diagnosis of metastasis [19]. The low positive predictive value (77.7%) and sensitivity (62.5%) were similar to those in previous reports [20, 21]. Positron emission tomography would add useful information, but its sensitivity for gastric cancer with diffuse-type histological appearance is known to be low [22], and its routine preoperative use is not realistic.

We tested some combinations of cT and cN diagnosis to improve the selection (Table 5). Naturally, there is a trade-off between the inclusion of pathological stage I tumors and the sensitivity for pathological stage III tumors, and despite various changes in cutoff values for nodal size (e.g., major axis 12 or 15 mm) or CT slice thickness, we were unable to determine a diagnostic criterion that satisfies our initial expectation (data not shown). For future NAC trials, we have to choose either a low rate of inclusion of pathological stage I tumors, or high sensitivity for pathological stage III tumors. In Japan, where T1/T2 tumors are commonly detected and mostly cured safely by the “surgery first” policy without adjuvant chemotherapy, many physicians are reluctant to give toxic chemotherapy to patients who may have pathological stage I disease. Therefore, we will choose the criterion having a low rate of inclusion of pathological stage I disease rather than high sensitivity for pathological stage III disease. With our new criterion “cT3/T4 and cN1/N2/N3”, one third of patients with pathological stage III tumors will be excluded from the NAC trial. For these patients, new diagnostic modalities or therapeutic strategies should be evaluated in different settings. Preoperative staging by endoscopy and CT has also limitations in detecting peritoneal disease. Despite the exclusion of high-risk cases such as linitis plastica or large diffuse ulcerative tumors, 12.3% of our patients were found to have positive lavage cytology findings and/or visible intra-abdominal metastases at the time of surgery. Staging laparoscopy would detect them before treatment, but its routine use cannot be accepted in daily practice for tumors with such a low possibility of peritoneal metastasis. For an NAC study, the inclusion of these patients with a low volume of peritoneal disease that might benefit from chemotherapy will be accepted.

In conclusion, in Japan, where gastric cancer is detected in early stages and treated with primary surgery, the benefits of NAC should be tested with carefully selected patients. Clinical diagnosis of T3/T4 tumors made possible selection of nearly 90% of patients with pathological stage III disease but was associated with undesirable inclusion of more than 10% of patients with pathological stage I disease. We propose the stricter criteria “cT3/T4 and cN1/N2/N3” as a more appropriate option for the next NAC trial.