Introduction

Gastric cancer is the fourth most commonly occurring cancer and second most common cancer-related cause of death in the world. The incidence of gastric cancer was estimated at 989,600 cases in 2008, with 61 % of the new cases being derived from Eastern Asia including China, Japan, and Korea [1].

Standard treatments for locally advanced gastric cancer differ among various regions: surgery followed by adjuvant chemoradiotherapy in the USA [2], pre- and postoperative chemotherapy in European countries [3], and postoperative chemotherapy in Asian countries [4, 5]. Preoperative chemotherapy is a part of the standard treatment used in Europe and has been evaluated in many clinical trials in other countries. However, there are no established short-term endpoints to screen the efficacy of preoperative chemotherapy regimens. The response rate based on the Response Criteria in Solid Tumor (RECIST) [6] is one of the standard short-term endpoints, but is not always applicable to gastric cancer because locally advanced gastric cancer does not necessarily have a measurable lesion.

Pathological response rate (pathRR) is another commonly used endpoint in the preoperative settings of gastric cancer because it can be used even when there is no measurable lesion. PathRR is evaluated microscopically in a resected specimen of the stomach and is estimated based on the percentage of the residual tumor area in the primary tumorous bed. Although many phase II trials have adopted pathRR as the primary endpoint, there is no globally accepted consensus regarding the optimal cutoff percentage to determine the responder. Various definitions regarding the cutoff percentage of residual tumors such as 10 % [712], 40 % [13], 50 % [9, 14], or 67 % [1523] have been used in previous clinical trials. According to the criteria proposed by Becker et al. [24, 25], 10 or 50 % is typically used as the cutoff percentage in Western countries, while 33 or 67 % is commonly used in Asian countries following the definition specified in the Japanese Classification of Gastric Carcinoma [26]. These differences in definitions between the East and West essentially impair the comparability of pathRRs between trials.

It is essential to establish a good short-term endpoint that predicts survival well in phase II trials in order to increase the success probability of phase III trials. Based on these backgrounds, we estimated the percentage of residual tumors on virtual microscopic slides as a continuous variable and determined which cutoff definition was the best to predict overall survival.

Methods

Included studies

The Stomach Cancer Study Group of JCOG has conducted a series of clinical trials evaluating preoperative chemotherapy. In this study, we used individual patient data from four phase II trials [17, 20, 22, 23] to evaluate the efficacy of preoperative chemotherapy. Details of the studies included are shown in the Table 1. The main subjects of JCOG0002-DI and JCOG0210 were patients with Borrmann type 4 (linitis plastica) cancer, while those of JCOG0001 and JCOG0405 were patients with non-type 4 cancer with extended lymph node metastases.

Table 1 Details of included clinical trials

Pathological diagnosis

Hematoxylin-eosin (H&E)-stained pathological sections of resected tumors were collected from 24 participating institutions. Sections corresponding to the cut surface with the largest tumor diameter on the resected specimen were selected in each patient and were digitally captured on a virtual microscopic slide. The Japanese Criteria of Gastric Carcinoma [26] were used for pathological diagnosis of the residual tumor and primary tumorous bed. For example, the primary tumorous bed volume was defined by microscopic findings such as necrosis, macrophage accumulation, or interstitial fibrosis below the submucosal layer. Inflammatory changes caused by peptic ulcer disease were excluded from the primary tumorous bed volume. Degenerative cancer cells were evaluated as viable cancer cells. The definition of viable tumor cells was sometimes difficult, and any tumor cells identifiable under the microscope were regarded as viable unless the tumor cells were totally necrotic, cyto-/karyolitic, or apoptotic. An example of non-viable cells is shown in Fig. 1a. The validity of the detailed criteria was examined by four pathologists (TK, TS, RK, HT) with a small number of cases prior to the consecutive pathological diagnosis for this study. According to the consensus criteria, the residual tumor area and primary tumorous bed were traced on a virtual microscopic slide by one pathologist (TK) and another (TS) confirmed these areas. If the opinions of the two pathologists differed, a consensus-based decision was made. The square measures of these two areas were automatically calculated on the software for virtual microscopic diagnosis (NanoZoomer Virtual Microscopy System, Hamamatsu Photonics). The percentage of the residual tumor, the square measure of the residual tumor divided by that of the primary tumorous bed, was then calculated. Tumor cells, particularly those in type 4 tumors, often exist sparsely in the interstitial area and sometimes the density of tumor cells is very low, for example, less than 0.1. These areas were identified separately, and the square measure multiplied by 0.1 was added to the sum of the residual tumor area. An example of a pathological diagnosis is shown in Fig. 1b.

Fig. 1
figure 1

a Example of tumor cells diagnosed as non-viable (arrows); b example of a pathological diagnosis of the residual tumor area and primary tumorous bed in a macroscopic type 3 tumor

Statistical consideration

According to the four typical cutoff percentages (10, 33, 50, 67 %), patients were classified into a responder and non-responder group. The primary outcome was overall survival, which was defined as the time from patient registration to death from any cause and was censored at the last day for surviving patients. All patients were followed up for at least 3 years. The hazard ratio (HR) of non-responders to responders in overall survival was calculated for each cutoff percentage by stratified Cox regression analysis including the study as a stratification factor. Adjusted HRs by the multivariate stratified Cox model were also estimated including age, sex, performance status, pathological type, and macroscopic type as covariates. Concordance probability estimates (CPE) were also calculated from the stratified Cox model for each cutoff to investigate how well each cutoff discriminated overall survival [27]. If the patient in the responder group lived longer, then the pair was regarded as concordant. CPE was the fraction of all pairs that were concordant and ranged from 0.5 to 1.0, with 0.5 indicating no association and 1.0 indicating a perfect association. All statistical analyses were performed using SAS 9.2 (SAS Institute, Cary, NC).

Results

A total of 188 patients from all the enrolled patients (n = 213) in the four trials underwent surgery, and pathological specimens were evaluated in 173 (92 %) out of 188 operated patients (Fig. 2).

Fig. 2
figure 2

Flow diagram of the study population

The characteristics of all analyzed patients are shown in the Table 2. Approximately two-thirds of patients had the histological diffuse type and 39 % of patients had Borrmann type 4 tumors. A total of 39 (23 %) out of 173 analyzed patients underwent R1/R2 resection.

Table 2 Patient characteristics

Pathological complete response rate was observed in eight patients only (4.6 %). There were 35 patients (20.2 %) with 1–10 % residual tumor, 33 patients (19.1 %) with 11–33 %, 27 patients (15.6 %) with 34–50 %, 23 patients (13.3 %) with 51–66 %, and 47 patients (27.2 %) with 67–100 %. Pathological response rates for the 10, 33, 50, and 67 % cutoffs were 25, 44, 60, and 73 %, respectively. Areas with a low density of tumor cells were identified in 36 patients (20.8 %) for whom the square measure of such areas was multiplied by 0.1 and was then added to the sum of the residual tumor area.

Prediction of overall survival

The HRs and CPEs for each cutoff percentage are shown in Fig. 3. HR for the overall population was the largest in the 10 % cutoff, which was the same even in the multivariate analysis, and CPEs were almost the same in each cutoff. When patients who underwent R1/R2 resection were excluded, both HR and CPE were the largest in the 10 % cutoff.

Fig. 3
figure 3

a Hazard ratio of overall survival and concordance probability estimates (CPEs) for the overall population (n = 173); b hazard ratio of overall survival and CPE for patients with R0 resection (n = 134)

Subgroup analyses

HRs and CPEs in the subgroup analyses for the macroscopic type (type 4/non-type 4) and histological type (intestinal/diffuse) are shown in Fig. 4. The 10, 33, or 50 % cutoffs did not predict survival well in the subgroup analysis for type 4 tumors, while the 67 % cutoff predicted survival moderately well. All cutoff percentages worked well in the subgroup analysis for non-type 4 tumors. All cutoff percentages worked well in the diffuse type, while only 10 % predicted overall survival moderately well in the intestinal type.

Fig. 4
figure 4

Hazard ratio of overall survival and CPEs in subgroups: a macroscopic type 4 (n = 68), b macroscopic non-type 4 (n = 105), c pathological diffuse type (n = 115), and d pathological intestinal type (n = 58)

As a sensitivity analysis, we simply added low cellularity area to the residual tumor area and calculated HRs and CPEs. The HRs with the cutoff of 10, 33, 50, and 67 % were 0.92, 1.01, 1.40, and 1.36 for type 4 tumors and 2.57, 2.25, 1.91, and 1.75 for non-type 4 tumors. CPEs with respective cutoffs were 0.50, 0.50, 0.54, and 0.53 for type 4 tumors and 0.59, 0.60, 0.58, and 0.55 for non-type 4 tumors. These results were quite similar to those when multiplying by 0.1 for low cellularity area.

Discussion

In the present study, the 10 % cutoff was the best in terms of the hazard ratio in both the overall population and patients who underwent R0 resection, while CPEs were almost the same between 10 and 33 %. Based on these results, the 10 % cutoff was recommended in terms of predicting survival. In addition, the 10 or 33 % cutoff did not predict survival well in the subgroup analysis for type 4 tumors, which implied that the diagnosis of %residual tumor may not have been as accurate as that of non-type 4 tumors.

Several short-term endpoints have been used in clinical trials to evaluate preoperative therapy. The response rate is not applicable in many trials on gastric cancer because subjects include patients without measurable lesions. The R0 resection rate is another candidate, but the R0 resection rate is affected by selection bias. The complete response rate (CR rate) can also be used as a candidate; however, because it is commonly less than 10 % in gastric cancer, it is not a good endpoint to screen the efficacy of preoperative chemotherapy. The CR rate was only 4.6 % in the present study.

PathRR does not need any special modality and can be used without measurable lesions. Kurokawa et al. [28] demonstrated that response assessment validity was higher with pathRR with a cutoff of 67 % than with the response rate with RECIST. Becker et al. [25] showed in their multivariate analysis that pathRR with the 10 % cutoff remained a prognostic factor while R0 resection rate did not. Therefore, pathRR has currently become a common endpoint in preoperative settings in gastric cancer. However, different definitions of the cutoff percentage of residual tumors between the East and West have impaired the comparability of the results of different trials.

The 10 % cutoff was the best in terms of the hazard ratio in the overall population in the present study, while CPEs were almost the same between 10 and 33 %. Based on these results, the 10 % cutoff was recommended due to the larger hazard ratio observed in this study, the ease of the pathological diagnosis, and standardization of the definition between the East and West. By harmonizing the definitions used in the East and West, it may become possible to compare the results of phase II trials from both, which would enable the more efficient development of treatment screening. In the current version of the Japanese Classification of Gastric Carcinoma, the %residual tumors with both 1–10 % and 10–33 % were included in grade 2. Therefore, we propose a modification to the Japanese Classification of Gastric Carcinoma to include the 10 % cutoff in the histological grading system for preoperative chemotherapy.

A macroscopic type 4 tumor, linitis plastica type cancer, is a particular type of gastric cancer. It has been referred to as a scirrhous type, with tumor cells often existing sparsely in the interstitial area. Thus, identifying both the residual tumor area and primary tumorous bed was assumed to be difficult. In this study, the area with a low density of tumor cells was identified separately by multiplying the area by 0.1 and adding the sum of the residual tumor area. Nevertheless, the 10 or 33 % cutoff did not work well in the subgroup analysis for type 4 tumors, which implied that the diagnosis of %residual tumor may not have been as accurate as that of non-type 4 tumors. The area with a low density of tumor cells was multiplied by 1 for sensitivity analysis, and the results revealed the same trends for both type 4 and non-type 4 tumors. Thus, the pathological response rate is not recommended for clinical trials in which most subjects have macroscopic type 4 tumors.

The present study has some limitations. First, virtual microscopic slides were used to identify the areas determining %residual tumor considering the reproducibility of the results. There may be a difference between the area diagnosis on the virtual slides and that on microscopic diagnosis in clinical practice. The reproducibility of the pathological area diagnosis in a clinical practice should be verified in a multiinstitutional setting. Second, determining the primary tumorous bed volume is generally harder than determining residual tumor volume. In addition to the criteria for evaluating primary tumorous bed volume employed in this study, we also believe that including some clinical findings, especially endoscopic findings, may be helpful to improve the understanding of primary tumorous bed volume. However, collecting the information for central review was not feasible in this multiinstitutional study, and pathological evaluation was performed based only on pathological specimens.

In conclusion, the 10 % cutoff should be the global standard cutoff of %residual tumor to determine the pathological response rate. The pathological response rate might not be recommended for clinical trials where the main subjects are type 4 tumors.