Introduction

Gastric cancer (GC) is a major global health issue. About 1 million new cases occur annually across the globe, half of which occur in East Asia, including Korea [1, 2]. Although the incidence and mortality of GC have gradually declined over the past few years with widespread screening, the dismal prognosis of advanced GC is still threatening public health in Korea [2]. On the basis of benefits in terms of survival and quality of life compared with best supportive care [3], systemic chemotherapy has been established as the standard treatment for unresectable or metastatic GC. Fluoropyrimidine plus platinum-based combination chemotherapy is currently the most commonly used first-line regimen. Although molecularly targeted agents such as trastuzumab and ramucirumab have been integrated into the standard care for GC [4,5,6], survival advantage with the new agents is still minimal; cytotoxic chemotherapy is so far the mainstay of treatment in most patients, achieving a median overall survival of about 1 year [7].

However, the outcome of metastatic GC varies significantly among individuals treated with first-line chemotherapy, and evaluation of an individual’s prognosis is very important in regard to medical care by clinicians, patient stratification, and analysis in clinical trials. Previous studies have reported prognostic models that predict overall survival by categorizing patients into various risk groups based on the number of poor prognostic clinicopathological factors [8,9,10]. However, there is no currently available tool that can integrate multiple putative prognostic factors into a single numerical estimate of survival of individual patients with unresectable or metastatic GC.

Nomograms have been used to estimate individual survival probability, showing relatively improved predictive accuracy compared to prognostic grouping or a scoring system [11, 12]. They have been developed to predict the prognosis of various malignancies, including sarcoma, prostate, colon, and breast cancer [13]. Nomograms have also been constructed and validated for predicting survival of individuals with localized GC after curative resection [14, 15]. However, thus far no nomograms for predicting survival have been reported for patients with metastatic or unresectable GC.

On the other hand, sensitivity to treatment may be an important prognostic factor, although responses to cytotoxic chemotherapy cannot be reliably predicted before treatment. Therefore, incorporating early tumor response to chemotherapy into a prognostic model could enable provision of more personalized information to both clinicians and patients after initiation of treatment.

The objective of the current study was to develop and validate a prognostic nomogram (the Baseline Nomogram) using clinicopathological variables to predict the 1-year survival of patients with unresectable or metastatic GC receiving first-line chemotherapy. We also aimed to construct a second prognostic nomogram (the ChemoResponse-based Nomogram) incorporating the initial response to chemotherapy as well as baseline clinicopathological variables.

Materials and methods

Study patients

Patients with metastatic or unresectable GC who had received at least one cycle of first-line chemotherapy from April 2001 to December 2006 were identified from a computerized patient database of the National Cancer Center, Goyang, South Korea (n = 1327). The process for selecting patients for model development is presented in Fig. 1. Patients were eligible for this study if they had pathologically proven adenocarcinoma and at least one measurable or evaluable lesion on computed tomography (CT) before starting chemotherapy. Eligible chemotherapy regimens were restricted to modern doublet combination regimens containing at least two of the following: fluoropyrimidine, platinum, irinotecan, and taxane. Those patients who were unavailable for one or more clinicopathological variables of interest (age, sex, histological differentiation, performance status, prior gastrectomy, metastatic sites, initial laboratory values, and chemotherapy regimen) were excluded from the analysis. According to these criteria, 949 patients were selected for model development.

Fig. 1
figure 1

Selection process of subjects for Cox proportional hazards model and construction of nomograms. Asterisk indicates clinicopathological variables of interest including age, sex, histological differentiation, performance status, prior gastrectomy, metastatic sites, initial laboratory values, or chemotherapy regimen

The following clinicopathological variables were collected: age, sex, performance status according to the Eastern Cooperative Oncology Group scale (ECOG PS), tumor location, endoscopic appearance (Bormann type), histological differentiation, disease status, prior gastrectomy, metastatic sites, total number of organs harboring metastases, and measurability of lesions on baseline imaging studies according to the Response Evaluation Criteria in Solid Tumors (RECIST) version 1.0 [16]. The following baseline laboratory values (before starting first-line chemotherapy) were collected: white blood cell count (WBC), platelet count, hemoglobin (Hb), serum albumin, alkaline phosphatase (ALP), and serum total bilirubin concentrations. Additionally, chemotherapeutic agents given as a first-line regimen and receipt of second-line or later chemotherapy were identified.

Among these patients, initial tumor responses to first-line chemotherapy were known in 836 patients: they had been evaluated with CT scans within 9 weeks of starting chemotherapy or had stopped their treatment because of unequivocal symptomatic progression before undergoing imaging studies. The time point of 9 weeks from the start of chemotherapy was chosen because most chemotherapy regimens were given with a 3-week cycle and reassessment was done at the 9th week of treatment. Investigators retrospectively assessed the tumor responses of each patient within 9 weeks of commencing chemotherapy using the RECIST version 1.0 [16] and classified the patients as having progressive disease (PD) or non-progressive disease (non-PD). Patients who had stopped their treatment because of symptomatic progression before initial imaging studies were classified as having PD.

The protocol of this study was reviewed and approved by the Institutional Review Boards of National Cancer Center, and all information was obtained with appropriate Institutional Review Board waivers.

Development of prognostic model

Overall survival was defined as interval between the date of starting first-line chemotherapy and the date of death from any cause. Overall survival according to each clinicopathological variable was estimated with the Kaplan–Meier method. The Cox proportional hazard model was employed to develop a multivariate model to predict 1-year survival of the patients. In the univariate analysis, crude and age-adjusted analyses were performed to identify potential risk factors. After potential risk factors were selected, we performed multivariate analyses with three selection procedures (forward, backward, and stepwise) to select the best-fit model. A statistical significance level of 0.20 was used to select variables into the model. After comparing the models from each procedure, the final model was from the backward selection process with p < 0.2.

Based on a multivariate model from the Cox proportional hazards model, a baseline nomogram was constructed to generate survival probability at 1 year after commencement of chemotherapy. Through the same process, but also including response to chemotherapy within the first 9 weeks, another model called the chemoresponse-based nomogram was developed.

Model validation

The model was validated for discrimination and calibration abilities by calculating the probability of each patient of the whole dataset according to the model and comparing it with the actual survival of the patient. Discrimination is defined as a model’s ability to correctly distinguish non-events and events, which can be quantified by calculating the C statistic developed for the survival model. The C statistic, a concordance measure analogous to the receiver operating characteristic curve, indicates the probability that a model produces higher risks for those who develop events than for those who do not develop events.

Calibration measures how closely the predicted probabilities agree numerically with the actual outcomes. A Hosmer–Lemeshow (H–L) type χ 2 statistic was used. This χ 2 statistic was calculated by first dividing the data into ten groups (deciles) by the predicted probabilities produced by the model in ascending order. Then, for each decile, the average predicted probabilities were compared with the actual event rate estimated using the Kaplan–Meier approach.

The C statistics were compared between the baseline and chemoresponse-based Nomograms with Z statistics. Statistical analysis was performed with STATA version 12.0 (StataCorp, College Station, TX, USA) and R software version 2.13.2 (http://www.r-project.org).

Results

Patient characteristics

Baseline characteristics of the patients are summarized in Table 1. As of December 2010, 902 (95.0%) patients had died of GC and 28 patients (3.0%) of another medical condition (e.g., toxicity of chemotherapy or other illness) or unknown cause. The median survival was 9.6 months [95% confidence interval (CI), 9.0–10.2] and the survival rate at 1 year was 39.4% (95% CI, 36.3–42.5). In the chemoresponse-based nomogram cohort, the initial disease control rate (proportion of non-PD at the first evaluation performed within 9 weeks) was 70.2%.

Table 1 Baseline characteristics of study patients

Prognostic nomogram

Twenty-two clinicopathological variables were analyzed for association with overall survival. We excluded chemotherapy regimen from these analyses, because it could be an inappropriate variable to input to a prognostic model. The standard fluoropyrimidine-platinum (FP) regimen was associated with inferior survival compared to non-FP regimens (taxane- or irinotecan-containing doublets) in univariate analysis (HR, 0.79, 95% CI, 0.69–0.91, p = 0.001) as well as multivariate analysis (HR, 0.87, 95% CI, 0.76–1.00, p = 0.058), however, non-FP regimens were mostly given as a study treatment of clinical trials. This finding suggests a selection bias, rather than efficacy of the regimen, might have caused the difference in survival.

Of the initial 22 variables, 6 variables (sex, location of tumor, endoscopic appearance of tumor, measurability of diseased sites, intraabdominal and extraabdominal lymph node metastases) were excluded from the Cox regression hazard model because they showed no or weak association with survival in univariate analyses (p value > 0.2). An additional 3 variables (disease status, number of organs harboring metastases, hemoglobin) were excluded in the process of variable selection during model construction, for reasons of attenuated influence on survival by multicollinearity with other variables in the model; thus, eventually the 13 variables shown in Table 2 were included in the baseline nomogram. Additionally, a chemoresponse-based nomogram was developed by including the initial response to chemotherapy and the 10 other variables shown in Table 2. One-year survival probability can be estimated with these nomograms, as described in the figure legends (Figs. 2, 3).

Table 2 Selected variables included in nomogram according to Cox proportional hazards model
Fig. 2
figure 2

Baseline Nomogram (baseline nomogam) was constructed from 13 clinicopathological parameters (Table 2). To calculate a patient’s 1-year survival probability, points for each parameter are assigned by corresponding values from the “points” axis, and sum of the points is plotted on “total points” axis. The patient’s 1-year survival probability is the value at a vertical line from corresponding total points. ECOG Eastern Cooperative Oncology Group, WBC white blood cell, ALP alkaline phosphatase

Fig. 3
figure 3

ChemoResponse-based Nomogram (chemoresponse-based nomogram) includes early response to chemotherapy (progressive disease vs. nonprogressive disease at the first evaluation within 9 weeks) and ten baseline parameters (see Table 2). ECOG Eastern Cooperative Oncology Group, WBC white blood cell, PD progressive disease

Validation and performance of the nomogram

The C statistics for the baseline nomogram and the chemoresponse-based nomogram were 0.656 (95% CI, 0.628–0.673) and 0.718 (95% CI, 0.694–0.741), respectively. The H–L type χ 2 statistics were 4.74 (p = 0.856) for the baseline and 7.96 (p = 0.548) for the chemoresponse-based nomogram, indicating good fit for both models (Fig. 4).

Fig. 4
figure 4

Calibration plot of actual risk probability with 95% confidence interval by decile (y-axis), over predicted risk probability (x-axis) by baseline-nomogram (a) and chemoresponse-based nomogram (b). Dashed line corresponds to 10% margin of error

The difference in terms of discriminating performance between the two models was significant (Z statistics = 3.74, p < 0.01), suggesting the chemoresponse-based nomogram predicts 1-year survival more accurately.

Survival estimates by predicted probability

Survival estimates according to 1-year survival probability from the baseline and the chemoresponse-based nomogram are shown in Table 3 and Supplemental Figure S1. Using each nomogram and the survival estimate table, a patient can be assigned to one of the groups sorted by 1-year survival probability. For example, one of the study patients, a 54-year-old man with poorly differentiated adenocarcinoma of stomach, peritoneal seeding, performance status of ECOG 1, elevated ALP (161 IU/l), and no history of gastrectomy, but no other unfavorable laboratory findings (WBC 7920/mm3, serum albumin 4.1 g/dl, serum bilirubin 0.4 mg/dl) corresponds to the sum of score of 145 in the baseline nomogram, which indicates the 1-year survival probability of 0.36. He was placed in a group with 1-year survival probability ≥0.3 and <0.4 that gives a median survival estimate of 10.2 months (95% CI, 8.9–12.1; Table 3; Supplemental Figure S1). When he showed nonprogressive disease at the first evaluation of response during treatment, his sum of score and 1-year survival probability by chemoresponse-based nomogram was 79 and 0.55, respectively. He could be assigned to a group with a median survival of 13.6 months (95% CI, 11.6–15.3; Table 3; Supplemental Figure S1). His actual survival was 15.2 months, which was better predicted by the chemoresponse-based nomogram than by the baseline nomogram.

Table 3 Survival estimates of subjects by 1-year probability generated by nomogram

Discussion

In this study, we constructed two nomogram-based prognostic models for patients with unresectable or metastatic GC who were treated with first-line doublet combination chemotherapy and performed internal validation for each model. Although survival of advanced gastric cancer has gradually increased, the median survival remains around 1 year with palliative combination cytotoxic chemotherapy [7], and untreated patients survive usually less than 6 months [3]. If physicians are able to estimate whether individual patients have survival time longer or shorter than the median survival shown for patients who receive chemotherapy, it would be very useful in discussing their prognosis with patients and stratifying patients in clinical trials.

In this context, we constructed nomograms to predict 1-year survival. The baseline nomogram using clinicopathological variables that were readily available before treatment showed favorable performance in discriminating 1-year survivors from those who died within a year. This rating could give physicians and patients a general scope of prognosis at the time of diagnosis. The chemoresponse-based nomogram incorporating initial response to chemotherapy within 9 weeks more accurately predicted a patient’s probability of 1-year survival, which could help decide on whether to continue palliative chemotherapy. Given that prognostication is a dynamic process over the course of the disease, because a patient’s prognosis may change based on treatment response, each of our nomograms could be used to refine survival estimates and guide a decision on important inflection points during the continuum of care for patients with advanced GC.

Our study used a nomogram-based method to predict survival probability of patients with unresectable or metastatic GC rather than dividing patients into several prognostic groups, as has been done in other studies [8,9,10]. For model construction, we used prognostic variables shown in previous studies to be associated with survival of patients with metastatic or locally advanced GC: performance status [8,9,10, 17]; metastases in liver [10], lung [8], peritoneum [8, 10, 17], or bone [8, 9, 17]; multiple metastatic sites [17]; no previous gastrectomy [8, 9]; low serum albumin [8, 9] or high serum bilirubin [17]; and high ALP concentrations [8,9,10]. In addition, our study showed poorly differentiated histology had a significant impact on survival, which had been also shown in a large-scale retrospective study [18]. Host inflammatory response to tumor has been suggested as an important prognostic marker in various types of cancer, including GC [19, 20]; these findings support the relevance of leukocytosis and thrombocytosis in the model of the present study. Although many of these prognostic variables overlapped between studies, the present study included more variables than any other single study and had improved power to identify prognostic value because of the large sample size.

However, prediction based only on pretreatment baseline clinicopathological variables seemed to be insufficient to construct a good-fit model. Most of the selected categorical variables had modest hazard ratios (1.00–2.00), which means that no adverse clinicopathological value could increase the risk of death within a year by 100%. For this reason, our baseline nomogram showed a modest discrimination performance (C statistic = 0.65), although it was better than the value suggested by other prognostic models [8]. Recently, molecular biomarkers such as human epidermal growth factor receptor 2 (HER2), epidermal growth factor receptor (EGFR), fibroblast growth factor receptor (FGFR), and c-MET have been identified and used to develop molecularly targeted agents; however, their prognostic or predictive functions are limited in the setting of cytotoxic chemotherapy for metastatic GC [21,22,23,24]. To date, HER2 is the only validated predictive biomarker for targeted therapy in GC. As more molecularly targeted therapies become available, molecular characteristics should be incorporated into a prognostic model.

Initial response to chemotherapy was more significantly associated with risk for death than any other clinicopathological variable (hazard ratio = 3.25 in multivariable analysis) and was thus the most powerful predictor of survival in our study. It is not yet possible to reliably predict chemosensitivity with any biomarkers or clinical characteristics before starting treatment for GC. However, initial response determined in the early phase of chemotherapy could be used to predict long-term clinical outcome, as shown in colon and lung cancer [25, 26]. In our study, the chemoresponse-based nomogram provided relatively improved discrimination compared to the baseline nomogram by enabling reevaluation of a patient’s probability of survival after completion of a few cycles of first-line chemotherapy, thus guiding subsequent treatment decisions. However, it should be noted that this model has several limitations: tumor responses were determined at varying time points (within 9 weeks from starting treatment) and only categorized into PD versus non-PD, because a considerable proportion (13%) of the study patients had non-measurable disease, of which the degree of tumor regression is not quantifiable. We thought this categorization of response was reasonable considering previous findings that disease control rate at approximately 8 weeks was a more powerful predictor of survival than was the tumor response rate in several advanced solid cancers such as non-small cell lung cancer and breast cancer [26, 27]. Furthermore, when we conducted another analysis based on three-step categorization of response (CR/PR, SD and incomplete response of non-measurable disease, and PD), the performance of the model was not better than the one based on dichotomization (data not shown).

This study has other limitations, including biases from its retrospective nature. First, the model was derived from an old dataset; the study patients were treated in the 2000s. However, front-line standard treatment for GC has not changed greatly in the interim. The standard of care remains cytotoxic chemotherapy, consisting of preferably two drugs among fluoropyrimidines, platinum, taxane, and irinotecan, for the majority of GC [28], except for HER2-positive disease, which constitutes less than 20% of the disease [24]. Second, our model could not consider the impact of subsequent lines of chemotherapy on survival. The survival benefit with second-line chemotherapy was proven in advanced GC [29], and undergoing subsequent treatment was associated with better overall survival in our dataset (data not shown). However, this was an undetermined factor at the beginning of first-line chemotherapy and thus was not counted for the model construction. Nevertheless, the influence of second-line or later treatment on survival of GC is increasing; more patients are being exposed to multiple lines of treatment than previously, and anti-angiogenic agents such as ramucirumab and apatinib are being introduced as second- or third-line treatment [5, 6, 30]. These factors beyond first-line treatment may attenuate the clinical utility of the nomograms from our study. Last, the validation of the models was conducted by simple application of the models on the whole dataset to maximize the data utility, but this approach has a risk of overfitting. Model performance could have been overestimated, so external validation is essential to prove the clinical utility of the models.

In conclusion, we used baseline clinicopathological variables to develop and validate nomograms for prediction of individual survival in patients undergoing standard cytotoxic chemotherapy for unresectable or metastatic GC. These nomograms provide comprehensive prognostic information to discuss with patients and to optimize stratification in randomized clinical trials. The ChemoResponse-based Nomogram appears to better predict outcomes than the Baseline Nomogram by incorporating response to chemotherapy in the early phase of treatment. Ongoing chemotherapy and palliative care could be planned on the basis of the detailed information with this model. Further studies are needed for additional external validation and refining of these nomograms for diverse clinical settings.