Background

Gastric cancer (GC) is the fifth most common and fourth deadliest cancer type, with over 1 million diagnoses and 760,000 deaths in 2020 alone [1]. While GC patient 5-year survival rates have been slowly rising in China, they remain under 35.1% for all GC patients and under 10% for those with advanced disease [2, 3]. Surgical tumor resection remains the primary treatment for advanced GC [4, 5], but accurately assessing and staging GC patients is valuable as a means of guiding clinical decision-making.

The Union for International Cancer Control/American Joint Committee on Cancer (UICC/AJCC) staging system is an internationally accepted set of criteria that is widely used in clinical practice [6, 7]. Under these guidelines, the N stage assesses the degree of lymph node metastasis based on the number of metastatic lymph nodes (mLNs), yet it fails to take into account the number of examined lymph nodes (eLNs). Patients with ≥ 7 mLNs in resected samples are diagnosed with pN3 stage disease. In Chinese, Japanese, Korean, and Western GC patient cohorts, these patients account for 39.2–50%, 36.2%, 40.2%, and 42.1% of all patients with mLNs, respectively [8,9,10,11,12]. Many studies have focused on subtyping patients with pN3b stage disease based upon the number of identified mLNs [13, 14]. While National Comprehensive Cancer Network (NCCN) guidelines recommend that a minimum of 15 lymph nodes be examined in GC patients to reduce staging migration [15], this number is not sufficient for patients with ≥ 7 mLNs diagnosed with pN3 stage disease, particularly for pN3b stage patients with ≥ 16 mLNs. Multiple prior analyses have revealed that for patients with certain TNM stages of disease, the optimal number of eLNs associated with improved patient prognosis may be 23 or higher [9, 11, 16, 17]. There is thus a clear need to further refine the definitions of pN3 patient subclassifications in order to further optimize the AJCC-TNM staging system.

As such, in light of the AJCC-TNM staging system, related guidelines, and other research results, this study was formulated to further subclassify pN3 stage GC patients, who make up a large proportion of GC patients, based upon numbers of mLNs and eLNs using the Surveillance, Epidemiology, and End Results (SEER) database.

Methods

Study population

The SEER program compiles authoritative cancer incidence and survival data pertaining to roughly 28% of the US population [18, 19]. The SEER-STAT software (SEER*Stat 8.3.6) was employed to screen the data for the present study. Patients that had undergone gastrectomy who were subsequently diagnosed with gastric adenocarcinoma between 2000 and 2016 who were included in the SEER database were identified. Patients with > 6 mLNs (pN3) and without distant metastases (pM0) as per the 8th edition AJCC Cancer Staging Manual were selected for further analysis. Patients were excluded from this study if they: (1) exhibited tumors at the cardia; (2) were < 18 or > 90 years old; (3) lacked clear follow-up or clinical data; (4) survived for < 1 month; (5) had < 16 eLNs. Based upon these criteria, 2894 patients were eligible for inclusion in this study (Fig. 1).

Fig. 1
figure 1

Patient screening process for the current study from the SEER database

Clinicopathological characteristics extracted from the SEER database included age at diagnosis, race, sex, tumor grade, tumor primary site, tumor size, tumor depth of invasion, number of eLNs, number of mLNs, adjuvant therapy, and patient outcomes as of most recent follow-up (Nov. 2018). TNM staging was defined according to the 8th edition of the AJCC Staging Manual.

Propensity score-matching (PSM) analysis

As the SEER program did not record patient data based upon the number of eLNs in a random manner, subgroup analyses would be subject to intrinsic bias. To minimize the potential impact of such selection bias and associated confounding variables when separating patients into groups, a PSM analysis was thus conducted [20, 21]. A 1:1 matching approach without replacement was performed using a nearest-neighbor matching based upon the logit of the propensity score within a caliper of 0.01, with this score having been derived based on sex, age, grade, primary site, tumor size, T stage, N stage, and adjuvant therapy type.

Statistical analysis

Categorical variables are given as counts and proportions, and were analyzed using Pearson’s Chi-squared tests or Fisher’s exact test. Overall survival (OS) was defined as the time between tumor resection and death, and served as a key composite prognostic readout. OS values in different patient groups were compared using Kaplan–Meier curves and log-rank tests, with follow-up being quantified via the reverse Kaplan–Meier method [22]. The X-tile software (https://medicine.yale.edu/lab/rimm/research/software.aspx) was utilized to select the optimal eLN cutoff value for the reliable classification of pN3 patients so as to maximize prognostic accuracy [23]. Cox proportional hazards regression models were employed to establish hazard ratios (HRs) for prognostic variables of interest, with Cox proportional regression analyses with a restricted cubic spline model being conducted to examine relationships between continuous variables and HRs [24]. Time-dependent receiver operating characteristic (ROC) curves were generated, and the area under the curve (AUC) was measured to gauge the accuracy of a given classification. To measure clinical utility, a decision curve analysis (DCA) was conducted by measuring the net benefits for a group of threshold probabilities. Likelihood ratio χ2 tests were used to assess homogeneity within a given classification, with the linear trend χ2 test was used to assess discriminatory ability and gradient monotonicity (for patients with favorable clinical features exhibiting prolonged survival relative to those with unfavorable conditions). The discriminatory ability of each classification was evaluated using Akaike information criterion (AIC) and Bayesian information criterion (BIC) values, with smaller AIC and BIC values being indicative of better prognostic utility [25].

R (v 3.6.0; R Foundation for Statistical Computing, Vienna, Austria) and SPSS (v 23.0; SPSS Inc, IL, USA) were used to conduct all statistical analyses, with P < 0.05 as the significance threshold.

Results

GC patient clinicopathological characteristics

The clinicopathological characteristics of the 2894 patients with pN3 stage GC identified in the SEER database who were eligible for inclusion in the present study are compiled in Table 1. The median age of these patients at the time of diagnosis was 67 years, and a majority of these patients were male (1661, 57.4%), with more than one-third having been diagnosed with GC affecting the lower third of the stomach (1081, 37.4%). The median tumor size for this patient cohort was 6 cm, with 1383 (47.8%) patients exhibiting a tumor > 6 cm in size. Approximately 60 percent of patients were diagnosed with pN3a stage disease (1763, 60.9%). The mean numbers of eLNs and mLNs in these patients were 29.05 ± 13.44 and 15.65 ± 8.48, respectively. Approximately two-thirds of patients (1940, 67.0%) underwent postoperative adjuvant therapy. The median follow-up time for these patients was 93 months (range: 0 – 203), and their 5-year OS was 20.10%.

Table 1 Basic clinicopathological characteristics of the 2894 patients with pN3 stage GC

Assessment of the prognostic relevance of eLNs and mLNs

To examine the relationship between the number of mLNs or eLNs and GC patient mortality risk, we conducted Cox proportional regression analyses using a restricted cubic spline model. HRs rose significantly as the number of mLNs increased (Fig. 2A), with higher numbers of mLNs being associated with an increased risk of death. In contrast, HRs declined rapidly as the number of eLNs increased (Fig. 2B), suggesting that GC patient survival outcomes differ significantly in a manner correlated with the number of eLNs.

Fig. 2
figure 2

The association between the number of mLNs (A), eLNs (B) and HRs for pN3b patients by using the univariate Cox proportional regression analyses with a restricted cubic spline model

Cut‑off value selection, PSM, and survival analyses

In light of the apparent relationship between eLNs and HRs in pN3 GC patients detected above, we next sought to use the X-tile software to establish an optimal eLN cutoff value capable of maximizing prognostic accuracy when evaluating these patients. Prior to PSM analyses, the optimal number of eLNs for separating 2894 patients into two categories was 31, while the best cutoff values for three categories were 20 and 31 (Fig. 3A). Survival analyses indicated that patients with ≤ 31 eLNs exhibited significantly worse survival outcomes relative to patients with > 31 eLNs (5-year OS: 18.4% vs. 24.7%, P < 0.001, Fig. 3B). Significant differences in survival outcomes were also observed among groups when separated into three categories according to the cutoff values of 20 and 31 eLNs (5-year OS: 14.9% vs. 20.7% vs. 24.4%, P < 0.001, Fig. 3C). In order to facilitate clinical decision-making, we separated patients into two groups based upon the number of eLNs 31 (≤ 31 or > 31) and conducted a PSM analysis. Following this analysis, 857 pairs of pN3 stage GC patients with ≤ 31 or > 31 eLNs remained, thereby minimizing the potential impacts of confounding variables and selection bias on analytical results (Table 2, All P > 0.05 after matching). Even after such matching, patients with > 31 eLNs exhibited a 5-year OS that was almost 8% higher than that observed for patients with ≤ 31 eLNs (5-year OS: 16.6% vs. 24.4%, P < 0.001, Fig. 3D).

Fig. 3
figure 3

Calculation of the pN3 patients using the optimal obtained cut-off values of eLNs using the X-tile software (A). Survival curves of the pN3 patients using the optimal cut-off values of eLNs: B patients with ≤ 31 eLNs vs. patients with > 31 eLNs; C patients with ≤ 20 eLNs vs. patients with > 20 and ≤ 31 eLNs vs. patients with > 31 eLNs; D patients with ≤ 31 eLNs vs. patients with > 31 eLNs after PSM analysis

Table 2 Clinicopathological characteristics of patients grouped by the optimal cut-off value of eLNs before and after PSM analysis

Subgroup survival comparisons for patients with different numbers of eLNs

As shown in Fig. 4A, we found that the prognosis of pN3 patients with ≤ 31 or > 31 eLNs differed among different pT stages. For pT1 or pT2 patients, although lower HRs were evident for individuals with > 31 eLNs (HR = 0.691 and 0.819, respectively), there were no significant differences in survival when comparing individuals with ≤ 31 or > 31 eLNs (All P > 0.05). Conversely, patients with > 31 eLNs exhibited a significantly better prognosis than those with ≤ 31 eLNs for pT3/4a and pT4b stages (pT3/4a, HR = 0.740, P < 0.001; pT4b, HR = 0.614, P = 0.002). We next separated all pN3a and pN3b stage GC patients into four groups according to the number of eLNs: pN3a patients with ≤ 31 or > 31 eLNs, and pN3b patients with ≤ 31 or > 31 eLNs. As shown in Fig. 4B, these four groups exhibited significantly different prognoses (All P < 0.05), with pN3b patients with ≤ 31 eLNs having the worst prognosis (5-year OS: 7.3%). For patients with a given pN stage, those with > 31 eLNs exhibited better survival outcomes than those with ≤ 31 eLNs (pN3a stage, 5-year OS: 35.9% vs. 28.5%, P = 0.004; pN3b stage, 5-year OS: 14.6% vs. 7.3% P < 0.001). We additionally conducted subgroup analyses of pN3 patients with different pT stages. In pT1 stage patients, no significant differences in survival outcomes were observed among groups, likely due to the small number of patients in this cohort (All P > 0.05, Fig. 4C). In pN3a or pN3b patients in the pT2 cohort, there were also no significant differences between patients with ≤ 31 or > 31 eLNs (Fig. 4D). However, pN3b patients with ≤ 31 eLNs exhibited a significantly worse prognosis than both pN3a patient groups (pN3a patients with ≤ 31 eLNs, P = 0.016; pN3a patients with > 31 eLNs, P = 0.009). In the cohort of patients with pT3/4a stage disease, there were significant differences in survival outcomes among these four groups (All P < 0.05, Fig. 4E). Patients with pT4b stage disease additionally exhibited significant differences in survival among these four groups (Fig. 4F), with pN3a patients exhibiting a significantly better prognosis (5-year OS: 31.5%, P < 0.05), whereas there were no significant survival differences among the other three groups (pN3a patients with ≤ 31 eLNs, 5-year OS: 7.6%; pN3b patients with ≤ 31 eLNs, 5-year OS: 2.9%; pN3b patients with > 31 eLNs, 5-year OS: 5.3%).

Fig. 4
figure 4

Subgroup survival analyses and forest plot of pN3 patients under different pT stages after PSM analysis: A forest plot of HRs and 95% CIs for OS of patients examined ≤ 31 or > 31 LNs; B survival curves of the whole matched pN3 cohort; C survival curves of pT1 stage cohort; D survival curves of pT2 stage cohort; E survival curves of pT3/4a stage cohort; F survival curves of pT4b stage cohort

Establishment and evaluation of a novel TNM staging system for pN3 stage GC patients

In light of our above subgroup analyses, we modified the AJCC-TNM staging system for pN3 GC patients and proposed a novel TNM (nTNM) staging system that takes the number of eLNs into account (Fig. 5A). In this nTNM staging system, pN3 patients were separated into six groups with distinct prognoses. For those patients with pT3 or higher stage disease, the classification system was expanded from the original two classifications to four under our nTNM staging system. Survival curves for the AJCC-TNM and nTNM staging systems are shown in Fig. 5B and C. While both systems were able to effectively classify pN3 patients according to their survival outcomes, our novel system was more precise as a classification tool. When the 3-year OS of pN3 patients was assessed, the AUC values for the AJCC-TNM and nTNM staging systems were 0.669 and 0.693, respectively (Fig. 6A), while for 5-year OS these values were 0.694 and 0.722, respectively (Fig. 6B). DCA curves also revealed that the nTNM staging system exhibited better clinical utility when used for prognostic analyses as compared to the AJCC TNM staging system (Fig. 6C). The homogeneity, discriminatory ability, and monotonicity of gradients were improved for this nTNM staging system, with higher linear trend χ2 and likelihood ratio χ2 values relative to those associated with AJCC-TNM staging (Table 3). Furthermore, the smaller AIC and BIC values associated with our novel system suggested that it may be an optimal tool for prognostic patient stratification.

Fig. 5
figure 5

The novel TNM staging system for pN3 stage GC patients taking the number of eLNs into account were established (A). Survival curves of pN3 patients under different staging systems: B AJCC-TNM staging system; C the novel TNM staging system. ROC curves of pN3 patients under different staging systems for predicting OS

Fig. 6
figure 6

ROC curves of pN3 patients under different staging systems for predicting OS: A 3-year OS; B 5-year OS. C The DCA curves of pN3 patients under AJCC-TNM and nTNM staging systems

Table 3 Comparison of the performance of the AJCC-TNM staging system and the novel TNM staging system

Discussion

Herein, we examined the prognostic relevance of different numbers of eLNs in 2894 pTxN3M0 GC patients in the SEER database who had undergone gastrectomy. Following PSM analyses aimed at controlling for selection bias and confounding variables, 857 patient pairs were retained for subsequent analyses which indicated that pN3 GC patients with > 31 eLNs survived for longer than did individuals with fewer eLNs. Based on these results, we proposed an optimized version of the AJCC-TNM staging system for these patients, and found that this nTNM staging system was more reliably able to predict patient prognosis as compared to the 8th edition AJCC-TNM staging system.

Currently, pN3 stage GC is pathologically diagnosed based upon the identification of 7 or more mLNs in postoperative tissue specimens, with a cutoff of 16 mLNs being used to further stratify these patients into those with pN3a and pN3b stage disease [6]. An estimated 15.7% of total GC patients in the world are diagnosed with pN3 stage disease, accounting for 38.1% of patients with pN + disease [12]. Among patients free of distant metastases other than those with pT1N3aM0 early-stage GC, which is classified as stage IIB under the AJCC-TNM staging system, all other pN3 patients are classified as having stage III disease [6, 7]. A single-center retrospective analysis conducted in China determined that among M0 stage patients with matched T stage disease, pN3 stage patients exhibited a worse prognosis than other patients, with a 5-year OS as low as 10.5% (pT4bN3bM0) or 7.1% (pT4aN3bM0), whereas the 5-year OS for M1 patients was 7.6% [8]. Similarly, pN3a GC has been linked to a poor prognosis in Western patient cohorts, with pN3a stage disease being associated with a 5-year OS of approximately 20%, falling to under 10% in those with pN3b stage disease [26]. While pN3 patients generally exhibit a poor prognosis, the 5-year OS of these patients varies significantly from 7.1 to 62.5% across different TNM stages [8].

AJCC-N staging according to the number of identified mLNs has been confirmed to be a key prognostic indicator in multiple multicenter retrospective analyses of Chinese, Western, and global populations [12, 26, 27]. Several studies have, to date, sought to further optimize such AJCC-N staging based upon the number of mLNs and/or eLNs [9, 11, 13, 14, 28,29,30,31]. For example, one multicenter Chinese cohort study led to the proposal of a novel mLN number-based subclassification system for pN3b GC patients [15]. Specifically, pN3b patients with > 24 mLNs were found to exhibit a significantly lower 5-year OS relative to patients with 16–24 mLNs (13.5% vs. 16.4%, P = 0.048). Another single-center study of 222 GC patients proposed a cutoff value of 21 for further stratifying such pN3b patients [16]. With respect to the number of eLNs, one 10-year retrospective study found that pN2-N3 patients in whom at least 25 LNs had been evaluated exhibited a better prognosis than did other patients, exhibiting a roughly 10% improvement in their 5-year OS among pN3 patients [11]. Zheng et al. [9] detected no significant differences in pN1 or pN2 patient prognosis as a function of the number of eLNs, whereas they found that the examination of > 22 LNs was linked to significantly prolonged survival following radical gastrectomy. Herein, we also determined that pN3 stage GC patients with > 31 eLNs exhibited a better prognosis than those with fewer eLNs both before and after PSM analysis (All P < 0.001). Given that pN3 patients make up a large fraction of total GC patients and have an inconsistent prognosis, we suggest that it is important to subclassify these patients not only based upon the number of mLNs, but also on the number of eLNs, leading us to propose a new staging system. Indeed, the LN ratio (LNR) and log-odds of metastatic lymph nodes (LODDs) staging systems that take both mLN and eLN numbers into account have been validated in multiple previous reports [28,29,30,31]. While promising, the implementation of these two prior systems was complicated, making them impractical for use in clinical practice.

In addition, although several studies have suggested that more LNs should be examined in pN3 patients [9, 11] and the main subjects of the present study were pN3 GC patients with sufficient eLNs, it remains challenging to examine a sufficient number of LNs in certain clinical contexts, such as in patients after neoadjuvant or conversion therapy, or as a consequence of limited clinician experience in performing LN examinations. For these patients, we posit that eLN-based staging optimization should also be performed. The results of one retrospective study combining data from multiple centers and the SEER database found that pN3a patients with < 16 eLNs exhibited a significantly poorer prognosis relative to patients with ≥ 16 eLNs, and should thus be classified as having pN3b stage disease [32]. In another single-center study focused on patients with stage III GC, researchers found that at stages IIIA, IIIB, and IIIC, the prognosis of patients with < 16 eLNs was significantly poorer than that of patients with ≥ 16 eLNs, suggesting that there were large substage increases in stage III patients with insufficient eLNs [33].In the present study, following PSM analyses conducted to control for potential confounding factors and selection bias, we confirmed that pN3 patients with > 31 or ≤ 31 eLNs still exhibited significantly different prognoses within the SEER cohort. However, due to sample size limitations we were unable to detect significant differences in survival outcomes as a function of the number of eLNs in pN3a or pN3b patients with pT1 or pT2 stage disease, although we did detect significant differences in pT3 and pT4 stage patient outcomes such that we were able to divide this population into four subgroups rather than the original two AJCC-TNM stages (Fig. 5A). We further confirmed that our novel staging system offered prognostic advantages over the AJCC-TNM staging system when evaluating pN3 patients.

Despite our promising results, this study is subject to certain limitations. For one, this was a retrospective analysis of the SEER database, which compiles data from many centers over an extended period of time, potentially introducing variability with respect to patient diagnosis and treatment strategies. In addition, our pT1/2 patient sample size was limited, reducing our statistical power when conducting prognostic analyses of these patients. Furthermore, while a PSM approach was employed to decrease the influence of bias on our study results, this approach is not comparable to the data generated by a randomized control study. Future prospective randomized controlled studies and/or larger retrospective analyses are thus warranted to validate and expand upon our results. The AUC improvements observed for ROC curve analyses and the changes in AIC and BIC indexes associated with this novel TNM staging system were limited, potentially owing to sample size limitations and biases associated with different treatment regimens across centers in the SEER database. However, the primary significance of the present study was not only that we were able to further optimize the AJCC TNM staging system, but also that we were able to provide evidence suggesting that there were certain limitations associated with staging based solely on the number of mLNs for pN3 GC patients. More detailed staging systems based upon the numbers of both eLNs and mLNs may thus represent a valuable future direction for the precision medicine-based treatment of GC.

Conclusions

In summary, we were able to further subclassify patients with pN3 stage GC using an optimal eLN cutoff number of 31 identified using the SEER database. Patients attained a significant survival benefit in the present study if they underwent the examination of > 31 LNs. Subgroup-based analyses of pT stages further revealed that there were significant differences in the prognostic outcomes of pN3a/b stage patients with > 31 eLNs relative to those of patients with ≤ 31 eLNs. In light of these analyses, we additionally proposed a novel TNM staging system capable of differentiating pN3 patients into six prognosis-related subgroups. Future external prospective studies will be essential to validate the utility of this new TNM staging approach.