The log odds of negative lymph nodes/T stage: a new prognostic and predictive tool for resected gastric cancer patients

Purpose When only the TNM classification is used to predict survival in gastric cancer (GC) patients, the impact of the degree of lymphadenectomy on the prognosis is neglected. This study aimed to establish a more effective nomogram based on the log odds of negative lymph nodes/T stage ratio (LONT) to predict survival in surgically treated GC patients. Methods The data of resected GC patients were extracted from the Surveillance, Epidemiology, and End Results Program (SEER) database. Univariate and multivariate Cox regression analyses were used to identify the significant prognostic factors. The prognostic performance was assessed using a calibration plot, concordance index (C-index), and area under the (time-dependent receiver operating characteristic) curve (AUC) to compare the predicted survival probability based on the nomogram score groups. Results The results showed LONT as an independent prognostic factor for cancer-specific survival (CSS) and overall survival (OS), independent of clinicopathological factors. After removing potential redundancy, only LONT, T stage, N stage, location and age were used in the final nomogram model. The model had a higher C-index (0.736 ± 0.012) and AUC (0.798) than the TNM staging system (0.685 ± 0.012 and 0.744). The nomogram score could predict a significant survival difference between any two adjacent groups in terms of CSS and OS. Conclusion High LONT is associated with improved survival of gastric cancer patients, independent of other clinicopathological factors. The prognostic nomogram model based on LONT could effectively predict CSS and OS for resectable GC patients. Supplementary Information The online version contains supplementary material available at 10.1007/s00432-021-03654-y.


Introduction
Gastric cancer (GC) is the third most common cause of cancer death and the fifth most common cancer globally, with over 1 million estimated new cases annually (Smyth et al. 2020). GC is therefore regarded as a major global health problem, especially in East Asian countries (GBD 2017;Stomach Cancer Collaborators 2020). The accuracy of survival prediction for GC patients is crucial for postoperative treatment and follow-up plans. To date, tumor-node-metastasis (TNM) staging based on the depth of tumor invasion and the number of regional positive lymph nodes is the most widely accepted system for risk stratification in GC (Fujitani et al. 2018). However, the clinical outcomes among GC patients with the same TNM stage might be completely different. The conflicting results might be because the system is based only on the extent of disease and disregards the influence of the degree of lymph node dissection (LND) on survival.
Presently, the evaluation of the degree of LND mainly depends on the extent of lymph nodes removed at the time of gastrectomy (Japanese Gastric Cancer Association 2020). Although D2 LND describes the extent of lymphadenectomy with the goal of examining more than 16 lymph nodes (Degiuli et al. 2004;Schwarz and Smith 2007;Son et al. 2012), the difference in the technical aspects of performing D2 LND may lead to different survival (Enzinger et al. 2007;Liang et al. 2015). Moreover, this model has inherent limitations; namely, the evaluation by the surgeon is considered, not the pathologist, and it lacks objective quantitative indicators. In recent years, observational studies have indicated that the count of examined lymph nodes (ELNs) (Smith et al. 2005;Son et al. 2012) and negative LNs (NLNs) (Kattan et al. 2003;Martinez-Ramos et al. 2014;Wang et al. 2017) have independent prognostic value in GC, which can reflect the degree of LND. However, they also have limitations due to the lack of information on individualized tumor characteristics.
Indeed, T stage is a robust risk factor in GC, which is based on the depth of tumor invasion and can represent the major tumor characteristics. Increasing studies have shown that T stage is not only related to prognosis but also closely related to tumor biological characteristics (Mao et al. 2019;Sun et al. 2020;Wang et al. 2019). Both NLNs and T stage are important independent prognostic factors in GC, which represent the degree of LND and the severity of disease, respectively. However, whether the combination of NLNs and T stage can serve as a novel prognostic factor that reflects the degree of individualized LND in GC patients remains unclear. Thus, in our study, we first defined the log odds of negative lymph nodes/T stage (LONT) as log (NLNs+1)/T stage , which represents the NLNs adjusted by the T stage, to better reflect the degree of LND.
Here, we used a population-based cohort from the Surveillance, Epidemiology, and End Results (SEER) database and aimed to investigate the correlation between LONT and prognosis. Based on LONT, we also constructed a novel prognostic nomogram model to predict survival in surgically treated GC patients.
Information on surgery type, exact tumor size, location, regional nodes examined, regional nodes positive, tissue type, tumor differentiation, CS extension, survival status, and demographic characteristics was collected. The TNM status of each patient was re-evaluated according to the 8th edition of the American Joint Committee on Cancer (AJCC) Cancer Staging Manual based on CS extension and regional nodes positive in the SEER database. Patients with stage I to III GC who did not undergo radical excision or lacked a detailed description of the surgery, were less than 18 years old, or contained any missing data for selected variables were excluded.
The included patients were randomly divided into the training cohort and the validation cohort (7:3) for cross-validation. The cases excluded due to unknown tumor size, race, differentiation, or location were used to assess the robustness of the nomogram. The training cohort was used to develop the prognostic nomogram model, and the validation cohort and the missing data cohort were used to validate the model.

Statistical analysis and nomogram development
In our study, T1, T2, T3, T4a, and T4b were assigned 1, 2, 3, 4, and 5, respectively. LONT was analyzed as a continuous variable in the study, which was defined as log (NLNs+1)/T stage , where NLNs is the ELNs minus the count of positive LNs. One is added for NLNs to avoid the occurrence of zero (Wen et al. 2017). Continuous and categorical variables are expressed as medians and totals (percentages), respectively. cancer-specific survival (CSS) was the primary endpoint, and cases in which the cause of death was unclear or death was due to other causes were treated as censored observations. Overall survival (OS).
Univariate and multivariate logistic regression models were performed to identify variables that were independently associated with CSS and OS. Stratified analysis of the LONT effect on CSS and OS based on different clinicopathological factors was also performed by Cox proportional hazards regression. A time-dependent receiver operating characteristic curve (ROC) and concordance index (C-index) were utilized to evaluate the discriminative ability of LONT and other factors and used to further exclude unnecessary variables. Then, according to the ROC and C-index results, we chose the simplest combination of variables and built the final nomogram.

Model validation
To validate our model, we used the following four criteria to assess the prediction performance in both validation cohorts (Wang et al. 2018). First, calibration curves were plotted to judge the consistency between predicted survival probability and actual survival proportion at 3, 5, and 10 years for CSS and OS, respectively. Second, the cases were grouped according to their total nomogram score, and Kaplan-Meier survival analysis was used to plot the survival curves and to compare the survival rate among the different groups by logrank test. Third, the AUC of the time-dependent ROC curve was calculated at 5 years and compared with the model with the 8th TNM stage. Fourth, the C-index was used to evaluate the prediction accuracy of the model, with a larger C-index value indicating better accuracy.
Nomogram development and validation were performed using RStudio software (version 3.6.3). Other analyses were performed using SPSS (version 22.0). A two-tailed P value less than 0.05 was considered statistically significant.

Baseline characteristics
The data of 55,536 patients pathologically diagnosed with primary GC from 2004 to 2017 were obtained from the SEER database. Among these patients, 32,653 did not undergo surgical resection, or details about their surgery were lacking. Information regarding the exact tumor size was not available for 3370 patients. In addition, 1026 patients were excluded because no LNs were removed or information regarding ELNs or PLNs was incomplete. Of the remaining 18,487 patients, 2342 did not have stage I/ III disease (2155) or had an unknown T stage (187), 573 had an unknown grade, 57 were of an unknown race, and 3 were less than 18 years old. Finally, 352 lacked information regarding survival time. Thus, a total of 15,160 patients were ultimately included and randomly divided into two cohorts ( Fig. 1): the training cohort (n = 10,612, 70%) and the validation cohort (n = 4548, 30%). In addition, 2799 patients with missing information on tumor size, race, differentiation, or location data constituted the missing data cohort (Table S1). The latest follow-up date was in November 2019. The median follow-up time was 30 months (range 1-167 months) in the training cohort and 29 months (range 1-167 months) in the validation cohort. The clinicopathological characteristics between the two cohorts were similar, but the proportion of patients with stage III disease in the validation cohort was significantly higher than that in the training cohort (P = 0.013). Detailed information about the clinicopathological features is shown in Table 1.

The prognostic impact of LONT on CSS and OS
In univariate analysis, all the included variables were significantly correlated with CSS (Table 2) and OS (Table S2) in the training group. Except for sex on CSS (Table 2) and histology type on OS (Table S2), the other variables had similar results in the validation cohort. To avoid losing prognostic information, we used T stage and N stage instead of TNM stage for the multivariate analysis. The results showed that LONT, age, race, T stage, N stage, and location but sex and histology type were independent prognostic factors for CSS (Table 3) and OS (Table S3) in both cohorts, while tumor size for CSS and grade for OS were not significantly associated with the survival outcome in the validation subset. To confirm the independent prognostic effect of LONT, the prognostic impact of LONT on CSS and OS was further examined by stratified analysis with a multivariate Cox proportional hazards model. According to the results, all the subsets were significantly associated with CSS (Table 4) and OS (Table S4) in both cohorts. All the results showed that lower LONT values indicated a worse prognosis.

Nomogram development in the training cohort
To obtain a simple nomogram for clinical application, a time-dependent ROC and C-index were used to further remove potential redundancy. After removing the factors  of size, race, and grade, the C-index and AUC decreased slightly; however, when we removed location and age, especially T stage and N stage, the C-index and AUC all decreased strikingly for CSS (Table 5) and OS (Table S5). Thus, only LONT, N stage, T stage, location and age were used in the final nomogram model. Figure 2 depicts the risk score of each item in the final nomogram. LONT occupied the largest proportion of risk scores, followed by N stage and T stage. Kaplan-Meier survival analysis confirmed that the nomogram risk score had excellent survival prediction ability for CSS (Fig. 2b) and OS (Fig. 2c). The C-index 0.741 (95% CI 0.733-0.749) and AUC (0.810) for the established nomogram were higher than those for the 8th TNM classification (0.691; 95% CI 0.683-0.699; AUC 0.755).

Nomogram in the validation cohort and the missing data
Figures 3a (CSS) and 3B (OS) show that the nomogram risk score can accurately predict the survival difference between any two adjacent groups. The integrated AUC (Fig. 3c) and C-index (0.736; 95% CI 0.724-0.748) for the nomogram were higher than those for the 8th TNM classification (0.685, 95% CI 0.673-0.697). The calibration curve showed that predictions of 3-year (Fig. 3d), 5-year (Fig. 3e), and 10-year (Fig. 3f) CSS were highly consistent with the actual  Figure S1). To verify the reliability of the nomogram, the missing data cohort was also used for further verification. The results were highly consistent with the validation group, and the survival difference between any two neighboring risk groups was still significant in both CSS (Figure S2 A) and OS (Figure S2 B) analyses. The values of the C-index (0.751, 95% CI 0.735-0.767) and AUC were even higher than those in the validation cohort ( Figure S2C). Similar calibration results also showed that the predicted survival probability at 3, 5, and 10 years for CSS ( Figure S2 D, E, F) was highly consistent with the actual survival proportion.

Discussion
In the present study, we used LONT to quantify the relative degree of LND; moreover, a prognostic nomogram model based on LONT was established and validated using the SEER database. Our results proved that a high LONT was associated with improved survival of GC patients and independent of clinicopathological factors. The nomogram based on LONT, N stage, T stage, location and age not only included information on the tumor characteristics itself but also included information on the degree of LND and exhibited better prognostic performance than TNM stage, which can assist clinicians in developing individualized treatment strategies for GC patients after gastrectomy. To our knowledge, this is the first study quantifying the relative degree of LND and developing a novel nomogram based on LONT and clinicopathological factors to predict the survival of GC patients after gastrectomy.
An accurate prediction of the prognosis of patients with GC plays a very important role in postoperative treatment and follow-up planning. For resectable GC, the prognosis is related not only to the TNM stage ) and biological characteristics (Mao et al. 2019;Sun et al. 2020;Wang et al. 2019) but also to the degree of LND (Degiuli et al. 2004;Enzinger et al. 2007;Schwarz and Smith 2007)  and postoperative comprehensive treatment (Smyth et al. 2020). However, so far, except for the extent of lymphadenectomy at the time of gastrectomy, only ELNs and NLNs can reflect the degree of LND to some extent. However, patients with different disease states have their own individualized optimal ELNs and NLNs; therefore, just using   ELNs or NLNs cannot compare the degree of LND among patients with different TNM stages. Overall, we still lack indicators on how to objectively evaluate the degree of LND.
Previous studies have shown that both ELNs (Smith et al. 2005;Son et al. 2012) and NLNs (Kattan et al. 2003;Martinez-Ramos et al. 2014;Wang et al. 2017) are independent prognostic factors for GC. Unfortunately, due to the lack of important information, such as tumor biological characteristics, their clinical value needs further study. In our study, LONT was defined for the first time as the log of the ratio between the NLN counts plus one (Wen et al. 2017) and the T stage; the NLNs represent the total level of LND, and the T stage represents the severity of the disease. The NLNs adjusted by T stage can be understood as the relative number of negative lymph nodes removed for each patient. A higher value indicates that more NLNs were obtained, and conversely, a lower LONT value means fewer NLNs were obtained. Therefore, it can be used to compare the relative level of LND among different patients. Even in patients with Fig. 3 Performance of the prognostic nomogram model in the validation cohort. a Kaplan-Meier survival analyses for cancer-specific survival based on nomogram scores, which were calculated according to the nomogram in Fig. 2a. b Kaplan-Meier survival analyses for overall survival based on nomogram scores. c The area under the time-dependent ROC curve was calculated for the 8th TNM staging system and the nomogram at five years for cancer-specific survival. Red: the nomogram established in the present study; green: the 8th TNM staging system. The calibration curves for predicting patient CSS at 3 years (d), 5 years (e) and 10 years (f). The nomogram model predicting CSS is plotted on the x-axis, and the actual survival proportion is shown on the y-axis Nomogram predicted probability of 10 years CSS C D E F different TNM stages, different ELNs or NLNs, the same LONT value represents the same risk level. In our study, the univariate analyses showed that the HR of NLNs was 0.965 (95% CI 0.961-0.968), and remarkably, that of the NLNs adjusted by T stage was 0.255 (95% CI 0.24-0.272). This confirmed that adjusting for the effect of the T stage may significantly improve the prognostic value of NLNs. A similar result was validated by multivariate Cox analysis and in the validation cohort. The results of subgroup analysis also further confirmed the LONT effect on the CSS and OS rates for different clinicopathological factors. All the results indicated that LONT was an independent prognostic factor for surgically treated GC patients.
To obtain the simplest and most effective nomogram for clinical application, unlike the previously published nomogram including all prognostic factors (Dikken et al. 2013;Kim et al. 2015;Wen et al. 2017). In the present study, ROC and C-index were used to further remove potential redundancy. The results showed that after removing the factors of size, race, and grade, the C-index and AUC decreased only slightly; however, when we removed location and age, especially T stage and N stage, the C-index and AUC all decreased strikingly for CSS (Table 5) and OS (Table S5). Therefore, our nomogram model only included age, location, LONT, N stage, and T stage. The C-index of the nomogram model was 0.741 (95% CI 0.733-0.749) for CSS. Kaplan-Meier survival analysis confirmed the excellent discriminant ability of the model between any neighboring two risk groups for CSS and OS in the validation and missing data cohorts. The value of the C-index and AUC was higher than that of the 8th TNM stage, also indicating the strong predictive ability of our nomogram model (Table 5). Thus, this comprehensive and personalized risk score prediction method could be applied as stratification criteria in guiding postoperative treatment and follow-up planning.
Notably, LONT had the largest proportion of risk scores in the model, with clear risk discriminatory ability for the same T stage, N stage, TNM stage or other clinicopathological factors (Table 3), and showed a higher C-index (0.689 ± 0.008) and AUC (0.733) than T stage (Table 5). The vital contribution of LONT to the nomogram also confirmed the influence of the degree of LND on the prognosis and highlighted the importance of using LONT for prognosis prediction of GC. Moreover, this marker can be simply calculated from the postoperative pathological report at no extra cost. With respect to the prevalence of the model in clinical applications, additional improvements in the accuracy of estimating survival outcomes will benefit more patients.
Using the SEER data empowers us to draw sound conclusions consistent with general clinical practice based on a large sample number of GC patients. However, we must admit that the current study has some inherent limitations.
First, we lacked some routinely available clinical parameters, such as lymphovascular invasion, margin status, nerve invasion, CEA, and CA19-9. This information may affect the predictive value of the factors identified in our model. Second, the treatment was not considered because the SEER database did not provide detailed preoperative and postoperative treatment information for these patients. We assume that all patients received the same treatment. Third, the T stage as an indication of tumor characteristics is not accurate because the histological type, differentiation degree and genotyping of GC are also important biological characteristics but were not included in our adjustment. Furthermore, although we used the validation cohort and missing data cohort to verify our model, our results were not validated in our database, and due to the retrospective nature of the SEER and above the limitations, prospective data are needed to confirm these findings.

Conclusions
In conclusion, LONT is a new prognostic indicator that can reflect the relative degree of LND among different patients. It could effectively predict CSS and OS for resectable GC patients, independent of clinicopathological features. The established nomogram based on LONT showed better discriminatory ability than the 8th TNM staging system, which is a simple, accurate and easy-touse scoring method for clinicians to develop individualized treatment strategies.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate
Informed consent is waived as SEER is a deidentifed, publicly available cancer database.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.