Introduction

PC is the most aggressive and lethal malignancy in gastrointestinal cancers. The overall 5-year survival rate is less than 10%, with few significant improvements for years (Ansari et al. 2016; Siegel et al. 2022). The primary treatment for PC including surgery, neoadjuvant therapy, and postoperative therapy, surgical resection is considered to be the only potentially curative treatment among those treatments (Stott et al. 2022). However, most PC patients underwent surgical resection with inadequate number and extent of lymph node dissection (Groot et al. 2017; Kovac et al. 2019). Mostly, it is difficult to get R0 excision and patients diagnosed with PC usually experience early local recurrence and metastasis after surgery (Suto et al. 2022; Torphy et al. 2020). Besides, it is insufficient to evaluate the preoperative LNM on the imaging appearance solely. Therefore, the evaluation of preoperative LNM is an important prognostic determinant factor for resectable PC, which determined the surgical resection type and the implementation of preoperative neoadjuvant therapy and aggressive postoperative adjuvant therapy (Shi et al. 2019; Suto et al. 2022).

The nomogram model has been widely used in the prediction of lymph node metastasis. However, there is a lack of nomograms for predicting LNM in resectable PC patients preoperatively. In this study, the clinical characteristics of cases diagnosed with PC were analyzed, and a nomogram for predicting LNM was developed, which contributes to providing personalized guidance for resectable PC patients.

Materials and methods

Data collection

In our study, patients diagnosed with pancreatic cancer from 2000 to 2019 were collected from the SEER database. The exclusion criteria were as follows: diagnosed confirmation with clinical diagnosis only, radiography without microscopic confirm, direct visualization without microscopic confirmation or unknown; more than 2 primaries; SEER cause-specific death unknown; survival months equal to zero or unknown; grade unknown; stage or T, N, M Stage unknown; surgery unknown; tumor size unknown; regional nodes examined or positive unknown; age < 18 years old; race unknown; confirmed distant metastasis during surgery (stage M1); unresectable pancreatic cancer; death within one month after surgery. The following clinicopathological variables of gender, age at diagnosis, race, grade, primary site, histology, T-stage, and lymph node status were collected. The screening flowchart is shown in Fig. 1.

Fig. 1
figure 1

Patients enrollment and exclusion process in the SEER database

A total of 62 patients diagnosed with PC from December 2018 to February 2022 in The First Affiliated Hospital of Xinxiang Medical University were used to further validate the constructed nomogram externally. The inclusion and exclusion criteria were the same as the training set. The time of the last follow-up was March 2023. This study was approved by the institutional review board of our hospital.

Statistical analysis

The median (IQR), frequency (proportions), Mann–Whitney U tests, independent t-tests, Pearson’s chi-square test, Fisher’s exact test, and univariate and multivariate binary logistic regression analysis were calculated by SPSS (SPSS Inc., Chicago, USA). Nomograms, ROC curves, calibration plots, the nutrition risk index (NRI), integrated discrimination improvement (IDI), DCA curves, and Kaplan–Meier plots, were conducted by R software (version 4.2.2). P < 0.05 was considered statistically significant.

Construction, validation, and clinical usefulness of the nomogram

Univariate and multivariate logistic regression analyses were utilized to find the independent factors in predicting LNM and the minimum Akaike’s information criterion (AIC) was performed to choose the optimal model parameters and construct a nomogram for evaluating the risk of LNM. The predictors include age at diagnosis, race, primary site, grade, histology, and T-stage. Nomogram was constructed based on these variables (a dynamic nomogram was also provided in our study). The accuracy and discrimination of the nomogram were assessed by the ROC curve and the C-index. The calibration curves were utilized to evaluate the consistency between the actual outcomes and the predicted probabilities. The NRI and IDI were calculated to compare the performance between the nomogram and the clinical predictors. Additionally, the clinical utility in decision-making was assessed by DCA.

Results

Patient characteristics

A total of 13,200 resectable pancreatic cancer patients were enrolled in our research between 2000 and 2019 according to the screening flowchart from the SEER database and randomly divided into a training group (n = 9279) and internal validation group (n = 3921) at a ratio of 7:3 (Fig. 1). Meanwhile, 62 patients who underwent surgical resection with PC were obtained from the First Affiliated Hospital of Xinxiang Medical University and applied as the external validation group. The detailed clinicopathological features of all patients are presented in Table 1. There was no significant difference in the three groups except the race (P < 0.001) (Table 1).

Table 1 Clinicopathological characteristics in resectable pancreatic cancer

Univariate and multivariate logistic regression results

The clinicopathological factors associated with LNM were revealed by the univariate and multivariate logistic regression analysis. Univariate logistic regression analysis showed that age at diagnosis, race, primary site, grade, histology, and T-stage were significant factors for LNM in PC patients (Table 2). Consequently, we figured out independent factors by multivariate logistic regression analysis, including race [Asian: odds ratio (OR) = 0.807 (95%CI = 0.686–0.949), P = 0.009], primary site [Body of pancreas: OR = 0.479 (95%CI = 0.404–0.568), P < 0.001], grade [G3: OR = 1.904 (95%CI = 1.642–2.208), P < 0.001], histology [Neuroendocrine carcinoma: OR = 5.465 (95%CI = 4.586–6.513), P < 0.001], and T-stage [T4: OR = 4.892 (95%CI = 3.694–6.408), P < 0.001] (Table 2).

Table 2 Risk variables for lymph node metastasis determined by univariate and multivariate logistic regression analyses

Construction and validation of the nomogram based on predictors of lymph nodes metastasis

The minimum Akaike’s information criterion (AIC) was used to select the optimal model parameters and construct a nomogram for assessing the risk of LNM (Arunajadai 2009; Coles et al. 1980; Wang et al. 2004; Zhang 2016), and a total of six predictors including age at diagnosis, race, primary site, grade, histology, and T-stage were integrated to construct the nomogram (Fig. 2). The AUC was 0.711 (95%CI: 0.700–0.722) in the training, 0.700 (95%CI: 0.683–0.717) in the internal validation group, and 0.845 (95%CI: 0.749–0.942) in the external validation group, which proved a superior performance than the single factor (Fig. 3). The AUC of the T-stage and grade alone were lower than that of the nomogram. The AUC for T-stage was 0.645 (95%CI: 0.635–0.656), 0.649 (95%CI: 0.634–0.665), and 0.704 (95%CI: 0.587–0.821) in the training set, internal validation set and external validation set. Moreover, the AUC for the grade was 0.619 (95% CI: 0.608–0.630), 0.615 (95%CI: 0.598–0.632), and 0.601 (95%CI: 0.472–0.729) in the training, internal validation, and external validation groups, separately. Furthermore, the calibration plots show good consistency in the training set (C-index: 0.689), internal validation set (C-index: 0.686), and external validation set (C-index: 0.752) (Fig. 4). We also designed an online web calculator: https://xxlchxjh.shinyapps.io/DynNomappforLNMinpancreaticcancer/.

Fig. 2
figure 2

The nomogram for the risk of lymph node metastasis in resectable pancreatic cancer patients

Fig. 3
figure 3

ROC of the nomogram for the training cohort (A), the internal validation cohort (B), and the external validation cohort (C)

Fig.4
figure 4

The calibration plots of the training cohort (A), the internal validation cohort (B), and the external validation cohort (C)

The clinical application value was determined by DCA which calculates the net benefits at different risk threshold probabilities. The net benefit of the nomogram was the largest in comparison to the grade and T-stage, which indicated the nomogram was a reliable clinical tool for predicting LNM in PC patients who underwent surgical resection (Fig. 5).

Fig.5
figure 5

Nomogram decision curves (DCA) for the training cohort (A), the internal validation cohort (B), and the external validation cohort (C)

Additionally, the accuracy of the nomogram compared with the T-stage was demonstrated by the NRI and IDI. The NRI was 0.370 (95%CI: 0.329–0.411) and the IDI was 0.044 (95%CI: 0.039–0.048, P < 0.001) in the training group. The NRI and IDI in the internal validation group were 0.274 (95%CI: 0.211–0.337) was 0.035 (95%CI: 0.029–0.041, P < 0.001). In the external group, the NRI and IDI were 0.577 (95%CI: 0.091–1.063) was 0.062 (95%CI: 0.004–0.120, P = 0.037). The accuracy for predicting LNM by the nomogram was greater than the T-stage.

The Kaplan–Meier overall survival curves of training and internal/external validation groups are plotted in Fig. 6. The prognosis of PC patients with positive LNM was significantly lower in both training and internal/external validation groups. (P < 0.01).

Fig. 6
figure 6

The Kaplan–Meier overall survival (OS) analysis of lymph node metastasis in the training set (A), the internal validation set (B), and the external validation set (C)

Discussion

PC is one of the most lethal of all cancers with high mortality, which is the seventh leading cause of cancer death worldwide (Sung et al. 2021). Even after surgical resection, early recurrence rates were reported to be 50% to 60%, with 5-year survival rates of only 20% to 30% (Gupta et al. 2017; Shin et al. 2018). PC patients with positive LNM have a worse prognosis with or without surgical resection. The status of LNM is a significant prognostic factor in PC patients, which is also important for the choice of treatment decisions. PC patients with positive lymph node metastasis should accept neoadjuvant chemotherapy or immunotherapy before surgical resection (Barrak et al. 2022; Kanda et al. 2011; Roland et al. 2015). Therefore, it is important to distinguish the status of lymph nodes before surgical resection in the clinic. At present, there are low sensitivities and specificities in evaluating lymph node metastasis by imageological examinations, and it is difficult to identify the LNM before surgical resection. Therefore, it is important to construct a sensitive and efficient prediction model for assessing the status of LNM preoperatively in PC patients.

In our study, a total of six clinicopathological factors were considered as risk factors associated with LNM in PC patients, including age at diagnosis, grade, histology, T-stage, primary site, and race, which was largely consistent with previous analyses (Huang et al. 2023; Song et al. 2018). The convenient preoperative nomogram prediction model was constructed by those independent predictors. This is the first research to construct and validate a nomogram for predicting LNM in resectable PC patients based on large populations. Previously, researchers pay more attention to the status of lymph node metastasis in pancreatic head cancer. Xingren Guo et al. developed a nomogram for predicting the lymphatic metastasis in pancreatic head cancer based on 191 pancreatic head cancer patients who received laparoscopic pancreaticoduodenectomy (Guo et al. 2023). Yi-Nan Shen et al. constructs a nomogram for predicting the peripancreatic vein invasion in pancreatic head cancer patients. Additionally, the other tumor sites of PC such as the body and tail of the pancreas also occur lymphatic metastasis (Shi et al. 2022; Tanaka et al. 2022, 2020), and a model for predicting the status of LNM in those tumor sites of PC is in need. The nomogram model constructed in our study could satisfy this requirement. In our study, it is obvious that PC patients with the tumor site in the head have more potential LNM compared with the tail and body of the pancreas, which was consistent with previous studies and clinical practice (Guo et al. 2023; Kobayashi et al. 2022).

Various studies demonstrated that race was related to lymph node metastasis and prognosis (Oweira et al. 2017; Zheng-Pywell et al. 2022). Rui Zheng-Pywell et al. reveals that black patients had a higher risk of LNM in tumors less than 2 cm in size compared with white patients (Zheng-Pywell et al. 2022). In our study, Asian PC patients such as Chinese, Japanese, and Korean were less likely to undergo LNM. Moreover, a higher positive rate of LNM was observed in black PC patients, which is consistent with the previous conclusion.

The correlation between grade and LNM in PC patients has been revealed in previous studies widely. Harimoto Norifumi et al. shows that lymph node metastasis was significantly associated with higher tumor grade in pancreatic neuroendocrine neoplasm (Harimoto et al. 2019). Similarly, our study found that grade was an independent risk factor associated with LNM in PC patients. LNM is more likely to occur in poorly differentiated or undifferentiated PC patients.

The histological type is commonly considered an important predictor of the prognosis in PC patients. Bi-Yang Cao et al. found that adenocarcinoma was the independently associated risk factor for poor prognosis in patients with liver metastasis in PC patients (Cao et al. 2023). Until now, there were few studies focused on the association between histological type and risk of LNM. In this study, there is a higher risk of LNM in PC patients with infiltrating duct carcinoma, while, PC patients with the histological type of neuroendocrine carcinoma have less LNM. Furthermore, the T-stage was a significant prognostic factor in PC, including the tumor size and infiltrating scope. In 2022, Xi-Tai Huang et al. showed that the T-stage was significantly associated with LNM in pancreatic neuroendocrine tumors (Huang et al. 2022). In this study, PC patients with T4 indicate more potential risk of LNM in comparison with T1 or T2.

The nomogram for evaluating the risk of LNM in PC patients was developed by easily available clinicopathological factors, including age at diagnosis, race, grade, histology, T-stage, and tumor location. The AUC and the calibration curves demonstrated excellent discrimination and consistency of this nomogram model. The risk of LNM in PC patients could be conveniently and accurately calculated by those accessible variables. Furthermore, DCA curves were utilized to estimate the clinical utility, which shows good net benefit. In summary, the risk of LNM in preoperative PC patients can be easily and accurately predicted by the newly established nomogram model.

Although the nomogram model had good accuracy for predicting the risk of LNM in PC patients, there are several limitations to this study. First of all, the selection bias could not be avoided due to the nature of retrospective analyses. For example, patients with missing data were excluded from our study, which may cause selection bias. Secondly, variables such as age, tumor size, leucocyte, albumin, and lymphocytes/monocytes have been identified as independent predictors of LNM in pancreatic head cancer (Guo et al. 2023). The serum CA 19–9, PC.ae.C42_5, and PC.aa.C38_4 were considered the powerful preoperative clinical variables in predicting the early recurrence of pancreatic cancer (Rho et al. 2019). However, those variables were not supplied in the SEER database. Therefore, those important variables cannot be incorporated into the nomogram model. Finally, the external validation data from our hospital are very little, which may lead to underfitting the model and more external validations are needed.

Conclusion

In summary, the nomogram for predicting the preoperative LNM in PC patients was developed based on the SEER database, which shows good performance and clinical application.