Introduction

According to Global Cancer Statistics, colorectal cancer (CRC) ranks third in terms of incidence rate and second in terms of fatality rate among malignant tumors. According to the American Cancer Society, the 5-year survival rate for colorectal cancer with distant metastases is only 14%. Approximately 26.5% of these can be ascribed to liver metastases of colorectal cancer (CRLM) [1]. The median overall survival (mOS) of CRLM patients was only 6.9 months, whereas CRLM patients who underwent surgical resection of liver metastases had a much longer median survival of up to 35 months [2,3,4]. Due to the substantial impact of surgery on the outlook for patients, the treatment approach focuses on enhancing the availability of surgical operations to a broader patient population. Neoadjuvant therapy is recommended for CRLM patients that can be entirely removed during the initial surgery (R0 resected) to enhance their chances of long-term survival without tumors [5]. Nevertheless, a significant proportion of patients (up to 70%) diagnosed with CRLM are deemed inoperable at the time of initial diagnosis [6]. In such cases, it is crucial to promptly and vigorously administer comprehensive treatment to minimize the tumor burden and potentially qualify for surgical intervention, also known as conversion therapy.

Presently, the rates of response for chemotherapy along with targeted therapy can be elevated to 60% in patients with initially unresectable CRLM [7, 8]. The assessment of whether liver metastases can be surgically removed is made by a multidisciplinary team (MDT). However, there is a lack of defined criteria to determine whether patients with initially inoperable CRLM should undergo palliative or translational therapy. Hence, it is crucial to establish a scoring system that evaluates the survival advantage of patients with CRLM following conversion or neoadjuvant chemotherapy in conjunction with surgical resection. A range of predictive systems have been created to determine whether patients will derive advantages from undergoing surgery. One of the most frequently cited is the Clinical Risk Score (CRS), which predicts a patient’s overall survival (OS) based on five variables [9]. Due to its reliance on a solitary facility and a restricted number of patients, the external validity and relevance of this study are subject to debate among various population groups [10, 11]. Other well established models include those of Nordlinger [12], Iwatsuki [13], the Genetic And Morphological Evaluation (GAME) score [14], and the modified clinical score (m-CS) [15] have been adapted to genomic and chemotherapeutic variables by including KRAS mutation status in the modern era. Nevertheless, there is still ongoing debate over the influence of some variables on survival outcomes, and there is a dearth of accurate prognostic assessment for CRLM patients with treatment.

We obtained data on patients pathologically diagnosed with CRC from 2000 to 2019 from the Surveillance, Epidemiology, and End Results Database (SEER). We also collected data from the Affiliated Hospital of Qingdao University from January 1, 2010, to June 1, 2022. Prognostic models were created for CRLM patients following preoperative chemotherapy and surgical treatment using Cox proportional hazards regression and competing risk regression analyses.

Methods and materials

Case information collection and data collation

We downloaded data of CRLM patients from the SEER database (www.seer.cancer.gov) “Incidence SEER Research Plus Data, 17 Registries, Nov 2021 Sub (2000–2019)” dataset. This database contains retrospective study information on cancer epidemiology and demographics for approximately 35% of the United States of America population. Since the confidential patient data were not included in the SEER database, ethical clearance was not necessary for the utilization of the dataset. We also collected data form the Affiliated Hospital of Qingdao University from January 1, 2010, to June 1, 2022. The study protocol was approved by the Medical Ethics Committee of the Affiliated Hospital of Qingdao University (Approval No. QDFY WZLL 28552).

Inclusion and exclusion criteria

Filter the SEER data by the following fields: (i) {Site recode ICD-O-3/WHO 2008} = ‘Colon and Rectum’; (ii) {Behavior code ICD-O-3} = ‘Malignant’; (iii) {Chemotherapy recode (yes, no/unk)} = ‘Yes’; (iv) {RX Summ—Systemic/Sur Seq} = ‘Systemic therapy before surgery’; (v) {SEER Combined Mets at DX-liver (2010 +)} = ‘Yes’; (vi) {Diagnostic Confirmation} = ‘Positive exfoliative cytology, no positive histology’, ‘Positive histology’, ‘Positive microscopic confirm, method not specified’; (vii) {Histologic Type ICD-O-3} = ‘8140’, ‘8210’, ‘8220’, ‘8480’, ‘8481’; (viii) {RX Summ–Surg Oth Reg/Dis (2003 +)} = ‘Non-primary surgical procedure to distant site’; (ix) {Mets at DX-Other(2016 +)} = ‘None; no other metastases’; (x) {RX SUMM—SURG PRIM SITE (1998 +)} = ‘Resection’, ‘A surgical procedure to the primary site was done’.

Inclusion criteria: (i) primary site is colon, junction of colon and rectum or rectum; (ii) adenocarcinoma confirmed by microscopic pathology; (iii) liver metastases are identified either concurrently with the diagnosis of CRC or within 6 months following the surgery of primary cancer; (iv) surgical resection (R0 resection) of the primary site and liver metastasis; (v) receiving complete chemotherapy before surgery.

Exclusion criteria: (i) confirmation of multiple primary tumors; (ii) accompanied by distant metastasis to other organs and/or distant lymph node metastasis; (iii) those whose survival status is death, but the cause is unknown; (iv) those with missing follow-up dates.

The SEER data obtained after inclusion and exclusion criteria were randomly divided 1:1 into training cohort and internal validation cohort using the R (v.4.3.1) ‘splithalfr’ package. The screened data from the Affiliated Hospital of Qingdao University were defined as the external validation cohort (QDU cohort).

Study variables and outcomes

We selected the following variables for further analysis: age, gender, marital status, race, primary tumor site, pre-treatment CEA level, tumor size, T stage, N stage, pathological grade, number of tumor deposits, perineural infiltration, number of regional lymph nodes examined, and number of positive regional lymph nodes. Among them, age, tumor size, number of tumor deposits number of regional lymph nodes examined, and number of positive regional lymph nodes were continuous variables, and the other variables were categorical variables. We transformed continuous variables into categorical variables and defined the number of categories and the choice of node locations for categorical variables by drawing Restricted Cubic Spline (RCS) analysis. OS was used as the study outcome in this study. Kaplan–Meier (K–M) survival analyses were performed within different subgroups of the SEER and QUD cohorts, respectively, and mOS was obtained by SPSS software (v.26).

Cox proportional risk regression model and competitive risk regression model construction

Univariate Cox regression analysis was performed in the training cohort. Variables with p < 0.05 were considered significant for univariate Cox regression. A stepwise regression method based on Akaike information criterion (AIC) was applied to further screen the variables, and the combination of variables with the smallest AIC value was selected to be used in constructing the Cox regression model with the highest fitness. The Cox regression proportional hazards assumption test was used to test whether the model assumed that the hazard ratio (HR) changed over time.

Considering the presence of competing events of non-cancer-related factors leading to death, we further developed a competing risk regression model. Univariate analysis was performed to estimate the cumulative incidence of recurrence by cumulative incidence function (CIF). Survival curves were plotted using Nelson–Aalen cumulative risk curves. Intergroup variability was tested by Gray’s test.

Nomogram construction and model testing

We constructed nomograms for the Cox proportional risk regression model and the competing risk regression model, respectively. Prognostic assessment of selected patients in the SEER database was performed to compare the differences in the predicted outcomes of the two nomograms. Internal validation and external validation were performed to further test the reliability and accuracy of the nomogram. Calibration curves were used to make predictions about the likelihood of outcomes occurring. The accuracy of the model was tested using the receiver operating characteristic curve (ROC) and area under curve (AUC). Decision curve analysis (DCA) was used to evaluate the clinical utility of the model.

Statistical analyses

The study utilized R software version 4.3.1, with the major R packages utilized as follows:

  1. (i)

    The ‘ggrcs’ package was utilized for conducting RCS, while ‘car’ package was utilized for analyzing the baseline patient data;

  2. (ii)

    The ‘splithalfr’ package was utilized to randomly division;

  3. (iii)

    The ‘survival’, ‘survminer’, and ‘coin’ packages were utilized to construct the Cox proportional risk regression models;

  4. (iv)

    The ‘riskRegression’ package was utilized to construct the competitive risk models;

  5. (v)

    The ‘survival’, ‘regplot’, ‘vioplot’, ‘beanplot’, ‘survivalROC’, and ‘ggDCA’ packages were utilized to create nomograms and validation;

The statistical analysis of survival analysis was performed using SPSS (v.26).

A two-sided p < 0.05 indicates a statistically significant difference.

Results

Baseline characteristics

A total of 735 patients met the inclusion and exclusion criteria and were included in this study. Of these, 316 were in the SEER training cohort, 317 in the SEER internal validation cohort, and 102 in the external validation cohort. RCS was utilized to determine the optimal grouping nodes for the following continuous variables: age (≤ 57, > 57), tumor size (< 5 cm, ≥ 5 cm), number of tumor deposits (negative, positive), number of regional lymph nodes examined (< 17, ≥ 17), and number of regional lymph node positives (negative, positive) (Fig. 1A–E).

Fig. 1
figure 1

Optimal grouping nodes for continuous variables identified by RCS. A The best grouping node for tumor size was 49.899497 mm; B the best grouping node for the number of positive regional lymph nodes was 0.9657; C the best grouping node for the number of regional lymph nodes examined was 16.824121; D the best grouping node for age was 57.03015 years; and E the best grouping node for tumor deposits was 0.000

The baseline characteristics of all patients included in the study are shown in Tables 1 and 2. We listed the year of diagnosis of patients in the SEER and QDU cohort (Supplemental Table 1). Within the SEER cohort, the majority of individuals were male (61.5%), with Caucasians being the most prevalent ethnic group (79.9%). The primary tumor type observed most frequently was a moderately differentiated adenocarcinoma (66.0%), primarily situated in the colon (49.3%), and T3 (55.1%) and N1 (49.1%) stages being the most common. Four hundred five (64.0%) patients were accompanied by pre-treatment elevated levels of CEA. Positive regional lymph node metastases was found in 402 (63.5%) patients. One hundred forty-nine (23.5%) patients showed positive perineural infiltration, and 95 (15.0%) patients were with ≥ 1 tumor deposits. Two hundred sixteen patients (34.1%) had received radiotherapy. There were no statistically significant differences between the training cohort and the internal validation cohort in variables other than primary tumor site and pretreatment CEA level.

Table 1 Baseline characteristics of patients in the total SEER population, modeling cohort, and internal validation cohort
Table 2 Baseline characteristics of patients in the total population, SEER cohort, and QDU external validation cohort

Baseline characteristics of the external validation cohort revealed that a significant majority of the patients were male (76.5%) and over the age of 57 (72.5%). 52.0% patients had a primary tumor in the rectum, and 79.4% patients were moderately differentiated adenocarcinoma. 63.7% patients had negative tumor deposits and 65.7% patients had positive regional lymph node metastases. There were 20 patients (19.6%) in the external validation cohort who had received radiotherapy (Table 2).

Results of overall and subgroup survival analyses

The mean survival time of the training cohort was 64.24 months (95% CI 58.50–69.97) with a mOS of 55.00 months (95% CI 46.97–63.03), the mOS of the internal validation cohort was 60.27 months (95% CI 54.91–65.63) with a mOS of 48.00 months (95% CI 40.65–55.35), and the external validation cohort had a mean survival time of 65.84 months (95% CI 59.34–72.34) and a mOS of 68.00 months (95% CI 54.91–81.08). Subgroup K–M analysis of the SEER cohort showed significant survival differences (p < 0.05) in age, number of tumor deposits, number of positive regional lymph nodes, presence of perineural infiltration, subgroups of T stage, N stage, and primary tumor site (Fig. 2A). Subgroup K–M survival analysis of the external validation cohort similarly yielded significant survival differences (p < 0.05) in subgroups of the above variables (Fig. 2B). The mOS of all subgroups is shown in Table 3.

Fig. 2
figure 2figure 2

Kaplan–Meier survival analysis at SEER cohort and QDU external validation cohort. A The SEER cohort showed significant survival differences in age, number of tumor deposits, number of positive regional lymph nodes, perineural infiltration, T stage, N stage and tumor site subgroups (p < 0.05); B the QDU cohort showed significant survival differences (p < 0.05)

Table 3 mOS of SEER cohort and QDU external validation

Cox proportional risk regression model and nomogram

Univariate Cox regression analysis was performed on the training cohort. Age, T stage, N stage, presence of perineural infiltration, number of tumor deposits, number of positive regional lymph nodes and tumor size were verified as significant variables (p < 0.05) (Table 4). Variables were screened for the training cohort by AIC stepwise regression method and included in multivariate Cox regression analysis. Finally, age, N stage, presence of perineural infiltration, number of tumor deposits and number of positive regional lymph nodes were obtained as independent risk factors for prognosis (p < 0.05) (Table 4). These variables met the Cox regression proportional risk assumption test (p > 0.05) (Fig. 3A).

Table 4 Modeling cohort for univariate and multivariate Cox regression
Fig. 3
figure 3figure 3

Cox proportional risk regression modeling and testing. A Age, N stage, number of tumor deposits, number of positive regional lymph nodes and perineural infiltration met the Cox regression proportional risk assumption test (p > 0.05); B calibration curves of the modeling set at 1 year, 3 years, and 5 years, respectively; C ROC curves of the modeling set at 1 year, 3 years, and 5 years, respectively, and the corresponding AUC values; D DCA curves of the modeling set at 1 year, 3 years, and 5 years, respectively, and compared with the TNM prediction model; E calibration curves for the internal validation set at 1 year, 3 years, and 5 years, respectively; F ROC curves and corresponding AUC values for the internal validation set at 1 year, 3 years, and 5 years, respectively; G DCA curves for the internal validation set at 1 year, 3 years, and 5 years, respectively; H calibration curves for the external validation set at 1 year, 3 years, and 5 years, respectively; I external ROC curves and corresponding AUC values for the validation set at 1 year, 3 years, and 5 years, respectively; J DCA curves for the external validation set at 1 year, 3 years, and 5 years, respectively. Model: the Cox proportional risk regression model, and TNM: the TNM prediction model

Calibration curves, ROC curves, and DCA curves were plotted in the internal validation set and the external validation set at 1 year, 3 years, and 5 years, respectively. The model’s performance and accuracy are deemed satisfactory (Fig. 3B–J). The C-index and DCA curves suggested that the Cox model had higher clinical predictive and application value than TNM staging (Table 5, Fig. 3D).

Table 5 C-index for Cox proportional risk regression model and competitive risk model

We constructed Cox proportional risk regression nomogram at 1 year, 3 years, and 5 years, respectively (Fig. 4A). The nomogram represents the covariate values for a specific patient (ID = 35,288,445) in the SEER dataset. The total points was used to calculate the probability of the patient having a survival time of less than 1, 3, and 5 years. The probabilities were found to be 0.0516, 0.358, and 0.668, respectively. (Fig. 4A).

Fig. 4
figure 4

Cox proportional risk regression model nomogram

Competitive risk regression model and nomogram

Gray’s test was used to test for between-group variability. CIF was used to estimate the cumulative incidence of recurrence, and Nelson–Aalen cumulative risk curves were developed. There was a statistically significant difference in the risk of death between subgroups in the variables of age, number of regional lymph nodes examined, tumor size, N stage, presence of perineural infiltration, number of tumor deposits, and number of positive regional lymph nodes after controlling for competing risk events (p < 0.05). There was a statistically significant difference (p < 0.05) in the cumulative competing risks in age and primary tumor sites (Table 6, Fig. 5A). A multivariable competing risk regression model was constructed and visualized by the nomogram (Fig. 5B). The model was further validated with C-indexes of 0.623, 0.682, and 0.703 at 1 year, 3 years, and 5 years, respectively (Table 5). The nomogram calculated the likelihood of survival for the same patient with id = 35,288,445 at less than 1, 3, and 5 years. The given probability represents the likelihood of death while taking into account other risk events, namely with probabilities of 0.0443, 0.327, and 0.629, respectively (Fig. 5B). There was a discernible disparity in the calculation of cumulative risk of death between the competing risk model and the Cox proportional risk model. The competing risk model yields a slightly lower risk of death for patients with id = 35,288,445. The calibration curve at 1 year, 3 years, and 5 years is shown in Fig. 5C.

Table 6 Modeling cohort for Gray’s test
Fig. 5
figure 5figure 5

Competition risk modeling and testing. A Nelson–Aalen curves for the variables age, pathological stage, tumor site, tumor size, tumor deposits, number of regional lymph nodes examined, N stage, number of positive regional lymph nodes, and perineural infiltration. B Competing risk model nomogram. id = 35,288,445 The patient’s total score is 193, and his cumulative probability of death at 1 year, 3 years, and 5 years is predicted to be: 0.0489, 0.357 and 0.663, respectively. C Calibration curves for the modeling set at 1 year, 3 years, and 5 years, respectively

Discussion

The primary benefit of preoperative chemotherapy in CRLM patients is the potential to render previously inoperable tumors operable, a concept commonly known as “conversion therapy”. Conversion therapy has the potential to decrease the stage of cancer to a resectable level in approximately 35% of cases in these patients [16]. Preoperative chemotherapy can have benefits in decreasing the stage, even in patients with originally resectable illness. Meanwhile, preoperative chemotherapy facilitates the treatment of micrometastatic lesions and enhances the ability to achieve negative margins after surgery [17]. Nevertheless, the intricate individual diversity of colorectal liver metastases (CRLM) indicates that a solitary biomarker or predictor in this group may not accurately reflect the intricate tumor characteristics of individuals with initially inoperable CRLM. The primary factors influencing survival time in this patient population are the lack of successful conversion and the recurrence of postoperative illness. The long-term consequences of conversion treatment continue to be a subject of debate and disagreement. A considerable proportion of patients encounter early relapse [16, 18]. A retrospective study showed that patients with conversion therapy combined with surgery had a mOS of 24 months, whereas patients with conversion failure had a mOS of only 14 months (p < 0.001), which was similar to that of patients on palliative chemotherapy [19]. This implies that it is crucial to evaluate our patients who may derive advantages from conversion therapy, as well as preoperative chemotherapy.

The determination of “unresectability” varies depending on the surgeon, the accuracy of cross-sectional imaging tests, and guidelines that have not yet been standardized. Resection of liver metastases needs to be performed to ensure complete (R0) resection of the liver metastases and to preserve sufficient functional liver tissue. Factors such as the number and location of liver metastases are no longer used as single factors in determining the feasibility of surgery [20, 21]. However, a clear distinction between initially resectable and initially unresectable lesions remains relatively difficult. From an oncological point of view, perioperative chemotherapy is recommended even if the lesion is anatomically resectable; induction chemotherapy followed by surgery is also recommended for patients with borderline resectable CRLM [22]. Therefore, our study included CRLM that was initially resectable but required preoperative neoadjuvant therapy as well as initially unresectable for conversion therapy in one patient population. We aimed to investigate the influencing factors that can affect the prognosis of patients with CRLM treated with preoperative chemotherapy combined with surgical resection.

Our model incorporated several characteristics that are supported by existing scoring systems. Preoperative CEA levels have been observed to have an impact on both overall survival (mOS) and the response to systemic therapy [9, 23], presence of positive tumor deposits in CRLM patients is also associated with a worse outcome [24]. A tumor deposit is a distinct tumor mass located in the pericolonic or perirectal fat or nearby mesentery (colonic mesenteric fat), separate from the main tumor infiltrate and without any remaining lymphoid tissue [25]. The 8th of the American Joint Committee on Cancer (AJCC) TNM staging classifies lesions that are negative regional lymph nodes and positive tumor deposits as N1c. The correlation between the existence and quantity of tumor deposits and the unfavorable postoperative prognosis in CRC is substantial. Furthermore, there is increasing endorsement for considering tumor deposits as an indicator of distant metastasis [26,27,28]. Perineural invasion (PNI) is one of the most potent interactions between tumors and nerves. PNI is interpreted as tumor proximity to nerves, occupying at least 33% of the nerve circumference, or the presence of tumor cells within the nerve sheaths of the nerve epineurium, nerve fascicles, or nerve endothelium [29]. Stimulating the growth of nerves within the tumor enhances the advancement of cancer, and the interaction between the tumor and nerves mutually enhances the development of the tumor. Neurons release beneficial growth factors and angiogenic signals that promote the growth of tumor cells. In addition, tumors exploit nerves as an additional pathway for distant metastasis [30, 31]. PNI has been shown to correlate with a poor prognosis in a variety of malignancies such as pancreatic adenocarcinomas, squamous cell carcinomas of the head and neck, gastric and colorectal cancers [29, 31, 32]. Specifically, PNI is strongly linked to the advancement of disease and unfavorable results in CRC, and it indicates the possibility of early spread of cancer cells to other parts of the body [32]. In our patient cohort, 25.6% of patients were positive for PNI, which is consistent with data from previous studies [33,34,35,36]. The impact of the number of positive regional lymph nodes on survival after surgery has been taken into account. Research has demonstrated a direct correlation between the number of positive regional lymph nodes and pathological grade. In addition, a higher number of positive lymph nodes increases the probability of multiple metastases and is associated with a more severe response to chemotherapy [37]. Ozawa et al. [38] demonstrated that a higher number of positive regional lymph nodes was associated with worse 5-year survival in stage IV CRC patients underwent surgery. Our investigation also discovered that the presence of positive regional lymph node metastases was a prognostically independent risk factor for patients. Several prognostic studies have included the maximum diameter of liver metastases as a variable for postoperative prognostic prediction. Furthermore, previous studies have regarded the size of liver metastasis as a separate risk factor. However, Jang et al. [39] have demonstrated that the OS of patients with 1–2 liver metastases nodules is not significantly different from that of patients with 3–8 nodules. This survival outcome may be attributed to the introduction of advanced therapeutic techniques [15].

Two prognostic models were created in this investigation. The Cox proportional hazard regression model offers the benefit of analyzing the impact of multiple variables on survival outcomes and estimating survival of patients over time. In our study, the DCA curve of Cox model is always above the TNM system, suggesting that the model’s prediction has a higher net clinical benefit and better clinical application. Given the advanced age at which CRLM patients generally get diagnosed, there may be additional factors that impact cancer-specific mortality. Therefore, the competing risk regression model is used to assess the extent to which multiple variables affect the result. This study revealed that competing events exhibited a conflicting association with cancer-related outcomes in regard to age and the tumor site. The study revealed that patients over the age of 57 made up 51.2% of the overall population. Advanced age is associated with an increased risk of treatment-related adverse outcomes, including a higher likelihood of postoperative complications and readmission rates [40]. The presence of conflicting events within subgroups of tumor sites may be linked to the likelihood of postoperative problems in various surgical modalities. A study conducted by Zenger et al. [41] demonstrated that performing radical surgery on the mid-transverse colon increased the likelihood of developing gastroparesis and intestinal blockage compared to radical surgery for CRC in other locations. This finding has implications for the prognosis of patients.

After excluding other competing events, our study indicated that the number of regional lymph nodes examined was an independent risk factor, separate from the other prognostic factors included in the Cox model. There is a correlation between the prognosis of CRC and the number of regional lymph nodes examined. In a study conducted by Murphy et al. [42], it was discovered that there was a notable disparity in the 5-year survival rate between patients who had less than nine regional lymph nodes examined and those who had more than ten (69.4% vs 87.6%, p = 0.001). Chang et al. [43] discovered that stage II patients who had less than 11 lymph nodes examined had a 5-year overall survival rate of 73%, those with 11–20 lymph nodes examined had a rate of 80%, and those with more than 20 lymph nodes examined had a rate of 87% (p = 0.001). Furthermore, our study indicated that the AUC values of the training cohort and the C-index of the competing risk model exhibited a tendency to increase at 1 year, 3 years, and 5 years. This suggests that the model’s predictive performance and accuracy improved over time, particularly at the 5-year mark. This improvement may be attributed to the mOS of the entire population.

Certain variables included in other scoring systems or prognostic studies were not included in our analyses, such as the number of hepatic metastases. Jang et al. showed that OS in patients with 1–2 CRLM nodules was not statistically different from that in patients with 3–8 CRLM nodules [39]. The advent of modern treatment modalities may explain the similar survival outcomes independent of the number of liver metastases [15]. In recent years, molecular pathological features and gene mutations status (e.g., KRAS, BRAF, NRAS, TP53, microsatellite status, tumor mutation burden and immune checkpoint), and chemotherapy regimen have been shown to correlate with therapeutic response and survival outcomes [44, 45]. Chemotherapy regimens containing oxaliplatin or irinotecan, along with anti-EGFR for tumors with wild-type RAS/BRAF or anti-VEGF for tumors with RAS/BRAF mutations, were successful in enhancing the rate of tumor removal and improving overall survival [46]. The three-agent irinotecan-based chemotherapy regimen (FOLFIRI) has been shown to increase the rate of successful surgery compared to the two-agent regimen. However, it is also associated with a higher incidence of hazardous side effects [47,48,49].

Our study has the following advantages: (i) First, the existing prognostic models for CRC have encompassed various patient groups, but little focus has been given to the prognosis of CRLM patients who undergo preoperative chemotherapy followed by surgery. This study aims to develop a prognostic model specifically for this group of patients. (ii) Previous research indicates that only about 15% to 25% of CRC patients have combined liver metastases at the time of diagnosis, and the percentage of CRLM patients who can receive preoperative chemotherapy combined with surgical is even lower [50,51,52]. The enormous sample size from the SEER database allowed us to investigate potential risk factors and construct reasonably accurate predictive models. (iii) The SEER database is highly accurate and objective, which helps to minimize selection bias. In addition, the study included data from 102 patients for external validation, providing further confirmation of the model’s accuracy. (iv) K–M survival analysis method considers competing events as censored events. This can impact the accuracy of survival outcome estimations [53, 54]. In this study, two models were employed: the Cox proportional risk regression model and the competing risk regression model. The objective was to investigate the influence of multifactorial and non-oncological factors on the outcomes in a more comprehensive manner. (v) The study conducted a comparison with the TNM staging system and acquired data from three cohorts of the same system. When compared to the TNM staging approach, the model provided a higher C-index in all three cohorts.

The study we conducted exhibits the subsequent deficiencies: (i) Certain variables, such as treatment regimen, molecular pathological features, and immunotherapeutic markers, were not included in the study due to restrictions imposed by the SEER database and the economic and other objective conditions of the patients in the external validation cohort. (ii) Disparities in race and variations in treatment guidelines contribute to an inherent imbalance in the initial data of patients. Therefore, large-scale, prospective studies are still necessary to generate a higher level of evidence that can aid in clinical decision-making.

Conclusion

Our findings indicate that age, N stage, perineural invasion, tumor deposit, and regional lymph node metastasis are independent risk factors of Cox proportional risk regression analysis, that can predict the prognosis of CRLM patients who underwent preoperative chemotherapy and surgery. After adjusting for competing risk events, the variables that remained as independent risk factors for cancer-specific death were the number of regional lymph nodes examined, N stage, perineural invasion, tumor deposit, and number of positive regional lymph nodes. Finally, our study discovered that the patients’ mortality risk, as determined by the competing risk model, was lower compared to the Cox proportional risk model, which demonstrated a more precise predictive ability for the risk of cancer-related death. However, further analysis using extensive datasets is necessary to validate this observation.