Introduction

Colorectal cancer (CRC) is a major health concern that is the second most deadly and third most common cancer worldwide [1]. Early-onset CRC, defined as CRC diagnosed in individuals < 50 years of age, has been on a concerning rise globally in the past decades [2,3,4,5,6,7,8,9]. Clinical features of early-onset CRC differ from those of later-onset disease [4, 10]. A deeper understanding of characteristics in early-onset CRC is highly warranted.

Recent studies have begun to shed light on the unique genetic, clinicopathological, and molecular characteristics of early-onset CRC compared to its later-onset counterpart. These studies have unveiled differences in genetic mutations [11,12,13,14], lifestyle factors [15,16,17], gut microbiome [18]. Despite this growing body of research, there remains a significant gap in our understanding of how tumor size affects survival outcomes in young patients with CRC. Although tumor size has been recognized as a prognostic factor for CRC, the results were inconsistent [19,20,21,22,23]. To the best of our knowledge, no study has investigated the impact of tumor size on survival outcomes in early-onset CRC.

This study aimed to bridge this knowledge gap by examining the impact of tumor size on the survival of patients with early-onset CRC. We hypothesize that tumor size may have a distinctive role in the prognosis of these patients, potentially influencing treatment decisions and survival outcomes differently than in older populations. Through a comprehensive analysis of clinical data from the Surveillance, Epidemiology, and End Results (SEER) database, this study seeks to provide new insights into the prognostic significance of tumor size in early-onset CRC, thereby contributing to more tailored and effective treatment strategies and improving survival outcomes and quality of life for this unique patient demographic.

Materials and methods

Study design and participants

This was a retrospective cohort study. We used SEER*Stat 8.4.1 software and selected “Incidence - SEER Research Plus Data, 18 Registries, Nov 2020 Sub (2000–2018)” as the database. The clinicopathological data of patients diagnosed with early-onset CRC between 2004 and 2015 were extracted from the abovementioned database. Primary tumor sites (C18.0, C18.2–18.7, C19.9, and C20.9) included the colon (cecum, ascending colon, hepatic flexure of colon, and transverse colon, splenic flexure of colon, descending colon, sigmoid colon, and rectosigmoid junction), and rectum. In addition, the histologic subtypes included adenocarcinoma (8140/3, 8144/3, 8201/3, 8210/3, 8211/3, 8213/3, 8220/3, 8221/3, 8255/3, 8260/3, 8261/3, 8262/3, 8263/3, 8310/3, 8323/3), mucinous adenocarcinoma (MA) (8480/3, 8481/3), and signet ring cell carcinoma (SRCC) (8490/3). Cases were coded according to the International Classification of Diseases for Oncology, Third Edition.

The inclusion criteria were patients diagnosed with early-onset CRC (pathologically confirmed) between 2004 and 2015. Exclusion criteria were as follows: patients whose tumor size was 0, unknown, or larger than 200 mm; patients who did not undergo surgery; patients with a loss of vital clinical and survival information; and patients younger than 18 years old.

This study followed the Strengthening the Reporting of Cohort Studies in Surgery (STROCSS) reporting guidelines [24].

Study variables

The collected variables included age, sex, race, tumor size, histologic subtypes, grade, stage, chemotherapy, survival time, cause of death, and vital status records. The endpoint of this study was overall survival (OS) and cancer-specific survival (CSS). In the SEER database, patients between 2004 and 2010 were classified with the sixth American Joint Committee on Cancer (AJCC) classification, and patients between 2010 and 2015 were classified with both the sixth and seventh classification. Thus, to unify the criteria, all patients were classified according to the sixth AJCC classification. OS was defined as the time from diagnosis to death from any cause or the last follow-up. CSS was defined as the time interval between cancer diagnosis and death from colorectal cancer or the last follow-up.

Statistical analysis

Colon cancer is studied separately from rectal cancer. X-tile software [25] (version 3.6.1, Yale University School of Medicine) was used to determine the optimal cut-off points for age. Categorical variables were expressed as frequencies and percentages. OS and CSS were analyzed using Kaplan–Meier curves and compared using log-rank tests.

Potential nonlinear associations between tumor size and outcomes were examined using restricted cubic spline (RCS) [26] with 4 knots. Covariates included in the analysis were age, sex, race, histologic subtypes, grade, stage, and chemotherapy.

Univariate and multivariate Cox regression analyses were performed to calculate hazard ratios (HR) and 95% confidence intervals (CI). To fully assess the relationship between tumor size and outcomes, tumor size was analyzed as both continuous and categorical variables (two, three, and four categories). Cut-off values for tumor size were determined based on professional experience and X-tile software.

Additionally, based on 3 models, the P values for linear trends were calculated using the quartile values as an ordinal variable. Model 1 was unadjusted; Model 2 was adjusted for age, sex, and race; Model 3 was further adjusted for histologic subtypes, grade, stage, and chemotherapy.

To assess the consistency of the impact of tumor size on outcomes, subgroup analysis was performed according to the above-mentioned covariates. Moreover, likelihood ratio tests were used to examine interaction [27].

To reduce the impact of baseline differences on the outcomes, a sensitivity analysis was carried out using 1:1 propensity score matching (PSM) [28]. The balance in covariates was assessed by using the standardized mean difference (SMD) approach. SMD of 10% or less was considered to be adequate balance. After PSM, OS and CSS were analyzed using Kaplan–Meier curves and log-rank tests.

R software (version 4.3.1; http://www.r-project.org) was used for statistical analyses. Two-sided P < .05 indicated statistical significance.

Results

Characteristics of the participants

Overall, 33,356 patients with early-onset colon and rectal cancer were identified between 2004 and 2015 from the SEER database. According to the inclusion and exclusion criteria, 17,551 (76.7%) colon and 5323 (23.3%) rectal cancer patients were included. The study screening flow chart is shown in Fig. 1. The demographic and clinicopathological characteristics of early-onset colon and rectal cancer patients are summarized in Table 1.

Fig. 1
figure 1

Flow chart of the study population

Table 1 Demographic and clinicopathological characteristics of patients with early-onset colon and rectal cancer

Kaplan–Meier survival analysis

The 3-, 5‐, and 10‐year OS rates were 78.0%, 69.1%, and 60.7%, respectively, and the 3‐, 5‐, and 10‐year CSS rates were 79.3%, 71.0%, and 63.9%, respectively for early-onset colon cancer. The 3‐, 5‐, and 10‐year OS rates were 85.7%, 75.7%, and 65.2%, respectively, and the 3‐, 5‐, and 10‐year CSS rates were 86.8%, 77.3%, and 67.9%, respectively for early-onset rectal cancer. The OS and CSS in patients with tumor size ≤ 50 mm were better compared with those with tumor size > 50 mm (Fig. 2).

Fig. 2
figure 2

Kaplan–Meier curves for overall survival (a) and cancer-specific survival (b) of patients with early-onset colon cancer; Kaplan–Meier curves for overall survival (c) and cancer-specific survival (d) of patients with early-onset rectal cancer. Large tumor: >50 mm; Small tumor: ≤ 50 mm

Potential nonlinear associations between tumor size and survival

The RCS revealed that the risk of OS and CSS increased linearly with increasing tumor size. (Fig. 3).

Fig. 3
figure 3

Association between tumor size and survival using a restricted cubic spline regression model. Early-onset colon cancer: a overall survival; b cancer-specific survival. Early-onset rectal cancer: c overall survival; d cancer-specific survival. Graphs show HRs for survival according to tumor size adjusted for age, sex, race, histologic types, grade, stage, and chemotherapy. Data were fitted by a restricted cubic spline Cox proportional hazards regression model, and the model was conducted with 4 knots at the 5th, 35th, 65th, 95th percentiles of tumor size (reference is the 5th percentile). Solid lines indicate HRs, and shadow shape indicate 95% CIs. HR, hazard ratio; CI, confidence interval

Prognostic impact of tumor size on OS and CSS

Tumor size had a negative impact on OS and CSS in both the unadjusted model and the fully adjusted model, regardless of whether it was analyzed as a continuous or categorical variable (Table 2).

Table 2 Association between tumor size and survival in early-onset colon and rectal cancer

Linear trend analysis

When tumor size was categorized based on quartiles, it still negatively impacted OS and CSS. Additionally, P-values for linear trend were significant in all 3 models for early-onset colon cancer, and significant in Model 1 and Model 2 for early-onset rectal cancer (Table 3).

Table 3 Association between tumor size and survival in early-onset colon and rectal cancer (according to quartile of tumor size)

Subgroup analysis

The results of subgroup analysis of OS and CSS in early-onset rectal cancer were consistent (Supplementary Figs. S1 and S2). Particularly notable was the distinct survival advantage observed with larger tumors in Stage II early-onset colon cancer, contrasting with other stages. Although the HRs for Stage I, III, and IV were greater than 1, Stage II presented an HR less than 1 (HR, 0.95; 95% CI, 0.82–1.10), suggesting a unique trend where larger tumor sizes in this stage were associated with better OS compared to smaller tumors (Fig. 4). Similar results were observed in subgroup analysis for CSS (Fig. 5).

Fig. 4
figure 4

Forest plot for subgroup analysis of overall survival in early-onset colon cancer. Large tumor: > 50 mm; Small tumor: ≤ 50 mm. HR, hazard ratio; CI, confidence interval; MA, mucinous adenocarcinoma; SRCC, signet ring cell carcinoma

Fig. 5
figure 5

Forest plot for subgroup analysis of cancer-specific survival in early-onset colon cancer. Large tumor: > 50 mm; Small tumor: ≤ 50 mm. HR, hazard ratio; CI, confidence interval; MA, mucinous adenocarcinoma; SRCC, signet ring cell carcinoma

Propensity score matching

Before PSM, there was a significant imbalance in baseline characteristics. After PSM, no statistically significant differences remained in the covariates (all SMDs < 0.1) for OS and CSS analysis in both early-onset colon (Supplementary Tables S1 and S2) and rectal cancer (Supplementary Tables S3 and S4). After matching, Kaplan–Meier survival curves showed that the prognosis of patients with tumor size ≤ 50 mm was better than that of patients with tumor size > 50 mm (Fig. 6).

Fig. 6
figure 6

Kaplan–Meier curves for overall survival (a) and cancer-specific survival (b) of patients with early-onset colon cancer after PSM; Kaplan–Meier curves for overall survival (c) and cancer-specific survival (d) of patients with early-onset rectal cancer after PSM. Large tumor: > 50 mm; Small tumor: ≤ 50 mm. PSM, propensity score matching

Discussion

Based on the SEER database, 22,874 participants with early-onset colon and rectal cancer were included. Our study revealed that larger tumor size (as both continuous and categorical variables), significantly correlated with worse OS and CSS in early-onset colon and rectal cancer patients. Notably, smaller tumor size was associated with worse survival in stage II early-onset colon cancer (adjusted HR, 0.95; 95% CI, 0.82–1.10, and adjusted HR, 0.95; 95% CI, 0.80–1.13 for OS and CSS, respectively), suggesting that smaller tumors may reflect a more biologically aggressive phenotype for patients with stage II early-onset colon cancer.

In accordance with our findings, some studies have revealed that patients with larger tumors had a decreased survival compared with those with smaller tumors in CRC no matter with [20, 29, 30] or without [31] metastasis. However, there were also studies with different opinions. Hajibandeh et al. evaluated the predictive significance of tumor size in 192 CRC patients undergoing curative surgery [21]. They found that tumor size on its own may not have a significant prognostic value in OS. Their study was limited by the small sample size. In addition, Shiraishi et al. performed a retrospective study of 95 patients with pT4 CRC and demonstrated that tumor size ≥ 50 mm was associated with a better CSS than that of < 50 mm [23]. This contrasting view should be interpreted with caution because only 95 patients were included in the analysis. Overall, we still believe that larger tumor size is associated with worse survival outcomes for CRC.

It should be noted that an interesting result was observed after subgroup analysis. Surprisingly, we identified that tumor size > 50 mm was associated with a better OS and CSS than that of ≤ 50 mm for patients with stage II colon cancer; however, this was not statistically significant. This finding contrasted with other stages and highlighted the nuanced impact of tumor size on survival, depending on the stage of the disease. After a thorough literature search, we found that previous studies have revealed this seemingly paradoxical finding [23, 32,33,34,35,36,37]. For example, Huang et al. analysed 7719 patients with stage II colon cancer from the SEER database and indicated that patients with smaller tumors were associated with decreased CSS compared with those with larger tumors [32]. This was extremely similar to what we reported here. It was speculated that smaller tumors with heavy intestinal wall invasion may represent a biologically aggressive phenotype, whereas larger tumors may reflect a biologically indolent phenotype in stage II CRC. This distinct growth pattern may be caused by inter-tumor heterogeneity of CRC that results from various genetic and epigenetic factors. More studies are needed to elucidate the underlying mechanism.

There are several strengths to our study. In addition to the large sample size, the main strength of our study was the multiple rigorous statistical methods. First, colon cancer is studied separately from rectal cancer due to their different biological behaviors. Second, potential nonlinear associations between tumor size and outcomes (OS and CSS) were evaluated using RCS. Third, to correct for potential confounding factors, univariable and multivariable Cox regression were used. Additionally, PSM analysis was also performed as a sensitivity analysis. Therefore, the findings of our study were robust. Fourth, tumor size was analyzed as both continuous and categorical variables. Moreover, when it was analysed as a categorical variable, different numbers of categories (two, three, and four categories) and cut-off values were used. Thus, the results were reliable. Fifth, both OS and CSS were evaluated as survival outcomes. Last, subgroup analyses were conducted and a special population was identified (stage II early-onset colon cancer).

However, when interpreting the results of the present study, several limitations should be considered. First, the retrospective nature of the study limited the generalizability of the results. Prospective studies are needed in the future. Second, besides genetics and epigenetics data, microsatellite instability status, comorbidities, intestinal obstruction or penetration, and detailed information on CEA, radiotherapy, and chemotherapy were not included in the SEER database. Third, there is still a possibility of residual confounding, despite adjusting for potential confounders. Fourth, only the US population were included in the SEER database, possibly resulting in a degree of selection bias. The results of the present study might be unsuitable for patients in other countries, suggesting that a large-scale multicenter global study is necessary. Fifth, only patients who had undergone surgical resection were included in this study. Therefore, our results may not apply to patients without surgical resection.

To the best of our knowledge, this is the first large study to reveal the association between tumor size and survival outcomes in early-onset colon and rectal cancer. Our study highlights that tumor size is an important risk factor for OS and CSS in early-onset colon and rectal cancer. More prospective multicenter studies are needed to validate the association between tumor size and survival in stage II early-onset colon cancer, especially stratified by microsatellite instability status. Further studies should also be undertaken to elucidate the underlying genetic and molecular mechanisms of the impact of tumor size on the survival of early-onset colon and rectal cancer.

Conclusions

This study found that patients with larger tumors experienced worse OS and CSS compared to those with smaller tumors in early-onset colon and rectal cancer. Notably, smaller tumors may reflect a more biologically aggressive phenotype and be associated with worse survival in stage II early-onset colon cancer. More studies are warranted to verify our findings and elucidate the underlying mechanisms.