Colorectal cancer is among the leading causes of cancer-related mortality in both western countries and China [1, 2]. Colorectal cancer is mainly divided into colon cancer and rectal cancer based on primary tumor location, with colon cancer accounting for approximately 70% of colorectal cancers [1, 3]. Early colon cancer refers to carcinoma with invasion limited to the submucosa [4, 5], which can be designated T1NXM0 based on the TNM classification system.

T1 colon cancer is heterogeneous in its clinical presence and prognostic outcome [4]. Generally, the long-term survival of patients with stage I colorectal cancer is excellent after radical resection [6]. The risk of lymph node metastasis (LNM) has been reported to range between 8 and 15% [6,7,8] in T1 colorectal cancer. The probability of lymph node involvement is considered in the clinical management of colon cancer because lymph node status substantially affects patient prognosis [9]. On the one hand, inadequate removal of positive regional lymph nodes would increase local recurrence and cause poor prognosis. On the other hand, extensive surgical resection that is unnecessary would lead to low quality of life and postoperative morbidity.

Advanced endoscopic techniques have become established therapeutic approaches in patients with T1 colon cancer who are carefully selected and evaluated [8, 10]. As LNM occurs in approximately 10% of all T1 colorectal cancers [7, 11], unnecessary additional surgical resection might be avoided after initial endoscopic resection and careful evaluation to eliminate any possible risk factors, including LNM. For this proportion of patients, unnecessary surgery would cause anastomotic leakage and bowel dysfunction but would yield no survival benefit [12]. However, for patients with a high risk of LNM, surgical resection is required to decrease the local recurrence rate and subsequently increase survival. Therefore, to establish a proper therapeutic strategy and minimize the local recurrence rate, patients with a high risk of LNM should be identified.

To this end, we aimed to determine the predictors for LNM in T1 colon cancer using data of eligible patients from the Surveillance, Epidemiology, and End Results (SEER) database in the present study.

Materials and methods

Data source and patient selection

The National Cancer Institute-based SEER database covers approximately 28% of all cancer cases and includes 18 population-based cancer registries in the USA [13]. SEER is also one of the largest publicly accessible databases globally and is updated annually. In this study, relevant data were retrieved from the SEER database. This study was approved by the institutional ethical review board of Shanghai Ninth People’s Hospital, School of Medicine, Shanghai Jiao Tong University.

A total of 8056 eligible patients were enrolled between 2004 and 2012, according to the following inclusion criteria: (1) patients age 18 years or over; (2) a pathological diagnosis of T1 adenocarcinoma or mucinous adenocarcinoma of the colon; (3) at least 12 lymph nodes sampled; and (4) undergoing active follow-up. Patients were eliminated if they had in situ cancer, underwent preoperative radiotherapy, or experienced another primary malignancy.

Data on patient demographics (age, sex, year at diagnosis, ethnicity, and marital status) and tumor characteristics [tumor size, histology, carcinoembryonic antigen (CEA) level, tumor grade, primary tumor site, number of resected lymph nodes, and postoperative radiation] were retrieved from the SEER database and subsequently analyzed.

Overall survival (OS) was defined as time from the date of diagnosis until death for any reason, or the last follow-up. Cancer-specific survival (CSS) was defined as time from the date of diagnosis until death attributed to colon cancer.

Statistical analysis

Chi-square or Fisher’s exact tests were used to compare categorical variables. An unadjusted logistic regression model, adjusted logistic regression model, and backward logistic regression model were used to identify and confirm risk factors for positive lymph node involvement. Odd ratios (ORs) and 95% confidence intervals (CIs) were determined. A Cox regression model was used to identify independent prognostic factors for OS and CSS. In addition, OS and CSS curves were generated using the Kaplan–Meier method, with a log-rank test to determine statistical significance. Finally, a competing risks model was established and the cumulative incidence function (CIF) was estimated. SPSS version 13.0 (SPSS Inc., Chicago, IL, USA) and R software for Windows version R-3.4.3 (The R Foundation for Statistical Computing, Vienna, Austria) were used for statistical analysis. A two-sided P value < 0.05 was considered to indicate statistical significance.

Results

Baseline characteristics

The patient selection process is shown in Table 1. Of the data of 161,589 patients diagnosed with colon cancer who underwent surgical resection during 2004–2012 from the SEER database, 8056 eligible patients were finally included in the present analysis. A total of 3924 male and 4132 female patients were included. The median number of lymph nodes sampled was 17 [interquartile range (IQR): 14–22]. The overall risk of LNM in patients with T1 colon cancer was 12.0% (N = 967). The median follow-up was 68 months (ranging from 47 to 94 months). At the end of follow-up, 6650 (82.55%) patients were still alive. The cancer-specific mortality rate was 9.41% (N = 91) and 3.26% (N = 231) in patients with and without LNM, respectively. Other detailed clinicopathological information is shown in Table 2.

Table 1 Flowchart of patient selection
Table 2 Clinicopathological characteristics of the selected patients

Risk factors of lymph node metastasis

Unadjusted and adjusted multivariate logistic regression analyses were used to determine the risk factors for LNM. As a result, mucinous carcinoma, tumor grade, age, and primary tumor location were robustly confirmed as significant predictive factors for LNM (Table 3). Patients with mucinous carcinoma had significantly higher risks of LNM. Compared with patients who had well-differentiated colon cancer, those with moderately differentiated, poorly differentiated, and even undifferentiated carcinoma were at higher risk of LNM. In terms of age, a decreasing LNM risk was detected in older patients (age 65–79 years and age over 80 years). Of note, carcinoma located in the ascending colon and sigmoid colon was significantly associated with lower LNM risk, as compared with carcinoma located in the cecum.

Table 3 Logistic regression analysis of the risk factors for lymph node metastasis in T1 colon cancer

Lymph node metastasis and patient survival

We further evaluated the association between LNM and patient survival. Unadjusted and adjusted multivariate Cox regression models persistently showed that tumor size, CEA level, age, and marital status were significant prognostic factors for OS in patients with T1 colon cancer (Table 4). Similarly, lymph node status, tumor size, CEA level, tumor grade, year at diagnosis, age, and marital status had significant prognostic value for CSS in patients with T1 colon carcinoma (Table 5). Interestingly, positive lymph node involvement was significantly associated with CSS [hazard ratio (HR) = 3.02 (2.34–3.89), P < 0.001 in adjusted analysis] but not with OS [HR = 1.11 (0.95–1.29), P = 0.21 in unadjusted analysis]. To further investigate the prognostic significance of LNM, patients were categorized into two groups according to their lymph node status. Kaplan–Meier curves showed no statistical significance of OS between the two groups (P = 0.21) (Fig. 1A), whereas the CSS rate was significantly lower in the lymph node positive group than that in the lymph node negative group (P < 0.0001) (Fig. 1B).

Table 4 Cox regression analysis of prognostic factors for overall survival in T1 colon cancer
Table 5 Cox regression analysis of prognostic factors for cancer-specific survival in T1 colon cancer
Fig. 1
figure 1

Effect of lymph node metastasis on overall survival (A) and cancer-specific survival (B) in T1 colon cancer

Competing risk analysis

The prognostic outcomes of cancer patients are influenced by both oncological factors and non-oncological factors. Therefore, cancer patients might die from other causes before cancer-specific death occurs [14].

For accurate determination of the prognostic role of LNM in T1 colon cancer, a competing risks model was used, which directly links the effects of risk factors with cause-specific cumulative incidence of death [15]. As a result, LNM [subdistribution hazard ratio (SHR) = 2.96, P < 0.001], tumor size > 3.0 cm (SHR = 1.50, P = 0.026), negative CEA level (SHR = 0.45, P < 0.001), poorly differentiated (SHR = 1.60, P < 0.031) or undifferentiated (SHR = 2.91, P = 0.022) carcinoma, diagnosis during 2010–2012 (SHR = 0.60, P = 0.001), older age (SHR = 1.61, P = 0.048 for age 65–79 years; SHR = 3.01, P < 0.001 for age over 80 years), white ethnicity (SHR = 0.57, P < 0.001), and single/widowed marital status were all significant prognostic factors for T1 colon cancer (Table 6). In addition, the CIF was used to evaluate the probability of cancer-specific mortality and death from other causes [16]. As shown in Fig. 2, the cancer-specific death rate was significantly higher in patients with LNM (shown as a red curve) than in patients without LNM (shown as a black curve).

Table 6 Competing risks analysis for cancer-specific death
Fig. 2
figure 2

Cumulative incidence function for cancer-specific death. Black curve indicates cancer-specific death without lymph node metastasis; red curve indicates cancer-specific death with lymph node metastasis in T1 colon cancer (Color figure online)

Discussion

With great advances in endoscopic techniques, endoscopic resection is advantageous for low-risk submucosal colon cancer, which dramatically decreases postoperative morbidities, increases quality of life, and gives rise to relatively good long-term clinical outcomes comparable to those of radical surgical resection. However, the indications of endoscopic resection in T1 colon cancer should be cautiously managed. In a retrospective study including 428 patients with T1 colorectal cancer [17], the authors indicated that the conventional indications for endoscopic treatment should not be expanded, mainly owing to the risk of LNM. Therefore, accurate identification of the predictors for LNM risk is crucial to distinguishing patients with low risk of LNM who can thus be treated using endoscopic resection, with oncological outcomes comparable to those of radical resection.

In this population-based study, we investigated the predictors for LNM in T1 colon cancer. Mucinous carcinoma, tumor grade, age, and primary tumor location were significant predictors for LNM. Mucinous carcinoma is a relatively rare pathological type of colorectal cancer, accounting for approximately 10–15% of all colorectal cancer cases [18]. As a distinct subtype, mucinous carcinoma has been reported to be associated with higher risks of lymph node involvement in stage I and II colorectal cancer [19, 20]. Our population-based analysis consistently revealed that patients with mucinous carcinoma of the colon had a higher risk of LNM. Not surprisingly, tumor grade was significantly predictive for lymph node involvement. Of note, poorly differentiated carcinoma increased LNM risk by more than 5 times, in comparison with well-differentiated carcinoma, in all three logistic regression models. Consistent with previous findings in T1 rectal cancer [21], in the present study, we identified older age as a significant negative predictor for LNM. Compared with patients age up to 49 years, the risk of LNM in patients age 65–79 years and more than 80 years dropped to approximately 0.65 and 0.44, respectively (both P < 0.001). It has been reported that lymph node yield declines with age in patients with colorectal cancer, with mean lymph node yield reduced by 1 for every 7-year increase in age overall [22].

Primary tumor location has long been reported to have an impact on the risk of LNM in colorectal cancer [4, 23]. The LNM risk in T1 rectal carcinoma has been revealed to be as high as 15% [4, 5, 24], dropping to 8% in the left colon and 3% in the right colon [4]. Here, we report similar observations, which suggests that carcinoma of the ascending colon is a significant negative predictor for the risk of LNM, whereas sigmoid colon cancer significantly increases the LNM risk. The differing LNM risks according to different primary tumor locations might be owing to intrinsic genetic differences [4, 25]. Unlike other studies concerning rectal cancer [21], we found that tumor size was not a predictive factor for the risk of LNM in T1 colon cancer. Consistent with our findings, Okabe et al. also demonstrated an insignificant association between tumor size and LNM risk in T1 adenocarcinoma of the colon and rectum [4]. Therefore, it remains controversial whether primary tumor size is a predictive factor for the risk of LNM in T1 colorectal cancer, a question that deserves further investigation.

During the patient selection process, patients without an adequate number of resected lymph nodes were excluded. The cutoff value for the number of sampled lymph nodes was set to 12, according to the general consensus that at least 12 lymph nodes are required for accurate pathological judgement [26]. In this population-based analysis, LNM was detected in 12.0% (967 out of 8056) of patients with T1 colon cancer, which was slightly higher than the proportion in other studies [4, 27]. It is feasible that the lymph node positive rate increases with an increased number of sampled lymph nodes. In this study, only patients with more than 12 resected lymph nodes were enrolled, which might give rise to a slightly higher LNM rate in our study.

In survival analysis, LNM was a significant prognostic factor for CSS but not for OS. Patients with T1 colon cancer generally have good prognosis. In this study, the cancer-specific death rate and noncancer-specific death rate were 3.26% and 14.02%, respectively, for patients without LNM (Table 2). However, these rates were comparable to those in patients with LNM (9.41% for cancer-specific death and 9.31% for noncancer-specific death). The above observations robustly indicate the importance of lymph node status in determining oncological outcome in T1 colon cancer.

Owing to relatively long survival in patients with T1 colon cancer, long-term patient survival is influenced by other noncancer risks. That is to say, a considerable proportion of patients might die from causes other than cancer-related causes [15, 28, 29]. Therefore, to accurately illustrate the prognostic role of lymph node status in T1 colon cancer, we constructed a competing risks model and estimated the CIF. LNM was revealed as a definite risk factor for prognosis in patients with T1 colon cancer.

In the present population-based analysis, our conclusions are based on real-world outcomes. With a median follow-up of 68 months among 8056 eligible participants, we report these convincing findings with a high degree of statistical power. Nevertheless, certain limitations must be acknowledged. The limited availability of data from the SEER database is the main drawback. Factors including submucosal invasion depth, tumor budding, and lymphovascular invasion might also affect the likelihood of LNM, which were not assessed in our study. In terms of primary tumor location, ascending colon and sigmoid colon carcinomas are significant predictors for lymph node involvement; however, we failed to reveal any association of the hepatic flexure, transverse colon, splenic flexure, and descending colon with the risk of LNM. The relatively small sample of these tumor locations might be the cause.

In conclusion, the overall LNM rate is approximately 12.0% for T1 colon cancer. Mucinous carcinoma, tumor grade, age, and primary tumor location are significant predictors for LNM in patients with T1 colon cancer. Moreover, positive lymph node involvement is a significant prognostic factor for CSS. Thus, careful preoperative assessment of lymph node status is essential in clinical decision making, to achieve better long-term outcomes.