Background

Colorectal cancer has been ranked as the second most common malignancy in women and third in men across the world. Annual global incidence is approximately 1.4 million with nearly 700,000 deaths [1, 2]. There are more than 50,000 death reports and over 130,000 newly occurred cases in the United States [2]. In European Union, 215,000 cases have been reported with colorectal cancer being listed as the second common cause of death [3]. In China, colorectal cancer is listed as one of the five most commonly malignancies both in men and women [4].

Genomic characterization of colorectal cancer has been well elucidated and the role of immunology is increasingly valued [5,6,7]. Therapeutically, surgical intervention and chemotherapy-based strategies have been widely accepted for colorectal cancer [8, 9]. Noteworthy, the impact of colorectal cancer surgery on the elder group, regarding long term survival, is similar to that of younger group [10].

Generally, elderly colorectal cancer patients (ECRC), defined by age surpass 70 years old, may naturally associate with increased mortality as age increased. However, no study did fully cover nor depict the quantified association of age and risks for prognosis of ECRC [11, 12]. Previously, tumor-node-metastasis (TNM) stage system of American Joint Committee on Cancer (AJCC) is widely used in the therapeutic and prognostic administration of colorectal cancer. Given increasing values of multiple variables, including tumor size and marital status, have been noticed [13, 14], a more comprehensive prognostic predictor is necessary for ECRC.

Of note, knowledge regarding the clinical prediction of ECRC is limited, with very few studies focusing on the nomogram implementation. In this study, a ECRC-targeting nomogram was established for prognostic prediction based on large sample size retrieved from the Surveillance, Epidemiology, and End Results (SEER) database in hopes of elucidating further prognostic insights [15].

Methods

Recruitment of patients from SEER database

The clinical variables of patients confirmed as ECRC between 2004 and 2016 were retrieved from the SEER database, a program established by National Cancer Institute aiming for comprehensively national-level clinical investigation [16, 17]. The reference number was 16,595-Nov2018. The inclusion criteria were: 1) colon and rectum (site recode, international classification of diseases for oncology (ICD-O-3)/WHO 2009); 2) age ≥ 70; 3) complete information on TNM stage; 4) only one primary tumor cases were selected; 5) surgery performed in each case. Next, all included cases were randomly divided into training and validation sets with equal sample size. In addition, x-tile software was used to determine and visualize the best cutoff points of age and tumor size variables in this study [18].

Clinical variables extracted for analysis

Age, sex, marital status, tumor site, histological grade, SEER stage, the AJCC TNM stage, distant metastasis (bone, brain, liver and lung) and tumor size were all selected for the establishment of nomogram modeling. Regarding the clinical outcome, overall survival (OS) and cancer-specific survival (CSS) were chosen as the primary and second endpoints.

Construction and validation of the nomogram

Statistically, chi-square test was used for all included categories between training and validation groups. Next, univariate and multivariate analysis were used to determine distinct variables, which were further output for the construction of nomogram model by R software 3.3.0 (R Foundation for Statistical Computing, Vienna, Austria, www.r-project.org). Then, the validation group was used for the assessment of the newly established nomogram. The comparison between the nomogram prediction and observed outcomes was assessed by the concordance index (C-index). The calibration plot was used for visualized comparison between prognosis predicted by nomogram and actual ones. Sensitivity and specificity were evaluated by receiver operating characteristics curve (ROC)-the area under the curve (AUC). Furthermore, the power of nomogram model was also compared to the TNM stage and SEER stage in both ROC and decision curve analysis (DCA). All analysis was achieved by R software 3.3.0, with p value< 0.05 considered as statistically significant.

Results

Characterization of included cases

Following inclusion criteria, a total of 44,761 cases were finally included in this study with 22,381 assigned to training set and 22,380 to validation set randomly (Fig. 1). Among all patients, 44.6% were male and 55.4% female; 47.6% were unmarried and 46.8% married; 81.9% were colon cancer and 18.1% rectal cancer; 0.3% of cases had bone metastasis, 0.1% with brain metastasis, 7.0% with liver metastasis, 1.8% with lung metastasis. The cutoff points of age and tumor size were determined by x-tile (Fig. 2). Specifically, 40.9% were < =76 years old, 44.5% between 77 and 86 years old, and 14.7% > =87 years old. 29.8% were < =3.4 cm, 36.3% between 3.5–5.9 cm and 25.4% > = 6 cm (Table 1). No significant difference was identified between training and validation cohorts regarding each included variable.

Fig. 1
figure 1

The inclusion criteria flowchart of recruited patients in SEER database

Fig. 2
figure 2

The X-tile analysis of best-cutoff points of age and tumor size variables. a X-tile plot of training sets in age; b the cutoff point was highlighted using a histogram of the entire cohort; c the distinct prognosis determined by the cutoff point was shown using a Kaplan-Meier plot (low subset = blue, middle subset = gray, high subset = magenta); d X-tile plot of training sets in tumor size; e the cutoff point was highlighted using a histogram; f Kaplan-Meier plot of prognosis determined by the cutoff point (low subset = blue, middle subset = gray, high subset = magenta)

Table 1 Baseline demographic and clinical characteristics of elderly patients with CRC

Establishment of the nomogram

Interestingly, sex, age, marital status, tumor size, grade, SEER stage, AJCC TNM stage, bone metastasis, brain metastasis, liver metastasis, lung metastasis and tumor size were all displayed high statistically difference in univariate OS analysis (Table 2). Next, sex, age, marital status, grade, AJCC TNM, bone metastasis, brain metastasis, liver metastasis and lung metastasis and tumor size were all significantly identified in OS multivariate analysis (Table 2). Meanwhile in CSS, age, marital status, tumor site, grade, SEER stage, AJCC TNM stage, bone metastasis, brain metastasis, liver metastasis, lung metastasis and tumor size were significantly identified in univariate CSS analysis. Age, marital status, tumor site, grade, SEER stage, AJCC TNM, bone metastasis, brain metastasis, liver metastasis, lung metastasis and tumor size were significantly associated with CSS in multivariate analysis (Table 3). Thus, OS and CSS nomogram models of 1-, 3- and 5-year were established, respectively (Fig. 3a, b).

Table 2 Univariate and multivariate analysis of overall survival in the training cohort
Table 3 Univariate and multivariate analysis of cancer-specific survival in the training cohort
Fig. 3
figure 3

Establishment of overall survival (OS) and cancer-specific survival (CSS) nomograms. a Construction of OS nomogram; b construction of CSS nomogram

Nomogram validation

The assessment was performed both internally and externally, measured by C-index and calibration plots. Specifically, C-index of OS nomogram was 0.726 (95% confidence interval (95%CI): 0.720–0.732) in training set while 0.722 (95%CI: 0.716–0.728) in validation set (Table 4. C-index of CSS was 0.791 (95%CI: 0.785–0.797) in training set while 0.789 (95%CI: 0.783–0.795) (Table 4). Meanwhile, calibration plots indicated high quality of predicted outcome of OS/CSS nomogram models (Figs. 4, 5). Next, to further compare the nomograms with other classic staging methods, including AJCC TNM stage and SEER stage, DCA and ROC were performed in both OS and CSS. In DCA, nomograms both in OS and CSS showed superior power to AJCC TNM stage and SEER stage (Fig. 6). Meanwhile, nomograms in OS and CSS also showed higher statistic power to AJCC TNM stage and SEER stage (Figs. 7, 8, Table 5).

Table 4 C-indexes for the nomograms and other stage systems in patients with CRC
Fig. 4
figure 4

Calibration plots of OS nomogram model. a 1-year calibration plot of OS using training set; b 3-year calibration plot of OS using training set; c 5-year calibration plot of OS using training set; d 1-year calibration plot of OS using validation set; e 3-year calibration plot of OS using validation set; f 5-year calibration plot of OS using validation set

Fig. 5
figure 5

Calibration plots of CSS nomogram model. a 1-year calibration plot of CSS using training set; b 3-year calibration plot of CSS using training set; c 5-year calibration plot of CSS using training set; d 1-year calibration plot of CSS using validation set; e 3-year calibration plot of CSS using validation set; f 5-year calibration plot of CSS using validation set

Fig. 6
figure 6

Decision curve analysis (DCA) of OS and CSS nomograms. a DCA of OS nomogram using training set; b DCA of OS nomogram using validation set; c DCA of CSS nomogram using training set; d DCA of CSS nomogram using validation set

Fig. 7
figure 7

Receiver operating characteristics curve (ROC) comparison of OS nomogram, AJCC TNM stage and SEER stage. a1-year ROC of OS nomogram using train set; b 3-year ROC of OS nomogram using training set; c 5-year ROC of OS nomogram using training set; d 1-year ROC of OS nomogram using validation set; e 3-year ROC of OS nomogram using validation set; f 5-year ROC of OS nomogram using validation set

Fig. 8
figure 8

ROC comparison of CSS nomogram, AJCC TNM stage and SEER stage. a 1-year ROC of CSS nomogram using train set; b 3-year ROC of CSS nomogram using training set; c 5-year ROC of CSS nomogram using training set; d 1-year ROC of CSS nomogram using validation set; e 3-year ROC of CSS nomogram using validation set; f 5-year ROC of CSS nomogram using validation set

Table 5 The area under the curve (AUC) of comparison between nomograms and AJCC TNM stage and the Surveillance, Epidemiology, and End Results (SEER) database stage

Discussion

Up to now, numerous studies had investigated the role of prognostic nomograms for colorectal cancer patients using SEER database for variable objects [19, 20]. In fact, increasing studies tended to focus more on the therapeutics or modified classification, with very rare highlighted the role of age in the prognostic assessment of colorectal cancer. Our previous study reported that a nomogram for early-onset colorectal cancer patients could display comparably higher C-index value and better performance than conventional variables [21]. ECRC, on the other hand, had been explored with limited studies. Li et al. reported that, with 18,937 included cases, adjuvant chemotherapy did not offer additional survival benefits to elderly patients with stage II or III [22]. Nonetheless, a general prognostic nomogram of ECRC is yet to be fully characterized. In this study, the nomograms displayed higher C-index and convinced calibration plots for OS and CSS prediction using SEER database. Moreover, they achieved higher values regarding both AUC and DCA assessment systems compared to AJCC TNM and SEER stages.

Of note, in OS, 12 variables (sex, age, marital status, grade, AJCC TNM, bone metastasis, brain metastasis, liver metastasis and lung metastasis and tumor size) out of 15 variables were determined for the construction of nomogram. Similar feature had also been noticed in CSS nomogram. It was highly possible that the prognosis of ECRC could be associated with more variables than common colorectal cancer cases. Moreover, four types of distant metastasis, for the first time, had been incorporated for nomogram of ECRC in SEER analysis.

In addition, X-tile tool was introduced for the best cutoff values of age and tumor size in this study. X-tile tool was established as a powerful graphic method to illustrate potential subsets (cutoff) with construction of a two dimensional projection [18]. It had been widely used in numerous investigations, including esophageal squamous cell carcinoma, bladder cancer and chondrosarcoma [23,24,25]. In this study, for the first time, subsets of consecutive variables, age and tumor size, were determined by X-tile tool. In fact, the role of tumor size had been intensively studied [26]. However, the cutoff points of tumor size in colorectal cancer remain largely arbitrary. Therefore, introduction of X-tile for the classification of tumor size could be both reliable and replicated.

Generally, elderly patients may naturally associate with increased mortality as age increased. However, no study did fully cover nor depict the quantified association of age and risks for prognosis, particularly when elderly patients had surpassed 70 years old. In our study, age itself was identified as a higher risk factor in OS compared to CSS nomogram, with age ≥ 87 representing nearly 90 points in OS but less than 60 points in CSS. Interestingly, female was identified as a protective factor in OS nomogram, instead of CSS nomogram. Moreover, marriage is also identified as a protective factor in both OS and CSS nomogram. By comparing OS and CSS nomograms, insightful clues had been noticed for further external clinical investigation.

Conclusion

This study established nomograms of elderly colorectal cancer patients with distinct clinical values compared to AJCC TNM and SEER stages regarding both OS and CSS.