Introduction

Since the introduction of the Danish national colorectal cancer (CRC) screening program in 2014, there has been a threefold rise in the incidence of stage I CRC and according to data from the Danish Colorectal Cancer Group (DCCG.dk) 19.8% of newly diagnosed colon cancers in 2017 were pT1 cancers [1]. Unfortunately, it is still not perfectly clear how to manage early (pT1) CRC optimally. In many cases, major bowel resection with regional lymphadenectomy is performed, but this may be associated with a significant risk of post-operative mortality and morbidity, especially in elderly or fragile patients. Endoscopic local excision of pT1 CRC is a less invasive, organ-preserving procedure that may be an especially attractive option for patients with significant comorbidity and frailty. The risk of disease recurrence after resection of pT1 cancer ranges between 2 and 10% [2,3,4], depending heavily on several histopathological risk factors. Seven to 20% of all pT1 CRC patients will have lymph node metastasis (LNM) at the time of diagnosis [5, 6] and the oncological outcomes are only comparable to major bowel resection with regional lymphadenectomy when LNM are absent [7]. Moreover, approximately 1.8–3% of patients with T1 cancer will develop distant metastasis [2, 8]. Today, when local excision is performed for pT1 CRC, pathologists play a crucial role in stratifying patient risk by examining histopathological risk factors. Over the years, several risk factors for lymph node metastasis have been reported [9, 10]. Many of these risk factors correlate not only with the risk of LNM but also with risk of distant metastasis and thereby to the overall risk of disease recurrence [3, 8]. However, this single parameter-based risk assessment overestimates the risk of LNM, as 80–90% of the patients selected for additional surgery will have no evidence of LNM or residual disease [11, 12]. Several risk scores and prediction models have already been proposed [3, 13,14,15]. Nevertheless, clinical multicentre studies are still needed to fully confirm the predictive value of histopathological risk factors, not only their role in predicting LNM, but also their impact on overall disease recurrence, despite the pathophysiological variations in recurrence mechanisms. The current study aimed to develop a prediction model based on histopathological data for the probability of disease recurrence and residual tumour in a large, nationwide cohort of Danish patients with pT1 CRC.

Materials and methods

Study design

This is a nationwide retrospective cohort study of patients diagnosed with pT1 CRC between January 2001 and December 2011. The Data Protection Agency in Denmark and the Medical Ethics Committee of the Capitol Region in Denmark approved the study (Approval ID: 2013–41–2475 and H-15001716). This study was performed in accordance with the Helsinki Declaration.

Study population

Patients over the age of 17 who underwent endoscopic resection (ER) of pT1 CRC with or without subsequent bowel resection (SBR) between January 2001 and December 2011 were retrospectively evaluated. Only patients without previous surgery for colorectal cancer who underwent complete endoscopic resection of pT1 CRC were included in the study. Patients diagnosed with Lynch syndrome, familial adenomatous polyposis, patients with active inflammatory bowel disease, multiple malignant lesions or synchronous tumours were excluded. Patients were also excluded if histological blocks or endoscopy reports were missing, if the histological re-evaluation revealed non-invasive lesions or if the patients had received neoadjuvant radiotherapy. ER included endoscopic mucosal resection (EMR), endoscopic submucosal dissection (ESD) and snare polypectomy. pT1 CRC was defined as adenocarcinomas invading through the muscularis mucosae into the submucosa, but not involving the muscularis propria [16]. During the study period between January 2001 and December 2011, guidelines from the Danish Colorectal Cancer Group (DCCG.dk) recommended subsequent surgery if at least one of the following risk factors were present: positive resection margin (< 1 mm), poorly differentiated adenocarcinoma or lymphovascular invasion [17]. SBR was performed as an open or laparoscopic procedure.

Data source

The patients were identified from the Danish Colorectal Cancer Group (DCCG.dk) database. The data gathered were supplemented with data from the Danish National Patient Register (NPR) and the Danish National Pathology Register and Data Bank (DNPR) [18,19,20]. All data were crosschecked with manual reviews of medical, endoscopy, pathology reports and radiology charts and additional information on patient and tumour characteristics were collected. All available paraffin blocks and haematoxylin and eosin (HE)-stained sections on primary confirmed cases of pT1 CRC were retrieved from nationwide Pathology Departments. Patients were followed up until December 2016 or until death.

Pathological evaluation

HE staining was used as standard for the histopathological re-evaluation. In all available cases, the original HE slides were re-evaluated to confirm or reject the diagnosis of pT1 CRC. In case of missing original HE slides, new sections were cut from all available blocks and stained for HE. From each case, one or two blocks were selected for inclusion in the study. On the included material, both control HE and immunohistochemical staining was performed: cytokeratin (CKAE1/AE3), D2-40, caldesmon, pMLH1, pMSH2, pMSH6 and pPMS2. All original HE slides, new HE- and immunohistochemical stained slides from the included cases were re-evaluated by an experienced pathologist subspecialised in gastrointestinal pathology (TPK). She was blinded for the results from the original pathology reports and clinical characteristics, except for the endoscopic polyp type (pedunculated or sessile) and whether the polyp had been completely removed in one piece or was removed by piece-meal technique. The following data were recorded at the re-evaluation of each case: tumour type defined according to WHO 2019 [16]. Presence of mucinous tumour component. Invasive tumour size: measured as the largest diameter at the invasive front in mm. Tumour level: Haggitt level 1–4 for pedunculated polyps, Kikuchi level Sm1–3 for sessile polyps [21, 22]. Tumour grade: low grade and high grade, based on the worst area of differentiation. Distance from invasive tumour to the resection margin, measured in mm: 0 mm (involved margin), ≤ 1 mm, > 1 mm. Perineural invasion, intramural lymphatic invasion (HE and D2-40 staining) and intramural venous invasion (HE and caldesmon staining). Tumour budding: Bd1 (0–4 buds), Bd2 (5–9 buds) or Bd3 (≥ 10 buds). Tumour budding was defined as “a single cancer cell or a cell cluster of up to four tumour cells” and counted according to the recommendations of the International Tumour Budding Consensus Conference 2016 (scored on HE, if necessary guided by CK staining) [23]. Mismatch repair protein (MMR) status: pMLH1, pMSH2, pMSH6 and pPMS2.

Outcomes

The primary outcome was to develop a prediction model for disease recurrence in patients with pT1 CRC. Patients who underwent complete ER without SBR and developed locoregional and/or distant CRC recurrence during a 5-year follow-up period and patients who underwent complete ER followed by SBR with ≥ 1 positive lymph node in the resection specimen or developed distant CRC recurrence during a 5-year follow-up period were defined as disease recurrence-positive cases. Locoregional recurrence was defined as any recurrent tumour growth or recurrences in lymph nodes near the primary resection site. Distant recurrence was defined as any histological, morphological and clinical evidence of metastasis in distant organs, bones or peritoneum.

The secondary outcome was to develop a prediction model for residual disease in patients with pT1 CRC after primary endoscopic resection. Residual disease was defined as histologically verified tumour tissue in the mucosa and bowel wall at the primary resection site following SBR.

Candidate variables for predicting disease recurrence

Based on previous literature and current guidelines, a set of candidate variables for predicting disease recurrence were selected. These included tumour grade, polyp shape, polyp size, distance to the resection margin, high risk (Haggitt level 3–4 or Kikuchi Sm3), intramural venous invasion, lymphatic invasion and the tumour budding score (Bd1–3).

Statistical analysis

Categorical variables were summarised as counts and percentages; medians (interquartile ranges; IQR) were used for continuous variables. Multiple imputations by fully conditional specification (FCS) method were used for missing data by imputing 20 data sets using the SAS procedure PROC MI [24]. Univariate and multivariate analysis of disease recurrence and residual disease was performed by logistic regression model and reported as odds ratio (OR) with a 95% confidence interval (CI). Backward selection using a liberal significance level of 0.157 was used to select the prediction model. Since we used multiple imputations, the selection method was conducted in all data sets, and we included variables selected in at least 10 analyses. The model performance was assessed for calibration and discrimination capability. Calibration was assessed using the Hosmer–Lemeshow (HL) goodness-of-fit test and the scaled Brier score. ROC curves and the corresponding area under the ROC curve (AUC) were calculated to test for discrimination [25]. Statistical analyses were conducted using SAS version 9.4. All reporting was conducted in accordance with the STROBE statement.

Results

Study population

A total of 692 patients with pT1 CRC were identified through the DCCG.dk database. Paraffin blocks and HE slides from 49 patients could not be retrieved, and they were excluded from the analysis. After the histopathological re-evaluation of the original HE slides, another 85 patients were excluded from further analysis, due to either rejection of the primary diagnosis of adenocarcinoma or if the diagnosis was uncertain based on the available material. The final cohort consisted of 558 patients. Among these, 339 patients (61%) underwent complete endoscopic resection (ER), and 219 patients (39%) underwent ER and subsequent bowel resection (ER + SBR). Figure 1 shows the study flow chart. The median follow-up time of the study group ER and ER + SBR were 79.0 months (IQR 55.5–112.0 months) and 96.0 moths (IQR 71.0–122.5 months), respectively. Baseline clinical and histopathological characteristics are shown in Table 1.

Fig. 1
figure 1

Flow chart illustrating the study population. CRC colorectal cancer

Table 1 Baseline clinical and histopathological characteristics

Disease recurrence and residual disease

A total of 27 patients (8.0%) in the ER group experienced disease recurrence. Among them, 12 patients (3.5%) were diagnosed with locoregional recurrence, and 15 patients (4.5%) developed distant metastasis. In contrast, a significantly higher number of patients, 34 (15.5%), in the ER + SBR group developed disease recurrence, p = 0.008. A total of 15 (11.9%) had positive lymph nodes in the resection specimen after SBR. The pathology reports of the resection specimens revealed 11 (5.0%) cases in which the pathological T-category was higher than pT1. These cases were excluded from further analysis. A total of 8 (3.7%) patients in the ER + SBR group developed distant metastasis during the follow-up period. There was no significant difference in the proportion of distant metastases between the ER group and the ER + SBR group, p = 0.68. Finally, 50 (8.1%) disease recurrence positive cases were used for the development of the clinical prediction model for disease recurrence. The presence of residual disease was identified in 21 (9.6%) cases after SBR. Table 2 shows the rates of disease recurrence and residual disease in the study population.

Table 2 The rate of disease recurrence and residual disease

Derivation of the prediction model for disease recurrence

As described previously, 50 (8.1%) cases were identified positive for disease recurrence. The logistic regression analysis is illustrated in Table 3.

Table 3 Univariate and multivariate logistic regression analysis for disease recurrence

After backward model selection, the following variables remained in the final model: resection margin with a cut-off point of 0 mm [OR, 2.84; 95% CI, 1.39 to − 5.79; p = 0.004], presence of intramural venous invasion [3.12; 1.52–6.42; p = 0.002] and lymphatic invasion [3.34; 1.67–6.68; p = 0.002]. Table 4 illustrates variables selected for the prediction model after backward selection.

Table 4 Variables selected after backward selection for the prediction model for disease recurrence

The model demonstrated good performance for the prediction of disease recurrence (AUC = 0.75; 95% CI, 0.72–0.78; scaled Brier score = 10%). Figure 2 shows the ROC curve for disease recurrence prediction. The Hosmer–Lemeshow goodness-of-fit test yielded a p value of 0.59, suggesting good agreement between observed and predicted numbers of disease recurrence.

Fig. 2
figure 2

Receiver operating characteristic curve (AUC) of the predictions model for disease recurrence

Derivation of the prediction model for residual disease

A total of 21 patients had residual disease after SBR. The prediction model was constructed using the same methodology as the disease recurrence prediction model. Univariate and multivariate logistic regression analysis is illustrated in Table 5.

Table 5 Univariate and multivariate logistic regression analysis for residual disease

After backward model selection, only resection margin with a cut-off point of 0 mm [OR, 2.91; 95% CI, 1.07–7.94; p = 0.04] was included in the model. Budding level Bd2–3 was nearly significant [1.96; 0.70–5.52; p = 0.20] and was present in 8 of 20 imputed data sets. Due to the absence of other relevant variables, we included the budding level in the final prediction model. Table 6 illustrates variables selected for the prediction model after backward selection.

Table 6 Variables selected after backward selection for the prediction model for residual disease

The ROC curve demonstrated medium performance of the prediction model with an AUC of 0.68 (95% CI, 0.63–0.72). Figure 3 shows the ROC curve for residual disease prediction. The Hosmer–Lemeshow goodness-of-fit test had a p value of 0.77 and a scaled Brier score of 3%.

Fig. 3
figure 3

Receiver operating characteristic curve (AUC) of the predictions model for residual disease

Discussion

The aim of the present study was to develop a prediction model for disease recurrence and residual disease based on histopathological factors in patients with pT1 CRC. We identified 50 (8.1%) disease recurrence positive cases in our data set. Intramural venous invasion, lymphatic invasion and a positive resection margin (involved margin) were all independent predictive factors for disease recurrence. Consequently, these variables were selected for the prediction model for disease recurrence. The model performance was good in terms of discrimination and calibration. Furthermore, we developed a prediction model for residual disease. Multivariate analysis identified a positive (involved) resection margin as an independent predictive factor, and additionally, we included tumour budding Bd2–3 in the prediction model, despite borderline significance. The model demonstrated medium performance for discriminating patients with residual disease, most likely due to the small sample size of the dataset available for model derivation.

Among patients who underwent subsequent bowel resection, more than 80% had no LNM in the subsequent surgical specimen, which perfectly demonstrates the challenges in distinguishing between high- and low-risk pT1 CRC patients. The prevalence of LNM and distant metastases in the current study was in accordance with the existing literature [26]. Similar to our study, lymphovascular invasion is one of the most reliable predictors for LNM in pT1 CRC in many studies [27]. However, previous studies have underlined that these should be recorded separately, as done in our study, since the presence of submucosal lymphatic invasion and to a lesser degree venous invasion are some of the strongest predictors of LNM in pT1 CRC [28]. In contrast to the above, we found similar odds ratios for both lymphatic invasion and vascular invasion. The recognition of both lymphatic and vascular invasion can be difficult, as lymphatics can be hard to distinguish from venules, and other factors like retraction artefacts, tumour budding or poorly differentiated clusters may further complicate the picture. Consequently, the histopathological evaluation of lymphatic invasion is known to be subjective with significant rates of inter-observer variation [29]. Compared to several other studies, the presence or absence of both lymphatic and venous invasion in the current study was confirmed by immunohistochemistry for D2-40 and caldesmon, respectively. The use of immunohistochemistry has been shown to increase both the number of detected cases and to significantly improve the inter-observer agreement [30].

The Kikuchi and Haggitt classification is used for risk stratification of lymph node metastasis in several international guidelines, including the current Danish guidelines [31]. In accordance with the challenges described in the literature, regarding the use of Kikuchi and Haggitt classification, the level of invasion could not be evaluated in 11.4% and 48% of cases, respectively, during histopathological re-evaluation. This limitation hinders the accurate determination of the extent of tumour invasion and, consequently, the ability to make informed decisions regarding subsequent treatment [32]. As of today, there is still significant controversy about the degree of risk of local recurrence, lymph node metastasis and distant metastasis in cases where a tumour extends close to the deep resection margin (1 mm or less) but does not directly involve it. In the current Danish guidelines, a resection margin distance of > 1 mm is still recommended [33], but also in Denmark the discussion of the cut-off for positive margin is ongoing. Some studies have reported that a resection margin > 0 mm, in the absence of other histological risk factors, effectively identifies patients at low risk of residual disease and lymph node metastases [34, 35]. In the current study, we included both resection margins with a 0 mm cut-off value (involved margin) and a 1 mm cut-off value as a predictor for disease recurrence and residual disease. Interestingly, only resection margin with a cut-off point of 0 mm qualified for inclusion in the final prediction model.

Previous studies have reported prediction models for both, LNM and distant metastasis, based on histopathological factors with results similar to our study [8, 15, 36]. Recently, prediction models developed by artificial intelligence (AI) methods and AI-aided histopathological evaluation have demonstrated stronger performance than that of conventional models [37, 38]. Aside from the fact that these models are not yet fully integrated into clinical practice, one of their limitations is that some rely solely on histopathology reports rather than digital histopathology slides. Furthermore, the current AI models for detecting LNM are based on a sensitivity level of 100%, which may also present certain limitations. As a result, only a few extra unnecessary bowel resections could be potentially avoided compared to the use of histopathological risk factors as we know them today.

Overall, a common limitation of most studies on prediction of disease recurrence in pT1 CRC is restricted information on histopathological factors, heterogeneity in surgical procedures, small sample size and single-centre data. Our study has significant strengths compared to some of these earlier published studies, including the use of nationwide, validated patient data, including patients who underwent both only ER and ER with SBR with sufficiently long follow-up time, and the fact that the predictive model is based on re-evaluation of all cases by one experienced pathologist and not only on pre-existing pathology reports.

However, the study also has several limitations. The limited sample size and a low number of patients with disease recurrence and residual disease may introduce bias. Handling missing data poses inherent challenges, and the use of imputation introduces the potential for different final models in each imputed dataset. To mitigate this challenge, a suggested solution involves including variables that consistently appear in the final model. However, it is crucial to acknowledge that this method does not guarantee the relevance or stability of variables. A notable limitation of backward elimination is that once a variable is rejected, it is not re-entered. However, a rejected variable may become significant in the final model. We did not perform internal validation by data splitting into training and testing models, since independent validation would be misleading due to absence of sufficient sample size [39, 40]. Finally, we cannot determine the generalisability of the prediction model since our prediction model has not been externally validated.

In conclusion, while our prediction model for residual disease failed to demonstrate good performance, we succeeded in developing a prediction model for disease recurrence with good performance and calibration based on histopathological data. A unique result of this study is the finding of an involved resection margin (0 mm) as opposed to a margin of ≤ 1 mm, as an independent risk factor for both disease recurrence and residual disease. This finding might impact the coming Danish recommendations for the optimal treatment of patients with pT1 CRC.