Accurate diagnosis of malignant lymphadenopathy on cross-sectional imaging for rectal cancer remains challenging, with sensitivity and specificity estimated to be approximately 70% to 80% when pathologic assessment is used as the gold standard reference test.1 This often creates a management dilemma because decisions around neoadjuvant treatment and surgical excision with lateral pelvic node dissection (LPLND) hinge on this preoperative nodal assessment. What has always been somewhat unclear is whether the relatively poor diagnostic accuracy of cross-sectional imaging for lymphadenopathy is a limitation of the imaging technology, a weakness of human interpretation of the images generated, or a combination of both.

In this issue of Annals of Surgical Oncology, Nakanishi et al.2 describe an elegant study using a radiomics-based artificial intelligence prediction model to diagnose abnormal lateral pelvic lymph nodes (LPLN) in a large cohort of rectal cancer patients treated with neoadjuvant chemoradiotherapy (CRTx) and lateral pelvic lymph node dissection. The radiomics method was significantly better at diagnosing true-positive nodes than the traditional parameters of pre-treatment short-axis diameter and size change after neoadjuvant CRTx.

As the first published investigation using radiomics to evaluate LPLN disease in rectal cancer, this study represents a significant advance in the field by a group uniquely placed to undertake such research due to their expertise in sophisticated machine-learning analytics and a large population of patients who have undergone both CRTx and LPLND.

During a 13-year period, 854 patients with rectal cancer underwent total mesorectal excision (TME) in the two Japanese centers involved in this study. Of these patients, 247 also underwent LPLND after CRTx due to the presence of enlarged lateral nodes (axis diameter, ≥ 7 mm) and were eligible for inclusion in the study.

Radiomics analysis of contrast-enhanced computed tomography (CT) images was undertaken with the largest lateral node labeled manually. The radiomics score achieved a staggering area-under-the-curve (AUC) value higher than 0.90 in both the primary and validation cohorts for prediction of viable tumor cells via pathologic assessment. On multivariate analysis including relevant clinical variables (age, gender, carcinoembryonic antigen [CEA] level, T stage, tumor height, neoadjuvant protocol) the radiomics score was the only independently predictive factor. These preliminary data suggest that treatment of lateral nodes can be better targeted using this radiomics protocol to improve accuracy of nodal assessment in clinical staging.

Although this study is very exciting indeed and has potentially wide-reaching implications, its several limitations highlighted by the authors will appropriately temper its adoption until further progress is made. First and most importantly, all the data were retrospective, which introduced an element of selection and recall bias. In particular, all the included patients had undergone treatment under the assumption that they had abnormal LPLNs, and it is unclear how many in the denominator population would have been included had the radiomics nomogram been available. This overall baseline population becomes very relevant if the nomogram is to be applied prospectively (the logical next step) and may affect the performance of the algorithm going forward because the patients would then not be pre-selected. In the included population, LPLND was performed for all the patients with nodes 7 mm or longer on the long axis and regardless of the size of the LPLNs after neoadjuvant treatment. This is inconsistent with emerging protocols that focus on short axis size, abnormal node morphology, and lack of response to CRTx as the main indications for LPLND.3,4

Another important issue is that the imaging method of choice for assessment of LPLN is magnetic resonance imaging (MRI), not CT as used in this study. Although CT lends itself better to the radiomics approach, as explained by the authors, future comparisons really should include radiologists’ assessment of MRI images because that is what clinicians use in practice to guide treatment decisions.

In addition, the reference gold standard used in the study was the presence of viable tumor cells in the pathologic analysis, whereas nodes with mucin or fibrosis indicating potential complete clinical response were considered negative. Although this was a practical approach, it is unclear whether a truly positive abnormal node rendered “sterile” by neoadjuvant CRTx can be treated the same as the true-negative node with no difference in long-term outcome.

The field of artificial intelligence in medicine is evolving rapidly, and the application of machine-learning techniques in medical imaging holds particular promise. Radiomics is a subset of machine learning that involves extraction and analysis of very many quantitative image features simultaneously to generate a score that can then be used as part of a clinical nomogram. Although this method has been shown to improve diagnostic accuracy in many areas of medical imaging (currently including the diagnosis of LPLN on CT), it still relies on human identification and input of relevant imaging and clinical parameters.5

In contrast, deep-learning models such as convolutional neural networks represent another relatively more recent variant of artificial intelligence analytics. When applied to medical imaging, these models can automatically learn the optimal features for diagnosis if given a reference gold standard without reliance on humanly defined parameters.6 It remains to be seen whether deep learning can further improve the efficiency and accuracy of nodal diagnosis in colorectal cancer, or indeed what further technological advances are forthcoming. Either way, there is a very real sense that enhanced clinical nodal staging and therefore improved treatment targeting, for both colon and rectal cancer, is just around the corner. Studies such as this investigation by Nakanishi et al.2 are rapidly laying the groundwork for this ultimate eventuality and bringing us one step closer to computer-assisted intelligence in surgical practice.