Integration of ultrasound-based radiomics with clinical features for predicting cervical lymph node metastasis in postoperative patients with differentiated thyroid carcinoma

Objective The primary objective was to establish a radiomics model utilizing longitudinal +cross-sectional ultrasound (US) images of lymph nodes (LNs) to predict cervical lymph node metastasis (CLNM) following differentiated thyroid carcinoma (DTC) surgery. Methods A retrospective collection of 211 LNs from 211 postoperative DTC patients who underwent neck US with suspicious LN fine needle aspiration cytopathology findings at our institution was conducted between June 2021 and April 2023. Conventional US and clinicopathological information of patients were gathered. Based on the pathological results, patients were categorized into CLNM and non-CLNM groups. The database was randomly divided into a training cohort (n = 147) and a test cohort (n = 64) at a 7:3 ratio. The least absolute shrinkage and selection operator algorithm was applied to screen the most relevant radiomic features from the longitudinal + cross-sectional US images, and a radiomics model was constructed. Univariate and multivariate analyses were used to assess US and clinicopathological significance features. Subsequently, a combined model for predicting CLNM was constructed by integrating radiomics, conventional US, and clinicopathological features and presented as a nomogram. Results The area under the curves (AUCs) of the longitudinal + cross-sectional radiomics models were 0.846 and 0.801 in the training and test sets, respectively, outperforming the single longitudinal and cross-sectional models (p < 0.05). In the testing cohort, the AUC of the combined model in predicting CLNM was 0.901, surpassing that of the single US model (AUC, 0.731) and radiomics model (AUC, 0.801). Conclusions The US-based radiomics model exhibits the potential to accurately predict CLNM following DTC surgery, thereby enhancing diagnostic accuracy.


Introduction
Differentiated thyroid carcinoma (DTC), mainly including papillary thyroid carcinoma (PTC) and follicular thyroid carcinoma (FTC), accounts for the great majority (>90%) of thyroid cancers and is characterized by indolent growth and low malignancy [1,2].The cervical lymph node (LN) is one of the most prone to metastasis after thyroid cancer surgery.Previous studies have shown that approximately 13.3% of patients experience recurrence after primary surgery, which may be related to mortality in a small subset of PTC patients [3][4][5][6].In the postoperative management of thyroid cancer, neck ultrasound (US) plays an important role in thyroid bed recurrence or cervical lymph node metastasis (CLNM).
The US characteristics of CLNM in thyroid cancer have been well demonstrated in the previous literature.These malignant signs include round shape, the absence of echogenic fatty hilum, the presence of cystic area, calcification, hyperechogenicity, abnormal vascularity, and increasing size [7][8][9].However, no single US feature is sensitive enough to detect CLNM of thyroid cancer [7].Moreover, the interpretation of US images is subjective and will be affected by the examiner's clinical experience.Additionally, changes after thyroidectomy, such as residual thyroid glands, postoperative suture granuloma, fibrosis, reactive hyperplasia of LNs, or fat necrosis, are similar to metastatic cervical LNs on US [10], further increasing the difficulty of diagnosis.
Radiomics has developed rapidly in recent years, with the objectives of extracting many quantitative signatures from medical images, deeply mining tumor biological information, analyzing tumor heterogeneity and facilitating clinical decision-making [11][12][13].Radiomics is a quantitative, reproducible, less operator dependent, relatively objective method that compensates for the shortcomings of conventional US.Currently, US-based radiomics is widely used in the thyroid.Previous studies have focused on thyroid nodules or the use of US features of preoperative thyroid cancer primary lesions to predict CLNM [14,15].To our knowledge, few studies have explored the US-based radiomic features of LNs to predict CLNM in patients with postoperative DTC.
Therefore, the aim of this study was to investigate the value of US-based radiomics for the prediction of CLNM in postoperative patients with DTC and construct a model combining US-based radiomics and clinical features for predicting CLNM in this population.

Patients
The Ethics Committee of Shandong University Affiliated Qianfoshan Hospital authorized this retrospective study and waived the need to obtain informed consent.
From June 2021 to April 2023, patients with postoperative DTC who underwent neck US with suspicious LN fine needle aspiration (FNA) cytopathology findings at our institution were retrospectively enrolled.The suspicious LN criteria for FNA in our study were as follows: the absence of a fatty echogenic hilum, alteration of the internal echogenicity (calcification, cystic changes, hyperechogenicity), and a long-to-short-axis ratio ≤2.The selection criteria of patients enrolled were as follows: 1) patients operated on with total thyroidectomy and central region LN dissection, with or without lateral cervical LN dissection; 2) high US image quality, saved in Digital Imaging and Communications in Medicine (DICOM) format; and 3) complete clinical information.The exclusion criteria of patients enrolled were as follows: 1) combined history of other head and neck malignancies; 2) patients who had received previous therapy (puncture, radiotherapy, chemotherapy, radiofrequency or microwave ablation); and 3) incomplete clinical information or inadequate cytological samples.All patients underwent FNA within one week after ultrasonography, and all laboratory findings were obtained before aspiration.If a patient had multiple suspicious LNs, one of the most suspicious LNs was selected for inclusion in the study.A total of 211 LNs from 211 patients (135 females and 76 males, mean age, 45.47 years ±11.81;range, 18-76 years) were enrolled in our study.Figure 1 illustrates a detailed patient screening flowchart.
The clinicopathological features of postoperative DTC patients were collected based on the postoperative pathological results, including the size of the primary thyroid cancer lesion (maximum diameter of the largest node selected for multiple nodes), the presence of Hashimoto's thyroiditis, extrathyroidal invasion (three grades: no, thyroid capsule invasion and extrathyroidal invasion), multifocality (number of thyroid lesions ≥ 2 defined as multifocal), and the number of lymph node metastasis (LNM) in the central and lateral neck regions (three grades: 0, 1-5, >5).In addition, whether the patient received radioiodine therapy and iodine dose were recorded.All patient laboratory test data were collected, including thyroglobulin (Tg) and antithyroglobulin antibody (TgAb), which were measured by the electrochemiluminescence immunoassay method.With reference to Augusto et al., DTC patients with postoperative suppressed Tg levels ≥1 ng/ml or stimulated Tg levels ≥2 ng/ ml or a progressive increase in TgAb on consecutive follow-up examinations were defined as having biochemical recurrence [16].
The total dataset was randomly divided into a training set and a testing set at a ratio of 7:3 using random seed value 52.Table 1 shows the baseline conventional US and clinicopathological information of the training and testing cohort patients.

US image acquisition
US examination was performed using a uniform type of high-resolution US instrument (GE LOGIQ E9) with a linear array probe (6-15 MHz).Then, a specific "thyroid" procedure was used to examine the thyroid bed and bilateral cervical LNs, and the location, size, internal echogenicity, presence of hilus, calcification, cystic changes and hyperechogenicity of the suspected LNs were recorded.Color Doppler flow imaging was used to observe the LN vascular distribution.
The locations of the suspected LNs were determined based on the guidelines from the American Head and Neck Society and the American Academy of Otolaryngology-Head and Neck Surgery [17].LN size was measured as the long-axis and short-axis diameter on the maximal longitudinal section.Calcifications included microcalcifications and macrocalcifications.The presence of echo-free areas in the LNs was defined as cystic changes.Hyperechogenicity was defined as the cortex echogenicity of LNs being higher than that of adjacent muscles.Abnormal vascularity was the presence of abundant, peripheral or mixed blood flow.The conventional US images of LNs were evaluated independently by two sonographers with at least 5 years of diagnostic experience who were blind to the pathological results.If there was any disagreement, the two radiologists would make a decision by consensus.

US image segmentation and feature extraction
Anonymized US grayscale DICOM images of LNs, captured in both longitudinal and cross-sectional images, were securely uploaded to the RadCloud platform (version 7.2; Huiying Medical Technology Co., Ltd, Beijing, China, https://mics.huiyihuiying.com).Two experienced radiologists, with over 5 years of professional experience, conducted the image analysis without access to the pathological outcomes, ensuring unbiased evaluation.The radiologists manually performed region of interest (ROI) segmentation, meticulously outlining the LNs.The ROIs consisted of two distinct components: longitudinal and cross-sectional segments.Subsequently, relevant feature values were extracted from the segmented ROIs.To ensure the precision and dependability of the results, an experienced radiologist (with 8 years of work experience) conducted thorough validation and correction of all delineated labels.
Radiomics feature extraction was then performed on this platform.Specifically, the platform utilized the PyRadiomics package (version 3.1.0,https://pyradiomics.rea dthedocs.io/),based on the Python language (version 3.7.0,https://www.python.org), to extract a comprehensive set of radiomics features from medical US images.To ensure standardization, preprocessing steps of the US images, including image resampling and grayscale discretization, were performed.A total of 661 image features were extracted from each ROI, encompassing seven major categories.These categories included first-order statistics, 2D shape features, gray-level cooccurrence matrix (GLCM) features, gray-level run-length matrix (GLRL) features, gray-level size zone matrix (GLSZM) features, gray-level dependence matrix (GLDM) features, and neighboring gray-tone difference matrix (NGTDM) features.

Feature selection
Prior to selecting radiomics features, Z-score normalization was performed on all features.To mitigate the risk of overfitting and enhance the relevance of features, dimensionality reduction techniques were employed.Initially, the intraclass correlation coefficient (ICC) was utilized to evaluate the interobserver reproducibility of the radiomics features.Features with ICC values > 0.8 were considered reliable and were included in the subsequent feature selection process.Subsequently, a variance threshold of 0.8 and the Select K Best method based on univariate analysis were employed.Last, the least absolute shrinkage and selection operator (LASSO) regression method was used for feature selection utilizing 10-fold cross-validation.

Development and validation of the model
Radiomics models were developed using longitudinal section, cross-section, and combined longitudinal + cross-sectional images of LNs to distinguish cervical LN status.Logistic regression (LR), random forest (RF), support vector machine (SVM), and decision tree (DT) algorithms were employed to create radiomics models, and then the best performing model was determined.
Significant US features from the training cohort were incorporated into multivariate analysis to construct US models in the training and test cohorts.Furthermore, considering the potential influences of clinicopathological characteristics, we performed multivariate analysis to integrate radiomic scores (Rad-scores), conventional US features, and independent clinical risk factors.Subsequently, a nomogram for predicting CLNM was constructed based on the developed model to facilitate clinical decision-making.
The Hosmer-Lemeshow goodness-of-fit test and calibration curve were constructed to assess the performance of the prediction model.Figure 2 illustrates the workflow diagram depicting the various stages of this study.

Statistical analysis
Student's t test and Mann-Whitney U test were utilized to compare the differences between groups in measurement data.Count data comparisons between groups were made using the Chi-square test or Fisher's exact probability method.Univariate and multivariate logistic regression analyses were utilized to screen US and clinicopathological characteristics.The area under the curve (AUC) estimates of different prediction models were compared using the nonparametric DeLong's test.SPSS 20 (IBM Corp., Armonk,

Clinical and ultrasound characteristics
The 211 patients included 208 with PTC and 3 with FTC.Among PTCs, there were 197 low-risk subtypes (190 classical variants, 7 follicular variants) and 11 high-risk subtypes (5 diffuse sclerosing variants, 4 tall cell and 2 columnar cell subtypes), and no significant differences were found between low-risk and high-risk subtypes in predicting CLNM.Table 2 presents the results of univariate and multivariate analyses predicting CLNM in the training cohort.In terms of US features, compared with the non-CLNM group, the CLNM group mostly showed calcification (p < 0.001), cystic area (p = 0.02), hyperechogenicity (p < 0.001) and abnormal vascularity (p < 0.001).In addition, the short axis of LNs in the CLNM group was significantly higher than that in the non-CLNM group (p = 0.02).
Based on postoperative pathological features of the thyroid primary lesion, the incidence of extrathyroidal invasion (p ≤ 0.001) and the number of LNMs >5 in the central and lateral neck regions were substantially higher in the CLNM group (p = 0.01) than in the non-CLNM group.Moreover, the primary tumor size in the CLNM group was significantly larger than that in the non-CLNM group (p = 0.03).Additionally, patients with biochemical recurrence were more likely to present with CLNM (p < 0.001).Among 114 patients with biochemical recurrence, 83 had elevated Tg only, 26 had elevated TgAb only, and 5 exhibited elevations of both Tg and TgAb.
Multivariate analysis revealed that US features (calcification, hyperechogenicity) and clinicopathological features (extrathyroidal invasion, biochemical recurrence) were independent factors for predicting CLNM.

Radiomics results and model building
In our study, a total of 1322 feature values were extracted from the longitudinal + cross-sectional images of LNs.After data dimensionality reduction, 13 nonzero coefficient radiomics features were obtained (6 from cross-sections and 7 from longitudinal sections).These features were applied to compute the total cohort per patient Rad-score, which was formulated by calculating the sum of the corresponding feature values multiplied by their respective weights as illustrated in Fig. 3.
A combined radiomics feature machine learning model was constructed with LR, SVM, RF and DT, and the best overall performance of the LR model was shown (Table 3).The AUCs of the LR model were 0.846 and 0.801 in the training and test cohorts, respectively, and the accuracy, sensitivity and  Based on the methodology mentioned above, 8 radiomics features were extracted from 661 longitudinal section imaging signatures to calculate the Rad-score of longitudinal sections, and 8 radiomics features were extracted from 661 cross-sectional imaging signatures to calculate the Rad-score of cross-sectional sections.Notably, the features screened from longitudinal sections and cross-sections alone were different from those screened from the combination of longitudinal +cross-sectional sections.LR was used to construct the radiomics model, and the results showed that the longitudinal +cross-sectional radiomics models outperformed the single longitudinal section and cross-section models (Table 4).

Construction and validation of the model
In our research, a model combining conventional US, Radscores and clinical features was established and presented as a nomogram (Fig. 4).Then, to assess the model performance, the Hosmer-Lemeshow goodness of fit test was applied to the training and test cohorts, with p values of 0.18 and 0.25, respectively, indicating that no significant differences were found between the model-predicted risk of CLNM and the actually occurring risk.The calibration curve of the radiomics nomogram showed good agreement between the predicted and actual CLNM in the training and test sets (Fig. 4).
Compared with conventional US and radiomics models, the combined model showed the best performance in predicting CLNM.Table 5 shows the performance evaluation of each model.As Fig. 5 shows, the receiver operating characteristic (ROC) curves corresponding to the three models and the AUCs of the combined model in the training and test cohorts were 0.943 and 0.901, respectively.DeLong's test indicated that the radiomics model was significantly better than the conventional US model (p < 0.05), and the combined model was significantly better than the single US model and radiomics model (p < 0.05).
CLNM cervical lymph node metastasis, LNM lymph node metastasis, CI confidence interval, NA values were not available *P-value of univariate logistic regression analysis <0.05 **P-value of multivariate logistic regression analysis <0.05 The decision curve analysis suggested that the combined model yielded the highest net benefit in predicting CLNM compared to the radiomic and US models (Fig. 6).In patients with postoperative DTC, the combination of US-based radiomics with clinical features showed greater efficacy in predicting the status of the cervical LNs.

Discussion
The most common pathological subtype of DTC is PTC, and lymphatic metastasis is its main metastasis mode.The most common site of recurrence after DTC surgery is also cervical LNs, and high-frequency US has become an important tool to evaluate postoperative locoregional  recurrence.This study showed that US characteristics of LNs, including short axis, calcification, cystic area, hyperechogenicity and abnormal vascularity, were risk factors for predicting CLNM, similar to previous study findings [18,19].No significant differences were found in terms of LN hilus (p = 0.09), LN long axis (p = 0.69) or long-to-short axis ratio (p = 0.05) between the non-CLNM and CLNM groups.This was because only suspicious LNs were included in this study, and LNs with normal LN US features were not included.Moreover, including data from such LNs could also reduce the overall diagnostic accuracy.Recently, previous studies have shown that radiomics features enable objective and quantitative analysis that can provide insights for personalized medicine and potentially improve the accuracy of diagnosis, prognosis, and prediction [20,21].US-based radiomics studies have also shown critical value in predicting the prognosis of many diseases, such as breast, thyroid and liver cancer [15,[22][23][24].Previous studies have mostly focused on preoperatively predicting CLNM status using the US radiomics features of primary thyroid tumors.Vivian et al.'s study showed the potential of using the US radiomics signature of the primary tumor alone to predict lateral neck region LNM in PTC patients, with AUCs of 0.710 and 0.62 in the training and test sets, respectively [25].
This study innovatively used US radiomics of cervical LNs to determine LN status after DTC surgery.We preliminarily found that the radiomics of both longitudinal and cross-sectional images of cervical LNs had good diagnostic performance in predicting CLNM, and the diagnostic performance of the longitudinal + cross-sectional radiomics model was superior.The AUCs of the radiomics model in predicting CLNM were 0.846 and 0.801 in the training and test cohorts, respectively, which were better than those of conventional US, suggesting that US-based radiomics analysis had good application value in predicting CLNM after DTC.
Lin et al. showed that grayscale US radiomics analysis of LNs had potential value in extracting significant US features and enhancing the diagnosis of CLNM in patients with nasopharyngeal carcinoma, with an AUC of 0.810 (0.739-0.881) in the radiomics model [26].Additionally, in some studies, US techniques such as elastography, contrastenhanced US and radiomics have been used to predict the risk of CLNM of PTC preoperatively.The clinical-radiomics nomogram constructed by Li et al. based on contrastenhanced US for the preoperative prediction of LNM in PTC showed good performance in the training set (AUC = 0.820) and test set (AUC = 0.814) [14].The study by Meng et al. showed that a radiomics nomogram based on shear-wave elastography had greater discriminatory efficacy in the USreported LN-negative subgroup (AUC = 0.812) [15].
This study showed that extrathyroidal invasion of the primary thyroid tumor was independently related to CLNM, similar to previous studies [27].In addition, postoperative serum Tg and TgAb levels are of great value in monitoring recurrence after total thyroidectomy for DTC [7,28].Our study findings showed that the biochemical recurrence rate was significantly higher in the CLNM group than in the non-CLNM group.Consequently, in cases where biochemical recurrence is clinically indicated, the sonographer should conduct a thorough neck examination to carefully assess CLNM.
Accounting for the influence of clinical factors in each patient, our study integrated conventional US, Rad-scores, extrathyroidal invasion, and biochemical recurrence into a comprehensive combined model for predicting CLNM.This combined model was then represented as a nomogram to aid clinical decision-making.Remarkably, when compared with the US and radiomics models, the combined model exhibited superior diagnostic performance, achieving AUCs of 0.943 and 0.901 in the training and test sets, respectively.These results underscore the potential clinical utility of our combined model in accurately predicting CLNM in postoperative patients with DTC.
Nonetheless, several limitations need to be recognized in our study.First, since this was a single-center retrospective study with a relatively small sample size, it is essential to conduct prospective multicenter investigations with larger cohorts to validate and strengthen the findings.Second, our analysis focused solely on static grayscale US images of LNs, and the integration of multimodal images, such as radiomics combined with elastography or contrast-enhanced US, could offer additional information and provide superior performance.Therefore, future research will include the acquisition of multimodal images for comprehensive analysis.Third, due to the limited sample size, the classification of LN size was not conducted in detail, warranting further exploration in future studies.Addressing these limitations will be critical for advancing the understanding and clinical application of our findings.
In conclusion, our preliminary study indicates that the identified radiomic features have the potential to serve as predictors of cervical LN status after DTC surgery.The integration of US-based radiomics with clinical features demonstrated enhanced diagnostic efficacy and significantly improved discriminatory performance in evaluating CLNM among patients with postoperative DTC.

Fig. 1
Fig. 1 Patient selection flowchart in this study.DTC differentiated thyroid carcinoma, CLNM cervical lymph node metastasis

Fig. 2
Fig. 2 The workflow diagram of this study.ROC receiver operating characteristic, LASSO the least absolute shrinkage and selection operator

Fig. 3
Fig.3This study uses the least absolute shrinkage and selection operator (LASSO) in the parametric method to select the ultrasound image radiomics features.a Select the optimal alpha of 0.0183 with

Fig. 4 A
Fig. 4 A nomogram combining Rad-score, conventional ultrasound, extrathyroidal invasion and biochemical recurrence was developed to predict the risk of CLNM in patients with postoperative differentiated

Fig. 5 Fig. 6
Fig. 5 Receiver operating characteristic (ROC) curves of the combine model, conventional ultrasound model, and radiomics model were compared in the training (a) and test (b) cohorts to predict the risk of CLNM in patients with postoperative differentiated thyroid cancer

Table 1
Baseline conventional ultrasound and clinicopathological data of patients in the training and test cohorts

Table 2
Results of the univariate and multivariate analysis based on the training cohort

Table 3
The performance of LR, SVM, DT and RF in longitudinal + cross-sectional radiomics for predicting lymph node status after differentiated thyroid carcinoma LR logistic regression, SVM support vector machine, DT decision tree, RF random forest, AUC area under the curve

Table 4
The performance of lymph nodes longitudinal, cross-sectional and longitudinal + cross-sectional radiomics model using logistic regression for predicting lymph node status after differentiated thyroid carcinoma

Table 5
The performance of