Significance statement

This study used a fine needle puncture for sampling and prediction simultaneously during the preoperative examination, and innovatively established prediction models based on genetic mutations and clinicopathological factors that can be used to evaluate CLNM of PTC with high sensitivity and specificity, thus achieving the ideal combination of molecular and traditional diagnostic methods for clinical application. The high diagnostic efficacy of this model can help clinicians to diagnose metastases in the central lymph nodes of PTC, thus avoiding prophylactic central neck dissection in patients with negative lymph nodes and reducing the incidence of complications such as laryngeal nerve injury and hypoparathyroidism. These prediction models facilitate the formulation of individualized patient treatment plans and assisting clinicians in accurately diagnosing the disease.

Introduction

Papillary thyroid carcinoma (PTC) is the most common thyroid malignancy, accounting for approximately 84% of all thyroid cancers [1, 2]. Cervical lymph node metastasis in PTC is closely related to the local recurrence of tumors, disease-free survival (DFS), and overall survival (OS). The mode and scope of surgery are determined by the presence of lymph node metastasis and the site of metastasis. Central lymph node metastasis (CLNM) is the primary cervical lymph node metastasis site in PTC [3,4,5].

Few tests can diagnose CLNM of PTC with high accuracy. Preoperative ultrasound has high diagnostic value in detecting lateral neck lymph node metastasis (LLNM) but is less sensitive in diagnosing CLNM [6, 7]. The accuracy of intraoperative palpation for CLNM by surgeons is usually less than 30% [8]. However, in PTCs with lymph nodes larger than 1 cm in diameter, the rate of CLNM is as high as 40%-90%. Even in microscopic thyroid cancer, the rate of CLNM can be 25%-45% [9,10,11]. Prophylactic central neck dissection (PCND) reduces the risk of recurrence but increases the incidence of complications such as recurrent laryngeal nerve injury and hypoparathyroidism [12, 13]. Therefore, PCND and the extent of dissection in patients with PTC are controversial [14,15,16]. There is an urgent need for a marker that can be used to accurately predict CLNM or establish a predictive model for CLNM.

Genetic sequencing of tumor tissues helps improve the understanding of tumor genomes and enables the development of personalized cancer treatments for patients with certain genetic variants [17]. BRAF V600E is the most commonly mutated gene in PTC [18]. The BRAF V600E mutation is a useful reference for predicting the presence of CLNM and could be used to determine whether to perform PCND in patients with clinically lymph node-negative (cN0) stage PTC [19,20,21]. However, the predictive value of BRAF V600E mutation in CLNM needs to be confirmed by further studies (with large patient cohorts) based on the clinicopathological characteristics of and geographical differences in CLNM.

In this study, we included 488 patients diagnosed with PTC by ultrasound-guided fine-needle aspiration biopsy, collected clinicopathological data, analyzed the correlation between CLNM and clinicopathological features using univariate analysis and binary logistic regression, and constructed a nomogram prediction model and a convolutional neural network (CNN) prediction model of CLNM. The model was also validated in the subclinical metastasis group and the clinical metastasis group. This study provides a new method for predicting CLNM in patients with PTC.

Methods

Patient selection

From January 2019 to December 2021, a total of 582 patients suspected of having PTC after ultrasound-guided fine-needle aspiration biopsy of thyroid nodules were recruited for this study. The inclusion criteria were as follows: (a) preoperative ultrasonography of the thyroid gland and cervical lymph nodes, ultrasound-guided fine-needle aspiration cytology of thyroid nodules, and fine-needle aspiration tissue DNA (FNAT-DNA) BRAF V600E gene mutation testing, with the aspiration biopsy considered to be PTC; (b) first thyroid surgery, with the surgical approach following the "Thyroid Cancer Diagnosis and Treatment Standard" issued by the Medical Administration of the Chinese Health Care Commission; (c) complete genetic information, ultrasound findings and clinicopathological data of the patients; and (d) intraoperative rapid pathology and postoperative pathology confirming PTC. The exclusion criteria were as follows: (a) a history of thyroid-related medication; (b) preoperative treatment with neck radiation and iodine-131; and (c) recurrence of the tumor and presence of distant metastases. Among the patients, 94 patients with incomplete clinical data were excluded, and 488 patients were finally included in this study. The median age was 45 years, 128 patients were male, and 360 patients were female (Fig. 1, Table 1).

Fig. 1
figure 1

Flowchart of patient selection for the study

Table 1 Basic and comprehensive characteristics in this study

All patients were operated on by the same team of thyroid surgeons. All patients underwent thyroid lobectomy and ipsilateral CLND on the side where the nodule was located or total thyroidectomy and bilateral CLND if the lesion was present on both sides; some patients underwent lateral lymph node dissection(LLND) based on preoperative evaluation, intraoperative rapid frozen section reports, and intraoperative conditions. Of the 488 patients included in the study, 213 patients underwent unilateral thyroidectomy, 275 patients underwent total thyroidectomy, and 73 patients underwent LLND. All pathological specimens were sent to the pathology department for paraffin fixation and histological analysis. All specimens were examined microscopically and cross-checked by two or more experienced pathologists for analysis and lymph node metastasis evaluation, and a final diagnostic report was given. According to the postoperative pathological findings, 251 patients did not have lymph node metastasis, 173 patients had CLNM, 11 patients had LLNM, and 53 patients had both CLNM and LLNM (Table 1).

Clinicopathological features

Clinicopathological data, such as sex, age (45 years [median age] was used as the grouping criterion), combined benign thyroid disease, surgical method, BRAF V600E gene mutation results, cervical lymph node metastasis and ultrasound characteristics (maximum diameter of thyroid nodule, nodule location, nodule aspect ratio, nodule microcalcification, nodule boundary, capsular invasion, number of lesions and ACR-TIRADS classification), were collected from all included patients (Table 1).

Detection of BRAF V600E gene mutation status by second-generation sequencing

The specimens extracted from ultrasound-guided fine-needle aspiration biopsies of thyroid nodules were preserved in preservation solution and then submitted to a genetic testing company for BRAF V600E mutation testing. The BRAF V600E mutation test was performed by next-generation sequencing (NGS) at Ovison Gene Technology Tianjin Co., Ltd. using DNA-EZ Reagents F DNA-Be-Locked A for storage and transportation at room temperature.

Construction and validation of the nomogram prediction model

A nomogram prediction model was constructed using a logistic regression algorithm, and a nomogram was plotted with the training data set. The goodness of fit between the observed and predicted values was examined using the Hosmer–Lemeshow method and is shown on the calibration curve plot [22]. The degree of identification and the consistency of the prediction model were examined by receiver operating characteristic (ROC) curves and calibration curves. The “rms” (ver. 6.3.0) and “pROC” (ver. 1.18.0) packages in R were applied to create the calibration curve and ROC graph [23]. The likelihood of CLNM was quantified as a risk score for predictive classification.

1D-CNN model building and training

The discrete variables (combined benign thyroid disease, nodule location, and capsular invasion) with multiple classifications were processed by the one-hot encoding method to make the distance between features more reasonable to calculate [24]. The numerical variables (age and maximum diameter of thyroid nodules) were z score normalized to make them comparable across features. In this study, 11 clinicopathological features (sex, age, combined benign thyroid disease, maximum diameter of thyroid nodules, nodule location, aspect ratio, microcalcification, nodule boundary, capsular invasion, BRAF V600E gene mutation and number of lesions) were incorporated in the construction of a one-dimensional convolutional neural network (1D-CNN) model, which was based on the PyTorch framework and realized by Python programming [25]. We constructed a 12-layer 1D-CNN, as shown in Supplementary Fig. 1, which includes one input layer, six 1D convolutional layers, one flattening layer, two dropout layers and two fully connected layers. The samples were divided into a training set and a test set at a ratio of 8:2. The training set was used to train the model, and the test set was used to validate the model. Eighty percent of the patients in each of the CLNM and nonmetastasis groups were randomly selected to form the training set, and the remaining 20% of the patients formed the validation set to ensure a balanced lymph node metastasis status in the training and validation sets. The 1D-CNN model was trained with cross-entropy as the loss function using the Adam optimizer. The learning rate was set to 0.0001, and the number of iterations was set to 200. After the model was trained, the test set samples were input into the model for prediction [26]. The area under ROC curve (AUC) was used to evaluate the predictive discriminability of the model [27].

Subgroup analysis and validation

According to the preoperative ultrasound results, the 477 patients who participated in the model construction were divided into a subclinical metastasis group and a clinical metastasis group; those with lymph node metastasis indicated by preoperative ultrasound were included in the clinical metastasis group (n = 73), and those without lymph node metastasis were included in the subclinical metastasis group (n = 404). The constructed nomogram prediction model and CNN model were validated in these two groups (Table 2).

Table 2 Distribution of clinicopathological characteristics in the subclinical metastasis group and the clinical metastasis group

Statistical analysis

Binary count data, unordered multicategorical count data, and ordered multicategorical count data were analyzed using the χ2 test. Binary logistic regression analysis was used to identify independent risk factors for CLNM. Then, a nomogram prediction model was constructed based on statistically significant indicators from the binary logistic regression analysis. Statistical analyses and the creation of the nomogram prediction model were performed with SPSS statistical software v26.0 and R Studio's R version 4.1.2 (R Project for Statistical Computing) [28], and 1D-CNN creation was performed with the machine learning software PyCharm community edition, based on Python language [25]. P < 0.05 indicates that the differences are statistically significant.

Results

CLNM is associated with several clinicopathological features

Among the 488 patients enrolled, 389 patients had the BRAF V600E gene mutation according to the FNAT-DNA BRAF V600E gene mutation results, with a gene mutation rate of 79.7% (Table 1). Eleven patients with LLNM only were excluded, and the remaining 477 patients were divided into groups with (n = 226) and without lymph node metastasis in the central region (n = 251) (Fig. 1). Fifty-three patients in the metastasis group also had LLNM. Univariate analysis revealed that the differences in age, maximum diameter of thyroid nodules, capsular invasion, and BRAF V600E mutation were statistically significant between the two groups (Fig. 2, P < 0.05). In contrast, the differences in sex, combined benign thyroid disease, surgical method, nodule location, aspect ratio, microcalcification, nodule boundary, number of lesions and ACR-TIRADS classification were not statistically significant between the two groups (Table 3, P > 0.05). Thus, CLNM was associated with age, maximum diameter of thyroid nodules, capsular invasion, and BRAF V600E gene mutation. In addition, multifactorial analysis further showed that age (OR = 3.380, P < 0.01), maximum diameter of thyroid nodules (OR = 2.228, P < 0.01; OR = 4.795, P < 0.01), BRAF V600E gene mutation (OR = 6.410, P < 0.01), and capsular invasion (OR = 1.507, P = 0.027) were independent risk factors for CLNM (Table 4).

Fig. 2
figure 2

Correlation of central lymph node metastasis with age (2A), maximum diameter of thyroid nodules (2B), capsular invasion (2C), and BRAF V600E gene mutation (2D). The vertical axis indicates the proportion of the number of patients. ***stands for P ≤ 0.001

Table 3 Correlation of central lymph node metastasis and clinicopathological features
Table 4 Multivariate logistic regression of factors associated with central lymph node metastasis

Construction of the nomogram prediction model

The statistically significant indicators (age, maximum diameter of thyroid nodules, capsular invasion and BRAF V600E mutation) from the multifactorial analysis were used to construct a nomogram prediction model for CLNM. The sum of the points for each feature equals the probability of CLNM of PTC (Fig. 3A). To characterize the predictive efficacy of the model, we performed ROC curve analysis, and the AUC value of the model was 0.778 (95% CI: 0.7374–0.8196; P < 0.001). The calibration curve of the model showed good agreement between the predicted and observed results (Fig. 3B-C). In addition, we divided the enrolled patients into two subgroups for validation (Table 2), and the AUC value of the model in the subclinical metastasis group was 0.76, and that in the clinical metastasis group was 0.96, both indicating high diagnostic efficacy (Fig. 4A-D).

Fig. 3
figure 3

Construction of the nomogram prediction model. 3A: Nomogram of the logistic regression model for predicting central lymph node metastasis; 3B: ROC curve plot; 3C: Prediction model calibration curve

Fig. 4
figure 4

Subgroup analysis and validation. 4A-B: ROC curves (A) and diagnostic efficacy evaluation indexes (B) of the nomogram prediction model in subclinical metastasis group; 4C-D: ROC curves (C) and diagnostic efficacy evaluation indexes (D) of the nomogram prediction model in clinical metastasis group; 4E-F: ROC curves (E) and diagnostic efficacy evaluation indexes (F) of the CNN prediction model in subclinical metastasis group; 4G-H: ROC curves (G) and diagnostic efficacy evaluation indexes (H) of the CNN prediction model in clinical metastasis group

Construction of the CNN prediction model

Next, we incorporated all the indicators to build a CNN prediction model for CLNM (Table 5). The iterative graph of this model showed that the loss of the training model decreased as the number of iterations increased. Figure 5A indicates that the accuracy of the training model increased as the number of iterations increased, and the prediction model reached the minimum loss of training data and the maximum accuracy after 60 iterations of training. ROC curve analysis in the training set showed that the model had high predictive efficacy; the AUC value was as high as 0.89, the specificity and sensitivity were 0.84 and 0.75, and the false positive and false negative rates were 0.16 and 0.25 (Fig. 5B-C). In the test set, the AUC value was 0.78, the specificity and sensitivity were 0.82 and 0.62, and the false positive and false negative rates were 0.18 and 0.38, suggesting that the model has good predictive efficacy (Fig. 5D-E). We also validated the model in the subclinical metastasis group and the clinical metastasis group (Table 2), and the AUC value of the model were 0.8 in the subclinical metastasis group and 0.99 in the clinical metastasis group, suggesting that the model has high diagnostic efficacy. Notably, the model had a high specificity (85%) in the subclinical metastasis group and a high sensitivity (98%) in the clinical metastasis group (Fig. 4E-H).

Table 5 Convolutional neural network baseline table
Fig. 5
figure 5

Construction of the convolutional neural network prediction model. 5A: Iterative plot of the convolutional neural network prediction model; 5B, 4D: ROC curves obtained with the training and test sets; 5C, 5E: Sensitivity, specificity, false positive rate and false negative rate obtained with the training and test sets

Stratified analysis according to nodule size

We next constructed another CNN model for the population (284 patients) with thyroid nodules measuring less than or equal to 1 cm in diameter to assess its predictive value for this specific subgroup. The prediction model also had high performance, and its accuracy in the training set increased as the number of iterations increased. After 70 iterations, the training data loss was minimized, and the accuracy was maximized (Fig. 6A). The AUC values were 0.87 and 0.76 for the training and test sets, with specificities of 0.81 and 0.83, sensitivities of 0.76 and 0.64, false positive rates of 0.19 and 0.17, and false negative rates of 0.24 and 0.36, respectively (Fig. 6B-E). Therefore, the model could be a good predictor of CLNM in patients with nodule diameters measuring ≤ 1 cm.

Fig. 6
figure 6

Construction of convolutional neural network prediction models for patients with nodule diameters ≤ 1 cm. 6A: Iterative plot of the convolutional neural network prediction model; 6B, D: ROC curves obtained with the training and test sets; 6C, E: Sensitivity, specificity, false positive rate and false negative rate obtained with the training and test sets

Discussion

In this study, we retrospectively collected data from 488 patients with PTC and investigated the correlations between the CLNM and clinicopathological features using univariate analysis and binary logistic regression analysis. We constructed a nomogram prediction model and a CNN prediction model for CLNM, thus facilitating the formulation of individualized patient treatment plans and assisting clinicians in accurately diagnosing the disease.

PTC is usually characterized by chromosomal rearrangements of RET or point mutations in the RAS or BRAF proto-oncogene, and mutations in the BRAF, RAS, or RET genes are found in nearly 70% of PTC cases [19, 29]. Of these, BRAF mutations are seen in 60%-70% of PTCs, making them the most common mutations in PTC [30, 31]. Yan et al. showed that 1715 of 2048 patients with PTC had BRAF V600E mutations, with a mutation rate of 83.7% [32]. In the present study, the BRAF V600E gene mutation rate was as high as 79.71%. There is some controversy in previous studies regarding the correlation between the BRAF V600E gene mutation and cervical lymph node metastasis [33,34,35,36,37,38,39], and our study showed that the BRAF V600E gene mutation (OR = 6.410, P < 0.001) was an independent risk factor for CLNM, suggesting that it may serve as an important reference in predicting lymph node metastasis.

In further analyses, in addition to the BRAF V600E mutation, we also found that age, maximum diameter of thyroid nodules, and capsular invasion were also independent risk factors for CLNM. The risk of CLNM was higher in younger patients and those with larger diameter nodules and with capsular invasion. The prediction model developed by incorporating these independent risk factors into the logistic regression model had good predictive efficacy. The CNN is a typical, deep learning-based feedforward neural network that employs convolutional computations and a deep structure, has a good self-learning ability, adaptive performance and fault tolerance, and can automatically extract features from the input data and use them for further classification or prediction [40, 41]. Of the different kinds of CNNs, a 1D-CNN is mainly applied to the data processing of sequence classes [42,43,44]; therefore, the 1D-CNN framework was used to automatically identify all clinical variables and indicators of enrolled patients. Compared with the regression prediction model, the 1D-CNN model has a more accurate prediction capability and achieves a perfect interaction between AI algorithms and medical diagnosis. Here, it was used as the first application of deep learning in the prediction of CLNM of PTC.

A retrospective analysis of 500 thyroid nodules (maximum nodule diameter ≤ 2.0 cm, TI-RADS classification 4c) examined by ultrasound showed that an anteroposterior diameter of 0.9 cm could be used as a threshold for assessing the risk of metastasis of malignant thyroid nodules [45]. In this study, our results showed that the risk of CLNM was 2.228 times higher in patients with thyroid nodules 10–20 mm in maximum diameter than in patients with thyroid nodules < 10 mm in maximum diameter and 4.795 times higher in patients with thyroid nodules > 20 mm than in patients with thyroid nodules < 10 mm in maximum diameter. Therefore, the larger the maximum diameter of the thyroid nodule is, the higher the rate of CLNM. We usually only treat patients with thyroid nodule diameters < 10 mm through observation in clinical practice. Thus, treatment may be delayed for those patients who also have CLNM. Better means of lymph node detection are limited, so it is crucial to accurately predict lymph node metastasis in this group of patients. We developed a predictive model with high predictive efficacy for this population that can accurately distinguish between high- and low-risk clinical groups, assist in determining the follow-up treatment, and thus improve the prognosis of patients. The high specificity of the model is also useful for screening subsets of patients for clinical observation.

The subgroup analysis and validation of this study showed that the nomogram prediction model and the CNN prediction model had high specificity in the subclinical metastasis group, suggesting that our model in cN0 patients can better predict patients who eventually do not have CLNM, thus enabling these patients to avoid unnecessary PCND and reduce complications. In addition, our prediction models had high sensitivity in the clinical metastasis group, suggesting that the models can better predict patients who actually have metastasis in clinically lymph node-positive patients, thus providing a reference for clinical diagnosis.

Multifocal papillary thyroid carcinoma (MPTC) accounts for 20.0%-40.0% of PTCs [46]. Compared with unifocal PTC, MPTC is more malignant, more aggressive and has a higher rate of neck lymph node metastasis. Some studies have shown that the rate of CLNM in MPTC ranges from 23.3% to 58.5%, and there is a high risk of recurrence and poor prognosis after surgery [47]. Tam et al. studied the clinicopathological data of 912 patients with PTC and found that patients with multifocal PTC and total tumor diameters > 1 cm had a significantly higher risk of CLNM than patients with unifocal PTC [48], and some studies found multifocality to be an independent risk factor for CLNM in PTC [49, 50]. In this study, the number of lesions did not significantly differ between the CLNM group and the nonmetastasis group, probably because most of the enrolled patients were unifocal PTC patients and there were more patients with tumor diameters less than 1 cm, which may have produced some bias in the results. Thus, the correlation between multifocality and CLNM in PTC patients still needs to be confirmed in a large sample and multicenter clinical study.

Although we developed a new prediction model to diagnose CLNM in patients with PTC, there are still some limitations. The high rate of BRAF V600E gene mutation may be related to ethnicity or race, and further validation of the BRAF V600E gene mutation in tissues should be performed in surgical specimens. BRAF V600E is not solely responsible for PTC aggressiveness and further research on mutations coexisting with BRAF V600E is needed [51]. In addition, the prediction model of this study needs to be further validated in multicenter and large-sample studies.

In conclusion, we innovatively established a prediction model based on genetic mutations and clinicopathological factors that can be used to evaluate CLNM of PTC with high sensitivity and specificity, thus achieving the ideal combination of molecular and traditional diagnostic methods for clinical application.