Thyroid nodules are highly prevalent in the general adult population, with a detection rate of 19–67% during routine ultrasound examinations [1]. An epidemiological study showed that approximately 5–15% of these nodules are malignant [2]. Despite the high incidence of thyroid malignancy, most patients referred for suspected nodules have benign conditions. The overestimation of malignancy leads to the performance of unnecessary procedures and causes a burden for both society and patients. Therefore, distinguishing thyroid nodules preoperatively is required.

To date, the Thyroid Imaging Reporting and Data System (TIRADS) and American Thyroid Association guidelines are considered as the main criteria for determining malignancy and are generally followed by radiologists in practice [3]. However, these categorization systems were established based on fine needle aspiration (FNA) cytology results that included data from nodules > 1 cm. In addition, a few reports have presented serum thyrotropin (TSH) and positive thyroid autoantibodies as possible predictors of thyroid malignancy [4, 5]. However, these guidelines or studies either used FNA cytology results for their final diagnoses, which are less reliable than those confirmed via surgical inspection, or they included a relatively small number of patients. Additionally, most studies to date have focused on single risk factors, clinical, biochemical or radiological, and only a few studies have analyzed these risk factors in combination. A robust predictive model involving easily accessible clinical, laboratory and radiological risk factors may serve as a pragmatic aid in making decisions regarding malignancy differentiation.

In the present study, we reviewed a large cohort of 2984 patients in China who underwent thyroid surgery and had final pathological data available. The purpose of our study was to verify the independent risk factors of clinical, laboratory and ultrasonographic (US) features in patients with thyroid carcinomas and to establish a predictive model for determining malignancy that can be used by clinical practitioners.



We retrospectively studied the data from 3145 consecutive patients who mostly received routine neck ultrasound detections and underwent total or partial thyroid surgery between 2006 and 2009 at four tertiary hospitals in China. Patients with a previous thyroid surgery or radiation ablation and patients who were taking thyroxine or antithyroid drugs were not included. Patients with medullary thyroid cancer, anaplastic cancer or lymphoma were considered TSH-nonresponsive and were excluded. After the exclusions, 2984 patients were included in the analysis. Their clinical, laboratory, and US variables were assessed retrospectively. This study had institutional review board approval.

US imaging analysis

US examinations of the four tertiary hospitals were performed using US scanner GE LOGIQ9 (USA) equipped with a 5–12-MHz linear transducer for morphological examinations and a 4.7-MHz transducer for color Doppler evaluations. The examinations were conducted and recorded by two skilled sonographers from respective hospitals according to a standard procedure and interobservers reached agreement on the results of each US findings. The following US parameters of the nodules were recorded: (1) number of nodules, (2) nodule size, (3) echoic texture, (4) echogenicity, (5) shape, (6) margin, (7) calcification (microcalcification, macrocalcification, or egg-shell calcification) and (8) intranodular central flow.

Laboratory variables

The levels of serum TSH, free triiodothyronine (FT3) and free thyroxine (FT4) were determined using chemiluminescence analyzer Roche Cobas E601 (Switzerland) and the matched kit. These values ranged from 0.35 to 5.5 UI/ml for TSH, from 11.5 to 22.7 pmol/l for FT4 and from 3.5 to 6.5 pmol/l for FT3. If the other laboratories had different normal ranges, the values were adjusted to reflect the same normal range. Anti-thyroid peroxidase antibody (TPOAb, reference value < 60 μIU/ml) and anti-thyroglobulin antibody (TGAb, reference value < 60 IU/ml) levels were measured using immunometric assays. Thyroid antibody levels higher than the upper range were considered positive.


FNA cytology was not generally performed and considered as a routine pre-operative assessment when the study was conducted. Postoperative histopathologic evaluations were performed by pathologists experienced in thyroid pathology. The histopathologic results of the patients operated on were grouped as either malignant or benign.

Statistical analysis

Descriptive statistics are presented as the means ± standard deviations for continuous variables and as the number of patients and percentages for categorical variables. Differences between independent groups for continuous variables were evaluated using a Student’s t-test or a Mann–Whitney U-test, where applicable. Categorical data were analyzed using Pearson’s chi-square test. Univariate and multivariate logistic regression analyses were performed to evaluate the association between malignancy and risk factors. Appealing receiver operating characteristic (ROC) curve analyses were performed to examine the predictive power of combinations of clinical, laboratory and sonographic features. The areas under the curves (AUCs) were derived from ROC curves. The Youden index was used to define the optimal cut-off value [6]. All statistical analyses were performed using SPSS version 17.0 (SPSS, Inc., Chicago, IL). Differences between AUCs were detected using Delong’s test [7]. A p-value of < 0.05 was considered statistically significant.


Clinical characteristics

This study cohort consisted of 541 men and 2443 women. Overall, 2460 patients were diagnosed with pathologically benign nodules, and 524 patients were diagnosed with malignant nodules. The malignancy rate in our study was 17.6%. Most of the nodules were detected incidentally in routine body check-up and totally 10.5% of the patients present clinical systems such as hoarsennes, swallowing difficulty, thyroid enlargement, with the duration of symptoms varying from 7 days to 26 years. As shown in Table 1, there was no difference in the sex ratios between the patients with benign and malignant nodules. Patients with malignant nodules were younger than those without malignant nodules (43.5 ± 11.6 years vs. 48.5 ± 11.5 years, p < 0.001) (Table 1).

Table 1 Clinical characteristics of 2984 subjects with thyroid nodules

The mean maximal diameter of malignant nodules was significantly smaller than that of benign nodules (1.96 ± 1.16 cm vs. 2.75 ± 1.70 cm, p < 0.001). The prevalence of solitary nodules in malignant cases was not different from that in benign cases (29.0% vs. 25.1%, p = 0.109).

Laboratory values

As shown in Table 2, there were no significant differences in FT3 and FT4 values between the two groups. The level of TSH (median 1.63 mIU/L, IQR (0.89–2.66) vs. 1.19 (0.59–2.10), p < 0.001] in the malignant group was higher than in the benign group. Subsequently, based on the cutoff values predetermined in population studies, TSH levels were divided into quintiles, including below normal (< 0.35 mIU/L), above normal (> 5.5 mIU/L), and within normal, with the latter divided into tertiles of similar size (0.35–0.99 mIU/L, 1.0–2.49 mIU/L, and 2.5–5.49 mIU/L). The prevalence of malignancy was 9.8% when TSH levels were less than 0.35 mIU/L, compared with 13.2% when TSH levels were 5.5 mIU/L or greater (p = 0.17). In the normal range, a high rate of malignancy was observed in patients with higher TSH levels. The prevalence of malignancy was 15.8% when TSH levels were between 1.0 and 2.49 mIU/L and 24.4% when TSH levels were between 2.50 and 5.49 mIU/L, compared with 12.6% when TSH levels were between 0.35 and 0.99 mIU/L (p = 0.09 and p < 0.001, respectively) (Fig. 1).

Table 2 Laboratory variables of subjects with thyroid nodules
Fig. 1
figure 1

Prevalence of malignancy in relation to the serum TSH concentration, indicating an increased prevalence in patients with higher TSH levels. **P < 0.05, compared with patients with TSH levels less than 0.35 mIU/L

Patients with malignant nodules had positive TGAb and TPOAb results more frequently than did patients with benign nodules (for TGAb, 30.3% vs. 15.0%, p < 0.001; for TPOAb, 25.6% vs. 18.0%, p = 0.028).

Sonographic features

The prevalences of an irregular shape (42.7% vs. 10.7%, p < 0.001), an ill-defined margin (38.7% vs. 9.7%, p < 0.001), a solid structure (75.8% vs. 41.3% p < 0.001), hypoechogenicity (68.5% vs. 27.1%, p < 0.01), microcalcification (48.5% vs. 13%, p < 0.001), macrocalcification (18.5% vs. 12.5%, p = 0.001), and an intranodular central flow (60.3% vs. 47.1%, p < 0.001) were significantly higher in malignant nodules than in benign nodules (Table 3). There were no differences between the benign and malignant groups for egg-shell calcifications (p > 0.05).

Table 3 Sonographic features of subjects with thyroid nodules

Clinical, biochemical and sonographic characteristics of microcarcinoma

Of 524 malignant nodules, 104 nodules ≤1 cm in diameter were defined as microcarcinomas. Since microcarcinoma is considered “more silent”, we analyzed clinical, biochemical and sonographic parameters separately. As shown in the Additional file 1: Table S1, we found age, positive TGAb result, hypoechogenicity, microcalcification and intranodular central flow were also associated with increased risk for malignancy in the nodules less than 1 cm in diameter.

The associations between risk factors and the presence of malignant nodules

We further explored the correlation of clinical characteristics, laboratory values and US features with the risk for malignant nodules via univariate analysis, which gave results consistent with those from the prevalence analysis (data not shown). Multivariate analysis confirmed that age had a significant negative correlation with an increased risk of thyroid malignancy (OR 0.963, 95% CI 0.934–0.993, p = 0.017) (Table 4). Additionally, a positive TGAb result, hypoechogenicity, microcalcification and intranodular central flow were independently associated with increased risks for malignant nodules (TGAb OR 4.435, 95% CI 1.902–10.345, p = 0.001; hypoechogenicity OR 2.830, 95% CI 1.113–7.195, p = 0.029; microcalcification OR 4.624, 95% CI 2.008–10.646, p < 0.001; central flow OR 2.155, 95% CI 1.011–4.594, p < 0.05, respectively).

Table 4 Multivariate logistic regression of risk factors for the presence of thyroid malignancy

The performance of independent risk factors—A mathematical model to predict malignancy

To evaluate the predictive power of combinations of clinical characteristics, laboratory values and US features and to establish a mathematical model to calculate the risk for malignancy, a series of ROC curve analyses were performed, and AUCs were calculated. When the factors age, TGAb, hypoechogenicity and microcalcification were combined, the optimal AUC had a favorable value of 0.808 (0.761–0.855), indicating a diagnostic accuracy of 80.8% (Fig. 2). By combining these four independent risk factors of malignancy, we established the following formula for a predictive model:

Fig. 2
figure 2

ROC curve for cancer prediction with a discrimination accuracy (AUC) of 0.808, 95%CI 0.761–0.855

p = (EXP(− 0.963–0.4*age + 1.108*TGAb+ 1.441*microcalcification+ 1.722*hypoechogenicity)/(1 + EXP(− 0.963–0.4*age + 1.108*TGAb+ 1.441*microcalcification+ 1.722*hypoechogenicity)).

The best cut-off value was calculated as 0.52, with a sensitivity of 84.6% and a specificity of 76.3%.


In this study, we verified risk factors associated with thyroid malignancy after comprehensively evaluating clinical, laboratory and sonographic variables in a population of 2984 patients who underwent thyroidectomy. Subsequently, we developed a mathematical model for cancer prediction, thereby providing a practical tool for clinicians to distinguish thyroid nodules preoperatively.

In agreement with previous studies, we identified that decreased age was one of the independent risk factors for thyroid cancer [8]. Malignant nodules were smaller than benign nodules (1.96 ± 1.16 cm vs. 2.75 ± 1.70 cm, p < 0.001). However, our multivariate logistic analysis did not confirm a predictive role of nodule size. This difference indicates that smaller nodules may not have a higher risk of malignancy because patients with larger nodules often have an increased likelihood of surgery for benign reasons, such as compressive symptoms, whereas patients with smaller nodules without any suspicious sonographic findings often select a conservative follow-up.

Higher TSH values, even within normal ranges, have been associated with a higher prevalence of thyroid malignancy in some studies [4, 5, 9, 10]. The results of our study are in agreement with those of previous studies, except for when TSH levels were higher than 5.5 mIU/l, which was not associated with a further increase in the prevalence of malignancy. This difference may be due to selection bias because we excluded patients who were taking thyroxine drugs; therefore, the number of patients with TSH levels > 5.5 mIU/L would have been quite small. However, in our study TSH lost its diagnostic value after being included in the multivariate logistic regression analysis, probably due to its weak role in predicting malignancy, which could be masked by including other co-effectors. Elevated TGAb, but not TPOAb, levels were a significant predictor of thyroid cancer, which is consistent with the findings of other reports [11,12,13,14]. Consistently, our study confirmed that the prevalence of lymphocytic thyroiditis was more frequent in malignant nodules (Additional file 2: Table S2). Additionally, our data also confirmed that patients with thyroiditis had positive TGAb more frequently than patients without thyroiditis (63.9% vs. 13.0%, p < 0.001).

Numerous studies have investigated the role of US findings in the diagnosis of malignant nodules [1, 15,16,17]. These studies state that hypoechogenicity, microcalcification, thyroid nodules with irregular margins, and intranodular vascularity are important features in determining the risk of malignancy. However, Cappelli et al. showed that an ill-defined margin was a nonspecific finding that could be seen for both benign and malignant nodules [18]. Consistent with these previous findings, we confirmed that microcalcifications, hypoechogenicity and intranodular central flow were associated with increased risks of malignancy. Our study did not find an association between egg-shell calcification and malignancy. Peripheral-rim or eggshell calcification has generally been considered to be an indicator of a benign nodule. However, a recently published study of thyroid nodules with eggshell calcifications reported that the findings of a peripheral halo and disruption of eggshell calcifications may be useful predictors of malignancy [19, 20]. Further studies are needed to confirm this observation.

Previously, some researchers have reported several systems for maligncy assessment [21,22,23,24,25]. Stojadinovic et al. established a model based on the performance of electrical impedance scanning (EIS) EIS, which was not routinely scheduled in clinics [21]. Zahir et al. showed a complicated two-step predictive model which was less accesible for clinicans [22]. Koike et al. included US features alone for differentiating non-follicular neoplasms > 5 mm [23]. Maia et al. evaluated malignancy risk based on patients from a single center [24]. Banks et al. analyzed 639 patients established a diagnostic model using the variables age, nodule size and FNA cytology [25]. Different from previous reports, in this study we enrolled 2984 patients from multiple tertiary medical centers, which greatly strengthens the evidence for diagnostic evaluations. Additionally, our mathematical model is derived from a combination of easily accessible clinical, biochemical and sonographic predictors, which improves the feasibility and practical appeal, thereby helping clinicians with decision making and reducing unnecessary invasions.

In addition, we analyzed predictive variables based on postoperative pathological inspections instead of FNA cytology examinations. Although FNA is considered to be an accurate and cost-effective method for evaluating thyroid nodules with a high diagnostic sensitivity and specificity [26], there are some limitations to diagnostic FNAs. First, FNA is recommended for nodules > 1 cm at their greatest dimension with a highly or intermediately suspicious sonographic pattern and for nodules > 1.5 cm at their greatest dimension with a minimally suspicious sonographic pattern [3]. Nodules smaller than 1 cm are difficult to distinguish via FNA cytology. Second, the performance of FNA is largely affected by the experience of radiologists, and the quality of the FNA procedure may affect the results. Reflecting these limitations, a number of previous studies have analyzed risk stratification based on FNA diagnoses [4, 26, 27] and have shown that it is less reliable than postoperative pathological examinations, which were used in our study.

However, there are some limitations to this study. The US feature of a node being taller than it is wide is considered to be a reliable indicator for thyroid malignancy. Unfortunately, these data were not available for the majority of the patients; therefore, this parameter was not included in the analysis. An algorithm including this US feature might improve the diagnostic accuracy of the predictive model in our study. Although less convincing than operative confirmations, FNA cytology is a relatively effective and robust method for identifying malignancies. Unfortunately, due to limitations relating to the skill with which FNAs are performed and a lack of compliance by patients, FNAs were not routinely performed in suspicious thyroid nodules in this study. Lastly, our study is retrospective, and prospective studies in a larger patient population are required to define and verify this model of risk prediction to improve clinical management.


In summary, we analyzed 2984 patients who underwent thyroidectomy from multiple tertiary medical centers and established a practical model for predicting malignancies using a combination of simple and accessible clinical, biochemical and sonographic predictors. Prospective studies are required to validate this predictive model in a larger population.