Introduction

The routine measurement of serum calcitonin (CT) in patients with nodular/multinodular goiter represents the best method to precociously identify non-clinically evident MTC, very often in an early stage of progression, with a consequent positive impact on prognosis [1]. Although its cost effectiveness has been clearly demonstrated [2], the indication for the routine assessment of CT is still not universally shared and suggested by the scientific societies. In particular, though CT was recommended in the initial work-up of thyroid nodules by the European Consensus published in 2006 [3], more recent ATA and AACE/ACE/AME guidelines do not recommend either for or against routine measurement of serum CT (ATA, AACE) [4, 5] or limit the indication to patients to be submitted to surgery [6]. A very recent systematic review confirms the high sensitivity and specificity of CT testing, still questioning the value of its routine use due to the low prevalence of medullary thyroid cancer (MTC) (0.32% in thyroid nodular disease patients), and to the risk of overdiagnosis of possibly indolent tumors [7]. Nevertheless, since lymph nodal metastases are already present in up to 43% of MTCs ≤ 10 mm and 20% of them fail to be cured [8], the final goal of CT screening is the identification of micro-MTCs. Another reported drawback for the routine CT testing is related to the variability of CT measurement, depending on the laboratory/assay type [9, 10] and to the frequent finding of basal CT levels slightly higher than the normal range, which imply the need to perform a confirmatory stimulation test. The identification of universal and reliable cut-offs, which are still lacking for both basal and calcium stimulated CT (bCT and casCT), will likely support a more accurate diagnostic work up of thyroid nodules. In this context, when calcium (ca) test became the gold standard to evaluate casCt, we performed receiver operating characteristic (ROC) plot analyses showing that the best thresholds for the identification of MTC were > 26 and > 68 for bCT and > 79 and > 544 pg/mL for casCT in females (F) and males (M), respectively [11]. Since then, though these thresholds are currently used in the clinical practice, scanty data have been published. Data on bCT thresholds (> 35 pg/mL for F and > 46 pg/mL for M) have been recently reported to distinguish between non-MTC and MTC cases, though four different CT assay were used, all with different functional sensitivities [12]. In 2018, Niederle’s group reported that, in a series of 62 cases, MTC was predicted in F for bCT > 23 pg/mL or casCT > 780 pg/mL and in M for bCT > 43 pg/mL or casCT > 1500 pg/mL [13]. In 2020, the same Authors found in a larger series that these cut-offs were able to distinguish between a group of patients with definitive diagnosis of MTC and another group including C-cells hyperplasia and microMTC, and reported new cut-offs associated with lateral neck lymph node metastases [14].

Aim of this study was to identify cut-offs able to clearly differentiate between non-MTC and MTC cases in a large series of patients, followed by the same protocols in different Italian Institutions. The identification of reliable cut-offs is predicted to make CT screening more widely included in the initial diagnostic evaluation of thyroid nodules, particularly considering that ultrasonographic (US) features unique to this cancer are lacking. In this context, a large systematic review reported that US patterns of suspicion performed well even in MTC, though extremely specific signs used for the identification of papillary thyroid cancers, such as the taller than wide shape and the microcalcifications, were present in only 11% and 32% of MTCs, respectively [15]. Finally, since one of the hypothesized drawbacks of calcium test is related to a possible effect on cardiac function as reported in one patient [16] and not further confirmed, we recorded all cardiac variations during the test in a subgroup of the present series.

Patients and methods

A total of 54 new patients (33 F and 21 M) affected with uni- or multi-nodular goiter were enrolled, to join their data, for the establishment of more precise cut-offs, with those (n = 81) previously published by our group [11], for a total of 135 cases. Patients carriers of a RET germline mutation were excluded, to avoid the possible interference exerted by diffuse C cells hyperplasia in the determination of the cut-offs. Indeed, aim of the present study was to calculate the best cut-offs for the identification of MTC, being C cells hyperplasia (CCH) considered as a non-MTC result together with benign thyroid diseases (such as follicular adenomas, papillary thyroid cancers, nodular goiters). Patients came from five different Italian Centres (Milan, Padua, Bologna, Udine, and Pavia), all measuring CT using a 2-site automated chemiluminescent immunometric assay (Immulite2000; Siemens Diagnostics), with an analytical sensitivity of 2 pg/mL, and all performing the calcium test as previously described [11]. Briefly, calcium gluconate was administered iv at the dose of 25 mg (2.3 mg or 0.12 mEq of elemental calcium)/kg, and the adjusted body weight was calculated (www.manuelsweb.com/IBW.htmforidealbodyweightandadjustedbodyweightcalculator) for each patient to avoid overdosage. After a basal blood sampling for CT, calcium gluconate was administered iv at 5 mL/min, with a minimum time of administration of 3 min. All the possible causes interfering with the correct CT assessment were excluded and drugs, such as proton-pump inhibitors, withdrawn for 2 weeks. All patients were submitted to ultrasound (US) evaluation and, when appropriate, to cytology.

Patients with CT levels higher than the normal range were submitted to at least an additional control sampling by the above reported assay, and only those with confirmed CT levels above the cut-off were included. Patients enrolled in the present study had nodular/multinodular goiter (MNG) with bCT levels ≥ 10 pg/mL (30 F and 20 M), and underwent surgery according to a surgical and pathological protocol based on International guidelines [9], and shared among the five participating Centers (Table 1). Four patients with bCT levels detectable but < 10 pg/mL (6–9 pg/mL), with huge MNG or suspicious cytology were also enrolled. All cases with a suspicion of CCH/MTC based on previous cut-offs for basal and/or stimulated CT (49/54 cases) were submitted to total thyroidectomy and prophylactic bilateral central neck dissection, whereas the remaining five patients were submitted to total thyroidectomy alone, for tracheal compression and/or for cytological suspicion (indeterminate or suspicious for papillary thyroid cancer). To note, among the 81 patients previously reported, 50 had a bCT ≥ 10 pg/mL and the remaining had a bCT < 10 pg/mL and were submitted to surgery for reasons other than a suspicion of MTC [11]. Patients with thyroid autoimmunity were included both in the previous [11] and in the present study (n = 9 in the previous, n = 7 in the present), since autoimmunity has been demonstrated not to influence CT levels [17].

Table 1 Clinical features of the patients firstly described in the present study

To better evaluate the side effects of calcium test, we examined 133 consecutive cases (74 F and 59 M), submitted to this procedure in the Milan site during 5 years (2014–2018). Moreover, in 67 cases (43 F and 24 M), which underwent calcium test in the last 22 months (April 2018–January 2020), we evaluated data coming from the continuous ECG evaluation pre-, during and post-test.

All patients gave their informed consent to be submitted to the test and to include the results obtained in the present study, which has been approved by the Ethical Committees of the Institutions involved.

Statistical analyses

The correlation between MTC diameter, bCT levels and casCT, and whiskers plot graphs were done by means of a regression analysis and Chi-square test, respectively.

The gender-specific cut-off levels corresponding to the highest accuracy to differentiate between non-MTC and MTC cases were assessed by receiver-operating characteristic (ROC) curves, and their sensitivity and specificity were defined. It is worth noting that the data obtained in the present series were pooled with those already reported by our group [4] to increase the potency of the analysis. In a following step, the obtained cut-offs were used to evaluate if the combination of bCT and CasCT cut-offs could improve the prediction of MTC. Statistical significance was defined as P < 0.05. All statistical analyses were performed using SPSS version 8.0 for Windows and MedCalc Software version 11.6.1 for Windows.

Results

The histological examination showed a MTC in 27/54 (50%) cases, 19/33 (57.6%) females and 8/21 males (38%). All MTCs were pT1 (pT1a in 13 and pT1b in 14 cases), and the majority of them had no regional lymph nodes involved (p N0 in 20 cases and pN1 in 7, i.e., 26% of cases). No patient had distant metastases. In the remaining cases, histology showed a MNG in 10 F and 7 M (associated with CCH in nine cases), a follicular adenoma in 1 F (with CCH), a PTC in 4 F (associated with MTC in one case) and in 6 M, associated with CCH in four cases (Table 1). The US features and the cytological results, when available, of the 27 patients with a final diagnosis of MTC are reported in Table 2. Interestingly, according to the Italian classification of thyroid cytology [18], the majority of the nodules submitted to biopsy were classified as TIR1 (22.2%) or TIR2 (44.4%), while the remaining 33.3% were equally divided into TIR3B, TIR4, and TIR5 cases. At US, the majority of nodules were solid and hypoechoic (77.7 and 76.9%, respectively), while a minority were mixed or spongiform; regular margins were recorded in 64% of cases and microcalcifications were present in 38.5% of nodules. Finally, the majority of nodules were round (65%), while a taller than wide shape was found in only 10% of patients.

Table 2 Ultrasonographic features and cytological results of the 27 patients with a final diagnosis of MTC

The below reported results refer to the combination of these new cases with our previously published [11]. The whole series includes a total of 135 cases (77 F and 58 M), with 47 MTC cases (29 F and 18 M). In particular, MTCs < 5 mm were 3/27 in the new series and 1/20 in the previous one, those ≥ 5 < 10 mm were 9/27 in the new series and 7/20 in the old one.

Correlation between bCT levels and tumour size

The mean and median CT levels were 21.38 and 15 pg/mL (range 2.8–53.7) for tumors < 5 mm, 52.26 and 58.8 pg/mL (range 5.6–126) for 5–10 mm tumors, 227.6 and 121 pg/mL (range 12.9–1860) for tumors ≥ 10 mm (P < 0.001) (data not shown). A strongly significant correlation between tumor size and bCT levels was found either considering all MTCs (R2 = 0.352, P < 0.0001) or only microMTCs (≤ 10 mm) (R2 = 0.377, P = 0.0014) (Fig. 1). On the other hand, the correlation between tumor size and casCT levels was low (R2 = 0.1213, P = 0.01) (data not shown).

Fig. 1
figure 1

the highly significant correlation between basal CT levels and tumor size is shown (upper part); an outlier level (1860 pg/mL) is not represented in the whiskers plot for graphical reasons, but it was included in the statistical analysis. The significant correlation exists also considering only tumors ≤ 10 mm (lower part)

Sensitivity, specificity, positive and negative predictive value, and accuracy for MTC (ROC analysis)

ROC plot analyses were used to find the bCT and casCT thresholds able to differentiate between non-MTC (including nodular goiter associated or not with CCH) and MTC for women and men. The best thresholds for bCT were > 30 pg/mL and > 34 pg/mL, in F and M, respectively (Fig. 2). Interestingly, the same thresholds were obtained by ROC curves after the exclusion from the whole series of cases with bCT ≤ 10 pg/mL (23 F and 15 M) (data not shown). On the other hand, casCT > 79 and > 466 pg/mL, in F and M, respectively, were selected as the most accurate to distinguish non-MTC cases from patients with MTC (Fig. 3).

Fig. 2
figure 2

ROC curve analyses to identify the cut-off levels for basal CT with the highest accuracy to differentiate between non-MTC and MTC in females (left panel) and males (right panel). Non-MTC includes nodular goiter with and without C cells hyperplasia. Abbreviations: TP: true positive, FP: false positive, TN: true negative, FN: false negative cases

Fig. 3
figure 3

ROC curve analyses to identify the cut-off levels for calcium stimulated CT with the highest accuracy to differentiate between non-MTC and MTC in males (left panel) and males (right panel). Non-MTC includes nodular goiter with and without C cells hyperplasia. Abbreviations: TP: true positive, FP: false positive, TN: true negative, FN: false negative cases

Combined cut-offs

After the identification of the most accurate cut-offs, we aimed to verify if the combination of bCT and casCT could improve the diagnostic performance for MTC identification (Figs. 4, 5). Interestingly, all cases with both bCT and casCT levels below the thresholds for MTC determination were found at histology with a benign nodular diseases, associated with CCH in 29 and 50% of F and M cases, respectively. On the contrary, the majority (88 and 89% for females and males, respectively) of patients with bCT and casCT higher than the cut-offs established in the present study, had a MTC at histology. In both situations, a high specificity was obtained, ranging 94–100%. Since all patient with a bCT higher than the threshold had a casCT over the cut-off, but one male who had a bCT > 34 and CasCt < 466 pg/mL, the accuracy of bCT was almost superimposable to that found for bCT combined with CasCT.

Fig. 4
figure 4

prevalence of MTC and goiter not associated (benign) or associated with C cells hyperplasia (CCH) by preoperative basal CT and calcium stimulated CT levels, alone or in combination, in female patients who underwent total thyroidectomy. * for the identification of non-MTC cases; ° for the identification of MTC cases

Fig. 5
figure 5

prevalence of MTC and goiter not associated (benign) or associated with C cells hyperplasia (CCH) by preoperative basal CT and calcium stimulated CT levels, alone or in combination, in male patients who underwent total thyroidectomy. * for the identification of non-MTC cases; ° for the identification of MTC cases. The number of patients with MTC or non-MTC are also reported

Nevertheless, seven female MTCs (all pT1aN0, tumor size ranging 2.3–13 mm) and two male MTCs (all pT1bN0, tumor sizes 16 and 17 mm) would have been misdiagnosed by considering bCT over the threshold either alone or in combination with a casCT higher than the cut-off, since they had bCT below and casCT above the cut-off. Thus, the important finding of this study is that, by combining bCT below or above the cut-offs with casCT above the cut-offs, all the MTC cases were correctly identified, either in F and M patients.

Prevalence of CCH and hypercalcitoninemia in non-MTC cases

We evaluated the prevalence of CCH in non-MTC cases, i.e., those with a final diagnosis of nodular goiter or PTC. In the whole series, CCH was found to associate with PTC in 45.7% of cases (6/18 F and 10/17 M), and with MNG in 73.8% of cases (17/25 F and 14/17 M) (P = 0.31). As far as CT levels concerns, in females, mean bCT ± SD levels were 19.12 ± 29.81 and 26.82 ± 68.54 pg/mL in patients with a final diagnosis of PTC and in those with a final diagnosis of MNG, respectively (P = 0.59), while casCT ± SD levels were 249.55 ± 545.31 for those with PTC and 264 ± 287 pg/mL for those with MNG (P = 0.9). In males, mean bCT ± SD levels were 17.65 ± 19.39 and 19.43 ± 18.63 pg/mL in patients with PTC and MNG, respectively (P = 0.77), while casCT ± SD levels were 335.2 ± 324.9 for those with PTC and 394.5 ± 434.2 pg/mL for those with MNG (P = 0.64).

Evaluation of calcium test safety

To assess the safety of calcium test, in the subgroup of patients from Milan, all the adverse events were recorded during the test, and possible rhythm alterations were considered before and during the test (Supplemental Figure 1). The most frequent side effect recorded was the feeling of warmth (57% of cases), whereas other side effects were present in a minority of cases. Concerning the cardiological status of the patients before being submitted to calcium test, a sinus rhythm was recorded in the majority of cases (67.2%), a sinus bradycardia was present in 16.4% of patients, while other rhythm alterations were found in the minority of cases (1.5–4.5%). Interestingly, at the end of the test, the percentage of patients with a sinus rhythm significantly decreased (58.2%) due to an increase of patient with a sinus bradycardia (25.4%).

Finally, it is worth to note that the test was performed without side effects even in patients with hypertension (20%), with previous myocardial infarction (3.7%), with deep vein thrombosis, previous stroke, stenotic carotid artery, history of atrial fibrillation (0.7–3%).

Discussion

The present study, including a novel multicentric series, provides data on the most accurate bCT and casCT cut-offs for the preoperative identification of subjects with MTC. These findings represent a refinement of those reported by our group in 2014 in a more limited number of cases [11]. After pooling data, for a total of 135 patients, ROC curve analyses were used to compare the preoperative bCT and casCT levels with the histological findings. The bCT cut-offs able to separate non-MTC from MTC patients were > 30 in F and > 34 pg/mL in M. Interestingly, a strong correlation was found between tumor size and bCT, either considering all tumors or limiting to micro-MTCs. The best CasCT thresholds for the identification of MTC were > 79 and > 466 pg/mL for women and men, respectively. The casCT thresholds are equal or similar to those previously reported (> 79 and > 544 pg/mL, F and M) [11], whereas present bCT cut-offs differ from the previous report (> 26 and > 68 pg/mL, F and M) [11], particularly in males.

After the establishment of these new cut-offs, we evaluated their diagnostic power either alone or in combination. Similar to our previous data [11] and a recent report [14], bCT was shown to have an accuracy superimposable to that derived from the combination of bCT and casCT, indicating that, in both F and M, bCT values are extremely good predictors of MTC, and suggesting that serum CT assays with improved functional sensitivity may avoid the stimulation test in several conditions. Nevertheless, in the present series, we recorded patients for whom MTC could be diagnosed only upon calcium test results.

Present cut-offs have been established to discriminate MTC cases from non-MTC cases (i.e., follicular adenomas, multinodular goiters, PTCs), independently from their possible association with CCH. Indeed, CCH was found to be highly prevalent in both goiters and PTC cases, without significant differences, and is responsible for the bCT levels above the normal range recorded in these cases. We previously demonstrated that CCH is frequent nearby benign or PTC nodules likely due to a paracrine effect on C cells [19]. Nevertheless, the real impact of PTC on bCT levels is still debated and has been poorly explored. In a study including almost 500 patients, no significant differences in the prevalence of basal hypercalcitoninemia was found between patients with or without PTC [17], but stimulated data were not available. In our whole series, we found no significant differences in either bCT or casCT levels between patients with PTC or benign diseases at histology. Thus, hypercalcitoninemia seems not to be indicative of PTC and, when found in patients with thyroid nodules, it should be regarded as being suggestive of MTC.

The cut-offs defined in the present series are different with respect to those recently reported by another group [13], particularly related to stimulated data, which are higher and that would have led to the loss of the majority of our MTC cases (data not shown). This is consistent with the lower sensitivity that Niederle et al. found for both bCT and casCT with respect to our findings [14]. Although this evidence shows that the direct comparison of results obtained in different laboratories is difficult, making it challenging to find universal CT cut-offs, a possible explanation of this discrepant data could reside in the different population examined, being the tumors with pT > 1 higher in Niederle’s than in our series (16 vs 2%). In this context, since lymph nodal metastases are already present in up to 43% of MTCs ≤ 10 mm [8], and the aim of the routine CT measurement is to perform an early surgical intervention to improve cure rates [20], we believe that our cut-offs are more appropriate to reach that goal. Moreover, we tried to overcome the problem of comparability by pooling data from different Institutions using the same assay. Since the sensitive method used in the present study is widely diffused, and the calcium test is now well standardized, present cut-offs are likely to be reliably used in many different Institutions. We propose a diagnostic and therapeutic decision based only on the acquirement of a bCT and a casCT. Due to the variability in CT measurements among commercial assays [9, 10], bCT should be assayed by a sensitive method, and repeated once in the absence of interfering factors. This strategy may be preferable and more cost effective than the basal re-evaluation at intervals of 3–6 months suggested for patients with bCT values falling in the grey zone [21].

The high tolerability of the Ca test was confirmed, with minor side effects mostly consisting of a brief feeling of warmth. The test was safely performed also in patients with previous diseases, such as myocardial infarction o stroke, or atrial fibrillation, and the only effect observed was the onset of sinus bradycardia in 9% of patients during the test, with a prompt recovery of the basal rate at the end of the procedure.

Finally, US evaluation showed that in our series MTC nodules are frequently solid and hypoechoic, with a round shape and regular margins, consistent with data reported in a large meta-analysis [15], and confirming that suspicious sonographic features are less often present in MTCs than in PTCs, with the exception of hypoechogenicity. Moreover, the low diagnostic accuracy of cytology in MTC [9, 10] has been confirmed in the present series, being the majority of our cases classified as inadequate or benign.

In conclusion, we report accurate cut-offs for basal and calcium stimulated CT in patients with nodular goiter for the clear separation between non-MTC and MTC cases. Although basal CT cut-offs have been shown to identify most cases, the combination with a casCT is needed, if the aim is to diagnose all MTCs. The reliability and safety of calcium test support the routine use of CT determination in nodular thyroid disease, particularly considering that MTCs do not present a unique US pattern and can be misdiagnosed at cytology.