Shear wave elastography and Afirma™ gene expression classifier in thyroid nodules with indeterminate cytology: a comparison study

Purpose To compare shear wave elastography (SWE) and Afirma™ gene expression classifier (GEC) for diagnosis of malignancy in thyroid nodules (TNs) with Bethesda Classification (BC) III or IV indeterminate cytology. Methods This preliminary single-center prospective study was approved by the Institutional Review Board. We evaluated 151 consented patients with 151 indeterminate TNs (123 BC III, 28 BC IV) on fine-needle aspiration biopsy (FNAB). B-mode ultrasound, vascularity, and SWE were performed prior to FNAB. TN stiffness was measured as shear wave velocity (SWV) in meters per second (m/s). The stiffest area of the TN was selected for SWV measurement. GEC testing was performed with a second FNAB. Surgery was recommended for GEC-suspicious TNs, or GEC-benign TNs with two or more worrisome B-mode US features. Results Surgical pathology confirmed 31 malignant TNs. Among the GEC-suspicious group, 28 of 59 TNs were malignant. The SWV value of ≥3.59 m/s was the best cut-off for malignancy risk based on the receiver operating curve (ROC). Twenty-six malignant TNs had SWV ≥ 3.59 m/s. The sensitivity and specificity for SWV ≥ 3.59 m/s were 83.9 and 79.2%, respectively. Positive predictive value (PPV) was 51.0% and negative predictive value (NPV) was 95.0%. For the GEC-suspicious group, sensitivity, specificity, PPV, and NPV were 90.3, 74.2, 47.5, and 96.7%, respectively. In multivariate analysis, SWV and GEC-suspicious were significant predictors of malignancy, but B-mode features and vascularity were not. Conclusion This preliminary study indicates that SWE and GEC are independent predictors of malignancy in TNs with BC III or IV.


Introduction
Management of thyroid nodules (TNs) with indeterminate fine-needle aspiration biopsy (FNAB) has been a major challenge for thyroid experts. The Bethesda System for the Reporting of Thyroid Cytopathology renders standardized reporting criteria for TN FNAB cytopathology. Bethesda categories (BC) include: I (non-diagnostic or unsatisfactory), II (benign), III (atypia of undetermined significance or follicular lesion of undetermined significance), IV (follicular neoplasm or suspicious for follicular neoplasm), V (suspicious for malignancy) or VI (malignant) [1]. Indeterminate cytology categories include BC III and BC IV.
Approximately 15-30% of TN aspirations are interpreted as indeterminate, but the majority has benign surgical pathology. Among these patients, the risk for serious surgical complications ranges between 2-10% [2].
Prior to molecular testing, surgical resection was recommended for TNs with two indeterminate FNAB [1]. The risk stratification process has become more refined since molecular testing was introduced.
The use of molecular markers in indeterminate thyroid FNAB specimens improves diagnostic accuracy and subsequently influences the decision to perform surgery as well as the extent of the procedure. It is important to note that the long-term outcome data on the true value of this modality to guide therapeutic decision-making is lacking [3].
In the setting of indeterminate FNAB, some molecular tests have a high positive predictive value (PPV) in predicting malignancy [4], while others have a high negative predictive value (NPV) capable of predicting benign histology [2]. The combination of high NPV and sensitivity can be helpful to avoid surgery.
Several studies that used the Gene Expression Classifier (GEC) methodology for indeterminate TNs ≥ 10 mm reported a risk for thyroid cancer (TC) ranging from 13-47% [2,[5][6][7]. Practical limitations of molecular testing include cost and invasive nature of performing FNAB to obtain tissue samples for GEC or other molecular testing.
Elastography is a recent and non-invasive technology used to differentiate benign TNs from TC through tissue stiffness measurement [8]. Several recent publications report shear wave elastography (SWE) as an independent predictor of TC [9][10][11][12].
Virtual touch imaging quantification (VTIQ) is a 2D-SWE technology generated by acoustic radiation force impulse. VTIQ is capable of creating shear wave image and subsequent tissue quantification in one display. This allows for identification of regions for measurement of tissue stiffness [13]. The diagnostic performance of SWE in indeterminate TNs has not been evaluated in comparison to the GEC standard. We previously reported that TN stiffness measured by VTIQ-generated SWE is an independent predictor of TC when TNs with all Bethesda classifications were included. Based on the ROC curve, a single cut-off at 3.54 m/s has a sensitivity, specificity, PPV, and NPV of 79.27, 71.52, 26.75, and 96.34%, respectively [11]. Other predictors for TC were micro-calcifications and irregular margins.
The goal of this study was to prospectively compare the performance of SWE with GEC in TNs with indeterminate FNAB. Other B-mode ultrasound (US) characteristics and Color Doppler (CD) vascularity patterns were also examined and included in TC risk stratification.

Study design
This single-center prospective study was approved by the Institutional Review Board. Patients were evaluated between April 2014 and October 2016. All participants gave written informed consent in this Health Insurance Portability and Accountability Act compliant study. All patients were examined clinically and with US prior to FNAB by a single practitioner with more than 15 years of experience in thyroid and neck US and 12 months experience using SWE.

Study population and protocol
We prospectively evaluated 1520 patients with 1674 TNs from April 2014 to October 2016. All TNs were evaluated with a high-resolution US and FNAB. A total of 221 TN had BC III or IV; of those 160 had GEC testing. Fifty nine individuals with 61 TNs did not have GEC testing for a variety of reasons, including lack of insurance coverage for GEC testing or deciding for or against surgery without additional testing. This manuscript presents the data on TNs with GEC testing only.
Inclusion criteria included (a) age of 18 years and older; (b) TNs ≥ 10 mm or ≥5 mm with at least one suspicious US feature; (c) patients with indeterminate FNAB results (BC III or IV) and subsequent second FNAB with GEC methodology, and (d) surgical resection for those with GEC-suspicious results or GEC-benign and two or more suspicious B-mode US findings (irregular margins, microcalcification, central vascularity, tall shape, and hypoechoic pattern). All surgical candidates had cervical neck mapping and FNAB of lymph nodes, if indicated, prior to surgery. Nine patients were excluded because (a) they refused surgery after GEC-suspicious results (n = 5); (b) GEC testing was non-diagnostic (n = 2); initially four patients had non-diagnostic GEC testing, of these, two had successful second GEC and two refused repeat testing and were excluded; (c) when second FNAB yielded BC V or VI results, GEC was not performed (n = 2). Surgical pathology results were used as the reference standard for the classification of benign or malignant TN pathology.

Conventional US examination and biochemical testing
Prior to FNAB, B-mode characteristics were examined with 18L6 High Definition probe. The initial US exam included Bmode information related to thyroid gland and TNs: homogenous versus heterogeneous gland, size of TN, sub-capsular location, isthmus location, macrocalcifications, microcalcifications, isoechoic, hypoechoic, hyperechoic, tall, solid, or complex TN. CD vascularity patterns of TNs were divided into four groups: no blood flow, peripheral blood flow only, peripheral and central blood flow and final group with primarily central blood flow. Sub-capsular location was assigned to lesions <2 mm from the thyroid capsule.

Shear wave elastography
SWE was the last part of the US exam prior to FNAB. For this study a Siemens Acuson S3000 US system was used. SWE was performed using VTIQ. During the elastography exam, patients were asked not to swallow or breathe for a few seconds. A qualitative elastography image was created first. Subsequently, SWV of the stiffest area within the TN was measured in meters per second (m/s) twice using a small region of interest (ROI) box measuring 1.5 mm in diameter (ROI box size with VTIQ is predetermined and cannot be changed). The highest velocity of Read 1 and Read 2 was reported as the maximum SWV, and the average of both reads was defined as mean velocity. SWV of the thyroid tissue surrounding the TN (tissue velocity) was measured once. SWV measurement can be converted from m/s to kilopascals (kPa) using the following formula: kPa (Young's Modulus) = 3pc 2 , where c is shear wave speed in m/s and p is tissue density (a constant = 1000 kg/ m 3 ) [14]. For example, a SWV of 4 m/s is 48 kPa.
In addition to the main clinician of this study, three sonographers with 24, 7, and 14 years of experience, respectively, reviewed the SWE and B-mode features, including calcifications and irregular margins. They were blinded to the GEC result and surgical pathology outcomes. There was 100% agreement among all three reviewers for SWE image and SWV measurements. Among B-mode features, there was 96.9% agreement yielding a high interrater reliability, measured by Brennan and Prediger's Kappa due to the skewed distribution of the ratings, of 93.8 [15].
FNAB procedure, cytopathology, and GEC FNAB was performed with US guidance to confirm accurate needle placement. For FNAB, 2-3 passes were made. Approximately six weeks after the first FNAB, a second FNAB was performed with two additional passes to obtain material for GEC analysis. Cytopathology was performed by Thyroid Cytopathology Partners (Austin, TX), who are associated with Veracyte. Afirma™ GEC testing was performed by Veracyte (South San Francisco, CA). Patients were enrolled only when GEC testing and repeat cytology with second FNAB was performed. 151 TNs met all inclusion criteria. 123 TNs were BC III, and 28 TN were BC IV.

Statistical analysis
All statistical analyses were conducted using Stata 14.1 (StataCorp LP, College Station, TX, USA), and used a statistical significance level of 0.05 and two-sided hypothesis tests. Continuous variables were summarized using means and S.D., while categorical variables were summarized using frequencies and percentages. Bivariate associations with malignant pathology were assessed using Fisher's exact tests for categorical variables and Wilcoxon rank-sum tests for continuous variables. Multivariate logistic regression was used to determine the influence of the maximum SWV and GEC on malignant pathology while controlling for all variables with a p-value of <0.1 in the bivariate analyses.

Results
Demographics Figure 1 summarizes our study data. One hundred fifty one TNs in 151 mostly female patients (89.4%) met all inclusion criteria. Mean age was 51.4 years (S.D. 15.77) ( Table 1). Neither age nor gender was a statistically significant predictor of TC.

Surgery
A total of 76 patients had thyroid surgery, including all 59 GEC-suspicious patients and all 17 with GEC-benign but two or more worrisome US features or worrisome TN size (>4 cm). There were 31 malignant TNs confirmed in 31 patients; ten were papillary thyroid cancer (PTC), 17 were follicular variant of PTC (FVPTC), and four were follicular carcinoma. One patient had metastatic tumor to lymph nodes. Among all surgical cases, 15 follicular adenomas (FAs) were detected, with eight in the GEC-suspicious group. Figure 2 demonstrates B-mode, vascularity and elastography images of a TN with low SWV. FNAB pathology indicated BC III and GEC-suspicious results; final surgical pathology showed a FA.

B-mode and Doppler analysis
B-mode US showed an average nodule size of 16.5 mm (S. D 11.96) ( Table 1). Nodule size was, on average, significantly larger among benign pathologies with a mean of 17.3 mm (S.D. 12.39) than among malignant pathologies with a mean of 13.4 mm (S.D. 9.63) (p = 0.034). The presence of macrocalcifications was significantly higher in malignant TNs with 62.5% compared to 37.5% in benign TNs (p = 0.007). None of the other TN characteristics, however, showed a significant bivariate association with TC. Nodular blood flow on CD imaging was also not significantly different between the two groups.

Serum analysis
None of the biochemical variables, including serum TSH, TPOAb, TgAb, thyroglobulin, and calcitonin, were significantly associated with TC (Table 1).

GEC findings
GEC-benign results were found in 60.9% of TN. Among patients with GEC-suspicious results, 47.5% were found to be TC, and among those with GEC-benign, 3.3% were found to be TC (p < 0.001) ( Table 1). The sensitivity and specificity of GEC as a predictor of TC were 90.3% and 74.2%, respectively, with a PPV of 47.5% and NPV of 96.7%, and a positive likelihood ratio (LR+) of 3.50 and a negative likelihood ratio (LR−) of 0.13 (Table 2). For nodules with both GEC-suspicious and a maximum SWV of ≥3.59 m/s as a predictor of TC (compared to either a GEC-benign, a maximum SWV of <3.59 m/s, or both) sensitivity was 77.4%, specificity was 91.7%, PPV was 70.6% and NPV was 94.0%, a LR+ was 9.29, and a LR− was 0.25. Figure 3 shows a TN with high SWV. FNAB pathology showed BC III and GEC-suspicious results. Surgical pathology was suggestive of un-encapsulated FVPTC.  Table). Both the mean as well as the maximum of these two readings were significant predictors of TC (p < 0.001) ( Table 3). The average maximum SWV in the benign group was 3.3 m/s (S.D. 0.96) compared to 4.3 m/s (S.D. 1.13) for the malignant group. The receiver operating curve (ROC) was used to determine the best single cut-off for maximum SWV to predict TN malignancy (Fig. 4) All independent variables with a p-value < 0.1 in the bivariate analyses were included in a multivariate logistic regression model. In this context, calcifications, maximum nodule size and being a single nodule gland were not significant predictors of TC (Table 4). GEC-suspicious results and SWV of ≥3.59 m/s remained strong independent statistically significant predictors of TC (p < 0.001) with oddsratios of 16.03 and 17.91, respectively.
In an additional analysis, when we combined GECsuspicious results with a SWV ≥ 3.59 m/s, we had a higher specificity and PPV than either criterion alone (Table 4). On the other hand, when a SWV cutoff of 2.9 m/s was used, Color Doppler vascularity and c shows SWV measurements. Initial FNAB pathology was read as Atypia of Undetermined Significance (BC III); repeat FNAB confirmed this diagnosis and showed GECsuspicious results. Surgical pathology showed a follicular adenoma among 51 TNs, 37 were GEC-benign and 13 were GECsuspicious. We found only one TC. Table 5 demonstrates an analysis of concordance between GEC and SWE groups with the purpose of better understanding our data.

Discussion
This single-center study prospectively determined SWV in TNs with indeterminate cytology (BC III or IV) plus a GEC analysis. SWV has a similar sensitivity, specificity, PPV, and NPV to that of GEC. High SWV (≥3.59 m/s) is a risk factor for TC. Our result for indeterminate TN with low SWV (<2.9 m/s) was also promising. Of 51 TNs (33.7%) with SWV of <2.9 m/s, only 1 TN was malignant. In this group, 39 were GEC-benign and 12 GEC-suspicious. SWV of ≥3.59 m/s and GEC-suspicious results are independent predictors of malignancy.
This may be particularly useful in subgroups of BC III and IV cytopathology. The first group is GEC-benign plus low SWV (2.9 m/s) that has a very low risk of TC. The second group is considered as high risk for TC (GEC-suspicious plus SWV ≥ 3.59). Combining both methodologies may serve us in new ways. While GEC testing can improve pre-operative risk stratification for TC, it cannot help with nodule selection. SWE may become a tool to assess TC risk prior to FNAB. It might not only improve nodule selection for FNAB, but also enhance the GEC interpretation. Adding SWV as an independent risk factor for TC can potentially increase the number of surgical procedures among GECbenign nodules. It is important to mention that we do not suggest that surgery should be recommended solely based on high SWV (≥3.59 m/s). Future single and multi-center studies are needed to validate our results.
There are several differences in study design between this study and other studies using GEC methodology [2,6,7,[16][17][18][19][20][21]. In this study, all patients had SWE exam prior to FNAB. We included TN < 10 mm with at least one worrisome US feature. Furthermore, all patients in the GECsuspicious group had a final surgical pathology diagnosis.
The frequency of final pathology diagnoses in this study was different than in other studies. In this study, the majority of TCs were determined to be FVPTC (55%). In most studies, all Bethesda classifications are included and the majority of TCs are PTC [16,[22][23][24]. TNs with indeterminate FNAB appear to have a different US profile when compared to TNs read as suspicious or diagnostic for TC (BC V and BC VI) on FNAB because the majority of the latter group are PTC [23][24][25][26]. In our current study, only one patient had metastatic lymph nodes. This is a lower prevalence of metastatic lymph nodes when compared to other publications [3,[27][28][29][30][31]. FVPTC may have a lower

Shear wave velocity (SWV) performance
A ROC analysis determined a single cut-off of SWV ≥ 3.59 m/s as the best predictor of TC with a sensitivity of 83.9%, a specificity of 79.2%, a PPV of 51.0%, and a NPV of 95.0%. This result is similar to our previous study with 707 TNs including all Bethesda categories [11]. We analyzed two additional cut-off values for SWV measurement (>4 m/s and >4.5 m/s). As anticipated, these had slightly higher specificity, but lower sensitivity ( Table 2).
Among five malignant TNs with SWV < 3.59 m/s, three were <10 mm (2 FVPTC and 1 PTC). In this group, four were GEC-suspicious and one was GEC-benign.
We compared SWV with other B-mode US characteristics. By multivariate analysis B-mode US features were not predictors for malignancy in these indeterminate TNs. This confirms Samir's findings [32]. This finding is contrary to our previous two elastography studies [11,33]. Both studies reported that microcalcifications and irregular margins were independent risk factors for TC when all Bethesda categories were included. Excluding TNs with BC V and BC VI cytology on FNAB might be the main reason for this finding. The other reason is the smaller sample size of the current study.

Comparison with other shear wave studies/ technologies
Two studies with SuperSonic US system have evaluated the performance of SWE in nodules with indeterminate FNAB have yielded conflicting results.
Samir et al. demonstrates the potential benefit of SWV in TNs with atypical FNAB (BC III or BC IV) in a pilot study of 35 patients [32]. ROC determined a cut off value of 22.3 Surgical pathology confirmed a 10 × 8 × 5 mm FVPTC kPa, with sensitivity, specificity, PPV, and NPV were 82, 88, 75, and 91% respectively. His relatively low SWV is possible due to measurement of the entire TN versus a smaller area with highest SWV. In contrast, Bardet et al. examined BC III, BC IV, plus BC V TNs with SWE [34]. Of 131 patients enrolled, 21 (16%) had TC. The mean and maximum elastography measurements for benign and malignant nodules were similar. However, among malignant TNs, classic PTC had a higher SWV compared with other TC. In addition to using a different device, our procedure was different. Bardet et al. positioned the ROI box within the nodule in the area identified with elastography color map as most homogenous. In our study, we measured the stiffest area or highest SWV seen on color map.
Measurement of a larger area within a TN can decrease the overall SWV. For example, when using a larger ROI box 6 × 5 mm with Virtual Touch Quantification, the measured SWV is lower than when smaller ROI box of 1.5 × 1.5 mm of VTIQ is used [10,11].

GEC performance
Among 59 GEC-suspicious TNs, we found that 28 were malignant (47.5%). Our malignancy rate is similar to that of Alexander et al. [2,17], Harrison et al. [21], and Yang et al. [18]. Our study confirmed that GEC has a relatively high NPV (96.7%), and suggests that there was no inherent selection bias in our study as compared to other studies in the literature.
In the GEC-benign group there were three malignant TNs, of those, two TNs had SWV ≥ 3.59 m/s. Two were FTC and one had PTC. The median SWV for the GECbenign group was 3.00 m/s (mean 3.17 m/s). Among these patients, 17 had surgery with three malignancies. It is possible that additional nodules in this group may be malignant since not all TNs were resected. For example, eight TNs are being monitored despite SWV ≥3.59 m/s. This potential limitation of our study is a function of current clinical practice recommendations.
The malignancy rate for TNs < 10 mm was higher (26%) than TNs ≥ 10 mm (17.5%). A probable explanation is because smaller TNs (<10 mm) were selected for FNAB only when there was at least one worrisome US feature. There was no statistical difference in SWV between malignant TNs in both groups.
GEC testing for TNs ≥ 10 mm has been validated to describe the risk of thyroid malignancy in many studies [2,6,7,16,17,19,21]. Our analysis demonstrated that TN size did not affect GEC performance. This observation was reported by Wu et al. [20] as well.

Autoimmune thyroid disease and TN with indeterminate FNAB
Some large studies demonstrated an association between differentiated TC and autoimmune thyroid disease and/or TSH when all Bethesda classifications were included [23][24][25][26]. There are at least three possible reasons that these relationships were not observed in this study. This relatively small study may have been underpowered to identify these associations. These associations may not occur in BC III and IV nodules. Alternatively, the final pathologies of this study were largely FVPTC and follicular carcinoma (69.2%), with only 30.7% being PTC. PTC may be more

Limitations
Our study was a single center prospective study. Although all TNs in the GEC-suspicious group had surgery, not all in the GEC-benign group did. There are eight TNs in the GEC-benign group with high SWV (≥3.59 m/s) and ≤1 suspicious US feature. These individuals did not undergo surgical resection since GEC-benign result is felt to represent the current clinical practice standard [3]. We continue to monitor these eight TNs with US for growth and if indicated, repeat FNAB. A number of factors can affect thyroid tissue stiffness and subsequently cause false SWV measurement: TNs with high fluid content (>50%), isthmus location, tissue fibrosis, and high calcified content of TNs (>25%).
There has been a recent nomenclature revision for encapsulated follicular variant of papillary thyroid cancer (EFVPTC). Due to its indolent behavior, this thyroid tumor was downgraded from cancer to noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) [36]. The implementation of this reclassification will lower the number of malignancies in particular among the GEC-suspicious group. Despite change in nomenclature, surgery is still the treatment of choice for this group of TNs. This recommendation was published in August 2016 and did not affect our study.

Conclusion
In TNs with BC III and IV cytopathology, SWV, and GEC demonstrate similar diagnostic performance and are independent predictors of TC.
Acknowledgements For performing the statistical analysis and for expert statistical advice, we acknowledge Sonja Ziniel, PhD (Children's Hospital Colorado, Aurora, CO). We would also like to thank our independent reviewers Steve Lundberg, BS, RDMS, RVT; David Hawa, RDMS, ABD, OBGYN, RVT; and Megan Cox, RDMS, RVT.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest. Ethical approval All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with animals performed by any of the authors.
Informed consent Informed consent was obtained from all individual participants included in the study.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://crea tivecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.