Background

Thyroid cancer (TC) is one of the most frequent endocrine malignancies, accounting for 3–4% of cancers [1], and its occurrence has increased by approximately 5% on a yearly basis, with higher prevalence in females than in males (20.6 vs. 6.9 new cases per 1000 persons) [2]. The number of newly diagnosed cases has risen dramatically in the last 10 years, which could be partially ascribed to the availability of more sensitive diagnostic tools, i.e., ultrasonography and fine-needle aspiration (FNA) and the smaller size of diagnosed tumors. However, over diagnosis is also an issue because its occurrence rate has risen 15-fold since 2003, whereas mortality rates have not changed [3].

In general, the 5- and 10-year survival rates for TC patients are excellent (approx. 98%) but are related to the age of the patient at the time of diagnosis and the cancer subtype [1, 4, 5].

Both papillary (PTC) and follicular thyroid carcinoma (FTC) arise from follicular epithelial thyroid cells involved in iodine metabolism. PTC and FTC, together with the less common Hürtle cell carcinoma, are classified as differentiated thyroid cancer (DTC, see Fig. 1) [6, 7]. Both PTC and FTC progress slowly and are generally characterized by good prognosis, especially if diagnosed early [5].

Fig. 1
figure 1

Overview of thyroid cancer types and their origins

Undifferentiated anaplastic thyroid carcinoma (ATC) is the most aggressive TC type. Although ATC also originates from follicular cells, similar to PTC and FTC, it does not possess their original biological properties [8]. ATC represents 2–5% of cases, (77% in women) with the worst prognosis and a 5-year survival rate of 5% [3]. ATC is insensitive to conventional methods of treatment [9].

In contrast, medullary thyroid cancer (MTC) is derived from parafollicular thyroid “C” cells, which produce calcitonin [2].

The majority of TC cases are sporadic, with only 5% of DTC characterized as familial (mostly PTC) and ~ 25% of MTC inherited as an autosomal trait [10]. Only sporadic tumors are analyzed in this review.

Although most mutations found in TC differ among types, certain DNA alterations were found to be common in more than one subtype. As discussed later in this review, ATC tumors appear to derive from other differentiated tumors and thus possess a large overlap with mutations present in DTCs, such as TMPRSS4. Mutations in certain genes, e.g., CHEK2, are reported in both PTC and FTC, although not with the same prevalence [11, 12], and their potential contribution to TC carcinogenesis is described in the respective paragraphs. In this work, we focus on tumor heterogeneity and the mutation burden carried by thyroid tumors, as tested primarily by high-throughput methods performed within larger genomic projects, including The Cancer Genome Atlas (TCGA).

We gathered the data from RNA expression and DNA sequencing experiments and identified potential genetic biomarkers of disease progression. Genome-wide association studies (GWAS) as well as sequencing and microarrays were considered. In this work, we present an overview of the available biomarkers candidates for progression and development of thyroid cancer and drivers of carcinogenesis, as discussed in detail in the respective sections. All gene functions were inferred using GeneCards (www.genecards.org) [13].

Genome-wide studies significantly aid in the identification of cancer-specific germline and somatic mutations, which can contribute to more sensitive diversification of cancer subtypes and facilitate early diagnosis. Identification of disease-specific point mutations can accelerate the evaluation of candidate target genes for therapeutic drugs and the search for novel driver mutations. However, the identification of polymorphisms (SNPs) could additionally improve prognosis and patient outcomes.

Common genetic determinants of thyroid cancer subtypes

In recent years, the development of sequencing and microarray technologies has permitted a whole-genome search for TC-linked or associated genes. Genome-wide association studies (GWAS) are a highly potent method for identification of high-incidence single nucleotide polymorphisms (SNPs) and copy number variations (CNVs). Recently, GWAS were used to study large TC patient cohorts [14,15,16,17] and were followed by studies confirming the findings [18,19,20,21,22,23,24,25,26,27]. Mutation hot spots identified through GWAS (microarray, next-generation sequencing (NGS) and high-resolution melting (HRM)) are collected in Table 1. Specific SNPs could be associated with susceptibility to DTC (mostly papillary and follicular) in single or multiple populations with variable strength.

Table 1 Somatic mutations associated with susceptibility to differentiated thyroid cancers

Sixteen case/control studies allowed identification of 27 SNPs located primarily within the coding regions (see Table 1). Only rs6983267 was located in the non-coding RNA; rs1220597, rs73227498, and rs13184587 were located in the introns; and rs965513, rs1867277, rs71369530 and rs944289 were located in proximity to NKX2–1. This observation might stem from the fact that most microarray and NGS experiments are focused on transcriptome analysis and can be biased against regulatory or non-coding fragments. TC-associated genes are often connected to DNA-damage repair or transcription.

Using a slightly different approach, Gudmundsson et al., selected 22 SNPs based on a score of high association with high levels of thyroid stimulating hormone in a GWAS study of over 27,000 samples from an Icelandic population [16]. The results of genotyping of 561 samples of the non-medullary type were compared with over 40,000 controls from different populations (Dutch, American and Spanish). Three variants proved to be significantly correlated, namely, rs966423 in non-coding RNA-DIRC3 (OR = 1.34, P = 1.3·10− 9), rs2439302 membrane glycoprotein involved in cell signaling NRG1 (OR = 1.36, P = 2.0·10− 9) and rs116909374 in thyroid-specific transcription factor NKX2–1 (OR = 2.09, P = 4.6·10− 11), and their functions in thyroid tumorigenesis are still unknown.

The ThyroSeq microarray panel (ThyroSeq) is widely used and offers the possibility of testing more than 1000 hotspots in 14 TC-related genes and over 40 fusions simultaneously. Nikiforova and Nikiforov tested over 800 TC samples of all types using ThyroSeq panels, thus proving its usefulness in detection and classification of cancerous tissue [28,29,30].

Figlioli et al., performed SNP genotyping of an Italian population (case/controls: 1437/1534), validated in DTC patients from Poland (case/controls: 448/424) and Spain (case/controls: 375/408) [14]. The strongest correlation among all tested cohorts was found for rs10136427 localized in transcription regulator BATF, (OR = 1.40, P = 4.35·10− 7) and rs7267944 in putative RNA helicase DHX3 (OR = 1.39, P = 2.13·10− 8).

Gudmundsson et al., published a follow-up study in Icelandic, Dutch, Spanish and 2 American populations (case/controls: 1003/278,991, 85/4956, 83/1612, 1580/1628 and 250/363, respectively) confirming 5 novel loci associated with non-medullary thyroid cancer (Pcombined < 3 × 10− 8), i.e., rs12129938, rs6793295, rs73227498, and two independently associated variants, i.e., rs2289261 (OR = 1.23; P = 3.1·10–9) and rs56062135 (OR = 1.24; P = 4.9·10− 9) [31].

Applying a presumption that the DNA repair genes of base (BER) or nucleotide (NER) excision repair pathways might be involved in TC tumorigenesis, Cipollini et al., genotyped known SNPs in 450 case-control paired DTC samples from an Italian population [32]. The TT variant of base excision repair gene NEIL3, which codes a DNA glycosylase, was associated with increased risk of DTC. Another GWAS study on an Italian population performed by Köhler et al. associated mutations in non-coding RNA genes DIRC3, RARRES1, SNAPC4 and IMMP2L with increased DTC in a high-incidence population of 690 cases and 497 controls and confirmed this finding in 3 low-incidence populations (total of 2958 cases and 3727 controls) [15] (See Table 1). SNAPC4 encodes a large subunit of the RNA-activation protein complex, and RARRES1 and IMMP2L are transmembrane proteins.

Papillary thyroid Cancer (PTC)

Derived from follicular cells, papillary thyroid cancer is named after its cyto-architecture and can be further divided into 3 subtypes based on histotype: tall cell variant (TCV), follicular, and classical (most common) [33]. According to TCGA, up to 70% of somatic PTC drivers are found in activators of the MAPK pathway and include BRAF, RAS and rearrangements of the RET and NTRK1 genes [5] (See Table 2). The alterations are generally thought to be mutually exclusive in PTC [34,35,36,37], but contradictory data have emerged [38,39,40,41]. Other mutations such as PTEN and PIK3CA [42] have been reported at lower frequencies (2/86 (2.32%) and 3/86 (3.48%), respectively). The mutation density is relatively low at 0.41 mutations/Mb for PTC and 0.5 mutations/Mb for TCV. PTC is often multifocal, with a main tumor (> 1 cmØ) and several microcarcinomas [43, 44]. Nodules might be positioned unilaterally or bilaterally in the thyroid lobes. Multifocality is a characteristic of up to 40% of all PTC, [45, 46] leading to aggressiveness and resistance to radioiodine treatment [47]. The clonal origin of each singular carcinoma is not necessarily the same because tumors might arise independently through a series of molecular events, such as chromosome X inactivation [43, 48,49,50,51,52], but certain authors suggest clonal homogeneity between the nodules [49, 53,54,55,56,57].

Table 2 Somatic mutations characteristic of PTCs

Genetic alterations in kinases

BRAF

The most common somatic mutation occurring in PTC is a mis-sense BRAF mutation resulting in thymine-to-adenine substitution at position 1799 of the B-type Raf Kinase (BRAF) gene. This mutation leads to a valine-to-glutamate substitution at codon 600 of the BRAF protein (BRAFV600E) and constitutive activation of the MAPK signaling pathway via activation of the G-coupled receptor in the membrane [58,59,60], and it is common for several cancers, including non-small cell lung cancer and melanomas. [59,60,61]. BRAF is an activator of BRAF-activated non-coding RNA (BANCR), which regulates many cellular processes, including tumorigenesis, metastasis and, apoptosis [62]. BRAF can function as both a tumor suppressor and disease progression factor [63]. BRAFV600E is typical for TCV and classical subtypes, whereas RAS mutations predominantly drive the follicular subtype [33, 64]. This dependence, in combination with the various prevalence of driver mutations in populations, might explain certain of the disparities between different studies.

Recently, the potential heterogeneity of BRAF mutants (intra- and inter-tumoral) has been emphasized using both traditional methods (PCR verified by Sanger sequencing) as well as novel techniques such as exome capture and pyrosequencing. Kimbrell et al., tested 57 tumors from 27 patients for the presence of the BRAF V600E mutation [65]. The results were discordant between primary and secondary tumors in 10 out of 27 cases, but no significant histological changes were observed. However, the irregularity of the tumor edge appears to indicate its metastatic origin. No correlation was detected for the lobe positioning of the concordant and discordant nor the size of BRAF-positive and negative tumors. Sun et al., showed (n = 455) that 75.5% of the patients in a Chinese population harbored a BRAFV600E mutation, which was significantly correlated with increasing patient age [66]. In contrast, the rate of BRAFV600E mutations was two times lower in children than in adults [67]. One of 14 pediatric patient samples was positive for concomitant BRAF mutation and RET/PTC3 rearrangement (see below). Lu et al., identified BRAFV600E mutation as the most common using deep sequencing of 21 foci from 8 patients [68]. The experiments confirmed that multifocal TC could be heterogeneous and that BRAF is not necessarily the driver because up to 75% of the clones had independent clonal origins. Those results were supported by reports from other groups in which foci did not share the same mutation patterns [48, 69,70,71]. Gandolfi et al., tested 37 primary PTC tumors and 95 metastases in adults and found that 43.9% of the samples were BRAF-positive, but no correlation was observed with metastasis. The allele percentage shows that BRAF mutations are heterogeneous and rarely a result of a clonal event [69, 72]. De Biase et al., tested the distribution of neoplastic cells in BRAFV600E-positive tumors (n = 85/155) [51]. The percentage of cells harboring a mutated BRAF allele present in each sample varied from less than 30% (n = 9/85) to 80% (n = 39/85). Down-regulation of the transcript was observed in paired PTC tumor samples and normal adjacent tissues. Real-time PCR shows that the down-regulation of BANCR correlates with patient prognosis with consideration of tumor size, number of nodules, stage, gender, metastasis and extrathyroidal extensions but not with age.

PIK3CA

Mutations in PIK3CA, a catalytic subunit of the phosphatidylinositol 3-kinase and a component of the PI3K/Akt signaling pathway, were found by Lee et al. in a targeted sequencing experiment (n = 240). One sample carried a PIK3CAE542K mutation (0.4%), 24 p.E545A mutation (10%) and 138 concomitant BRAFV600E and PIK3CAE454A mutations (57.7%) [73]. Independently, Wang et al., found 20 samples carrying the PIK3CA copy gain mutation (14%, n = 141) [74].

RET proto-oncogene

The RET proto-oncogene encodes a tyrosine kinase receptor [75, 76], and RET activation promotes downstream signaling, leading to cell proliferation, differentiation and survival. [75]. Depending on the length of the C-terminus of the RET protein, three splice variants of the RET mRNA can be distinguished, namely, RET9, RET43 and RET51, and all present different cellular localization and function [77]. In PTC, gene fusions are the most common, but RET gene mutations were also associated with tumorigenesis, specifically RET G691S (rs1799939), L769 L (rs1800861) and S904S (rs1800863) [78]. Khan et al., suggested that rare variants G691SA and S904S are more prevalent in PTC and might be associated with a predisposition to TC development, as opposed to the underrepresented L769 L variant. However, this study was conducted on blood samples of post-thyroidectomy patients, thus the sensitivity of the assay remains to be determined.

Gene fusions

RET/PTC gene fusions

The variants of RET rearrangements are characterized by the fusion of the kinase domain to the 5′ terminus of the donor gene, resulting in a change of the subcellular localization of the receptor to the cytosol and leading to constitutive activation of the MAPK signaling pathway [79]. Until now, 25 fusion variants were described, 19 of which are associated with PTC [33, 80,81,82,83,84,85,86,87,88,89,90,91,92]. The RET kinase domain and 5′ end of CCD6 gene (RET/PTC1) fusion [84] or the nuclear receptor co-activator 4 gene (NCOA4) (RET/PTC3) are most common [81]. Zou et al., reported a 14% rate of RET/PTC rearrangement and co-occurrence of BRAFV600E with RAS/PTC1 (n = 82) [93]. Rossi et al., tested fine-needle aspiration of PTC samples by real-time PCR and showed that in 7.3% of the 940 samples, either RET/PTC1 or RET/PTC3 was present [37]. Six of the patients had both RET rearrangement and BRAF mutation. RET rearrangement appears to be fairly common in children with PTC [67]. Out of 13 samples in the study, RET gene fusions were detected in 2 (15%) samples by fluorescence in situ hybridization (FISH) assay.

KAZN–C1ORF196

Le Pennec et al., identified 4 novel gene fusions, most prominently KAZN–C1ORF196 [94], and this finding was confirmed in both a case study and in 85% of additional PTC samples (n = 94). KAZN encodes a keratinization-associated adhesion protein, whereas C1ORF196 is a putative gene. The biological function of such gene fusion is unknown, but it is predicted to be a result of an alternative splicing event generates a transcript coding for an in-frame protein. RNA sequencing of 115 samples from thyroid tumor tissues and metastases was performed, and 87 samples classified as PTC were sequenced using the Sanger method to validate the existing mutations [94]. KAZN–C1ORF196 gene fusion was absent in both tumor-adjacent (n = 37) and normal thyroid tissue (n = 23). Other mutations specific for the patients were identified, all of which highlight the tumor genetic heterogeneity. What is remarkable about this study is the fact that most of the mutations found were specific for a particular patient only.

Mutations of DNA-repair genes

CHEK2

Mutations in DNA repair genes appear to be mutually exclusive with MAPK activator mutations such as BRAFV600E, but they might exist simultaneously with other mutations involved in the MAPK signaling pathway, e.g., RAS (see below) [95]. Disruption of DNA repair can be a prognostic marker for aggressive PTC development, according to TCGA (See Table 2) [33]. Genotyping of a Polish population showed that 15.6% of samples (n = 468) had one of four cell cycle checkpoint kinase 2 (CHEK2) mutations known to contribute to carcinogenesis (truncating mutations IVS2 + 1G > A, 1100delC or del5395 and a mis-sense mutation I157T) [11]. Wójcicka et al., identified the rs17879961 variant as a risk allele for PTC in a group of 1781 patients (OR = 2.2, P = 2.37·10− 10) [96]. In a Greater Poland female population (case/control: 602/829), the c.470C (I157T) homozygous variant was shown to increase the risk of developing PTC by nearly 13-fold (OR = 12.81, P = 1.9·10− 2) and was observed in 3 women (0.57%), as determined by pyrosequencing [97]. A heterozygous variant of the same mutation increases the risk by 2-fold (OR = 1.7, P = 1.7·10− 2).This association was not observed for male patients.

Alterations in cell signaling pathways

RAS

Mutations in the family of RAS proteins are associated with AKT phosphorylation and result in preferential activation of the PI3K/AKT pathway in TC by evasion of apoptosis, proliferation and cellular growth [98, 99]. The RAS family consists of 4 proto-oncogenes: H-RAS, N-RAS, K-RAS4A and K-RAS4B [100]. Although RAS mutations are more prevalent in FTC, they are also observed in a subset of PTCs [101]. Zou et al., detected KRAS mutations (p.Q61R and p.S65 N) in 2 samples (2%, 2/88) and an NRAS (p.Q61R) mutation in 3 cases (PTC 1%, TCV 2%). Rossi et al., observed 3.4% of samples harboring a somatic RAS mutation (n = 940), which correlated with an aggressive histotype and poorer prognosis [37]. Until now, RAS mutations have not been found in juvenile thyroid tumors [67].

MUC1

Mucin (MUC1) plays a role in the signaling pathways of proliferation and differentiation of epithelial cells and is crucial in metastasis and tumorigenesis of epithelial cancers such as adenocarcinomas and ovarian cancer [102]. In PTC, MUC1 is thought to be a marker of poorer outcome (See Table 2), although this stance is controversial. Using pyrosequencing, Renaud et al., showed that 40% of 94 PTC samples overexpressed MUC1 in the cytoplasm, which correlated with the presence of the BRAFV600E mutation in 95% of samples.

Deregulation of protease expression

TMPRSS4

Transmembrane protease serine 4 (TMPRSS4) is a type II transmembrane serine protease overexpressed in several cancer types, including gastric [103], breast [104], lung [105] and thyroid cancers [105,106,107]. TMPRSS4 promotes cell proliferation, invasion, metastasis and epithelial-mesenchymal transition (EMT) and is predominantly overexpressed in PTC. Kebebew et al., tested 131 tumors by cDNA microarrays, and TMPRSS4 was one of the 6 genes deregulated in malignant tumors [107]. Jarząb et al., tested 50 samples from 33 patients (23 PTC, 10 other thyroid malignancies) paired with normal tissue using microarray analysis [106]. TMPRSS4 was classified as one of the genes forming a set of markers that distinguish between benign and malignant tumors.

Mutations in transcription regulators

EIF1AX

Eukaryotic translation initiation factor 1A/X-linked (EIF1AX) is a major player in the transfer of Met-tRNAf and has a high mutation rate in PTC (1.5%, 6/402). EIF1AX is suggested as a potential driver of tumorigenesis in other cancers, e.g., uveal melanoma [33, 108, 109], and in TC, it is a promising biomarker candidate. This observation is supported by Karanamurthy et al., who detected EIF1AX mutation in 2.3% (n = 3/86) of tested PTC samples and 1 of 5 PTC FNA samples using NGS [110]. Almost all of the EIF1AX mutations were located at a hotspot A113_splice site at intron 5/exon 6.

FOXE1

The thyroid transcription factor forkhead box E1 (FOXE1) possesses a well-conserved DNA binding domain (FDH) and is crucial in the development of a healthy thyroid [111]. Deregulation of transcription factors from the FOX family is recognized as an important element of TC progression.

Penna-Martinez et al., used PCR to genotype 196 PTC samples (German population) for the presence of two known susceptibility SNPs in FOXE1 [17, 112]. The rs965513 phenotypes “AA” and “AG” were more common in DTC patients in contrast to the “GG” phenotype, which was common in healthy controls. The rs965513 variant is more pronounced in PTC than in FTC [112]. Mond et al., sequenced 120 PTC tumors for SNPs in the coding region of FOXE1. Four mis-sense mutations were found in the FHD (c.821C > A, p.P54Q; c.943A > C p.K95Q; c.994C > T, p.L112F), each in a single tumor. Molecular modeling of the described mutations showed their location in a region highly conserved across species, thus explaining the potential carcinogenic effect [111].

TERT promoter

Telomerase reverse transcriptase (TERT) is a catalytic subunit of telomerase vital for the gain of immortality by cancer cells [113, 114]. Two mutations located in the TERT promoter region are associated with carcinogenesis, namely, C-to-T substitution (C1,295,228 T) and C-to-A substitution (C1,295,250A) [115]. TERT promoter mutations appear to be rare in PTC (4.4%, n = 455, Chinese population) [64], but they correlate positively with aggressiveness of the tumor and patient age (See Table 2). These results confirm studies performed by Liu et al., [116, 117]. TERT mutations are less common in PTC (11.3%, n = 408) than in ATC (42.6%, n = 54) when pooled data are considered [118]. Studies also show that TERT promoter mutations correlate with poorer outcomes and an increase in aggressiveness of the tumor, even if they do not coincide with BRAF mutation [115, 119]. TERT promoter mutations are most common in TCV.

Regulatory RNAs

RNA-mediated regulatory pathways disrupted in carcinogenesis involve micro-RNA (miRNA, miR) signaling. Micro-RNAs are short, 21–23 nt, non-coding endogenous RNA fragments that regulate expression at the posttranscriptional level [120]. MicroRNA-deregulated thyroid cancers are collected in Table 3. T Yoruker et al., used RT-PCR to test serum from pre- and post-operative PTC patients to measure the level of micro-RNA expression [121]. The PTC patient sera levels of 4 miRNAs (miR-222, miR-31, miR-151-5p, let-7) were significantly higher compared with healthy controls, and the miR-21 level was lower (see Table 2). General levels of all miRNAs were lower in the post-operative samples and showed no significant difference with the healthy control group. A similar study was performed by Lee et al., to measure the expression of miR-222 and miR-146b in plasma and tumor tissues [122]. In recurrent tumors, miRNAs were significantly up-regulated compared with non-recurrent patients and healthy controls. Plasma miRNAs levels decreased after thyroidectomy in both cases. The results, especially miR-222 overexpression, confirm the results of other groups [123, 124], suggesting that both miRNAs might be used as biomarkers of cancer progression. MiR-221, miR-22, and miR-21 are involved in PTEN regulation [125], whereas miR-126 is associated with angiogenesis [120], and its expression in PTCs as well as undifferentiated thyroid cancers showed a correlation between miR-126 down-regulation and overexpression of VEGF-A mRNA and protein in tumors. miR-639 expression was upregulated in cancer tissues [126]. In contrast, expression of miR-20b a regulator of the MAPK/ERK signaling pathway with potential tumor suppressor qualities, was down-regulated in TC [127]. Samsonov et al., showed the potential differentiating miRNAs (miR-21 and miR-181a) that might be useful in distinguishing PTC from FTC [128]. Studies conducted by Hu et al., associated down-regulation of miR-940, miR-15a, and miR-16 with PTC phenotype [129].

Table 3 microRNAs differentially expressed in PTC and their tissue of origin

Follicular thyroid carcinoma (FTC)

Follicular thyroid carcinoma is the second most common thyroid malignancy, is considered more aggressive than PTC, and has a 95% 5-year survival rate. Mortality rate and disease aggressiveness increase with the age of the patient at diagnosis [130].

Hou et al., showed the occurrence of PTEN (7%, 6/86 samples) and PIK3CA (6%, 5/85 samples) mutations in FTC [42]. PIK3CA gene copy gain was found in 20% of tested samples (24/85). These mutations might affect the activation and regulation of the PI3K/Akt pathway. In contrast to PTC, the BRAFV600E mutation is generally rare in FTC [115]. TERT promoter mutations (see Table 4) were also tested, but the FTC sample number was low (20 minimally invasive FTCs without metastasis and 3 FTCs with metastasis). Nevertheless, the results correlated positively with the presence of distant metastases (1/2 minimally invasive samples with distant metastases).

Table 4 Somatic mutations found in FTCs. SNV: Single nucleotide variant

Świerniak et al., performed targeted NGS sequencing of 48 FTC tumors [12]. The authors identified previously undescribed somatic mutations in both intronic and exonic regions. FTC mutations were found in FOXO4 (transcription suppressor), CHEK2 and NCOA2 (epigenetic modifier) genes. Additionally, 10/18 identified single nucleotide variants (SNVs) were located in the non-coding regions of the studied genes. Other types of mutations included indels in MITF and KTN1 genes (transcription factor and transmembrane kinesin receptor, respectively) and loss of heterozygosity (LOH) in the IDH1 gene that belongs to the dehydrogenase family. Copy number variations (CNV) in ARNT (facilitates transport to the nucleus, transcriptional co-regulator of HIF1 expression), FBXW7 (component of the ubiquitin degradation signaling chain) and USP6 (ubiquitin specific peptidase) were also found in samples with populations of cells highly represented in tumors. In the low-confidence FTC group, a distinct subset of mutations was found, meaning that the differentiation of the two subsets based on their molecular profiles might be possible. In lower-confidence FTC, subset mutations were found in the COL1A1 gene, which is a fibrin-forming type of collagen. LOHs were identified in WRN (belonging to a family of DNA and RNA helicases) and PPARγ (member of a nuclear receptor subfamily), among others. A new translocation of unknown function was described, namely, COX6C/DERL2. KAZN/C1ORF196 gene fusion was confirmed in the case study and in 55% (out of 11) of FTC additional samples [94].

One of the most common genetic events in follicular thyroid cancer is the gene fusion of PAX8/PPARγ or PPFP oncoprotein gene [131, 132]. PAX8 on its own is necessary for the normal development of the thyroid [133], and PPARγ is a nuclear receptor [134]. PAX8/PPARγ fusion is present in 35% of FTC tumors on average, can be overexpressed by up to 50-fold compared with endogenous PPARγ in tumor tissues [135, 136] and is probably the effector component of the oncogenic rearrangement [137].

In FTC, as in PTC, overexpression of TMPRSS4 is observed in 53.6% (15/28) of the samples, as shown by Guan et al. [138].

Sun et al., found a positive correlation between FTC tumorigenesis and low levels of miR-199a-5p expression [131]. MiR-199a-5p was identified as a regulator of the connective tissue growth factor (CTFG), which acts as an inhibitor of the cell cycle in healthy tissue. In tumor conditions, both fusion proteins appear to possess binding domains that retain their function in the correct cellular context [132].

Anaplastic thyroid carcinoma (ATC)

Anaplastic thyroid carcinoma is the most aggressive type of TC and contributes to 1–2% of all thyroid cancers and 39% of reported deaths [133]. The 6- to 12-month mortality rates reach 80%. The high aggressiveness of ATC is caused by dedifferentiation of well-differentiated thyroid cancer forms such as PTC [134,135,136]. Compared with PTC and poorly differentiated thyroid cancers, the mutation burden in ATC is much larger [137] (see Table 5).

Table 5 Somatic mutations found in ATCs

ATC can arise independently, but it often coincides with well-differentiated tumors. Co-occurrence of BRAF and RAS mutations in ATC suggests its common genetic origin with DTC [135, 139, 140]. Hou et al., tested 50 ATC tumors and found a high prevalence of mutations associated with PI3K/Akt pathway activation: PTEN 16% (8/50) and PIK3CA 12% (6/50) [42]. RAS mutations were also identified in 8% (4/50) of samples. The molecular heterogeneity of ATC makes it incredibly difficult to analyze. Kasaian et al., performed whole-genome sequencing of 1 ATC sample and identified 24 somatic mutations, including two heterozygous mutations in BRAF (V600E) and TP53 (Y163C) genes. [141]. Kunstman et al., tested 22 tumor samples with whole-exome sequencing [142]. The majority (68%) of the observed variants code for mis-sense mutations. A total of 16 genes were identified as potential drivers of tumorigenesis, 6 of which were present in multiple samples, namely, NF1 (negative regulator of RAS pathway), mTOR (kinase, mediates response to stress), ERBB2 (EGF receptor), DAXX (apoptosis regulator and transcription repressor among other functions), MLL2 (histone methyltransferase), and NOTCH2 (regulator of cell fate). In addition, recurrent mutations of EIF1AX and HECTD1 (ubiquitin-transferase activity) and non-synonymous USH2A (development of retina and inner ear) mutations were observed. Several of the tested cases presented a hypermutation phenotype, resulting in a high mutation burden of mismatch repair genes. Bonhomme et al., sequenced 94 ATC tumors targeted to TERT using NGS and 98 samples using Sanger sequencing [143]. More than 50% of samples possessed the TP53 mutations, and ALK rearrangements were rare. In total, 210 different alterations were found, including those not previously described in the context of TC, such as MET (proto-oncogene) and ERBB2 mutations. In the Korean population, 60% samples (3/5) had a TERT promoter mutation, which coincided with BRAFV600E [115]. In a study by Landa et al., the presence of BRAFV600E mutation was observed in 45% out of 33 tumors [137]. In the same study RAS mutations (H-RAS, K-RAS, and N-RAS) occurred in 24% of the samples but were mutually exclusive with BRAFV600E.

Other mutations found in ATCs were NF1 (3 samples), PIK3CA (18%), and PTEN (15%). PIK3CA mutation tends to co-occur with BRAF mutations, whereas NF1 tends to be present simultaneously with PTEN mutations. EIF1AX mutations were present in 9% of the 33 studied tumors.

For the first time, Landa et al. reported mutations in components of the SWI/SNF complex (chromatin remodeling system), as reported in 36% (n = 33) of tumors. Mutations were also found in histone methyltransferase genes (KMT2A, KMT2C, KMT2D, and SETD2) in 24% (n = 33) of ATCs. Additional genes involved in epigenetic processes, i.e., CREBBP, EP300, BCOR, and BCL6, were mutated at low frequencies. One sample carried a CTNNB1 (p.L347P; WNT signaling pathway) mutation, but this finding was not validated by others. Mutations were also observed in members of the MMR DNA repair pathway (MSH2, MSH6, and MLH1) in 12% of samples. Another DNA damage response element, ATM, was mutated in 9% of tested ATCs. Landa et al., reported frequent (73%, n = 33) TERT promoter and TP53 mutations. The TERT promoter C228T variant was more common than the C250T variant. TERT promoter mutations significantly diminished the survival rate from 732 to 147 days.

Gene fusions are also present in ATC. KAZN/C1ORF196 was identified by Le Pennec et al., in a case study and confirmed in 11% of additional ATC samples [94]. Guan et al., observed an increase of TMPRSS4 expression in all ATC samples (n = 12) compared with adjacent normal tissue [138]. Targeted DNA sequencing for TP53, RAS, BRAF, ALK, and NF1 of 30 formalin-fixed paraffin-embedded (FFPE) ATC tumor samples by Latteyer et al., showed that 28/30 tested samples carried at least one of the tested mutations [144]. TP53 mutation was most common (18/30), followed by NF1 (11/30) and RAS family mutations (7/30 combined). It is also worth mentioning that nearly a third of the samples showed residual contaminations of either PTC or FTC tissue, proving the anaplastic tumor heterogeneity.

Zhang et al., tested the expression of myocardin family genes (involved in cell growth arrest, inhibition of differentiation, metastasis and tumor invasion) [145]. MRTF-A was overexpressed in metastatic ATC but was not present in either in primary tumor or the adjacent tissue. Following this finding, down-regulation of miR-206 was identified as the factor leading to the MRTF-A overexpression.

Conclusions

Despite the large number of mutations involved in the tumorigenesis of thyroid carcinomas (Fig. 2), many tumors remain unclassified by FNA biopsy or even genetic testing. Pagan et al., notes that over 50% of samples tested for a large number of reported mutations already observed in TC by RNA-seq do not show a phenotype, leading to the conclusion that the fast-growing database of somatic and driver mutations in thyroid cancers must be expanded with respect to histological subtype [146].

Fig. 2
figure 2

Genetic changes identified in thyroid cancers of follicular origin. Genes common for all three TC subtypes are marked in red

DNA methylation in thyroid cancer has been extensively studied and reviewed but was not discussed in detail in this review. However, it is worth mentioning that the advances in next-generation sequencing and microarray techniques enable in-depth research on the methylation pattern in GC-rich regions and its effect on gene expression. Most studies focus on pre-determined loci [147, 148], and fewer are available at the whole-genome scale [149, 150]. Determination of the methylation patterns can be potentially useful for differentiating between TC subtypes with greater precision. The largest study to date that examines whole-genome methylation was performed as a component of the TCGA project (PTC, n = 496) [33]. In a recent study, Bisarro dos Reis et al., proposed a hyper/hypomethylation genetic signature that allows distinction between TC subtypes (Hürtle cell, PTC, FTC, non-neoplastic tissue and benign lesions, ATC) based on the Illumina 45 k platform, with high sensitivity and specificity (63 and 92%, respectively) [151]. Methylation can also be used as a prognostic marker of disease outcome, as proposed in the same article. Beltrami et al., proposed the PTC hypomethylation signature of 41 PTC-paired samples (88% of hypomethylation) as a prognostic biomarker of PTC development [152]. This signature coincides with the presence of the BRAFV600E mutation (68% of the hypomethylation signature).

In the era of advanced molecular analysis, genetic markers have become a useful tool for the evaluation of thyroid tumor growth and progression. Molecular biomarkers can be applied in the classification of thyroid tumor subtypes and the prediction of disease outcome and might also aid development of systemic molecular therapies in cancers that are refractory to standard treatment. The discovery of specific genetic alterations and mechanisms of thyroid carcinoma development is expected to lead to more personalized treatment for patients with advanced and recurrent disease. Despite the presence of the molecular changes described in this review, the roles of molecular biomarkers in the development of different thyroid tumor subtypes still remain unclear.