1 Introduction

Cancer arises when normal cells are transformed into malignant cells by acquiring a number of hallmarks such as sustained proliferative signaling; evading cell death, growth suppression and immune destruction; replicative immortality; and activation of invasion and metastasis (Hanahan et al. 2000, 2011). Sequential accumulation of genetic mutations is a major cause of acquiring these cancer hallmarks in the cell transformation process, and hence a complete characterization of the landscape of pathogenic somatic and congenital mutations in cancer cells forms a holy grail to fully understand cancer biology. The introduction of next generation sequencing technology around 2005 has heralded a new era in which cancer geneticists have been able to detect mutations in whole exomes and more recently also in full genomes of thousands of cancer samples at an unprecedented speed and price (Martincorena et al. 2017; Consortium ITP-CAoWG 2020). These efforts have revealed an astounding amount of genetic aberrations, with an average of as much as 18,399 somatic variants that are detected per tumor at whole genome level. This number is an average, and large heterogeneity exists in the number of mutations that are detected per tumor type, and per sample within each tumor type. For instance, skin cancer (melanoma) that has a heavy mutational burden due to UV-induced mutagenesis displays an average of 116,855 variants per genome, as opposed to the central nervous system tumor pilocytic astrocytoma where each tumor genome counts on average 351 variants (Consortium ITP-CAoWG 2020). The large majority (92%) of the detected mutations are single nucleotide variants (SNVs) , and 99.2% of these SNVs occur in regions that do not encode for proteins (the non-coding region). This is not entirely surprising, when considering that only 1% of the human genome is protein coding.

Being able to detect mutations in cancer samples is one thing. The story however does not end there and the next challenge is to distinguish innocent random mutational events that have no role in promoting cancerous transformation (the so-called ‘ passengers ’) from mutations that are driving oncogenesis (the ‘drivers’) within the long lists of genetic lesions that are detected in cancer samples. Indeed, the large majority of mutations in tumor samples are probably not involved in disease pathogenesis and represent irrelevant passenger events originating from errors in DNA replication, exposure to mutagenic toxins or environmental factors such as UV-irradiation and cigarette smoke , and genomic instability in the cancer cells. A number of biostatistical methods can aid in a first prioritization of candidate driver mutations within long lists of sequence variants. However, to fully confirm the driver role of a mutation, further follow-up experiments are needed in which the mutation is expressed in cells to test whether the mutation can provide these cells with one of the cancer hallmarks . Such follow-up experiments are labor intensive, imposing serious limitations on the number of mutations that can be tested in this fashion.

Historically, cancer geneticists and biologists have mainly focused on identifying cancer driver mutations in the protein coding genome , and it is only recently that attention has also been directed towards lesions in the non-coding genome . One of the reasons for this evolution is the significant drop in sequencing and data analysis cost of a full genome mutational screen as compared to an exome screen in recent years. A second reason is the large progress that has been made in understanding the functionality of the non-coding genome , with a better characterization of the roles of non-coding RNAs and gene regulatory regions. We refer to other literature for an update on the exciting insights that have been gained on the function of our non-coding genome and on its role in cancer pathogenesis (Diederichs et al. 2016; Rheinbay et al. 2020). Whereas the protein coding genomic regions have thus attracted the attention of cancer biologists since the beginning of the cancer genetics field, not all types of mutations in these regions have been studied in equally much detail. A lot of effort has gone towards characterizing somatic missense and nonsense SNVs. These nucleotide changes result in an amino acid substitution or a premature STOP codon in the encoded protein and represent 64% and 4% of the point mutations in the protein coding regions respectively. Small insertions and deletions (INDELs ) make up another 5% of mutations in the coding regions and can result in additional or missing amino acids or in a change in protein translation reading-frame resulting in an alternative STOP codon (Sharma et al. 2019). Finally, synonymous mutations represent 23% of the somatic variants in the protein coding genome (Sharma et al. 2019). These nucleotide changes do not result in an amino acid change in the protein for which they encode and have previously attracted significantly less attention as candidate cancer driver mutations as compared to INDELs, missense and nonsense mutations. However, in a variety of other diseases such as cystic fibrosis , ataxia telangiectasia and even in hereditary cancer syndromes like familial adenomatous polyposis, a causative role for synonymous mutations in disease pathogenesis has been described (Sauna et al. 2011). Furthermore, also in the context of cancer, the number of synonymous mutations that have a significant impact on the corresponding RNA and protein expression level or isoform is rapidly rising. It is thus becoming clear that there might be a significant fraction of synonymous mutations that are not as ‘silent’ as they have long been considered to be. In this chapter, we will discuss why synonymous mutations have received little attention in the context of cancer. Furthermore, we will describe the recent progress that was made in characterizing the landscape of oncogenic synonymous mutations as well as the variety of molecular mechanisms by which synonymous mutations affect RNA and protein expression levels of oncogenes and tumor suppressors.

2 Why Synonymous Mutations Have Remained Silent for a Long Time in the Cancer Field

As outlined above, the availability of exome and by extension full genome sequencing data has resulted in the identification of long lists of somatic SNVs in tumors samples and initiated the challenge to distinguish disease driver mutations from random passengers. Biostatistical methods and tools have been developed to assist cancer geneticists and biologists in this mission. A common approach that emerged is based on determining mutational significance, which means determining whether the mutation frequency of a gene or region is significantly higher than one would expect by chance. Genes or mutated regions with higher mutation rates than the random background mutation rate (BMR ) are considered to be cancer drivers. This method thus heavily depends on choosing a correct BMR, a task that is far from trivial as BMR is influenced by many parameters such as point mutation rates, heterochromatin-associated nucleosome modification H3K9me3, replication timing, base composition and mRNA expression levels. Since synonymous mutations do not induce any amino acid changes in the proteins for which they encode, several of the widely used mutational significance algorithms that have been developed in the past (e.g. MutSig1.5 and its later versions such as MutSigCV ) were based on the assumption that synonymous mutations are meaningless in cancer pathogenesis. Therefore, these algorithms use the somatic synonymous mutations that are detected in cancer samples as BMR and can thus not be used to determine whether a synonymous mutation that is detected in a cancer sample could be a driver mutation (Martincorena et al. 2017; Ding et al. 2008; Peifer et al. 2012; Lawrence et al. 2013). Also other in silico  tools that are commonly used by cancer biologists cannot be used to assess synonymous mutations. For instance, algorithms as SIFT , (Ng et al. 2003) PolyPhen2 , (Adzhubei et al. 2010) and transFIC (Gonzalez-Perez et al. 2012) analyze the functional impact of an amino acid change imposed by a mutation on protein functioning. It is worth mentioning that a machine-learning based bioinformatic tool has been develop to identify functional synonymous mutations based on features as conservation, codon usage, splice sites, splicing enhancers and suppressors, mRNA folding. This tool is however not specific for mutations in cancer and the algorithm was trained on a very limited number of synonymous mutations (Buske et al. 2013). A final factor contributing to the silence on synonymous mutations in the cancer field is their poor visibility to cancer biologists: most research articles describing results from cancer genomics studies do not report synonymous mutations. Furthermore, prominent cancer genomics databases such as the cBio Cancer Genomics Portal (http://cbioportal.org) (Cerami et al. 2012) do not allow to query for synonymous mutations (Soussi et al. 2017).

3 Screenings for Synonymous Cancer Driver Mutations

The first comprehensive screening for synonymous mutations at pan-cancer level was published in 2014. In that study, Fran Supek and colleagues analyzed 3851 tumors affecting 11 different tissues and screened them for synonymous mutations in 77 known cancer genes (39 oncogenes and 38 tumor suppressors). To do so, they compared the incidence of synonymous mutations in these established oncogenes and tumor suppressors to that in control gene sets of non-cancer-associated genes, after having matched the gene sets for BMR influencing factors. The analyzed oncogenes showed a 23–30% excess of synonymous mutations compared to the genes in the control set, with 16 of these oncogenes even showing an excess of >50%. This conclusion was confirmed using other approaches that are based on comparing mutation rates in the coding sequence of oncogenes to those in their intronic regions or in the coding sequence of neighboring genes. In sharp contrast, the analyzed tumor suppressors were not enriched for synonymous mutations and were even slightly depleted for this type of mutations, with the only exception being the TP53 tumor suppressor . Whereas synonymous mutations in a particular tissue tend to target the same oncogenes as those hit by missense mutations in that tissue, synonymous and missense mutations typically do not co-occur in the same gene in a particular sample (Supek et al. 2014). These observations support that the same oncogene can get activated either by synonymous mutations or by missense mutations in different patients and that synonymous mutations may thus have the same capacity as missense mutations to activate particular oncogenes. Also, the observation that synonymous mutations in a particular oncogene are clustered into evolutionary conserved genomic regions such as exonic splicing motifs underscores the functional relevance of these mutations. When analyzing the molecular mechanisms by which synonymous mutations may activate oncogenes, Supek et al. provide statistical evidence that the most frequent mechanism is altered splicing (Supek et al. 2014). The study by Supek is a significant milestone in the cancer genomics field, providing the first recognition of the functional role that synonymous mutations may play in cancer pathogenesis. Nevertheless, most analyses are restricted to 77 known cancer genes, whereas 574 experimentally verified cancer genes have been described (Sondka et al. 2018). Many relevant mutations may thus have been missed in the cancer genes that were not analyzed. An example of these are the synonymous mutations in the BCL2L12 oncogene that were previously identified in melanoma (Gartner et al. 2013). Furthermore, the design of the Supek study requires a prior assumption on whether a gene is a cancer gene or not. This prevents the identification of novel cancer genes that may only be hit by synonymous mutations, and may induce errors because of the presence of unrecognized cancer genes in the reference gene set.

In 2019, a novel pan-cancer study by the group of Sven Diederichs was published. Data from 18,028 tumor samples representing 88 different tumor types were included, allowing the analysis of 659,194 synonymous mutations. In this study, cancer mutation signature 1 (Alexandrov et al. 2013) was used to correct the observed frequency of synonymous mutations for background mutation bias in cancer samples. The observations that more than 40% of synonymous mutations target highly conserved nucleotides and that synonymous mutation frequency in samples negatively correlates with their mutational load further underscores positive selection of synonymous mutations. In an effort to rank synonymous mutations for their likelihood to have functional impact, the SynMICdb Score was developed. This score is based on nine different parameters such as mutation frequency in cancer, mutational load of samples containing the mutation, evolutionary conservation, annotation as cancer gene and predicted impact on secondary RNA structure. Interestingly, a significant depletion of synonymous (and missense) mutations was observed in 5′-end of the coding region of genes, but synonymous mutations in this region had significantly higher SynMICdb scores. The 5′-end of a coding region is typically more structured, and while synonymous mutations in the first codons of the mRNA are thus rarer, the mutations that are occurring there show a higher likelihood to alter the structure of this region. A real asset for the field is that synonymous mutations from this study and their associated SynMICdb score can easily be consulted on an online webportal (http://SynMICdb.dkfz.de) (Sharma et al. 2019).

4 How Synonymous Mutations Break the Silence

The studies described above strongly support that positive selection of synonymous mutations occurs in cancer samples. More insights are emerging on the molecular mechanisms that are at the basis of this positive selection. Synonymous mutations can have an impact on mRNA splicing or on mRNA translation speed, with the latter dictating protein folding and stability. These and other mechanisms cause that synonymous mutations can profoundly modulate protein function. In this section, we will elaborate on the growing list of molecular mechanisms by which synonymous mutations affect expression levels of oncogenes and tumor suppressors in cancer (Fig. 5.1 and Table 5.1).

Fig. 5.1
A chart illustrates the mechanisms that modify genetic expressions. The first row has Splicing and Protein stability, second row on Codon usage, third row on R N A structure, R B P interaction and m i R N A binding, and the fourth row depicts R N A stability and Translation efficiency.

Overview of molecular mechanisms by which synonymous mutations affect gene expression

Table 5.1 Overview table of synonymous mutations described in this text

4.1 Splicing

Splicing is the process in which the intronic sequences in the primary RNA transcript are removed to obtain a mature messenger RNA (mRNA) . Synonymous and non-synonymous mutations affecting splicing have widely been reported, and the mutations that are located within a window of 30 bp next to the splice junctions typically have the highest impact on the splicing process. Splicing perturbations can have multiple outcomes. First, a mutation can create a novel splice donor or splice acceptor site within an exon, resulting in exon truncation. Second, perturbed splicing due to a mutation can result in the aberrant addition or removal of an entire exon from the transcript (the latter being referred to as exon skipping). Finally, intron retention can occur, where the creation of a novel splice donor or splice acceptor site by a mutation results in the retention of an entire intron or part of an intron in the mature transcript (Fig. 5.2) (Cartegni et al. 2002).

Fig. 5.2
Gene alterations due to mutation are depicted through four mechanisms namely, Normal splicing regulation, Intron retention, Exon skipping, and Exon truncation, from top to bottom, respectively.

Splicing dysregulation by single nucleotide variants. This figure illustrates a non-exhaustive set of examples by which synonymous mutations (indicated in red) alter the splicing of exons (indicated as rectangles)

In two recent-pan cancer studies , exome and RNA sequencing data from 8656 (study by Jayasinghe et al.) and 1812 tumors (study by Jung et al.) were integrated with the aim to identify pathogenic cancer mutations that affect splicing. In addition to many non-synonymous mutations that alter splicing, these studies identified 239 synonymous mutations with a detectable impact on splicing in the associated tumor RNA sequencing data. Interestingly, only 33 of these synonymous mutations occurred in genes that have previously been linked to cancer pathogenesis (Jayasinghe et al. 2018; Jung et al. 2015). The study by Jayasinghe et al. developed MiSplice (mutation-induced splicing) , a bioinformatic tool to identify mutations that create splice sites. A couple of the synonymous mutations that were picked up in their study were further tested in splicing minigene assays. In such an assay, a minigene sequence consisting of the exonic and intronic regions of interest are tested for effective splicing (Stoss et al. 1999; Cooper 2005; Desviat et al. 2012). These minigene assays support that the c.2817C > T (p.S939S) mutation in the Poly ADP-ribose polymerase 1 (PARP1) gene , which encodes an enzyme involved in cellular DNA repair, generates a novel splice donor site. This then results in a 10 amino acid deletion within the PARP1 catalytic domain in a lung squamous cell carcinoma (LUSC) patient. Also the c.234A > G (p.T78T) mutation in RAD51C, which encodes a protein involved in DNA double-strand break repair, was further validated to cause aberrant splicing in splicing minigene assays (Jayasinghe et al. 2018). Jung et al. analyzed data from six cancer types to identify somatic SNVs that cause intron retention. They characterized two synonymous mutations in breast cancer that caused intron retention in TP53 c.375C > G (p.T125T) and CDH1 c.1008G > A (p.E336E) and one in ARID1A (c.4101G > A, p.Q1367Q) in lung cancer. Intron retention was validated using minigene splicing assays corroborating the synonymous mutation computational analysis results (Jung et al. 2015).

Synonymous mutations can also influence pre-mRNA splicing by affecting exonic splicing enhancer (ESE) and exonic splicing silencer (ESS) motifs. ESE or ESS DNA sequence motifs consist of 4–18 bases within an exon that enhance or inhibit splicing. ESEs execute this function by assisting in the recruitment of splicing factors to the adjacent intron. On the other hand, ESSs recruit proteins that negatively affect the splicing machinery. Zhang and Xia et al. explored the role of somatic synonymous mutations in melanoma samples from The Cancer Genome Atlas (TCGA) , reporting 402 ESEs and 316 ESSs synonymous mutations, and conclude that pathogenic synonymous mutations are enriched in regions with a role in splicing regulation (Zhang et al. 2020). Supek et al. described that synonymous mutations are 1.75 times enriched within 30 bp of an exon boundary in oncogenes as compared to control genes, and that this is not the case for tumor suppressor genes. Analysis of RNA-sequencing data from more than 2000 cancer patients from which also DNA mutation data were available could document detectable splicing aberrations in the oncogenes that showed the strongest enrichment for synonymous mutations. Their data support that the most frequent mechanism by which synonymous mutations affect the splicing of oncogene encoding RNAs is by creating ESE or by destroying ESS motifs. These results were further validated by analyzing the effects of 12 synonymous mutations in five oncogenes in splicing minigene assays. Splicing changes due to reduced exon skipping by ESS loss or ESE gain were seen for half of the tested mutations (Supek et al. 2014).

Synonymous mutations affecting splicing have been described in major tumor suppressor genes such as APC, BRCA1/2 and TP53. The TP53 gene encodes the p53 tumor suppressor protein which is considered to be the “Guardian of the genome ” because of its role in conserving DNA stability by sensing and activating the DNA repair mechanisms. The first time that the pathogenic synonymous mutation c. 375G > A (p.T125T) was detected in the TP53 gene was in patients with Li-Fraumeni, which is an autosomal dominant syndrome with an elevated risk for a variety of cancers. This mutation affects the splice donor site of exon 4, provoking an intron retention (Warneford et al. 1992; Varley et al. 1998). Synonymous mutations affecting the same nucleotide have also been recurrently detected as somatically acquired mutations in cancer patients and in a human T-cell leukemia cell line (Supek et al. 2014; Soudon et al. 1991). Overall, TP53 is strongly enriched for synonymous mutations that mainly cluster in nucleotides adjacent to splice sites (75% of synonymous mutations in TP53). Also the 3′ terminal nucleotide of exon 6 is recurrently hit by synonymous (e.g. c.672G > A (p.E224E)) and non-synonymous mutations . Data from a minigene splicing assay support that the c.672G > A (p.E224E) mutation activates a cryptic splice site, resulting in an mRNA with a shifted open reading frame. Yet another synonymous mutation in TP53 (c.993G > A (p.Q331Q)) in the 3′ terminal nucleotide of exon 9 was described with similar properties (Supek et al. 2014).

The APC protein is a negative regulator of the WNT signaling cascade that is involved in cell adhesion. In patients with familial adenomatous polyposis, an autosomal dominant disorder characterized by the development of colorectal adenomas, a c.1869G > T (p.R623R) mutation in APC was shown to cause exon skipping (Montera et al. 2001). Another study in a patient who primarily developed lung carcinoma and later brain metastasis, described a c.5883G > A (p.P1961P) nucleotide substitution in the APC gene that results in an aberrantly spliced mRNA. Interestingly, this mutation was present in the metastasis but not in the primary lung tumor, suggesting that this mutation may have played a role in tumor metastasis (Pecina-Slaus et al. 2010).

The BRCA1 and BRCA2  genes encode tumor suppressor proteins involved in the repair pathway of DNA double-strand breaks , and their inactivation results in an elevated risk to develop breast and ovarian cancer. Around 5–10% of women with breast or ovarian cancer have a congenital mutation in the BRCA1 or BRCA2 gene, and more than 50% of breast tumors and 7% of ovarian tumors display acquired mutations in these genes (O’Donovan et al. 2010). Disruption of BRCA1 and BRCA2 function by non-synonymous mutations has been well established, but a few studies also indicate a role for synonymous mutations in this context. In BRCA1, the c.5073A > T (p.T1691T) mutation was reported by two studies to affect splicing in families with breast and ovarian cancer. Using the NNSPLICE splicing prediction tool , (Reese et al. 1997) this mutation was predicted to promote skipping of exon 17. These results were further consolidated by RT-PCR detection of the aberrant transcript (Coppa et al. 2018; Minucci et al. 2018). Regarding BRCA2, Hansen et al. described the c.744G > A (p.K172K) synonymous mutation at the last base in exon 6, next to the splice donor site, in a Danish family with breast and ovarian cancer. Utilizing exon trapping analysis, they showed that the mutation results in exon skipping of exon 6 or of both exon 5 and 6 (Hansen et al. 2010). Recently, one group described mutations that could affect splicing between the exons 2–9 of the BRCA2 gene. From the mutation databases BIC and UMD , Beroud et al. (2016) they pooled out 302 different variants. After in silico analysis utilizing NNSPLICE (Reese et al. 1997) and Human Splicing Finder (Desmet et al. 2009), 84 variants were selected for splicing minigene assays. Fragment analysis by capillary electrophoresis showed that 53 mutations (63.8%) impaired splicing: 27 exonic (18 missense, 3 synonymous, 2 nonsense, and 4 frameshift) and 26 intronic changes. One of these synonymous mutations (c.378A > G (p.Q126Q)) was reported to disrupt an ESE motif, whereas the synonymous mutations, c.222G > A (p.L74L) and c.441A > G (p.Q147Q) create an ESS motif. Interestingly, the c.441A > G (p.Q147Q) mutation was previously catalogued as a variant of uncertain clinical significance (VUS) and was now reclassified as pathogenic or likely pathogenic (Fraile-Bethencourt et al. 2019).

Finally, we want to highlight a recent study on synonymous mutations in the GATA2 gene. A variety of germline mutations driving deficiency for the GATA2 transcription factor lead to a complex multi-system disorder that can present with many manifestations including cytopenias , bone marrow failure , myelodysplastic syndrome/acute myeloid leukemia (MDS/AML) , and severe immunodeficiency (Crispino et al. 2017; Hirabayashi et al. 2017). Kozyra et al. analyzed a cohort of 911 individuals with the cancer predisposing disease MDS. In this cohort, 110 patients carried at least one mutation in GATA2, of which nine patients carried synonymous GATA2 mutations: c.351C > G (p.T117T) (n = 4 patients); c.649C > T (p.L217L); c.981G > A (p.G327G); c.1023C > T (p.A341A) and c.1416G > A (p.P472P) (n = 2). Among these five different synonymous mutations, three were predicted in silico to affect splicing. c.351C > G (p.T117T) and c.981G > A (p.G327G) were predicted to create a novel ESS motif, whereas c.1023C > T (p.A341A) disrupts an ESE motif . Splicing analysis by RNA sequencing and RT-PCR on patient material confirmed the aberrant splicing associated with the most recurrent c.351C > G (p.T117T) variant, but not for the other variants. However, for these other mutations, a complete lack or substantial reduction in RNA levels of the mutant allele could be shown (Kozyra et al. 2020).

4.2 mRNA Structure

The secondary structure of an mRNA molecule , and by extension its three-dimensional folding, is dictated by base pairing between complementary nucleotides. The structure of an mRNA molecule is essential for its functioning. First, mRNA structure determines its stability (Nowakowski et al. 1997), and a mutation that alters mRNA structure may thus affect cellular levels of the protein that it encodes for by altering mRNA stability. The RNA structure also determines the accessibility of the mRNA for post-transcriptional modifications such as methylation, and can profoundly impact mRNA function (Anreiter et al. 2020). Furthermore, the mRNA structure dictates the efficiency and speed by which a ribosome can translate an mRNA into protein. In this context, it has been well established that setting the correct translation initiation speed is essential, and that this initiation speed is heavily determined by mRNA structures in the 5′ UTR region (Livingstone et al. 2010). The impact of an altered folding of the open reading frame of an mRNA on protein translation efficiency by the ribosome is less well understood. However, it is hard to imagine that a significant change in its 3D structure due to a nucleotide change would not affect ribosomal translation speed. Finally, mRNA structure also dictates interaction of an mRNA with RNA binding proteins (RBPs) and with microRNAs , which again will affect RNA stability and translation efficiency.

Some nucleotides have an essential role in maintaining more complex elements of mRNA structure such as hairpin or pseudoknot structures , and mutation of such a nucleotide can thus have far reaching consequences. Interestingly, while all positions in a codon are important to maintain secondary mRNA structure, synonymous codon positions contribute most heavily to mRNA stability, and base pairing at the third codon position is significantly higher as compared to other codon sites in mammalian transcriptomes (Shabalina et al. 2006). Hence, the impact of synonymous nucleotide substitutions on secondary mRNA structure is not to be underestimated.

A first example of a synonymous mutation in cancer that may promote oncogenesis by altering mRNA structure is the c.30A > C (p.G10G) mutation in the KRAS oncogene. The RAS family proteins are small GTPases involved in transmitting incoming signals at the cell surface in order to activate genes involved in cell differentiation, survival and growth. Mutations in KRAS, HRAS or NRAS are observed in around 25% of all tumors, and these mutations promote cancer by causing hyperactive cell signaling (Downward 2003). The c.30A > C (p.G10G) mutation in KRAS was picked-up in the pan-cancer study by the Diederichs group because its SynMICdb score ranked in the 99.9th percentile of all analyzed synonymous mutations and because of its high score with two different algorithms to predict mutation induced changes in RNA structure (remuRNA (Salari et al. 2013) and RNAsnp (Sabarinathan et al. 2013)). Additionally, in silico RNA structure prediction tools as well as chemical probing experiments by SHAPE (Selective 2′-Hydroxyl Acylation analyzed by Primer Extension) confirmed the impact of the c.30A > C mutation on KRAS  mRNA structure . Upon overexpression of a cDNA with this mutation in HEK293 cells, a minor (±20%) induction of KRAS expression was observed. Addition of the endogenous 5′ UTR to this overexpression construct may be needed to see a more profound effect on KRAS expression and mRNA structure (Sharma et al. 2019).

A second interesting example is the c.309C > T (p.R103R) variant in ZFP36. ZFP36 encodes an RNA binding protein that binds adenylate and uridylate (AU)-rich elements (ARE) in the 3′UTR of mRNAs involved in inflammation, cell cycle regulation and angiogenesis, leading to the degradation of these mRNAs. This function of ZFP36, together with the reduced ZFP36 mRNA and proteins levels in a variety of cancer types, point towards a tumor suppressor role for this protein (Sanduja et al. 2012). The c.309C > T (p.R103R) variant in ZFP36 corresponds to a known single nucleotide polymorphism (SNP; rs3746083) that has been linked to rheumatoid arthritis and that occurs in the healthy population at a frequency of 3.5% (Carrick et al. 2006). The c.309C > T variant in ZFP36 is present in an aggressive breast cancer cell line with detectable ZFP36 mRNA levels, in which ZFP36 protein expression could not be shown. In vitro transcription and translation assays as well as cDNA transfection experiments in HEK293 cells could confirm that introduction of the c.309C > T mutation in the ZFP36 mRNA drastically impairs ZFP36 protein expression. Whereas this mutation does not significantly alter the ZFP36 mRNA stability, it is associated with a profound reduction in association with heavy polysome fractions in polysome profiling assays. These data support that this variant reduces ZFP36 translation efficiency, which may be caused by a more stable secondary structure of the c.309C > T mutant ZFP36 mRNA as compared to wild type, as indicated by in silico predictions of mRNA folding (Griseri et al. 2011). The International Cancer Genome Consortium (ICGC; https://icgc.org/) reports the c.309C > T mutation in ZFP36 in acute myeloid leukemia and colon adenocarcinoma . Interestingly, the incidence of this variant is also 5 times higher in breast tumors that are resistant to treatment with the anti-HER2 antibody Herceptin (or Trastuzumab) as compared to Herceptin responsive breast cancers (Griseri et al. 2011). This observation supports that synonymous mutations may not only be relevant for the oncogenic transformation process of cells, but also for determining the response of cancer cells to therapy.

The idea that an altered mRNA structure due to synonymous mutations may impact affinity for RNA binding proteins (RBPs) was recently explored at pan-cancer level using PIVar . PIVar is a computational pipeline that was developed to mine TCGA mutation data for SNVs that overlap with RBP-binding sites identified by crosslinking and immunoprecipitation sequencing (CLIP-seq). To filter SNVs that may have a functional impact, PIVar subsequently analyzes these SNVs for impact on RNA expression, secondary RNA structure and RBP binding. Using PIVar, almost 23,000 synonymous SNVs across 22 cancer types and targeting 2042 genes were predicted to affect RBP affinity. Depending on the cancer type, this corresponds to 5–15% of the synonymous mutations in that cancer type. The binding motif of 35 RBPs was overrepresented in the RBP binding disrupting synonymous mutations identified by PIVar . Network analyses revealed genes related to phosphoinositide 3-kinase (PI3K), NOTCH and mTOR signaling to be significantly affected by RBP affinity disrupting synonymous mutations. Interestingly, besides prominent known cancer genes, this study identifies many genes that have previously not been linked to cancer as host genes for RBP disrupting synonymous mutations. Electrophoretic mobility shift assays (EMSAs) provided experimental support for disruption of RBP binding upon introduction of synonymous mutations in DAB2, ZFHX3 and USP9X (Teng et al. 2020). Further experiments are however required to test the functional relevance of these findings in cancer pathogenesis.

MicroRNAs (miRNAs) are a family of small non-coding RNAs of approximately 18–28 nucleotides long. Their primary role is post-transcriptional regulation of the expression of proteins involved in a wide variety of biological processes. Many miRNAs have been identified as ‘oncomirs ’. These oncogenic miRNAs are misexpressed in cancer and promote cancer by modulating oncogenes or tumor suppressors (Calin et al. 2006; Peng et al. 2016). Misexpression of these miRNAs in cancer originates from a variety of sources such as copy number changes, epigenetic DNA methylation and histone deacetylation changes and alterations in miRNA transcription or miRNA processing factors (Calin et al. 2004; Mavrakis et al. 2010; Scott et al. 2006; Saito et al. 2006; Chang et al. 2008; He et al. 2007; Thomson et al. 2006; Karube et al. 2005). Besides misexpression, the function of a miRNA can be abrogated by mutating its binding region in the target mRNA. This mechanism has not been studied extensively, but several interesting examples have been described (Yuan et al. 2019). In this regard, the c.51C > T (p.F17F) synonymous mutation in the BCL2L12 gene was identified. This is a highly recurrent mutation that is present in 4% of melanoma tumors and that results in higher BCL2L12 transcript and protein levels. Two different miRNA target prediction programs predicted that miRNA hsa-miR-671–5p binds the BCL2L12 wild type, but not the c.51C > T BCL2L12 mutant sequence. In agreement with this, transfection of this miRNA into cells did not affect c.51C > T BCL2L12 mutant RNA levels, whereas a significant mRNA reduction was obtained for the wild type BCL2L12 mRNA. In other words, this synonymous mutation protects BCL2L12 from hsa-miR-671–5p mediated reduction of BCL2L12 transcript levels (Gartner et al. 2013). BCL2L12 belongs to the Bcl-2 protein family of apoptotic regulators and has been previously linked to tumorigenesis. In the majority of glioblastomas, BCL2L12 is upregulated, resulting in resistance to apoptosis by binding TP53 (Stegh et al. 2010). Whereas the synonymous BCL2L12 mutation in melanoma does not affect TP53 binding, the mutation was shown to protect melanoma cells from UV-induced apoptosis and is associated with reduced p53 target gene transcription (Gartner et al. 2013).

4.3 Codon Usage

The protein coding region of the human DNA is composed of 64 different triplets of nucleotides (codons) that encode for 20 different amino acids. Most amino acids can thus be encoded by two to six synonymous codons. The particular codon that is used for an amino acid however has implications for the translation speed of the protein, as not every tRNA is present in equal abundance in each cell or tissue (Kames et al. 2020). A synonymous nucleotide change can thus result in a codon for which the tRNA abundance is different, which in turn affects translation speed, resulting in altered protein levels and/or protein misfolding and degradation (Tsai et al. 2008; Lampson et al. 2013; Hanson et al. 2018). Codon frequency has been shown to correlate well with expression of the corresponding tRNAs (Mahlab et al. 2012). Therefore, potential effects on translation speed can be scored by evaluating the difference in abundance between the wild type and mutant codon in the human protein coding genome . In addition to codon frequency, two other types of codon usage bias exist: codon pair bias and codon co-occurrence bias. The codon pair bias refers to the non-random distribution of nucleotides neighboring a particular codon. This mechanism is thought to influence the efficiency of the translation process by interaction of the tRNAs in the A and P sites of the ribosome (Buchan et al. 2006). Codon co-occurrence bias works by clustering synonymous codons that are recognized by the same tRNA. This codon bias type involves both frequent and rare codons and is most prominent in highly expressed genes that must be rapidly induced (Cannarozzi et al. 2010). Chu and Wei et al. calculated the impact of synonymous mutations on codon bias and codon frequency. For this purpose, they assigned a codon bias value between −1 and 1, where a positive value indicates a preferred codon. They found that in cancer related genes, codons with their synonymous counterparts ending in C/G were preferred over the codons ending in A/T. Furthermore, cancer related genes were significantly positively correlated with codon bias changes, inferring that the synonymous SNPs in cancer-related genes tend to gain a more frequently used codon or a preferred codon (Chu et al. 2019). A recent study analyzed, using different statistical measures, the possible role of codon bias on thyroid carcinoma genes, proposing some synonymous mutations that could affect significantly the codon usage (Pepe et al. 2020). An optimization in the codon usage has been reported to benefit cellular translation and cell growth (Qian et al. 2012; Yang et al. 2014).

For the RAS family of oncogenes, codon bias is an established mechanism of gene expression regulation . Whereas the three RAS family proteins have a very similar amino acid composition, their codon usage is different. The nucleotide sequence of KRAS is enriched in rare codons (decoded by low-abundant tRNAs) in comparison to HRAS which contains many common codons. This explains the poor cellular translation efficiency and low protein expression level of KRAS as compared to HRAS in the human body. In agreement with this, expression of an adeno-associated viral vector in HCT166 colon cancer cells in which the rare KRAS codons were replaced by more common codons resulted in five fold higher KRAS protein and two fold higher KRAS mRNA expression as compared to the original KRAS sequence (Lampson et al. 2013). Also in other cancer cell lines, higher protein expression of KRAS was observed when expressing a cDNA in which the rare KRAS codons were replaced by common codons. Similarly, HRAS protein expression is reduced by replacing its common codons by rare codons. Interestingly, these codon replacements affected drug sensitivity of the cancer cell lines, and the higher KRAS expression and reduced HRAS expression were associated with resistance to kinase inhibitors such as vemurafenib, gefitinib and sunitinib (Ali et al. 2017).

4.4 Protein Stability

In the context of TP53 , the synonymous c.66A > G (p.L22L) variant was shown to affect TP53 protein stability. In non-stressed cellular conditions, the ubiquitin ligase MDM2 induces continuous TP53 ubiquitination followed by proteasomal TP53 degradation. When DNA double-strand breaks occur in the cell, TP53 accumulates because of phosphorylation of MDM2 by the ATM kinase. This phosphorylation switches the activity of MDM2 from a negative to a positive regulator of TP53 by causing MDM2 binding to the TP53 mRNA and by promoting TP53 mRNA translation. ATM also phosphorylates protein residue S15 on the nascent TP53 protein, and this phosphorylation prevents that the produced TP53 protein immediately gets ubiquitinated by MDM2 and degraded by the proteasome. Interestingly, the synonymous c.66A > G nucleotide change in codon L22 of TP53 prevents this phosphorylation event by MDM2, leading to TP53 degradation (Karakostis et al. 2019). Whereas this synonymous variant clearly affects TP53 function, it is very rare in cancer. The mutation was only reported once in one patient with chronic lymphocytic leukemia (CLL) (Oscier et al. 2002) and the variant also corresponds to a rare SNP (rs748527030).

4.5 Other Mechanisms

For the synonymous mutations described above, a (potential) mechanism of how these variants affect the protein expression level of the mutated gene has been proposed or experimentally demonstrated. However, this mechanism is not always very clear. An interesting example here is the c.2361G > A (p.Q787Q) polymorphism (SNP rs1050171) in the epidermal growth factor receptor (EGFR) . This SNP has been linked to higher EGFR translation and reduced drug sensitivity to EGFR targeting drugs in the context of renal cell carcinoma (Grepin et al. 2020). In complete contrast, this SNP induces higher sensitivity to EGFR targeting drugs in head and neck cancer, where it alters EGFR splicing because of reduced expression of the EGFR-AS1 long noncoding RNA (Tan et al. 2017). In fact a number of synonymous SNPs in EGFR have been linked to drug sensitivity and patient outcome, but mechanistic details on these effects are missing (Toomey et al. 2016). The variety of molecular mechanisms by which synonymous mutations may potentially affect gene expression is very large, and it is to be expected that synonymous mutations for which mechanistic insights are currently lacking are affecting less characterized mechanisms such as exonic DNA methylation or transcription regulatory sequences, post-transcriptional RNA modifications, etc.

5 What Is Needed to Entirely Break the Silence on Synonymous Mutations?

In the past 15 years, next generation sequencing analysis of large cohorts of cancer samples has resulted in a detailed atlas of mutations. Significant efforts have gone towards characterizing the functional impact of non-synonymous mutations in protein coding regions and recently also of non-coding mutations. Relatively little attention has been dedicated to the characterization of synonymous mutations, despite the fact that they represent 23% of the point mutations in the coding region (Sharma et al. 2019). Based on biostatistical analyses, it is estimated that somatic synonymous mutations represent 6–8% of all cancer driver mutations due to single nucleotide substitutions (Supek et al. 2014). This is however only a rude estimation, as the number of somatic synonymous mutations that have been experimentally tested is highly limited. Furthermore, this experimental testing is often restricted to showing an effect of the mutation on the mRNA and/or protein expression level of the mutant gene. In order to draw conclusions on cancer driver activity, cell transformation assays and/or tumor acceleration studies in animal models would be required.

Transformation of a normal cell into a malignant one typically requires accumulation of a series of mutations. It is unclear whether synonymous mutations may rather act as early mutations bringing the cell in a pre-malignant state, or as late mutations that release the brake in full-blown cell transformation. Single cell sequencing studies can shed light on this mutational order. As described in this text, a number of synonymous SNPs have been shown to affect the protein expression level of cancer genes, resulting in altered drug sensitivity. It will be of interest to see whether also somatic synonymous mutations can be linked to therapy failure and disease reappearance in relapse samples.