Introduction

The simplistic one-to-one relationship between genes and traits provides an excellent framework for understanding the core concepts of genetics, but it is not always a simple one-to-one relationship. For instance, even ‘identical twins’ who are raised together are never truly identical (Jonsson et al. 2021).

On one hand, the dominant or recessive genetic pattern of inheritance described by Mendel represents the extremes of a spectrum of states. On the other hand, this spectrum is (1) a function of allele strength, all mutant alleles are not equal; (2) genetic background, other genes influence sensitivity or resistance to a certain mutation; and (3) their interaction with the environment, a phenomenon known as phenotypic plasticity (West-Eberhard 1989).

Thus, the relationship between the genotypes of an organism and its phenotypes is complex. This complexity was illustrated by genetic studies of monogenic disorders in humans where the same mutation can produce different phenotypes (Quarton et al. 2020). Examples include cystic fibrosis (CF), Duchenne muscular dystrophy (DMD), Marfan syndrome and beta thalassemia, which are all monogenic disorders, but display wide phenotypic variability (Amaral 2015; Cao and Galanello 2010; Carter 1977; Lovering et al. 2005).

Nevertheless, a question that many scientists have been trying to answer is ‘What causes this phenotypic variability at the molecular level?’.

One part of the answer has started to unravel more than a century ago by the famous work of Calvin Bridges on modifier genes in Drosophila melanogaster (Bridges 1919). Modifier genes have the ability to influence the penetrance, dominance, expressivity, and pleiotropy of a phenotype (Nadeau 2001). Nowadays, the evidence for the action of modifier genes is extensive, both in humans and model organisms, and corresponding studies have shown that their effects on the phenotypic presentation of disease-causing variants can be subtle or profound (Cutting 2010; Nadeau 2001).

A typical example of a modifier gene is represented by the APC−/+ (adenomatous polyposis coli) mouse model, a murine counterpart of human FAP (familial adenomatous polyposis) (Moser et al. 1990). These mice display a wide phenotypic variation, for instance, the number of polyps, depending on the genetic background (Moser et al. 1990). By linkage analysis, a modifier gene, originally named Mom1 (Modifier of Min-1) in the APC−/+ mouse model was mapped to the distal part of chromosome 4 (Dietrich et al. 1999). Mom1 was able to explain 50% of the genetic variance in polyp number (Nadeau 2001). As at to date, many modifier genes have been described for different diseases including retinitis pigmentosa, CF, DMD, Bardet-Biedl syndrome, Marfan syndrome, Rett syndrome, Neurofibromatosis, Thalassemia, etc. (Collacoa and Cutting 2008; Dietz et al. 1994; Lobo 2008; Nadeau 2001).

Some modifier genes are inherited as genetic background, while others can be activated through different compensatory mechanisms (Hartman et al. 2001; Hilgert et al. 2009; Rossi et al. 2015; Rutherford 2000; Whitacre and Bender 2010).

Genetic buffering mechanisms such as genetic compensation and transcriptional adaptation can trigger the expression of modifier genes and have been proposed as mechanisms to explain apparent genotype to phenotype discrepancies (El-Brolosy et al. 2019; Lewontin 1974; Rossi et al. 2015; Wagner and Zhang 2011). By buffering the effects of deleterious genetic variations, cells can modify the outcome phenotype in different ways and at different levels (Rossi et al. 2015; El-Brolosy and Stainier 2017).

Genetic Redundancy and Genetic Compensation

Eukaryotic cells show surprising robustness against internal and external perturbations, which can be partially attributed to functional redundancy established after small-scale and whole genome duplication events throughout evolution (Kuzmin et al. 2021; Lynch and Conery 2000).

After their duplication, genes initially gain redundant copies that can further evolve via processes such as non-functionalization, neofunctionalization or subfunctionalization into pseudogenes, genes with new biochemical functions or genes retaining part of the ancestral functions, respectively (Lundin 1999; Kuzmin et al. 2021). The presence of gene copies that retain some of the ancestral activity can provide a certain level of genetic redundancy that support genetic robustness (Ihmels et al. 2007; Kafri et al. 2006; Kuzmin et al. 2021; Li et al. 2010). The evolution of genetic robustness by duplication is followed by a neutral mode characterized by the loss of backup capacity, that is proportional to the divergence time. In the meantime, natural selection might act on a few pairs to maintain their long-term backup capacity, which are slowly evolving and now co-clustered in the same protein complexes and tend to interact with similar partners (Kuzmin et al. 2021). This can be defined as cellular robustness or genetic compensation arising from genetic redundancy (Stelling et al. 2004; Wagner 2005). The budding yeast S. cerevisiae has been a prime example of studying duplicated genes and redundancy (Goffeau et al. 1996). From a set of 201 duplicate gene pairs, 69 (34%) were found to be at least partially redundant (Dean et al. 2008). Furthermore, 49 of those redundant genes (71%) were synthetically lethal, indicating that their duplication partners could be compensating gene loss in the single mutants through functional redundancy (Dean et al. 2008).

In contrast to genetic redundancy, genetic compensation refers to an active mechanism where a deleterious mutation triggers the expression of modifier genes and does not develop the expected harmful phenotypes (Hartman et al. 2001; Rutherford 2000; Whitacre and Bender 2010). How genetic compensation be triggered is not fully understood, and different mechanisms seem to be involved. In general, genetic compensation seems to be achieved through protein feedback loops and to be independent of the types of mutation (Deconinck et al. 1997). A well-known example of genetic compensation is related to Dystrophin (DMD), a scaffold protein that links the actin cytoskeleton to the extracellular matrix (Blake et al. 2002; Nowak and Davies 2004). The absence of Dystrophin causes an X-linked genetic disorder, a devastating hereditary childhood disease characterized by progressive muscle degeneration, loss of ambulation in adolescence, and cardiopulmonary failure leading to the death of DMD patients during the third decades of their lives (Blake et al. 2002). mdx mice have been widely used to model DMD despite showing a milder phenotype compared to their human counterparts (Deconinck et al. 1997; Spitali et al. 2013). In these mice, an overall upregulation and sarcolemma recruitment of a modifier gene named Utrophin (UTRN) in the skeletal muscles and other tissues has been well described (Deconinck et al. 1997). A similar compensatory mechanism effect via UTRN upregulation has been suggested in some DMD patients, who also show higher levels of UTRN in their muscles and exhibit milder symptoms (Janghra et al. 2016; Nowak and Davies 2004; Kleopa et al. 2006). The mechanism that underlies UTRN upregulation is mediated, at least in part, by a protein feedback loop. The Dystrophin-associated protein complex (DAPC) has been shown to play both mechanical and nonmechanical roles in stabilizing the sarcolemma and protecting the muscle cells from contraction-induced damage. Thus, DMD mutations destabilize the DAPC and produce muscle weakness and muscular dystrophy (Ehmsen et al. 2002). The myogenic Akt (protein kinase B) signaling (Peter et al. 2009) family can be activated by numerous extracellular stimuli, an example being changes in the composition of extracellular matrix. In mouse models of dystrophy, due to the instability of the DAPC, signals are transduced by tyrosine kinase receptors and integrins followed by a subsequent cascade of Akt signaling that leads to the upregulation of UTRN (Peter et al. 2009). It is interesting to note that Integrin α7β1 is strongly upregulated in the mdx mice, and that this expression change has been linked to the reduction of the dystrophic phenotype and the partial restoration of viability in dystrophic mice (Lowell and Mayadas 2012). Other examples of genetic buffering via protein feedback loops include Anthrax toxin receptor 1 (ANTXR1), Lamins, Filamins, Emerin and other mechanosensing molecules (Cheng et al. 2019; Razinia et al. 2011; Shemesh et al. 2005).

Transcriptional Adaptation

Transcriptional adaptation (TA) is a more recent example of ‘active’ genetic buffering where the presence of premature termination codons (PTCs) in engineered mutant alleles has been reported to correlate with cases of genotype–phenotype discrepancies (El-Brolosy et al. 2019; Ma et al. 2019; Rossi et al. 2015; Serobyan et al. 2020).

TA was initially described in zebrafish while analyzing discrepancies between antisense technology (morpholino oligos) and genetic engineering-based models for vascular development (Rossi et al. 2015). Following extensive discussion in the zebrafish field about the shift from antisense reagents to the newer genome engineering tools, a meta-analysis on the accumulating engineered zebrafish mutants lacking a phenotype had revealed a poor correlation between morpholino induced and genetic mutants, raising concerns about off-target effects (Kok et al. 2015; Schulte-Merker and Stainier 2014; Stainier et al. 2015). Knockdown of a protein product by inhibiting mRNA translation produced strong vascular phenotypes, in contrast to engineered mutants of the underlying gene, which developed a functional vascular system and were viable and fertile (Rossi et al. 2015). Unlike in cases of weak mutant alleles or unspecific effects of the antisense reagents used, mutant embryos were also resistant to knockdown, indicating that genetic rewiring had allowed them to develop through alternative avenues (Rossi et al. 2015). This result also argues against genetic redundancy, whereby the presence of redundant genes or pathways requires their combined inactivation to uncover a phenotype (Rossi et al. 2015).

In the same study, transcriptional and proteomic comparisons of the knockdown and knockout states lead to the identification of a group of modifier genes that were upregulated in the latter. Introducing some of those candidates could partially protect from the knockdown effects, indicating that these genes could replace the function of the mutated gene (Rossi et al. 2015). These results also reinforced the hypothesis that silent potential within the genome can be mobilized to overcome genetic insults (Kontarakis and Stainier 2020). Failure to do so can lead to disease, while successfully activating compensatory mechanisms, such as transcriptional adaptation could ameliorate disease symptoms (Kontarakis and Stainier 2020). Many reports involving zebrafish have since associated mild phenotypes to transcriptional adaptation, but only some have provided experimental evidence of such relationship (Kontarakis and Stainier 2020). A good example is the analysis of actc1b (Actin alpha cardiac muscle 1b) mutants, showing mild muscle defects and resistance to actc1b MO injection (Sztal et al. 2018). The authors showed the upregulation of actc1a (Actin Alpha Cardiac Muscle 1) and suggested that this paralogue served as a functionally redundant gene in actc1b mutants. Similarly, nid1a (Nidogen 1a) mutant zebrafish were reported to lack the short body phenotype exhibited by nid1a morphants (Zhu et al. 2017). The increased expression of nid1b (Nidogen 1b) and nid2a (Nidogen 2a) in nid1a mutants but not morphant zebrafish indicated that a transcriptional adaptation response was at play (Zhu et al. 2017). The authors were able to uncover the short body phenotype in nid1a mutants through nid1b and nid2a morpholino knockdown experiments. This result provided further support of the functional compensation provided by nid1b and nid2a activation in nid1a mutants.

How the transcriptome is modulated falls into the general field of gene expression regulation, but experimental support of a mechanism that initiates TA has been lacking (El-Brolosy and Stainier 2017). Two independent studies recently showed that in zebrafish and mouse models of transcriptional adaptation, the mutant mRNA serves as a signal to induce transcription of the adapting genes (El-Brolosy et al. 2019; Ma et al. 2019). This finding points to a number of possible mechanisms, including degradation via RNA quality control mechanisms, such as the nonsense mediated decay (NMD) (Brogna and Wen 2009). Nonsense-mediated decay represents only one form of such cellular pathways. The recent zebrafish data support a parallel pathway where the premature termination codon (PTC) containing mRNA is “saved” from degradation and repurposed as a regulator of transcription (Ma et al. 2019).

Processing of the repurposed mRNA or mRNA fragment, their transportation into the nucleus and the forming of transcriptional regulation units involve many currently unknown players. Both the complexity and conservation of this phenomenon have been highlighted in a report of transcriptional adaptation in C. elegans (Serobyan et al. 2020). A small candidate RNAi screen revealed players in RNA metabolism and transport that can influence the upregulation of adapting genes, and the resulting phenotypes. Taken together, small differences in seemingly housekeeping pathways could potentially affect the perceived strength of mutant alleles, adding an interesting and underexplored parameter to consider in phenotypic analyses (Kontarakis and Stainier 2020).

Phenotypic Plasticity

It has been repeatedly proposed that another potential source of phenotypic variation in many monogenic diseases is the exposure to environmental factors or stressors (Gallati 2014). In this context, phenotypic variability, in which one genotype can produce more than one phenotype, when exposed to different forms of environmental stress, has been defined as phenotypic plasticity (Klingenberg 2019; Price et al. 2003; Via and Lande 1985).

Exposure of patients to different environmental conditions has been repeatedly proposed as an important factor which contributes to phenotypic plasticity in monogenic diseases (Gallati 2014). Clinical and basic science data suggest that non-genetic, i.e., exposomal factors (including environmental, life-style and dietary factors) might affect modifier gene expression and thereby contribute to the development of phenotypic plasticity (Cutting 2010; Cuvertino et al. 2017; Genin et al. 2008; Groman et al. 2002; Kleopa et al. 2006; Medici and Weiss 2017; Spiegler et al. 2018; Tummler 2019). The description and characterization of at least two genetic compensation pathways by which modifier gene expression can be regulated has provided one mechanistic explanation for phenotypic variability (El-Brolosy et al. 2019; Ma et al. 2019; Deconinck et al. 1997). One common denominator through which such factors could work is the generation of oxidative stress (Allen and Tresini 2000). An example that illustrates the impact of oxidative stress on clinical severity is the identification of single nucleotide polymorphisms in the glutathione pathway, which affect bacterial colonization and lung inflammation in patients suffering from cystic fibrosis (CF) (Marson et al. 2014). CF is a common, life limiting monogenic disease, which typically manifests as progressive bronchiectasis and recurrent sinopulmonary infections. After CFTR (cystic fibrosis transmembrane conductance regulator) was described in 1989, it has become increasingly evident that modifier genes and environmental factors play substantial roles in determining the severity of diseases (Collacoa and Cutting 2008; Maiuri et al. 2017). Analysis of siblings and twins with identical CFTR genotypes show different disease severity, strongly indicating that environmental factors play significant roles in determining the severity of CF (Mekus et al. 2000). Apart from phenotypic variation associated with oxidative stress, histone deacetylase (HDAC) inhibitors were proposed to be beneficial for CF patients (Angles et al. 2019). Interestingly, human bronchial epithelial cells exposed to environmental stress, including diesel exhaust particles, display increased HDAC6 mRNA expression levels (Lin et al. 2020). HDAC’s mechanism of action in CF patients is not known, but HDAC molecules control transcriptional regulation and can interact with the ribosome, thus potentially controlling the expression of modifier genes.

Nevertheless, scientific evidence supporting the role of environmental stressors on phenotype variability are mainly circumstantial in nature, and the molecular mechanism(s) underlying phenotypic plasticity are currently not fully understood (Murren et al. 2015).

An intriguing possibility is that environmental stressors indirectly affect genetic buffering and modifier gene expression by interfering with RNA quality control pathways (e.g., NMD) (Nickless et al. 2017). In support of this hypothesis, it has been shown that environmental stress, including oxidative stress, suppresses NMD and affects cells by stabilizing NMD targeted gene expression (Usuki et al. 2019). In such a scenario, it can be postulated that the cells of a patient carrying a certain disease (premature stop codon mutation) would be unable to activate a compensatory machinery or enhance the expression of modifier genes and, therefore, should display a more severe phenotype. It is worth noting that CF patients that carry G542X, R553X, S1255X, W1316X CFTR PTC mutations show a severe mutant mRNA decrease, but display a mild but variable lung phenotype. In contrast, CF patients that carry R1162X, W1282X CFTR PTC mutations, in which mRNA stability is not affected, display a severe lung phenotype (Ferec et al 2012).

Cerebral cavernous malformations (CCM) are enlarged vascular lesions that consist of closely clustered, abnormally dilated and leaky capillary caverns that affect up to 0.2% of the general population (Choquet et al. 2015; Mouchtouris et al. 2015; Salman et al. 2008). A subset of mutation types have been identified within the three known CCM genes, which should allow for a better phenotype to genotype correlation and characterization (Choquet et al. 2015; Morrison and Akers 2003). Interestingly, the wide variability in phenotypes seen amongst different patients carrying the same mutation strongly suggests the influence of additional genetic and/or environmental components (Morrison and Akers 2003; Shenkar et al. 2015; Spiegler et al. 2018). In support of this, around 25% of people with cavernous malformations in the brain never have symptoms (Taslimi et al. 2016). Mouse models for cerebral cavernous malformations include Ccm1 and Ccm2 endothelial cell specific knockout mice (Boulday et al. 2011; McDonald et al. 2011). In these models, vascular malformations are seen at around postnatal day 6 (Tang et al. 2017).

Interestingly, it was noted that following a change in vivarium, Ccm1 and Ccm2 endothelial cell specific KO mice did not develop a severe phenotype (minimal to no hindbrain lesions). Further analysis revealed that a key role for the formation of CCM lesions was mediated by the gut microbiome through the activation of the Toll-like receptor 4 (TLR4) (Tang et al. 2017). Germ-free mice were protected from CCM formation and even a single course of antibiotics permanently alters CCM susceptibility in mice (Tang et al. 2017). Gene–environment interactions and phenotype variability have been described for other diseases but the molecular mechanisms underlying it are still not fully understood. Environmental factors, like exposure to toxic chemicals and brain injury, but also nutrition, traffic air pollution and virus or bacterial infections, have long been linked to the phenotype variability in many diseases including Alzheimer's, dementia, autism and Parkinson (Ball et al. 2019; Dunn et al. 2019).

Genome Editing, Induced Pluripotent Stem Cells (iPSCs), Isogenic Lines and Organoid Models

iPSCs and genome editing combined represent a cutting-edge toolset to model diseases and better understand the biological processes that are still unsolved or poorly understood (Ramachandran et al. 2021). In particular, the creation of isogenic cell lines represents a precise control for the genetic disease model of interest, especially in those genes where discrepancies between genotype and phenotypes have been described.

One noteworthy development is the production of organoids from engineered cells to model and recapitulate disease phenotypes in three-dimensional tissues (Kim et al. 2020; Lancaster et al. 2013). This strategy provides a framework for both disease modeling and regenerative medicine based on the synthetic reconstitution of tissues. An interesting perspective would be to study genetic buffering in different cell types that carry the same type of mutation. Interesting questions to answer are: is the genetic buffering triggered at the same level in different cell types? Or is it cell specific? How are modifier genes affected by environmental stressors? Do environmental stressors affect mRNA quality control pathways that have been linked to TA?

These are all questions that, if scientifically addressed, might lead to better treatments and the development of new therapies that focus on enhancing the expression of modifier genes rather than fixing the ‘broken’ mutated gene. 

In the last few years, iPSCs and genome editing have been exploited to model human diseases using proper controls (isogenic lines) (Jones et al 2017). For instance, correcting the cystic fibrosis transmembrane regulator sequence in patient-derived iPSCs produced corrected cells that differentiated into healthy mature airway epithelial cells in vitro (Crane 2015). A mutation in the Presenilin 1 (PSEN-1) gene, which is responsible for the majority of familial cases of Alzheimer's disease (AD), was corrected in iPSCs generated from a 58-year-old patient (Pires et al. 2016). CRISPR–Cas9 gene editing was also used to correct a mutation in the DMD gene in patient-derived iPSCs (Min et al. 2019).

Closing Remarks

Sequencing studies on the Icelandic population using Illumina short reads (Gudbjartsson et al. 2015) and Oxford Nanopore long reads (Beyter et al. 2021) pointed out several loss of function mutations without apparent phenotypes, and one possible explanation is the presence of compensatory mechanisms, such as modifier genes, in these individuals (Sulem et al. 2015).

Furthermore, the advent of Nanopore sequencing technology has brought significant advantages to the field. For instance, alternative splicing (AS), alternative transcription initiation (ATI), and alternative cleavage and alternative polyadenylation (APA) have been identified as major contributors of transcriptome diversity (Lee et al. 2021). While AS events can be quantified and annotated using NGS with good accuracy, it has been hard to deduce full-length splicing isoforms that contain a combination of these individual splicing events (Lee et al. 2021).

Thus, long-read sequencing offers now the ability to map full-length sequences and potentially identify complex splice isoforms with diverse unknown roles, potentially also in genetic compensation.

Epigenetic modifications are heritable phenotypic changes that do not involve alteration of the nucleotide sequence but play a key role in gene expression and are associated with many diseases. Despite their presence in the human genome and the role in gene expression, base modifications are often overlooked due to difficulties with their detection (Liu and Seki 2020).

Using Nanopore sequencing, researchers have identified 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC), N6-methyladenine (6 mA), and Bromodeoxyuridine (BrdU) modifications in DNA, and through direct RNA sequencing N6-methyladenosine (m6A) modification in RNA (Liu et al. 2019). Furthermore, detection of other natural or synthetic epigenetic modifications is also possible through base calling algorithms training, and it could shed light on phenotypic variability. It has been proposed that these DNA and RNA modifications play a role in different biological processes including the control of gene expression and possibly genetic compensation (Barbieri and Kouzarides 2020; Liu et al. 2019).

Finally, the accumulation and storage of genetic variation in phenotypically normal populations is possible through genetic buffering. Silent variations can produce phenotypic differences when the buffering threshold is crossed, at which point these differences become sensitive to selection. Evolution and regulation of the balance between evolutionary stasis and change are regulated by buffering mechanisms. Yet, little is known about how buffering mechanisms work and respond to environment stimuli, thereby influencing phenotypic variability within a population.