Skip to main content

Random Penetrance of Mutations Among Individuals: A New Type of Genetic Drift in Molecular Evolution

Abstract

The determinative view of mutation penetrance is a fundamental assumption for the building of molecular evolutionary theory: individuals in the population with the same genotype have the same fitness effect. Since this view has been constantly challenged by experimental evidence, it is desirable to examine to what extent violation of this view could affect our understanding of molecular evolution. To this end, the author formulated a new theory of molecular evolution under a random model of penetrance: for any individual with the same mutational genotype, the coefficient of selection is a random variable. It follows that, in addition to the conventional Ne-genetic drift (Ne is the effective population size), the variance of penetrance among individuals (ε2) represents a new type of genetic drift, coined by the ε2-genetic drift. It has been demonstrated that these two genetic drifts together provided new insights on the nearly neutral evolution: the evolutionary rate is inversely related to the log-of-Ne when the ε2-genetic drift is nontrivial. This log-of-Ne feature of ε2-genetic drift did explain well why the dN/dS ratio (the nonsynonymous rate to the synonymous rate) in humans is only as twofold as that in mice, while the effective population size (Ne) of mice is about two-magnitude larger than that of humans. It was estimated that, for the first time, the variance of random penetrance in mammalian genes was approximately ε2 ≈ 5.89 × 10–3.

Introduction

Current wisdom in population genetics and molecular evolution postulates that any single mutation has a fixed fitness effect among individuals with the same genotype (Kimura 1983). However, this determinative view has been constantly challenged by the fact that mutations frequently exhibit different effects in individuals of a population (Nadeau 2001; Chandler et al. 2013; Riordan and Nadeau 2017). There are two major types. Incomplete penetrance occurs when a mutation shows an effect in some individuals but not others (Eldar et al. 2009; Raj et al. 2010): for instance, many genetic causes of diseases are classified as dominant mutations, but not all individuals with the same mutation show the disease phenotype (Riordan and Nadeau 2017; Griffiths et al. 2015). Meanwhile, variable expressivity occurs when a mutation shows quantitatively different phenotypic effects among individuals (Vu et al. 2015). As those two phenomena are not mutually exclusive, Mullis et al. (Mullis et al. 2018) used the term ‘background effects’ for a mutation that may show an effect in certain individuals, among which such effect could vary in severity among affected individuals.

To avoid the terminological confusion in evolutionary genetics, the random penetrance is used thereafter to refer to any mutation that exhibits different fitness effects among individuals, in contrast to the conventional view of determinative penetrance. In essential, random penetrance reflects the stochastic nature of genotype–phenotype mapping (Dowell et al. 2010; Lehner 2013), which can arise through a variety of mechanisms, including epistasis (genetic interactions) (Mullis et al. 2018; Taylor and Ehrenreich 2014), epigenetics (Maamar et al. 2007; Raj and Oudenaarden 2008), stochastic noise (Ozbudak et al. 2002; Elowitz et al. 2002; Süel et al. 2007; Chang et al. 2008), developmental cell colonization (Binder et al. 2015), and environmental influences (Vu et al. 2015; Khoury et al. 1988). In the literature, for instance, differences in gene expression between affected and unaffected individuals were frequently used to explain the randomness of penetrance (Raj et al. 2010; Hume 2000; Wernet et al. 2006): regulatory or epigenetic effects triggered by stochastic fluctuations in development stages (Ozbudak et al. 2002; Elowitz et al. 2002; Süel et al. 2007; Chang et al. 2008; Binder et al. 2015), or environmental cues (Vu et al. 2015; Khoury et al. 1988).

All these considerations, together, come to a challenging question: to what extent the violation of determinative penetrance would affect our understanding of molecular evolution. In this report, I formulate a new theory of molecular evolution under a random model of penetrance, that is, for any individual carrying the same mutation, the coefficient of selection (z) is a random variable drawn from a population with a mean (s) and the variance (ε2). The central intent of this study is that, in addition to the Wright–Fisher sampling effect inversely measured by the effective population size (Ne), the well-known Ne-genetic drift (Lynch et al. 2011), the variance of penetrance among individuals (ε2) could serve as a novel, important resource of genetic drift, coined by the ε2-genetic drift. As exemplified by an analysis of molecular evolution between the human and mouse (Kosiol et al. 2008; Ellegren 2009; Ohta 1993), the author demonstrated that Ne-genetic drift and ε2-genetic drift together provide new insights on the nearly neutral theory of molecular evolution.

Results and Discussion

The Random Model of Mutation Penetrance

The random model of penetrance postulates that, in each generation, the coefficient of selection of a mutation (z) in an individual is a random variable drawn from a distribution with the (population) mean (s) and the variance (ε2); a large ε2 indicates a high level of penetrance variation among individuals, and vice versa. Under the Wright–Fisher model with a finite effective population size (Ne), I formulated the following diffusion-limit model to describe the stochastic process of allele frequency (x), characterized by the infinitesimal mean μ(x) = sx(1 − x) and the infinitesimal variance σ2(x) = x(1 − x)/2Ne + ε2x2(1 − x)2, respectively (see “Materials and Methods”).

It should be noticed that genetic drift refers to a random fluctuation of allele frequencies across generations (Kimura 1983; Lynch et al. 2011), which is quantified by the infinitesimal variance σ2(x) under the diffusion model. Two terms in σ2(x) represent two different resources of genetic drifts: the Wright–Fisher random sampling of gametes at each generation in a finite population, coined by the Ne-genetic drift, and the penetrance variance (ε2) among individuals, by the ε2-genetic drift.

Rate of Molecular Evolution Under the Random Penetrance Model

The first goal of this study is to derive the evolutionary rate (λ) at a nucleotide site under the random model of penetrance. Following the common practice (Kimura 1983, 1962), λ is theoretically defined by the amount of new mutations per generation (2Nv) multiplied by the fixation probability of a mutation with the initial frequency 1/(2 N), where v is the mutation rate and N is the census population size. In Materials and Methods, the fixation probability u(p) was derived, given the initial frequency (p). According to the formula λ = 2Nv × u(p) where p = 1/(2 N), one can show that the evolutionary rate can be written by

$$\begin{aligned} \lambda & = v\frac{{4N_{{\text{e}}} s - 2N_{{\text{e}}} \varepsilon^{2} }}{{1 - \left( {2N_{{\text{e}}} \varepsilon^{2} + 1} \right)^{{1 - 2s/\varepsilon^{2} }} }} \\ & = v\frac{{2N_{{\text{e}}} \varepsilon^{2} (\rho - 1)}}{{1 - \left( {2N_{{\text{e}}} \varepsilon^{2} + 1} \right)^{1 - \rho } }}. \\ \end{aligned}$$
(1)

The first equation of Eq. (1) shows that the strength of penetrance (2Neε2) and the strength of selection (4Nes) together determine the evolutionary rate (λ), in addition to the mutation rate (v). In the second equivalent form, a new parameter called the selection-penetrance ratio (ρ) is introduced, defined by

$$\rho = \frac{2s}{{\varepsilon^{2} }}.$$
(2)

It appears that while the sign of ρ (negative or positive) indicates the sign of selection coefficient, the magnitude of ρ is inversely related to the penetrance variance.

Figure 1 illustrates the rate-mutation ratio (λ/v) plotting against the strength of selection (S = 4Nes) when different strengths of penetrance (2Neε2) are given. Regardless of 2Neε2, the basic rule of molecular evolution holds strikingly: λ > v if s > 0 (positive selection), λ = v if s = 0 (neutrality), and λ < v if s < 0 (negative selection). As 2Neε2 increases, the evolutionary rate (λ) increases in the negative-selection theme (s < 0), but decreases in the positive-selection theme (s > 0) (Fig. 1). In particular, when ε2 → 0, Eq. (1) is reduced to the Kimura’s classical formula (Kimura 1962), that is

$$\lambda = v\frac{{4N_{{\text{e}}} s}}{{1 - {\text{e}}^{{ - 4N_{{\text{e}}} s}} }}.$$
(3)
Fig. 1
figure1

The rate–mutation ratio (λ/v) plotting against the strength of selection (S = 4Nes), given the different strengths of penetrance: 2Neε2 = 0, 0.1, 0.5, 1.0, 2.0, and 10.0, respectively

Penetrance Variation (ε 2) as New Type of Genetic Drift

In the classical theory of molecular evolution, genetic drift specifically refers to the Ne-genetic drift, the magnitude of which is inversely determined by the effective population size (Ne) (Kimura 1983; Lynch et al. 2011). Independent of Ne, the magnitude of the new ε2-genetic drift is determined by the variance (ε2) of penetrance among individuals. Moreover, the ε2-genetic drift anticipates an important role of the genotype-to-phenotype mapping in shaping the pattern of molecular evolution (Dowell et al. 2010; Lehner 2013). In a simplest scenario that a mutation is slightly deleterious for some individuals but strictly neutral for others, ε2-genetic drift can be technically regarded as a reduction of the effective population size (Ne). However, the effect of ε2-genetic drift, in general, could be much more profound for two reasons. First, the magnitude of ε2-genetic drift is not only gene-specific (i.e., different genes may have different values of ε2) but also mutation type-specific (e.g., ε2 for nonsynonymous mutations could be considerably larger than that for synonymous mutations) (Mullis et al. 2018; Ozbudak et al. 2002). Second, a nearly neutral mutation may reveal a spectrum in a population from slightly deleterious to slightly beneficial, raising the possibility that one mutation may have some dual properties of nearly neutral evolution and positive selection at the population level, rather than either neutral or nearly neutral (Kimura 1983,1968; Ohta 1973; Lynch 2007), or selectively beneficial (Gillespie 2000, 2001; Hahn 2008).

The new theory of molecular evolution reveals that molecular evolution depends on the relative strengths of Ne-genetic drift and ε2-genetic drift. Tentatively, the evolutionary scenario can be classified as follows (also see Table 1). The S-theme refers to a weak ε2-genetic drift in a large population, where molecular evolution is mostly determined by the nature of selection (deleterious or beneficial) (Hahn 2008). The Ne-theme refers to a weak ε2-genetic drift in a small population, where the classical nearly neutral model (Ohta 1973; Kimura 1968) is sufficient, driven mainly by the Ne-genetic drift. By contrast, the ε2-theme refers to a novel evolutionary scenario that the nontrivial effect of penetrance variation could dominate the pace of molecular evolution, especially when the population size is not small. Finally, the Ne-ε2-theme refers to the evolutionary scenario that a mutation with a large penetrance variation in a small population, where both genetic drifts are strong.

Table 1 Evolutionary themes of molecular evolution, classified according to the relative strengths of Ne-genetic drift and ε2-genetic drift, as well as the selection–penetrance ratio of natural selection (ρ)

Evolutionary Rate When Penetrance Variation is Nontrivial (2N e ε 2 > 1)

A technical issue that remains to address is about the criterion for a strong ε2-genetic drift. Roughly speaking, one may use 2Neε2 > 1, which means that the ε2-genetic drift is greater than the Ne-genetic drift, or ε2 > 1/(2Ne). Numerical analysis (Fig. 2) indicates that the effect of 2Neε2 > 1 on the evolutionary rate (λ) is nontrivial, the extent of which also depends on the selection-penetrance ratio ρ. Furthermore, it was realized that the ε2-genetic drift can profoundly modulate the relationship between the evolutionary rate (λ) and the effective population size (Ne) in the case of 2Neε2 > 1: after some mathematical treatments, one can show

$$\frac{\lambda }{v} \approx \left\{ {\begin{array}{*{20}l} {2N_{{\text{e}}} \varepsilon^{2} (\rho - 1)} \hfill & {\text{if }} \hfill & \rho \hfill & { > 1} \hfill \\ {2N_{{\text{e}}} \varepsilon^{2} /\ln (2N_{{\text{e}}} \varepsilon^{2} )} \hfill & {\text{if }} \hfill & {\rho = 1} \hfill & {} \hfill \\ {\left( {2N_{{\text{e}}} \varepsilon^{2} } \right)^{\rho } (1 - \rho )} \hfill & {\text{if }} \hfill & \rho \hfill & { < 1} \hfill \\ 1 \hfill & {\text{if }} \hfill & {\rho = 0} \hfill & {} \hfill \\ {\left( {2N_{{\text{e}}} \varepsilon^{2} } \right)^{\rho } (1 - \rho )} \hfill & {\text{if }} \hfill & \rho \hfill & { < 0} \hfill \\ \end{array} } \right..$$
(4)
Fig. 2
figure2

The rate–mutation ratio (λ/v) plotting against the effective population size (Ne) under various selection–penetrance ratios (ρ = 2s/ε2), where ε2 is fixed to be 5.0 × 10–3. a In the case of positive selection (s > 0 or ρ > 0): as Ne is sufficiently large, the curve approaches to the asymptotic line lnλ/v ~ lnNe when ρ > 1, or lnλ/v ~ ρlnNe when 0 < ρ < 1. b In the case of negative selection (s < 0 or ρ < 0), the curve approaches to the asymptotic line lnλ/v ~ ρlnNe as Ne is sufficiently large

Equation (4) explicitly shows how the evolutionary rate is dependent of the effective population size (Ne) when the ε2-genetic drift is nontrivial (Fig. 2). (i) Strong positive selection (ρ > 1): if the coefficient of selection of a mutation is s > ε2/2, the evolution is mainly driven by the positive selection and marginally affected by the ε2-genetic drift; consequently, the classical result λ/v ≈ 4Nes (Kimura 1983) is asymptotically achieved. (ii) Intermediate positive selection (ρ ≈ 1): in this special case of s ≈ ε2/2, the evolutionary rate is proportional to Ne/lnNe. (iii) Weak positive selection (0 < ρ < 1): due to the ε2-genetic drift, the evolutionary rate (λ) is positively related to Neρ, where ρ is a less-than-one positive power component. (iv) Neutral evolution: the evolutionary rate equals to the mutation rate (v), independent of Ne. (v) Negative selection (ρ < 0): the evolutionary rate under the nearly neutral selection is inversely related to Ne, with a negative power component ρ. Some numerical examples are illustrated in Fig. 2.

ε 2-Genetic Drift Explains the Human–Mouse N e Puzzle

The inverse relationship between the evolutionary rate (λ) and the effective population size (Ne) is one of the fundamental predictions from the nearly neutral theory. Ohta (1993) first tested this λNe inverse relationship, using the rate ratio of nonsynonymous-to-synonymous substitutions (ω = dN/dS) of encoding genes, a proxy of the rate–mutation ratio (λ/v). Based on a limited number of genes, Ohta (Ohta 1993) showed about twice high of the mean dN/dS in the human (ωh) lineage as that (ωm) in the mouse lineage. This observation has been confirmed by a number of follow-up studies, e.g., the genome-wide estimates of ωh = 0.249 and ωm = 0.127 by Kosiol et al. (2008). Since the effective population size of humans is much smaller than that of mice, many textbooks have cited this example as a piece of solid evidence to support the nearly neutral theory (Li 1997).

Yet, one caveat has been almost neglected for several decades. Since the effective population size (Ne) in mice is about two magnitudes larger than that in humans (Ellegren 2009; Lanfear et al. 2014), the Ohta’s model (Ohta 1973) implies that the dN/dS ratio should also be different in two magnitudes between the human and the mouse lineages. This so-called human-mouse rate-Ne puzzle cast some doubts about how accuracy of the classical nearly neutral model (Ohta 1973) that predicted ω ~ 1/Ne. Here, I show that the ε2-genetic drift may provide a plausible solution to fix this caveat. Assume a constant penetrance variance (ε2) in mammals: while the Ne-genetic drift differed considerably between the human and mouse lineages, the ε2-genetic drift remained roughly universal. Ellegren (2009) estimated that the effective population size at the node of mouse-rat split was approximately 2Nm = 106, and that at the node of human–chimpanzee split was 2Nh = 104. Under the Ohta’s (1973) model, the coefficient of selection (s) varies over sites according to an exponential distribution. From the last equation of Eq. (4), the ωh/ωm ratio was approximately derived under the nearly neutral theme with 2Neε2 > 1, the ε2-Ohta’s model, as follows:

$$\frac{{\omega_{h} }}{{\omega_{m} }} \approx \frac{{\ln 2N_{m} + \ln \varepsilon^{2} }}{{\ln 2N_{h} + \ln \varepsilon^{2} }}.$$
(5)

It is impressive that, as long as the ε2-genetic drift is nontrivial, the inverse effect of the effective population size (Ne) on the evolutionary rate tends to be logarithmic, shedding some lights on solving the human–mouse rate-Ne puzzle.

Based on the human dN/dS ratio ωh = 0.249 (Kosiol et al. 2008) and the effective population size 2Ne = 104 (Ellegren 2009), the genome-wide mean of selection coefficient (s) can be calculated according to Eq. (17) for any given ε2. This treatment allows the dN/dS ratio (ω) plotted against the effective population size (2Ne) for a given ε2 value (Fig. 3), where all curves cross at ω = 0.249 and 2Ne = 104 (the human). One can further extrapolate the dN/dS ratio (ω) to 2Ne = 106 (Ellegren 2009) as a predictor for the mouse dN/dS ratio. While the ε2-Ohta’s model perfectly explains the observations of ωh = 0.249 and ωm = 0.127 (Kosiol et al. 2008) for ε2≈5.89 × 10–3, the classical Ohta model (ε2 = 0) predicts the mouse dN/dS ratio to be virtually zero (Fig. 3).

Fig. 3
figure3

The dN/dS ratio (ω) is plotted against the effective population size (2Ne) for a given ε2 value, while the genome-wide mean of selection coefficient (s) is specified, such that the dN/dS ratio (ω) exactly equals to the observed humans ωh = 0.249 at 2Ne = 104. When ε2 ≈ 5.89 × 10–3, the ε2-Ohta model perfectly predicts the observation of mouse dN/dS ratio (ωm = 0.127) at 2Ne = 106. Note that the classical Ohta model (ε2 = 0) predicts mouse dN/dS ratio to be virtually zero

Concluding Remarks

A new theory of molecular evolution has been formulated under a random model of penetrance: for any individual carrying the same mutation, the coefficient of selection varies among individuals (Riordan and Nadeau 2017; Eldar et al. 2009; Raj et al. 2010; Griffiths et al. 2015; Vu et al. 2015; Mullis et al. 2018). The main result is that the variance of penetrance (ε2) provides a novel, nontrivial resource of genetic drift, termed by the ε2-genetic drift. Further analysis showed that the new theory explained the rate (ω)-Ne inverse relationship in mammals much better than the classical nearly neutral theory (Ohta 1993). The current study concluded that any model without taking the ε2-genetic drift into account may not be adequate to interpret the pattern of molecular evolution.

Instead of the ε2-genetic drift, there are two alternative explanations for the observed ωNe inverse relationship between the human and mouse (Fig. 3). The first one is to assume the genome-wide mean (s) of selection coefficient in mice is about two magnitudes lower than that in humans, which is biologically unlikely. The second one invokes Kimura’s (1979) model that predicted a ωNe power law, i.e., ω ~ Ne−β. Rather than β = 0.5 suggested by Kimura (1983), a simple calculation showed that β must be as low as 0.15 to fit the observation, but the problem is that the biological meaning of a small value of β is obscure. More extensive analyses in the future are needed to rule out these possibilities completely.

The concept of ε2-genetic drift may profoundly impact our understandings of molecular evolution. Several speculations are listed below, which are definitely worthwhile for further investigations: (i) molecular evolution in a large population (Kimura 1983; Ohta 1973) may be mainly driven by the ε2-genetic drift; (ii) a universal ε2-genetic drift may underlie the baseline of molecular clock (Zuckerkandl and Pauling 1965) along the tree of life; (iii) the role of effective population size (Ne) (Lynch et al. 2011; Lanfear et al. 2014) may be overstated in molecular evolution; and (iv) the connection between the ε2-genetic drift and the genotype-to-phenotype map (Dowell et al. 2010; Lehner 2013) provides some new insights about the non-adaptive origin of biological complexity (Lynch 2007). Needless to say, this new theory of molecular evolution may help to resolve the over half-century debate between neutral theory and natural selection; one may see Kern and Hahn (2018), and Jensen et al. (2019) for the recent updates.

Materials and Methods

Population Genetics of Random Penetrance Model

Consider a simple genetics model of a locus with two alleles, the wide-type (A) and the mutant (a) at a locus in a haploid population. The relative fitness of three genotypes AA, Aa, and aa are additive, given by 1, 1 + z, and 1 + 2z, respectively, where z is the coefficient of selection of mutant a. The allele frequency of the mutant a in a population changes across generations, jointly determined by the selection pressure (z) and the random sampling of individuals in a finite population (the genetic drift whose strength is inversely determined by the effective population size Ne).

In addition to the random sampling of gametes, a new resource of genetic drift emerges under the random model of penetrance: the coefficient of correlation (z) varies among individuals, with the mean and variance denoted by s and ε2, respectively. Denote the frequency of the mutant a at time t as x(t). In the diffusion limit, one can show that the stochastic process of the frequency of the mutant a follows the standard single-locus diffusion equation characterized by the mean and drift parameters.

$$\begin{aligned} \mu (x) & = sx(1 - x), \\ \sigma^{2} (x) & = \varepsilon^{2} x^{2} (1 - x)^{2} + \frac{x(1 - x)}{{2N_{{\text{e}}} }}. \\ \end{aligned}$$
(6)

Derivation of the Fixation Probability u(p)

Kolmogorov Backward Equation

The probability of the ultimate fixation of a mutant allele can be computed by solving the Kolmogorov's backward equation. Given the initial frequency p, the fixation probability denoted by u(p) is determined by

$$u(p) = \int_{0}^{p} G (x){\text{d}}x/\int_{0}^{1} G (x){\text{d}}x,$$
(7)

where function G(x) is given by

$$\begin{aligned} G(x) & = \exp \left\{ { - 2\int {\frac{\mu (x)}{{\sigma^{2} (x)}}} {\text{d}}x} \right\} \\ & = \left[ {\frac{x + a}{{1 - (x + a)}}} \right]^{{ - 2s/\varepsilon^{2} }} , \\ \end{aligned}$$
(8)

and a is the parameter defined by

$$a = \frac{{\sqrt {1 + 2/(N_{{\text{e}}} \varepsilon^{2} )} - 1}}{2}.$$
(9)

Equation (8) shows that two parameters, the product of the effective population size and the variance of selection coefficient (Neε2), and the selection–penetrance ratio (ρ) defined by Eq. (2), determine u(p). Note that the product of the effective population size and the mean selection coefficient (Nes) can be derived from the production of Neε2 and ρ. In the case of ε2 → 0 (no penetrance variation), u(p) is reduced to the standard one, that is

$$u(p) = \frac{{1 - {\text{e}}^{{ - 4N_{{\text{e}}} sp}} }}{{1 - {\text{e}}^{{ - 4N_{{\text{e}}} s}} }}.$$
(10)

Moreover, when the initial frequency p is small, the following approximation is used to simply the algebraic tediousness, that is:

$$\frac{\mu (x)}{{\sigma^{2} (x)}} = \frac{{2N_{{\text{e}}} s}}{{2N_{{\text{e}}} \varepsilon^{2} x(1 - x) + 1}} \approx \frac{{2N_{{\text{e}}} s}}{{2N_{{\text{e}}} \varepsilon^{2} x + 1}},$$
(11)

which holds closely for x <  < 0.5. It follows that G(x) of Eq. (8) is well approximated by:

$$G(x) \approx \left( {x + \frac{1}{{2N_{{\text{e}}} \varepsilon^{2} }}} \right)^{{ - 2s/\varepsilon^{2} }} ,$$
(12)

and a close analytical result of the fixation probability for p <  < 0.5 by

$$\begin{aligned} u(p) & \approx \left[ {\left( {p + \frac{1}{{2N_{{\text{e}}} \varepsilon^{2} }}} \right)^{1 - \rho } - \left( {\frac{1}{{2N_{{\text{e}}} \varepsilon^{2} }}} \right)^{1 - \rho } } \right]/\left[ {\left( {1 + \frac{1}{{2N_{{\text{e}}} \varepsilon^{2} }}} \right)^{1 - \rho } - \left( {\frac{1}{{2N_{{\text{e}}} \varepsilon^{2} }}} \right)^{1 - \rho } } \right] \\ & = \left[ {\left( {2N_{{\text{e}}} \varepsilon^{2} p + 1} \right)^{1 - \rho } - 1} \right]/\left[ {\left( {2N_{{\text{e}}} \varepsilon^{2} + 1} \right)^{1 - \rho } - 1} \right], \\ \end{aligned}$$
(13)

where ρ is given by Eq. (2).

Rate of Molecular Evolution Under Random Penetrance Model

In each generation, the number of new mutations entering the population is expected to be 2Nv, where v is the mutation rate, and N is the census population size. Each mutation occurs only in a single individual, so that the initial frequency of each mutation is p = 1/(2 N). The rate of molecular evolution is thus defined as the fixation probability of a new mutation, multiplied by the number of new mutations per generation, i.e., λ = u(1/2 N) × 2Nv. It has been shown that the fixation probability can be approximated by

$$\begin{aligned} u(1/2N) & = \left[ {\left( {\frac{{\varepsilon^{2} N_{{\text{e}}} }}{N} + 1} \right)^{1 - \rho } - 1} \right]/\left[ {\left( {2N_{{\text{e}}} \varepsilon^{2} + 1} \right)^{1 - \rho } - 1} \right] \\ & \approx \frac{1}{2N}\left[ {\frac{{2N_{{\text{e}}} \varepsilon^{2} (1 - \rho )}}{{\left( {2N_{{\text{e}}} \varepsilon^{2} + 1} \right)^{1 - \rho } - 1}}} \right], \\ \end{aligned}$$
(14)

which directly leads to the derivation of Eq. (1); note that the last approximation holds as long as 2Ne > 102.

Derivation of Eq. (5) from the ε 2-Ohta Model

From the last equation of Eq. (4), the evolutionary rate of a nearly neutral mutation (s < 0) can be rewritten as follows:

$$\lambda \approx ve^{\alpha s} \left( {1 - \frac{2s}{{\varepsilon^{2} }}} \right) \approx ve^{\alpha s} ,$$
(15)

where the new parameter α is given by

$$\alpha = 2\frac{{\ln (2N_{e} \varepsilon^{2} )}}{{\varepsilon^{2} }}.$$
(16)

In the case of 2Neε2 > 1, the parameter α is always larger than zero. Note that the last approximation of Eq. (15) holds under the nearly neutral model that |s|< < ε2/2.

Next, consider the evolutionary rate of a gene, denoted by λg. Suppose that the magnitude of selection coefficient (|s|) varies among nucleotide sites of a gene according to an exponential distribution. One may then easily show

$$\frac{{\lambda_{{\text{g}}} }}{v} = \frac{1}{{1 + \alpha |\overline{s}|}} \approx \frac{{\varepsilon^{2} }}{{2|\overline{s}|\ln (2N_{{\text{e}}} \varepsilon^{2} )}}.$$
(17)

It is straightforward to derive Eq. (5) by applying the dN/dS ratio as a proxy of the rate–mutation ratio (λg/v) in Eq. (17), under the assumption that ε2 and s roughly remain the same in mammals.

Data availability

The datasets analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Binder BJ, Landman KA, Newgreen DF, Ross JV (2015) Incomplete penetrance: the role of stochasticity in developmental cell colonization. J Theor Biol 380:309–314. https://doi.org/10.1016/j.jtbi.2015.05.028

    Article  PubMed  Google Scholar 

  2. Chandler CH, Chari S, Dworkin I (2013) Does your gene need a background check? How genetic background impacts the analysis of mutations, genes, and evolution. Trends Genet 29:358–366. https://doi.org/10.1016/j.tig.2013.01.009

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. Chang HH, Hemberg M, Barahona M, Ingber DE, Huang S (2008) Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature 453:544–547. https://doi.org/10.1038/nature06965

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  4. Dowell RD, Ryan O, Jansen A, Cheung D, Agarwala S, Danford T, Bernstein DA, Rolfe PA, Heisler LE, Chin B, Nislow C, Giaever G, Phillips PC, Fink GR, Gifford DK, Boone C (2010) Genotype to phenotype: a complex problem. Science 328:469. https://doi.org/10.1126/science.1189015

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. Eldar A, Chary VK, Xenopoulos P, Fontes ME, Losón OC, Dworkin J, Piggot PJ, Elowitz MB (2009) Partial penetrance facilitates developmental evolution in bacteria. Nature 460:510–514. https://doi.org/10.1038/nature08150

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. Ellegren H (2009) A selection model of molecular evolution incorporating the effective population size. Evolution 63:301–305. https://doi.org/10.1111/j.1558-5646.2008.00560.x

    Article  PubMed  Google Scholar 

  7. Elowitz MB, Levine AJ, Siggia ED, Swain PS (2002) Stochastic gene expression in a single cell. Science 297:1183–1186. https://doi.org/10.1126/science.1070919

    CAS  Article  PubMed  Google Scholar 

  8. Gillespie JH (2000) The neutral theory in an infinite population. Gene 261:11–18. https://doi.org/10.1016/s0378-1119(00)00485-6

    CAS  Article  PubMed  Google Scholar 

  9. Gillespie JH (2001) Is the population size of a species relevant to its evolution? Evolution 55:2161–2169. https://doi.org/10.1111/j.0014-3820.2001.tb00732.x

    CAS  Article  PubMed  Google Scholar 

  10. Griffiths AJ, Wessler SR, Carroll SB, Doebley J (2015) Introduction to genetic analysis. Macmillian Publishers, New York

    Google Scholar 

  11. Hahn MW (2008) Toward a selection theory of molecular evolution. Evolution 62:255–265. https://doi.org/10.1111/j.1558-5646.2007.00308.x

    CAS  Article  PubMed  Google Scholar 

  12. Hume DA (2000) Probability in transcriptional regulation and its implications for leukocyte differentiation and inducible gene expression. Blood 96:2323–2328

    CAS  Article  Google Scholar 

  13. Jensen JD, Payseur BA, Stephan W, Aquadro CF, Lynch M, Charlesworth D, Charlesworth B (2019) The importance of the neutral theory in 1968 and 50 years on: a response to Kern and Hahn 2018. Evolution 73:111–114. https://doi.org/10.1111/evo.13650

    Article  PubMed  Google Scholar 

  14. Kern AD, Hahn MW (2018) The neutral theory in light of natural selection. Mol Biol Evol 35:1366–1371. https://doi.org/10.1093/molbev/msy092

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  15. Khoury MJ, Flanders WD, Beaty TH (1988) Penetrance in the presence of genetic susceptibility to environmental factors. Am J Med Genet 29:397–403. https://doi.org/10.1002/ajmg.1320290222

    CAS  Article  PubMed  Google Scholar 

  16. Kimura M (1962) On the probability of fixation of mutant genes in a population. Genetics 47:713–719

    CAS  Article  Google Scholar 

  17. Kimura M (1968) Evolutionary rate at the molecular level. Nature 217:624–626. https://doi.org/10.1038/217624a0

    CAS  Article  PubMed  Google Scholar 

  18. Kimura M (1979) Model of effectively neutral mutations in which selective constraint is incorporated. Proc Natl Acad Sci USA 76:3440–3444. https://doi.org/10.1073/pnas.76.7.3440

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  19. Kimura M (1983) The neutral theory and molecular evolution. Cambridge University Press, New York

    Book  Google Scholar 

  20. Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A (2008) Patterns of positive selection in six Mammalian genomes. PLoS Genet 4:e1000144. https://doi.org/10.1371/journal.pgen.1000144

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  21. Lanfear R, Kokko H, Eyre-Walker A (2014) Population size and the rate of evolution. Trends Ecol Evol 29:33–41. https://doi.org/10.1016/j.tree.2013.09.009

    Article  PubMed  Google Scholar 

  22. Lehner B (2013) Genotype to phenotype: lessons from model organisms for human genetics. Nat Rev Genet 14:168–178. https://doi.org/10.1038/nrg3404

    CAS  Article  PubMed  Google Scholar 

  23. Li WH (1997) Molecular evolution. Sinauer Associates Incorporated, Sunderland

    Google Scholar 

  24. Lynch M (2007) The origins of genome architecture. Sinauer Associates Inc, Sunderland

    Google Scholar 

  25. Lynch M, Bobay L-M, Catania F, Gout J-F, Rho M (2011) The repatterning of eukaryotic genomes by random genetic drift. Annu Rev Genom Hum Genet 12:347–366. https://doi.org/10.1146/annurev-genom-082410-101412

    CAS  Article  Google Scholar 

  26. Maamar H, Raj A, Dubnau D (2007) Noise in gene expression determines cell fate in Bacillus subtilis. Science 317:526–529. https://doi.org/10.1126/science.1140818

    CAS  Article  PubMed  Google Scholar 

  27. Mullis MN, Matsui T, Schell R, Foree R, Ehrenreich IM (2018) The complex underpinnings of genetic background effects. Nat Commun 9:3548. https://doi.org/10.1038/s41467-018-06023-5

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  28. Nadeau JH (2001) Modifier genes in mice and humans. Nat Rev Genet 2:165–174. https://doi.org/10.1038/35056009

    CAS  Article  PubMed  Google Scholar 

  29. Ohta T (1973) Slightly deleterious mutant substitutions in evolution. Nature 246:96–98. https://doi.org/10.1038/246096a0

    CAS  Article  PubMed  Google Scholar 

  30. Ohta T (1993) An examination of the generation-time effect on molecular evolution. Proc Natl Acad Sci USA 90:10676–10680. https://doi.org/10.1073/pnas.90.22.10676

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A (2002) Regulation of noise in the expression of a single gene. Nat Genet 31:69–73. https://doi.org/10.1038/ng869

    CAS  Article  PubMed  Google Scholar 

  32. Raj A, van Oudenaarden A (2008) Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135:216–226. https://doi.org/10.1016/j.cell.2008.09.050

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  33. Raj A, Rifkin SA, Andersen E, van Oudenaarden A (2010) Variability in gene expression underlies incomplete penetrance. Nature 463:913–918. https://doi.org/10.1038/nature08781

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  34. Riordan JD, Nadeau JH (2017) From peas to disease: modifier genes, network resilience, and the genetics of health. Am J Hum Genet 101:177–191. https://doi.org/10.1016/j.ajhg.2017.06.004

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. Süel GM, Kulkarni RP, Dworkin J, Garcia-Ojalvo J, Elowitz MB (2007) Tunability and noise dependence in differentiation dynamics. Science 315:1716–1719. https://doi.org/10.1126/science.1137455

    CAS  Article  PubMed  Google Scholar 

  36. Taylor MB, Ehrenreich IM (2014) Genetic interactions involving five or more genes contribute to a complex trait in yeast. PLoS Genet 10:e1004324. https://doi.org/10.1371/journal.pgen.1004324

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. Vu V, Verster AJ, Schertzberg M, Chuluunbaatar T, Spensley M, Pajkic D, Hart GT, Moffat J, Fraser AG (2015) Natural variation in gene expression modulates the severity of mutant phenotypes. Cell 162:391–402. https://doi.org/10.1016/j.cell.2015.06.037

    CAS  Article  PubMed  Google Scholar 

  38. Wernet MF, Mazzoni EO, Çelik A, Duncan DM, Duncan I, Desplan C (2006) Stochastic spineless expression creates the retinal mosaic for colour vision. Nature 440:174–180. https://doi.org/10.1038/nature04615

    CAS  Article  PubMed  Google Scholar 

  39. Zuckerkandl E, Pauling L (1965) Evolutionary divergence and convergence in proteins. In: Bryson V, Vogel HJ (eds) Evolving genes and proteins. Academic Press, New York, pp 97–166

    Chapter  Google Scholar 

Download references

Acknowledgements

The author is grateful to all research group members for their constructive comments in an early manuscript.

Funding

This work was partly supported by the fund from Iowa State University.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Xun Gu.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gu, X. Random Penetrance of Mutations Among Individuals: A New Type of Genetic Drift in Molecular Evolution. Phenomics 1, 105–112 (2021). https://doi.org/10.1007/s43657-021-00013-2

Download citation

Keywords

  • Random penetrance
  • Genetic drift
  • Nearly neutral evolution
  • Effective population size
  • Rate of molecular evolution