In college, biologists learn how Darwin and Mendel, whose ideas eventually resulted in the modern evolutionary synthesis [1], prevailed over Lamarck and Lysenko. Now, from a cursory reading of the literature, it is possible to get the impression that this is changing [24]. But, although epigenetics is clearly enriching modern genetic research, reports of the end of genetics have - in our opinion - been an exaggeration. Since the 1990s, the molecular basis of hundreds of naturally occurring phenotypic variants has been identified in crop or wild species and, overwhelmingly, DNA sequence differences are involved (for example, [57]). Indeed, the number of natural epialleles that we know of in plants is only about a dozen. However, it is very likely that there is an ascertainment bias in favor of DNA sequence changes in the studies carried out so far. Thus, the extent to which epigenetic variation contributes to phenotypic variation in plants is still not known with certainty.

Epigenetics was a term coined by Waddington, to reflect - in modern terms - the causal mechanisms that lie between genes and phenotypes [8]; however, today it is mainly used to describe modifications that cause changes in gene expression that are stably transmitted during mitosis or meiosis, but that do not involve differences in the underlying DNA sequence. In eukaryotes, most known epigenetic mechanisms are chromatin based, and may involve still poorly defined combinations of post-translational histone modifications and histone variants, small or long non-coding RNAs, and DNA methylation [9]. Although the role of epigenetic processes in development is now well established, the field that investigates the transgenerational inheritance of epigenetic modifications is still in its infancy.

Natural epialleles in plants

The first natural plant mutant for which the molecular basis was determined to be an epimutation rather than a change in DNA sequence was a peloric variant of toadflax, Linaria vulgaris. Whereas flax normally has bilaterally symmetrical flowers, the flowers of this variant show radial symmetry, a phenotype strikingly similar to the one seen in induced cycloidea (cyc) mutants of snapdragon, Antirrhinum majus [10]. Isolation of a cyc homolog from flax revealed that it was genetically linked to the peloric phenotype, and that RNA of the cyc homolog did not accumulate in the peloric strain [11]. The open reading frame of cyc, however, appeared intact, and on DNA blots no gross differences could be detected around the gene. In contrast, analysis with DNA-methylation-sensitive restriction enzymes provided evidence for increased methylation of the cyc locus. A direct relationship between DNA methylation and reduced cyc expression was deduced from the analysis of spontaneous somatic revertants ranging in phenotype from semi-peloric to near wild type. In these plants, DNA methylation was reduced, which also confirmed that the open reading frame was indeed intact and functional. Notably, demethylation was less pronounced in semi-peloric than in near-wild-type flowers, both supporting a direct relationship between DNA methylation and gene expression, and confirming that DNA methylation is not necessarily an all-or-nothing affair [11, 12].

A second example of a natural epimutation is provided by the colorless nonripening (CNR) locus from tomato [13]. As in the example from flax the causal locus, which encodes another type of transcription factor, is intact in the non-ripening strain, but expressed at much lower levels. Again, there were differences in DNA methylation. This being a more recent study, the authors were able to investigate DNA methylation more quantitatively, using bisulfite sequencing. A block of about 300 bp approximately 2 kb upstream of the gene was heavily methylated in the non-ripening plants. Strikingly, the Liberto wild-type background, in which the colorless nonripening variant was found, was also quite highly methylated in this region, although not quite as extensively as CNR plants. In contrast, another wild-type strain, Ailsa Craig, had very low levels of DNA methylation at the locus, even though there were no obvious differences in the DNA sequence. Thus, it is possible to speculate the Liberto strain is more likely to give rise to Cnr mutant plants than the Alisa Craig strain.

Within 95 kb around the Cnr locus, the epimutant and the Liberto parent had no DNA sequence differences. Since the final mapping interval of 13 kb was approximately in the center of this 95 kb region, it is reasonable to assume that a nearby structural variation is not responsible for the modification of Cnr chromatin [13], although such a variant might have provided an initial trigger, as discussed below with respect to FOLT in Arabidopsis thaliana [14].

In contrast to cyc and Cnr, several other epialleles are clearly associated with alterations in DNA sequences. Because DNA methylation may spread outwards from repeats and transposable elements [1517] - although it does not always [18] - structural variants could in fact be the primary causes of differences in the activity of adjacent genes, with DNA methylation playing a secondary or mediating role. One such example is provided by melon plants, in which the gynoecious (g) locus is inactive and which therefore produce only female flowers [19]. The sequences responsible for loss of g expression were mapped to a 1.4-kb non-coding sequence, which contained a DNA transposon insertion in all gynoecious plants tested. The transposon was heavily methylated, and high level DNA methylation was also detected in the promoter of the adjacent gene, perhaps as a result of spreading from the transposon. However, in a recombinant gene in which the transposon was segregated away, DNA methylation was reduced and gene expression was increased. Moreover, DNA methylation was much lower in phenotypically revertant branches, indicating that the transposon effects were variable, at least to a certain extent. A similar case has been described for rice plants with a metastable epiallele at the DWARF1 (D1) locus [20], with a large tandem repeat being responsible for variable DNA methylation.

Thus, all natural epialleles reported to date and for which sequence information is available, have involved a gain or loss of DNA methylation. Moreover, these differences in DNA methylation are often in transposable elements or other types of repeat sequences located near or within the affected genes. This suggests that the 'epimutability' of many genes is ultimately conditioned by the presence of repeat sequences near or within them, and is thus likely to differ substantially between genotypes (Figure 1a).

Figure 1
figure 1

Classes of epialleles. (a) Epigenetic modifications and associated silencing of the adjacent gene is dependent on a specific cis-element, often a repetitive element. (b) Epigenetic modification is triggered by another locus or allele. Once the modification has been established, the trigger is no longer required for its maintenance. (c) Epigenetic modification is triggered by another locus or allele, but the trigger is permanently required.

Communication between homologous sequences

In A. thaliana, tandem repeats in the promoter are also associated with gene silencing, in this case of the FWA gene. In wild-type plants, these repeats are methylated, except in the triploid endosperm, where the two copies of the maternal allele are demethylated and expressed [21]. Stable epialleles in which the repeats have become unmethylated throughout the life cycle have been obtained either after ethyl methanesulfonate (EMS) mutagenesis, or in plants that are defective in DNA methylation. Demethylation in the adult plant leads to activation of FWA and late flowering [22, 23]. Once fully unmethylated, these tandem repeats very rarely, if ever, become spontaneously re-methylated [22, 24]. In contrast, when an unmethylated copy is transformed into wild-type plants, its repeats become rapidly methylated, shutting down expression of the transgene, apparently because of information transfer from the endogenous, methylated copies [25]. Such a communication between alleles may be widespread [26], but is not observed in crosses of plants with a methylated and silenced FWA allele to plants with an unmethylated, activated copy at the endogenous locus, and fwa epimutants therefore behave like normal mutants (as do the examples discussed above, with the exception of the reversion events).

Epigenetic interactions at the FWA locus thus differ from the classic examples of paramutation in maize, in which silenced alleles frequently induce silencing of normal alleles [27, 28] (Figure 1). Nonetheless, as FWA, paramutation has been linked to tandem repeats in the promoter of paramutable alleles at the maize b1 locus [29]. Tandem repeats are seemingly also important for paramutation at the r1 locus, but in this case they are apparently much larger, as the r1 locus is a tandem array of several very similar genes [30]. In contrast, the role of repeats in paramutation at the p1 locus is less clear [31].

Just as information between alleles or between endogenous genes and transgenes can be transferred (relying on short interfering RNAs (siRNAs) and the DNA methylation machinery they recruit [28, 32]), there is communication between homologous sequences throughout the genome. The first case reported in A. thaliana was that of the PAI family of genes. One natural strain of A. thaliana has two PAI genes in an inverted tandem arrangement, plus two more dispersed single copies, and all four genes are heavily methylated [33]. Another strain has only three single copies, which are not methylated, but which become methylated after a cross to the strain with the inverted tandem copies [18, 34].

A similar situation, with interesting phenotypic consequences, is seen at the FOLT1 and FOLT2 loci in A. thaliana [14]. In one strain, the FOLT2 locus contains multiple truncated copies, and siRNAs produced by these truncated versions target the intact FOLT1 copy and silence it. Notably, FOLT2 itself escapes complete silencing, preserving FOLT activity. Another strain lacks the FOLT2 locus, which induces silencing, but has an active FOLT1 copy. When this copy is replaced by the silenced FOLT1 allele from the other strain through crossing, plants lack FOLT activity and almost always die [14]. Important for this phenomenon is that FOLT1 stays silenced even after the FOLT2 locus that induces silencing has been segregated away (Figure 1b). In other words, FOLT1 may be seen as a 'pure' epiallele [35], but without complete information about the history of the genetic background it has passed through, it is impossible to know whether it reached this state without any external influence. Genome-wide analyses with genetic material derived from crossing closely related tomato species have recently confirmed that such trans interactions are likely to be quite common, and that they may underlie many aspects of the superior or inferior performances of hybrid plants [36]. An important finding in this case was that silencing was only established gradually - similar to what has been observed in A. thaliana [37] - which is discussed below. Once complete genome sequences for the tomato lines become available, it will also be possible to address systematically the question of whether there are epialleles that are absolutely dependent on a trans-acting trigger (Figure 1c).

Spontaneous changes in DNA methylation patterns

The examples discussed so far indicate that changes in DNA methylation patterns are far from random, but that they are also not always entirely predictable. To distinguish the effects of interactions between different genomes, and of new structural variants from spontaneous changes, whole-genome methylation patterns were studied in isogenic A. thaliana lines [38, 39]. Lines were derived from a single progenitor and then propagated in a benign greenhouse environment by single-seed descent. After thirty generations, almost 10% of all methylated cytosines in the genome had increased or decreased methylation in at least one out of ten lines examined. However, there is little evidence that such differentially methylated positions (DMPs) can have major effects on the activity of adjacent genes. Rather, it is large contiguous regions of differential methylation (differentially methylated regions, or DMRs), as in the epialleles discussed above, which normally matter. In contrast to DMPs, there were very few DMRs in the studied A. thaliana lines [38, 39].

In agreement with what is known about the establishment and maintenance of DNA methylation, DMPs were not randomly distributed. DNA methylation on and near transposons was highly stable, whereas it often changed over genes and far away from transposons. Moreover, the same changes were seen much more often than expected by chance in different lines, indicating that certain sites are considerably less stable than others. The bias in spontaneous DNA methylation changes parallels what has been reported for differences between wild strains, in which transposon methylation is much more similar than genic methylation [40]. It is also consistent with transposon methylation being under much greater selective pressure. Loss of DNA methylation has comparatively few effects on the expression of protein-coding genes, but it greatly reduces transposon silencing [4143]. In turn, active transposons are powerful mutagens.

Lessons from crosses between methylated and demethylated genomes

Given the frequent implication of repeat elements in the epimutability of genes, an important question is the extent to which the accidental loss of DNA methylation over transposons and other repeats can be inherited and affect phenotypes. Two experimental studies have provided genome-wide answers to this question in A. thaliana [44, 45]. Both studies relied on the creation of epigenetic recombinant inbred lines (epiRILs). In one case [44], the epiRILS were derived from the cross of a wild-type individual with a near-isogenic plant homozygous for a mutant allele of MET1, which encodes the main DNA methyltransferase responsible for maintaining CG methylation in repeat sequences, as well as in gene bodies. In another case [45], a wild-type individual was crossed with a plant mutant for DDM1, which encodes a putative chromatin remodeler involved in maintaining all types of DNA methylation (CG, CHG and CHH), specifically over repeat sequences. After the initial cross, a single F1 individual was either selfed [44], or backcrossed to the wild-type parent [45]. F2 progeny homozygous for the wild-type MET1 or DDM1 allele were selected, and epiRILs were propagated through seven rounds of selfing. Analysis of these lines indicated that met1- and ddm1-induced hypomethylation of repeat sequences could be either stably inherited for at least eight generations or else fully reversed [44, 45]. Reversion was mediated by small RNAs mainly acting in cis, and often occurred in several steps over successive generations [37]. Moreover, heritable variation for several complex traits was observed in the epiRILs [4447], highlighting the potentially important role of repeat-associated epigenetic changes in generating heritable phenotypic diversity.


Although the mechanisms by which repeat elements are targeted for DNA methylation and become fully methylated are now understood in detail, much less is known about the tempo of this process, which presumably is both progressive over several generations and dependent on a multiplicity of factors, such as the type of repeat sequence concerned and environmental conditions. Moreover, it is still unclear how DNA methylation can be lost over repeat elements in natural settings, and how stable hypomethylation can be. Here again, the DNA sequence and environment are likely key determinants. Indeed, there are now several reports of transgenerational effects of stresses such as heat, where the progeny of stressed plants apparently withstand a specific stress better than the initial line - amazingly similar to what Lamarck and Lysenko believed [4852]. Assuming such phenomena can be confirmed, they must be the product of Darwinian evolution, which would have produced the (epi)genetic mechanisms that underlie such transgenerational effects. That the environment can effect heritable changes is not new; inducible hypermutability is a well-documented phenomenon in bacteria [53]. Exploring the role of the environment in inducing epigenetic variation is therefore an important task for the future, as is the study of epigenome-wide changes that can be induced by different environments. Similarly, we need more knowledge of how the genome-wide effect sizes of genetic and epigenetic alleles compare. Finally, we need an explicit theory of population epigenetics that describes the parameters under which epimutations could contribute to evolution (Figure 2).

Figure 2
figure 2

The potential role of inherited epigenetic changes, comparing the effects of spontaneous and induced epimutations. A population of genotypically identical individuals is shown, which contain a single locus that can exist in two epigenetic states. Like spontaneous epimutations, induced epimutations are maintained across generations, but revert randomly without the inducing environment (which almost never happens for DNA mutations). The epiallele marked in purple is disadvantageous in a normal environment (leading to increased death; red crosses). In a stress environment (indicated by a thunder bolt), the unmodified allele (shown in grey) is disadvantageous. If the environment changes randomly from generation to generation, induced epivariation is unlikely to be advantageous. If there are longer episodes of stress, induced epivariation could be advantageous, and Darwinian selection might favor alleles that can become subject to induced epivariation. However, formalization is needed to determine the boundary conditions for such a scenario.