Role of Random Mutations in Evolution

Land plants (embryophytes) are a monophyletic group believed to have evolved from a single clade of streptophyte algae (Kenrick and Crane 1997; Magallón and Hilu 2009; de Vries and Archibald 2018; Morris et al. 2018). Focusing on spermatophytes, spontaneous mutations have played a pivotal role in their evolution. Examples include bud variations (sports); the first bud sports were noted by Peter Collison in 1741, reported by Shamel and Pomeroy (1936). Bud sports were also reported to be observed in chrysanthemum, rose, banana, pineapple, and citrus varieties, but some of these variations were not transmissible to progeny (East 1917). Bud sport mutations, most likely, occurred in the shoot apical meristem (SAM) or an axillary meristem (AXM). Prior to use of chemicals or radiation (induced mutagenesis), greater than 1600 fruit tree bud sports were noted and propagated as distinct varieties (Shamel and Pomeroy 1936). These somatic mutants could retain ideal traits of the original plant along with the new/altered trait(s).

Somatic mutations were defined by Schoen and Schultz (2019) as “any genetic change that occurs during the plant mitotic cell cycle in either sporophyte or gametophyte and is passed on to descendent cells.” Somatic mutations that arose in SAM and/or AXM could yield stable variation through asexual propagation and in progeny—if that altered genotype yielded flowers and viable seeds (Schoen and Schultz 2019). Also, the specific cells/cell layers the mutations arose in were important. Angiosperms generally display the tunica-corpus arrangement of SAM, containing stratified layers of tunica (such as L1 and L2) and multi-layered corpus (beginning at L3 in this example) (Bowman and Eshed 2000; Evert 2006). In AXM development, derivative corpus cells start to undergo cell divisions, then derivative tunica layers join in. Although all contributed to AXM development, mutations in cells located in the L2 layer were most often transmitted to germline cells (D’Amato 1997; Foster and Aranzana 2018).

As noted above, somatic mutations could be transmitted to progeny. In addition to bud sports, seeds of increased age are examples of tissues that could yield heritable somatic mutations (D’Amato 1997). Although not working on plants, Muller (1927) stated “most modern geneticists will agree that gene mutations form the chief basis of organic evolution.”

Spontaneous mutation frequencies could vary within different tissues of plants as well as between different plant types. Wang et al. (2019) sequenced the genomes of various tissues and ages from 22 plants representing annuals and perennials. In general, perennial roots displayed greater numbers of mutations compared to shoots, which could reflect the different mutation rates, per cell division, of SAM and root apical meristems (RAM). This difference was not noted in annual plants. The per-generation mutation rate in perennials was noted to be much higher than in annuals; however, annuals tended to evolve faster than perennials based on a per-year basis (Wang et al. 2019). Schoen and Schultz (2019) noted that both external and internal environmental factors could play roles in genetic instability leading to mutations, in addition to stage in plant development, how early the mutation occurred, how many cells arose from it, and the degree of apical dominance the plant displayed.

Heritable somatic and gametic mutations (occur during meiosis, resulting in genetically altered zygote), collectively called de novo mutations, can accumulate over generations. To study recent evolution in a plant species based on de novo mutations, plants of Arabidopsis thaliana were analyzed (Exposito-Alonso et al. 2018). Native to Africa and Eurasia, these self-pollinating plants recently colonized North America (primarily clade HPG1), so mutations arising from initial progenitors could be assessed. Genomes from the plants in that clade maintained as herbarium specimens (earliest from 1863) up to current plants were sequenced and compared for presence of de novo mutations based on single nucleotide polymorphisms (SNPs). Researchers identified >5000 mutations with some appearing at mid-to-high frequencies in genomes. Narrow sense heritability estimates >0.5 were obtained for the following traits: root growth rate, root gravitropism, and shoot growth rate which could have resulted from positive natural selection of these de novo mutations (Exposito-Alonso et al. 2018).

Plant Domestication and Breeding

Random Mutations

Domestication of crop plants was estimated to have occurred 11,000 to 13,000 years ago when people began to transition from hunter-gatherers to farmer-herders (Evert and Eichhorn 2013). Domesticated plants are those that were genetically and phenotypically different or distinct from their wild progenitors (Meyer et al. 2012). Some general traits of domesticated food crops included larger sized grain or fruit, increased apical dominance, and seed retention on the plant vs. dispersal (Doebley et al. 2006). Approximately 2500 plant species, including those used for food, feed, fiber, medicine, and ornamentals, have been domesticated for current use (Dirzo and Raven 2003). Examples of two “domestication” genes are briefly described below.

Zea mays ssp. mays (maize) underwent domestication approximately 9000 yr ago in Mexico. It arose from Z. mays ssp. parviglumis (teosinte) (reviewed by Doebley 2004; Doebley et al. 2006; Stitzer and Ross-Ibarra 2018). Differences between teosinte and maize included inflorescence, fertility, and seed set, as well as differences in vegetative growth, with maize demonstrating reduced axillary branching (increased apical dominance). The teosinte branched 1 gene (tb1) was implicated in this altered plant architecture. Humans unknowingly selected plants with greater expression of tb1 that resulted in repression of branching/tillering and better “ear” characteristics.

There was approximately 2 times greater transcription of maize tb1 compared to teosinte tb1, but there were no amino acid differences in their coding sequences (Lukens and Doebley 1999). Quantitative differences in gene expression vs. qualitative suggested external sequences were involved. Maize near isogenic lines containing teosinte tb1 and upstream sequences (next gene was 161 kbp upstream) were crossed with its inbred parent to look for cross-over events within that intergenic region in the F2 generation. The sequence 58 to 69 kbp upstream from tb1 appeared to contain cis-regulatory sequences and controlled expression of multiple traits (basal branching and ear phenotype; Clark et al. 2006). In fact, this sequence was found to contain two transposable elements, confirmed through sequence analysis of bacterial artificial chromosome (BAC) clones from libraries of teosinte and maize inbreds (Zhou et al. 2011).

A trait that was altered in the domestication of barley (Hordeum vulgare ssp. vulgare) from its progenitor H. vulgare ssp. spontaneum was the retention of seeds on the plant enabling more efficient harvest (Meyer et al. 2012; Pourkheirandish et al. 2015). This trait arose from mutations in one of the two closely linked complementary dominant genes controlling brittle rachis (Brt1 and Brt2); at least one gene needed to be mutated for the non-brittle phenotype to be observed. Farmers from different regions selected for this trait, with btr1 mutation predominant in Europe/Middle East and btr2 predominant in East Asia/Africa. Sequence analysis revealed Btr1 encoded a transmembrane protein; compared to Btr1, mutant btr1 gene sequence differed by a 1-bp deletion. Btr2 encoded a soluble protein; compared to Btr2 gene sequence, mutant btr2 had an 11-bp deletion (Pourkheirandish et al. 2015).

Human-directed selection of plants and natural selection may work in concert, or appear antagonistic as exemplified in the barley example above. The non-brittle rachis mutants were of benefit to farmers facilitating efficient seed harvest, whereas brittle rachis enabled many/most seeds to escape harvest and disperse, allowing continued growth and adaptation (Allard 1999). In addition to farmers selecting the best seeds to grow or propagules to maintain, plant breeders began a concerted effort to take advantage of spontaneous mutations for analysis and potential incorporation into their plant species of interest. Plant breeders would assemble and assess genetic variation to meet their specific goals in a given plant species. As indicated, de novo mutations contributed to meeting their goals, but due to the “extreme infrequency of their occurrence” (Muller 1927), induced mutations became a valued tool in the plant breeder’s toolbox still in use today.

Induced mutations

After Muller’s mutation experiments using X-rays on Drosophila melanogaster (1927), research on maize and barley proved similar use of X-rays in generating de novo mutations (Stadler 1928a, 1928b). At the time, Stadler did not appear to show great enthusiasm for its promise: “The author does not predict much practical benefit from induced mutations except in such special cases as to produce bud variations in fruit trees and such highly heterozygous plants” (Stadler 1930).

Limited induced mutation experimentation took place for a few decades, until after World War II. At an address before the United Nations (UN) General Assembly in 1953, US President Dwight Eisenhower gave an “Atoms for Peace” speech (IAEA 2021). Eisenhower proposed creation of an international atomic energy agency; one goal would have experts that could “apply atomic energy to the needs of agriculture…” The International Atomic Energy Agency (IAEA) was created in 1957 and jointly promoted and kept track of mutation breeding efforts with the UN Food and Agriculture Organization (FAO). After the release of their “Manual on Mutation” in 1969 as well as hands-on training courses, a greater effort was placed on induced mutations by plant breeders (Micke et al. 1990). Types of mutagenic agents plant breeders employed included physical (X-rays, gamma rays, ion beams, fast neutrons) and chemical (e.g., ethyl methanesulfonate (EMS), sodium azide). Mutation rates were difficult to determine, and varied by locus being studied and the specific mutagen used (as well as duration and intensity of treatment), so the estimated range using Solanum lycopersicum (tomato) and barley as examples was 8×10−2 to 2.7×10−3 (Brock 1971).

The first registered induced mutant plant was a Nicotiana tabacum (tobacco) variety. Hybrid F1 “materials” were treated with X-rays; the resulting mutant, ‘Chlorina F1,’ displayed a pale color and higher leaf quality. It was registered in 1929 and released as a variety in the mid-1930s (Jankowicz-Cieslak et al. 2017). This and many other registered mutants can be found in the FAO/IAEA Mutant Variety Database (FAO/IAEA-MVD 2021). To date, 75 countries have registered their mutant materials in this database.

An example of a mutant that arose from the combination of both spontaneous and induced mutations is a common citrus many have consumed—‘Rio Star’ red grapefruit. The origin of grapefruit (Citrus paradisi) was postulated to be the West Indies in the 1700s, and arose as a natural hybrid of C. grandis (pummelo) and C. sinensis (sweet orange) (Rouse et al. 2001; da Graça and Louzada 2004; de Moraes et al. 2007). Use of molecular markers confirmed grapefruit shared markers with pummelo and sweet orange (Nicolosi et al. 2000). Grapefruit was introduced into the USA (Florida) via seeds in the 1800s. One of the seedling-derived plants was propagated as the variety ‘Duncan’; the fruit was white and contained numerous seeds.

‘Duncan’ was the progenitor of all grapefruit varieties in the USA (Rouse et al. 2001; da Graça and Louzada 2004). Two progeny that arose from ‘Duncan’ were maintained as separate varieties, ‘Marsh’ and ‘Walters.’ Pink-fleshed and seedless varieties arose from each in steps that involved both spontaneous and induced mutations. Information regarding these lineages were described by Hensz (1991), Rouse et al. (2001), da Graça and Louzada (2004; contained a family tree), and Louzada (2009); summaries regarding the development of two important commercial varieties are included below.

  • ‘Marsh’: White fruit, seedless. A bud sport on ‘Marsh’ that produced pink fruit (pulp) was named ‘Thompson Pink.’ In Texas, a bud sport on ‘Thompson Pink’ that produced darker pink fruit was named ‘Ruby Red.’ In 1963, budwood from ‘Ruby Red’ was irradiated with thermal neutrons, and one tree (A&I 1–48) produced darker fruit. A bud sport on that tree produced red fruit and was named ‘Rio Red.’ It was registered in 1970 as mutant #281 (FAO/IAEA-MVD 2021).

  • ‘Walters’: White fruit, seeded. A bud sport on ‘Walters’ that produced pink fruit was named ‘Foster.’ A bud sport on ‘Foster’ that produced seedless fruit was named ‘Foster Seedless.’ In Texas, a bud sport on ‘Foster Seedless’ generated red fruit, but contained seeds, and was named ‘Hudson.’ In 1969, seeds of ‘Hudson’ were irradiated with thermal neutrons; one seedling produced seedless red fruit and was named ‘Star Ruby.’ It was registered in 1984 as mutant #282 (FAO/IAEA-MVD 2021).

Both ‘Rio Red’ and ‘Star Ruby’ comprise the majority of grapefruit grown in Texas (Sauls 2008) and are marketed as ‘Rio Star’ across the USA (Taylor et al. 1988).

In vitro mutations via tissue culture

Plant tissue culture, quite simply, is the culture of plant cells/tissues/organs in/on artificial media for generation of growth. Addition of hormones or their synthetic counterparts, plant growth regulators (PGRs), provided cues to the cells regarding the desired response. This was elegantly demonstrated in an auxin/cytokinin media matrix conducted with tobacco tissues by Skoog and Miller (1957), and was visually recreated in a media overview by Phillips and Garda (2019). Two reviews by distinguished researchers in the field discussed the history of plant tissue culture (Gamborg 2002; Thorpe 2007). Plant regeneration could be achieved via organogenesis or somatic embryogenesis (Carman 1990; Hicks 1994; Dodds and Roberts 1995), and could be direct or indirect (callus intermediate stage; George and Debergh 2008).

With increased success in culturing and regenerating numerous plant species, it became apparent that some plants exhibited different phenotypes than the plants that supplied donor tissues. This phenomenon was labeled somaclonal variation (SV) by Larkin and Scowcroft (1981). Some of the noted somaclonal variants displayed altered traits that were heritable, therefore classified as mutants (Meins Jr 1983); the noted rates of SV ranged from 8×10−3 to 2×10−7. For comparison, the spontaneous mutation rate in Arabidopsis thaliana plants maintained for 30 generations by single seed descent was estimated to be 7×10−9 (Ossowski et al. 2010). Since that time, somaclonal variants have been generated in numerous plant species, with many of the noted variations actually confirmed to be stably inherited mutations. Unfortunately, most yielded undesirable or deleterious changes (Skirvin et al. 1994; De Klerk 1990).

Some factors implicated in SV included tissue source, culture environment and duration, and mode of regeneration with indirect regeneration increasing cellular instability (Skirvin et al. 1994; Karp 1995). A review that listed nearly 90 genera exhibiting somaclonal variation (Bairu et al. 2011) also included the types of analyses conducted on these lines; they included morphological analyses and DNA analyses such as random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), and use of microsatellite markers. Inclusion of mutagen treatments could increase the SV rate as demonstrated in Miscanthus × giganteus detected by inter simple sequence repeat (ISSR) analysis (Perera et al. 2015a, 2015b); treatment with EMS was needed to induce differential banding patterns in regenerants arising from indirect somatic embryogenesis.

Cymbidium sp. ‘Cocktail Dress’ was generated in Japan through tissue culture via SV. It was the first tissue-culture-generated mutant included in the database (1997; mutant #3221; FAO/IAEA-MVD 2021). This variety displayed improved petal color and shape compared to the variety (‘Inasa’) the tissues were obtained from. Since that time, greater than 100 mutants have been registered with approximately 1/3 SV-based and approximately 2/3 SV in concert with mutagenesis, primarily physical (FAO/IAEA-MVD 2021).

In addition to the above analyses of SV, more in-depth analyses have been incorporated into the characterization of variants/mutants. As an example, next-generation sequencing (NGS) was used in Cucumis sativus (cucumber) to identify small nucleotide variants (SNVs); that classification included single or multiple polynucleotide polymorphisms (SNPs and MNPs), and insertions and deletions (indels) shorter than 50 bp in length (Skarzyńska et al. 2020). Three stable SV lines were compared to the inbred line (B10) used to generate them. Initial explant used, culture conditions, and duration led to differences in the number and types of SNVs identified, and what chromosomes they were mapped to. Algorithm-based predictions for the number of SNVs per each of the three lines: >7000, >8000, and >44,000 predicted variants. Interestingly, the SV line that was most similar phenotypically to B10 actually contained the greatest number of SNVs. For all three SV lines, the majority of SNVs were SNPs with deletions being the next highest type of polymorphism detected (Skarzyńska et al. 2020). Through transcriptome profiling, researchers determined that 273 to 418 genes in these three SV lines were differentially regulated compared to B10 (Pawełkowicz et al. 2021). Greater numbers of these genes were upregulated (164–275) compared to downregulated (90–143).

FAO/IAEA database

The FAO/IAEA-MVD (2021) referred to above currently lists greater than 3350 registered mutant lines along with their national or local release date. Mutants obtained through treatments (physical and chemical) and SV are included. The majority of mutants arose from physical mutagenic treatments with gamma rays as the preferred treatment. The database included >170 genera, with entries from 75 countries. As indicated previously, induced mutation breeding efforts increased post–World War II. The numbers of mutants registered per decade are listed in Fig. 1. Rice accounted for 25% of the registered mutants spanning 1957 to 2020. Barley was the second most prominent (9%) with registrations spanning 1955 to 2019.

Figure 1.
figure 1

The number of induced mutations registered in the FAO/IAEA-MVD database. The figure does not include numbers for the following decades: 1920 (1), 1940 (2), and 2020 (4, to date). Database accessed January 30, 2021.

Since there were many types of mutations described in the database, a sense of what new/altered characteristics they contained was estimated by how many times certain words appeared in their descriptions. Under the “character improvement details” column in the database, the percentage of descriptions containing the following terms were determined: “maturity” (nearly 50%), “resistance” or “tolerance” (36%), “yield” (35%); examples of qualifier terms included “early,” “mid,” “late,” “high,” “low,” and “improved” (FAO/IAEA-MVD 2021). Oladosu et al. (2016) provided examples of mutated plant species that now contained specific resistances to both biotic and abiotic stresses, and those with improved nutritional quality.

Mutations—known and unknown

As described above, analyses of cucumber SV lines demonstrated that, in addition to phenotypic changes, there were numerous nucleotide and gene expression changes not evident by simply characterizing altered phenotypes (Skarzyńska et al. 2020; Pawełkowicz et al. 2021). Regarding the induced mutants generated through physical or chemical methods, studies have begun relying on NGS to better characterize mutant lines in efforts to describe the mutated genes as well as other unknown mutations that were also present. Oryza sativa (rice) ‘Kitaake’ seeds irradiated with fast neutrons were grown out and 41 M3 lines were sequenced (Li et al. 2016). NGS analysis detected 2418 mutations (average = 59 mutations per line). Types (and relative amounts) of mutations included single base substitutions (53%), deletions (36%), and smaller percentages of insertions, inversions, and translocations. However, in the >1200 genes that were affected, deletions accounted for the majority of mutations (Li et al. 2016). A separate analysis of a rice variety mutated with EMS revealed there was an average of >16,000 SNPs and approximately 1700 indels per mutant line analyzed, but no drastic phenotypic changes were noted (Sevanthi et al. 2018). These analyses clearly demonstrated that many unknown and undetected mutations may be present in commercial varieties resulting from induced mutations. An altered phenotype may just be the tip of the iceberg, so characterizing the desired mutant as well as identifying unknown mutations has become increasingly important.

Modern Genome Editing

Genetic engineering

Any history of genome editing should include a brief history of plant genetic engineering. Many techniques and procedures used in modern genome editing were developed for use in genetic engineering. This knowledge base formed the platform for genome editing to start its ascent and, along the way, form its own knowledge base for what is yet to come.

Various pieces needed to be brought together to generate genetically modified (GM) plants. The term transgenic was used since introduced genes were those from unrelated species. Tissue culture procedures were employed to regenerate plants from transgenic protoplasts, cells, and tissues; if the wild-type plant was fertile, its transgenic counterpart should be able to transmit introduced trait(s) to progeny.

The ability to generate transgenic tissues and plants was made possible by the research conducted on Agrobacterium tumefaciens including making disarmed tumor-inducing (Ti) plasmids, characterizing the DNA sequences transferred (T-DNA) to plant cells, using co-integrate vectors, constructing binary vectors, etc. These areas will not be covered, but Gelvin (2003) provided an excellent review. The earliest proof-of-concept that plants could be genetically engineered to contain and express foreign genes (transgenes) and yield regenerated plants occurred in 1983. Barton et al. (1983; April) confirmed transmission of foreign genes into plants and progeny using A. tumefaciens. They generated tobacco plants that contained an introduced yeast alcohol dehydrogenase 1 (adh1) genomic clone and the nopaline synthase (nos) gene from A. tumefaciens that naturally contained plant-recognized promoter and 3′ untranslated sequences. Intact T-DNA (containing those transgenes) was confirmed to be present in regenerants and progeny, the nos gene was expressed, but the eukaryotic adh1 gene was not expressed in any tissues analyzed. Herrera-Estrella et al. (1983b; May) confirmed expression of transgenes in plant tissues transformed with A. tumefaciens. They generated tobacco cells containing octopine synthase (ocs) or chloramphenicol acetyltransferase (cat) coding sequence each controlled by A. tumefaciens nos regulatory regions (promoter and 3′ untranslated sequences). Separate lines of tobacco cells expressed OCS or CAT. These researchers first used the term “cassette” to describe the combination of components from different genes.

A selectable marker gene was proven useful to select transgenic cells. Herrera-Estrella et al. (1983a; June), Bevan et al. (1983; July), and Fraley et al. (1983; August) confirmed transmission of a valuable selectable marker gene into tobacco or Petunia sp. (petunia) protoplasts or tobacco stem sections using A. tumefaciens. Cells contained a type II neomycin phosphotransferase (npt) coding sequence [nptII; same as aph(3′)II] fused to nos regulatory regions. Transgenic cells were selected on media containing kanamycin or G418. Horsch et al. (1984; February) and De Block et al. (1984; August) confirmed transmission of the selectable marker gene to progeny. This gene is still being used for selection of transgenic plant tissues.

The research described, so far, confirmed successful expression of prokaryotic coding sequences fused to nos regulatory regions from A. tumefaciens. Murai et al. (1983; November) proved a plant eukaryotic gene controlled by its native promoter could be expressed in plant cells after delivery via A. tumefaciens. The genomic ß-phaseolin storage protein gene isolated from Phaseolus vulgaris (bean; Slightom et al. 1983) was introduced into Helianthus annuus (sunflower) cells. In beans, this gene was developmentally regulated, being highly expressed during seed development. In sunflower cells, the full-length gene was expressed at much lower levels. However, the mRNA was correctly processed (5 introns removed) and ß-phaseolin protein was detected by enzyme-linked immunosorbent assay (ELISA). Sengupta-Gopalan et al. (1985; May) confirmed the ß-phaseolin gene introduced into tobacco by A. tumefaciens was correctly developmentally regulated, with mRNA and protein (both of correct size) accumulating during seed development. The gene was stably transferred to progeny in an expected 1:2:1 Mendelian ratio based on ELISA analysis of individual tobacco seeds. This research confirmed that a plant gene controlled by their own regulatory sequences could be correctly expressed in a different, unrelated plant species after transfer via A. tumefaciens.

An abiotic method of DNA delivery, particle bombardment, could also successfully deliver DNA into plant cells (Klein et al. 1987). This method, also called biolistics, used a device that could propel DNA coated onto small particles (gold or tungsten microprojectiles) at velocities great enough to penetrate the plant cell wall, plasma membrane, and nuclear membrane. Maize cells were successfully bombarded (Klein et al. 1988a) as well as tobacco leading to generation of transgenic plants (Klein et al. 1988b).

Figure 2 displayed results of Google Scholar searches for articles that generated GM plants via A. tumefaciens or particle bombardment per decade. This was meant to show relative comparisons between the two types of plant transformation; A. tumefaciens appeared to be preferred over particle bombardment, quite possibly because no expensive specialized equipment was needed to conduct transformations.

Figure 2.
figure 2

Google Scholar articles listing A. tumefaciens (At) or particle bombardment (gun) as the plant transformation method. The number of “hits” for A. tumefaciens in 2000–2009 was set at 100%, with all other values listed as percentages compared to it. Articles that included the acronym “CRISPR” were not included in the counts. Data accessed February 16, 2021. Search terms: agrobacterium + tumefaciens + plant + transformation + transgenic; particle + bombardment + plant + transformation + transgenic.

In 2019, 29 countries grew transgenic plants on 190 M hectares (470 M acres), with 80% composed of Glycine max (soybean) and maize (ISAAA 2020). An additional 43 countries allowed GM imports for food, feed, and processing. The US Department of Agriculture-Animal and Plant Health Inspection Service (USDA-APHIS) maintains a database of all GM plants that have been deregulated for growth in the USA (USDA-APHIS 2021). This dataset listed 131 GM plants that have been deregulated. A separate database maintained by the International Service for the Acquisition of Agri-biotech Applications (ISAAA 2021) listed 425 approved GM crop events (included discontinued events).

Natural infection of plants in the wild by Agrobacterium normally generated abnormal growth (crown galls, A. tumefaciens; hairy roots, A. rhizogenes) in specific tissues. However, these localized infections also led to horizontal gene transfer of some T-DNA genes into plants that persisted and were transferred to subsequent generations. This natural introduction of genes (mutations) was recently analyzed in Ipomoea batatas (sweet potato; Kyndt et al. 2015). Two types of T-DNA were found to be inserted into sweet potato genomes primarily from A. rhizogenes: lbT-DNA1 (4 open reading frames identified), and lbT-DNA2 (≥5 open reading frames identified). The lbT-DNA1 DNA was believed to have been horizontally transferred into an ancestor or ancestral form of sweet potato because it was found (and expressed) in all domesticated lines analyzed. Therefore, one or more genes could have contributed to a selective advantage during domestication because these genes are fixed in the genome (Kyndt et al. 2015). To determine how widespread natural genetic transformations (carried cellular T-DNA (cT-DNA)) were in plant species, two databases (Whole Genome Shotgun and Transcriptome Shotgun Assembly) were screened for presence of cT-DNA (Matveena and Otten 2019). Screenings included 631 eudicot and 205 monocot species; 7% of eudicots and 1% of monocots in these databases contained cT-DNA.

CRISPR/Cas genome editing

The history of clustered regularly interspaced short palindromic repeats (CRISPR) CRISPR-associated protein (Cas) genome editing in plants is included below. This is definitely not the only type of genome editing being conducted in plants, but it has become the most prominent protocol in recent years. For basic comparisons among genome editing procedures, Iqbal et al. (2020) provided a recent review. Reviews on CRISPR history (Lander 2016; Wright et al. 2016) and the early history of genome editing in plants with CRISPR/Cas9 (Belhaj et al. 2013, 2015) were starting points for the history summarized in these remaining sections.

Ishino et al. (1987) first identified short repeated/repetitive sequences separated by short “spacer” sequences in Escherichia coli, and repetitive sequences were later identified in Haloferax species (Mojica et al. 1993, 1995). These were determined to be a family of repetitive elements found in many prokaryotic species (Mojica et al. 2000); these sequences (21–37 bp) were given the acronym CRISPR (Jansen et al. 2002). CRISPRs were different than other previously characterized repetitive sequences since they were interspaced with non-repetitive DNA of similar size. Interestingly, until the focus shifted to these spacer sequences, the functions of CRISPR and CRISPR-associated genes (cas) remained unclear. Searches of nucleotide sequence databases revealed that the majority of spacer sequences were derived from pre-existing (extant) sequences, primarily from bacteriophages (phages) and conjugal plasmids and, over time, new spacer sequences were added (Bolotin et al. 2005; Mojica et al. 2005; Pourcel et al. 2005). Researchers postulated that the CRISPR locus was a place to store information about past infections, and bacterial resistance was observed against phages whose extant DNA was found in their spacer DNA, so CRISPR was thought to be involved some sort of cell immunity. Barrangou et al. (2007) provided the critical evidence—after challenging Streptococcus thermophilus with phage(s), bacteria integrated new phage-related spacer sequences and, thereby, became resistant to subsequent infection by the specific strain(s). Researchers could also remove resistance to a specific phage by deleting its related sequence in the spacer region (Barrangou et al. 2007). These researchers also determined that cas5 (renamed cas9) was integral to this immunity/resistance mechanism.

The CRISPR/Cas system was more fully elucidated to show that two types of RNAs were required—CRISPR RNA (crRNA) that corresponded to spacer sequences and trans-activating CRISPR RNA (tracrRNA) that directed maturation of crRNA (Deltcheva et al. 2011). These RNAs together with Cas9 targeted invading DNA sequences complementary to the crRNA sequence. The Cas9 endonuclease generated double-strand breaks (DSBs) a few bases upstream to an identified protospacer adjacent motif (PAM; for Cas9, sequence was NGG) in this target DNA. These steps were characteristic of the type II CRISPR/Cas9 system (Garneau et al. 2010; Makarova et al. 2011).

Jinek et al. (2012) analyzed each component of the type II CRISPR/Cas9 system in vitro to determine requirements for efficient editing of plasmid DNA or short double-stranded (ds) DNA. Synthesized or in vitro transcribed RNA and purified Cas9 protein were used in experiments. A PAM sequence (site) downstream from the target DNA was required; mismatches in the crRNA that would align with target DNA sequences close to PAM site prevented DSBs. Cas9 plus crRNA and tracrRNA were required to generate targeted DSBs. Cas9 plus chimeric single guide RNA (gRNA; term from Mali et al. 2013) that contained crRNA plus truncated tracrRNA sequences (retained hairpin structure of original crRNA:tracrRNA duplex) also generated targeted DSBs. The gRNA sequence could be modified to be complementary to different target sites and Cas9 would generate DSBs within those sites, thereby proving its great potential for use in precise genome editing.

Cong et al. (2013) proved the CRISPR/Cas9 system (collated components) could work in mammalian cells (human and mouse) to facilitate RNA-guided site-specific cleavage of DNA creating DSBs which were subsequently repaired by non-homologous end joining (NHEJ). The codon-optimized cas9 was controlled by the elongation factor 1a (EF1a) promoter, and designed crRNA plus tracrRNA were controlled by U6 (RNA polymerase III type 3) promoters. Cells were transfected for DNA delivery. Chimeric gRNA as described by Jinek et al. (2012) was compared to use of separate RNA constructs to create the crRNA:tracrRNA hybrid, but the gRNA did not appear to work as efficiently in editing the genes being targeted. CRISPR/Cas9 facilitated multiplex editing using two crRNAs that targeted different sites within a gene or one site within two different genes. CRISPR/Cas 9 was also used to demonstrate homology-directed repair (HDR) by adding donor template DNA for precise repair of the DSBs; HDR worked as efficiently with Cas9 and Cas9n (mutated to create single-strand nick instead of DSB) (Cong et al. 2013).

Published on the same date as Cong et al. (2013), Mali et al. (2013) also proved the CRISPR/Cas9 system could work in human cells. Each designed gRNA was controlled by the U6 promoter for targeting along with human codon-optimized cas9 controlled by cytomegalovirus (CMV) promoter. A human cell line containing a green fluorescence protein (gfp) gene disrupted by 68 bp of adeno-associated virus integration site 1 (AAVS1) was used to demonstrate homologous recombination (HR). Two gRNAs were used to target AAVS1 sequences separately using Cas9 and, with added donor template DNA, yielded 3–8% HR. Using the Cas9DH10A mutant that functioned as a nickase yielded similar HR percentages in this system, but reduced NHEJ (desirable outcome with the goal of HR). With the goal of NHEJ, the same gRNAs were used to target a native gene, human protein phosphatase 1 regulatory subunit 12C (ppp1r12c), which contained the AAVS1 locus. Three human cell lines were targeted separately with each gRNA and yielded an overall 2 to 38% NHEJ.

2013—a pivotal year for genome editing in plants

Similar to the onset of plant genetic engineering in 1983 with research groups vying to be first in an aspect of this new field, fast-forward three decades and that same trend was noticed regarding vying to be first to confirm CRISPR/Cas9 genome editing in plants. In fact, nine articles (included five “notes” to Editor) were published within a time span of 3 mo (August to October) in 2013. Articles included the following: Feng et al. (2013); Jiang et al. (2013); Li et al. (2013); Mao et al. (2013); Miao et al. (2013); Nekrasov et al. (2013); Shan et al. (2013); Upadhyay et al. (2013); Xie and Yang (2013). The plant species, targeted genes, and mutation rates are included in Supplemental Table 1, and a few key points are summarized below.

Plant species successfully genome-edited (cells or whole plants) included Arabidopsis, N. benthamiana, rice, Sorghum bicolor, and Triticum aestivum (wheat). All used nuclear localization signals (NLS) in their cas9 gene constructs like Cong et al. (2013) and Mali et al. (2013), using either one or two NLS; use of gfp-tagged Cas9 confirmed nuclear localization (Nekrasov et al. 2013). Four used plant codon-optimized cas9 (Jiang et al. 2013; Li et al. 2013; Miao et al. 2013; Shan et al. 2013).

Use of one A. tumefaciens strain to deliver all genes used in editing improved overall editing efficiency compared to delivering individual genes in separate strains (Upadhyay et al. 2013). Among those who checked for off-targets, two identified minor editing of off-target sites (Shan et al. 2013; Xie and Yang 2013). In Arabidopsis, the floral dip method for A. tumefaciens transformation yielded individual T1 seedlings that displayed a number of different mutations, and there appeared to be non-uniform editing in individual cells, indicating the editing continued beyond the fertilization/early zygote stages (Feng et al. 2013; Mao et al. 2013).

Li et al. (2013) used one gRNA to successfully target two members of a gene family. Two confirmed HDR in edits conducted with added donor template DNA (Li et al. 2013; Shan et al. 2013); two confirmed HR using genes modified to contain overlapping coding sequences interrupted by a target site (Feng et al. 2013; Mao et al. 2013). Multiplex targeting was achieved by targeting two sites within one gene, and one site in each of two separate genes (Mao et al. 2013; Upadhyay et al. 2013). Li et al. (2013) generated a database that contained nearly 1.5 million unique gRNA target sites in exons of Arabidopsis nuclear genes. In screening the rice genome for potential gRNA sequences 5′ to putative PAM sites, Xie and Yang (2013) determined that >90% rice transcripts could be specifically targeted.

Since 2013, the number of publications on CRISPR-based genome editing in plants increased dramatically. Jaganathan et al. (2018) used the Web of Science database to identify the numbers of CRISPR articles published per year (2013–2018) for 16 plant species. Through data mining, Kaul et al. (2019) collated information on CRISPR articles covering specific plant species. Their supplemental files included information on all retrieved publications, a list of edited genes, and vector profiles. Lists of available web-based bioinformatics tools for experimental design (target site selection, gRNA design, predicted outcomes, etc.) and techniques to screen for edited plants were included in Bhat et al. (2020) and Pramanik et al. (2021).

Reviews cited above also provided updates on the progress made in five major crops that employed CRISPR technologies for plant improvement: maize, rice, soybean, tomato, and wheat (Kaul et al. 2019; Bhat et al. 2020). A NSF-supported Plant Genome Editing Database described by Zheng et al. (2019) was designed to be a repository for plant CRISPR/Cas mutants, and currently contains data on mutants in eight plant species: Brachypodium distachyon, Fragaria sp. (strawberry), Manihot sp. (cassava), Medicago truncatula (barrelclover), N. benthamiana, Physalis sp. (groundcherry), rice, and tomato (accessed March 6, 2021).

Numerous researchers have confirmed the successful use of CRISPR/Cas9 for plant genome editing. Components generally included use of codon-optimized cas9 controlled by a strong promoter (like 35S from cauliflower mosaic virus) and gRNA (single or multiplexed) controlled by a U6 promoter for expression of the important components (Zhan et al. 2020). Building on this success to take genome editing to the next level include use of engineered variants (or replacement) of Cas9, and develop/refine techniques such as gene replacement, base editing, and prime editing, to name a few (Wada et al. 2020; Zhan et al. 2020; Zhu et al. 2020; Huang and Puchta 2021; Pramanik et al. 2021).

Plant mutations without direct human influences will continue to enable plants to adapt to the ever-changing environment. Generation of precise mutations may also assist in this adaptation and also address human-focused goals.