Introduction

Woody plants are important components of the global ecosystem; they play an important role in limiting emissions of carbon dioxide and other greenhouse gases, and the water-retaining capacity of forests is critical for flood control. In addition, woody plants are a major biomass resource and are gaining attention as a source of biofuel.

Various useful traits of woody plants have been investigated for sustainable production of biomass, bioremediation using trees, and improvements in the efficiency of energy production from woody plant materials. A good understanding of the key genetics factors regulating the phenotypes involved in those processes is crucial. Various genetic engineering tools have been used to analyze such factors, and important target genes have been manipulated. As a result, many genetically engineered trees with superior traits have been produced.

Recent advances in sequencing technologies have resulted in the availability of whole genome sequences of a number of industrially important woody plants. In addition, in so-called “multi-omics” analyses, data of useful traits have been accumulating rapidly in various woody plants. Such data are needed to develop breeding systems for the production of woody plants with novel favorable traits. These tools will contribute to a more detailed understanding of the gene expression involved in growth and development, environmental stress responses, and also the regulation of cell wall biosynthesis.

Although many instances of the genetic manipulation of important target genes have been reported in trees, the time and effort needed for the establishment of transgenic plants of this type still needs to be reduced by at least one order of magnitude in order for the technology to be cost effective. “Traditional” methods of genetic engineering require lengthy molecular or phenotypic screening to identify the desired characteristics. A recently emerged genetic engineering tool, so-called “genome editing” is now used widely to modify the genome of various organisms. Engineered endonucleases that digest specific sequences in the target genome are the key technology applied in genome editing. Currently, four types of systems are used for this purpose: zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), engineered mega-nucleases (EMNs), and the clustered, regularly interspaced, short palindromic repeats and CRISPR-associated protein 9 (CRISPR/Cas9) system. Of these, the CRISPR/Cas9 system has emerged as the most popular method of achieving target-specific manipulation of the gene of interest in the genomes of many organisms, including various plant species. Together with the accumulation of data on woody biomass production and conventional genome modification technologies, genome editing has the potential to produce novel tree species more rapidly than ever.

In this review article, we present a brief history of genetic engineering in woody plants, and then describe recent advances in this new technology, with an emphasis on genome editing to improve plant genomes via the generation of site-directed sequence modifications by engineered nucleases in model plant species. We also discuss the application and future prospects of these technologies in woody plant molecular breeding and biotechnology.

Molecular breeding: genomics and genetic engineering in woody plants

Molecular breeding, which has become an important approach used to accelerate the introduction of useful genetic traits into woody plant genomes, can be divided mainly into two distinct techniques: (1) marker-assisted selection/breeding and quantitative trait loci (QTLs) analysis, and (2) genetic engineering based on recombinant DNA and gene transfer techniques.

Marker-assisted breeding relies on genetic variations (most often DNA polymorphism) and requires many germplasm lines as a resource for selection. Currently, a number of genetic variations and germplasm lines are identified in both hardwoods and softwoods and information on genetic mapping and quantitative trait locus mapping for the valuable traits is available [16].

The availability of whole genome sequences of woody plants have helped molecular breeding. Recently, the whole genome sequences of Populus trichocarpa [7] (http://phytozome.jgi.doe.gov/pz/#!info?alias=Org_Ptrichocarpa) and Eucalyptus grandis [8] (http://phytozome.jgi.doe.gov/pz/;#!info?alias=Org_Egrandis) have been determined and published. Such data sets open up a wide range of research activities, such as gene discovery, gene function, gene expression, and comparative genomics as well as physical mapping via bio-informatics approaches. In addition, recent advance of “omics” techniques in woody plants have provided further valuable information.

The transcriptome and proteome provide a dynamic link between the genome and cellular characterization through gene/protein expression data. When initially introduced, microarray-based technology was applied to the transcriptome [9], and online tools comprising a microarray-based expression data (PoplarPLEX; http://www.plexdb.org/plex.php?database=Poplar) and Poplar eFP Browser (http://www.bar.utoronto.ca/efppop/cgi-bin/efpWeb.cgi) were available to better understanding the transcriptome and genome of Populus. Now, with development of the next-generation sequencing technologies, RNA-sequencing analysis is accelerating genome-wide expression studies [10] (http://www.eucgenie.org/).

The metabolome of woody plants also provides valuable resources for molecular breeding applications. The study of the “metabolome” is the study of all the collected end products of gene/protein expression. Therefore, “multi-omics” analyses, such as a combination of transcriptome and metabolome analyses, provide an interactive view of intracellular function [11, 12]. Metabolome analysis is also useful for screening in molecular breeding projects. Indeed, Eckert et al. [13] identified over 3000 associations for a total of 1617 unique single nucleotide polymorphisms (SNPs) associated with 255 metabolites.

Genetic engineering can produce novel traits that would not arise from natural variations in the target genome. In addition, genetic engineering techniques can reduce the breeding period of woody plants, which generally requires a long time when using molecular marker-assisted techniques. Although there are several techniques for genetic engineering for woody plants, Agrobacterium-mediated transformation is currently the method of choice for a wide variety of woody plant species.

Populus species were one of the first hardwood species to undergo genetic transformation, which was established in 1987 to produce herbicide resistance plants [14]. Since then, many transgenic Populus plants have been produced via Agrobacterium-mediated transformation, altering growth and wood characteristics [15], nitrogen metabolism [16], lignin content and character [17, 18] and improving salt tolerance [19, 20].

Eucalyptus species are another main target hardwood species for genetic engineering. Although Eucalyptus plants are relatively difficult to regenerate in tissue culture compared to Populus plants, several research groups have developed improved tissue culture and transformation protocols [21, 22], and both endogenous and exogenous genes have been introduced into Eucalyptus cells to produce Eucalyptus plants with modified secondary cell wall [23] and improved salt tolerance [24, 25].

Although softwoods are more recalcitrant to Agrobacterium-mediated transformation, improvements in tissue culture systems have allowed the production of transgenic softwoods, Pinus radiata [26], Pinus taeda [27, 28], and Picea abies [27]. As in the case of Populus and Eucalyptus plants, the introduction and overexpression of genes of commercial interest have also been demonstrated, e.g., production of salt-tolerant Pinus taeda [29]. A combination of the next-generation sequencing technology, multi-omics, and genetic engineering will also accelerate the molecular breeding of softwoods of commercial interest.

Genome editing: a modern tool for precise genome engineering of woody plants

Natural germplasm variations and mutated germplasm lines obtained using exogenous mutagens have been used in recent decades for breeding to increase quality and yield. Screening of the mutant population to identify individual plants that have the desired phenotype is needed to identify mutant lines of interest. This procedure is quite laborious, and genetic engineering has been used widely to reduce the time required for molecular breeding. However, conventional genetic engineering techniques rely on random gene transfer mechanisms. New methods of targeted gene modification are thus a key technology for improvement of woody plants as well as for analysis of gene function.

The recent development of engineered nucleases has allowed more precisely targeted gene engineering, namely genome editing. Four types of engineered nuclease systems are currently in use: EMNs, ZFNs, TALENs, and the CRISPR/Cas9 system; all these systems rely on the induction of double-stranded breaks (DSBs) in the target genome DNA [30, 31]. Creating site-directed DSBs in genomic DNA results in gene modifications through either non-homologous end joining (NHEJ) in the case of site-directed mutagenesis or homologous recombination repair (HR) in HR-mediated gene targeting [32].

EMNs, ZFNs and TALENs are based on protein–DNA interactions. ZFNs and TALENs utilize the restriction enzyme Fok I [33, 34]. These latter two engineered nucleases comprise a DNA-binding domain (DBD) and a separate nuclease domain, and thus ZFNs and TALENs are easier to customize than EMNs [33, 34]. In the case of ZFNs, the ZF DBD comes from ZF transcription factor [35, 36]. The DBD of ZFNs is typically composed of three-to-four ZF arrays, and each array recognizes three bases. Similar to ZFNs, TALENs use the TALE domain as DBD; this DBD is composed of 34–35 amino acids, and each TALE DBD domain recognizes one base of the target DNA sequence [37]. In both ZFNs and TALENs, the identification and construction of highly specific DBDs are key steps for subsequent genome editing, and these steps are relatively laborious.

Unlike ZFNs and TALENs, the CRISPR/Cas9 system uses RNA–DNA recognition [38, 39]. The CRISPR/Cas9 system consists of two components: a so-called guide RNA (gRNA) for target recognition; and the RNA-binding/endonuclease protein, Cas9. The gRNA in the current system is a chimeric hybrid of an endogenous bacterial crRNA and tracrRNA, namely single guide RNA (sgRNA) [38, 39]. The base-pairing of part of the gRNA and its complementary DNA sequence on the target genome can guide the complex of sgRNA-Cas9 to the target DNA region, where Cas9 can then introduce a DSB at the target DNA sequence. For correct binding of the sgRNA-Cas9 complex onto the target DNA sequence, a so-called protospacer adjacent motif (PAM) sequence immediately following the target sequence is required. In the most widely used the CRISPR/Cas9 system, derived from Streptococcus pyogenes, the sequence “NGG” (N = A, T, G, C) for the PAM and a 20-nucleotide sequence complementary to the target at the 5′-end of the gRNA as guide is required for correct target recognition [3840]. These features make the CRISPR/Cas9 system quite simple and easy to design, as well as being highly effective, as proven in human [38] and mouse [39] cells. Following successful site-directed mutagenesis in mammalian cells, the CRISPR/Cas9 system has been used widely in a variety organisms from bacteria to higher eukaryotes: Candida albicans [41], Caenorhabditis elegans [42], fruit fly [43], zebrafish [44], rodents [45], and cattle [46].

In higher plants, genome editing technologies exploiting ZFNs and TALENs have been utilized for site-directed mutagenesis and/or gene targeting in Arabidopsis [4749], maize [50], and tobacco [51]. Since the appearance of the CRISPR/Cas9 system, the efficacy of genome editing has improved rapidly in a wide variety of plant species.

The first reports using the CRISPR/Cas9 system in plants confirmed that the transient expression of gRNA and Cas9 in Arabidopsis protoplasts, tobacco cells, and rice plants introduced mutations in the target gene of interest [5254]. To establish knockout model plants lines, Arabidopsis [55] and liverwort [56] were subjected to Agrobacterium-mediated transformation. Fauser et al. [55] demonstrated the stable inheritance of nuclease-induced targeted mutagenesis events in the Arabidopsis ADH1 and TT4 genes at frequencies from 2.5 up to 70.0 % in the T3 generation. Sugano et al. [56] demonstrated the production of liverwort knockout plants for the ARF1 gene using the CRISPR/Cas9 system. It is interesting to note that liverwort is amenable to genome editing using haploid generation to obtain knockout plants in the parent generation of gene transfer (T0 generation). Other than model plants, genome editing with the CRISPR/Cas9 system has been performed in many crop plants, e.g., maize [57], rice [54, 58], sorghum [59], soybean [60, 61], tomato [62], and wheat [54, 63]. These reports demonstrated (1) a high-frequency of mutation following gene transfer of sgRNA/Cas9 via Agrobacterium-mediated transformation, with a variety of mutation rates depending on species; in rice, mutation efficiency reached 100 % [58], (2) the achievement of biallelic mutations after gene transfer of sgRNA/Cas9 in the T0 generation in several crop plants [58, 6062], and (3) that multiplex gRNA expression can lead to simultaneous mutation of multiple target genes [54, 58, 6063]. Biallelic mutations produced by the introduction of gRNA/Cas9 genes will have the advantage of obtaining plants with a knockout phenotype at the T0 generation, which will reduce the breeding period. In addition, the successive genome engineering of multiple genes will be of great value in breeding programs where the goal is to disrupt many target genes as well as the knockout of redundant genes.

As mentioned above, genome editing tools, such as ZFNs, TALENs, and the CRISPR/Cas9 system have been developed as site-specific endonucleases to introduce site-specific mutations. Recently, genome editing tools are also used for gene activation/repression and DNA/chromatin modification technologies. ZFs and TALEs were originally identified as transcription factors; therefore, the use of target-specific transcriptional activator/repressor with custom-designed ZFs and TALEs is quite straight forward. In the case of Cas9, the catalytically inactive mutant form of Cas9 (referred to as dCas9 [40]), which contained two mutations of the RuvC1 and HNH nuclease domains can be used for RNA-mediated custom-designed transcriptional factors. Currently, gene fusions of TALEs/dCas9 to several types of activation/repression domains are available for gene regulation tools in many organisms including mammalians and plants [6466]. Piatek et al. [67] demonstrated the target-specific gene regulation using dCas9-based transcriptional activator and repressor in tobacco. They used the EDLL activation domain [68] and the TAL activation domain [64] for transcriptional activation, and the synthetic SRDX domain [69] for transcriptional repressor.

Effector domains of DNA-/chromatin-modifying enzyme are also used for the genome editing tool. Hilton et al. [70] reported the efficacy of dCas9-based histone acetyltransferase to fuse dCas9 with the catalytic core of the human acetyltransferase p300 in human cells. They demonstrated that the dCas9-based histone acetyltransferase catalyzed acetylation of histone H3 lysine 27 at its target sites, and this acetylation led transcriptional activation of target genes from promoters and enhancers. Establishment of dCas9-based DNA-/chromatin-modifying enzyme other than histone acetyltransferase, such as DNA methyltransferase, methylcytosine deoxygenase, ubiquitin ligase, and poly-ADP ribosyltransferase, will enable to perform precise epigenome editing to control genome-wide gene regulation and chromatin status.

Challenges for genome engineering of woody plants and wood decay basidiomycetes

There are a number of studies of genome editing in higher plants; however, reports of genome editing in tree species are limited. Trees are subject to the difficulties in application of genome editing technologies common to all higher plants: low transformation efficiencies, lack of information for optimal expression cassettes for expressing engineered nucleases, and difficulty of isolation of clonal engineered plants. This is compounded by the slower growth speeds of perennial trees compared to those of grass species. Therefore, the development of genome editing tools for tree species needs to be done in a limited number of trial-and-error cycles.

Jia and Wang [71, 72] reported CRISPR/Cas9-based genome editing in the sweet orange, Citrus sinensis. In this latter report, transient expression of plant-codon optimized Cas9 and CsPDS-targeted gRNA disrupted the endogenous CsPDS locus. Notably, the low transformation efficiency in sweet orange was overcome using Agrobacterium infection facilitated by pre-infection with Xanthomonas citri subsp. citri (Xcc). Xcc improves infection efficiencies of Agrobacterium to citrus [72]. In addition, the cauliflower mosaic virus (CaMV) 35S promoter, which is transcribed by RNA polymerase II, was used for the expression of gRNA in sweet orange [71]. RNA polymerase III-transcribed promoters, such as U3 and U6, have more commonly been used to express gRNA. However, in tree species, information on U3/U6-snRNA expression is somewhat lacking. RNA polymerase II-based gRNA expression is thus one of the possible approaches currently being followed up to design predictable CRISPR/Cas9-based expression in trees.

Peer et al. [73] utilized ZFN targeting of the uidA transgene, which expresses β-glucuronidase (GUS), and demonstrated ZFN-based site-directed mutagenesis in apple, Malus domestica, and fig, Ficus carica. Similar to the work in Arabidopsis [31], a heat-shock promoter was used to express ZFN cassettes to avoid toxicity of ZFN [73]. In this report, individual plants were produced and cloned after a tissue culture period of almost a year. Since Agrobacterium-mediated transformation of engineered nucleases produces NHEJ-based mutagenesis only in transfected cells, isolation of transfected cells and their regeneration is an unavoidable step in the cloning of engineered plants [73]. As another approach, systemic infection with a virus harboring genome editing vectors, which would not require regeneration from mutated cells if genome editing occurred in shoot apical meristems, was proposed by Peer et al. [73]. Combination of a systemic virus and an engineered nuclease is one possible approach to rapid breeding in woody plants.

Most recently, Zhou et al. [74] demonstrated the successful gene knockout of 4-coumarate:CoA ligase, the 4CL1 gene in Populus tremula × alba clone 717-1B4. The 4CL1 gene is a key gene in lignin biosynthesis [17, 75]. Although an RNA polymerase III-transcribed promoter has not been cloned so far in Populus, Zhou et al. used the Medicago U6.6 snRNA gene promoter [60] for gRNA expression, and highly efficient biallelic mutation of the 4CL1 gene was achieved. Mutant lines with these biallelic mutations showed similar phenotypes as in transgenic poplar plants expressing antisense 4CL1 RNA [75]. As mentioned above, highly efficient regeneration and transformation procedures have been developed for Populus plants. The establishment of efficient transformation methods, optimization of expression cassettes, and isolation methodologies of clonal plants will be helpful in developing genome editing tools in woody plants.

For efficient usage of woody plants biomass, it is necessary not only to breed woody plants themselves but also to utilize fungal and bacterial species that decay difficult-to-use biomass. Reports on the molecular breeding of wood decay basidiomycetes have been limited; however, there are some examples of molecular breeding in basidiomycete mushrooms and cellulose-degrading bacteria.

Schizophyllum commune is a wood-rotting fungus that has also found a use as one of the model organisms of mushrooms [76, 77]. Polyethylene glycol (PEG)-based transformation methods and HR-based gene targeting has already been developed in S. commune [77], although the HR rate was relatively low and selection markers should be used [77]. Using disruption of transcription factors fst3 and fst4, artificial regulation of mushroom development in S. commune has already been reported [76]. Higher HR rates could be achieved using artificially engineered nucleases, and other genetic factors regulating mushroom development and lignin-degradation may be uncovered.

Coprinopsis cinerea is another model organism of mushroom, and its genome had already been uncovered [78]. Some tools for molecular biology of this organism have been developed, such as PEG-based transformation, RNAi, fluorescent reporters, and HR-based gene targeting using selection markers [79]. C. cinerea generally grows on dung—a non-woody substrate—and is thought to be a non-wood-rotting fungus. However, heterologous expressions of lignin-degradation enzymes in Coprinopsis had been shown to increase lignin-decolorization activities [80]. Recently, electroporation-based gene delivery has been developed in Coprinopsis (Sugano et al. unpublished). Electroporation might possibly allow mRNA transfection. Like mRNA-injection-based genome editing in mouse and zebrafishes, transfection of engineered nuclease mRNA to basidiomycetes would open up marker-free genome manipulation in wood decay fungi. Without the need for foreign selection markers, fungal species derived from molecular breeding could rapidly be brought to market.

To date, there are no reports of genome editing in basidiomycetes; however, demonstration of efficient Platinum-TALEN-based genome editing (nearly 100 % gene targeting) in a filamentous fungus, Pyricularia oryzae, has been reported [81], so it is thought to be only a matter of time until genome editing in basidiomycetes is accomplished. Besides demonstration of genome editing per se, actual utilization of genome editing in basidiomycetes might be subject to a specific fungal problem. Basidiomycetes live dominantly as heterokaryotes; therefore, establishment of clonal individuals is more difficult than in plants. Cloning technologies using monokaryotic stages of fungal cells would be required.

Recently, CRISPR/Cas9-based genome editing in the wood decay bacteria, Clostridium cellulolyticum, has been reported: Xu et al. [82] reported that Cas9 expression with gRNA, which induces DSB in vivo, was lethal in C. cellulolyticum. It was assumed that C. cellulolyticum might have less active NHEJ-based repair pathways. On the other hand, “nickase type” Cas9n had been shown to induce highly efficient (>95 %) HR-based gene targeting at the targeted locus [74]. This high gene targeting efficiency will open the door to high-throughput molecular genetics in wood decay bacteria.

Concerning the molecular tools that have been developed in basidiomycetes, and the recent demonstration of highly efficient genome editing in filamentous fungi and wood decay bacteria, molecular breeding of wood decay fungi and bacteria look to be accelerated in the near future.

Concluding remarks and future prospects

Targeted gene engineering has been achieved effectively by custom-designed engineered nucleases. In gene modification, these “targetable” nucleases have the potential to become alternatives to standard breeding methods to identify novel traits in economically important plants, especially once the efficacy of the CRISPR/Cas9 system has been improved. In crop plants, several features of genome editing have been reported: (1) highly efficient gene disruption mediated by the coupling of the CRISPR/Cas9 system and Agrobacterium-mediated transformation, (2) multiple-gene disruption can be achieved using multiplex gRNA expression, and (3) biallelic mutations often occur.

However, to extend genome engineering technologies to make them more applicable and useful for woody plant species, further improvements are required to overcome their limitations. Highly specific and efficient genome editing systems will be required in woody plants, because most woody plant species exhibit lower transformation efficiencies compared to annual crop plant species tested with the CRISPR/Cas9 system.

The generation of off-target mutations is a problem encountered with the CRISPR/Cas9 system that will also need to be overcome in plant genome editing. One solution for the off-target problem was demonstrated recently using a double-nicking CRISPR/Cas9 system [83]. This system has already been tested in plants [55] and, therefore, could also be valuable for woody plant species. In addition to the double-nicking CRISPR/Cas9 system, a Fok I-based Cas9 nuclease system [84] and a novel gRNA design system with a 17- to 18-nt target sequence [85] have also been reported as a means of drastically reducing off-target mutation efficiency. A combination of these techniques will be required for precise genome engineering in woody plant species.

As new plant-breeding techniques develop, these efforts, together with a deeper understanding of the whole genome structure and function of wide variety of woody plants, will enable the development of future technologies in breeding novel and important traits in woody plants as well as fungal and bacterial species for efficient usage of woody plants biomass.