Introduction

In 1983, four groups published independent investigations introducing foreign genes into plants through cell transformation, thereby creating the first transgenic plants (Bevan et al. 1983; Fraley et al. 1983; Herrera-Estrella et al. 1983; Murai et al. 1983). Since then, countless genes, RNAs and regulatory elements have been introduced and studied in various transgenic plants. Several important agronomic crops such as cotton, soybean, canola and maize, with commercially useful traits, have been genetically modified (GM) and released for commercial production (Moeller and Wang 2008). Because of the enhanced quality of the plants, GM technology has had a significant and positive impact on farm income derived from a combination of increased productivity and efficiency gains. In 2007, the direct global farm income benefit from biotech crops was $10.1 billion. Since 1996, farm incomes have increased by $44.1 billion (Brookes and Barfoot 2009). Advancement in technologies for plant genetic engineering have significantly improved as well as transformation procedures, which have become routine for a wide variety of plant species (Moeller and Wang 2008). These sophisticated breeding techniques and GM technology have not only had positive impact on food producers, but also there has been a decreasing trend in global hunger due, in part, to the decreased cost and availability of superior food crops. Although the global hunger index is continually decreasing the ability to enhance food crops for better quality and yield through genetic modification will undoubtedly play an important role in addressing this concern.

Agrobacterium-mediated and particle bombardment methods are the two most commonly used techniques for crop transformation. However, both methods are dependent on the random nature of the transgene integration. The phenomenon of multiple copy transgene insertion leading to gene silencing and unpredictable expression is often encountered with transgenic plants. The isolation of stable transgenic lines with the desired level of transgene expression is labor intensive and costly. It is often necessary to screen hundreds of independently transformed plants to identify those with suitable transgene structure and expression (Ow 2005). Therefore, many research endeavors are focused on goals to eliminate random DNA integration and/or reduce the frequency of multi-copy transgene insertions, thus reducing or eliminating events that exhibit unreliable transgene expression (Ow 2005). As more genes are discovered through whole genome sequencing from different organisms including important economic plants, applications to improve crop traits, (e.g. rice and wheat) will become an import focus in the post-genomic era. How to express or manipulate multiple genes in the plant genome is still a major technical hurdle that is difficult to achieve (Halpin 2005). Site-specific recombination is a promising technology that can be used to address these challenges of crop genome engineering. In this review, we examine previous studies and discuss recent advances to the applications of site-specific recombinase technology. We also propose a novel strategy to achieve both site-specific gene integration and deletion of unneeded DNA through the combined use of two irreversible site-specific recombination systems.

Recombinase types and their modes of action

Site-specific recombinase systems were discovered in bacteria and yeast and found to facilitate a number of biological functions, including the phase variation of certain bacterial virulence factors and the integration of bacteriophage into the host genome. Site-specific recombination occurs at a specific sequence or recognition site and involves cleavage and reunion leading to integration, deletion or inversion of a DNA fragment without the gain or loss of nucleotides. Whether its integration, deletion or inversion of a DNA fragment, orientation of the recognition sites determines the mode of action (Grindley et al. 2006).

The recombinase super family can be split into two fundamental groups, the tyrosine and serine recombinases. This division is based on the active amino acid (Tyr or Ser) within the catalytic domain of the enzymes in each family. Both families can be further subdivided into unique members based on either size or mode of recombinase action (Fig. 1). The first and best-characterized group has members that include the Cre-lox (Sauer and Henderson 1990), FLP-FRT (Golic and Lindquist 1989) and R-RS (Onouchi et al. 1991) systems where Cre, FLP and R are bidirectional tyrosine recombinases and lox, FRT, and RS are the respective identical DNA recognition sites (i.e. sequences the enzymes recognize to perform recombination). Within this bidirectional tyrosine sub-family, the recombinase-mediated genetic cross-over occurs between the two identical recognition sites. Because of the identical nature of the recognition sites the recombination reaction is fully reversible, although intra-molecular recombination (excision) is highly favored over inter-molecular reactions (integration).

Fig. 1
figure 1

Diagram of the recombinase super family. The two major families are divided based on the active amino acid of the catalytic domain, either a tyrosine or a serine. The tyrosine family can be divided into members that utilize identical and non-identical recognition sites. Those depend on identical recognition sites are the bidirectional tyrosine and are reversible in action while the unidirectional tyrosine that utilize non-identical sites are irreversible. Members of the serine family are irreversible in action but can be further divided into “large” (~60 kDa) or “small” (~23 kDa) families based on enzyme size. While the “small’ members utilized identical recognition sites, they appear only capable of excision due to topological constraints. The “large” serine is efficient at excision, integration and inversion. Examples of each subfamily are listed and their mode of action and the nature of the recognition sites are shown

The unidirectional tyrosine sub-family has non-identical recognition sites typically known as attB (attachment site bacteria) and attP (attachment site phage) and performs irreversible recombination in the absence of a helper protein, termed an excisionase. The unidirectional tyrosine recombinases that have been shown to be useful for genome manipulation include HK022 (Kolot et al. 1999; Gottfried et al. 2005) and a modified form of λ (Christ and Droge 2002).

The serine recombinase family also has two distinct members with the division being based on size of the enzyme. The small serine sub-family contains β-six (Diaz et al. 2001), γδ-res (Schwikardi and Droge 2000), CinH-RS2 (Kholodii 2001; Thomson and Ow 2006) and ParA-MRS (Gerlitz et al. 1990; Thomson et al. 2009), where β, γδ, CinH and ParA are small serine recombinases, and six, res, RS2 and MRS are the respective DNA recognition sites. While recombination mediated by these small serine recombinases (a.k.a. resolvases) utilizes identical recognition sites only intra-molecular excision events are observed. Studies have determined that due to conformational strain small serine recombinases cannot facilitate inter-molecular integration (Mouw et al. 2008). Therefore, an excision event mediated by the small serine recombinases is considered irreversible.

The large serine sub-family is represented by phiC31 (Thomason et al. 2001; Rubtsova et al. 2008), TP901-1 (Stoll et al. 2002), R4 (Olivares et al. 2001) and Bxb1 (Kim et al. 2003; Keravala et al. 2006; Thomson and Ow 2006). These enzymes act on two recognition sites that differ in sequence, typically known as recognition sites attB and attP, to yield hybrid product sites known as attL and attR. Excision, inversion or integration reactions can occur, but because the recognition site sequences of attB and attP are changed to attL and attR, the reverse reaction cannot occur. A reversal of the reaction is only possible though the addition of a second protein, the corresponding excisionase (Thorpe et al. 2000; Ghosh et al. 2006).

Uses of recombinases for excision

Site-specific recombination was among the first methods used to create transgenic plants without retention of a selectable marker transgene (Dale and Ow 1991; Russell et al. 1992) (Fig. 2). Adoption of this technology could potentially eliminate the movement of selectable marker transgenes within the environment. Removal of the selectable marker also allows reuse of the same selection regime for subsequent rounds of gene transfer. A number of recombinase-mediated marker deletion strategies have been reported in model plants (Dale and Ow 1991; Albert et al. 1995; Gleave et al. 1999; Sugita et al. 1999, 2000a, b; Endo et al. 2001, 2002; Hohn et al. 2001; Hare and Chua 2002; Nanto et al. 2005; Nanto and Ebinuma 2008; Nanto et al. 2009; Thomson et al. 2009, 2010), as well as in crop species (Lyznik et al. 1996; Srivastava et al. 1999; Ebinuma and Komamine 2001; Hoa et al. 2002; Matsunaga et al. 2002; Gilbertson et al. 2003; Srivastava and Ow 2003; Zhang et al. 2003; Kerbach et al. 2005; Radhakrishnan and Srivastava 2005; Sreekala et al. 2005; Ballester et al. 2006; Cao et al. 2006; Chawla et al. 2006; Cuellar et al. 2006; Djukanovic et al. 2008; Hu et al. 2008; Kempe et al. 2010).

Fig. 2
figure 2

Schematic representation of recombinase-mediated selectable marker removal. The marker gene is flanked by directly oriented recombinase recognition sites (red arrows). The excision event removes the DNA between the associated recognition sites leaving the external DNA, such as the gene of interest (GOI) intact and a single recognition site behind in the genome. The non-replicating circular DNA fragment is lost. The recombinase can be provided in cis or in trans—not shown

In 2006, the first commercial marker-free corn LY038, developed by Monsanto with Cre-lox system, obtained USDA approval (Ow 2007). LY038 contains high lysine content providing supplemental lysine for poultry and swine diets (Comprehensive Reviews in Food Science and Food Safety 2008). To produce the LY038, a plasmid containing the cordapA (coding region of the dihydrodipicolinate synthase gene from Corynebacterium glutamicum) and kanamycin selectable marker gene nptII coding sequences was introduced into the maize H99 through biolistic particle bombardment transformation. The nptII cassette was flanked by loxP sites, in direct orientation for Cre-mediated excision. The selected cordapA-nptII transgenic maize was crossed with another corn line expressing the Cre recombinase. Recombinase-mediated excision removed the nptII gene cassette leaving only the cordapA gene cassette. However, to produce these marker-free transgenic plants, involved crossing the recombinase-expressing lines to the target line, selecting for complete excision, and finally segregation of the recombinase gene. Although it is feasible to produce marker-free transgenic crop plants using this process, it may not be optimal due to constraints on time, labor and the substantial financial resources needed. Generation of transgenic crops in this manner requires multiple generations and may not be practical in species with longer generation times such as trees, or in other crops such as potato, which are propagated asexually.

To shorten and simplify the process of selectable marker excision, various groups have designed one-step auto-excision strategies for DNA removal (Fig. 3). In these strategies the gene(s) of interest (GOI), the marker gene, and the recombinase, are cloned into a single construct with the recombinase gene under the control of an inducible promoter (for example, heat shock promoter HSP81-1; Takahashi et al. 1992; Hoff et al. 2001; Liu et al. 2005). The selection gene is placed in a cassette flanked by directly oriented recognition sites, while the gene of interest is inserted outside of the region flanked by recognition sites. After transformation, the putative transgenic plants are induced to initiate the expression of a recombinase, such as Cre. Other inducible systems have been constructed by the fusing Cre to various ligand binding domains (Logie and Stewart 1995; Metzger et al. 1995; Joubès et al. 2004). Cre has also been fused to the estrogen hormone receptor, which in the absence of the target hormone can limit access of the recombinase fusion protein to the nucleus and thereby inhibit recombination events (Feil et al. 1996; Kellendonk et al. 1996; Brocard et al. 1998; Danielian et al. 1998). In the presence of the hormone inducer, a recombinase-mediated excision event deletes the intervening region between directly oriented recognition sites (e.g. loxP) removing both the recombinase and the selectable marker genes (Fig. 3). An autoexcision strategy like this has been applied to Arabidopsis (Zuo et al. 2001), tomato (Zhang et al. 2006), maize (Zhang et al. 2003), rice and aspen (Ebinuma and Komamine 2001; Matsunaga et al. 2002; Sreekala et al. 2005) and tobacco (Sugita et al. 2000a, b; Endo et al. 2002; Liu et al. 2005; Wang et al. 2005). As chemicals or heat shock treatments are required for recombinase activation, marker gene deletion utilizing these treatments may be limited to certain plant species and/or may present complications for the transformation process due to premature recombinase expression (Li et al. 2007). Situations to be considered are if the system is leaky producing uninduced expression of the recombinase, which can lead to undesirable excision. This can be the result of either promoter mis-regulation or genomic positional effect. Then again, weak-induced expression of the recombinase can also result giving incomplete transgene excision.

Fig. 3
figure 3

Schematic representation of an inducible recombinase-mediated selectable marker removal strategy. The marker and recombinase genes are flanked by directly oriented recombinase recognition sites (red arrows). In this strategy the recombinase gene is present in the genome and can be externally or developmentally induced. Activation of the recombinase causes excision of the recombinase and marker gene, leaving behind only the gene of interest (GOI), a single recognition site and a non-replicating circular DNA fragment

An alternative approach to the previously described strategy is the use of a developmentally inducible promoter to activate recombinase expression only within specific organs/tissues during development. Various germline-specific promoters have been employed for recombinase expression (Mlynárová et al. 2006; Li et al. 2007; Luo et al. 2007; Verweire et al. 2007; Kopertekh et al. 2010); see Gidoni et al. (2008) for a detailed listing of promoters used to drive recombinases. This strategy may be easier to implement than the inducible systems since they do not require extra steps for recombinase expression and potentially enabling higher rates of excision. Employment of the developmentally regulated recombinase-mediated excision strategy has provided a containment system to prevent transgene movement via pollen by activating recombinase expression during pollen development mediating the excision of transgenes from the genome (Mlynárová et al. 2006; Luo et al. 2007). This technique could potentially reduce the risk of transgene flow within the environment, eliminating the adventitious presence of transgenes in non-GM crops or related wild species. Li et al. (2007) demonstrated the feasibility of this process in seed with a self-activating excision system in soybean using an embryo-specific promoter to drive the temporal expression of the recombinase Cre.

Another strategy is to provide transient expression of the recombinase. The most direct approach is to transform cells directly with a recombinase expression cassette (Albert et al. 1995; Araki et al. 1995; Vergunst et al. 1998; Srivastava and Ow 2001). The recombinase is transiently expressed in cells and should not stably integrated into the genome of the host cell. However, Srivastava and Ow (2001) have measured genomic integration of the recombinase gene in 40% of host cells that underwent recombinase-mediated excision. Two published expression vectors have been designed specifically for transient recombinase expression: one utilizes A. tumefaciens transformation proteins (Vergunst et al. 2000; Kopertekh and Schiemann 2005), the other is a Cre/virus vector (Kopertekh et al. 2004a, b; Jia et al. 2006). Other options that are known to control transient recombinase expression include direct transformation of recombinase mRNA (De Wit et al. 1998) and use of peptides to facilitate direct cellular uptake of the recombinase protein (Peitz et al. 2002).

Resolution of transgene concatomers

A unique feature of recombinase-mediated excision is its ability to resolve complex insertion sites containing multiple transgenes down to single copy structures. The technique, originally demonstrated in wheat by Srivastava et al. (1999), involved four multi-copy transgenic lines being resolved to single copy by Cre-mediated excision. This strategy requires the presence of at least one recognition site within the transgene structure, although two sites with inverted orientation flanking the entire T-DNA appear to be the optimal configuration. Tandem arrays will be excised due to the presence of the multiple recognition sites in direct orientation. Recombinase-meditated excision will continue until a single recognition site is left (Fig. 4). The possibility does exist for fragmented T-DNA to be located outside the outer most recognition site but this should be detectable with proper molecular characterization. Single copy transgene structures are generally the most sought-after due to their consistent expression pattern, stability within the genome, heritability, low occurrence of silencing, and simplicity of structural characterization (Day et al. 2000). A study by Chawla et al. (2006) demonstrated that rice with multicopy transgene inserts, initially silenced for expression, recovered expression when resolved by segregation to a single genomic copy. This procedure has the advantage of lowering the total number of transgenic plants required in order to find a properly expressing single copy line that is heritable.

Fig. 4
figure 4

Schematic representation of a recombinase-mediated resolution event. This technique uses the recombinases’ capacity to excise DNA from between any two directly oriented recognition sites (red arrows) thereby removing the intervening or ‘complexed’ DNA from the genome. a The initial construct used for transformation. b Complex transgene integration. Dotted line designates all possible excision events mediated by directly oriented recognition sites. c Transgene resolution to a single recognition site in the genome and non-replicating circular fragment. DNA fragments present without flanking recognition sites will not be removed. The recombinase can be provided in cis or in trans—not shown. All possible excision products—not shown

In recent years, a number of novel recombinase systems have been identified that show the ability to excise DNA in eukaryotic cells. These include phiC31 (Kempe et al. 2010; Thomason et al. 2001, 2010), Bxb1, TP901-1 and U153 (Keravala et al. 2006; Thomson and Ow 2006) of the large serine recombinase family and CinH, ParA, Tn1721 and Tn5053 from the small serine resolvase family (Thomson and Ow 2006; Thomson et al. 2009). In the S. pombe system described by Thomson and Ow (2006), the recombinases with the most effective rates of genomic excision are Bxb1, CinH, ParA and phiC31 at 100, 95, 97 and 91%, respectively. Currently, phiC31 is patent protected by the USDA (US Patent 6,746,870) while the remaining recombinases have a USDA patent pending (US Patent application 20060046294). This technology is publicly available for research purposes and non-exclusive licenses for commercial uses are granted. As can be seen, recombinase-mediated excision offers many benefits over traditional genomic engineering methods, namely the removal of unwanted DNA, recycling of selectable markers and resolution of complex transgene concatomers to single copy stable loci. Other current reviews are available on the topic of recombinase-mediated excision strategies in plants (Gidoni et al. 2008; Moon et al. 2009) and ‘clean gene’ technology (Afolade 2007).

Chromosomal engineering

Physical distance between recognition sites does not appear to limit the capacity of the recombinase, although larger distances (megabases) will lower the efficiency of recombination. This feature of site-specific recombination has made planned chromosomal rearrangements a feasible option for genetic engineering. Large deletions, duplications or chromosomal translocations allow the study of known genetic diseases such as Downs syndrome, Smith–Magenis syndrome, Cri-du-chat, and Charcot–Marie–Thooth type 1A (Korenberg et al. 1994; Chen et al. 1997; Lupski 1998). Cre-induced site-specific translocations have been reported in ES cells (Deursoen et al. 1995) and plants (Qin et al. 1994), with site-specific translocation being shown to occur 1:1,200–2,400 (non-random:random) in ES cells that express the Cre protein. As orientation of the loxP sites will determine the type of recombination event observed, and since eukaryotes are diploid, complications can occur. With two loxP sites located on a single chromosome in directly repeated orientation a deletion event is expected, but duplication events have also been detected (Medberry et al. 1995; Ramierez-Solis et al. 1995; van Deursen et al. 1995; Uemura et al. 2010). The deletion/duplication events result from a non-sister chromatid recombination event that generates a series of balanced and unbalanced chromatids. The difference in chromosomal relation (i.e. intra-chromosomal, homologous, non-homologous) and loxP placement appears to govern the frequency of recombination (Burgess and Kleckner 1999). A more controlled and effective form of chromosomal deletion and/or duplication involves the use of a temporal-specific promoter, for the synaptonemal complex protein1 (SYCP1). In the spermatocytes of mice during meiosis, the homologous chromosomes are tightly paired in the synaptonemal complex. The SYCP1 promoter drives Cre expression generating efficient deletion and duplication events due to close proximity of the chromosomes in the synaptonemal complex (Herault et al. 1998). With the loxP sites in the opposite orientation and located on the same chromosome, an inversion event is generated. This form of chromosomal rearrangement can be used to study genetic abnormalities and establish balanced lethal systems to facilitate stock maintenance (Zheng et al. 1999). This loxP orientation can also lead to unequal recombination between sister chromatids, generating dicentric and acentric chromosomes (Uemura et al. 2010). Chromosomes in these configurations are lost during the next cell division, generating monosomic cells (Lewandoski and Martin 1997; Uemura et al. 2010).

Placement of the loxP sites in the same orientation on a non-homologous chromosome can lead to balanced and unbalanced chromosomal translocations. These rearrangements have been used to study the effects of inappropriate regulation of spatial and temporal gene expressions leading to various forms of human cancer, developmental abnormalities and genetic diseases (Van Deursen et al. 1995; Smith et al. 1995). A translocation experiment was performed in tobacco plants to determine efficiencies in programmed chromosomal rearrangements. One loxP/hygromycin open reading frame containing chromosome was allowed to recombine in the presence of a second 35S promoter/loxP-containing chromosome. When Cre protein was introduced into the in vivo system, 2.5% of the resulting plants were hygromycin resistant (Qin et al. 1994). It has also been shown that chromosomal translocations can be induced across species. Protoplasts from two species of plants (Arabidopsis and tobacco) were fused in culture and induced for a Cre-mediated recombination event. A successful recombination event joined the promoter region with the open reading frame of the resistance gene hygomycin. Resistant calli were analyzed and found to contain the junction between the Arabidopsis of chromosome V with a chromosome from the tobacco genome. Unfortunately after the calli were grown and self fertilized, it was determined that the interspecies transferred arm was not maintained (Koskinsky et al. 2000).

While as yet undemonstrated in crops, the use of chromosomal recombination offers the possibility of speeding introgression between laboratory transformation competent lines and elite high production lines by breaking associated linkage drag thereby speeding the transfer of genomic modifications from lab to elite lines for agronomic cultivation as proposed by Ow (2005) (Fig. 5).

Fig. 5
figure 5

Schematic representation of a recombinase-mediated introgression event. The gene of interest (GOI) is flanked by oppositely oriented recombinase recognition sites (red arrows; #s 1 and 2). Inverted recognition sites prevent unwanted DNA excision. The recombination event targets the DNA between the associated recognition sites of different chromosomes. Two recombination events are needed to break the linkage drag associated with tradition breeding techniques. a The first event produces a transposition between the different chromosomes of the lab and elite lines (see recognition sites; red arrows # 1). b The second recombination event reverses the transposition (see recognition sites; red arrows # 2) and c leaves the transgene in the elite line. In theory this technique could be used to stack genes directly from laboratory lines into elite lines

Benefits of recombinase-mediated integration

Various factors appear to effect transgene expression and stability. The most prominent of these are the genomic location of transgene integration (positional effect) and the complexity of integration. Transgene expression may be increased, decreased or mis-regulated depending on surrounding genomic elements. The integration pattern refers to aspects of the transgene such as, its final structural configuration, number of copies, presence of transgene fragments, and number of loci where transgene insertion occurred. In genomic engineering the ability to insert a single copy transgene into a predicted location is most desirable. The single copy transgene produces comparable gene expression levels and effects (Day et al. 2000) by reducing or even eliminating the ‘position effect’ (Clark et al. 1994; Meyer 2000), mosaicism (Burdon and Wall 1992), genomic instability (Collick et al. 1996; Maqbool and Chritou 1999), gene variegation (Dobie et al. 1997) and silencing (Henikoff 1998; Selker 1999; Muskens et al. 2000). Furthermore, single copy transgene inserts give more reliable and reproducible expression than those with multicopy insertions (Day et al. 2000; Iyer et al. 2000). Because of these benefits considerable effort and time is spent isolating and characterizing single copy lines for predictability of expression and inheritance. These single copy insertion lines also offer a much simpler molecular characterization, which, in turn, may ease the process of federal de-regulation (Ow 2007).

Recombinase-mediated integration can be used to insert a single copy of foreign DNA into predetermined locations within a genome (Sauer and Henderson 1990; O’Gorman et al. 1991). This technology has allowed the production of precisely engineered transgenic plants and has been reported to function in Arabidopsis (Louwerse et al. 2007; Vergunst and Hooykaas 1998; Vergunst et al. 1998), aspen (Fladung and Becker 2010), tobacco (Albert et al. 1995; Choi et al. 2000; Day et al. 2000; Nanto et al. 2005, 2009; Nanto and Ebinuma 2008) maize (Baszczynski et al. 2003; Kerbach et al. 2005), rice (Srivastava and Ow 2002; Srivastava et al. 2004; Chawla et al. 2006), soybean (Li et al. 2009) and the plastid genome of tobacco (Lutz et al. 2004). Rates of integration have been documented from ~33% in tobacco (Albert et al. 1995; Day et al. 2000) to nearly 50% in rice (Srivastava et al. 2004). In other words, a minimum of one plant in three demonstrated a precise single copy insertion event. Of the single copy insertion events reported in rice, nearly all displayed consistent expression patterns based on genomic loci, while in tobacco approximately half showed uniformity. The remaining half of the tobacco single copy transgene insertions were affected by methylation-dependent DNA silencing (Day et al. 2000). These rates are much better than ~1–10% single copy insertion associated with random integration transformation methods. However, it should be pointed out that compared to conventional methods, the tissue culture process for transformation is labor intensive and time consuming due to the number of explants (protoplasts or callus) required. For practical purposes the transformation efficiency would need improvement. In another study, Chawla et al. (2006) documented that site-specific integration in rice exhibited stable gene expression over multiple generations. Also, noted in this study was how multi-copy transgenic plants initially silenced, recovered expression when segregation removed the extra transgene copies. This can most likely be attributed to inactivation of homology-dependent gene silencing due to removal of repetitive transgene DNA (Jakowitsch et al. 1999; Luff et al. 1999).

Targeted integration can be achieved by co-transforming a cell containing a single recombinase recognition site within the genome with two plasmids (Fig. 6). One plasmid contains the complimentary recognition site along with the gene of interest and the other contains the recombinase expression cassette. This was first shown with the Cre recombinase in yeast (Sauer and Henderson 1990), thereby demonstrating targeted site-specific insertion into a eukaryotic genome. Cre, Flp and R are reversible recombinase systems (RRS) that favor excision over integration, thus to facilitate integration, modifications were required. Improved targeted integration using Cre was achieved by providing the recombinase transiently thereby trapping the DNA in its final position as the enzyme expression ceased (Baubonis and Sauer 1993; Albert et al. 1995; Vergunst and Hooykaas 1998; Srivastava and Ow 2001). This strategy was improved upon by placing the initial target loxP recognition site in the genome between the promoter and open reading frame of the Cre recombinase. In this manner, the recombinase protein is pre-loaded within the cell allowing integration. Once integrated, the recombinase open reading frame is displaced from the promoter and effectively shut off thereby trapping the DNA in the integrated state. This technique has been very effective for the reversible recombination systems of Cre and Flp. It has been used for integration in Arabidopsis (Vergunst et al. 1998), tobacco (Albert et al. 1995; Choi et al. 2000; Day et al. 2000) and rice (Srivastava and Ow 2002; Srivastava et al. 2004; Chawla et al. 2006). A novel twist was employed that used two loxP sites in the targeting DNA, such that in the presence of Cre, all unwanted ‘backbone/vector’ DNA was removed producing a circular fragment. The circular DNA was subsequently integrated into the genomic loxP site providing a ‘clean’ transgene (Kolb and Siddell 1997; Vergunst et al. 1998; Vergunst and Hooykaas 1998; Srivastava et al. 2004). Integration of plasmid backbone into the host genome has been linked to transgene silencing, and therefore its removal is desirable (Iglesias et al. 1997).

Fig. 6
figure 6

Schematic representation of a recombinase-mediated integration event. The targeting DNA is circular and contains a single recognition site (red arrow), a selectable marker and gene of interest. The accepting DNA (in genome) contains a single complimentary recognition site. The resulting integration event between the two associated recognition sites inserts the entire circular targeting DNA fragment into the genome. For the reversible recombinases this leaves an unstable configuration of two directly oriented recognition sites that can be immediately excised. Transient expression of the reversible recombinase is commonly used to trap the integrated DNA—not shown

A model proposed by Hoess and Abremski (1984) suggested that mutations in one of the two 13 bp binding domains of a loxP recognition site could be tolerated. By definition the RRSs Cre, Flp and R have recognition sites loxP, FRT and RS that contain two recombinase-binding domains separated by a spacer region that determines orientation of the sequence. When one 13 bp binding domain was mutated cooperative Cre–Cre interaction allowed loxP site attachment to occur normally, but if mutations were in both 13 bp binding domains overall binding efficiency dropped significantly. Mutant loxP sites could be tolerated for integration since each loxP site had one ‘good’ binding domain facilitating normal enzyme attachment. After recombination, one loxP site would be wild type, while the other would have a mutation in both its binding domains inhibiting further Cre attachment, thereby trapping the DNA in the integrated position (Fig. 7). This prediction was demonstrated through the use of randomly generated mutant loxP sites (Albert et al. 1995). The system was tested in vitro with a plasmid inversion system so both forward and reverse reactions could be analyzed. These optimized mutant loxP sites have subsequently allowed targeted integration to be achieved at useful level in plants such as tobacco (Albert et al. 1995; Day et al. 2000), rice (Srivastava and Ow 2002; Srivastava et al. 2004; Chawla et al. 2006) and maize (Srivastava and Ow 2001). Use of double mutant recognition FRT sites for enhanced integration stability with Flp system has not been successful, probably due to the intrinsic nature of the Flp/FRT interaction, where Flp binding is not a cooperative event (Senecoff et al. 1988; Huang et al. 1991).

Fig. 7
figure 7

Schematic of recombinase-mediate integration with mutant loxP sites schematic. As the reversible recombinase Cre results in an unstable configuration upon integration, use of loxP recognition sites with half site mutations (red arrow with yellow star) have been exploited. The half site mutations allow integration to occur at approximately wild type rates due to cooperative binding of the recombinase to the recognition site. Once recombined and both half site mutation are brought together future recombination events (excision) are inhibited thereby trapping the integration event. This is believed to be due to the loss of cooperative binding. Transient expression of the reversible recombinase further enhances the effectiveness of trapping of the integrated DNA—not shown

Another option for chromosomal targeting involves the use of endogenous genome located recognition sites (or cryptic sites) provides an additional option for targeted chromosomal integration. This method has been demonstrated in various prokarotes and eukaryotes (Sauer and Henderson 1990; Sauer 1996; Thyagarajan et al. 2000, 2001; Groth et al. 2000; Olivares et al. 2001; Thomson et al. 2003; Allen and Weeks 2005, 2009; Held et al. 2005; Calos 2006; Ou et al. 2009). Research for plants has only been recently conducted identifying potential cryptic recognition sites via sequence analysis (Thomson et al. 2009, 2010). However, empirical studies are required to determine the utility of the predicted sites for practical application. Other researchers have used DNA mutagenesis techniques to modify the recombinase’s binding domain to more effectively recognize the cryptic site with the genome of interest. The Calos lab has modified the phiC31 recombinase to more effectively bind a cryptic attP site from the human chromosome (Sclimenti et al. 2001; Keravala et al. 2009). The modified phiC31 has shown enhanced genome targeting capacity and gene delivery (Keravala et al. 2009; Chavez et al. 2010). The use of recombinase in the field of gene therapy has provided a substantial step forward for programmed genetic treatment of incurable diseases.

Despite the potential advantages, the commercial application of recombinase-mediated technology in plants has been modest. This is due, in part, to the intellectual property restrictions that limit the availability and commercial use of the effective recombination systems Cre, Flp and R. To provide DNA manipulation tools for genetic modification with the freedom to operate, a number of labs have recently screened and described novel recombinase systems with properties analogous to the irreversible recombination system (IRS) phiC31 (Thorpe and Smith 1998; Thomason et al. 2001). These systems are from the large serine subfamily, which perform recombination between non-identical attB and attP recognition sites. As such, novel recombinase systems R4 (Olivares et al. 2001), TP901-1 (Stoll et al. 2002; Thomson and Ow 2006) and Bxb1 (Keravala et al. 2006; Russell et al. 2006; Thomson and Ow 2006) have been identified. These unidirectional recombinases present a unique set of tools for genomic engineering and offer improved techniques over those involving the RRS Cre, Flp or R recombinases. While effective, the bidirectional systems require more complex schemes in order to reduce the reverse reaction and trap the desired product—as previously described. In the S. pombe system described by Thomson and Ow (2006), the relative rates of completion for genomic integration of Bxb1, phiC31 and TP901-1 were 85, 95 and 78%, respectively. These rates can be directly compared, since the unique target site for each respective recombinase was placed, by homologous recombination in the same genomic location, thereby removing positional effects.

The phiC31 recombinase system was the first large serine recombinase found to be functional in eukaryotes and the best characterized as a genomic engineering tool. PhiC31 has been tested in both Arabidopsis and wheat for its ability to excise a DNA fragment from the genome and transmit the excision event to next generation (Kempe et al. 2010; Thomson et al. 2010). These studies found that phiC31 was fully functional in the germinal tissue, which demonstrates that the recombination system is suitable for the generation of stable marker-free, recombinase-free transgenic plants. This recombinase has also been used for both integration (Lutz et al. 2004) and excision (Kittiwongwattana et al. 2007) within the tobacco plastid genome. The results showed that a transformation efficiency of 17 independent lines per bombarded sample was achieved via site-specific integration with long-term stable transgene expression observed. Most recently, Yau et al. (2010) describes the use of the Bxb1 large serine recombinase for site-specific integration into a pre-determined locus of the tobacco genome. In this experiment, a construct with the Bxb1 attP site was transformed into the tobacco genome (inserted via random integration). Single copy transgenic plants were isolated and used for site-specific integration of a plasmid containing the Bxb1 attB site and a hygromycin-resistance gene. The Bxb1 recombinase-expressing plasmid was provided in trans and co-transformed into the protoplasts derived from the target lines by PEG-mediated transformation. Integration lines were obtained through hygromycin selection and confirmed with sequenced integration junction PCR products and Southern blot analysis. The primary results showed that approximately 5% of the transformation events were site-specific. For more details on the use of site-specific integration for crop plant improvement, see Srivastava and Gidoni (2010).

Strategies combining recombinase-mediated integration and excision

As a substantial number of site-specific recombination systems from prokaryotes and lower eukaryotes have been identified, future GM plants may be produced using multiple recombination systems and strategies for multi-gene stacking and deletion. Ow (2005) described a strategy for gene stacking and deletion using both a reversible and an irreversible recombination system. The idea rests on a concept that the integrating DNA carries an extra recombination site, such that after insertion into the plant genome, the extra recombination site becomes a new target for the next round of DNA integration (Fig. 8). However, this strategy as described can only apply to sexually propagated plants due to crossing requirements. For each round of targeted integration, the target plant line has to be crossed with the recombinase expression line in order to remove the marker gene, with a second generation being required for segregation to remove the recombinase gene. Although time consuming, this technique is powerful allowing multiple genes (or groups of genes) to be sequentially stacked into a predetermined genomic locus.

Fig. 8
figure 8

The schematic representation of gene stacking via recombinase technology. a The original ‘TAG’ DNA contains a previously targeted gene of interest 1 (GOI1), an IRS recognition site (attB, green arrow) and an RRS recognition site (lox, red arrow). b For effective gene stacking an incoming vector consisting of an attP-GOI2-attP-lox-Marker-inducible Cre can be introduced into the ‘TAG’ line (attP, yellow arrow is the complimentary IRS recognition site for attB). The IRR recombinase is provided transiently by a co-transformed plasmid. c Recombination between attB and one of the two attP sites integrates GOI2 construct into the GOI1 locus. As two attP sites are present on the incoming vector integration will produce a useful product only 50% on the time. d Induction of Cre will remove both itself and the marker gene, leaving the e GOI1-attL-GOI2-attP-lox structure (attL and attR, bicolouerd yellow/green arrows are the hybrid sites of the attB/attP integration and are not competent for further recombination). The remaining genome located attP is available for future targeted integration at the ‘TAG’ locus. f Analogous to the previous steps, the attP can be used to add a third gene, GOI3, with the construct attB-GOI3-attB-lox-Marker-inducible Cre and bring in a new attB site for yet another round of integration. Not all possible conformations are represented

An alternative to the previously described strategy would be the inclusion of an inducible recombinase system for excision without crossing and segregation of the recombinase gene as currently proposed (Fig. 8). There are several advantages to this revised strategy. First, only one selectable marker gene is used for both target construct and stacking vector. Second, the strategy avoids the need for crossing and segregation of a recombinase gene to mediate excision as both the marker and recombinase gene are deleted once recombinase expression is induced. Third, only two simple vectors are used. Theoretically, this strategy could be used to stack numerous genes in one transgenic plant line via irreversible site-specific recombination utilizing attB and attP recognition sites. Finally, the overall strategy could benefit greatly by the use of a positive/negative selectable marker cassette to allow direct selection of plants that have undergone an excision event (Kondrak et al. 2006). Negative selection prior to molecular characterization would save time and effort invested in screening for candidate lines (Gleave et al. 1999).

Recombinase-mediated cassette exchange (RMCE)

A unique targeted integration strategy was developed that takes advantage of the homology requirement between the 8 bp-spacer region of recombining FRT, loxP and RS sites (Hoess et al. 1986; Lee and Saito 1998; Nanto et al. 2005). This technique is termed recombination-mediated cassette exchange and was originally used with the Flp/FRT recombinase system (Schlake and Bode 1994; Seibler and Bode 1997; Seibler et al. 1998). This technique has been shown to be an efficient way to produce transgenics with a minimum of excess DNA added to the host (Bouhassira et al. 1997). This strategy combines both site-specific integration and excision mediated by one recombinase. The approach involves a targeted integration event followed by a recombinase-mediated excision event removing unwanted transgenic DNA (Fig. 9). The initial design is similar to most biphasic systems where a tagged genome (TAG) must first be generated before it can be further modified by the recombinase. The ‘TAG’ usually consists of a positive selectable marker flanked by two inverted recognition sites. The inverted recognition site orientation inhibits unwanted ‘TAG’ auto-excision. The ‘EXCH’ plasmid also contains inverted recognition sites and flanks the DNA of interest to be inserted into the genome. These recognition sites are homologous to the sites already within the genome. The recombinase gene is either provided in cis on the incoming DNA, outside the flanked ‘EXCH’ cassette, or in trans from a separate molecule. Re-transformation of the ‘TAG’ lines with the recombinase and exchange (EXCH) vector allows a double recombination event to swap DNA. During the double recombinase-mediated exchange event, both sets of homologous sites will undergo a separate recombination event exchanging the original ‘TAG’ gene with the incoming ‘EXCH’ gene. This is predicted to be a two-step process where one site initially undergoes recombination integrating the entire construct into the genome followed by a second recombination event removing the ‘TAG’ gene (selectable marker) from the genome along with the ‘EXCH’ plasmid backbone, leaving behind the ‘exchanged’ DNA of interest (Fiering et al. 1993; Feng et al. 1999; Thomason et al. 2001; Nanto et al. 2005). Recent evidence through the use of an atomic force microscope has confirmed that RMCE is a two-step process (Malchin et al. 2008). Because the Cre recombinase is more active than Flp (Buchholz et al. 1996; Dymecki 1996; Westerman and Leboulch 1996), the RMCE system was redesigned to include the loxP site. It was modified yet again to accommodate the use of other recombinase systems.

Fig. 9
figure 9

General RMCE schematic for a single bidirectional recombinase (the green and red arrows represent identical recognition sites and colors are only intended to allow visualizing of the various recombination possibilities). This technique has been shown to be an efficient way to produce transgenics with a minimum of excess DNA added to the host. This strategy combines both site-specific integration and excision mediated by one recombinase. The approach involves a targeted integration event followed by a recombinase-mediated excision event removing the unwanted transgenic DNA. Specifically, RMCE is a technique where DNA can be integrated in a specific manner to a pre-existing genomic target ‘TAG’ with a minimum of backbone DNA. The pre-existing ‘TAG’ contains a selectable marker (Marker) and is flanked by inverted RRS recognition sites (red and green arrows, thin lines). The incoming plasmid DNA contains a gene of interest (GOI) also flanked by inverted RRS recognition sites (red and green arrows, thick lines) and is termed the exchange ‘EXCH’ cassette. The recombinase will integrate the incoming ‘EXCH’ DNA into the ‘TAG’ utilizing one of the two flanking recognition sites. To proceed forward, the second set of recognition sites are then used for site-specific excision of the intervening DNA. This results in the switch the GOI of the ‘EXCH’ cassette for the (Marker) of the ‘TAG’ DNA. Although useful, inverted sites also result in ‘exchanged’ cassettes with both forward and reverse orientations making the molecular conformation more complicated and differing transgene expression levels. Not all possible conformations are represented. The recombinase can be provided in cis or in trans—not shown

A limitation to the use of the bidirectional tyrosine recombinases (Cre, R or Flp) is the reversible nature of the recombination events, meaning that the RMCE can be reversed at any step in the process or even be repeated. Further, the final product can be integrated in two possible orientations making molecular confirmation complex. However, despite these potential limitations, this technique has been used to successfully generate transgenic yeast, mammalian cells, mice, Drosophila and plants (Feng et al. 1999; Baszczynski et al. 2003; Belteki et al. 2003; Horn and Handler 2005; Nanto et al. 2005; Louwerse et al. 2007; Nanto and Ebinuma 2008; Watson et al. 2008; Li et al. 2009; Nanto et al. 2009; Fladung and Becker 2010). Scientists working in mammalian systems have demonstrated targeting efficiencies approaching 100%, from a pre-existing chromosomal site, through the use of a negative-selection marker exchange vector. They have reported that the technique is powerful enough to specifically integrate into a targeted location without the use of any selection and still obtain an integration frequency of 1% (Feng et al. 1999). Recombinase systems that have been successful with this method include Cre/loxP, Flp/FRT, R/RS, phiC31 and Bxb1, indicating that even bidirectional recombinases such as Flp and R, which are reported to be less active than Cre, can be successfully used in this strategy. Of interest is the fact that this technique has been successfully employed using both direct DNA transformation and Agrobacterium-mediated methods (Vergunst et al. 1998; Nanto and Ebinuma 2008).

Use of the reversible recombinase systems Cre/loxP, Flp/FRT and R/RS require special attention due to unwanted self-excision. Careful cloning strategies must be devised to avoid generating directly oriented recombinase-binding sites on the same construct, which can result in premature excision (Nanto et al. 2005; Louwerse et al. 2007; Nanto and Ebinuma 2008). Use of heterologous or mutant binding sites has been shown to reduce this problem with some success (Louwerse et al. 2007; Watson et al. 2008). Unfortunately, minor differences in heterologous binding sites are not always sufficient to inhibit unintended DNA excision (Siegel et al. 2001). While mutant binding sites help stabilize the integration event of a reversible recombinase system, recombination efficiency is lowered which correlates with the stabilizing effect of specific mutations (Albert et al. 1995; Thomson et al. 2003; Araki et al. 2010). Also, some recombinase systems such as R/RS do not have known mutant sites. As mentioned earlier, the most universally successful binding site strategy employs a design where the orientation of the recombination sites is in an inverted arrangement for both the ‘TAG’ and ‘EXCH’ constructs. Although useful, inverted sites also result in ‘exchanged’ cassettes with both forward and reverse orientations resulting in differences in transgene expression levels (Fig. 9) (Feng et al. 1999; Nanto et al. 2005, 2009).

A modification to RMCE is a strategy that employs two recombinase expression systems where both recombinase enzymes are simultaneously expressed. Lauth et al. (2002) first demonstrated the use of the Cre/loxP and FLP/FRT systems, along with the use of mutant binding sites to stabilize the integration event. Targeting without selection was 3%. Put into perspective, this is 3X better than the levels obtained with homologous recombination using a selectable marker. The research did, however, require the use of a flow-assisted cell sorter (or FACS) to isolate transgenic tissue culture cells and failed to address the presence or absence of the recombinase genes in the final cell lines. Two other groups have now used a modified dual RMCE technique. Dafhnis-Calas et al. (2005) used the Cre/loxP and phiC31 recombinases for RMCE. The authors called their technique ‘iterative site-specific integration’ and used it in mammalian cell culture to stack genes into the same locus. However, this strategy required the retention of a selectable marker in the resultant transformed lines, while a publication by Nanto and Ebinuma (2008) demonstrated the use of both R/RS and Cre/loxP systems for RMCE in tobacco. This well executed study successfully generated marker-free single copy targeted integration transgenic lines. Agrobacterium was used as the vehicle for transformation demonstrating that RMCE can be executed from the incoming T-DNA molecule. Further, the R/RS RMCE targeting strategy (Nanto et al. 2005; Nanto and Ebinuma 2008) was enriched for ‘clean’ exchange events by including an extra RS recognition site within the incoming vector. If random integration were to occur, the continued expression of the R recombinase gene would excise the incoming vector removing the entire construct from the genome, minus an LB-RS-RB footprint.

We describe here a novel approach to RMCE that can potentially overcome the limitations previously described. This strategy utilizes two unidirectional recombinase systems and was developed to create single insertion events in previously characterized lines having stable expression patterns. The system is intended for high throughput gene integration such as those needed by biopharmaceutical companies for protein production or for the study of genes with regard to biotechnology risk assessment. Another important feature is that the insertion event itself is unidirectional, and thus transgene expression will be constant and not compromised by forward and reverse insertion directions, unlike most RRS-dependent RMCE techniques. The large serine recombinases, namely Bxb1 and phiC31, are naturally unidirectional, and therefore trap the DNA within their target loci. The second recombinase needed for excision is an irreversible small serine recombinase. Due to topological constraints (Sarkis et al. 2001; Mouw et al. 2008), these recombinases are only capable of site-specific excision and therefore cannot re-integrate DNA into the genome, as can occur with bidirectional tyrosine systems (Srivastava and Ow 2003). During this process the selectable marker, recombinase genes and plasmid backbone will be eliminated from the host genome during the initial transformation event (Fig. 10). All progenies containing the gene of interest produced and confirmed from the initial transformation will serve as the final product. In other words the plant lines will be immediately available for production and/or study. No secondary crosses or segregation to remove unwanted DNA is necessary. This design employs a positive/negative selectable marker scheme that has been very effective for replacement strategies (Kondrak et al. 2006; Nanto and Ebinuma 2008) and makes the site-specific excision event completely dependent on precise integration and cell survival. When the ‘EXCH’ vector is site-specifically inserted into the chromosomal ‘TAG’ cassette, the recognition sites required for excision will align stimulating the removal of the positive/negative selectable marker gene (Fig. 10). This technique is meant to eliminate undesirable random genomic integration, thereby reducing background and screening time. The presence of the negative marker gene (e.g. codA) eliminates cells that experienced site-specific integration of the ‘EXCH’ cassette without subsequent excision, and those cells that had only random integration events.

Fig. 10
figure 10

Schematic representation of the irreversible recombinase-mediated cassette exchange. This strategy simultaneously uses two irreversible recombinases, one for integration into the pre-existing chromosomal ‘TAG,’ and one for excision of unwanted DNA. This strategy traps the GOI from the incoming ‘EXCH’ cassette in the genome (seen as circle with thick lines). Even though two recombinase-mediated events are necessary, these reactions occur in a single transformation experiment. Integration of the ‘EXCH’ cassette is identified via negative selection against the presence of the Pos/Neg selectable marker. The design makes the proper alignment of the recognition sites for excision dependent on the site-specific integration event. No site-specific or random integration event where the negative selection marker is present survives negative selection. a The ‘TAG’ contains a positive–negative selectable marker and two irreversible recombinase genes flanked by the recognition sites (attP and Res) of both IRR systems, represented by yellow and blue arrows, respectively. b The ‘EXCH’ vector is a simple vector containing a GOI flanked by the recognition sites of the two corresponding IRR systems (attB and Res, represented by green and blue arrows, respectively. c Irreversible recombinase-mediated integration inserts the entire plasmid into the pre-existing ‘TAG’ loci (i.e. the yellow and green arrows are recombined—integration). Site-specific integration aligns the Res recognition sites in direct orientation (blue arrows) enabling excision. Irreversible recombinase-mediated excision removes the positive–negative marker and recombinase genes from the genome (i.e. the Res sites are recombined—excising the undesirable DNA). d Upon completion of both irreversible recombination events, the GOI has been site-specifically inserted into the genome free of unwanted plasmid backbone, marker and recombinase genes. Circularized DNA does not replicate and cannot be reintegrated into the genome via the recombinase genes as neither the attL, attR (seen as bicolored yellow and green arrows) or Res recognition sites (blue arrows) can mediate genomic integration

This strategy can be modified for gene stacking by including a second, integration-specific recognition site into the ‘EXCH’ vector. This design results in an additional recognition site being added to the insertion locus, and can be used as subsequent target site. By placing the additional site in an inverted orientation only the correctly targeted recognition site will allow negative marker gene, codA, to be excised and thus selection of the appropriate event.

Conclusion

This review summarizes the impact that site-specific recombination-based technologies are having on the field of genomic engineering, and how these methods will change genomic engineering in the future. This technology enables the removal of extraneous DNA such as selectable markers, (i.e. antibiotic resistance genes) from the genome as well as speeds the transition from laboratory manipulation to field production; an important benefit for both the industry and general public. From an application point of view, the number and type of recombinases available along with the innovative strategies being developed offer a multitude of applications for genome manipulation; with everything from single copy high throughput targeted integration, to sequential gene stacking, to complete transgene removal from the pollen and/or seed. These strategies are applicable to the commonly used transformation methods of biolistics and Agrobacterium providing the widest available use in diverse systems. Site-specific recombinase technology is ready for routine application of genomic engineering and will likely become an integral part of crop biotechnology, enabling the technology to generate improved crops that rely less on pesticides and fertilizers and have the ability to produce high quality, abundant foods impacting the farm economy. However, utilization of recombinases as tools for genome manipulation has seen limited use. This could be due to the perceived complexity of the systems involved (Albert et al. 1995) or possibly to the low efficiencies observed in initial recombinase studies (Vergunst and Hooykaas 1998; Vergunst et al. 1998). However, through improvement to initial protocols practical utilization of recombinase technology has been demonstrated (Srivastava and Ow 2002; Baszczynski et al. 2003; Srivastava et al. 2004; Nanto et al. 2005; Louwerse et al. 2007; Nanto and Ebinuma 2008; Li et al. 2010). In addition, the RMCE strategy and general availability of recombinases provide greater commercial application to the use of these systems. The public will further benefit from a greater supply of healthy, economical foods, and issues of hunger facing economically challenged regions of the world can be more easily addressed. Finally, public and regulatory concerns over the potential unintended effects of extraneous DNA may also be alleviated, improving acceptance for genetically engineered crops in the future.