1 Introduction

It is now well established that bacteria, such as Escherichia coli, Corynebacterium glutamicum, and yeast, exhibit excellent characteristics in producing amino acids, organic acids, and vitamins, etc. (Hou et al., 2012; Xu et al., 2013; 2014a; 2014b; 2015). Although the high-producing strains were obtained via repeated physical and/or chemical mutagenesis, there are many disadvantages to the processes, such as slow-growth, low sugar consumption rating, and low stress tolerance (Xu et al., 2013). Fortunately, the development of techniques for directed DNA manipulation and the availability of whole genome sequences make real the possibility of constructing high-producing strains using genetic engineering techniques (Kalinowski et al., 2003; Baba et al., 2006). These strategies have been successfully used in modifying E. coli or C. glutamicum strains for producing, e.g., amino acids (Georgi et al., 2005; Becker et al., 2011), organic acids (Inui et al., 2007), and vitamins (Martens et al., 2002). The strategies of genetic engineering breeding for constructing high-producing strains are as follows: (1) up-regulation of the key enzyme gene involved in the target product biosynthetic pathways; (2) relieving the inhibition and/or repression of the key enzyme; (3) interruption of the pathways for synthesizing by-products. The implementation of the above-mentioned strategies involves a wide variety of DNA manipulations, including site-directed mutagenesis (SDM), gene inactivation and over-expression. In contrast to conventional breeding via repeated physical and/or chemical mutagenesis, genetic engineering breeding gives more practicable options for subsequently isolating mutant strains because it can avoid poor physiological characteristics and unknown inherited characteristics. Therefore, developing a new method to achieve faultless DNA manipulations is one of the most popular research subjects.

Here we review the common strategies used for DNA manipulations. We focus on the strategies for DNA manipulations via genetically modifying the E. coli and C. glutamicum genomes. In addition, the potential problems of multi-layered DNA manipulations are considered. However, this review avoids any discussion of plasmid-mediated gene over-expression. It intends to be an easy introduction for novices and a source of new experimental information for experienced investigators.

2 Strategies for SDM

SDM, also called “rational mutagenesis” (Mandaci, 2011), is commonly used to introduce mutations at definite sites of a target DNA fragment, including the genome and plasmid, via polymerase chain reaction (PCR) or restriction endonuclease reaction (RER). It plays a great role in understanding the regulatory motifs of operon and the relationship of protein structure to function (Ling and Robinson, 1997; Seyfang and Jin, 2004; Wu et al., 2013). Depending on the numbers of mutational sites, SDM can be divided into two types: single site-directed mutagenesis (Single-SDM) and multiple site-directed mutagenesis (Multi-SDM) (Liang et al., 2012).

2.1 Strategies of Single-SDM

Single-SDM is mainly based on the amplification of double-stranded DNA (dsDNA) plasmid using complementary primer pairs, which contain 20–30 oligonucleotides and the desired mutation (Ling and Robinson, 1997; Holland et al., 2015). Because of its simplicity, time-saving capability, and relatively high efficiency, this approach has become a common strategy for introducing mutation into E. coli and C. glutamicum genomes (Ling and Robinson, 1997; Muyrers et al., 2001; Xu et al., 2014b). Although most of the methods used for Single-SDM have been reviewed by researchers around 15–20 years ago (Chatellier et al., 1995; Ling and Robinson, 1997; Muyrers et al., 2001), many new methods are springing up along with the development of genetic engineering technique.

2.1.1 Single-SDM by enzyme digestion and homologous recombination (ED-HR)

Commercial kits are simple and easy to use for SDM (Imai et al., 1991; Martin et al., 1995; Chiu et al., 2004; Stoynova et al., 2004), but some defects usually limit their application in creating larger deletions (Li et al., 2008). In order to overcome the limitations of commercial kits, the methods used for larger deletion have been developed based on DNA ligation and hybridization in vitro (Imai et al., 1991; Chiu et al., 2004; Stoynova et al., 2004) or on homologous recombination in vivo (Martin et al., 1995). The improved homologous recombination technique has greatly simplified the procedure of mutagenesis (Li et al., 2008), with only two steps: PCR amplification and transformation (Martin et al., 1995). However, the efficiency is not invariable because the plasmid used as a template in PCR amplification is also transformed into a competent cell (Li et al., 2008). Although many efforts have been made to overcome this obstacle, such as gel purification of PCR products, they are time-consuming and do not work well for purifying the few products resulting from fewer PCR cycles and/or lower amplification efficiency (Li et al., 2008).

It is generally known that most bacteria possess DNA restriction-modification (R-M) systems that are used to protect the cell from intrusion by foreign DNA. In most cases, “self” DNA is marked by methylation, whereas in “non-self” DNA it is absence of methylation which marks it (Siwek et al., 2012). The DpnI restriction system is one of the R-M systems (Johnston et al., 2013). DpnI endonuclease cleaves only the double-stranded (ds)-methylated 5′-GmATC-3′ sequence in DNA (Lacks and Greenberg, 1977). Interestingly, the plasmid extracted from bacteria contains methyladenosine at the DpnI restriction site, which makes the plasmid susceptible to DpnI (Lu et al., 2002), whereas PCR-amplified DNA does not contain methyladenosine (Li et al., 2008). Based on this principle, PCR using plasmid DNA as a template is a very useful way to obtain replication and mutagenesis (Lu et al., 2002). PCR products digested by DpnI have been widely used in mutagenesis, and the protocols are listed in Fig. 1a. In general, the PCR products are digested in vitro by DpnI (Weiner et al., 1994; Martin et al., 1995; Qi and Scholthof, 2008). Li et al. (2008) pointed out a simple and rapid strategy to digest PCR products in vivo by DpnI when carrying out SDM. The crucial aspects of this strategy are inactivation of dam gene in chromosome and heterologous expression of dpnC gene (encoding DpnI) in the host cell. Since DpnI was expressed in vivo, there is no need to purchase DpnI and to manipulate PCR products.

Fig. 1
figure 1

Schematic diagrams of the Single-SDM method for base deletion, substitution, and insertion by enzyme digestion and homologous recombination (a) (Li et al., 2008) and by enzyme ligation and homologous recombination (b) (Wu et al., 2013)

Blue lines represent targeted gene; bright-green lines represent the homologous fragment between target gene and primers; red lines represent desired mutant sites; pink lines represent restrict enzyme sites (Note: for interpretation of the references to color in this figure legend, the reader is referred to the web version of this article)

2.1.2 Single-SDM by enzyme ligation and homologous recombination (EL-HR)

The above-mentioned ED-HR and other PCR-based traditional Single-SDMs (Picard et al., 1994; Brøns-Poulsen et al., 1998; Chapnik et al., 2008; Li et al., 2008) require at least two PCR rounds, multiple steps of enzymatic treatment, DNA template purification, and primer transformation or purification via high-performance liquid chromatography (HPLC) (Picard et al., 1994; Qi and Scholthof, 2008; Wan et al., 2012). In addition, the targeted gene should be sub-cloned into an original plasmid to form targeted plasmid which is used as a template to amplify the mutant plasmid with the desired mutation site (Picard et al., 1994). However, undesired amplification of these methods results in long extension time and low mutagenesis efficiency (Wang et al., 2011; Adachi and Fukuhara, 2012). Moreover, these methods require the digestion of the PCR products with DpnI in vitro or in vivo (Chapnik et al., 2008; Li et al., 2008; Qi and Scholthof, 2008). All these can be expensive and time-consuming (Wan et al., 2012).

Many plasmids, including T-vector (e.g. pUC18 and pMD™18-T), mobilizable plasmid (e.g. pK18mobsacB and pK19mobsacB), and expression plasmid (e.g. pDXW-8 and pET28a), contain a multiple cloning site (MCS) (Schäfer et al., 1994; Xu et al., 2010). The MCS can be used to clone the targeted DNA fragment into these plasmids. Based on the circularization of PCR products via homologous recombination in vivo (Jones and Winistorfer, 1991), Wu et al. (2013) proposed a rapid and efficient protocol depending on two separate PCR amplifications and one-step ligation, which we call an “EL-HR”. This protocol can be used to introduce deletions, insertions, and substitutions into any site of the target genes (Fig. 1b). Since the up-fragment and down-fragment (i.e. A fragment and B fragment) of the targeted gene are independently amplified with the same PCR parameters and at the same time, it avoids long extension time, high PCR error rate, and low mutagenesis efficiency. In addition, this protocol does not need the digestion of the parental template because the targeted plasmid is constructed by enzyme ligation rather than by PCR.

2.1.3 Single-SDM by TA strategy

Although these PCR-based approaches are useful and powerful for SDM, there are still many defects: (1) long or HPLC-purified primer (Davis et al., 1999; Salerno et al., 2005); (2) high-fidelity thermostable DNA polymerase and restriction endonuclease (Salerno et al., 2005); (3) the need for subcloning the desired gene into the original plasmid to form a targeted plasmid (Davis et al., 1999); (4) undesired amplification of the wild-type gene template (Ke and Madison, 1997); (5) the self-annealing of megaprimer strands (Brøns-Poulsen et al., 1998; Siloto and Weselake, 2012); (6) the unbalanced melting temperatures of the primers (Tseng et al., 2008; Saeedi et al., 2012); (7) the requirement for a high concentration of megaprimer to ensure the successful modification (Wan et al., 2012); (8) the requirement for additional steps for preparation of the template plasmid (Urban et al., 1997) or restriction endonuclease treatments of PCR products (Seraphin and Kandels-Lewis, 1996).

Alternatively, Adachi and Fukuhara (2012) proposed a simple method using TA strategy with synthetic mutagenic oligonucleotides. By application of this strategy, virtually any DNA manipulation can be achieved, and it is even possible to introduce foreign sequences into the PCR-amplified 3′ A-overhang vector in a size-independent and site-specific manner (Adachi and Fukuhara, 2012). The protocols of this method are illustrated in Fig. 2 and the crucial aspects are as follows:

  1. (1)

    The choice of thermostable DNA polymerases. 3′→5′ exonuclease exists inherently in DNA polymerase γ and T4 polymerase (Khare and Eckert, 2002), and involves rectification of a mismatched base-pair. Therefore, in order to leave 3′ A-overhang after PCR, we should select 3′→5′ exonuclease-deficient thermostable DNA polymerases for PCR.

  2. (2)

    The synthetic mutagenic ds-oligonucleotides. According to the principle of adenine-thymine (AT) base pairs, the 3′ A-overhang vector and 3′ T-overhang synthetic mutagenic ds-oligonucleotides are connected. Construction of a different mutation type is executed via synthesizing the different mutation type ds-oligonucleotides with the 3′ T-overhang sticky ends.

  3. (3)

    The length of the targeted plasmid with target gene. PCR-mediated amplification increases the risk of introducing undesired mutation during use of thermostable polymerase, especially use of low-fidelity thermostable polymerase (Wu et al., 2013). Moreover, Adachi and Fukuhara (2012) have pointed out that excessively long extension time increases the likelihood of error rates and of generating artifacts.

  4. (4)

    The storage time of PCR product. Because 3′ A-overhangs on the PCR product would be degraded over time to reduce the ligation efficiency with 3′ T-overhang synthetic mutagenic ds-oligonucleotides, the fresh PCR product should be used if the SDM has been done.

Fig. 2
figure 2

Schematic diagram of the TA strategy-based Signle-SDM method for base deletion, substitution, and insertion (Adachi and Fukuhara, 2012)

Blue lines represent targeted gene; red lines represent mutant sites; 5′→3′ represents base direction (Note: for interpretation of the references to color in this figure legend, the reader is referred to the web version of this article)

2.1.4 Comparison of major methods for Single-SDM

With the development of genetic engineering technology, numerous novel Single-SDM methods based on the QuikChange™ SDM system (QCM)-like kits or previous protocols have been developed to overcome the disadvantages of previous methods (Ke and Madison, 1997; Urban et al., 1997; Salerno et al., 2005; Qi and Scholthof, 2008; Tseng et al., 2008; Wang et al., 2011; Adachi and Fukuhara, 2012; Wan et al., 2012; Sun et al., 2015). Most of these methods can be widely used to construct various difficult-to-construct mutants in areas where traditional SDM protocol cannot perform. Moreover, these methods can vastly simplify routine work and reduce experimental cost. However, there are no perfect methods for Single-SDM, including the above-mentioned methods, as far as we know. In summary of the sections on the applications of PCR-based approaches and non PCR-based approaches in Single-SDM, some aspects of above-mentioned methods are compared in Table 1.

Table 1 Comparison of major methods in vitro for Single-SDM

2.2 Strategies of Multi-SDM

As stated previously, SDM plays an important role in genetic research, such as understanding the regulation site of operon or the relationship of protein structure to function (Ling and Robinson, 1997; Seyfang and Jin, 2004; Wan et al., 2012). Therefore, many strategies used for Single-SDM are springing up with the development of genetic engineering technology (see Section 2.1). However, it is sometimes necessary to modify several sites in one amino acid sequence to understand the structure and function of a given protein. Hence, a rapid and efficient Multi-SDM strategy would be of great benefit in various applications in functional proteomics and genetic engineering, such as codon optimization for heterologous expression, generation of cysteine-less proteins for subsequent cysteine-scanning mutagenesis, and disulfide-scanning mutagenesis studies (Davis and Seqaloff, 2002; Wang et al., 2012) or (re)design and removal of restriction endonuclease sites in expression vectors (Seyfang and Jin, 2004). Many strategies have been developed to simplify the generation of multiple mutations in a targeted gene.

2.2.1 Multi-SDM by multiple mutagenic oligonucleotide-directed (MMOD) method

Oligonucleotide-directed SDM methods have been described, including PCR-based methods, and methods based on plasmid DNA templates and mutant DNA (Hogrefe et al., 2002). Most of these methods are used to introduce only one mutation site at a time. In 1990, Perlak (1990) presented evidence that it is possible to modify multiple sites in a targeted DNA by the oligonucleotide-directed SDM method using multiple mutagenic oligonucleotides as primers. We call this a “Multiple Mutagenesis Oligonucleotide-Directed (MMOD) method”. The theoretical basis of this method is that the mutant strand synthesized by primer extension can be mediated by T4 DNA polymerase or by a Klenow fragment of DNA polymerase I. T4 DNA polymerase cannot remove the mutagenic primer, whereas the Klenow fragment can be used to displace the primer oligonucleotide which contains the desired mutations and homologous fragment of the original strand (Nossal, 1974). After a series of PCR reactions with multiple mutagenic oligonucleotides as primers, the mutants with desired mutations are screened. Tu and Sun (1996) have successfully introduced three mutation sites in a targeted gene. However, the efficiency of mutagenesis is low (Perlak, 1990).

2.2.2 Multi-SDM by adaptation methods based on QuikChange™

The QuikChange™ protocol, developed by Stratagene (La Jolla, CA, USA), is one of the simplest and fastest methods for introducing mutations (Hogrefe et al., 2002). QuikChange™ protocol uses a reverse complement mutational primer pair to replicate a template plasmid with the targeted gene, and then introduces mutation(s) at the binding site(s) for primer. It relies on the fact that DNA synthesized in vitro is not methylated (Li et al., 2008), whereas the template plasmid extracted from E. coli cell is methylated (Lu et al., 2002), and the methylated DNA can be digested by DpnI. However, there are several deficiencies (Mitchell et al., 2013): (1) DNA replication is linear rather than exponential; (2) it needs high-fidelity DNA polymerase; (3) it is hard to introduce multiple mutations in a single reaction; (4) efficiency is limited by the size of the template plasmid; (5) it has high error rates due to long extension time (Adachi and Fukuhara, 2012); (6) DpnI is required to digest the template DNA; (7) primers are synthesized with a 5′ phosphate moiety and purified by polyacrylamide gel electrophoresis (PAGE) (Hogrefe et al., 2002); (8) the self-annealing and self-pairing need to be carried out during primer design (Liu and Naismith, 2008); (9) it is hard to carry out deletions and insertions. In order to address these shortcomings, many researchers have proposed new strategies.

Kirsch and Joly (1998) divided the in vitro DNA synthesis into two successive phases, and the two newly amplified DNA fragments served as “megaprimers” to synthesize the remainder of plasmid. This method can be used with non-overlapping oligonucleotides and non-specific restriction sites, and achieves simultaneously two separate mutation sites in a gene. By the use of in vitro dam-methylation among the successive PCR amplifications, Kim and Maas (2000) pointed out that all mutations would be introduced after a single transformation and DNA preparation step. This adaptation method not only saves time but also preserves high targeting efficiency. In addition, this method plays an important role when intermittent phenotypic selection is unfavorable, or the targeted gene is unstable or tends to recombine in vivo. Moreover, Liu and Naismith (2008) used novel primers to achieve multiple mutations in a gene. The primers, containing extended non-overlapping sequences at the 3′ end and primer-primer complementary sequences at the 5′ end, promote primer-template annealing via eliminating primer dimerization and permit the newly-synthesized DNA to be used as the template in subsequent PCR cycles.

Multichange isothermal (MISO) mutagenesis is another new protocol to introduce Multi-SDM into plasmid DNA, which is based on the QuikChange™ protocol and one-step isothermal (ISO) assembly (Mitchell et al., 2013). This method overcomes all of the aforementioned defects of the traditional QuikChange™ protocol. Fig. 3 shows an overview of this protocol. The MISO mutagenesis can be used as a simple strategy for making three types of DNA manipulations (i.e. base substitution, insertion, and deletion) in a single round of experimentation (Adachi and Fukuhara, 2012). However, there are still several limitations, such as DNA sequence, the error rate associated with oligonucleotide synthesis, and mis-assembly errors from the ISO-assembly step. In addition, it is difficult to introduce mutations at nearby bases.

Fig. 3
figure 3

Schematic diagram of the multichange isothermal mutagenesis for Multi-SDM (a) and one-step isothermal assembly relies on the concerted action of three enzymes (b) (Mitchell et al., 2013)

Blue lines represent targeted gene; red lines represent insertion sequences; the same colour lines represent homologous regions; symbols “^” represent mutant sites; orange box represents deletion region (Note: for interpretation of the references to color in this figure legend, the reader is referred to the web version of this article)

2.2.3 Multi-SDM by overlap extension PCR (OE-PCR)

OE-PCR has been widely used to introduce a single mutation embedded in the oligonucleotide primers (Ho et al., 1989) and to change gene (Horton et al., 1990). Based on the principle and strategy of OE-PCR in Single-SDM, OE-PCR has been developed for use in Multi-SDM (Urban et al., 1997; Tian et al., 2010; Luo et al., 2012; Wäneskog and Bjerling, 2014). In traditional OE-PCR, two or more mutation-embedding fragments are separately amplified using internal primers which contain mutation sites and an overlapping sequence. These fragments are fused together and then form a new double strand template DNA in a subsequent extension reaction, in which the 3′ overlap of each strand serves as primer for extending the 3′-end of the complementary strand. Finally, the resulting mutant DNA is generated from the fused product by PCR using external oligonucleotide primers (Luo et al., 2012). Compared with the Quick-change method, traditional OE-PCR is good for Multi-SDM with respect to the linear template. However, it is necessary to remove the wild-type DNA template and residual primers in the first amplification (Peng et al., 2006). This purification step is time-consuming and expensive. In order to avoid the purification step, Tian et al. (2010) have developed a method which is based on OE-PCR and DpnI digestion. In this method, the circular plasmid with a targeted gene rather than linear DNA was used as template. The key point of this method is that the template plasmid is required to be methylated by Dam methyltransferase or extracted from dam+ E. coli (Tian et al., 2010). Despite efforts to continuously improve the application of OE-PCR, there are still some defects, such as it being hard to manipulate large DNA fragments (i.e. >7 kb).

2.2.4 Multi-SDM by multiple rounds PCR (MR-PCR)

PCR-based mutagenesis protocols have been widely used to change an amino acid from one to another in recent years. Several protocols have been developed to achieve the desired mutation using PCR technology (Ling and Robinson, 1997). The mutation introduced into the gene is executed by MR-PCR with multiple mutagenic oligonucleotides as primers (Meetei and Rao, 1998). In contrast to the aforementioned three methods, this method does not require DNA ligase or 5′ phosphorylation of mutagenic primers, which is concerned with annealing efficiency of the mutagenic primer and 5′ to 3′ exonuclease activity of thermostable DNA polymerase, and it needs to subclone the PCR product each time (Meetei and Rao, 1998). One of the key points of this method is to have two templates, i.e. “A” and “B”. “A” has an extra flanking sequence except for the gene or part of the targeted gene, which is used as a binding site for one of the flanking primers (S1), whereas “B” does not have this sequence. “B” either has a different flanking fragment or lacks the flanking sequence with few mismatches at the end of the flanking fragment (Barettino et al., 1994; Boles and Miosga, 1995). It should be noted that the template “A” must be removed by gel electrophoresis after the first round of PCR.

2.2.5 Multi-SDM by primer extension and ligation (PEL)

As described previously, multiple mutagenic oligonucleotides are used as primers to execute Multi-SDM by MR-PCR (Tu and Sun, 1996; Patel et al., 2009). However, the efficiency of mutagenesis is lower (Patel et al., 2009). Moreover, these methods require two pairs of complementary primers to introduce one mutation (Seyfang and Jin, 2004). To avoid the use of two complementary primers and to overcome the shortcomings of MR-PCR, Seyfang and Jin (2004) developed a simple and rapid method for SDM, which can be used to achieve more than 10 mutations with up to 100% efficiency. The fragment with the mutation is synthesized by PEL with T4 DNA polymerase and ligase, respectively. Then the fragment with the mutation is amplified by high-fidelity PCR with specific tailed anchor primers which contain a unique 25-nucleotide tail for subsequent mutational-fragment-specific PCR and three restriction endonuclease sites for subcloning the PCR product into different vectors. The overview of this method is given in the report of Seyfang and Jin (2004). In contrast to the above-mentioned methods, this method does not require the synthesis of single-stranded DNA as template and benefits from the use of only one antisense mutagenic primer for introducing one mutation (Seyfang and Jin, 2004). And it only needs a single round of PCR to introduce all of the mutation.

2.2.6 Multi-SDM by amplification, ligation, and suppression PCR (MALS)

In contrast to other PCR-based strategies for multi-SDM, Fushan and Drayna (2009) developed a rapid and efficient method for the introduction of a wide variety of mutations. The generation of mutation types is dependent on the use of the designed internal oligonucleotides, and the desired mutation exists at the 5′ end of one or both internal oligonucleotides (Fushan and Drayna, 2009). An outline of this MALS-mediated strategy is shown in Fig. 4a. The novelty of this method is that the mutation is introduced using SDM in combination with suppression PCR. The procedure consists of sequential rounds, and each individual round requires PCR amplification of the target DNA with two pairs of non-overlapping primers (Fushan and Drayna, 2009). Three types of molecules, i.e. homomeric ligation products, non-ligated molecules, and heteromeric ligation products, are found in the ligation products (Fig. 4b). The effect of suppression PCR in preventing PCR amplification (Diatchenko et al., 1996; Dai et al., 2007; Liew et al., 2015), is that the suppression PCR is used to screen the desired mutant molecules from other types of molecules. It is important to note that the mutagenic primers must be phosphorylated at the 5′ end.

Fig. 4
figure 4

Outline of this MALS-mediated strategy and three types of molecules after mixing and ligation of two PCR products with SO1/IR1 and IF1/SO2 as primers, respectively (Fushan and Drayna, 2009)

Black lines represent targeted gene; bright-green and blue lines represent homologous arms; red lines represent mutant sites; Ⓟ represents phosphate groups (Note: for interpretation of the references to color in this figure legend, the reader is referred to the web version of this article)

2.3 Strategies to design SDM primers

The efficiency of the oligonucleotide-mediated SDM is greatly affected by the design of mutagenic oligonucleotide primers and the quality of the subsequent primer-template annealing. At present, the design of primers is based on an optimized strategy using partial- or complete-overlapping primers (Zheng et al., 2004; Xu et al., 2013). Therefore, the design of SDM primers has become even more important in SDM, and especially in Multi-SDM (Seyfang and Jin, 2004). In order to distinguish the mutant sequence from the wild-type gene, the main strategy is to introduce new or remove previous restriction sites and then add a unique identifier to each mutation during the design of the SDM primers. However, it is hard to distinguish the different point mutations of the same wild-type gene (Karnik et al., 2013). Therefore, a unique restriction site (also known as a “silent” restriction site) specified for each mutation should be used as a simple and reliable identifier. There are four primer design programs that allow the introduction of a “silent” restriction site during SDM primer design, i.e. SILMUT (Shankarappa and Vijayananda, 1992), Primer Generator (Turchin and Lawler, 1999), Site-Find (Evans and Liu, 2005), and SDM-Assist (Karnik et al., 2013). In contrast to Primer Generator and SiteFind, SDM-Assist overcomes the limitation of Primer Generator which only runs in a DOS-based text-entry intensive interface (Turchin and Lawler, 1999), and the defects of SiteFind which only allows the input of about 400 bp nucleotides of DNA (Evans and Liu, 2005). Moreover, SDM-Assist also can be used to manipulate the input sequence in a user-friendly format and to help the design of the primers and the calculation of their thermodynamic parameters (Karnik et al., 2013).

3 Strategies for gene inactivation

SDM can be used for deletion of bases, but this mutation is a small deletion, no more than 170 bp (Qi and Scholthof, 2008; Wan et al., 2012; Wu et al., 2013). That is in stark contrast to gene knockout which is used to completely or partly delete one gene to inactivate the function of this gene, up to 50 kb (Sawitzke et al., 2013). Gene inactivation plays an important role in revealing the function of specific genes and in developing novel variants (Wu et al., 2013). With its sequence interrupted, the target gene would be translated into a nonfunctional protein that leads to phenotypic modulation in most cases. This strategy is very useful for classical genetic research and for modern biotechniques including functional genomics (Sawitzke et al., 2013). In general, gene inactivation is based on the theories of homologous recombination to replace the existing autologous gene with a designed heterogenous gene. With the development of gene inactivation, new techniques are gradually being introduced in gene inactivation, such as insertional inactivation.

3.1 Gene inactivation based on homologous recombination

Homologous recombination-mediated gene inactivation is an in vivo genetic modification. In contrast to in vitro genetic modification, it is not limited by the location of the enzyme digestion sites and the construction of base-pairs (Sawitzke et al., 2013). It is executed by introducing recombinant DNA (plasmid or linear DNA) containing selectable marker(s), counter-selectable marker(s), and homologous arm(s) into cells (Datsenko and Wanner, 2000; Yu and Ellis, 2000; Muyrers et al., 2001; Nakashima and Miyazaki, 2014). Recombinant DNA can be constructed in vitro and then introduced into cells. Homologous recombination occurs through homologous arm(s), and the recombination strains are selected according to the selectable marker(s) and counter-selectable marker(s).

3.1.1 Plasmid-mediated homologous recombination

Research on plasmid-mediated homologous recombination showed some successful examples using intermolecular recombination between two circular molecules (Poustka et al., 1984; Philippe et al., 2004; Xu et al., 2013; 2014b; 2015). In this method, the host cell must have the genetic background of recA+, which is called RecA-dependent strategy. An outline is shown in Figs. 5a and 5b. Before the first round of homologous recombination, a dedicated plasmid with several components must be constructed. These components include: (1) selectable marker(s) and counter-selectable marker(s); (2) long homologous arm(s); and, (3) controlling element(s). To avoid selectable markers or scars in the chromosome, the most useful tools are suicide plasmids (Philippe et al., 2004). There are three types of suicide plasmids that have been used to construct recombinant E. coli or C. glutamicum strains: temperature-sensitive plasmids, e.g. pSC101 and its derivatives (Cornet et al., 1994), and plasmids carrying the replication origin of R6K, such as pCVD441 and its derivatives (Donnenberg and Kaper, 1991), or the replication origin of pMB1, such as pK18mobsacB and its derivatives (Schäfer et al., 1994). These systems involve a two-step procedure: recombinant targeted plasmid integrates in the targeted sequence by the first round of homologous recombination, then excises from the chromosome via the second round. Excision of the recombinant plasmid from the chromosome is executed with counter-selectable markers, because the cell would die in the presence of a counter-selectable compound if the plasmid remains in the chromosome (Reyrat et al., 1998). There are three major counter-selectable markers: (1) fusaric acid-sensitivity system (Maloy and Nunn, 1981); (2) streptomycin-sensitivity system (Dean, 1981); (3) sucrose-sensitivity system (Gay et al., 1983).

Fig. 5
figure 5

Schematic diagram of gene inactivation based on plasmid-mediated homologous recombination (a, b) and linear DNA-mediated homologous recombination (c, d)

sm represents selectable marker; csm represents counter-selectable marker; ce represents controlling element; A represents the autologous A gene; A-L represents the left homologous arm of A gene; A-R represents the right homologous arm of A gene; B represents the extraneous B gene; C represents the C gene; P1 and P2 represent the primers according the sequence of antibiotic marker

Much research has reported that E. coli could be genetically modified to produce various amino acids and organic acids by plasmid-mediated homologous recombination (Zhou et al., 2003; Causey et al., 2004; Imaizumi et al., 2005; Zhang et al., 2007). At present, as far as we are aware, gene inactivation of the C. glutamicum genome is mainly carried out by plasmid-mediated homologous recombination. Over the past decade, many researchers dedicated themselves to develop the new methods of plasmid-mediated homologous recombination to overcome the defects in the present method (Poustka et al., 1984; Homilton et al., 1989; Kato et al., 1998; Philippe et al., 2004). However, there remain questions on the requirement of long homologous arm(s), how time-consuming it is, low success rates, and the limitation of the recombinogenic window (Muyrers et al., 2001).

3.1.2 Linear DNA-mediated homologous recombination

An alternative homologous recombination strategy was developed where the recombination is performed by introducing recombinant linear DNA rather than a plasmid. In addition, this strategy is not dependent on RecA, but it is mediated by phageencoded recombinant proteins, either RecE/RecT from Rac phage (Zhang et al., 1998) or Redβ (encoding Bet), Redexo (encoding Exo), and Redγ (encoding Gam) from λ phage (Datsenko and Wanner, 2000; Yu and Ellis, 2000). RecE and Exo are 5′→3′ dsDNA specific exonucleases, and are required for dsDNA recombination. RecT and Beta are single-stranded DNA (ssDNA) annealing proteins, and are the central recombinases in homologous recombination. Gam is not absolutely required for recombination but promotes dsDNA recombination because it inhibits the host RecBCD exonuclease that is normally used to digest linear dsDNA (Sawitzke et al., 2013).

An outline of this linear DNA-mediated strategy is shown in Figs. 5c and 5d. This strategy possesses the advantages inherent in the use of linear targeted DNA and thus increases the success rate. A linear targeted DNA carries two short homologous arms (35–60 bp) flanking a selectable gene. The shorter homologous arms can be made by oligonucleotide synthesis rather than PCR and thus offers a shorter operation time. In addition, the linear targeted DNA is obtained by PCR and no dedicated plasmid needs to be constructed (Muyrers et al., 2001). In particular, PCR products carry selectable and counter-selectable gene(s) that can be used to select recombinants without a marker gene. However, unintended additional homologous recombination may occur in the targeted DNA that limits the recombinogenic window. In addition, residual scar, such as flippase (FLP) recognition target (FRT), would disturb the next chromosomal rearrangement in FLP-promoted recombination events (Datsenko and Wanner, 2000).

3.2 Gene inactivation based on insertional inactivation

Insertional inactivation has been used for inactivating the prokaryotic (such as E. coli and C. glutamicum) and eukaryotic (such as yeast) genes for many years. Several methods have been reported that can achieve the desired gene-inactivation mutants by inserting foreign sequences into a target gene. The first method is based on allelic exchange. The targeted fragment with a selectable marker is constructed in vitro and then transferred into the host cell. The artificial fragment replaces the targeted wild-type gene by double-crossover homologous recombination (Tilly et al., 2000; Xu et al., 2014c). The key aspect of this method is that it requires a gene transfer mechanism and a selectable marker (Tilly et al., 2000). The second method is carried out by plasmid integration. The plasmid for integration must contain a selectable marker and a part of the targeted gene, which could result in reversal of the integration by a single-crossover recombination. However, it is hard to maintain the selection for the plasmid, and plasmid integration can also occur without mutation at the insertion site (Tilly et al., 2000).

In contrast to being dependent on homologous recombination, the third method used for gene inactivation is random insertion in the genome using a transposable element (Berg and Berg, 1996). This method can easily generate large-scale disruption of the genome (Judson and Mekalanos, 2000; Suzuki et al., 2006). Transposons are mobile genetic elements that can transpose from one location in a genome to another (Judson and Mekalanos, 2000). Transposon insertions can alter the regulation and expression of genes, so the application of it has provided valuable functional genomic information to complement genome-sequencing projects (Kim, 2015). However, the drawbacks of this method are that it is difficult to identify the transposon insertion sites, and requires complex procedures (Suzuki et al., 2006). In order to overcome these drawbacks, Suzuki et al. (2006) have developed a method which can be used to simply identify the transposon insertion sites based on the sequences of thermal asymmetric interlaced (TAIL)-PCR products of mutant cells by using BLAST and Per1.

All in all, the three above methods of insertional inactivation require a usable selectable marker. The outline of these methods is similar to that of homologous recombination-mediated gene inactivation, only with a selectable marker in the genome.

4 Strategies for gene over-expression

Plasmid-mediated structural gene over-expression is a common strategy in genetic modification of E. coli and C. glutamicum (Ikeda and Katsumata, 1998). Plasmids generally carry genetic markers, such as a drug resistance marker, which are used for high-efficiency screening of target-transformants (Xu et al., 2014c). However, there are several disadvantages to the application of plasmid-mediated gene over-expression, such as the instability of the plasmid and negative effect on cell growth. Although Hu et al. (2014) have constructed a novel expression system for gene amplification using plasmid, the antibiotic resistance gene is also introduced into the cell. In past decades, constructions of integrative plasmids which carry a homologous chromosomal fragment and drug resistance marker have made it possible to insert a gene into the chromosome. The homologous fragment in a plasmid consists of multiple copies in the chromosome, such as IS13869 (Amador et al., 2000) and rRNA gene (Correia et al., 1996). The integration is carried out by single-crossover recombination between the integrative plasmid and the recipient genome occurring in the homologous region (Ikeda and Katsumata, 1998). In addition, Suzuki et al. (2006) have reported a new Cre/mutant lox system for integrating the heterologous genes into the C. glutamicum genome. However, the host cell must contain a mutant lox gene, and the integrated foreign gene must contain a selective marker, such as antibiotic resistance genes (Suzuki et al., 2006). Recently we have developed a method for the simultaneous replacement of a targeted gene by a given gene cassette, leaving no genetic markers (Fig. 6) (Xu et al., 2014c). However, the methods mentioned above are only used in modifying the C. glutamicum chromosome. Although gene replacement is one of the easiest procedures for gene manipulation in E. coli, there is no method for gene over-expression by inserting into the E. coli chromosome. It is a possible reason why it is hard to screen the target transformants.

Fig. 6
figure 6

Strategy used for genetic manipulation in Corynebacterium glutamicum (Xu et al., 2014c)

A represents the autologous A gene; A-R represents the right arm of A gene; A-L represents the left arm of A gene; Ptac-B-rrnBT1T2 represents insertional cassette; Ptac represents the promoter tac; B represents the extraneous B gene; rrnBT1T2 represents the terminator rrnBT1T2; kan represents kanamycin resistance encoding gene; sacB represents levansucrase encoding gene

5 Conclusions and perspectives

Over the last several years, genetic engineering techniques have been widely employed in research and development for strain improvement. Genetic engineering is the name of a group of techniques used for direct genetic modification of organisms, such as SDM, gene inactivation, and gene over-expression (Montaldo, 2006). The strategies presented outline how precise DNA modifications, including SDM, gene inactivation, and gene over-expression, can be made to DNA molecules in E. coli and C. glutamcum of any size. This review sums up the latest methods used for genetically modifying E. coli and C. glutamicum genome, and discusses the technical problems for multi-layered DNA manipulations. In particular, key factors have been described for each method to perform the DNA manipulation. From this review, we can see many efficient methods for genetically modifying E. coli and C. glutamicum genome. Choices greatly depend on the experimental needs and the resources of the researchers.

It should be noted that, as far as we are aware, the methods used for gene over-expression in the genome are very rare, especially for E. coli. Plasmid-mediated structural gene over-expression is a common strategy in the genetic modification of E. coli and C. glutamicum (Ikeda and Katsumata, 1998). However, plasmid-mediated gene over-expression will inevitably introduce a drug resistance marker, which increases the concern over the distribution of antibiotic resistant genes (Tauch et al., 2002). Considering the very broad interest generated by this subject, we can reasonably hope that strategies will be found to counteract this deficiency, and that new approaches will be developed. At the same time, it is crucial to construct stable high-yielding strains.

Compliance with ethics guidelines

Jian-zhong XU and Wei-guo ZHANG declare that they have no conflict of interest.

This article does not contain any studies with human or animal subjects performed by any of the authors.