Introduction

At a time when molecular cloning has become routine laboratory technique, we thought it was important to provide readers with some cues for understanding the function and specificities of the different enzymes used to generate and manipulate nucleic acids. Over the past few years, the tremendous expansion of cloning techniques and applications has triggered an enormous interest from laboratory suppliers. As a result, countless sources of enzymes are now available, and selecting the appropriate enzyme for a specific task may seem difficult to the novice.

Nucleic acids used for molecular cloning can be of natural or synthetic origin, and their length ranges from a few to several thousands nucleotides. Nucleic acids can be extensively manipulated, in order to acquire specific characteristics and properties. Such manipulations include propagation, ligation, digestion, or addition of modifying groups such as phosphate or methyl groups. These modifications are catalyzed by polymerases, ligases, nucleases, phosphatases, and methylases, respectively. In this review, we provide a description of the main enzymes of each group, and explain their properties and mechanism of action. Our goal is to give the reader a better understanding of the fundamental enzymatic activities that are used in molecular cloning.

DNA polymerases (DNA-dependent DNA polymerase, EC 2.7.7.7)

DNA polymerases are enzymes that catalyze the formation of polymers made by the assembly of multiple structural units or deoxyribo-nucleotides triphosphate (dNTPs). None of the DNA polymerases that have been characterized thus far can direct de novo synthesis of a polynucleotidic molecule from individual nucleotides. The DNA polymerases only add nucleotides to the 3′-OH end of a pre-existing primer containing a 5′-phosphate group. Primers are short stretches of RNA complementary to about 10 nucleotides of DNA at the 5′ end of the molecule to replicate. Primers are synthesized by an RNA polymerase called primase.

Most polymerases used in molecular biology originate from bacteria or their infecting viruses (bacteriophages or phages). We will only discuss prokaryotic polymerases in this review. The functional and structural properties of eukaryotic DNA polymerases, which are specific for chromatin-embedded DNA, are reviewed elsewhere (Frouin et al. 2003; Garg and Burgers 2005).

A brief description, requirements, enzymatic activities, and main applications of the DNA polymerases presented in this section are summarized in Table 1.

Table 1 Main DNA polymerases used in molecular biology

Prokaryotic DNA polymerases

In bacteria, three DNA polymerases act in concert to achieve DNA replication: Pol I, Pol II, and Pol III. All three enzymes catalyze 5′→3′ elongation of DNA strands in the presence of primer and dNTPs, but have variable elongation rate. DNA Pol III was discovered in 1970 (Kornberg and Gefter 1970) and is the main enzymatic complex driving prokaryotic replication. All three DNA Pol also have a 3′→5′ (reverse) exonuclease activity, otherwise known as proofreading activity as it initiates removal of incorrectly added bases as polymerization progresses. The proofreading activity increases fidelity but slows down polymerase progression. In addition to polymerase and proofreading activities common to all three DNA polymerases, DNA Pol I also has a 5′→3′ exonuclease activity, which is used for the removal of RNA primer from the 5′ end of DNA chains, and for excision-repair (upstream of polymerization). The 5′→3′ exonuclease activity of Pol I is utilized in vitro for nick-translation (i.e. tagging technique in which some of the nucleotides of a DNA sequence are replaced with labeled analogues).

Pol I and Klenow fragment

The native DNA Pol I has been successfully used to remove 3′ protruding DNA ends (in the absence of dNTPs), or to fill in cohesive ends (in the presence of dNTPs) before addition of molecular linkers. However, the 5′→3′ exonuclease activity of DNA Pol I makes it unsuitable for all applications that require polymerization activity alone (e.g. to fill in cohesive ends before addition of linkers, or to copy single-stranded DNA in the dideoxy method for sequencing). Fortunately, it was discovered that proteolytic digestion of E. Coli DNA Pol I (109 kDa) generates two fragments (76 and 36 kDa). The large fragment, also known as DNA Pol IK or Klenow fragment (named after its inventor, (Klenow and Henningsen 1970)), contains the 5′→3′ polymerase and 3′→5′ exonuclease (proofreading) activities of DNA Pol I, while the small fragment exhibits the 5′→3′ exonuclease activity alone. Since then, recombinant sources of Klenow significantly improved the functional quality of this fragment by eliminating contaminations due to the presence of residual native enzyme in proteolytically treated preparations.

T4 DNA polymerase

Bacteriophage T4 polymerase requires a template and a primer to exhibits two activities: it is a 3′→5′ (reverse) exonuclease in the absence of dNTPs, and a 5′→3′ polymerase in the presence of dNTPs. Unlike the E. Coli DNA Pol I, T4 DNA Pol does not exhibit a 5′→3′ exonuclease activity (Englund 1971). Therefore, T4 DNA Pol can be used instead of Klenow to fill in 5′-protruding ends of DNA fragments, for nick translation, or for labeling 3′ ends of duplex (double-stranded) DNA. T4 DNA Pol exonuclease rate is approximately 40 bases per minutes on double stranded DNA, and about 4,000 bases on single stranded DNA. T4 DNA Pol polymerization rate reaches 15,000 nucleotides per minutes when assayed under standardized conditions.

Modified bacteriophage T7 DNA polymerase

A chemically modified phage T7 DNA polymerase has been described by Richardson et al. as an ideal tool for DNA sequencing (Huber et al. 1987; Tabor et al. 1987; Tabor and Richardson 1987). Modified bacteriophage T7 DNA Pol is a complex of two proteins: the 84 kDa product of the T7 gene 5, and the 12 kDa E. Coli thioredoxin (Mark and Richardson 1976; Modrich and Richardson 1975). The T7 gene 5 protein provides catalytic properties to the complex, while the thioredoxin protein connects the T7 gene 5 protein to a primer template, which allows the polymerization of thousands of nucleotides without dissociation, thereby increasing the efficiency of the T7 polymerase. Hence, modified T7 DNA Pol has a polymerization rate of more than 300 nucleotides per second, which makes it more than 70 times faster than that of AMV reverse transcriptase. Thus, this enzymatic complex can be used for preparation of radioactive probes and amplification of large DNA fragments.

Further characterization identified a 28-amino acid region (residues 118–145) as essential to T7 DNA Pol 3′→5′ exonuclease activity (Tabor and Richardson 1989). In vitro mutagenesis of the corresponding nucleotides in the T7 gene 5 resulted in complete elimination of the exonuclease activity, thereby increasing polymerase efficiency (9-fold) and spontaneous mutation rate (14-fold) (Tabor and Richardson 1989). The mutant T7 polymerase/thioredoxin complex, commercially available under the name sequenase (United States Biochemical Corporation), is used for DNA sequencing because of its high efficiency and ability to incorporate nucleotide analogs (such as 5′-(α-thio)-dNTPs, dc7-GTP, or dITP used to improve the resolution of DNA sequencing gels, and to avoid gel compression resulting from base pairing).

Terminal deoxynucleotidyl transferase (DNA nucleotidyl-exotransferase, EC 2.7.7.31)

Terminal deoxynucleotidyl transferase (TdT), initially purified from calf thymus (Krakow et al. 1962), is a DNA polymerase that catalyzes the addition of a homopolymer tail to 3′-OH ends of DNA, in a template-independent manner. TdT is used in molecular biology for labeling DNA 3′ends with modified nucleotides (such as ddNTP, DIG-dUTP, or radiolabeled nucleotides), for primer extension, or for DNA sequencing. It is also used in TUNEL (TdT dUTP Nick End Labeling) assay for the demonstration of apoptosis (Gavrieli et al. 1992).

TdT requires a single-stranded DNA primer in the presence of Mg2+ (three nucleotide-long minimum), but can accept double-stranded DNA as a primer in the presence of cobalt ions (Roychoudhury et al. 1976). However, addition of Co2+ may result in a relaxation of the helical structure of the DNA, thereby allowing the tailing of internal nicks. Use of Mg2+ reduces this problem but results in a significant decrease of the tail length (approximately one-fifth of the length obtained in the presence of Co2+ at identical enzyme:DNA ratios). Hence, optimization of incubation conditions is critical to control specificity, reactivity, and activity of TdT.

Thermostable DNA polymerases

Although thermostable DNA polymerases were purified in the early seventies, their considerable interest for molecular cloning emerged from the development of polymerase chain reaction (PCR) and subsequent need for enzymes able to perform DNA synthesis at high temperature. Since thermostable polymerases are functional at high temperature, they can replicate DNA regions with high G/C content, similar to those frequently found in thermophilic organisms (high C/G content sequences form secondary structures that need to be properly denatured in order to be efficiently copied).

Bst polymerase

Bst polymerase is a thermostable DNA Pol I that was isolated in 1968 (Stenesh et al. 1968) from the thermophilic bacterium Bacillus stearothermophilus (Bst), which proliferates between 39 and 70°C. Bst Pol I is active at an optimal temperature of 65°C, and is inactivated after 15 min incubation at 75°C. Bst Pol I possesses a 5′→3′ exonuclease activity and requires high Mg2+ concentration for maximum activity.

Protease digestion of Bst DNA Pol I generates two protein fragments. The large protein fragment of Bst DNA Pol I is thermostable, and thus very useful for sequencing reactions performed at 65°C to avoid problems due to hairpin formation. Like Klenow, the Large Fragment of Bst DNA Pol I shows a faster strand displacement than its full length counterpart. Recombinant Bst DNA Pol I is presently available from Epicentre Biotechnologies, which also commercializes the Large Fragment of rBst DNA Pol I under the name IsoTherm™.

Taq polymerase

Taq polymerase was first purified in 1976 (Chien et al. 1976) from a bacterium discovered 8 years earlier in the Great Fountain region of Yellowstone National Park (Brock and Freeze 1969). This bacterium, which thrives at 70°C and survives at temperatures as high as 80°C, was named Thermophilus aquaticus (T. aquaticus or Taq). Taq polymerase has a halflife of 40 min at 95°C, and 5 min at 100°C, which allows the PCR reaction. For optimal activity, Taq polymerase requires Mg2+ and a temperature of 80°C.

Taq polymerase lacks 3′→5′ exonuclease (proofreading) activity, and is therefore often described as being a low fidelity enzyme (error rate between 1 × 10−4 and 2 × 10−5 errors per base pair, depending on experimental conditions). Yet, one should keep in mind that this corresponds to a quite good accuracy (inverse of the error rate) since 45,000 nucleotides can be incorporated into newly synthesized DNA strands before an error occurs.

Like other DNA polymerases lacking 3′→5′ exonuclease activity, Taq polymerase exhibits a deoxynucleotidyl transferase activity that is accountable for the addition of a few adenine residues at the 3′-end of PCR products.

Later on, additional thermostable DNA polymerases have been discovered and commercialized (see below). These enzymes have better accuracy than the original Taq. However, the term “Taq” is commonly used in place of “thermostable DNA polymerase”, hence the erroneous term “high fidelity Taq”, often used in laboratories.

Tth polymerase

Tth polymerase, isolated from Thermus thermophilus HB-8 (Ruttimann et al. 1985), is 94 kDa thermostable polymerase lacking 3′→5′ exonuclease (proofreading) activity. Tth polymerase catalyzes the polymerization of nucleotides into duplex DNA from a DNA template (DNA polymerase) in the presence of Mg2+, and from an RNA template (reverse transcriptase) in the presence of Mn2+.

Thermostable DNA polymerases with proofreading activity

Several DNA polymerases isolated from thermophillic organisms exhibit 3′→5′ exonuclease (proofreading) activity, which increases fidelity. Among them, DNA polymerase from the archebacteirum Pyrococcus furiosus (Pfu) has an error rate (1.6 × 10−6) 10-fold lower than that of Taq polymerase. It can be purchased from numerous providers.

Pow polymerase (isolated from Pyrococcus woesei) has a half-life greater than 2 h at 100°C and a very low error rate of 7.4 × 10−7. Pow polymerase provides high PCR yields only for templates shorter than 3.5 kb. To circumvent PCR product size limitation, a mixture of Taq and Pow DNA polymerases has been commercialized by Roche Molecular Chemicals (under the name “Expand high fidelity”). This system allows amplification and cloning of long stretches of DNA (20–35 kb), and represents a unique tool for analyzing and sequencing large eucaryotic genes, or introducing large DNA fragments in lambda phages or cosmid vectors.

Vent polymerase (also known as Tli polymerase since isolated from Thermococcus litoralis) has an error rate comprised between that of Taq and Pfu. Vent polymerase has a half-life of 7 h at 95°C. It is marketed by New England Biolabs, together with a modified version that lacks exonuclease (proofreading) activity.

Two DNA polymerase (Pol I and II) that exhibit a 3′→5′ exonuclease proofreading activity were also isolated from Pyrococcus abyssi (Gueguen et al. 2001). Pol I DNA polymerase from P. abyssi (Pab), marketed by Qbiogen as “Isis DNA Polymerase”, has an error rate (0.66 × 10−6) similar to that of Pfu. However, Pab is more thermostable than Pfu or Taq (Pab has a half-life of 5 h at 100°C), therefore being very useful for conducting PCR reactions that involve high temperature incubations.

Most of DNA polymerases and reverse transcriptases (described below) are commercially available in “Hot Start” (or equivalent) version. Hot Start modification is aimed at preventing non-specific priming events such as template/ primer hybridization or primer dimmer formation, which occur in low stringency conditions (during PCR preparation). Non-specific priming events generate secondary products during the preparation and in the first cycle of the PCR, and are further amplified in subsequent cycles. Hot Start technology introduces a “barrier” between secondary structures and the polymerase DNA-binding site. The “barrier”, which can be an antibody, an oligonucleotide, or a reversible chemical modification of the polymerase’s amino acids, is released from the enzyme during the first denaturation cycle of the PCR, thus restoring enzymatic activity only after denaturation of secondary structures. Hot Start modifications are very efficient in limiting polymerase activity at room temperature, and thereby facilitate PCR preparation while enhancing PCR specificity.

Reverse transcriptase (RNA-dependent DNA polymerase EC 2.7.7.49)

Until the 1960’s, the transfer of genetic information was thought to flow unidirectionally from DNA to RNA (Crick 1958). Characterization of reverse transcriptase (RT) from Rous Sarcoma Virus (RSV) (Temin and Mizutani 1970) and Rauscher Leukaemia virus (RLV) (Baltimore 1970) exemplified that single stranded DNA could be synthesized from an RNA template. The resulting single-stranded DNA is a “complementary” copy of the RNA template, or cDNA.

The main characteristics of the reverse transcriptases discussed in this section are summarized in Table 2.

Table 2 Main reverse transctiptases used in molecular biology

Natural occurences of reverse transcriptase

Viral reverse-transcriptases

RSV and RLV belong to the group of retroviridae (retroviruses). The life cycle of retroviruses includes an RT-directed transcription of their RNA genome and formation of a proviral double stranded molecule of DNA, which is integrated in the host genome. Infectious viral progeny is produced from transcription of the proviral DNA.

Many viruses contain a reverse transcription stage in their replication cycle.

Metaviridae are closely related to retroviruses and exist as retrotransposons in the eucaryotic host genome. Retrotransposons are mobile elements that amplify (multiply) through intermediate RNA molecules, which are reverse transcribed and integrated at new places in the host genome. Examples of metaviridae include the Saccharomyces cerevisiae Ty3 virus, Drosophila melanogaster gypsy virus, and Ascaris lumbricoides Tas virus.

Pseudoviridae (e.g. Saccharomyces cerevisiae Ty1 virus and Drosophila melanogaster copia virus) have a segmented single stranded RNA genome.

Hepadnaviridae (like the hepatitis B virus (Seeger and Mason 2000)) have a genome made of two uneven strands of DNA that exist as stretches of both single and double stranded circular DNA. The viral polymerase, which catalyses RNA- and DNA-dependent DNA synthesis, possesses both RNase H and protein priming activities. Upon infection, the relaxed circular DNA is converted into a circular DNA that is transcribed by the host RNA polymerase. This “pre-genomic” RNA is retrotranscribed in cDNA by the viral RT to give rise to the genomic viral DNA strands.

Caulimoviridae are unenveloped viruses that infect plants, including cauliflower (cauliflower mosaic virus) and soybean (soybean chlorotic mottle virus). Upon infection, the polyadenylated viral RNA is reverse-transcribed by the viral RT to give rise to genomic double-stranded DNA molecules. Viral RNA is also used for viral protein synthesis in the cytoplasm of infected cells. Although caulimoviridae’s life cycle resembles that of retrovirus, replication of caulimoviridae does not involve the integration of proviral DNA in the host genome.

Non-viral reverse transcriptases

Transposons

Nearly all organisms contain variable amounts of repetitive mobile DNA known as transposable elements (TEs) or transposons. TEs constitute more than 80% of the total genome in plants, while it represents about 45% of the human genome. Finnegan’s classification (Finnegan 1989) distinguishes RNA transposons (Class I or retrotransposons), which amplify through a “copy and paste” type of transposition, and DNA transposons (Class II), which use a “cut and paste” type of transposition. However, the newer Wicker’s classification takes into account the recent discovery of bacterial and eukaryotic TEs that copy and paste without RNA intermediates, and of new miniature inverted repeat transposable elements (MITEs) (Wicker et al. 2007).

Retrotransposons, which make up to 42% of human genome, encode an RT used to make a cDNA copy of an RNA intermediate, which is produced after transcription of the retrotransposon integrated in the genome.

Telomerase

Telomeres are non-coding, linear sequences that cap the ends of DNA molecules in eukaryotic chromosomes. Telomeres are made of up to 2,000 repeats of TTAGGG stretches. In normal cells, the DNA replication machinery is unable to duplicate the complete telomeric DNA. As a result, telomeres are shortened after every cell division. Having reached a critical length, telomeres are recognized as double strand break DNA lesions, and cells eventually enter senescence.

In embryonic and adult stem cells, which have an extended lifespan, telomere length is maintained through activation of telomerase. Telomerase is a ribonucleoproteic complex that contains an RNA (TElomere RNA Component, or TERC) and an RT (TElomere Reverse Transcriptase, or TERT). The RNA provides the AAUCCC template directing the synthesis of TTAGGG repeats by the RT. Interestingly, telomerase is also activated during carcinogenesis (Raynaud et al. 2008).

Reverse transcriptase for molecular biology

The use of RT in fundamental and applied molecular biology has been propelled by the introduction of the Reverse Transcription Polymerase Chain Reaction (RT-PCR). As a result, commercial sources of RT have flourished over the past two decades.

Because RTs are deprived of 3′→5′ exonuclease (proofreading) activity, they have much lower fidelity than DNA-dependent DNA polymerases. For instance, HIV RT has an error rate of 1 mutation per 1,500 nucleotides, whereas RT of avian and murine origin generate 1 mutation per 17,000 and 30,000 nucleotides, respectively (Arezi and Hogrefe 2007; Roberts et al. 1998).

AMV/MAV RT

The RT the most commonly used in molecular biology is the one that allows the replication of the Avian Myeoloblastosis Virus (AMV), an alpha retrovirus that induces myeoloblastosis in chicken (Baluda et al. 1983). Interestingly, cloning of the AMV genome identified the v-myb oncogene as responsible for intense proliferation of transformed myeloblasts. Insertion of v-myb oncogenic sequences in the AMV genome interrupts the coding sequence of the RT, thus making AMV RT-deficient. Like every deficient viruses, AMV depends on a helper virus for its replication. AMV helper, the Myeoloblastosis Associated Virus (MAV), is in fact the “real” source of RT in the life cycle of AMV, and for molecular biology (Perbal 2008).

The AMV/MAV RT is composed of two structurally related sub-units designated α and β (65 kDa and 95 kDa, respectively). The α subunit of the enzyme provides RT and RNase H activities. The RT activity requires the presence of a primer and a template (Leis et al. 1983; Verma 1977). AMV/MAV RT is widely used to copy total messenger RNAs using polydT or random primers.

RNase H activity is generated by proteolytic cleavage of the α subunit and associated with a 24 kDa fragment. RNase H is a processive exoribonuclease that degrades specifically RNA strands in RNA— DNA hybrids in either 5′→3′ or 3′→5′ directions.

The use of reverse transcriptase has found many applications in molecular cloning, two well-documented examples being the synthesis of cDNA from RNA in the preparation of expression libraries, as a first step for quantitative PCR, and for nucleotide sequencing. Several protocols have establish optimal conditions for high yields reactions (Berger et al. 1983), or for the synthesis of large RNA templates (Retzel et al. 1980).

MuLV RT

The pol gene of Moloney murine leukemia virus (M-MuLV) encodes an RT lacking DNA endonuclease activity, and exhibiting a lower RNAse H activity than AMV/MAV RT (Moelling 1974). M-MuLV RT is 4-times less efficient, and at least 4-times less stable than AMV/MAV RT. Therefore, comparable yields of cDNA synthesis require six to eight times more M-MuLV RT than AMV/MAV RT. However, M-MuLV RT is able to generate longer transcripts than does AMV/MAV RT when used in excess (Houts et al. 1979).

Thermostable reverse transcriptases

Several RTs identified in thermophilic organisms are commercially available and are useful in certain challenging conditions (e.g. to transcribe templates with high CG contents or abundant secondary structures).

Thermus thermophilus (Tth) is a DNA polymerase (see above) that efficiently reverse-transcribes RNA in the presence of MnCl2. Upon chelation of the Mn2+ ions, its DNA polymerase activity allows for PCR amplification in the presence of MgCl2.

Epicentre’s MonsterScript™ RT is an Mg2+-dependent thermostable RT that lacks RNase H activity, and is fully active at temperatures up to 65°C. According to its manufacturer, this enzyme can produce cDNA larger than 15 kb.

The Klenow fragment of Carboxydothermus hydrogenoformans (C. therm.) polymerase is a Mg2+-dependent RT that is active at temperatures up to 72°C. It is marketed by Roche (C. therm. Mix) for RT PCR uses.

Thermo-X™ RT from Invitrogen has a half-life of 120 min at 65°C, which is the highest stability reported so far.

RNA polymerase (DNA dependent-RNA polymerases)

DNA-dependent RNA polymerases catalyze the 5′→3′ elongation of RNA copies from DNA templates, a process called transcription. Like DNA polymerases that lack proofreading exonuclease activity, RNA polymerases can add an extra base at the end of a transcript.

The two main RNA polymerases used in molecular biology are SP6- and T7- RNA polymerases. SP6 RNA polymerase is a 96 kDa polypeptide purified from SP6 bacteriophage-infected Salmonella typhimurium LT2 (Butler and Chamberlin 1982), while T7 RNA polymerase is a 98 kDa polypeptide (Stahl and Zinn 1981) produced by the T7 bacteriophage (Chamberlin et al. 1970). RNA polymerases are used for in vitro synthesis of anti-sense RNA transcripts (Melton et al. 1984), production of labeled RNA probes, or for RNase protection mapping (Zinn et al. 1983).

SP6 and T7 RNA polymerases are similar enzymes: both require Mg2+ and a double-stranded DNA template, and are greatly stimulated by spermidine and serum albumine (Butler and Chamberlin 1982). In addition, T7 RNA polymerase typically requires the presence of dithiothreitol in reaction buffer.

SP6 and T7 RNA polymerases are extremely promoter specific and will transcribe any DNA sequence cloned downstream of their respective promoter (SP6 and T7). Importantly, SP6- or T7- RNA polymerase transcription proceeds through poly(A) stretches (Melton et al. 1984), and thus can progress around a circular template multiple times before disassociating. Therefore, linearization of the template prior to translation (a blunt or 5′ overhang is recommended (Schenborn and Mierendorf 1985)) will guaranty an efficient termination of transcription. The RNA produced under these conditions is biologically active (Krieg and Melton 1984) and can be properly spliced (Green et al. 1983).

Ligases

The main characteristics of the ligases discussed in this section are summarized in Table 3.

Table 3 Main ligases used in molecular biology

DNA ligases (EC 6.5.1.1 and 6.5.1.2 for ATP and NAD+ DNA ligases, repectively)

DNA ligases connect DNA fragments by catalyzing the formation of a phosphodiester bond between a 3′-OH and a 5′-phosphate group at a single-strand break in double-stranded DNA (Lehman 1974). In cells, DNA ligases are essential for joining Okazaki fragments during replication, and in the last step of DNA repair process. DNA ligases are used in molecular biology to join DNA fragments with blunt or sticky ends such as those generated by restriction enzyme digestion, add linkers or adaptors to DNA, or repair nicks.

DNA ligases operate in a three-step reaction. The fist step involves the creation of a ligase-adenylate intermediate, in which a phosphoamide bond is created between a lysine residue and one AMP molecule of the enzyme cofactor (ATP or NAD+). Second, the AMP is transferred to the 5′-phosphate end of the DNA nick to form a DNA-adenylate (AppDNA). Finally, a nucleophilic attack from the 3′ end of the DNA nick directed to the AppDNA results in joining of the two polynucleotides and release of AMP.

Original observations suggested that bacterial DNA ligases use NAD+ as a cofactor whereas DNA ligases from eukaryotes, viruses and bacteriophages use ATP (Doherty and Wigley 1999; Timson and Wigley 1999). However, it is known now that some ligases can accept either cofactor, even though both cofactors are not equally efficient (Nakatani et al. 2000; Rolland et al. 2004).

Non-thermostable DNA ligases

The smallest known DNA ligases are the ATP-dependant DNA ligase from Chlorella virus and bacteriophage T7 (34 and 41 kDa, respectively). They are much smaller than eukaryotic DNA ligases, which can reach 100 kDa in size. While T7 and chlorella-encoded ATP-dependent ligases both contain only a nucleotidyl-transferase domain and an OB-fold domain (Agrawal and Kishan 2003; Arcus 2002; Theobald et al. 2003), the eukaryotic ligases contain additional domains that include zinc fingers and BRCT (C-terminal portion of BRCA-1) domains with nuclear or mitochondrial localization signals (Martin and MacNeill 2002).

The ligase the most frequently used in molecular biology is the bacteriophage T4 DNA ligase. T4 DNA ligase is a 68 kDa monomer that requires Mg2+ and ATP as cofactors. T4 DNA ligase can connect blunt and cohesive ends, or repair single stranded nicks in duplex DNA, RNA, or DNA/RNA hybrids.

The E. Coli DNA ligase works preferentially on cohesive double-stranded DNA ends. However, it is also active on blunt ends DNA in the presence of Ficoll or polyethylene glycol. Hybrids such as DNA-RNA or RNA-RNA are not efficiently formed by E. Coli DNA ligase. This can be used as an advantage when double stranded DNA ligation is wanted and blunt end ligation needs to be avoided.

Because DNA ligases’ activity depend on several factors (such as temperature, fragment concentration, nature of fragments — blunt or sticky ends, length of sticky end, stability of hydrogen bonded structure —), it is difficult to presume the ideal incubation conditions for a specific ligation. For instance, ligation of fragments generated by Hind III is 10–40 times faster than the one for fragments generated with Sal I (even though both enzymes generate sticky fragments). For these reasons, the definition of the ligase unit is very specific: one ligation unit is defined by most suppliers as the amount of enzyme that catalyzes 50% ligation of Hind III fragments of lambda DNA in 30 min at 16°C under standard conditions.

Most standard ligation protocols recommend incubating the ligation reaction overnight at 16°C. However, ligation protocols should be empirically optimized for every ligation and according to the amount of DNA present in the reaction (although most suppliers give guidelines according to reaction volume, not DNA concentration). Typically, successful ligation with T4 DNA ligase have been reported with incubation varying from 10 min at room temperature to 24 h at 4°C. In addition, it is good practice to verify the activity of a ligase preparation being kept at −20°C for a long period of time, performing a periodical ligation test. A typical test for ligation of sticky ends is performed with Hind III-digested λ DNA. For blunt end ligations, the same procedure can be used with Hae III-digested DNA.

Thermostable DNA ligases

Thermostable DNA ligases can perform ligation of duplex molecules and repair of single stranded nicks at temperatures ranging from 45 to 80°C. They are highly specific and are very well suited for applications that need high stringency ligations. Thermostable DNA ligases are isolated from diverse sources such as Thermus thermophilus (Takahashi et al. 1984), Bacillus stearothermophilus (Brannigan et al. 1999), Thermus scotoductus (Jonsson et al. 1994), and Rhodothermus marinus (Thorbjarnardottir et al. 1995).

Thermostable DNA ligases are usually not a substitute for T4 or E. Coli DNA ligases but are used for very specific techniques such as Ligase Chain Reaction (LCR). LCR is a technique used to detect single base mutations: a primer is synthesized in two fragments that cover both sides of a possible mutation. Thermostable ligase will connect the two fragments only if they match exactly to the template sequence. Subsequent PCR reactions will amplify only if the primer is ligated.

RNA ligases (EC 6.5.1.3)

RNA ligases catalyze the ATP-dependent formation of phosphodiester bonds between 5′-phosphate and 3′-OH termini of single stranded RNA or DNA molecules. In cells, they act mainly to reseal broken RNAs. Like DNA ligases, RNA ligases operate in a three-step reaction (Silber et al. 1972; Sugino et al. 1977). First, the RNA ligase reacts with ATP to form a covalent ligase-(lysyl-N)-AMP intermediate, and pyrophosphate. Then the AMP moiety of the ligase adenylate is transferred to the 5′-phosphate end of the RNA to form an RNA-adenylate intermediate (AppRNA). In the third step, a nucleophilic attack of the 3′-OH end of the RNA on the AppRNA creates a phosphodiester bond, which seals the two RNA ends.

T4 RNA ligases

The best characterized RNA ligase is the T4-bacteriophage RNA ligase (gp63), which was identified in T4-infected E. Coli. T4 RNA ligase is also the most commonly used RNA ligase in molecular biology. T4 RNA ligase 1 is used to ligate single stranded nucleic acids and polynucleotides to RNA molecules, usually to label RNA molecules at the 3′-end for RNA structure analysis, protein binding site mapping, rapid amplification of cDNA ends (RLM-RACE) (Liu and Gorovsky 1993; Maruyama and Sugano 1994), ligation of oligonucleotide adaptors to cDNA (Tessier et al. 1986; Zhang and Chiang 1996), oligonucleotide synthesis (Kaluz et al. 1995), 5′ nucleotide modifications of nucleic acids (Kinoshita et al. 1997), and for primer extension for PCR. T4 RNA ligase can also be used to circularize RNA and DNA molecules. This enzyme is used in Ambion’s FirstChoice™ RLM-RACE Kit for tagging the 5′ ends of mRNA with oligonucleotide adaptor.

A second RNA ligase encoded by the bacteriophage T4 has recently been described (Ho and Shuman 2002). T4 RNA ligase 2, also known as T4 Rnl-2 (gp24.1), catalyzes both intramolecular and intermolecular RNA strand ligation. Unlike T4 RNA ligase 1, T4 RNA ligase 2 is much more active joining nicks on double stranded RNA than on joining the ends of single stranded RNA. T4 RNA ligase 2 ligates 3′-OH/5′-phosphate RNA nicks, and can also ligate 3′-OH of RNA to the 5′-phosphate of DNA in a double stranded structure (Nandakumar et al. 2004; Nandakumar and Shuman 2004).

A truncated form of T4 RNA ligase 2 (truncated T4 RNL2, also known as RNL2 [1–249]) is commercialized by New England Biolabs. Truncated T4 RNL2, first 249 amino acids of the full length T4 RNA Ligase 2 (Ho and Shuman 2002), is unable to perform the first adenylation step of the ligation reaction. Thus, the enzyme does not require ATP but does need the pre-adenylated substrate to specifically ligates pre-adenylated 5′ end of DNA or RNA to 3′ end of RNA molecules (Ho and Shuman 2002; Nandakumar et al. 2004). Truncated T4 RNL2 reduces background ligation because it selects adenylated primers. This enzyme can be use for optimized linker ligation for the cloning of microRNAs (Aravin and Tuschl 2005; Pfeffer et al. 2005).

An RNA ligase coding frame has also been identified in the pnk/pnl gene (ORF 86) from the baculovirus Autographa californica nucleopolyhedrovirus (ACNV) (Durantel et al. 1998; Martins and Shuman 2004a) and from the radiation-resistant bacterium Deinococcus radiodurans (Dra) (Martins and Shuman 2004b). DraRnl ligates 3′-OH/5′-phosphate RNA nicks that can occur in either duplex RNA or in RNA/DNA hybrids. However, it cannot ligate nicks in DNA molecules, as it requires a ribonucleotidic 3′-OH.

Thermostable RNA ligases

Thermostable RNA ligases have been isolated from bacteriophages rm378 and TS2126 that infect the eubactrium rhodothermus marinus and bacterium thermus scotoductus respectively.

Phosphate transfer and removal

The principal characteristics of the enzymes used for phosphate transfer and removal in molecular biology discussed in this section are summarized in Table 4.

Table 4 Main enzymes used for phosphate transfer or removal in molecular biology

Alkaline phosphatase (EC 3.1.3.1)

Alkaline phosphatase is purified from either E. Coli or higher organisms (e.g. calf intestine). It is used for removal of 5′-phosphate groups from nucleic acids in order to prevent recircularization of DNA vectors in cloning experiments. Alkaline phosphatase does not hydrolyze phosphodiester bonds.

Although E. Coli and calf intestine alkaline phosphatases have different structures (80 and 140 kDa, respectively), both enzymes are zinc and magnesium containing protein (Reid and Wilson 1971). They are inactivated by chelating agents such as EGTA, and by low concentrations of inorganic phosphate. Importantly, inorganic phosphates need to be removed after restriction endonuclease cleavage and prior to incubation with alkaline phosphatase. Inorganic phosphates are certainly removed if cloning strategy involves gel migration and purification of linearized vector. If no purification is made, dialysis of endonuclease-digested DNA is required prior to alkaline phosphatase treatment (ethanol precipitation does not efficiently remove inorganic phosphates). Similarly, when labeling DNA fragments at the 5′-end (e.g. 32P labeling), incubation with alkaline phosphatase needs to precede incubation with polynucleotide kinase in the presence of [32P]deoxynucleoside triphosphate.

T4 Polynucleotide kinase (polynucleotide 5′-hydroxy kinase, EC 2.7.1.78)

Polynucleotide kinase catalyzes the transfer of a phosphate group from an ATP molecule to the 5′-OH terminus of a nucleic acid (DNA or RNA, with no size limitations), the exchange of 5′-phosphate groups (Berkner and Folk 1980; Chaconas and van de Sande 1980), or the phosphorylation of 3′-ends of mononucleotides. Encoded by the pse T gene of bacteriophage T4 (Depew et al. 1975), the T4 polynucleotide kinase is a tetramer of four identical 33-kDa monomers (Lillehaug 1977; Panet et al. 1973). Polynucleotide kinase is commonly used for labeling experiments with radiolabeled ATP utilized as a phosphate donor.

Phosphate transfer activity is optimum at 37°C, pH 7.6, in the presence of Mg2+ and reducing reagents (DTT or β2-mercaptoethanol), and with a minimum of 1 mM ATP and a 5:1 ratio of ATP over 5′-OH ends (Lillehaug et al. 1976). When substrate concentration is limited, addition of 6% polyethylene glycol (PEG 8000) in the reaction mixture enhances radiolabeling of recessed, protruding, and blunt 5′-termini of DNA (Harrison and Zimmerman 1986).

When radiolabeled ATP is used (e.g. γ32P-ATP), polynucleotide kinase generates labeled DNA or RNA molecules with 32P at their 5′ ends, by catalyzing either direct phosphorylation of 5′-OH groups generated after alkaline phosphatase digestion, or exchange of the 5′-phosphate groups. This reaction is efficiently used for labeling DNA or RNA strands prior to base specific sequencing. Polynucleotide kinase is also used for mapping of restriction sites, DNA and RNA fingerprinting (Galas and Schmitz 1978; Gross et al. 1978; Reddy et al. 1981), and synthesis of substrates for DNA or RNA ligase (Khorana et al. 1972; Silber et al. 1972).

In addition to the kinase activity described above, T4 polynucleotide kinase also exhibits a 3′-phosphatase activity (Cameron and Uhlenbeck 1977). Optimum pH for 3′ phosphatase activity is comprised between 5 and 6, i.e. more acidic than for phosphate transfer activity. Based on this property, protocols using T4 polynucleotide kinase as specific 3′-phosphatase have been developed (Cameron et al. 1978).

A T4 polynucleotide kinase lacking 3′-phosphatase activity has been purified from E. Coli infected with a mutant T4 phage producing an altered pse T1 gene product (Cameron et al. 1978). Like the wild-type enzyme, the mutant polynucleotide kinase is made of four subunits of 33 kDa each, effectively transfers the gamma phosphate of ATP to the 5′-OH terminus of DNA and RNA. In addition, the mutant and the wild type polynucleotide kinases require similar magnesium ion concentrations, have the same pH optima and are both inhibited by inorganic phosphate. The mutant polynucleotide kinase is very useful when the 3′-exonuclease activity must be avoided (e.g. 5′-32P terminal labeling of 3′-CMP in view of 3′ end-labeling of RNA species prior to fingerprinting or sequencing).

Tobacco acid pyrophosphatase

Tobacco acid pyrophosphatase is used as a first step for labeling mRNAs at their of 5′ ends, which are usually capped, in order to generate radiolabeled probes of for RNA sequencing: in vitro, 32P-labeling of mRNA 5′ terminus requires the elimination of the 7-methylguanosine and 5′-phosphate moieties of the capped end. Tobacco acid pyrophosphatase hydrolyzes the pyrophosphate bond in the cap’s triphosphate bridge, generating a 5′-phosphate terminus on the RNA molecule (leading to p7MeG, pp7MeG, and ppN— pN— mRNA). The generated open cap can then be dephosphorylated by alkaline phosphatase, and labeled with T4 polynucleotide kinase using γ32P-ATP.

Nucleases

Nucleases cleave phosphodiester bonds in the nucleic acids backbone. Based on their mode of action, two main classes have been defined: exonucleases are active at the end of nucleic acid molecules, and endonucleases cleave nucleic acids internally. Deoxyribonucleases cleave DNA and generates nicks (point in a double stranded DNA molecule where there is no phosphodiester bond between adjacent nucleotides of one strand, typically through damage or enzyme action), whereas ribonucleases cleave RNA. Nucleases are double edge swords for molecular biologists: on one hand, they are the worst enemy to nucleic acids integrity and, on the other hand, they are very useful to cut and manipulate nucleic acids for cloning purposes.

The principal characteristics of the nucleases discussed in this section are summarized in Table 5.

Table 5 Main nucleases used in molecular biology

Deoxyribonucleases

Deoxyribonuclease I (DNase I, EC 3.1.21.1)

Deoxyribonuclease I (DNase I) is an endonuclease that acts on single- or double-stranded DNA (either isolated or incorporated in chromatin). DNAse I is used for nick translation of DNA, for generating random fragments for dideoxy sequencing, to digest DNA in RNA or protein preparations, and for DNA-protein interactions analysis in DNase footprinting.

DNase I is a 31 kDa glycoprotein, usually purified from bovine pancreas as a mixture of four isoenzymes (A, B, C, and D). It is important to note that crude preparations of DNAse I are often contaminated with RNase A. Thus, great attention should be paid to the quality control provided by the manufacturer to ensure lack of RNase A activity in DNase I preparations. At the end of the reaction, DNase I can be removed from the preparation by thermal denaturation at 75°C for 5 min in the presence of 5 mM EGTA (Huang et al. 1996).

DNase I-catalyzed cleavage occurs preferentially in 3′ of a pyrimidine (C or T) nucleotide, and generates polynucleotides with free 3′-OH group and a 5′-phosphate. In the presence of Mg2+, DNase I hydrolyzes each strand of duplex DNA independently, generating random cleavages. Maximal activity is obtained in the presence of Ca2+, Mg2+, and Mn2+ ions (Kunitz 1950). The nature of the divalent cations present in the incubation mixture affects both specificity and mode of action of DNase I (Campbell and Jackson 1980; Junowicz and Spencer 1973). For instance, in the presence of Mn2+, DNase I cleaves both DNA strands at approximately the same site, producing blunt ends or fragments with 1–2 base overhangs.

Exonuclease III (exodeoxyribonuclease III, EC 3.1.11.2)

Exonuclease III was first isolated from the BE 257 E. Coli strain, which contains a thermo-inducible overproducing plasmid (pSGR3). Exonuclease III is a 28 kDa monomeric enzyme (Weiss 1976) with several interesting activities (Mol et al. 1995): i) 3′-exonuclease activity: exonuclease III catalyzes the removal of mononucleotides from 3′-OH ends of double-stranted DNA; ii) phosphatase activity: exonuclease III dephosphorylates DNA chains that terminate with a 3′-phosphate group (this type of chain is usually inert as a primer and inhibits DNA polymerase action); iii) RNase H activity: exonuclease III degrades RNA strands of DNA/RNA hybrids (Rogers and Weiss 1980); iv) endonuclease activity: exonuclease III cleaves apurinic and apyrimidic bases from the sugar phosphate backbone in DNA (Keller and Crouch 1972).

Another significant feature of exonuclease III is its relative specificity for double-stranted DNA. When this enzyme acts as an endonuclease, it generates a gap at nicks in the double-stranted DNA, whereas when it acts as an exonuclease, it generates a 5′-protruding end (that is resistant to further digestion since exonuclease III is not active on single-stranted DNA). Exonuclease III is unique among the exonucleases in its phospho-monoesterase action on a 3′-phosphate terminus (Demple and Harrison 1994; Mol et al. 1995).

Exonuclease III is often used in conjunction with the Klenow fragment of E. Coli DNA polymerase to generate radio-labeled DNA strands. It is also used sequentially with S1 nuclease to reduce the length of double stranded DNA.

Optimum pH for exonuclease III’s endonucleolytic and exonucleolytic activities is between 7.6 and 8.5, while phosphatase activity is maximal at pH between 6.8 and 7.4. The presence of Mg2+ or Mn2+ ions is required for optimal activity, while the presence of Zn2+ inhibits enzyme activity (Richardson and Kornberg 1964; Richardson et al. 1964; Rogers and Weiss 1980).

Bal 31 nucleases (EC 3.1.11)

Two distinct molecular species described as fast (F) and slow (S) Bal 31 nucleases have been purified from the culture medium of Alteromonas espejiana Bal 31. Both species shorten duplex DNA (at both 3′ and 5′ ends) without introducing internal nicks, and exhibit a highly specific single stranded DNA endonuclease activity (cleaves at nicks, gaps and single-stranded regions of duplex DNA and RNA).

The purified F-Bal 31 nuclease is used for restriction mapping, removing long stretches (up to thousands of base pairs) or short stretches (tens to hundreds of base pairs) from duplex DNA, mapping B-Z DNA junctions, cleaving DNA at sites of covalent lesions (such as UV-induced), and shortening RNA molecules. The S-Bal 31 is a slower acting enzyme used only for restriction mapping and removal of short stretches from DNA duplex. Since Ca2+ is an essential cofactor, EGTA is required to inactivate both enzymes.

Most Bal 31-generated DNA fragments have fully base paired ends, which may further be ligated to any blunt end DNA fragments like molecular linkers. A small fraction of Bal 31-generated DNA fragments is harboring 5′-protruding ends, suggesting that Bal 31 acts sequentially as a 3′→5′ exonuclease, followed by endonucleolytic removal of the protruding ends (Shishido and Ando 1982; Talmadge et al. 1980; Wei et al. 1983). Thus, the efficacy of a blunt-end ligation can be increased by utilizing Klenow fragment of E. Coli DNA polymerase or T4 DNA polymerase to fill the 5′-protuding ends generated by Bal 31.

Although F and S species of Bal 31 have similar activities on single stranded DNA, they act differently on double stranded DNA: F-Bal 31 can shorten duplex DNA approximately 20 times faster than S-Bal 31 (Talmadge et al. 1980; Wei et al. 1983), although the reaction rate of digestion is dependent upon the C/G content of the substrate DNA (Kilpatrick et al. 1983). When tested under standard conditions, 1 μg/ml of the F- and S-Bal 31 shortens DNA at rates of approximately 130 and 10 base pairs/ terminus/ minute, respectively.

Exonuclease VII (EC 3.1.11.6)

Exonuclease VII has first been isolated from the E. Coli K12 strain as a 88 kDa polypeptide (Chase and Richardson 1974b). Exonuclease VII specifically degrades single stranded DNA, with no apparent activity on RNA or DNA/ RNA hybrids (Chase and Richardson 1974a). Since it is able to attack either end of the DNA molecules, it is often utilized to degrade long single protruding strands from duplex DNA (such as those generated by restriction endonucleases) and generate blunt end DNA. Exonuclease VII differs from exonucleases I and III in that i) it can degrade DNA from either its 5′ or 3′ end, ii) it can generate oligonucleotides, and iii) it is still fully active in 8 mM EDTA. Exonuclease VII is particularly useful for rapid removal of single-stranded oligonucleotide primers from a completed PCR reaction, when different primers are required for subsequent PCR reactions (Li et al. 1991).

Ribonucleases

Pancreatic ribonuclease (RNase A, EC 3.1.27.5)

Pancreatic ribonuclease (also called RNase A) is an endonuclease that cleaves single stranded RNA at nucleoside 3′-phosphates and 3′-phospho-oligonucleotides ending in Cp or Up. RNase A is used to reduce RNA contamination in plasmid DNA preparations, and for mapping mutations in DNA or RNA by mismatch cleavage, since it cleaves the RNA in RNA/DNA hybrids at sites of single nucleotide mismatch.

RNAse A degrades the RNA into 3′-phosphorylated mononucleotides and oligonucleotides, generating a 2′,3′-cyclic phosphate intermediate during the reaction (Anfinsen and White 1961).

RNase A is active under very different conditions and difficult to inactivate. Hence, great care needs to be taken to avoid contamination with RNAse A and further degradation of RNA samples. RNase-inhibitor from human placenta might be used to inhibit RNAse activity. However, we recommend using diethyl pyrocarbonate (DEPC), guanidinium salts, beta-mercaptoethanol, heavy metals, or vanadyl-ribonucleoside-complexes for efficient inhibition of RNAse A.

RNase A is available from many commercial sources. As stated above for other enzymes, very few suppliers provide useful information regarding origin or purity of their enzyme preparation.

Ribonuclease H (3.1.26.4)

Ribonuclease H (RNase H) specifically degrades RNA in RNA/ DNA hybrids. It allows removal of RNA probes from prior hybridizations, or removal of poly-A tails at the 3′ end of mRNAs.

Optimal activity of RNase H is achieved at a pH comprised between 7.5 and 9.1 (Berkower et al. 1973) in the presence of reducing reagents. RNase H activity is inhibited in the presence of N-ethylmaleimide (a chemical that react with SH groups), and is not markedly affected by high ionic strength (50% activity is retained in the presence of 0.3 M NaCl). RNase H requires Mg2+ ions, which can be replaced partially by Mn2+ ions.

Other ribonucleases

Many other ribonucleases have been used for RNA sequencing. Each of them has specific cleavage requirement and specificity. Below is a list of these ribonucleases.

Ribonuclease Phy I can be used for rapid sequencing of RNA. It is isolated from cultures of Physarum polycephalum. Ribonuclease Phy I cleaves the RNA molecule at G, A, and U, but not at C residues. The products are 3′ mononucleotides with 5′ C termini.

RNase CL3 is used for sequencing RNA. It is isolated from chicken liver (Gallus gallus). RNase CL3 digests RNA adjacent to cytidilic acid in a ratio of 60 C residues digested for every U residue digested. The enzyme activity is inhibited by poly-A tracts, unless spermidine is added.

Used primarily for RNA sequencing (Lockard et al. 1978), Cereus ribonuclease is an endoribonuclease that preferentially cleaves RNA at U and C residues.

Ribonuclease Phy M, purified from Physarum polycephalum, has also been used for sequencing of RNA (Donis-Keller 1980). Ribonuclease Phy M preferentially cleaves RNA at U and A residues. Occasionally, cleavage at G may occur. This unwanted reaction can be minimized in the presence of 7 M urea.

RNase T1 is an endoribonuclease that specifically degrades single-stranded RNA at G residues. It cleaves the phosphodiester bond between 3′-guanylic residues and the 5′-OH residues of adjacent nucleotides with the formation of corresponding intermediate 2′, 3′-cyclic phosphates. RNase T1 is an 11 kDa protein purified from Aspergillus oryzae.

RNAse T2 is also purified from Aspergillus oryzae. It has a molecular weight of 36,000, and cleaves all phosphodiester bonds in RNA, with a preference for adenylic bonds.

RNase U2, purified from Ustilago sphaerogena, is utilized as complement of RNase T1 in RNA sequencing, to discriminate purines residues since it specifically cleaves adenine residues when incubated at 50°C, pH3.5, and in the presence of 8 M urea. When incubated under standard conditions (Takahashi 1961), RNase U2 specifically cleaves the 3′-phosphodiester bond adjacent to purines, therefore generating purines 3′-phosphates or oligonucleotides with purine 3′-phosphate terminal groups. Cyclic 2′,3′ purine nucleotides are obtained as intermediates and that reversal of the final reaction step can be used to synthetize ApN and GpN. RNase U2 is thermostable (80°C for 4 min) in aqueous solution at pH 6.9.

DNA/ RNA nucleases

S1 nuclease (EC 3.1.30.1)

S1 nuclease is a very useful tool for measuring the extent of hybridization (DNA— DNA or DNA— RNA), probing duplex DNA regions, and removing single stranded DNA of protruding ends generated by restriction enzymes. S1 nuclease is purified from Aspergillus oryzae. This enzyme degrades RNA or single stranded DNA into 5′ mononucleotides, but does not degrade duplex DNA or RNA— DNA hybrids in native conformation.

S1 nuclease is a 32 kDa metalloprotein (Vogt 1973). It requires Zn2+ for activity. Co2+ and Hg2+ can replace Zn2+, but are less effective as cofactors. S1 nuclease’s optimal activity is achieved at pH 4.0–4.3 (a 50% reduction of activity is observed at pH 4.9, and the enzyme is inactive at pH > 6.0). S1 nuclease is strongly inhibited by chelating agents such as EDTA and citrate, or by low concentration of sodium phosphate (as low as 10 mM). Nuclease S1 is resistant to denaturing agents such as urea, SDS, or formamide (Hofstetter et al. 1976; Vogt 1973) and is thermostable (Ando 1966).

S1 nuclease hydrolyzes single-stranded DNA five times faster than RNA, and 75,000 times faster than double-stranded DNA. The low level of strand breaks which can be introduced by S1 nuclease in duplex DNA (Beard et al. 1973; Shenk et al. 1975; Vogt 1973, 1980; Wiegand et al. 1975) can be further reduced by high salt concentration (i.e. 0.2 M). S1 nuclease is active at S1 sensitive sites generated by negative supercoiling of the helical DNA structure (Beard et al. 1973; Godson 1973; Mechali et al. 1973), UV irradiation (Heflich et al. 1979; Hofstetter et al. 1976; Shishido and Ando 1974), or depurination (Shishido and Ando 1975).

S1 nuclease is widely used when specific removal of single portions of duplex DNA, RNA/DNA hybrids, or RNA molecules is required. Among these applications are mapping of spliced RNA molecules, isolation of duplex regions in single stranded viral genomes (Shishido and Ikeda 1970, 1971), probing strand breaks in duplex DNA molecules (Germond et al. 1974; Shishido and Ando 1975; Vogt 1973), cleavage of regions with lesser helix stability (Lilley 1980; Panayotatos and Wells 1981; Shishido 1979), localization of inverted repeated sequences (Lilley 1980; Panayotatos and Wells 1981; Shishido 1979), introduction of deletion mutation at D loop sites in duplex DNA (Green and Tibbetts 1980), and mapping of the genomic regions involved in interactions with DNA binding proteins (Meyer et al. 1980).

Mung bean nuclease

Mung bean nuclease can be used in the same way as nuclease S1, to remove protruding ends in duplex DNA or for transcription promoter mapping.

Mung bean nuclease is a single stranded specific DNA and RNA endonuclease purified from mung bean sprouts. It yields 5′-phosphate terminated mono- and oligonucleotides (Johnson and Laskowski 1968, 1970; Mikulski and Laskowski 1970; Sung and Laskowski 1962). Complete duplex DNA degradation may occur when high enzyme concentration and extended incubation time are used. This is due to a two-step process: the enzyme first introduces single stranded nicks, followed by double stranded scissions and exonucleolytic digestion of the resulting fragments (Johnson and Laskowski 1970; Kroeker and Kowalski 1978; Kroeker et al. 1976).

Proper trimming of the 5′-protruding extensions is achieved when the final blunt end contained a G-C base pair at its terminus (Ghangas and Wu 1975). Presence of an A-T base pair at the position where the fragment would end after trimming seemed to interfere with precise removal of the protruding end. Nucleotide composition of the overhang does not affect efficiency or quality of the nucleolytic digestion (Ghangas and Wu 1975).

Mung bean nuclease was also proven to be very useful for excising cloned DNA fragments inserted in vectors following a dA·dT tailing (Wensink et al. 1974). Because poly(dA/dT) are not recognized as typical double stranded structures (Johnson and Laskowski 1970), they are hydrolyzed at half the rate of single stranded tails, but they are more efficiently cleaved than other duplex regions in DNA.

Mung bean nuclease requires Zn2+ and a reducing agent such as cysteine for maximum activity and stability. It is inhibited by high salt concentrations (80 to 90% inhibition in 200–400 mM NaCl). Triton X-100 (0.001% w/v) can increase mung bean nuclease stability (when used at low concentrations such as less than 50 U/μl) and to prevent nuclease adhesion to surfaces.

Methylases

Methylases from bacterial restriction-modification systems

Bacteria use restriction endonucleases to degrade foreign DNA introduced by infectious agents such as bacteriophages. In order to prevent destruction of its own DNA by the restriction enzymes, bacteria mark their own DNA by adding methyl groups to it (methylation). This modification, which must not interfere with the DNA base-pairing, usually affects only a few specific bases on each strand. Methylases are part of the bacterial restriction- modification (RM) system. They catalyze the transfer of methyl groups from S-adenosyl-methionine (SAM) to specific nucleotides of double stranded DNA molecules. Methylases from type II RM systems (the most common) are encoded by separate proteins and act independently of their respective restriction endonucleases, whereas methylase and restriction activities of type I and III RM systems are provided by a unique protein complex (Wilson 1991).

Methylations often inhibit restriction enzymes that recognize the corresponding sequences (Sistla and Rao 2004), although there are exceptions to this rule (Gruenbaum et al. 1981a): i) some restriction endonucleases cleave DNA at a recognition sequence being modified by the dam or dcm methylases (see below) (e.g. Apy I, Bam HI, Sau 3AI, Bgl II, Pvu I, or BstNI); ii) some bacteria have restriction endonucleases that only degrade methylated DNA and not the host unmethylated DNA (to overcome the camouflage evolution of some bacteriophages that contain methylated DNA).

Methylases are often used in molecular cloning to protect DNA from digestion, during DNA cloning or cDNA or genomic library construction. For example, when cloning a DNA fragment at the BamH I site of a given vector, the BamHI methylase can first be used to protect potential internal Bam HI sites in the vector, before addition of Bam HI linkers and digestion with the Bam HI endonuclease.

A methylase can also inhibit a restriction endonuclease of different RM systems (e.g. TaqI methylase inhibits BamHI restriction enzyme). The extent of overlap between restriction and methylation sequences determines the extent to which methylation alters endonuclease cleavage specificity and/or activity. There are two types of overlaps: in the first case, the methylation sequence completely overlaps with the one of restriction endonuclease, and methylation alters cleavage specificity. In the second case, the methylation sequence partially overlaps with the one of restriction endonucleases, and nucleotide methylation modifies the endonuclease recognition sequence, which becomes resistant to the endonuclease activity (see below).

The first class of overlap occurs when restriction endonucleases have degenerated recognition sequence, and when methylases are active on only one of the possible sites. For example, the Hinc II recognition sequence is GTPyPuAC, which means that Hinc II can cleave duplex DNA at the following four combinations: GTCGAC, GTCAAC, GTTGAC, and GTTAAC. TaqI methylase (designated M.TaqI) catalyzes the transfer of a methyl group to the adenine residue of the TCGA sequence. Thus, among the four possible Hinc II recognition sites, only those containing a TCGA sequence are methylated by M.TaqI, and consequently become resistant to Hinc II digestion. This kind of overlap can be used to create new specificities for restriction endonucleases in duplex DNA (Nelson et al. 1984).

The second class of overlap occurs at the boundaries of the recognition sequences for a restriction endonuclease and a methylase. For example, when the GGATCC sequence of a Bam HI restriction site is followed by GG, the Bam HI site partially overlaps with the CCGG methylation site of M.MspI. Since M.MspI transfers a methyl group to the 5′ cytosine residue in the CCGG sequence, it methylates the Bam HI site at its internal cytosine, thereby making this sequence resistant to the Bam HI endonuclease.

Dam and dcm methylases

Most B, K, and W strains of E. Coli also contain two site-specific DNA methylases, which are not part of the RM system and are encoded by the dam and dcm genes (Pirrotta 1976). The dam methylase transfers a methyl group from SAM to the N6 position of the adenine residue in the sequence GATC, while the dcm methylase (also known as mec) transfers a methyl group from SAM to the internal cytosine residues in the sequences CCAGG or CCTGG (Geier and Modrich 1979; Marinus and Morris 1973; May and Hattman 1975).

Dam methylation regulates post-replication mismatch repairs. All DNA isolated from E. Coli is not methylated to the same extent. For instance, the pBR322 plasmid DNA purified from E. Coli after amplification with chloramphenicol is resistant to the digestion by Mbo I, which does not cut at methylated sites. Furthermore, only 50% of the λ DNA obtained from a dam + strain of E. Coli is methylated. The degree of in vivo cytosine methylation in the sequence CpG is related to the level of gene expression in eukaryotic cells, and quantity of methylated cytosine residues inversely correlates with gene activity (Ehrlich and Wang 1981; Gruenbaum et al. 1981b; Razin and Cedar 1977; Razin and Riggs 1980; Sutter and Doerfler 1980).

Practical considerations on the in vitro use of methylases

DNA fragments obtained by restriction endonuclease digestion of in vitro methylated DNA are in most respects indistinguishable from unmethylated DNA. However, it is important to note that methylated cytosine does not generate a band in the C channel when sequencing DNA by the Maxam and Gilbert method. In addition, the presence of methyl-cytosine residues in DNA reduces efficiency of transformation in most common strains of E. Coli. This second problem results from the existence of restrictions systems in E. Coli K12 responsible for a specific degradation of DNA containing methylated cytosines. These restriction systems, designated 5-methylcytosine-specific restriction enzyme (mcr)-A and mcrB, are similar to those degrading hydroxymethylcytosine-containing DNA (rglA and rglB). The mcrA + strains cleaves DNA modified by the Hpa II methylase while mcrB cleaves DNA modified by the Hae III, Alu I, Hha I, and Msp I methylases. E. Coli strains deficient in the mcrB system are available from New England Biolabs.

DNA methylases from of type II RM systems perform the methylation reaction under similar conditions as do restriction endonucleases, except that the methylase requires SAM as a methyl group donor. Therefore, it is generally acceptable to carry out the methylation reaction using standard restriction endonuclease buffers to which SAM has been added. Importantly, methylases do not require divalent cations to be active, while most restriction endonucleases do.

Other enzymes

Polyadenylate polymerase of E. Coli (polynucleotide adenylyl-transferase, EC 2.7.7.19)

Polyadenylate (polyA) polymerase of E. Coli is a 58 kDa polypeptide that polymerizes adenylate residues at the 3′ end of various polyribonucleotides, in a template-independent manner. In vitro, it is used for RNA 3′ extension to prepare a priming site for cDNA synthesis using oligo-dT, or to prepare RNA for poly-dT-based purification. Depending on the experimental conditions, polyA polymerase can polymerize a polyA stretches from 20- to 2,000-nucleotide long.

E. Coli polyA polymerase catalyzes the addition of adenosine monophosphates (AMPs) using ATP as a substrate. ADP, dATP, and GTP are not polyA polymerase substrates, but CTP and UTP are (though at less than 5% of the rate obtained with ATP). PolyA polymerase is inhibited by phosphate and pyrophosphate ions, and aurin tricarboxylic acid, an inhibitor of both RNA polymerase and of the binding of mRNA to ribosomes. PolyA polymerase is a rather unusual enzyme in that its optimal activity requires the presence of high concentrations of monovalent cations (e.g. 400 mM NaCl). It is stimulated by Mn2+ and is insensitive to antibiotics such as rifampicin and streptolygdin (transcriptional inhibitors of initiation and elongation, respectively). In contrast to polynucleotide phosphorylase, another template-independent polynucleotide synthesizing enzyme, polyA polymerase does not degrade its own polyA product.

PolyA polymerase uses a wide variety of single stranded RNA species as primers. Double stranded RNA and some polynucleotides (such as polyUG, polyC, or di- and trinucleotides) are very poor primers. DNA does not function as a primer. This latter property is common to virtually all other polyA polymerases, except for the enzyme isolated from plants such as maize that shows considerable activity with natural and synthetic oligo- and poly-dNTPs.

Topoisomerase I (EC 5.99.1.2)

Topoisomerase I is a 105 kDa protein (Liu and Miller 1981) that catalyzes the relaxation of supercoiled DNA by transiently cleaving one strain of duplex DNA in the sugar— phosphate backbone, and further ligating the two generated ends (Champoux 1978; Gellert 1981). Topoisomerase I is an ubiquitous nuclear protein that is used by cells during processes such as replication, recombinaison, or DNA- protein interactions that involve unwinding of the DNA, and thereby create excessive supercoiling upstream of the unwinding point (Champoux 2001).

In vitro, topoisomerase I was initially used for circular DNA preparation (Martin et al. 1983), and for studying nucleosome assembly (Germond et al. 1974; Laskey et al. 1977) or DNA tertiary structures (Peck and Wang 1981; Wang 1979). Later, it was shown that topoisomerase I also catalyzes the covalent transfer of a single stranded DNA (donor) to an heterologous DNA (acceptor) (Been and Champoux 1981; Halligan et al. 1982). This reaction involves the initial binding of the enzyme to single stranded DNA and formation of a covalent DNA/ enzyme complex, which is then able to ligate the single stranded fragment to the 5′-OH end of the acceptor DNA (Halligan et al. 1982; Prell and Vosberg 1980). This ability of topoisomerase I to function as both a nuclease and a ligase has been utilized to develop new cloning vectors (Invitrogen TOPO® Cloning Vectors). The TOPO® cloning method uses topoisomerase I of vaccinia virus that i) specifically cuts double stranded DNA at the end of a (C/T)CCTT sequence, and ii) binds covalently to the 3′-phosphate of the thymidine of the generated DNA. The ligase activity of topoisomerase I can then join an acceptor DNA with compatible ends. TOPO® vectors form Invitrogen are provided as linear duplex DNA containing a topoisomerase I covalently linked to the 3′-phosphate on one strand, and a GTGG overhang on the other strand. Cloning is achieved using PCR products that have been amplified using a forward primer that contains four extra bases (CACC) at the 5′ end. The overhang in the cloning vector (GTGG) hybridizes the 5′ end of the PCR product, anneals to the added bases, and topoisomerase I ligates the PCR product in the correct orientation. Topoisomerase I also ligates the PCR product at the blunt 3′ end.

Topoisomerase I is active at pH 7.5 in the presence of 50 to 200 mM NaCl, and is inactivated by 0.2% SDS. Covalent binding to DNA is stimulated by 5–10 mM Mg2+.

Topoisomerase II (E. Coli DNA gyrase, EC 5.99.1.3)

Topoisomerase II, which is purified from M. Luteus (Klevan and Wang 1980), catalyzes the breakage and resealing of both strands of duplex DNA, thereby changing the linking number of discrete supercoiled forms of DNA by two (Brown et al. 1979; Liu and Miller 1981). Topoisomerase II is also capable of reversibly knotting intact circular DNA (Brown et al. 1979; Liu and Miller 1981). Type II topoisomerases are multimeric proteins that require ATP to be fully active. Type II topoisomerase is inhibited by etoposide, a chemotherapeutic agent used to reduce growth of rapidly dividing cancer cells (Baldwin and Osheroff 2005).

Guanylyl transferase from vaccinia virus

Guanylyl transferase is a capping enzyme complex isolated from Vaccinia virus. Guanylyl transferase is used to label either 5′ di- and triphosphate ends of RNA molecules, or capped 5′ ends of RNA after chemical removal of the terminal 7-methyl-guanosine (m7G) residue.

This complex has three enzymatic activities: 1) it acts as an RNA triphosphatase by catalizing the cleavage of a pyrophosphate from a trisphosphate end of an RNA molecule, releasing a bisphosphate RNA end; 2) it is a guanylyl transferase, able to transfer one molecule of GTP to RNA, releasing pyrophosphate (because the 5′ end of the RNA molecule ends in a phosphate group, the bond formed between the RNA and the GTP molecule is an unusual 5′-5′ triphosphate linkage, instead of the 3′-5′ bounds that exist between the other nucleotides forming an RNA strand); and 3) it is an RNA (guanine-7)-methyl transferase that transfers a methyl group from SAM to the 5′ guanidine residue of an RNA molecule.

Guanylyl transferase complex does not accept monophosphate RNA as a substrate. Consequently, degraded or nicked RNA will not be labeled except at the 5′-cap end. The optimal amount of enzyme to be used must be determined empirically.

During labeling experiments, it is recommended to verify that all labeling is incorporated in an authentic cap structure. To this end, tobacco acid pyrophosphatase can be used on an aliquot of the labeled samples: all labeling should be released in the form of 5′ GMP after digestion.

Conclusion

Countless sources of enzymes are now available to scientist for propagating, cutting, or modifying nucleic acids. A good understanding of the activity of each enzymes utilized in a specific protocol will certainly help selecting the most appropriate source of enzymatic activity needed for a specific purpose, and defining the optimal reaction conditions.

Most commercial preparations of enzymes are reliable and can be used with great confidence. However, it is not always possible to obtain information regarding the biological origin, extent of quality controls, and unit definition for various enzymes offered by different suppliers. Nevertheless, such information is crucial to optimally define cloning reaction conditions (unit per reaction, optimal temperature, buffer composition, incubation time, denaturation conditions, enzymatic contamination, etc). Carefully chosen enzymatic activities and quality reagents are keys to successful cloning experiments.