Cell-free protein synthesis: the state of the art
- First Online:
- Cite this article as:
- Whittaker, J.W. Biotechnol Lett (2013) 35: 143. doi:10.1007/s10529-012-1075-4
- 1.6k Downloads
Cell-free protein synthesis harnesses the synthetic power of biology, programming the ribosomal translational machinery of the cell to create macromolecular products. Like PCR, which uses cellular replication machinery to create a DNA amplifier, cell-free protein synthesis is emerging as a transformative technology with broad applications in protein engineering, biopharmaceutical development, and post-genomic research. By breaking free from the constraints of cell-based systems, it takes the next step towards synthetic biology. Recent advances in reconstituted cell-free protein synthesis (Protein synthesis Using Recombinant Elements expression systems) are creating new opportunities to tailor the reactions for specialized applications including in vitro protein evolution, printing protein microarrays, isotopic labeling, and incorporating nonnatural amino acids.
KeywordsCell-free translation Protein engineering Protein synthesis PURE expression Synthetic biology
Proteins represent the ultimate expression of biochemical function: as enzymes, they perform exquisitely selective chemical reactions; as receptors, they recognize specific partners in signaling pathways that integrate and regulate metabolism; as transporters, they control trafficking across cell membranes; and as macromolecular building blocks, they form the scaffolding of cell architecture. In biotechnology, proteins are important not only for their remarkable catalytic power, but also for their unique physical properties, and as tools of discovery in the development of new therapeutics. As a result, the production of pure proteins has itself become an important goal, driving the evolution of both chemical and biochemical methods for protein synthesis. Conventional solid phase chemical synthesis provides the ultimate control over molecular structure, but is generally restricted to smaller peptides (<40 residues) (Nilsson et al. 2005), limiting its ability to meet the need for a broad range of proteins for industrial and research applications. Alternatively, several decades of biochemical research has expanded and refined methods for recombinant protein production based on ribosomal protein synthesis. The protein expression field has traditionally been dominated by cell-based expression systems, including both conventional (Escherichia coli, Saccharomyces cerevisiae) and non-conventional (e.g., Pichia pastoris, Thermus thermophilus) host organisms that have been developed as ‘cell factories’. However, cell-free protein synthesis is rapidly emerging as an important complementary approach, rivaling cell-based protein production in terms of both convenience and scalability, and becoming widely recognized as a valuable tool for protein engineering (Hodgman and Jewett 2012; Kai et al. 2012; Swartz 2012).
Cell-free protein synthesis is not a new concept. Protein synthesis in cell-free extracts helped to crack the genetic code and has been extensively used to dissect ribosomal protein biosynthesis, providing a specialized niche for the development of these methods. However, advances in cell-free protein synthesis in recent years, particularly the ability to reconstitute protein synthesis from well-defined, purified components, has transformed the approach from a specialized analytical tool to a powerful preparative method with broad applicability. Cell extract—or lysate-derived systems for protein synthesis—continue to be exploited and optimized, with the traditional wheat germ, rabbit reticulocyte and E. coli lysates being extended with Leischmanii (Kovtun et al. 2011), Thermus (Zhou et al. 2012) and even human (Mikami et al. 2008) cell-free systems. Systems based on reconstituted highly purified components (PURE expression systems) have significantly extended the range of applications and will be the primary focus of this mini-review, which surveys recent advances and emerging applications of cell-free protein synthesis.
PURE expression systems
The advantages of the PURE systems include reduced levels of contaminating proteases, nucleases, and phosphatases, greater reproducibility resulting from more defined chemistry, and the flexibility of a modular system. Metabolic side reactions that deplete the amino acid pool in cell extracts can be entirely avoided. In addition, the 6 × His affinity tags associated with the PURE components can be utilized in ‘reverse purification’ of products, extracting the tagged proteins by metal affinity chromatography. Because it is modular, the PURE system supports a variety of modifications for specialized applications, including ribosome display and site-selective incorporation of nonnatural amino acids, described in more detail below. Both cell extracts and the components of the PURE system can be modified to mimic the macromolecular crowding of the interior of a cell (Ge et al. 2011).
Ribosomal protein synthesis is an energy-intensive process, requiring approximately 4 ATP equivalents/peptide bond (including 2 ATP equivalents/residue for amino acid activation, 1 GTP/residue for transfer of the charged aminoacyl-tRNA to the A site of the ribosome, and 1 GTP/residue for translocation of the ribosome through coupling to EF-G) (Calhoun and Swartz 2007; Kim and Kim 2009). Consequently, the choice of ATP-regenerating system is a critical factor for cell-free protein synthesis. A variety of ATP-regenerating reactions have been explored to achieve sustained, stable production of ATP while avoiding accumulation of inorganic phosphate, which inhibits translation by binding Mg2+ that serve as essential cofactor in many nucleotide-dependent reactions, including protein synthesis. The choice of ATP-regeneration module determines both the rate and duration of active protein synthesis.
In the PURE systems, all enzymes required to form a complete in vitro catalytic pathway must be included to regenerate ATP in one or more substrate-level phosphorylation steps. High phosphoryl-group transfer potential substrates (creatine phosphate, phosphoenolpyruvate (PEP), 3-phosphoglycerate) can be used directly as phosphate donors, although they are relatively expensive. Alternatively, non-phosphorylated substrates may be used, with the phosphate entering from ATP (lowering the overall yield) or as inorganic phosphate. Adenylate kinase and nonspecific nucleotide diphosphate kinase must also be included to catalyze phosphate exchange within the pool of ribonucleotides. These two enzymes equilibrate the phosphorylation state of all four nucleotides, and couple ATP regeneration to formation of GTP, which is required for delivery of charged tRNA and for translocation of the ribosome during protein synthesis.
The glycolytic intermediate, 3-phosphoglycerate, supports substrate-level phosphorylation in the presence of three enzymes of the glycolytic pathway (phosphoglyceromutase, enolase and pyruvate kinase). Once again the 1:1 stoichiometry for ATP production is a limiting factor. Another simple ATP-regeneration module is based on acetate kinase-catalyzed formation of ATP from ADP and acetyl phosphate, which is readily available (Ryabova et al. 1995), although formation of an organic acid (acetic acid) as an end product may interfere with protein synthesis (Wang and Zhang 2009). Polyphosphate, a linear polymer of phosphoric acid that occurs naturally in cells, can also serve as a phosphate donor for ATP regeneration. However, as in the other examples described above, chelation of divalent metal ions by inorganic phosphate is a complicating factor.
Some of the problems associated with ATP-regeneration based on simple phosphate donors can be resolved by reconstituting extended metabolic pathways, increasing the yield of ATP and recycling the inorganic phosphate. So far, approaches based on well-characterized cellular metabolism (including glycolysis and oxidative phosphorylation) have mainly been applied in cell extracts, although reconstitution of extended metabolic pathways in vitro from purified components is also possible (Stevenson et al. 2012).
The glycolytic reaction sequence is well-suited to ATP regeneration in the cell-free system, since the reactions do not require O2 and the presence of two substrate-level phosphorylation steps enhances the yield of ATP. When glucose is used, two ATP equivalents are required for substrate activation but four ATP are formed, and two equivalents of inorganic phosphate are recycled. One drawback of this scheme is that the redox stoichiometry of glycolysis results in a net production of NADH as an end product (Kim et al. 2008).
A number of variations on this theme have been described. The PANOxSP [PEP, Amino acids, NAD+, Oxalic acid, Spermidine and Putrescine] system utilizes PEP as a simple phosphate donor but extends the biochemical processing through the pyruvate dehydrogenase complex, and recycles one equivalent of inorganic phosphate, with a net yield of two ATP for each PEP (Calhoun and Swartz 2007) (Fig. 2). Another system, called cytomim (cytosolic mimicry), effectively reconstitutes oxidative phosphorylation, consuming simple organic acids (e.g., succinate) and using inverted respiratory membranes to generate ATP (Jewett and Swartz 2004; Jewett et al. 2008).
A particularly interesting method of ATP-regeneration has recently been described that involves the use of glucose storage polymers (maltodextrin or soluble starch) as a reservoir of chemical free energy (Wang and Zhang 2009; Kim et al. 2011). This approach is related to the reconstituted glycolysis system described above but is extended to include phosphorolytic cleavage of the storage polymer. There are four major advantages of this scheme: (1) it allows use of an inert and inexpensive maltodextrin or starch substrate, (2) it provides a ‘pacekeeper’ reaction that moderates ATP production, (3) it generates a high phosphoryl group transfer potential intermediate (glucose 1-phosphate) in situ, and (4) it completely recycles inorganic phosphate. When combined with the reactions of glycolysis and the PANOx pathway, the reactions have a theoretical yield of four ATP per glucose. This appears to be the most efficient energy coupling module currently available. Development of reaction modules to drive cell-free protein synthesis is an active area of research, and the possibilities are far from being exhausted.
Some of the limitations inherent in closed system (batch) reaction scheme described above can be overcome in a continuous exchange open-system format allowing substrates to be replenished during the reaction (Yin et al. 2012). For example, filtration or dialysis can be used to continuously supply ATP as it is consumed and to remove inorganic phosphate as it is formed. This strategy is very effective at extending the duration of active translation in small scale cell-free systems, but is not as scalable. Other strategies, in which the open-system character is restricted to pH control, have been used to extend cell-free protein synthesis to the industrial scale resulting in a high yield of a disulfide-bonded protein product (700 mg/l) (Zawada et al. 2011).
Cell-free protein synthesis is programmed by addition of a DNA template, formed from either closed circular vector DNA or a linear PCR product. Transcription is performed by recombinant phage T7 RNAP, generating the mRNA upon which the ribosomal translation machinery acts (Beckert and Masquida 2011). T7 RNAP is a relatively simple, single-subunit polymerase with high promoter specificity and transcriptional fidelity (Sousa and Mukherjee 2003).
The minimum requirements for the DNA template include a 5′-untranslated region (UTR) comprising a strong T7 promoter sequence to support a high transcription rate, a Shine-Delgarno sequence that serves as the ribosome entry point, and a 3′-UTR that includes an efficient translation termination codon (e.g., TAA), followed by six or more nucleotides (Shimizu and Ueda 2010). An epsilon (Enhancer of Protein Synthesis Initiation) sequence may be added to the 5′-UTR to improve translation, and a T7 terminator in the 3′-UTR improves efficient release and recycling of the ribosome. This simple transcription unit is easily prepared by two-step PCR, allowing virtually any coding sequence to be assembled together with promoter and terminator elements for cell-free synthesis.
Transcription and translation processes can be arranged as either linked or coupled reactions. Linked transcription/translation implies a two-step sequential process, where the transcript is formed first. In contrast, coupled transcription/translation implies simultaneous synthesis of mRNA and protein within an extended polysome complex. Cell-free protein synthesis based on prokaryotic components generally involves a coupled reaction system (Fig. 1).
It is possible to extend cell-free synthesis to simultaneous production of multiple polypeptides in a single reaction mixture, since addition of multiple templates results in the parallel synthesis of distinct proteins. This approach can be used to assemble complex multicomponent proteins, as demonstrated by the successful cell-free synthesis of the heterotrimeric core of Paracoccus denitrificans cytochrome c oxidase in an E. coli extract (Katayama et al. 2010). An alternative strategy for multigene expression from polycistronic constructs has been demonstrated for production of up to five distinct protein products from a single ‘BioBrick’ plasmid template (Du et al. 2009). Sequential synthesis is also possible: by immobilizing template DNA on magnetic microbeads, cell-free protein synthesis can be arbitrarily reset and reprogrammed, an example of artificial gene circuits (Lee et al. 2012).
Typical protein synthesis reactions are driven by transcription from sub-pmol quantities of template, and increasing improvements in the efficiency of transcriptional processing has brought the technology near the single-molecule limit of template sensitivity. In a recent example, green fluorescent protein was produced at quantized expression levels from 1 to 2 copies of template in picoliter volume reaction microchambers (Okano et al. 2012). This level of sensitivity is important for a variety of nanoscale processing applications, including printing protein microarrays.
Addition of a species-independent translation sequences (SITS) in the 5′-UTR of the template eliminates species barriers to cell-free translation. This universal adapter element relaxes secondary structure in the transcript and facilitates assembly of the translation complex in yeast, wheat germ, insect cell, rabbit reticulocyte and E. coli translation systems, opening up new possibilities for cell-free protein expression (Mureev et al. 2009).
tRNA aminoacylation module
Amino acids are activated for peptide bond formation by amino acyl tRNA synthetases that are specific for the amino acid substrate and the cognate set of tRNAs, covalently linking the residue to the 3′-terminal adenosine of the tRNA (Ling et al. 2009). Formation of inorganic pyrophosphate as a by-product during amino acid activation requires inorganic pyrophosphatase and adenylate kinase to be present to drive the reaction and recycle AMP.
Protein synthesis requires that all 20 amino acids are present in amounts super-stoichiometric with the amount of protein to be formed, in addition to catalytic amounts of all 20 amino acid tRNA synthetases and the full complement of tRNAs. In the PURE system the synthetases are affinity purified recombinant products, and tRNAs are isolated from special E. coli strains (Shimizu and Ueda 2010). The modular design of the PURE translation system makes it possible to supplement with rare tRNAs to compensate for codon bias in the template.
Modifying the composition of the amino acid mixture can be used to control the translation process. For example, using a drop-out mixture lacking one amino acid will result in ribosome stalling or pausing, allowing expression to be synchronized. Substitution with isotopomers facilitates isotopic labeling (with stable isotopes or radionuclides) while avoiding isotope dilution and scrambling resulting from unavoidable metabolic processing in cell-based systems (Ozawa et al. 2005; Su et al. 2011; Yokoyama et al. 2011). These approaches also permit efficient incorporation of Se-methionine into proteins without the toxic side effects that can compromise in vivo selenium labeling (Kigawa et al. 2002). Insertion of photochemically active analogs that are recognized by the native E. coli tRNA aminoacylation machinery (e.g., photo-leucine and photo-methionine (Suchanek et al. 2005)) is also expected to be straightforward. Alternatively, bioorthogonal pairs of suppressor tRNA/cognate aminoacyl tRNA synthetase can be added to the system to direct the incorporation of nonnatural amino acids into the protein product (Goerke and Swartz 2009; Ozawa et al. 2012), or aminoacylating ribozymes (flexizymes) can be used to expand the genetic code (Goto et al. 2011; Goto and Suga 2012) (see below).
Peptide synthesis module
Peptide synthesis is catalyzed by ribosomes, large (megadalton) ribonucleoprotein complexes comprised of more than 50 proteins and three RNAs that represents about 30 % of the mass of rapidly growing bacterial cells (Bremer and Dennis 2008). Although in principle a small, catalytic amount of ribosomes could be continuously recycled during cell-free protein synthesis, in practice the yield of protein tends to be roughly proportional to the quantity of ribosomes, which therefore represent one of the main components of the reaction mixture. Because of their abundance, unmodified native ribosomes may be easily isolated from E. coli. However, an engineered strain of E. coli producing His-tagged ribosomes is now available thereby allowing a one-step purification of active ribosomes (Ederth et al. 2009) that lend themselves to the ‘reverse purification’ strategy described above. The PURE systems also provides control over the availability of release factors, facilitating applications like ribosome display (see below) by promoting stalling of the ribosome on the mRNA, or incorporation of nonnatural amino acids by suppressor tRNAs.
The modular design of the cell-free expression system provides a high degree of control over posttranslational processing events. One of the most widely appreciated advantages of the cell-free system is the possibility of directing insertion of an integral membrane protein product into a lipid structure, avoiding problems that are often encountered in high-level expression of membrane proteins in cellular systems (Schneider et al. 2010). Expression of integral membrane proteins in living cells can be problematic as a result of toxic side-effects relating to membrane disruption. Cell-free synthesis of integral membrane proteins has been accomplished utilizing a variety of lipid structures, including liposomes, micelles, bicelles and nanodiscs (Lyukmanova et al. 2012). Bacteriorhodopsin synthesized in a cell-free system has been cotranslationally inserted into giant liposomes, where its photochemical proton pumping function could be demonstrated (Kalmbach et al. 2007). Nanodisc technology appears to be particularly well-suited to marrying cell-free synthesis with protein structural studies, since the nanodisc is structurally well-defined (Bayburt and Sligar 2010).
Correct folding of the recombinant protein can be assisted by supplementing the basic PURE system with molecular chaperones to enhance the efficiency of protein folding, helping polypeptides navigate the folding funnel and reach their native conformational state (Shimizu et al. 2005; Ueda 2008). Interestingly, even when the cell-free synthesis protein product is insoluble, it tends to be more readily solubilized and re-folded than proteins that have been recovered from inclusion bodies in cell-based expression systems (Swartz 2012). The formation of disulfide bonds is another important posttranslational processing step that can be problematic in prokaryotic expression systems. In cell-free protein synthesis, disulfide bond formation has been shown to be enhanced by addition of a glutathione redox buffer that facilitates disulfide exchange (Goerke and Swartz 2008; Knapp et al. 2007).
Applications of cell-free protein synthesis
Cell-free protein synthesis is an enabling technology that has the potential to transform many aspects of biotechnology. Its value has already been demonstrated in production of ‘difficult’ proteins that may be toxic to conventional cell protein factories, and new applications are emerging in protein engineering, drug discovery and synthetic biology.
Printing protein arrays
Protein microarrays have emerged as a promising platform for high-throughput screening in postgenomic biomedical research, permitting massively parallel functional analysis of complete proteomes and providing an important tool for vaccine development and personalized medicine (Berrade et al. 2011). Cell-free protein synthesis is a crucial link between the established technology of DNA microarrays and protein arrays allowing multiplexed, robotic processing to be used to print protein arrays from gene chips. Cell-free expression based in situ protein microarrays can be produced by a variety of methods that take advantage of the programmability of ribosomal protein synthesis by transcripts formed from either linear DNA (generated by PCR) or circular plasmid vectors: PISA (protein in situ array), DAPA (DNA array to protein array), NAPPA (nucleic acid programmable protein array) and TUS-TER mediated affinity labeling differ in the details of template construction and affinity capture strategy (Nand et al. 2012).
Ribosome display is an approach to protein functional analysis that can be used as a platform for protein engineering that is based on the stability of the ribosomal translation complex (comprised of mRNA, ribosome, and nascent polypeptide) in the absence of release factors. This translation complex provides a physical link between gene and protein that can be utilized in a recursive cycle of functional selection and mutagenesis to drive protein evolution (Plückthun 2012). The modular composition of the PURE cell-free system is especially well-suited to this application (Ohashi et al. 2007; Ueda et al. 2010), because it allows release factors to be omitted as a rational choice in the preparation and can accommodate incorporation of nonnatural amino acids (Watts and Forster 2012).
Labeling of recombinant proteins with stable isotopes or radionuclides is important for a variety of applications, including biological NMR and tracer experiments. When proteins are expressed in living cells or cell extracts, labeling is complicated by metabolic processes that dilute isotope or scramble labeling patterns as a result of molecular transformations, although strategies have been devised to suppress these side-reactions (Su et al. 2011; Yokoyama et al. 2011; Matsuda et al. 2007). In contrast, the severely edited metabolic map in PURE cell-free protein synthesis may be expected to eliminate many of these problems, because the interfering enzymes and metabolites are not present (Ozawa et al. 2005).
Incorporation of nonnatural amino acids
Expansion of the genetic code is perhaps the most significant development in ribosomal protein synthesis since biochemistry based on the canonical set of 20 amino acids evolved 3 billion years ago, adding new dimensions to protein engineering (Liu and Schultz 2010). The approach is conceptually simple, using a bioorthogonal cognate pair of tRNA and aminoacyl tRNA synthetase to suppress nonsense or frameshift mutations, incorporating nonnatural amino acids into the growing polypeptide and thereby creating a chemical toolbox for protein engineering. By systematic evolution of cognate tRNA/aminoacyl tRNA synthetases, more than 50 distinct nonnatural amino acids have been incorporated into proteins in this way, adding fluorescent tags or reactive groups that contribute novel chemistry or functionality in the protein product (de Graaf et al. 2009; Hartman et al. 2007).
While the majority of this work has been performed in vivo, cell-free expression systems offer some important advantages (Goerke and Swartz 2009; Ozawa et al. 2012). For example, cellular systems are susceptible to toxic side-effects of non-natural amino acids in cellular proteins or recombinant products. In contrast, cell-free systems avoid toxicity issues. Further, the cell-free system can be more easily tailored to accommodate unusual side chain structures by modifying the components of the translation machinery. Mutation of EF-Tu has already been shown to improve the efficiency of incorporating nonnatural amino acids with bulky side chains in the PURE system (Doi et al. 2007). The well-defined chemistry of the PURE system also lends itself to customization in terms of the choice of tRNAs and aminoacyl tRNA synthetases. Non-natural amino acid chemistry can be introduced into the protein product simply by adding the bioorthogonal components. The effectiveness of this approach has already been demonstrated in cell-free extracts by preparation of p-propargyloxyphenylalanine-containing proteins that can be crosslinked to form bioconjugates via click chemistry (Bundy and Swartz 2010). Selective omission of the RF1 release factor can enhance suppression by the bioorthogonal tRNA, allowing incorporation of nonnatural amino acids at multiple sites (Johnson et al. 2011; Loscha et al. 2012).
Ribozyme technology has recently been developed as an alternative strategy for expanding the genetic code. Flexizymes are small catalytic RNAs that support acylation of tRNA with virtually any amino acid analog (Goto et al. 2011; Goto and Suga 2012). Flexizymes can be subjected to rapid in vitro molecular evolution to select for optimized performance, and the lack of membrane barriers in the cell-free system makes it the ideal platform for both selection and application of flexizyme technology.
Comparison of platforms for protein synthesis
Comparison of platforms for protein synthesis
Elongation rate (residues/s)
Polypeptide length range (residues)
20 α-amino acids
In vivo (ribosomal)
Glucose, inorganic salts
Solid phase (nonribosomal)
20 protected α-amino acids
Cell-free protein synthesis has emerged as an important and effective alternative to both cell-based expression systems and solid-phase protein synthesis (Table 1). Advances in cell-free methods, including commercialization of PURE expression technology, are creating new possibilities for applications in biotechnology, ranging from microscale to industrial scale and in diverse areas including protein microarrays, biotherapeutics and biomaterials.
The author would like to thank Corinna Tuckey (New England Biolabs) for helpful discussions. This work was supported by the National Institutes of Health (GM42680 to J. W. W.).