FormalPara Key Points
Table 1

1 Introduction

Proteins whose functionality is not well characterized form a large percentage of entries in many of the currently available biological databases, including the Protein Data Bank (PDB), and there is a constantly growing demand for reliable and fast synthesis and characterization methods. When it comes to drug discovery, proteins are key components as they can have therapeutic potential themselves (e.g., antibodies, coagulation factors, hormones, growth factors, enzymes, and antimicrobial peptides), but also because they could serve as drug targets for diverse diseases (as ion channels, receptors, enzymes, and transporters, for example) [1,2,3,4,5,6,7]. A large proportion of approved pharmaceutical drugs target human proteins. Beyond that, protein-based therapeutics, such as antibody–drug conjugates, represent a significant percentage of total drug molecules currently approved. They are poised to grow further with increased gene expression technology, improved protein engineering, and refined bioinformatics tools. Some proteins are very difficult to express in traditional cell-based systems and this can hamper our ability to define the mechanism of action and structure–function relationship of the individual protein, knowledge which aids the development of drugs targeting these proteins [1,2,3,4].

Generally, to exploit and fine-tune the structural and functional characteristics of a protein, it needs to be expressed and purified with high quality by using recombinant expression technology. Traditionally, Escherichia coli-based systems were widely used for the production of recombinant proteins due to simplicity in preparation and operation, and cost effectiveness. As a result, broad research and standardization from several years was performed using E. coli-based expression systems, resulting in their often-cited utilization as a state-of-the-art protein expression system [8]. For complex therapeutic proteins, membrane proteins (MPs) originating from humans, and virus-like proteins (VLPs), mammalian expression systems fulfill all the requirements like post-translational modifications (PTMs), cofactors, and chaperones for correct folding and efficient production. However, batch-to-batch variation in cell culture may be a source of process variation. Additionally, overexpression of MPs might be toxic for the cultivated cells, resulting in cell death or truncated and misfolded proteins [9].

Ideally, synthesized proteins are functionally folded and exhibit appropriate PTMs. Due to the lack of extensive research and low yields in recombinant protein expression, many MPs are not yet crystallized, thus limiting the computer-aided drug discovery efforts. Due to the growing demand for the production of protein biologics and drug discovery targeting proteins, alternative strategies for protein synthesis should be developed. New expression technologies where proteins can be expressed in a simple way and which allow high throughput screening of different reaction conditions, different genes, and different supplements in a cost-effective manner are extremely important for future drug development. In this review, we give an overview of recent advances in cell-free (CF) synthesis platforms and their diverse applications. Additionally, we focus on human and therapeutic proteins produced by different types of CF systems and how these CF protein synthesis (CFPS) methods can further play a prominent role in future drug development.

2 Cell-Free Protein Synthesis Systems

CFPS systems use crude cell extracts prepared from cells of choice by lysis followed by many steps of washing to remove the cell debris and genomic DNA [10, 11]. These cell extracts can be stored at − 80 °C for years and can be used by thawing just before the reaction. Such extracts contain all the principal components necessary for transcription and translation, such as aminoacyl-tRNA synthetase (AAS), ribosomes, and factors necessary for elongation, initiation, and transcription. Protein synthesis can be realized by combining cell extracts with necessary substrates like amino acids, energy substrates, DNA, cofactors, salts, and nucleotides. Depending on the biochemical properties of the protein and its end application, the appropriate CF system can be selected. CFPS is a fast protein production system since it does not require transfection or cell culture and lacks cell viability constraints. Due to its openness, CFPS platforms offer additional advantages when compared with cell-based expression methods. A comparative analysis of CF and cell-based approaches is shown in Table 1.

Table 1 Comparison of in vitro (cell-free) and in vivo (cell-based) protein synthesis formats for drug discovery applications

For complex proteins, eukaryotic CF systems are ideal as they contain the endogenous microsomes derived from the endoplasmic reticulum (ER), enabling co-translational translocation of proteins and ER-based PTMs [10, 11, 18, 19]. There has been a constant improvement in the quality of lysate preparation, system optimization, linear template-based protein synthesis, and reduction of process costs, which has led to the preparation of cost-effective systems suitable for commercial purposes.

3 Cell-Free Systems

A general scheme of CF protein production is depicted in Fig. 1. CFPS platforms are based on either prokaryotic or eukaryotic origin. Among the prokaryotic CF systems, extracts based on E. coli are regularly used and are available commercially for CFPS of a diverse range of proteins. Very recently, CF systems based on Bacillus subtilis [32], Pseudomonas putida [33], Streptomyces [34], and Vibrio [12] have been optimized well at the laboratory level due to the ease of preparation of CF lysates. A wide range of detailed protocols is currently available for the preparation of E. coli-based lysates. Among the eukaryotic CF systems, extracts based on rabbit reticulocyte lysate (RRL), wheat germ, insect Spodoptera frugiperda 21 (Sf21), Chinese hamster ovary (CHO), and cultured human cells are regularly used. An increasing number of eukaryotic CF systems have so far reached technical maturity and become commercially available.

Fig. 1
figure 1

General scheme depicting the overall process of cell-free protein production. aatRNA aminoacyl-tRNA, AAS aminoacyl-tRNA synthetase, ATP adenosine triphosphate, EF elongation factor, GSH glutathione, GSSG glutathione-disulfide, GTP guanosine-5’-triphosphate, IF initiation factor, IRES internal ribosome entry site, MP membrane protein, nCAA non-canonical amino acid, PDI protein disulfide isomerase, PEG polyethylene glycol, PTM post-translational modification, R ribosomes, t-RNA transfer RNA, TF transcription factor, UTR untranslated region, VLP virus like particle

Recently, several eukaryotic CF extracts based on Tobacco [35], Leishmania [36], Neurospora [37], yeast cells [38], and human blood cells [39] were characterized and optimized for a limited number of proteins at the laboratory level. There is a growing trend in the development of novel CF platforms for taking advantage of the genetic tools available in the literature and the abundant literature available on the in vivo expression of proteins.

3.1 Prokaryotic Cell-Free Platforms

Prokaryotic CF systems based on E. coli are most commonly used for protein production towards drug development due to their simplicity and a vast literature available on the utilization of these cells. Protein synthesis starts with crude cell extracts prepared from E. coli cells that contain the translation machinery along with all the essential components required for translation. A modified and reconstituted CF synthesis system known as the PURE system (protein synthesis using recombinant elements), where all the components of the translation machinery are purified and added individually along with the DNA template to produce the protein, has been reported [40]. This is a highly controlled system compared with crude extract methods. A major advantage of the PURE system is that protein factors participating in the initiation, elongation, and termination of the protein synthesis process are identified and can be adapted individually to the CF system’s requirements. Although the naturally occurring PTM machinery is not available in the E. coli lysates, recently proteins with N-glycosylation were synthesized by using E. coli extracts enriched with glycosylation components, including oligosaccharyltransferases (OSTs) and lipid-linked oligosaccharides (LLOs) [41]. Using release factor (RF1) deficient E. coli lysates, proteins were phosphorylated by incorporation of non-canonical amino acids, which will be addressed in a later part of this review [42].

3.2 Eukaryotic Cell-Free Platforms

Due to a constantly growing demand for more complex proteins of pharmaceutical value, CF systems based on eukaryotic lysates have been developed to produce high-quality proteins. CF systems based on wheat germ lysates (WGL) are among the most popular eukaryotic platforms due to their capacity to produce eukaryotic proteins with high yields [43]. CFPS based on WGL have been used frequently for the discovery of novel vaccine candidates as well as for producing several proteins of high quality for structural analysis. Despite the high yields and quality of the lysate, this system does not offer all the PTMs like glycosylation and does not support the solubilization of complex MPs [13]. In the case of wheat germ and RRL, there are no translationally active endogenous microsomes present in the system. In the case of RRL, exogenous microsomes are typically supplied from the canine pancreas for protein translation [13, 44]. It is quite laborious and difficult to enrich RRLs with heterologous microsomes.

CF systems derived from cultured insect (Sf21) cells represent the most popular eukaryotic-based approach for synthesizing a wide variety of proteins. Sf21 lysates contain translationally active endogenous ER membranes, thereby supporting the signal peptide-mediated translocation of proteins across the membrane, and further provides functions such as signal peptide cleavage, post-translational modifications like N-glycosylation, and lipid modification [13, 14, 45].

CHO cell-based expression is well established and is approved for the large-scale synthesis of several biologics by the FDA because it undergoes human-compatible PTMs. Nearly 70% of the approved mammalian therapeutic proteins are currently expressed in CHO cells. However, these cells have limitations when it comes to difficult-to-express proteins like overexpression of complex MPs, toxic proteins, and multi-subunit proteins as discussed above. CF systems based on CHO lysates are evolving as an alternative strategy for the expression of difficult-to-express proteins [13, 17,18,19, 46]. Apart from many general advantages of CF systems, CHO-based CF systems retain most of the features of CHO cells while being more flexible due to the lack of cell membrane boundaries. CHO-based lysates harbor endogenous microsomal vesicles enabling translocation of transmembrane proteins and secretory proteins. Furthermore, PTMs of de novo synthesized MPs, such as glycosylation, are possible using CHO lysate. Thus, using CHO cell lysate for CFPS has a potential value and enables new opportunities, in particular, the high-yield production of pharmaceutically relevant MPs [13, 17,18,19]. There is a significant increase in the number of publications based on CHO lysates for CFPS. Table 2 compares different CF systems and their advantages and limitations with some selected examples.

Table 2 Comparison of different cell-free systems reported in the literature with their significant properties

4 Cell-Free Protein Synthesis Reaction Formats

CFPS can be performed in different formats. The batch-based format is the most commonly used method both in the prokaryotic and eukaryotic systems. This method is relatively fast and cheap, and synthesis can be performed within 1.5–3 hours depending on the system. E. coli-based systems can provide protein yields ranging from 100 µg/mL to 2–3 mg/mL. Although the yields from batch-based eukaryotic systems are comparatively low, MPs are automatically incorporated into microsomal membranes and the functionality can be addressed immediately after the synthesis [62]. For researchers who would like to further scale up the protein yields via batch-based eukaryotic systems, a repetitive batch-based synthesis format has been proposed where the microsomes incorporating the MP of interest generated in an initial synthesis reaction can be added to a fresh CF synthesis reaction that has been depleted of its microsomes [19, 45].

Another popular CF synthesis format that has been used for a rapid increase in the protein yields is the so-called continuous exchange cell-free synthesis platform (CECF). In this format, a semi-permeable dialysis membrane separates the reaction chamber and a feed chamber and thereby a feed chamber provides the fresh reaction components and enriches the reaction chamber. In exchange, the inhibitory components accumulated during the reaction are removed [14, 17, 18, 46]. Typically, the CECF format prolongs the reaction time and increases the protein yields. Until now, the CECF format has been used to increase the protein yield by multiple fold, and is widely used as CF platforms (Table 2).

5 Parameters Influencing Cell-Free Protein Production

This section highlights some of the key parameters that might influence the protein production using CF lysates.

5.1 Gene Design

Designing synthetic DNA and sequence manipulation for CF synthesis by adding regulatory elements plays a significant role in high-yield protein production. In eukaryotic CF systems, initiation factors (IFs) in particular limit the initiation of protein synthesis, thereby leading to low protein yields. One alternative is to use internal ribosome entry site (IRES) elements found in the 5′-untranslated region (5′UTR) of the different viral genomes upstream of the start codon for cap-independent translation initiation [14, 18, 62]. IRES elements from three different viral sources were compared for their translational efficiency in Sf21, CHO, and human leukemia K562 CF lysates. The IRES from the cricket paralysis virus (CrPV) typically increased protein yields by a factor of 3–5 [62]. Inserting the CrPV-IRES into the corresponding vector upstream of the epidermal growth factor receptor (EGFR) gene, and using the CECF reaction format, EGFR yields were significantly increased to more than 100-fold compared with batch reaction format without CrPV-IRES [14]. Additionally, replacement of the initiator codon (ATG) to a GCU-codon in combination with the CrPV-IRES resulted in a further improvement of protein expression levels in CHO and K562 CF systems [62]. The vector backbone also plays an important role in CFPS. A detailed study comparing commercially available vectors harboring the luciferase gene in combination with CrPV-IRES showed that there is a significant 5-fold increase in protein yield with a change in the vector backbone [19].

Species-independent translational sequences (SITS) are another group of synthetic 5′UTRs capable of initiating cap-independent translation in multiple prokaryotic and eukaryotic CF systems [66]. Typically, polymerase chain reaction (PCR) products are generated with SITS downstream of the T7 promoter and upstream of the start codon ATG [66]. The 3′ hairpin region of the SITS increases the residence time of the preinitiation complex in the vicinity of the start codon [66]. Using L. tarentolae CFPS in the presence of genes encoding 58 Rab encoding variable fragments in combination with a universal SITS, nearly a full complement of human Rab GTPases were produced with a yield of around 30 µg/mL [36]. Similarly, EGFP with a yield of around 300 µg/mL [66], and an active multisubunit enzyme heterodimeric farnesyl transferase (FTase) [36] were synthesized using the L. tarentolae CFPS [36].

Codon optimization is another important parameter that plays a crucial role in increasing the expression yields of proteins. Codon optimization has been shown to influence the translation efficiency of several proteins [71]. By taking advantage of the CF lysates derived from N. crassa and S. cerevisiae, transcription and translation reactions were uncoupled for ribosome profiling, which provided strong biochemical evidence that codon optimization enhances the rate of translational elongation, thereby affecting the ribosome traffic on the mRNA [72]. On the one hand, codon optimization usually improves protein yields, but on the other hand, it was shown that faster translation rates might negatively affect the protein folding and function of the individual protein [72, 73]. This problem often cannot be solved even by altering the tRNA population in the case of CFPS.

The addition of anti-spliced leader oligonucleotide to L. tarentolae cell extracts suppressed the translation of endogenous L. tarentolae mRNAs, thus increasing the translation efficiency of exogenously supplied mRNA [65]. Using the ER-specific signal sequence of honeybee melittin (melittin signal peptide) instead of the native signal peptide increased the translocation of synthesized proteins such as WNT proteins, single-chain antibody variable fragments, and the hTLR9-ectodomain into microsomes in the case of Sf21 and CHO-based CF systems [18, 59, 74, 75].

5.2 Reaction Conditions

Iterative optimization processes are required to develop high-yield CFPS. Factors that influence both protein quality and quantity include reaction temperature, reaction time, plasmid concentration, salt concentration, T7 polymerase, and other supplements. The influence of these factors on the synthesis rates is also protein specific. Very recently, CFPS of human toll-like receptor protein (hTLR9) in CHO-based lysates has been reported by using a CECF method with high yields of around 0.9 mg/mL. By increasing the temperature from 27 to 30 °C, the protein yields were increased by almost 50%. Stable monitoring and maintenance of pH throughout the entire CF reaction along with sufficient adenosine triphosphate (ATP) supply are essential for efficient and maximum yield protein production. By using amino acid decarboxylase, the pH is controlled throughout the CF reaction [76].

5.3 Influence of External Supplements

Supplementation of chaperones influences the functional folding of many proteins. Supplementation of chaperones such as GroES/EL and DnaK/DnaJ/GrpE in prokaryotic CF systems was used to increase the yield and solubility of colicin M from 16 to 100%, resulting in enhanced cell-killing activity [77]. Li et al. demonstrated that by using CFPS based on wheat germ extracts, expression of J-domain containing chaperone proteins (DNAJB12 and DNAJB14) along with potassium channels plays a critical role in the folding, stabilization, and tetramerization of K+ channels [78]. Ion concentrations (potassium and magnesium) in the CF reaction have a significant effect on protein production. In the case of CHO-based CECF reactions, an increase in the magnesium ion (Mg2+) concentration from 3.9 to 22.5 mM led to a 3.9-fold increase in EGFR yield [46].

For efficient regeneration of ATP, several methods have been developed in CF systems. In prokaryotic systems, compounds like phosphoenolpyruvate (PEP), glucose + glutamate decarboxylase, glucose-6-phosphate, fructose-1,6-biphosphate, acetyl phosphate, maltodextrin, and creatine phosphate are widely used as energy sources [79]. In eukaryotic CF systems, a combination of creatine phosphate and creatine kinase is typically used for energy regeneration. Apart from these, phosphoglycerate (B. subtilis), and polyphosphate are used in CF systems [80, 81].

6 Applications of Cell-Free Systems

CF systems have evolved over the last decade from their use as a prototype method in research laboratories to commercial and large-scale applications. In this section, the utility of CF systems in MP synthesis, antibody production, vaccine development, protein labeling, and antimicrobial peptide synthesis are addressed.

6.1 Cell-Free Systems for the Synthesis of Membrane Proteins

MPs are structurally and functionally diverse, and constitute 30% of the proteins encoded in the human genome. Drugs targeting MPs such as ion channels, transporters, and G-protein coupled receptors (GPCRs) represent 12 out of the top 20 global revenues in the pharmaceutical industry [3]. Due to the presence of transmembrane domains, ranging from 1 to 24, these proteins are highly hydrophobic and are very challenging to express by traditional cell-based systems. Expression of human proteins in heterologous cellular hosts is very much limited due to the difference in their lipid composition, which can prevent the MPs from attaining maximum functionality [9]. Synthesis of MPs by cell-based methods often leads to cytotoxicity, aggregation, and improper folding [9]. To analyze MP functionality, the protein needs to be folded properly and in the appropriate hydrophobic environment. CF systems derived from prokaryotic, as well as eukaryotic lysates lacking endogenous microsomes, require specific supplements for the solubilization of MPs in the form of detergents, nanodiscs, or liposomes. Non-ionic and zwitterionic detergents are commonly used as supplements in the majority of CFPS reactions for the solubilization of MPs during their production. Detergent-solubilized MPs can be either used directly for functional analysis or may be reconstituted into liposomes by mixing with artificial lipids followed by detergent removal [9, 82, 83].

Alternatively, nanodiscs (NDs) and liposome-based reconstitution are detergent-free strategies where NDs and liposomes, prepared and characterized externally, could be supplemented directly into the reaction mixture for the reconstitution [9, 84]. A detailed review of the CF synthesis of MPs and the usage of solubilization supplements for isolation and functional analysis is presented in the literature [9]. Some of the advantages of NDs are easy purification and flexibility in using different lipids and membrane scaffold proteins for creating different sizes, and their availability as monodisperse and homogenous NDs. Nonetheless, NDs have their limitations, particularly when working with a protein whose functionality depends on its orientation and also working with transporter proteins. Liposome-based reconstitution covers the limitations of the NDs, but the separation of liposomes after the CF reaction is quite challenging and often suffers from disruption due to osmotic instability. Further, such passive reconstitution strategies do not offer the advantages of post-translational modifications within native membranes and are limited for MPs whose function does not depend on active translocon-based translation.

CF systems derived from eukaryotic lysates equipped with endogenous microsomes (e.g., Sf21, CHO, cultured human cells, and Tobacco-BY2) satisfy all the necessary requirements for proper folding of MPs. The microsomes offer a native environment and intact translocon machinery for a proper embedment and folding of MPs [13, 18, 19, 45, 46, 51, 59]. There are continuing efforts in analyzing the functionality of microsomal reconstituted MPs, indicated by a good number of publications reporting on this reconstitution strategy, which should help the pharmaceutical industry to develop more dynamic drug screening assays involving MPs [9, 46, 83]. Here we present recent works on ion channels, transporters, and GPCRs, which constitute more than 40% of the major drug targets in the pharma industry [85].

6.1.1 Ion Channels and Transporters

Ion channels constitute approximately 19% of all currently existing human drug targets and play a crucial role in diverse physiological processes involving cell excitability, neuronal transmission, metabolism, sensory transduction, cognition, and electrolyte homeostasis. Transporters mediate the translocation of a variety of substrates across biological membranes [86]. The solute carrier (SLC) family is the largest class of transporters and is implicated in metabolic conditions and diseases, and in the transport of drugs. These proteins typically have 9–12 transmembrane domains and are difficult to express by traditional methods [2, 9, 50, 87]. SLC transporters are an emerging drug target class and the molecular target of several approved inhibitor drugs [2]. Despite this, these classes of proteins remain largely unexplored in recent years due to the high costs involved and lack of proper expression methods [9, 53]. Table 3 highlights some of the selected publications using CF methods for synthesis, reconstitution, and functional analysis of ion channels and transporters.

Table 3 Functional ion channels and transporters synthesized using CF systems

The most widely used method of reconstitution for functional analysis is detergent-based reconstitution into liposomes or passive integration of MPs into liposomes and NDs. The majority of the functional assays were performed with PLBE in the case of ion channels and substrate uptake assays in the case of transporters.

6.1.2 G-Protein Coupled Receptors (GPCRs) and Drug Discovery in Cell-Free Systems

GPCRs transduce extracellular stimuli to the inside of cells, after activation by a variety of different molecules such as neurotransmitters, hormones, odorants, and peptides, thereby triggering several signal transduction cascades. The involvement of GPCRs in almost all processes in living cells has resulted in significant pharmaceutical interest in this protein class, and the development of robust and high-throughput-suitable assays for the discovery of novel ligands and drugs targeting these proteins. In principle, the screening of ligands can be performed in whole-cell assays by measuring a downstream signaling event, or in CF assays, which are decoupled from the living organism. Usually, these decoupled methods are preferable for high-throughput screenings, as they are easy to handle and therefore amenable for automation and downsizing. These parameters can be well combined with CFPS. An automated CF synthesis procedure for the production of different MPs is already reported [97]. This procedure might be further expanded for the parallel analysis and identification of molecules that target different GPCRs. To date, only a few studies have analyzed in detail the activity of receptors produced by CF systems. The main reason for this is that there are limited well established activity assays. This section addresses possible activity assays that might be transferred to CF systems in the future.

Radioligand binding assays, the gold standard for identifying binding molecules, are already adapted for GPCRs that have been synthesized in eukaryotic and prokaryotic CF systems, and demonstrate similar binding affinities in comparison with in vivo produced GPCRs [20, 98,99,100]. Alternatively, fluorescently labeled ligands can be analyzed by an optical read-out system using eukaryotic CF systems harboring endogenous membrane structures [101]. Nevertheless, for these systems, radiolabeled or fluorescently labeled ligands are required, thereby limiting the analysis mainly to GPCRs with already known ligands. In addition, simple ligand binding assays usually do not differentiate between an agonistic and an antagonistic effect of the bound substance.

In this context, measuring downstream signaling to distinguish between an activating and inhibitory ligand is preferable. One possible method of choice is the receptor-mediated coupling of G proteins [102]. This early event immediately follows after GPCR activation and is detected by the binding of [35S]GTPγS to Gα subunits. This method is not yet established in CF systems but might be transferable assuming the presence of Gα proteins in the eukaryotic lysate. Alternatively, the Gα proteins can be additionally co-synthesized to the target GPCR or directly applied to the reaction based on the open nature of CF systems. After GPCR activation, GTP binding and hydrolysis should be detectable.

In addition to ligand binding and G protein coupling, intra- and intermolecular interactions can be visualized by Förster and bioluminescence resonance energy transfer techniques. Different sensor models are known in living cells [103]. The monitoring of intermolecular interactions can be performed as well in CF systems using the already established in vivo models. One model includes the tagging of the C-terminus of a GPCR of interest to a fluorophore (GFP/YFP) and fusing a binding partner such as β-arrestin to luciferase or a second fluorophore [104]. Upon activation of the GPCR, β-arrestin binds to the receptor and both tags are in close proximity, resulting in a measurable energy transfer. This model requires active G protein-coupled receptor kinases for the phosphorylation of the C terminus of the synthesized GPCR to get recognized by β-arrestins. This requirement has to be analyzed in detail in the individual CF systems. The second known in vivo model visualized intramolecular changes after agonist and antagonist binding by introducing fluorophores into the third extracellular loop and the C terminus of different GPCRs. Upon activation, the distance between both fluorophores changes and an alteration in the energy transfer can be measured [105, 106]. Initial experiments to transfer these energy transfer-based sensors were recently performed in CF systems [107]. Both models can be applied to high-throughput analyses.

In summary, the successful CF synthesis of a variety of GPCRs has been demonstrated in recent years and a transfer of these GPCR production systems to a drug discovery format in a high-throughput manner has recently started. In the near future, we might see novel technologies for ligand screenings, thereby utilizing the advantage of the automatization and downsizing capacity of CF systems.

6.2 From Antibody Discovery to Production

The gold standard for synthesis, development, and production of antibody-based drugs (based on full-length antibodies) is mammalian cell culture-based expression systems. Although cultivation of mammalian cells is well established and widely used, the development of monoclonal antibodies (mAbs) and antibody-drug conjugates (ADCs) remains time-consuming and challenging. Thus, methods for high-throughput screenings, especially in the early-stage evaluation of antibody candidates, are valuable. In view of this, the use of the CF technology constitutes a promising strategy to shorten the time from antibody discovery to production.

6.2.1 Antibody Discovery

Using CF technology, antibodies can be produced in a flexible scale within a couple of hours. Besides the synthesis of individual antibodies, CF technology can be used to display libraries of antibodies. In contrast to phage and yeast display, in vitro CF systems such as ribosome and mRNA display are open, and thus result in higher library sizes. In theory, the size of the library is only limited by the quantity of supplemented mRNA/DNA, the volume of the CF reaction, and the number of ribosomes within the system, resulting in library sizes of ~ 1012−15/mL CF reaction [108]. In comparison, phage and yeast display exhibit library diversities of ~ 106–1010. Selection technologies such as ribosome display [109], mRNA display [110], and CIS display [111] have been developed based on reticulocyte lysate [110] and E. coli CF systems [109]. These systems focused on smaller antibody fragments because their functionality does not rely on the assembly of multiple polypeptide chains. Nonetheless, recently two groups have succeeded in developing completely CF display technologies that allow the selection of Fab fragments [112]. The challenge to assemble the heavy and light polypeptide chain (HC/LC) of the Fab fragment was approached in different ways. While Sumida et al. succeeded by combining mRNA display based on two mRNA sub-libraries, one encoding HC, the other one encoding LC, with in vitro compartmentalization PCR to link and then amplify HC and LC gene pairs [112], Stafford et al. developed a ribosome display method where they displayed only one of the two Fab chains, while the other one was not presented in display format [113].

6.2.2 Antibody Production

Successful synthesis of different antibody formats, including single-chain variable fragments (scFvs), Fab fragments, as well as complete IgGs, has already been shown in E. coli [114, 115], Sf21 [10, 116], reticulocyte [110], wheat germ [117], and CHO CF systems [46, 75, 118]. Furthermore, the upscaling of CF reactions to the liter-scale [25, 115] as well as downscaling [119] and high-throughput applications [120] have been demonstrated.

In addition, advances in bioorthogonal reaction chemistries have paved the way to expand the possibilities for ADC development. The site-specific introduction of non-canonical amino acids into a genetically engineered sequence can be used to create site-specifically labeled ADCs [121]. Currently, seven ADCs are approved for therapy. To date, all of these ADCs have been generated by coupling of mAbs to the cytotoxic linker-payload via surface-exposed lysines, or partial disulfide reduction and conjugation to free cysteines, which typically results in a controlled but heterogeneous ADC population with varying numbers and positions of drug molecules attached to the mAb [122]. Homogeneous ADC populations can be achieved by introducing the payload at one or more defined positions. By developing a bioorthogonal tRNA/synthetase pair, Zimmerman et al. have shown that the optimized non-canonical amino acid para-azidomethyl-l-phenylalanine (pAMF) can be site-specifically incorporated into the tumor-specific, Her2-binding IgG trastuzumab [123]. Subsequently, the cytotoxic linker payload DBCO-PEG-monomethyl auristatin (DBCO-PEG-MMAF) was conjugated to pAMF via strain-promoted azide–alkyne cycloaddition (SPAAC) copper-free click chemistry.

In the context of dual-functioning molecules, bispecific antibodies have also emerged as promising anti-cancer agents. One of the advantages of these proteins is their capability to target two different epitopes simultaneously, thereby increasing target engagement, where mono-specific antibodies might fail [124]. Due to their open design, CFPS reactions can easily be manipulated, for example by varying the template ratios and concentrations of HC and LC. For example, Xu et al. showed the successful assembly of bispecific ‘knobs-into-holes’ antibodies in multiple scaffolds by using an E. coli-based CF expression platform [125].

Taken together, antibody evolution, selection, and engineering can dramatically benefit from the technological advances in the field of CFPS. (1) Novel display technologies based on CF methods enable the in vitro evolution of multimeric proteins and allow for more sophisticated protein engineering. (2) Due to the very short time frame from synthesis to functional testing, CF systems can accelerate antibody construct evaluation by a repetitive (one after one) and/or parallel screening. (3) The introduction of non-canonical amino acids expands the chemical repertoire and thus the possibilities to modify and improve antibody-based therapeutics. Advanced labeling technologies allow for a very fast qualitative analysis of drug-to-antibody ratio (DAR), linker, linker/position, drug, drug/position (research application), and allow full control of the ADC design (commercial application).

6.3 Application of Cell-Free Synthesis in Vaccine Development

CF systems are becoming a potential option for synthesizing vaccine antigens. Most of the vaccine antigens produced by CF systems to date have used E. coli and WGL. Recent progress on eukaryotic CF systems may offer additional advantages. In this context, eukaryotic CF systems are endotoxin free and lack the complex plasma membrane that makes the protein purification simple. Some of the antigens synthesized by using CF systems are highlighted in Table 4. They are able to induce a strong immune response in experimental animals and could serve as a proof of concept for future vaccine development. Using recent advances in CFPS technology, a freeze-dried, cell-free (FD-CF) expression system was created based on E. coli CF lysates [31]. Using this FD-CF technique, diphtheria toxoid antigen variants (DT5 and DT6) were produced following rehydration with water and functional characterization of the synthesized proteins was verified following administration in mice and measuring the immune response [31]. The FD-CF method could enable the production of on-demand, point-of-care biologics requiring just the simple addition of water for activation and synthesis.

Table 4 Vaccine antigens synthesized by using cell-free systems

Recently, CF-based expression has proven successful in producing difficult-to-express proteins like major outer membrane protein (mMOMP) of Chlamydia spp., a major vaccine antigen. Using E. coli-based CFPS, mMOMP was synthesized in a native trimeric form in the presence of nanolipoproteins (NLPs) with a yield of around 1.5 mg/mL. When injected into mice in the presence of an adjuvant, the protein elicited an enhanced humoral immune response [126]. This method of synthesizing and simultaneous incorporation of antigens into NLPs using a CF approach is a promising method for future vaccine development.

Conjugate vaccines are one of the safest and most effective biologics [127]. Bioconjugate vaccines are produced using protein glycan coupling technology (PGCT). However, PGCT has its own limitations such as time-consuming in vivo processes. Additionally, FDA-approved carrier proteins, such as toxins derived from Clostridium tetani and Corynebacterium diphtheria, have not yet been demonstrated to be compatible with an E. coli-based production process. Relevant PTMs are often difficult to synthesize in E. coli CF systems.

Meanwhile, there have been further advances in using bacterial glycoengineering combined with CFPS for producing bioconjugate vaccines. This CF glycoprotein synthesis (CFGpS) used glycooptimized E. coli extracts integrating both N-linked glycosylation and protein synthesis. Using CFGpS, two bioconjugate vaccines were synthesized against F. tularensis and E. coli O78 [146, 147]. Besides post-translational modification, the assembly of macromolecular structures in CF systems is highly ambitious. Virus-like particles (VLPs), for example, are nanoscale structures that are formed from the self-assembly of viral proteins without the viral genome responsible for the infection. Usually, VLPs mimic the capsid structure of the real virus. VLP antigens are vaccine candidates for several diseases [148]. One of the vaccine candidates, which is currently in clinical trials, contains VLP antigens addressing noroviruses responsible for gastroenteritis in humans [149]. CF synthesized VLPs were structurally confirmed by electron microscopy [150].

6.4 Antimicrobial Compounds

Using E. coli CF systems, antimicrobial colicins (Colicin M, La, E1, and E2) have been synthesized with high yields (around 300 µg/mL) and solubility. The synthesized colicins are able to effectively kill the target cells without any purification [151]. Antimicrobial peptides (AMPs) are another class of defense molecules that have a wide spectrum of targets; for example, bacteria, viruses, fungi, parasites, and cancer. AMPs are evolving as alternatives to antibiotics [152]. Using lyophilized E. coli CF lysates, ten different AMPs have been synthesized successfully and the functionality of BP100, Cecropin B, and Cecropin P1 was demonstrated by E. coli inhibition assay [31].

6.5 Site-Directed Labeling of Proteins

When non-canonical amino acids (ncAAs) are incorporated into proteins, novel functional, structural, and imaging properties can be generated. This synthetic biology application is fast emerging and has wide applications such as incorporating precise PTMs and adding novel functions to proteins [23, 24, 42, 45]. By taking advantage of the openness of the CFPS, one can add the machinery responsible for the co-translational incorporation of ncAAs directly to the standard reaction components. One possible method to incorporate ncAA is to use precharged tRNAs harboring the ncAA. One of the most commonly used methods is the amber suppression technology using an orthogonal pair of aminoacyl-tRNA synthetase/tRNA (O-tRNA/aaRS pairs from distinct organisms), which functions independent of endogenous AARSs and tRNAs in the host and is used to direct the incorporation of ncAAs to specific positions such as the amber stop codon (UAG). After incorporation of an ncAA with a reactive group, bioorthogonal click reactions can be performed to conjugate a molecule of interest.

The most general biorthogonal click reactions for conjugating molecular probes or polymers are the copper-catalyzed azide-alkyne cycloaddition (CuAAC), Staudinger-ligation, photo click cycloaddition, strain-promoted azide-alkyne cycloaddition (SPAAC) and inverse electron-demand Diels-Alder cycloadditions (IEDDA + SPIEDAC). Using E. coli-based CF systems, Cui et al. showed the incorporation of two fluorescent labels, BODIPY fluorophore and TAMRA-DIBO, by using a precharged tRNA + orthogonal system for FRET measurements [153]. Using Sf21-based CF systems, Quast et al. demonstrated the incorporation of p-azido-l-phenylalanine at defined amber positions in parallel in the two subunits of the human EGFR protein dimer. Later, the azido group of the incorporated AzFs was coupled by photoaffinity cross-linking using a bis-COMBO linker to create a stable synthetic dimer of EGFR [14]. The dimerized protein shows autophosphorylation in the presence of tyrosine kinase. In general, release factor 1 (RF1) competes with orthogonal ncAA-tRNA for the amber codon, which results in truncated products along with successfully suppressed products. So, CF lysates derived from genetically modified E. coli lacking release factor 1 (RF1) can be used to enhance the incorporation efficiency of ncAAs. Using the orthogonal system and E. coli-based CFPS, human MEK1 kinase with PTMs was synthesized up to milligram quantities by site-specific, co-translational incorporation of phosphoserine at specific positions [154].

Various polyethylene glycol (PEG) moieties have been widely used to decorate therapeutic proteins. The PEG moiety usually offers high stability and extends the half-life of proteins while in circulation inside the body. The Food and Drug Administration (FDA) has recognized PEG moieties as safe due to their structural flexibility, hydrophilicity, and minimal toxicity, and several PEGylated drugs have been approved by the FDA. Using Sf21-based CF systems, a site-specific PEGylated human EPO was produced and characterized by autoradiography [45]. Apart from the amber suppression strategy, there are other strategies like frameshift suppression, sense codon reassignment, and unnatural base pairing. A detailed review of prominent methods for the incorporation of ncAAs into proteins using CFPS has been recently published [23].

7 Commercial Cell-Free Systems

A wide range of commercial CF systems is available in the market based on lysates derived from diverse sources. As well, a few companies provide services for CF synthesis of proteins. Some of the products derived from the CF systems based on E. coli lysates are already in clinical trials, such as ADCs targeting CD74 and folate receptor alpha highly expressed in myeloma and cancer cells (Sutro Biopharma, Inc, USA). Table 5 lists commercial systems currently available for the CF production of proteins.

Table 5 List of commercially available cell-free synthesis kits in the market

8 Outlook and Future Directions

Evolving CF systems from a laboratory level to a robust production platform is necessary to fulfill their potential. Prior to full realization of CF systems as emerging tools for drug discovery and evaluation, several factors need to be addressed, like synthesis of the high-quality functional protein with proper folding and PTM, cost of production, scalability, and safety issues. A more detailed understanding of the components in the CF lysates will substantially improve the quality and stability of the extract preparation. The quantity of the protein depends on the translation efficiency of the CF system. The most important factors that influence the protein yields are quality of the cell-lysate, reaction conditions, and template optimization as addressed in section 3.2. To increase the translation efficiency, further efforts are required to increase the quality of lysate production. This can be achieved by using genetic engineering tools to remove the factors responsible for nucleic acid degradation, ribosome inactivation, and protein degradation. Brodiazhenko et al. showed that genomic disruption of genes encoding ribosome-inactivating factors (HPF in B. subtilis and Stm1 in S. cerevisiae) has improved the activities of bacterial and yeast translation systems [54]. In this context, advanced engineering tools like CRISPR Cas could help to improve the translation efficiency of the CF systems [155]. Activation and enrichment of translation-relevant factors could also increase translation efficiency [63].

When it comes to eukaryotic CFPS platforms, translocation through microsomes currently remains a black box. Optimizing the efficiency of coupling translation and translocation needs to be addressed. The most important issue with CF systems, especially when working with CECF systems, is to maintain the balance between the amount of protein synthesized and the stability and quality of the protein. Although CECF has been capable of producing 0.6–1 mg protein per mL, especially with the mammalian expression systems, only a small fraction of the produced protein was subject to detailed functional analysis [15, 155]. This is one of the reasons why the functional assays are limited to binding assays (GPCR, TLR, antibody), PLBE (ion channels), and colorimetric assays (enzymes). By optimizing the redox conditions, the problem of Ab translocation into the lumen of microsomes is addressed already [75, 155]. However, when it comes to the synthesis of complex transmembrane proteins in mammalian systems, the insertion efficiency might be already saturated at the low synthesis rates due to restrictions on the level of the translocon’s functionality. A more detailed analysis of lipid composition and proteins constituting the microsomes present in the insect, CHO, and human-derived lysates will help to improve the quality of synthesized membrane proteins. One could use alternative supplements like nanodiscs or liposomes reconstituted from microsomal membranes to support MP integration [156]. Intense efforts on designing novel and improved mammalian CF systems should be maintained as the majority of the drug targets are related to complex eukaryotic proteins. Optimizing CF reactions in order to decrease protein aggregation during the purification processes and increasing the quality of the protein purification, especially when using the CECF method, is strongly required.

Another point to address in the field of CFPS is to decrease the costs of production, especially in the preparation of CF lysates and the individual reaction components. Substantial costs arise from the usage of phosphorylated energy systems, cofactors, nucleotides, amino acids, and DNA. Alternative energy regeneration systems are available in the place of phosphorylated substrates (e.g., glucose, maltodextrin, etc.) for sustainable ATP regeneration throughout the synthesis reaction [157,158,159]. Use of nucleoside monophosphates instead of nucleoside triphosphates as the nucleotide source in the CF systems could be another cost-effective parameter [159]. Avoiding the use of exogenous tRNAs and cyclic AMPs and reducing the concentration of amino acids and nucleotides are some of the cost-effective parameters one could optimize during protein synthesis. Additionally, new high-cell-density cultivation strategies and improvement in the quality of cell lines by genetic engineering could help to produce cost-effective high-quality CF systems. Costs can also be decreased by engineering and optimization of eukaryotic lysates to extend the lifetime of these systems, thereby increasing the yield of the produced protein.

There has been considerable progress in the point-of-care production devices for on-demand biologic synthesis of small quantities of therapeutic proteins using CHO lysates and E. coli lysates through on-site good manufacturing practice (GMP) [30]. This type of miniaturized device could be useful for quick testing of proteins and thus help in treating common and rare diseases, and CFPS could help solve the challenges associated with in vivo expression.

Due to the open nature of the CF systems, proteins can be modified with chemically synthesized glycans by bioconjugate chemistries. This will help to increase the quality and therapeutic efficiency of the synthesized proteins. There is an exponential increase in the number of publications from the last 5 years using CF lysates for producing a wide range of proteins [160]. Due to the increased awareness of the biosynthetic potential of the CF systems, protocols becoming simpler, improvement in the lysate quality, and its applicability in the preparation of a diverse range of proteins, there will be unexpected outcomes in the field of protein production towards future drug development.