Introduction

Nature provides a reservoir of active compounds that help us to fight and cure disease (Fig. 1). We have learned to exploit this reservoir by identifying and isolating the active compounds, by developing strategies for their chemical synthesis and by improving their biological activity. However, this process is laborious and very expensive. The large-scale isolation of drugs is often hindered by the low amount of material produced by the natural source, and the complexity of the molecules makes their total chemical synthesis costly to unprofitable. Transferring the ability to biosynthesize these compounds from the source organisms to genetically friendlier hosts would greatly facilitate their production. Genome sequencing efforts now provide the genetic information of entire metabolic pathways necessary to recombinantly produce pharmaceutically relevant molecules. However, biosynthetic pathways are often highly intertwined, regulated by extensive feedback mechanisms and require the correct organization of the individual entities in space and time. The simple transfer of genetic information from one organism to another is therefore rarely sufficient to confer the biosynthetic capability. Thus, we need to understand the properties of individual components as well as the properties of entire systems and eventually learn to recombine artificial biosynthetic networks in new hosts in a predictable, efficient manner. The design of life with novel properties from individual modules in a way resembling the creation of new machines in modern engineering is the ultimate aim of the emergent discipline of synthetic biology.

Fig. 1
figure 1

Molecular structure of members of important natural drug families. a Vancomycin, a cyclo-peptide antibiotic. b Cyclosporin, a family member of the cyclic peptides and an important immunosuppressant. c Inhibitor of HIV budding selected from an in vitro cyclic peptide library (Tavassoli et al. 2008). d Artemisinin, a terpenoid with anti-malarial activity. e Epothilone D, member of the polyketide family of natural products with cytostatic activity

In its most stringent definition, synthetic biology is the application of the principles of engineering to the construction of life with desired properties in a rational and systematic way (Serrano 2007). Synthetic biology therefore uses well-characterized, modular parts that can be put together to create new functionality in a directed, predictable manner (Kiel et al. 2010; Ninfa 2010; Topp and Gallivan 2010). Several recent advances in DNA technology encourage the followers of this new discipline: First, inexpensive DNA synthesis enables the facile production of codon-usage optimized genes and other genetic elements, such as promoters and regulatory sequences. This technology has recently been improved towards the synthesis of an entire genome, which was subsequently used to substitute the genome of a living cell (Gibson et al. 2009; Gibson et al. 2008a, b; Lartigue et al. 2007, 2009). Second, rapid, next-generation DNA sequencing technologies provide vast amounts of genetic information. Entire genome sequences of many organisms are now publicly available and can be mined electronically for relevant genes (Benson et al. 2009). Third, well-characterized genetic modules are being developed, which can be combined rationally to create a new function (Lu et al. 2009). The Registry of Standard Biological Parts (http://parts.mit.edu) is a continuously growing collection of standardized parts (“BioBricks”) that are being used to rationally create organisms with desired properties (Voigt 2006). Directed evolution has become a widespread strategy to adapt existing parts to novel functions. For example, the directed evolution of tRNAs and amino acyl-tRNA synthetases (aaRS) to decode novel, unnatural amino acids has been used to expand the genetic code of several organisms (Chin et al. 2003; Liu et al. 2007; Wang et al. 2001) and allows genetically programming the incorporation of unprecedented functionality into proteins in vivo (Xie and Schultz 2006).

With increasing experience in the reprogramming of microorganisms, entirely new types of therapy might become feasible in the future. Bacteria might no longer be limited to the mere production of drugs but will also be able to deliver them to the site of action. In this review, we want to give an overview of recently developed individual components and of the attempts to recombine them for the production of modules of biotechnological and pharmaceutical interest. We are trying to give an impression of how these individual modules might eventually be used as parts in a synthetic biology approach to pharmaceutical biotechnology.

Creating in vivo compound libraries

Combinatorial small molecule libraries are an efficient way to search for new lead structures. Establishing these libraries synthetically, however, is costly and many initial hits unfortunately fail in subsequent rounds of screening or in clinical trials.

Producing compound libraries directly in vivo would have several advantages. First, maintaining and amplifying such a library is simple and can be done by cultivating cells. Second, the screening process is facilitated because the genetic information remains directly coupled to the compound. Hence, the selected compound can be identified by analysing the genetic information of the corresponding cell. Third, an intracellular genetic selection can directly assay for effects on enzymatic activity, bypassing inherent limitations of in vitro assays. Fourth, the selection is performed in the context of a living cell requiring an enhanced level of selectivity of the drug for its target. Finally, potential problems with solubility and uptake of the compound are circumvented.

Cyclic peptide libraries

Many important natural compounds are modified cyclic peptides, for example, antibiotics like vancomycin (Fig. 1a), the immunosuppressant drug cyclosporin (Fig. 1b) and fungal toxins phalloidin or actinomycin D (Katsara et al. 2006). Cyclization renders peptides resistant to cellular degradation and restricts conformational freedom, potentially improving binding affinity and specificity. Using a split intein approach, the Benkovic lab has developed a method to produce libraries of cyclic peptides in Escherichia coli (Scott et al. 1999). A number of promising candidates have since been identified in libraries generated this way (Horswill et al. 2004; Tavassoli and Benkovic 2005; Tavassoli et al. 2008). Natural cyclic peptides contain many modified, non-coding amino acids. This increases the diversity of these compounds and is crucial for their biological activity. As detailed below, the artificial expansion of the genetic code allows the co-translational incorporation of unnatural amino acids (Wang et al. 2001; Xie and Schultz 2006). Combining the Benkovic approach to the production of cyclic peptides with the incorporation of unnatural amino acids would vastly increase the diversity of cyclic peptides that can be produced in vivo.

Polyketide synthases—natural synthetic biology

Another important class of natural products are the polyketides. Polyketides are synthesized by large multi-enzyme complexes, the polyketide synthases. These “assembly lines” are built from modular components that catalyze the formation of the carbon chain of the final product in a stepwise manner (Fig. 2). The modular composition of polyketide synthases makes them an ideal playground for synthetic biologists. Their individual modules can be split and recombined to form active hybrid enzymes (Watanabe et al. 2003b). The combinatorial recombination of individual modules has been achieved, and the new enzymes were shown to successfully catalyze the formation of polyketides in E. coli (Menzella et al. 2005). This approach can potentially be used to produce libraries of polyketides with novel biological activities in vivo. Presently, simple recombination of different modules often produces inactive synthases because the transfer of intermediates between modules might be blocked or the connectivity between modules disturbed. Growing information on the structure of individual modules and their connectivity (Alekseyev et al. 2007; Keatinge-Clay and Stroud 2006; Tang et al. 2006) together with increasing experimental experience will help to develop predictive algorithms to rationally design synthases for “unnatural” polyketides in a combinatorial biosynthetical approach (Khosla et al. 2009). If a system to produce large libraries of hybrid enzymes is connected to a selectable output (Yin et al. 2007), active clones can be identified, even if their frequency in the library is very low (Menzella and Reeves 2007). A combination of computer-assisted prediction, combinatorial library design guided by structural information and selection might eventually develop into the mainstream of drug discovery.

Fig. 2
figure 2

Microbial polyketide synthases are modular assembly lines that fit polyketides together from monomeric building blocks. In the first step of the reaction, the starter module is acylated with the first unit (1). This unit is subsequently transferred to the first of a row of extender modules that catalyze the elongation of the polyketide. All extender modules contain at least a ketosynthase (KS), an acyl transferase (AT) and an acyl carrier protein (ACP) domain. In each step, the KS receives the acyl group unit from the ACP of the preceding module and the AT adds an appropriate extender unit from its CoA ester to the prosthetic group of the ACP (2). The KS then catalyzes a decarboxylative Claisen condensation between the acylated KS and the extender unit to give a β-keto-acylated acyl carrier protein (3). Other modules may be included to further modify the growing polyketide chain. Upon completion of the elongation process, the polyketide chain is cleaved from the polyketide synthase by the thioesterase (TE) of the termination module (4)

De novo creation of metabolic pathways

The de novo design of biosynthetic pathways to expand the biosynthetic potential of microorganisms is an intriguing route towards compounds for which natural pathways have not been elucidated or which are not of natural origin (Prather and Martin 2008). A “retro-biosynthetic” approach, analogous to retrosynthesis in organic chemistry, uses enzymes with evolved substrate specificity to create artificial biosynthetic pathways de novo. Many natural enzymes show high selectivity for their substrates, which is the result of divergent evolution from promiscuous precursor proteins (O'Brien and Herschlag 1999). However, there are examples of natural enzymes with broad substrate specificity. Starting from such a promiscuous enzyme, the sesquiterpene synthase γ-humulene synthase, Keasling and co-workers were able to recapitulate this evolutionary process and create seven specific and active enzymes that use different reaction pathways and produce different products (Yoshikuni et al. 2006). These enzymes could be used in the future to design biosynthetic pathways for unnatural terpenoids.

Similarly, prenylation is an important modification occurring on natural products, like naphterpin, conferring anti-cancer, anti-viral or anti-microbial activity to the molecules (Botta et al. 2005). Prenyltransferases, the enzymes responsible for the derivatization, have been identified and characterized (Kuzuyama et al. 2005). These enzymes have a broad substrate spectrum and might form a starting point for the evolution of enzymes for regio-specific prenylation of aromatic small molecules. Engineered enzymes might eventually allow us to create enzymatic pathways de novo and to produce small molecule libraries of differently modified compounds of the same scaffold structure (Dietrich et al. 2009; Yoshikuni et al. 2006).

Expanding the chemistry of life

The cellular environment imposes a constraint to the chemical scope of reactions that can be used to create in vivo compound libraries. Many reactions familiar to synthetic organic chemists require elevated temperature or are incompatible with aqueous environments. This limitation may be overcome in the future by the computer-assisted, directed evolution of enzymes (Kaplan and DeGrado 2004) that catalyze unnatural chemical reactions. Using several enzymes as scaffolds, Baker and co-workers (Rothlisberger et al. 2008) employed computer-aided design to engineer enzymes able to catalyze a Kemp elimination reaction for which no natural enzyme is yet known. It is thus conceivable that the limits to the chemistries available in vivo will be shifted in the future, thereby expanding the chemical scope of in vivo compound libraries.

The diversity in the chemical nature of proteins and peptides has been tremendously increased by the invention of strategies to introduce unnatural amino acids in vivo. Peter Schultz and his co-workers have artificially expanded the genetic code of several organisms (Chin et al. 2003; Liu et al. 2007; Wang et al. 2001) and succeeded in incorporating a wide range of unnatural amino acids (Xie and Schultz 2006). The basic principle of genetic code expansion is the expression of an additional pair of tRNA and its cognate aaRS in the host system (Fig. 3). Both, tRNA and aaRS, must not be substrates for the host aaRSs and tRNAs to ensure correct translation of host proteins. Under these prerequisites, the tRNA can be mutated to decode amber stop codons, for example, and the aaRS can be evolved to specifically recognize unnatural amino acids. This system allows the incorporation of additional, unnatural amino acids in response to amber stop codons. In the past decade, the list of unnatural amino acids that can be incorporated in this way has grown continuously (Neumann et al. 2008; Nguyen et al. 2009b; Xie and Schultz 2006). Recent developments of amber and quadruplet suppressor ribosomes now allow the efficient incorporation of multiple different unnatural amino acids into the same protein (Neumann et al. 2010; Wang et al. 2007). These unnatural amino acids have side chains with unusual chemical properties and reactivities and could be used to increase the chemical diversity of peptides and proteins.

Fig. 3
figure 3

The principle of genetic code expansion. The basic principle of genetic code expansion is the heterologous expression of an additional tRNA/aminoacyl-tRNA synthetase (aaRS) pair in the host cell. This pair is orthogonal to the host’s translational machinery, meaning that the tRNA is not a substrate for endogenous synthetases and that the aaRS does not charge host tRNAs. The orthogonal tRNA is mutated to decode amber stop codons and the orthogonal aaRS is evolved to specifically recognize an unnatural amino acid (UAA) supplied with the growth medium. Cells with such an expanded genetic code are able to produce proteins with the UAA incorporated at a position encoded by an amber codon in the mRNA

Drug screening

The quality of an assay system has an enormous impact on the outcome of a screen. In order to fully profit from the advantages of in vivo compound libraries, it is necessary to develop intracellular assays, like a colour signal or a genetically selectable phenotype, to detect or select positive clones. The continuously increasing number of biosensors could serve as such detector modules.

The Benkovic group has developed a method for identifying small molecule inhibitors of protein–protein interactions by combining a reverse two-hybrid system with the split intein-mediated formation of cyclic peptide libraries (Horswill et al. 2004). Using this system, cyclic peptides were identified that modulate the interaction between the two subunits of ribonucleotide reductase (Horswill et al. 2004), HIV-Gag and TSG101 (Fig. 1c) (Tavassoli et al. 2008) and AICAR transformylase homodimerization (Tavassoli and Benkovic 2005). A similar system that combines in vivo cyclic peptide libraries with a fluorescent reporter assay was developed by the same group and used to identify antibacterial cyclic peptide inhibitors of ClpXP protease (Cheng et al. 2007).

Tsien and co-workers created the Ca2+-sensor “Cameleon” (Miyawaki et al. 1997) by fusing calmodulin to a calmodulin-binding peptide and flanking the construct with two fluorescent proteins with overlapping spectral properties. Ca2+-dependent binding of calmodulin to its own peptide results in a predictable conformational rearrangement that leads to a change in energy transfer between the two fluorescent proteins. Biosensors of this sort are being used extensively to monitor the intracellular concentration and dynamics of ions and small molecules and the activation state of signalling proteins (Balla 2009). Similar sensors could be designed for metabolites of interest and combined with systems to produce small molecule libraries in vivo in order to identify strains and conditions that allow production of pharmacologically relevant compounds.

Optimization of engineered metabolic pathways depends on our ability to measure the production of intermediates efficiently. Mass spectrometry can be used to quantify the relevant metabolites, but faster real-time assays allow the identification of optimal growth condition or the screening of mutant libraries in high-throughput formats. Mevalonate, for example, is an important precursor in the biosynthesis of isoprenoids. Keasling and co-workers (Pfleger et al. 2007) engineered an auxotrophic strain of E. coli into a mevalonate biosensor by expressing GFP. The strain reports on the mevalonate concentration in the growth medium through a change in growth rate. This biosensor strain can now be used to measure mevalonate in high-throughput formats.

In a more direct approach, reporter proteins could be created by redesigning the affinities of binding proteins, like maltose binding protein (MBP), for ligands of interest. MBP is a member of the periplasmic binding protein superfamily that mediates chemosensory processes in a wide range of bacteria. This protein family has evolved to bind a variety of ligands, such as amino acids, carbohydrates and ions, naturally and has served as starting point for the directed evolution of biosensors, receptors and enzymes (Dwyer and Hellinga 2004). Upon ligand binding, these proteins undergo a significant conformational change (induced fit), which has been exploited to monitor the event by positioning environmentally sensitive fluorophores within the binding site (endosteric placement), at the rim of the binding site (peristeric placement) or in a crevice that opens and closes with ligand binding (allosteric placement) (de Lorimier et al. 2002). Alternatively, GFP variants have been conjugated to the N- and C-termini of periplasmic-binding proteins in order to monitor ligand binding by fluorescence energy transfer (Fehr et al. 2002; Fehr et al. 2003; Schultz and Moini 2003). Computational design has been successfully employed to create biosensor proteins with novel affinities for small molecules, such as trinitrotoluene, L-lactate and serotonin (Looger et al. 2003). These engineered sensors were incorporated into synthetic bacterial signal transduction pathways to regulate gene expression in response to extracellular stimuli (Looger et al. 2003).

G-protein coupled receptors (GPCRs) are the target of the majority of pharmaceuticals. This membrane protein family is stimulated by extracellular ligands and initiates a signalling cascade on the cytosolic side of the membrane upon agonist binding. The study of GPCRs in their natural environment is usually complicated by the presence of multiple different receptors on a single cell combined with the tremendous efforts necessary for its genetic manipulation. Pausch and colleagues transplanted a mammalian GPCR into an engineered Saccharomyces cerevisiae strain and connected it to a downstream cascade controlling the expression of the his3 gene (Price et al. 1995). Activation of the receptor permitted the cells to grow in the absence of histidine. This approach has since been used extensively to study GPCR function and to identify ligands and antagonists (Brown et al. 2000; Minic et al. 2005; Sachpatzidis et al. 2003).

These examples demonstrate that it is possible to create modules for the detection of a molecule of interest by rational design. A synthetic biology approach to the production of pharmaceuticals should employ these sensors to control the expression level of individual parts or connect them to a selectable output. In a case study, Farmer and Liao coupled the E. coli Ntr regulon, which senses the intracellular metabolite acetylphosphate, to the expression of key enzymes in the pathway for the recombinant biosynthesis of lycopene (Farmer and Liao 2000). This approach significantly enhanced lycopene production and simultaneously reduced the toxic side effects caused by metabolic imbalance. Riboswitches, RNA elements with the ability to control the translation of a coupled open reading frame upon binding specifically to certain ligands, are another class of parts that can be used to control metabolism (Famulok et al. 2007). Even before their discovery in nature, artificial riboswitches had been designed to control the expression of reporter genes in response to the binding of small molecules, such as Hoechst dyes or tetracycline (Hanson et al. 2003; Suess et al. 2004; Suess et al. 2003; Werstuck and Green 1998). These elements could be used to couple gene expression on the translational level to the metabolic state of the cell. Similar strategies might form the basis of high-throughput screening systems to identify engineered strains that are able to biosynthesize the molecule of interest.

Biosynthesis of pharmaceuticals

Recombinant DNA technology has been used for more than three decades to engineer bacteria with the ability to produce pharmacologically important molecules like insulin (Goeddel et al. 1979). Progress has been made in recent years towards the heterologous production of important natural products, like terpenoids, polyketides, non-ribosomal peptides and alkaloids. Synthetic biology approaches are being employed to fine-tune the levels and activities of individual steps and components. Here, we present examples of recent successes of this strategy.

Terpenoids

Terpenoids are a large and diverse class of natural products with important applications in human health. Well-known examples are the anti-malarial drug artemisinin (Fig. 1d) (Tan et al. 1999), the anti-cancer agent paclitaxel (taxol) (Jennewein and Croteau 2001) and eleutherobin (Long et al. 1998). A major step forward towards the recombinant production of terpenoids in E. coli was the creation of a platform strain that heterologously expressed the mevalonate pathway of S. cerevisiae (Martin et al. 2003). The metabolic precursors produced by this strain can be further converted into relevant terpenoids by co-expression of the corresponding terpene synthases and modifying enzymes. Building on this work, the Keasling lab has engineered E. coli to produce artemisinic acid, which can be converted chemically into artemisinin (Roth and Acton 1989), with yields of more than 300 mg/l (Chang et al. 2007; Kizer et al. 2008; Newman et al. 2006). Similarly, the same group has recently reprogrammed the metabolism of S. cerevisiae to produce artemisinic acid with titres exceeding 100 mg/l (Ro et al. 2006).

Taxol (paclitaxel) is a well-established anti-neoplastic drug active against a variety of different cancers (Goldspiel 1997). Although the total synthesis of Taxol has been achieved (Kingston et al. 2002), the approach is too expensive for it being commercially practical (Borman 1994). The drug is produced by yew (Taxus) species in about 19 individual enzymatic steps from the isoprenoid precursor geranylgeranyl diphosphate (Jennewein and Croteau 2001; Walker and Croteau 2001). Several Taxol biosynthetic genes have been expressed functionally in S. cerevisiae and used to establish the five-step biosynthesis of an intermediate of Taxol (Dejong et al. 2006; Engels et al. 2008). Efforts are also being undertaken to establish Taxol biosynthesis in E. coli (Huang et al. 2001). However, there are still a number of problems to overcome, like the lack of a complete list of enzymes involved in the production of many important terpenoids, in order to achieve their heterologous total biosynthesis. Still, the achievements are promising and there is hope for cost-efficient production of some very important natural products in the near future.

Polyketides

Polyketides are a large class of important natural products, e.g. erythromycin, epothilone (Fig. 1e) and FK-506. They are produced by microorganisms, most prominently actinomycetes, soil-dwelling bacteria. A significant achievement in the recombinant production of polyketides in E. coli (that naturally does not produce any) was the engineering of the pathway that encodes the production of 6-deoxyerythronolide B (6dEB) (Pfeifer et al. 2001). This strain was created by optimizing precursor production, enzyme engineering and destroying catabolic pathways. Subsequent optimization of the strain has improved the level of 6dEB production to levels obtained from optimized S. coelicolor strains (Pfeifer et al. 2002). Building on this work, heterologous production of several other important polyketides or their precursors has been achieved recently, namely, the anti-cancer drugs epothilone C and D (Mutka et al. 2006) and ansamycin precursors (Rude and Khosla 2006; Watanabe et al. 2003a), aklanoic acid (precursor to several antitumor polyketides, e.g. doxorubicin and aclacinomycin A) (Lee et al. 2005) and aromatic bacterial polyketides (Zhang et al. 2008).

Metabolic fine-tuning

These promising examples illustrate the potential of a synthetic biology approach to the recombinant production of pharmaceutics. There are, however, significant challenges ahead. In many cases, the relevant precursor metabolites are inefficiently produced by the host or bottlenecks are formed during the synthesis of the product when an intermediate is unstable or an enzyme of low activity in the foreign environment. Synthetic biology has the aim to create autonomous circuits for the production of key metabolites allowing the modular combination of part reactions into new pathways. The carefully adjusted expression level of individual components of the pathway is important to optimize flux and to avoid the accumulation of toxic intermediates. Synthetic promoter libraries and a range of engineered constitutive promoters of varying strengths have been created (Alper et al. 2005; Hammer et al. 2006) and could be used to fine-tune the expression levels of all enzymes in a pathway in order to optimize metabolic flux.

The Church lab has recently published a novel technology for multiplex automated genome engineering (“MAGE”) to optimize metabolic pathways in E. coli. By feeding cells with synthetic oligonucleotides, which are probably used as Okazaki fragments during replication, the authors were able to mutate and fine-tune ribosomal binding sites genome-wide (Wang et al. 2009). They have used this method to increase the production of lycopene in an engineered E. coli strain by a factor of five within only 3 days of evolution. The same approach might allow optimizing the production of other metabolites if a simple screening assay is available. The use of an automated setup greatly accelerated and increased the throughput of this technology. The development of similar automated systems will have a major impact on the de novo creation of biosynthetic pathways and regulatory networks in the future.

An important reoccurring theme of cellular metabolism is compartmentalization and channelling of metabolites (Ovadi and Srere 2000). This avoids side reactions (especially hydrolysis) of activated intermediates and prevents toxic side effects. A strategy that mimics this organization of enzymes would facilitate the design of efficient metabolic pathways. A first step towards this aim has recently been taken: By fusing the components of the metabolic pathway for mevalonate to peptide motifs, Keasling and co-workers (Dueber et al. 2009) were able to recruit them into synthetic protein scaffolds and to optimize their stoichiometry within the synthetic complex. The activity of the pathway was thereby increased by a factor of 77. Compartmentalization (for example, in protein-based organelles that exist in many bacteria (Yeates et al. 2008)) can also help to establish new chemical reactions by shielding unstable intermediates from a hostile environment or by creating a special phase, e.g. with low or high pH or hydrophobicity.

Bacteria as drug delivery units

In recent years, we have witnessed an explosion of new technologies to manipulate cellular genetic information. Our engineering abilities allow us to design cells with artificial genomes (Gibson et al. 2009; Gibson et al. 2008a, b; Lartigue et al. 2007, 2009), promising the advent of customized drug-producing or therapeutic bacteria. It might become possible to equip “synthetic” organisms with just the genetic information necessary for function, while omitting any interfering and potentially dangerous genetic and metabolic material. These cells could be rationally equipped with modular genetic elements to create an organism with a desired phenotype. In a futuristic scenario, genetic devices might be combined to program organisms like robots. First attempts produced newly combined biosensor modules under newly linked genetic circuits (Kobayashi et al. 2004; Looger et al. 2003; Lu et al. 2009; You et al. 2004). For example, a chimeric sensor domain of a phytochrome from cyanobacteria fused to a signal transduction domain of E. coli was used to design an E. coli strain responding to light (Levskaya et al. 2005). Engineering new bacterial sensors to regulate genes in response to a new environmental context can be extended to create other sensors and also reprogram downstream signalling at wish. Promoters, for example, could be built to integrate (multiple) specific signals (Anderson et al. 2007).

This approach can be extended to design live bacteria as targeted delivery systems for live vaccination, as probiotics and anti-tumour agents (Garmory et al. 2003; Martin et al. 2003; Nguyen et al. 2010; Pawelek et al. 2003). Various bacteria invade tumours and have been engineered to destruct them by releasing a chemotherapeutic prodrug (Nemunaitis et al. 2003), via secretion of TNFα (Dang et al. 2001; Nuyts et al. 2001) and cytokines (Murray et al. 1996). However, a synthetic biological approach would allow designing a bacterium rationally with desired capabilities and safety requirements. The Voigt lab constructed an E. coli strain that senses a low-oxygen microenvironment such as found in tumour tissues. Hypoxia is the cue in this particular E. coli strain to upregulate a Yersinia pseudotuberculosis adhesin protein (invasin) sufficient for E. coli to invade mammalian cells. Because invasion is only efficient from a certain cell density, the Voigt lab further equipped this E. coli with a second sensor including another genetic circuit that senses cell density (quorum-sensing circuit from Vibrio fischeri) (Anderson et al. 2006). Thus, this E. coli strain should specifically invade tumour cells in a population density-dependent manner.

The type III secretion apparatus, usually used by certain bacteria to pump effector proteins into the eukaryotic host cytoplasm for manipulation, was exploited in a synthetic biology approach to pump spider silk proteins into the medium (Widmaier et al. 2009). Combined with the previous approaches, a (synthetic) bacterial strain equipped with rationally linked sensors and genetic circuits might be able to sense the environment of damaged tissues, specifically to adhere and release therapeutic proteins into the infected cell and thus could kill (or heal) damaged or infected tissues. Refining such approaches together with biological “safety tasks”, which can also be synthetic, would be a further step towards the use of safe therapeutic bacteria.

Optimizing drugs

Often the identification of a new drug candidate is only the beginning of a laborious and time-consuming effort to optimize and mature its pharmacokinetic properties.

Small therapeutic proteins, for example, have a short residence time in the bloodstream. This can be improved by modifying surface residues with polyethylene glycol (PEG). Random derivatization usually creates an ill-defined mixture of molecules with varying degrees of PEGylation. A strategy to introduce PEG residues in a defined way is therefore much preferable over random modification. This is being tried using incorporation of unnatural amino acids with an alkyne side chain (Deiters et al. 2004). Subsequent modification with azide group-containing PEG residues generates proteins with site-specific PEGylation.

Many proteins are stabilized by disulfide cross-bridges. In recombinant proteins, these disulfide bonds are usually difficult to reproduce because of their redox sensitivity and the lack of relevant chaperones. A general method for the genetically programmed installation of directed redox-insensitive cross-links has recently been devised (Neumann et al. 2010). The authors combined evolved orthogonal ribosomes with two pairs of mutually orthogonal tRNA/aaRS pairs to incorporate two different unnatural amino acids into the same protein. The side chains of these amino acids contained an azide and an alkyne functional group that formed a cross-link upon addition of a Cu(I) catalyst.

Post-translational modifications (PTMs) are often present on proteins and can tremendously change their properties. However, recombinant proteins from E. coli are usually unmodified. It is therefore often necessary to use eukaryotic expression systems to produce correctly processed proteins, but even then the pattern of modifications is usually different from the natural situation. Lysine acetylation has been recognized as a widespread PTM in prokaryotes and eukaryotes (Choudhary et al. 2009; Kim et al. 2006; Zhang et al. 2009). The effect of the modification on protein function is largely unexplored. A method to genetically encode this PTM on recombinant proteins in E. coli has recently been devised (Neumann et al. 2008; Neumann et al. 2009). Hence, therapeutic proteins harbouring this modification can now be produced recombinantly without any knowledge of the modifying enzymes involved. Similar methods exist to produce proteins containing lysine methylation or tyrosine sulphation (Liu and Schultz 2006; Nguyen et al. 2009a). Other PTMs that cannot be encoded genetically to date, like glycosylations or ubiquitination, can be mimicked by incorporating a chemical handle into the protein at the site of the modification followed by in vitro derivatization (Liu et al. 2003).

Conclusion and perspective

In this review, we did not intend to give a complete list of all the parts available to a synthetic biology approach to pharmaceutical biotechnology and we apologize to those whose work was left unmentioned (please see Kiel et al. (2010), Ninfa (2010) and Topp and Gallivan (2010) for more recent reviews of this topic.) Our aim was to give a flavour of the potential impact that synthetic biology might have on the pharmaceutical industry. Future work should use these parts to create systems for the simultaneous production, identification and optimization of new drugs. Such an approach might look as follows: A reporter is engineered to detect a protein–protein interaction or an enzymatic activity of interest. To this end, one could develop biosensors that sense a reaction product or genetic circuits that are activated once a certain interaction is prevented. Such a sophisticated system will reduce the number of false hits during the screening process. This reporter system is subsequently combined with various in vivo compound libraries to isolate clones with the desired activity. Newly identified lead structures can be further improved afterwards, for example, with the help of unnatural amino acids. Their biosynthesis could be optimized for instance by using MAGE or by compartmentalizing the enzymes involved. Any improvement to the system could be identified or selected quickly by the reporter. A lot of work need to be invested before such an approach will become the mainstream of drug development. An important next step towards this aim would be to form well-characterized modular parts from individual components that can be freely and quickly combined in the true sense of a synthetic biology approach to pharmaceutical biotechnology. The automation of such an approach combined with the development of computational design tools for synthetic biology (Marchisio and Stelling 2009) will be an important technological advance that will help in its realization.