N-Linked glycoengineering for human therapeutic proteins in bacteria
- First Online:
- Cite this article as:
- Pandhal, J. & Wright, P.C. Biotechnol Lett (2010) 32: 1189. doi:10.1007/s10529-010-0289-6
- 441 Views
Approx. 70% of human therapeutic proteins are N-linked glycoproteins, and therefore host cells for production must contain the relevant protein modification machinery. The discovery and characterisation of the N-linked glycosylation pathway in the pathogenic bacterium Campylobacter jejuni, and subsequently its functional transfer to Escherichia coli, presents the opportunity of using prokaryotes as cell factories for therapeutic protein production. Not only could bacteria reduce costs and increase yields, but the improved feasibility to genetically control microorganisms means new and improved pharmacokinetics of therapeutics is an exciting possibility. This is a relatively new concept, and progress in bacterial N-glycosylation characterisation is reviewed and metabolic engineering targets revealed.
KeywordsN-Linked glycosylationTherapeutic proteinsCampylobacter jejuniE. coliMetabolic engineering
Therapeutic proteins for human use have been traditionally prepared from human or even animal sources with the well-known example, insulin, being derived from the pancreas. With the rapid increase in demand for therapeutics combined with the development in genetic engineering tools, recombinant expression systems are by far the most favourable method of production. Bacteria, yeast, plant, insect and mammalian cell lines are the most common hosts, and human insulin produced in the well-studied Gram-negative bacterium, Escherichia coli, was brought into the market in 1982 (Johnson 1983). However, the structure of insulin consists of only two polypeptide chains linked by two inter-chains and one intra-chain disulphide bridge. Essentially, post-translational modifications are absent, and this contrasts significantly with the majority of therapeutic proteins. The most common, complex and energy demanding of these modifications is glycosylation, and approx. 70% of therapeutic proteins in the three phases of drug development are glycosylated (Sethuraman and Stadheim 2006). Because these modifications are essential for protein functionality, host cells must be chosen on their ability to perform them. For this reason, only eukaryotic cell lines can be used to produce N-type glycosylated therapeutic proteins, as they have the correct molecular machinery capable of including this structural variation, or so we thought.
N-linked: Glycans attach to nitrogen of asparagine or arginine side-chain,
O-linked: Glycans attach to hydroxyl group of serine, threonine, tyrosine, hydroxylysine, or hydroxyproline side-chains,
P-linked: phospho-glycans linked through the phosphate of a phosphoserine,
C-linked: Glycans attach to a carbon on a tryptophan side chain,
G-linked: Components of glycophosphatidylinositol anchor.
For this review, we concentrate on the N-Linked glycosylation, as this is the most widely distributed glycan–peptide bond found in therapeutic proteins. N-Linked glycosylation occurs when the membrane protein complex called the oligosaccharyltransferase, recognises the Asn-Xaa-Ser/Thr (A-X-S/T) sequon (Lehle and Tanner 1978), although only approximately 66% of sequons are glycosylated due to further structural requirements (Apweiler et al. 1999). Most significantly, from a therapeutic point of view, the exact structure, type and location of the glycan groups (called macro and microheterogeneity), affect both the function and efficacy of the proteins. The incorrect presence or composition of glycans can affect the pharmacokinetic properties and lead to immunogenic responses. Using host systems which cannot correctly glycosylate proteins would ultimately lead to rapid clearance of the drug. Furthermore, it has been reported that as much as 80% of erythropoietin produced in Chinese hamster ovary (CHO) cells is wasted due to incomplete glycosylation (Jacobs and Callewaert 2009).
On the cytoplasmic side of the endoplasmic reticulum (ER), N-acetylglucosamine phosphate transferase adds a GlcNAc group to lipid carrier dolichol-PP.
This lipid-linked oligosaccharide (LLO) is transferred to the luminal side of the ER through a membrane transporter (RFT1).
The core oligosaccharide (Glc3Man9GlcNAc2) is assembled on the dolichol-PP.
The oligosaccharyltransferase transfers Glc3Man9GlcNAc2 onto a specific asparagine site on the nascent polypeptide chain.
Glycoside hydrolases, called glucosidase I and II, then trim this glycan.
α-1,2-Mannosidase removes a specific terminal α-1,2-mannose residue producing a Man8GlcNAc2 structure.
Man8GlcNAc2-containing glycoproteins are then transferred to the Golgi where several α-1,2-mannosidases remove mannose to yield Man5GlcNAc2.
Glycosyltransferases assist in the addition of a diverse variety of monosaccharide units, for example, GalNAc, fucose, sialic and galactose, producing complex oligosaccharides.
The intention of this review is to present an overview of how the concept of glycoengineering in bacteria for production of human therapeutics has evolved since the discovery of N-glycosylation in bacteria. Targets for making this system successful are revealed through specific studies which have uncovered crucial cellular mechanisms, and further targets are suggested for improving overall production rates and hence making the bio-process, a viable one.
Chinese hamster ovary cells are presently the preferred host for therapeutic protein production (Sheeley et al. 1997). The main advantage of CHO cells is the wealth of information available regarding their implementation as protein production cell factories (although interestingly there is no public CHO genome sequence available). Their growth characteristics are well-defined from a scaling-up perspective, and technical advances mean they are relatively simple to manipulate genetically through transfection. Most significantly, they offer a post-translationally modified expression product (Sheeley et al. 1997). However, there are major disadvantages in using these cell lines. The costs associated with cell maintenance and growth can be high, and a relatively long time is required for the process of growing cells, expressing relevant genes, and finally harvesting the protein product. There is also the issue of transmitting viruses and prions. Perhaps the most important disadvantage of CHO cells from a biotechnology point of view is the inevitable production of different types of glycoprotein in addition to the glycoprotein desired. This heterogeneous mixture can also be the consequence of differing culture conditions, further complicating the matter (Yuen et al. 2005). Therefore, methods of purification are required, increasing overall production costs. Metabolic engineering of CHO cells has seen some advances. For example, the introduction of three proteins, alkaline phosphatise, p21 and CCAAT/enhancer-binding protein α, led to arrests in cell proliferation of growing cell cultures, and thereby increased recombinant protein production up to fifteen times (Fussenegger et al. 1998). More recently, the use of antisense DNA and gene targeting have improved glycosylated protein yield (Warner 1999).
Other host cell types include insect cell lines and insect larvae. Although naturally unable to produce complex oligosaccharides, some success has been achieved with sialylated forms using Mimic insect cells expressing five mammalian genes encoding glycosyltransferases (Legardinier et al. 2005). Transgenic plants have also been investigated for human glycoprotein production, mainly to overcome the lack of human sialic acid and galactose glycans and remove immunogenic xylose and α-1,3-fucose glycans (Gomord and Faye 2004; Strasser et al. 2004). Transgenic animals have been used as host cells (Houdebine 2002), although issues regarding serum half-life have arisen due to natively occurring high-mannose type glycans (van Berkel et al. 2002). Yeast expression systems have provided the most exciting advances of late in regard to using alternative host cells (Hamilton et al. 2003; Hamilton and Gerngross 2007; Li et al. 2006; Potgieter et al. 2009). Compared to previously discussed systems, the relatively shorter growth time means potential productivity rates are higher and scaling up fermentation is a well-understood procedure. Yeasts do not naturally perform the trimming associated with human-like N-glycosylations (discussed in “Introduction” section), but share the same initial biosynthesis pathway (Fig. 1). The yeast Pichia pastoris has been engineered to eliminate adverse glycans and produce human glycosylations, and very importantly to a high degree of homogeneity (Hamilton et al. 2003). This ability to control the glycoform of recombinant proteins has far-reaching advantages over rival host cell platforms, as screening different products led to the discovery of therapeutic proteins of enhanced efficacy (Li et al. 2006).
Glycosylation in bacteria
The significant breakthrough arrived when Wacker et al. (2002) transferred the Pgl pathway into E. coli, the workhorse of molecular biology, and successfully produced glycoproteins, although in tiny quantities. This opened the door to using this well-understood bacterium for human glycoprotein production. It is useful here to examine the advantages of using a bacterium such as E. coli to produce human glycoproteins. From a molecular standpoint, E. coli has a published genome sequence, and is facile to genetically modify. It is the most studied of all organisms; thus its metabolic pathways are well elucidated and there is a wealth of literature allowing more accurate predictions of its behaviour in various conditions or in responses to genetic changes (Baneyx 1999). Generally, bacteria can produce higher titres of product (volumetric productivity mg l−1 h−1) even compared to yeast (Müller et al. 2006), and are certainly cheaper to culture with shorter fermentation times. Reduced viral contamination issues often associated with mammalian cell culture is also a vital advantage. More specifically, bacteria are less sensitive to glycosylation changes, and essentially cell survival is not dependent upon the glycosylation process, and cell death is therefore unlikely. This means glycosylation control is more amenable in E. coli, leading to less heterogeneous glycoprotein products. Further, controlling glycan type and site can lead to increased efficacy as detailed earlier (Li et al. 2006). The potential to modify protein structure, half-life, immunogenicity, cellular uptake and target recognition is extremely exciting, and bacteria would be an ideal host compared to eukaryotes where hundreds of enzymes are involved, complicating process control.
More recent work has increased understanding of this N-glycosylation system. In bacteria, the N-glycosylation process occurs independently from protein translocation machinery, and therefore fully folded proteins can be glycosylated (Kowarik et al. 2006). This makes the process a true post-translational modification, as opposed to eukaryotes where it is more of a co-translational one. The putative oligosaccharyltransferase in C. jejuni, PglB, recognises an extended sequon compared to oligosaccharyltransferases in eukaryotes (Fig. 2), in addition, recognition requires a negatively charged amino acid side chain at position-2 (Kowarik et al. 2006). By performing elegant mutational and structural experiments, the relaxed specificity of PglB has been demonstrated (Linton et al. 2005). This means that variant glycan structures can be added to the polypeptide backbones. It is now known that proteins native to E. coli interact with the pgl locus (Linton et al. 2005). One example is Wzx, an ABC transporter which can replace the transport activity of PglK, and lead to the provision of incomplete glycans (Kelly et al. 2006). Another example is the WecA protein, an undecaprenylphosphate (UDP) GlcNAc-1-phosphate transferase that has functions in lipid-linked glycan biosynthesis. This latter example showed that a lipid linked intermediate is involved in the process, and confirmed the earlier hypothesis by Wacker et al. (2002) that this leads to a heptasaccharide glycan starting with the HexNAc sugar residue, as opposed to the bacillosamine. This pioneering work has contributed significantly for the prospects of glycoengineering in E. coli.
Targets and progress for humanizing bacteria
There are several major differences between the N-glycosylation process in humans and Campylobacter species (Fig. 1), including consensus sequence recognition, lipid carrier type, method of glycan transfer and the actual glycan groups themselves. All will require further research in order to humanise glycoproteins produced in bacteria.
As specified earlier, the process of N-glycosylation in eukaryotes and bacteria is co-translational and post-translational, respectively. Therefore, bacteria can glycosylate completely folded proteins most likely because the consensus sites are presented in displayed regions of the protein (Kowarik et al. 2006; Rangarajan et al. 2007). Eukaryotic proteins are in a flexible form prior to glycosylation, and are subsequently correctly folded. To avoid missing essential N-glycosylation sites in therapeutic glycoproteins, the glycosylation sites would need to be in flexible positions at the time of the modification (Kowarik et al. 2006).
As shown in Fig. 1, the C. jejuni system uses a specific sugar donor and attaches bacillosamine to asparagine, using the enzyme glycosyl-1-phosphate transferase [encoded by PglC (Glover et al. 2006)]. Unfortunately PglC cannot use UDP-GlcNAc or UDP-GalNAc as donors, the primary glycans in human proteins. Therefore, the bacterial system requires an enzyme that can use GlcNAc as a donor substrate, and attach this sugar to the acceptor (UDP-PP). Bacterial systems do contain proteins that can transfer HexNAc sugars to the membrane bound polyprenol phosphate acceptor as a function of peptidoglycan biosynthesis. Examples include WecA, MraY, and WbpL which transfer GlcNAc, MurNAc-pp and FucNAc/QuiNAc from sugar donors, respectively (Price and Momany 2005). Therefore, WecA is a candidate protein here for GlcNAc transfer, and this strategy is discussed later. Moreover, studies on the catalytic mechanisms and substrate specificities in bacterial glycosyltransferases have revealed carbohydrate recognition domains which could be manipulated in the future (Price and Momany 2005).
Complex human glycans are characterised by terminal sialic acid residues that have a strong effect on the half-life of glycoproteins in the serum. Not only do incorrectly sialyated proteins reduced life expectancy but additional sialic acids increase the half-life of recombinant human erythropoietin (Egrie and Browne 2001; Elliott et al. 2003). Sialyation is performed by sialtransferases and, unfortunately, the expression of such mammalian glycosyltransferases in bacteria leads to very low production of functional product. However, Skretas et al. (2009) recently expressed human sialyltransferase ST6GalNAcI, an enzyme that sialylates O-linked glycoproteins, in E. coli. This is the first example of the functional expression of human glycosyltransferases in bacteria. Impressively, by generating thioredoxin and glutaredoxin/glutathione mutant cells, oxidising conditions were created in the cytoplasm, and hence functionality of soluble ST6GalNAcI was maintained (Skretas et al. 2009). Co-expression of specific cellular chaperones increased this yield further. Importantly, this study has implications for the functional expression of human glycosyltransferases (N-linked and O-linked) in E. coli.
The oligosaccharyltransferase in eukaryotes is an enzyme comprised of eight membrane bound subunits including the STT3 unit in yeast, that has been directly implicated in the process of catalytically adding sugars to asparagine (Kelleher and Gilmore 2006). Campylobacter protein PglB is homologous to STT3 (Dempski and Imperiali 2002), and both contain the highly conserved WWDYG amino acid sequence. The flexibility of PglB, presented previously, adds encouragement to the concept of controlling its function in bacterial expression systems. In E. coli, O-antigen LPS biosynthesis involves the addition of glycan groups to a lipid A core. Replacing the enzyme O-antigen ligase with PglB illustrated that various o-antigen glycans could be transferred to acceptor proteins (Feldman et al. 2005). These findings combined with further engineering work means PglB could potentially be used to transfer complex oligosaccharides in N-linked glycosylation. However, there are significant hurdles at present for using PglB. Firstly, the difference in consensus sequences means the existing N-X-S/T in eukaryotes is not recognised by PglB, although this can be overcome by engineering the bacterial consensus D/E-Z-N-X-S/T into eukaryotic proteins. Moreover, work uncovering structural and functional characteristics of the protein could contribute to altering the enzyme’s substrate specificity, and lead to eukaryotic consensus recognition.
In 2004, a new strategy for generating glycoproteins in E. coli was reported (Zhang et al. 2004). They expanded on previous breakthroughs regarding the co-translational process of incorporating unnatural amino acids into proteins using evolved tRNA synthetase/tRNA pairs (Wang et al. 2001). They provide evidence for successful incorporation of GlcNAc (importantly the primary human-type glycan) onto serine residues on myoglobin, and also demonstrated that expressing a glycosyltransferase can lead to the addition of the galactose sugar onto this glycan. However, this illustration of a unique strategy for producing homogenous glycoforms in E. coli was unwarranted as the authors retracted the paper in late 2009 as a consequence of their inability to replicate the results (Zhang et al. 2009). This is a good example of just how complex the process of glycosylation is, and how a variety of factors can affect the final protein product.
More recently, a combined (in vivo and in vitro) method for producing eukaryotic N-glycoproteins has been demonstrated (Schwarz et al. 2010). In this study, the Pgl pathway is introduced into E. coli, minus the genes coding for sugar biosynthesis (pglD, pglE and pglF) and transfer (pglC). This meant that the native WecA protein in E. coli (referred to previously) was able to transfer GlcNAc-1-phosphate to UND-P making the primary glycan GlcNAc (human-like) as opposed to bacillosamine. Subsequent purification and exo-α-N-acetylgalactosaminidase digestion meant GlcNAc tagged proteins could be used as a substrate for in vitro transglycosylation reactions. When an excess of Man3GlcNAc was added to these reactions, a branching Man3GlcNAc2 structure was obtained on Campylobacter protein AcrA as well as eukaryotic proteins, human IgG-Fc (CH2 domain) and the single-chain antibody F8 (Schwarz et al. 2010).
When E. coli is used for high-level recombinant protein expression, it is important to achieve a desirable balance between anabolic and catabolic reactions. Assembling oligosaccharide groups could represent a bottleneck in recombinant glycoprotein production that would ultimately lead to very small amounts of glycoprotein production. Low levels of the glycoprotein, AcrA, were produced when the Pgl pathway was transferred to E. coli (Wacker et al. 2002). Moreover, E. coli needs to be able to produce its own precursors for glycosylation, primarily because of the high cost of adding the required chemicals to the media, and the relatively poor uptake efficiency of glycans with free hydroxyl groups (Sarkar et al. 1995). E. coli generates the correct glycan GalNAc using the pathway shown in Fig. 3 and this was illustrated previously in an in vitro study using a combination of six purified recombinant enzymes (Shao et al. 2002). Processes including central carbon metabolism (glycolysis), nucleotide sugar metabolism and peptidoglycan synthesis are all involved in production.
As mentioned previously, although E. coli can glycosylate C. jejuni proteins PEB3 and AcrA (Wacker et al. 2002), it does so very inefficiently. A study using global proteomics technique- isobaric tags for relative and absolute quantification (iTRAQ), metabolic maps on graphs (MMG) and pseudo-selective reaction monitoring (SRM) led to forward engineering of metabolism in an E. coli strain which contained the Pgl pathway and AcrA protein (Pandhal et al. submitted). Production of the glycosylated protein was improved using this systems biology based approach by over 300%. Further manipulation of the major and minor pathways would ultimately lead to a fitter strain for glycoprotein production, and engineered eukaryotic cells e.g. CHO cells, can reveal already tried and tested successful strategies, which could feasibly be applied to bacteria (Warner 1999).
Now that it is known that the most popular bacterial host for recombinant protein production can perform N-glycosylation, the race (academically and industrially) to achieve the ultimate goal and produce human-type glycoproteins is on. Increased understanding of this modification in C. jejuni has revealed interesting insights into shared and contrasting processes with eukaryotic glycosylation, essential for achieving the first ‘E. coli-produced human glycoprotein’ ambition. Humanizing the yeast, P. pastoris, has been achieved through a multitude of techniques, and there is no reason why similar principles cannot be applied in E. coli, particularly whilst hurdles are continuing to be overcome. Metabolic engineering (or synthetic biology) strategies are taking a variety of forms due to the better availability and accessibility of sophisticated genomic and post-genomic tools, as well as the application of systems biology approaches to reveal metabolic bottle-necks in host cells. With a combination of these research outcomes, and from lessons learnt from previous studies on glycoprotein-producing mammalian cells, glycoengineering in E. coli for human therapeutic drug production holds a potentially sweet future.
The authors acknowledge funding from the UK’s Biotechnology and Biological Sciences Research Council (BBSRC) through the Bioprocess Research Industry Club (BRIC) programme (ref BBF0048421), and also from the Engineering and Physical Sciences Research Council (EPSRC) (ref EP/E036252/1).