Multicopy plasmid integration in Komagataella phaffii mediated by a defective auxotrophic marker
- 905 Downloads
A commonly used approach to improve recombinant protein production is to increase the levels of expression by providing extra-copies of a heterologous gene. In Komagataella phaffii (Pichia pastoris) this is usually accomplished by transforming cells with an expression vector carrying a drug-resistance marker following a screening for multicopy clones on plates with increasingly higher concentrations of an antibiotic. Alternatively, defective auxotrophic markers can be used for the same purpose. These markers are generally transcriptionally impaired genes lacking most of the promoter region. Among the defective markers commonly used in Saccharomyces cerevisiae is leu2-d, an allele of LEU2 which is involved in leucine metabolism. Cells transformed with this marker can recover prototrophy when they carry multiple copies of leu2-d in order to compensate the poor transcription from this defective allele.
A K. phaffii strain auxotrophic for leucine (M12) was constructed by disrupting endogenous LEU2. The resulting strain was successfully transformed with a vector carrying leu2-d and an EGFP (enhanced green fluorescent protein) reporter gene. Vector copy numbers were determined from selected clones which grew to different colony sizes on transformation plates. A direct correlation was observed between colony size, number of integrated vectors and EGFP production. By using this approach we were able to isolate genetically stable clones bearing as many as 20 integrated copies of the vector and with no significant effects on cell growth.
In this work we have successfully developed a genetic system based on a defective auxotrophic which can be applied to improve heterologous protein production in K. phaffii. The system comprises a K. phaffii leu2 strain and an expression vector carrying the defective leu2-d marker which allowed the isolation of multicopy clones after a single transformation step. Because a linear correlation was observed between copy number and heterologous protein production, this system may provide a simple approach to improve recombinant protein productivity in K. phaffii.
KeywordsKomagataella phaffii Leucine biosynthesis Auxotrophic marker Multicopy integration Expression system
The methylotrophic yeast Komagataella phaffii (formerly Pichia pastoris) is one of the most important expression platforms for the production of recombinant proteins [1, 2]. It offers many advantages such as: easy genetic manipulation; growth at high cell densities, e.g. 200 g L−1 dry weight during a glucose-limited fed-batch cultivation ; ability to produce heterologous proteins at high levels, e.g. more than 18 g L−1 of lignocellulolytic enzyme TrCBH2 ; and post-translational modifications similar to higher eukaryotes .
Due to its biotechnological interest, many studies have focused on the genetic improvement of K. phaffii in order to optimize protein production. A well-established approach to accomplish this is to assure high transcription levels of a heterologous gene thus favoring the translation of the desired mRNA. Typically, this can be achieved by constructing expression cassettes under the control of strong promoters or/and by screening clones bearing multiple copies of the desired gene (for a review see [6, 7]). Genetic strategies are available for the isolation of multicopy clones. Yeast cells can be transformed with vectors carrying extra copies of the expression cassette cloned in tandem (multimeric construction)  or successive rounds of transformation can be performed using different selection markers . In both cases cloning is labor-intensive and the extent of copy number increase is limited .
Another option consists in the use of antibiotic-resistance markers, in which case one looks for transformants growing in higher concentrations of the antibiotic (direct selection method) . A previous study showed that this type of selection resulted in the isolation of sporadic multicopy integrants with increased productivity of the desired protein . Dominant markers can also give rise to multicopy clones by posttransformational vector amplification (PTVA)  or liquid PTVA . It has been demonstrated that after transformation with a few copies of a vector carrying a drug-resistance marker, such as zeocin or G418, cells can be selected in stepwise higher concentrations of the drug resulting in the selection of multicopy clones. The use of PTVA in combination with the use of rDNA non-transcribed sequence (NTS) as an integration target sequence resulted in multicopy clones in K. phaffii . Besides being a laborious and expensive method due to the high costs of eukaryotic antibiotics, one disadvantage of the use of dominant markers is that a significant number of clones show increased natural drug-resistance for other reasons than vector copy number.
An alternative strategy is based in the use of defective auxotrophic markers, i.e. genes that are poorly transcribed typically due to extensive deletions of their promoters. To compensate the low transcription levels, cells need to amplify the copy number of the defective marker in order to recover prototrophy. Consequently, copy number of the neighboring heterologous gene is also amplified . An example of such defective marker is the leu2-d allele which contains only 29 base pairs of the original promoter and is commonly used in S. cerevisiae for plasmid maintenance at high copy number under selective pressure . Due to this feature, this system has also been used to increase recombinant protein production in this yeast [18, 19, 20]. This prompted us to develop an analogous system to be employed in K. phaffii. To accomplish this, we sought the construction of a K. phaffii strain auxotrophic for leucine and the development of an integrative expression vector based on leu2-d as a tool to increase recombinant protein production in this yeast.
Results and discussion
Construction of a leu2 auxotrophic strain
Genetic manipulation of K. phaffii is possible due to its widely used transformation system which enables integration of foreign DNA into the genome via homologous recombination . This approach has been successfully used to disrupt several genes in order to create auxotrophic mutants, e.g. URA5 , ARG1, ARG2, ARG3, HIS1, HIS2, HIS5 and HIS6 . Recently, a CRISPR-Cas9 system was developed for K. phaffii which has greatly facilitated gene knock out in this yeast .
Heterologous expression in K. phaffii M12
Multiple copy integration
Although the defective leu2-d allele has been successfully used to increase plasmid copy number and protein production in S. cerevisiae it has not yet been tested for the same purpose in K. phaffii. In a previous work, attenuated ADE1 and ADE2 genes involved in adenine biosynthesis were used to develop a color-based system for the screening of multicopy integrants in K. phaffii . However, this system is based on large plasmids and, in some cases, the effects of background transcription from vector sequences were responsible for the recovery of adenine prototrophy.
Seven days after transformation of K. phaffii M12 cells with pKGFP-ld, colonies of diverse sizes were observed on MD plates. Ten transformants representing colonies with different sizes were selected for further analysis. After a few passages on fresh MD plates four clones derived from the smallest colonies present on the original transformation did not grown upon replica plating and were removed from this study. We speculate that these abortive clones were transformed with a limited number of copies of the defective marker, thus they were unable to sustain growth under selective conditions.
Main features of the selected clones studied in this work
Vector copy number
EGFP production (fluorescence units)
0.1596 ± 0.0062
19215 ± 2388
0.0708 ± 0.0022
7705 ± 550
0.1614 ± 0.0027
23,688 ± 2152
0.1408 ± 0.0075
18,387 ± 1045
0.1650 ± 0.0078
20,074 ± 1738
0.0502 ± 0.0064
6711 ± 206
0.1707 ± 0.0069
Previous works have shown that single or low copy integrated messages are genetically stable in K. phaffii under different conditions [38, 39], however, few studies have focused on the integrity of multicopy clones. Since multicopy K. phaffii strains generally arise from multiple events of homologous recombination at the same locus, the integrated messages are typically repeated in tandem. The stability of such array may be compromised by excisional recombination which can “loop out” the genetic message under non-selective conditions , . In order to test the stability of integrated pKGFP-ld, transformed cells were grown in non-selective medium (YPD) for 36 and 72 generations. The culture was transferred to fresh medium every 24 h to ensure that cells were in exponential phase throughout the experiment. After growth for 96 h or 144 h (36 and 72 generations, respectively), copy number of the selected clones was assessed by Southern blot analysis which showed that all clones maintained the original vector copy number (data not shown).
In a recent work, S. cerevisiae strains with multiple integrated cassettes bearing different defective auxotrophic markers also showed mitotic stability under prolonged nonselective conditions . S. cerevisiae cells transformed with five or more copies of an integration vector conferring resistance to G418 and expressing SUC2 (invertase) were very unstable during long-term culture in non-selective medium . Likewise, when K. phaffii was transformed with a set of vectors containing sequentially increasing copies of porcine insulin precursor gene (PIP), both low and high copy strains were stable in serial culture in non-selective YPD medium. However, in high copy strains, loss of PIP cassettes was observed after 96 h of methanol induction .
Based on these previous results, it has been proposed that multicopy strains should be carefully evaluated for genetic stability especially under conditions of high expression or secretion . In our work, since EGFP was produced intracellularly from a moderately strong K. phaffii promoter (P PGK1 ) , it is possible that the titers of this particular protein were not high enough to compromised cell growth as shown on Fig. 4, hence, genetic stability was observed.
Correlation between copy number and protein production
However, the relationship between gene copy number and protein production is not always linear and in some cases it proved to be detrimental, especially for secreted proteins . In another study involving EGFP, an increase of the secreted protein was observed with up to three copies but a decrease occurred with six copies . When multicopy clones were used to produce intracellular human superoxide dismutase (hSOD) and secreted human serum albumin (HSA) a difference was observed in the correlation of gene copy number and productivity between non-secreted and secreted proteins . The productivity of hSOD correlated linearly with gene copy number, while HSA productivity increased up to approximately 5–7 gene copies, and then decreased with higher copy numbers. K. phaffii strains secreting human trypsinogen under the control of the AOX1 promoter presented a positive correlation between copy number and product yield from 1 to 2 copies per cell, and a negative correlation at 3 or more copies . Upon overexpression, great part of the heterologous protein was retained in the insoluble fraction of the endoplasmic reticulum. From this studies it is clear that bottlenecks in the secretory pathway are to some extent responsible for the low productivity of some multicopy clones .
Since the effect of gene dosage may vary from one protein to another, it is not possible to define the optimal copy number for any specific heterologous gene which should be assessed on a case-by-case basis. However, by using the approach presented in this work one can easily obtain a panel of clones with different copy numbers to be screened for the desired application. Furthermore, we envision that this approach might be also applied in synthetic biology studies in which different doses of specific genes may be required. This could be rapidly achieved by transforming M12 with different plasmids bearing the leu2-d marker following a screening for the desired phenotype. Work is underway in our laboratory to test this new application.
In this work, we proposed a simple approach to obtain K. phaffii clones containing multiple copies of a desired expression vector. Our genetic system is based on a K. phaffii strain auxotrophic for leucine which is transformed with an expression vector bearing a defective leu2-d marker. The main advantage of the approach proposed here is the ease in selecting multicopy clones, in our case this was based on colony size. This approach might serve as a first step in the construction of strains with higher productivity thus lowering the costs of industrial recombinant protein production.
Strains and growth conditions
Komagataella phaffii GS115 (his4) and X-33 (Invitrogen) were used as a source of template DNA to amplify LEU2 and cell host to perform transformation with the disruption cassette, respectively. K. phaffii was routinely grown on YPD (1% yeast extract, 2% peptone and 2% glucose) at 28 °C. Solid medium was prepared by the addition of 2% agar. After transformation yeast cells were plated on YPD containing 300–500 µg mL−1 G418 or 150 µg mL−1 hygromycin B. Transformants were tested on MD [0.34% Yeast Nitrogen Base (YNB), 1% ammonium sulphate, 2% glucose, 0.4 µg mL−1 biotin and 2% agar] and buffered MD [MD with 100 mM potassium phosphate (pH 6.0)] supplemented or not with 0.04 or 0.08% leucine. For induction of heterologous gene expression from the P AOX1 promoter cells were grown in a medium containing 1% yeast extract, 2% peptone, 100 mM potassium phosphate (pH 6.0), 0.34% YNB, 1% ammonium sulphate, 0.4 µg mL−1 biotin supplied with 1% glycerol (BMGY medium) or 0.5% methanol (BMMY medium). When liquid medium was used, growth was carried out under agitation (200 rpm) in shake flasks with a volume at least 10 times greater than the volume of the medium.
Cloning procedures were carried out in E. coli XL10-gold (Stratagene, USA) which was cultivated in modified LB medium (0.5% yeast extract, 1% peptone and 1% NaCl) containing the appropriate antibiotic for selection of transformants (100 µg mL−1 ampicillin or 50 µg mL−1 kanamycin). Bacterial cells were grown at 37 °C with constant shaking (250 rpm). For solid medium, 1.5% agar was added.
Phusion high-fidelity DNA polymerase (Finnzymes) was routinely used for PCR according to the instructions of the manufacturer. To amplify LEU2, Easy Taq DNA polymerase (LGC Bio, Brazil) was used in a final volume of 50 μL consisting of 0.2 mM dNTP, 0.2 μM each primer, 2 mM MgCl2, Easy Taq buffer 1X, 2 U polymerase and 1–5 ng template DNA. PCR involved an initial denaturation step at 96 °C for 3 min followed by 30 cycles of 60 s/94 °C, 1 min/60 °C, 2 min/72 °C and a final elongation step at 72 °C for 5 min.
Plasmid extraction, electrophoretic analysis and other basic DNA manipulations were performed as described previously . For DNA elution from agarose gels and for amplicon purification Wizard SV Gel and PCR Clean-Up System (Promega, USA) were used according to the manufacturer’s instructions, respectively. Genomic DNA was purified by using Wizard Genomic DNA Purification Kit (Promega, USA) using the manufacturer’s protocol.
Construction of disruption cassette
Primers used in this work
BamHI and BglII
BamHI and BglII
A vector based on pYRCre was constructed in order to promote marker excision in K. phaffii. Plasmid pYRCre was originally used to express the CreA recombinase in S. cerevisiae . The P GAL1 promoter present in this vector was removed after XbaI digestion and replaced by a 441-bp fragment corresponding to the S. cerevisiae P TEF1 promoter which was obtained by PCR using primers TEF-1F and TEF-1R (Table 2). The amplicon was digested with AvrII and cloned into XbaI-digested pYRCre. The resulting vector, pYRCre2, was used to transform K. phaffii and selection was made on YPD plates containing hygromycin B. Transformants were incubated at 28 °C for 3 days to allow expression of CreA recombinase and then selected clones were transferred to an YPD plate for plasmid curing. Isolated colonies were replica plated on YPD plates containing G418 or hygromycin B to confirm the removal of kan marker and cure of pYRCre2, respectively.
Construction of expression plasmid pGFP-L2
First, a vector constructed in our lab derived from pPIC9 (Invitrogen, USA) with the EGFP reporter gene under the control of the P AOX1 promoter was digested with EcoRV. This digestion removed the entire HIS4 sequence, which was replaced by the LEU2 gene obtained from pGEM-LEU after digestion with PvuII. The resulting vector, pPIC-LEU, was digested with BamHI and NotI to remove the EGFP gene which was fused in-frame to the α-factor secretion signal. This secretable version of EGFP was replaced by a 741 bp EGFP fragment from pEGFP-N3 (Clontech, USA) after digestion of this plasmid with the same enzymes. The resulting plasmid, which allows intracellular expression of EGFP, was named pGFP-L2. Before K. phaffii transformation pGFP-L2 was linearized with SacI to promote targeted integration to the P AOX1 locus.
Construction of expression vector pKGFP-ld
The leu2-d allele was amplified by PCR from S. cerevisiae genome with primers 5- and 3-leud (Table 2). The amplified 1.4 kb fragment included the LEU2 coding region with its transcription termination region and only 29 bp of its promoter . The leu2-d amplicon was cloned into pBlueScript SK II (Agilent Technologies) and then liberated after BglII digestion for subcloning into BamHI-linearized pPICK2  resulting in pK-ld vector. This vector was digested with SacI and NotI to remove the α-factor secretory sequence. This digestion also removed a 179 bp fragment from P PGK1 which was restored when the EGFP gene was cloned. The 916 bp fragment including the EGFP gene fused to the 179 bp fragment from the P PGK1 was obtained by digestion of pPICK-GFP [a vector derived from pPIC9 (Invitrogen) for intracellular expression of EGFP under the control of P PGK1 ] with SacI and NotI. Cloning of this 916 bp fragment into pK-ld resulted in pKGFP-ld vector. This vector was linearized with SacI to target integration to the PGK1 locus.
Komagataella phaffii X-33 was transformed by electroporation following the protocol described in the Pichia Expression Kit (Invitrogen). Transformation with pYRCre2 was carried out as previously described for the auto-replicative pPICHOLI vector .
Komagataella phaffii cells expressing EGFP were grown in 5 mL BMGY for 16 h at 28 °C. After cell count, pelleted cells were re-suspended in 20 mL BMMY to a final OD600 of 0.3. The culture was incubated at 28 °C and methanol was added to a final concentration of 0.5%. After 24 h of induction cells were imaged in a Zeiss Axio Observer Z1 Inverted Fluorescence Microscope equipped with 63× NA 1.4 oil immersion objective and a cooled CCD camera to analyze EGFP fluorescence. The images were acquired with Zen2011 software (Zeiss) and manipulated with Microsoft Office Picture Manager or Adobe Photoshop.
A fresh colony was inoculated in 500 µL of MD medium in a deep-well plate and incubated for 24 h at 30 °C and 200 rpm. The appropriate volume of this culture was inoculated in 100 µL of MD to an OD600 = 0.08 in a 96-well plate. Cell growth was performed on the Epoch Microplate Spectrophotometer (Biotek) by incubating at 30 °C under agitation of 300 rpm for 72 h. OD600 data was collected every 30 min. Three biological replicates were tested for each analyzed clone and the mean of the three values was presented. Natural logarithm of OD600 values was used to construct growth curves. Maximal growth rate was calculated from the slope of the linear section of these curves (up to eight hours growth).
Southern blot analysis
Yeast cells were grown in 40 mL of MD medium at 30 °C under agitation during 24 h and the DNA was extracted using phenol–chloroform as previously described . Aproximately 10 µg of genomic DNA were digested with EcoRI at 37 °C overnight. Digested DNA was applied in 0.8% agarose gel and then transferred to nitrocellulose membrane as described . Probe labeling, hybridization and detection were made using the AlkPhos Direct Labeling and CDP-Star Detection System (GE Life Sciences) following especifications of the manufacturer. The probe used was a fragment of ~600 bp corresponding to the PGK1 promoter obtained by digestion of pKGFP-ld with BglII and BamHI. The temperature for hybridization was 55 °C. Chemiluminescence was detected using the Amersham Imager 600 system (GE Life Sciences) and band intensity was measured with the use of the ImageQuant TL 8.1 software.
Genetic stability testing
The stability of the heterologous DNA integrated into the yeast genome was tested in shake flasks after 36 and 72 generations. A fresh colony was grown in 10 mL YPD medium for 24 h at 30 °C and 200 rpm. Then, 400 µL of this pre-inoculum were inoculated in 40 mL YPD and incubated under the same conditions for 24 h. A 400 µL sample of the culture was transferred to a new flask with 40 mL YPD and incubated under the same conditions for 24 h. This procedure was repeated four more times for a total growth time of 144 h. After 96 h (36 generations) and 144 h (72 generations) genomic DNA was extracted and submitted to Southern blot analysis as described above.
Yeast cells were grown in 5 mL of MD medium for 24 h at 30 °C and agitation. The required volume of each pre-inoculum was inoculated in 5 mL of MD to start the culture with an OD of 0.5. After 24 h of incubation at 30 °C under agitation cells were washed twice with PBS (137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4 and 2 mM KH2PO4, pH 7.4) containing 0.5% Tween by centrifugation at 3000×g for 5 min at 4 °C. Cells were suspended in the required volume of PBS to obtain approximately 106 cells mL−1. Cells were maintained at 4 °C until analysis with FACSVerse flow cytometer. All samples were collected with identical voltage parameters. Acquired data were analyzed using the FlowJo software. The gating strategy included: (a) gating on yeast cells on forward versus side scatter plots; (b) gating on single cells using forward scatter width versus forward scatter height plots and (c) selecting positive cells based on histograms from wild-type cells. Three biological replicates were tested for each analyzed clone and the mean of the three values is presented.
Statistical analyses and figures were made on GraphPad Prims 5 software. ANOVA followed by Tukey’s post-test was applied. Error bars on graphics represent standard error of the mean.
MB carried out the experimental studies and drafted the manuscript. VR and JLDM participated in the design of the study and helped to draft the manuscript. AMN performed and analyzed the flow cytometry experiments. LM participated in the design of the study. FT conceived the study, participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.
The authors wish to thank CNPq (Grant # 441978/2014-2) and FAPDF (Grant # 193.000.582/2009) for financial support of this Project.
The authors declare that they have no competing interests.
Availability of data and materials
All data and material concerning supporting the conclusions of this work is presented in the main paper and is made public available.
Consent for publication
We consent BioMed Central to publish this manuscript should it be accepted for so.
CNPq (Grant # 441978/2014-2), FAPDF (Grant # 193.000.582/2009).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 7.Piva LC, Betancur MO, Reis VCB, De Marco JL, Moraes LMP, Torres FAG. Molecular strategies to increase the levels of heterologous transcripts in Komagataella phaffii for protein production. Bioengineered 2017;1–5. doi: 10.1080/21655979.2017.1296613.
- 10.Romanos M, Scorer C, Sreekrishna K, Clare J. The generation of multicopy recombinant strains. Methods Mol Biol. 1998;103:55–72.Google Scholar
- 17.Erhart E, Hollenberg CP. The presence of a defective LEU2 gene on 2 mu DNA recombinant plasmids of Saccharomyces cerevisiae is responsible for curing and high copy number. J Bacteriol. 1983;156:625–35.Google Scholar
- 19.Servienë E, Melvydas V. Effect of the defective leucine gene leu2-d, on the properties of recombinant plasmid in yeast Saccharomyces cerevisiae. Biologija. 2001;4:30–3.Google Scholar
- 25.Bussey H, Umbarger HE. Biosynthesis of the branched-chain amino acids in yeast: a leucine-binding component and regulation of leucine uptake. J Bacteriol. 1970;103:277–85.Google Scholar
- 28.Grenson M, Hou C, Crabeel M. Multiplicity of the amino acid permeases in Saccharomyces cerevisiae. J Bacteriol. 1970;103:770–7.Google Scholar
- 30.Kotliar N, Stella C, Ramos E, Mattoon J. l-leucine transport systems in Saccharomyces cerevisiae participation of GAP1, S1 and S2 transport systems. Cell Mol Biol (Noisy-le-grand). 1994;40:833–42.Google Scholar
- 35.Du M, Battles MB, Nett JH. A color-based stable multi-copy integrant selection system for Pichia pastoris using the attenuated ADE1 and ADE2 genes as auxotrophic markers. Bioeng Bugs. 2012;3:32–7.Google Scholar
- 39.Lim H-K, Kim K-Y, Lee K-J, Park D-H, Chung S-I, Jung K-H. Genetic stability of the integrated structural gene of guamerin in recombinant Pichia pastoris. J Microbiol Biotechnol. 2000;10:470–5.Google Scholar
- 48.Sambrook J, Fritsch EF, Maniatis T. Molecular cloning: a laboratory manual. 2nd ed. New York: Cold Spring Harbor Laboratory Press; 1989.Google Scholar
- 49.Reis VCB, Nicola AM, de Souza Oliveira Neto O, Batista VDF, de Moraes LMP, Torres FAG. Genetic characterization and construction of an auxotrophic strain of Saccharomyces cerevisiae JP1, a Brazilian industrial yeast strain for bioethanol production. J Ind Microbiol Biotechnol. 2012;39:1673–83.CrossRefGoogle Scholar
- 51.Lueking A, Horn S, Lehrach H, Cahill D. A dual-expression vector allowing expression in E. coli and P. pastoris, including new modifications. In: Vaillancourt P, editor. E. coli gene expression protocols SE—3, vol. 205. Totowa: Humana Press; 2003. p. 31–42 (Methods in Molecular Biology™).CrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.