Introduction

Severe acute respiratory syndrome (SARS), caused by a novel coronavirus SARS-CoV, is a highly contagious respiratory disease [1, 2]. The first ever SARS in the world broke out at the end of 2002 in Southern China and spread rapidly to many other countries. More than 8400 SARS cases were reported and over 900 of them died by November 27, 2003, according to WHO database [3].

SARS-CoV is tentatively classified into group IV coronavirus because of its low homology of the genomic sequence with those of other coronaviruses. However, like other enveloped viruses, SARS-CoV enters host cells by binding the Spike (S) protein to its cellular receptors [46]. S protein belongs to class 1 transmembrane glycoprotein and can be divided into S1 and S2 fragments by its structure although it does not experience cleavage. S1 is situated in the N terminal and responsible for the binding to cellular receptor, while S2 is situated in the C terminal and responsible for the fusion with cellular membrane [7]. The angiotensin-converting enzyme 2 (ACE2) was identified as the major cellular receptor of SARS-CoV, [4, 5, 8]. The receptor-binding domain of S1 was determined to be within the region of 318–510 amino acids (AA) [9]. Meanwhile, this region includes several linear and conformation-dependent epitopes to neutralization antibodies [10, 11]. Therefore, S1 plays an important role in SARS-CoV-mediated infection and transmission. Additionally, S1 is a key vaccine antigen for the control of SARS-CoV-mediated infection. However, SARS-CoV is one of the human viruses which are highly restricted for the purpose of research because of the biosafety issue. The cloning and in vitro expression of S1 may provide an alterative approach to study the virus pathogenesis and infection control.

Pichia pastoris is a kind of eukaryotic unicell microorganism, which grows rapidly and is easily manipulated. Like higher eukaryotic expression systems, P. pastoris expression system undergoes posttranslational modifications such as disulfide bond formation, proteolytic processing, and glycosylation [12]. We ever cloned S1 gene fragment (n.t.777–1683) including receptor binding domain into a P. pastoris expression vector pPIC9K but failed to express it. More recently, S1 was successfully expressed in P. pastoris expression system with much lower efficiency compared to the E. coli system [13]. Sequence analysis of the native S1 gene excluded the code bias problem in yeast. However, it was found that many A + T repeats exist at regions of n.t.1041–1050, n.t.1236–1248, n.t.1317–1335, and n.t. 1590–1605, Socrer et al. demonstrated that the A + T abundant regions may act as a polyadenylation loop in P. pastoris and result in improper termination of transcription of HIV envelop glycoprotein gp120 [14]. To test the hypothesis that the A + T abundant regions may impede the expression of S1 in P. pastoris system, S1 gene was modified to remove the A + T repeats but keep the correct codons for translation. As a result, this modification enabled the high expression of S1 in P. pastoris.

Materials and methods

Plasmids and strains

SARS Sl gene (n.t.777–1683) (Acession AY274119) was offered by Dr. Wang Hanzhong, Wuhan Institute of Virology, Chinese Academy of Sciences. P. Pastoris expression vector pPIC9K and the auxotrophic strain GS115 were purchased from Invitrogen (Carlsbad, CA, USA).

S1 gene manipulation

Ten primers were designed to change the third base A or T into G or C in the trinucleotides by Over-Lap PCR based on the same meaning of the native codons in SARS-CoV S1 gene. Four regions were modified and shown in Table 1. The primers used in Over-Lap PCR were listed in Table 2.

Table 1 The positions and sequences of SARS-CoV SI where the modifications were performed
Table 2 The primers used in Over-Lap PCR to modify the S1 gene sequence

Five fragments were firstly amplified with pairs of the primers including Pl/P2, P3/P4, P5/P6, P7/P8, and P9/P10. The reaction condition included 95°C 5 min; 35 cycles for 94°C 1 min, 52°C 1 min, 72°C 50 s, and 72°C 10 min for extension. The full-length of the modified S1 gene fragment was constructed with PCR by using the previous five PCR products as the templates and the primer pair of P1/P10. The overlapped extension was programmed as 94°C 1 min, 20 cycles for 50°C 1 min, 72°C 50 s, and 72°C 10 min, Pyrobest Taq and other reagents were purchased from Takara (Dalian, China), The PCR strategy was illustrated in Fig. 1.

Fig. l
figure 1

Ovelap-PCR sketch map. The red regions represent four A + T abundant regions n.t.1041–1050, n.t.1236–1248, n.t.1317–1335, and n.t. 1590–1605 where the modifications were performed

Construction of the expression vector

The restriction enzymes SnaBI and EcoRI designed in P1 and P10 were used to clone the modified S1 gene fragment into the multiple cloning site in a secretion expression vector pPIC9 k. The resultant vector was designated as pPIC9K-S1. The insertion was confirmed by SalI and EcoRI digestion and sequencing.

Transformation of P. pastoris GS115 and identification of the recombinant

The recombinant plasmid pPIC9K-S1 was completely digested with SalI, transformed into competent GS115 cells previously prepared with lithium chloride (Sangon, Shanghai, China) by electroporation. 200 μl of the transformant was spread on Minimal Dextrose Medium(MD) agar plates and incubated at 30°C for 2–4 days, The single colonies were picked and transferred individually into 5 ml Yeast Extract Peptone Dextrose (YPD) broth for incubation 36 h at 30°C. The yeast cells, harvested by centrifugation at 12,000g for 15 s, was re-suspended in 200 μl TE, boiled for 10 min, incubated for 10 min in ice bath, pelleted by centrifugation. The integration of S1 gene into the yeast genome was confirmed with PCR by using the supernatant as the template and the primers specific to a factor and 3′AOX1 in the vector. The yeast transformed with blank vector was used as negative control and the recombinant plasmid as the positive control. The primer specific to α factor was 5′-TACTATTGCCAGCATTGCTGC-3′ and to 3′AOX1 was 5′-GCAAATGGCATTCTGACATCC-3′. The PCR condition was: 95°C 5 min, 35 cycles for 94°C 1 min, 52°C 1 min, 72°C 1 min, and extension at 72°C for 10 min. The PCR products were checked on 0.8% agarose by electrophoresis.

Induced expression of S1 protein by recombinant yeast

The single positive colonies were separately inoculated into 2 ml of Buffered Glycerol-complex Medium (BMGY), incubated at 28°C for 25–36 h in a shaker (250 r/min) until OD600 value of the culture reached 10. The yeast was harvested, diluted with Buffered Methanol-complex Medium (BMMY) until OD600 decreased to 1 and grew at 28°C in a shaking incubator (250 r/min). The samples were collected and methanol (100%) was added to a final concentration of 1% every 24 h. After 7 days of induction, the culture supernatant was screened for S1 protein production by 12% SDS-PAGE and Coomassie Brilliant Blue G250 [15].

The batch induction of expression was conducted with 50 ml BMGY culture of the high-expression colony when the OD600 reached 10 as previously described. The expressed S1 protein was further concentrated by 32% saturated ammonia sulfate [16]. The S1 crude protein was finally dissolved in l ml TE buffer for further function analysis. The protein concentration was estimated using the formula: Protein (mg/ml) = 1.75 × A280/0.74 × A260.

Function identification of S1 protein

ELISA and Western Blot were performed to analyze the antigenic activity of S1 protein.

S1 protein was isolated by SDS-PAGE transferred onto the nitrocellulose (NC) membrane, probed with the convalescent sera of SARS patients (provided by Dr. Huo Xixiang, Hubei province Center of Disease Control, China) and goat IgG to human immunoglobin coupled with HRP. The NC membrane was developed with DAB [15].

Conventional ELISA was performed. S1 protein was coated on ELISA plates, overlaped with the convalescent sera of SARS patients and rabbit IgG to human immunoglobulins coupled with HRP. TMB was used to develop HRP reaction, and OD630 was read [16].

Ligand blot assay was performed to determine the receptor binding activity of S1. The procedure was similar to Western blot, but S1 on NC membrane was sequentially probed by the SARS-CoV receptor ACE2 (expressed and purified by this lab), rabbit polyclonal antibody to human ACE2 (R & D Systems, MN, USA), and the goat IgG to rabbit immunoglobulins coupled with HRP [17].

Deglycosylation analysis of S1 protein

Deglycosylation was performed according to manufacturer’s instruction of PNGase F (New England Biolabs, MA, USA). Twenty (20) μg of S1 protein was denatured in 1 × Denaturing Buffer (0.5% SDS, 1% β-mercaptoethanol) by boiling 10 min at 100°C and added 1/10 (v/v) of 10 × G7 Reaction Buffer (50 mM Sodium Phosphate, pH7.5) and 1/10 (v/v) 10% NP-40. PNGase F 3 μl (1–2 U/μl) was added and incubated at 37°C for 1 h. The deglycosylation was determined by SDS-PAGE and protein stain with Coomassie Brilliant Blue.

Results

S1 gene modification and construction of the expression vector

We used Over-Lap PCR to modify the S1 gene fragment (n.t. 777–1683) of SARS-CoV by changing the A or T in the native sequence to G or C in four regions (n.t.1041–1050, n.t.1236–1248, n.t.1317–1335, and n.t.1590–1605). The PCR product was cloned into the expression yeast vector pPIC9K and the resultant vector was designated as pPIC9K-S1. The cloned S1 gene was sequenced and all the modifications were confirmed to be correct as designed in Table 1. No unexpected mutation was observed. Double digestion of the recombinant plasmid pPIC9K-Sl with SalI and EcoRI demonstrated the right orientation of S1 gene fragment. The integration of expression vector pPIC9 k-S1 into P. pastoris GS115 genome was confirmed by PCR using primers specific to α factor and 3′AOX1 flanked by S1 gene insert (data not shown).

Protein expression of the recombinant yeast

Methanol successfully induced the S1 protein expression in this study. The pilot induction was used to screen the positive colonies for S1 expression. Then the batch induction was performed with the high-level expression colony. The protein yield reached 69 mg/l in the supernatant of yeast culture. SDS-PAGE showed that a 70 kDa protein was expressed and secreted, whereas the S1 protein was expected to be 30 kDa. To test whether the higher molecular mass resulted from the hyperglycosylation of S1 in this yeast expression system, we deglycosylated the secreted product with PNGase F which hydrolyzes nearly all types of N-glycan chains from glycopeptides/proteins. The high molecular mass of S1 was degraded to the expected size of 30 kDa after deglycosylation in the Coomassie Brilliant Blue stained SDS-PAGE gel (Fig. 2).

Fig. 2
figure 2

SDS-PAGE and Coomassie Brilliant Blue stain of S1 protein expressed by the recombinant yeast before and after deglycosylation. Lane M, Protein marker with the molecular mass (kDa) shown on the left; lane 1, S1 protein of about 70 kDa before deglycosylation; lane 2, Deglycosylated S1 protein of about 30 kDa after treatment with PNGase F. The arrows show the position of S1 protein before and after deglycosylation

The reaction of SI with antisera to SARS-CoV

Western blot demonstrated that the expressed S1 reacted with the convalescent sera of SARS-CoV patients specifically, while the yeast cells transformed with the blank expression vector could not develop any reactions. Although there was only one 30 kDa band on Coomassie Brilliant Blue-stained SDS-PAGE gel after deglycosylation of the S1 protein, Western blot analysis detected two bands, one 30 kDa band as expected and another band of about 40 kDa. The later 40 kDa band was probably the oligosaccharide cleaved by PNGase F which could be recognized by antisera to SARS-CoV but could not be stained by Coomassie Brilliant Blue (Fig. 3).

Fig. 3
figure 3

Western blot analysis of S1 protein expressed by the recombinant yeast with the convalescent sera of SARS patients before and after deglycosylation. Lane M, Protein marker with the molecular mass (kDa) shown on the left; lane 1, S1 protein of about 70 kDa before deglycosylation; lane 2, Deglycosylated S1 protein of about 30 kDa after treatment with PNGase F. Another 40 kDa band probably represented the cleaved oligosaccharide probed by the antisera to SARS-CoV; lane 3, The mocked-recombinant yeast

ELISA detected reaction of S1 protein with the positive sera from 2 SARS-CoV convalescent patients but there was no reaction between S1 and the negative sera from six healthy people. The OD values of the positive sera are over three times higher than those of the negative sera, suggesting the expressed S1 could specifically react with the antisera to SARS-CoV (Fig. 4).

Fig. 4
figure 4

ELISA detected the reaction of S1 protein expressed by the recombinant yeast with convalescent sera of the SARS patients. S1 was coated on ELISA plates and reacted with the sera. Bar 1, 2 represented the sera from two SARS convalescent patients respectively; Bar 3–6 represented sera from four healthy people, respectively. The OD value for each bar represented the mean of three independent experiments. The error bars represented the standard deviation (SD)

The reaction of S1 with the ACE2

Ligand blot assay was performed to analyze the specific interaction between S1 and its receptor ACE2. S1 protein was blotted on NC membrane and reacted sequentially with ACE2, polyclonal antibody toACE2, and the second antibody. The ACE2 fragment (1–365AA) was expressed by this lab with the gene cloned from Vero E6 cells. The recombinant S1 reacted with ACE2 a producing a clear band on blotted NC membrane, while the mocked-recombinant yeast did not give any signals (Fig. 5).

Fig. 5
figure 5

Ligand blot assay of S1 protein expressed by the recombinant yeast with the SARS-CoV receptor ACE2. Lane M, Protein marker with the molecular mass (kDa) shown on the left; lane 1, The mocked-recombinant yeast; lane 2, S1 protein shown by the arrow probed by its receptor ACE2

Discussion

The methylotrophic yeast P. pastoris is an excellent protein expression system because of the simplicity in molecular genetic manipulation. P. pastoris produces foreign proteins with high efficiency either intracellularly or extracellularly. Furthermore, this yeast expression system performs many eukaryotic posttranslational modifications such as processing of signal sequences, folding, glycosylation, disulfide bond formation, and proteolytic processing [12]. Expression vectors based on the methanol-inducible AOX1 promoter are integrated into the host chromosome. However, there are some disadvantages which impede the heterologous protein expression such as the codon bias, A + T composition of cDNA, glycosylation patterns, etc. High-level expression of genes in P. pastoris prefers a lower ratio of A + T and the third nucleotide of C for a trinucleotide codon. A + T rich sequence lowered transcriptional efficiency because of the transcriptional blocks [18]. For instance, AT-rich regions such as 5′-ATTATTTATAAT-3′ in the gene of HIV-1 gp120 resulted in premature transcriptional termination and the production of truncated mRNA. When G + C content was increased by synthesizing the gene, this transcriptional problem was overcome giving rise to full-length mRNA [14].

In our previous study, we failed to detect expression of SARS-COV S1 gene fragment (n.t 777–1683) which includes the receptor-binding region (n.t. 954–1530) [9]. Later we realized that this fragment had abundant AT repeats and some of AT-rich regions even span more than 7 nucleotides. Therefore it was assumed that this AT-rich sequence might result in an undesirable termination of S1 gene transcription by acting as a polyadenylation loop in P. pastoris. Based on the same meaning of the native codons, we changed A or T to G or C as much as possible at four sites with over 7 nucleotides of A + T repetition. As a consequence, this modification of S1 gene rendered its high-level expression in P. pastoris with the yield of 69 mg/l. The expressed S1 was functionally demonstrated by its reaction with antisera to SARS-CoV and the receptor ACE2 (Figs. 35). These evidences support our assumption that high content of A + T in heterologous genes impedes the expression in P. pastoris.

One of the advantages for P. pastoris expression system is that the expressed protein can be glycosylated. However, the glycosylation patterns differ between yeast and mammal expression systems [19]. Although able to add both O- and N-linked carbohydrate moieties to secreted proteins, P. pastoris glycosylated a heterologous protein by adding O-linked and N-linked high-mannose oligosaccharides. Some of foreign proteins appear to be hyperglycosylated with outer chains of 50–150 mannose residues in length [12]. The native S protein of SARS-CoV is a highly-glycosylated protein with high mannose and/or hybrid oligosaccharides, and our S1 protein fragment includes four N-glycosylation sites located in residues 269, 318, 330, and 357, respectively [20, 21]. The hyperglycosylation of S1 protein secreted by the recombinant P. pastoris was confirmed by PNGase F deglycosylation. PNGase F is an amidase, which cleaves between the innermost G1cNAc and asparagine residues of high mannose, hybrid and complex oligosaceharides from N-linked glycoproteins. Therefore PNGase F hydrolyzes nearly all types of N-glycan chains from glycopeptides/proteins. After PNGase F deglycosylation, the high molecular mass (70 kDa) of S1 protein was degraded to the expected size of S1 (30 kDa). Unlike the SDS-PAGE result, we detected two bands on NC membrane when the antisera to SARS-CoV were used to probe the degraded S1 protein in Western blot analysis. Because the recognition site of PNGase F is located between the innermost GlcNAc and asparagine residues of high mannose, it would be possible that the other 40 kDa band was the oligosaccharide cleaved by PNGase F which could be recognized by antisera to SARS-CoV but not detected by protein stain Coomassie Brilliant Blue (Fig. 3).

In conclusion, this study successfully modified S1 gene of SARS-CoV by changing A or T in the native gene to G or C. This modification enabled the recombinant P. pastoris to express S1 protein with a high efficiency. The biological activity of expressed S1 protein was confirmed by reacting with the antisera and ACE2, the receptor of SARS-CoV. Based on the great concern on biosafety of live SARS-CoV, the high-level expression of functional S1 protein in P. pastoris is of significance to research on SARS-CoV pathogenesis and infection control.