After the first report of coronavirus disease 2019 (COVID-19) in December 2019 (Li et al. 2020; Lu et al. 2020), the number of cases is now at 124 million with over 2.3 million deaths worldwide (JHU 2021). As of March 23, 2021, several vaccines have been approved for emergency use in multiple countries and are currently deployed for mass vaccinations globally (Zimmer et al. 2021). The development and production of the COVID-19 vaccines relies on several technologies or platforms, mainly nucleic acid and viral vector vaccines as novel technologies and whole inactivated, live attenuated viral or recombinant protein subunit or virus-like particle vaccines as conventional platforms (Chakraborty et al. 2021). Among the first to have received approval for use were two messenger RNA (mRNA)-based vaccines from Pfizer/BioNTech and Moderna, both produced in record time, but posing the challenges that are relatively expensive to manufacture and difficult to scale and require transportation and storage at ultra-low temperatures. The viral vector–based vaccines developed by Oxford-AstraZeneca, Johnson & Johnson (J&J), Gamaleya, and CanSino have also now been approved for use in a variety of countries and bring the added advantage that do not have these ultra-low cold-chain requirements providing easier delivery and, in the case of the J&J vaccine, the advantage of being used as a single dose. Furthermore, several whole inactivated viral vaccines produced by the Chinese manufacturers Sinopharm and Sinovac and by Bharat Biotech, an India-based manufacturer, have added to the list of approved vaccines (Craven 2020; Eccleston-Turner and Upton 2021). However, formidable challenges still remain to produce much larger scales and distribute COVID-19 vaccines within low- and middle-income countries (LMICs) (Lancet Commission on and Therapeutics Task Force 2021). Data from Duke’s Global Health Innovation Center and others clearly highlight the continued procurement and manufacturing challenges, leading to enormous inequity in the access for COVID-19 vaccines in these regions of the world (Duke Global Health Innovation Center 2021; Ritchie et al. 2021).

Therefore, this situation is leaving LMICs bereft of low-cost COVID-19 vaccines suitable for their modest or depleted health systems (Lancet Covid-19 Commissioners and Commission 2020). In response, there is an urgent need that the LMIC vaccine developers and manufacturers from the Developing Country Vaccine Manufacturers Network (DCVMN) (DCVMN 2021) accelerate the development of additional vaccines employing the traditional platforms, especially additional vaccines based on recombinant proteins (Craven 2020; Eccleston-Turner and Upton 2021). These vaccine platforms are less demanding with respect to transport and storage and often come with a long history of successful global large-scale production capabilities and affordable use for other infectious diseases (Hotez and Bottazzi 2020). Particularly attractive in this aspect are recombinant protein antigens (Pollet et al. 2021a), in particular those produced through microbial fermentation in yeast. For instance, recombinant hepatitis B vaccine has been administered to adults and children for decades (World Health Organization 2017). Currently, only a few protein-based COVID-19 vaccine candidates have advanced to phase 3 trials, namely from Novavax, Vector Institute in Russia, and Cuba’s Finlay Vaccine Institute (Pollet et al. 2021a), but in aggregate, it is promising to see more than 30 protein-based vaccines are actively in development (Zimmer et al. 2021), which will greatly enable future access and distribution of safe, effective, and affordable COVID-19 vaccines for the world.

The SARS-CoV-2 virus, the pathogen that causes COVID-19, uses its surface spike (S) protein for host cell entry, just like its close relative, SARS-CoV that had caused an outbreak of severe acute respiratory disease in 2002. The receptor binding domain (RBD) of the S protein binds to a cellular receptor, angiotensin-converting enzyme 2 (ACE2), that mediates membrane fusion during viral entry into the cell (Li et al. 2003; Yan et al. 2020). S proteins for both viruses have served as vaccine antigens that could elicit antibodies to prevent virus entry by blocking the binding of RBD to ACE2 (Hoffmann et al. 2020), and all the leading COVID-19 vaccines currently in clinical trials, including the mRNA vaccines, use the S protein to elicit immunity (Haynes et al. 2020). Overwhelmingly, such vaccines protect through their induction of virus-neutralizing antibodies, together with T cell responses (Jiang et al. 2020).

Building on our experience with the RBD of the SARS-CoV spike protein (Chen et al. 2014, 2017, 2020), the SARS-CoV-2 RBD was cloned and expressed in the yeast Pichia pastoris. Yeast has a track record of serving as a host organism for the production of multiple regulatory-approved and prequalified recombinant subunit vaccines, including vaccines for hepatitis B, influenza B, human papillomavirus, as well as for diphtheria and tetanus (Bill 2015; Kumar and Kumar 2019). Eukaryotic expression in yeast shows advantages over the prokaryote, Escherichia coli, with respect to the production of recombinant protein vaccines. Proper protein folding, disulfide bridge formation, post-translational modifications, and secretory cleavage are better supported in yeast, while also allowing for robust production with low costs and full scalability, features that distinguish this platform from other eukaryotic systems, such as insect cells, mammalian cells, and plants.

The RBD219-N1C1 antigen is derived from residues 332–549 of the SARS-CoV-2 RBD with a single mutation of a free cysteine residue (Cys538) to alanine to prevent intermolecular disulfide bond formation and therefore unwanted oligomerization during process development (Chen et al. 2021). In addition, N1 refers to the deletion of Asn331 to avoid hyperglycosylation observed in previous studies with the SARS-CoV RBD219-N1 antigen (Chen et al. 2014). In initial studies, the modifications used to express RBD219-N1C1 recombinant protein did not affect the in vitro binding to its receptor ACE2, when compared to the yeast-expressed recombinant wild-type RBD (Chen et al. 2021). Additionally, when RBD219-N1C1 was adjuvanted with Alhydrogel®, the vaccine formulation has been shown to elicit a robust neutralizing antibody response against SARS-CoV-2 pseudovirus in mice (Pollet et al. 2021b).

With any COVID-19 vaccine candidate, the ability to produce billions of doses efficiently is crucial to satisfy the potential global vaccine demand. We, therefore, have been developing and optimizing a scalable production process of the RBD219-N1C1 vaccine candidate at a low cost to support its technology transfer. Initial fermentation runs scouting for growth media, induction time, and glycerol fed-batch conditions were executed in a 1-L bioreactor and resulted in a ~10-fold increase in RBD219-N1C1 expression levels. Further scale-up experiments in a 5-L bioreactor established the reproducibility of the selected conditions. Simultaneously, a purification scheme was developed based on the process used for the 70% homologous SARS-CoV RBD antigen (Chen et al. 2017) and further optimized to allow full scalability and lower the cost. Taking into consideration yield, purity, functionality, and removal of host cell contaminants, we have developed an optimized fermentation at the 5-L scale and purification (process 2) suitable for production and manufacturing of a high-yield (and therefore, potentially low-cost) COVID-19 vaccine antigen candidate. The developed process has already been transferred to Biological E, an industrial vaccine manufacturer in India and is currently undergoing further production maturity while the vaccine candidate has recently completed a combined phase 1 and 2 clinical trial. We expect the results from this trial to be published and available during the second quarter of 2021.

Materials and methods

Generation of research cell bank

To generate a research cell bank (RCB), P. pastoris X33 strain was transformed with expression plasmid pPICZαA containing RBD219-N1C1 coding DNA, and one transformed colony with high expression of recombinant RBD219-N1C1 protein (Chen et al. 2021) was selected and streaked on yeast extract peptone dextrose (YPD) plates containing 100 μg/mL Zeocin to make single colonies. The plates were incubated at 30 °C for approximately 3 days until single colonies were observed. Subsequently, 200-mL plant-derived phytone YPD medium was inoculated with a single colony from the respective plate and incubated at 30 °C with constant shaking (225 rpm) until the OD600 reached 9.3. Finally, the cell culture was mixed with plant-derived glycerol to a final concentration of 20% and aseptically aliquoted (1 mL each) into 1.2-mL cryovials. For long-term storage, the cryovials were stored at −80 °C.


One vial of the SARS-CoV-2 RBD219-N1C1 RCB was used to inoculate a 0.5-L buffered minimal glycerol (BMG) medium in a 2-L baffled shake flask. The shake flask culture was grown at 30 °C and 225 rpm until an OD600 of 5–10. For 1-L fermentations, this seed culture (20–40 mL) was inoculated into 0.4 L of sterile basal-salt medium (BSM) (pH 5.0; BSM: 18.2 g/L potassium sulfate, 14.9 g/L magnesium sulfate heptahydrate, 4.13 g/L potassium hydroxide, 0.93 g/L calcium sulfate dehydrate, 26.7 mL/L of 85% phosphoric acid, and 40 g/L glycerol) or low-salt medium (LSM) (pH 5.0; LSM: 4.55 g/L potassium sulfate, 3.73 g/L magnesium sulfate heptahydrate, 1.03 g/L potassium hydroxide, 0.23 g/L calcium sulfate dehydrate, 10.9 mL/L of 85% phosphoric, and 40 g/L glycerol) to a starting cell density (OD600) of 0.5. Fermentation was conducted using a Biostat Qplus bioreactor with a 1-L vessel (Sartorius Stedim, Guxhagen, Germany). For 5-L runs, the seed culture (125–250 mL) was inoculated into 2.5 L of LSM, and fermentation was conducted in a CelliGen 310 bioreactor with a 7.5-L vessel (Eppendorf, New York, USA), controlled by the Eppendorf Bio Command software. Cell expansion was continued at 30 °C with a dissolved oxygen (DO) set point of 30%. After 19 ± 2 h of growth, a dissolved oxygen spike was observed on the trend chart, which indicates glycerol depletion. A fed-batch was initiated with 50% glycerol at a feed rate of 15 mL/L/h for 6 h to further expand biomass. During the last hour of the fed-batch phase, pH was adjusted to 6.5 using 14% NH4OH, while the temperature was adjusted to 25 °C. When a glycerol fed-batch was not included in the fermentation process, the pH and temperature were adjusted to the desired value during the first hour of induction. After the fed-batch phase, methanol induction was initiated; the total induction time was approximately 68–72 h. Biomass was removed by centrifugation at 12,227×g for 30 min at 4 °C before the supernatant was filtered through 0.45-μm polyethersulfone (PES) filters stored at −80 °C until purification.

Purification overview of three processes

The fermentation supernatant (FS) was removed from −80 °C and thawed at 22 °C for 4–6 h. Three purification processes were performed with 1-L FS aliquots (Fig. 1b). In process 1, the RBD219-N1C1 protein was captured from the FS using hydrophobic interaction chromatography (HIC), concentrated by ultrafiltration/diafiltration (UFDF), and polished using size exclusion chromatography (SEC). In process 2, the RBD219-N1C1 protein was captured using HIC, buffer-exchanged (UFDF), and polished using anion-exchange chromatography (AEX). Finally, in process 3, the FS was buffer-exchanged using UFDF before the target protein was captured using cation-exchange chromatography (CEX), buffer-exchanged (UFDF), and polished using AEX.

Fig. 1
figure 1

Fermentation (a) and purification (b) flow diagrams. Three purification processes performed are shown in different colors. The color scheme remains consistent throughout all figures. UFDF, ultrafiltration and diafiltration; HIC, hydrophobic interaction chromatography; SEC, size exclusion chromatography; TFF, tangential flow filtration; CEX, cation exchange chromatography; AEX, anion exchange chromatography

UFDF (ultrafiltration and diafiltration)

Two types of devices were used for UFDF, a centrifugal concentrator, and a flat sheet membrane, depending on the target volume. For process 1, Amicon centrifugal concentrator, with a 10 kDa molecular weight cutoff (MWCO) (MilliporeSigma, Burlington, USA) was used to concentrate the HIC elution pool (2050×g at 4 °C). This allowed concentration to the small volume needed for SEC. For process 2, a flat sheet Pellicon XL Cassette with a Biomax 5 membrane (5 kDa MWCO) and a Labscale TFF System (MilliporeSigma, Burlington, USA) were used to concentrate the HIC elution pool 8-fold, followed by diafiltration with 4 diavolumes of 20 mM Tris-HCl (pH 7.5) and 100 mM NaCl. A crossflow was kept at 25 mL/min over a 0.005-m2 membrane area throughout the entire process with an average transmembrane pressure (TMP) of ~15 psi. For process 3, a flat sheet Pellicon 2 Mini Cassette with a Biomax 5 membrane (MilliporeSigma, Burlington, USA) was used for the first UFDF (UFDF-1) to concentrate the FS 4-fold followed by diafiltration with 4 diavolumes of 20 mM sodium citrate (pH 4.2) and 10 mM NaCl. A crossflow was kept constant at 200 mL/min over a 0.1-m2 membrane area throughout the entire process with an average TMP of ~8 psi. For the UFDF-2, the CEX elution pool was concentrated 4-fold followed by diafiltration with 4 diavolumes of 25 mM Tris-HCl (pH 7.2) and 5 mM NaCl using the Pellicon XL Cassette as described for process 2.

Hydrophobic interaction chromatography

In processes 1 and 2, HIC was used to capture RBD219-N1C1 proteins from the FS. Ammonium sulfate salt was added to the FS to a final concentration of 1 M (w/v), and the pH was adjusted to 8.0. The FS was filtered through a 0.45-μm PES filter unit and loaded on a 112-mL Butyl Sepharose High-Performance column (4.4 cm diameter and 7.4 cm bed height) at a 20 mL/min flow rate. The column was washed with 1 M ammonium sulfate in 30 mM Tris-HCl (pH 8.0). Bound proteins were eluted with 0.4 M ammonium sulfate in 30 mM Tris-HCl (pH 8.0).

Size exclusion chromatography

Five milliliters of the concentrated HIC elution pool was loaded on a HiLoad 16/600 Superdex 75 prep-grade column (Cytiva, Marlborough, USA), pre-equilibrated with 20 mM Tris-HCl (pH 7.5) and 150 mM NaCl, and eluted at a flow rate of 1 mL/min. The SEC elution pool was aseptically filtered using a 0.2-μm PES filter in a biosafety cabinet and stored at −80 °C until usage.

Ion exchange chromatography

In process 3, RBD219-N1C1 was captured using CEX. The Pellicon 2 retentate pool in 20 mM sodium citrate (pH 4.2) and 10 mM NaCl was loaded on a 50-mL CM Sepharose Fast Flow column (2.6 cm diameter and 9.3 cm bed height) at a 10 mL/min flow rate. The column was washed with 20 mM sodium citrate (pH 4.2) and 10 mM NaCl. Bound proteins were eluted in 20 mM sodium citrate (pH 6.6) and 10 mM NaCl.

In processes 2 and 3, RBD219-N1C1 was polished using a negative capture AEX. The Pellicon XL retentate pool was loaded on a HiPrep Q Sepharose XL 16/10 column (Cytiva, Marlborough, USA) that was pre-equilibrated with 20 mM Tris-HCl (pH 7.5) and 100 mM NaCl for process 2, and 25 mM Tris-HCl (pH 7.2) and 5 mM NaCl for process 3. The flowthrough from AEX was collected, aseptically filtered using 0.2-μm PES filters in a biosafety cabinet, and stored at −80 °C until usage. NaCl (95 mM) was added to the final purified proteins from process 3 prior to storage in 25 mM Tris-HCl (pH 7.2) and 100 mM NaCl.

Protein yield and purity determination by quantitative SDS-PAGE

In-process samples taken at each purification step were loaded on either 14% Tris-glycine gels or 4–12% Bis-Tris gels to determine the concentration and purity of the various RBD219-N1C1 samples. Purified RBD219-N1C1 proteins of known concentrations were used as standards. After SDS-PAGE, gels were stained with Coomassie blue and scanned with a GS-900 densitometer (Bio-Rad, Hercules, USA). Gel images were processed with Image Lab software (Bio-Rad, Hercules, USA) to create a standard curve and determine protein concentration and purity.

Western blot

Western blot analysis was performed to detect RBD219-N1C1 as well as P. pastoris host cell protein (HCP). Five micrograms of purified protein was run on 14% Tris-Glycine gels under non-reducing and reducing conditions to detect RBD219-N1C1 and HCP, respectively. Proteins in gel were transferred to PVDF membranes and blocked with 5% dry milk in PBST (1× PBS with 0.05% Tween-20). RBD219-N1C1 was detected using a rabbit monoclonal antibody against the SARS-CoV-2 spike S1 protein (Sino Biological, Beijing, China; Cat#: 40150-R007) and goat anti-rabbit IgG secondary antibodies conjugated with horseradish peroxidase (Invitrogen, Carlsbad, USA; Cat#: G21234). HCPs were detected using an anti-P. pastoris:HRP conjugate (2G) solution (Cygnus, Southport, USA; Cat#: F641-12). The blots were developed using ECL Prime Substrate System (Cytiva, Marlborough, USA).

Size Exclusion Chromatography-High Performance Liquid Chromatography

Waters® Alliance HPLC Separations Modules and Associated PDA Detectors were operated to analyze the size and purity of purified RBD219-N1C1 proteins. Fifty micrograms of the RBD219-N1C1 protein was injected into a Yarra SEC-3000 column (300 mm × 7.8 mm; catalog #: 00H-4513-K0) and was eluted in 20 mM Tris-HCl (pH 7.5) and 150 mM NaCl, at the flow rate of 0.6 mL/min. The elution of protein was confirmed by detecting the absorbance at 280 nm.

Dynamic light scattering

The size of the purified RBD219-N1C1 proteins was also analyzed by dynamic light scattering (DLS) (Chen et al. 2017, 2020). In short, RBD219-N1C1 was first diluted to 1 mg/mL with TBS, and approximately 40 μL of protein was then loaded into a clear bottom 384-well plate in four replicates to evaluate the hydrodynamic radius and molecular weight using the cumulant fitting on a Wyatt Technology DynaPro Plate Reader II.

Host cell protein quantification by ELISA

Yeast-expressed RBD219-N1C1 is N-glycosylated (Chen et al. 2021). To avoid any cross-reactivity from anti-P. pastoris HCP antibodies that recognize the N-glycans, which could result in an over-estimation of true HCP, we performed quantitative ELISAs with a second-generation anti-Pichia pastoris HCP ELISA Kit (Cygnus, Southport, USA; Cat#: F640) following the manufacturer’s instructions. This kit provides strips pre-coated with anti-P. pastoris HCP antibodies. Serially diluted RBD219-N1C1 was loaded onto the strips (HCP standards range from 0 to 250 ng/mL) in the presence of HRP-conjugated anti-P. pastoris antibodies. The strips were then incubated for approximately 3 h at room temperature followed by 4 washes. Finally, 100 μL of 3,3′,5,5′-tetramethylbenzidine (TMB) solution was added to react with the HRP-conjugated antibodies that were presented in the strip for 30 min prior to the addition of 100 μL of 1 M HCl to stop the reaction. The absorbance of 450 nm was measured in each well of the strip, and a linear standard curve was generated by plotting an “absorbance vs concentration” graph with the HCP standards to further calculate the HCP concentration present in the RBD219-N1C1 proteins.

Endotoxin test

Endotoxin levels in the purified RBD219-N1C1 samples were measured using the Endosafe Portable Testing System (Charles River Laboratory, Wilmington, USA). The purified protein was diluted 10-fold with Endosafe LAL reagent water, and 25 μL of diluted protein was loaded to each of the four wells of PTS20 Limulus amebocyte lysate Reagent Cartridge for the measurement as described in the literature (Charles River Laboratory, Wilmington, USA) (Jimenez et al. 2010).

In vitro ACE2 binding ELISA

The binding of RBD219-N1C1 to recombinant human ACE2 was evaluated using an ELISA procedure described previously (Chen et al. 2021). In short, 96-well ELISA plates were coated with 100 μL of 2 μg/mL RBD219-N1C1 overnight at 4 °C followed by blocking with PBST/0.1% BSA. One hundred microliters of serially diluted ACE2-hFc (LakePharma, San Carlos, USA; Cat#: 46672) was added to the wells and incubated at room temperature for 2 h, and the binding was detected by adding 100 μL 1:10,000 diluted HRP conjugated anti-human IgG antibodies (GenScript, Piscataway, USA; Cat#: A00166) with a 1-h incubation period at room temperature. Finally, 100 μL TMB substrate was provided to each well to react with HRP and the reaction was terminated with 100 μL of 1 M HCl. Absorbance at 450 nm was measured using an Epoch 2 microplate reader (BioTek, Winooski, USA).


Fermentation optimization

When BSM and LSM were compared for the production of the RBD219-N1C1 protein, no differences were observed in the growth profiles and the final biomass. However, the salt concentration appeared to have a significant effect on the yield. The yield of the RBD219-N1C1 protein using BSM was only 52 mg/L, while using LSM, 237 mg/L was achieved (Table 1, runs 1 and 2). Therefore, LSM was used for the further development of the fermentation process.

Table 1 Summary of the development fermentation runs

The baseline fermentation process consisted of two phases: a glycerol-batch phase and a methanol fed-batch phase. In glycerol-batch mode, LSM contained 40 g/L of glycerol. At the time of glycerol depletion, the initial induction biomass was 110 ± 10 g/L (WCW). In this study, a glycerol fed-batch phase was then added before methanol induction to test the efficiency of protein expression based on the initial induction biomass. After a 6-h glycerol-fed-batch phase, the initial induction biomass doubled to 210 ± 20 g/L (WCW). The methanol feed strategies were kept the same. At harvest, the final OD600 and the biomass were determined to be 260 AU and 417 g/L, respectively. By adding the glycerol fed-batch, the yield of RBD219-N1C1 was increased about 120% to 533 mg/L (Table 1, run 3).

Fermentation scalability and reproducibility

The fermentation process with 6 h of glycerol feed was then scaled up from 1 to 5 L to test scalability and reproducibility (Table 1, run 4). The induction time was extended to 87 h until biomass started to drop. This suggested that the cells were no longer actively dividing. Since excessive methanol feeding may lead to cell death thus leading to a loss of protein yield, it was decided to stop the methanol feeding after 87 h of induction. The peak yield of RBD219-N1C1 was 449 ± 8 mg/L at 70 h after induction (day 3), after which the yield slightly dropped to 408 ± 9 mg/L at 87 h after induction.

The fermentation process without the 6-h glycerol fed-batch phase was also scaled up from 1 to 5 L for comparison (Table 1, run 5). After 70 ± 2 h of induction, the yield of RBD219-N1C1 reached 479 ± 15 mg/L (a 128% increase compared to the 1-L scale). This yield was close to the yield of the fermentation run with the 6-h glycerol fed-batch phase (Table 1, run 4). Since there was no significant increase in yield by the glycerol fed-batch at the 5-L scale, we decided to proceed without this step (Fig. 1a). To establish reproducibility, this fermentation process (Fig. 1a) was repeated four times (runs 5–8). The average yield of four reproducibility runs was 428 ± 36 mg/L, with a coefficient of variance of 8.3%. The SDS-PAGE gel analysis of fermentation supernatants of a representative run (run 5) with the lockdown process is shown in Fig. 2. RBD219-N1C1 (a dominant protein band of ~28 kDa) was secreted and accumulated in the fermentation supernatant through the course of methanol induction.

Fig. 2
figure 2

Timepoint SDS-PAGE analysis of pre- and post-induction fermentation samples of the lockdown process (run 5). PI: pre-induction; D1, D2, D3: days 1–3 after induction. The arrow shows RBD219-N1C1 in the fermentation supernatant after induction

Three purification schemes

In parallel with the fermentation optimization, three different processes were performed to purify RBD219-N1C1 from the FS (Fig. 1b). Process 1 was developed by adapting the purification method of our SARS-CoV RBD219-N1 antigen that shares 70% homology with the SARS-CoV-2 RBD (Chen et al. 2017, 2021; Shang et al. 2020). In this process, the target protein was captured by HIC using a butyl HP column with 1 M ammonium sulfate salt for the binding. After the HIC, 67% of the target protein was recovered and purity significantly improved from 85.6% in the FS to 97.6% (Fig. 3a). The target protein was concentrated using Amicon centrifugal concentrators and further polished by SEC using a Superdex 75 column. The SEC elution pool was then diluted to 2 mg/mL for storage. Overall, the final yield of the target protein using process 1 was 188.8 mg/L FS (Fig. 3a), representing a recovery of 50% with a purity of 98.3%. This is similar to the overall recovery of 52% and the purity of 98.5% shown with the SARS-CoV RBD219-N1 protein (Chen et al. 2017).

Fig. 3
figure 3

In-process sample comparison from three processes. Yield, step recovery, overall recovery, and purity are shown as an average ± SD calculated from two independent gels that are shown in the table (left) and a representative gel stained with Coomassie blue that is shown (right) from process 1 (a), process 2 (b), and process 3 (c). FS, fermentation supernatant; HIC, hydrophobic interaction chromatography; UFDF, ultrafiltration and diafiltration; SEC, size exclusion chromatography; AEX, anion exchange chromatography; CEX, cation exchange chromatography

Although process 1 is sufficient and proven to produce proteins at high yield and purity at the laboratory scale, up to 10 L (Chen et al. 2017), there are considerations to be made with respect to scaling up manufacture. Both HIC and SEC are costly steps due to their low binding and process capacities, requiring large resin volumes and long processing times. Therefore, we explored two alternative processes utilizing IEX, favored in the biopharmaceutical industry due to its low cost and high scalability.

In process 2, the capture step was unchanged. After the UFDF step to concentrate and exchange buffer, the target protein was polished by a negative capture using AEX instead of SEC (Fig. 1b). Since the theoretical pI of RBD219-N1C1 was 8.44 calculated based on its amino acid sequence (Chen et al. 2021) using Emboss Pepstats (Madeira et al. 2019), RBD219-N1C1 was predicted to be positively charged at a pH below its theoretical pI. After screening buffer conditions, consisting of Tris-HCl at different pH values ranging from 7.0 to 8.0 and various concentrations of NaCl, the buffer consisting of 20 mM Tris-HCl (pH 7.5) and 100 mM NaCl showed that the RBD219-N1C1 did not bind to the Q XL column while non-specific HCPs were bound to the column and removed effectively. The step recovery during AEX was 78%, which is lower than the 89% of the step recovery seen from SEC in process 1. The final purity of the purified protein from process 2 was 95.1%, which is lower than the 98.3% purity seen in process 1 but still highly pure. However, the overall recovery in process 2 was only 39%, much lower than the 50% for process 1. This is due to the lower recovery during HIC, 45%, that lags the 67% recovery seen for the equivalent step in process 1. This lower recovery may offer the opportunity for improvement, but overall, it is fair to conclude that AEX can successfully replace SEC for the polishing step.

In process 3, we further optimized process 2 to utilize CEX for the capture step instead of HIC. After the first UFDF step (UFDF-1) to concentrate and buffer exchange the FS, RBD219-N1C1 was captured using a CM FF column followed by a second UFDF step (UFDF-2) and a polishing step using negative AEX capture with another buffer consisting of 25 mM Tris-HCl (pH 7.2) and 5 mM NaCl and selected from the aforementioned buffer screening (Fig. 1b). The additional UFDF-1 step required prior to CEX increased processing time compared to processes 1 and 2. Although the step recovery from CEX was 65%, very similar to the 67% seen after the HIC in process 1, the purity was only 83% after CEX which is significantly lower than the purity (97.6% and 95.2% from processes 1 and 2, respectively) after HIC (Fig. 3c). Purity was improved significantly by 9.5% after the polishing step, resulting in an overall purity of 92.5%, which is lower than the 98.3% and 95.1% seen in processes 1 and 2, respectively.

To summarize, HIC showed a superior performance to remove non-specific host proteins and, hence, resulted in >95% purity after the capture step, which is even higher purity than 92.5% purity seen in the final protein product from process 3. This favored HIC over CEX although its only drawback is the cost. HIC has no limitation on scale-up. On the other hand, both AEX and SEC showed very similar performance during the polishing step. However, while AEX is cost-effective chromatography with full scalability, SEC is expensive and has limitations in scale-up. This reasons us to favor process 2, employing HIC and AXE for the capture and the polishing step, respectively. Before we urge to conclude that process 2 is the best process to produce RBD219-N1C1, we characterized and compared the purified protein from process 2 and two other processes for integrity, size estimation, impurity contents, and functionality.

Characterization and size estimation of the purified proteins from three processes

The purified proteins from all processes were characterized for integrity by Western blot, size exclusion chromatography-high performance liquid chromatography (SEC-HPLC), and DLS. When 5 μg of purified protein was analyzed by SDS-PAGE followed by Coomassie blue staining, a single band was seen at ~28 kDa under reducing conditions and at ~25 kDa under non-reducing conditions (Fig. 4a). Western blot analysis using a monoclonal antibody against SARS-CoV-2 spike protein under a non-reducing SDS-PAGE indicating that the ~25 kDa band is indeed derived from the SARS-CoV-2 spike protein (Fig. 4b). An additional band at ~50 kDa was detected in the protein from process 3, likely representing a dimer. Dimerization through free cysteine residues had also been reported for SARS-CoV RBD219-N1, and therefore, the free cysteine (C538) was mutated to alanine in RBD219-N1C1 (Chen et al. 2021). Although RBD219-N1C1 theoretically lacks free cysteine residues, we observed some dimers during the fermentation that were removed during purification. Therefore, process 3 appears to be less efficient at removing dimeric RBD219-N1C1 than the other processes.

Fig. 4
figure 4

Characterization of purified RBD219-N1C1 proteins from three processes. Purified proteins were analyzed by SDS-PAGE with Coomassie blue stain (a) and Western blot with a monoclonal anti-SARS-CoV-2 spike antibody (b). Size and aggregate evaluation by SEC-HPLC (c). Hydrodynamic radius and size in solution measured by dynamic light scattering (d). Averages ± SD are shown from four independent measurements

SEC-HPLC with 50 μg of the purified protein preparations indicated that all three proteins were similar in size and had no aggregation. Only the purified protein from process 3 showed an additional peak eluting ~1 min earlier, likely, as reported above, a dimer (Fig. 4c). Finally, all three proteins were analyzed by DLS to estimate size and dispersity in solution. The estimated sizes of the purified proteins from each process were 29.75, 31.00, and 34.25 kDa, respectively (Fig. 4d). As expected, the protein from process 3 showed higher polydispersity than the other samples (Fig. S1).

Impurity assessment in the purified proteins

P. pastoris HCP was assayed by Western blot and quantified by ELISA using a second-generation P. pastoris HCP detection kit. When 5 μg of unpurified proteins (i.e., FS), as well as the purified proteins, was analyzed by SDS-PAGE followed by Coomassie blue stain and Western blot, we saw that HCP had been effectively removed from all three processes (Fig. 5a and b). The HCP content in the purified proteins was calculated as 95.9 ng, 6.8 ng, and 44.8 ng per mg of RBD219-N1C1 from processes 1–3, respectively (Fig. 5c). All these values were within acceptable limits (1–100 ng/mg), for biopharmaceuticals (Bracewell et al. 2015; Zhu-Shimoni et al. 2014).

Fig. 5
figure 5

Impurity evaluation of the purified RBD219-N1C1 proteins from three processes. Unpurified (FS) and purified RBD219-N1C1 in reduced SDS-PAGE with Coomassie blue stain (a) and with Western blot using anti-P. pastoris HCP antibody (b). Measured P. pastoris HCP content by quantitative ELISA (c) and endotoxin levels (d) are shown

Endotoxin levels measured in the purified proteins were 1.74, 1.48, and 2.10 EU per mg for the purified proteins from processes 1–3, respectively (Fig. 5d). These values are significantly lower than the maximum recommended endotoxin level for recombinant subunit vaccines, 20 EU/mL (Brito and Singh 2011).

Functionality assessment using ACE2 binding assay

To evaluate the functionality of the purified proteins from each process, the ability to bind to the human ACE2 receptor was tested in vitro. SARS-CoV-2 uses this human cell surface receptor for cell entry (Hoffmann et al. 2020), and here, the binding of each protein to ACE2 was quantified by ELISA. All proteins presented similar binding curves to ACE2, with calculated EC50 values (for 2 μg/mL purified protein) of 0.037, 0.033, and 0.038 μg/mL ACE2, respectively (Fig. 6), suggesting that all three proteins were functionally equivalent.

Fig. 6
figure 6

Binding ability of the purified RBD219-N1C1 from three processes to a recombinant human ACE2 receptor


We developed a process suitable for producing a recombinant protein COVID-19 vaccine antigen for clinical testing and transition to industrial manufacture. Fermentations were initially run at the 1-L scale (for fermentation condition optimization) and then the 5-L scale (for downstream purification process development). When scaled to 5 L and conditions had only been modified for gas flow and agitation rate to maintain 30% dissolved oxygen, differences in the protein yield were observed. Four subsequent identical 5-L fermentation runs showed reproducibility with a CV of 8.3%, further emphasizing robustness.

Based on our previous experience with SARS-CoV RBD219-N1 (a prototype vaccine for SARS), a 1.6- to 2.5-fold yield increase was achieved when switching from basal-salt to low-salt medium during the glycerol batch phase in the fermentation process (Chen et al. 2014). For SARS-CoV2-RBD219-N1C1, a 3.6-fold increase in yield suggests that the salt concentration was a significant factor. In basal-salt medium, the recombinant protein precipitates in the presence of magnesium and calcium phosphates as the pH is adjusted above 5.5. Low-salt medium also precipitates, though to a much lesser extent. The precipitate formation can have adverse effects on the fermentation process as it can lead to an unbalanced nutrient supply, cause cell disruption, and induce secreted proteins to form aggregates (Zhang et al. 2006). Similar findings had previously been observed with the production of a therapeutic Fc-fusion protein in the fermentation of P. pastoris. When salt supplements were added at induction, the protein yield decreased (Lin et al. 2007).

Purification optimization produced RBD219-N1C1 at high purity and yield, with a high recovery rate, suitable for scalability for manufacturing. Three purification methods (processes 1–3) were tested and compared using 1 L FS from the identical fermentation runs for rapid development. Process 1 was adapted based on the previous purification method with SARS-CoV RBD219-N1 with slight modification on ammonium salt concentration in HIC. Process 1 resulted in 98.3% purity with a 50% overall recovery rate, similar to the 98.5% purity and 52% overall recovery shown in SARS-CoV RBD219-N1 purification (Fig. 3a) (Chen et al. 2017). Purity was dramatically increased to >97% after the HIC capture step (Fig. 3a). Process 1 is suitable to produce the target protein at the laboratory scale but is limited in scale-up due to low binding and process capacities, as well as the long processing time leading to high cost for production. Therefore, two other processes were tested to replace costly HIC and SEC with CEX and AEX, respectively. For biopharmaceuticals, IEX has been favored in chromatography due to its robustness and full scalability (Chen et al. 2017).

While IEX was tested in the polishing step of process 2, it was used for both capture and polishing steps in process 3. In process 2, AEX showed comparable step recovery, and purity increases to SEC in process 1 (Fig. 3a and b). However, a significant improvement in purity by AEX was shown in process 3 after the capture step by CEX (Fig. 3c) as the CEX elution pool showed only 83.0% purity. This suggests that AEX not only can successfully replace SEC but also can effectively remove non-specific host proteins. On the contrary, CEX showed a comparable step recovery but a lower capability to remove host proteins during the capture step. The purity after the CEX capture was only 83.0%, which is significantly lower than the purity after HIC capture (97.6% and 95.2% seen in processes 1 and 2, respectively) (Fig. 3). Overall, process 3 produced the least pure RBD219-N1C1 protein among the three processes tested.

Before choosing the best process for purification of RBD219-N1C1, the purified proteins were characterized for their quality based on size, specificity, and impurity. The integrity assessment of the purified proteins was performed by SDS-PAGE. Coomassie-stained gels showed a single band at ~25 kDa that was recognized by a SARS-CoV-2 spike protein-specific antibody (Fig. 4a and b). In addition, for process 3, the Western blot indicated the presence of an additional band speculated to be a dimer (Fig. 4b); this same product was also seen by SEC-HPLC (Fig. 4c). Although no difference in size was seen among the purified proteins from the three processes by SDS-PAGE (Fig. 4a), the size under native conditions, estimated by DLS, showed differences. The sizes in solution were 29.75, 31.00, and 34.25 kDa for the products from processes 1–3, respectively. The purified protein from process 3 appeared larger estimated size, suggesting the presence of additional molecules in the preparation (Fig. 4d and Fig. S1). Next, impurities such as P. pastoris HCPs and endotoxin levels were analyzed and compared. While all purified proteins showed no detectable HCPs by Western blot with anti-P. pastoris antibodies (Fig. 5b), when measured by ELISA, different HCP content levels were observed. Process 2 showed the lowest HCP content (6.8 ± 0.0 ng) per mg of purified protein while process 1 showed the highest HCP content (95.9 ± 8.2 ng) and process 3 showed 44.8 ± 13.0 ng (Fig. 5c). The higher HCP content found in the purified protein from process 1 was likely due to the presence of HCP with a similar size of RBD219-N1C1, which further suggested that SEC might not be an ideal purification step. No significant difference in endotoxin level was measured in the purified protein from three processes (Fig. 5d), albeit all protein preparations contained less than the maximally allowed endotoxin levels. Finally, the functionality of the purified proteins from three processes tested by in vitro ACE2 binding assay showed that all three proteins showed similar binding to recombinant human ACE2 receptor (Fig. 6).

In summary, after comparing yield, purity, and recovery after each purification, we conclude that HIC for capture due to its superior capability to remove non-specific host proteins and produce a protein with >95% purity, and AEX for polishing due to its low cost and full scalability (process 2) are best suited to produce RBD219-N1C1. In addition, comparison for the integrity, dimer content, HCP contents, and endotoxin level in purified protein-supported process 2 generates quality proteins similar to process 1 but significantly better than process 3 and, hence, is a more ideal process for upscaling.

P. pastoris is widely used to produce recombinant proteins for clinical and commercial use. The P. pastoris system is licensed to more than 300 companies in the biotechnology, pharmaceutical, vaccine, animal health, and food industries, and more than 70 therapeutic and industrial products are approved by stringent regulatory bodies including human insulin, Hep B vaccine, cytokines, and hormones (Safder et al. 2018). P. pastoris offers high growth rates, high cell densities, and high protein yield using simple and inexpensive fermentation media. Fermentation conditions are highly scalable due to the robust nature of P. pastoris, and the manufacturing times are short. With such an effective production platform and the availability of manufacturing facilities including vaccine manufacturers from the developing countries network, we can produce this COVID vaccine candidate at a low cost to meet the urgent global needs. The production technology of RBD219-N1C1 was transferred to Biological E. Limited, an India-based vaccine and pharmaceutical company, and a phase I/II clinical trial was initiated in November 2020 in India (CTRI 2020; Dynavax 2020).