Introduction

Advances in molecular biology and genetic engineering, together with the development of various biotechnological tools, have enabled the production of a large number of protein drugs, vaccines, and industrial enzymes, such as α-amylase and phytase, in living organisms (Bandaranayake and Almo 2014; Chen 2012; Islam et al. 2018; Obembe et al. 2011; Sohn et al. 2018; Tang and Zhao 2009; Tschofen et al. 2016; Xu et al. 2011). Bacteria and animal cells have been widely used as platforms for the production of various proteins, which are currently available in the market as commercialized products (Bandaranayake and Almo 2014; Chen 2012). However, these conventional production systems have certain shortcomings; for example, the animal cell-based system is expensive and cannot be operated on a large scale (Egelkrout et al. 2012). Recently, plants have received increasing attention as a new platform for the production of recombinant proteins. Compared with animal cells and bacteria, plants have various advantages as protein production systems, such as easy scalability, and potentially zero risk of contamination with human pathogens and bacterial toxins. These advantages have led to the concept of green pharmaceuticals, i.e., pharmaceuticals produced in plants (Egelkrout et al. 2012; Obembe et al. 2011; Xu et al. 2011; Zoschke and Bock 2018).

In plant cells, recombinant proteins are often expressed in subcellular organelles (Jang et al. 1999). In leaf tissues, one of the most attractive organelles for protein production is the chloroplast. In mesophyll cells of mature leaves, the number of chloroplasts is as high as 100, and the area occupied by a chloroplast is second to that occupied by a vacuole (Daniell et al. 2016; Zoschke and Bock 2018). Moreover, unlike vacuoles, chloroplasts have neutral pH and minimally active proteolysis (Zoschke and Bock 2018). Thus, chloroplasts are ideal for the storage of recombinant proteins in plant cells. This is supported by the Rubisco protein complex, which is stored in chloroplasts and accounts for half of the total cellular proteins in leaf tissues (Vitlin Gruber and Feiz 2018).

Two different approaches have been developed for recombinant protein production in chloroplasts. One approach is based on the integration of the target gene into the chloroplast genome via homologous recombination, followed by transcription and translation in chloroplasts (Adem et al. 2017). Many excellent review papers have been published on this topic (Adem et al. 2017; Chebolu and Daniell 2009; Daniell et al. 2009, 2016; Jin and Daniell 2015; Olejniczak et al. 2016; Scotti et al. 2012); therefore, this approach is not covered in this review but referred to in these papers.

The second approach is based on nuclear transformation; the transgene is either transiently expressed in the plant cell nucleus or stably integrated into the plant nuclear genome, and the translated protein is imported into the chloroplasts from the cytosol (Cui et al. 2011; Gelvin 2003; Zahin et al. 2016). Like the chloroplast transformation method, this approach also has great potential for the production of foreign proteins in chloroplasts; however, it has not received much attention and has not been fully explored for the production of recombinant proteins. Compared with research conducted on the chloroplast transformation approach, only a few research articles have been published on import of recombinant protein into the chloroplast using nuclear transformation technologies. In this review, we summarize advantages and weaknesses of the nuclear transformation-based approach for the import of recombinant proteins from the cytosol to the chloroplast for the large-scale production.

Expression of target genes in the nucleus for protein production in chloroplasts

Agrobacterium tumefaciens, a phytopathogenic Gram-negative soil bacterium, has been used for delivering transgenes into the nucleus of plant cells. A substantial region of the tumor-inducing (Ti) plasmid of Agrobacterium, referred to as the transfer DNA (T-DNA), is integrated into the host plant genome (Gelvin 2003; Hwang et al. 2017a; Smith and Townsend 1907). Plant transformation is performed using the binary vector system, which was developed based on the T-DNA of Agrobacterium. Bacterial genes within the T-DNA are replaced with the gene of interest and delivered into the host nucleus during transformation. Recently, it has been shown that Agrobacterium can be used to transform not only tobacco but also a large number of other plant species (Guo et al. 2018). Thus, there is almost no restriction on plant species for foreign gene expression via nuclear transformation. To produce recombinant proteins in chloroplasts via nuclear transformation, two different approaches have been developed, based on the status of the transgene in the nucleus. In the first approach, the transgene is stably integrated into the nuclear genome, whereas, in the second approach, the transgene is transiently expressed in the nucleus by the unintegrated T-DNA (Fig. 1). Although both approaches rely on nuclear gene expression, there are several differences that distinguish these two approaches when used for recombinant protein production in plants.

Fig. 1
figure 1

Stable or transient expression of foreign genes from the nuclear genome and post-translational import of foreign proteins into chloroplasts. Two methods are used for transgene expression following nuclear transformation. In the first method, T-DNA harboring the transgene is stably integrated into the nuclear genome, thereby giving rise to transgenic plants, and in the second method, the non-integrated transgene is expressed from the nucleus and subsequently proteins are synthesized on cytosolic ribosomes and post-translationally imported into chloroplasts

Stable expression system for foreign proteins

Stable integration of the transgene in the nuclear genome requires the generation of transgenic plants. First, the transgene is introduced into a binary vector. Subsequently, the binary vector carrying the transgene is introduced into A. tumefaciens, which transfers the T-DNA from the binary vector into the host cells, where the T-DNA is integrated into the nuclear genome. In this approach, a marker gene is integrated into the nuclear genome along with the gene of interest, thereby allowing the screening of transformed cells from which transgenic plants are regenerated. This is followed by the selection of homozygous plants for the production of proteins encoded by transgenes. The advantage of this approach is that the production of plant biomass expressing the recombinant proteins is as simple as growing plants in a greenhouse, and does not require any additional steps such as agroinfiltration. Elite lines selected on the basis of performance are propagated via seeds, thus requiring minimal maintenance. Thus, this approach is the most suitable for large-scale protein production. However, the disadvantage of this approach is that the generation of homozygous lines is a time-consuming process. In addition, this approach can be used for the production of transgenic in only a limited number of plant species. Furthermore, individual transgenic plants produced using this approach show high variability in transgene expression. To obtain transgenic plants expressing the transgene at high levels, it is necessary to screen a large number of independent transgenic plants; therefore, this method is labor intensive. Moreover, the expression level of a single copy of the transgene stably integrated into the host genome is generally lower than that resulting from the transient transgene expression post-agroinfiltration, and the expression of transgenes is often silenced after several generations. To address these problems, many new approaches have been developed recently. These include the use of matrix attachment sequences (to ensure transgene expression independent of the integration site; (Halweg et al. 2005), strong artificial transcription factors (Li et al. 2017; Lowder et al. 2018), inducible gene silencing suppressors, transcription terminators [with high transcription efficiency; (Csorba et al. 2015)], and 5′ untranslated regions [with high translational efficiency; (Kim et al. 2013; Li et al. 2017)]. These advances in the expression of transgenes in the host cell nucleus provide a new strategy for high-level protein production in chloroplasts. In addition, according to the current regulation of genetically modified (GM) plants, it is important to identify homozygous transgenic plants carrying a single-copy insertion of the target gene in the intergenic region. Thus, this approach would best fit the current regulation of GM plants for commercialization.

Transient expression system for foreign proteins

In this approach, the transgene carried by the non-integrating T-DNA system is only transiently present and expressed in the nucleus of leaf cells post-agroinfiltration through mutant Agrobacterium strains (Fig. 1; Gelvin 2003; Kapila et al. 1997; Sheludko 2008). Thus, this approach is not subject to the limitations of transgene integration in the nuclear genome, such as the lengthy process of transgenic plant selection and the effect of the insertion site on transgene expression (Marillonnet et al. 2005; Sheludko et al. 2007; Sheludko 2008; Vaquero et al. 1999). Using this approach, recombinant proteins can be produced within a few days, typically ranging from 3 to 10 day post-agroinfiltration. Thus, the transient expression system is a rapid method for the production of recombinant proteins compared with the stable transformation system. This approach would be particularly suitable for the production of recombinant vaccines in the event of viral disease outbreaks (Chen et al. 2013). The most important advantage of this approach is the high-level protein production compared with the transgenic approach. The transgene copy number in the nucleus is higher in transiently transformed plants than that in single-copy transgenic plants, leading to higher protein production (Huang and Mason 2004; Komarova et al. 2010; Sheludko 2008). If the binary vector is based on DNA or RNA viruses, the copy number of the target gene or mRNA, respectively, is amplified greatly, thus leading to high-level protein production (Cañizares et al. 2006; Huang et al. 2009; Lico et al. 2008; Marillonnet et al. 2005; Sheludko 2008). For example, the transient expression of a foreign gene using the MagnICON vector, an RNA virus-based expression vector, followed by import of translated foreign proteins into chloroplasts led to the expression level as high as that obtained with the transplastomic expression approach (Marillonnet et al. 2004, 2005). Moreover, gene silencing can be avoided by the coexpression of gene silencing suppressors (Csorba et al. 2015). Furthermore, it is possible to express multiple genes simultaneously, enabling the production of protein complexes composed of hetero-multimeric subunits (Pineo et al. 2013).

However, this approach also has certain disadvantages compared with the transgenic approach. In the transient expression approach, a new Agrobacterium culture is needed each time recombinant protein production is required, and plants must be infiltrated individually. Moreover, agroinfiltration is a laborious process compared with the sowing of seeds in the transgenic approach. Thus, the overall cost of biomass production using the transient expression approach is much higher than that using the transgenic approach. Another point of concern is that the efficiency of protein production varies among individual plants transformed in one experiment and among plants transformed in different experiments; this could be a major problem when the consistency of protein production is important, as in the case of pharmaceuticals (Fujiuchi et al. 2016). Another disadvantage of this method is the use of Agrobacterium, a Gram-negative species of bacteria. Recently, plant systems have been advocated as a safe system with no or very low level of endotoxins (Islam et al. 2018). However, with the use of Agrobacterium, the plant system is no longer considered an endotoxin-free system, and additional processing is needed to remove endotoxins of Agrobacterium origin from the purified proteins. It should be noted that the above-mentioned issues are not specific to protein production in chloroplasts; instead, these issues represent the general problems of this approach when used for recombinant protein production in plants.

Protein import into chloroplasts from the cytosol after translation

Successful application of nuclear transformation for foreign protein production in chloroplasts is critically dependent on the efficiency of protein import into chloroplasts after translation in the cytosol. Thus, it is important to understand how a large amount of protein is imported into chloroplasts from the cytosol for high-level protein accumulation. The capacity to deliver proteins from the cytosol to the chloroplast may not be a limiting factor, because more than 3000 different proteins are imported into chloroplasts from the cytosol (Jarvis 2008; Lee et al. 2013; Li and Chiu 2010). Moreover, the three most highly abundant proteins in plant leaf cells, Rubisco large subunit (RbcL), Rubisco small subunit (RbcS), and chlorophyll a/b-binding protein (Cab), occur in chloroplasts. Two of these three proteins, RbcS and Cab, are imported into chloroplasts from the cytosol. Thus, the first step for high-level protein production in chloroplasts via nuclear transformation is to identify a signal sequence that is highly efficient in delivering foreign proteins into chloroplasts. Since transit peptides are cargo-independent (Zhang and Glaser 2002; Zybailov et al. 2008), the selection of a highly efficient transit peptide is critical for the accumulation of proteins at high levels in chloroplasts. Transit peptides of RbcS and Cab proteins have been widely used for protein production in chloroplasts (Shen et al. 2017) examples below). However, transit peptides are highly variable in sequence and protein import efficiency. Small sequence motifs present in the transit peptide play crucial roles in the efficiency of protein import into chloroplasts. Thus, it is possible that the modulation of these sequence motifs affects the protein import efficiency (Lee et al. 2006, 2018). Consistent with this possibility, a recent report showed that modified chloroplast transit peptides improve the import of diverse E. coli enzymes such as EcTSR and EcGCL into rice chloroplasts (Shen et al. 2017). Thus, the identification of transit peptides with high protein import efficiency could potentially enhance recombinant protein production in chloroplasts via nuclear transformation.

Another important issue in transit peptide-mediated delivery of proteins into chloroplasts is the removal of transit peptide from the target recombinant protein after import into chloroplasts. During or after import of precursors into chloroplasts, the N-terminal transit peptide is cleaved off by signal propeptide peptidases (SPPs) within chloroplasts (Chen and Li 2007; Richter and Lamppa 1998). However, the entire signal sequence is not removed from the target proteins after import, because the cleavage site of SPPs is located within the transit peptide (Emanuelsson et al. 1999; Lee et al. 2008). Thus, target proteins contain many extra residues derived from the transit peptide. It is possible that the N-terminal region relative to the SPP cleavage site of the transit peptide can be used as a signal sequence for delivering proteins into chloroplasts. However, this may decrease the import efficiency, as the sequence downstream of the cleavage site also contributes to the import efficiency (Lee et al. 2008). Another approach for completely removing the transit peptide from the target protein after import into chloroplasts is to include a new proteolytic cleavage site between the transit peptide and target protein. For example, enterokinase removes the remaining region of the transit peptide when the enterokinase recognition site is inserted between the transit peptide and N-terminus of the target protein. Enterokinase cleaves the C-terminal end of the cleavage site, leaving no extra residues (Hosfield and Lu 1999; Skala et al. 2013).

Demonstration of foreign gene expression and sequestration of translated proteins into chloroplasts via nuclear transformation technologies

Stable expression of foreign proteins

Several studies report successful production of recombinant proteins in chloroplasts using the nuclear transformation strategy (Table 1). These include production of β-glucuronidase (GUS), green fluorescent protein (GFP), and E. coli phosphoenolpyruvate synthetase and xylanase in various plant species such as petunia, potato, rice, and tobacco (Hyunjong et al. 2005; Kavanagh et al. 1988; Köhler et al. 1997; Panstruga et al. 1997). The purified GUS protein from petunia showed 95% bioactivity (Kavanagh et al. 1988). Transit peptides of Arabidopsis or rice RbcS or Cab proteins were used for the delivery of these proteins to chloroplasts. The level of protein accumulation in chloroplasts showed high variability, depending on the protein and plant species. For example, in potato, E. coli phosphoenolpyruvate synthetase accounted for approximately 0.1% of the total soluble protein (TSP) (Panstruga et al. 1997). However, xylanase and GFP were expressed at much higher levels, reaching approximately 4.5% and 10% of the TSP in transgenic tobacco and rice plants, respectively. In addition, HPV16-L1 or p24 antigen produced in transgenic tobacco plants accounted for 11% of the TSP or 230 µg/kg (fresh weight; fw) of plant biomass (Maclean et al. 2007; Meyers et al. 2008). To date, the level of HPV16-L1 protein produced in transgenic tobacco plants is the highest level of protein production achieved in chloroplasts through nuclear transformation (Jang et al. 1999). Hoppmann et al. (2002) used tobacco cell lines to produce GFP and Catharanthus roseus strictosidine synthase (Str1); these proteins accumulated to very high levels when delivered to chloroplasts using the transit peptide of potato granule-bound starch synthase (gbss). In addition, several enzymes have been produced using this approach. Polyhydroxybutyrate (PHB) enzymes of Alcaligenes eutrophus were produced in chloroplasts of transgenic maize (Zhong et al. 2003). Hyperthermostable endoglucanase Cel5A and a protein encoded by the truncated cry1Ac gene were produced to approximately 5.2% and 2% of the TSP in transgenic tobacco and rice plants, respectively (Kim et al. 2009, 2010). Moreover, Cel5A produced in chloroplasts was highly efficient in the degradation of its substrate, carboxy-methyl cellulose (CMC) (Kim et al. 2010). In these three studies, the transit peptide of RbcS was used to deliver proteins to chloroplasts. Together, these studies support the potential of recombinant protein production in chloroplasts using stable transformation.

Table 1 List of proteins expressed via nuclear transformation, followed by transit peptide-mediated protein import into chloroplasts

Transient expression of foreign proteins

Recently, transient expression of foreign genes, followed by protein import into chloroplasts, was used for high-level protein production in plants (Table 2). For example, bioactive human growth hormone (hGH) and Norwalk virus capsid proteins were expressed to 0.1% of the TSP and approximately 0.1 mg/kg (fw) of plant biomass, respectively (Gils et al. 2005). The truncated Gag (p17/p24) and p24 capsid subunit proteins of human immunodeficiency virus type 1 (HIV-1) were produced at very high levels, exceeding 1 mg/kg (fw) of plant biomass (Meyers et al. 2008). Moreover, p17/p24 proteins boosted T-cell and humoral responses in mice that had been primed with a gag DNA vaccine (Meyers et al. 2008). Another example of a viral protein produced in chloroplasts is HPV16 L1; Maclean et al. (2007) produced 137 mg HPV16 L1 per kg (fw) of plant biomass. Moreover, L1 proteins were properly assembled into virus-like particles (VLPs) in N. benthamiana, which are highly immunogenic in target animals and can be developed into effective vaccines; thus, this result demonstrates the potential of this approach for vaccine development. Another study showed further improvement in the productivity of HPV16 L1 VLPs using the MagnICON vector (Zahin et al. 2016); using this vector, 250 mg HPV16 L1 protein was produced per kg (fw) of plant biomass. VLPs of the human papilloma virus (HPV) are composed of both L1 and L2 proteins, and are thought to exhibit higher immunogenicity than L1 alone (Zahin et al. 2016). Indeed, coexpression of HPV16 L1 and L2 genes encoding the major and minor coat proteins, respectively, resulted in the accumulation of 1710 mg HPV16 VLPs per kg (fw) of plant biomass (Pineo et al. 2013). In other studies, Yanez et al. (2017, 2018) demonstrated the production of HPV16 E7 to high levels in plants. Together, these studies demonstrate that transient transgene expression, followed by protein import into chloroplasts, is a powerful approach for high-level recombinant protein production in plants.

Table 2 List of proteins expressed in tobacco via agroinfiltration, followed by protein import into chloroplasts using the transit peptide of the small subunit of Rubisco (RbcS)

Potential production of N-glycosylated proteins in chloroplasts

Chloroplasts originally evolved from cyanobacteria and, therefore, do not have the ability to synthesize glycosylated proteins. However, many pharmaceutical proteins are N-glycosylated (Gomord et al. 2010; Solá and Griebenow 2010). These N-glycans are important for the stability of proteins. Moreover, in certain cases, N-glycans are crucial for the biological activity of proteins. Proteins are N-glycosylated in the endoplasmic reticulum (ER) via a system involving a large number of proteins (Ceriotti et al. 1998). However, it has been shown that certain chloroplast proteins such as α-type carbonic anhydrases and amylases are N-glycosylated (Burén et al. 2011; Chen et al. 2004; Faye and Daniell 2006; Lehtimäki et al. 2015). Moreover, α-carbonic anhydrases in Arabidopsis are one of the most abundant proteins in chloroplasts, suggesting that a large amount of foreign proteins can be transported to chloroplasts via this route. Recent studies show that N-glycosylated chloroplast proteins use a special route to reach chloroplasts (Lehtimäki et al. 2015). These proteins are first targeted to the ER and then to the Golgi apparatus. In the ER, proteins are N-glycosylated, and N-glycans are modified to the complex type in the Golgi apparatus (Chen et al. 2004; Villarejo et al. 2005). Subsequently, N-glycosylated proteins are diverted from the Golgi apparatus to chloroplasts. The exact mechanism of trafficking from the Golgi to the chloroplast has not been elucidated. These data suggest that this pathway can be used for the import of N-glycosylated foreign proteins in chloroplasts in a large-scale production. To use this pathway for recombinant protein production, proteins must be produced via nuclear transformation. However, this pathway has not yet been used for the import of foreign proteins in chloroplasts.

Conclusion and perspectives

Proteins are one of the most useful biomaterials. Advances in molecular biology and genetic engineering have enabled the use of proteins in a wide range of applications such as biopharmaceuticals, vaccines, antibodies, and industrial enzymes. With the sequencing of the entire genomes of numerous organisms, ranging from bacteria to human, the source of protein-coding genes has increased substantially. In addition, recombinant DNA technology has enabled the generation of new genes by de novo synthesis or by combining domains from naturally existing genes. To ensure that recombinant proteins are used in a wide range of applications, the method of protein production must be economical and highly scalable. For these reasons, plants have recently gained popularity as a platform for recombinant protein production. Chloroplasts are considered excellent subcellular organelles in plants for the production of recombinant proteins at high levels. Two main approaches have been developed for protein production in chloroplasts: chloroplast transformation, which has emerged as a highly cost-effective method, and nuclear transformation, followed by protein import into chloroplasts, which has not been explored extensively for the production of foreign proteins. Transient transgene expression via agroinfiltration has been shown to produce target proteins at extremely high levels in chloroplasts, suggesting that nuclear transformation-based expression is also an excellent option for protein production by import of translated recombinant proteins into chloroplasts. In contrast to transient expression, stable integration of a foreign gene in the nuclear genome has a clear disadvantage for protein production, because proteins are expressed from a single copy of the gene. However, it is still possible to achieve a high level of protein production from limited copy numbers of genes. For example, in Arabidopsis, only two copies of the RbcS gene have been used for the accumulation of RbcS protein to extremely high levels in chloroplasts. New innovative approaches are needed for improving the level of protein production in chloroplasts. One possible approach is to use RbcS as a fusion partner to derive the production of foreign proteins at high levels. Accumulation of foreign proteins has been improved by several fold using the full-length RbcS protein as a fusion partner rather than using only the transit peptide of RbcS (Hwang et al. 2017b). Based on the design principle of transit peptides, synthetic transit peptides with high efficiency in protein import into chloroplasts could be generated to deliver a large amount of protein into chloroplasts (Lee and Hwang 2018). In addition, the nuclear transformation approach could be explored for producing N-glycosylated proteins in chloroplasts. Although considerable research has been conducted for demonstrating the production of recombinant proteins at high levels in chloroplasts by importing them into chloroplasts, no recombinant proteins produced in chloroplasts have yet been commercialized. We expect that this promising approach will soon be used for the production of commercial protein products.

Author contribution statement

TMS, JSK, GC and IHH conceived and co-wrote this review. TMS drew Fig. 1.