Porcine epidemic diarrhea virus (PEDV), a member of the family Coronaviridae, causes acute diarrhea, dehydration, and high mortality in piglets [1, 2], which incurs significant economic losses in European and Asian swine industries [36]. PEDV has a single-stranded positive-sense RNA genome of approximately 28 kb in size that encodes four structural proteins (spike [S], envelope, membrane [M], and nucleocapsid [N] protein) and four nonstructural proteins (1a, 1b, 3a, and 3b). Among the viral proteins, the S protein that functions in the receptor binding and virus-cell membrane fusion at the time of virus entry represents an important site for virus neutralization [710]. It has been well documented that S proteins of several coronaviruses, such as mouse hepatitis virus (MHV), porcine transmissible gastroenteritis virus (TGEV), avian infectious bronchitis virus (IBV), and human severe acute respiratory syndrome coronavirus, play important roles in virus entry, neutralization, and pathogenicity [1113]. The M protein of coronavirus is the most abundant component among the viral envelope proteins and plays important roles in virus assembly by interacting with S and N proteins [1419]. The N protein of coronavirus having a RNA-binding property packages viral genomic RNA into the nucleocapsid of virus particles [20, 21].

In general, live attenuated virus vaccines elicit protective immunity more efficiently than inactivated virus vaccines or subunit vaccines that consist of DNA or recombinant proteins. Likewise, the majority of commercially available PED vaccines are live attenuated strains of PEDV that have been established by a growth adaptation of the virus through serial passages in vitro [22, 23]. While serial passages of other coronaviruses also often result in growth adaptation in new host cells and attenuation of virulence in their natural hosts, the molecular basis of virus adaptation and attenuation are not yet elucidated.

Previously, we have isolated 83P-5 strain of PEDV from the small intestine of a piglet with diarrhea by using Vero cells, and the virus was subsequently subjected to a serial passage in Vero cells [24]. The parent 83P-5 replicated only in Vero cells among various cell lines we tested, whereas the virus at the 22nd passage replicated not only Vero cells but also MA104, MPK, and CPK cell lines. Thus, the serial passage of 83P-5 in Vero cells resulted in the viral adaptation to grow in cultured cells. In this study, to better understand the mechanisms underlying in vitro growth adaptation and in vivo attenuation of PEDV, we further maintained the 83P-5 in Vero cells up to the 100th passage and analyzed changes in the S, M, and N gene sequences and pathogenicity of the virus at the 34th, 61st, and 100th passage levels.

Full-length S, M, and N gene cDNAs were amplified from the parent, the 34th-, 61st-, and 100th-passaged 83P-5 by RT-PCR using the Transcriptor High Fidelity cDNA Synthesis kit (Roche Diagnostics, Switzerland) and the Expand High Fidelity Plus PCR System (Roche Diagnostics) with primers shown in Table 1. The resultant amplicons were gel-purified and cloned into the pCR® 4-TOPO® vector (Invitrogen Corp., USA) and sequenced in both directions.

Table 1 Primer sequences for amplification of S, M, and N gene and GenBank accession numbers of each passaged 83P-5 strain

The S genes of the parent, the 34th- and 61st-passaged 83P-5 viruses consisted of 4,152 nucleotides encoding 1,383 amino acids, whereas the 100th-passaged virus had an in-frame deletion of three nucleotides at position 455–457 that resulted in a Tyr152 deletion (Table 2). Compared to the S gene of the parent 83P-5, the 34th-, 61st-, and 100th-passaged viruses had 6, 10, and 18 nucleotide changes, respectively. These mutations were predicted to change 5, 9, and 13 amino acids in the S protein of the 34th-, 61st-, and 100th-passaged 83P-5, respectively. The high nonsynonymous/synonymous ratio of nucleotide substitutions (5.00, 9.00, and 5.00 in the 34th-, 61st-, and 100th-passaged 83P-5, respectively) implied a positive selection of the mutations in the S genes of 83P-5 during the consecutive passages in Vero cells.

Table 2 Lengths and mutations in nucleotides and amino acids of the S gene of 83P-5 strain

The S protein of PEDV is a type I membrane glycoprotein composed of a signal peptide, S1 and S2 external domains, a transmembrane domain, and a C-terminal cytoplasmic domain. As shown in Fig. 1, the serial passage of 83P-5 up to the 100th passage was associated with amino acid substitutions in the signal peptide and S1 and S2 external domains, yet no mutation was found in the transmembrane or cytoplasmic domains. Thus, the extracellular region of the S protein of 83P-5 was preferentially subjected to selection pressures for the virus replication in Vero cells. Interestingly, all amino acid changes occurring at the 34th and 61st passages, except for Arg1087Gly change at the 34th passage, were found to be carried over to the 100th passage. This implied that the mutations occurring at the 34th and 61st passages were strongly selected for and accumulated in the viral S gene during the course of consecutive passage. In addition to the mutations carried over from the 34th- and 61st-passaged viruses, the 100th-passaged 83P-5 had five mutations specific to this virus, including a Tyr152 deletion. As a result, the 100th-passaged 83P-5 showed mutations in close proximity to each other in the S1 domain (Asn112/Ser114, Tyr152 deletion/Leu153, and Lys378/Ser379) and in the S2 domain (Leu963/Thr965/His973). These mutations occurring in close proximity to each other may influence the structure and/or function of the S protein more strongly than the other sporadic mutations.

Fig. 1
figure 1

Schematic representation of mutations in the S protein of the parent, 34th-, 61st-, and 100th-passaged 83P-5. Signal peptide (SP), transmembrane domain (TM), and neutralizing epitopes (SS2, SS6, and 2C10) are shown as gray boxes. Amino acid numbering corresponds to the parent virus. Dash indicates deletion

Previous studies of the S protein of PEDV have identified neutralizing SS2 (a.a. position 748–755) and SS6 (a.a. position 764–771) epitopes in the S1 domain [10] and 2C10 epitope (a.a. position 1368–1374) in the cytoplasmic domain [7]. We found no mutations in these neutralizing epitopes during the serial passages of 83P-5 in the Vero cells (Figs. 1, 2).

Fig. 2
figure 2

Alignment of the deduced amino acid sequences of the S protein of the parent and passaged PEDV 83P-5 strains with that of the parent and attenuated DR13 strains [27]. Dots represent amino acids that are identical to those in the parent 83P-5. Potential N-linked glycosylation sites of N-{P}-[ST] are underlined. Boxes indicate the signal peptide (position 1–21), transmembrane domain (1334–1356), and neutralizing epitopes (748–755; SS2, 764–771; SS6 and 1368-1374; 2C10)

The S protein of PEDV is heavily glycosylated, and the parent 83P-5 contains 30 potential N-linked glycosylation sites defined by N-{P}-[ST], where {P} is any amino acid except proline. The predicted N-linked glycosylation sites in the S protein of the parent 83P-5 were well conserved in the 34th-, 61st-, and 100th-passaged 83P-5, except for an N-linked glycosylation at Asn110 of the parent virus shifted to Asn112 in all of the passaged viruses by the close proximal Thr112Asn and Asn114Ser mutations (Fig. 2). The shift of the N-linked glycosylation site in the close proximity site as well as the conservation of the other N-linked glycosylation sites suggest an importance of the glycan motif in the S protein of PEDV for the protein conformation and/or function, as has been previously shown in S proteins of other coronaviruses [25, 26].

Previously, Park et al. [27] reported the S gene sequences of a live attenuated DR13 vaccine strain of PEDV that was established by a serial passage of the parental virus in Vero cells. To examine whether the 83P-5 and DR13 viruses were subjected to a similar selection pressure during serial passage in Vero cells, the S protein sequences of the parent and the serially passaged 83P-5 were aligned with those of the parent and the attenuated DR13 (Fig. 2). The result revealed a remarkable sequence similarity between the 100th-passaged 83P-5 and the attenuated DR13. Among fourteen amino acid changes in the 100th-passaged 83P-5 compared to its parental virus, thirteen amino acid changes, including the Tyr152 deletion, were also present in the attenuated DR13. The parent strains of 83P-5 and DR13 had forty-seven amino acid differences; among them, only Asn112, Ser379, and Ala474 in the parent DR13 remained in the attenuated DR13. Thus, the S protein of attenuated DR13 appeared to more closely resemble the 83P-5 strains, in particular the 100th-passaged 83P-5, than the parent DR13.

To further define the genetic relationships of the 83P-5 and DR13 parental strains and their serially passaged derivatives, phylogenetic analysis was performed along with full-length S gene sequences of PEDV that were available in GenBank database. The phylogenetic tree topology formed five distinct genetic groups (groups I–V in Fig. 3), and the parent 83P-5 and parent DR13 belonged to the genetically distinct III and IV groups, respectively. In respect to the 83P-5 strains, the parent, the 34th-, 61st-, and 100th-passaged viruses were clustered into the group III and ranked in order of their passage levels. As predicted, the attenuated DR13 belonged to the group III and was most closely related to the 100th-passaged 83P-5.

Fig. 3
figure 3

Phylogenetic analysis of the full-length S gene sequence of PEDV. Neighborhood-joining tree was generated with 1,000 bootstrap replications using Clustal X v1.83 [39] and viewed by MEGA v4.0.2. The phylogenetic tree was rooted with an out-group TGEV strain (accession number 811785). GenBank accession numbers of reference PEDVs for the S gene are AY167585 (Chinju99), AF353511 (CV777), DQ462404 (DR13 attenuated), DQ862099 (DR13 parent), EU031893 (DX), AY653204 (JS-2004-2), AB548622 (Kawahira), GU180142 (KNU-0801), GU180143 (KNU-0802), GU180144 (KNU-0901), GU180145 (KNU-0902), GU180146 (KNU-0903), GU180147 (KNU-0904), GU190148 (KNU-0905), DQ985739 (LJB/03), EF185992 (LZC), AB548624 (MK), AB548623 (NK), and AF500215 (Spk1)

We had also cloned and sequenced the M and N genes of each passaged 83-5 strains. The M genes of the parent, the 34th-, 61st-, and 100th-passaged 83P-5 viruses consisted of 681 nucleotides encoding 226 amino acids. Compared to the M gene of the parent 83P-5, 61st- and 100th-passaged viruses had one A to G nucleotide change at position 499 that substituted Asn167 with Asp in an amphipathic domain of the M protein (data not shown), whereas the 34th-passaged virus had no nucleotide change. In the N genes constituting of 1,326 nucleotides encoding 441 amino acids, one silent mutation was found at position 102 of the 100th-passaged virus (data not shown).

It has been shown that a serial passage of PEDV and other coronaviruses often results in attenuation of virus virulence. To investigate if the serial passages of 83P-5 had influenced the viral pathogenicity, PEDV-seronegative 12-week-old pigs were inoculated orally with the 34th-, 61st-, or 100th-passaged viruses. All pigs that were inoculated with the 34th- or 61st-passaged 83P-5 (n = 3 each) showed a watery diarrhea or loose stool by 5 days post-infection (Table 3). In contrast, none of the pigs infected with the 100th-passaged 83P-5 (n = 4) showed diarrhea within 14 days post-infection, indicating that the 83P-5 virus had been attenuated somewhere between the 61st and 100th passage levels.

Table 3 Positive pigs with diarrhea and viral RNA in fecal swab samples

We collected fecal swab samples daily from the infected pigs to analyze virus excretion by nested RT-PCR. The first RT-PCR was performed using the Titan One Tube RT-PCR kit (Roche Diagnostics) with a set of the S gene primers: 5′-TTCTGAGTCACGAACAGCCA-3′ (forward) and 5′-CATATGCAGCCTGCTCT GAA-3′ (reverse). The second PCR using BioTherm Taq DNA Polymerase (GeneCraft, Germany) was performed with a forward primer 5′-TATTACTGTCTCTGCGGCTTT-3′ and the reverse primer of the first RT-PCR. The resultant amplicons were visualized by ethidium bromide staining after electrophoresis in 1.5% agarose gels. Although one out of four pigs infected with the 100th-passaged 83P-5 did not show fecal virus excretion, all other pigs, regardless of the virus inoculum, showed sporadic virus excretion in the feces. Since the PCR analysis of virus excretion is not quantitative, it remains to be determined if there is a significant difference in the virus replication in vivo between the 34th-, 61st-, and 100th-passaged viruses.

In summary, the serial passage of 83P-5 in Vero cells resulted in the viral growth adaptation in vitro and attenuation of virulence in vivo, and these phenotypic changes were associated with a strong selection for the viral S gene. One-directional and cumulative selective pressure for the S gene of 83P-5 in Vero cells was evident by the fact that virtually almost all mutations occurring at early and intermediate passages had been carried over to the 100th-passaged attenuated virus. Given the significance of the S protein for PEDV entry and pathogenicity, the mutation accumulated in the S gene of 83P-5 during the serial passage may contribute to the phenotypic change of the virus. For example, the growth adaptation of 83P-5 in Vero cells had occurred as early as 22nd passage, so that four amino acid changes (Asn112, Ser114, Lys378, and Ser379) in the S gene at the 34th passage that were varied over through 100th passage may be responsible for the in vitro adaptation. Similarly, the attenuation of 83P-5 at 100th, but not 61st, passage may be driven by newly acquired five amino acid changes (Thr2, Tyr152 deletion, Leu153, Met774, and Leu963) in the S gene of the 100th-passaged virus.

The remarkable similarity of the S genes between the 100th-passaged 83P-5 and the attenuated DR13 is unusual with respect to previous studies of coronavirus attenuation. Analyses of various IBV strains attenuated in embryonic eggs have shown no common mutations in the S gene [2831]. In two attenuated swine TGEV strains that were established from different virus strains, only one common amino acid change was found in the S gene [32]. The evolution of the closely related 100th-passaged 83P-5 and attenuated DR13 may be explained by a strong selection pressure on the S gene of PEDV during serial passages in Vero cells, while there should be many other pathways leading to attenuation of coronaviruses.

The 100th-passaged 83P-5 has been licensed for use as an attenuated PEDV vaccine in Japan (Nisseiken Co. Ltd., Japan). As the first step to better understand the mechanisms underlying the attenuation of 83P-5, this study documented the positive selection and accumulation of mutations in the viral S gene during the serial passages in Vero cells. The extensive mutations of the S gene were a contrast to the strong conservation of the M and N genes of 83P-5 during the serial passage. While the single substitution of Asn167 with Asp found in the M protein of the 61st- and 100th-passaged virus may influence the virus phenotypes, the influence is likely to be a lesser extent than the extensive S gene mutation does.

Data indicating pivotal roles of S protein in determining phenotypes of coronaviruses have been accumulated, however, it remains to be determined whether the mutations in the S gene of 83P-5 actually contribute to the viral attenuation. In fact, it has recently been shown that mutations in the non-structural protein of MHV [33], severe acute respiratory syndrome coronavirus [34], and IBV [35] play an important role in the viral attenuation. Furthermore, a deletion of group-specific accessory genes has been shown to confer attenuation of feline infectious peritonitis virus [36], MHV [37], and TGEV [38]. Thus, a mutation in any gene of the ~30 kb genome of coronaviruses that affects virus life cycle in natural hosts potentially results in attenuation of pathogenicity. Further studies will be needed to define the genetic mechanisms that underlie the attenuation of PEDV and other coronaviruses.