Introduction

Porcine epidemic diarrhea (PED) is a devastating swine disease that is characterized by acute enteritis and lethal watery diarrhea, followed by dehydration, and frequently leading to a high mortality in piglets [13]. Most of the incidence farms found the disease first in farrowing barns and subsequently 100 % mortality of newborn piglets. The disease was first reported in England in 1971 [4], and since then, outbreaks of the disease have been reported frequently in Europe and Asia [57]. Since 1990s, the disease has continuous outbreak in pig farms of 26 major cities and provinces in China, causing tremendous economical losses to the swine industry [8].

The causative agent of PED, the porcine epidemic diarrhea virus (PEDV), was first described in 1978 [9]. Then, a cell culture system was developed for PEDV isolation and propagation [10]. PEDV is a member of Coronavirus genus and the family Coronaviridae. The genome consists of a positive-sense, single-stranded RNA, with 27–32 kb in size, which can transcribe into several subgenomic mRNAs, and encode structure or non-structure proteins in a conserved order [11]. The polymerase gene, which covering 70 % of the genome, encodes the replicase polyproteins. The genes for major structural proteins including the membrane protein (M), the phosphorylated nucleocapsid protein (N), the small membrane protein (sM), and the spike protein (S) are located downstream of the polymerase gene [11].

The S glycoprotein makes up the large surface projections of the virion and plays an important role in the attachment of viral particles to the receptor of the host cell [1214]. Thus, the S glycoprotein would be a primary target for the development of vaccines against PEDV. It is also the major envelope glycoprotein of the virion, which serves as an important viral component to understand genetic relationships of different PEDV strains and the epidemiological status of PEDV in the field [6, 15, 16].

The sM gene is the only accessory gene of PEDV. Accessory genes are generally maintained and their loss mainly results in attenuation of the virus in the natural host [17]. For PEDV, virulence of the virus can be reduced by altering the accessory gene region in a manner similar with TGEV [18], and its differentiation could be a marker of virus attenuation [19] and a valuable tool for the study of molecular epidemiology of PEDV [8].

In China, PEDV was first isolated in 1982 [20], its prevalence has been a big problem of swine industry in recent years, although a periodic vaccination strategy has been applied nationwide to prevent the disease [21]. Thereby, a comprehensive study is necessary to better understand the genetic relationships between different strains, and would be helpful to find out the reason of the continuously outbreak of PEDV and develop new strategy to control and prevent PEDV infection. In this study, we investigated the molecular epidemiology and analyze phylogenetic relationships of Fujian PEDV field samples with other PEDV reference strains. The study mainly focused on S1 and sM gene due to their vital roles in viral function and higher variation.

Materials and methods

Sample collection

Partial of intestine or stool specimens were taken individually from the acute enteritis and watery diarrhea piglets of 3 different big swine farms in Fujian province in 2011, and designated as P55, P68, and F422, respectively. Intestinal samples were homogenized with 9 times of phosphate-buffered saline (PBS). The suspensions were then vortexed and centrifuged for 10 min at 1,700×g. The supernatants were stored at −80 °C before utilization.

RT-PCR, DNA cloning and sequence analysis

In order to determine the sequences of the PEDV samples, primers were designed based on the sequence of reference PEDV strains (Table 1). Partial of S gene, i.e., S1, was amplified for investigation because of its long length. In brief, viral RNA was extracted from the supernatants of the homogenized samples with the RNAiso Plus agent (Takara, Japan) according to the manufacturer’s instructions. RT-PCR was conducted individually to amplify each fragment from the isolated RNA using Primescript® One Step RT-PCR Kit Ver.2 (Takara, Japan) according to the manufacturer’s protocol under the following conditions: reverse transcription at 50 °C for 30 min, denaturation at 94 °C for 2 min, 30 cycles of denaturation at 94 °C for 30 s, annealing at 55 °C for 30 s, and extension at 72 °C for 1 min.

Table 1 Amplification primers for the S1 and sM genes

The RT-PCR products were analyzed by 1.5 % agarose gel electrophoresis and visualized by ultraviolet illumination after ethidium bromide staining. Bands of the corresponding size of the gene were excised, and the synthesized DNA was purified using a QIAquick Gel Extraction Kit (QIAGEN, Germany) according to the manufacturer’s instructions, then sequenced by Takaka Company.

Multiple alignments and phylogenetic analysis

The nucleotide and deduced amino acid sequences of S1 and sM genes of PEDV samples were independently used for sequence alignments. The multiple-sequencing alignments were generated with ClustalW method by Megalign 4.0 [22]. Phylogenetic tree were constructed with deduced amino acid sequences by the bootstrap neighbor-joining method.

Protein characterizations prediction

In the study, the characterizations of deduced amino acid sequences, including PI value, antigenic peptides, hydrophobic positions, and transmembrane motif, were analyzed by DANMAN program.

Results

Sequence analysis

Sequence analysis of S1 region

The nucleotide sequences of the Sl region are 2,024 bp for P55, 2,032 bp for P68, and 2,036 bp for F422 in length (Accession number: JQ723739, JQ723740, and JQ723741). Sl protein of P55 is 620 aa in length with a predicted Mr of 68.1 kDa, Sl protein of P68 and F422 is 522 aa in length with a predicted Mr of 57.2 kDa. Twelve homolog sequences were found in the GenBank and shared the similarity of 99 % (Table 2). However, mutations were frequently occurred in S1 gene. The alignment analysis indicated that five sequences including P68, F422, CH/FJND-3/2011 (Accession number: JN381492), KNU-0901 (Accession number: GU180144), and CNU-091222-02 (Accession number: JN184635) were classified into one group (Group1, Fig. 1) with same nucleotide substitutions at 28 positions as well as one site insertion and one site deletion. Moreover, all of the five sequences except KNU-0901 formed one subgroup with 8 substitutions and 2 insertions in common. P68, F422, and CNU-091222-02 have close relationship with 5 same mutations in common. Another 2 mutations (A/C→T at 409, C→T at 463) were found in P68 and one mutation (T→C at 1717) in F422. Interestingly, most of the mutations were observed in the N-terminal region. These variations of P68 and F422 were probably due to mutation of the gene with filed strains. P55 and DR13 consists of another group (Group 2, Fig. 1) with 8 specific nucleotide changes, and the mutations occurred in the middle of S1 gene, interestingly, the purine (C/G) and pyrimidine (A/T) was found interchanged (C/G↔A/T).

Table 2 Reference PEDV strains with 99 % similarities for the S1 and sM genes
Fig. 1
figure 1

Alignment of amino acid sequences of S1 proteins of Fujian PEDV strains and reference strains. The asterisks represent the segments not shown in this figure. The dashes represent deleted amino acids. The shadows indicate the unique substitutions s of chosen strains. The boxes indicate the unique deletions of chosen strains

The relationships of Group 1 and Group 2 were testified by their deduced amino acids. The sequences of Group 1 were found to have a long deletion at the initial followed by a short deletion. The mutations of Group 2 were found to have a deletion at position 157 and a substitution at position 329 (S→F). In terms of potential asparagine (N)-linked glycosylation sites, only 11 sites were found in Group 1, much less than Group 2 (14 for P55 and 15 for DR13). Unlike the result by Lee et al. [23], neither GTAAAC nor similar sequence was found upstream of the initiator ATG of the S gene in all of the Chinese and English (CV777) strains.

Sequence analysis of sM gene

The sM gene of 3 Fujian PEDV field samples were sequenced (Accession number: JQ723734 for P55, JQ723732 for P68 and JQ723733 for F422) and compared to those of other chosen PEDV reference strains which shared with 99 % similarities. The sM genes of P68 and F422 have 1,050 nucleotides, encoding a protein of 224 amino acids, with a predicted Mr of 25.3 kDa. Unlike other strains and samples, the sM gene of P55 has 996 nucleotides with a 50 nucleotides deletion ranged from 412 to 465, encodes a protein of 128 amino acids with a predicted Mr of 14.4 kDa. P55 was found to show a close relationship with CH/GSJIII/07 (Accession number: GU372743), which owns only 92 amino acids. Both of them have a same deletion mutant, but P55 had special segment insertions that were different from all the other stains (Fig. 2).

Fig. 2
figure 2

Alignment of amino acid sequences of sM proteins of Fujian PEDV strains and reference strains. The dashes represent deleted amino acids. The boxes indicate the unique deletions of P55 and CH/GSJIII/07

P55 and F422 own 7 and 8 unique point mutations, respectively. However, besides the long deletion in P55, only one amino acid was changed by those mutations (F→L at 124 in F422, Fig. 2). In addition, P55 have one less asparagine (N)-linked glycosylation sites than the others. All the PEDV strains including the 3 Fujian samples except the SM98 strain (Accession number: GU937797) have a conserved sequence (CTAGAC) at 46 nucleotides upstream of the initiator ATG.

Phylogenetic analysis

In order to analyze the phylogenetic relationships between the 3 Fujian samples and other PEDV strains isolated in various regions worldwide, we constructed 2 phylogenetic trees using the deduced amino acid sequences of S1 and sM, respectively (Fig. 3).

Fig. 3
figure 3

Phylogenetic relationship of Fujian PEDV isolates and other strains based on comparisons of S1 and sM amino acid sequences. The GenBank accession number for these genes were listed in Table 2. a Tree based on amino acid sequences of S1 protein. b Tree based on amino acid sequences of sM protein

The phylogeny based on the S1 glycoprotein indicated all the strains were clustered into 3 major groups, including one big mixed group (Group 1) and 2 Chinese groups (Group 2 and 3). P68 and F422 formed a subgroup (Subgroup 4) to differentiate with other strains. The subgroup comprising DR13 and P55 (Subgroup 1) located in Group 1. The result was correlated with the finding from sequence analysis.

Quite different from the results from S1 protein, phylogenetic analysis based on the sM protein fragment divided the strains into 2 groups, one of which included P55 and CH/GSJIII/07 (Fig. 3b). The reason might be the deletions occurred in the P55 and CH/GSJIII/07. F422 had a close relationship with DX and formed a subgroup, while P68 formed another subgroup.

Protein characterizations

The characterization of S1 protein confirmed the results from phylogenetic analysis (Table 3). The characterizations of P55 and DR13, except antigenic peptide number, were shown to be greatly different from those of other strains; and the strains F422, P68, CH/FJND-3/2011, CNU-091222-02, and CV777 shared the similar antigenic peptide, but had one unique difference in amino acid substitution. Otherwise, the hydrophobic region in the N-terminus (underlined) of CV777 and CNU-091222-02 are different from the other 4 strains, which might play important roles on the protein structure and function. The close relationship of P68, F422, and CH/FJND-3/2011 was consistent with the results of sequence analysis. Whereas, the relationship between DR13 and P55 were not close with only 4 hydrophobic regions in common and same hydrophobic region in the N-terminus (Table 3, underlined).

Table 3 Predicted protein characterizations of the deduced protein of S1 gene

For the sM protein, PI varied from about 6.5 to 11 among the 6 chosen strains (Table 4), indicating the potential variation of the protein. It was noteworthy that high identities between F422 and DX were indicated by same characterizations except one hydrophobic region. The identities between P68 and CV777 were less than DX and F422, differences of which involved in little PI variation, one variation in hydrophobic and transmembrane segments and 3 positions’ amino acid mutations (Table 4, underlined). Consistent with the phylogenetic analysis, the characterizations of P55 and CH/GSJIII/07 were similar and extremely different from the other strains. Since the sM determines the virulence of PEDV [24], our results would benefit the research on the variation of virulence of PEDV in China.

Table 4 Predicted protein characterizations of the deduced protein of sM gene

Discussion

The diversities in S1 and sM were observed to be significant among different strains. Although there were so many mutations in this segment, the first unique characteristic was the deletion in the sM gens of CH/GSJIII/07 and P55. Compared to CH/GSJIII/07, P55 was found to be more viable due to the existence of insertion within the C-terminus domain, the unique point mutations and less asparagine (N)-linked glycosylation sites. The long deletion of sM gene, which was also found in the field strain DR13 (Accession number: JQ023161) and its attenuated strain (Accession number: JQ023162) [25], led to reduced pathogenicity and induced protective immune response in pigs [24]. Remarkably, similar results were found in P55 and there were no significant mutations found in the sequences of other structural protein genes including M, N (data not shown), and S gene, whether the mutated strain reduced its pathogenicity or not needs further study. The loss of sM resulted in attenuation of the virus in the natural host. However, we found that the PEDV with long deletion of sM gene also caused typical clinical signs of PEDV infection, the pathogenesis mechanism of the virus and how the sM mutant strain comes from also need to be clarified. In general, the variation in sM gene may determine the different epidemiologic infection mechanism of the strain.

Different from the various diversities of sM gene, the S1 region of the 3 Fujian samples have unique mutations in common. Coronaviruses have transcription regulatory sequences (TRSs) that include a highly conserved core sequence 5′-CUAAAC-3 or a related sequence at upstream of encoding genes [26]. Though the sequence ATAAAC, AGAAAC, and CTAGAC were found respectively upstream of the initiator of M, N (data not shown), and sM gene, the sequence GTAAAC reported in the Korean strains [23] was not found upstream of the S1 gene of the Fujian PEDV samples. However, the neutralizing epitome was conserved in S1 that is responsible for mediating the production of anti-viral neutralizing antibodies.

Phylogenetic trees based on the protein sequence were constructed to analyse the relationship between the Fujian samples and the other strains. Phylogenetic analysis based on sM protein indicated that the strain CH/GSJIII/07 was relatively close to P55, but distantly related with Group 1. However, Park et al [27] found that CH/GSJIII/07 was in Group 1, which was different from our research. The reason for these might be due to the nucleotides sequences were used in the previous study, but amino acid sequences were used in this study. The location of P68 and F422 in the tree based on S1 protein suggested high variation of Fujian samples. DR13 and P55 were within the same subgroup. As DR13 was used to develop the PEDV vaccine in Korea [28], it might be interesting to know whether P55 can be used to develop the PEDV vaccine in China.

The results of protein characterization prediction confirmed the relationship and demonstrated specific differences between the close strains obtained from sequence and phylogenetic analysis, which might be useful in further functional exploration. It was noteworthy that the unique hydrophobic region in the N-terminus of S1 protein of CV777, CNU-091222-02, DR13, and P55 that might related to the variation of protein structure and function.

In conclusion, the Fujian PEDV samples were classified into different group. Both of P68 and F422 were found to have close relationship with isolated strains from China, but still have some unique characterizations. The P55 had highest variation and a close phylogenetic relationship with filed strain CH/GSJIII/07.