Sequence and phylogenetic analysis of nucleocapsid genes of porcine epidemic diarrhea virus (PEDV) strains in China

Porcine epidemic diarrhea virus (PEDV) causes acute diarrhea and dehydration with high mortality rates in swine. It has become increasingly problematic in China. Since the nucleocapsid (N) protein is highly conserved, it is a candidate protein for early diagnosis and vaccine development. In this study, the N genes of 15 PEDV strains were amplified by RT-PCR and cloned into the pMT-19T vector, sequenced, and compared to each other as well as to PEDV reference strains. The nucleotide sequences of the N gene of the Chinese PEDV strains consist of 1326 nucleotides and encode a 441-aa-long peptide. The nucleotide sequences of the fifteen PEDV strains in our study were 96.1-100 % identical to each other, and the deduced amino acid sequences were 94.8-100 % identical. Sequence comparison with other PEDV strains selected from GenBank revealed that their nucleotide sequences were 94.2-99.7 % identical to those of the Chinese PEDV strains, and their deduced amino acid sequences were 94.1-99.5 % identical. In addition, the fifteen strains showed a high degree of nucleotide sequence identity to the early domestic strains (98.4-99.7 %) except the LZC strain, but less sequence identity to the vaccine strain (CV777) used in China (94.7-97.7 %). Phylogenetic analysis showed that the Chinese PEDV strains are composed of a separate cluster including three early domestic strains (JS-2004-02, LJB/03 and DX) but differ genetically from the vaccine strain (CV777) and the early Korean strains (Chinju99 and SM98).


Introduction
Porcine epidemic diarrhea (PED), caused by porcine epidemic diarrhea virus (PEDV), is an acute, highly contagious, and devastating enteric disease that is characterized by severe enteritis and diarrhea, with high mortality rates in suckling pigs [1]. PED was first reported in England in 1971 [2]. Since then, PED occurs in most swine-raising countries in Europe, as well as China, Korea, Thailand, and Japan [3][4][5][6]. However, this disease is becoming a big concern especially in Asia, where outbreaks are often more acute and severe than those observed in Europe [7,8]. Since the beginning of October 2010, a porcine epidemic diarrhea epizootic has been occurring in China, affecting pigs of all ages but characterized by high mortality rates among suckling piglets. The outbreak has been prevalent nationwide and has caused huge economic losses [9][10][11]. Most of the affected farms have lost 100 % of their newborn piglets, usually within 7 days, but sometimes even within only a few hours of birth. Few sows or boars show any clinical signs, which is inconsistent with a previous report of an outbreak in Thailand in 2007 that was characterized by pigs of all ages being infected and showing different degrees of diarrhea and anorexia [4]. In addition, in China, sporadic outbreaks of PED have been seen year round, and not just in the winter months.
Porcine epidemic diarrhea virus (PEDV) is an enveloped, single-stranded RNA virus, belonging to the order Nidovirales, family Coronaviridae, genus Alphacoronavirus [12][13][14]. The genome is comprised of a 5' untranslated region (UTR), a 3' UTR, and at least seven open reading frames (ORFs) that encode four structural proteins, (spike [S], envelope [E], membrane [M], and nucleocapsid [N]) and three non-structural proteins (replicase 1a and 1b and ORF3) [15][16][17]. It has been reported that the N protein binds to viral RNA, providing a structural basis for the helical nucleocapsid, which is a basic phosphoprotein associated with the genome [18]. In addition, the N protein is thought to be important in inducing cell-mediated immunity in the host [19]. The N protein participates in transcription of the viral genome, the formation of the viral core, and packaging of viral RNA [20,21]. In the early stages of PEDV infection, a pig can produce high levels of antibodies against the N protein. Since the N protein is highly conserved, it is the best candidate protein for use as an antigen for early diagnosis reagents and vaccine development [22][23][24].
The purpose of the present study was to investigate the genetic characteristics of the N gene between 2010 and 2012 during PED outbreaks in different region of China. In this study, RNA was extracted directly from the feces or intestinal contents of piglets infected with PEDV. The N genes were cloned and sequenced, and the sequences were submitted to GenBank and compared with the N genes of other PEDV strains. In addition, the N protein motifs (including phosphorylation sites and hydrophilic regions) were identified. These data provide additional molecular epidemiological information on PEDV circulating in China and provide a basis for further development of diagnostic reagents and methods as well as assisting in vaccine selection.

Sample collection
Porcine samples (including feces and intestinal contents) from piglets with severe watery diarrhea, dehydration and high mortality were collected from 55 farms in five provinces in China during February 2010 to March 2012. These samples were confirmed to be positive for PEDV by reverse transcription polymerase chain reaction (RT-PCR) [9].

Primer design for RT-PCR
In order to determine the sequences of the N gene, primers were designed based on known published sequences in GenBank (CV777 and Brl/87). The sense primer was 5'-TGCGGTTCTCACAGATAGTG-3', and the antisense primer was 5'-AAGTCGCTAGAAAAACACTCAGTA AT-3'. The size of the amplified product was predicted to be 1380 bp.

RT-PCR
Viral RNA was extracted from samples using TRIzol Reagent (Invitrogen, CA, USA), resuspended in nucleasefree water, and kept at -70°C until further use. Reverse transcription was performed at 50°C for 30 min in a reaction mixture consisting of 2.5 ll RNA (0.2 lg), 1 ll primer (10 pmol), 1 ll Prime Script One Step Enzyme mix, 8 ll RNase-free H 2 O, and 12.5 ll 2 9 One Step Buffer. The cycling conditions for the PCR were 94°C for 2 min, followed by 32 cycles of denaturation (94°C for 10 s), annealing (58°C for 30 s), and extension (72°C for 1 min), followed by a final extension at 72°C for 7 min. Both the reverse transcription and the polymerase chain reaction were conducted using a PrimeScript One Step RT-PCR Kit (TaKaRa, Japan).

Cloning and sequencing
The amplified PCR products were subjected to gel electrophoresis, excised from the agarose gel and purified using an Agarose Gel DNA Purification Kit (TaKaRa, Japan). The PCR products were cloned into the pMD19-T vector according to the manufacturer's instructions (TaKaRa, Japan), three clones were sent to Shanghai Sangon Bioengineering Ltd. to be sequenced in both directions for each fragment, and the sequence was analyzed.

Genome and amino acid analysis
Fifteen sequences were selected randomly and aligned using ClustalX software (version 1.83), Bioedit and DNASTAR to examine their genetic diversity. Phylogenetic trees were constructed by the neighbour-joining (NJ) method using Molecular Evolutionary Genetics Analysis (MEGA) software (version 4.0). Bootstrap values were estimated for 1000 replicates. The reference strains used for sequence alignment, sequence analysis, and phylogenetic analysis with the Chinese PEDV strains are shown in Table 1.

Protein sequence analysis
The hydrophilic regions of the deduced amino acid sequences were analysed using DNASTAR software. The phosphorylation sites were predicted using NetPhis and NetPhos K analysis tools available at http://www.cbs.dtu. dk/services/NetPhos/ and http://www.cbs.dtu.dk/services/ NetPhosK/.

Homology analysis of the N gene
Sequence homology results were based on the fifteen Chinese PEDV strains and seven commonly used strains published in GenBank (  Table 2). The Chinese strains had 15 to 18 amino acids mismatched when compared to Chinju99. Interestingly, the sequences of the fifteen strains were highly conserved in the 5' region (bases 1 to 252), while some variation was observed in the 3' region (bases 1050 to 1233). This finding indicates that the N-terminal part of the protein is more conserved than the C-terminal part. In addition, there were four highly conserved regions at bp 1-83, 517-616, 950-1020 and 1106-1178.

Prediction of phosphorylation sites and analysis of hydrophilic regions of the N protein
The deduced amino sequence of the N gene of strain CH-GX1-2011 was randomly selected to be analyzed for specific motifs. The analysis indicated that the protein had seven potential asparagine (N)-linked glycosylation sites, consistent with the number seen in the Chinju99, LJB/03 and DX strains. Moreover, the CH-GX1-2011 strain had seven potential protein kinase C phosphorylation sites, nine casein kinase II phosphorylation sites, one tyrosine kinase phosphorylation site, and two cAMP-and cGMP-dependent protein kinase phosphorylation sites. In addition, a large hydrophilic region was identified in the central region of the protein (Fig. 1).

Phylogenetic analysis of the N gene
Phylogenetic analysis based on nucleotide and encoded amino acid sequences of the N gene confirmed that all field strains fell into two groups (Fig. 2). Group I consisted of Chinju99 (Korean), CV777 (Europe), LZC (China), and

Discussion
In the present study, the N genes of Chinese PEDV field strains isolated between 2010 and 2012 were amplified by RT-PCR, cloned and sequenced to determine the genetic characteristics of viruses causing PED outbreaks in China.
The results confirmed that the N gene had an ORF of 1326 nucleotides, coding for a protein of 441 amino acids. None of the Chinese strains were found to have sequence insertions or deletions in their N genes. Sequence comparison with other PEDV strains selected from GenBank indicated that the N genes of the Chinese strains were highly conserved, even though these strains originated from different geographic regions. The alignment also showed that the N gene sequences have a high degree of nucleotide sequence identity. This could be useful information for the development of genetically engineered N proteins for vaccine development and prevention of PEDV infections. The N protein is a phosphorylated structural protein that is associated with the viral genome and is abundant in virus-infected cells [25]. Therefore, the appearance of the N protein indicates replication of PEDV, and this can be used for early and accurate detection of virus replication in infected cells [23,26]. Previous studies have shown that the N protein of the Chinju99 isolate has seven potential T-or S-linked phosphorylation sites [27]. In this study it was revealed that the CH-GX1-2011 strain (representing all of the 15 Chinese strains) had the same number of T-or S-linked phosphorylation sites as Chinju99 despite that fact that there were 15-18 mismatched amino acids when compared to Chinju99, LJB/03 and DX [28,29]. Moreover, the CH-GX1-2011 strain had seven potential protein kinase C phosphorylation sites, nine casein kinase II phosphorylation sites, one tyrosine kinase phosphorylation site, and two cAMP-and cGMP-dependent protein kinase phosphorylation sites. There are larger hydrophilic regions in the center of the N protein that might play a role in transcription and replication of the viral genome.
Phylogenetic trees were constructed and analyzed using nucleotide and deduced amino acid sequences of the N gene. Similarities and differences among the PEDV strains were observed, which helped in elucidating the phylogenetic relationship between the Chinese PEDV strains and the reference PEDV strains. The results showed that the Chinese PEDV strains characterized in this study make up a separate cluster that includes three other Chinese strains (JS-2004-02, LJB/03 and DX), while they differed genetically from the vaccine strain (CV777) and the early Korean strains (Chinju99 and SM98).
In conclusion, the N genes of Chinese PEDV strains isolated in 2010-2012 during PED outbreaks were sequenced and compared to other reference strains. The N genes were found to be closely related to the N gene of earlier domestic PEDV isolates (JS-2004-02, LJB/03 and DX). The N gene was highly conserved but still had some unique point mutations as well as conserved regions. It is hoped that these data may add to other molecular epidemiological studies of PEDV in China and neighboring countries. Furthermore, it may also lay the foundation for further development and selection of PEDV vaccines.