Background

Canine distemper virus (CDV) is a single strand RNA virus belonging to genus Morbillivirus within the family Paramyxoviridae. The CDV genome encodes the following virion proteins: nucleocapsid (N), phosphoprotein (P), matrix (M), fusion (F), hemagglutinin (H) and polymerase (L). The F protein mediates pH-independent fusion of the viral envelope with the plasma membrane of the host cell [1].

Paramyxovirus fusion proteins are synthesized as an inactive precursor F0 that is cleaved by a host-cell protease to release the new N-terminus of the F1 [2]. Thus, forming the biologically active protein consists of the disulfide linked chain F1 and F2 [3]. The membrane anchored F1 subunit contains several regions important for promotion of membrane fusion. At its C-terminus, a hydrophobic trans-membrane domain (TM) anchors the protein in the membrane leaving a short cytoplasmic tail (20-40 residues). The fusion peptide, locates at the F1 subunit N-terminus, has been demonstrated to insert into the target membrane upon initiation of membrane fusion [4]. Also, F1 contains two heptad repeat regions, one close to C-terminal of the fusion peptide (HRA) and the other adjacent to the trans-membrane domain (HRB) [2, 5, 6]. To date, intensive studies were carried out on the H gene sequencing and phylogenetic relationship analysis [714] but a little is known about the F gene variations. Two genotypes of H gene, Asia 1 [9, 15] and Asia 2 [12], have been recognized among Asian isolates of CDV and they were found to differ from those of the European and American CDV genomes. In this study the phylogenetic characterization of F as well as H protein genes among Asian isolates of CDV was carried out to know the genetic variations of F genes.

Results

Phylogenetic analysis of deduced amino acids of H genes

The phylogenetic relationship based on the deduced amino acid sequences of the H protein of fourteen CDV strains were analyzed as shown in Fig. 1. As a result, strains 007Lm, 55L, 66L, 009L, M25CR, 011C, 50Con and 50Cbl were classified into Asia 2 group and strains Ac96I, Th12, 50Sc, 81ND, 82Con and 83mLN were classified into Asia 1 group. Among the Asia 2 strains, 007Lm, 66L, 009L, M25CR, and 011C had identical amino acid sequences of the H gene, although strain 009L differed from strains 007Lm, 66L, M25CR, and 011C in its nucleotide sequence of the H gene (99% identity). In addition, these four strains differed from strains 55L, 50Con, and 50 Cb1 in both amino acid and nucleotide sequences (99% identity of both amino acid and nucleotide sequences). On the other hand, among the Asia 1 strains, 50Sc, 82Con, and 83mLN had identical amino acid sequences of the H gene although strain 82Con differed from strains 50Sc and 83mLN in the nucleotide sequences of the H gene (99% identity). However, these three strains differed from strains Ac96I, Th12, and 81ND in both amino acid (identity 99% with Ac96I and 81ND, and 98% with Th12) and nucleotide sequence (identity 99% with Ac96I and 81ND, and 98% with Th12).

Figure 1
figure 1

Phylogenetic analysis of deduced amino acid sequences of H gene of Asian isolates of canine distemper virus using the neighbor-joining method in Mega 3.1 program. Accession numbers of CDV used for comparison are shown in parentheses as follows: A75/17 (AF164967), 5804 (AY386315), 00-2601 (AY443350), 01-2689 (AY649446), 98-2645 (AY445077), Yanaka (D87949) and Onderstepoort (AF378705).

Extra 27 nucleotides upstream of the usual F gene initiation codon characterized Asia 2 strains

Sequence analyses of the F gene revealed a new initiation codon and extra 27 nucleotides upstream of the usual F gene open reading frame (ORF) in all Asia 2 isolates. To characterize this nucleotide sequence, which extended from 4908 to 4934, various CDV strains as well as the present fourteen strains were compared about the nucleotide sequences from 4901 to 4940 as shown in Fig. 2. Interestingly, only Asia 2 isolates have a nucleotide change from 4909G to 4909T which led to the expansion of the F ORF. All Asia 2 isolates (007Lm, 55L, 66L, 009L, M25CR, 011C, 50Con and 50Cbl) had an identical 27 nucleotide sequence. In addition, other nucleotide differences were found among Asia 2 isolates such as 4907T/4907C that characterized 50Con and 50Cbl from other strains, also 4920G/4920A and 4930T/4930G characterized all Asia 2 strains although American and European as well as Yanaka (Asia 1) strains have the same nucleotide at position 4920 as Asia 2 isolates. The 4926A and 4928A were shared by all strains in compared to Onderstepoort strain (Fig. 2).

Figure 2
figure 2

Alignment of the nucleotide sequences of different CDV field isolates and Onderstepoort strain from nt 4901 to 4940 upstream of the usual initiation codon of F gene ORF (start from nt 4935) showing the extra 27 nucleotides and their deduced 9 amino acids in Asia 2 isolates. The following strains had identical sequences; Asia 2 strains (007Lm, 55L, 66L, 009L, M25CR and 011 C), (50Con and 50Cbl), Asia 1 strains (Th12, Ac96I, 50Sc, 81ND, 82Con and 83mLN) and American strains (98-2645, 00-2601 and 01-2689). Accession numbers as in Fig. 1.

Structure of the F gene product and cleavage sites stability

The F genes of fourteen Asian CDV isolates were sequenced and the deduced amino acids were aligned to detect the genetic variations among Asian isolates as shown in Fig. 3. The F gene product is cleaved by cellular proteases of signal peptidase and furin into three regions; signal peptide, F2 and F1 [2]. The cleavage sites, AQIHW in the C-terminus of signal peptide region and RRQRR in the N-terminus of F1 region [1618], were highly conserved in all Asian isolates as shown in Fig. 3.

Figure 3
figure 3

Alignment of deduced amino acid sequences of F genes of CDV strains. Only amino acids differ from the Onderstepoort sequence are shown. Potential N-linked glycosylation sites (N-X-S/T) are boxed. Cysteine residues (*), cleave sites (▼), Hydrophobic regions are underlined. Domains in the F gene are Fusion peptide (FP), heptad repeats (HRA and HRB), helical bundles (HB), trans-membrane (TM), cytoplasmic tail (CT). Numbering starts at the first methionine residue of the Onderstepoort strain. The predicted amino acid sequences of the following pairs were identical; 66L/009L, M25CR/011C, 50Con/50Cbl and 82Con/83mLN.

Signal peptide region

The signal peptide region is an important region for location of the precursor F0 into golgi network to cleave into F1 and F2 for fusion activity [19] and is a highly divergent region [2]. As shown in Fig. 3 all Asia 2 isolates have extra 9 amino acids upstream of the N-terminus of this region which is a characteristic for Asia 2 isolates. The amino acid variations were 30 - 32 % and 34 - 35 % while the nucleotide differences were 15 - 16 % and 18 - 19 % for Asia 2 and Asia 1, respectively in comparison with the Onderstepoort strain in the signal peptide region. Both Asia 1 and Asia 2 isolates have common amino acids which differ from Ondestepoort. Moreover, each group has its specific amino acids. However, within the same group there is amino acid(s) characteristic to individual strain as in 007Lm 116Y/116C and inTh12 9T/9S and 26R/26K (Fig. 3).

F2 and F1 regions

In the F2 region (aa136-224), amino acid differences were found as 208N/208K and 216V/216L in Asia 1 isolates, whereas 186D/186G, 193S/193N, 195V/195I and 216V/216L in Asia 2 isolates but strains 50Con and 50Cbl have the same amino acid at positions 193 and 195 as Onderstepoort strain (Fig. 3).

The membrane anchored F1 subunit contains the fusion peptide (FP) domain (hydrophobic) at the N-terminus, trans-membrane (TM) domain (hydrophobic) and the cytoplasmic tail (CT) domain at the C-terminus. The fusion peptide domain was highly conserved among all CDV strains. On the other hand, amino acid changes were found in the TM domain as 616I/616S in all Asia 1 isolates and 627C/627Y in all Asia 2 isolates. In the CT domain, six amino acid changes were observed within a span of 33 amino acid sequence. Common amino acid changes in all Asian isolates were found as 640N/640H and 646T/646A. Specific amino acid changes to all Asia 1 isolates were found as 634R/634Q, 637F/637L and 639H/639Q while those specific to Asia 2 isolates were found as 637H/637L and 656R/656K. Strains 50Con and 50Cbl had the same amino acids at positions 637 and 656 as Onderstepoort strain.

Adjacent to these domains, heptad repeats were designated as HRA (aa 250-307), HB (aa 328-374) and HRB (aa 557- 601), respectively [6, 20, 21] as shown in Fig. 3. Amino acid changes were found in HB domain as 366N/366G in all Asian isolates and as 600K/600R in HRB domain in Asia 1 isolates except for Yanaka strain while HRA domain was conserved

In other regions than the above described domains, common amino acid changes in all Asian isolates but different from Onderstepoort strain were found as 317K/317R, 431V/431I and 556S/556G. Group specific change(s) was found as 395V/395I in Asia 2 isolates while those were found as 309L/390F, 429K/429R, 466I/466L and 607S/607G in Asia 1 isolates, but strains Ac96I and Th12 have no amino acid difference at position 607 in compare to Onderstepoort strain. However, an unique amino acid to one or more strains in the same group was detected such as 546S/546G for M25CR and 011C, 478G/478C for 81ND and 482W/482L for 50Sc strains as shown in Fig 3.

N-linked glycosylation sites and cysteine residues

The F protein of Asian isolates had seven potential glycosylation sites (Fig. 3). Four of them were recognized at positions 141-143, 173-175, 179-181 in the F2 region and 517-519 in the F1 region as reported previously [1618, 21]. Interestingly, the extra three potential glycosylation sites according to the consensus amino acid sequence for N-glycosylation site (N-X-S/T) were found in Asia 1 but not Asia 2 isolates, two at positions 62-64 and 108-110 in the signal peptide region were conserved in all Asia 1 strains while the site 605-607 in F1 region was found in some Asia 1 strains (50Sc, 81ND, 82Con and 83mLN). These seven glycosylation sites are shared by Taiwanese field isolates [22].

Cysteine amino acids are an important factor for the intra molecular disulfide bond and the steric structure of protein. As a result, a total 18 cysteine residues were detected in the F gene product; among them, fourteen residues were located at identical positions in all CDV strains (Fig. 3). Characteristic cysteine residues were located at positions 67, 116, 478 and 627.

Phylogenetic analysis of amino acids of F genes

The identities between the amino acid and nucleotide sequences of Asia 2 and these of Onderstepoort were 91 % except for 50Con and 50Cbl (92 %), whereas identities between the amino acid and nucleotide sequences of Asia1 and Onderstepoort were 90 % and 91 %, respectively as shown in Table 1. Strains 66L and 009L, 011C and M25CR as well as strains 82Con and 83mLN showed 100 % identities in both amino acid and nucleotide sequences. The similarity of strain Th12 was 99 and 98 % to other Asia 1 strains in amino acids and nucleotides, respectively. While the similarity was 91 and 93 % to all Asia 2 isolates in amino acids and nucleotide sequences except for strains 50Con and 50Cbl amino acid identity was 92 % (Table 1).

Table 1 The identity of the deduced amino acid and nucleotide sequences of F genes of CDV Asian isolates.

The Phylogenetic analysis of F genes revealed that Asia 2 strains clustered into four clades; clade 50Con and 50 Cbl, clade 007Lm, clade 011C and M25CR, clade 66L, 55L and 009L as shown in Fig. 4. On the other hand, Asia 1 isolates have five clades including clade Ac96I, clade Th12, clade 81ND, clade 50Sc, clade 82Con and 83mLN. Interestingly, Asia 1 isolates were appeared to be closer to the European (5804) strain than American strains (A75/17, 98-2645, 00-2601, and 01-2689) when the phylogenetic relationship of F gene (Fig. 4) was compared with that of H gene (Fig. 1). In addition to this, the identical strains in H gene sequences such as 66L, 009L, M25CR, 011C and 007Lm could be distinguished into three distant clades of Asia 2, and strains 82Con, 50Sc and 83meLN into two distant clades of Asia 1 by F gene sequences analysis as shown in Fig. 4.

Figure 4
figure 4

Phylogenetic analysis of deduced amino acid sequences of F gene products of Asian isolates. Accession numbers of CDV strains are shown in the legend of Fig. 1.

The phylogenetic relationship among various CDV strains based on the deduced amino acid sequences of the signal peptide region (Fig. 5) showed similar but not identical classification to that of F gene (Fig. 4).

Figure 5
figure 5

Phylogenetic analysis of deduced amino acid sequences of signal peptide region in the F gene of Asian isolates of CDV. Accession numbers of canine distemper viruses are the same as shown in the legend of Fig. 1.

Discussion

Although the presence of Asia 2 group of CDV was known previously by the sequencing and phylogenetic analysis of H gene [12], the characteristion of F gene or F protein of Asia 2 group had not been identified. In this study, the characteristic extra 27 nucleotides encoding extra 9 amino acids adding to the usual ORF of F gene and the usual 662 amino acids of F protein, respectively, were found for the first time in all Asia 2 isolates by sequencing analysis of F genes (Fig. 2 and 3). The extra 27 nucleotide sequences were identical and highly conserved among Asia 2 isolates. This fact indicates that Asia 2 isolates are easily distinguished from other CDV strains including Asia 1, American and European isolates, by this sequence.

The nucleotide change from 4909G to 4909T led to the appearance of new initiation codon from position 4908 upstream of the usual F gene ORF (Fig. 2). Previous studies have suggested that translation of F protein starts at the first initiation codon, AUG1, or at the second codon, AUG61, that locates in the signal peptide region [2, 23]. Adding to these in-frame AUGs, a new AUG appeared in Asia 2 isolates. Thus, producing an unusual long signal peptide, depending on the translation initiation codon used in the case of Asia 2 isolates.

The signal peptide region cleavage is necessary for the F gene activation and expression on the cell surface [2]. However, many reports have indicated that the potential function of this region is indirectly affecting the fusion activity of F protein and thus potentially contributing to neurovirulence, although this function was different for CDV strains [2, 6]. So, the frequent variation observed in signal peptide region among Asian isolates may account for viral pathogenesis. In addition to the four conserved N-linked glycosylation sites, three were found only in Asia 1 isolates. Previous studies suggested that N-linked glycosylation of viral envelope proteins (H of measles virus, F of Newcastle disease virus, F of Nipah virus or prM of Japanese encephalitis virus) plays a number of critical roles in the virus life cycle and in virulence mechanisms such as binding to cell surface receptors and protecting against antibody neutralization [2428], also the glycosylation might play an important role in the cleavage dependent activation of the precursor F0 protein or in its transport to the sub-cellular region where the proteolytic cleavage occurs [29]. Our finding of different glycosylation sites of the F proteins suggested that these F proteins have different characters.

When compared to Asian isolates, European 5804 strain shared all N-glycosaltion sites with Asia 1 except for that at position 605-607; in contrast American strains have the common four glycosylation sites as Asian isolates in addition to one site 108-110 shared by strains A75/17 and 01-2689.

Interestingly, strains Ac96I and Th12 have the same amino acids as Asia 2 at position 23H and Th12 has unique amino acids at 7K and 26R. Also, Yanaka strain, Asia 1 isolate [9] has similar amino acids to Asia 2 isolates such as 19L, 23H, 84D and 101R as well as unique amino acids at positions 57F, 67G, 94F, 95I, 98V, 104K, 116R, 126M, 130L, 151N and 513G.

The genetic relationships shown in Fig 1 and 4 indicate that the field isolates form two separated lineages based on the deduced amino acids of F or H gene. Surprisingly, Asia 1 isolates appeared to be more closely related to European 5804 strain than to any other American strains by comparing the full F gene, while by H gene analysis, Asia 1 isolates were clearly distinguishable from European strain (Fig 1 and Fig. 4). Moreover, the phylogenetic analysis of the deduced amino acids of the signal peptide region of F genes is helpful for CDV classification giving a similar overview to that of the full F gene as shown in Fig 4 and 5.

Conclusion

The phylogenetic analysis of F gene gives clear picture for the H gene identical CDV strains and the signal peptide region gives a remarkable differentiation between Asia 1 and Asia 2 isolates.

Materials and methods

Cells and Viruses

Vero.DogSLAMtag cells were established as described previously [30]. Cells were passaged and maintained in Dulbecco's modified Eagle's medium (D-MEM; autoclavable; Nissui Pharmaceutical Co. Ltd., Tokyo, Japan) supplemented with 10 % fetal bovine serum in a CO2 incubator at 37°C.

Fourteen CDV strains; 007Lm, 55L, 66L, 009L, M25CR, 011C, 50Con, 50Cbl, Ac96I, Th12, 50Sc, 81ND, 82Con and 83mLN, were isolated and propagated, one or a few times, in Vero.dogSLAMtag cells and stored at -80°C until use. Specimens were collected from diseased dogs as summarized in Table 2.

Table 2 Summarized data of CDV strains used in this study.

Sequencing of F and H genes of CDV and phylogenetic analysis

Vero.DogSLAMtag cells were infected with virus suspensions at MOI = 0.01 and incubated for 18 - 24 hours. When the CPE almost covered the cultures, total RNA was extracted using a MagExtractor™ RNA Extraction Kit (Toyobo Co., Ltd. Osaka, Japan) according to the manufacturer's instructions. Reverse transcription and PCR amplification (RT-PCR) were carried out using a ReverTra-Plus-™-RT-PCR Kit (Toyobo Co., Ltd. Osaka, Japan). The primers used were as follows: 5'ACTTGCCCGATCTCAAGCTA 3' and 5' ATGCTGGAGATGGTTT AATTCAATCG 3'. The forward represents nucleotides 4754 - 4773 of the M-F region in the positive sense and the reverse represents nucleotides 8969 - 8994 of the H-L region in the negative sense. The amplified PCR products (4240 bp) were purified by using a Gene Clean II kit (Biogene, Inc., USA) after agarose gel (0.7 %) electrophoresis, and sequenced directly using a Big Dye® Terminator v.3.1 cycle sequencing kit (Applied Biosystems, Inc., CA, USA), with appropriate primers designed according to an overlapping strategy Table 3. The sequences were aligned by CLUSTAL W (1.83) Multiple Sequence Alignments (DDBJ) and phylogenetic analysis was carried out by the neighbor-joining method in Mega 3.1 program.

Table 3 Oligonucleotide primers used for RT-PCR amplification and nucleotide sequencing.