Genetic diversity and evolutionary history of Korean isolates of severe fever with thrombocytopenia syndrome virus from 2013–2016

Severe fever with thrombocytopenia syndrome (SFTS) is caused by SFTS virus (SFTSV). Although SFTS originated in China, it is an emerging infectious disease with prevalence confirmed in Japan, Korea, and Vietnam. The full-length genomes of 51 Korean SFTSV isolates from 2013 to 2016 were sequenced, and the sequences were deposited into a public database (GenBank) and analyzed to elucidate the phylogeny and evolution of the virus. Although most of the Korean SFTSV isolates were closely related to previously reported Japanese isolates, some were closely related to previously reported Chinese isolates. We identified one Korean strain that appears to have resulted from multiple inter-lineage reassortments. Several nucleotide and amino acid variations specific to the Korean isolates were identified. Future studies should focus on how these variations affect virus pathogenicity and evolution.


3
more molecular-level information on SFTSV toward the goal of developing a new diagnostic method for SFTS. To this end, we randomly selected 51 cases while ensuring that all provinces with a confirmed SFTS patient were included, and the isolates from these cases were sequenced.
The 51 clinical samples used in this study were collected as part of a laboratory surveillance system led by the Korea National Institute of Health (KNIH) during 2013-2016. In brief, the 5′-and 3′-terminal regions were sequenced by rapid amplification of cDNA ends technology. The genome sequences, including 41 tripartite (segments L, M, and S) and 10 bipartite (segments M and S) sequences, were generated using de novo assembly with DNASTAR SeqMan version 7.1 (Lasergene). The genome sequences obtained in this study were deposited in the GenBank/EMBL/DDBJ databases under the accession numbers KU507543-KU507577, KP663731-KP663745, and MF094728-MF094820, respectively. For gene characterization, we collected and manually edited 207 tripartite segmented genome sequences (163 Chinese, 43 Japanese, and one Korean) with available sampling dates from the GenBank database. Here, we focused on the protein-coding regions of SFTSV to investigate sequence variations and evolutionary dynamics.
The geographical distribution of the sequenced SFTSV samples is shown in Figure 1. In our dataset, isolates from Daegu represented the majority of the SFTSV genomes sequenced.
Variation analysis was performed using 207 genome sequences collected from the National Center for Biotechnology Information GenBank database and the 51 genome sequences from the KNIH. The genome sequences were aligned against a reference genome sequence (strain HB29: accession no. NC_018139, NC_018138, and NC_018137 for the L, M, and S segment, respectively) using MUSCLE v3.8 [19]. At the nucleotide level, the total coding sequence length of the three segments was 6255, 3222, and 1620 nucleotides for the L, M, and S segment, respectively. This dataset revealed sequence variations by segment, including 1,254 variations for segment L, 803 for segment M, and 358 for segment S, 207, 154, and 58 of which were present exclusively in the Korean isolates, respectively.
At the amino acid level, the L, M, and S segments contain 2084, 1074, and 540 amino acid residues, respectively. In the Korean isolates, 82, 122, and 48 amino acids varied in the L, M, and S segments, respectively, 31, 37, and 16 of which were specific to the Korean sequences. In segment S, site 238 of the nonstructural protein coding region contained multiple variations: D (Asp) > E (Glu)/N (Asn)/G (Gly). In all of the Japanese sequences, this change was to E (Glu), whereas the Korean sequences presented three variations: one E (Glu) (strain 16KS28), two N (Asn) (strain 16KS31 and 16KS40), and one G (Gly) (strain 16KS26). A Japanese research group reported that substitution of the amino acid residue 962 (R > S) is crucial for the membrane fusion step of viral infection [20]. In our data, all of the KNIH strains except for strain 15KS7 (accession no. MF094809) had this replacement at residue 962. Another study found that the R > W 2 substitution at position 624 was associated with strong cell-fusion activity under acidic conditions, although none of the KNIH strains showed this variation [21].
To investigate the evolutionary dynamics of SFTSV, a maximum-clade-credibility tree was constructed from Bayesian phylogenetic analysis using the BEAST v1.8.4 package [22] and the FigTree v1.4 program [23], with general timereversible, gamma-distributed substitution rate heterogeneity (G) and proportion of invariable sites (I) under both strict and uncorrelated relaxed molecular clocks. The trees for each of the three segments showed a similar topology (Fig. 2). A total of 248 sequences for segment L and 258 sequences for segments M and S were divided into two major geographical clades, designated as the Chinese clade and the Korean/Japanese clade (hereafter referred to as clade B, representing the virus commonly circulating in South Korea and Japan). The Chinese geographical clade was composed of five clades (A, C-F), and geographical clade B was the largest single clade.
Among all of the analyzed isolates in clade B, there were 30 Chinese, 42 Japanese, and 34 Korean strains for segment L; 29 Chinese, 42 Japanese, and 41 Korean strains for segment M; and 30 Chinese, 42 Japanese, and 41 Korean strains have resulted from multiple inter-lineage reassortment. This isolate was grouped into different Chinese clades according to the segment analyzed: the segment L tree grouped 16KS45 in clade C, whereas the segment M and S trees grouped this isolate into clade A. Although 98% of the Japanese isolates clustered in clade B, the isolate SPL087A grouped in Chinese clades; clade C for segment L, clade E for segment M, and clade A for segment S. For the Chinese isolates, 81.6% of the genome sequences clustered in the Chinese clades, whereas 30 isolates clustered in the Korean/Japanese clade B. Altogether, these results indicate that the majority of the Korean and Japanese SFTSV genomes cluster distinctly from the Chinese SFTSV genomes. Nevertheless, clade B may need to be separated into at least three subclades owing to the recent growth of this clade with a large number of Korean SFTSV sequences.
Genetic reassortment within the segmented RNA genome of SFTSV was observed in this study ( Table 1). The Japanese isolate SPL087A emerged as a unique reassortant within the Japanese genomes and clustered in the Chinese clade C, E, and A for the L, M, and S segment, respectively. The Korean isolate 16KS45 was a unique reassortant among the Korean sequences, belonging to the Chinese clade C, A, and A for the L, M, and S segment, respectively. The Chinese strains NB32 and NB38 were reassigned from Chinese clades to the Korean/Japanese clade B. NB32 clustered in clade B, A, and B and NB38 clustered in clade A, A, and B for segment L, M, and S, respectively. Of the 15 strains that resulted from reassortment, eight had their L and S segments assigned to the same clade and the M segment was assigned to a different clade, which in accordance with the findings of Rezelj et al. [24]. The present analysis also identified a novel Korean reassortant of SFTSV that was not found in earlier studies [25,26].
Bayesian phylogenetic analysis was performed to estimate the evolutionary rate and timescale for SFTSV. The evolutionary rate of all sequences of SFTSV was estimated to be 1.07E-4 (5. Although a different dataset was used in each study, our estimates of evolutionary rate were similar to those reported previously [27,28]. However, Liu et al. [26] reported 3.25-4.2 times higher evolutionary rates than our estimates.
In summary, in this study, we determined 51 full-length genome sequences of Korean SFTSV isolates that were sampled from 2013 to 2016. This is the first phylogenetic and evolutionary analysis of a large number of Korean SFTSV genome sequences. Most of these KNIH sequences clustered in a major clade with Japanese sequences, whereas six complete KNIH genome sequences clustered in Chinese clades. One of the Korean isolates was identified as a novel reassortant and was assigned to a Chinese clade.