Molecular evolution of dengue virus types 1 and 4 in Korean travelers

Dengue virus (DV) is a mosquito-borne virus that is endemic to many tropical and subtropical areas. Recently, the annual incidence of DV infection has increased worldwide, including in Korea, due to global warming and increased global travel. We therefore sought to characterize the molecular and evolutionary features of DV-1 and DV-4 isolated from Korean overseas travelers. We used phylogenetic analysis based on the full coding region to classify isolates of DV-1 in Korea into genotype I (43251, KP406802), genotype IV (KP406803), and genotype V (KP406801). In addition, we found that strains of DV-4 belonged to genotype I (KP406806) and genotype II (43257). Evidence of positive selection in DV-1 strains was identified in the C, prM, NS2A, and NS5 proteins, whereas DV-4 showed positive selection only in the non-structural proteins NS2A, NS3, and NS5. The substitution rates per site per year were 5.58 × 10-4 and 6.72 × 10-4 for DV-1 and DV-4, respectively, and the time of the most recent common ancestor was determined using the Bayesian skyline coalescent method. In this study, the molecular, phylogenetic, and evolutionary characteristics of Korean DV-1 and DV-4 isolates were evaluated for the first time. Supplementary Information The online version contains supplementary material available at 10.1007/s00705-021-04973-8.


Introduction
Dengue fever is a major mosquito-borne viral illness that causes significant public health problems in subtropical and tropical regions of the world [1]. Dengue virus (DV) is commonly transmitted among humans by two mosquito vectors, Aedes aegypti and Aedes albopictus [2]. Global warming and urbanization have fueled the population growth of these vectors and broadened their habitat. During the past three decades, the incidence of dengue fever has increased by more than 300% in more than 100 countries [3][4][5]. There are 2.9 million dengue infections and 5,906 deaths reported in Southeast Asia each year [6]. For this reason, the World Health Organization has designated dengue infection as one of the top 17 tropical diseases [7].
Dengue has a wide spectrum of clinical severity, from asymptomatic through a self-limiting febrile illness (dengue fever) to life-threatening severe infections such as dengue hemorrhagic fever (DHF) and dengue shock syndrome (DSS). The mechanisms underlying severe dengue infections are not completely understood, but an antibody-dependent Handling Editor: Zhenhai Chen.

Supplementary Information
The online version contains supplementary material available at https ://doi.org/10.1007/s0070 5-021-04973 -8. enhancement phenomenon and infection with particularly virulent virus strains are known to correlate with severe dengue infection [9][10][11]. Specific amino acid changes in DV proteins are closely associated with altered virulence. For instance, mutation of amino acids 124 and 128 of the E protein, amino acid 64 of NS2A, or amino acid 55 of NS2B can increases DV virulence [11,12].
Since the first case of dengue fever was reported in South Korea in 1995, DV infections have been increasingly reported by travelers returning from dengue-endemic countries [13]. A. albopictus was recently identified as a mosquito vector for dengue infection in Korea. The expanding distribution of this mosquito in urban areas increases the likelihood of outbreaks in Korea [14][15][16]. Indeed, autochthonous dengue infections have been reported in Japan and some European countries at approximately the same latitudes as in Korea. DV-1 and DV-4 have been reported to cause severe dengue fever at a high rate of incidence in Asian countries [17,18]. Therefore, surveillance of these serotypes is necessary, but information on Korean DVs is very limited. For this report, we performed molecular and evolutionary analysis of DV-1 and DV-4.

Viruses and sequencing
Isolates of DV-1 and DV-4 (NCCP43251, NCCP43257) were obtained from the National Culture Collection for Pathogens (Cheongju, Korea). These viruses were isolated from serum samples from Korean overseas travelers (Table 1). The virus was cultivated on a monolayer of Vero E6 cells (ATCC CRL-1586) for propagation in minimal essential medium (Gibco, Life Technologies, NY, USA) containing 2% fetal bovine serum and 1% penicillin (100 IU/ml)/ streptomycin (100 μg/ml) (Gibco) at 37 ℃ and 5% CO 2 . Viral RNA was extracted from the supernatant of the DVinfected Vero E6 cells using a QIAamp Viral RNA Mini Kit (QIAGEN, CA, USA) according to the manufacturer's instructions. Reverse transcription (RT)-PCR was performed using a QIAGEN OneStep RT-PCR Kit with modified PCR primer sets targeting the full genome sequences of DV-1 and DV-4 (Supplementary Table S1) [19]. The PCR conditions as follows: reverse transcription at 45 ℃ for 30 minutes and denaturation at 95 ℃ for 15 minutes, followed by 40 cycles of amplification (94 ℃ for 10 seconds, 46 ℃ for 30 seconds, and 68 ℃ for 3 minutes), and then a final elongation step at 68 ℃ for 10 minutes. The final amplicons, stained with SYBR ® Safe DNA Gel Stain (Invitrogen, USA), were visualized on 1.6% agarose gels with UV transillumination and purified using an Expin Gel SV kit (GeneAll, Seoul, Korea). The purified products were then sequenced by the Sanger method using an ABI 3730XL System (Macrogen, Seoul, Korea). All procedures were conducted in a Biosafety Level 2 laboratory.

Phylogenetic and sequence analysis
The nucleotide sequences were trimmed using Bioedit software, version 7.0.5.3., and assembled using CLC Genomics Workbench 12.0 (QIAGEN). Multiple alignments of the complete coding regions were performed with DV reference genes (97 DV-1 and 57 DV-4), including previously reported Korean isolates, using the CLUSTAL W method. Maximum-likelihood phylogenetic analysis of DV-1 and DV-4 was performed using the Tamura Nei model with gamma distributed rates and 1,000 bootstrap replicates in Molecular Evolutionary Genetics Analysis (MEGA), version X [20]. The percentages of nucleotide and amino acid sequence identity among Korean isolates were calculated as pairwise p-distances.

Bayesian evolutionary analysis
The rate of nucleotide substitutions per site per year and the time to the most recent common ancestor (TMRCA) of 70 DV-1 and 54 DV-4 strains (differentiated based on their E protein sequences) was estimated using the Bayesian Markov chain Monte Carlo (MCMC) approach as implemented in the BEAST 2 package. Sequences with low quality or more than 99% identity as well as recombinant sequences were excluded from this analysis. The best-fit substitution model was selected using the Akaike information criterion and the Bayesian information criterion as implemented in CLC Genomics Workbench 12.0. The GTR + G + I model (general time-reversible model with gamma-distributed rates of variation among sites and a proportion of invariable sites) was found to be the best-fit model for the DV-1 and DV-4 datasets. Three independent MCMC analyses, each with 60,000,000 steps, were performed using the uncorrelated relaxed molecular clock with a Bayesian skyline coalescent prior, and they were combined with a burn-in value set to 10% generations using Log Combiner 2.5.2. The convergence of the chain was evaluated and viewed using Tracer version 1.

Selection pressure analysis
Selection pressure was evaluated for 101 DV-1 and 59 DV-4 strains by different methods, using the datamonkey web server and the HyPhy package [21]. Sequences with ambiguous characteristics or high similarity (> 99% identity) were excluded from this analysis. The ratio of synonymous to non-synonymous (ω ratio) changes was calculated using the single-likelihood ancestor, fixed-effects likelihood, mixedeffects model of evolution, and fast unconstrained Bayesian approximation methods [22,23]. Selection pressure analysis was performed based on the full coding regions of the structural (C, prM, and E) and non-structural (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) genes. Low ω ratios (0.060-0.081) were considered an indication of negative selection. Positive and negative selection at each site were analyzed based on statistical significance (p-value < 0.1 or posterior probability < 0.9) by at least two methods.

Sequencing and phylogenetic analysis
The full coding sequences of the Korean DV-1 (43251; 10,179 nucleotides) and DV-4 (43257; 10,163 nucleotides) isolates were determined, and the serotypes of those viruses were confirmed by NCBI BLAST analysis. The phylogenetic analysis of the DV-1 and DV-4 strains was performed based on the full coding region ( Fig. 1a and b)

Sequence analysis
Among the Korean DV-1 isolates, the nucleotide and amino acid sequence identity ranged from 90.5% to 98.8% and 96.4% to 99.4%, respectively. The nucleotide and amino acid sequence identity among the DV-4 isolates was 94.3% and 96.8%, respectively ( Table 2). The amino acid sequence for each open reading frame of the Korean DV-1 and DV-4 isolates was compared with that of the most similar strain and the standard strain of each type (Tables 3 and 4). The region of the DV-1 isolates that was most variable, based on amino acid sequence comparisons (95.8-96.4% identity) to EU848545 (the DV-1 standard strain), was in the prM protein. Compared with the most similar strain of each DV-1 isolate, the KP406801 strain exhibited the lowest overall sequence similarity (95.4-99.2% identity). The region of the DV-4 isolates that differed the most from AF326573 (the DV-4 standard strain) was in the NS2A and NS1 proteins (95.4% and 95.5% identity, respectively). However, amino acid changes in the motifs related to virulence and viability [24,25] were not observed in either DV-1 or DV-4 (Supplementary Tables S2 and S3).

Bayesian evolutionary analysis
Rates of nucleotide substitution and TMRCA with ESS values above 200 were determined for the E genes of 70 DV-1 strains and 54 DV-4 strains ( Fig. 2a and b)

Selection pressure
Selection pressure was analyzed for the structural and nonstructural proteins of the DV-1 and DV-4 strains (Tables 5  and 6). The sites under positive or negative selection were identified based on statistical verification using at least two methods. Positive selection in DV-1 strains was observed in the C, prM, NS2A, and NS5 proteins, whereas positive selection in DV-4 strains was observed only in the non-structural proteins NS2A, NS3, and NS5. DV-4 had higher overall rates of negative selection than DV-1 in most structural and non-structural proteins.

Discussion
The incidence of global dengue outbreaks is growing at a very rapid rate, with a 400% increase in recent decades due to global warming, urbanization, and increased global travel [26]. Since 2010, the influx of global travelers infected with DV has been increasing rapidly in Korea [27], putting Korea at high risk for epidemics of dengue disease. Therefore, DV must be continually monitored and characterized to cope with this impending threat. In this study, we conducted the first molecular, biological, and evolutionary analysis of DV-1 and DV-4 isolates from Korea. The DV-1 strains were previously classified into five distinct genotypes (I-V) [28,29]. Phylogenetic analysis based on the E protein and the complete coding region revealed that the DV-1 isolates in Korea belong to genotype I (43251, KP406802), genotype IV (KP406803), and genotype V (KP406801) (Fig 1a). The strains D1/Singapore/ EU081276/2005 and D1/Indonesia/KC762641/2008, which were the ones most similar to two Korean DV-1 isolates (43251 and KP406802; genotype I), were the major strains that caused dengue outbreaks in Singapore in 2005 and Indonesia in 2007, respectively [30][31][32]. Our results indicate that these viruses are still circulating in Southeast Asian countries more than 10 years later. The KP406803 isolate was grouped into the genotype IV clade, which also includes D1/Philippines/GQ868602/2004 and D1/Hawaii/ DQ672564/2001. Genotype IV of DV-1 was first isolated in the Philippines in 1974, and it has predominated in each epidemic in the South Pacific since 2000 [33]. Among the five DV-1 genotypes, genotype V has been reported to be the most virulent and to be strongly associated with DHF [9]. We identified one strain (KP406801) from this virulent genotype. As shown by our results, various DV-1 genotypes that have caused epidemics in other countries have been introduced into Korea, where they could cause disease outbreaks.
The substitution rate per site per year of the E protein in DV-1 was 5.58 × 10 -4 , which is high for an RNA virus. The substitution rates for RNA viruses generally range from 10 -6 to 10 -4 . The rate for DV-1 was similar to that reported previously for DV-2 strains isolated in Korea (5.32 × 10 -4 ) [34]. However, the isolation years differed significantly between the Korean DV-1 strains and the most similar strains, differing by 7 to 8 years, even though these strains shared a high degree of nucleotide sequence similarity (98.7-99.5% identity) (Fig. 2a). The average TMRCAs of the DV-1 isolates from this study were 11 (43251 and KP406802), 21 (KP406803), and 17 years (KP406801) [23]. The TMRCA of previously reported DV-2 isolates from Korea (4 to 10 years) was shorter than those of the DV-1 and DV-4 strains we tested, although the overall substitution rates of the DV-2 strains were similar to those in this report [34]. These results suggest that some DV-1 genotypes with slow substitution rates caused concomitant DV outbreaks in Southeast Asia, including in Singapore, and were then introduced into Korea [32,35].
Among the three genotypes (I-III) of the DV-4 serotype, the Korean DV-4 isolates belong to genotype I (KP406806) and genotype II (43257) (Fig. 1b). The strain most similar to KP406806 is D4/Philippines/AY947539/1956, a representative strain that has been found consistently in China and Taiwan from about 60 years ago until the 2000s. The DV-4 43257 strain, which was isolated from a Korean traveling to the Philippines in 2010, has the highest nucleotide sequence similarity to DV-4 strains circulating in Singapore in 2005. DV-4 genotype II is the predominant genotype in Singapore, indicating that it was introduced into Korea via the Philippines [32]. The substitution rate per site per year of the DV-4 strains was 6.72 × 10 -4 , which was as fast as that of previously reported DV-1 and DV-2 strains [34]. The TMRCA analysis showed that this strain evolved within 20 years. Like the DV-1 isolates, DV-4 strains have been consistently circulating in Southeast Asia without significant mutation for a long time, even though we found DV-4 to have a rapid substitution rate, similar to those of DV-1 and DV-2 in Korea. Several amino acid sequence motifs affecting viability and virulence have been found in DV proteins. An S112A substitution in the prM protein inhibits the assembly of replicon particles, and a T193A mutation in the E protein reduces the infectivity of DV-1 tenfold [24]. An R888 mutation results in incorrect localization of NS5, affecting the synthesis of negative-stranded RNA [25]. A 40-aa deletion in the DV-4 NS2B protein eliminates autoproteolytic activity [36]. A D192N substitution in the NS3 protein causes increased neurovirulence in mice [37]. However, we did not find any amino acid substitution mutations known to be related to virulence or viability in the Korean DV-1 and DV-4 isolates we studied (Supplementary Tables S2 and S3).
Compared with EU848545 and AF326573, non-synonymous substitutions were observed in an average of 6.7 and 5 sites in the structural proteins of DV-1 and DV-4, respectively (Tables 3 and 4). In particular, amino acid mutations in the E protein can lead to structural changes in cell binding and immunogenicity [38], possibly leading to immune escape in the host [39]. In a previous report, sites near epitopes in the hemagglutinin of influenza viruses known to be important for antigenicity were mainly under positive selection, which favors non-synonymous substitutions [40]. The C and prM structural proteins and NS2A and NS5 non-structural proteins in DV-1 were found to be under positive selection (Table 5). However, positive selection in DV-4 was found only in the NS2A, NS3, and NS5 non-structural proteins ( Table 6). NS2A and NS5 play important roles in viral RNA replication and the suppression of host immune responses [41,42]. Therefore, those proteins might be under positive pressure to escape host immunity. On the other hand, the E protein was not found to be under positive selection in the Korean DV-1 or DV-4 strains, even though the E protein of DV is known to be important for immunogenicity. This finding is consistent with that of previous studies showing that the E protein of DV-1 and DV-2 was not strongly affected by positive selection [34,43]. The role of the proteins under positive pressure should be evaluated carefully to prepare for future DV epidemics. Interestingly, significantly fewer sites in DV-1 were under negative selective pressure than in DV-2 and DV-4. This significant difference suggests that the DV-1 strains have a more stable evolutionary status than the DV-2 and DV-4 strains [44].
In this study, we performed molecular and evolutionary analysis of DV-1 and DV-4 isolates from Korea. Because many global travelers with dengue infection are asymptomatic or have mild symptoms, the actual incidence of DV infection in Koreans is probably higher than the number of reported cases. Nevertheless, considering the extremely limited amount of information that was previously available about DV-1 and DV-4 isolates in Korea, this study has provided the first comprehensive molecular, phylogenetic, and evolutionary information about DV-1 and DV-4 strains in Korea. Various genotypes of DV-1 and DV-4 with distinct characteristics have been introduced into Korea, and some of them could cause autochthonous outbreaks in Korea. Therefore, further characterization and surveillance of DV should be performed.
Acknowledgements The pathogen resources (NCCP43251 and NCCP43257) for this study were provided by the National Culture Collection for Pathogens.
Author contributions EHH drafted the manuscript. GK, HC, and HO performed the data analysis. JHP and GHH revised the manuscript. JJH and BSK designed and supervised the experiments. All authors read and approved the final manuscript.
Funding This study was funded by a grant from the Defense Acquisition Program Administration (ADD-911255201), Republic of Korea.