Background

Despite significant advances in fighting childhood diarrhea through sanitation improvement, and vaccine introduction, diarrheal diseases remained worldwide the fourth most frequent cause of death for children < 5 years of age in 2016 [1]. Rotavirus (RV) infection remains the leading cause of severe acute gastroenteritis (AGE) [2]. In countries where rotavirus mass vaccination programs have been established, noroviruses have taken over as the most frequent cause of childhood AGE [3,4,5]. The effectiveness of these vaccines and the decline in diarrheal disease caused by rotavirus has been faster in the developed countries but remains disappointingly low in the low-middle-income countries [1, 6,7,8]. Such differences in disease distribution and severity are likely driven by circulating rotavirus strains and genotype distributions [6, 8].

Rotaviruses are members of the genus Rotavirus within the family Reoviridae, with a genome of 11 segments of double-stranded RNA (dsRNA), encoding six structural viral proteins (VP1-4, VP6, and VP7) and six non-structural proteins (NSP1-5/6) [9]. Based on the sequences of VP4 and VP7 proteins, RVs are divided into different P (VP4) and G (VP7) genotypes, respectively. Globally, six G types (G1-4, G9, and G12) and three P types (P[4], P[6], and P[8]) have been predominant in the past decades[2, 10]. Data from the Chinese RV sentinel surveillance network showed that G1, G2, and G4 were very rarely reported since 2012, and the predominant G type in China was G9 (~ 90%), followed by G3 (~ 7%) [11]. The G8 genotype has not been reported before. In the present study, the whole genome sequences of G8 rotavirus strains in China are reported, and the genetic characteristics and evolutionary relationships between rotavirus strains from Guangzhou in China and epidemic rotavirus strains derived from GenBank are described.

Materials and methods

Participants and specimens Collection

A hospital-based study on children of less than 5 years of age, hospitalized for acute gastroenteritis caused by rotavirus infection, was conducted at Zhujiang Hospital, Guangzhou between December 2020 and February 2021. Stool specimens were collected from each patient with severe AGE at the admission for laboratory diagnosis as a routine clinical procedure. 16 fecal specimens were collected.

Rotavirus antigen detection, RNA extraction and RT-PCR

A commercial enzyme immunoassay (EIA) (Ridascreen, R-Biopharm AG, Germany) was applied for RV detection [12]. Those RV antigen positive samples were further suspended in 1×phosphate-buffered saline (PBS; Invitrogen) to approximately 10%, and centrifuged for 10 minutes at 8000×g. Viral RNA was automatically extracted from 200 µL supernatant stool samples using the Nucleic Acid Extraction Kit (Tianlong, Xian, China) to be used in RT-PCR. Amplification of the partial VP7/VP4 genes were performed in a 25 µL reaction volume containing 10 µM forward primers Beg9 (5’- GGCTTTAAAAGAGAGAATTTCCGTCTGG-3’) or Con3 (5’-TGGCTTCGCTCATTTATAGACA-3’) and reverse primers End9 (5’-GGTCACATCATACAATTCTAATCTAAG − 3’) or Con2 (5’-ATTTCGGACCATTTATAACC-3’), respectively [13], and 2 µL RNA template using the FastKing One Step RT-PCR Kit (TIANGEN, Beijing, China). Reverse transcription was conducted at 42˚C for 30 min, followed by initial denaturation at 95˚C for 3 min. In total, 40 cycles of amplification were performed comprising denaturation at 94˚C for 30 s, annealing at 55˚C (VP7)/49˚C (VP4) for 30 s, and extension at 72˚C for 30 s; the program ended at 72˚C for 5 min. The PCR products were analyzed by electrophoresis in 1% agarose gels. Sanger sequencing of obtained amplicons (1062 bp for VP7 and 876 bp for VP4) was performed and analyzed with the BLAST database to determine the G and P genotypes.

Whole-genome sequencing

Library preparation and Illumina sequencing were performed by a commercial provider (Tsingke, China). First, the nucleic acid was fragmented, and the average fragment size used for sequencing was 300∼500 bp. Paired-end (PE) 100-base sequencing was performed using Illumina Novaseq 6000 PE150. Fastp (version 0.20.0) and bbmap (version 38.51) were used as tools to remove adapter sequences and contaminating sequences in reads [14, 15]. The remaining reads were then subjected to de novo contig assembly using SPAdes (version 3.14.1) and SOAPdenovo (version 2.04), which assembles reads based on the de Bruijin graph algorithm [16, 17]. The generated contigs were then analyzed by BLAST (version 2.10.0+) using the NCBI nonredundant nucleotide (NT) and viral refseq databases to evaluate the accuracy and completeness of the obtained assembly results [18].

Phylogenetic analysis of rotavirus G and P genotypes

Original data of strains and reference sequences that were more similar to the sequenced strains were downloaded from the GenBank database. DNAMAN (version 9.0) software was applied for sequence similarity analysis. Multiple alignments and phylogenetic analysis were performed with MEGA X software. The phylogenetic trees were constructed using the neighbor-joining method. Kimura-2 parameter model and gamma distribution were used to calculate genetic distance, and reliability analyses were performed using the bootstrap method, repeated sampling 1000 times, with less than 70% was considered meaningless. The percentages of nucleotide-sequence similarity between Guangzhou RVA strains and RVA sequences deposited in the GenBank were calculated using the p-distances method. [19]

Construction of background RV strains

To compare the genome constellations of the G8P[8] isolates from Guangzhou with those of RVA mainly from Asia and Africa, complete genome sequences of 44 representative RVA strains available from GenBank were selected for comparison (Table 1). In addition, complete genome sequences of 47 representative RVA strains around the world which were available from GenBank were obtained in order to determine whether the Guangzhou strains had arisen following reassortment events.

Table 1 Comparison of genome constellations of the G8P[8] isolates from Guangzhou with those of RVA genomes available from GenBank

Results

Genotypes and genetic characteristics of Guangzhou G8 rotavirus

Between December 2020 and February 2021, 16 children < 5 years were hospitalized for treatment of severe AGE in Zhujiang Hospital, Guangzhou. Five of sixteen samples collected from hospitalized AGE children were positive for rotavirus. Further G/P typing differentiated the samples as two G8P[8] strains, two G9P[8] strains and one G2P[4] strain.

Whole-genome sequencing

GenBank files containing genome sequences can be retrieved from GenBank (accession no. OK349178 - OK349199) (Additional file 1: Table S1). The whole-genome analysis confirmed that the two G8P[8] strains were DS-1-like strains with a genotype constellation of G8-P[8]-I2-R2-C2-M2-A2-N2-T2-E2-H2 (Table 1). The whole genome of the two G8 strains were highly similar with an overall genome identity of 99.78% and the sequence identity of 11 genome segments ranged from 99.47 to 99.96%.

Large evolutionary distance between two Guangzhou G8P[8] strains and other circulating G8P[8] strains

Ten representative G8P[8] strains isolated in other areas shared the same genotype in all 11 genome segments with the two Guangzhou strains (Table 1). Of these, eight G8P[8] strains isolated between 2013 and 2019 were analyzed. The whole genome sequence identity between the eight G8P[8] strains and the two studied strains varied from 87.23 to 95.21%. For the segments encoding VP2, VP3, VP4, VP7, NSP1, and NSP3 sequence identities of > 98% were observed whereas the sequence identities of the other segments were lower (Table 2).

Table 2 Nucleotide sequence similarity of strains closely related to Guangzhou strains

Phylogenetic analysis of G8P[8] genotype VP7, VP4 segments

We used the full length of the VP7 and VP4 gene sequences to construct phylogenetic trees. The nucleotide similarities of the two Guangzhou G8 strains and the Thailand strain (SSKT-269/THA/G8P[8]) were 99.34% and 99.15% (Table 2). In the phylogenetic tree of VP7 genes, the two Guangzhou strains were clustered exclusively with DS-1-like G8 strains formerly isolated and described in multiple regions such as Singapore, Japan, Thailand, and Korea (lineage 1). In addition, several G8 bovine rotaviruses in Southeast Asia (BE4/IND/G8P[1], 79/IND/G8P[14], A5-13/THA/G8P[14], A5/THA/G8Px) were also located on lineage (1) Other clusters of G8 genotypes in lineage 3 were obtained with Wa-like RVA strains, which were isolated in America (2,009,727,045/USA/G8P[4]), African countries (6862/TUN/G8P[8]), and European countries (CR2006/HRV/G8P[8], SI-885/SVN/G8P[8]). Other African DS-1-like G8 strains were clustered into a distinct lineage (2) (Fig. 1)

Fig. 1
figure 1

Phylogenetic analysis of the VP7 gene of G8 rotavirus strains used in the phylogenetic study of RVA strains. Phylogenetic tree of VP7 gene. This tree involves five different VP7 genes including G8, G3, G9, G1 and G2. The G8 gene was further clustered into four lineages. Studied strains were marked in different shapes and colors. : sample strains; : G8P[8] strains in various regions; : animal-derived strains; : strains in mainland China. All sequences except those of the sample strains were obtained from the NCBI public database. The Kimura-2 parameter model was used for the construction of the Neighbor-Joining phylogenetic tree. Bootstrap numbers are shown at the branch nodes and are more reliable at values of > 70%. The scale bar indicates nucleotide substitutions per site

Further analysis indicated that the VP7 nucleotide sequence similarities between the Guangzhou G8 strains and the DS-1-like G8 strains in Southeast Asia were very close (differences 0.0066 ~ 0.0114). However, the genetic distances between the Guangzhou strains and the DS-1-like G8 strains in Africa were further apart (differences 0.1476 ~ 0.1678). In addition, the genetic distances between the two Guangzhou strains and the Wa-like G8 strains were even further apart with a genetic distance over 0.1689 (Additional file 1: Table S2). The sequence identities between the Guangzhou strains and bovine strains ranged from 97.51 to 98.66%, lower than those between the Guangzhou strains and other human G8 strains (over 99%).

For VP4 genes, strain DBM2018-291/THA/G9P[8] (DS-1) circulating in Thailand had the highest sequence similarity (99.45–99.53%) (Table 2) and the closest averaged genetic distance (difference 0.0079) with the two Guangzhou strains. Analysis of G8P[8] strains circulating in Southeast Asia and East Asia from 2013 to 2019, and one strain of DS-1-like G8P[8] in the Czech Republic suggest a close genetic distance of other circulating G8P[8] strains (Fig. 2). The VP4 genes of P[8] RV strains detected in China during 2016–2019 were far less related to the two Guangzhou G8P[8] strains (Fig. 2).

Fig. 2
figure 2

Phylogenetic analysis of the VP4 gene of G8 rotavirus strains used in the phylogenetic study of RVA strains. Phylogenetic tree of VP4 gene. This tree involves five different VP4 genes including P[8], P[4], P[6], P[14] and P[9]. Studied strains were marked in different shapes and colors. : sample strains; : G8P[8] strains in various regions; : animal-derived strains; : strains in mainland China. All sequences except those of the sample strains were obtained from the NCBI public database. The Kimura-2 parameter model was used for the construction of the Neighbor-Joining phylogenetic tree. Bootstrap numbers are shown at the branch nodes and are more reliable at values of > 70%. The scale bar indicates nucleotide substitutions per site

Genogrouping analysis of whole genomes and reassortment analysis

The VP1 genes of the two Guangzhou strains had the highest sequence similarity with Thailand 2018 G2P[4] strain (DS-1). The VP2, VP4, VP6, and NSP5/6 genes of the two strains had the highest sequence similarity with Thailand 2017–2018 G9P[8] strains (DS-1). The VP3, NSP1, NSP2, and NSP3 genes had the highest similarity with the Spanish G3P[8] strain in 2015 (DS-1). The most similar sequences for the two Guangzhou G8P[8] strains except for VP7 genes were found in DS-1-like G2P[4], G3P[8] and G9P[8] strains rather than other circulating G8P[8] strains. A detailed similarity score can be found in Table 2.

Furthermore, 9 genome segments other than those encoding VP7 and VP4 were analyzed through phylogenetic trees, involving the two Guangzhou G8P[8] strains and other 47 RV strains derived from GenBank (Fig. 3). For each genome segment, we focus on the location of: (1) the two Guangzhou G8P[8] strains (red dot); (2) RV strains with the highest sequence similarity with the two Guangzhou strains (purple diamond); (3) other G8P[8] strains circulating globally (blue triangle), and (4) other RV strains circulating in China (brown square). The results showed that, except for VP7 gene, the two Guangzhou G8P[8] strains did not cluster with any branch of other circulating G8P[8] strains, nor RV strains circulating in China. The VP1 (Fig. 3a), VP2 (Fig. 3b), VP4 (Fig. 2), NSP1 (Fig. 3e), NSP2 (Fig. 3f), and NSP5/6 (Fig. 3i) genome segments of the two Guangzhou strains were located in the same branch with Thailand G9P[8] strain in 2018 (DBM2018-291, DS-1). The VP3 (Fig. 3c) and VP6 (Fig. 3d) genes has the closest genetic distance to Thailand 2017 G9P[8] strains (DBM2017-016, DS-1). NSP3 (Fig. 3 g) and NSP4 (Fig. 3 h) genes were genetically the closest to GER33-15/DEU/G3P[8] (DS-1). In general, 10 genome segments except for VP7 had relatively close genetic distances with DS-1-like G9P[8], G3P[8],and G2P[4] circulating strains in Thailand, Vietnam, Spain, Germany and other places during 2015–2018 (Figs. 2 and 3).

Fig. 3
figure 3figure 3figure 3figure 3figure 3figure 3figure 3figure 3figure 3

Phylogenetic trees of genome segments not encoding VP7 or VP4.A, VP1 gene. B, VP2 gene. C, VP3 gene. D, VP6 gene. E, NSP1 gene. F, NSP2 gene. G, NSP3 gene. H, NSP4 gene. I, NSP5/6 gene. Studied strains were marked in different shapes and colors. : sample strains; : G8P[8] strains in various regions; : strains in mainland China; : the strain with the highest similarity to the gene sequence of this segment. All sequences except those of the sample strains were obtained from the NCBI public database. The Kimura-2 parameter model was used for the construction of the Neighbor-Joining phylogenetic tree. Bootstrap numbers are shown at the branch nodes and are more reliable at values of > 70%. The scale bar indicates nucleotide substitutions per site

Discussion

This full-length genome analysis of G8P[8] RVA strains isolated in Guangzhou provided interesting results. The G8 genotype is one of the more common RV strains of bovine origin [20]. It was first discovered in humans in Indonesia between 1979 and 1981 in the form of an “ultra-short” electrophoretic pattern [21, 22]. Since then, the G8 strain has been detected in children in many countries and even became one of the dominant strains in some sub-Saharan Africa countries [23]. Even though G8 strains were circulating in multiple countries surrounding China, including India [24], Iran [25], Vietnam [26], Thailand [27], Singapore [28] and Japan [29], it was rare in China [30]. Of more than ten thousand rotavirus gene sequences submitted from China, only one strain was identified as a G8 strain (G17011060/CHN/G8P[8]) [31]. In the current study, out of five RV antigen positive samples, two were confirmed as G8P[8] strains. During the same epidemic season, a high proportion of infants with severe AGE in our ongoing multi-center RV vaccine effectiveness study were found to be infected with G8P[8] strains: Huizhou (25.0% 1/4), Shunde (55.6% 5/9), Shenzhen (42.1% 8/19) in Guangdong Province, Mianyan (36.4% 8/22) in Sichuan Province, and Xiamen (11.1% 3/27) in Fujian Province (whole-genome sequencing of these viruses had not been completed). G1, G2, G3, G4, G9 and G12 were recognized as globally important rotavirus genotypes [10, 32, 33], and studies have shown that a single novel RV (e.g., a vaccine escape mutant) can spread around the world in little more than a decade [33]. In China between 1998 and 2000, the predominant strains causing AGE in children less than five years were G1 (72.7%)[34]. After 2000, the G1 genotype decreased from 70 to 20%, while G3 type rose from 33 to 43% [35]. Around 2010, G9 strains increased and eventually replaced G3 strains [30, 36,37,38]. A dominant strain replacement cycle of about ten years could be inferred. These observations may indicate that the currently predominant G9 strains will be replaced by G8 strains in China. Of course, this assumption needs to be supported by further surveillance data.

Serotype G8 rotaviruses are rarely found in man and the exchange of genes between human and bovine G8 viruses may have occurred on more than one occasion [39]. G8 reassortant strains are thought to have two major lineages, one originating in Africa [40, 41], and another originating in Southeast Asia [26]. In the BEAST analysis, Hoa-Tran et al. [26] confirmed the hypothesis that the G8P[8] strains in Southeast Asia were generated by reassortment of bovine G8 strains and human DS-1-like strains and that these event occurred between 2007 and 2012. In our study, the whole genome sequencing results suggest that, although the similarity of VP7 genes between Guangzhou G8P[8] strains and bovine RVA strains derived in Southeast Asia was more than 90% (91.90 ~ 97.93%), they had higher gene homologies (99.40 ~ 99.59%) with DS-1-like G8P[8] strains circulating in Southeast Asia in recent years. It is therefore doubtful whether the two Guangzhou RV strains originated from reassortment events between animal and human strains. Secondly, the whole genome sequences of the two Guangzhou isolates differed from DS-1-like G8 and Wa-like G8 strains derived from Africa and Europe, with regard to sequence similarities and genetic distances. Thirdly, further whole genome sequence comparisons between the two Guangzhou strains and G8P[8] strains circulating in Southeast Asia and East Asia suggest a low similarity, especially regarding the VP1, VP6, NSP2 NSP4, and NSP5/6 genes. Conversely, except for the VP7 gene, higher similarity was observed with 10 other gene segments between the two Guangzhou strains and G9P[8], G3P[8] and G2P[4] strains circulating in Thailand and Spain between 2014 and 2018. Hence, it seems most likely that the two Guangzhou strains originated from reassortment events of G8P[8], G9P[8], and G3P[8] strains circulating in Southeast Asia in recent years.

One limitation of the study is that it is based on only two G8P[8] isolates obtained in Guangzhou. Further studies on prevalence, evolution and origins are required to characterize their spread in China. Nevertheless, in the past epidemic season, we have noticed the emergency of G8 strain not only in Guangzhou, but also in other regions of southern China. It would be very valuable to study the evolution-associated characteristics with more G8 strains that might spread elsewhere, to further verify our hypothesis on the origins of emerging G8P[8] RV strains in China.

Conclusions

Probably due to the frequent personnel mobility and trade, RVAs of G8 genotype, which used to circulate in countries around China for years, have recently emerged in the South of China and accounted for a considerable proportion of children presented as severe AGE. The clinical and epidemiological significance of G8 RV strains in China remains to be closely monitored.