Introduction

Noroviruses are a major causative agent of acute gastroenteritis, with high prevalence across the globe [1]. Norovirus outbreaks in public spaces, such as kindergartens and primary or secondary schools, are generally associated with low hygiene levels and contaminated food or water [2,3,4]. Noroviruses belong to the family Caliciviridae and have three open reading frames (ORFs) in their genome. ORF1 encodes a large nonstructural protein (Ntpase, p22, VPg, 3CLpro and RdRp), ORF2 encodes the major structural capsid protein (VP1), and ORF3 encodes a minor structural protein (VP2) [5]. Based on the amino acid sequence of the capsid protein, noroviruses have been classified into 10 genogroups (GI~GX) and approximately 53 genotypes so far [6, 7]. Of the genogroups, GI, GII and GIV can be found in human infections, with genogroup GII being the most common [8]. The GII.4[P4] genotype has been reported in many countries and is the prevalent strain in most outbreaks and human infections [9,10,11]. Previous reports have shown that GII.17[P17] became the main epidemic genotype of norovirus in China, in 2014 [11,12,13,14]. GII.21[P21] and GII.13[P13] have also been found in the coastal environment and in seafood in Korea. These two genotypes share a close phylogenetic relationship with GII.17[P17]. However, these genotypes are found infrequently in human infections, and this might be associated with a unique histo-blood group antigen (HBGA) binding site involved in host susceptibility and the presence of decoy glycan receptors in the human gastrointestinal tract that prevents binding of the virus [15, 16].

Due to the high frequency of genetic recombination at the ORF1 and ORF2 junction, a number of recombinant strains have emerged under natural selection [1, 17, 18]. The emerging virus can replace the old ones and cause new infections and outbreaks every two to four years, as has been observed with GII.4(Sydney)[P4(New Orleans)] and GII.2[P16] [19]. In addition, other recombinant strains, such as GII.6[P7], GII.13[P16] and GII.4[P31], have been reported in outbreaks and environments [20,21,22,23]. Although those genotypes have not caused severe outbreaks, studying these strains has helped us to obtain more information on the genetic diversity and gene constellation of noroviruses.

Here, we report two GII.13[P21] recombinant strains from an outbreak in the city of Changsha. In this outbreak, there were also two cases of coinfection with GI and GII. To better understand the genetic background and molecular characteristics of these viruses, we carried out a comprehensive analysis of the complete genome sequences of these noroviruses obtained by next-generation sequencing (NGS).

Materials and methods

The stool sample in this study was provided by the YueLu District Center for Disease Prevention and Control and tested by the Changsha Center for Disease Prevention and Control (CSCDC). The epidemiological and clinical information on laboratory-confirmed human cases of norovirus infection were collected by the CSCDC staff and medical doctors at the hospital. The Institutional Review Board reviewed and approved the use of those samples for this research (no. CSCDC-2019-008).

Six stool samples were obtained from patients who suffered from fever, vomiting and diarrhea on 1 April 2019. The samples were stored at −70°C until RNA extraction. Viral RNA was extracted using a QIAamp RNA Mini Kit (QIAGEN), and detected using a norovirus real-time PCR kit (Jiangsu Bioperfectus Technologies, China) according to the manufacturer’s instructions. Then, the region corresponding to the ORF1 and ORF2 junction was amplified from norovirus-positive samples by reverse transcription polymerase chain reaction (RT-PCR) using norovirus type I and II ORF(1+2) junction region gene amplification kits (BioGerm, China). ORF1 and 2 junction sequences were obtained from the company (BioGerm, China). The genotypes of the norovirus sequences were determined using the Norovirus Genotyping Tool website (http://www.rivm.nl/mpf/norovirus/typingtool) [7].

The full genome was amplified from of norovirus-positive samples by RT-PCR using a SuperScriptTM III One-Step RT-PCR System with a PlatinumTM Taq High Fidelity DNA Polymerase Kit (Thermo Fisher, USA), performed in a GeneAmp PCR System 9700 (Thermo Fisher, USA) [24, 25]. PCR products were purified using AMPure XP beads (Beckman, USA) according to the manufacturer’s instructions and eluted using 45 μl of nucleic-acid-free water. The purified nucleic acid sequences were quantitated by Qubit 2.0 using a dsDNA HS (High Sensitivity) Assay Kit. A Nextera XT DNA Library Prep Kit (Illumina, USA) was used to construct a DNA library with 1 ng of input DNA. Then, the samples were sequenced using a Miseq v2 Reagent Kit (Illumina, USA) on a Miseq platform (Illumina, USA). The sequence data were analyzed using Fastqc, Cutadapt and Virus Identification Pipeline (VIP) software. Sequences were assembled using SPAdes-3.13.0 software.

Sequence alignments were performed with Clustal W using Molecular Evolutionary Genetic Analysis software version 6 (MEGA 6). A phylogenetic tree based on full-genome sequences was constructed by the maximum-likelihood method in MEGA 6 (1000 bootstrap replicates). Phylogenetic trees based on RdRp and VP1 sequences were constructed by the neighbor-joining method with the Kimura two-parameter model in MEGA 6 (1000 bootstrap replicates). The other complete and partial genome sequences were downloaded from NCBI.

To identify break points in the genomes of the recombinant strains, their sequences were analyzed using SimPlot 3.5.1 software. The analysis was conducted using 1500-bp sequences from the ORF1 and ORF2 regions with a window size of 200 nt and a step size of 20 nt [24]. Amino acid sequences and HBGA binding sites were analyzed using Biological Sequence Alignment Editor (BioEdit 7.0.5).

Results

The outbreak occurred in a senior high school after 786 students participated in a group activity at a commercial park. On 29 March 2019, the students had lunch and dinner at the commercial park, and the first case of illness with nausea and vomiting was reported on 30 March. In this outbreak, 68 cases were reported, 31 males and 37 females, including one teacher. The infection cases were distributed in 13 classes. The symptoms in students who suffered from gastroenteritis mainly included dizziness (48.53%), nausea (75.00%), vomiting (83.82%), diarrhea (57.35%) and abdominal pain (47.06%). In all cases, the symptoms were resolved within 48-72 h, with no severe outcome.

To identify the pathogens causing acute gastroenteritis in this outbreak, six stool specimens from patients, 10 anal swab samples from the park staff, and food and drinking water samples were collected for norovirus testing. The results showed that the six stool specimens were GII positive, two of which exhibited a mixed infection with GI and GII. However, other samples from the commercial park were negative. The ORF1 and ORF2 junction was amplified from six samples by RT-PCR and sequenced, and the following genotypes were found: GII.13[P21] (n = 2), GII.2[P16] (n = 3), GII.17[P17] (n = 1) and GI.4[P4] (n = 2). In the mixed infection, GI.4[P4] and GII.2[P16] were found. We obtained three complete genome sequences from two coinfection cases, including two GI.4[P4] sequences and one GII.2[P16] sequence. These sequences were submitted to the GenBank database with the accession numbers MN938460, MN938461, and MN394542 to MN394546.

To investigate the genetic relationship between the recombinants and other sequences available in the GenBank database, full-genome sequences and ORF1-2 fragments of strains were analyzed by constructing a phylogenetic tree (Fig. 1A). A phylogenetic tree based on a fragment of the RdRp gene revealed that the recombinant strains belonged to the same branch as viruses from the United Kingdom (MH218651) and Bhutan (MH702263) and shared 98.4%-98.5% sequence identity with MH702263 (Fig. 2A). The ORF2 fragment of these strains belonged to the GII.13 genotype branch and shared 98.2% sequence identity with MG892908 (Fig. 2B). The complete genome sequences of the GII.13[P21] strains were 98.0%-98.1% identical to that of MH218651 and 99.8% identical to each other. A SimPlot analysis revealed a break point in the GII.13[P21] strains at approximately nt 5059, at the initiation site of ORF1, as shown in Fig. 1B.

Fig. 1
figure 1

Phylogenetic and SimPlot analysis of noroviruses based on nucleotide sequences. (A) Phylogenetic analysis of full-genome sequences of noroviruses in the outbreak (maximum-likelihood method). The strains from this study are indicated by a solid black circle. (B) SimPlot analysis of GII.13[P21] strains performed using a window size of 1500 nucleotides and a step size of 100 nucleotides. The GII.13[P21] strains were compared with KJ196284 (gray line) and MG892908 (black line) using a partial sequence (1500 bp) of the ORF1 and ORF2 regions

Fig. 2
figure 2

Phylogenetic analysis of noroviruses based on (A) the nucleotide sequence of the RdRp region and (B) the nucleotide sequence of the ORF2 region. The trees were made by the neighbor-joining method. The GII.13[P21] strains for this study are indicated by a solid black circle

GII.13[P13] and GII.21[P21] strains have a unique binding site for recognition of glycans with a terminal β-galactose [15]. This site consists of eight residues located at the top of each P domain. The eight residues are as follows: N297 and W298 of the B loop; S357, T359, and S360 of the N loop; and N395, N397, and T398 of the T loop. None of these residues were mutated in the GII.13[P21] recombinant strains in this study. However, other substitutions that might affect the structure of the binding site were also observed – N294S, N309S, and V394Q – as well as a V541A substitution in the C-terminal amino acid sequence. The HBGA binding sites were conserved in the GII.13[P21] recombinant strains. T135A and K139R substitutions were found in the RdRp segment of these viruses.

Discussion

GI and GII noroviruses are the most important viral cause of non-bacterial gastroenteritis in humans globally. A large proportion of the population is infected with noroviruses via contact with contaminated food, water, and environment every year [17, 18, 26]. Various GII genotypes have been recognized in a number of outbreaks, including GII.4[P4], GII.16[P16], and GII.17[P17] [9, 11, 20]. Owing to the high frequency of genetic exchange at the ORF1 and ORF2 junction, multiple recombinant strains have emerged and caused outbreaks, such as GII.4[P31] and GII.2[P16] [19, 27, 28]. Previous studies have shown that GII.17[P17] was the most prevalent genotype in 2014 in China [11, 13]. Genotypes GII.17[P17], GII.13[P13], and GII.21[P21] belong to a separate genetic lineage in the phylogenetic tree[29, 30]. Unlike GII.4[P4] and GII.2[P16], which have caused large outbreaks and infections around the world, human infections with genotypes GII.21[P21] and GII.13[P13] have rarely been reported. Differences in the HBGA binding sites of GII.21[P21] and GII.13[P13] variants might contribute to the sporadic rate of infection [15]. Hence, GII.21[P21] and GII.13[P13] genotypes are still evolving and spreading at a lower rate. Although these genotypes might infect humans only sporadically, it is still useful to study the genetic diversity of these norovirus genotypes.

Due to the lack of in vitro cell culture systems and in vivo animal models for human noroviruses, there is still a lack of information on its pathogenesis and epidemiology. Whole-genome sequencing of recombinant strains is important for vaccine development strategies and studies of viral evolution. Hence, we determined the genome sequences of GII.13[P21] recombinant strains isolated in the city of Changsha, China. From the epidemiological information, although only visitors of the commercial park suffered from gastroenteritis, all food and water samples from the park were free of contamination, indicating that the source of the contamination might have been direct contact with infected persons or contaminated environments. Based on the genetic and sequence data, we conclude that the outbreak involved multiple genotypes, including GII.4[P4], GII.2[P16], and GII.17[P17], which have been reported frequently in previous outbreaks [10]. These results reveal that the prevalent genotypes continue to infect humans. Phylogenetic analysis of the ORF1 and ORF2 segments showed that the recombinant strains belong to different branches. These results suggest that sequences in these recombinant strains might have originated in other countries neighboring China or that the strains might have evolved from local strains that have not been detected in the environment before. The high degree of sequence similarity between the two GII.13[P21] recombinant strains indicated that the infection probably had a common source or was spread via human-to-human transmission. Phylogenetic analysis has shown that members of genotypes GII.21[P21] and GII.13[P13] are closely related to each other [15]. GII.21[P21] and GII.13[P13] are continually observed in Korea and Thailand [13]. The break point identified in this study was located in ORF1 segment, at the same position as reported previously in other recombinant strains [23]. Based on these findings, it is suggested that recombination between these genotypes might occur more easily than with other genotypes.

Although the genotypes GII.21[P21] and GII.13[P13] represent a new evolutionary lineage of norovirus selected by HBGAs, the binding sites of the recombinant strains are still conserved [16]. The role of substitutions in the B, N and T loops of the P domain, including N294S, N309S and V394Q, which are associated with adaptation of the HBGA binding site, needs to be clarified in future research. TheT135A and K139R substitutions in the ORF1 sequence and the V541A substitution in ORF2 sequences may have no influence on the structure of the binding pocket. Although the sporadic norovirus genotypes do not cause serious epidemic worldwide, unlike GII.2[P16] and GII.4[P4], due to these special binding sites, human infections are associated with severe vomiting and diarrhea. Hence, it is important for us to enrich the database of norovirus genome sequences.

Although only a few samples were collected in this outbreak, we obtained the full genome of GII.13[P21] recombinant strains that have rarely been reported in China. Due to the high rate of genetic exchange in the norovirus genome, the virus can escape immune monitoring in individuals and acquire new host specificity [31]. Our research is important for understanding the diversity and wide distribution of noroviruses.