Molecular epidemiology and temporal evolution of norovirus associated with acute gastroenteritis in Amazonas state, Brazil
Globally, Norovirus (NoV) is considered the most common cause of diarrheal episodes across all age groups. Despite its wide genetic diversity, the GII.4 strain is the most predominant and has been associated with epidemics worldwide. In this study, we characterized sporadic cases of diarrhea from NoV-positive children, during a five-year period (2010–2014).
A total of 250 NoV-positive samples identified by an enzyme immunoassay (EIA) were subjected to RT-PCR and partial nucleotide sequencing for polymerase and capsid genes. Phylogenetic analysis was performed to identify NoV genotypes using the binary classification. In addition, sequences from the P2 subdomain (capsid) gene of GII-4 variants were characterized by evolutionary analyses, using the MCMC method implemented in the BEAST package. A 3D structure was built using protein modeling.
Phylogenetic analysis demonstrated a predominance of genotype GII.4 (52.4% - 99/189), variants New Orleans_2009 and Sydney_2012 followed by GII.P7/GII.6 with 6.3% (12/189). Amino acid analyses of the GII.4 strains showed several important amino acid changes. A higher evolutionary rate was found, 7.7 × 10− 3 in the Sydney variant and 6.3 × 10− 3 in the New Orleans. Based in evolutionary analysis the time to the most recent common ancestor (TMRCA) has been calculated as estimates of the population divergence time. Thus, TMRCA for New Orleans and Sydney variant were 2008.7 and 2010.7, respectively. Also, we observed a lineage of transition between New Orleans and Sydney.
This study describes the different strains of norovirus isolated from Amazonas state in Brazil during a five-year period. Considering that NoV are capable of changing their antigenic epitopes rapidly, a continuous surveillance is important to monitor the occurrence and changes of the NoV in the community through epidemiological studies. These results contribute to the understanding of NoV molecular epidemiology and its evolutionary dynamics in Amazonas state, Brazil.
KeywordsNorovirus Acute gastroenteritis Phylogenetic Genetic diversity
Bayesian Evolutionary Analysis Sampling Trees
Effective sample size
Histoblood group antigens
Bayesian Markov Chain Monte Carlo
Open reading frames
Protein Database Bank
Quantitative polymerase chain reaction
RNA-dependent RNA polymerase
Reverse transcription polymerase chain reaction
Most recent common ancestors
Norovirus (NoV) is considered a major cause of non-bacterial gastroenteritis worldwide . These viruses are highly infectious, requiring a low viral load to cause infection (≥18 particles), and they are environmentally stable for long periods . These characteristics increase their infectivity and facilitate their transmission and spread, which can cause outbreaks, hospitalization, and global epidemics .
The main transmission route of the noroviruses is fecal-oral, contact with infected persons, ingestion of contaminated water or food, or aerosols produced by vomiting . A meta-analysis concluded that norovirus is responsible for 18% of the cases of acute gastroenteritis (AGE) worldwide, with approximately 24% of these cases in the community, 20% in outpatients and 17% in hospitalized patients .
The norovirus genome comprises single-stranded, positive sense RNA, organized into three open reading frames (ORFs). ORF 1 encodes a large polyprotein of 1738 amino acid (AA), which is cleaved by viral protease (3C) into 6 non-structural proteins, p28, NTPase, p22, VPg, 3C-like protease (3CLpro), and RNA-dependent RNA polymerase (RdRp). ORF2 encodes a major structural protein, VP1, and ORF3, a minor structural protein, VP2 .
The virus contains an icosahedral capsid composed of 90 dimers of VP1 protein, which consists of two domains, the shell (S) domain and protruding (P) domain . The S domain (AA 50-225) is more related to the structure of the capsid. The P domain (AA 226-530) is subdivided into two subdomains, P1 and P2. The P2 subdomain contains important determinants of antigenicity, being responsible for binding to histoblood group antigens (HBGA), which function as attachment factors or co-receptors on host cells [7, 8]. Changes in the P2 nucleotide sequence of GII.4 strains are associated with the emergence of new pandemic/epidemic strains (variants) with alterations in their antigenicity profiles .
The genus Norovirus is classified into at least six genogroups (GI to GVI) , which are subdivided into more than 40 genotypes. Viruses from genogroups GI, GII, and GIV are known to infect humans . The GII.4 is the predominant genotype responsible for the majority of norovirus outbreaks .
Emergence of new GII.4 variants every two to three years is associated with most norovirus pandemics. Since 1995, six GII.4 pandemic variants have emerged, which were denominated as US 95/96, Farmington Hills_2002, Hunter_2004, Den Haag_2006b, New Orleans_2009, and Sydney_2012 . In addition, other GII.4 variants have been described, including Asia 2003 and Yerseke 2006a, both of which were related to additional limited outbreaks .
In the present study, a molecular approach was designed for a phylogenetic analysis of norovirus lineages in Amazonas state, Brazil, over five years. Sequence analysis of the polymerase, capsid, and P2 subdomain regions was successfully used for the identification of genotypes, as well as for characterization of the recombinant strains.
Selection of clinical specimens and norovirus detection
Totally, 1053 fecal specimens were collected from children (< 10 years old) with acute gastroenteritis symptoms by the National Program for Surveillance of Rotavirus Gastroenteritis in Manaus city, Amazonas State, between January 2010 to December 2014. The program investigated sporadic cases of diarrhea from inpatients that used public health facilities. These samples were tested for the presence of norovirus by an enzyme immunoassay (EIA) using the RIDASCREEN® Norovirus 3rd Generation EIA kit (R-Biopharm, Darmstadt, Germany) resulting in 349 positive samples distributed over the five years. Seventy one percent of these positive samples with available material (n = 250; 2010 = 36; 2011 = 33; 2012 = 70; 2013 = 52; 2014 = 59) were selected for genotyping and amplification by reverse transcription polymerase chain reaction (RT-PCR).
This study was approved by the Ethics Committee on Human Research of Evandro Chagas Institute, Brazilian Ministry of Health (CEPH/IEC protocol No. 0017/2014 update No. 1.318.103 of 2015).
Nucleic acid extraction and reverse transcription
Nucleic acids were extracted using the silica method . The extracted genetic material was submitted to reverse transcription (RT) with a random primer [pd(N)6™ (Amersham Biosciences, UK)] using the enzyme Superscript™ II Reverse transcriptase (Invitrogen, USA).
Norovirus RNA amplification
Norovirus-positive samples were amplified by RT-PCR, targeting the regions B of polymerase gene (213 bp) and D of the capsid (253 bp) in the viral genome, using primers Mon 431/432/433/434  and Cap C/D1/D3 , respectively. The GII.4 strains were also amplified with the primers EVP2F/EVP2R (653 bp) , targeting the hypervariable capsid region, P2 subdomain. To investigate the samples with different genotypes on polymerase and capsid genes, PCR was performed, targeting the junction region of ORF1/2, using the primers Mon 431 and G2SKR [14, 17].
DNA purification and sequencing
The amplicon was purified with the QIAquick® PCR purification kit (QIAGEN®) or MEGAquick-spin™ Total Fragment DNA Purification Kit (iNtRON Biotechnology, Kyungki-Do, Korea) as described in the manufacturers’ protocol. Sequencing was performed with the Big Dye Terminator Cycle Sequencing Ready Reaction Kit (v.3.1) (Applied Biosystems, Foster City, CA, USA) using the same pair of primers from the PCR in an ABI Prism 3130 xl DNA Sequencer (Applied Biosystems, Foster City, USA) platform. All the reactions were accomplished with positive controls (positive standard sample - norovirus GII.4) and negative controls (DNase/RNase- Free Water). The sequences generated were deposited in GenBank under accession numbers MF401649-MF401943.
Molecular and phylogenetic analyses
Preliminary analyses of the genotypes were performed in Norovirus Genotyping Tool v.1.0 (http://www.rivm.nl/norovirus/typingtool) . Phylogenetic analyses were performed using Maximum-Likelihood with IQTree program v.1.3.0  by Ultrafast Bootstrapping (UFboot) . The statistical significance of phylogenies constructed was estimated with 1000 replicates. Edits on the phylogenetic trees were done with the program FigTree v.1.4.2 .
Analysis of the epitopes of the P2 region was done in the MEGA 6 program . The prototype sequences were obtained from GenBank database of the National Center for Biotechnology Information (NCBI). The recombination analyses were performed using the Simplot v. 3.5.1 program .
Evolutionary analysis of norovirus GII.4 variants
In order to conduct time-measured phylogenetic analysis, P2 subdomain sequences were tested by the Bayesian Markov Chain Monte Carlo (MCMC) method implemented in Bayesian Evolutionary Analysis Sampling Trees (BEAST) v1.8.2 . The most recent common ancestors (TMRCA) were estimated by a relaxed clock, uncorrelated log-normal molecular clock model . TMRCA was determined using the Coalescent Piecewise Bayesian Skyride Plot method  with 100 million replicates (more details in Additional file 1: Table S1). Nucleotide variations within and between clusters were examined by applying the maximum likelihood based on the GTR + I + G nucleotide substitution model, chosen by jModelTest v. 2.1 .
The 3D structure was built using protein homology modeling. The initial search and selection were done using templates from Protein Database Bank (PDB) (https://www.rcsb.org/), using the norovirus capsid. The selected templates were 1IHM and 4OP7. The MODELLER© v. 9.15 software was used to build the 3D models. After protein modeling, the results were validated using the PROCHECK  and VERIFY3D programs  in order to check the biochemical parameter quality. A visual inspection was performed using the PyMOL Molecular Graphics System v. 1.8 (Schrödinger, LLC).
Positivity rates obtained for norovirus in fecal samples from children with acute gastroenteritis, from Manaus, Amazonas, Brazil, between 2010 and 2014, using an enzyme immunoassay (EIA) and polymerase chain reaction (PCR)
2010 Pos/Total (%)
2011 Pos/Total (%)
2012 Pos/Total (%)
2013 Pos/Total (%)
2014 Pos/Total (%)
TOTAL Pos/Total (%)
Tested by EIA
Tested by PCRa
Genotypes of norovirus obtained in 189 positive fecal samples from children with gastroenteritis, from Manaus, Amazonas, Brazil, between 2010 and 2014
Nº of cases (%)
GII.P4/GII.4 New Orleans_2009
Norovirus GII.4 variants
The GII.4 Sydney 2012 variant had an accumulation of several changes in putative epitopes on the P2 subdomain. Thus, when Sydney_2012 strains were analyzed we identified five changes in AA antigenic epitopes: 297 and 372 (A epitope), 340 (C epitope), 393 (D epitope) and 412 (E epitope) (Fig. 4). The majority (4/5) of these changes occurred in strains collected in 2014, except for the alteration of AA 393 (D epitope) that was present in 97% of the total samples. Three of these changes modified the chemical nature of the AA: Polar uncharged for Apolar (340) from negative to Polar uncharged (372), Apolar for Polar non-loaded (393).
To investigate the temporal evolutionary dynamics of the GII.4 norovirus in the Amazon, we applied Bayesian coalescent analysis in 58 P2 (VP1) sequences from 2010 to 2014, implemented in the BEAST package. Different population dynamics models were tested (constant, exponential, expansion, skyline) and the better values of ESS (effective sample size) revealed that the skyride population growth model was the best fit to the data. The uncorrelated log-normal model estimations calculated higher rate of evolution, 7.7 × 10− 3 for the Sydney variant and 6.3 × 10− 3 for New Orleans. The same Bayesian inference estimated the time to TMRCA. The year of TMRCA from the population analyzed was 2008.7 to New Orleans strains and 2010.7 to Sydney. The boxplot showed that the variation in the evolutionary rate was higher in the Sydney (7.7 × 10− 3 average) strains, but the interval was larger than for New Orleans (1.1 × 10− 3 to 1.8 × 10− 2) (Fig. 5).
Considering the high genetic diversity and elevated rates of evolution of norovirus, continuous surveillance of cases for monitoring genotypes and the emergence of new strains is required. The evolution of the GII.4 pandemic strains is a consequence of point mutations in the P domain capsid, and genomic recombination events between ORF1 and ORF2 .
In epidemiological investigations carried out in several countries, the GII.4 strain was observed as the cause of most cases, corroborating with the findings obtained in this study [12, 34]. The pattern observed in the variants of GII.4 in Manaus was similar to that found across the world, where new variants appear every 2 or 3 years, replacing the previous one . Between 2009 and 2012, the New Orleans pandemic strain was responsible for 75% of the outbreaks of diarrhea in New Zealand and Australia . In some studies, it was possible to observe its co-circulation with other variants of GII.4, such as Den Haag_2006b [10, 37]. This co-circulation was found among the analyzed samples of Manaus since 2012, where Sydney_2012 was predominant and New Orleans_2009 appeared in few cases.
It is known that recombination events are frequently evolutionary mechanisms in the genomes of norovirus and that they can strongly influence phylogenetic grouping . The GII.Pg/GII.12 strain found in Manaus in February 2010 shares 99.8% nucleotide identity with other recombinants that circulated in Rio Grande do Sul (South of Brazil) in 2009 (KR074161-62, KR074190-91), indicating a possible circulation in the country. Moreover, data from the literature report that this recombinant has emerged on almost all continents between the years 2009 and 2011 [39, 40, 41, 42].
Interestingly in this work, the recombinant strain GII.P7/GII.6 was detected in all years between 2011 and 2014. Recombinant strains are usually found in few cases or causing outbreaks . This fact suggests that this lineage is well established in the population and that more comprehensive studies involving immunogenicity would be necessary to assess its impact. The genotypes GII.6 and GII.7 are frequently described in several studies conducted in Brazil, often behind only GII.4 strains [44, 45]. In most epidemiological studies, genotyping was performed by sequencing only one region of the genome, the viral capsid, which makes it difficult to understand the actual epidemiology of the recombinant strains. The data obtained in the present study reinforce the need for binary classification of norovirus genotypes, as already suggested by Kroneman et al. .
By the analysis performed on the subdomain P2, changes were found in both the GII.4 New Orleans_2009 and the Sydney_2012 strains. It is known that this subdomain interacts with neutralizing or blocking antibodies, and with HBGA ligands. Several studies using human and mouse monoclonal antibodies against norovirus VLPs in neutralization assays have identified a correlation between the emergence of novel GII.4 epidemic strains and amino acid changes in specific epitopes (A-E). Such changes alter the immunogenicity pattern, resulting in an escape from the action of the immune system .
The alterations found in the Manaus strains indicate that residues 294/413 and 297/372 were identified as antigenic determinants of New Orleans and Sydney 2012, respectively. Thereby, the New Orleans and Sydney lineages have undergone significant mutations in blocking epitopes, and already exhibit differences with respect to the prototypes. In the Sydney strains, it is possible to observe that the changes in epitopes occurred only from May 2014, which may indicate that over the years 2012 and 2013, these strains were acquiring point mutations, culminating in amino acid changes in the strains from 2014.
It is known that RNA viruses have a high evolution rate, higher than the DNA genomes, mainly because they do not have repair mechanisms for their replication . The relaxed-clock estimations calculated similar rates of evolution in Manaus (Sydney: 7.7 × 10− 3; New Orleans: 6.3 × 10− 3) to other studies, which reported 6.99 × 10− 3 and 7.3 × 10− 3 subst./site/year for full VP1 and subdomain P2 gene sequences, respectively [48, 49]. In a study conducted in Belem, a state adjacent to Manaus, over 30 years, Siqueira et al.  found an evolutionary rate of 9.05 × 10− 3 subs./site/year for other GII.4 variants. The time-scale evolutional was constructed based on the subdomain P2, which is the most hypervariable region in VP1, which are under selection pressure of the immunological system. Its can explain the reasons to the high evolutionary rates founded in this study. Further studies should be carried out for studying the time-scale evolutionary phylogeny and phylodynamics of the VP1 and RdRp genes.
The amino acid changes acquired over the years in samples from the present study are already reflected in the 3D structure of the protein, which may have allowed the virus to evade host immune surveillance. Recently, the emergence of the GII.17_Kawasaki strain has been observed in several countries, but it is important to note that GII.4 strains still play a key role in norovirus cases.
One of the limitations that may be considered in this study is the use of EIA for norovirus screening, taking into account the limited detection potential of this technique over the current gold standard quantitative PCR (qPCR), as well as lower sensitivity against emerging genotypes. However, a research conducted by Siqueira et al.  demonstrated a good performance using the same EIA kit, with a sensitivity of 92% and a specificity of 83.3%. This methodology has already been used in other studies, carried out in the Amazon region, establishing higher positive rates [44, 52, 53]. Uncommon recombinants strains (GII.P21/GII.2, GII.13/GII.17, GII.P21/GII.3) and emergent variants (GII.4_Sydney) were also detected by EIA [50, 54].
The research conducted in the present study was based on the epidemiological and molecular surveillance of norovirus strains, in samples collected from an important state of the Brazilian Amazon region, over a period of five years. Our data indicate that noroviruses are an important cause of gastroenteritis in the Amazon region. Although highly diverse, NoV circulating over the past 5 years was predominantly characterized as GII.4, including GII.4 variants New Orleans_2009 and Sydney_2012. Monitoring of GA cases caused by norovirus is essential to evaluate the impact of this virus in the community (sporadic cases and outbreaks), as well as for the development and evaluation of control measures, such as vaccines.
The authors would like to acknowledge the valuable technical support provided to the entire staff of the Norovirus and other Enteric Viruses Laboratory (LNVE). We are grateful to the Fundação Amazônia de Amparo a Estudos e Pesquisas do Pará (FAPESPA) for the scholarship provided to the first author during her Master’s degree in the Postgraduate Program in Virology. A special thanks to the Central Laboratory (LACEN) of Amazonas State and all involved in the Brazilian Network Surveillance Program of Viral Gastroenteritis, who conducted the samples collection. This research was supported by Evandro Chagas Institute, Secretary of Health Surveillance, Ministry of Health (IEC/SVS/MS).
This study was supported by the Evandro Chagas Institute, Brazilian Ministry of Health.
Availability of data and materials
All data generated or analyzed during this study are included in this article. Also, all the sequences obtained are available in GenBank database.
JMH contributed in all sections, including laboratory tests, molecular analyses, writing the manuscript. LDS drafted and revised the manuscript and performed the study design. ECSJ and RSB performed phylogenetic analysis of the sequences and all the bioinformatics approach. EAMR and STPC participated to laboratory analyses. MSSL processed the clinical samples and performed the EIA. YBG contributed to data analysis and the editing of the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
All the fecal specimens used in this research were collected by the National Program for Surveillance of Rotavirus/Norovirus Gastroenteritis, under the coordination Brazilian Ministry of Health (governmental sector responsible for the administration and maintenance of Public Health in the country). The present study involved only molecular characterization of viral strains, the consent to participate from the under 16 years old was waived and approved by Ethics Committee on Human Research of Evandro Chagas Institute (protocol No. 0017/2014 update No. 1.318.103 of 2015) according to Resolution 466/12 published in the DOU n° 12, June 13, 2013 - Section 1 - Page 59 and its complementary, of the Council National Health / Ministry of Health-Brazil.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 3.Allen DJ, Trainor E, Callaghan A, O'Brien SJ, Cunliffe NA, Iturriza-Gómara M. Early detection of epidemic GII-4 norovirus strains in UK and Malawi: role of surveillance of sporadic acute gastroenteritis in anticipating global epidemics. PLoS One. 2016;11:e0146972.CrossRefPubMedPubMedCentralGoogle Scholar
- 6.Green KY. Caliciviridae: the noroviruses. In: Knipe DM, Howley PM, editors. Fields Virology. Philadelphia: Lippincott Williams & Wilkins; 2013. p. 582–604.Google Scholar
- 21.Rambaut A. Figtree. Tree Figure Drawing Tool version 1.4.2, 2006-2014. Institute of Evolutionary Biology, University of Edinburgh. http://tree.bio.ed.ac.uk/. Accessed 14 Jan 2016.
- 51.Siqueira JA, Linhares AC, Oliveira DS, Soares LS, Lucena MS, Wanzeller AL, et al. Evaluation of third-generation RIDASCREEN enzyme immunoassay for the detection of norovirus antigens in stool samples of hospitalized children in Belém, Pará, Brazil. Diagn Microbiol Infect Dis. 2011;71:391–5.CrossRefPubMedGoogle Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.