Molecular epidemiology of respiratory syncytial virus among children and adults in India 2016 to 2018

Respiratory syncytial virus (RSV) is a common cause of respiratory tract infections among children less than 5 years of age and the elderly. This study intended to determine the circulating genotypes of RSV among severe acute respiratory illness (SARI) cases during the period 2016–2018 in India, among hospitalized acute febrile illness cases of age ranging from 1 to 65 years. Throat/nasopharyngeal swab samples were subjected for testing RSV and subgroups by real-time reverse transcriptase polymerase chain reaction (RT-PCR), further sequencing and phylogenetic analysis were performed for the second hypervariable region of the G gene. RSV-A and B subtypes co-circulated during the years 2016, 2017, and 2018, with RSV-A as the dominant subtype in 2016, and RSV-B as the dominant subgroup in 2017 and 2018. Phylogenetic analysis revealed that the circulating genotypes of RSV were GA2 (16/16), of RSV-A, and GB5 (23/23) of RSV-B in the South, North, and Northeast region of India during the period between 2016 and 2018. Here we report the first study comprising the distribution of RSV-A and B genotypes in the different geographic regions of India among children and adults during the year 2016 to 2018. We also report GA2.3.7 lineage of GA2 genotype for the first time in India to the best of our knowledge. Supplementary Information The online version contains supplementary material available at 10.1007/s11262-021-01859-4.


Introduction
Acute lower respiratory tract infections are the major cause of hospitalization and death among children less than 5 years of age, with RSV being an important viral pathogen causing the infection worldwide [1,2]. Previous studies estimate that annually about 33.1 million RSV infections occur, among which about 3.2 million are hospitalized cases with around 59,600 deaths occurring in children younger than 5 years [3]. Almost every child by the age of 2 years gets infected with RSV infection, and primary infection is rarely found to be asymptomatic [4]. Reinfections are common among individuals of any age group, RSV infections even lead to morbidity and mortality among the elderly [5]. The incubation period of RSV in human infection ranges from 3 to 5 days [6]. The common symptoms of RSV are nasal congestion, coryza, nonproductive cough, and mild grade fever that may lead to the involvement of the lower respiratory tract causing bronchiolitis, pneumonia, and tracheobronchitis [7][8][9]. The attributable mortality rate estimated due to RSV infections is around 0.86 per 1000 live births [10]. RSV infections were usually observed from March to October months in Southern Hemisphere, whereas during September to May months in Northern Hemisphere [11]. In India, RSV infections are observed during monsoons, which peaks during September and October months [12][13][14][15].
RSV can become a vaccine-preventable disease if an effective vaccine is developed against it; however, there are several vaccination strategies in advanced clinical phases for protection against severe RSV disease. For developing an effective vaccine against RSV there is a need to understand the circulating genotypes. However, there is limited data available on the molecular epidemiology of RSV among children and adults in the different geographical regions of India. Here we report a study to understand the molecular epidemiology of RSV in different geographical regions of India during the April 2016-September 2018 time period.

Study population
A total of 151 RSV-positive archived samples have been included in this study and cases from 10 states of India; Karnataka, Kerala, Assam, Goa, Gujarat, Maharashtra, Jharkhand, Tripura, Tamil Nadu and Odisha depicted in (Fig. 1) under Acute Febrile Illness (AFI) surveillance study conducted by Manipal Institute of Virology (MIV), with a case definition of patients admitted to hospital with fever (≥ 38 ℃) of age between 1 and 65 years were recruited and samples were tested for various viral pathogens including RSV, bacterial and parasite agents by molecular and serological methods [24]. All the inpatients admitted in the hospitals fulfilling the inclusion criteria of the case definition were included in the surveillance study The surveillance study was conducted for a duration of 5 years from 2014 to 2019 [24]. The total number of the AFI samples that were obtained annually during the study period is as follows: 414,359,. RSV-positive archived respiratory specimens of the AFI study were used in this study. The current study was reviewed and approved by the Institutional Ethical Committee, Manipal Academy of Higher Education (IEC No: UEC/32/2013-14, MUEC/ Renewal-08/2017). Cases were recruited after obtaining Informed consent as part of the AFI study.

Real-time RT-PCR for subgrouping
RNA was extracted from 150 μl of 151 RT-PCR RSVpositive throat swab samples using QIAamp® Viral RNA mini kit (QIAGEN, Hilden, Germany) as per manufacturer's instructions. Real-time RT-PCR was performed using primers targeting the N gene that was published by Hu et al. using AgPath-ID™ One-step RT-PCR kit (Applied Biosystems, Foster City, USA) to determine the RSV subgroups of the samples [25]. The reaction was performed in a QuantS-tudio™ 5 PCR system (Applied Biosystems, Foster City, USA). The cycling conditions for the PCR reaction were as follows: reverse transcription at 50 °C for 30 min, initial denaturation at 95 °C for 15 min, followed by 40 cycles of denaturation and extension at 95 °C for 15 s, 58 °C for 30 s respectively.

Genotyping of RSV and sequencing
Sixty-nine samples were then selected by a purposive sampling method based on the state (place), the year during the illness, and age of the patients for the determination of the genotypes. As the major number of cases were from two states, to remove the bias, 10 samples have been selected from each state based on the year during the illness. All the samples were included for genotyping from the states with samples that are less than 10 in number. Seminested PCR was carried out by using the primers targeting the second hypervariable region of the G gene that were published Parveen et al. using the AgPath-ID™ One-step RT-PCR kit (Applied Biosystems, Foster City, USA) [22]. The reactions were performed in a ProFlex™ 5 PCR system (Applied Biosystems, Foster City, USA). The amplified products were purified using the GenElute™ Gel purification Kit (Sigma Aldrich, Missouri, USA) according to the manufacturer's instructions. The purified products were sequenced using the Big Dye Terminator kit v3.1 cycle sequencing kit (Applied Biosystems, Foster City, USA) using ABI 3500xl genetic analyzer (Applied Biosystems, Foster City, USA).

Sequence analysis and phylogeny
The sequences obtained consisted of partial regions of the second hypervariable region of the G gene including N-terminal of F gene. The genome region ranging from 5419 to 5643 nucleotides in reference to the sequence Acc. No.: JX627336 of RSV-A subgroup, and the genome region ranging from 5349 to 5618 nucleotides in reference to the sequence Acc. No. KF640637 of the RSV-B subgroup was used for the construction of the phylogenetic tree and amino acid alignment. The published sequences of the second 1 3 hypervariable region of the G gene of RSV from different geographical regions of the world were downloaded from the GenBank of NCBI (National Centre for Biotechnology Information) database. Multiple sequence alignments of study sequences for the second hypervariable region of the G gene with known RSV genotypes that were recently classified by Goya et al. were carried out using MUSCLE (Multiple Sequence Comparison by Log Expectation) algorithms in MEGA X version 10.0.5 [21]. The phylogenetic trees of nucleotide sequences were constructed separately for RSV-A and RSV-B subgroups with Bayesian Information criterion using the Maximum Likelihood method. The suitable substitution model for RSV-A sequences was GTR + G substitution model and the same was used for the construction of the phylogenetic tree, whereas HKY85 + G was the suitable substitution model for RSV-B sequences and the same was used for construction of the phylogenetic tree with a support value of 1000 bootstrap replicates in PhyML software (version 3.0). The trees were visualized and midpoint trees were constructed using Figtree software. A total of 16 RSV-A and 23 RSV-B sequences of the second hypervariable region of the G gene were deposited in the GenBank with the accession number MN463622 to MN463637 and MN463638 to MN463660 respectively.

P-distance analysis
The intergenotypic and intragenotypic p-distances of the second hypervariable region of G gene sequences of RSV-A and RSV-B subgroups were calculated separately for each of the subgroups. The p-distance matrices were generated using MEGA X version 10.0.5 software.

Data analysis
Sociodemographic factors, clinical features, and laboratory parameters of the RSV cases were obtained from the AFI study database and analyzed using SPSS 15.0 for Windows software (SPSS™ Inc, Chicago, IL, USA). For analysis of continuous variables, one-way anova test and analysis of categorical variables, the chi-square test was used. A p value of < 0.05 was set as the level of statistical significance.

Description of the study population
Among AFI cases recruited, a total of 151 samples tested positive for RSV by RT-PCR during the period 2016 to 2018, out of 10 states, the major number of cases, 62 cases (41.1%) were from Tamil Nadu, 30 (19.9%) from Karnataka, and 20 (13.2%) from Assam were represented in (Table 1). The mean age of the study population is 22.2 years. The positivity rate of the RSV annually for each year during the AFI study period is 0.6% (73/12414) in 2016, 0.2% (42/20359) in 2017, and 0.7% (36/4954) in 2018. Out of the study population, 79 (52.3%) patients were males and 72 (47.7%) were females. Most of the RSV infected cases were of low and middle socioeconomic status 75 (49.7%) and 74 (49%) respectively. The mean duration of hospital stay was observed to be 3.64 days. The frequencies of symptoms and signs among RSV cases were summarized in (Table 1). RSV cases were presented with symptoms like cough (94%), general weakness (89.4%), and coryza (78.1%). Coryza (p = 0.015), headache (p < 0.001), night sweats (p = 0.035) and joint pain (p = 0.036) were significantly associated with the RSV infection among adults and children (Supplementary  Table 1). Mean ESR and CRP levels were observed to be elevated.

Discussion
RSV is one of the most common viral causes of acute respiratory illness (ARI) among infants and young children throughout the world [23]. In India, most of the molecular epidemiological studies were conducted in Assam, New Delhi, and Maharashtra to analyze the circulation and genetic diversity of RSV [5,22,23,29,30]. This study is the first of its kind which has covered 10 states, Karnataka, Kerala, Tamil Nadu, Assam, Goa, Gujarat, Maharashtra, Jharkhand, Tripura, and Odisha, and even different geographical locations that lie in southern, eastern, and western parts of India were analyzed from April 2016 to September 2018. Most of the molecular epidemiological studies of RSV usually focus only on children less than 5 years of age [1,31]. In this study, we included cases from the age group 1-65 years. In the current study, phylogenetic analyses were performed based on the newly reclassified genotypes by Goya et al. [21]. Phylogenetic analysis of RSV-A subgroup revealed a novel genetic lineage GA2.3.7 for the first time in India to the best of our knowledge, including GA2.3.5 lineage of GA2 genotype. The distinct clade of GA2.3.7 lineage of GA2 genotype showed the patristic distance of approximately 0.120 when the distance was set as 0 (zero) at the ancestral node of 2.3.0 subgenotype, fulfilling the criteria to designate as a new genetic lineage having more than eight multiples of the patristic distance of 0.015 which was proposed by Goya et al. [21]. The patristic distance of 2.3.0 subgenotypes and further lineages in it is depicted in (Supplementary Fig. 1). GA2.3.7 lineage of GA2 genotype was first reported in 2016 in Lebanon with 72 bp nucleotide duplication similar to GA2.3.5 lineage of GA2 genotype with characteristic amino acid substitutions (G284S, E295K, Y304H, L314P, T319I, and P320K) [18]. In the current study sequences of GA2.3.7 lineage, a substitutional change I319T was observed in both the GA2.3.7 lineage study sequences, whereas T292I was observed in a sequence from   [5,23,30]. The amino acid substitutional changes that were mentioned earlier in the results section (i.e., from L248I to T320A) were also reported by Cui et al.,Ogunsemowo et al.,Malasao et al.,. Few of the newly identified substitutional changes were also observed in our study sequences, but these changes did not constitute for defining these sequences as novel genotype based on the classification criteria defined by Goya et al. The circulating genotypes from different countries during the same period of our study (i.e., 2016, 2017, and 2018) were replicated same as our findings as GA2.3.5 lineage of GA2 genotype in South Korea, Portugal, Greece, Australia, Saudi Arabia, Lebanon, and Thailand [18,[26][27][28][36][37][38]. In India, according to old classification previous studies reported circulation of GA2, GA5 genotypes (2001)(2002)(2003)(2004)(2005), and ON1 genotype (2011)(2012)(2013)(2014)(2015) in Delhi; GA5 andNA1 genotypes (2009-2012) in Assam; and NA1 lineage and ON1 genotypes in Pune (2009)(2010)(2011)(2012), and NA1 and ON1 genotypes in Kerala (2012Kerala ( -2014 [14,22,23,30,39,40]. At the time of preparing this manuscript, a publication by Broor et al. in 2019 related to the status of the molecular epidemiology of RSV in India was the latest information related to the circulated genotypes in India to the best of our knowledge [41]. Phylogenetic analysis of the RSV-B subgroup revealed GB5.0.5a lineage of GB5 genotype as the predominant circulating genotype in the South, North, and Northeast regions of India. GB5.0.5a was first detected in 2013 from England (Accession number: KY249660) from unpublished data with 60 bp nucleotide duplication region along with twelve signature amino acids [21]. Sahu et al. (Madhya Pradesh) reported the substitutional changes I254T, E261G, S267P, I270T, and Y287H; Haider et al. (Delhi) reported the substitutional changes T227N, E241K, I254T, S267P, I270T, A271V, Y287H, and T290I; Choudhary et al. (Maharashtra) reported the substitutional changes T227N, E261G, S267P, I270T, A271V, Y287H, and T290I which were even observed in our study sequences [5,23,42]. The amino acid substitutional changes that were mentioned earlier in the results section (i.e., from T227N to T312I) were also reported by Abou-El-Hassan et al., Tsergouli et al.,Yun et al.,[26][27][28]. The circulating genotypes from different countries during the same period of our study (i.e., 2016, 2017, and 2018) were found to be of GB5.0.5a lineage of GB5 genotype in South Korea, Portugal, Greece, Australia, Saudi Arabia, Lebanon, and Thailand [18,[26][27][28][36][37][38]. In India, according to the old classification, previous studies reported circulation of BA genotype (2001of BA genotype ( -2005, BA7, BA9, BA10, BA12 genotypes -2010), SAB4, BA8, and BA9 genotypes (2011 in Delhi; BA genotype (2005 -2008) in Kolkata; GB2, BA9, andBA12 genotypes (2009-2012) in Maharashtra; BA9 andBA10 genotypes (2012-2014) in Kerala [14,22,23,39,40,42]. The p-distance analysis was carried out for 6 genotypes (GB1, GB2, GB4, GB5, GB6, and GB7) where 1 genotype (GB3) was excluded from the analysis as it had lacked signature amino acids significant for the genotypic determination in the second hypervariable region of the G gene [21].

.5 L L T S N T K G N P E H T S Q E E T L H S T T S E G Y L S P S Q V Y T T S G Q E E T L H S T T S E G Y L S P S Q V Y T T S E Y L SQ S L S S S N T T K GA2.3.5 MN463622/MIV/Tn
Co-circulation of A and B subgroups were observed in this study, in 2016 with a predominance of RSV-A subgroup, whereas the RSV-B subgroup was predominant in 2017 and 2018. Similar studies with co-circulation of RSV-A and RSV-B were observed with RSV-B as a predominant subgroup in 2005-2006, whereas RSV-A in 2007-2008 in Kolkata [14]. RSV-B was observed to be predominant among co-circulating subgroups (2001)(2002), RSV-A was observed to be predominant during (2002-2003, 2003-2004, and 2004-2005) in Delhi [22]. A study that analyzed the laboratory parameters of RSV infected individuals reported that the mean values of WBC count, ESR levels, and CRP levels as 9840 cells/ mm 3 , 28 mm/h, and 27 mg/l, respectively [43]. In this study, we found the mean values of WBC count, ESR levels, and CRP levels were as follows: 6849 cells/mm 3 , 31.8 mm/h, and 15.6 mg/l respectively.
The circulation and spread of the GA2 genotype, and the GB5 genotype in India might be due to the change of the viral antigenic properties leading to the emergence of the new genotypes for evasion from the host immune response. The disappearance of old genotypes with the replacement of new genotypes of RSV might be because of the herd immunity developed by the individuals [44]. The development of a vaccine against circulating genotypes in a particular region can reduce the extent of the spread of RSV infection among children and old age.
This study has some limitations. Sequences of few samples were unable to obtain as the quality of the RNA was low as samples were from retrospective study. The newly identified substitutional changes need to be seen in further studies in other parts of India. The p-distance analysis was carried out for 6 genotypes (GB1, GB2, GB4, GB5, GB6, and GB7) where 1 genotype (GB3) was excluded from the analysis as it doesn't have characteristic amino acid significant for the genotypic determination in the second hypervariable region of the G gene.

Conclusions
GA2 genotype was the circulating genotype of the RSV-A subgroup, whereas GB5 was the circulating genotype of the RSV-B subgroup in the South, North, and Northeast regions of India during the period between 2016 and 2018. To the best of our knowledge, this is the first report to identify the GA2.3.7 lineage of GA2 genotype in India. RSV needs more surveillance studies on molecular epidemiology to get a better understanding of the circulating genotypes globally as we are observing mutations. Acute surveillance is required to know the predominance of RSV-A and RSV-B subgroups which may help in the implementation of the development of therapeutics or prevention against RSV infection. The consequences of the COVID19 pandemic and the restrictions applied to the movement of the population will be the most useful aspect leading to the decrease of the transfer of RSV infection among the population worldwide. If there is a possibility in maintaining the same social distancing norms even after the fall of COVID19 can lead to a massive decline in the infection rate of RSV and other transmissible diseases. Mid-rooted phylogenetic tree of RSV-B subgroup. Mid-rooted phylogenetic tree of the second hypervariable region of the RSV-B G protein gene constructed by using Maximum Likelihood method, by HKY85 + G substitution model with 1000 bootstrap replicates in PhyML software. Study sequences are displayed in red. The reference sequences used to construct the tree were downloaded from the GenBank database. The two lettered words in the labels of the MIV isolates represent the state names, where: T n = Tamil Nadu, Ka = Karnataka, Jh = Jharkhand, Ga = Goa, Tr = Tripura, As = Assam, Kl = Kerala, Od = Odisha, and Mh = Maharashtra. The grid lines over the tree represent the patristic distance where the distance between each grid is 0.005. The names of the genotypes and lineages were highlighted by different colors ◂

Declarations
Conflict of interest All authors: no reported conflicts of interest. All authors have submitted the ICMJE form for disclosure of potential conflicts of interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.

Consent for publication
All authors have seen and approved the manuscript.
Ethical approval The current study was reviewed and approved by the Institutional Ethical Committee, Manipal Academy of Higher Education (IEC No: UEC/32/2013-14, MUEC/Renewal-08/2017). Cases were recruited after obtaining Informed consent as part of the AFI study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.