Introduction

Influenza viruses are negative-sense segmented RNA viruses belonging to the Orthomyxoviridae family and are responsible to cause seasonal epidemics, with significant mortality and morbidity due to acute respiratory illness every year [16]. These viruses comprise of seven or eight single-stranded RNA segments and are classified into three types—influenza viruses (A, B and C) [14]. Influenza A viruses are further divided into different subtypes based on their antigenic and genetic properties of surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA). These include 18 different HA types (H1–H18) and 11 NA types (N1–N11) [4].

Hemagglutinin is a major viral surface membrane-bound glycoprotein which plays an important role in viral attachment and evasion from neutralizing antibody responses. Under selective pressure, HA undergoes mutation which helps the virus evade the host immune system. HA glycoprotein comprises of two domains; globular head (composed of the HA1 polypeptide) and long fibrous stem (comprised of the HA2 polypeptide). HA1 region is most rapidly evolving region of influenza viruses. Based on studies of the HA sequence variation, four antigenic sites Sa, Sb, Ca and Cb have been identified for influenza A (H1N1) viruses [3, 22, 36]. Neuraminidase is the second major surface glycoprotein comprising of a bulky head attached to a slender stalk, which plays a major role in the release of progeny virus from infected cell surface [21]. The host specificity of influenza A virus has changed significantly due to antigenic drift and antigenic shift in HA and NA genes leading to a rapid evolution of the virus [25].

In April 2009, a novel strain of swine origin influenza virus emerged in humans, first identified in Mexico and the United States and rapidly spread worldwide causing a significant global mortality and morbidity [8, 25]. Characterization of influenza A (H1N1) pdm 09 virus revealed that the virus was a genetic reassortment of segments from the North American and Eurasian lineages [5, 13, 15]. In India, the first case of influenza A (H1N1) pdm 09 infection was identified in May 2009 [29]. Subsequently, the virus soon became endemic and was reported in different parts of India, including Maharashtra [24]. Haffkine Institute for Training, Research and Testing is a National Influenza Center under WHO for testing of pandemic H1N1 influenza samples in Mumbai. Comparisons between antigenic differences and phylogenetic analysis are essential to understand the multiple lineages of influenza virus variants. Information on evolution of influenza A (H1N1) pdm 09 viruses in India is limited with no published data from Mumbai. The present study aims to elucidate evolution and genetic characteristics of influenza A (H1N1) pdm 09 viruses from Mumbai, circulating during the pandemic season of 2009–2011.

Methods

Cells and Virus Isolation

Madin-Darby canine kidney (MDCK) cells, obtained from National Center for Disease Control (NCDC) were maintained in Minimal Essential Medium (MEM, Gibco, by Life Technologies) supplemented with 10% fetal bovine serum (Gibco, by Life Technologies), 100 U/ml Penicillin and 0.5 mg/ml Streptomycin (Hi-Media Laboratories, India). During the period August 2009–July 2010, clinical samples (throat and nasal swabs) positive for influenza A (H1N1) pdm 09 virus were inoculated onto confluent MDCK cells in serum free medium containing 2 µg/ml of Tosyl phenylalanyl chloromethyl ketone (TPCK) trypsin, and were passaged twice to reach sufficient titers. A total of 150 samples were selected based on the cycle threshold value (Ct < 35), maximum volume of the samples available and complete clinical history of the patient [33]. Tissue culture fluid was harvested and stored at −80 °C after observing MDCK cell line for cytopathic effect. All the samples were processed in the enhanced Biosafety level (BSL + 2) with BSL-3 precautions [1]. The presence of influenza virus in the tissue culture supernatant was determined by hemagglutination assay using Guinea pig RBCs as described elsewhere [18, 19].

RNA Extraction and Reverse Transcription Polymerase Chain Reaction

Viral RNA was extracted from 140 µl of viral cell culture supernatant using QIAamp viral RNA mini kit (Qiagen, Hilden, Germany). HA and NA genes were amplified using oligonucleotide primers as described elsewhere [34]. One-step reverse transcriptase polymerase chain reaction (RT-PCR) was performed using Access Quick RT-PCR System (Promega Corporation, Madison, WI, USA). Each segment was amplified in three to four fragments of 600–800 bp with 200 bp overlap to obtain appropriate sequence coverage. Each segment was amplified using 5 µl of template RNA to 20.5 µl of the RT-PCR mixture consisting of nuclease free water 3.5 µl, forward primer 2 µl, reverse primer 2 µl, AMV reverse transcriptase (5U) 0.5 µl and Master Mix (2X) 12.5 µl. PCR conditions were determined as per the WHO protocol [34]. Resulting amplicons were analyzed by 1.5% agarose gel electrophoresis.

PCR Product Purification and Sequencing

Amplified products were purified using HiPurATM PCR product purification kit (Hi Media Laboratories Pvt. Ltd) as per the manufacturer’s instructions and stored at −20 °C until sequencing. DNA sequencing was performed using an automated sequencer (ABI 3730Xl Applied Biosystems, USA) with corresponding forward primers [34].

Nucleotide Sequence Deposition

The nucleotide sequence data from this study has been submitted to National Center for Biotechnology Information (NCBI) GenBank. The accession number of the sequences for HA gene obtained in this study are KM219026-KM219072, and those for NA gene are KJ958934-KJ958980, and are considered as testing data in further analysis.

Sequence Driven Phylogenetic Analysis

Multiple sequence alignment was performed using molecular evolutionary genetics analysis (MEGA) 6.0.5 [31]. Sequences were assembled and aligned with the reference sequences of the same season, and for the same gene to generate consensus sequence. Phylogenetic tree was constructed by maximum parsimony (MP) method with subtree-pruning-regrafting (SPR) method. To compare genomic sequences of isolates from Mumbai, sequences of the vaccine strains and the strains from other regions were obtained from the Influenza Virus Resource of NCBI and were included in the sequence driven analysis considering it as reference data. Potential N-glycosylation sites were predicted using nine artificial neural networks [17].

Results

During the period August 2009–July 2010, a total of 150 clinical samples positive for influenza A (H1N1) pdm 09 virus were selected for virus isolation. Of these 150 samples, forty-seven viruses could be isolated and were subsequently characterized. Complete HA and NA gene sequences were analyzed for genetic reassortment and phylogenetic clustering.

Phylogenetic and Genetic Characterization of Influenza A (H1N1) pdm 09 Strains

Phylogenetic analysis for HA and NA of influenza A (H1N1) pdm 09 isolates showed genetic variations when related to the human vaccine strain of 2010–2011. Besides the vaccine strain, Mumbai influenza A (H1N1) pdm 09 viruses were similar to the strains circulating during those years throughout the world (Fig. 1 and 2).

Fig. 1
figure 1

Phylogenetic analysis of the HA gene segments of influenza A (H1N1) pdm 09 isolates in Mumbai. The closed circle indicates the WHO recommended reference vaccine strain. Indian isolates from Mumbai (closed square)

Fig. 2
figure 2

Phylogenetic analysis of the NA gene segments of influenza A (H1N1) pdm 09 isolates in Mumbai. The closed circle indicates the WHO recommended reference vaccine strain. Indian isolates from Mumbai (closed square)

To investigate the detailed genetic characterization of isolates from Mumbai, the nucleotide and deduced amino acid sequences of the HA1 region from the 47 isolated samples were compared with human vaccine virus strain and against virus isolates available from GenBank. For all the isolates, A/California/07/2009(H1N1) was used as the vaccine strain in the Northern Hemisphere 2010–2011. All the isolates indicated sequence similarity to the reference vaccine strain based on nucleotide (98.8–99.5%) and deduced amino acid (98.8–99.7%) sequences.

Phylogenetic tree of the HA1 gene and NA gene was tailored using MP method using the SPR algorithm with search level 0 in which the initial trees were obtained by the random addition of sequences (five replicates). The analysis involved 75 nucleotide sequences for HA gene and 68 nucleotide sequences of NA gene with composite training and testing dataset [31]. The phylogenetic analysis of HA1 gene revealed three distinct clades. Majority of the genomic sequences of influenza A (H1N1) pdm 09 from Mumbai were grouped in Clade I. The isolates in Clade I were homologous to WHO-recommended vaccine strain A/California/07/2009(H1N1)-like virus used in the Northern hemisphere during the 2010–2011 season. When compared with A/Pune/NIV20007/2009(H1N1), the average percent nucleotide and amino acid similarities of thirty-three isolates in this study were 99–99.8% and 99.1–100% respectively while with A/Mum/NIV9312/2009(H1N1), three isolates showed similarities of 99.5–99.7% and ~99.7% respectively.

Influenza A (H1N1) pdm 09 isolates from Mumbai characterized in clade II were homologous to A/California/07/2009(H1N1)-like vaccine virus. However, the average percent nucleotide and amino acid similarities of the four 2009 pdm A (H1N1) isolates in this study were 99.6–99.8% and 100% respectively to A/California/24/2009(H1N1), and 99.7–99.9% and 100% to A/Singapore/ON812/2009(H1N1). When compared with A/Guangdong/1202/2009(H1N1), the average percent nucleotide and amino acid similarities of the five isolates in this study were 99.5–99.6% and 99.7–100% and with A/Pune/NIV807/2009(H1N1) showed 99.5–99.6% and 99.4–99.7%, respectively. Influenza 2009 pdm A (H1N1) isolates of 2010 season were categorized in the Clade III showing homology to A/California/07/2009(H1N1)-like vaccine virus. The average percent nucleotide and amino acid similarities of the isolates of 2010 season in this study were 99.4–99.5% and 99.1–99.4% respectively to A/Thailand/CU-H2389/2010(H1N1), 99.2–99.5% and ~94% to A/India/NIV42443/2010(H1N1).

The patterns of the antigenic sites in the HA1 gene of influenza A (H1N1) pdm 09 viruses were observed by deduced alignment of amino acid sequences. For H1N1 viruses, four antigenic sites have already been defined viz., Sa and Sb (strain specific) and Ca and Cb (common antigenic sites) of the virus hemagglutinin [3, 22, 36]. Amino acid sequences of HA1 domain of Mumbai strains were compared to the vaccine strain in order to highlight functional variation which might potentially impact vaccine efficacy. Comparative analysis of Mumbai A (H1N1) isolates with A/California/07/2009(H1N1) revealed mutations across all antigenic sites. Amino acid changes were detected at 14 positions in four antigenic sites (Table 1). At site Sa, mutation such as K153E, G155E, K160 N and K163E were observed while S185T, Q193H and N194D were seen at antigenic site Sb, A139 V, A141S, S203T, D222G, E235 K and K239 N were present at antigenic site Ca and L70F was seen at antigenic site Cb. Of these isolates, 89.40% (42 of 47) isolates had substitution at S203T at antigenic site Ca when compared to the human vaccine strain.

Table 1 Amino acid comparison of deduced amino acid sequences of HA1 of HA protein gene of representative 2009 pdm A (H1N1) strains from Mumbai with that of vaccine strain (in bold Italic). The amino acid position corresponds to strain A/California/07/2009(H1N1). Identical residues are indicated by dots

The epitope regions of influenza H1 subtype have been modified by Deem and Pan. Five epitope sites (A-E) have been introduced for HA glycoprotein with A/California/04/2009 numbering [3, 10]. Amino acid changes were detected at 21 positions in five antigenic sites (A–E). These include N129D, A139 V and A141S at epitope site A, K153E, G155E, K160 N, S185T, Q193H and N194D at epitope site B, V272F at epitope site C, D94 N, K163E, S207 N, D222G, E235 K, K239 N and E243Q at epitope site D and L70F, P83S, N260 K and A261D at epitope site E. In addition, all Mumbai influenza A (H1N1) pdm 09 isolates had a unique substitution of Proline by Serine at residue 83 (P83S) in epitope site E. Mutations outside the antigenic sites include V30I, D97 N, I116 M, K119 N, K130R, G131S, V175G, D238G, I266 V, Q239H and I321 V.

N-linked glycosylation is conserved among various HA subtypes of influenza A viruses with (Asn-Xaa-Ser/Thr) as a specific polypeptide for glycosylation. These sites may determine antigenic conservation or variation [14]. In the sequence (Asn-Xaa-Ser/Thr), Xaa can be any amino acid except for aspartic acid and proline [21]. Eight potential N-Glycosylation sites at amino acid position 27, 28, 40, 104, 293, 304, 498, and 557 were predicted in A/California/07/2009(H1N1). These positions were also identified in influenza A (H1N1) pdm 09 isolates from Mumbai except A/Mumbai/2867/2009(H1N1) which had lost a glycosylation site with a change at amino acid residue 498 from asparagine (N) to Isoleucine (I). Additional to the pre-existing glycosylation sites, four isolates from Mumbai gained new glycosylation sites. Isolate A/Mumbai/3411/2009(H1N1), A/Mumbai/3417/2009(H1N1), A/Mumbai/4467/2009(H1N1) and A/Mumbai/4923/2010(H1N1) possessed an additional glycosylation site with a change at amino acid Lysine (K) to asparagine (N) at residues 146, 136, 177 and 256, respectively.

Further, detailed comparison of the nucleotide and deduced amino acid sequences of the NA from 47 isolated samples was carried out with human vaccine strain A/California/07/2009(H1N1) used in the Northern Hemisphere in 2010–2011. All the isolates indicated sequence similarity to the reference vaccine strain with 98.3–99.6% based on nucleotide and 98.3–100% based on amino acid sequences.

Phylogenetic analysis of the NA gene of Mumbai H1N1 strains revealed two distinct clades. Majority of the influenza A (H1N1) pdm 09 isolates were grouped in Clade I showing homology to A/California/07/2009(H1N1)-like vaccine virus. However, the average percent nucleotide and amino acid similarities of these isolates to A/California/07/2009(H1N1) was relatively as low as 98.3–99.4% and 98.3–100%, respectively. Influenza A (H1N1) pdm 09 isolates from Mumbai were characterized in clade II and were homologous to A/California/07/2009(H1N1)-like vaccine virus. When compared with the isolates in this clade, the average percent nucleotide and amino acid similarities of the three isolates in Clade II to A/India/NIV30784/2010(H1N1) were 99.5–99.7% and 99.4–100%, to A/Kenya/073/2010(H1N1) were 98.9–99.1% and 98.9–99.6%, and to A/Ontario/152846/2009(H1N1) were 99.9 and 100% respectively.

Furthermore, comparative analysis of Mumbai A (H1N1) isolates with A/California/07/2009(H1N1) revealed mutations across the NA protein. Two mutations, V106I and N248D were found in majority of Mumbai strains. Three of the 2010–2011 season virus isolates, A/Mumbai/4916/2010(H1N1), A/Mumbai/4923/2010(H1N1) and A/Mumbai/5116/2010(H1N1) had amino acid substitutions V241I and N369 K in the NA protein.

Eight potential N-Glycosylation sites at amino acid position 50, 58, 63, 68, 88, 146, 235 and 386 were predicted in A/California/07/2009(H1N1) strain. These positions were also identified in all of Mumbai 2009 pdm A (H1N1) isolates except two isolates A/Mumbai/154/2009(H1N1) and A/Mumbai/180/2009(H1N1) which possessed an additional glycosylation with a change of amino acid Isoleucine (I) to Threonine (T) residue at position 46 (I46T). Isolate A/Mumbai/162/2009(H1N1) possessed an additional glycosylation site with a change at amino acid Aspartic acid (D) to asparagine (N) at residue 151; isolate A/Mumbai/3417/2009(H1N1) possessed an additional glycosylation site with a change at amino acid Serine (D) to asparagine (N) at residue 35 and isolate A/Mumbai/2867/2009(H1N1) possessed an additional glycosylation with a change of amino acid Proline (P) to Threonine (T) at residue at position 272.

Discussion

Pandemics occurs when a novel strain of swine or avian influenza acquires HA and/or NA genome segment through reassortment between human, swine and avian influenza viruses. Influenza A (H1N1) pdm 09 was a result of genetic reassortment of multiple gene segments from different lineages [11]. In the present study, nucleotide and deduced amino acid sequences of HA and NA genes of influenza A (H1N1) pdm 09 Mumbai isolates were compared to WHO recommended A/California/07/2009(H1N1)-like vaccine strain. Phylogenetic analysis of HA and NA genes of Mumbai isolates confirmed that these isolates were related to the vaccine strain. The HA1 region of hemagglutinin represents the membrane fusion glycoprotein of influenza virus. Thus, variations in H1N1 isolates were predominantly located at the proposed antigenic sites in HA1 region due to its receptor binding specificity [21]. Overall, 14 substitutions were observed in four antigenic sites when compared with A/California/07/2009(H1N1). Previous reports indicate that antigenic variants usually exhibit more than four amino acid substitutions situated at two or more antigenic sites on the HA protein or one variation in an antigenic binding site and one in a sialic acid binding site [30, 35]. This outcome suggests that influenza A (H1N1) pdm 09 viruses circulating in Mumbai during 2009–2011 seasons were similar to vaccine strain but had undergone changes gradually.

On comparing with vaccine strain, majority of the H1N1 viruses in Mumbai from 2009 to 2011 had amino acid S203T substitution in antigenic site Ca of the HA1 region, which was concurrent with major strains in United States and Asia [2, 7,8,9]. Furthermore, two isolates possessed the D222G substitution at the receptor binding cavity at the Ca antigenic site, which changed the acidic and polar to neutral and non-polar properties of the side chain. D222G substitution in the HA gene was reported in influenza A (H1N1) pdm 09 virus on November 2009 in Norway [11]. It has been suggested this substitution can possibly affect the binding of the HA protein to its receptor. Influenza virus infection initiates through binding of the HA receptors to sialic acid receptors on the surface of the host cells. In humans, influenza infection is restricted to the upper respiratory tract where α2–6-linked sialic acid receptors predominate. Respiratory cells of humans also express α2–3-linkage although they are more abundant in lower respiratory tract. Experimental settings suggest that substitution at position 222 causes a shift to a dual α2–3/α2–6-linkage specificity, which enables the new variant protein to bind to both receptors. This observation leads to a hypothesis that viruses with dual receptor specificity can replicate to high titer in lungs thus resulting in more severe disease course [32]. Two isolates circulating in Mumbai represented this substitution. Thus, it is important to study the dual receptor specificity of this mutated strain which may lead to severe disease progression. However, recent experimental studies in mice with influenza A (H1N1) 1918 pandemic viruses do not support such conclusion [23, 27]. Experimental data with influenza A (H1N1) pdm 09 is yet to be published. Two substitutions Q293H and I321 V have been observed in the HA1 domain of influenza A (H1N1) pdm 09 isolates from Mumbai. Amino acid residues from 280 to 327 represent conserved region and part of the stalk structures of hemagglutinin that supports the globular region of HA1 polypeptide. Mutations at this position can affect the receptor binding specificity of the virus. It has been also been suggested that occurrence of substitution at 321 could only be an effect changing frequency over time rather than an association to severity [11, 23]. Sequence analyses of influenza A (H1N1) pdm 09 viruses of the 2010–2011 season showed further evolution from viruses of the 2009–2010 season. Genetic analysis of Mumbai isolates showed substitution at position S185T at the antigenic site Sb, located on the globular head of the HA. This substitution was also observed in 65% of strains from Tunisia which were isolated from severe cases [11]. This substitution can also enhance binding capability of the virus resulting in increased disease severity.

In addition to the variation in the antigenic sites of HA1 domain, analysis was also done for the number of N-glycosylation sequons in the HA1 domain. Epidemiological observations suggest that the N-linked glycosylation sites are relatively conserved, but have seemed to increase in number as influenza viruses evolve [23]. Since glycosylation at antigenic site is an essential mechanism of immune evasion by influenza virus and also affects the antigenic properties of the virus; changes in the potential N-glycosylation sites were also studied [11]. Influenza A (H1N1) pdm 09 isolates from Mumbai exhibited both loss and addition of N-linked glycosylation in HA and NA gene sequences. It has been observed that the presence and absence of N-linked glycosylation is significant because it may cause increase or loss of function of the glycoprotein as it is involved in initiation and maintaining protein folding, oligomerization, stability, solubility, quality control, sorting, transportation, antigenicity and immunogenicity [11, 23].

Similar to HA gene, molecular analysis of NA active site at the catalytic and framework residues was also examined [6, 28]. All the isolates had conserved catalytic and framework residues. Majority of the isolates from Mumbai exhibited V106I and N248D mutation. This mutation was observed in the majority of influenza A (H1N1) pdm 09 viruses circulating worldwide [2, 8, 20, 26]. It has been suggested that mutation at residue 248 alter the central part of an antibody recognition site on NA [28]. The 2010–2011 season viruses form Mumbai found to have the characteristic amino acid substitutions N369 K and V241I which was concurrent with the substitutions reported in strains from Japan [8]. It has been documented that amino acid modifications of N369 K and V241I would improve protein stability in neuraminidase gene which would possibly improve oseltamivir-resistant virus fitness [7, 12].

Findings in the present study confirm that influenza A (H1N1) pdm 09 strains in Mumbai were homologous to the strains reported worldwide, although the strains rapidly acquired antigenic changes. The genetic instability of influenza A (H1N1) pdm 09 viruses highlights the importance of monitoring viral genetic changes throughout the year for effective management of influenza epidemics. With the availability of vaccines against influenza A (H1N1) pdm 09 viruses, it is still mandatory to monitor genetic changes of viral strains circulating in various parts of India to identify antigenic shifts and drifts. This will help understand significant evolutionary changes thereby providing appropriate vaccination strategies for prevention of influenza outbreaks.