Background

The H9N2 influenza A viruses have been known to cause infection in the poultry population around the globe including Ireland, Iran, Germany, Italy, Pakistan, Saudi Arabia, South Africa and USA since mid-1990 s [1]. In 1998, domestic pigs from Hong Kong were also observed to be infected with H9N2 influenza Y280-like viruses [2]. Several human cases of H9N2 infection have been recorded since 1997 from Hong Kong and China in children and adults exhibiting influenza like symptoms and mild upper respiratory tract infections [26]. Genetic analysis of H9N2 viruses from Hong Kong live bird markets showed the preferential binding of viruses to 2, 6-linked sialic acid, human-like receptors [6, 7]. All these findings pointed towards the possibility of interspecies transmission of H9N2 viruses and its persistent threat to the human population.

Influenza viruses belonging to the Orthomyxoviradae family of viruses are divided into eight single stranded RNA segments encoding ten proteins. These include two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA), along with nucleoproteins (NP), three polymerase proteins (PA, PB1, PB2) two matrix proteins (M1, M2) and non-structural proteins (NS1, NS2) [811]. Of these ten proteins HA and NA are primarily responsible for facilitating influenza virus infection. There are 16 HA and nine NA subtypes. HA is involved in the early stages of infection, causing the binding of the sialic acid receptor present on the host cell surface, and leading to fusion of the viral and endosomal membrane and subsequent entry into the host cell [11]. Virus aggregation is prevented by the NA glycoprotein and by the cleavage of the α-ketosodic linkage between sialic acid and an adjacent sugar residue. This facilitates the movement of the virus to and from the site of infection by destruction of receptors recognized by HA [12]. Previous studies have defined two distinct lineages of H9N2 influenza viruses: North American and Eurasian. The Eurasian lineage can be further divided into three major sublineages; the G1 lineage, represented by A/Quail/Hong Kong/G1/97 (G1-like); the Y280 lineage, represented by three prototype viruses A/duck/Hong Kong/Y280/97 (Y280-like), A/Chicken/Beijing/1/94 (BJ94-like), and A/Chicken/Hong Kong/G9/97 (G9-like) and the Korean lineage, represented by A/chicken/Korea/38349-p96323/96 (Korean-like) and A/duck/Hong Kong/Y439/97 (Y439-like) [7, 13, 14].

It is important to study the evolution of H9N2 viruses because of their constant prevalence in poultry flocks and repeated emergence in the human population. The present study involved computational molecular analysis and phylogenetic characterization of 11 influenza A (H9N2) viruses which have been isolated between 1997 and 2009. The aim of this study was to aid in understanding the evolution of pandemic H9N2 strains, which have circulated various animal populations in the indicated period.

Methods

Viruses

To perform this study, a computational search of all reported cases of influenza A H9N2 human infections from 1997 to 2009 was conducted. A total of eleven nucleotide and their respective deduced amino acid sequences for each of hemagglutinin (HA) and neuraminidase (NA) segments were retrieved from the NCBI flu database [15] accessed on April, 19, 2010. The viruses used in this study are listed in Table 1. Sequencing data was obtained together with information of the host, subtype, isolation year, and isolation place. The selected sequences of H9N2 human cases were then aligned and compared by using multiple sequence alignment software ClustalW2 [16].

Table 1 GenBank accession numbers genes and proteins of avian influenza A (H9N2) viruses isolated from 1997 to 2009 from humans.

Potential Glycosylation sites and antigenic variations

Identification and comparison of N-glycosylation sites into reported protein sequences of HA and NA was performed by an online server ScanProsite [17] and the extent of antigenic variations between viruses was checked by the CTL epitope prediction method [18]. Each amino acid sequence of HA and NA was evaluated separately using consensus approach.

Phylogenetic characterization and tree construction

Phylogenetic patterns of NA and HA nucleotide sequences of H9N2 influenza viruses isolated between 1997 and 2009 from humans were observed using MEGA4.0.2 [19]. The selected nucleotide sequences based on local alignment and homology searches using BLAST were aligned by using CLUSTALW. Unrooted phylogenetic trees were constructed by using minimum evolution analysis with maximum composite likelihood and the Tamura-Nei model. Internal branching probabilities were determined by bootstrap analysis of 1000 replicates and are indicated by percentage value on each branch.

Results and Discussion

Molecular Analysis

Influenza A viruses of subtype H9N2 are now considered to be widespread in poultry and have demonstrated the ability to infect humans [20]. The recurring presence of H9N2 infections in humans has raised concerns about the possibility of H9N2 viruses evolving into pandemic strains. Therefore, it is crucial to evaluate the potential pandemic threat posed by H9N2 viruses using experimental and computational approaches. During this study, bioinformatics analysis of HA and NA from H9N2 viruses was performed and the key residues in receptor binding sites (RBS), the cleavage motifs of HA and NA hemadsorbing sites (HB), stalk length and enzyme active sites were studied in detail.

Hemagglutinin

It has been well documented that the receptor binding site motif of HA is critical for cellular receptor specificity and determining virus host range [21, 22]. Out of five conserved amino acids in the pocket of the HA glycoprotein, two positions showed the maximum number of mutations (198, 234) and three remained 99% conserved (191, 235, and 236). The pattern of observed mutations at position 198 was E198T, E198A, E198V and E198D, whereas for position 234 was L234Q, Q234L and Q234M as summarized in Table 2. Mutations in these regions are considered strong factors for change in sugar specificity thus leading to change in host specificity. The presence of glutamine (Q) at position 234 (H3 numbering: 226) is a typical avian virus signature, and it has been reported that presence of this amino acid results in a preference for binding to 2,3-linked sialic acid (avian receptors) whereas, in the case of leucine (L) at the same position, there is a preference for 2,6-linked sialic acid (human receptors) and potential cause of reported human infections [5, 6, 2325]. However, upon analysis of amino acids at the receptor binding site of HA glycoproteins, we have identified that H9N2 viruses isolated in 2009 from infected patients in Hong Kong possessed glutamine at position 234 (H3 numbering: 226) instead of leucine as has been found in previous isolates, yet still somehow managed to bind with human sialic receptors. As shown in Table 2, this residue variation in avian H9N2 viruses has not been reported before in human cases. This observation can be correlated with a number of phenomena such as genetic evolution of influenza viruses in order to evade host defense mechanisms, and transfer of this virus from swine to humans instead of avian to human transmission. It is also known that pigs act as a "mixing vessel" because viruses isolated from pigs recognize both types of sialic receptors [26] and most importantly represent the balance that exists between the activities of HA and NA. Both the genes PB2 and HA are known to be critical for the pathogenicity of the virus. Therefore, the role of other proteins such as the PB2 segment [4, 27] must be viewed as potential causes of zoonotic H9N2 possibly resulting in future human to human transmission.

Table 2 Comparison of critical amino acid residues in hemagglutinin and neuraminidase proteins.

Analysis of the HA cleavage site showed that strains isolated in 2009 have a different cleavage site compare to those previously reported. From 1997 to 2008, all H9N2 viruses retained a conserved amino acid pattern at the cleavage site; 335RSSR338GLF341 except for the sublineage prototype strain (A/chicken/Kr/38349-p96323/96) that had the cleavage site 335ASYR338GLF341. Presence of the R-S-S-R motif is suggestive of H9N2 viruses with low pathogenicity that have adapted to chicken host [2830]. However, it has been observed that viruses isolated in 2009 have the different pattern (335RSNR338GLF341) due to the substitution mutation S337N (Table 2). To our knowledge, this is the first report to show presence of the R-S-N-R motif in avian H9N2 influenza viruses isolated from human patients although the R-S-N-R motif had been reported before in different studies on avian H9N2 viruses from around the globe [3133]. The significance of this mutation on viral stability or increased pathogenicity is not fully understood as yet and requires further studies, as the cleavage site is considered an indicator of pathogenicity [34, 35]. Comparison with vaccines strains also revealed variations at positions 198 and 234 in the HA glycoprotein. The currently available vaccine virus to the virus has an E and an L at positions 198 and 234, respectively, whereas strains associated with human infections reported in 2009 contain D and Q residues at these respective positions.

Neuraminidase

The major factors known to influence the functional activities of the NA glycoprotein are enzyme active sites, the stalk length, HB site and potential glycosylation sites. The HB site is located on the surface of the NA molecule, away from the neuraminidase enzyme active site [36]. Analysis of the HB site showed substitutions similar to those detected in the Y280 lineage prototype strain (A/Duck/Hong Kong/Y280/97) and in H9N2 human isolates from Hong Kong in 1999. These mutations are typical of human pandemic H2N2 and H3N2 viruses [7, 14]. The most prominent mutation is at position 402 from isoleucine to asparagine/serine and at position 403 from arginine to tryptophan/serine. Strains from 1999 - 2009 also showed mutations at position 367 from lysine to glutamic acid and at position 432 from glutamine to lysine as described in Table 2. However, the biological significance of any of these substitutions in the HB site is not yet known.

The NA protein consists of a box-like head with an enzymatic active site that is connected to a fibrous stalk region of variable length. The NA stalk is important for balancing the complementary activity of HA and NA and has been correlated with efficiency of virus replication and pathogenesis. From various studies previously performed it has been concluded that longer stalk length of the virus results in better replication [3740]. Analysis of stalk length revealed that the two prototype viruses, Qa/Hong Kong/G1/97 and A/Hong Kong/1073/97, of the G1-lineage contained a two amino acid deletion in the NA stalk region at positions 38 and 39, a deletion at position 62 - 64 was observed in A/Duck/Hong Kong/Y280/97, A/Shaoguan/408/98 and A/Shaoguan/447/98. The particular 46-50 amino acid deletion, which is important for poultry adaptation of the virus [41], is not found in any of the analyzed sequences even though these viruses are thought to be transmitted from avian species to humans. Furthermore, as shown in Table 2, from 1999 - 2009 none of the reported strains contained a stalk deletion at any position, which may also be an indication of evolution in the viral structure leading to a NA protein with better replication rate in humans.

Sequence analysis of binding pocket residues in NA for drugs such as zanamivir (Relenza®) and oseltamivir (Tamiflu®) was also performed. These analyses showed that for each virus, amino acids in the enzyme active site were conserved and no substitution mutations were seen, which may lead to an evolution of sialidase inhibitor resistant viruses (Table 2). It is known that mutations at positions other than the active sites still tend to alter the activity of active site amino acids; therefore this possibility cannot be excluded. Such mutations are not always captured or extensively studied in homology models. Therefore, wet lab studies are required to check the level of effectiveness possessed by NA inhibitors against recently reported strains.

Phylogenetic characterization

Evolutionary relationships of HA and NA nucleotide sequences were determined by comparing H9N2 human isolates from 1997 - 2009 with the established Eurasian H9N2 lineages: namely, the G1, Y280 and Korean-lineages represented by their respective prototype viruses (Table 1). The strains of current and proposed vaccines were also included in this analysis.

Two distinct groups were observed in the unrooted phylogenetic tree for HA (Figure 1). Viruses isolated from Hong Kong in 1997, 1999, and in 2009 showed a more intimate relationship with the G1 lineage strain and clustered together in one group. Sequences from Hong Kong isolated in 2003 and 2008, along with Chinese strains from 1998 and 1999 showed the close association with the Y280 lineage and clustered together in second group. None of the reported sequences clustered within the Korean-like lineage.

Figure 1
figure 1

Phylogenetic relationships of HA genes in H9N2 influenza viruses isolated from humans between 1997 and 2009. A phylogenetic tree was generated using minimum evolution analysis with maximum composite likelihood using the Tamura-Nei model with MEGA software version 4.0.2. Numbers below branches indicate bootstrap value percentages from 1000 replicates. The scale bar represents the distance unit between sequence pairs. Representative prototype viruses for different Eurasian lineages are indicated as red. The sequences of H9N2 influenza viruses isolated from the human population are indicated as blue.

The phylogenetic tree for NA (Figure 2) showed clustering pattern different from that observed in the HA phylogenetic results. Similar to HA, none of the sequences grouped within the Korean-like lineage. However, sequences isolated from Hong Kong in 2003, 2008 and 2009 clustered within G9-like lineage which is one of the prototype strains from the Y280 lineage. It was observed that the proposed vaccine strain and the 2009 strain from Hong Kong appear in the same sub group with 100% similarity. Sequences isolated from Shaoguan, China in 1998 clustered together with the Y280 lineage strain in one subgroup, while another contained the G1 lineage strain along with available vaccine strain and strains from Hong Kong isolated in 1997 and 1999.

Figure 2
figure 2

Phylogenetic relationships of NA genes in H9N2 influenza viruses isolated from humans between 1997 and 2009. The phylogenetic methods and abbreviations were as described for figure 1.

Collectively, we observed that phylogenetic relationship of strains at different times and geographical regions displayed complexity and diversity. It was identified that both HA and NA glycoproteins do not belong to a single lineage but originate from different Eurasian sublineages which relates to genetic heterogeneity of H9N2 viruses. These results were also in consensus with the sequence analysis results of strains exhibiting high sequence similarity in the same lineages and sublineages.

N-Glycosylation sites of HA and NA

The N-linked glycosidic moieties have been found to play a vital role in mediating viral infectivity, receptor binding capacity and cell associated host immune responses, as well as protecting critical viral protein epitopes from immune attack [42, 43]. Potential glycosylation sites with the N-X-T/S motif (in which X may be any amino acid except proline) were identified, which revealed the region based similarities and yearly variations. For H9N2 HA protein sequences, the N-glycosylation sites of viruses isolated from Hong Kong in 1997 were similar to those isolated in 1999 (current vaccine strain) from Hong Kong. The strains isolated from China in 1998 exhibited a different pattern of glycosylation sites in terms of position and sequence pattern. A similar case was observed for another virus isolated from China (A/Guangzhou/333/99) in 1999, which showed a totally different pattern of glycosylation sites as shown in Table 3. It may be because viruses from China fall into the Y280 lineage while other cases are closest to the G1 sublineage as shown in our phylogenetic analysis. Sequence analysis results also showed that Chinese isolates contained mutations in the HA receptor binding site at positions 191N and 198A, as compared to other viruses having 191 H and 198E or 198 D (Table 2). Viruses from Hong Kong isolated during 2003 contained altogether different N- glycosylation sites at different positions, which were not found in the analysis of previous entries. Glycosylation sites of viral strains from 2008 and 2009 again had high similarity with each other but were different from all the previous strains except at one site with a single mutation 305 - 308 (NISK - NCSK) in 2008 strains, which was also present in Hong Kong viruses from 1997 - 1999.

Table 3 Comparison of predicted N-glycosylation sites of the virus hemagglutinin proteins.

Similar to HA, NA sequences isolated from Hong Kong in 1997 and 1999 showed similar glycosylation sites, as did the sequences isolated from China in 1999 and from Hong Kong in 2003. Sequences isolated from Hong Kong in 2008 and 2009 also had similar glycosylation sites. An interesting observation was that in some cases of HA and NA, the glycosylation sites remained the same as those seen in previous or later strains. However, there was a change of position for these sites, such as 198 - 201 (NATA) in Hong Kong/1074/1997 and 183 - 186 (NATA) in Shantou/239/98 (Table 4).

Table 4 Comparison of predicted N-glycosylation sites of the virus neuraminidase proteins.

Antigenic sites of HA and NA

Analysis of antigenic sites of HA and NA also showed region based similarities and yearly variations as observed for glycosylation sites distribution. Viruses isolated from Hong Kong in 1997 and 1999 had the same antigenic sites as strains isolated from China in 1998 and 1999, except for A/Shaoguan/447/98 which had only one antigenic site (219 - GRIDYYWSV) similar to other Chinese strains. Strains isolated from Hong Kong in 2003, 2008 and 2009 had similar antigens as the earlier strains from the region but at different positions compared to the strains from 2008 and 2009, but again presenting similar antigenic site positions (Table 5).

Table 5 Comparison of antigenic sites in amino acid sequences of hemagglutinin proteins (H9N2).

For NA, the sequences isolated from Hong Kong in 1997 and 1999 had the same antigenic sites. Sequences from China isolated in 1998 had similar antigenic sites but at different positions with variation at one site. Chinese sequences from 1999 had three sites that were the same as those in the Hong Kong viruses isolated during 2003. Sequences isolated from Hong Kong in 2008 and 2009 also had similar antigenic sites as previous strains from the region but at different positions. Furthermore, in 2008, one new antigenic site was observed in a sequence (102 - ASGDIWVTR) and another two in 2009 sequences at positions 19 (ALFATTMTL) and 240 (GRADTRILF) as shown in Table 6. Another observation was that both the available and proposed vaccine strains are not fully coordinated with the antigenic epitope regions of the HA and NA genes. Therefore, in case of reappearance of incompatible viral strains, vaccine products may become less effective.

Table 6 Comparison of antigenic sites in amino acid sequences of neuraminidase proteins (H9N2).

The potential for differences in antigenic variations is high, especially when compared across sublineages of H9N2 viruses. Xu et al [44] demonstrated antigenic diversity in H9N2 viruses using monoclonal antibodies, which corresponded with phylogenetic relationships. Findings of the study by Xu et al also correspond with our results where H9N2 viruses showed sequence variations and antigenic diversity based on evolution. Furthermore, differences in glycosylation of the virion surface proteins may contribute to antigenic variations, however this requires further evaluation.

Conclusions

In summary, we have reported an in silico molecular analysis of HA and NA genes and respective deduced amino acid sequences from H9N2 avian influenza viruses that were isolated from humans between 1997 and 2009. We found that H9N2 surface genes belonged to two distinct lineages - G1 and Y280, indicating that they have different sources of origin. Sequence analysis revealed unique variations in antigenic and N - linked glycosylation sites. Drug binding pockets remained highly conserved in all reported strains and hence, the activity of NA inhibitors should remain unaffected. However, matching with vaccine strains showed variations, requiring further investigations in animal models. The RBS modification of leucine to glutamine (Leu226Glu) instead of glutamine to leucine and a new cleavage site motif (R-S-N-R) for HA is related to the balancing activity of NA. Findings from the study support the genetic instability of influenza A (H9N2) viruses and highlight the necessity for more comprehensive surveillance and further evaluation of H9N2 viruses with proper in vitro and in vivo models.