Introduction

Human infection with avian influenza A (H7N9) virus was first identified in eastern China in April, 2013 [4, 14]. As of December 27, 2015, a total of 686 laboratory-confirmed cases of infection with the virus had been reported in China, and 274 of these patients had died [7]. There is a great concern worldwide regarding the human-to-human transmission ability of the new virus [5].

An earlier study demonstrated that avian influenza A (H7N9) virus is possibly a triple reassortant in which the HA and NA genes originated from A/duck/Zhejiang/2/2011(H7N3) and A/wild bird/Korea/A14/2011(H7N9), respectively, while the internal genes were closely related to A/brambling/Beijing/16/2012-like viruses (H9N2) [14]. Subsequent studies indicated that the N9 gene of the influenza A (H7N9) virus was more closely related to H11N9 and H2N9 viruses found in the migratory wild birds of Hong Kong in 2010-2011 [19]. In addition, the internal genes of the H7N9 virus displayed much higher genetic heterogeneity after sequential and multistep reassortments with local prevalent H9N2 viruses [8, 11, 18, 20, 35].

Since avian influenza A (H7N9) virus emerged in March 2013, there have been three epidemic waves of human infection. The first epidemic wave was concentrated from March to May 2013, whereas the second epidemic wave occurred from December 2013 to May 2014, and the third epidemic wave occurred from December 2014 to May 2015. Between the earlier two epidemic waves, from early June 2013 to late November 2013, there was only one confirmed case of infection with avian influenza A (H7N9) virus reported in Guangdong province on August 10, 2013. To date, many studies regarding the origin and characteristics of the novel influenza A (H7N9) virus have been performed using sequences obtained from the first epidemic wave [15, 38, 41], and therefore, changes in genomic characteristics of the H7N9 virus between the earlier two epidemic waves remain to be elucidated.

During the second epidemic wave of human infection with the H7N9 virus, 27 confirmed cases of infection were identified from December 30, 2013 to April 15, 2014 in Shenzhen, a city in southern China bordering Hong Kong. All nasal swab samples of the 27 confirmed cases were cultured in embryonated chicken eggs; 11 strains of avian influenza A (H7N9) virus were successfully obtained and the entire genome was sequenced. At the same time, the genomes of four strains of the H7N9 virus obtained from environmental samples were fully sequenced. In this study, the genome characteristics of the 15 strains of avian influenza A (H7N9) virus obtained from the second epidemic wave were thoroughly analyzed to better understand the selective pressures and the specific amino acid mutations driving its viral evolution. The findings of this study may facilitate further study regarding the evolution of avian influenza A (H7N9) virus.

Materials and methods

Sample collection and virus isolation

Nasopharyngeal swabs from confirmed cases of infection with avian influenza A (H7N9) virus with severe pneumonia and acute respiratory distress syndrome were collected [13]. Environmental samples were collected from chicken feces, wastewater from slaughterhouses, chicken cage surfaces, and poultry chopping boards in live-poultry markets, all of which were epidemiologically associated with confirmed cases or the workplaces of confirmed cases. Collection of all samples was conducted under the guidelines of the standard operation procedures (SOPs) of the China Center for Disease Control and Prevention. All samples were transferred to our laboratory in a viral transport medium at 4 °C for avian influenza A (H7N9) virus detection.

RNA was extracted from nasopharyngeal swabs and environmental samples using MagNA Pure LC 2.0 (Roche, Switzerland) with a MagNA Pure LC Nucleic Acid Isolation Kit (Roche, Germany). H7 and N9 genes were detected using real-time reverse transcription polymerase chain reaction (RT-PCR) assays according to the SOPs of the World Health Organization [34].

Only samples testing positive for both the H7 and N9 genes were propagated in 10-day-old embryonated chicken eggs and then cultured at 37 °C for 2 days in a biosafety level 3 laboratory. Allantoic fluid was extracted and tested for H7 and N9 genes by RT-PCR assays and, if positive, used for further genomic sequencing.

Genomic sequencing

All eight segments of isolated positives from the H7N9 RT-PCR assays were sequenced using a high-throughput sequencing strategy on an Illumina HiSeq 2500 sequencer. The sequencing methods were developed in a previous study [39]. The nucleotide sequences for the viral genome of the H7N9 isolates have been submitted to GISAID, and the accession numbers are listed in Table 1.

Table 1 Accession numbers for the H7N9 viruses isolated in Shenzhen for this study

Phylogenetic analysis

All full-length genomic segments of avian influenza A (H7N9) virus were downloaded from the influenza virus database in GenBank. The full-length HA gene of the H7Nx virus, the full-length NA gene of the HxN9 virus, and the six internal full-length segments (PB2, PB1, PA, NP, M, and NS) of the H9N2 virus from 2010 to 2014 [35] were randomly downloaded from influenza virus databases. All sequences, including those from the Shenzhen influenza A (H7N9) virus strains (11 from confirmed cases and 4 from environmental samples) and those downloaded from GenBank, were subjected to multiple sequence alignment using the ClustalW program, and the amino acid mutations were analyzed using the Highlight model in MEGA software (version 5.0). Preliminary phylogenetic trees were generated using the neighbor-joining (NJ) method and a bootstrap analysis of 1000 replications using MEGA software (version 5.0). A detailed analytical phylogenetic tree for each segment was generated by including only sequences belonging to the same evolutionary branch as the novel influenza A (H7N9) virus strains and nearby branches.

Adaptive evolution analysis

Adaptive evolution was analyzed by measuring positive selection, which drives viruses to evolve through cross-species transmission. To understand whether the novel influenza A (H7N9) virus was driven by positive selection, the standard McDonald-Kreitman test (MKT) was used to detect natural selection pressure [9, 10]. The MKT was designed to compare the amount of variation (synonymous and nonsynonymous) within a species and the divergence across species. The ratio of nonsynonymous to synonymous polymorphisms within species (Pn/Ps) and the ratio of nonsynonymous to synonymous fixed substitutions between species (Dn/Ds) were calculated to derive the neutrality index (NI) using the formula NI = PnDs/PsDn. If the NI value was <1, then positive selection must have driven the virus through cross-species transmission by an excess of fixation of non-neutral replacements. If the NI value was 1, then neutral selection must have driven viral evolution. If the NI value was >1, then negative selection precluded the fixation of harmful mutations, which generally leads to coexistence of multiple viral genotypes.

Results

Phylogenetic analysis

In total, 15 influenza A (H7N9) virus strains were isolated from confirmed cases and environmental samples in Shenzhen during the second epidemic wave, and all genomic segments were sequenced. Together with genomic sequences downloaded from influenza virus datasets, these 15 genomic sequences were analyzed to elucidate the origin and evolution of the influenza A (H7N9) virus in Shenzhen, and the phylogenetic trees of all eight segments of the influenza A (H7N9) viruses were constructed as shown in Fig. 1(1A-H).

Fig. 1
figure 1figure 1figure 1figure 1figure 1figure 1figure 1figure 1

Phylogenetic trees of all eight segments of influenza A (H7N9) viruses isolated in Shenzhen during the second epidemic wave. The phylogenetic trees were constructed by the NJ method using MEGA 5.0, and the reliability of the tree was evaluated by the bootstrap method with 1,000 replications. Only bootstrap values of >50 are shown at the corresponding nodes. Sequences labeled in red with red dots represent the H7N9 virus isolated from confirmed cases in Shenzhen, and those labeled in blue with blue triangles represent the H7N9 virus isolated from environmental samples in Shenzhen. Sequences in green represent the reference strains that are most closely related to influenza A (H7N9) virus. The branches in red indicate the other H7N9 viruses, and those in black are H9N2 or other subtype viruses

For the HA and NA genes, all 15 avian influenza A (H7N9) virus strains isolated from Shenzhen clustered into a classic H7N9 clade with other influenza A (H7N9) viruses previously isolated during the first epidemic wave (Fig. 1A and B); however, the HA gene of the 15 strains from Shenzhen was concentrated on the far end of the phylogenetic tree and formed a distinct subclade (Fig. 1A). This implied that the HA gene of the influenza A (H7N9) virus from the second epidemic wave underwent continuous evolution and was divergent from those isolated during the first epidemic wave, although there was a high degree of nucleic acid sequence similarity between the influenza A (H7N9) virus isolated from the first and the second epidemic waves. The HA gene of the influenza A (H7N9) virus from Shenzhen originated from the A/duck/Zhejiang/2/2011(H7N3)-like virus, which is consistent with previous reports [14, 21]. Inconsistent with previous analyses [17, 21], our study revealed that the NA gene from local strains seemed most likely to have originated from an H2N9 or H11N9 virus found in migratory birds in the southern regions of China. This is because the H2N9 virus or H11N9 virus from southern China and the avian influenza virus from Korea clustered into two different clades in the phylogenetic tree for the NA gene, and the former viruses were more closely related to the novel influenza A (H7N9) virus (Fig. 1B).

For internal genes, phylogenetic analysis of the PA gene showed that the 11 strains from the confirmed cases in Shenzhen segregated into a single distinct subclade within a major clade that covered almost all influenza A (H7N9) virus strains isolated in the first epidemic wave [35]. The four strains isolated from the environmental samples in Shenzhen clustered into another single subclade, which was the closest to the local H9N2 virus (A/chicken/Guangdong/LG1/2013) (Fig. 1H). It did not appear that the internal segments of the H7N9 virus isolated from the confirmed cases in Shenzhen directly originated from the local H7N9 virus in live poultry markets; the most likely donor may have been local H9N2 viruses. For the M gene, 9 out of the 11 H7N9 virus strains from the confirmed cases were clustered into a separate subclade within a minor clade that included H7N9 virus strains from the first epidemic wave. The other two strains, i.e., A/Shenzhen/4/2014(H7N9) and A/Shenzhen/17/2014(H7N9), were scattered in this minor clade (Fig. 1D).

Further analysis of the phylogenetic trees of the other four internal genes (NP, NS, PB2, and PB1) revealed that their sources were diverse and complex, were inconsistent with some previous reports from the first wave [8, 12, 14, 28, 35], and were partially consistent with two other previous reports from the second wave [23, 24]. For the NP gene, except for A/ Shenzhen/16/2014, the other 10 H7N9 virus strains from the confirmed cases fell into a separate subclade that was closest to the local H9N2 virus (A/chicken/Gongdong/LG1/2013(H9N2)) within a major clade of H7N9 and H9N2 viruses (Fig. 1C). The four H7N9 virus strains from the environmental samples and A /Shenzhen/16/2014 were segregated into another distinct clade that was far away from the major H7N9 virus clusters and closer to the H9N2 viruses isolated in southern China in 2011-2013.

The NS, PB2, and PB1 genes of the H7N9 virus strains isolated from confirmed cases belonged to a single cluster, as shown in Figs. 1E, F, and G, respectively. Sequences in this single cluster showed a closer relationship to an H9N2 virus that was prevalent in poultry in southern China, especially in Hong Kong and Guangxi, in 2010-2012. This indicated that the origins of the NS, PB2, and PB1 segments of the H7N9 virus strains isolated from confirmed cases from the second epidemic wave were different from those from the first epidemic wave. The NS, PB2, and PB1 segments of the H7N9 virus strains isolated from the environmental samples formed a distinct subclade within a major cluster that included H7N9 virus sequences isolated in the first epidemic wave and related H9N2 sequences. This implied that there were different gene donors for the influenza A (H7N9) viruses isolated from confirmed cases and those isolated from the environmental samples in the second epidemic wave.

Purifying selection drives the evolution of influenza A (H7N9) viruses in humans

Avian influenza virus can acquire the ability to undergo cross-species transmission and even human-to-human transmission through adaptive mutation and reassortment of multifarious lineages [27]. Previous studies have demonstrated that animal viruses experienced positive Darwinian selection in the process of cross-species transmission and in the initial stages of human outbreaks, followed by purifying (negative) selection when the virus adapted to the new host in a later epidemic [30, 40]. To explore whether positive selection pressure drove the influenza A (H7N9) virus to adapt to humans in the second epidemic wave, the MKT was performed to analyze the sequences of each segment of the influenza A (H7N9) virus isolated in Shenzhen (taxon name labeled in red in Fig. 1A-H), and the sequences closest to the H7N9 virus clade were referred to as the background clade (isolate names labeled in green in Fig. 1A-H). The results are summarized in Table 2.

Table 2 McDonald-Kreitman test for each segment of A (H7N9) virus in Shenzhen

The NI values for the HA, NA, PA, and M genes were 0.267, 0.514, 0.708, and 0.618, respectively. The HA gene, which encodes the most important surface protein for determining host range, had NI values <1. Importantly, the posterior probability (p) value achieved statistical significance (0.004), implying that a strong positive selection pressure acted on the influenza A (H7N9) virus isolated from Shenzhen in the second epidemic wave. These four segments, particularly the HA gene, may play important roles in the cross-species transmission from poultry to humans as a result of mutations. In contrast, the NI value for PB2, PB1, and NP were 3.418, 2.645, and 3.097 respectively, indicating that negative selection acted on these segments and caused multifarious genotypes to co-prevail. The relatively stable NS sequence of the influenza A (H7N9) virus isolated from Shenzhen may exist for a long time due to its NI value being close to 1 (1.093).

Characteristic amino acids for influenza A (H7N9) viruses isolated in Shenzhen during the second epidemic wave

To further elucidate how the influenza A (H7N9) virus evolved during the second epidemic wave in Shenzhen, the full-length amino acid sequences of the 15 H7N9 virus strains were aligned with other downloaded sequences. Characteristic amino acids for the influenza A (H7N9) viruses isolated in Shenzhen were defined as those present only in the H7N9 viruses from Shenzhen and were different from those observed in the H7N9 viruses from the first epidemic wave. The amino acids characteristic of the influenza A (H7N9) viruses from Shenzhen are summarized in Table 3 and supplementary files (Figure S1-Figure S10).

Table 3 Characteristic amino acids for genomic segments of H7N9 virus isolates collected during the second wave in Shenzhen

Discussion

Since its emergence in March 2013, the novel influenza A (H7N9) virus has experienced continuous evolution, and multiple genotypes have formed and circulated in poultry in several provinces of China [8, 21, 35]. Many studies have revealed the molecular characteristics of H7N9 viruses using sequences from the first epidemic wave. For example, HA-Q226L of the influenza A (H7N9) virus from the first wave is associated with human-virus-like receptor binding and may enable the virus to gain the ability to infect humans; this virus has a single basic residue at the HA0 cleavage site (EIPKGR↓G), indicating that it is a low-pathogenic virus [31]. The presence of NA-292R in most H7N9 viruses indicates that the virus is sensitive to oseltamivir and zanamivir [26, 36]. PB2-R627K may be associated with bird-to-human transmissibility and increased virulence of influenza A (H7N9) virus [2]. Deletion of amino acids 67-73 in NA may be associated with increased virulence [25]. M2-S31N is associated with resistance of the influenza A (H7N9) virus to the ion channel blockers amantadine and rimantadine [3]. However, there has been little information on the molecular characteristics of the H7N9 virus from the second epidemic wave. The current study has helped to fill that gap. We found that, aside from the molecular characteristics of the H7N9 virus from the first wave mentioned above, some new characteristic amino acids were present in the influenza A (H7N9) virus isolated in Shenzhen during the second epidemic wave (see Table 3 and supplementary materials). HA-R47K and T122A, particularly T122A as it is near the receptor-binding site (130 loop), may affect receptor binding and, consequently, alter the virulence of the virus [22, 32]. Our study shows that amino acid mutations in HA seem to decrease the virulence of H7N9 viruses isolated in Shenzhen during the second epidemic wave (mortality rate 2/27 [7.4 %] in Shenzhen vs. 47/144 [32.6 %] in China during the first wave), although further study is needed to confirm this result [6]. Furthermore, the presence of PB2-V139I, PB1-I397M, and NS1-T216P indicated that the source of the internal segments from the second wave was different from that from the first wave. Phylogenetic analysis of the PB2, PB1, NP, and NS genes revealed that there were complex donors for internal segments of the influenza A (H7N9) virus isolated in Shenzhen, perhaps due to frequent poultry transport or various traditions.

To better understand the selection pressure that drove the genomic evolution of the novel influenza A (H7N9) virus during the second epidemic wave, and to understand whether the virus had gained the ability of cross-species transmission and human-to-human transmission, we performed a standard and generalized McDonald-Kreitman test [10] on eight genomic segment sequences of the influenza A (H7N9) virus from the second wave. We found that the NI values for HA, NA, PA, and M were 0.267, 0.514, 0.708, and 0.618, respectively, indicating that the novel virus had possibly gained the capability of transmission from birds to humans.

The emergence of the influenza A (H7N9) virus in China raises concerns about potential virus adaptation to mammals [1, 14, 24, 29, 33, 37, 41] and human-to-human transmission [16]. Previous studies showed that the influenza A (H7N9) virus was a triple reassortant with the HA gene from avian influenza viruses that circulated in ducks, the NA gene from avian influenza viruses in wild birds, and the internal genes from an earlier H9N2 lineage [14]. Migratory birds and wild ducks, particularly mallards, may have played an important role in the emergence and transmission of the influenza A (H7N9) virus at the early stage, because these species were frequently infected with or carried H7 viruses. At the later stages of the epidemic, poultry, especially asymptomatic chickens, infected with the avian influenza A (H7N9) virus became the primary source of infection due to the large numbers of birds being transported [35]. Therefore enhanced surveillance of migratory birds and poultry should be implemented to determine the origin and mode of transmission of the novel influenza A (H7N9) virus, which would facilitate the formulation of effective prevention and containment strategies.