Introduction

Influenza virus infection is a great burden on health, causing seasonal outbreaks that affect approximately 10 % of the population each year [1]. Antigen variation often happens to influenza viruses which can lead to influenza prevalence, even worldwide pandemic. Influenza pandemic occurred four times in the last century: the 1918 Spanish influenza, the 1957 Asian influenza, the 1968 Hong Kong influenza and the 1977 Russian influenza [1, 2]. Among them, the Spanish and Russian influenza pandemic were caused by H1N1 subtype influenza virus.

In April 2009, a novel H1N1 influenza virus emerged in Mexico and North America [3, 4] and rapidly spread to more than 70 countries, causing approximately 28,000 infections and 144 deaths before the World Health Organization announced it as the first pandemic in the 21st century, known as pandemic H1N1 2009 (H1N1pdm09) [5]. This virus was a novel gene re-assortment influenza virus from North American avian and classical swine influenza, human influenza, and Eurasian swine influenza [6].

The first laboratory-confirmed case in China was detected on 10 May 2009 [isolated virus strain was A/Sichuan/1/2009(H1N1)] in Sichuan Province, southern part of China [7]. Since the outbreak of the H1N1pdm09 influenza infection, a large-scale surveillance was carried out in China on molecular level, the evolutionary and spatial dynamics of this virus in human populations. The network laboratories expanded from original 83 to 410. Six new laboratories were developed in Tianjin since then. Influenza in the south part (2 prevalent peaks, winter and summer) and north part (1 peak, winter) of China have different prevalence patterns. Tianjin is located in the northeast part of China, which showed one prevalent peak usually in winter season. To clarify the prevalent pattern of H1N1pdm09 in Tianjin and its molecular characteristics, epidemiology and full genome sequence analysis were performed in April 2009–March 2012.

Materials and methods

Sample collection

11,324 Throat swabs were collected from patients with influenza-like illness (ILI) in 9 sentinel influenza surveillance hospitals and ILI outbreak cases in Tianjin between April 2009 and March 2012 [8]. 765 Other respiratory samples were collected from severe respiratory infection cases in the same period in Tianjin [8].

Virus screening and isolation

All samples were processed and detected by real-time PCR for influenza virus typing and subtyping according to the CDC protocol [9]. Next, H1N1pdm09 viruses were isolated from real-time PCR-positive samples. Samples were inoculated onto Madin–Darby canine kidney cells (MDCK) and cultured with serum-free minimum essential medium (MEM; Gibcol, USA) in the presence of 2.0 μg/ml of trypsin (Sigma, USA) [10]. Virus was identified by hemagglutination inhibition (HI) tests with virus type-specific anti-serum which was distributed by National Influenza Center (NIC) and stored at −80 °C.

RNA extraction and sequencing

RNA for real-time PCR was extracted from 500 μl respiratory samples and RNA for sequencing after RT-PCR amplification was extracted from 50 μl viral aliquots using NucliSENS easyMAG (BioMerieux, France) according to the manufacturer’s instructions.

The whole genome of eight H1N1pdm09 isolates from patients with severe respiratory infection (reported as unexplained pneumonia) and HA segment of seven H1N1pdm09 isolates from upper respiratory infection (mild symptom infection) were sequenced by performing one-step RT-PCR amplification (Qiagen, Germany), followed by direct sequencing of the products using the WHO recommended protocol with some modification in M13F/M13R-tailed primers [3]. In brief, 46 pairs of M13F/M13R-tailed primers for eight segments were used to perform RT-PCR separately. Product of each primer has 200–300 bp overlapping sequences. All sequencing procedure was conducted by Invitrogen Biotechnology Co. Ltd., Shanghai, China using an ABI 3730 DNA Analyzer (Applied Biosystems, USA). The splicing of each primer products’ sequence was performed using Seqman software in DNAStar (version 5.01).

Sequence alignment and phylogenetic analysis

Nucleotide sequence alignments were generated using the ClustalW algorithm of the MEGA4.0.1 (http://www.megasoftware.net/mega.html) software. The nucleotide and induced amino acid sequences identities were calculated using Megalign software in DNAStar (version 5.01). Vaccine strain A/California/07/2009(H1N1), the early isolate A/California/04/2009(H1N1) and first Chinese isolate A/Sichuan/01/2009(H1N1) were taken as the reference genome and whole open reading frames (ORFs) were adopted for phylogenetic tree construction. Other representative worldwide strains were included from America, Australia, southeastern Asia, Europe and south and north China [10, 11]. All data were downloaded from GenBank. Phylogenetic trees were constructed by MEGA4.0.1 version using the neighbor-joining method. Bootstrap proportions were plotted at the main internal branches of the phylogram to show support values.

Substitution rate estimation

We estimated the substitution rate of the H1N1pdm09 virus sequences using a Bayesian Markov chain Monte Carlo (MCMC) approach as implemented in the BEAST package v1.7.5 (http://beast.bio.ed.ac.uk/). The HKY substitution model was used with a gamma parameter of site heterogeneity model. A strict molecular clock model was used. For each analysis, a chain length of 10,000,000 was used and sampled for every 1,000 states. Convergence was confirmed with Tracer v1.5.

Amino acid mutation rate was calculated as formula:

$${\text{No}} .\;{\text{of}}\;{\text{amino}}\;{\text{acid}}\;{\text{mutation}}/{\text{No}} .\;{\text{of}}\;{\text{all}}\;{\text{amino}}\;{\text{acid}}\;{\text{in}}\;{\text{single}}\;{\text{protein}}\; \times \; 100\,\% .$$

Selection pressure measurement

To determine the overall selection pressure faced by each gene segment in H1N1pdm09, we estimated the mean numbers of nonsynonymous substitutions (d N) and synonymous substitutions (d S) per site using single-likelihood ancestor counting (SLAC) method, accessed through the Datamonkey interface (http://www.datamonkey.org). In all cases, d N/d S estimates were based on neighbor-joining trees under the substitution model selected by website.

Statistical analyses

Chi-square test was applied in this study to perform statistical analysis on amino acid mutation rate using SPSS11.5 software.

Results

Prevalence of influenza viruses

Totally, 3,068 influenza viruses (25.4 %) were detected from 12,089 respiratory specimens in April 2009–March 2012 (Table 1; Fig. 1). From April 2009 to September 2009, called summer surveillance season, seasonal H1N1, H3N2, H1N1pdm09 and B type influenza viruses co-circulated, but seasonal H3N2 and H1N1pdm09 were the predominant subtypes, accounting for 57.7 % (184/319) and 37.0 % (118/319), respectively, in all influenza viruses. Only two seasonal H1N1s were detected. From October 2009 to March 2010, called winter surveillance season, seasonal H3N2, H1N1pdm09 and B type influenza virus co-circulated, and H1N1pdm09 (69.1 %, 930/1,346) was absolutely the predominant subtype. In the next summer season, seasonal H3N2 (79.7 %, 114/143) was the major subtype. In the next winter season, seasonal H3N2 (38.4 %, 201/523) and H1N1pdm09 (42.1 %, 220/523) moved back as the advantage prevalent subtypes. In the summer season in 2011, only type B was detected. And in the following winter season, type B (90.0 %, 633/703) still remained the absolute prevalent type. Since October 2009, no seasonal H1N1 was detected.

Table 1 Prevalence of influenza virus from April 2009 to March 2012
Fig. 1
figure 1

Prevalence of influenza virus from April 2009 to March 2012. Seasonal H3 seasonal influenza virus H3 subtype. Seasonal H1 seasonal influenza virus H1 subtype. H1N1pdm09 pandemic influenza A (H1N1) 2009 virus. H1N1pdm09(%) the constituent ratio of pandemic influenza A (H1N1) 2009 virus

Prevalence of H1N1pdm09

Among 3,068 influenza virus positive specimens, 41.4 % (1,269/3,068) was H1N1pdm09 positive. 15.1 % (192/1,269) severe respiratory infection cases were H1N1pdm09 positive. The H1N1pdm09 constituent ratio was shown in Table 1 and Fig. 1. None of the H1N1pdm09 was detected since April 2011. H1N1pdm09 was the predominant prevalence subtype in October 2009–March 2010 and October 2010–March 2011.

Whole genome sequencing of H1N1pdm09

Eight H1N1pdm09 isolates from severe respiratory infection or death cases (some severe respiratory infection cases were finally died) were selected to sequence the whole genome (Table 2). Among them, four were isolated from 2009, two were from 2010 and the other two were isolated from 2011. As the HA protein was the major antigenic protein and most variable protein, seven HA genes were sequenced from throat swabs which were collected from upper respiratory tract infection cases to analyze the relationship between amino acid changes in HA and clinical severity. Sequences were submitted to GenBank, whose accession numbers are shown in Table 4. The patients and virus strains information are shown in Table 2.

Table 2 Information of the selected patients and virus strains sequenced

Phylogenetic and genetic diversity analysis

The eight whole genome sequences sampled from Tianjin were subjected to phylogenetic analysis. Two genes were selected as representatives: HA (Fig. 2a) and NA (Fig. 2b). Phylogenetic trees of other genes are shown in Figure S1. Phylogenetic analysis showed that HA, NA, M, NP and NS genes of H1N1pdm09 viruses gathered together with swine influenza A(H1N1), whereas PB2 and PA genes originated from avian influenza virus, and PB1 gene originated from human seasonal influenza virus. The strains isolated in 2011 were clustered together in a separate branch, a little far from other strains isolated in 2009 and 2010.

Fig. 2
figure 2figure 2

Phylogenetic tree of HA (a) and NA (b) genes of H1N1pdm isolates collected in Tianjin, China from April 2009 through March 2012. MEGA 4.0 was used to draw phylogenetic tree by neighbor-joining method. The values at the branches denoted the percentage of 1,000 times of boot strap re-sampling and the bootstrap values >70 % are shown. A/California/07/2009(H1N1) and A/California/04/2009(H1N1) were taken as references and the ranges adopted for phylogenetic analysis of three genes were the ORFs. The viruses indicated by filledcircle (2009 severe cases) opencircle (2009 mild infection cases) filledtriangle (2010) opendiamond (2011) were isolated in Tianjin

Compared with vaccine strain A/California/07/2009(H1N1), HA amino acid sequence identities were between 97.4 and 99.5 %. Identities of HA sequences isolated from severe infection cases and mild infection cases were 97.4–99.3 and 98.9–99.3 %, respectively. Mutation rate of HA sequences isolated in 2011 were 2.3–2.6 %, higher than that in 2009 (0.7–1.1 %) and 2010 (1.1–2.0 %). Mutation rate of NA sequences isolated in 2011 was 1.7 %, higher than that in 2009 (0.6–0.9 %) and 2010 (0.6–1.1 %). Totally, among eight gene segments, the maximal mutation gene was HA, then NA, last one was M (mutation rate 0–0.6 %), as shown in Tables 4 and 5. The amino acid substitution rates were varied among eight gene segments, ranging from 7.39 × 10−4 for PB2 to 7.40 × 10−3 for NA. NA, HA, NS1 and PA proteins exhibited highest genetic diversity. The higher d N/d S rates were observed in HA, PA and NS segments in H1N1pdm09 in Tianjin (Table 3).

Table 3 Genetic evolutionary of H1N1pdm09 sampled in Tianjin

Genetic mutation and antiviral drug resistance

The H1N1pdm09 HA amino acid changes (compared with vaccine strain A/California/07/2009(H1N1)) are shown in Table 4. Among them, three HA amino acid substitution occurred in the HA receptor-binding sites and at antigenic determinant, including S179N and K180T (located at antigenic site Sa) in A/Tianjinhedong/SWL44/2011(H1) and A/Tianjinjinnan/SWL41/2011(H1), and D239N (located at antigenic site Ca) in A/Tianjinninghe/SWL49/2009(H1). In addition, P100S, S200T and I338V happened in all Tianjin isolates (8 from severe infection cases and 7 from mild infection cases), and mild infection strains had the unique mutation V428I. E391G and E391K substitution were observed in severe infection isolates and mild infection isolates, respectively. 391 and 428 were not in receptor-binding site or antigenic sites domain. No significant difference was found in HA mutation rate between isolates from severe infection cases and that from mild infection cases if it was analyzed in isolates from the same year (2009) (χ 2 = 0.05, p = 0.083).

Table 4 Amino acid changes and identity of HA from severe and mild respiratory infection cases compared with A/California/07/2009(H1N1)

Other amino acid substitutions of the H1N1pdm09 are shown in Table 5. More site substitutions were observed in 2011 isolates than in 2009 and 2010 isolates in HA (χ 2 = 9.64, p = 0.002), NA (χ 2 = 12.95, p = 0.003) and PA (χ 2 = 14.8, p = 0.001) proteins. In NA, all isolates had the substitutions at V106I and N248D. No antiviral drug-resistant site substitution was observed at 275 and 295 sites. These indicated that the H1N1pdm09 in Tianjin was still susceptible to oseltamivir.

Table 5 Amino acid changes and identity of the whole genome (except for HA) compared with A/California/07/2009(H1N1)

In M2 protein, 31-amino acid site, which was the most important site related with amantadine resistance, was N in all eight isolates, which suggested that H1N1pdm09 was resistant to amantadine.

P224S in PA, I123V in NS and V100I in NP occurred in all eight Tianjin H1N1pdm09.

Discussion

H1N1pdm09 caused the first influenza pandemic in the 21st century. This virus distinctly differed from the seasonal H1N1 influenza virus that was circulating in humans, enabling it to spread in the thus-far naïve population.

Surveillance of influenza from April 2009 to March 2012 showed that from its first detection in Tianjin in 2009, H1N1pdm09 became the predominant prevalence subtype in that winter surveillance season (October 2009–March 2010), it was up to peak in November (91.1 % influenza viruses were H1N1pdm09 in this month). In the whole surveillance season, 69.1 % influenza viruses were H1N1pdm09. With the increasing of H1N1pdm09-infected individuals and the expanding of vaccinated population, immune barrier developed gradually, only one H1N1pdm09 strain was detected in April 2010–September 2010, which was called summer surveillance season. In the next winter surveillance season (October 2010–March 2011), H1N1pdm09, together with seasonal H3N2, became the predominant prevalence subtype once again. Now WHO has announced that H1N1pdm09 will be prevalent alternatively with seasonal H1N1, H3N2 and B in human beings. Three years’ surveillance (April 2009–March 2012) in Tianjin confirmed this viewpoint further.

Influenza A virus has eight gene segments. More attention was often paid to HA gene. There are at least four antigenic determinant domains in the HA gene: Sa, Sb, Ca and Cb [12]. D239G/N (located at Ca antigenic determinant domain) mutation in the HA protein was associated with severe and fatal H1N1pdm09 case [1315]. In this study, D239N mutation occurred in A/Tianjinninghe/SWL49/2009(H1), and the patient infected with this virus subjected to severe pneumonia and died finally. S179N and K180T substitutions were observed in two 2011 isolates, A/Tianjinhedong/SWL44/2011(H1) and A/Tianjinjinnan/SWL41/2011(H1). These two sites together with 239 were all in antigenic determinant domain of the HA protein and this could improve the adaption of H1N1pdm09 to human hosts [15]. The transmission ability of the influenza virus is mainly determined by the receptor-binding sites (RBS), which are defined by three structural elements, a130-loop (135–138) and a 190-helix site (190–198)and a 220-loop (221–228) (numbering on the basis of H3 HA sequence) [16, 17]. Compared with A/California/07/2009, our virus isolates were conserved at the 130-loop, 190-helix sites: VTAA (135–138) at loop 130 and DQQSLYQNA (190–198) at helix 190. Only one substitution occurred at loop 220 (PKVRDQEG PKVRNQEG) in one HA sequence (A/Tianjinninghe/SWL49/2009(H1), which also had D239N substitution).

The HA gene of the H1N1pdm09 viruses was reportedly derived from “classical swine H1N1 virus”, which likely shares a common ancestor with the human H1N1 virus that caused the influenza pandemic in 1918, and whose descendant viruses continued to circulate in the human population with highly altered antigenicity of HA [18]. Some research about 1918 influenza virus reports that three amino acid mutations in 190, 225 and 226 (numbering on the basis of H3 HA sequence) sites can cause a switch in receptor-binding preference from the avian alpha-2,3 to the human alpha-2,6 sialic acid which results in the virus host range change [19, 20]. 190D, 239D and 240Q (recognized as 225D and 226Q if numbering on the basis of H3 HA sequence) residues in Tianjin H1N1pdm09 HA protein indicated that the virus bound preferentially to human host α2-6-linked sialyl receptor and can spread in human population [19, 20].

No significant association between the increased amino acid substitution and clinical severity was observed in this study, but only limited sequence data were obtained, so further sequencing and case epidemiology are essential to make statements about this issue. But we found that three-residue mutations from severe infection viruses located at the antigenic domain [D239N (isolated in 2009), which has reportedly been associated with severe cases [13], S179N and K180T (isolated in 2011)] occurred, while none was found from mild symptom infection viruses. We also noticed that substitution sites increased gradually with time (from 2009 to 2012), the substitution rates of eight proteins (except M2 and NEP) were higher (from 7.39 × 10−4 to 7.40 × 10−3 per site per year), “antigenic drift” may have occurred in H1N1pdm09 with its ongoing evolving. Relatively high d N/d S ratios per site (especially PA, HA, NA and NS1) were observed, most likely reflecting different selection pressure at eight gene segments. The high evolutionary rate in H1N1pdm09 might be due to adaption to new host species, indicating strong selective advantages [21].

Single-nucleotide polymorphisms at codons encoding L26F, V27A, A30T S31N and G34E in the M2 gene fragment confer resistance to amantadine, and the most common substitution occurred at S31N [22]. In this study, 26, 27, 30 and 34 were conserved in all 8 H1N1pdm09, only residue 31 was N, which indicated the virus was resistant to amantadine.

Neuraminidase inhibitors (such as oseltamivir and zanamivir) were the first medicinal choice against H1N1pdm09 influenza, but oseltamivir-resistant H1N1pdm09 strains were reported continually in different countries [23]. H275Y and N295S (N1 numbering) substitutions in the NA gene were confirmed that can lead to oseltamivir resistance. None of the H1N1pdm09 sequenced in this study had these two amino acid changes, but they all had N248D, which was also found in southern China [24]. Pan et al. [25] indicated that 248 site was adjacent to 275, and maybe associated with oseltamivir resistance. But another different opinion emerged from a Japanese researcher that N248D had little effect on E227 through crystal structure analysis; it may not alter the effect of oseltamivir to the virus [26].

One of the main concerns derived from our analysis would be a possibility that identified and forthcoming novel HA mutations may cause an “antigenic drift” which is sufficient to diminish the protective effect of vaccination against H1N1pdm09 in a significant proportion of vaccinated populations. This concern seems relevant because the viral strain utilized for vaccine development (the influenza A/California/7/2009 strain) does not carry the mutant form of the HA protein [27, 28].