Introduction

Avian influenza virus (AIV) is a negative-sense, segmented, and single-stranded RNA virus that belongs to genus Influenza virus A of the family Orthomyxoviridae. Its genome is composed of eight segments encoding at least 11 proteins, including the surface glycoproteins hemagglutinin (HA) and neuraminidase (NA), transcriptase proteins (PB2, PB1, PB1-F2 and PA), matrix proteins (M1 and M2), nucleoprotein (NP), and non-structural proteins (NS1 and NS2) [1]. The segmented nature of the genome facilitates reassortment, which contributes to the rapid evolution of AIVs. Furthermore, this genomic reshuffling plays a major role in generating pandemic influenza strains [2]. In the past, reassortant viruses with avian lineage gene segments caused the influenza pandemics of 1957, 1968 and 2009 [3, 4]. Thus, for pandemic preparedness, it is important to monitor the evolution of AIVs in avian populations, including wild and domestic birds.

Wild birds, particularly waterfowl, are considered the main reservoir of AIVs [57], which usually cause either mild or subclinical disease in domestic poultry hosts. Generally, evolution of AIV involves mutations or reassortment caused by the exchange of gene segments [7, 8]. Surveillance studies and genetic characterization of AIVs in nature have been widely performed, as AIVs are prone to reassort in migratory birds, resulting in the appearance of different influenza viruses of either a new subtype or the same subtype [2, 9, 10]. Viruses that reassort in wild birds can result in rapid changes in viral genotype that are conducive to host switching, potentially causing serious disease in poultry and human hosts [11]. To date, only some viruses that belong to subtypes H5 and H7 have evolved into the highly pathogenic (HP) forms by acquiring genetic mutations in the HA cleavage site [1214], but low-pathogenic (LP) forms remain a risk for epidemic spread. In 2014, Eurasian-origin H5 related to the Goose/Guangdong lineage spread rapidly to domestic flocks in East Asia, Western Europe and North America, consistent with movement by wild birds [15]. Thus, it is imperative to characterize newly isolated H5 AIVs, particularly in Asia, where biosecurity between wild and domestic birds is porous, in order to monitor viral distribution and any genetic changes that may alter the viral phenotype.

It has been hypothesized that LPAI H5N2 viruses are the precursor strains responsible for the outbreaks of H5N2 HPAIV in poultry in northern Europe [16], Mexico [17, 18], Italy [19], Texas [20] and Taiwan [21]. Also, the presence of avian-origin LPAI H5N2 viruses in domestic ducks and pigs in Korea [22, 23] and in food products in Singapore [24] has been reported. In addition, Japan experienced outbreaks in domestic poultry during 2005-2006, caused by LPAI H5N2 viruses of the North American lineage [25]. During these outbreaks, 41 farms were affected, in which 70 % of poultry were laying hens and 30 % were broilers [26]. Owing to past experiences with the LPAI H5N2 virus and its potential for mutating to HPAIV, the presence of H5 LPAIV as well as the H7 LPAIV in birds and poultry was incorporated in the list of Domestic Animal Infectious Diseases regulated by law in Japan and strictly monitored by the veterinary authorities (http://www.maff.go.jp/aqs/english).

Here, we report the genetic characterization of multiple reassortant H5N2 viruses isolated from wild birds in Hokkaido during 2009 and 2011 and propose a mechanism for their genesis. Molecular characteristics of these isolates were investigated in comparison with other H5N2 strains, including the previously reported Japanese strains from wild birds and chickens. In addition, phylogenetic analysis was conducted to estimate the evolutionary history of the isolates. Combined, this work helps us to understand the temporal-spatial scale for evolution of contemporary H5 viruses, which remain an agricultural and public-health concern.

Materials and methods

Viruses and virus identification

Ten viruses of the H5N2 subtype, isolated from wild birds in Hokkaido, the northern part of Japan, were included in this study. Isolation methods and surveillance details were reported previously [27]. Briefly, 10 % homogenous samples of feces were inoculated into 10-day-old embryonated chicken eggs by the allantoic route. The eggs were incubated at 37 °C for 4 days. The allantoic fluid was then collected and tested for hemagglutination activity. The presence of AIV was checked by real-time RT-PCR for detection of the M gene [28]. The AIV isolates were subtyped for HA and NA by RT-PCR and confirmed by serological tests according to the methods described previously [2931]. Three of the 10 H5N2 viruses were isolated in Hakuryo Pond during 2009 from wild birds migrating through Hokkaido [27]. In 2011, the other seven strains of H5N2 were isolated from wild birds in Hakuryo Pond and the Obihiro River and identified as reported previously. Table 1 lists the information for all 10 isolates.

Table 1 H5N2 viruses obtained from wild birds in Hokkaido from 2009 and 2011 used in this study

Nucleotide sequencing

Total RNA was extracted from allantoic fluids using ISOGEN-LS (NIPPON GENE CO., LTD., Tokyo, Japan). For nucleotide sequencing of AIV genes, the viral RNA was transcribed into cDNA using the universal 12-mer primer (5′-AGC RAA AGC AGG-3′) and Superscript III Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) at 42 °C for 60 min, followed by 70 °C for 10 min. Using the cDNA as a template, PCR was conducted to amplify the entire length of all eight viral segments using segment-specific primers as described by Hoffmann et al. [32] and Obenauer et al. [33]. The resulting PCR products were separated by 1 % agarose gel electrophoresis and purified using either a QIAquick PCR Purification Kit (QIAGEN, Hilden, Germany) or a Gene Clean II Kit (Biogene, Inc., USA). The purified PCR products were used as templates for sequencing reactions using a BigDye Terminator ver. 3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA). Sanger sequencing was performed using an Applied Biosystems 3500 Genetic Analyzer (Life Technologies, Carlsbad, CA). The primer sets described above as well as additional overlapping primers that were designed as needed (sequences available upon request) were used for nucleotide sequencing.

Molecular characterization and genotypic assignment

Viral nucleotide sequences were analyzed with the Genetyx Ver. 10 software (GENETYX Corp., Tokyo, Japan) and compared with other sequences available in GenBank identified by BLAST homology searches (http://www.ncbi.nlm.nih.gov/genomes/ FLU/FLU.html). Nucleotide and deduced amino acid sequences of all of the gene segments of the isolates were compared with those of other H5N2 strains, including previously reported Japanese strains, based on alignments made using Clustal W [34]. Genotyping of the H5N2 viruses was conducted using the influenza A virus genome segment genotyping tool available at http://www.flugenome.org/genotyping.php. The program was used to determine the individual genome segment lineage as described previously by Lu et al. [35]. Alleles for gene segments were determined using the BLAST method [36] with coverage >95 % and identity >90 %. Concatenating alleles for each gene segment created genotypes of each isolate.

Phylogenetic analysis

Phylogenetic analysis was performed by aligning nucleotide sequences of the H5N2 viruses from this study with those of other isolates of the Eurasian and American lineages and closely related isolates identified using BLAST. Multiple sequence alignment was performed using Clustal W [34]. The Tamura-Nei substitution model was identified as the best-fit model for the data and used for phylogenetic tree reconstructions of all eight gene segments. All positions containing gaps and missing data were eliminated. Phylogenetic trees were constructed using the maximum-likelihood (ML) method supported by 500 bootstrap replicates [37] using MEGA 6.06 software [38].

Bayesian phylogenetic analysis was conducted to determine the time to the most recent common ancestor (TMRCA) and date the divergence between LP and HP strains circulating in East Asia. Sequences of AIV from avian hosts collected globally were downloaded from the Influenza Research Database (IRD) on 15 July 2015 and aligned using MUSCLE [39], and any incomplete sequences were removed. Using PAUP 4.0 [40], neighbor-joining trees were generated using the HKY85 substitution model and taxa that formed outgroups were randomly down-sampled. The final number of sequences in each dataset was as follows: PB2, n = 450; PB1, n = 418; PA, n = 404; H5, n = 454; NP, n = 406; N2, n = 207; M, n = 386; NS, n = 894. Phylogenetic analysis was performed using a Bayesian Markov chain Monte Carlo (MCMC) method implemented in BEAST v1.8.0 [41]. The date of sample collection associated with each sequence was used to time calibrate the tree. The uncorrelated lognormal relaxed molecular clock and general time-reversible (I+G) substitution model were used with a Bayesian skyride coalescent tree prior. We performed four independent MCMC chains of 50 million generations that were diagnosed in Tracer 1.6 to ensure that the effective sample size of each run reached 200. Trees from each MCMC run were combined after appropriate burn-in (~10 %) to produce 10,000 trees. For TMRCA analysis, the most recent sequences were used to calibrate the root height of the tree and the nodes were used to estimate the divergence time between two sequences.

Results

Virus identification

Samples showing hemagglutination titers from 1:64 to 1:512 after harvest from inoculated eggs were confirmed for influenza A virus infection by real-time RT-PCR for detection of the M gene segment, which yielded Ct values ranging from 16.96 to  22.70 (Table 1). The 10 isolates were identified as H5N2 using RT-PCR and serological tests (data not shown) followed by sequencing of the viral genome. The entire genome sequences of the isolates obtained in this study were deposited in GenBank under the accession numbers shown in Table 1. The sequences including the HA, NA and M genes of three H5N2 viruses (9UO025, 9UO036 and 9UO139) isolated in 2009 were previously reported by Abao et al. [27], while the rest of the sequence data were obtained in this study.

Molecular characterization

Genomic features of the 10 H5N2 isolates are shown in Table 2. Analysis of the deduced amino acid sequences of the HA protein revealed that all of the H5N2 viruses had a PQRETR/GLF amino acid sequence motif at the HA cleavage site, however, a single virus (9UO139) had PQKETR/GLF. Both motifs are typical of LPAI H5 viruses [42]. All of the isolates had a Gln at position 222 and Gly at position 224 of the HA, suggesting that all of them retained affinity for alpha 2,3 avian-type glycan receptors [43, 44]. No NA stalk deletions, which are associated with adaptation to terrestrial poultry [45], were found in any of the Japanese H5N2 strains, including the 10 isolates described here, except in the H5N2 Ibaraki strain, which was isolated in 2005 [25]. Several amino acids that enhance pathogenicity in mice and/or humans were detected in genes of the new H5N2 isolates. An Ala residue at position 42 of the NS1 protein was found in the three isolates from 2009, as well as in the reference Japanese strains, including Ibaraki and Akita (Table 2). All of the H5N2 viruses reported in this study and most of the reference strains contained the PDZ binding sequence motif (-ESEV-), although the Ibaraki strain has -ESKV- in the NS1 gene sequence. The PB1-F2 gene of the 9UO036 and 9UO139 H5N2 isolates had an S66 mutation, while the other strains and reference strains retained N66. No other genetic markers associated with virulence in mammals such as mutations in the PB2 gene (E627K and D701N) [11, 46], were found in any of the isolates or selected Japanese strains. Moreover, no mutations conferring amantadine or oseltamivir resistance, such as H274Y in the NA gene, were found in any of the isolates or the selected reference strains shown in Table 2 [4749], except for the Ibaraki strain, which had 26F and 31N in the M2 protein, indicating amantadine resistance.

Table 2 Molecular analysis of amino acid sequences of the H5N2 viruses isolated in Hokkaido 2009 and 2011 in comparison with reference strains

Phylogenetic analysis

HA and NA genes

The ten HA gene sequences of the H5N2 isolates were analyzed to explore their evolutionary relationships relative to the Eurasian and American lineages and H5 viruses. The initial ML analysis revealed that the HA genes of the H5N2 isolates belong to the Eurasian lineage and clustered separately from the HPAI H5 viruses. The HA genes of our H5N2 isolates were closely related and clustered with viruses from wild birds in Japan, Korea and China sampled between 2005 and 2010 (Fig. 1). Bayesian analysis indicated that the H5N2 isolates diverged from the HPAIV clade circa 1990 and were distantly related to clade 2.3.4.4 (Fig. 4).

Fig. 1
figure 1

Maximum-likelihood phylogenetic trees of the H5 gene of avian influenza isolates in Japan together with reference strains of H5 subtypes available in the GenBank database. The H5N2 strains isolated in 2009 (●), 2011 at Hakuryo Pond (▲) and 2011 at Obihiro River (■) are indicated by markers. Numbers at each node indicate bootstrap values ≥60 %. Phylogenetic analysis was conducted using MEGA6.06. The scale bar indicates 0.05 nucleotide substitutions per site

All NA genes of the 10 H5N2 viruses fell within the Eurasian lineage. These sequences formed a monophyletic group structured according to year and location of isolation. All of the H5N2 isolates clustered with a lineage containing H4N2, H5N2, H9N2, H10N2, H13N2 and H6N2 viruses isolated in Japan, Korea and China. The N2 gene sequences of the 10 H5N2 isolates were most closely related to A/avian/Japan/8KI0148/2008(H4N2) (Fig. 2). The TMRCA date of the N2 viruses and clade 2.3.4.4. is approximately 1975-1976, demonstrating that the NA of these Japanese isolates do not contain an immediate precursor to circulating clade 2.3.4.4 virus.

Fig. 2
figure 2

Maximum-likelihood phylogenetic trees of the full N2 gene of avian influenza isolates in Japan together with reference strains of N2 subtypes available in the GenBank database. The H5N2 strains isolated in 2009 (●), 2011 at Hakuryo Pond (▲) and 2011 at Obihiro River (■) are indicated by markers. Numbers at each node indicate bootstrap values ≥60 %. Phylogenetic analysis was conducted using MEGA6.06. The scale bar indicates 0.05 nucleotide substitutions per site

Internal genes

ML analysis indicated that all of the internal genes of the H5N2 isolates belonged to the Eurasian-like avian lineage of AIV (Fig. 3A-F). The ML tree topology of the M gene was similar to those of the HA and NA trees, indicating possible co-carriage, as they clustered within a single branch with H3N8 viruses from Japan and Laos as well as H6N1 and H7N7 viruses from Korea (Fig. 3E). TMRCA analysis revealed that the H5N2 isolates diverged from the currently circulating clade 2.3.4.4 between 1977 and 1978 (Fig. 4).

Fig. 3
figure 3figure 3figure 3

Maximum-likelihood phylogenetic trees of the internal genes of H5N2 avian influenza isolates in Japan together with reference strains available in the GenBank database. Phylogenetic trees are shown for the nucleotide sequences for PB2, PB1, PA, NP, M and NS (A-F, respectively). The H5N2 strains isolated in 2009 (●), 2011 at Hakuryo Pond (▲) and 2011 at Obihiro River (■) are indicated by markers. Numbers at each node indicate bootstrap values ≥60 %. Phylogenetic analysis was conducted using MEGA6.06. The scale bar indicates nucleotide substitutions per site

Fig. 4
figure 4

Dated phylogenies of all six internal genes, the H5 gene, and the N2 gene belonging to the Eurasian lineage of influenza viruses using Bayesian inference (BEAST 1.8.0). The 10 isolates generated in this study are color coded according to purple (9UO), green (11UO) and blue (11OG) distinguishing the three separate sampling events in Japan. All eight gene trees indicate that the 10 isolates are only distantly related to the highly pathogenic H5 clade 2.3.4.4, with the most recent common ancestor dated at 1975-1996. The horizontal axis shows the calendar years

For PB2, the 10 H5N2 isolates clustered into two distinct subgroups. Six of the isolates obtained from Hakuryo Pond (UO samples) clustered with H9N2, H6N5, H6N8 and H3N8 viruses isolated from ducks in Japan and an H3N8 strain isolated from an environmental sample in Korea. The other four isolates sampled during 2011 grouped with isolates from chickens (H7N7) and domestic ducks (H4N8 and H6N1) from China, as well as the H10N4 virus of wild birds in Korea (Fig. 3A). Molecular dating indicated that these four H5N2 viruses diverged from the H5 clade 2.3.4.4 during 1996-1997 (Fig. 4).

The PB1 genes of the 10 H5N2 isolates were divided into four distinct clusters. Three isolates at Hakuryo Pond, including 11UO0008, 11UO0012 and 11UO0023, were close to each other and closely related to the isolates from Korea (H4N6), Vietnam (H11N9), Hong Kong (H11N9) and Japan (H6N5). The four isolates in the second cluster, 11OG1032, 11OG1038, 11OG1083 and 11OG1084, were close to the isolates from China (H6N1), Vietnam (H5N2), Korea (H4N6) and Japan (H4N2). The other isolates, including 9UO025, 9UO036 and 9UO139, formed a distinct cluster relative to the 2011 group (Figs. 3B and 4).

For the PA gene, the ML tree revealed that the 10 isolates were separated into four distinct clusters. Among those, 11UO samples were close to each other and formed a cluster distinct from the 11OG samples. The 11UO samples were closely related to Korean strains (H2N3 and H3N8). The 11OG samples were close to A/wild duck/Korea/SH5-26/2008(H4N6). The 9UO isolates were structured into two distant clusters in which 9UO036 and 9UO139 formed a cluster with Japanese H3N5 and H6N5 and Chinese H5N2 isolates. The 9UO025 isolate was parental to the 11UO isolates (Fig. 3C).

For the NP gene, the ML tree showed that the 10 isolates were separated into two distinct clusters. The three isolates of the 11UO group were slightly distant from other H5N2 isolates and clustered with the H3N8 isolated from a Vietnamese domestic duck and other Korean (H3N1, H3N6 and H3N8) and Chinese isolates (H6N2). Other H5N2 isolates, including the three isolates in the 9UO group and four isolates in 11OG group, clustered together, along with Japanese isolates of H3N5; H3N8, H5N1, H6N5 and H9N2 subtypes (Fig. 3D).

The NS gene tree was structured into two alleles: allele A, containing 11UO and 11OG, and allele B, containing 9UO isolates. The ML analysis showed that the 11UO and 11OG isolates were closely related to each other and closely related to the Korean H7N7, Chinese H11N9 and American H4N6 isolates. The TMRCA for the H5N2 and American isolates was relatively recent, circa ~2005 (Fig. 4). In contrast, the 9UO isolates were highly similar to each other and closely related to the H3N8 Japanese strains, Korean strains (H6N1, H3N8), Chinese H11N9 and Taiwanese strains (H4N6 and H3N8) (Fig. 3F).

Assignment of genotypes

Three unique genotypes were assigned to the 10 isolates of H5N2 AIVs (Table 3). These genotypes were different in the PB2 and NS gene segments, and each of the three genotypes was assigned to more than one isolate. These genotype assignments depended on either the year (2009 or 2011) or sampling site (Hakuryo Pond or Obihiro River) and resulted in at least five genomic patterns among the 10 H5N2 subtype isolates (Supplementary Table S1). All of the 11OG strains (four isolates) had the same pattern in which the genes were derived from at least six different origins. While the 11UO strains (three isolates) shared a common pattern in which the genes originated from seven AIVs with four subtypes. In contrast, 9UO strains showed three different genomic patterns in which the 9UO036 and 9UO139 strains were different only in the NS gene segment. However, 9UO025 seemed to be separately evolved from the 9UO036 and 9UO139 strains.

Table 3 Genotypes of the H5N2 viruses isolated from wild birds in Hokkaido during 2009 and 2011

Discussion

Here, we report the results of full-genome sequencing and genetic analysis of 10 H5N2 viruses obtained during surveillance in wild birds in Japan during 2009 and 2011, revealing multiple genetic reassortments. Phylogenetic analysis of the eight genomic segments of the 10 H5N2 viruses indicated that the HA, NA and M genes were closely related to each other and other LPAIVs isolated in East Asia. In contrast, the other segments were derived from various subtypes of AIV in the Eurasian-like avian lineage, while the NS lineage was associated with America-to-Asia viral flow. Phylogenetic analysis indicated that the H5N2 viruses contain segments closely related to the corresponding genes of a range of subtypes, including those of viruses isolated from wild birds in Korea, Japan, China, Taiwan and Vietnam. The multiple reassortment H5N2 viruses sampled during the same year and location (9UO, 11UO and 11OG isolates) seems to imply a high frequency of reassortants among AIVs, as described previously [50, 51]. This observation suggests that these viruses could play a role as gene donors for H5N2 viruses that are found in Japan and provides further evidence of dynamic inter-subtype interaction in this region.

No mutations that increased virulence of the H5N2 viruses were found except in the NS and PB1-F2 genes. The NS1 protein of 9UO025, 9UO036 and 9UO139 displayed a mutation of S42A, which has been associated with an increase in virulence in chickens and mice [52]. In all of the H5N2 isolates in this study, the PDZ binding sequence at the C-terminus of the NS1 protein was ESEV and the residue at position 138 was Phe (F), suggesting increased virulence in mammals [33, 53]. In addition to the mutation in the NS1 gene, the mutation N66S, associated with increased virulence of influenza A viruses in mice [54], was found in the PB1-F2 gene of two isolates (9UO036 and 9UO139). Interestingly, the Ibaraki strain (a reference strain that caused an outbreak in Japan during 2005) did not contain either S66 or N66 in the PB1-F2 gene. Taken together, several critical residues that contribute to the pathogenicity of AIV were found in the NS1 and PB1-F2 genes of our H5N2 isolates.

In order to further characterize the 10 H5N2 strains, we analyzed the genome for molecular markers associated with resistance to existing antivirals. To date, amantadine and oseltmivir have been developed for prophylaxis and treatment of influenza A virus infection. However, a single substitution in the M2 or NA protein has been shown to confer resistance to amantadine and oseltamivir, respectively [4749, 55]. In this study, the NA and M2 of the H5N2 viruses did not contain any of the amino acid substitutions associated with resistance to oseltamivir and amantadine (Table 2). On the other hand, the Ibaraki strain contained two amino acid substitutions at positions 26 (L26F) and 31 (S31N), suggesting a reduced susceptibility to amantadine (Table 2).

In this study, all H5N2 viruses were characterized as LPAIV derived from multiple reassortments. However, the presence of H5N2 viruses in migratory wild birds in Japan represents a continued risk for poultry, as H5N2 LPAIV may acquire high pathogenicity by consecutive passages in chickens [21, 56]. In this study, the NP genes of the H5N2 viruses isolated in 2011 (11UO samples) were from the same lineage as H3N8 viruses isolated from domestic ducks in Vietnam. This result implies exchange of virus between migratory birds and free-ranging domestic ducks in South Asia, followed by movement to Japan facilitated by wild migratory birds. The continued circulation and proximity of rapidly evolving viruses, as demonstrated with the group of H5N2 viruses analyzed here, calls for expanded and continued surveillance of LPAIVs in poultry to ensure early detection of infection and limit regional spread of the virus due to the migration of wild birds. The rapid evolution of H5 and H7 viral subtypes to HP strains [13, 14] is a mechanism that was involved in 10 past outbreaks in poultry. Among them, three outbreaks were associated with H5N2 viruses [12]. Despite their phylogenetic distance from the current clade 2.3.4.4 H5 virus, the circulation and persistent presence of both HPAIV and LPAIV pathotypes in wild birds and poultry highlights the need for full genetic characterization of H5 viruses to better understand the evolution and potential origin of high-risk strains. This includes the risk to human health, as there have been reports of H5N2 virus of poultry origin infecting humans [57, 58], and we currently have little ability to predict pathogenicity from the viral sequence.