Inventory of molecular markers affecting biological characteristics of avian influenza A viruses

Avian influenza viruses (AIVs) circulate globally, spilling over into domestic poultry and causing zoonotic infections in humans. Fortunately, AIVs are not yet capable of causing sustained human-to-human infection; however, AIVs are still a high risk as future pandemic strains, especially if they acquire further mutations that facilitate human infection and/or increase pathogenesis. Molecular characterization of sequencing data for known genetic markers associated with AIV adaptation, transmission, and antiviral resistance allows for fast, efficient assessment of AIV risk. Here we summarize and update the current knowledge on experimentally verified molecular markers involved in AIV pathogenicity, receptor binding, replicative capacity, and transmission in both poultry and mammals with a broad focus to include data available on other AIV subtypes outside of A/H5N1 and A/H7N9.


Introduction
Avian influenza viruses (AIVs) (Family: Orthomyxoviridae, genus: Influenzavirus A) are negative sense, single-stranded RNA viruses found globally in their natural reservoir hosts, wild waterfowl, and other aquatic birds (mainly of the orders Anseriformes and Charadriiformes). AIV genomes are composed of eight genomic RNA segments encoding at least twelve viral proteins. Viral nomenclature is based on combinations of the two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA). To date, sixteen HA (H1-H16) and nine NA (N1-N9) subtypes circulate in wild aquatic birds [1], and two novel HA subtypes (H17 and H18) and two novel NA subtypes (N10 and N11) have recently been identified in bats [2][3][4][5][6]. AIVs are sporadically transmitted from waterfowl to domestic avian species, resulting in a number of stable AIV lineages in domestic poultry. These domestic lineages typically circulate in poultry flocks as low pathogenic (LPAIV) variants, causing little to no apparent illness; however, some subtypes (namely A/H5 and A/ H7) have the potential to mutate to form highly pathogenic (HPAIV) variants capable of causing high mortality rates in domestic avian species. These HPAIV lineages then spread back to wild bird species, potentiating global spread [1].

3
severe acute respiratory illness. Infections with particularly virulent viral subtypes, such as A/H5 and A/H7, or infections in immunocompromised, high-risk hosts can cause high morbidity, respiratory distress and/or multiple-organ failure, and, in some cases, mortality [13]. Fortunately, minimal human-to-human transmission occurs in human AIV cases; however, each zoonotic event in mammals represents a risk for AIVs to adapt replicative and transmissible properties in the mammalian host. Indeed, several studies using ferret models have shown that these AIVs have the potential to mutate into airborne transmissible forms, potentially with some mutations already present in nature [14,15]. However, while these observations are very important, point mutations associated with airborne transmission between ferrets are not universal, as they may not confer similar changes in biological characteristics of these viruses in other mammalian species, including humans.
Given the continual circulation of AIVs in wild birds and domestic poultry, the potential for human spillover, and the mutable nature of AIV itself, it is paramount to understand the potential risk of any emerging AIV. Indeed, risk assessment tools are vital in pandemic preparedness planning [16]. Genetic variation resulting in changes in viral properties such as receptor binding, replicative capacity, and transmission is a critical component of risk assessment. Historically, assessing these factors occurs in vivo; however, the ability to evaluate these properties in silico from sequence data allows for faster, more efficacious assessment of novel, emerging strains. Numerous studies identify molecular markers for AIV risk, and several previous papers compile markers of interest. Here, we summarize and update the current knowledge on experimentally verified molecular markers affecting biological characteristics of avian influenza viruses important for risk assessment and broaden the scope outside of the A/H5N1 and A/H7N9 subtypes.

Data collection
All available information on AIV molecular markers/mutations was collected from: the CDC H5N1 Genetic changes inventory [17], the WHO Working Group on Surveillance of Influenza Antiviral Susceptibility (WHO-AVWG) [18], and publications summarizing AIV mutations and molecular markers affecting biological characteristics and potential risk [19][20][21]. Journal articles were sourced for each specific subtype using PubMed searches with MeSH Terms, Boolean operators and wild cards. For example, terms used to search for mutations in H6 AIVs: (

Inclusion/exclusion
All mutations/molecular markers from influenza viruses of avian origin were included in the initial screening and tabulation, including viruses isolated from humans AIV cases. However, we do not consider this inventory as an exhaustive list. Several studies computationally predict molecular markers of viral adaptation to humans [22][23][24][25][26]. While this work is extremely valuable, experimental validation of the majority of the markers described in these studies is not available and, therefore, necessitated exclusion from this data summary. In addition, since this inventory focuses on AIV and zoonotic infection, genome mutations in human seasonal influenza viruses were also excluded. As several publications observe the same mutations causing similar biological characteristics that could indicate risk markers, we excluded duplicate information from the tables.

Numbering
For all data presented, HA mutations are numbered according to the H3 subtype to maintain consistency with available literature; however, H5 numbering was also included in the table. N2 numbering was used for NA as this is most commonly used in the current literature. Internal proteins (including deletions) and deletions in NA are numbered according to the full length of A/Goose/Guangdong/1/1996 genome segments.

Surface proteins
Hemagglutinin (HA; Table 1) Hemagglutinin is a homotrimeric transmembrane protein and is the most abundant protein present on the surface of influenza virions. For virions to successfully enter and replicate in host cells, host proteases cleave the HA0 precursor into two subunits, HA1 and HA2. The main role of the HA1 subunit is to initiate infection by recognizing and binding receptors on the host cell surface. After internalization and entrance into the endosomal pathway, the HA2 subunit fuses the viral and endosomal membranes, creating a pore for viral RNA entry into the host cell, and initiation of transcription and translation of viral products [27,28]. Several mutations in HA are associated with changes in viral fitness and transmissibility, as they affect viral receptor binding avidity/ specificity or viral membrane fusion activities (Table 1).    A switch in binding preference from α2,3 "avian-type" receptors to α2,6 "human-type" receptors is considered a key factor for pandemic potential of AIVs. There are numerous HA mutations that, individually or in combination, affect viral receptor binding preference, pathogenicity and transmissibility [40]. Two mutations, E190D and G225D, increase the preference for "human-type" receptors in both the 1918, "Spanish" A/H1N1 pandemic virus and the 2009 A/H1N1 "Swine Flu" pandemic virus [41,42]. However, the impact of these mutations on other AIVs appears to be subtype specific [43][44][45]. A single G225D mutation alters the receptor preference of an A/H6N1 virus isolated from a human in Taiwan  . In contrast to Q226L, a singular G228S mutation in A/H5 subtype isolates produces a dual binding phenotype, increasing α2,6 binding and maintaining α2,3. Therefore, the combination of Q226L/G228S dual mutations decreases, or even ablates, α2,3 binding while simultaneously increasing affinity for α2,6. However, this elegant switch does not hold true for all subtypes. For instance, single Q226L or dual Q226L/G228S mutations in a human A/H10N8 isolate significantly decrease α2,3 binding with only a minimal increase in α2,6 affinity [33, 34]. In addition, a number of other HA mutations, individually or in combination, affect viral receptor binding preference, pathogenicity and transmissibility, and more work is necessary to understand the risk of these mutations in all subtypes of concern.
Overall, the variable effect of HA mutations on AIV receptor binding preference between different AIV subtypes likely occurs due to differences in the HA receptor binding site (RBS). Four key structural elements are present in the HA RBS of all AIV subtypes: the 130-loop, 150-loop, 190-helix, and 220-loop. Conformation and amino acid composition of these structures is a primary determinant of HA receptor specify. Therefore, the variability of RBS loop length and amino acid composition between AIV subtypes could account for the variable effects observed with the same mutations in the RBS [40].

pH of fusion and HA stability
HA stability, or pH of fusion, refers to the pH required to trigger an irreversible conformational change in the HA1/ HA2 trimer that activates the HA2 fusion peptide to mediate the fusion of the viral and endosomal membranes. This fusion creates pores through which viral ribonucleoproteins (RNPs) can exit the endosome, initiating AIV infection in the cytoplasm of the host cell. Following internalization, the endosome progressively becomes more acidic until the contents are destroyed in the host lysosome [55]. Therefore, pH of fusion dictates the efficient timing of the release of viral RNPs. If released too early, host cell recognition of viral products heightens host antiviral response, attenuating viral infection [56,57]. If released too late, endosomal contents are destroyed, preventing release of viral products. The optimal pH of fusion varies substantially. In avian species, optimal pH of fusion can differ dramatically; however, a pH of fusion above 5.5 is believed to enhance AIV replication and transmission. In humans and ferrets, increased stability with a pH of fusion of less than 5.5 favors replication [28].
Understanding mutations affecting HA stability is vitally important since pH of fusion may partly contribute to the ability of AIVs to transmit by aerosol droplet in the ferret 1 3 model [14,15]

Antiviral susceptibility and resistance
The function of NA is essential for productive AIV infection, as exemplified by conserved NA catalytic sites across influenza virus strains. Developed as antivirals in the 1990s, neuraminidase inhibitors (NAIs) bind to this active site of NA and prevent the release of new viruses from the surface of the infected host cell. While these antiviral compounds have been used successfully for several decades, the prevalence of NAI resistance is increasing [79]. Environmental contamination with NAIs is also a recent concern, increasing the possibility of NAI resistance in AIVs from wild birds and domestic poultry [80]. The majority of NA mutations in this updated inventory are markers associated with resistance to the major NAIs currently in use globally: oseltamivir, zanamivir, laninamivir, and peramivir. The WHO-AVWG periodically releases a comprehensive table of mutations shown to affect NAI susceptibility in seasonal human influenza viruses (including both A and B genera viruses) as well as in subtype A/H5N1 and A/H7N9 AIVs [18]. The NA table (Table 2) provides a summary of WHO-AVWG NAI susceptibility markers for A/H5N1 and A/H7N9 viruses.

Proteins of the ribonucleoprotein complex: PB2, PB1, PA, and NP
Inside the viral envelope, the eight genomic viral RNA (vRNA) segments of influenza A viruses form part of the viral ribonucleoprotein complex (vRNP). The vRNP consists of vRNA associated with multiple copies of the nucleoprotein (NP), and an RNA-dependent RNA polymerase sub-complex (RdRP), formed by polymerase basic protein 2 (PB2), polymerase basic protein 1 (PB1), and the polymerase acidic protein (PA). Cryo-EM and crystal structures of the RdRP show that PB1 forms the core of the structure, associating with PA via its N-terminal and PB2 via it's C-terminal [81-83]. Subunits of the RdRP associate with the 5′ and 3′ ends of viral RNA and with NP. The RdRP is crucial for viral transcription and replication, producing vRNA, complementary RNA and viral messenger RNA (mRNA) (reviewed in [84,85]). The synthesis of viral mRNA is dependent on RdRP cap snatching. Whereby, PB2 binds the 5′ cap of host RNA polymerase II transcripts [86-89], 10-13 nucleotides are cleaved by the endonuclease site of PA, and PB1 uses the cleaved fragment as a primer to initiate transcription [90]. Cap snatching not only facilitates the translation of viral mRNA, it also inhibits the production of host mRNA, referred to as host shutoff [91,92]. Mutations that hinder AIV cap snatching ability affinity of vRNP proteins can affect viral replicative capacity and, consequently, AIV virulence.
Mutations that increase the polymerase activity of AIVs are important for the adaptation of AIVs to mammalian hosts [84]. In a number of studies, the polymerase activity of AIVs is impaired in mammalian cell lines [93]. This reduction in polymerase activity limits the transcription of viral RNA resulting in less viral material available to be packaged into progeny viruses. Additionally, limited replication capacity reduces viral genomic mutation, hindering the ability of AIVs to create progeny with beneficial mutations. A number of mutations in proteins of the polymerase complex enhance the replicative capacity of AIVs in mammalian Reduced susceptibility to oseltamivir, zanamivir, and peramivir H5N1 [251,252,257,263] Reduced susceptibility to zanamivir H7N9 [253] cells. However, these increases in polymerase activity and/or replicative capacity do not always correlate with an increase in viral pathogenicity [94].

Polymerase basic protein 2 (PB2; Table 3)
Mutations in the PB2 protein, namely one mutationthe substitution of glutamate with lysine at position 627 (E627K)-are, by far, the best-known mutations in the polymerase complex proteins to be associated with increases in viral fitness [ PB2 also potentially antagonizes the host interferon (IFN) response. A subset of AIVs contain PB2 proteins with an N-terminal mitochondrial targeting signal (MTS) that facilitates import of PB2 into the mitochondrial matrix [66, 67]. Mitochondrial PB2 then antagonizes host IFN production by interfering with the action of mitochondrial antiviral signaling proteins (MAVS) [113,114]. Disruption of the MTS through a single amino acid substitution at position 9 prevents mitochondrial localization of PB2, heightening host IFN response and attenuating viral virulence [115,116]. Asparagine (N9) or threonine (T9) at position 9 facilitates import of PB2 into the mitochondrial matrix. N9 is typically present in human seasonal viruses (A/H1N1 prior to 2009, A/H2N2, and A/H3N2). In contrast, AIVs typically have aspartic acid at positon 9 (D9), and AIV PB2s are non-mitochondrial and predominantly found in the nucleus. Experimentally mutating seasonal human influenza viruses to contain the non-mitochondrial D9 heightens production of IFN-β and attenuates pathogenicity in mice [115]. Conversely, mutating avian A/H5N1 to contain N9 increases pathogenicity in mice, though specific cellular localization was not investigated [116].
Polymerase basic protein 1 (PB1; Table 4) and polymerase protein (PA;    Interestingly, a combination of two mutations in PA, S224P and N383D, increase polymerase activity of a A/ H5N1 subtype virus and enhance viral replication in both mammalian and avian cell lines [121,122]. This dual mutation also increases A/H5N1 AIV virulence in both mice and ducks. The mutations act in synergy, N383D increases AIV polymerase activity alone while the combination of S224P and N383D significantly increases AIV virulence. This combination differs from previously mentioned mutations as single point mutations in polymerase proteins typically affect viral virulence in mammalian species, with minimal changes in the polymerase activity, replication or virulence in avian species. Some mutations reportedly increase polymerase activity in avian and mammalian cell lines (PB2: A588V, Q591K; PB1: D3V, S678N). However, in these reports, only murine models were utilized to test AIV virulence.

Nucleoprotein (NP; Table 6)
The NP protein encapsulates viral genomic RNA and mediates import into the nucleus to initiate viral replication through nuclear localization signals and the importin-α/ importin-β nuclear import pathway. Mutations that affect the functions of NP have not been extensively investigated; however, substitutions I41V, R91K, R198K, E210D, K227R, K229R, N319K, E434K, K470R enhance polymerase activity in mammalian cell lines [53, 94, 111, 123]. These mutations do not always increase AIV virulence and the mechanism behind the increase in viral replication is not always clear. One particular mutation, N319K, improves A/H7N7 viral replication in mammalian cells by enhancing the interaction between NP and importin-α isoforms [94, 111, 124]. M105V, I109T, and A184K enhance viral replication and increase AIV virulence in chickens [125][126][127]. Interestingly, the M105V mutation may be involved in spillover adaption from ducks to chickens as M105V affects viral replication in embryo fibroblasts from chickens but not from ducks [127].

Matrix protein (MP; Table 8)
Influenza virus genomic segment 7 (MP) encodes for the matrix 1 and 2 proteins (M1 and M2, respectively). M1 is located beneath the viral envelope where it associates with the lipid membrane and viral RNPs. This interaction needs to be ablated for viral RNP to enter the host cell cytoplasm  Airborne transmissible in ferrets H5N1 [15] Only a small number of mutations in segment 7 are associated with host adaptation. All occur in the region encoding the M1 protein. Four mutations, namely N30D, I43M, T139A and T215A, increase the virulence of A/H5N1 subtype AIVs in mice [155][156][157]. I43M also increases virulence in chickens and ducks; however, the underlying mechanisms remain unclear [157]. The M2 protein is an important target of antiviral compounds and mutations in this region contribute to antiviral resistance phenotypes. Adamantanes (amantadine and rimantadine) block the M2 ion channel and inhibit early stages of virus replication. This class of drugs is no longer recommended against seasonal human influenza as these viruses display a high degree of adamantine resistance. In addition, adamantine resistance is increasing in AIVs globally, including subtypes of major concern, such as   PA-X (

Limitations and usage considerations of this molecular inventory
As stated in the methodology, this inventory was not intended as an exhaustive list so several caveats and limitations need to be considered when utilizing these tables. First, the inventory only includes experimentally validated (in vitro or in vivo) markers/mutations. As stated previously, several studies utilize in silico approaches to predict molecular markers of viral adaptation to humans [22][23][24][25][26]. This work is extremely valuable and findings from these computational studies guide future research. Indeed, novel machine learning approaches have helped to identify novel markers and amino acid positions in human and avian influenza strains associated with pandemic potential [184,185]. Further work on these novel mutations should be conducted in vitro and in vivo. Second, as stated in the text, changes in biological characteristics for specific point mutations and motifs are sometimes only associated with specific viral subtypes or hosts, so caution should be made when generalizing findings for novel strains or hosts. Finally, consideration must be made for viral sequences themselves. Influenza viruses, like many other RNA viruses, replicate as a quasispecies, allowing for rapid changes in diversity and viral adaptation in response to environmental pressures [186]. The majority of molecular markers/motifs described in these studies come from consensus sequences which may not adequately reflect the entire population of viruses in a given sample. With the continual advancement of Next Generation Sequencing (NGS) and bioinformatics we are better able to understand how minority variants could contribute to pandemic potential of influenza strains. Indeed, recent work by Welkers et al. using NGS on human respiratory A/H5N1 samples identifies multiple single amino acid variants in all three polymerase subunits. In vitro analysis of these markers shows substantial increases in polymerase activity [187]. Therefore, we must move towards consideration of the influenza viral quasispecies as a whole when considering the dynamic evolutionary and adaptive pathways/processes and the meaning and predictive power for zoonotic and/or pandemic potential.

Discussion/conclusions
Since the discovery of the Goose/Guangdong-lineage A/ H5 viruses in China in 1996, HPAIV of the A/H5N1 subtype have spread globally, causing numerous outbreaks in wild birds and poultry. Overall, 861 confirmed human A/ H5N1 infections have been reported with a case fatality rate of 52.8% [188]. In addition, newly emerged and emerging AIVs with zoonotic potential continue to appear, including subtypes A/H7N9, A/H7N4, A/H5Nx, A/H9N2, A/H10N7, and A/H10N8 [189][190][191][192][193]. As of June 2019, there have been 1,568 human A/H7N9 cases reported with 616 deaths (CFR: 39%) [194]. There have been 23 human cases of A/H5N6 infection reported from 10 different provinces across mainland China, of which 15 were fatal (CFR 65.2%) [195]. A single human non-fatal infection with an A/H7N4 subtype virus occurred in an elderly woman in Jiangsu, China in December 2017 [192,196]. This virus is antigenically distinct from formerly circulating A/H7 strains, and, concerningly, appears to be spreading across Southeast Asia, continually reassorting with other viruses in the region [197,198]. To date, none of the novel A/H7N4 viruses contains known amino acid mutations that confer adaptation of AIV to humans (e.g., PB2 627/701 or HA 186/226/228) or antiviral resistance. However, A/H7N4 isolated from Cambodia does contain the M gene amino acid mutations N30D and T215A that increase pathogenicity of A/H5N1 virus in mice [198]. Due to the antigenic differences between the A/H7N4 viruses and other H7 lineages, including the A/ Anhui/1/2013-like A/H7N9 lineage, the continual spread, and risk for human infection, this newly detected A/H7N4 lineage is now in preparation as a candidate vaccine virus for pandemic preparedness [199]. To date, no avian origin influenza viruses can transmit efficiently from human-tohuman via aerosols. Given the ongoing spillover of avian influenza viruses into domestic poultry, as well as the risk of human infection and potential for adaptation, continual, vigilant surveillance and risk assessment is vital to combat endemic and emerging AIVs. This inventory provides a list 1 3 of the currently known, experimentally verified mutations/ molecular markers affecting host adaptation in AIVs and can be utilized for molecular characterization and risk assessment of novel AIV strains.