Key words

1 Introduction

One of the greatest discoveries in medical sciences is the discovery of sialic acid in the mid-1930s to mid-1980s. The Swedish chemist Blix separated the polyhydroxy acid part from a disaccharide crystalline compound isolated in 1936 from a boiling water solution of bovine submaxillary mucin of salivary glands and proposed in 1952 that it be internationally called “sialic acid (Sia)” (the Greek root sialos = saliva) [1]. The German biochemist Klenk purified brain glycolipid from ganglion cells, neurons in brain gray matter, and called it “ganglioside” [2], and he called an acidic crystal from methanolysis cleaving the glycosidic linkages of the ganglioside “neuraminic acid (Neu)” in 1941 [2]. The German biochemist Gottschalk found a water-soluble, nitrogen-containing compound released from ovomucin/human urinary mucoprotein after incubation with V. cholerae/influenza A or B virus and reported it to be “N-substituted isoglucosamine” in 1951 [3]. The pathogen enzyme that releases this monosaccharide was named receptor-destroying enzyme (RDE) according to its activity by Burnet and Stone in 1947. Later, the name “sialidase” was proposed by Heimer and Meyer in 1956 [4], and the name “neuraminidase (NA)” was proposed by Blix, Gottschalk, and Klenk in 1957 [5]. Finally, it appeared that a characteristic building block of Sia, N-substituted isoglucosamine, and other compounds found in the 1950s, such as hemataminic acid from a hematoside glycolipid of equine blood stroma in 1951 [6] and lactaminic acid from cow colostrum in 1954 [7], is identical to Neu, which is 5-amino-3,5-dideoxy-d-glycero-d-galacto-non-2-ulosonic acid (9-carbon α-keto acids, C9H17N1O8). In 1986, 2-keto-3-deoxy-d-glycero-d-galacto-nononic acid (C9H16O9), which is not Neu but a deaminated Neu, was isolated from rainbow trout egg polysialoglycoprotein and named “ketodeoxynonulosonic (KDN)” [8].

Currently, Sia (Fig. 1a) is used as the generic name for a family of acidic sugars derived from 9-carbon backbone derivatives of Neu and KDN. As shown in Fig. 1b, Sia modifications of the carbon backbone at the C5 position give four core molecules: Neu (C5-NH2, not found in nature, formed due to a side effect of methanolysis provoking de-N-acylation of Sia), Neu5Ac (C5-N-acetyl, the most common derivative), Neu5Gc (C5-N-glycolyl), and KDN (C5-hydroxyl). These four core molecules can carry more substituents as described in the legend of Fig. 1b [9]. In nature, Sias exist predominantly as sialylglycoconjugates on N- and O-linked glycoproteins as well as gangliosides in the cell plasma membrane (Fig. 1c) and occasionally on glycosylphosphatidylinositol (GPI)-anchored proteins as well as secreted glycoproteins. Negatively charged Sia moieties are typically found at the outermost ends of glycoconjugates linked through (1) α2-3 or α2-6 to Gal or GalNAc found in glycoproteins and gangliosides [10,11,12], (2) α2-6 to GlcNAc or Glc found in glycoproteins and gangliosides [13, 14], (3) α2-8 to the second Sia found in glycoproteins and gangliosides [15] and less often through (4) α2-4 to Gal or GlcNAc found in glycoproteins [16, 17], (5) α2-9 to the second Sia found in glycoproteins [18], and (6) Neu5Gc oligomer ((→5-Oglycolyl-Neu5Gcα2→)n, n = 4 to more than 40 Neu5Gc residues) only found so far in jelly coat glycoproteins of sea urchin eggs [19]. Figure 1 shows only the well-known sialylglycoconjugates recognized by viruses.

Fig. 1
figure 1

Sialoglycan structures of virus receptors on host cell membranes. Complexity levels of sialoglycan structures can be divided into four levels: (1) Sia core and core modifications, (2) Sia linkages, (3) branches, and (4) classes [20]. (a) The Sia structure is a 9-carbon α-keto carboxylic acid skeleton with different substituents R4, R5, R7, R8, and R9 as indicated. (b) The parent molecule, Neu, contains a substituent R5 being an amino group at C5. N-Acylation of the 5-amino group gives 5-N-acetyl-Neu (Neu5Ac) and hydroxylation of the 5-N-acetyl group gives 5-N-glycolyl-Neu (Neu5Gc). Deamination of the amino group at C5 of Neu gives 2-keto-3-deoxynononic acid (KDN). These four cores differing at the C5-position can carry one or more O-substituents of the hydroxyl groups with acetyl group(s) at C4, C7, C8, and/or C9 and less often a lactyl group at C9, a sulfate group at C8, or a methyl group at C8. Also, the C1 carboxylate group can react with a hydroxyl group or with the C5 amino group forming a lactone or a lactam, respectively, and free Sias with unsaturated and anhydro forms can also be found. (c) The C2 of Sia can form several types of glycosidic linkages with the penultimate sugar. Only three major linkages, Siaα2-3Gal, Siaα2-6Gal, and Siaα2-8Sia, are shown here. (d) Sialoglycans can be linear or branched (antennae). (e) Based on biomolecules under sugars, surface sialoglycans used for virus infection are classified into sialoglycoproteins and gangliosides (one or more Sia-containing glycosphingolipids). Sialoglycoproteins can be further classified on the basis of their covalent linkage to a protein through an Asn or a Ser/Thr into N-glycans or O-glycans, respectively. For O-glycans, the major core 1–4 subtypes based on the second sugar(s) attached to GalNAc-Ser/Thr are shown. Abbreviations: Cer ceramide (sphingosine-fatty acid complex), GM monosialoganglioside, GD disialoganglioside, GT trisialoganglioside, GQ tetrasialoganglioside, GM3 Siaα2-3Galβ1-4Glcβ1-1′Cer, GD3 Siaα2-8Siaα2-3Galβ1-4Glcβ1-1′Cer, ganglio-series, gangliosides containing GalNAcβ1-4Galβ1-4Glcβ1-1′Cer, GM2 GalNAcβ1-4(Siaα2-3)Galβ1-4Glcβ1-1′Cer, GM1 (GM1a) Galβ1-3GalNAcβ1-4(Siaα2-3)Galβ1-4Glcβ1-1′Cer, GM1b Siaα2-3Galβ1-3GalNAcβ1-4Galβ1-4Glcβ1-1′Cer, GD1a Siaα2-3Galβ1-3GalNAcβ1-4(Siaα2-3)Galβ1-4Glcβ1-1′Cer, GT1a Siaα2-8Siaα2-3Galβ1-3GalNAcβ1-4(Siaα2-3)Galβ1-4Glcβ1-1′Cer, GD2 GalNAcβ1-4(Siaα2-8Siaα2-3)Galβ1-4Glcβ1-1′Cer, GD1b Galβ1-3GalNAcβ1-4(Siaα2-8Siaα2-3)Galβ1-4Glcβ1-1′Cer, GT1b Siaα2-3Galβ1-3GalNAcβ1-4(Siaα2-8Siaα2-3)Galβ1-4Glcβ1-1′Cer, GQ1b Siaα2-8Siaα2-3Galβ1-3GalNAcβ1-4(Siaα2-8Siaα2-3)Galβ1-4Glcβ1-1′Cer, neolacto-series gangliosides containing Galβ1-4GlcNAcβ1-3Galβ1-4Glcβ1-1′Cer, sialylparagloboside Siaα2-3/2-6Galβ1-4GlcNAcβ1-3Galβ1-4Glcβ1-1′Cer, i-active ganglioside Siaα2-3Galβ1-4GlcNAcβ1-3Galβ1-4GlcNAβ1-3Galβ1-4Glcβ1-1′Cer, I-active ganglioside Siaα2-3Galβ1-4GlcNAcβ1-3(Galα1-3Galβ1-4GlcNAcβ1-6)Galβ1-4GlcNAcβ1-3Galβ1-4Glcβ1-1′Cer, terminal sialylglycoconjugates , sialyl-LacdiNAc, SLDN (Siaα2-6GalNAcβ1-4GlcNAcβ1-), Sda Siaα2-3(GalNAcβ1-4)Galβ1-4GlcNAcβ1-, sialyllactosamine sialyl-LacNAc, SLN (Siaα2-3/2-6Galβ1-4GlcNAcβ1-), (di)Sialyl-Lex/a sialyl-Lewisa (Siaα2-3Galβ1-3(Fucα1-4)GlcNAcβ1-), sialyl-Lewisx (Siaα2-3Galβ1-4(Fucα1-3)GlcNAcβ1-), or disialyl-Lewisa (Siaα2-3Galβ1-3(Fucα1-4)(Siaα2-6)GlcNAcβ1-), sialyllactose SL (Siaα2-3/2-6Galβ1-4Glcβ1-)

These terminal sialyl linkage antennae on glycoconjugates are remarkably potential recognition determinants for endogenous sialyl glycan-binding proteins (lectins) that mediate various biological processes, such as cell signaling that is pivotal for the development and maintenance of life functions, and for exogenous lectins of various exogenous agents such as bacteria, toxins, protozoa, and viruses [20] that mediate symbiotic or pathogenic processes. The parasitic relationship between hosts and pathogens drives the coevolutionary arms race between hosts and pathogens, resulting in diversity of Sias. For example, humans have evolved a mechanism to inactivate the gene encoding CMP-Neu5Ac hydroxylase, which converts CMP-Neu5Ac into CMP-Neu5Gc in other mammalian cells. In normal adult human cells, Neu5Gc is found less than 0.1% of total Sias from dietary intake [21], and loss of Neu5Gc production appears to render human cells resistant to some pathogen infections, such as infections with enterotoxigenic Escherichia coli K99 [22] and the malaria parasite Plasmodium reichenowi [23], but not to infection by pathogens with rapid evolution such as influenza A viruses, which are able to switch binding specificity including avidity to specific Sia species on the host glycan surface by changing a few amino acids in the binding pockets of hemagglutinin (HA) lectins [24]. Besides Neu5Gc, evasion of pathogens by evolutionary suppression of the expression of the other major Sia, Neu5Ac, which is critical in several endogenous functions [20], seems to be unfavorable. Abolishing Neu5Ac production in mice by inactivation of the UDP-GlcNAc 2-epimerase (GNE), a key enzyme of Neu5Ac biosynthesis, results in early embryonic lethality [25]. Mutations of the GNE gene in humans can result in impaired sialyl (Neu5Ac) O-glycan formation in sarcolemmal glycoproteins, a mechanism explaining a muscular disease called distal myopathy with rimmed vacuoles [26].

Several viruses cause important diseases in humans and livestock that affect health and have large social and economic burdens, whereas some viruses are useful for therapeutic applications. Being obligate intracellular parasites, viruses must invade and take over the host cellular machinery for survival and multiplication. To be successful in life, viruses must pass through three important stages, (1) entry, (2) gene expression and genome replication, and (3) exit, as shown in Fig. 2. The entry stage is not only critical for survival and continuation of their life cycle but also contributes to virus transmission and pathogenesis. This stage requires specific interactions between viral capsid proteins of non-enveloped viruses or viral spike glycoproteins of enveloped viruses and specific receptor molecules on the host cell surface (viral attachment). This chapter focuses on Sia-binding viruses. The characteristics of Sia–viral lectin interactions are described and the design and development of viral detection/antiviral drugs/virus-based therapies are discussed.

Fig. 2
figure 2

Simplified general scheme of animal and human viral life cycle. The viral life cycle can be divided into three stages: entry, gene expression and genome replication, and exit. (I) The entry stage involves attachment of a virus particle to host cell surface receptors, penetration (by fusion of the viral envelope with the host plasma membrane or with the host endosomal membrane and by permeabilization of the plasma membrane (still under debate [93])/by permeabilization/lysis of the endosomal membrane/by an endoplasmic reticulum (ER )-associated degradation (ERAD ) pathway for non-enveloped viruses) and uncoating that releases the viral nucleocapsid into the cytosol. (II) The next stage is viral gene expression and genome replication. DNA viruses, except for poxviruses with a large genome having their own RNA and DNA polymerases, transcribe and replicate their genomes within the nucleus by using host polymerases. In contrast, RNA viruses, which have their own enzymes for transcription and replication, transcribe and replicate their genomes within the cytoplasm. Except for orthomyxoviruses, which need the host 5′ cap for viral mRNA synthesis, and hepatitis D virus (HDV), which lacks both enzymes for transcription and replication, they transcribe and replicate their genomes within the nucleus. (III) Finally, the exit step occurs after viral components are assembled to form progeny viruses and ultimately released from the cell. The enveloped membrane of enveloped viruses is derived from budding, either through an internal compartment followed by exit via a secretory pathway or through the plasma membrane giving direct exit. For non-enveloped viruses, the primary mode of exit is cell lysis, and for nonlytic viral spread, the non-enveloped viruses are released from cells via extracellular vesicles [193, 194]

2 Determination of Sia-Binding Specificities for Viruses

The ability of a virus to bind to Sias can be determined via (1) a simple hemagglutination test by incubation of the virus and Sia-rich erythrocytes and observation of erythrocyte agglutination (hemagglutination) by the naked eye. To confirm that Sias or O-acetyl Sias are required for virus binding, (2) a virus-Sia-binding inhibition test can be performed by (2–1) the use of erythrocytes pretreated with an NA enzyme from a bacterium or a virus, which cleaves terminal Sia residues, or the use of Sia-deficient cell lines such as CHO-Lec2 cells. Specific binding of a virus to O-acetyl Sias can be checked by the use of erythrocytes pretreated with O-acetyl esterase from a virus such as bovine coronavirus or influenza C virus, which cleaves O-acetyl groups on Sia residues. (2–2) Hemagglutination inhibition can be performed by using a Sia-containing compound, such as fetuin, as a competitor. To reveal roles of Sia type, linkage type, sequence, compositions, chain lengths, and architectures in binding specificity of viruses/viral lectins, (3) direct binding of viruses/viral lectins to a variety of defined structures of sialyl glycopolymers coated on microplates or printed on microarray slides can be determined by quantitative detection. The detection can be performed by several methods such as measurement of the intensity of fluorescence products from viral NA enzyme activity [27] and measurement of fluorescently labeled antiviral/antiviral lectin antibodies or fluorescently labeled viruses/viral lectins [28] that correlate with the number of glycan-bound viruses/viral lectins. To reveal viral lectin–sialyl glycan interactions and conformational diversity of sialyl ligands bound to the viral lectins, (4) X-ray crystallography is now the most widely used technique for determining the atomic arrangement in a three-dimensional space of a co-crystal obtained after evaporation of viral lectin and sialyl glycan in an appropriate solvent.

3 Sialylglycoconjugates as Host Receptor Determinants of Viral Spike Glycoproteins for Enveloped Viral Infection

For enveloped viruses (Table 1), only single-stranded (ss) RNA viruses have so far been shown to encode spike glycoproteins, which function as lectins that read codes of sialyl sugar chains during infection. These ssRNA enveloped viruses belong to the families of Orthomyxoviridae, Paramyxoviridae, and Coronaviridae.

Table 1 List of Sia-binding enveloped viruses, host range, viral lectin, receptor specificity and tissue/cellular tropism

3.1 Orthomyxoviridae

The role of the hemagglutination activity of tick-borne Thogoto virus remains to be determined [29]. Only five genera in the family Orthomyxoviridae have been confirmed to internalize into the host cell via endocytosis upon binding of their glycoproteins to the target Sia receptors. Each genus contains only one species. Infectious salmon anemia (ISA) virus (ISAV) in the genus Isavirus causes a systemic and lethal disease in farmed and wild salmonids, whereas influenza A, B, C, and D viruses in the genera Influenzavirus A, B, C, and D, respectively, cause influenza, a contagious respiratory disease, with differences in severity and host range as shown in Table 1. ISAV, which separately encodes hemagglutinin-esterase (HE) and fusion (F) glycoproteins separately, initiates viral attachment to terminal 4-O-acetyl-sialyl glycans via HE-F complexes. This attachment mediates dissociation of HE and F, and the virus is endocytosed into the host cell [30]. Influenza C and D viruses encode hemagglutinin-esterase-fusion (HEF) glycoproteins responsible for receptor binding, receptor destroying, and membrane fusion [31]. Influenza A and B viruses have hemagglutinin (HA) glycoproteins mediating receptor binding and membrane fusion and have separate neuraminidase (NA) glycoproteins possessing a receptor-destroying function [32, 33]. While HEF glycoproteins of influenza C and D viruses attach to 9-O-acetyl-Neu5Ac-carrying sugar chains found in the respiratory tract of animals [31] as a receptor determinant for infection in cattle and pigs (only C virus has been detected in humans), HAs of influenza A (Fig. 3a) and B viruses recognize α2-6Neu5Ac-carrying sugar chains and cause epidemics in humans as shown in Table 1 [24, 33, 34]. However, only influenza A viruses have a wide host range of animals and a variety of subtypes due to high mutation rates from both reassortment between their genome segments and point mutations that allow them to quickly adapt to a new environment [35, 36]. H1–H16 and N1–N9 subtypes have been reported to use specific Sia receptors as key determinants for host/tissue tropism and specificity. Avian influenza A viruses primarily recognize α2-3Sia, which is mainly found on bird intestinal epithelial cells [37] and embryonated chicken egg chorioallantoic cells [38]. Human-adapted (pandemic and seasonal) influenza A viruses can be detected in nasal or nasopharyngeal samples, a throat swab, and tracheal and bronchial aspirates [39,40,41], in which α2-6Neu5Ac is dominant [42]. Infection in humans with a nonhuman influenza A virus that has crossed the species barrier via or not via an intermediate host can lead to a pandemic. So far four pandemics have been reported. While the H1N1/1918 pandemic (pdm) has an unknown origin, biochemical and evolutional studies have indicated tha t the HA genes of H2N2/1957, H3N2/1968, and H1N1/2009 pandemics were derived from nonhuman viruses, providing major viral antigens that are new to humans [24, 32]. The nonhuman avian (av)H2 to pdmH2/1957 HA and the avH3 to pdmH3/1968 HA changed their binding preference from α2-3Sia to α2-6Sia receptors, which are predominant in most regions of the human respiratory system such as the tracheal epithelium [42] except for the human alveoli region, in which α2-3Neu5Ac receptors are dominant [24], for efficient transmission among humans. Classical swH1 to pdmH1/2009 HA has decreased binding to Neu5Gc, which is found in the porcine respiratory tract [43] but is rare in healthy human adult hosts [21]. While nonhuman-to-pandemic HA lectins acquire a Sia type/linkage shift in binding specificity, pandemic-to-seasonal long-term human-adapted HAs drift binding preference to long α2-6Neu5Ac-polyLacNAc glycans. The presence of up to 10 LacNAc units in human respiratory sialyl N-glycans [44] but only 2 LacNAc units in a Neu5Ac(LacNAc)2 structure responsible for 0.15% of total human alveolar N-glycans might be another factor explaining why seasonal viruses are rarely found in human alveoli [24]. While little is known about ISAV including its potential to spark a pandemic, extensive studies on influenza A viruses [45,46,47] have indicated the continual threat of nonhuman influenza A viruses, especially avian and swine influenza viruses, to human infections highlighting the need for surveillance of changes in receptor binding HA amino acids and receptor binding preference to the α2-6Neu5Ac human-type receptor.

Fig. 3
figure 3figure 3figure 3

Binding of viral lectins to sialoglycans on the outer surface of host cell membranes. A schematic cross section through each virus particle with a diameter in nm is shown to locate viral lectins. The viral lectins of human influenza H3 HA trimer (a), mumps HN tetramer (b), porcine torovirus HE dimer (c), CVA24v VP1 monomer (d), reovirus T1L σ1 trimer (e), reovirus T3D σ1 trimer (f), HAdV-D37 fiber trimer (g), AAV1 VP3 monomer (h), and SV40 VP1 pentamer (i) are shown as surface diagrams using PYMOL, with monomers colored in salmon, sky blue, lime, light orange, and warm pink. The viral lectins are extended from the viral envelope (green) or viral capsid (pink) directly or via the pentameric λ2 protein for reoviruses and via the pentameric penton base for adenoviruses. The head/knob of each monomer of each viral lectin (except for T3D σ1 containing a Sia-binding site in its body) interacts with a sialoglycan on a glycoprotein/glycolipid (a brown dash), which is anchored to the host cell membrane. Details of interactions between viral amino acids in the receptor-binding site (RBS) and a sialoglycan receptor are zoomed in on one monomer. Sugar residues are shown as sticks with red color for Sia, yellow for Gal and GalNAc, and blue for GlcNAc and Glc according to sugar symbol colors in Fig. 1 (except for sugar residues of α2-3-sialyllactose in a mumps HN pocket that are in red since each residue cannot be colored independently). Only amino acids in RBS that are found to have direct contact with sugar residues are shown in lines with colors according to elements and are summarized on the upper part of each viral lectin-sialoglycan complex. Bond length in angstroms between two bonded atoms is on the dashed lines. PDB accession no. of each complex used for analysis is indicated in red text and details are provided in Table 3. Abbreviations in each schematic of a virus particle are M for matrix protein; NEP for nuclear export protein; NA for neuraminidase; HA for hemagglutinin; HN for hemagglutinin-neuraminidase; F for fusion protein; E and M for viruses in the family Coronaviridae for envelope protein and membrane protein, respectively; S for spike protein; and HE for hemagglutinin-esterase. Each monomer of PToV HE consists of 3 domains, MP, E, and R, that stand for a membrane-proximal domain, an esterase domain, and a receptor-binding domain, respectively. Residues are abbreviated as “res.” VP and CP stand for viral protein and capsid protein, respectively

3.2 Paramyxoviridae

In the family Paramyxoviridae, human parainfluenza virus type 1 (hPIV-1), type 2 (hPIV-2), and type 3 (hPIV-3) [48,49,50], mumps virus [51], Sendai virus [52,53,54], and Newcastle disease virus (NDV) [52, 55, 56] have been shown to be Sia-binding viruses. Unlike the viruses in Orthomyxoviridae that fuse their envelope with the host endosomal membrane, these viruses have a non-segmented (−)ssRNA genome that is released into the host cytosol via direct fusion on the host plasma membrane [57]. These viruses have undissociated binding- and destroying-receptor molecules (hemagglutinin-neuraminidase, HN) that trigger fusion activity of F proteins upon binding to sialic acid-containing receptors. HNs of hPIV-1 and hPIV-3 have been demonstrated to carry two Sia-binding sites. Results obtained by computer modeling analysis and glycan array assays suggested that binding-catalytic site I of hPIV-1 prefers α2-3-sialyl-Lex and binding site II of hPIV-1 prefers a sulfated α2-3-sialyl-Lex and α2-8Sia [50], in agreement with the results of an earlier study showing that hPIV-1 preferentially binds to α2-3Neu5Ac linked to branched LacNAc [48]. Crystallographic studies on the hPIV-3 HN complexed with Neu5Ac revealed that site I with both receptor binding and NA activities is located on the globular head HN, whereas site II with receptor binding activity is near the hPIV-3 HN dimer interface and is thought to be involved in promotion of virus fusion [58]. Direct receptor binding assays showed that hPIV-3 can bind to α2-3Neu5Ac, α2-6Neu5Ac, and α2-3Neu5Gc receptors [48], suggesting flexibility of the binding pockets of hPIV-3 HN. Similar to hPIV-3 binding preference, hPIV-2 was shown to bind and cleave both α2-3 and α2-6Neu5Ac ligands on a glycan array [49]. However, hPIV-1 (often) and hPIV-2 (less frequently detected) cause laryngotracheobronchitis (croup) in children, whereas hPIV-3 causes pneumonia and bronchiolitis in infants [48]. Studies on receptor binding and pathological observations have suggested that these viruses might target different cell types in the same and different regions of the human respiratory system. Thus, further analysis of virus cell type binding specificity and sialoglycoconjugates in the human respiratory system for each age group is needed.

Unlike the targets of the above-described respiratory viruses, mumps virus primarily infects the parotid and other salivary glands and sometimes spreads to other tissues and organs including the pancreas, testis, ovary, mammary glands, and kidney. Co-crystal structural analysis (Fig. 3b) and glycan-binding assays revealed that mumps virus HN proteins prefer α2-3Sia linked to unbranched sugar chains [51]. Uncovering the glycan expression profile of infected tissues should expand our understanding of this virus tropism and control.

Due to similarities to hPIV-1 in sequence, structure, and antigenicity, Sendai virus is a so-called murine parainfluenza virus that primarily infects the respiratory tract of mice and laboratory animals including rats, hamsters, and guinea pigs and occasionally pigs. Since it can agglutinate red blood cells, it was previously known as hemagglutinating virus of Japan (HVJ). It has been shown to bind to linear and branched sialosylpolylactosamine (sialyl-poly-LacNAc) sequences in erythroglycan II on erythrocyte membranes [53] and gangliosides including I-active and i-active gangliosides with terminal Siaα2-3Galβ1-4GlcNAc (Fig. 1) coated on asialoerythrocytes [52]. Resistance of V. cholerae sialidase-treated cells to Sendai virus infection can be restored fully by re-sialylation of the cells with CMP-Sia and β-galactoside α2-3-sialytransferase but not with β-galactoside α2-6-sialytransferase. This indicated that NeuAcα2-3Galβ1-3GalNAc, but not NeuAcα2-6Galβ1-4GlcNAc, is a receptor determinant of Sendai virus infection [54]. More research works to clarify binding preferences and interactions of Sendai virus and sialyl glycan receptors should lead to the generation of a recombinant Sendai virus that is specific to cell types and can improve the efficacy and specificity of virus entry and thus safety for therapeutic approaches such as virus-based vaccines against human respiratory viruses [59], virus vectors to reprogram cell genomes for regenerative medicine, and oncolytic virotherapy against cancer [60].

NDV is a minor zoonosis. Exposure to a large amount of the virus is necessary for human infection, which typically causes mild conjunctivitis and/or influenza-like symptoms. NDV mainly infects domestic poultry and causes a range of diseases from nonapparent to severe respiratory/gastrointestinal/nervous system diseases depending on the virus strain. With a neurovirulent strain infection, infected poultry would develop pneumonitis followed by encephalitis, leading to severe economic effects on poultry production; hence, NDV is sometimes called avian pneumoencephalitis [61, 62]. Based on laboratory tests of their virulence (mean death time) in chicken embryos after allantoic inoculation, NDVs are divided into three groups: velogenic (most virulent), mesogenic (mid-virulent), and lentogenic (non-virulent). While velogenic strains cause economic loss, lentogenic strains are used for vaccination of chickens and for oncolytic virotherapy [62]. Understanding NDV attachment to its receptors and the viral entry mechanism is critical to find a way to control virus infection: inhibition of chicken infection by velogenic strains but enhancement of entry of a lentogenic strain into tumor cells. An early study showed that NDV prefers α2-3Sia with either Neu5Ac or Neu5Gc-paragloboside gangliosides containing linear lacto-series type 2 oligosaccharide and ganglioside GM3 with either NeuAc or NeuGc [52]. Recent advances in the Sia-protein co-crystal technique [55] and hemadsorption/fusion deficiency of dimer interface mutants [63] have revealed that NDV HN carries two Sia-binding active sites. Site I possesses receptor binding mediating NA activity and triggers HN interaction with the F protein, while site II has only receptor binding activity at the dimer interface that has been proposed to maintain virus-host Sia attachment during fusion. Molecular modeling of NDV HN with small molecules and analysis of the binding free energy of interactions [56] indicated that site I prefers α2-3-sialyllactose. Site II can interact with α2-3-sialyllactose similar to α2-6-sialyllactose, but the interaction is much weaker than the interaction of site I with α2-3-sialyllactose. It is interesting that binding of NDV to red blood cells was inhibited by α2-3-sialyllactose but was increased in the presence of zanamivir (an NA inhibitor) preferential to bind to site I [56]. This raises important questions about whether NA activity is required for NDV NH binding and/or fusion and whether site II works independently or must be activated by site I binding to some compound such as zanamivir but not α2-3-sialyllactose. These questions should be answered in order to efficiently control the functions of these two sites.

3.3 Coronaviridae

In the family Coronaviridae, non-segmented (+)ssRNA torovirus [64], some coronaviruses in the genus Betacoronavirus clade A group 2 [65,66,67] and clade C [68], and transmissible gastroenteritis coronavirus (TGEV) in the genus Alphacoronavirus [69] (TGEV will be discussed in the family Parvoviridae) have been reported to bind to Sia. These viruses use spike (S) glycoprotein as a multifunctional molecule: its S1 subunit binds to a receptor, and its S2 subunit contains the fusion peptide (FP) that fuses viral and host membranes after activation by proteolytic cleavage. Fusion of these viruses occurs at the endosomal membrane, but Middle East respiratory syndrome coronavirus (MERS-CoV) mainly fuses its viral membrane at the plasma membrane since its S proteins are sensitive to be activated by both the pH-dependent endosomal protease and host secreted or surface proteases [70].

Toroviruses cause gastroenteritis in vertebrates, mainly cattle, pigs, horses, and humans, especially children. Based on comparative sequence analysis, four genotypes can be classified into (1) equine torovirus (Berne virus) (EqToV), (2) bovine torovirus (Breda virus) (BToV), (3) porcine torovirus (Markelo virus) (PToV), and (4) human torovirus (HToV) [71]. Little is known about the receptor binding specificity of torovirus S proteins. There is evidence that EqToV, who lacks an intact hemagglutinin-esterase (HE) gene, can agglutinate human, rabbit, and guinea-pig erythrocytes, and these hemagglutinations were decreased in the presence of fetuin and gangliosides. This evidence suggested that glycoproteins/glycolipids may be receptors of torovirus S proteins [72].

Coronaviruses can cause a variety of illnesses with varying severity and they have a wide range of hosts. Their infection is mainly associated with respiratory diseases, especially common colds in humans, and infection often leads to diarrhea in animals. Coronaviruses have sparked global concern about their pandemic potentials since zoonotic transmission, most likely from bats, to humans of severe acute respiratory syndrome SARS-CoV (not a Sia-binding virus) via civet cats, an intermediate host, which emerged first in Asia in 2002–2003, and of severe viral respiratory MERS-CoV via camels as an intermediate host, which emerged first in a middle eastern country in 2012 [73]. Only some coronaviruses use Sia as a receptor/assistant receptor/attachment factor as shown in Table 1. The S proteins of bovine coronavirus (BCoV) [74] and human coronavirus (HCoV) OC43 and HKU1 [75] in the genus Betacoronavirus clade (or lineage) A group 2 have been reported to bind to 9-O-acetylated Sia in the receptor-binding site. Typically, these betacoronaviruses also carry HE spike proteins that bind to O-acetylated Sias as well.

Why are HE proteins present in some but not all toroviruses and coronaviruses? Results of evolution studies have suggested that intact HE genes present in three genera of toroviruses (BToV, PToV, and HToV) and Betacoronavirus clade A group 2 with S proteins binding to O-acetylated sialic acids [75] were acquired via independent heterologous RNA recombination events from a yet unknown source, possibly during a mixed infection with another virus, such as influenza C virus [76, 77]. Variants with the HE gene, which have become circulating strains, could be explained by the finding that the HE protein increases the efficiency of production of infectious virus [78] possibly by acting as a lectin for enhancing viral binding and as an enzyme that destroys receptors by de-O-acetylation (esterase) for enhancing release of trapped virions from the host mucosa and of budding virions from infected cells. However, it should be clarified how S proteins and HE proteins act cooperatively for supporting virus infection, and it should be clarified whether the virus can still infect cells if receptor binding at one site is inhibited. An understanding of the mechanism of initiation of infection should lead to the development of a potential strategy to combat virus infection.

HE proteins of viruses from different hosts have distinct receptor/substrate preferences. Esterases of porcine/bovine toroviruses and equine/bovine/human coronaviruses are specific for 5-N-acetyl-9-O-acetylneuraminic acid (Neu5,9Ac2) [64, 77]. An esterase-deficient PToV HE (receptor-binding domain) can bind to 4,9-di-O-acetyl-Neu5Ac- [64] as shown in Fig. 3c. Esterases of bovine toroviruses prefer the di-O-acetylated substrate (5-N-acetyl-7(8),9-di-O-acetylneuraminic acid) [77]. Bovine coronavirus HE also binds to 7,9-di-O-acetyl Sia [67]. Most rodent coronaviruses express sialate-4-O-acetylesterases and the rat coronavirus HE protein binds to 4-O-acetyl Sia. The HE proteins of murine coronavirus DVIM (diarrhea virus of infant mice) cleave 9-O-acetylated Sias [64, 67, 77]. Some viral HE proteins may have more flexible pockets that would allow for a greater range of receptors/substrates to be in their pockets. The HE proteins of human toroviruses were shown to have 74% sequence identity with those of bovine toroviruses [79], but there is a lack of information regarding receptor/substrate preferences. These receptor-binding and substrate-cleaving specificities of HE proteins in a host indicated that if the virus jumps to infect different animal species, its HE proteins need to acquire mutations to bind/cleave O-acetylated Sias expressed on the host target.

MERS-CoV in the genus Betacoronavirus clade C binds to dipeptidyl peptidase 4 (DPP4 and also called CD26) as a primary receptor via domain B of S1 Subunit (S1B) [80] and binds to α2-3Sia glycans as an attachment factor via domain A of S1 Subunit (S1A) of its spike (S) protein designated S1A through S1D from the N-terminal S protein [68]. These bindings trigger infection in nonciliated bronchial epithelial cells and type II pneumocytes [73] that mainly have α2-3Sia receptors [81]. Reduction in MERS-CoV infection of sialidase-treated Calu-3 human airway cells indicated that the host Sia receptor is involved in determining MERS-CoV host/tissue tropism and transmission [68]. It remains unclear how this virus emerged to infect humans. However, MERS cases have continued to be reported [82], indicating the possibility of a pandemic in the future. This highlights the need for continuous surveillance of changes in MERS-CoV including two receptor-binding preferences and finding a way to control MERS.

4 Sialylglycoconjugates as Host Receptor Determinants of Viral Capsid Proteins for Non-enveloped Viral Infection

Various non-enveloped viruses with either an RNA or DNA genome have been reported to encode lectins for binding to sialyl glycans, which mediate entry into the host cells (Table 2).

Table 2 List of Sia-binding non-enveloped viruses, host range, viral lectin, receptor specificity and tissue/cellular tropism

4.1 Picornaviridae

Several strains listed in Table 2 of small (+)ssRNA-containing viruses, pico-rna-viruses, in the family Picornaviridae, including equine rhinitis A virus (ERAV) in the genus Aphthovirus; enterovirus D68 (EV-D68), enterovirus 70 (EV70), and coxsackievirus A24 variant (CVA24v) in the genus Enterovirus; and encephalomyocarditis virus (EMCV) and low-neurovirulent Theiler’s murine encephalomyelitis virus (TMEV) in the genus Cardiovirus, have been confirmed to use a Sia-containing receptor for host cell attachment. ERAV is a horse pathogen [83]; EV-D68, EV70, and CVA24v are human pathogens [84,85,86]; EMCV has a wide host range including humans [87]; and low-neurovirulent TMEV is a mouse pathogen that is able to cause a chronic demyelinating disease and is thus used for studies on human multiple sclerosis [88] as shown in Table 2. The viral capsid of the picornaviruses comprises three virion proteins (VP1, VP2, and VP3) that form the shell and one virion protein (VP4) lying on the inner surface of the virus particle as illustrated in Fig. 3d. The crystal structures of the viral capsid proteins, VP1, VP2, VP3, and VP4, in complex with a sialyl ligand have shown that a Sia-binding pocket is in VP1 of ERAV [83], in a pit between VP1 and VP3 of EV-D68 [89], in VP1 of CVA24v [90], and mainly in the VP2 puff B of low-neurovirulent TMEV [88, 91]. Analysis of EMCV mutants/variants has indicated that amino acids in VP1 are responsible for interactions with Sia on the host cell surface [87, 92]. Typically, binding of the viral capsid proteins of picornaviruses to receptors triggers endocytosis. When the endosome becomes acidic, the viral capsids undergo conformational change and/or a protease is activated, resulting in channel formation that allows the viral genome to pass through the host cytosol [84, 93], except for EMCV and poliovirus for which it remains unclear whether their genomes can directly penetrate through the plasma membrane due to no requirement of low pH for infection [93, 94].

Information on which virus strains require Sia and which sialyl glycan structure is a determinant for viral attachment and infection is critical to understand virus tropism and pathogenesis for diagnosis and treatment, especially the design of a detection system and inhibitors targeting Sia-binding sites in viral lectins. ERAV has been shown to require Sia on the host cell surface for its infection: reduction of the virus infection into host cells was observed when sialidase-treated cells were used or when α2-3-sialyllactose was added to be a competitive receptor for binding to the virus. α2-3 Sialyllactose was found to bind in VP1 of the ERAV capsid by X-ray crystallography, and the residues including Gln65, Ala118, Gln120, and Arg129 that interact with Sia are conserved across all ERAV strains, implying that all strains of ERAV require Sia as a receptor for virus entry [83].

Investigation of the role of Sia in infections by six EV-D68 strains isolated from patients during the period from 2009 to 2010 in comparison with the prototype Fermon strain isolated more than 50 years ago showed that Sia-deficient cells and sialidase-treated cells were resistant to infections by strains 670 (clade A), 2042 (clade B), and 2284 (clade C) as in the case of the Fermon strain [89] but were sensitive to infections by the other three strains, 947 (clade B), 1348 (clade A), and 742 (clade A), and can thus be classified as Sia-dependent and Sia-independent strains, respectively [95]. The use of knockout cell lines and gene reconstitution, glycan array screening, infection inhibition assays by receptor analogues, and co-crystal structure analysis have indicated that both α2-3Neu5Ac and α2-6Neu5Ac on either lactose or lactosamine can be receptors for virus infection [89, 95], possibly explaining why EV-D68 infection causes a wide spectrum of illnesses.

Although both EV70 and CVA24v (but not CVA24) are still major causes of acute hemorrhagic conjunctivitis (AHC) epidemics worldwide due to ocular infections of both conjunctival and corneal cells and still have the potential to spark pandemics [96], there are no vaccines or antiviral drugs for AHC diseases caused by EV70 and CVA24v [86]. Knowledge of the receptor-binding specificity of viruses is needed for prevention and treatment of infections.

Viral attachment studies using linkage-specific sialidase-treated/sialidase-untreated cells, sialidase-treated followed by α2-3- or α2-6-sialyltransferase-treated/α2-6-sialyltransferase-untreated cells, and cells blocked with α2-3Sia- or α2-6Sia-binding lectins suggested that the EV70 prototype strain J670/71 binds specifically to the α2-3-sialyl linkage on the host cell surface. EV70 can bind to phosphatidylinositol-specific phospholipase C-treated cells and tunicamycin-treated cells but not to benzyl N-acetyl-α-d-galactosaminide (benzylGalNAc)-treated cells or to proteinase K-treated corneal cells, suggesting that EV70 prefers binding to α2-3Sia O-glycosylated, non-GPI-anchored glycoproteins on human corneal epithelial (HCE) cells [85, 97]. Viral binding and infection studies using sialidase-, PNGase F-, tunicamycin-, or benzylGalNAc-treated/sialidase-, PNGase F-, tunicamycin-, or benzylGalNAc-untreated cells and binding competition assays in the presence of sialyl-Lex, 3′SLN, or 3′sialyl-TF (Neu5Acα2-3Galβ1-3GalNAcα1, TF, Thomsen-Friedenreich) suggested that CVA24v binds to and infects HCE cells via α2-3Neu5Ac O-linked cell surface proteins [86]. This shared α2-3Neu5Ac-binding preference could be a factor explaining why these viruses have overlapping cellular tropism. Binding to and infection of conjunctival cells by both AHC-causing human picornaviruses, EV70 and CVA24v, must be further investigated. In contrast to EV70, analysis of the crystal structures of CVA24v in complex with a range of sialyl glycans indicated that CVA24v binds strongly to 6′SL and DSLNT (Fig. 3d); binds weakly to LSTc, sialyl-Lex, 3′SL, and 3′SLN (Fig. 1); and does not bind to GM1, GM2, GD1a, GD1b, and GD3, suggesting that CVA24v can bind to both α2-3 and α2-6Neu5Ac with preference for 6′SL over 3′SL and 6′SL over LSTc structures and confirming that it does not bind to gangliosides [90]. The ability of CVA24v to bind well to α2-6Neu5Ac could partially explain why CVA24v more commonly than EV70 infects human upper respiratory tract tissue rich in α2-6Neu5Ac. The terminal α2-6Neu5Ac is also rich in mucin-type O-glycans in tear films that could facilitate the spread of CVA24v [90]. While the emergence of pandemic EV70 in 1971 after its first recognition in 1969 and the second pandemic in 1980 are still mysteries, two pandemics caused by CVA24v in 1985 after the emergence of CVA24v/1970 causing an AHC outbreak and in 2002 have recently been investigated [98]. Both nonvariant and variant CVA24 viruses use ICAM-1 as an essential receptor, but only infection of the AHC-causing CVA24 variants into HCE cells significantly depends on a sialylated cell surface. A comparison of the amino acid sequences in the Sia-binding pockets of nonvariant and variant CVA24 viruses indicated that most nonvariant CVA24 viruses and the first variant CVA24v/1970 virus contain VP1 with Phe250 but that all variant viruses since 1985 including the CVA24/2002 pandemic virus possess Tyr250 in VP1. Binding to and infection of HCE cells by the wild-type CVA24v virus were more efficient than binding and infection of the cells by a constructed Tyr250Phe CVA24v mutant. These findings suggested that the CVA24 → CVA24v/1970 virus causing AHC is still unknown, that enhanced Sia-binding engagement by a change of Phe250 to Tyr250 in VP1 contributed to the CVA24v/1970 → CVA24v/1985 pandemic, and that the change responsible for the emergence of the CVA24v/2002 pandemic is still unknown [98].

The detailed sialyl structures that are used for EMCV attachment are still not known. Earlier studies showed that the wild-type mengovirus, but not avirulent mutants, agglutinates human erythrocytes via binding to glycophorin, which is the major sialoglycoprotein on human erythrocytes [99]. The EMCV K2 strain agglutinated human and equine erythrocytes dominant in Neu5Ac and Neu5Gc, respectively, but did not agglutinate bovine erythrocytes dominant in Neu5Gc; hence, this selective agglutination needs to be further investigated [100]. Investigation of an EMCV receptor on permissive human cells revealed that the K2 strain uses a 70-kDa sialoglycoprotein(s) on HeLa and K562 cells as a cell surface receptor for virus attachment [101]. For the rat strain 1086C, the parental virus is not a Sia-binding virus [87]. The viruses, which were cultured in baby hamster kidney-21 (BHK-21) cells, produced two groups of variants. Group I with Lys231 in VP1 that becomes Sia-dependent can bind to and infect primary human cardiomyocytes more efficiently than can group II with Arg231 in VP1, which is not a Sia-dependent group [92]. In addition, the rat strain 1086C appeared to acquire the ability to replicate effectively in buffalo rat liver (BRL) cells as a result of 3 amino acid mutations, Lys49Glu, Leu142Phe, and Ile180Ala, in VP1 that provided the ability to bind to the sialylated BRL cell surface after 29 passages [87]. These findings indicate that some EMCVs adapt to a new host by acquisition of the ability to bind to Sia on the host cell surface.

The cytopathic effect (CPE) caused by infection of low-neurovirulent TMEV strains BeAn and DA in BHK-21 cells was reduced either by the use of sialidase-treated cells or by addition of α2-3-sialyllactose to the medium, indicating the importance of sialylated surface molecules for infection of these persistent TMEV strains [88]. Single amino acid substitutions, Gln2161Ala/Arg/Trp/Phe or Gly2174Trp/Phe, in VP2 puff B of BeAn virus cause loss or reduction of viral attachment to erythrocytes and loss or reduction of viral spread among BHK-21 cells. However, the Gln2161Ala mutant virus appeared to recover to the wild-type Gln2161 virus with potential for binding and infection after prolonged passage in BHK-21 cells, suggesting that this mutant can acquire rapid adaptation in the host environment [91].

4.2 Caliciviridae

Members of the family Caliciviridae with cuplike depressions in the viral surface, which have been shown to be Sia-binding viruses, include human enteric GII.3 (Chron1), GII.4 (Dijon) [102], and GII.4 (MI001) [103] and murine enteric GV (MNV1, WU11, and S99) [104] noroviruses in the genus Norovirus (Norwalk-like viruses), rhesus macaques enteric GI.1 Tulane virus (TV) in the genus Recovirus [105], cat respiratory GI (F9) feline calicivirus (FCV) in the genus Vesivirus [106], and porcine enteric GIII (Cowden) sapovirus (PSaV) in the genus Sapovirus (Sapporo-like viruses) [107]. The capsid of these non-enveloped viruses is comprised of 90 U of a single major capsid protein (VP1) in a dimeric form. Each VP1 monomer has 2 domains: a shell (S) domain and a protrusion (P) domain. Binding of the P domains to host receptors triggers receptor-mediated endocytosis that depends on clathrin, dynamin II, and/or cholesterol [108,109,110] as shown in Table 2, and the endocytosis is followed by penetration of the viral genome into the host cytosol for multiplication.

The use of neoglycoproteins for binding studies has revealed that virus-like particles (VLPs) of Chron1 (GII.3), Dijon (GII.4), and Norwalk (GI.1) strains bind to Leb and H type 1 chain glycoconjugates. Chron1 and Dijon strains also bind to sialylated type 2 chain glycoconjugates including sialyl-Lex and sialyl-diLex but not to Lex or sialyl-Lea, indicating virus binding specificity [102]. Later analysis of binding stoichiometry by native mass spectrometry provided evidences that there are four B antigens or two α2-3-sialyllactoses (GM3 trisaccharides) forming a complex with a recombinant P domain dimer from the GII.4 MI001 variant. Epitope mapping showed direct interaction of α2-3Sia with the P domain [103].

Binding to and infection of primary murine macrophages by a murine norovirus MNV1 strain were reduced in the presence of either Sambucus nigra lectin (SNL), which is preferential to α2-6Sia, or Maackia amurensis lectin (MAL), which is preferential to α2-3Sia, or when sialidase-treated macrophages were used. An enzyme-linked immunosorbent assay showed that MNV-1 bound to the ganglioside GD1a but not to GM1 or asialo-GM1 (GA1). In addition, reduction of MNV-1 binding to and infection of murine macrophages by the depletion of gangliosides in primary murine macrophages can be restored by the addition of GD1a. Similarly, a role of GD1a was also observed during binding to and infection of the macrophages with murine norovirus strains WU11 and S99 [104].

In addition to interactions with histo-blood group antigens (HBGAs), Tulane virus has been shown to bind to synthetic sialoglycoconjugates: strongly to Neu5Ac and weakly to α2-6SLN but not to Neu5Gc, α2-3SL, or a type A disaccharide (GalNAc-Gal). Tulane virus infection in permissive LLC-MK2 cells was significantly reduced by treatment of the host cells with either NA or α2-6Sia-binding SNL [105]. However, further investigation is needed to determine how HBGAs and sialoglycoconjugates coordinate with each other to attach to Tulane virus for mediating infection.

In addition to feline junctional adhesion molecule-A (fJAM-A) being a receptor for feline calicivirus (FCV), reduction of FCV binding and infection by V. cholerae NA treatment of host cells indicated that Sia is necessary for virus binding and infection. FCV binding and infection were also reduced by α2-6Sia-binding SNL but not by α2-3Sia-binding MAL, indicating the importance of α2-6 linkage for infection. Furthermore, FCV binding and infection were inhibited in the presence of both tunicamycin, which inhibits N-glycosylation, and PNGase F, which releases N-linked oligosaccharides, but not in the presence of benzylGalNAc, which inhibits O-glycosylation. These findings indicated that FCV uses α2-6Sia on N-linked glycoproteins as a primary receptor or co-receptor for infection [106].

Attachment and infection of PSaV Cowden strain were markedly inhibited by treatment of cells with V. cholerae NA and were partially inhibited by treatment of cells with α2-3Sia-cutting sialidase S, α2-3Sia-binding MAL, or α2-6Sia-binding SNL, suggesting that the virus can attach to both α2-3Sia and α2-6Sia for infection. Virus binding and infection can be reduced by treatment of cells with proteases or with benzylGalNAc but not by treatment with tunicamycin or with dl-threo-1-phenyl-2-decanoylamino-3-morpholino-1-propanol (PDMP), a glucosylceramide synthase inhibitor. These findings suggest that PSaV Cowden strain uses α2-3Sia and α2-6Sia on O-linked glycoproteins that are present on porcine intestinal epithelial cells as receptors for infection [107]. Interestingly, this virus did not agglutinate pig, human, rat, chicken, or cow red blood cells and did not bind to synthetic HBGAs including A and H types, which are known to be expressed in pigs. The sialyl sugar chain structures that are not found on those red blood cell membranes but are recognized by the PSaV Cowden strain should be investigated in detail. Detailed information on sialyl glycans recognized by each virus, not only this PSaV, and how many receptors are necessary for infection should lead to the development of effective strategies for control of calicivirus infection.

4.3 Reoviridae

Non-enveloped viruses that contain discrete 10–12 segmented dsRNAs with multilayered capsids belong to the family Reoviridae. “Reo-” is derived from respiratory enteric orphan viruses according to the pathology of the first members found in respiratory and enteric tracts as orphans, which did not cause symptomatic disease at the time of discovery. These viruses are currently members of the genus Orthoreovirus (“true” reoviruses). One species of this genus, the mammalian orthoreoviruses (also called reoviruses), which are classified as nonfusogenic viruses because infectious virus particles, which are activated in the clathrin-dependent endosome by proteolysis of outer shell proteins resulting in capsid conformational rearrangement and exposure of a hydrophobic part of a viral membrane-penetration protein, can penetrate, without fusing with the endosomal membrane, into the host cytosol [111], contains three serotypes (types), 1, 2, and 3, that infect a variety of mammalian species including humans. In humans, reoviruses usually cause subclinical or mild respiratory illness, such as common cold and enteritis, but sometimes might lead to severe illnesses (Table 2) including a CNS disease in infants [112]. The spread to and infection of the CNS by reoviruses in newborn mice have been also shown to be serotype-specific; type 1 spreads hematogenously to the infection site, ependymal cells, producing nonlethal hydrocephalus, whereas type 3 spreads neutrally to infect neurons, resulting in lethal encephalitis [113]. Type 2 has not been studied in detail due to the difficulty in experimental propagation. This serotype-specific pattern of neurotropism is primarily determined by the viral attachment σ1 protein encoded by the viral S1 gene. Ten segmented dsRNAs of the reoviruses are divided into three classes according to their size: three segments are large (L1, L2, L3), three are medium sized (M1, M2, M3), and four are small (S1, S2, S3, S4). Each segment encodes one protein, except for the S1 gene encoding structural and nonstructural proteins, which are denoted by Greek letters corresponding to the L, M, and S segments that encode them: λ, μ, and σ proteins with numbering of the proteins that is not related to the segment numbers that encode them, for example, the S4 gene encodes a σ3 protein [111]. As shown in Fig. 3e, f, the outer shell of this double-shelled virus is composed mainly of the μ1 protein including its cleavage products and the σ3 protein and partly of the homotrimeric spike σ1 protein interacting with the homopentameric turret-like spike λ2 protein at each vertex of 12 vertices of the virus particle.

The serotype-specific differences in routes of spread and in recognition of different cells in the CNS of newborn mice are thought to be due not to conserved binding located at the base of the σ1 head domain of these viruses to junctional adhesion molecule-A (JAM-A) but to different binding specificities of σ1 proteins of these viruses to sialyl glycans. Early binding specificity studies of reoviruses showed that reovirus hemagglutination is serotype-specific; types 1 and 2 preferentially agglutinate human erythrocytes, while type 3 favors bovine erythrocytes [114]. However, detailed structures of the sialyl sugar chains from these erythrocytes are still unknown. It is only known that there is only Neu5Ac in human erythrocytes [115] but that there are both Neu5Ac and Neu5Gc with a higher ratio of Neu5Gc in bovine erythrocytes [53]. Detailed information on Sia-binding specificity of reoviruses has been obtained by using glycan array screening and structural studies on the co-crystals of recombinant protein σ1 of prototypic type 1 strain Lang (T1L) [116] or prototypic type 3 strain Dearing (T3D) [117] and sialyl glycan. Glycan array analysis of T1L σ1 binding to gangliosides indicated that the GM2 binding signal is stronger than GM3, GM1, and GD1a binding signals [116]. GM2 appeared to specifically decrease type 1, but not type 3, infection of mouse embryonic fibroblasts. The crystal structure of T1L σ 1 in complex with GM2 showed that both terminal Neu5Ac and GalNAc moieties of GM2 make contact with protein σ 1 in a shallow groove in the globular head domain, while the crystal structure of the T1L σ 1-GM3 (lacking terminal GalNAc) complex (Fig. 3e) showed that only terminal Neu5Ac of the GM3 trisaccharide interacts with T1L σ1 [116]. In contrast, the crystal structure of T3D σ 1 in complex with α2-3-sialyllactose (a trisaccharide of GM3) showed that Neu5Ac makes extensive contact with each body domain of homotrimeric σ 1 protein and that the lactose (Gal-Glc) moieties participate in the contact in different directions in each body domain, presumably as a result of flexibility of the three binding sites [117]. Figure 3f shows that not only terminal Neu5Ac but also the third sugar Glc of α2-3-sialyllactose has direct hydrogen bonds with the binding site of T3D σ 1. The T3D σ1 binding site can also accommodate α2-6-sialyllactose and α2-8-di-sialyllactose with either Neu5Ac or Neu5G [116, 117]. It was proposed that σ1 proteins of a reovirus first selectively bind to specific sialyl glycans on the host cell surface with relatively low affinity. This is followed by binding of the viral σ1 proteins to JAM-As on the same host cell surface with high affinity. These firm attachments of the virus to sialyl glycans and JAM-As on the cell surface mediate viral internalization via clathrin-dependent endocytosis [117].

Both type 1 and type 3 reoviruses also cause serotype-specific patterns of infection in the intestine of adult mice; type 1 infects crypt epithelial cells with pathology restricted to the ileum, while type 3 prefers goblet and absorptive cells, causing a wide pathology of the duodenitis, jejunitis, and ulcerative colitis [118]. Studies on reovirus infection in rat lungs indicated that both type 1 and type 3, which can bind to α2-3-sialyl linkage, can infect and replicate in type I alveolar epithelial cells, leading to pneumonia, but that type 1 produced higher titers in the lungs than did type 3, having a wide range of binding preferences to α2-3-, α2-6-, and α2-8-sialyl linkages [119]. Serotype specificity of reoviruses to cellular tropism in the respiratory tract including the upper respiratory tract may exist and more investigation is needed.

An understanding of the mechanisms responsible for the differences between reovirus serotypes (types) in infection potency, cellular tropism, and pathogenesis might lead to the development of methods for diagnosis and treatment. Such an understanding might also be useful for improvements of reoviruses in clinical therapeutic applications against cancers having changes in glycan composition profiles.

The other genus in the family Reoviridae for which members attach to Sia for efficient infection is the genus Rotavirus . Rotaviruses, which are a major cause of viral diarrhea in animals worldwide and cause death in about half a million infants and young children each year, possess 11 dsRNA genome segments enclosed in triple-shelled capsids. The outer shell consists of the glycoprotein VP7 layer and protruded VP4 dimeric spikes. In the presence of trypsin (a serine protease), VP4 is cleaved into VP5 and VP8. These outer shell proteins have been shown to interact with several cell surface molecules; VP7 binds to integrins αvβ3 and αxβ2, VP5 binds to integrin α2β1 and to heat shock cognate protein 70 (hsc70), and VP8 binds to Sia. While binding to hsc70 is required for all virus strains, binding to integrins and to Sia appears to be strain-dependent [120]. It has been proposed that VP8 at the top of the VP4 molecule initiates cell attachment via Sia interactions. Sequentially or alternatively, VP5* at the body of VP4 interacts with integrin α2β1. Then VP5* at the foot of VP4 interacts with hsc70 followed by interactions of VP7 at the outer shell layer with integrins αvβ3 and αxβ2. These multistep interactions finally trigger endocytosis via a mechanism depending on the virus strain [120]. These two outer shell viral proteins, nonglycosylated protease-sensitive VP4 proteins and glycosylated VP7 proteins, carrying type-specific epitopes are also used for classification of rotaviruses into P and G genotypes, respectively.

Based on results of hemagglutination tests, hemagglutination inhibition tests, or infection inhibition tests with NA treatment/blocking with Sia-carrying inhibitors [121, 122], inhibition of virus binding to MA-104 cells or enterocytes by GM3 and GM2 [123] and/or direct interactions between GD1a and VP8 [124], all tested strains of P genotypes 1, 2, 3, and 7 in group A rotaviruses [121, 123, 124] and strain AmC-1 in group C rotavirus [122] have been shown to be Sia-dependent viruses that require binding to terminal Sia of glycoconjugates on the host cell surface, which is sensitive to hydrolysis by an NA. Therefore, these Sia-dependent viruses are subgrouped as sialidase-sensitive strains. Two human strains, KUN in genotype P4 and MO in P8, have been tested and found to be Sia-dependent viruses that require binding to internal (branched) Sia of glycoconjugates, i.e., GM1a (=GM1) (Fig. 1), but not binding to asialo GM1a (called GA1). The internal (branched) Sia of gangliosides appears to be insensitive to hydrolysis by an NA [125]. Therefore, these Sia-dependent viruses are subgrouped as sialidase-resistant strains. Later, infections of human strains Wa (G1P1A(8), 1A being a serotype that precedes P genotype in parenthesis), RV-3 (G3P2A(6)), RV-5 (G2P1B(4)), and S12/85 (G3P2A(6)) and bovine strain UK (G6P7(5)) were shown to be inhibited by the GM1-binding cholera toxin B (CTB), while infections are increased by the use of cells in which exogenous GM1 is incorporated or by the use of sialidase-treated cells having increased GM1 levels [126]. Thus, these human and bovine rotaviruses in P genotypes 4, 5, 6, and 8 are classified as Sia-dependent viruses in a subgroup of sialidase-resistant strains. Surprisingly, infection of porcine strain TFR-41 (G5P9(7)), which is an sialidase-sensitive strain, was reduced by CTB treatment [126]. Thus, this virus is classified as an sialidase-sensitive strain that can bind to both terminal Sia and internal Sia as shown in Table 2.

The binding preferences of rotaviruses for Sia species have been determined directly and indirectly. Infection of the human sialidase-resistant strains Wa, RV-3, and RV-5 in sialidase-treated cells is reduced in the presence of Neu5Acα2Me [126], suggesting that Neu5Ac is the Sia species preference of these strains. Studies on inhibition of rotavirus binding to host cells by Neu5Gc-containing aceramido-GM3Gc and Neu5Ac-containing aceramido-GM3Ac [127]; inhibition of rotavirus infection by an anti-Neu5Gc antibody, Neu5Gcα2Me, and Neu5Acα2Me [126, 128]; and analysis of the crystallographic structure of VP8 in complex with Neu5Gcα2Me [128] revealed that VP8 proteins from the porcine sialidase-sensitive strains CRW-8, OSU, and YM and the bovine strains SA11 and NCDV, which carry small residue Gly187, have greater specificity for Neu5Gc holding an extra hydroxyl group over Neu5Ac, whereas the rhesus (simian) NA-sensitive strain RRV VP8, which carries Lys187, has a higher preference for Neu5Ac than Neu5Gc and the bovine sialidase-resistant strain UK with Lys187 favors Neu5Ac. It was shown that CRW-8 virus acquires a Pro157-to-Ser157 mutation in VP8 that decreases Neu5Gc-binding affinity during cultivation in MA104 cells [128], indicating virus adaptation to host cell surface receptors.

Recent studies on sialyl linkage-specificity of sialidase-sensitive rotavirus strains have demonstrated that infection of bovine strain NCDV in genotype P1 and canine strain CU-1 in P3 was decreased by α2-6Sia-binding SNL but not by α2-3Sia-binding MAL. Infection of porcine strains PRG9121 in P7 and PRG942 in P23 was inhibited by both α2-6Sia-binding SNL and α2-3Sia-binding MAL. Pretreatment of MA104 cells with either PDMP, which inhibits glucosylceramide synthase, or tunicamycin, which inhibits N-glycosylation, inhibited infection of all viruses. In contrast, pretreatment with benzylGalNAc, which inhibits O-glycosylation, did not affect any virus infection. These results suggested that bovine P1 NCDV and canine P3 CU-1 strains have binding preference to α2-6Sia on gangliosides or N-linked glycoproteins, while both porcine P7 PRG9121 and P23 PRG942 strains can bind to both α2-6Sia and α2-3Sia on gangliosides or N-linked glycoproteins [129].

A rotavirus vaccine is now available, but there is still lack of antiviral drugs for treatment. The receptor-binding specificity of rotaviruses could be useful for designing effective inhibitors against VP8 attachment at the initial step of infection.

4.4 Adenoviridae

Human adenoviruses (HAdVs) belong to the genus Mastadenovirus (mammalian) in the family Adenoviridae. They are medium-sized (70–100 nm) non-enveloped viruses with a linear dsDNA genome. Type 90 (Ad90 or HAdV-90) has recently been identified, and there are currently 90 identified HAdV types [130] that are grouped into 7 species, A to G. Only some types in species D that are associated with human ocular diseases and type 52, the only current member in species G, that is associated with gastroenteritis are thought to use Sia as a primary receptor (Table 2). As shown in Fig. 3g, the viral capsid of HAdV is constructed from hexon trimers and penton pentamers that project long fiber trimers outward from each vertex. The fiber trimers carry receptor-binding sites at each monomeric distal globular end, the knob domain, for binding to primary receptors on the host cell surface. The primary binding allows the penton pentamers to make contact with secondary receptors on the host cell surface, leading to endocytosis typically via clathrin [131]. However, it has been shown that endocytosis of the human adenovirus type 37 in species D (HAdV-D37) into corneal cells uses caveolin-1 proteins clustered in lipid rafts [132].

Based on reduction of fiber knob/virus binding and infection of sialidase-treated cells, 6 types of ocular pathogens in HAdV species D including 8, 26, 37, 53 (a natural intertypic recombinant of HAdV types 8, 22 and 37), 54 (believed to be an HAdV-D8 variant strain), and 64 (formally known as 19a, a natural recombinant of HAdV types 19p (p = prototype), 22 and 37) [133] have so far been shown to use Sia as a primary cellular receptor for binding that is mediated by their trimeric fiber knobs [134,135,136]. Pretreatment of cells with an NA reduced interactions of HAdV-D19p, which does not cause an eye infection, but did not affect its productive infection, suggesting that Sia may support the virus attachment but is not used as a functional receptor for virus infection [134]. X-ray crystallographic studies have shown that both HAdV-D37 and HAdV-D19p fiber knobs, which have only two different amino acids (Lys240 and Asn340 for HAdV-37 and Glu240 and Asp340 for HAdV-D19p), bind to both α2-3 and α2-6-sialyllactose, indicating that sialyllactoses are not determinants of the tropism of these viruses [137]. Glycan array screening showed that the HAdV-37 knob domain specifically binds to GD1a glycan, which is a disialyl branched hexasaccharide. Surface plasmon resonance analysis revealed that GD1a glycan binding to the HAdV-D37 knob has about 260-fold higher affinity with Kd of 19 μM than binding of sialyllactose to the HAdV-D37 knob with Kd of 5 mM [137], presumably because two terminal Sias on a single GD1a glycan can directly interact with two of three binding sites in the trimeric knob (Fig. 3g) as indicated by molecular modeling, nuclear magnetic resonance, and X-ray crystallographic studies [138]. Several experiments have indicated that HAdV-D37 binds to cell surface O-linked glycoproteins rather than to ganglioside. For example, neither reduction of ganglioside biosynthesis by treatment with P4 compound [(1R,2R)-1-phenyl-2-hexadecanoylamino-3-pyrrolidino-1-propanol] nor removal of cell surface N-glycans by treatment with PNGase F affected binding of HAdV-D37 to HCE cells, while reduction of O-linked glycan synthesis by treatment with benzylGalNAc efficiently inhibited binding of HAdV-D37 to HCE cells and infection of the cells. Thus, O-linked glycoproteins that carry glycans that mimic GD1a glycan are functional host receptors of HAdV-D37. Amino acid sequence analysis revealed that fiber knobs of HAdV-D37 and HAdV-D64 have identity of 100% [136], suggesting that HAdV-D64 possesses the same binding pocket with preferential binding to O-linked glycoproteins bearing GD1a glycan. Fiber knobs of HAdV-D8, HAdV-D26, HAdV-D53, and HAdV-D54 carrying some amino acids that are different from those in the above fiber knobs [135, 136] should be further identified for determining positions of amino acids in 3D structures of the fiber knobs and screened for their receptor-binding specificity.

Using virus overlay protein blot assays, HAdV-D37 (Ad37) has been shown to interact with a human conjunctival (Chang C) membrane protein with an approximate molecular weight of 45 kDa (CAR) in a calcium-independent manner, with a Chang C membrane protein with a molecular weight of 50 kDa (a membrane protein that was later identified as CD46) in a calcium-dependent manner and with a Chang C membrane protein with a molecular weight of 60 kDa (sialylated protein) in a calcium-dependent manner. Pretreatment of Chang C cells with anti-CAR antibodies had little effect on HAdV-D37 infection, indicating that HAdV-D37 infection does not use CAR to infect Chang C conjunctival cells. Pretreatment of Chang C cells with an NA led to abolishment of HAdV-D37 binding to the 60-kDa protein but did not affect HAdV-D37 binding to the 50-kDa protein and infection. These findings suggested that HAdV-D37 can bind to and infect conjunctival cells through the 50-kDa protein (CD46) independently of Sia [139].

Typically, HAdVs have one type of fiber trimers, but all enteric HAdVs including species F containing HAdV-F40 and HAdV-F41 and species G containing HAdV-G52 have two different types of fiber trimers, long and short. All of these three viruses use their longer fibers for attachment to the coxsackie and adenovirus receptor (CAR) on target cells. However, HAdV-F40 and -F41 short fibers do not bind to Sia, while HAdV-G52 short fibers have been shown to attach specifically to Sia. It appears that HAdV-G52 requires either CAR or Sia for virus infection. Removal of the host cell surface Sia by NA treatment abolished HAdV-G52 binding to and infection of Sia-expressing original Chinese hamster ovary (CHO) cells but did not abolish (only partly reduced) virus binding to CHO cells that also express CAR (CAR-expressing CHO cells). When compared to original CHO cells, HAdV-G52 showed a remarkable increase in binding to CAR-expressing CHO cells [140]. Thus, HAdV-G52 may use its long or short fiber knob or both knobs for attachment to host CAR and/or Sia as a primary receptor or co-primary receptors to initiate virus infection. However, there are several questions that need to be answered for understanding the mechanism of virus infection. For example, why do all three known enteric HAdVs have two spikes (fibers), while other HAdVs need only one primary receptor-binding spike? Why do short fibers of HAdV species F not bind to Sia like short fibers of species G do? Both proteins and carbohydrates that are possible receptors on target gastrointestinal epithelial cells, especially those on the luminal (apical) surface that is a site of adenovirus infection, should be identified.

Sialyl glycan structures to which HAdV-G52 preferentially binds have been determined [140, 141]. Glycan microarray screening using a variety of α2-3- and α2-6-sialylated probes showed that several probes with one or two α2-3-sialyl linkages can be bound by the HAdV-G52 short fiber knob (52SFK) but that probes with α2-6-sialyl linkages cannot be significantly detected for their binding to 52SFK. The strongest binding observed was to the probe with a type II (Galβ1-4GlcNAc) backbone sequence, Neu5Acα2-3Galβ1-4GlcNAcβ1-3Galβ1-4Glcβ- [140]. However, the use of a broader range of probes including α2-8- and α2-9-sialyl linkages in glycan microarray screening of 52SFK binding specificity indicated that 52SFK interacts preferentially with linear α2-8-linked oligoSia (polySia, →8Neu5Acα2-), especially at a degree of polymerization of 5 to 9 that produced very strong binding signals that were much greater than those produced by α2-3-sialyl linkage [141]. Since polySia is abundant in brain and lung cancers, the molecular relationships between HAdV-G52-mediated gastroenteritis and 52SFK binding specificity to polySia receptors remain unknown [141]. Determination of the glycosylation type that is required for virus binding revealed that inhibition of O-linked glycosylation, but not inhibition of glycolipids or N-linked glycosylation, resulted in reduction of both virus particles and 52SFK binding to A549 cells, suggesting that O-linked glycans play a major role in 52SFK binding specificity [140].

Nonhuman adenoviruses that have been shown to interact with Sia are canine adenovirus type 2 (CAdV-2), which causes a respiratory disease in dogs and is grouped in the genus Mastadenovirus (mammals) [142], and turkey adenovirus type 3 (TAdV-3), which has a virulent form known as turkey hemorrhagic enteritis virus (THEV) that causes hemorrhagic enteritis in turkeys, splenomegaly in chickens, and marble spleen disease in pheasants and an avirulent form that is used as a vaccine. TAdV-3 was previously grouped in subgroup II avian adenoviruses but is currently grouped in the species turkey siadenovirus A in the genus Siadenovirus (amphibians, dinosauria, testudine species) [143]. Crystallographic studies showed that the CAdV-2 fiber knob can bind to both α2-3-sialyl-d-lactose and CAR D1 in different binding sites. However, binding between the CAdV-2 fiber knob and highly sialylated glycophorin cannot be detected, while hemagglutination of CAdV-2 correlates well with expression of CAR on erythrocytes: CAdV-2 can agglutinate rat and human erythrocytes, which have high CAR levels, but cannot agglutinate erythrocytes from dogs, mice, rabbits, lemurs, and monkeys, which have undetectable levels of CAR [142]. Thus, CAR on erythrocytes is likely to be an important factor for CAdV-2 binding.

By glycan microarray analysis, fiber knobs of both virulent and avirulent TAdV-3 strains have been shown to interact with both α2-3- and α2-6-sialyllactoses but not to bind to α2-3-sialyllactosamine. Isothermal titration calorimetry showed that the binding affinities of both fiber knobs (heads) to α2-3- and α2-6-sialyllactoses were in the mM range at pH 6.0 and less at pH 7.2 with a two- to fourfold higher affinity for α2-3-sialyllactose. Avirulent fiber knobs appeared to bind 1.5- to 3-fold more strongly to sialyllactoses than did virulent fiber knobs. Crystallographic studies showed that the Sia-binding region of TAdV-3 fiber heads comprises amino acids at positions of 392 and 419–423, while two amino acid differences between virulent and avirulent TAdV-3 fiber head domains are at the positions of 354 and 376; hydrophobic Ile354 and hydrophilic Thr376 in a virulent fiber head are substituted with hydrophobic Met354 and Met376 carrying a long side chain with a sulfur atom in an avirulent fiber head [143]. These two amino acids are on the protein surface in separate regions different from the Sia-binding region. The 354 and 376 regions may be binding regions for different ligands, such as CAR and CD46, possibly responsible for virulent and avirulent pathogenesis that need more investigation.

The presence of multi-receptor-binding sites of the fiber knobs (heads) of adenoviruses (such as HAdV-D37 fiber knobs containing binding sites for CD46 [139], Sia [138], and CAR [142]) should be taken into consideration when designing antivirals. The roles of these multi-receptor-binding sites in the life cycle of adenoviruses also need to be understood not only for efficient control of pathogenic viruses but also for design of efficient vectors for gene therapy and for development of oncolytic virotherapy with cell-specific tropism.

4.5 Parvoviridae

Both dependoviruses and autonomous parvoviruses, in the Parvovirinae subfamily in a family of 20–26-nm diameter small non-enveloped viruses with a 4–6-kb linear ssDNA genome, Parvoviridae, have been found to bind to Sia. Probably due to the very small amount of genetic materials to code the necessary biochemical apparatus, as indicated by its name, efficient replication of 4–5-kb ssDNA dependoviruses, except for duck and goose parvoviruses, depends on a larger helper virus such as adenovirus, herpesvirus, or papillomavirus. Thus, they are nicknamed replication-defective parvoviruses or adeno-associated viruses (AAVs). The other 5–6-kb ssDNA autonomous parvoviruses can autonomously replicate inside the cells but only during the DNA synthesis (S) phase of the host cell cycle due to the very simple viral genome [144]. Although replication-defective parvoviruses have not been reported to cause pathology, replication-autonomous parvoviruses have been found to have both nonpathogenic and pathogenic members [145].

Binding of a parvovirus to one or more plasma membrane receptors initiates virus infection typically via clathrin-mediated endocytosis that is possibly breached by lipolytic activity of phospholipase A2 located on the N-terminal “unique region” of VP1 (uVP1). This leads to release and transport of the capsid and/or viral DNA into the cytosol and the nucleus [146]. Receptor binding is a function of a viral capsid protein (CP) as shown in Fig. 3h. Depending on the virus strain and the maturation stage, each CP is comprised of two to four overlapping capsid proteins (called VP1, VP2, VP3, and VP4). Sixty copies (subunits) of each CP are assembled to build an icosahedral capsid surrounding the parvovirus [147].

VP1 and VP2 are formed by alternative splicing of the same messenger RNA (mRNA), and the entire sequence of VP2 is encoded within the VP1 gene. In some viruses, a third structural protein, VP3, is formed (only in DNA-containing capsids) by cleavage of a peptide from the amino terminus of VP2.

As shown in Fig. 3h, each VP is an isoform of VP1 shortened at the N-terminus and each VP thus contains the same C-terminal amino acids (receptor-binding region). VP2 arisen from alternative splicing (cleavage) of the same mRNA as that of VP1 lacks the N-terminal “unique region” of VP1 (uVP1) that is critical for releasing the virus from the endosome. VP3 and VP4 are likely to be produced through posttranslational cleavage of the N-terminal amino acids of VP2 and VP3, respectively. VP1 is produced at a much lower copy number than that of VP2. However, the major capsid component seems to be the final product of posttranslational cleavage [147]. As shown in Fig. 3h, it is an adeno-associated viral capsid consisting of VP1 (full-length VP protein), VP2 (lacking uVP1), and VP3 (lacking nuclear localization signal, NLS), but not VP4, with a ratio of around 1:1:10 [148, 149]. Why do all VPs contain the same receptor-binding site? Are there different roles? Binding of VP1 to a receptor(s) seems to be involved in triggering conformational changes in the uVP1 region [148], leading to exposure of the phospholipase A2 domain from the capsid interior that is critical for viral infectivity as mentioned above. The different roles of the receptor-binding regions present in VP1 and uVP1-lacking VP2 and VP3 should be investigated in detail.

Receptors on the host cell plasma membrane, both carbohydrates and proteins, have been investigated for many viruses in this family. Here we focus on Sia receptors recognized by viral capsid proteins, especially adeno-associated viruses, which have been extensively studied as shown in Table 2. For autonomous parvoviruses, bovine parvovirus (BPV) [150] and porcine parvovirus (PPV) [151] are known to bind to α2-3 O-linked and either O-linked or N-linked sialic acids, respectively, on the cell surface for attachment. Binding to both α2-3 and α2-6Sia of canine parvovirus (CPV) and feline panleukopenia virus (FPV) which is a canine-adapted form that emerged in 1978 in cats does not seem to be for cell infection, though their binding preferences to different Sia types, CPV specific for Neu5Ac and FPV specific for Neu5Gc, seem to be determinants of host adaptation in dogs and cats, respectively [152]. During evolution, it appeared that binding of CPV to Sia drifted from neutral pH to acidic pH in association with an Asn375Asp mutation in VP2. This virus variant (CPV type 2a, CPV-2a) emerged in 1979 and replaced the CPV-2 strain globally. FPV (Asp375) and CPV-2a (Asp375), which are transmitted through an oro-nasal route, can cause preferential binding to Sia at pH below 6.5 and were assumed to allow attachment of virus progenies released from infected progenitor cells in crypts of the intestinal epithelium to Sias on materials, such as mucus, in the intestinal lumen, where its pH is around 5.5–6.6. Such binding was suggested to enhance virus shedding in stools and persistence in the environment for transmission to other hosts [152]. This is an example of Sia-binding viral lectins, VP2 of CPV and FPV, that evolved not for infection of target cells. Another Sia-binding viral lectin that is not used for virus infection is spike (S) protein of several strains of transmissible gastroenteritis virus (TGEV), which is a linear (+)ssRNA enveloped virus in the genus Alphacoronavirus in the family Coronaviridae that causes fatal diarrhea in newborn piglets and binds to specific α2-3Neu5Gc. The protein is thought to allow the virus to persist in and pass through unfavorable environments, such as environments with low pH, proteases, and bile salts, when traveling in the alimentary tract to the target site, the intestinal epithelium [69].

Sia appears to be important for infection of minute virus of mice (MVM) and H-1 parvovirus (H-1PV, isolated from a human tumor cell line transplanted in rats), which are autonomous rodent parvoviruses that infect, propagate in, and kill tumor cells but not non-transformed cells and are thus promising antitumor agents [153]. The Sia-binding sites play a determinant role in cellular tropism of these two viruses and are thus important for tumor selectivity (oncotropism). Some studies have shown that a change in amino acid residues of the Sia-binding site affects receptor-binding preference that probably lead to a change in host range, expanded/restricted/retargeted [147, 154]. For example, constructed recombinant MVMp (p, prototype strain) viruses with VP Ile362Ser and/or Lys368Arg substitution(s) near the Sia-positioned dimple according to substitutions in lethal MVM variants derived from infection of severe combined immunodeficient (SCID) mice by the apathogenic strain (MVMp) were reported to have a lower affinity for Sia receptors, produce a large-plaque phenotype in cell lines in vitro, and cause lethal disease in SCID mice [147]. That study not only indicated that lowering of the Sia receptor-binding affinity of the virus leads to extension of viral tropism and a dramatic increase in viral pathogenicity but also suggested that a virus variant can be generated via both genetic engineering and natural selection. This raises serious concerns about whether the use of oncolytic viruses, especially in immunosuppressed patients, is safe and whether virus variants causing severe disease may emerge during oncovirotherapy. Virotherapeutics with formulations guaranteeing genetic stability may be required.

In contrast to autonomous parvoviruses, adeno-associated viruses (AAVs) have not been reported to cause any disease (generally low levels of immunogenicity and toxicity), and they can infect both dividing and nondividing cells with long persistence in site-specific integration in the host cell genome and in an episomal state when the gene part required for chromosomal integration was removed from its linear ssDNA [155, 156]. Although AAVs are replication-defective parvoviruses, AAVs are now widely used viral vectors carrying therapeutic genes for gene therapy of various disorders by which helper genes from an adenovirus for mediating AAV replication may be inserted in the other plasmid or directly inserted in the AAV vector. However, an immune response to AAV transduction has sometimes been observed, leading to limitation of the use of AAV vectors. For example, unexpected liver toxicity occurred when AAV2 was transduced into hepatocytes during a clinical gene therapy trial for hemophilia B due to cytotoxic T-lymphocyte (CTL) activation by the AAV2 capsid heparin binding motif [155]. This indicated that in addition to developments for increasing the AAV genome capacity and enhancing gene expression, an understanding of AAV-human host interactions is needed for the development of highly efficient transduction specific to target cells, not immune and other normal cells.

AAVs are ubiquitous; 13 serotypes (AAV1–AAV13) that differ in their capsid structure have so far been identified in human and nonhuman primate sources. Like other parvoviruses, the viral capsid protein (CP as shown in Fig. 3) is used for binding to cell surface receptors for mediating viral entry and is thus a prime determinant of the host range and host specificity [157]. AAV1, AAV4, AAV5, and AAV6 bind to the terminal Sia by the viral capsid Sia-binding motif found on all CP components, VP1, VP2, and VP3, but they differ in binding preference for the asialo portion, including glycosidic linkage types. As shown in Table 2, AAV4 requires α2-3Sia on O-glycans, whereas AAV5 preferentially binds to either α2-3 or α2-6Sia on N-glycans [158]. AAV1 and AAV6 can bind to either α2-3 or α2-6Sia-N-glycans [159]. Although AAV1 and AAV6 are closely related with only 6 residues from a total of 736 amino acid residues being different in their VPs, AAV6 appears to additionally bind to negatively charged heparan sulfate proteoglycans (HSPG), which is made possible by a single amino acid difference at the position of 531 (AAV6 Lys531, but AAV1 Glu531 as indicated in Fig. 3h). Some co-receptors have been found to be necessary for optimal AAV attachment and internalization: platelet-derived growth factor receptor (PDGFR) is a co-receptor for AAV5 and epidermal growth factor receptor (EGFR) is a co-receptor for AAV6 [160]. In addition to the primary receptor-binding region, the co-receptor-binding site should be identified for site-specific modification of both the primary and co-receptor-binding sites on the AAV capsid for therapeutic applications. Transduction efficiency of each natural AAV serotype in major tissues has been investigated and the results, summarized in Table 2, show differential tropism of AAV serotypes [160].

Bovine adeno-associated virus (BAAV) is a nonprimate AAV that uses Sia on glycosphingolipid (ganglioside) as a receptor for transduction (infection) [161]. It was shown that BAAV injected via canalostomy can efficiently deliver genes and control gene expression of connexin, leading to rescue gap junction coupling in cochlear non-sensory cells of the inner ear in adult mice with a nonsyndromic hearing loss and deafness (DFNB1) phenotype [162]. Although persistence of BAAV-mediated gene replacement in the cochlea remains limited, the results of that study indicated that BAAV can be further developed as a recombinant viral vector for gene therapy.

Taken together, the results indicate that the receptor expression pattern on the host cell surface varies not only between different host cell membranes but also between different stages of cell differentiation and maturation. Further extensive analysis and identification of the host receptor expression pattern together with studies on AAV binding specificities to receptors on target tissue cells and nontarget immune cells could lead to successful rational modification of the AAV capsid to specific receptors found on cells in the target tissue but not on immune cells. The constructed AAV vectors could increase transduction efficacy and specificity in the desired tissue and overcome concerns about adverse side effects including induction of an immune response.

4.6 Polyomaviridae

Typically, infections by viruses in the family Polyomaviridae are asymptomatic. However, the viruses can cause diseases, especially in immunocompromised individuals, such as severe kidney and brain disorders caused by human John Cunningham polyomavirus (JCPyV) and skin cancer called Merkel cell carcinoma (MCC) that is caused by Merkel cell polyomavirus (MCPyV) (Table 2). Also, BK polyomavirus (BKPyV) can be reactivated to cause nephropathy in renal transplant recipients [163]. In addition, polyomaviruses (poly- = many, −oma = tumors) can transform cells in cultures and in immunocompromised laboratory animals including newborn mice [164]. Surprisingly, simian virus 40 (SV40), which is a monkey virus that was accidentally introduced into humans with contaminated polio vaccine and is able to be transmitted among humans, has recently been detected in a variety of human cancers [165]. The increasing number of detected human and animal polyomaviruses raises concerns that polyomaviruses might acquire mutation and/or recombination with human polyomaviruses with the potential to cause diseases and cancers. To infect a host cell, a polyomavirus, which is a non-enveloped DNA virus, must bind to a receptor(s) on the host plasma membrane, triggering caveolae/raft-mediated endocytosis or clathrin-mediated endocytosis (for JCPyV), and navigate through the ER before delivery of its DNA genome into the nucleus for transcription and replication (Fig. 1). It should be noted that a caveolin/clathrin-independent pathway was observed for some polyomaviruses such as entry of BKPyV into primary human renal proximal tubule epithelial cells [166]. Viral capsid protein 1 (VP1) is responsible for engagement of the host cell receptor, an important step that determines cellular tropism and pathogenesis of the virus. Internalization of JCPyV into cells via clathrin-mediated endocytosis unlike other related polyomaviruses, including murine polyomavirus (mPy causing tumors in newborn mice, not shown in the table), SV40, BKPyV, and MCPyV, suggested that JCPyV has receptor-binding preference that is different from other related polyomaviruses. JCPyV prefers α2-6Sia neolacto-series on N-linked glycoproteins [167, 168] while its binding to the 5-HT2 serotonin receptor seems to facilitate this endocytosis pathway [169]. SV40, mPy, and BKPyV (either caveolin-dependent or caveolin-independent entry) share a common binding for ganglio-series gangliosides [166, 170,171,172] with differential binding preferences. SV40 prefers the branched α2-3Sia GM1 ganglioside [170], mPyV mainly attaches α2-3Sia on GD1a/GT1b/GT1a [173, 174], and BKPyV and MCPyV commonly prefer to bind to α2-8Sia b-series gangliosides including GD3/GD2/GD1b/GT1b for BKPyV [171] and GT1b in cooperative binding with a glycosaminoglycan (GAG) for MCPyV [172]. Comparison of the crystal structures of BKPyV VP1-GD3 and SV40 VP1-GM1 (Fig. 3i) and site-directed mutagenesis studies have provided an important evidence that the amino acid at position 68 is responsible for different α2-3/α2-8 linkage preferences of these two viruses and thus acts as a determinant of receptor specificity. The use of distinct receptors to initiate virus infection contributes to the differences in virus host/tissue/cell tropism and pathological consequences as shown in Table 2.

Viruses can rapidly adapt to their environment. Although polyomaviruses are circular dsDNA viruses using host polymerases for transcription and replication, progressive multifocal leukoencephalopathy (PML)-mutant strains of JCPyV have been detected with VP1 mutations near the apical Sia-binding pocket, possibly resulting from positive selection during the PML development [175]. Complete blockage of the attachment of PML-mutant virus-like particles and partial reduction of binding of wild-type JCPyV genotypes 2 to either SFT cells (gliosarcoma cells with the SV40 large T antigen) or ART cells (ovarian cancer cells with the SV40 large T antigen) by heparin and HS20, which are GAGs, were observed. These findings together with results of other experiments including experiments on infectivity in the presence of various treatments that inhibit engagement of GAGs or sialyl receptors indicated that while the wild-type JCPyV strain can use either GAGs or sialylated glycans for infection, PML-VP1 mutant JCPyV strains have lost the ability to bind to sialylated glycans but use GAGs as alternative receptors for infection. Experiments with wild-type BKV and a BKV-VP1 Phe76Trp constructed mutant showed the same results. When the GM1 receptor is unavailable, GAGs are used as alternative receptors for SV40 infection in some cell types [175]. These findings provide a better understanding of virus evolution and a switch of receptor-binding specificity that trigger alteration of virus tissue/cell tropism and virus pathology, an understanding that is critical for diagnosis and treatment.

5 Molecular and Structural Basis of Viral Lectin–Sialyl Glycan Interactions

The role of sialic acid in viral attachment, which is a key first step in infection of many viruses, has sparked much interest in characterization of viral lectin-Sia complexed structures. Characterization of the structures will facilitate further studies on a single amino acid mutation or a few mutations in the viral lectin binding pocket and investigation of the biological effects of the mutation or mutations for better understanding of virus evolution and variation in receptor-binding specificity, which is important for development of a plan for viral control. Characterization of the complexed structure will also facilitate studies on modifications of the sialyl glycan structure binding to the viral lectin that would pave the way for design of antivirals against viral lectin–Sia interactions. However, only a limited number of viral lectin-sialyl glycans have been co-crystallized for structural studies of their interactions. In the family Caliciviridae, only the crystal structure of norovirus P lectin of a GII.9 strain (VA207) in complex with α2-3-sialyl-Lex tetrasaccharide (PDB, 3pvd) has been determined, but the Sia residue is far from the binding site and does not participate in binding, and VA207 virus was thus classified as a Lewis carbohydrate-binding virus, not a Sia-binding virus [176]. Thus, only interactions between a sialyl glycan with viral lectins of Orthomyxoviridae human influenza H3 HA trimer, Paramyxoviridae mumps HN tetramer, Coronaviridae porcine torovirus HE, Picornaviridae CVA24v VP1/VP2/VP3/VP4, Reoviridae reovirus T1L σ1 trimer, Reoviridae reovirus T3D σ1 trimer, Adenoviridae HAdV-D37 fiber trimer, Parvoviridae AAV VP3 monomer, and Polyomaviridae SV40 VP1 pentamer have been analyzed using PYMOL and are illustrated in Fig. 3. Their PDB structural information is shown in Table 3. As shown in Fig. 3a, interactions of the receptor-binding site located on the head of human H3/2010 HA with pentasaccharide Neu5Acα2-6Galβ1-4GlcNAcβ1-3Galβ1-4Glc (LSTc), a human respiratory receptor analog [177], indicate that negatively charged Neu5Ac has direct hydrogen bond interactions with Tyr98, Tyr137, Ser136, Asn145, Ser227, and Ser228. A change of Tyr98 to nonhydroxyl Phe and a change of Ser136 to negatively charged Asp found on H17 and H18 hemagglutinins make them lack the ability to bind to negatively charged Sia. Ser228 is well known to be a critical determinant of the receptor-binding specificity of H3 HAs to the α2-6-sialyl linkage [24], different from H1 HAs that contain D190/D225 as critical determinants for α2-6-sialyl linkage [32]. Gly225 in human H3/2010 HA binds directly to Gal-2. Asn193 and Tyr159 provide direct hydrogen bond contacts with Glc-5 of long α2-6Sia receptors that are found in the human bronchus [44] but are rare in human alveoli [24].

Table 3 Crystal structures of viral lectins in complex with sialyl glycans that were used for analysis

Crystal structures of mumps HN in complex with trisaccharide α2-3-sialyllactose can be determined, whereas the co-crystal structure with α2-6-sialyllactose cannot be detected, indicating that mumps HN cannot bind efficiently to the α2-6-sialyl linkage [51]. The difference in sialyl linkage binding preference of human mumps HNs from human influenza HAs indicates that further analysis of sialyl glycans expressed on mumps target sites including human parotid gland epithelial cells is needed for clarifying the different tissue tropism of these viruses. Neu5Ac of α2-3-sialyllactose is bound by an arginine triad (Arg180, Arg422, and Arg512), Glu264, and Tyr323, whereas Glc-3 of the compound forms a hydrogen bond with Val476 in a top pocket of the HN head domain. These interactions suggest that the third sugar of the sialyl glycan is an important part of a receptor determinant for mumps virus. Further studies on receptor-binding specificity of this virus using α2-3-sialyl glycans with variation in the asialo portion should provide an insight into key binding determinants required for mumps infection. Currently known interactions suggest that α2-3Sia but not an α2-6 analog may be developed as a mumps inhibitor.

The structure of the esterase-deficient PToV HE (Ser46Ala mutant) strain Markelo in complex with the synthetic receptor 4,9-di-O-acetyl-Neu5Acα2Me [64] shows the receptor ligand in the HA lectin domain (R, receptor-binding site) on the top of an HE homodimer. 4,9-Di-O-acetyl-Neu5Ac- forms hydrogen bonds with four amino acid residues, Arg161, Tyr164, Glu220, and Ser222, in the R domain and with one residue, Tyr118, in the esterase (E) domain. Studies on substrate specificities of enzymatic de-O-acetylation of PToV and BToV HEs indicated that PToV HE prefers 9-mono-O-acetylated Sias (Neu5,9Ac2), whereas BToV can catalyze both mono- and di-O-acetylated Sias (Neu5,7,9Ac3) as substrates. It is assumed that esterase and HA lectin pockets coevolved, and the PToV HE receptor-binding site seems to bind 9-mono- and exclude 7,9-di-O-acetylated Sias. Observation of the synthetic receptor 4,9-di-O-acetyl-Neu5Acα2Me in the receptor-binding pocket of an esterase-deficient PToV HE indicated that the receptor-binding domain of PToV HE can bind to 4,9-di-O-acetyl-Neu5Ac- and possibly that an O-acetyl group at C7, but not at C4, is obstructed, assumedly by the side chains of Val166 and Tyr118 near C7 of the Neu5Ac glycerol side chain. However, the difference in PToV and BToV HE esterase-substrate specificities indicated that viral HE proteins have adaptation in their host [64]. Further analysis of glycan profiles on each of the host mucins and host target cells on the mid-jejunum to distal ileum together with HE amino acid sequencing and direct receptor-binding assays and esterase-substrate specificity assays should lead to an understanding of host-driven viral HE evolution of interactions between host-specific HE proteins of toroviruses and O-acetylated Sias.

Crystallization of CVA24v with several commercially available sialyloligosaccharides that have differences in glycan composition and linkage, including 6′SL, 3′SL, 3′SLN, LSTc, GD1a, sialyl-Lex, GD1b, DSLNT, GM1, GM2, and GD3, revealed that the virus has preferential binding to α2-6-sialyllactose (6′SL) and disialyllacto-N-tetraose (DSLNT, a hexasaccharide (Neu5Acα2-3Galβ1-3(Neu5Acα2-6)GlcNAcβ1-3Galβ1-4Glc-) that carries Neu5Acα2-3Gal- and Neu5Acα2-6GlcNAc- terminals); very weak binding to α2-3-sialyl glycans in 3′SL, 3′SLN, and sialyl-Lex; and undetectable binding to α2-8, α2-3-disialyl glycan in GD1b and GD3 and branched α2-3-sialylated glycans in GM1, GM2, and GD1a [90]. Figure 3d shows the crystal structure of CVA24v soaked with DSLNT. Neu5Ac of DSLNT is found in a shallow binding site at the tip of the crown, viral capsid protein VP1 on the virus shell. The Neu5Ac forms hydrogen bonds with Ser147 and Tyr145. Tyr250 was recently shown to be responsible for emergence of the CVA24v/1985 pandemic [98], and it was also shown to provide hydrogen bond formation with Neu5Ac in its clockwise rotated (cw) form due to flexibility of the VP1 structure (not shown here) [90].

Crystal structures of T1L σ1-GM3 complex [116] and T3D-α2-3-sialyllactose complex [117] are shown in Fig. 3e and Fig. 3f, respectively. Both crystals show an elongated trimeric fiber of the outer-capsid protein σ1 that carries the Sia-binding sites in different regions: on the head domain of T1L σ1 and on the body domain of T3D σ1. The structure of T1L σ1 in complex with the GM3 trisaccharide of Neu5Acα2-3Galβ1-4Glc- (α2-3-sialyllactose) indicated that only Neu5Ac sugar forms hydrogen bonds with Asn353, Thr355, Ser370, Gln371, and Thr373. The lactose moiety of the GM3 trisaccharide does not make contact with the binding pocket. The crystal structure of the T1L σ1-GM2 complex (not shown here) revealed that in addition to the terminal Neu5Ac moiety, the other terminal GalNAc moiety of GM2 provides additional optimal contact via van der Waals interactions in the binding pocket. This suggested that additional GalNAc at the terminal plays an important role in contact with the T1L σ1 binding pocket, explaining why T1Lσ1 has higher binding preference for GM2 than for GM3 in a glycan array binding assay [116]. This is different from interactions between T3D σ1 and α2-3-sialyllactose [117] in that the α2-3-sialyllactose structure in the T3D σ1 pocket adopts a topology different from the topology seen in the T1L σ1 pocket. There are two key sets of T3D σ1 residues having direct hydrogen bonds with α2-3-sialyllactose: (1) Ile201, Arg202, Leu203, and Gly205 anchor with Neu5Ac and (2) Ser195 and Gly196 make optimal contacts with the third Glc. These demonstrate different binding preferences of T1L σ1 and T3D σ1 to sialyl glycans that contribute to different tropism of these viruses in the CNS of newborn mice studied for reovirus pathogenesis; T1L virus is mainly found in ependymal cells, whereas T3D virus is commonly found in neurons [178]. Also, these findings may be useful for manipulation of these viruses to specifically infect target cells for therapeutic applications.

The HAdV-D37 knob does not bind to immobilized gangliosides including GD1a, but a glycan part of GD1a can be docked into the HAdV-D37 knob, and it is thought that HAdV-D37 binds to the GD1a glycan part-like structure on glycoproteins rather than to the GD1a ganglioside. The crystal structure of the HAdV-D37 fiber knob in complex with GD1a oligosaccharide was then determined [138]. Figure 3g shows the specific interactions between GD1a oligosaccharide, Neu5Acα2-3Galβ1-3GalNAcβ1-4(Neu5Acα2-3)Galβ1-4Glc-, and an HAdV-D37 trimeric fiber capsid protein that protrudes from another capsid protein, a penton base. The two terminal Neu5Ac moieties of the GD1a glycan were found in different protomers (shown in salmon and sky blue colors) in Sia-binding sites in the HAdV-D37 knob. The HAdV-D37 knob residues, Tyr312, Pro317, and Lys345, of each protomer provide direct hydrogen bond formation with each terminal Neu5Ac. Lys345 in one protomer (sky blue color) also provides indirect hydrogen bond formation with GalNAc-3 via a water molecule (not shown). In silico substitution of Lys345 to Ala suggested and an in vitro experiment confirmed a crucial role of Lys345 in Sia-binding specificity of the HAdV-D37 knob. The Lys345Ala mutation leads to almost complete abolishment of binding of the HAdV-D37 knob to cells [138], indicating that the ability of this protein to bind to Sia on the cell surface is important for HAdV-D37 knob-cell binding. The importance of Sia for HAdV-D37 infection was verified when HAdV-D37 infection of human corneal epithelial cells was decreased in the presence of multivalent Sia conjugated to human serum albumin compared to that in the presence of a monovalent Sia-conjugated inhibitor [179]. This finding highlights the possibility of development of multivalent Sia-containing antiviral drugs for specific treatment of HAdV-D37-infected conjunctival and corneal cells.

The AAV1 crystal was soaked with an α2-3-sialyl-LacdiNAc trisaccharide, Neu5Acα2-3GalNAcβ1-4GlcNAc (3′SLDN), a top hit glycan bound with AAV1 determined by a glycan array. Only the negatively charged C1 carboxylate of the terminal Neu5Ac was found to form hydrogen bonds with Asn447 and Arg448 residues in a Sia-binding pocket at the base of the protrusions of VP3 monomers (Fig. 3h). Residues Ser268, Asp270, Asn271, Asn447, Arg448, Ser472, V473 Asn500, Thr502, and Trp503, which are near the Neu5Ac ≤4 Å, were considered to provide potential Neu5Ac binding contacts [180]. The AAV1-Neu5Ac interactions were used as information for rational structural engineering of AAV1 and AAV6 vectors (only 6 of a total of 736 residues being different) in order to improve therapeutic efficacy. For example, site-directed mutagenesis substitution of some of these residues forming the Sia-binding pocket indicated that the S472R mutant can increase binding to Sia for both AAV1 and AAV6 [180]. One of the six residues that differ between AAV1 and AAV6, Glu-531 for AAV1 (shown as a purple stick in Fig. 3h) but Lys-531 for AAV6, appeared to be responsible for the binding ability of AAV6 to heparin sulfate proteoglycan (HSPG). This binding ability of AAV6, but not that of AAV1, is one factor contributing to the difference in tissue tropism (Table 2) of these two viruses [181].

From glycan array screening of 258 synthetic physiologically relevant oligosaccharides, GM1 oligosaccharide, Galβ1-3GalNAcβ1-4(Neu5Acα2-3)Galβ1-4Glc-, produced the highest binding signal with recombinantly produced SV40 capsid protein 1, SV40 VP1. The GM1 oligosaccharide is found in a shallow groove at the SV40 VP1 capsid outer surface in a crystal of VP1 pentamer-GM1 oligosaccharide (Fig. 3i). Clearly, there are two critical sets of VP1 amino acid residues, (1) Ser68 and Gln84, which directly interact with Gal- of terminal Galβ1-3GalNAcβ1- via hydrogen bond formation, and (2) Gln62, Ser68, Asn272, Ser274, and Thr276, which directly form hydrogen bonds with terminal Neu5Acα2-, on the Galβ1-4Glc stem of GM1 oligosaccharide. SV40 VP1–GM1 interactions are used as a model for comparison with sialyl oligosaccharide–VP1 interactions of JCPyV and BKPyV, viruses in the same family that are closely related with 74% amino acid identity among VP1 proteins of these three viruses, and for studies on their cellular tropism [170]. For example, structural comparison of SV40 VP1-GM1(one Neu5Ac- terminal and one Gal- terminal) interactions with BKPyV VP1-GD3 (one Neu5Acα2-8Neu5Ac- terminal) interactions suggested that the amino acid at position 68 may regulate selective oligosaccharide affinity [171]. It appeared that a single point mutation at this site from Lys in BKPyV VP1 to Ser in SV40 VP1 enables BKPyV to change binding preference from GD3 to GM1 oligosaccharide, which was confirmed by an in vitro binding assay and by cell culture. This finding highlights the plasticity of the viral binding site leading to a change of receptor-binding specificity, which could trigger a change of cellular tropism and pathogenicity [171]. However, in the future, SV40 VP1–GM1 interactions may be used for designing antiviral drugs against SV40 infection due to an increase in reports of SV40 association with human cancers including human brain tumors, bone cancers, malignant mesothelioma, and non-Hodgkin’s lymphoma [165].

6 Conclusions, Perspectives, and Future Directions

Sialoglycoproteins and/or gangliosides are important for infection, pathogenesis, and transmission of several enveloped and non-enveloped viruses that use their spike glycoproteins and viral capsid proteins, respectively, as viral Sia-binding lectins. While some viral Sia-binding lectins may bind to Sia-containing materials, such as mucus, in order to persist in and pass through environments for transmission, most grasp host cell surface Sias (like doorknobs) either as sole, primary, or co-receptors, leading to opening of the host plasma or endosomal membrane to allow the release of the viral genome into the host cell for multiplication. While Sia-binding therapeutic viruses are useful for prevention and/or treatment of animal and human diseases, Sia-binding pathogens continue to threaten the health and life of humans and/or economic animals as epidemics and/or pandemics (including influenza A viruses, EV70, and CVA24v). For the development of effective and specific Sia-binding therapeutic viruses and for effective control of Sia-binding pathogens, we reviewed the history up to recent data for sialyl glycans and binding of viruses to Sia, and we generated tables of Sia-binding viruses giving a roadmap that primarily displays links of viral lectins, Sia-binding preference, disease, host range, tissue tropism, and entry pathway in an attempt to identify similarities and differences of these viruses as well as what is still unknown and needs to further studies. It is remarkable that (1) most of the enveloped viruses in the families Orthomyxoviridae, Paramyxoviridae, and Coronaviridae have both Sia receptor-binding activity and Sia receptor-destroying enzyme (RDE) activity. (1–1) Why do some of these viruses have both functions on the same molecule, while others have the functions on separate molecules? (1–2) Why do Sia-binding enveloped MERS-CoV and TGEV and non-enveloped viruses (even though the genus Siadenovirus contains a putative sialidase homologue gene for which the function is not known [143]) not encode an RDE? How can these viruses be released from traps by decoy Sia receptors such as those on mucins? (1–3) While influenza A viruses must acquire the ability to bind to α2-6Sia human-type receptors for efficient transmission among humans, why do NAs of all influenza A viruses keep cleavage specificity for α2-3Sia? How can these viruses be released/spread from α2-6Sia-decoy mucins or infected cells? Further studies on receptor-binding and receptor-destroying structures, functions and evolution could lead to answers and consequently open avenues for efficient control of virus spread/transmission. In addition, we can summarize that (2) sialyl glycan-binding preferences of viruses are associated with glycans presented on tissues. For example, TGEV, which is a swine intestinal virus, prefers binding to Neu5Gc over Neu5Ac present in pigs, while human-adapted viruses, such as influenza A viruses, have reduced preference for binding to Neu5Gc, which is rarely detected in normal human tissues. (2–1) The targets of these Sia-binding viruses are most often eyes, respiratory system, intestine, and nervous system implying that these organs contain high levels of Sia. Thus, identification of sialyl glycan structures, being viral receptors, on these tissues of various animals should lead to an understanding of viral host and tissue tropism. (3) Viruses with a broad range of hosts, such as influenza A viruses, coronaviruses, EMCV, caliciviruses, and rotaviruses, have zoonotic potential, and surveillance of these viruses should be maintained and plans for responses to newly emerging zoonotic diseases should be made. (4) It is important to realize that some states of different virus infections may cause the same disease (such as red eyes possibly caused by influenza A viruses, EV70, CAV24v, and HAdV-D). (5) While the entry pathway of Sia-binding enveloped viruses is direct fusion on the host plasma membrane or fusion with the endosomal membrane, the entry pathway of Sia-binding non-enveloped viruses is endocytosis that is followed by endosomal permeabilization/lysis as a result of proteolysis and/or capsid conformational change (picornaviruses [84], caliciviruses [109], reoviruses [111], and adenoviruses [131])/lipolysis (parvoviruses [146, 148]) or that is followed by the ERAD pathway (polyomaviruses [166]), except for EMCV, which possibly has direct penetration of its genome through the plasma membrane. (6) Molecular and structural studies have indicated that Sia-binding sites are usually on the heads of viral lectins except for reovirus σ1 capsid protein of T3D strain carrying Sia-binding sites on its body. It remains unknown why the Sia-binding sites are on the T3D σ1 body and on the T1L σ1 head, whereas the JAM-A-binding sites of both viruses are on their σ1 heads.

Integrated analyses of similarities and differences in Sia-binding viruses in the above fields and viral lectin–sialyl glycan interactions will provide useful data for the design and development of new tools for combating pathogen infections and improvement of therapeutic viruses for therapeutic applications as follows. (1) Many simple tests have been established for detection of change in viral receptor-binding preference of viruses with pandemic potential such as viral NA-based detection [27] and immunochromatographic-based detection [182]. These detections can indicate whether α2-3/α2-6Sia-binding viruses are present in clinical samples or not and their binding specificity to α2-3Sia or α2-6Sia receptors. Due to the possibility of co-infections of Sia-binding pathogens in the samples, especially poultry stool samples, further improvements in the methods for detection, such as development of tags for labeling expected individual viruses in the sample, are needed. A receptor-binding-based diagnostic method that is able to identify a virus at the level of family, genus, subtype, species, or strain will be more specific and informative than current detection methods for viral identification, surveillance in pandemic preparedness, and treatment. (2) Viral lectins are potential targets because they are required for the crucial first stage of the virus life cycle, and several antiviral lectins have been developed, but most of them act against common pathogens causing widespread severe and life-threatening diseases in humans such as sialylmimetics against rotavirus infection [183], trivalent sialic acid-based inhibitors to treat EV-D68 infections [184], and 6′SLN-lipo PGA (Neu5Acα2-6Galβ1-4GlcNAcβ1-eicosanoyl chain poly-α-l-glutamic acid) against influenza epidemics and pandemics [185]. Some viruses are Sia-independent viruses, such as rotavirus human strains K8, KU, MO, and Wa [121] and EV-D68 strains 947, 1348, and 742 [95]. Some Sia-binding pathogens, such as MERS-CoV, CVA24v, JCPyV, and MCPyV, use more than one receptor for infection. Also some viruses including SV40, BKPyV, and JCPyV in the family Polyomaviridae can use an alternative receptor for infection. These facts should be taken into consideration for drug design and treatment. In addition, sialyl glycans are important for both host and Sia-dependent viruses, and antiviral drug design should thus be selective to viral lectins compared to counterpart host lectins in order to reduce host toxicity in treatment. (2–1) The use of the antiviral lectin 6′SLN-lipo PGA, which inhibits influenza infection via inhibition of HAs (influenza lectins) in attachment to cell surface Sia receptors, appeared to synergize with either of the two FDA (Food and Drug Administration)-approved NA inhibitors (oseltamivir carboxylate and zanamivir) [185], suggesting that antiviral lectins may be used in combination with an antiviral receptor-destroying enzyme, anti-NA/anti-esterase, for powered up inhibitory activities, minimized toxicity, and delayed development of resistance for potential treatment. (2–2) Some compounds, such as mumefural and its derivative [186] and Neu5Ac3αF-DSPE (C-3-fluorinated sialyl distearoylphosphatidylethanolamine) [187], inhibit influenza virus infection via inhibition of both HA and NA functions, suggesting that a molecule with dual inhibitory functions of binding and releasing can be designed as a new antiviral chemotype. (2–3) Some different viruses cause infection of the same cell and share Sia-binding specificity. For example, EV70, CAV24v, and HAdV-D can bind to and infect corneal cells through α2-3Neu5Ac-containing glycans. Thus, it is possible to design and develop broad-spectrum Sia-based antiviral drugs against Sia-binding pathogens infecting the same cells. (3) There is a Chinese saying “use the enemy to kill the enemy.” Some Sia-binding viruses, such as NDV [188], reoviruses [189], adenovirus vectors pseudotyped with fibers from HAdV-D [190], and AAVs [191], have been investigated for treatment of human diseases. Based on their oncolytic properties that preferentially infect and lyse cancer cells (so-called oncolytic virotherapy), reovirus T3D has been developed to be pelareorep (Reolysin®), which was approved by the FDA in 2015 for treatment of malignant glioma and in 2017 for treatment of metastatic breast cancer. It has been continued to be investigated for treatment of other cancers and cell proliferative disorders including malignant melanoma [189]. The lack of pathogenicity of AAVs allows them to be extensively investigated as gene-therapy vectors (called viral gene therapy). Alipogene tiparvovec (Glybera®) was developed as an AAV1 vector and was approved in 2012 by the European Commission for treatment of familial lipoprotein lipase deficiency (LPLD) in adult patients who have severe pancreatitis despite a strict low-fat diet [192]. Normally, low abundance of viral receptors on abnormal cells that need to be treated is a key limitation of the use of oncolytic viruses and viral vectors to treat human diseases. Thus, modification of viruses to efficiently bind to receptors present on the target cells is critical for virus infection. Typically, Sia-binding therapeutic viruses including reoviruses, AAVs, and vectors pseudotyped with HAdV-D fiber have more than one binding site, binding to a Sia receptor and another receptor(s). The design of a therapeutic virus for efficient binding to at least two receptors may help to prevent a mutation of the host receptor for escape from viral transfection/transduction. Also, a virus that efficiently binds to two receptors could efficiently infect the host cell and be transmitted specifically among the target host cells. There is evidence that in addition to using ICAM-1 as an essential receptor, CVA24v, which emerged from CVA24 in 1970, caused the AHC pandemic in 1985 after adaptation of its VP1 to efficiently bind to Sia, an attachment receptor supporting ICAM-1-mediated infection of CVA24v [98]. It has also been shown that Sia-dependent rotavirus strains grow more efficiently than Sia-independent rotavirus strains in Sia-containing cells [121]. Thus, further analysis of receptor structures on host cells that need to be treated, experiments on virus binding preferences to sialyl glycan structures found on the host cells, and X-ray crystallographic studies of virus–sialyl glycan interactions coupled with structure-guided mutagenesis are needed to provide blueprints for engineering viral proteins to bind efficiently to specific sialyl glycans on the target host cells. This increased virus-Sia receptor affinity in cooperation with other receptor binding on the target cell could confine the virus to a specific plasma membrane of target cells and increase viral entry efficiency.

Finally, although most viruses have the ability to adapt to their environment, continued and sustainable improvements in surveillance/diagnostic tools and antivirals would lead to eradication of viral pathogens such as in the case of smallpox virus, which has been declared to be eradicated since 1980. With continued improvements in research technology, it will be possible to generate recombinant therapeutic viruses with more efficient binding to dual receptors for synergistic internalization into target cells. This could prevent changes in cellular proteins to resist virus entry, reduce undesired toxicity to normal cells, and increase target cell specificity and transfection/transduction efficiency, leading to improvement of these viruses for treatment of animal and human diseases.