Of all the major classes of macromolecules in biology, glycoconjugates are amongst the most complex. This complexity is comprised not only of a large variety of potential monosaccharides, linkages and branching structures, but also a remarkable degree of intra- and inter-species diversity [1, 2]. The latter has long been one of the more puzzling aspects of their biology. One reasonable explanation is evolutionary selection, driven by the ongoing glycan-based interactions of hosts with their pathogens and symbionts [1, 3]. Of course, if such interactions were the sole cause of glycan heterogeneity, one would not find the negative consequences of genetically altering various glycan types in model organisms, and in human genetic disorders of glycosylation [4, 5]. Thus, glycans must have functions that are both intrinsic and extrinsic to the organism synthesizing them, functions that at times may be at odds with one another.

This review is about a class of monosaccharides with a nine-carbon backbone called sialic acids (Sias), which are typically found at the outermost ends of glycan chains in the deuterostome lineage of animals, and on certain bacterial species [68]. Given their location, Sias are intimately involved in recognition processes mediated by both intrinsic and extrinsic Sia-binding proteins [9, 10]. The focus of this review is on the unexpectedly high frequency of differences in Sia biology between humans and our closest evolutionary cousins, the so-called “great apes” (chimpanzees, bonobos, gorillas and orangutans, now classified along with humans as “hominids”). Given that there are <60 known genes known to be involved in sialic acid biology [11], these findings imply that a series of related events occurred in this system during the course of human evolution [10].

1 N-glycolylneuraminic acid: the “Deuterostome-specific” sialic acid

From the early days of the discovery of Sias it was evident that there were many structural variants of their common 9-carbon backbone [12, 13]. One prominent Sia detected in many mammals was N-glycolylneuraminic acid (Neu5Gc), which differs from the other common Sia N-acetylneuraminic acid (Neu5Ac) by one additional oxygen atom in the acyl group at the C5 position [14]. Many investigators investigated the biosynthetic pathway of Neu5Gc, and the answer emerged from work by Schauer and co-workers, who discovered the hydroxylase/monooxygenase enzyme involved [15, 16]. Detailed studies by Schauer, Suzuki, Kozutsumi and their co-workers later established that the enzyme worked at the CMP-Sia level, converting CMP-Neu5Ac to CMP-Neu5Gc, in a complex mechanism requiring a variety of co-factors, including cytochrome b5 and b5 reductase, iron, oxygen and NADH [1622]. The conversion at the CMP-Sia level was also demonstrated by us in intact cells, using pulse-chase experiments [23]. Perhaps because of the multiple factors required, this CMP-Neu5Ac hydroxylase (CMAH) enzyme activity has not been reported in prokaryotes, nor in any non-deuterostome lineage animals. Thus, Neu5Gc appears to be a marker of the deuterostome lineage of animals (vertebrates and so-called “higher” invertebrates), and likely represents a unique evolutionary experiment that occurred at or just before the Cambrian expansion, ~500 million years ago.

2 Apparent lack of Neu5Gc in human tissues

Even in the early days of Sia studies it was noted that Neu5Gc was hard to find in normal human tissues [24, 25]. However, the methods used could have missed small quantities, and evidence was then provided for Neu5Gc in tumors and fetal meconium [2630]. Thus it was presumed that Neu5Gc was an “oncofetal” antigen in humans, resulting from a gene that was turned off after fetal development, and then turned on again in cancer cells. This concept seemed to be supported by the findings that the “serum sickness” reaction of adult humans to horse serum infusions was partly directly against Neu5Gc [31, 32], and that similar “Hanganitziu–Deicher” antibodies were found in patients with cancer (reviewed in ref. [33]). However, the oncofetal theory was laid to rest in 1998, when two groups independently discovered a human mutation causing irreversible inactivation of the CMAH gene [34, 35]. Parallel studies also confirmed the absence of Neu5Gc in human blood samples, using modern high sensitivity methods [36].

3 Discovery of a human-specific mutation in CMAH

While two groups reported the same 92 base pair deletion in the CMAH cDNA, there was a difference in the details. The first published report [34] made a major contribution by sequencing the entire relevant region of the human genome (a heroic task in the 1990s) and showing a deletion eliminating the 92 base pair exon 6. However, difficulties in cloning the 5′-prime region of the CMAH encoding cDNA led to the assumption that the deletion resulted in an “in-frame stop codon”, and a suggestion that the human CMAH homologue was inactive because it lacked an N-terminal domain essential for enzyme activity [34]. Our group published a somewhat different conclusion, working with cDNAs and genomic PCR [35]. Although we found the same 92 base pair deletion in the cDNA and the human genome, the completed 5′ prime region of the cDNA showed that the deletion actually results in a frame shift mutation, causing early translation termination, and allowing production of only a very small 72 amino acid protein. Furthermore, we went on to show that this deletion occurred in all human populations, but not in any of the African great apes, indicating that the mutation event is both human-specific, and occurred prior the common ancestor of modern humans [35].

4 When and why did humans become deficient in Neu5Gc production?

The human CMAH inactivating event was then shown to be an Alu-mediated genomic deletion [37], which originally occurred in one chromosome of one individual, and is now universal to humans. A major collaborative effort involving investigators from 4 continents then showed the human mutation occurred prior to our common ancestor with Neanderthals (~0.5 million years ago), and likely, ~2.5–3 million years ago, prior to the origin of the genus Homo [38]. Furthermore, haplotype studies of human populations suggested a very deep history, with a coalescence time of ~2 million years for the mutation [39]—suggesting that selection rather than random drift was involved in the fixation of this mutation in human ancestral populations.

However, because of the depth of time involved, it is not possible to confidently detect the signatures of such selection in the genome. Thus, we are left with speculating about whether selection actually occurred to drive this mutation to fixation in human ancestral populations. If it did, the most likely candidate was some form of infectious disease, in which a pathogen (such as the “great ape” form of malaria, see below) or a bacterial toxin that was specifically binding to Neu5Gc, with those individuals who became homozygous for the CMAH mutation being protected. Another possibility is a change in binding preference of an important Sia-binding protein, favoring the loss of Neu5Gc and/or the accumulation of an excess of the metabolic precursor, Neu5Ac. A third possibility is that the ability of CMAH null individuals to generate anti-Neu5Gc antibodies (see below) protected them from enveloped viruses that originated from individuals with intact Neu5Gc expression—as is postulated to occur with other glycan variations associated with circulating antibodies [1, 40]. A fourth (not mutually exclusive) possibility is that the loss of Neu5Gc facilitated the speciation of the Homo lineage (Pascal Gagneux, personal communication) [41].

5 A “Sialoquake” in human evolution?

A single genetic mutation in Sia biology could have become universal to humans simply as a result of such a random mutation drifting to fixation in a small effective population size [14]. However, it was subsequently found that at least 10 other genes involved in Sia biology have undergone human-specific changes [10, 11]. Most of these additional changes seem to have occurred in the family of sialic acid-recognizing receptors called Siglecs (a Sia-binding family of immunoglobulin superfamily lectins) [42, 43]. As shown in Fig. 1 and Table 1, these changes range from gene inactivations, to specific amino acid changes altering function, to expression differences in different cell types. Given that less than 60 genes have been so far found to be directly involved in Sia biology [11], it seems unlikely that all these events occurred by chance. Thus, for want of a better word, we have suggested that human evolution was associated with a “sialoquake”, involving a series of related events in this system [10, 41]. The following sections outline each of these changes, indicating possible and probable implications for human evolution and physiology, and potentially for human disease. As with any such evolutionary discussions, the scenario presented in Fig. 1 and the details presented in Table 1 are likely to change over time, as additional information arises.

Fig. 1
figure 1

Suggested scenario for multiple changes in sialic acid biology during human evolution. The initial loss of Neu5Gc expression during human evolution could have occurred randomly, or due to selection by a pathogen that preferentially recognized Neu5Gc on cell surfaces (e.g., a form of hominid malaria, or a bacterial toxin). Regardless of the reason, the resultant loss of Neu5Gc-binding sites for some CD33rSiglecs should have generated unusual immune activation, and further selection was likely required to allow adjustment for binding of Neu5Ac in some Siglecs, with elimination or loss of binding by others. Thus, all the other human specific changes in sialic acid biology discussed in the text could have resulted from adjustments to the original event of CMAH inactivation. Of course, other scenarios are possible. A non-genetic complexity is also indicated, in which dietary Neu5Gc can accumulate in human tissues in the face of an anti-Neu5Gc response, potentially facilitating diseases associated with chronic inflammation. (Modified from Varki A. Nature 446: 1023, 2007). Note that while CD33rSiglecs are shown as binding to sialic acids on the same cell surface, they can also potentially detect high densities of sialic acids on other cell surfaces or on Neu5Ac-expressing pathogens (which are common in humans)

Table 1 Uniquely human changes in sialic acid biologya

6 Consequences of CMAH loss

Regardless of the original reasons for CMAH gene inactivation, there are multiple consequences. The loss of the CMAH enzyme would have caused a major change in the cell surfaces of ancestral hominids, with the loss of Neu5Gc and the accumulation of an excess of the precursor, Neu5Ac. This, in turn, would have resulted in a loss of proper binding to some of the major Siglecs of innate immune cells (e.g. Siglec-7, -9 and -11), which appear to strongly prefer Neu5Gc in chimpanzees and gorillas [44, 45], and presumably also in our shared common ancestor. Thus, there was likely a period of time when these Siglecs would have had a limited ability to recognize endogenous sialic acids as “self” (Fig. 1), perhaps resulting in a state of hyper-reactivity in the innate immune system. Adjustments in the binding pockets of these Siglecs have since occurred, allowing the binding of Neu5Ac [44, 45] (Fig. 1). Interestingly, this is a relaxation of binding specificity rather than a specific switch in binding preference—implying that the adjustment might not yet be complete.

A second consequence is a change in pathogen regimes, initially dictated by the fact that pathogens that bind Neu5Gc would no longer be able to infect humans. In contrast, those that bind Neu5Ac would have a special preference for human cells, because of the great increase in density of this precursor sialic acid. Examples of both scenarios can be found. E. coli K99 [46], transmissible gastroenteritis coronavirus [47], and simian virus 40 (SV40) [48] all prefer Neu5Gc-containing glycans for optimal binding and invasion. Thus, humans are expected to be resistant to these pathogens. The first two of these cause serious diarrheal disease in domesticated livestock, but do not appear to infect the humans who manage them. With regard to SV40, it is interesting that while humans were directly exposed to this virus due to contamination of poliovirus vaccines in the 1950’s and 60’s, no major deleterious consequences have since ensued [49]. Further studies are needed to see if other pathogens of animals that live in close contact with humans preferentially recognize Neu5Gc, giving humans the advantage of intrinsic resistance.

The converse situation applies to pathogens such as P. falciparum malaria. Our recent studies showed that the major binding protein of this pathogen (in its merozoite stage) preferentially recognizes Neu5Ac in the process of red blood cell invasion [50]. In striking contrast, the corresponding major binding protein of the chimpanzee/gorilla malarial parasite P. reichenowii preferentially recognizes Neu5Gc. This may explain why humans and chimpanzees are relatively or absolutely resistant to the malarial pathogen derived from each other [51, 52]. This also raises the possibility that the original selecting agent for elimination of Neu5Gc may have been a severe form of P. reichenowii-like malaria, and the outcome would have been humans who were completely resistant. Later, a variant form of the chimpanzee malarial organism could have emerged that preferentially recognized Neu5Ac [50]. Indeed, this is supported by recent evidence that the human P. falciparum malaria organism emerged only over the last tens of thousands of years [5356]. Analysis of multiple chimpanzee malarial isolates of P. reichenowii are needed, to ask if P. falciparum indeed arose more recently from a host transfer back from a chimpanzee, and such studies are currently ongoing by other investigators. Additional examples of relative human sensitivity and resistance to Neu5Ac and Neu5Gc preferring pathogens and toxins are also currently being pursued.

Another consequence of the loss of Neu5Gc is that it became a foreign antigen. This is of potential significance because of evidence that bound or free Neu5Gc from extracellular fluids can get incorporated into human cells, both in tissue culture [57, 58], and into the intact body (the latter from dietary sources) [57], and because all humans express varying levels of antibodies against glycans terminating in Neu5Gc [57, 59, 60]. These issues are discussed further below, and raise many new questions that are currently being pursued, both in a Cmah null mouse model, and in human populations.

7 Human-specific changes in genes involved in Sia biosynthesis

Comparisons of the mouse, rat, chimpanzee, and human genomes suggested that some other subtle human-specific changes may have occurred in certain enzymes of Sia biosynthesis [11]. These require further investigation, as to whether or not they represent unusual human adaptations. One gene that has likely undergone human-specific changes is ST6GAL1, which encodes the ST6Gal-I enzyme responsible for adding α2–6 linked Sias to the termini of N-glycans, generating the sequence Siaα2–6Galβ1–4GlcNAcβ1- [61]. This enzyme has widespread expression in many tissues, and shows several tissue-specific expression differences amongst multiple animals [62, 63]. A particularly striking feature is the human-specific selective up-regulation of Siaα2–6Gal14GlcNAc containing glycans in certain cell types, including blood cells and the respiratory epithelium (chimpanzees and gorillas appear more like mice in this regard, in not expressing this sequence at high levels) [64]. This difference in the respiratory epithelium is of interest because this uniquely human change likely protects us from a variety of pathogens that selectively bind to α2–3-linked Sias, particularly avian influenza viruses [65]. Indeed human influenza virus strains bind preferentially to α2–6-linked Sias [66, 67], a binding specificity that is relative uncommon for known pathogens. The mechanism by which this expression change on human respiratory epithelium [68] has occurred is unclear, but is presumably related to some aspect of the promoter region of the gene, and/or a change in a transcription factor. The possibility of involvement of a related enzyme ST6Gal-II [69, 70] must also be considered.

8 Changes in sialoadhesin (Siglec-1)

Siglecs are a Sia-binding family of immunoglobulin superfamily lectins that are widely distributed throughout various cell types in the immune system of primates and rodents [42, 43]. The evolutionarily conserved group includes Siglec-1 (sialoadhesin) [71], Siglec-2 (CD22) [72], Siglec-4 (myelin-associated glycoprotein) [73], and the recently discovered Siglec-15 [74]. Among these, at least one appears to have undergone a human-specific change, not in its binding properties, but in its expression pattern. Sialoadhesin, which is expressed in a subset of macrophages in the lymph nodes, spleen, and bone marrow of rodents [75] appears to be selectively upregulated in humans in comparison to chimpanzees, such that it is found almost universally on all splenic macrophages, with a strikingly different distribution in splenic follicles [76]. Also of interest is the fact that sialoadhesin expression is upregulated in a variety of common human diseases, including HIV infection [77, 78], rheumatoid arthritis [79], and cancer [80]. Given the strong preference of sialoadhesin for binding Neu5Ac over Neu5Gc [81], we have suggested that this change in sialoadhesin expression is related to the marked increase in endogenous Neu5Ac ligands during human evolution [76]. The biological significance of this human-specific difference needs further investigation.

9 Changes in binding specificity of some CD33-related Siglecs

The CD33-related Siglecs are a large family of rapidly evolving Siglecs, and are widely distributed in cells of the immune system. Many of them have cytosolic inhibitory tyrosine-based motifs, and we have postulated that they may thus serve as a simple “self” recognition system, to dampen unwanted innate immune responses against host cells bearing sialic acids [42, 43]. There appear to be quite a few human-specific changes in these molecules. First, as mentioned earlier, the major innate immune cell Siglecs-7 -9 and -11 have undergone amino acid changes in their binding pockets that result in tolerance for Neu5Ac binding, a derived state relative to the strong Neu5Gc-binding preference of the ancestral hominid orthologs [44]. It is not certain what this means in functional terms, but there was apparently a subsequent evolutionary adjustment in the Siglec binding profiles, and the question arises whether this adjustment has yet been completed. Regardless of the reason, this makes human innate immune cells theoretically more susceptible to the potential for molecular mimicry by pathogens that express Neu5Ac. We have hypothesized that such pathogens may take advantage of these Siglecs, thereby down-regulating innate immune responses [42]. While this hypothesis has yet to be proven, it is consistent with the finding that the majority of the Neu5Ac-expressing bacteria that are known to date are human pathogens, or even human-specific pathogens. In this regard it is interesting to note that human pathogens have used every conceivable biochemical mechanism to express Neu5Ac, many via convergent evolution [8]. Notably, in contrast to CD33-related Siglecs, sialoadhesin does not have the ability to deliver an inhibitory signal and, in fact, may serve to enhance phagocytosis of Sia-expressing bacteria [82]. In this regard, it is interesting that some of the Sia-expressing bacteria add O-acetyl groups to their Sias [8, 83], a modification that substantially reduces recognition of glycans by sialoadhesin [81, 84]. Meanwhile, Siglec-12 has undergone a human-specific mutation in a key arginine residue required for Sia recognition—and interestingly, the ancestral form preferred to bind Neu5Gc [85]. Additionally Siglec-13 has been completely deleted in the human lineage [86]. It is possible that some of these findings are related to the original loss of Neu5Gc, and/or, an ongoing evolutionary arms race between the Neu5Ac-expressing pathogens and the human host.

10 Human-specific expression of Siglec-6 in the placenta

As mentioned above, CD33-related Siglecs are primarily found on cells of the immune system. Surprisingly, Siglec-6 [87] was independently cloned by others from a placenta cDNA library [88]. Siglec-6 is indeed expressed at easily detectible levels in the trophoblast cells of the human placenta [87]. Interestingly this expression is not found in placentae from chimpanzees, gorillas, and orangutans [89], even though potential Siglec-6 ligands are present in all these placentae. The expression of Siglec-6 in human placentae was variable, with the highest level found following normal full-term labor, and the lowest level seen in placentae removed during elective ceasarean section, without the onset of labor [89]. These data suggest that Siglec-6 is upregulated during the process of labor, specifically in humans. In the absence of any other data, one can only speculate about the meaning of this finding. One possibility is that inhibitory cytosolic ITIM motifs of this molecule serve to down-regulate placental signals, to control or delay the process of labor. In this regard it is of interest that chimpanzees have, in contrast to humans, a very short labor process [90, 91]. While the great length of human labor is generally assumed to be because of the relative size of the fetal head and the maternal pelvic outlet, it is also possible that it is necessary to slow down the human birth process, to avoid further damage to the fetal brain and/or the maternal birth canal. Further studies will be needed to address this hypothesis.

11 Human-specific loss of CD33-related Siglec expression on T-cells

During the initial studies of Siglec expression on human blood cells, it was noted that T cells were the exception, in expressing none or very low levels of CD33-related Siglecs. Surprisingly, this is a human-specific trait, in that easily detectable levels of multiple Siglecs are found on T cells from chimpanzees, bonobos, gorillas and orangutans [92]. Since the Siglecs in question are all also encoded in the human genome [86], this represents a selective down-regulation of expression on human T cells. Again, the evolutionary origins of this change are uncertain. Regardless, there does appear to be a functional consequence, that human T cells were found to be more reactive than chimpanzee T cells when stimulated through the T cell receptor complex [92]. In contrast, strong non-specific stimulation using a lectin showed no major difference between human and chimpanzee T cells, suggesting that there is no intrinsic difference in the ability of the human T cell to respond. It is possible that the expression of CD33-related Siglecs (particularly Siglec-5) on chimpanzee T cells reduces or inhibits the ability of these cells to respond to physiological stimuli. Evidence supporting this hypothesis was obtained by either down-regulating the expression of Siglec-5 (the major Siglec of chimpanzee T cells), or by forcing expression of Siglec-5 on human T cells. In both cases the predicted responses occurred, i.e., chimpanzee T cells improved in their responses upon down-regulation of Siglec-5, and human T cells reduced their response upon expression of Siglec-5 orangutans [92]. A recent study [93] pointed out that the specific anti-CD3 antibody used in our work probably overestimated the extent of difference in chimpanzee and human T cell reactivity. Regardless, Fig. 1 of this paper continues to show obvious differences between human and chimpanzee T cells, with humans showing stronger responses overall, especially in the absence of anti-CD28 co-stimulation [93].

Taken together, all these data are consistent with the hypothesis that human T cells are prone to hyper-reactivity, perhaps explaining the human propensity for diseases associated with excessive T cell responses, such as rheumatoid arthritis, asthma, and other autoimmune disorders [9496]. It is also possible that this can help explain why infection of T cells with HIV progresses rapidly to their loss and their destruction in human AIDS—in contrast to chimpanzees, in whom the virus proliferates but does not result in large-scale loss of T cells [97]. Further studies are needed to confirm this overall hypothesis and whether or not the human loss of Siglec expression is the major cause. If so, a treatment that up-regulates Siglecs on human T cells may help to control human T cell-mediated diseases.

12 Human-specific expression of Siglec-11 in the brain

Siglec-11 is a CD33-related Siglec that is expressed in a subset of tissue macrophages of humans and other primates, and is believed to have an inhibitory function by virtue of its cytosolic ITIM motifs [98]. During the original characterization of human Siglec-11 it expression was also found in the microglia of the brain. At first glance this did not appear surprising, as microglia are essentially long term macrophage-like cells derived from circulating blood monocytes. However, further studies showed several unique features of human Siglec-11 [45]. First, human SIGLEC11 has undergone a human-specific gene conversion with an adjacent pseudogene (recently shown to be SIGLEC16), a process involving the sequences encoding the first two amino-terminal domains of the molecule. Gene conversions are not uncommon, but they usually result in permanent damage to the gene that is converted. However in the case of SIGLEC11 the open reading frame was maintained, giving a new molecule in which the amino-terminal sequences are rather different from those of the chimpanzee counterpart. Fortunately the antibody originally raised against human Siglec-11 cross-reacts with the ancestral chimpanzee molecule. Use of this antibody showed that while chimpanzees and other great apes do express Siglec-11 in their tissue macrophages, they express it at very low levels in microglia.

As with many of these human-specific findings, it is difficult to be certain what the evolutionary selection mechanism or the current impact is [45]. As the ancestral chimpanzee form of Siglec-11 strongly prefers to bind Neu5Gc, this could have represented another evolutionary adjustment to the loss of Neu5Gc. Regardless of the reason, the gene conversion happened to include ~250 base pairs of the 5′ untranslated region, which may have helped induce expression on the microglia (our unpublished data). Once this happened there was potential impact not only on human resistance or susceptibility to brain invading pathogens, but also on neuronal development and function. This is because long-lived microglia are not just immune cells [99], but are also known to have trophic functions, being involved even the development of the brain and in the maintenance of some aspects of neuronal function [100]. Overall, it is reasonable to hypothesize that regardless of the original cause, the expression of Siglec-11 in microglia may have resulted in some changes in the human brain. It is also of interest that microglia play a prominent role in many human brain diseases such as Alzheimer disease, HIV-1-associated dementia and multiple sclerosis [101]. Unfortunately, mice do not have an ortholog of Siglec-11. Thus, unless we were to find a human defective in the synthesis of Siglec-11, it is difficult to pursue this hypothesis. The other alternative is to express Siglec-11 transgenically in the microglia of the mouse brain. Analysis of this matter is further complicated by a recent finding that some (but not all) humans have lost an activatory counterpart of Siglec-11, designated Siglec-16 [102]. It will be interesting to see if this was also (as suggested by the authors) a human-specific event.

13 Are there more examples of human-specific changes in CD33-related Siglec expression?

Overall, our data indicates that human evolution was associated with some major changes in expression patterns of multiple CD33-related Siglecs. It seems unlikely that each and every one of these was an independent event involving specific base pair changes in the promoter regions of individual Siglecs. Rather, given that most of these genes are clustered within an ~0.5 Mb region on chromosome 19 [86], it appears more likely that there were global changes in enhancers, locus control regions and/or epigenetic changes affecting the expression of the entire cluster. Thus, searching for additional uniquely human Siglec expression changes in other tissues and cell types appears worthwhile.

14 Restoration of Sia binding properties of Siglec-5 and -14

Siglec-5 is expressed at varying levels on many other blood leukocytes [103]. However, the situation is complicated by the fact that Siglec-14 is encoded by an adjacent gene, that has activatory rather than inhibitory properties [104]. Furthermore, the SIGLEC14 gene is undergoing lineage specific gene conversion with the SIGLEC5 gene, involving the sequences encoding first two domains of both proteins (a form of concerted evolution) [104]. Thus, the Sia-binding properties of Siglec-5 and -14 are maintained the same within each primate species by ongoing gene conversion events. Surprisingly, an arginine residue in the V-set domain of these siglecs that is critical for Sia recognition has been mutated in the gorilla, chimpanzee, and orangutan molecules—leaving an open reading frame, but a protein incapable of binding Sias. In contrast, the human Siglec-5/14 sequence appears to have restored the arginine residue, thereby regaining Sia binding properties [104]. The nature of the species sequence differences is such that one cannot be absolutely sure about the human-specific restoration. However, it seems less likely that each of the three other hominid lineages have independently mutated the arginine residue. The significance of this apparent restoration of human Sia binding of Siglec-5 and -14 is uncertain, as is the significance of the loss of binding in all the other hominid lineages. Regardless, it provides yet another example of a human-specific change in CD33-related Siglecs. Further work will be needed to analyze this issue, taking into account the additional difference with regard to lack of expression in human T cells.

15 Are humans really unique in having so many lineage-specific changes in sialic acid biology?

Despite the finding of so many human-specific changes in Sia biology, one must recognize that these changes occurred in a system that is prone to rapid evolution in many taxa, because of multiple selection pressures that have been discussed elsewhere [42]. The following points suggest that the human situation is unusual. First, all of the changes mentioned above are specific to the human lineage, with the other hominids studied showing no differences amongst each other. Second, comparisons of the Siglecs of mice and rats show few differences [86], despite the fact that these two species shared a common ancestor long before humans and chimpanzees did. Third, studies of the gene and protein sequences of the sialic acid binding Ig-like V-set domains of CD33-related Siglecs of multiple species show evidence of more rapid evolution in humans (with the sequences of the adjacent Ig C2-set domains providing good controls) [11]. Finally, we can construct an evolutionary scenario potentially connecting some of these changes to each other (see Fig. 1) [10]. Regardless, we must keep an open mind to the possibility that some of the human-specific changes in sialic acid biology are actually unrelated events, which happened to occur in a system that is intrinsically prone to rapid evolution. In this regard, it would be best to conduct a more detailed study of sialic acid biology in all the other hominids. Unfortunately, current NIH policies are making it increasingly difficult to conduct research of any kind in our closest evolutionary relatives, even research that is considered ethical in humans [105, 106].

16 Metabolic incorporation of Neu5Gc into normal human cells and tissues

Inactivation of the mouse Cmah gene leaves no detectable Neu5Gc in any tissues, when examined via HPLC analysis [107], and even by mass spectrometry [108]. This suggests that there is only one gene dictating the biosynthesis of Neu5Gc in vertebrates. Despite this, we detected Neu5Gc not only in human carcinomas and fetal tissues (as expected from previous literature), but also in several normal tissue types, particularly in endothelium and epithelium of human surgical specimens or autopsy tissues [57]. This finding makes it likely that Neu5Gc is being incorporated from exogenous sources. In this regard, a human volunteer study confirmed that orally ingested Neu5Gc is indeed taken up into the human body [57]. The limited survey of foods that has been done so far indicates that the richest source of Neu5Gc involves red meats (lamb, pork, and beef), with bovine milk products containing significant amounts. Thus we have hypothesized that the long-term dietary intake of Neu5Gc with incorporation into endothelium and epithelium could combine with the circulating anti-Neu5Gc antibodies (see below), to stimulate chronic inflammation [10].

17 Circulating antibodies against Neu5Gc in humans

Many years ago it was shown that the so-called “heterophile antibodies” seen in certain human disease states can be directed against Neu5Gc-containing epitopes [31, 32, 109113], (reviewed in ref [33]). These antibodies were detected by the agglutination of animal erythrocytes, or by ELISA assays against high-molecular-weight glycoproteins from such erythrocytes [33, 114, 115]. Another assay involved the detection of Neu5Gc on a small glycolipid GM3(Neu5Gc) [31, 33, 116]. Using these assays it was reported that normal humans did not have anti-Neu5Gc antibodies [32, 111, 115, 117120]. However, recent work using a more specific assay that takes into account the background and other controls has shown that all normal humans actually have significant levels of circulating antibodies against Neu5Gc [57, 59, 60, 121]. Indeed, some normal humans have remarkably large amounts of these circulating antibodies, even surpassing levels of some well-known natural blood group and xenoreactive antibodies [60]. Furthermore, these antibodies can induce complement deposition on Neu5Gc-fed human cells [59].

Notably, the Neu5Gc molecule cannot by itself fill the binding site (paratope) of an antibody, and can also be modified and/or presented in various linkages, on diverse underlying glycans. Following up on this, we have recently used a novel set of natural and chemoenzymatically-synthesized glycans to show that many normal humans have an abundant and quite diverse spectrum of such anti-Neu5Gc antibodies, each reacting differently with a variety of Neu5Gc-containing epitopes [60]. Contrary to the standard dogma about anti-glycan antibodies, the commonest anti-Neu5Gc antibodies in normal humans are also of the IgG class [60]. As noted above, earlier literature had also reported more easily detectable anti-Neu5Gc antibodies in patients with cancer, rheumatoid arthritis, infectious mononucleosis and other diseases. This finding is now being further pursued, using a novel glycan microarray.

18 Implications for dietary intake of Neu5Gc in humans

The major dietary sources of Neu5Gc appear to be foods of mammalian origin, and major sites of accumulation (endothelia of blood vessels and epithelial cells lining hollow organs) [57], happen to also be the sites of diseases that seem to preferentially occur in humans, i.e. large-vessel occluding atherosclerosis and carcinomas of epithelial origin. Interestingly, both of these disease processes are epidemiologically associated with red meat or milk consumption [122130], and are aggravated by chronic inflammation [131137]. Furthermore, hypoxic conditions in tumors can up-regulate expression of the lysosomal Sia transporter that appears to be required for Neu5Gc incorporation into human cells [138].

Our current working hypothesis is that a combination of long-term Neu5Gc tissue incorporation with circulating antibodies against these glycans result in ongoing chronic inflammation in these cell types, contributing to uniquely human disease profiles. This hypothesis is currently being tested using the Cmah null animal as a human-like host. Epidemiological studies are also necessary in human populations. Such studies are complicated by the fact that the mechanisms of incorporation and turnover of Neu5Gc could vary between individuals, and the fact that antibody levels also vary a lot between individuals. Overall, much further work needs to be done before we can come to any final conclusions about these interesting hypotheses.

19 Implications for Neu5Gc in biotechnology products

Regardless of the implications for human diseases, the incorporation of Neu5Gc from animals cell and/or animal-derived culture medium components into biotechnology products of various kinds intended for therapeutic use in humans [139144], (including stem cells) [145147] is of potential relevance, given that humans have such varying and sometimes high degrees of expression of anti-Neu5Gc antibodies [60]. Contrary to prior reassurances based on insensitive methods [148], further studies are needed to ascertain if there will be any short-term and long-term consequences of injecting humans with biotherapeutic products containing Neu5Gc. Potential complications to be considered include immediate hypersensitivity reactions, reduced half-life in circulation, immune-complex formation, boosting of existing anti-Neu5Gc antibody levels, enhancement of immune reactivity against the underlying polypeptide, and the direct loading of human tissues with more Neu5Gc.

20 Initial phenotyping of the Cmah-null mouse

Two groups produced mice null in the Cmah gene, either by neo-selected cassette insertion [107], or using a Cre-mediated excision of exon-6 that gave a genotype essentially identical to that of the human mutation [108]. The first mouse was studied primarily regarding a defect in B cell biology, which is based on the fact that (unlike the case in humans) CD22 in the mouse is highly dependent on Neu5Gc as its preferred ligand. In this instance the loss of the ligand resulted in a phenotype similar to that of the loss the CD22 itself, i.e. hyperactive B cells [107]. While very interesting and instructive from the point of view of Siglec biology, this finding may or may not be relevant to the human situation, as both human and other hominid CD22 molecules appear to recognize Neu5Ac and Neu5Gc equally well [76]. The second mouse with the human-like genotype was evaluated further, after breeding into a congenic background. As predicted from the earlier work in humans, rapidly growing tumors in such mice were capable of taking up Neu5Gc given by oral feeding [108]. However, the extent of uptake and incorporation was much less than that seen in human tumors, perhaps because of the very short time (weeks), during which the experiment could be conducted and/or because of different metabolic pathways in mice. Regardless, this data confirms that Neu5Gc can be taken up by tumors from exogenous sources. We also noted that homozygous null mice that were born to heterozygous dams with endogenous Neu5Gc accumulated substantial amounts of Neu5Gc during in utero growth. This confirms that fetal tissues are capable of incorporating Neu5Gc from maternal sources, and supports the notion that the Neu5Gc previously found in human fetuses and placental tissues likely also originates from maternal dietary ingestion.

Further phenotypic characterization of the mice revealed other features of a human-like nature [108]. First, the older mice developed a degenerative process in the inner ear, resulting in poor hearing. This could potentially be relevant to age-related hearing loss in humans. Another commonly recognized feature of non-human primates is that skin wounds heal more rapidly [149, 150]. Interestingly, the Cmah null mice also showed a definite human-like delay in wound healing [108]. This phenotype also requires further study. Additional phenotypes involving metabolism, reproduction, etc. are being investigated, and further studies are needed to know if any of these phenotypes are relevant to human evolution. In this regard it must be recognized that even though the Cmah null mouse has a human-like genotype, it cannot provide a perfect phenocopy of the current human condition. After all, the human mutation occurred two to three million years ago, and much evolutionary time has elapsed since then. Thus, the null mice are at best a recapitulation of the situation as it existed two to three million years ago. Of course, even this is not necessarily the case, as the null phenotype was introduced into the background of rodent but not hominid biology.

21 Implications for human diseases

Despite the close genetic similarity between humans and other hominids, there are many definite and apparent differences in the incidence and severity of various diseases between humans and our closest evolutionary cousins. Lists of such diseases that cannot be explained simply by anatomical differences have been provided earlier [9496]. While speculative at this point, we have suggested testable hypotheses for how several human-specific changes in Sias or Siglecs may have resulted in, or contributed to these disease differences. For example, differences in the incidence and severity of late complication of HIV and hepatitis B and C virus infection as well as the severe responses of adult humans to other viral infections might be explained by the hyper-reactive T cells of humans, a state that we have suggested might be due to suppression of Siglec expression on these cells [92]. Other diseases in which T cells play a prominent role have also so far not been commonly reported in the other hominids, include asthma, psoriasis, and rheumatoid arthritis [9496]. Given that these diseases occur in the human population at frequencies approaching or above 1%, and that several thousand great apes have received extensive medical care in captive facilities, it seems likely that these differences are significant. However, further studies will be needed to confirm this. As already mentioned earlier, the difference in susceptibility of humans and chimpanzees to the two different kinds of malaria could be partly contributed to by the difference in Sias on the surface of red blood cells, which are the targets of the malarial merozoite [50]. Likewise, other infectious diseases that require Sia for invasive binding might be differentially expressed or manifested in humans because of the difference of the Sia composition of human cells. Additionally the differences in expression of alpha 2–6-linked sialic acids [64] could help explain differences in susceptibility to pathogens such as avian and human influenza viruses. Finally, as discussed above, the metabolic incorporation of Neu5Gc into human tissues in the face of anti-Neu5Gc responses could help explain some linkages between dietary intake of red meat and milk, and certain diseases that are aggravated by chronic inflammation. Of course, such chronic inflammation might be further fueled by the hyper-reactivity of human T cells, related to loss of Siglec expression. Further studies of some these possibilities are under way.

22 Conclusions and future prospects

The CMAH mutation was the first known genetic and biochemical difference between humans and other hominids. When discovered ~10 years ago, it was possible that this was just a random event that simply drifted to fixation in a small effective population size. Also, there are other lineages in which Neu5Gc expression appears to be the low or absent, such as chickens. Considering all of this, it was unclear at first glance whether this event had any significance for human evolution. However the effects of such genetic events must be considered in the context of the particular species in which they occur—and Neu5Gc was a very prominent Sia in the chimpanzee–human common ancestor, widely expressed through many tissues, and involved with recognition by some of the Siglecs. Perhaps for this reason, the loss of Neu5Gc expression could have had more of an impact in this lineage. In any event it is reasonable to speculate that the multiple differences in Sia biology that we have now discovered are related to one another through a step-wise series of related events (a possible scenario is presented in Fig. 1).

Much further work will be needed to confirm or refute some of these hypotheses. In some cases we are handicapped by not being able to recreate the evolutionary events and/or not being able to do many of the theoretically interesting experiments in humans or other hominids, for ethical and/or practical reasons. Thus going forward it seems reasonable to genetically “humanize” or “chimpanize” mice and study the consequences. Also, given the unusual expression patterns of Siglecs in humans (such as in the placenta and the brain) it is worthwhile to search for other examples of Siglec expression in unexpected locations in humans. This will require careful comparisons with other hominid tissues, something that is made difficult by the lack of availability of even autopsy tissue samples from great apes, because of currently restrictive NIH policies [105, 106]. Thus it may be that mouse experiments are the best we can do to try and understand these stages of human evolution.

There also remains the possibility of finding human individuals with defects in some of these of pathways and/or molecules that appear to have undergone human-specific changes. Such examples may serve to explain the specific functions of these changes during human evolution. Regardless of whether or not some of these changes are relevant to human evolution per se, this system provides an excellent case study of rapid evolution of glycan changes within a well-defined clade of species. As and when it becomes relevant it may be possible to examine some of these differences by directly studying hominid fossils [41]. However, such work would have to be very carefully justified in view of the precious nature of these kinds of samples. Overall, the future looks interesting, and it will be instructive to continue to pursue these issues on many fronts, not only to better comprehend human evolution, but also to understand human diseases.