Introduction

Non-human primates, notably rhesus macaques, are widely used as animal models in biomedical and immunological research. In particular, both the Indian-origin and Chinese-origin rhesus macaque models have contributed invaluable information to the study of disease pathogenesis and novel vaccine evaluation (Desrosiers 1990; Gardner and Luciw 2008; Haigwood 2009; ILAR 2003; Kindt et al. 1992; Persidsky and Fox 2007). However, these two distinct geographic populations have displayed decidedly different outcomes to infection despite their apparent physical similarity (Joag et al. 1994; Otting et al. 2005, 2007), perhaps attributive to genetic profile variations that affect immune responses. Specifically, upon Simian immunodeficiency virus (SIV) infection, Chinese rhesus macaques have a more delayed and prolonged progression to AIDS as compared to the animal populations of Indian origin (Ling et al. 2002).

A number of studies have shown that several macaque major histocompatibility complex (MHC; Mamu) class I alleles, including Mamu-A*01 (Allen et al. 1998), Mamu-B*17 (Mothe et al. 2002), and Mamu-B*01 (Loffredo et al. 2005) among others (Loffredo et al. 2004; Sette et al. 2005), are expressed with high frequency in Indian rhesus macaque populations. Interbreeding of the Indian-origin animals in the USA since 1978, when the exportation of these animals from India was discontinued (Southwick and Siddiqi 1988), has likely played a significant role in this observation. The peptide-binding specificities for several of these and other Indian rhesus MHC allelic forms have been extensively characterized, leading to the identification of specific alleles which influence disease progression (Mothe et al. 2003; O’Connor et al. 2003; Yant et al. 2006; Loffredo et al. 2007) as well as the discovery of viral evasion from cytotoxic T lymphocyte (CTL) responses (Evans et al. 1999; Allen et al. 2000) in the SIV arena. Indeed, Indian rhesus macaques are the model most utilized in HIV- and AIDS-related research studies (Persidsky and Fox 2007; Patterson and Carrion 2005; Gardner and Luciw 2008; Watkins et al. 2008). However, the increased demand for these animals and, more importantly, the rapid progression to disease displayed after SIV infection of the Indian-origin populations (Ling et al. 2002) have underscored the advantages for developing alternative animal models.

Because of their relative accessibility, Chinese rhesus macaques are becoming more widely employed as non-human primate models in infectious disease research. They are utilized for the evaluation of vaccines and the study of immune responses in pathogen systems ranging from Marburg virus, Ebola virus, and influenza virus to the more well-studied SIV (Geisbert et al. 2007; Larsen et al. 2007; Carroll et al. 2008; Degenhardt et al. 2009; Ling et al. 2007, 2002). These animals, however, have not been characterized at the MHC loci to the same extent as their Indian counterparts. Studies to address this disparity have revealed a surprisingly high degree of MHC polymorphism (Otting et al. 2005, 2007, 2008; Karl et al. 2008; Ma et al. 2009; Wiseman et al. 2009; Ouyang et al. 2008). However, it is largely non-overlapping with Indian-origin macaques (Solomon et al. 2010). This polymorphism may be due to the diverse geographic origins from which the animals have been derived, comparable to human population distribution, suggesting that Chinese rhesus macaques may represent human leukocyte antigen (HLA) diversity more effectively than those of Indian origin.

HLA polymorphism and its function to bind a diverse array of antigenic peptides for CTL scrutiny have been well documented, as has the existence of HLA supertypes, groups of MHC molecules which share similar peptide-binding specificities (Bjorkman and Parham 1990; Maryanski et al. 1986; Parham et al. 1995; Sette and Sidney 1999; Sidney et al. 1995, a, b; Townsend et al. 2006). Previous studies have demonstrated CTL repertoire overlaps between humans and chimpanzees (Bertoni et al. 1998), as well as humans and Indian rhesus macaques (Loffredo et al. 2009), suggesting that HLA binding supertypes may extend to non-human primates. Recently, the peptide-binding specificity associated with the most frequent Chinese-origin allele, Mamu-A1*02201, has been characterized and shown to be an HLA-B7 supertype analog (Solomon et al. 2010), though it is the only one to date. In order for Chinese-origin macaques to become more valuable as animal models of infectious disease research, continued investigation and characterization of their MHC profile is necessary so cellular immunity and immune correlates of protection can be more accurately assessed.

We previously reported that 11 alleles were found in multiple Chinese rhesus animals in different cohorts/colonies, each with a cumulative frequency greater than or equal to 5.6%. Combined, these alleles provide coverage for approximately 68% of all Chinese rhesus animals studied thus far and therefore represent a logical target for further analysis and characterization (Solomon et al. 2010). Accordingly, herein we sought to characterize Mamu-A1*02601 (6.7%) and Mamu-B*08301 (5.8%), two of the most frequently expressed Chinese-origin class I alleles. We report the specific peptide-binding motifs associated with these allelic forms and utilize their respective motifs to map SIV-derived Mamu-A1*02601 and Mamu-B*08301 binding peptides.

Materials and methods

Creation of stable Mamu-A1*02601, Mamu-B*08301 transfectant cell lines

Stable MHC class I transfectants were produced in the MHC class I deficient EBV-transformed B-lymphoblastoid cell line 721.221. An expression construct was created for Mamu-A1*02601 and Mamu-B*08301 by sub-cloning a full-length allele transcript into separate pcDNA 3.1 vectors (Invitrogen). These constructs were then used to transfect MHC class I-null 721.221 cells using an Amaxa Nucleofector II transfection machine (Lonza AG, Walkersville, MD, USA).

To produce secreted Mamu-A1*02601 molecules in the context of endogenous ligand identification, α-chain cDNAs of Mamu-A1*02601 were modified at the 3′ end by PCR mutagenesis to delete codons 5–7 encoding the transmembrane and cytoplasmic domains and to add a 30-bp tail encoding the ten amino acid rat very low density lipoprotein receptor (VLDLr), SVVSTDDDLA, for purification purposes (Hickman et al. 2000). sMHC-VLDLr were cloned into the mammalian expression vector pcDNA3.1 (Invitrogen); 721.221 cells were transfected with sMHC Mamu-A*26TVLDLr by electroporation. After 48 h incubation, cells were plated in 96-well plates (Falcon) in RPMI 1640 containing the antibiotic Geneticin. Transfectants were tested for production of sMHC molecules by a VLDLr-specific ELISA (Hawkins et al. 2008).

Mamu-A1*02601 endogenous ligand determination

Approximately 25 mg of Mamu-A*26TVLDLr molecules from the 721.221 cell line were purified over an affinity column composed of anti-VLDLr antibody (ATCC clone CRL-2197) coupled to CNBr activated Sepharose 4B (GE Healthcare, Piscataway, NJ, USA). sMHC molecules were then eluted in 0.2 N acetic acid, brought up to 10% acetic acid, and heated to 76°C for 10 min. Peptides were separated from heavy and light chains by ultra-filtration in a stirred cell with a 3-kDa molecular weight cutoff cellulose membrane (Millipore, Bedford, MA, USA). The peptide batch was flash frozen and lyophilized. The peptides were then reconstituted in 10% acetic acid.

Following isolation, 10% of the peptide pool was subjected to 14 rounds of N-terminal sequencing by Edman degradation. A motif was generated by calculating the fold increase of each amino acid over the prior round. A hierarchy was then determined based on the amino acid composition at each position (Falk et al. 1991).

Peptides were reverse-phase HPLC fractionated using a Jupiter Proteo C12 column (Phenomenex, Torrance, CA, USA) on a Paradigm MG4 system (Michrom Bioresources, Auburn, CA, USA). A standard CH3CN gradient was employed to generate approximately 40 peptide-containing fractions. UV absorption was monitored at 215 nm. Peptide fractions were concentrated to dryness and reconstituted in 20 μl of nanospray buffer composed of 50% methanol, 50% H2O, and 0.5% acetic acid. Nano-electrospray capillaries (Proxeon, Denmark) were loaded with 1 μl of each peptide fraction and infused at 1,100 V on a Q-Star Elite quadrupole mass spectrometer with a time of flight detector (Applied Biosystems, Foster City, CA, USA). Ion maps were generated for each fraction in a mass range of 300–1,200 amu. Using independent data acquisition for selection, ions (putative peptides) were fragmented by tandem mass spectrometry (MS/MS). An amino acid sequence was assigned using the publicly available, web-based MASCOT (Matrix Science Ltd., London, UK) and/or de novo sequencing.

Positional scanning combinatorial library and peptide synthesis

Positional scanning combinatorial libraries (PSCL) were synthesized as previously described (Pinilla et al. 1999). In the PSCL, each pool in the library contains randomized 9-mer peptides with one fixed residue at a single position. With each of the 20 naturally occurring residues represented at each position along a 9-mer backbone, the entire library consisted of 180 peptide mixtures.

Peptides utilized in screening studies were purchased as crude or purified material from Mimotopes (Minneapolis, MN, USA/Clayton, Victoria, Australia), Pepscan Systems B.V. (Lelystad, The Netherlands), A&A Labs (San Diego, CA, USA), Genescript Corporation (Piscataway, NJ, USA), or the Biotechnology Center at the University of Wisconsin—Madison (Madison, WI, USA). Peptides synthesized for use as radiolabeled ligands were synthesized by A&A Labs and purified to >95% homogeneity by reverse-phase HPLC. Peptide purity was determined with analytical reverse-phase HPLC and amino acid analysis, sequencing, and/or mass spectrometry. Peptides were radiolabeled utilizing the chloramine T method (Sidney et al. 2001b). Lyophilized peptides were re-suspended at 4–20 mg/ml in 100% DMSO, then diluted to required concentrations in PBS + 0.05% (v/v) nonidet P40 (Fluka Biochemika, Buchs, Switzerland).

SIV peptide sequences were derived from the SIVmac239 sequence, GenBank accession M33262 (Kestler et al. 1990).

MHC purification for peptide-binding assays

HLA and Mamu class I MHC purification was performed by affinity chromatography using the W6/32 and/or B123.2 class I antibodies, as previously described (Sidney et al. 2001b, 2005; Loffredo et al. 2009). Protein purity, concentration, and depletion efficiency steps were monitored by SDS-PAGE.

Quantitative assays for peptide binding to detergent solubilized MHC class I molecules were based on the inhibition of binding of a high-affinity radiolabeled standard probe peptide and performed as detailed in prior studies (Loffredo et al. 2004; Schneidewind et al. 2008; Sidney et al. 2001b, 2005). Peptides were tested at six different concentrations covering a 100,000-fold dose range in three or more independent assays. Control wells to measure non-specific (background) binding were also included. In each experiment, a titration of the unlabeled version of the radiolabeled probe was also tested as a positive control for inhibition.

The radiolabeled peptide utilized for the Mamu-A1*02601 assay was 3317.02 (sequence YLPTQQDVL), representing a sequence identified by Edman degradation and mass spectrometry analysis (described above). For Mamu-B*08301 assays, peptide 3317.04 (KSINKVYGK, an R9->K analog of Vaccinia B13R, an HLA-A3 supertype degenerate binder) was used. For each peptide, the concentration of peptide yielding 50% inhibition of the binding of the radiolabeled probe peptide (IC50) was calculated. Under the conditions used, where [radiolabeled probe] < [MHC] and IC50 ≥ [MHC], the measured IC50 values are reasonable approximations of the true K d values (Gulukota et al. 1997; Sette et al. 1994a; Cheng and Prusoff 1973).

Bioinformatic analysis

We performed analysis of the PSCL data as described previously (Sidney et al. 2008). Briefly, IC50 (nanomolars) values for each residue/position mixture were standardized as a ratio to the geometric mean IC50 (nanomolars) value of the entire set of 180 mixtures and further normalized to the average of libraries tested at each position. To identify predicted binders, all possible 9-mer peptides in SIVmac239 sequences were scored using the matrix values derived from the PSCL analyses of Mamu-A1*02601 and Mamu-B*08301. The final score for each peptide represents the product of the corresponding matrix values for each peptide residue–position pair. Peptides scoring among the top 3.0% (n = 100) were selected for binding analysis.

Phylogenetic analysis

We assembled representative MHC class I sequences from humans, Chinese, and Indian rhesus macaques. Sequences were normalized to 1,068 nucleotides in length and aligned using the ClustalX program (Thompson et al. 2002). A phylogenetic tree was built with the neighbor-joining method (Saitou and Nei 1987) using the Tamura three-parameter distance model (Tamura 1992). One thousand bootstrap samples were analyzed to ensure reliable clustering.

Results

Determination of natural Mamu-A1*02601 ligands

Previous studies have demonstrated that the elution and characterization of naturally bound ligands is an effective method for determining the peptide-binding specificity of class I MHC molecules (Hickman-Miller et al. 2005; Kubo et al. 1994). Accordingly, an initial evaluation of the peptide-binding specificity of soluble Mamu-A1*02601 was determined by sequencing 30 different endogenously loaded Mamu-A1*02601 peptide ligands representing 14 different peptide fractions (Table 1). The majority of ligands (80%; 24 out of 30) were nine residues in length, with only four and two ligands identified of ten and 11 residues, respectively. This observation suggests that 9-mers represent the preferred size for peptide ligands bound to Mamu-A1*02601.

Table 1 Endogenous Mamu-A1*02601 ligands identified by MS/MS sequencing analysis

Next, peptides were aligned and at each position, the frequency of each residue was tabulated (Supplemental Table 1, online resource). At position 2, leucine (L) was found in 15 of the 30 ligands, and other aliphatic residues valine (V) and isoleucine (I) were present in ten and three ligands, respectively. The amide residue glutamine (Q) was found in 15 peptides at position 6, and related residues glutamic acid (E) or asparagine (N) were found in seven ligands. At the C terminus, L was dominant (26 out of 30 peptides). The related aliphatic residues methionine (M) and I were also present at the C terminus. This information presents a preliminary motif with positions 2 and 6 and the C terminus, tentatively assigned as the main anchor positions based on ≥50% residue frequency.

Establishment of peptide-binding assays for Mamu-A1*02601 and Mamu-B*08301

To functionally characterize Mamu-A1*02601 further, MHC class I molecules expressed in single cell transfectants of 721.221 lines were purified by affinity chromatography. To establish a Mamu-A1*02601 binding assay, we took advantage of the natural ligand sequence information previously described. Accordingly, Mamu-A1*02601 natural ligands containing tyrosine (Y), an amino acid to which gamma-radioactive isotopes of iodine can attach, were synthesized and their capacity to bind purified MHC investigated. We found that the ligand YLPTQQDVL (peptide ID 3317.02) was associated with prominent binding to Mamu-A1*02601 (Fig. 1a). Significant binding was observed with as little as 700 pM of purified Mamu-A1*02601. The binding of YLPTQQDVL was also selective, since no significant binding was observed to purified Mamu-B*08301 molecules.

Fig. 1
figure 1

The development of the Mamu-A1*02601 and Mamu-B*08301 MHC–peptide-binding assays. The endogenous Mamu-A1*02601 ligand YLPTQQDVL (peptide 3317.02) and HLA-A3 supertype ligand Vaccinia B13R analog KSINKVYGK (peptide 3317.04) were utilized as radiolabeled probes in direct binding dose titration experiments to ascertain binding potential to purified Mamu MHC molecules. a The radiolabeled ligand YLPTQQDVL binds Mamu-A1*02601 (solid circle), but not B*08301 (open triangle). b The radiolabeled ligand KSINKVYGK binds Mamu-B*08301 (solid triangle), but not A1*02601 (open circle). c Inhibition of radiolabeled ligand binding to Mamu-A1*02601 and Mamu-B*08301 by excess unlabeled 3317.02 and 3317.04 ligands in dose titration binding experiments demonstrated specificity of binding

Endogenous ligands or defined epitopes for the Mamu-B*08301 molecule have not been reported. However, previous studies demonstrated that HLA supertype ligands also bind to MHC molecules expressed in other species, such as chimpanzees (Pan troglodytes), mice (Mus musculus; McKinney et al. 2000; Sette et al. 2005; Sidney et al. 2006), and, recently, Chinese rhesus macaques (Solomon et al. 2010), where the most frequent allele Mamu-A1*02201 was shown to be an analog of the HLA-B7 supertype. Hierarchical clustering analysis performed by our group (data not shown) predicted that Mamu-B*08301 might be associated with an HLA-A3 supertypic peptide-binding specificity. Experiments examining direct binding for a panel of HLA-A3-supertype peptides to Mamu-B*08301 were performed. An analog of a Vaccinia-derived HLA-A3 supertype ligand (KSINKVYGK, peptide 3317.04) bound Mamu-B*08301 with significant counts and showed specificity, displaying negligible binding to Mamu-A1*02601 (Fig. 1b).

Furthermore, the binding for both alleles was specific at the level of inhibition by unlabeled ligands. Mamu-A1*02601 binding to the radiolabeled YLPTQQDVL ligand could be inhibited by an excess of unlabeled 3317.02 peptide with an IC50 value of approximately 0.2 nM. Mamu-B*08301 binding to the radiolabeled KSINKVYGK ligand could be inhibited by an excess of unlabeled 3317.04 with an IC50 value of approximately 2.0 nM (Fig. 1c). In conclusion, these results demonstrate that the binding assays are specific for the purified MHC molecule and thus enable detailed investigation of the peptide-binding specificity for both Mamu-A1*02601 and Mamu-B*08301.

Definition of Mamu-A1*02601 peptide-binding motif

Building upon our mass spectrometric analysis of Mamu-A1*02601 eluted ligands, we next tested the capacity of 9-mer PSCL to bind purified Mamu-A1*02601 MHC to further define the peptide-binding motif (Fig. 2). The binding capacity relative to the geometric mean for the entire PSCL (IC50 = 2,438 nM) was determined for each positional/residue mixture and further normalized to the average of libraries tested at each position, as previously described (Sidney et al. 2008). For this analysis, we defined preferred residues as those that displayed 20-fold higher binding capacity than the position average and defined tolerated residues as those that displayed 4-fold higher binding. We defined a main binding anchor as a position in which at least half of the residues (≥10) are associated with binding capacity either 4-fold higher (in blue in Fig. 2) or lower (orange) than the position average. Furthermore, secondary anchors are defined as those positions associated with at least five residues (25%), but less than ten, employing the same binding criteria.

Fig. 2
figure 2

The relative influence of amino acids at each position on the binding to Mamu-A1*02601. The 9-mer positional scanning combinatorial library was tested for binding to Mamu-A1*02601. Values represent the ratio of the IC50 (nanomolars) of each pool at a given position relative to the geometric mean for the entire library (2,438 nM) and further normalized to the average of libraries tested at each position. Ratios of 0.25 or less are highlighted in orange; >4 are in blue

Accordingly, position 2 and the C terminus are the main anchor positions. At position 2, L, M, and V were overwhelmingly preferred above all other residues, while I and N were well tolerated. At the C terminus, residues L, I, and M were preferred and alanine (A), phenylalanine (F), glycine (G), and V were tolerated. Using the aforementioned criteria, position 1 is a secondary anchor with preferences for F and Y. Position 6, although failing to meet our secondary anchor criteria by one amino acid, has a very strong preference for Q, as displayed by a PSCL ratio of 41. These data are in overall agreement with the data observed with the natural ligands and were used to derive a Mamu-A1*02601 motif (Fig. 3). The observation that M was not identified in the natural ligand analysis, an exercise not intended to be comprehensive, but shown to be a preferred p2 anchor residue by PSCL, could in part be attributed to the residue’s low observed frequency in nature. The fact that both L and M contribute comparable binding potential at an anchor position stems from their shared chemical similarity as hydrophobic, aliphatic residues.

Fig. 3
figure 3

Map of the Mamu-A1*02601 motif. Pictorial summary representation of the Mamu-A1*02601 PSCL matrix, indicating residues contributing positive or negative binding potential by position. The total number of residues affecting binding by position is summed, with a double-digit score indicating primary anchor position. Preferred residues at anchor positions, defined as residues showing a 20-fold increase in binding capacity versus the position average, are highlighted in larger font. Positions 2 and the C terminus are identified as main anchor positions

Identification of Mamu-A1*02601 binding SIV-derived peptides

Next, we determined if the PSCL-based motifs could be used to identify Mamu-A1*02601 binding peptides. With Chinese rhesus macaques becoming an increasingly utilized animal model in SIV pathogenesis studies, we targeted the SIV proteome for these experiments (Joag et al. 1994; Ling et al. 2002; Trichel et al. 2002; Burdo et al. 2005; Monceaux et al. 2007; Degenhardt et al. 2009). The PSCL matrix was used to score all 9-mer peptides in the SIVmac239 proteome. Peptides scoring in the upper 3.0% range (n = 100) were tested for binding capacity (Supplemental Table 2, online resource). Additionally, a set of 35 control peptides, ranging from 3.1% to 70% in rank (five peptides randomly selected within each 10% demarcation), was also tested.

Previous SIV epitope and in vivo T cell recognition studies have established 500 nM as an appropriate peptide-binding threshold of immunogenicity (Sette et al. 1994a, b; van der Most et al. 1998; Vitiello et al. 1996; Allen et al. 2001; Loffredo et al. 2004, 2005; Mothe et al. 2002; Sette et al. 2005). In total, 45 of the 100 “top 3% score” peptides bound Mamu-A1*02601 with an affinity of 500 nM or less (Table 2). Of those, 21 were high-affinity binders, possessing an IC50 ≤ 50 nM. By contrast, none of the 35 lower-scoring control peptides resulted in similar affinities. Furthermore, 25 of the 45 binders with affinity ≤500 nM (56%) scored in the top 1% of predicted peptides, as well as 76% (16 of 21) of high-affinity binders.

Table 2 The efficiency of PSCL matrices in predicting Mamu-A1*02601 binders

Mamu-A1*02601 and the HLA-A2 MHC molecules share overlapping binding repertoires

The results described above define a Mamu-A1*02601 motif that appeared to match the motif associated with HLA-A2 supertype alleles, which is characterized by a preference for aliphatic and hydrophobic residues at position 2 and at the C terminus (Sidney et al. 2001a). Based on this observation, we predicted that cross-reactivity between these human and macaque molecules may exist (Dzuris et al. 2000). To investigate this hypothesis, we tested the 45 top SIV-derived binders (IC50 ≤ 500 nM) described above for binding to HLA-A2 supertype molecules (HLA-A*0201, HLA-A*0202, HLA-A*0203, HLA-A*0206, and HLA-A*6802). Twenty-seven of these 45 Mamu-A1*02601 binders (60%) bound at least one molecule of the HLA-A2 superfamily with an IC50 ≤ 500 nM (Fig. 4), with 18 (40%) being promiscuous binders, binding three or more HLA-A2 alleles. One peptide, FLIRQLIRL (Env.771), bound all five HLA-A2 supertype molecules examined.

Fig. 4
figure 4

Mamu-A1*02601 binders tested against HLA-A2 supertype alleles. SIVmac239-derived, Mamu-A1*02601 predicted peptides which had an \( {\hbox{I}}{{\hbox{C}}_{{{5}0}}} \le {5}00\,{\hbox{nM}}\left( {n = {45}} \right) \) were tested for binding capacity to HLA-A2 supertype molecules. Affinities highlighted in blue possess an IC50 < 50 nM. Those affinities in the 50–500 nM range are highlighted in green. Dashes indicate binding affinity >20,000 nM

HLA-A*0202 displayed the highest degree of cross-reactivity with Mamu-A1*02601. Of the 27 Mamu-A1*02601 binders that bound an A2-like allele, 23 (85%) bound HLA-A*0202. Additionally, 17 of the 18 promiscuous binders (94%) bound HLA-A*0202. HLA-A*0201 and HLA-A*0203 were also highly cross-reactive, as both alleles bound 18 Mamu-A1*02601 binders (67%); 16 of these cross-reactive peptides were shared between the two alleles.

Interestingly, the amino acid present at position 6 greatly influences cross-reactivity. More specifically, while Q at position 6 is strongly preferred for Mamu-A1*02601 binding, peptides possessing Q at position 6 are less likely to cross-react with HLA-A2 supertype molecules. Indeed, only 14% (three out of 21) of Mamu-A1*02601 binders with Q at position 6 were promiscuous HLA-A2 supertype binders, while 63% (15 out of 24) of Mamu-A1*02601 binders that did not have Q at p6 bound three or more HLA-A2 supertype molecules.

Definition of Mamu-B*08301 peptide-binding motif

As was performed for Mamu-A1*02601, we tested the capacity of 9-mer PSCLs to bind Mamu-B*08301 molecules (Fig. 5). The binding affinity relative to the geometric mean for the entire PSCL (518 nM) was determined. Using the criteria outlined above, the C terminus was identified as the dominant anchor position, with the basic positively charged residues lysine (K) and arginine (R) being overwhelmingly preferred and residues F, L, and Y being tolerated. Unlike Mamu-A1*02601, a second main anchor position was not identified for Mamu-B*08301. Instead, position 2 was defined as a secondary anchor. In this position, the most preferred residues were N and serine (S), while threonine (T) was tolerated. These data were subsequently used to derive a Mamu-B*08301 motif (Fig. 6).

Fig. 5
figure 5

The relative influence of amino acids at each position on the binding to Mamu-B*08301. The 9-mer positional scanning combinatorial library was tested for binding to Mamu-B*08301. Values represent the ratio of the IC50 (nanomolars) of each pool at a given position relative to the geometric mean for the entire library (518 nM) and further normalized to the average of libraries tested at each position. Ratios of 0.25 or less are highlighted in orange; >4 are in blue

Fig. 6
figure 6

Map of the Mamu-B*08301 motif. Pictorial summary representation of the Mamu-B*08301 PSCL matrix, indicating residues contributing positive or negative binding potential by position. The total number of residues affecting binding by position is summed, with a double-digit score indicating primary anchor position. Preferred residues at anchor positions, defined as residues showing a 20-fold increase in binding capacity versus the position average, are highlighted in larger font. The C terminus position is identified as the sole main anchor position

Identification of Mamu-B*08301 binding SIV-derived peptides

Repeating the exercise performed for Mamu-A1*02601, all 9-mer peptides in the SIVmac239 proteome were scored using the Mamu-B*08301 PSCL matrix. Peptides scoring in the upper 3.0% range (n = 100) were tested for binding capacity (Supplemental Table 3, online resource), and a set of control peptides was included.

In total, 64 of the 100 “top 3% score” peptides bound Mamu-B*08301 with an affinity of 500 nM or less, and of those, 32 were high-affinity binders (Table 3). Furthermore, 30 of the 64 binders with affinity ≤500 nM (47%) scored in the top 1% of predicted peptides, as well as 72% (23 of 32) of high-affinity binders. By contrast, none of the 35 control peptides had an IC50 < 1,900 nM.

Table 3 The efficiency of PSCL matrices in predicting Mamu-B*08301 binders

Mamu-B*08301 and the HLA-A3 MHC molecules share overlapping binding repertoires

The results presented for Mamu-B*08301 depict a motif that closely resembles the one associated with HLA-A3 supertype alleles, which is characterized by a preference for basic residues at the C terminus and small and aliphatic residues at position 2 (Sidney et al. 1996a). Revisiting our hypothesis of the potential for cross-reactivity between human and macaque alleles, we tested the 64 top SIV-derived binders (IC50 ≤ 500 nM) for binding to HLA-A3 supertype molecules (HLA-A*0301, HLA-A*1101, HLA-A*3101, HLA-A*3301, and HLA-A*6801). Fifty-two of the 64 Mamu-B*08301 binders (81%) bound at least one molecule of the HLA-A3 superfamily with an IC50 ≤ 500 nM (Fig. 7).

Fig. 7
figure 7

Mamu-B*08301 binders tested against HLA-A3 supertype alleles. SIVmac239-derived, Mamu-B*08301 predicted peptides which had an \( {\hbox{I}}{{\hbox{C}}_{{{5}0}}} \le {5}00\,{\hbox{nM}}\left( {n = {64}} \right) \) were tested for binding capacity to HLA-A3 supertype molecules. Affinities highlighted in blue possess an IC50 < 50 nM. Those affinities in the 50–500-nM range are highlighted in green. Dashes indicate binding affinity >20,000 nM

HLA-A*3101 displayed the highest degree of cross-reactivity with Mamu-B*08301. Of the 52 Mamu-B*08301 binders that bound an A3-like allele, 39 (75%) bound HLA-A*3101. HLA-A*0301 and HLA-A*6801 also displayed good cross-reactivity, as both alleles bound at least 26 Mamu-B*08301 binders. Additionally, 41% of Mamu-B*08301 binders (26 out of 64) were promiscuous HLA-A3 supertype binders, a percentage that is comparable to that seen between Mamu-A1*02601 and HLA-A2 (40%).

Evolutionary origin of Chinese rhesus macaque HLA-A2 and HLA-A3 functional analogy

The two common Chinese rhesus class I alleles described above are associated with HLA-A2 and HLA-A3 supertype specificities, which have not been detected in Indian rhesus macaques, although functional analysis of many of the most common alleles expressed in Indian-origin animals has been performed. To investigate the evolutionary origin of the functional analogy between the HLA and Chinese rhesus macaque MHC class I alleles in this study, we built a phylogenetic tree using representative MHC allele sequences.

For humans, we included one HLA allele (HLA-A*01010101, HLA-A*02010101, HLA-A*03010101, HLA-A*24020101, HLA-B*070201, HLA-B*080101, HLA-B*270202, HLA-B*44020101, HLA-B*580101, and HLA-B*15010101) from each of the known supertypes (A1, A2, A3, A24, B7, B8, B27, B44, B58, and B62 respectively) with an additional four HLA-B7 supertype alleles (HLA-B*35010101, HLA-B*510101, HLA-B*530101, and HLA-B*5401) and the four additional HLA-A2 supertype alleles (HLA-A*0202, HLA-A*020301, HLA-A*020601, and HLA-A*68020101) and HLA-A3 supertype alleles (HLA-A*110101, HLA-A*310102, HLA-A*330101, and HLA-A*680101) that were used in this study. For Indian rhesus macaques, we selected 14 sequences represented among the most common specificities (Boyson et al. 1996; Kaizu et al. 2007; Knapp et al. 1997; Loffredo et al. 2007; Voss and Letvin 1996). Similarly, the 14 Chinese rhesus macaque sequences included in the analysis were selected from the most frequent alleles (Solomon et al. 2010).

Although Mamu-A1*02601 displays functional analogy to HLA-A2 supertype alleles and Mamu-B*08301 is associated with an HLA-A3 supertype peptide specificity, these Mamu alleles cluster separately from their respective HLA-A2 and HLA-A3 molecules (Fig. 8). Previous analysis of Mamu-A1*02201 showed a similar lack of sequence homology to its functional HLA cognate (Solomon et al. 2010). These results further support the hypothesis that these shared similarities in allele specificity have evolved independently in humans and macaques, most likely as a result of convergent evolution (Sette and Sidney 1999).

Fig. 8
figure 8

Phylogenetic tree of HLA and Mamu MHC class I sequences. Phylogenetic analysis of 22 HLA, 14 Indian rhesus, and 14 Chinese rhesus MHC class I sequences is shown. Neighbor-joining tree created based on 1,068 aligned nucleotide sites. The percentage of bootstrap samples supporting the branch is shown (for values >50%). Mamu sequences derived from Indian animals are prepended with In. Sequences identified in Chinese animals (Solomon et al. 2010) are prepended with Ch. The Mamu allele B*08301 demonstrating HLA-A3-like specificity appears in the Mamu-B group and not Mamu-A, indicating this motif specificity between humans and macaques likely did not arise through persistence of a common allele

Discussion

Herein we report the peptide-binding motifs associated with the common Chinese-origin rhesus macaque class I molecules Mamu-A1*02601 and Mamu-B*08301. This is the first description of a motif for Chinese-origin Mamu-B alleles and only the second motif for Mamu-A alleles, since until now Mamu-A1*02201 was the only Chinese-origin MHC molecule for which a peptide-binding motif had been described (Solomon et al. 2010). Thus, our study triples the number of alleles for which peptide-binding motifs are available.

The significance of our study is underscored by the fact that, while Mamu-A1*02201 is the most frequently expressed class I allele in Chinese rhesus macaques, Mamu-A1*02601 is the second most frequent allele and Mamu-B*08301 is the most frequent Mamu-B allele, a distinction it shares with six other molecules (Solomon et al. 2010). These three alleles combined allow for potential coverage of approximately 20% of Chinese rhesus macaques used in biomedical research, irrespective of the geographical origin from China, thus significantly enhancing our knowledge of the functional MHC profile of these experimental animals.

Unexpectedly, we found that both of these molecules are associated with motifs that are overlapping with well-known HLA supermotifs. Specifically, Mamu-A1*02601 shares a high degree of cross-reactivity with the HLA-A2 supertype allele HLA-A*0202, as well as HLA-A*0201 and HLA-A*0203. Mamu-B*08301 is highly cross-reactive with HLA-A3 supertype alleles HLA-A*3101, HLA-A*0301, and HLA-A*6801. These results are even more remarkable in the context of the recently described HLA-B7 supertype specificity similarity of Mamu-A*02201 (Solomon et al. 2010). HLA-B7, HLA-A3, and HLA-A2 are the three most abundant supertypes in the human population. Separately, each of these three supertypes has a phenotypic frequency greater than 42% averaged across various ethnic groups. When these three supertypes are combined, over 86% of the human population is covered (Sette and Sidney 1999). Thus, each of the three common Chinese Mamu class I alleles thus far investigated are associated with a motif corresponding to one of the three most common HLA supertypes expressed in humans. It is possible that cross-reactivity with additional HLA or Mamu alleles will be identified, thus increasing the relevance and utility of the assays characterized herein for studies directed toward epitope discovery and characterization of the immune response to specific pathogens.

These observed similarities of Mamu with HLA could be explained by either common ancestry or convergent evolution (Sette et al. 2003). Previous studies have detected functional similarities between Mamu and HLA class I molecules, such as Mamu-B*08 with HLA-B*27 molecules. Notably, other similarities like Mamu-A*11 with HLA-B*44 and Mamu-A*22 with HLA-B*07 (Dzuris et al. 2000; Sette et al. 2005; Loffredo et al. 2009; Solomon et al. 2010) spanned across loci, arguing against the common ancestry hypothesis. In this study, we show that a Mamu molecule encoded by the B locus shares overlapping binding characteristics with HLA-A alleles, also arguing against common ancestry. This point was further and more formally demonstrated by phylogenetic analysis. Indeed, prior to this study, Mamu class I alleles, regardless of loci, have been purported to be functional HLA-B analogs only (Hickman-Miller et al. 2005), with rare associations linked to HLA-A thus far reported, even at the level of binding specificity. Our results present evidence of two Mamu class I molecules that share overlapping binding characteristics with HLA-A alleles.

The class I loci for both Indian- and Chinese-origin rhesus macaques are highly polymorphic, though Chinese alleles appear to be more polymorphic given the allele frequencies that have emerged in this population (Solomon et al. 2010). Additionally, the allelic variants are largely non-overlapping, possibly because of the geographical distance between India and China resulting in low probability of genetic exchange as demonstrated by mitochondrial DNA analysis of captive rhesus macaques (Kanthaswamy and Smith 2004; Satkoski et al. 2008). In this respect, the absence of the HLA-A2, HLA-A3, and HLA-B7 specificities from the seven common Indian rhesus macaque MHC class I alleles characterized to date (Loffredo et al. 2007; Knapp et al. 1997; Kaizu et al. 2007) is intriguing. This occurrence is likely the result of a combination of factors, including but not limited to a probable founder effect following the 1978 moratorium on importation of Indian-origin animals and the consequent breeding in the USA of captive populations.

Regardless of the evolutionary ramifications of MHC polymorphism and function, our findings have important practical implications because of the role of Chinese-origin rhesus macaques in biomedical research. The presence and identification of key HLA-like specificities in macaque populations of Chinese origin provides the scientific community valuable tools to evaluate disease pathogenesis and vaccine concepts in a setting more reflective of the global community, with broader human population coverage implications.