Introduction

Rhesus macaques have been extensively used in infectious disease research for several indications, most notably HIV/AIDS. Macaque models have provided key insights into disease pathogenesis and allow for the evaluation of novel vaccine concepts. Two subspecies of rhesus macaques, consisting of Indian-origin rhesus macaques and Chinese-origin rhesus macaques, have been utilized extensively in AIDS research and for other models of infectious disease (Desrosiers 1990; Gardner and Luciw 2008; Haigwood 2009; ILAR 2003; Kindt et al. 1992; Persidsky and Fox 2007). While physiologically these two subspecies appear to be nearly identical, genetic factors that can affect immune responses are quite varied, which may underlie the disparate outcomes of infections seen in these two subspecies (Joag et al. 1994; Ling et al. 2002; Otting et al. 2007, 2005). In the context of Simian immunodeficiency virus (SIV) research, the vast majority of studies performed with Indian rhesus macaques have shown progression to AIDS in a shorter time period compared with HIV-infected humans (Miller et al. 1989; Smith et al. 1999). For this reason, researchers have investigated other nonhuman primate animal models to more closely mimic HIV infections in humans. Chinese rhesus macaques have been an interesting choice because SIV infection in these animals yields a prolonged disease progression, more similar to HIV infection in humans (Joag et al. 1994; Ling et al. 2002; Reimann et al. 2005).

Several independent observations have implicated cellular immunity, specifically cytotoxic T lymphocyte (CTL) responses, in the control of AIDS viral replication (Brander et al. 2001; Kalams et al. 1999; Kuroda et al. 1999; Schmitz et al. 1999). Major histocompatibility complex (MHC) class I and II molecules determine the repertoire of T cell responses that an individual can develop against SIV and/or any other foreign pathogen (Parham 2005). The Indian rhesus macaques have been extensively characterized in terms of their MHC allele composition, resulting in instrumental findings in the setting of SIV infection, including the discovery of viral evasion from CTL responses (Allen et al. 2000; Evans et al. 1999) and identification of specific MHC alleles which influence disease progression (Loffredo et al. 2007; Mothe et al. 2003; O'Connor et al. 2003; Pal et al. 2002; Yant et al. 2006). Several macaque MHC (Mamu) class I alleles, including Mamu-A*01 (Allen et al. 1998), B*17 (Mothe et al. 2002), and B*01 (Loffredo et al. 2005) among others (Loffredo et al. 2004; Sette et al. 2005) are expressed with high frequency (over 10%) in specific macaque populations. The fact that Indian rhesus macaques used in biomedical research in the USA have been interbreeding since 1978, when India banned the exportation of these animals (Southwick and Siddiqi 1988), is probably a major contributing factor to the high frequency of expression of these MHC class I molecules. Because of the extensive characterization of their MHC alleles, Indian rhesus macaques are the most widely utilized animal model in AIDS research studies (Gardner and Luciw 2008; Patterson and Carrion 2005; Persidsky and Fox 2007; Watkins et al. 2008). However, as previously mentioned, the rapid progression to disease displayed after SIV infection of Indian rhesus macaques and, more recently, the increased demand for these animals has led to the desire in developing alternative animal models.

Chinese rhesus macaques are relatively accessible for research, but are not as well-characterized at their MHC loci. Although studies have been performed in recent years to address this shortcoming (Karl et al. 2008; Ma et al. 2009; Otting et al. 2007, 2005, 2008; Ouyang et al. 2008; Wiseman et al. 2009), these studies have not yet generated any functional data in terms of epitopes recognized or the specific biological relevance of these MHC molecules upon infection. This lack of information makes it difficult to interpret data with respect to immune correlates of protection, especially in terms of CTL immune responses and thereby severely hinders the greater use of Chinese macaques as animal models for infectious disease research.

Thus, while researchers have shown that Chinese rhesus macaques are of value as animal models for AIDS vaccine development and for other pathogens, their full potential has not been realized due to missing functional MHC and genetic information. In this study, we sought to characterize 50 unique Chinese rhesus macaque samples in detail for their entire MHC class I allele composition. Since these animals were derived from several national primate centers and sources, this provided a glimpse into the MHC composition of animals that reflect their vast geographic diversity. Furthermore, we thoroughly characterized the MHC/peptide binding motif of the most common MHC class I molecule in Chinese rhesus macaques, yielding the first functional MHC data in this subspecies.

Materials and methods

Sample acquisition, RNA isolation, and cDNA synthesis

We obtained peripheral blood mononuclear cells (PBMCs) from 12 animals from the Tulane National Primate Center (New Orleans, LA, USA), and from the Scripps Research Institute (La Jolla, CA, USA) for 11 others. Blood from 27 animals at the Washington National Primate Research Center (Seattle, WA, USA) was collected in 2 mL PAXgene Blood RNA tubes (Qiagen, Valencia, CA, USA).

We isolated RNA and DNA from PBMC samples using the Qiagen QIAshredder tissue homogenizer and the Qiagen AllPrep DNA/RNA Mini kit following the manufacturer's protocols. We used the PAXgene Blood RNA kit (Qiagen) to isolate RNA from the PAXgene Blood RNA tubes. Complementary DNA (cDNA) was synthesized using the Superscript III First-Strand Synthesis kit for RT-PCR (Invitrogen, Carlsbad, CA, USA) for all the samples.

PCR amplification, cloning, and sequencing of MHC class I transcripts

We performed polymerase chain reaction (PCR) for the MHC class I transcripts and cloned them as previously described (Karl et al. 2008). DNA from each clone was subjected to bidirectional sequencing using previously published primer sequences (Wiseman et al. 2007). Genewiz, Inc. (San Diego, CA, USA) sequenced the samples using the Sanger dideoxy method on an Applied Biosystems Prism 3730 x l DNA analyzer (Foster City, CA, USA). We utilized the CodonCode Aligner (CodonCode, Dedham, MA, USA) software package to analyze the sequence data. To identify the most highly expressed transcripts, we sequenced 88 clones from each animal. We required a sequence to be found in at least three clones from a unique animal/locus pair to avoid analysis of aberrant mutations introduced in the PCR. Novel sequences were submitted to GenBank. Additionally, sequence information was submitted to the IMGT/MHC Nonhuman Primate Immuno Polymorphism (IPD-MHC) Nomenclature Committee for naming of the novel alleles (Robinson et al. 2003).

Creation of stable MHC class I transfectants

Stable MHC class I transfectants were produced in the MHC class I-deficient EBV-transformed B-lymphoblastoid cell line 721.221. For each of the alleles studied (Mamu-A1*02201 and Mamu-B*08301), we created an expression construct by subcloning a full-length allele transcript into the pcDNA 3.1 vector (Invitrogen). This construct was then used to transfect 721.221 cells using an Amaxa Nucleofector II transfection machine (Lonza AG, Walkersville, MD, USA).

Positional scanning combinatorial library and peptide synthesis

Positional scanning combinatorial libraries (PSCL) were synthesized as previously described (Pinilla et al. 1999). PCSL are composed of systematically arranged mixtures, and each mixture in the library contains 9-mer peptides with one fixed residue at a single position. With each of the 20 naturally occurring residues represented at each position along a 9-mer backbone, the entire library consisted of 180 peptide mixtures.

Peptides utilized in screening studies were purchased as crude or purified material from Mimotopes (Minneapolis, MN, USA/Clayton, Victoria, Australia), Pepscan Systems B. V. (Lelystad, Netherlands), A and A Labs (San Diego, CA, USA), Genescript Corporation (Piscataway, NJ, USA), or the Biotechnology Center at the University of Wisconsin-Madison (Madison, WI, USA). Peptides synthesized for use as radiolabeled ligands were synthesized by A and A Labs and purified to >95% homogeneity by reverse-phase HPLC. Purity of these peptides was determined using analytical reverse-phase high performance liquid chromatography (HPLC) and amino acid analysis, sequencing, and/or mass spectrometry. Peptides were radiolabeled utilizing the chloramine T method (Sidney et al. 2001). Lyophilized peptides were resuspended at 4–20 mg/ml in 100% DMSO, then diluted to required concentrations in PBS +0.05% (v/v) Nonidet P40 (Fluka Biochemika, Buchs, Switzerland).

SIV peptide sequences were derived from the SIVmac239 sequence, GenBank accession M33262 (Kestler et al. 1990).

MHC purification and peptide binding assays

HLA and Mamu class I MHC purification was performed by affinity chromatography using the W6/32 and/or B123.2 class I antibodies, as previously described (Loffredo et al. 2009; Sidney et al. 2001, 2005). Protein purity, concentration, and depletion efficiency steps were monitored by SDS-PAGE.

Quantitative assays for peptide binding to detergent solubilized MHC class I molecules were based on the inhibition of binding of a high affinity radiolabeled standard probe peptide and performed as detailed in prior studies (Loffredo et al. 2004; Schneidewind et al. 2008; Sidney et al. 2001, 2005). Peptides were tested at six different concentrations covering a 100,000-fold dose range in three or more independent assays. Control wells to measure non-specific (background) binding were also included. In each experiment, a titration of the unlabeled version of the radiolabeled probe was also tested as a positive control for inhibition. The radiolabeled peptide utilized for the Mamu-A1*02201, and HLA-B7 supertype assays was 1021.05 (FPFKYAAAF, a B35 consensus sequence), with the exception of HLA-B*0702 which used 1075.23, a human A2 signal sequence p5 analog peptide (APRTLVYLL). For each peptide, the concentration of peptide yielding 50% inhibition of the binding of the radiolabeled probe peptide (IC50) was calculated. Under the conditions used, where [radiolabeled probe] < [MHC] and IC50 ≥ [MHC], the measured IC50 values are reasonable approximations of the true K d values (Cheng and Prusoff 1973; Gulukota et al. 1997; Sette et al. 1994a).

Bioinformatic analysis

We performed analysis of the PSCL data as described previously (Sidney et al. 2008). Briefly, IC50 nM values for each residue–position mixture were standardized as a ratio to the geometric mean IC50 nM value of the entire set of 180 mixtures and further normalized to the average of libraries tested at each position. To identify predicted binders, all possible 9-mer peptides in SIVmac239 sequences were scored using the matrix values derived from the PSCL analysis of Mamu-A1*02201. The final score for each peptide represents the product of the corresponding matrix values for each peptide residue–position pair. Peptides scoring among the top 3.0% were selected for binding analysis.

Phylogenetic analysis

We assembled representative MHC class I sequences from humans (HLA), Chinese, and Indian rhesus macaques. Sequences were normalized to 1,068 nucleotides in length and aligned using the ClustalX program (Thompson et al. 2002). A phylogenetic tree was built using the neighbor-joining method (Saitou and Nei 1987). One thousand bootstrap samples were analyzed to ensure reliable clustering.

Accession codes

MHC class I allele sequences were submitted to GenBank (accession numbers GQ902065-GQ902079, GU057837, GU057840–GU057846, GU080236–GU080246, GU120061, GU190191–GU190209, and GU198751–GU198752).

Results

Non-overlapping MHC class I repertoires in Indian and Chinese rhesus macaques

To analyze the frequency and functional features associated with the MHC class I alleles expressed by Chinese origin macaques, we assembled a cohort of 50 different animals derived from various colonies currently utilized in biomedical research. Samples were obtained from the Scripps Research Institute (n = 11), the Tulane National Primate Research Center (n = 12), and the Washington National Primate Research Center (n = 27). cDNA clones were sequenced from each of the individual macaques and assembled according to the described methods to generate a complete snapshot of the highly expressed MHC class I transcripts within the population of 50 animals.

In our study, we detected 58 distinct MHC class I A and B loci transcripts (Supplemental Table 1 online). Only nine of these had been previously detected in Indian-origin macaques (Mamu-A2*2402, Mamu-A4*1403, Mamu-B*00301, Mamu-B*00401, Mamu-B*00703, Mamu-B*03003, Mamu-B*03901, Mamu-B*04002, and Mamu-B*06901; Karl et al. 2008; Otting et al. 2007, 2005, 2008), yielding 49 sequences that were only detected in Chinese rhesus macaques. These results demonstrate that the repertoire of allelic variants found in Indian- and Chinese-origin macaques is largely non-overlapping. The limited allele sharing between Indian- and Chinese-origin animals was previously noted and suggests that some of the allelic polymorphism seen in the Chinese population may have arisen after geographic isolation of the two subspecies (Otting et al. 2007).

High degree of polymorphism in Mamu-A and B alleles in Chinese macaques

Of the 58 MHC class I A and B loci transcripts identified in this study, 28 represent novel sequences (13 novel alleles from the Mamu-A locus and 15 from the Mamu-B locus), including 16 that were detected in more than one animal (Table 1). Sixteen of the novel transcripts encode unique protein products, while 12 of the novel sequences are full-length transcripts of sequences that had been previously published as partial coding sequences. Two of the novel sequences, B*03701 and A1*0040202, are noteworthy because they exhibit very distinct nucleotide and amino acid differences from any full-length published allele, yet are present with appreciable frequency within our population (four and seven animals, representing 8% and 14%, respectively). Interestingly, an allele previously identified in Cynomolgus macaques was also identified in one animal (Mamu-B*1470101, identical to Mafa-B*04701). Given that there were 58 distinct transcripts detected in a cohort of 50 animals, these results demonstrate that a high amount of polymorphism exists within the MHC class I A and B loci of Chinese-origin macaques. Furthermore, that 48.3% (28/58) of the transcripts were novel underscores the need for continued sequencing efforts in this population.

Table 1 Novel and extended Mamu MHC class I allele transcripts detected in multiple animals

MHC class I frequencies detected in Chinese macaques from multiple colonies

A total of 32 alleles detected in this study share translated sequences with alleles that had been reported in previous studies utilizing different animal cohorts (Karl et al. 2008; Ma et al. 2009; Otting et al. 2007; Ouyang et al. 2008). These alleles are of interest, since they are likely to represent alleles truly more prevalent in the Chinese macaque population and not alleles whose frequencies are inflated by founder effects and inbreeding in specific animal colonies. Although we have not performed pedigree analysis on the animals to ascertain their relationships due to their diverse origins and the lack of complete history on their initial Chinese colony information, we feel that the diversity of the sources reduces the chance that allele frequencies might be artificially inflated due to familial relatedness.

Accordingly, we list in Table 2 alleles detected in our cohort, and also detected in the previously published studies, together with their frequency within our cohort, and frequencies calculated by compilation of previous studies. Where possible, in compiling this data we grouped previously published partial coding sequences with their full-length cognate sequences from our study for purposes of frequency analysis. This analysis reveals 11 alleles that are found in multiple animals in both sets of studies, exhibiting a combined frequency greater than or equal to 5.6% for each allele. These alleles are Mamu-A1*02201, Mamu-A1*02601, Mamu-A7*0103, Mamu-B*08301, Mamu-B*06601, Mamu-B*03901, Mamu-B*01001, Mamu-B*00301, Mamu-B*08701, Mamu-B*00101, and Mamu-A2*0102. Combined, these alleles provide coverage for approximately 68% of all Chinese rhesus animals studied thus far by us and others, assuming that none of these alleles are co-inherited. In conclusion, these data identify a set of Mamu MHC class I alleles that are most frequent within Chinese-origin macaques derived from disparate animal colonies and therefore represent a logical target for further analysis and characterization.

Table 2 Mamu MHC class I alleles detected in the present study and in previous studies

Establishment of the Mamu-A1*02201 peptide binding assay

Next, we decided to investigate the function of the most frequent allele identified above, Mamu-A1*02201. MHC class I molecules expressed in a single cell transfectant were purified by affinity chromatography. A transfectant expressing the common allele Mamu-B*08301, previously generated in the context of an independent project, was also included in these investigations as a control.

Endogenous ligands or defined epitopes for the Mamu-A1*02201 or Mamu-B*08301 molecules have not been reported. However, previous studies demonstrated that HLA supertype ligands also bind to MHC molecules expressed in other species, such as chimpanzees (Pan troglodytes) or mice (Mus musculus; McKinney et al. 2000; Sette et al. 2005; Sidney et al. 2006). Accordingly, we investigated the ability of purified Mamu-A1*02201 or Mamu-B*08301 to bind a panel of radiolabeled ligands representative of the main human HLA supertypes (HLA-A1, HLA-A2, HLA-A3, HLA-A24, HLA-B7, and HLA-B44; Fig. 1a, c).

Fig. 1
figure 1

The development of the Mamu-A1*02201 MHC/peptide binding assay. Radiolabeled ligand probes for six major HLA class I supertypes were used in a Mamu-A1*02201 and c Mamu-B*08301 direct binding dose titration experiments to ascertain binding potential to the purified MHC molecules. HLA supertype ligands are denoted by: solid circle, YTAVVPLVY, human J chain p102 (HLA-A1 supertype); open circle, YVIKVSARV, MAGE1 p282 (HLA-A2 supertype); solid square, KSINKVYGR, Vaccinia p63 (HLA-A3 supertype); open square, AYIDNYNKF, A24 consensus sequence (HLA-A24 supertype); solid triangle, FPFKYAAAF, B35 consensus sequence (HLA-B7 supertype); and open triangle, SEIDLILGY, naturally processed (HLA-B44 supertype). In the case of Mamu-A1*02201, the HLA-B7 supertype ligand was associated with prominent binding, whereas Mamu-B*08301 displayed significant binding only to the HLA-A3 supertype ligand. b Inhibition of HLA-B7 ligand binding to Mamu-A1*02201 by unlabeled ligand in dose titration binding experiments (1.9 nM) demonstrated specificity of ligand binding

In the case of Mamu-A1*02201 (Fig. 1a), the B7 supertype ligand FPFKYAAAF was associated with prominent binding, and no binding was detected with the other supertype ligands. Significant binding was observed with as little as 0.43 nM of purified MHC. Mamu-A1*02201 bound the B7 supertype ligand selectively, since no significant binding was observed for any of the other radiolabeled ligands examined. Furthermore, the binding was specific, in that it could be inhibited by excess unlabeled FPFKYAAAF, with an IC50 of approximately 1.9 nM (Fig. 1b).

Further evidence of the allelic specificity of the binding activity was demonstrated by the experiments examining purified Mamu-B*08301. This molecule did not bind the FPFKYAAAF ligand, but did bind the A3 supertype ligand 3035.0857 (KSINKVYGR) (Fig. 1c). In conclusion, these results demonstrate the development of a Mamu-A1*02201 binding assay, thus enabling detailed investigation of the peptide binding specificity of this MHC molecule.

Definition of Mamu-A1*02201 peptide-binding motif

To determine the capacity of Mamu-A1*02201 to bind peptide ligands of differing lengths, combinatorial peptide libraries of 7, 8, 9, and 10 amino acids in length, synthesized as described in the Materials and methods section, were tested as inhibitors in the Mamu-A1*02201 binding assay (data not shown). The 9-mer-size library exhibited the best binding affinity, with an average IC50 of 14,700 nM. A 10-mer library bound with slightly lower affinity (19,600 nM). The 8-mer and 11-mer libraries bound with even lower affinities (IC50 greater than 41,000 nM). These data suggest the optimal peptide length for binding to Mamu-A1*02201 is nine amino acids.

We next tested the capacity of a 9-mer PSCL to bind Mamu-A1*02201 molecules (Fig. 2). The binding capacity relative to the geometric mean for the entire PSCL (IC50 = 2681 nM) was determined for each mixture with a defined amino acid and further normalized to the average of activity of the mixtures tested at each position, as previously described (Sidney et al. 2008). The resulting matrix was used to derive a Mamu-A1*02201 motif (Fig. 3). For this analysis, we defined preferred residues as those that displayed 20-fold higher binding capacity than the position average and defined tolerated residues as those that displayed fourfold higher binding. We defined a main binding anchor as a position in which at least half of the residues (≥10) are associated with binding capacity either fourfold higher (in blue in Fig. 2) or lower (in orange in Fig. 2) than the position average. Furthermore, secondary anchor positions are defined as those associated with at least five residues (25%), but less than ten, meeting the described binding criteria.

Fig. 2
figure 2

The relative influence of amino acids at each position on the binding to Mamu-A1*02201. The 9-mer positional scanning combinatorial library was tested for binding to Mamu-A1*02201. Values represent the ratio of the IC50 nM of each pool at a given position relative to the geometric mean for the entire library (2681 nM) and further normalized to the average of libraries tested at each position. Ratios of 0.25 or less are highlighted in orange; >4 are in blue

Fig. 3
figure 3

Map of the Mamu-A1*02201 motif. Pictorial summary representation of the Mamu-A1*02201 PSCL matrix, indicating residues contributing positive or negative binding potential by position. The total number of residues affecting binding by position is summed, with a double-digit score indicating primary anchor position. Preferred residues at anchor positions, defined as residues showing a 20-fold increase in binding capacity versus the position average, are highlighted in larger font. Positions 2 and the C terminus are identified as main anchor positions

Accordingly, position 2 and the C terminus are the main anchor positions. At position 2, proline (P) was overwhelmingly preferred above all others, and V, A, and I were somewhat tolerated. At the C terminus, the aromatic residues F and Y were preferred, and M, L, and A were tolerated. Using the criteria described above, position 1 was identified as a secondary anchor with a strong preference for R, while S and A residues were tolerated. Additionally, at position 1, the residues D and E, and to a lesser extent P, were associated with substantial decreases in binding capacity. Significant, but less pronounced, effects were also seen in positions 3 and 8, suggesting that these positions are also secondary anchors. The influence of specific residues at most other non-anchor positions was minimal, but in general, aromatic, and aliphatic residues were associated with contributing positive binding energy, whereas charged residues, both acidic and basic, were associated with deleterious effects.

Identification of Mamu-A1*02201 peptide ligands

Next, we tested the usefulness of the PSCL-based motif as a means to identify Mamu- A1*02201 binding peptides. We targeted the SIV proteome for these experiments, as Chinese rhesus macaques are an increasingly utilized animal model in SIV pathogenesis studies (Burdo et al. 2005; Degenhardt et al. 2009; Joag et al. 1994; Ling et al. 2002; Monceaux et al. 2007; Reimann et al. 2005; Trichel et al. 2002). The PSCL matrix was used to score 9-mer peptides in the SIVmac239 proteome. Eighty-one peptides scoring in the upper 3.0% range were tested for binding capacity (Supplemental Table 2 online). A set of 40 lower-scoring control peptides, ranging from 5–50% in rank, was also tested.

As previously described, 500 nM has been established as an appropriate threshold for sufficient peptide binding in SIV epitope and in vivo T cell recognition studies (Allen et al. 2001; Loffredo et al. 2005, 2004; Mothe et al. 2002; Sette et al. 2005, 1994a, 1994b; van der Most et al. 1998; Vitiello et al. 1996). In our experiments, we used the 500 nM threshold to classify peptides as positive binders. In total, 13 of the 81 “top 3% score” peptides bound Mamu-A1*02201 with an affinity of 500 nM or less (Table 3). By contrast, none of the 40 lower-scoring control peptides resulted in similar affinities.

Table 3 The efficiency of PSCL matrices in predicting Mamu-A1*02201 binders

Mamu-A1*02201 and the HLA-B7 supertype MHC molecules share overlapping binding repertoires

The results above define a Mamu-A1*02201 motif that closely matches the motif associated with HLA-B7 supertype alleles. Early studies examining peptide binding motifs in human MHC molecules have shown that many HLA molecules display significant overlap in peptide-binding specificity (Sidney et al. 1996). The HLA-B7 supertype motif describes MHC molecules that display a preference for peptides that have hydrophobic or aromatic residues at the C terminus and proline at position 2 (Sidney et al. 1995). Furthermore, HLA-B7 supertype molecules and Mamu-A1*02201 share residues at key positions within the B pocket where position 2 of the bound peptide typically resides (data not shown). This provides a molecular explanation to the observation that the HLA-B7 supertype ligand FPFKYAAAF bound Mamu-A1*02201 molecules with very high affinity. Based on this, we hypothesized that the crossreactivity between these human and macaque alleles might be relatively widespread (Dzuris et al. 2000). To test this hypothesis, we selected three HIV-derived epitopes restricted to HLA-B7-like molecules from the literature (Rowland-Jones et al. 1995; Shiga et al. 1996; Wilson et al. 1999) and tested them against Mamu-A1*02201 (Table 4). The selected HIV-derived epitopes displayed binding to Mamu-A1*02201 although the degree of crossreactivity varied among the peptides. The HLA-B*3501 restricted Pol 330 peptide-bound Mamu-A1*02201 (9.7 nM) almost as well as the HLA restricting element (7.9 nM). While the HLA-B*0702 restricted Nef 68 peptide-bound Mamu-A1*02201 (230 nM), it was to a much lesser degree than to the HLA restricting element (1.0 nM). The remaining peptide, Nef 137, showed much weaker binding to Mamu-A1*02201 (1140 nM) in comparison to its HLA restricting element (9.5 nM).

Table 4 HLA-B7 supertype-restricted HIV epitopes tested for Mamu-A1*02201 binding affinity

Conversely, we also tested 13 SIV-derived binders (IC50 ≤ 500 nM) identified in the course of the analysis described above, for binding to HLA-B7 supertype molecules (HLA-B*0702, B*3501, B*5101, B*5301, and B*5401). Ten of the 13 top peptide binders to Mamu-A1*02201 also bound at least one molecule of the B7 superfamily with affinity better than 500 nM (Fig. 4). HLA-B*0702 displayed the highest degree of crossreactivity, binding eight out of 13 (62%) Mamu-A1*02201 binders. Additionally, HLA-B*3501 is also highly crossreactive, having bound seven of 13 (54%) Mamu-A1*02201 binders, five of which were shared with HLA-B*0702.

Fig. 4
figure 4

SIVmac239-derived Mamu-A1*02201 binders tested against HLA-B7 supertype alleles. The top 13 SIV-derived binders (Table 3) for Mamu-A1*02201 were tested for crossreactivity with HLA-B7 supertype molecules (B*0702, B*3501, B*5101, B*5301, and B*5401). Extensive crossreactivity between HLA-B7 supertype molecules and Mamu-A1*02201 was demonstrated. HLA-B*0702 displayed the highest degree of crossreactivity with Mamu-A1*02201. Affinities highlighted in blue possess an IC50 < 50 nM. Those affinities in the 50–500 nM range are highlighted in green. Dashes indicate binding affinity >10,000 nM

Evolutionary origin of Chinese rhesus macaque HLA-B7 functional analogy

The data presented above illustrate that the Chinese origin rhesus macaques exhibit a high degree of MHC haplotypic diversity and that it is largely non-overlapping with rhesus macaques of Indian origin. Functionally, the most common Chinese allele described above is associated with an HLA-B7 supertype specificity never, thus far, detected in Indian macaques, despite analysis of many of the most common alleles expressed in this animal.

To investigate the origin of HLA-B7 specificity in Chinese-origin rhesus macaques, we built a phylogenetic tree using representative MHC allele sequences from humans and both Indian and Chinese rhesus macaques. For humans, we included one HLA allele (HLA-A*01010101, HLA-A*02010101, HLA-A*03010101, HLA-A*24020101, HLA-B*070201, HLA-B*080101, HLA-B*270502, HLA-B*44020101, HLA-B*580101, and HLA-B*150101) from each of the known supertypes (A1, A2, A3, A24, B7, B8, B27, B44, B58, and B62, respectively) and the four additional HLA-B7 supertype alleles (HLA-B*350101, HLA-B*510101, HLA-B*530101, and HLA-B*5401) that were used in this study. For Indian rhesus macaques, we selected 14 sequences represented among the most common specificities, namely Mamu-A*00101, Mamu-A*00201, Mamu-A*00301, Mamu-A*00401, Mamu-A*00701, Mamu-A*00801, Mamu-A*01101, Mamu-B*0010101, Mamu-B*00201, Mamu-B*00501, Mamu-B*00701, Mamu-B*00801, Mamu-B*01201, and Mamu-B*01701 (Boyson et al. 1996; Kaizu et al. 2007; Knapp et al. 1997; Loffredo et al. 2007; Voss and Letvin 1996). Similarly, the 14 Chinese rhesus macaques sequences included in the analysis were selected from the most frequent alleles listed in Table 2 above.

Despite functional evidence that Mamu-A1*02201 is an analog of HLA-B7 supertype molecules, the resulting phylogenetic tree (Fig. 5) shows that this Mamu allele appears in the Mamu-A group, and not Mamu-B. This suggests that HLA-B7 specificity in humans and macaques did not arise through persistence of a common B7-like allele. On the contrary, these data indicate that HLA-B7 supertype specificity has arisen independently in humans and macaques, most likely as a result of convergent evolution (Sette and Sidney 1999).

Fig. 5
figure 5

Phylogenetic tree of HLA and Mamu MHC class I sequences. Phylogenetic analysis of 14 human, 14 Indian rhesus, and 14 Chinese rhesus MHC class I sequences is shown. Neighbor-joining tree created based on 1,068 aligned nucleotide sites. Mamu sequences derived from Indian animals are prepended with “In” and are colored in orange. Sequences from the Chinese animals in this study are prepended with “Ch” and colored in blue. Mamu-A1*02201 is colored in yellow. The Mamu allele demonstrating HLA-B7-like specificity appears in the Mamu-A group and not Mamu-B, indicating this motif specificity between humans and macaques likely did not arise through persistence of a common allele

Discussion

Herein, we present the first functional characterization of a Chinese rhesus macaque MHC class I molecule. Our study stems from the analysis of MHC class I sequences derived from a sample of 50 different rhesus macaques of Chinese origin. The results highlighted an extreme degree of polymorphism at the A and B class I loci of Chinese rhesus macaques, possibly reflective of the vast geographical region from which they originated, which supports large populations with limited genetic interchange. This situation favors the generation and maintenance of independent polymorphisms in the species. Indeed, previous analysis of mitochondrial DNA in captive rhesus macaque populations showed greater genetic diversity in Chinese-origin versus Indian-origin rhesus macaques (Kanthaswamy and Smith 2004; Satkoski et al. 2008).

Our results further indicated that the polymorphisms found in Chinese macaques are largely non-overlapping with the polymorphisms associated with the more thoroughly investigated Indian macaques. This likely reflects the geographical distance between India and China and the low frequency of contemporary genetic exchanges between the two populations. These results also have practical implications, as they suggest that the alleles associated with specific disease resistance and susceptibility found in Indian rhesus macaques will not be found in Chinese rhesus, and thus, new correlations will need to be established. Conversely, the rich polymorphisms of MHC class I alleles in Chinese rhesus represent a valuable opportunity for the scientific community, as it is likely that new interesting phenotype associations will be revealed, thus helping to further the understanding of host–pathogen interactions. A recent study using next-generation sequencing technology indicated that the number and polymorphism of class I alleles expressed in these animals may have been vastly underestimated (Wiseman et al. 2009). Such advances in technology are revealing an unprecedented array of sequences in Chinese macaques, but this only underscores the need to continue investigating the functional implication of such polymorphism and disease outcome correlations.

Perhaps the most striking observation resulting from our studies is the most common MHC class I allele found in the disparate Chinese rhesus macaque colonies we examined is associated with HLA-B7 supertype specificity. The HLA-B7 supertype is estimated to be the most abundant supertype represented in the human population with an average prevalence of 49.5% across various ethnic groups (Sette and Sidney 1999). Our finding is significant in light of the fact that this specificity is thus far absent from the Indian rhesus MHC class I alleles characterized to date, which represent a large majority of typed animals (Allen et al. 1998; Dzuris et al. 2000; Hickman-Miller et al. 2005; Kaizu et al. 2007; Knapp et al. 1997; Loffredo et al. 2007, 2009; Mothe et al. 2002). The selective forces responsible for the repeated appearance of HLA-B7 specificities may somehow be absent in the Indian rhesus environment. Alternatively, a catastrophic bottleneck event could have eliminated this specificity from Indian rhesus populations. Similar drastic events appeared to have shaped the evolution of MHC polymorphism in the case of chimpanzees (de Groot et al. 2002). A remaining possibility is that a founder effect occurred due to limited representation of some class I alleles in the Indian rhesus macaques that were imported prior to the 1978 ban, with the result that subsequent generations show no evidence of expressing an MHC class I molecule with HLA-B7 supertype binding characteristics.

Mamu-A1*02201 shares a high degree of crossreactivity with both the HLA-B*0702 and HLA-B*3501 alleles. Previous studies have noted the similarity between specific Mamu and HLA class I molecules such as Mamu-A*11 and HLA-B*44 and the Mamu-B*08 and HLA-B*27 molecules (Dzuris et al. 2000; Loffredo et al. 2009; Sette et al. 2005). Evidence that functional HLA-B analogs may exist within the Mamu class I proteome has been suggested (Hickman-Miller et al. 2005) despite highly dissimilar primary amino acid sequences in the binding pockets of the two species. These findings suggest that, as the number of Mamu class I molecules that are functionally characterized at the level of peptide binding specificity increases, specific pairs of HLA–Mamu class I molecules with similar specificity will continue to be identified. Thus, the specific macaque alleles could be used as stringent models of the corresponding human HLA class I molecule, or conversely, observations relating to the specific Mamu molecules might inform analysis targeting specific human HLA class I variants.

Finally, our investigations reveal a set of 11 alleles, which, taken together, are found in multiple independent macaque colonies. We extrapolate from our and previous data that about 68% of animals in a given macaque colony will be positive for at least one of these alleles. It should be straightforward to establish PCR-SSP assays for these alleles and validate them as a genotyping tool for the scientific community. Likewise, we anticipate that peptide binding motifs can be defined for the most common Chinese rhesus MHC class I molecules following the approach utilized herein to characterize in detail the specificity of Mamu-A1*02201. The availability of detailed motifs will, in turn, allow prediction of epitopes restricted by the various common Chinese rhesus MHC class I molecules, as has been done for rhesus of Indian origin (Peters et al. 2005). These algorithms will be regularly updated in the Immune Epitope Database Analysis Resource website (http://tools.iedb.org; Zhang et al. 2008), which is freely available to the scientific community.