The simian immunodeficiency virus (SIV) or chimeric SIV/HIV (SHIV)-infected rhesus (Macaca mulatta) and pigtail (Macaca nemistrina) macaques are common animal models in the field of HIV/AIDS research. SIV and HIV type 1 are not only similar in genomic structure but also in the immunopathology they induce in their respective hosts (Liska et al. 1999). SIV and SHIV infection of macaques results in similar CD4+ T cell exhaustion and disease progression to immunodeficiency as found in HIV-1-infected humans (Balamurali et al. 2010; Boberg et al. 2008; Henning et al. 2011; Kearney et al. 2011; Klatt et al. 2011; Reece et al. 2010; Smith et al. 1999).

The control of both acute and chronic HIV and SIV infections has been linked to CD8+ T cell responses that are MHC class I restricted (De Rose et al. 2008; Hatziioannou et al. 2009; Loffredo et al. 2004; Mankowski et al. 2008; Miller et al. 1991; Smith et al. 2005). Rhesus macaques have been extensively characterized in terms of their major histocompatibility complex (MHC) class I alleles, and the detailed characterization of the peptide-binding specificity of several rhesus macaque MHC class I molecules (Mamu) has proven to be a useful tool to facilitate the identification of SIV epitopes recognized in infected animals (Allen et al. 2001; Allen et al. 1998; Hickman-Miller et al. 2005; Loffredo et al. 2009; Loffredo et al. 2005; Loffredo et al. 2004; Mothe et al. 2002; Peters et al. 2005; Sette et al. 2005; Sidney et al. 2000; Solomon et al. 2010; Southwood et al. 2011; Walsh et al. 2009). As a result, they have proven to represent very valuable animal models for use in HIV vaccine and pathogenesis studies, as well as for studies for several other human diseases such as influenza, hepatitis, and smallpox (Gardner and Luciw 2008). Unfortunately, because rhesus macaques are in high demand for such studies, they are becoming increasingly harder to obtain for research purposes—a situation exacerbated by the continued export ban placed on them in 1978 (Hatziioannou et al. 2009; Smith et al. 2005).

Pigtail macaques represent a potential alternative animal model to shoulder the research burden caused by the scarcity of available rhesus macaques. Despite the initial identification of their MHC class I alleles (Mane) (Lafont et al. 2003; Lafont et al. 2007; Smith et al. 2005), the characteristics of their peptide binding properties are largely unknown and, as a result, their use in similar types of studies has been hindered. With the eventual goal of creating vaccines for SIV and HIV, it is crucial that more is known about the cellular immune responses of this increasingly important animal model.

The pigtail macaque class I type Mane-A1*082:01, formerly known as Mane A*0301, is the second most common pigtail macaque MHC allele, with a frequency of approximately 23% across several cohorts (Pratt et al. 2006), making it an attractive target for characterization. The B pocket primary structure of Mane-A1*082:01 is very similar to that of the HLA class I allele B*2705, which has been associated with long-term disease non-progression in HIV-infected individuals (Carrington and O'Brien 2003), thus making Mane-A1*082:01 all the more interesting in the context of HIV vaccine development studies. Characterization of the peptide binding specificity of Mane-A1*082:01 would allow researchers to expand the usefulness of these animals in future HIV/SIV studies, as well as several other potential disease indications.

As an initial step to characterize the Mane-A1*082:01 binding specificity, we developed an in vitro MHC-peptide binding assay utilizing purified MHC molecules. The assay was developed utilizing published methodologies (Sidney et al. 2008a; Sidney et al. 2001), and Mane-A1*082:01 molecules were purified from a 721.221 cell line stably transfected with the allele (Allen et al. 1998; Loffredo et al. 2009; Shimizu and DeMars 1989). As the radiolabeled probe, an analog (peptide 3263.0005; sequence DHQAAFQYI) of the SHIV Gag-derived Mane-A1*082:01-restricted T cell epitope (Lafont et al., manuscript in preparation) was designed to incorporate a tyrosine residue (I8 to Y) to allow for radiolabeling and to replace the endogenous methionine (M6 to F) to increase stability. In direct binding assays with 3263.0005, a signal-to-noise ratio of about 5 could be detected with as little as 10 nM of purified MHC (data not shown). Inhibition assays to validate binding specificity and sensitivity revealed that the binding of 3263.0005 could be inhibited by unlabeled peptide in a dose-dependent manor, with an IC50 of 8.2 nM.

Development of a sensitive binding assay allowed us to probe the specificity of Mane-A1*082:01 in detail. A panel of individual single amino acid substitution (SAAS) analogs, corresponding to substitution with each of the 20 naturally occurring amino acids at each position of peptide 3263.0005, was tested for its capacity to bind Mane-A1*082:01. To derive a detailed binding motif, the data were analyzed essentially as described previously (Sidney et al. 2008a). Briefly, the IC50 (in nanomoles) values for each single substituted peptide were standardized as a ratio to the geometric IC50 (in nanomoles) value of the entire panel of 172 analog peptides, and then normalized at each position so that the value associated with optimal binding at each position corresponds to 1. For each position, an average (geometric) relative binding affinity (ARB) was calculated. The ratio of the ARB values for each position to the entire SAAS panel is denominated as the specificity factor (SF). For SAAS panels, primary anchor positions are then defined as those with an SF ≥5. This criterion identifies positions where the majority of residues are associated with significant decreases in binding capacity, signifying position-specific stringency. The binding data are summarized in Table 1.

Table 1 SAAS-derived matrix describing 9-mer binding to Mane-A1*082:01

As shown in Table 1, position 2 and the C-terminus were defined as the main anchor positions, with SF of 24 and 5.3, respectively. At position 2, histidine (H) was clearly the most dominant residue. Substitution with any other residue resulted in a 30-fold or more decrease in binding capacity, relative to the (wild-type) peptide with H in position 2, and in total 17 amino acids were associated with reductions greater than 50-fold. At the C-terminus, the aromatic residue phenylalanine (F) was the most preferred. The aromatic residue tyrosine (Y) was the second most preferred, with an ARB of 0.232, followed by M (ARB = 0.158). A few other residues (R, L, C, and I) were tolerated at the C-terminus, with ARBs in the 0.02–0.1 range, but the remaining 13 amino acids were associated with 50-fold or greater reductions in binding capacity.

Secondary influences on peptide binding were most prominent in positions 5 and 7, where 7 and 10 substitutions, respectively, were associated with 50-fold or greater reductions in binding capacity, relative to the optimal residue. These positions, also associated with the next highest SFs (2.27 and 1.87, respectively), have been identified as secondary anchor positions for several HLA class I molecules (see, e.g., (Ruppert et al. 1993; Sidney et al. 2008a)). Influences at other positions were relatively minor.

The initial motif was defined on the basis of analogs of a single ligand. To verify that this motif was generally applicable to Mane-A1*082:01 ligands, and to determine the optimal ligand size for Mane-A1*082:01 molecules, we utilized previously described methodologies (Bairoch and Apweiler 2000; Geer et al. 2004; Udeshi et al. 2008) to elute, and then analyze by tandem mass spectrometry (MS/MS), ligands endogenously bound to purified Mane-A1*082:01 molecules. Data from the MS/MS experiments were searched against the SwissProt human database using the open mass spectrometry search algorithm (OMSSA) software to generate a list of candidate peptide sequences (ESM Table 1). Among the 46 peptides identified, 20 (43.5%) were nine residues in length. Longer peptides of 10 and 11 residues in length were also relatively common, being represented by 7 (15.2%) and 11 (23.9%) sequences, respectively. Only two 8-mer peptides (4.3%) were identified, suggesting that peptides shorter than nine residues are less well tolerated by Mane-A1*082:01. Similarly, while peptides longer than 11 residues were identified, these longer peptides collectively only represented less than 15% of all eluted ligands.

Next, the sequences of the eluted peptides were analyzed to determine the absolute number and relative fraction of occurrence for each individual amino acid, at each position (ESM Table 2 and Table 2, respectively). To do this, all of the peptides were aligned beginning at their N-terminus as shown in ESM Table 2, and the corresponding amino acid frequencies were determined for positions 1 through 8, and then for the C-terminal residue. Utilizing the fraction of occurrence data (Table 2), we defined main anchor positions as those where the median fraction of occurrence of any amino acid is 0, indicating that only a limited range of residues is found and implying that these positions are associated with the most severe specificity. Accordingly, and in agreement with the SAAS data, position 2 and the C-terminus were identified as main anchors, and at both of these positions there was a striking dominant preference for a single residue. More specifically, histidine was found at position 2 in 33 (72%) of the 46 sequenced peptides, and tyrosine was found at the C-terminus in 31 peptides (67%). At these positions, 14 and 15 different amino acids, respectively, were never seen in any of the eluted ligands. A dominant presence of asparagine (N) was also noted in position 1, although a number of other residues were also tolerated, suggesting that this is a secondary anchor position.

Table 2 Frequency of amino acid residues in eluted peptide ligands and respective positions

When compared to the SAAS data, the hierarchy of preferences at the C-terminus identified in the eluted peptides was a bit surprising. Specifically, the aromatic residue Y was clearly the most dominant residue among the eluted ligands, being present with about a fourfold higher frequency than F, whereas the SAAS analysis identified a fourfold higher preference for F. However, it should be noted that the SAAS data directly reflect molecular binding affinity, generated in vitro using purified reagents, whereas preferences determined on the basis of the elution data may be attenuated by factors inherent in the in vivo cellular environment, such as protein expression patterns, endogenous protein amino acid composition, and the preferences and/or limitations of protein processing mechanisms.

Finally, to develop a general prediction method for identifying candidate Mane-A1*082:01 epitopes, we determined the Mane-A1*082:01 binding capacity of a combinatorial peptide library (PSCL). Quantitative matrices developed using this approach have previously been shown effective for the prediction of peptide binders (Lauemoller et al. 2001; Peters et al. 2006; Sidney et al. 2008a; Sidney et al. 2007; Stryhn et al. 1996; Udaka et al. 2000; Udaka et al. 1995). Accordingly, the PSCL data were analyzed as previously described, and a scoring matrix derived (Table 3) (Sidney et al. 2008a).

Table 3 PSCL scoring matrix for prediction of Mane-A1*082:01 binding peptides

To evaluate the efficacy of the PSCL matrix for predicting Mane-A1*082:01 binders, we selected all possible 9-mer peptides derived from the SHIV DH12R clone 7 sequence (Sadjadpour et al. 2004). Similar to what was done previously for other MHC class I alleles (Loffredo et al. 2009; Sidney et al. 2008a; Sidney et al. 2007), each peptide was assigned a score representing the sum of the (log) matrix value for the corresponding residue at each position. The top 4% scoring peptides (n = 124) were then selected, synthesized, and tested for their capacity to bind Mane-A1*082:01 in competition assays utilizing purified MHC, as previously described (Loffredo et al. 2009; Sidney et al. 2008a; Sidney et al. 2001). A control set of over 200 peptides, selected to cover the 4–100% range, was also tested. As shown in Fig. 1a, the algorithm demonstrated specificity, with over 30% of the peptides scoring in the top 2 percentile range binding Mane-A1*082:01 with an affinity of 500 nM or better. By contrast, only 7.6% of the peptides in the second to tenth percentile range, and 3.4% in the 10th to 25th percentile range, were binders. None of the 72 peptides at the 25th or worse percentile bound Mane-A1*082:01. Figure 1b details the prediction efficacy of peptides scoring in the top 4%, demonstrating that effective saturation is achieved at about the second percentile (AUC = 0.79). The PSCL matrix was also adapted to predict binders of 8, 10, and 11 residues in length. Synthesis and testing of the top 1% scoring peptides of each size demonstrated a similar prediction performance to that seen when predicting 9-mers, with between 23% and 35% of the peptides binding with an affinity of 500 nM or better. All peptides identified in the present study with Mane-A1*082:01 binding affinities of 500 nM, or better, represent candidates for T cell epitopes, and are listed in ESM Table 3.

Fig. 1
figure 1

Efficacy of PSCL prediction of Mane-A1*082:01 binders. SHIV peptides were scored using the PSCL matrix. The top 1% scoring peptides were synthesized and tested in competitive binding assays utilizing purified MHC, as previously described (Loffredo et al. 2009; Sidney et al. 2008a; Sidney et al. 2001), and their IC50 (nanomoles) affinities determined. Binders were defined as those with an affinity of 500 nM, or better. a Percent binders as a function of the percentile score range; b cumulative number of Mane-A1*082:01 binders by percentile score in a complete set of peptides derived from SHIV scoring in the top 4 percentile range by PSCL matrix prediction

In the context of most human and non-human primate MHC class I alleles studied to date, it has been shown that an appreciable fraction of high-affinity binders, and T cell epitopes, do not strictly adhere to canonical, elution-based, motifs (Kast et al. 1994; Kubo et al. 1994; Ruppert et al. 1993; Sette et al. 1994). The PSCL predictions demonstrate that this also appears to be true for Mane-A1*082:01. Specifically, while 53% (20/38) of the high-affinity peptides (IC50 <100 nM) identified had H in position 2, and 76% (29/38) had F or Y at the C-terminus, only 29% conformed to the canonical motif at both main anchor positions (ESM Table 3). These observations underline the value of the detailed motif defined herein.

In the present study we have utilized MHC elution analyses and panels of single amino acid substitution analog peptides to probe the binding specificity of Mane-A1*082:01, one of the most frequent MHC class I alleles expressed in captive bred pigtail macaques. Based on these analyses, Mane-A1*082:01 was found to recognize a motif with a preference for H in position 2 and the aromatic residues F and Y, or the hydrophobic/aliphatic residue M, at the C-terminus. The development of a sensitive MHC-peptide binding assay utilizing purified MHC allowed the derivation of a detailed quantitative motif that was proven effective in predicting high-affinity binders from SHIV, an important model virus for studying HIV infection in humans. Future studies will address recognition of SHIV peptides in the context of Mane-A1*082:01 monkeys infected with SHIV, as well as potential binding cross-reactivity between Mane-A1*082:01 and Mamu A1*007:01, a rhesus macaque allele that recognizes a nearly identical primary anchor binding motif (Reed et al. 2011), and that has been associated with SIV elite controllers. Additional work will explore cross-reactivity with HLA alleles, such as those in the HLA B27 supertype (Sidney et al. 2008b), that recognize a similar peptide binding motif.