The characterization of HLA peptide ligands and allelic peptide-binding motifs has received considerable attention over the past few years. Autoimmunity, disease resistance, transplantation and vaccine development are some of the areas in which the characterization of natural ligands can help in the design of novel prophylactic vaccines and immunotherapeutic strategies (Hunziker et al. 2001). The sequencing of MHC class I- and class II-bound naturally processed peptides has permitted the characterization of primary anchor residues and the definitions of peptide motifs for many HLA isotypes (Falk et al. 1994; Rammensee et al. 1995). In particular, in the case of class II eluted peptides, pool sequencing and alignment approaches need to analyse large binding-peptide repertories in order to be effective, or, even better, to combine the alignment of natural ligands with the consideration of predicted pocket structures (Friede et al. 1996; Sanjeevi et al. 2002) and peptide-binding studies (Chicz et al. 1992, 1997; Khalil-Daher et al. 1998).

As reflected in the HLA ligand/motif database that has recently been created (http://hlaligand.ouhsc.edu; Sathiamurthy et al. 2003), HLA-DP has been little studied in comparison with other classical HLA molecules. Although 107 HLA-DP alleles have already been characterized (IMGT/HLA database, http://www.ebi.ac.uk/imgt/hla/index.html), peptide ligand sequences have been described only for seven of them (http://hlaligand.ouhsc.edu). In contrast to the 885 peptide ligands described for HLA-DR, to date only 65 HLA-DP ligands have been identified, probably because of the difficulty involved in collecting sufficient amounts of purified HLA-DP protein (Chicz et al. 1997). The scant information available about HLA-DP contrasts with the association of this molecule with several autoimmune disorders. In particular, HLA-DP2 is one of the HLA-DP molecules that has been related to diseases such as juvenile chronic arthritis (Paul et al. 1993), juvenile rheumatoid arthritis (Begovich et al. 1989), Graves’ disease (Dong et al. 1992), berylliosis (Richeldi et al. 1993) and hard metal lung disease (Potolicchio et al. 1999).

Studies investigating the pool sequences of natural peptides eluted from HLA-DP molecules (Chicz et al. 1997; Falk et al. 1994; Verreck et al. 1994) have afforded the definition of putative anchor residues. Data concerning putative HLA-DP2 peptide-binding motifs (Fig. 1a) were first reported using the alignment of three endogenous peptide sequences (Rammensee et al. 1995; Rotzschke and Falk 1994). Some years later, HLA-DP2 peptide motifs were redefined (Chicz et al. 1997); the anchor residues identified by these authors (Fig. 1a) were more convincing, because they used 12 natural peptide sequences for the alignment and also carried out binding studies with synthetic peptides. Currently, the HLA ligands/motifs database contains the two different HLA-DP2 peptide-binding motifs proposed by these two groups.

Fig. 1
figure 1

Proposed HLA-DP2 peptide-binding motifs. a Comparison of HLA-DP2 anchor features derived from this work and previously described motifs. b Amino acid residues detected with greatest frequency in the main HLA- DP2 anchor positions. Calculations were done including only the longer peptide sequences from each source protein

To gain further insight into the HLA-DP2 peptide-binding motifs and to define the peptide-binding properties of this molecule more precisely, we were prompted to analyse larger peptide repertoires naturally bound to this molecule. HLA-DP2 molecules were purified from the EBV-transformed human B-cell line 45.1 (Kavathas et al. 1980). In the presence of lysis buffer (Tris ClH 50 mM, ClNa 150 mM, EDTA 5 mM, Nonident-P40 0.5%, pH 8, plus protease inhibitors), 12×1010 cells were lysed and then centrifuged at 100,000 g for 90 min. The HLA-DP2-peptide complexes were purified from the detergent-soluble fraction by using tandem affinity chromatography columns in the following order: Sepharose CL-4B, L243-Sepharose CL-4B (anti-DR), B7/21-Sepharose CL-4B (anti-DP). Peptides were then eluted in the presence of trifluoroacetic acid 0.1%. Eluted peptide mixtures were centrifuged through 10,000-Da cut-off ultrafiltration tubes (Centripep 10, Amicon, Beverly, Mass.) and collected from the flow-through. HLA-DP protein purification was verified at this step. The fraction retained in the ultrafiltration tube was treated with trypsin and analysed by MALDI-TOF, revealing the presence of tryptic peptides corresponding to α- and β-HLA-DP2 molecules. The peptide pools from the flow-through were vacuum-concentrated and separated by reverse-phase HPLC in a Micro-Reverse Phase column C2/C18, 2.1×100 mm (Amersham Pharmacia Biotech, Piscataway, N.J.). One-minute fractions were collected within the range of 5–45% acetonitrile gradient and analysed by mass spectrometry. Mass assignments for 300 peptides were carried out by MALDI-TOF, revealing a size distribution consistent with previously studied HLA-DP2, DR and DQ repertoires (data not shown). Sequence determinations of peptides were obtained by analysis of peptide fractions by HPLC-ion trap mass spectrometry performed essentially as described previously (Lopez et al. 2002; Marina et al. 1999), with some modifications. A LCQ Deca XP ion trap (Thermo Finnigan, San José, Calif., USA) was coupled to a Surveyor HPLC system, using a RP18 Thermo Hypersil-Keystone (180 μm ID × 15×cm) microcolumn, operating at a flow rate of 1.5 μl/min. The ion-trap mass spectrometer was programmed to work in the “triple play” mode, that is, an MS scan is made to determine whether peaks are present with an intensity above a predefined threshold, in which case a second ZoomScan and a third MS/MS scan are performed on these peaks, using the “dynamic exclusion” method; only two consecutive MS/MS scans are allowed per ion in order to achieve the widest sequence coverage. Alternatively, the ion trap detector was programmed to perform a continuous sequential operation in MS/MS mode on the doubly and triply charged ions corresponding to the masses previously determined by MALDI-TOF analysis (single ion monitoring mode). Fragmentation spectra were assigned to peptide sequences using as search engines the Sequest program from the Bioworks package (Thermo Finnigan) or the MS/MS ion search page from MASCOT freely available on the Web (http://www.matrixscience.co.uk). SWISS-PROT, NR.FASTA or EST protein databases were consulted with these programs. To corroborate the results provided by the search programs, MS/MS spectra containing interpretable sequence information were also subjected to either a large-scale, automated, “de novo” sequence interpretation, using the program DeNovoX (Thermo Finnigan), or to a manual interpretation. The following criteria were used for manual confirmation of sequences: (1) each of the most intense peaks must be assigned to either a b- or a y-ion that could be double charged (y++, b++) when produced by a triple-charged parent ion fragmentation; (2) long, consecutive tags of y- and b-ions must be present, allowing the reading of a partial peptide sequence in both directions; and (3) it should be possible to assign most of the minor peaks to neutral losses from b- or y-ions as H2O (−18), NH3 (−17) or CO (−28).

Here, mass and sequence analyses of 44 HLA-DP2 self peptides derived from 24 different source proteins are reported (Table 1). In six of these source proteins (transferrin receptor, HLA-class I α chain, HLA-DR chain, chemokine receptor, α-enolase and RAB9A) we identified predominant epitopes represented by several length variants of the same protein stretch forming sets of nested peptides varying in 1–6 aa at the C terminal or 1–2 aa at the N-terminal ends (Table 1). Of the 44 peptides sequenced, only three (7%) were derived from exogenous proteins; 36% belonged to cytoplasmic proteins and 56% corresponded to surface membrane proteins (Rammensee et al. 1995). Cytoplasmic proteins included cytosolic proteins (such as α-enolase, tryptophanyl-tRNA synthetase, lactate dehydrogenase A, ubiquitin and actin β), lysosomal proteins (cathepsin 12), ER-Golgi proteins (disulfide isomerase A3 precursor, RAB9A, endoplasmin precursor), mitochondrial proteins (ATP synthase) and ribosomal proteins (eukaryotic translation elongation factor 1γ). In agreement with the foregoing, previous studies have disclosed that peptides derived from plasma membrane proteins are also predominant in other HLA class II molecules, although peptides derived from autologous proteins located in the ER, the Golgi apparatus, secretory vesicles and the cytosol have also been identified (Muntasell et al. 2002).

Table 1 Natural HLA-DP2 ligands. Amino acid sequences of individual peptides analysed from the HLA-DP2-bound peptide pool with their respective masses are indicated. Putative anchor residues, based on peptide alignment using Clustal W from EMBL (http://www.ebi.ac.uk/clustalw/) are in boldface. Sequences identified by the first time as MHC ligands are in italics. The known peptide sequences are classified according to the localization of the source protein within the cell: exogenous, surface membrane or cytoplasmic. Id no. corresponds to database sequence identification number

A striking feature of the HLA-DP2 peptide ligands sequenced was that many of them were enriched in lysine residues. Charged residues were found at the C terminus in 27 of the 44 HLA-DP2 natural ligands analysed, suggesting a dominant amino acid pattern on the flanking regions of the peptides isolated. The presence of positively (34%) or negatively (27%) charged amino acids at C termini might be due to proteolytic degradation by highly specific charged residue proteases involved in antigen processing. Proteases such as trypsin or endopeptidase Lys-C cleave the C terminus of Lys or Arg, corresponding to the pattern found most frequently in the HLA-DP2 natural ligands described here. Preferences at the N termini were not so clear, although charged amino acids were found at this position fairly frequently. Moreover, scrutiny of the complete sequence of the source proteins from the peptides identified showed that the amino acid residue contiguous to the C termini of the sequenced peptide usually included a charged residue (31%) or Leu (20%).

In this study, both promiscuous and putative allotype-specific self peptides were identified. The set of peptides derived from bovine serum albumin, ubiquitin, transferrin receptor, actin, ATP synthase, class I and class II molecules are examples of promiscuous peptides that have been found to be associated with almost all MHC molecules (Rammensee et al. 1995). Specifically, transferrin receptor 5–21 and bovine serum albumin 152–170 have been identified previously as HLA-DP2 ligands (Rotzschke and Falk 1994). The fact that these peptides were also identified by us, using different elution and sequencing methods, suggests that these ligands must be present in relatively large amounts. Ligands derived from ubiquitin, HLA class I α chain, HLA-DR α chain, actin β and chemokine receptor 4, containing at least part of the peptide sequence identified in this study, have been found to be mainly associated with HLA-DR and HLA-DQ molecules (Futaki et al. 1995; van de Wal et al. 1997). However, the core-binding region of these promiscuous ligands seems to be different for HLA-DP2. Ligands derived from the same proteins, but with a peptide sequence complete different from the stretch identified here, have been reported to be presented by allelic forms of HLA class I and class II and even mouse MHC molecules. The promiscuous peptides described here should help to gain further insight into the functional role of HLA-DP2, since these ligands exhibit mismatches at anchor positions with respect to other MHC alleles.

Thirteen of the 24 source proteins correspond to proteins described for the first time as source proteins for MHC ligands. The 14 peptides derived from these proteins (Table 1) should be considered putative allotype-specific self peptides for HLA-DP2 molecules.

The 44 naturally HLA-DP2 bound peptide sequences were aligned optimally to fit the putative peptide-binding motifs (Table 1). This alignment showed a preference for ligands with hydrophobic and aromatic residues at pocket 1 (Fig. 1). Our data indicate that Phe is the most frequent amino acid for the P1 residue, although this pocket accepts other hydrophobic residues such Tyr, Leu or Val. This is completely consistent with the anchor residues previously proposed (Chicz et al. 1997) at P1 (Fig. 1a). Using molecular modelling, we have previously described a deep and hydrophilic pocket 1 for HLA-DP2, mainly due to the presence of Gly at residue 84, allowing large hydrophobic residues to fit into it (Díaz et al. 2003). The same amino acid preferences have been reported for pocket 1 of HLA-DP4 molecules (Castelli et al. 2002). This is not surprising, since HLA-DP2 differs from HLA-DP4 by only four amino acids at positions 36, 55, 56 and 69, which are predicted to be located at pockets 4, 6 and 9 but not at pocket 1.

Positively charged residues (Lys, Arg) dominate the P4 anchor of HLA-DP2 (Fig. 1). Combining peptide-binding assays and molecular modelling, we and others have previously described the high affinity of this pocket for positive and polar residues (Arg, Thr, Ser) (Berretta et al. 2003; Chicz et al. 1997; Díaz et al. 2003). The preference for these residues is in agreement with the HLA-DP2 molecular modelling predictions (Berretta et al. 2003; Diaz et al. 2003), where HLA-DP2 displays a negatively charged pocket 4 because of the presence of Glu at position 69. It is important to note that Lys, despite being a positively charged amino acid residue, has not been reported previously as a P4 anchor (Chicz et al. 1997; Rotzschke and Falk 1994). However, this residue was found to be located at P4 in 38% of the HLA-DP2 peptide sequences described here, representing the most frequent residue at this position. In agreement with previous data (Chicz et al. 1997), Gln was found at P4 in many of the peptides sequenced. Residues Arg, Ala, and Leu were also found to be significantly present at P4, although to a lesser extent.

Similar to pocket 1, pocket 6 of HLA-DP2 is a deep and hydrophobic pocket with a preference for aromatic and aliphatic amino acids. Hydrophobic residues such as Phe and Leu were the most frequent amino acids found in P6. In addition to residues Tyr and Phe previously described (Chicz et al. 1997) as P6 anchor sites, our data suggest new P6 anchors such as Leu, Thr or His. Data collected by us previously demonstrated that residue 11 plays a crucial role in peptide binding and T-cell recognition, mainly determining the shape of this pocket (Díaz et al. 2003). Polymorphic position 11 is occupied by Gly in the HLA-DP2 molecule, resulting in a larger pocket 6, with a high affinity for aromatic or branched aliphatic amino acids. HLA-DP2 and HLA-DP402 exhibit similar binding motifs at P6 (Phe, Leu, Tyr) (Castelli et al. 2002; Díaz et al. 2003). The structural characteristics of pocket 6 appear to be similar in HLA-DP2 and HLA-DP402, since polymorphic positions within this pocket (residues 11 and 69) are occupied by the same amino acids in both alleles. Although to a lesser extent than the above-mentioned residues at P6, we detected the presence of His in some of the ligands analysed. The preference of pocket 6 for this aromatic and hydrophobic residue is in agreement with our previous results (Berretta et al. 2003).

The pattern was less restricted in P9 since hydrophobic (Ala, Val) or polar (Thr, Ser) amino acids were found. However, although several chemical features are allowed in this pocket, most peptide residues found at P9 are small. As well as the previously favoured residues proposed for this pocket (Leu, Val, Ile), our data also implicate residues such as Ala, Thr and Ser at P9. The data from HLA-DP2 molecular modelling indicate that the presence of Asp at residue 55 forms a salt bridge hydrogen bond with Arg76α, reducing the size of this pocket (Díaz et al. 2003), since small residues are the most favoured.

In conclusion, we have extended the number of previously described natural ligand peptide sequences associated with HLA-DP2 from 15 to 62. The alignment of this high number of peptides, together with previous HLA-DP2 peptide-binding studies HLA-DP2 molecular modelling, has allowed us to gain a better definition of the peptide amino acid preferences for this allele. Our data extend the knowledge about HLA-DP2 peptide motifs. We hope this will contribute to a better understanding of the pathogenesis of HLA-DP2 associated diseases.