Major histocompatibility complexes (MHCs) class II molecules are glycoproteins involved in the exogenous antigen processing pathway, responsible for presenting self and non-self peptides to inspection by T-cells. Class II MHCs are expressed on specialised cell types, including professional Antigen Presenting Cells (APCs), such as B cells, macrophages and dendritic cells. MHC class II proteins bind oligopeptide fragments derived through the proteolysis of pathogen antigens, and present them at the cell surface for recognition by CD4+ T cells. If sufficient quantities of the epitope are presented, the T cell may trigger an adaptive immune response specific for the pathogen. The peptides binding to MHC class II proteins vary considerably in length from 12-25 amino acids. They are bound by the protrusion of peptide side chains into cavities within the groove and through a series of hydrogen bonds formed between the main chain peptide atoms and the side chains atoms of the MHC molecule. The peptide is able to extend from either of the two open ends of the binding groove. It takes an extended polyproline-like conformation [1].

MHCs are the most polymorphic protein in higher vertebrates, with more than 6000 class I and class II MHC molecules listed in IMGT/HLA in February 2011 [2]. Determining the peptide binding specificities exhibited by this vast collection of alleles is beyond the present capacity of experimental techniques, necessitating the development of bioinformatic prediction methodologies. The most successful prediction methods for T-cell epitopes developed to date have been data-driven. T-cell epitope prediction typically involves defining the peptide binding specificity of specific class I or class II MHC alleles and then predicting epitopes in silico.

Using peptide sequence data, experimentally-determined affinity data has been used in the construction of many MHC-peptide binding prediction algorithms. Such methods include motif-based systems, Support Vector Machines (SVMs) [3, 4], Hidden Markov Models (HMMs) [57], QSAR analysis [8, 9], and structure-based approaches [1012]. MHC binding motifs are an easily understood epitope identification method, although such motifs invariably generate numerous false positives and numerous false negatives.

At least for well-studied class I MHC alleles, immunoinformatic prediction methods work well [13, 14]. However, for prediction of all immune epitope data other than class I MHC peptide binding, results have rarely proved satisfactory. Over the last few years, several comparative studies have shown that the prediction of class II T-cell epitopes is usually poor [1517].

Human MHC class II alleles are grouped into three loci: HLA-DP, HLA-DQ and HLA-DR. Class II MHCs have been associated with many chronic inflammatory diseases [18], including rheumatoid arthritis and type 1 diabetes. Many crystal structures are now available for HLA-DQ and HLA-DR proteins [19], which show that the peptide binding site is composed of two separate chains: α and β. The walls of the binding site are formed by two anti-parallel helices and the floor is formed by an eight-stranded β-sheet [20]. Much of the extraordinary sequence polymorphism apparent in human MHCs is concentrated in residues forming the binding site. The site is open at both ends and peptides of different length could bind, even though only 9 amino acids occupy the site itself.

In contrast to HLA-DR and HLA-DQ, HLA-DP proteins have not been studied extensively, as they have been viewed as less important in immune responses than DRs and DQs. However, it is now known that HLA-DP proteins contribute to the risk of graft-versus-host (GVH) disease [21], sarcoidosis [22], juvenile chronic arthritis [23], Graves' disease [24], hard metal lung disease [25] and especially, chronic beryllium disease [26]. Quite recently, the X-ray structure of the HLA-DP2 (DPA*0103, DPB1*0201) in complex with a self-peptide derived from the HLA-DR α-chain has been determined [27]. Although the overall structure of DP2 is similar to that of other MHC class II proteins, it contains a unique solvent-exposed acidic pocket containing three glutamic acids (Glu26β, Glu68β and Glu69β). This pocket may be able to bind Be and present it to T cells, thus explaining the mechanism of chronic Beryllium disease [27, 28]. The X-ray data also revealed that the DP2 binding site consists of four binding pockets: deep and hydrophobic p1 and p6 pockets; large, shallow and negatively charged p4; and deep, narrow and polar p9.

Given the ready availability of the structural data to which we have briefly alluded above, the molecular docking has now become an appropriate tool, capable of application to the problem of binding prediction for class II MHCs. Structure-based docking is the repeated static docking - and subsequent empirical scoring - of sets of molecular structures to a biomacromolecular target, such as class II MHC complexes. The molecular docking has a burgeoning track-record of success, at least in area of identifying small molecule ligands of macromolecular targets, and can help identify MHC binders. Speaking generally, the molecular docking can be separated into five phases, beginning with the X-ray structure of a target MHC. This is combined with potential peptide binders. The resulting set of ligands is then docked into a binding site model and scored for some appropriate correlate of binding. Handful of the top ranked hits is selected, and assayed experimentally [29].

Specifically, in the present study, we applied a molecular docking protocol to a library of 247 modelled peptide-DP2 complexes to assess the contribution of each of the 20 naturally occurred amino acids at each of the nine binding core positions and the four flanking residues (two on both sides). The normalized binding scores formed a quantitative matrix (QM). The predictive ability of the QM was assessed by external test set of 457 known binders to DP2. A comparison with results generated by existing servers for DP2 binding prediction indicated an improvement in performance offered by our docking score-based QM (DS-QM).


Input data

The X-ray structure of the HLA-DP2 (DPA*0103, DPB1*0201) in complex with a self-peptide derived from the HLA-DR α-chain (pdb code: 3lqz) was used as a starting structure [27]. The covalently bound peptide was separated and defined as chain C. It consists of nine binding core positions (FHYLPFLPS) and six flanking residues (RK at the N terminus and TGGS at the C terminus). The conformation of the peptide was used as a template for the modelling process. Thirteen positions were examined: nine binding core positions and four flanking residues (two on both sides). A library of 248 peptides (19 amino acids × 13 positions + 1 original ligand) was built using PyMOL [30]. We used the SAAS (single amino acid substitution) approach to model the conformations of each altered side chains: after substitution, the side chain conformation was minimised while keeping the rest of the peptide structure and the whole MHC protein rigid. The protonation state of ionisable protein side chains was assigned to a standard ionisable state: neutral for His; positively charged for Arg and Lys; and negatively charged for Asp and Glu [31].

AutoDock protocol

AutoDock 4.2 [32], employing an implementation of the Lamarckian genetic algorithm (GA), was used to model the peptide binding to HLA-DP2. In order to limit the computational burden of calculating peptide-MHC interactions at positions not involved in the static docking, we kept all coordinates fixed apart from the peptide residues of interest. These were left flexible. All GA settings were kept to their default values, apart from the number of energy evaluations and the number of generations which were set to 250 000 and 27 000, respectively. The docking grid was defined as a cuboid with sizes 32 Å × 36 Å × 38 Å, which encompassed the entire peptide binding site on DP2. The output from ten independent GA runs for each ligand was processed and the pose (binding conformation) with the lowest Free Binding Energy (FBE) was considered. FBE values represent the direct output from the AutoDock 4.2 scoring function which takes into consideration weighted terms for van der Waals dispersion/repulsion, hydrogen bonding, electrostatics, and desolvation interactions as well as the change in torsional free energy when the ligand goes from an unbound to bound state. Data was mined by python scripts using the MGL Tools 1.5.4 package [33]. All retained poses considered in the study had an RMSD below 1.5 Å.

Docking score-based quantitative matrices (DS-QMs)

The FBEs derived from the docking experiments had negative and positive values. Negative FBEs correspond to binding peptides, while positive FBEs correspond to non-binding peptides. Only negative FBEs were considered; non-binding amino acids were assigned the penalty score of -10.000. The FBEs were normalized in two ways: correcting using an average calculated on a position-dependent basis (epithet: position-per-position; acronym: npp) or correcting using an average calculated over all positions (acronym: nap). Normalised FBEs were thus calculated using the following formula:

where FBEi is the binding energy of the i-th peptide, is the average for a given position (npp) or over all positions (nap), FBEmax and FBEmin - the maximum and minimum FBEs, respectively, for a given position (npp) or for all positions (nap). Normalized FBEs were multiplied by (-1) before being entered into the quantitative matrices (QMs) for ease of presentation. Thus, the positive FBEs correspond to preferred amino acids, and negative FBEs to non-preferred residues. Three QMs were derived: one QMnpp and two QMnap (one for 9 mers and one for 13 mers).

Test set

A test set of 457 peptides known to bind HLA-DP2 was collected from the Immune Epitope Database [34] (November 2010 release). The peptides were of different length and originated from 24 proteins. Each protein was represented as a set of overlapping nonamers and the binding score of each nonamer was calculated as a sum of the weights of all nine positions. Peptides originating from one protein were arranged in descending order according to their binding score; the top 5%, 10%, 15%, 20% and 25% were selected and compared to the known binders. If the nonamer sequence is included in the known binder sequence, the predicted peptide was considered as a true predicted binder. The ratio of all true predicted binders to all binders in the test set defined the sensitivity of prediction at the given cut-off. In the case of flanking residues, the procedure was the same but the parent proteins were represented as a set of overlapping 13 mers. The test set used in the present study is given as Additional file 1.


Docking score-based quantitative matrices (DS-QMs) for nonamers

A library of 172 peptides (19 amino acids × 9 positions + 1 original ligand) was built and each docked separately into the HLA-DP2 rigid binding site. Two QMs (QMnpp and QMnap) were derived based on normalized FBE, according to the method described in Methods. The two QMs are given in Table 1 (DS-QMnpp) and Table 2 (DS-QMnap), respectively. A good correlation exists between the two QMs (r = 0.997).

Table 1 DS-QMnpp for HLA-DP2 binding prediction.
Table 2 DS-QMnap for HLA-DP2 binding prediction.

According to QMnpp, the preferred amino acids at position 1 (p1) are Phe, Trp and Tyr, followed by His, Leu and Ile. QMnap selects only Phe, Trp and Tyr as preferred amino acids for p1. The X-ray structure shows that the p1 pocket is deep and hydrophobic [27]. It can accommodate all hydrophobic residues, including large aromatic amino acids, such as Phe, Trp and Tyr.

Peptide positions 2 and 3 (p2 and p3) project out from the binding site. Both QMs select Trp, His and Phe as preferred and Pro as non-preferred amino acid for p2. A great variety of other residues are well tolerated at p3, such as Pro, Tyr, Trp, Phe and Val; while Glu and Asp are disfavoured.

The binding pocket p4 is large, shallow and negatively charged due to Glu26β, Glu68β and Glu69β[27]. It strongly attracts positively charged amino acids as Arg and Lys, and distracts Asp and Glu. Leu, Tyr, Trp and Phe also are well accepted here, while Pro is not preferred.

Position 5 (p5) protrudes from the binding cleft but it is still in close proximity to the negatively charged residues Glu26β, Glu68β and Glu69β. That explains the preferences for the positively charged Arg and Lys and the avoidance of Asp and Glu.

The binding pocket p6 is deep and hydrophobic like pocket p1 [27]. Phe, Tyr and His are well accepted here; Pro does not bind at all; while Asp, Glu and Trp are deleterious.

Position 7 (p7) lies tangentially to the binding site and is considered as a secondary anchor position for some MHC class II proteins [20, 35]. It is also in the vicinity of Glu26β, Glu68β and Glu69β and prefers Lys and Arg but avoids Asp and Glu. Trp also binds well here, while Pro does not bind at all.

Position 8 (p8) is solvent-exposed, yet shows a strong preference for Pro. Although it is far from Glu26β, Glu68β and Glu69β, their influence on binding preferences is clear. Glu and Asp are deleterious at p8.

The binding pocket 9 (p9) accepts large aliphatic, polar, or even charged residues [27]. Accordingly, there is a wide variety of preferred amino acids at this position: Lys, Met, Cys, Gln, Asn, and Thr. In contrast, Pro and large aromatic amino acids, such as Phe, Trp and Tyr, do not bind at all; nor is Arg tolerated here.

External validation

A test set of 457 peptides known to bind HLA-DP2 originating from 24 proteins was used for external validation of the derived DS-QMs. Initially, the predictive ability of every position was assessed. Subsequently, different combinations of positions were evaluated. The sensitivities of the predictions were calculated at five different thresholds (5%, 10%, 15%, 20% and 25%) for each position are given at Figures 1 and 2. It is evident that QMnap and QMnpp predict equally well. QMnap was used next in the study. The highest predictive ability belongs to p6, followed by p1 and p2. The best two-position and three-position combinations slightly improve the predictions (Figure 3, models "p1p6" and "p1p2p6"). Addition of a cross term between p1 and p6 has no impact on the predictions (Figure 3, model "p1p6crossp1p6"). Combinations between anchor positions were also tested (Figure 3, models "p1p4p6p9" and "p1p4p6p7p9"). No improvement was seen. The combination of all positions also shows a lower predictive ability than the "p1p6" model, thus considering for the non-additivity of binding (Figure 3, model "all positions"). If the contribution made by each pocket to the overall binding energy was formally additive, then the model containing all pocket residues would have had the highest sensitivity. This was not the case.

Figure 1
figure 1

Sensitivities of the predictions calculated at five different thresholds (5%, 10%, 15%, 20% and 25%) for each peptide binding core position by DS-QMnpp.

Figure 2
figure 2

Sensitivities of the predictions calculated at five different thresholds (5%, 10%, 15%, 20% and 25%) for each peptide binding core position by DS-QMnap.

Figure 3
figure 3

Sensitivities of the predictions calculated at five different thresholds (5%, 10%, 15%, 20% and 25%) for different combinations of peptide binding core positions by DS-QMnap.

Comparison to existing servers for HLA-DP2 binding prediction

To the best of our knowledge, only three other servers exist for peptide HLA-DP2 binding prediction: NetMHCII [36], IEDB [37] and MultiRTA [38]. All three are sequence-based methods. NetMHCII and IEDB use artificial neural networks, while MultiRTA applies the Regularized Thermodynamic Average (RTA) prediction method [39]. However, MultiRTA selects only one binder from a protein, and it is not suitable for use with our test set, which consists of many binders originating from a limited number of proteins. Thus, the comparative study only includes NetMHCII, IEDB, and the DS-QMnap model. The 24 proteins from the test set were cleaved into successive overlapping nonamers. Sensitivities at different cutoffs were recorded (Figure 4). It is evident that DS-QMnap performed best when compared to the existing servers for DP2 binding prediction. It predicts 38% of the true binders at the top 5% threshold, 61% at top 10%, 75% at top 15%, 85% at top 20%, and 92% at top 25%.

Figure 4
figure 4

Sensitivities of the predictions calculated at five different thresholds (5%, 10%, 15%, 20% and 25%) by different servers for DP2 binding prediction.

Effect of the flanking residues on the peptide binding affinities

In the present study, we also examined the influence of flanking residues on peptide binding affinities. Four flanking residues were considered: two at each end. Seventy six additional peptides (19 amino acids × 4 positions) were modelled and docked into HLA-DP2 and the FBEs derived from the docking experiments were normalized using either a position-dependent average (npp) or an overall average taken over all positions (nap). Similarly to the evaluation of nonamers, two QMs were derived: QMnpp and QMnap. They are given in Additional file 2. The two are highly correlated (r = 0.995), thus only QMnap was chosen to test predictivity.

The preferred amino acids at p-1 (the first before p1) were Lys, Arg and Pro, while non-preferred residues were Asp and Glu. This preference could be explained by the presence of Glu55α in close proximity to p-1. Positon p-2 (the second before p1) can accommodate a great variety of amino acids, including Pro, Trp, Ala, Arg, Gly, Lys and Phe. No disfavoured amino acids were seen for this position. Thr, Phe, Cys, Ile and Val are well accepted at p+1 (the first after p9), Pro is deleterious. Finally, Gly, Pro and Trp are accommodated well at position p+2 (the second flanking position after p9), while Thr is not favoured here.

The proteins from the test set were also converted to sets of overlapping 13 mers. The binding score of each 13 mer was calculated using the 13 mer-specific QMnap. Note that using overlapping 13 mers significantly decreases the sensitivity, since the number of distinct registers originating from one binder decreases. Several combinations of flanking residues were compared. The first bars in Figure 5 give the sensitivities calculated when only the binding core of nine amino acids was considered (the centre of each 13 mer); subsequent bars show the sensitivities for different combinations of binding core and flanking residues. It is evident that the addition of flanking residues does not improve the predictions.

Figure 5
figure 5

Sensitivities of the predictions calculated at five different thresholds (5%, 10%, 15%, 20% and 25%) for different combinations of peptide binding core positions and flanking residues by DS-QMnap.

Identification of the peptide binding core

The binding peptide RKFHYLPFLPSTGGS from the X-ray structure of the peptide - HLA-DP2 protein complex was used to test if the molecular docking procedure could identify the peptide binding core. The binding 15 mer was presented as a set of overlapping peptides, with a moving binding core shown in bold in Table 3. The FBEs of the peptides and their binding scores calculated by the best DS-QMnap model "p1p6", are given in Table 3. It is evident that both methods clearly discriminated the binding core, since derived scores were significantly higher than the scores derived from the rest of the overlapping peptides.

Table 3 Identification of peptide binding core by molecular docking and DS-QM.

The same procedure was applied to five additional known DP2 binders [40] (Table 3). The FBE values identified four of the five binding cores, while the DS-QMnap model found all the five cores.


Molecular docking is a key structure-based method of immunoinformatics. In contrast to sequence-based methods, docking experiments do not require extensive pre-existing experimental data. The only information necessary is an X-ray structure of the peptide - MHC protein complex. Recently, the docking methodology was extensively tested on both peptide - MHC class I and peptide - MHC class II complexes: it proved to be a rapid and accurate method for evaluating peptide binding to MHCs [41].

Although the structures of a number of HLA-DR and HLA-DQ alleles have long been available [20, 35], the structure of a HLA-DP protein was only solved recently [27]. This has now enabled us to apply structure-based molecular docking to the analysis of the interaction interface of the HLA-DP2 peptide complex. The X-ray structure of the binding peptide was used as a starting template to create a combinatorial library of 247 peptides built using the SAAS principle. Using this, we were able to explore the structure-activity relationships of the nine binding core positions and the four flanking positions, two on each end. Peptides were docked into the DP2 binding site using AutoDock. The lowest resulting FBEs were recorded, normalized, and used to create DS-QMs. The predictive ability of these QMs was tested using an external test set and compared to existing servers for DP2 binding prediction. A similar docking-based procedure was applied recently to 12 HLA-DR1 proteins indicating that DS-QMs have good predictive ability [42].

Analysis of DS-QMs coupled to the external predictivity tests lead to several clear conclusions. The anchor positions p1 and p6 take a leading role in the binding predictions. Hydrophobic aromatic amino acids, like Phe, Tyr and Trp, are preferred at these two positions. Thus, our results confirm the unique binding motif for DP2 [43] and other DP alleles [44]. The prediction tests show that p1+p6 with or without a cross term p1p6 are self-sufficient to identify 38% of the true binders among the top 5% of the best predicted peptides, 61% among the top 10%, and 75% among the top 15% (Figure 3).

The anchors p4 and p9 have a low impact on the peptide binding prediction when used either as single predictors or in combination (Figures 1, 2 and 3). Instead, p2 is found to be the third most important position after p6 and p1 (Figures 1 and 2). It works equally well as a single predictor and in combination with p1 and p6 (Figure 3). Aromatic amino acids are preferred here, such as Trp, His and Phe. These preferences could be explained by the presence of residue His79β, situated close to the side chain of p2, thus enabling the stacking of aromatic rings [45].

The most striking feature of the peptide - HLA-DP2 complex is the unique solvent exposed acidic pocket formed between the bound peptide backbone and the protein α-helix. It contains three glutamic acids: Glu26β, Glu68β and Glu69β. Additionally, close to this acidic triad there is another glutamic acid: Glu67β. The strong negative electrostatic potential created by the four nominally negatively-charged residues, determines the amino acid preferences within the main part of the binding core. All six positions between positions p3 and p8 disfavour Glu and Asp. Positions p4, p5 and p7 prefer Lys and Arg. It has been hypothesized that this acidic pocket is able to bind divalent inorganic cations (e.g., Ca2+, Mg2+, Co2+, Be2+, etc.); this forms an explanation for the association that DP2 has with hard metal lung disease [27].

Analysis of amino acid preferences for all nine binding core positions reveals the ambiguous role of Pro. Pro is a preferred amino acid at peptide positions p3, p5 and p8 yet is not-preferred at p1, p2 and p4. When Pro is present at positions p6, p7 and p9, peptides do not bind at all. As Pro does not possess an interactive side chain, its role in the peptide binding is connected mainly with restrictions to the backbone conformation. This highlights the complex and conflicting influences at play here. Certain deep, hydrophobic inward-facing pockets seem highly selective, while the more exposed pockets tolerate more and more variable residue types. This is to be expected. No binding data exists for the TCR recognition of the HLA-DP2 pMHC complex, which would indicate the steric and physic-chemical constraints exerted on immunologically-active epitopes as opposed to binding peptides.

No effect of the flanking residues was found on the peptide binding predictions to DP2, although all four of them show strong preferences for particular amino acids. Pro also plays an ambiguous role here, being preferred at positions p-2, p-1 and p+2 and non-preferred at position p+1.

Additionally, the DS-QMs were used to identify the binding core of six known DP2 binding peptides, one of them taken from the X-ray structure. All six binding cores were identified with scores significantly higher than the scores derived for the rest of the overlapping peptides.

The DS-QMs derived in the present study were compared to experimental studies based on SAAS peptides binding to HLA-DP2. Berretta et al. [46] performed competition tests with the Ii-derived peptide CLIP and its SAAS peptides in p4 and p6 binding to DP2. Pocket 4 showed high affinity for positively charged, aromatic, and polar residues, whereas aliphatic residues were disfavoured. Pocket 6 showed high affinity for aromatic residues. Both experimentally-determined pocket preferences agree in full with our DS-QMs. Sidney et al. [44] also performed a SAAS analysis of the binding specificities of HLA-DPB1*0201. They defined a binding motif for DP2 including preferred amino acids at peptide positions p-2 (Ala, Phe, Lys, Ser, Thr, Val, Trp, Tyr), p1 (Phe, Ile, Leu, Met, Val, Trp, Tyr) and p6 (Phe, Ile,Leu, Met, Trp, Tyr). The DS-QMs are in partial agreement with these preferences. According to the DS-QMs, Ser, Thr and Tyr are not among the preferred amino acids at p-2, and Trp is not accepted at p6. Most recently, Greenbaum et al. [47] found a high degree of overlapping repertoire amongst all HLA class II molecules due to binding of multiple registers and dominant backbone interactions than peptide anchor preferences.

We can say with some confidence that both molecular docking procedure and the DS-QM based peptide binding prediction identify the binding core of the bound peptide from the X-ray structure in a straightforward manner [27]. Moreover, the comparison to other servers suggests that the method described in the present study should provide a reliable tool for DP2 binding prediction.


Amongst immunoinformatics problems, the prediction of class II peptide-MHC binding has recently been the subject of much critical comment [1517]. We must set this against the background of prediction in general, which is still treated with considerable scepticism by many. In many ways, accurate quantitative and qualitative prediction is the ultimate goal of scientific endeavour, since it affords us both true certainty in our understanding and also greatly augmented abilities to manipulate and design. The present study has continued our exploration of docking as an approach to the difficult and challenging problem of class II MHC-peptide binding prediction. In future work, we will explore more complete docking protocols that allow energetic relaxation of both the peptide and the protein, and also make use of a wider range of different scoring functions; as well as extending our analysis to include a wider range of class II alleles and undertaking prospective as well as retrospective analysis.