Background

The major histocompatibility complex (MHC) molecules are cell-surface glycoproteins that play a vital role in adaptive immune response.[1] In order to help stimulate immune responses against a large repertoire of possible pathogens, MHC receptors can bind to a wide variety of peptides. The interaction of peptide/MHC (pMHC) complexes with T-cell receptors (TcRs)[2] on the surface of T cells is responsible for T-cell activation and stimulation of adaptive immune response. An understanding of the structural principles involved in the selection of specific antigenic peptides by the different MHC alleles and subsequently in the selection of specific pMHC complexes by the relevant TcR is critical for vaccine development. The experimentally determined 3-dimensional (3-D) structures of TcR/pMHC and pMHC complexes are available in the Protein Data Bank (PDB),[3] with some interaction parameters reported as significant for pMHC interactions.[4] A comprehensive dataset to facilitate the sequence-structure-function mapping in peptide binding by MHC receptors is essential for the development of predictive algorithms in computational immunology.

A preliminary pMHC interaction database was developed by Govindarajan et al. in 2003[5] consisting of 86 entries of classical pMHC complexes with standard residues derived mainly from human and rodents. Thereafter, new structures have become available and a new database, MHC-Peptide Interaction Database version T (MPID-T), was created to include interaction parameters on TcR/pMHC complexes and the latest available PDB data that contain classical and non-classical structures, as well as complexes with non-standard amino acid residues. MPID-T is a curated, structure-derived database containing interaction information on 187 pMHC complexes (represented by 40 human, murine and rat alleles) and 16 TcR/pMHC complexes (13 class I and three class II alleles). Information for each MPID-T entry is classified into four main groups: (i) MHC (allele, source, class); (ii) bound peptide (length, source, redundancy); (iii) computed interaction parameters (intermolecular hydrogen bonds, gap volume, gap index, interface area); and (iv) links to related external databases, particularly the IMGT/3Dstructure-DB[6] for annotations on TcR and MHC sequences with 3-D structures, and the Colliers de Perles for for TcR/pMHC structural analysis of the international ImMuno-GeneTics information system® (IMGT; http://imgt.cines.fr).[7]

Resource Description

Capabilities

MPID-T is a curated MySQL® (http://www.mysql.com) database hosted on a UNIX® server (IRIX 6.5, Apache 1.3.12). Currently, MPID-T contains only experimentally determined structures available in the PDB. For PDB entries with multiple molecular assemblies, the first TcR/pMHC or pMHC complex is stored as a single entity, for rapid visualisation, characterisation and comparison. Each structure is manually verified, classified, and analysed for intermolecular interactions (i) between the MHC and its corresponding bound peptide and (ii) between a TcR and its bound pMHC complex where TcR structural information is available. Included in MPID-T are non-classical structures and complexes with non-standard residues, which have implications for vaccine design. The non-redundant set of peptides bound to a particular allele is selected using the most accurate and complete structures.

Definition of Interaction Parameters

Specific interaction parameters have been identified as significant for the characterisation of the pMHC interface[4] and can be computed from the 3-D coordinates of a pMHC complex. These include (i) the number of intermolecular hydrogen bonds, (ii) the interface area between associating molecules, (iii) the gap volume and (iv) the gap index. Although the gap volume is computed as described by Kangueane et al.,[4] the accessible surface area (ASA), required for calculating the other three parameters, is now computed using the Naccess program (http://wolf.bms.umist.ac.uk/naccess/). A brief outline of the MPID-T interaction parameters follows.

Intermolecular Hydrogen Bonds

The total number of hydrogen bonds between the peptide and the MHC molecule is calculated using the program HBPLUS[8] in which hydrogen bonds are defined according to standard geometric criteria of maximum distances (D−Å = 3.9Å, H−Å = 2.5A and S−S = 3.0Å) with minimum angles (D−H−A = 90°, H−A−AA = 90° and D−H−AA = 90°), where participating atoms are represented as D for donor, A for acceptor, H for hydrogen, AA for acceptor antecedent and S for sulphur.

Gap Volume

Gap volume gives a measure of the complementarity of the interacting surfaces. The volume of the gaps between the two interacting subunits is calculated using the program SURFNET.[9] Each pair of subunit atoms are considered sequentially, placing a sphere (maximum radius 5.0Å) halfway between the surfaces of the two atoms such that its surface touches the surfaces of the atoms in the pair. The size of the sphere is reduced whenever other atoms intercept this sphere, and the sphere is discarded if the size of the sphere falls below a minimum radius of 1.0Å. The gap volume between the two subunits is computed based on the volume enclosed by all the allowable gap-spheres.

Gap Index

The gap index[10] provides an estimate of the electrostatic and geometric complementarity of interacting interfaces expressed by equation 1:

figure Eq1

Interface Area

The interface area for a pMHC complex is defined as the change in solvent-accessible surface area (ΔASA) on complexation from an unbound MHC to a bound pMHC complex state and calculated using the program Naccess (equation 2):

figure Eq2

Implementation

The MPID-T database web interface permits searching the molecular complexes stored in the database based on MHC allele or PDB information, as shown in figure 1. Structural visualisation of the TcR/pMHC complex, pMHC complex, MHC or the bound peptide can be performed using freely available graphics applications such as RasMol (http://www.openrasmol.org) or MDL® Chime (http://www.mdlchime.com), whereas structural alignment (based on MHC class and peptide length)[11] can be viewed using the Jmol molecular viewer (http://www.jmol.org) or an MDL® Chime-compatible web browser client.

Fig. 1
figure 1

Screenshots of (a) input and (b) output web interfaces from MHCPeptide Interaction Database version T (MPID-T) with user-defined input parameters (major histochemical compatibility complex [MHC] class, organism, data redundancy, MHC allele, peptide length and output format).

Each MPID-T entry bears a unique identifier, with sequence data hyperlinked to external databases that include IMGT/HLA (for the human MHC sequences),[7] IMGT/3Dstructure-DB (forpMHC and TcR/pMHC sequences and structures),[6] SYFPEITHI (for MHC ligands and peptide motifs)[12] and AntiJen (for experimental binding affinity).[13] Related sequences and structures for the relevant protein chains can be accessed via the National Center for Biotechnology Information (NCBI) Structure link (http://www.ncbi.nlm.nih.gov/Structure) and bibliographic references from PubMed. Pre-computed schematic diagrams based on the plotting program LIGPLOT[14] are provided to illustrate explicit pMHC interactions. Consensus patterns among peptides of the same length or allele are also available in MPID-T generated using the program WebLogo.[15] Other useful sources of information for researchers in vaccine design and immunology (referenced in Rammensee et al.[12]) are also provided under MHC resources on the MPID-T help page.

Discussion

MPID-T is a manually curated specialist database for sequence-structure-function information on pMHC and TcR/pMHC interactions. The aim of developing MPID-T is to define structural descriptors for in-depth characterisation of TcR/pMHC and pMHC interactions. Such descriptors should better reflect TcR/pMHC and pMHC interactions than just sequence alone. Together with other relevant databases containing MHC- or antigen-related data such as AntiJen (experimental binding affinities),[13] MHCBN (MHC binding and non-binding peptide sequences)[16] and FIMM (fully referenced data on protein antigens, MHC, pMHC and relevant disease associations),[17] MPID-T aim to facilitate the extraction of high-level relationships hidden within TcR/pMHC interaction data by mapping the TcR footprint on the peptide-bound MHC. This mapping will eventually determine T-cell recognition and binding. The identification of such structural descriptors will enhance the understanding of the binding mechanism underlying TcR/pMHC and pMHC interactions and facilitate the extension of algorithms[18] determining peptide binding to specific MHC alleles to predicting the induction of TcR response. Future developments will include classification of the structures based on TcRs, enabling TcR-specific searches.