Classification of MHC I Proteins According to Their Ligand-Type Specificity

  • Eduardo Martínez-Naves
  • Esther M. Lafuente
  • Pedro A. Reche
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6825)


Major histocompatibility complex class I (MHC I) molecules belong to a large and diverse protein superfamily whose families can be divided in three groups according to the type of ligands that they can accommodate (ligand-type specificity): peptides, lipids or none. Here, we assembled a dataset of MHC I proteins of known ligand-type specificity (MHCI556 dataset) and trained k-nearest neighbor and support vector machine algorithms. In cross-validation, the resulting classifiers predicted the ligand-type specificity of MHC I molecules with an accuracy ≥ 99%, using solely their amino acid composition. By holding out entire MHC I families prior to model building, we proved that ML-based classifiers trained on amino acid composition are capable of predicting the ligand-type specificity of MHC I molecules unrelated to those used for model building. Moreover, they are superior to BLAST at predicting the class of MHC I molecules that do not bind any ligand.


classical MHC class I molecules non-classical MHC class I molecules machine learning ligand prediction 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Maenaka, K., Jones, E.Y.: MHC superfamily structure and the immune system. Curr. Opin. Struct. Biol. 9, 745–753 (1999)CrossRefGoogle Scholar
  2. 2.
    Townsend, A., Bodmer, H.: Antigen recognition by class I-restricted T lymphocytes. Annual Review of Immunology 7, 601–624 (1989)CrossRefGoogle Scholar
  3. 3.
    Braud, V.M., Allan, D.S., McMichael, A.J.: Functions of nonclassical MHC and non-MHC-encoded class I molecules. Curr. Opin. Immunol. 11, 100–108 (1999)CrossRefGoogle Scholar
  4. 4.
    Clements, C.S., Kjer-Nielsen, L., Kostenko, L., Hoare, H.L., Dunstone, M.A., Moses, E., Freed, K., Brooks, A.G., Rossjohn, J., McCluskey, J.: Crystal structure of HLA-G: a nonclassical MHC class I molecule expressed at the fetal-maternal interface. Proc. Natl. Acad. Sci. USA 102, 3360–3365 (2005)CrossRefGoogle Scholar
  5. 5.
    He, X., Tabaczewski, P., Ho, J., Stroynowski, I., Garcia, K.C.: Promiscuous antigen presentation by the nonclassical MHC Ib Qa-2 is enabled by a shallow, hydrophobic groove and self-stabilized peptide conformation. Structure 9, 1213–1224 (2001)CrossRefGoogle Scholar
  6. 6.
    Lu, L., Werneck, M.B., Cantor, H.: The immunoregulatory effects of Qa-1. Immunological Reviews 212, 51–59 (2006)CrossRefGoogle Scholar
  7. 7.
    Hoare, H.L., Sullivan, L.C., Pietra, G., Clements, C.S., Lee, E.J., Ely, L.K., Beddoe, T., Falco, M., Kjer-Nielsen, L., Reid, H.H., McCluskey, J., Moretta, L., Rossjohn, J., Brooks, A.G.: Structural basis for a major histocompatibility complex class Ib-restricted T cell response. Nat. Immunol. 7, 256–264 (2006)CrossRefGoogle Scholar
  8. 8.
    Rodgers, J.R., Cook, R.G.: MHC class Ib molecules bridge innate and acquired immunity. Nature Reviews Immunology 5, 459–471 (2005)CrossRefGoogle Scholar
  9. 9.
    Barral, D.C., Brenner, M.B.: CD1 antigen presentation: how it works. Nature Reviews Immunology 7, 929–941 (2007)CrossRefGoogle Scholar
  10. 10.
    Kennedy, M.W., Heikema, A.P., Cooper, A., Bjorkman, P.J., Sanchez, L.M.: Hydrophobic ligand binding by Zn-alpha 2-glycoprotein, a soluble fat-depleting factor related to major histocompatibility complex proteins. The Journal of Biological Chemistry 276, 35008–35013 (2001)CrossRefGoogle Scholar
  11. 11.
    Esmon, C.T.: The endothelial protein C receptor. Current Opinion in Hematology 13, 382–385 (2006)CrossRefGoogle Scholar
  12. 12.
    Sanchez, L.M., Chirino, A.J., Bjorkman, P.: Crystal structure of human ZAG, a fat-depleting factor related to MHC molecules. Science 283, 1914–1919 (1999)CrossRefGoogle Scholar
  13. 13.
    Oganesyan, V., Oganesyan, N., Terzyan, S., Qu, D., Dauter, Z., Esmon, N.L., Esmon, C.T.: The crystal structure of the endothelial protein C receptor and a bound phospholipid. J. Biol. Chem. 277, 24851–24854 (2002)CrossRefGoogle Scholar
  14. 14.
    Liu, Y., Xiong, Y., Naidenko, O.V., Liu, J.H., Zhang, R., Joachimiak, A., Kronenberg, M., Cheroutre, H., Reinherz, E.L., Wang, J.H.: The crystal structure of a TL/CD8alphaalpha complex at 2.1 A resolution: implications for modulation of T cell activation and memory. Immunity 18, 205–215 (2003)CrossRefGoogle Scholar
  15. 15.
    Wingren, C., Crowley, M.P., Degano, M., Chien, Y., Wilson, I.A.: Crystal structure of a gammadelta T cell receptor ligand T22: a truncated MHC-like fold. Science 287, 310–314 (2000)CrossRefGoogle Scholar
  16. 16.
    Bahram, S., Inoko, H., Shiina, T., Radosavljevic, M.: MIC and other NKG2D ligands: from none to too many. Current Opinion in Immunology 17, 505–509 (2005)CrossRefGoogle Scholar
  17. 17.
    Roopenian, D.C., Akilesh, S.: FcRn: the neonatal Fc receptor comes of age. Nature Reviews Immunology 7, 715–725 (2007)CrossRefGoogle Scholar
  18. 18.
    Feder, J.N., Gnirke, A., Thomas, W., Tsuchihashi, Z., Ruddy, D.A., Basava, A., Dormishian, F., Domingo, R., Ellis Jr, M.C., Fullan, A., Hinton, L.M., Jones, N.L., Kimmel, B.E., Kronmal, G.S., Lauer, P., Lee, V.K., Loeb, D.B., Mapa, F.A., McClelland, E., Meyer, N.C., Mintier, G.A., Moeller, N., Moore, T., Morikang, E., Prass, C.E., Quintana, L., Starnes, S.M., Schatzman, R.C., Brunke, K.J., Drayna, D.T., Risch, N.J., Bacon, B.R., Wolff, R.K.: A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nature Genetics 13, 399–408 (1996)CrossRefGoogle Scholar
  19. 19.
    Burmeister, W.P., Huber, A.H., Bjorkman, P.J.: Crystal structure of the complex of rat neonatal Fc receptor with Fc. Nature 372, 379–383 (1994)CrossRefGoogle Scholar
  20. 20.
    Lebron, J.A., Bennett, M.J., Vaughn, D.E., Chirino, A.J., Snow, P.M., Mintier, G.A., Feder, J.N., Bjorkman, P.J.: Crystal structure of the hemochromatosis protein HFE and characterization of its interaction with transferrin receptor. Cell 93, 111–123 (1998)CrossRefGoogle Scholar
  21. 21.
    Wistrand, M., Sonnhammer, E.L.: Improved profile HMM performance by assessment of critical algorithmic features in SAM and HMMER. BMC Bioinformatics 6, 99 (2005)CrossRefGoogle Scholar
  22. 22.
    Sonnhammer, E.L., Eddy, S.R., Durbin, R.: Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins 28, 405–420 (1997)CrossRefGoogle Scholar
  23. 23.
    Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs motif sampling detection of bacterial outer membrane protein repeats. Prot. Sci. 4, 1618–1632 (1995)CrossRefGoogle Scholar
  24. 24.
    EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000)Google Scholar
  25. 25.
    Frank, E., Hall, M., Trigg, L., Holmes, G., Witten, I.H.: Data mining in bioinformatics using Weka. Bioinformatics 20, 2479–2481 (2004)CrossRefGoogle Scholar
  26. 26.
    Wu, X., Kumar, V., Quilan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining Knowledge and Information Systems. Springer, Heidelberg (2008)Google Scholar
  27. 27.
    Dasarathy, B.V.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. McGraw-Hill Computer Science Series. IEEE Computer Society Press, Los Alamitos, California (1991)Google Scholar
  28. 28.
    Platt, J.C.: Fast Training of Support Vector Machines using Sequential Minimal Optimization Advances in Kernel Methods - Support Vector Learning. In: Schölkopf, B., Burges, C., Smola, A.J. (eds.), pp. 185–208. MIT Press, Cambridge (1999)Google Scholar
  29. 29.
    Diez-Rivero, C.M., Chenlo, B., Zuluaga, P., Reche, P.A.: Quantitative modeling of peptide binding to TAP using support vector machine. Proteins 14, 14 (2009)Google Scholar
  30. 30.
    Bhasin, M., Reinherz, E.L., Reche, P.A.: Recognition and classification of histones using support vector machine. J. Comput. Biol. 13, 102–112 (2006)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Chen, K., Kurgan, L.A., Ruan, J.: Prediction of protein structural class using novel evolutionary collocation-based sequence representation. J. Comput. Chem. 29, 1596–1604 (2008)CrossRefGoogle Scholar
  32. 32.
    Saha, S., Zack, J., Singh, B., Raghava, G.P.: VGIchan: prediction and classification of voltage-gated ion channels. Genomics Proteomics Bioinformatics 4, 253–258 (2006)CrossRefGoogle Scholar
  33. 33.
    Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)CrossRefGoogle Scholar
  34. 34.
    Lafuente, E.M., Reche, P.A.: Prediction of MHC-peptide binding: a systematic and comprehensive overview. Curent Pharmaceutical Design 15, 3209–3220 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Eduardo Martínez-Naves
    • 2
  • Esther M. Lafuente
    • 2
  • Pedro A. Reche
    • 1
    • 2
  1. 1.Laboratory of ImmunomedicineUniversidad Complutense de MadridMadridSpain
  2. 2.Department of Microbiology I–Immunology, Facultad de MedicinaUniversidad Complutense de MadridMadridSpain

Personalised recommendations