Leveraging Information Across HLA Alleles/Supertypes Improves Epitope Prediction

  • David Heckerman
  • Carl Kadie
  • Jennifer Listgarten
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3909)


We present a model for predicting HLA class I restricted CTL epitopes. In contrast to almost all other work in this area, we train a single model on epitopes from all HLA alleles and supertypes, yet retain the ability to make epitope predictions for specific HLA alleles. We are therefore able to leverage data across all HLA alleles and/or their supertypes, automatically learning what information should be shared and also how to combine allele-specific, supertype-specific, and global information in a principled way. We show that this leveraging can improve prediction of epitopes having HLA alleles with known supertypes, and dramatically increases our ability to predict epitopes having alleles which do not fall into any of the known supertypes. Our model, which is based on logistic regression, is simple to implement and understand, is solved by finding a single global maximum, and is more accurate (to our knowledge) than any other model.


Epitope Prediction Multitask Learn Proteasomal Cleavage Leverage Model Eleven Amino Acid 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bhasin, M., Raghava, G.: Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine 22, 3195–3204 (2004a)CrossRefGoogle Scholar
  2. 2.
    Bhasin, M., Raghava, G.P.S.: SVM based method for predicting HLA − DRB1*0401 binding peptides in an antigen sequence. Bioinformatics 20(3), 421–423 (2004b)CrossRefGoogle Scholar
  3. 3.
    Bhasin, M., Singh, H., Raghava, G.: MHCBN: A comprehensive database of MHC binding and non-binding peptides. Bioinformatics 19, 665–666 (2003)CrossRefGoogle Scholar
  4. 4.
    Buus, S., Lauemoller, S., Worning, P., Kesmir, C., Frimurer, T., Corbet, S., Fomsgaard, A., Hilden, J., Holm, A., Brunak, S.: Sensitive quantitative predictions of peptide-MHC binding by a ’query by committee’ artificial neural network approach. Tissue Antigens 62, 378–384 (2003)CrossRefGoogle Scholar
  5. 5.
    Caruana, R.: Multitask Learning. PhD thesis, School of Computer Science. Carnegie Mellon University, Pittsburgh, PA (1997)Google Scholar
  6. 6.
    Dong, H.-L., Suie, Y.-F.: Prediction of HLA-A2-restricted CTL epitope specific to HCC by SYFPEITHI combined with polynomial method. World Journal of Gastroenterology 2, 208–211 (2005)Google Scholar
  7. 7.
    Donnes, P., Elofsson, A.: Prediction of MHC class I binding. BMC Bioinformatics, 3 (2002)Google Scholar
  8. 8.
    Goodman, J.: Sequential conditional generalized iterative scaling. In: ACL (2002)Google Scholar
  9. 9.
    Goulder, P., Addo, M., Altfeld, M., Rosenberg, E., Tang, Y., Govender, U., Mngqundaniso, N., Annamalai, K., Vogel, T., Hammond, M., Bunce, M., Coovadia, H., Walker, B.: Rapid definition of five novel HLA-A*3002-restricted human immunodeficiency virus-specific cytotoxic T-lymphocyte epitopes by Elispot and intracellular cytokine staining assays. J. Virol. 75, 1339–1347 (2001)CrossRefGoogle Scholar
  10. 10.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, Montreal, QU. Morgan Kaufmann, San Mateo (1995)Google Scholar
  11. 11.
    Larsen, M., Lundegaard, C., Lamberth, K., Buus, S., Brunak, S., Lund, O., Nielsen, M.: An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class i binding, TAP transport efficiency, and proteasomal cleavage predictions. European Journal of Immunology 35, 2295–2303 (2005)CrossRefGoogle Scholar
  12. 12.
    McMichael, A., Hanke, T.: The quest for an aids vaccine: Is the CD8+ T-cell approach feasible? Nature Reviews 2, 283–291 (2002)CrossRefGoogle Scholar
  13. 13.
    Milik, M., Sauer, D., Brunmark, A., Yuan, L., Vitiello, A., Jackson, M., Peterson, P., Skolnick, J., Glass, C.: Application of an artificial neural network to predict specific class I MHC binding peptide sequences. Nature Biotechnology 16, 753–756 (1998)CrossRefGoogle Scholar
  14. 14.
    Nielsen, M., Lundegaard, C., Worning, P., Lauemøller, S.L., Lamberth, K., Buus, S., Brunak, S., Lund, O.: Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Science 5, 1007–1017 (2003)CrossRefGoogle Scholar
  15. 15.
    Parham, P.: The Immune System. Garland Science Publishing (2004)Google Scholar
  16. 16.
    Platt, J.: Probabilities for support vector machines. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (1999)Google Scholar
  17. 17.
    Rammensee, H., Bachmann, J., Emmerich, N., Bachor, O.A., Stevanovic, S.: SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50, 213–219 (1999)CrossRefGoogle Scholar
  18. 18.
    Reche, P., Glutting, J., Zhang, H., Reinher, E.: Enhancement to the Rankpep resource for the prediction of peptide binding to MHC molecules using profiles. Immunogenetics 26, 405–419 (2004)Google Scholar
  19. 19.
    Yanover, C., Hertz, T.: Predicting protein-peptide binding affinity by learning peptide-peptide distance functions. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 456–471. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  20. 20.
    Zhao, Y., Pinilla, C., Valmori, D., Martin, R., Simon, R.: Application of support vector machines for T-cell epitopes prediction. Bioinformatics 19(15), 1978–1984 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • David Heckerman
    • 1
  • Carl Kadie
    • 1
  • Jennifer Listgarten
    • 1
  1. 1.Microsoft ResearchRedmondUSA

Personalised recommendations