Leveraging Information Across HLA Alleles/Supertypes Improves Epitope Prediction
We present a model for predicting HLA class I restricted CTL epitopes. In contrast to almost all other work in this area, we train a single model on epitopes from all HLA alleles and supertypes, yet retain the ability to make epitope predictions for specific HLA alleles. We are therefore able to leverage data across all HLA alleles and/or their supertypes, automatically learning what information should be shared and also how to combine allele-specific, supertype-specific, and global information in a principled way. We show that this leveraging can improve prediction of epitopes having HLA alleles with known supertypes, and dramatically increases our ability to predict epitopes having alleles which do not fall into any of the known supertypes. Our model, which is based on logistic regression, is simple to implement and understand, is solved by finding a single global maximum, and is more accurate (to our knowledge) than any other model.
KeywordsEpitope Prediction Multitask Learn Proteasomal Cleavage Leverage Model Eleven Amino Acid
Unable to display preview. Download preview PDF.
- 4.Buus, S., Lauemoller, S., Worning, P., Kesmir, C., Frimurer, T., Corbet, S., Fomsgaard, A., Hilden, J., Holm, A., Brunak, S.: Sensitive quantitative predictions of peptide-MHC binding by a ’query by committee’ artificial neural network approach. Tissue Antigens 62, 378–384 (2003)CrossRefGoogle Scholar
- 5.Caruana, R.: Multitask Learning. PhD thesis, School of Computer Science. Carnegie Mellon University, Pittsburgh, PA (1997)Google Scholar
- 6.Dong, H.-L., Suie, Y.-F.: Prediction of HLA-A2-restricted CTL epitope specific to HCC by SYFPEITHI combined with polynomial method. World Journal of Gastroenterology 2, 208–211 (2005)Google Scholar
- 7.Donnes, P., Elofsson, A.: Prediction of MHC class I binding. BMC Bioinformatics, 3 (2002)Google Scholar
- 8.Goodman, J.: Sequential conditional generalized iterative scaling. In: ACL (2002)Google Scholar
- 9.Goulder, P., Addo, M., Altfeld, M., Rosenberg, E., Tang, Y., Govender, U., Mngqundaniso, N., Annamalai, K., Vogel, T., Hammond, M., Bunce, M., Coovadia, H., Walker, B.: Rapid definition of five novel HLA-A*3002-restricted human immunodeficiency virus-specific cytotoxic T-lymphocyte epitopes by Elispot and intracellular cytokine staining assays. J. Virol. 75, 1339–1347 (2001)CrossRefGoogle Scholar
- 10.Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, Montreal, QU. Morgan Kaufmann, San Mateo (1995)Google Scholar
- 11.Larsen, M., Lundegaard, C., Lamberth, K., Buus, S., Brunak, S., Lund, O., Nielsen, M.: An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class i binding, TAP transport efficiency, and proteasomal cleavage predictions. European Journal of Immunology 35, 2295–2303 (2005)CrossRefGoogle Scholar
- 15.Parham, P.: The Immune System. Garland Science Publishing (2004)Google Scholar
- 16.Platt, J.: Probabilities for support vector machines. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (1999)Google Scholar
- 18.Reche, P., Glutting, J., Zhang, H., Reinher, E.: Enhancement to the Rankpep resource for the prediction of peptide binding to MHC molecules using profiles. Immunogenetics 26, 405–419 (2004)Google Scholar
- 19.Yanover, C., Hertz, T.: Predicting protein-peptide binding affinity by learning peptide-peptide distance functions. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 456–471. Springer, Heidelberg (2005)CrossRefGoogle Scholar