T-Cell Epitope Prediction Methods: An Overview

  • Dattatraya V. Desai
  • Urmila Kulkarni-Kale
Part of the Methods in Molecular Biology book series (MIMB, volume 1184)


The scientific community is overwhelmed by the voluminous increase in the quantum of data on biological systems, including but not limited to the immune system. Consequently, immunoinformatics databases are continually being developed to accommodate this ever increasing data and analytical tools are continually being developed to analyze the same. Therefore, researchers are now equipped with numerous databases, analytical and prediction tools, in anticipation of better means of prevention of and therapeutic intervention in diseases of humans and other animals.

Epitope is a part of an antigen, recognized either by B- or T-cells and/or molecules of the host immune system. Since only a few amino acid residues that comprise an epitope (instead of the whole protein) are sufficient to elicit an immune response, attempts are being made to identify or predict this critical stretch or patch of amino acid residues, i.e., T-cell epitopes and B-cell epitopes to be included in multiple-subunit vaccines.

T-cell epitope prediction is a challenge owing to the high degree of MHC polymorphism and disparity in the volume of data on various steps encountered in the generation and presentation of T-cell epitopes in the living systems. Many algorithms/methods developed to predict T-cell epitopes and Web servers incorporating the same are available. These are based on approaches like considering amphipathicity profiles of proteins, sequence motifs, quantitative matrices (QM), artificial neural networks (ANN), support vector machines (SVM), quantitative structure activity relationship (QSAR) and molecular docking simulations, etc. This chapter aims to introduce the reader to the principle(s) underlying some of these methods/algorithms as well as procedural and practical aspects of using the same.

Key words

T-cell epitope Proteasomal cleavage MHC–peptide binding TAP transport Quantitative matrix Motif MHC polymorphism Epitope prediction algorithm Vaccine design Immunoinformatics Bioinformatics 



D.V.D. and U.K.K. gratefully acknowledge financial support under the aegis of Center of Excellence (CoE) grant from the Department of Biotechnology (DBT), Government of India.


  1. 1.
    Uebel S, Tampé R (1999) Specificity of the proteasome and the TAP transporter. Curr Opin Immunol 11:203–208CrossRefPubMedGoogle Scholar
  2. 2.
    Niedermann G, King G, Butz S et al (1996) The proteolytic fragments generated by vertebrate proteasomes: structural relationships to major histocompatibility complex class I binding peptides. Proc Natl Acad Sci U S A 93:8572–8577PubMedCentralCrossRefPubMedGoogle Scholar
  3. 3.
    Craiu A, Akopian T, Goldberg A et al (1997) Two distinct proteolytic processes in the generation of a major histocompatibility complex class I-presented peptide. Proc Natl Acad Sci U S A 94:10850–10855PubMedCentralCrossRefPubMedGoogle Scholar
  4. 4.
    Koopmann JO, Post M, Neefjes JJ et al (1996) Translocation of long peptides by transporters associated with antigen processing (TAP). Eur J Immunol 26:1720–1728CrossRefPubMedGoogle Scholar
  5. 5.
    Uebel S, Kraas W, Kienle S et al (1997) Recognition principle of the TAP transporter disclosed by combinatorial peptide libraries. Proc Natl Acad Sci U S A 94:8976–8981PubMedCentralCrossRefPubMedGoogle Scholar
  6. 6.
    Gubler B, Daniel S, Armandola EA et al (1998) Substrate selection by transporters associated with antigen processing occurs during peptide binding to TAP. Mol Immunol 35:427–433CrossRefPubMedGoogle Scholar
  7. 7.
    Kindt TJ, Osborne BA, Goldsby RA (2006) Kuby immunology. W. H. Freeman & Company, New YorkGoogle Scholar
  8. 8.
    Robinson J, Halliwell JA, McWilliam H et al (2013) The IMGT/HLA Database. Nucleic Acids Res 41:D1222–D1227PubMedCentralCrossRefPubMedGoogle Scholar
  9. 9.
    Lund O, Nielsen M, Kesmir C et al (2004) Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics 55:797–810CrossRefPubMedGoogle Scholar
  10. 10.
    Doytchinova IA, Flower DR (2005) In silico identification of supertypes for class II MHCs. J Immunol 174:7085–7095CrossRefPubMedGoogle Scholar
  11. 11.
    Giudicelli V, Duroux P, Ginestoux C et al (2006) IMGT/LIGM-DB, the IMGT® comprehensive database of immunoglobulin and T cell receptor nucleotide sequences. Nucleic Acids Res 34:D781–D784PubMedCentralCrossRefPubMedGoogle Scholar
  12. 12.
    Kaas Q, Ruiz M, Lefranc M-P (2004) IMGT/3Dstructure-DB and IMGT/Structural Query, a database and a tool for immunoglobulin, T cell receptor and MHC structural data. Nucleic Acids Res 32:D208–D210PubMedCentralCrossRefPubMedGoogle Scholar
  13. 13.
    Ehrenmann F, Lefranc M-P (2011) IMGT/ 3Dstructure-DB: Querying the IMGT Database for 3D Structures in Immunology and Immunoinformatics (IG or Antibodies, TR, MH, RPI, and FPIA). Cold Spring Harb Protoc 2011(6):750–761. doi: 10.1101/pdb.prot5637 PubMedGoogle Scholar
  14. 14.
    Robinson J, Waller MJ, Parham P et al (2003) IMGT/HLA and IMGT/MHC sequence databases for the study of the major histocompatibility complex. Nucleic Acids Res 31:311–314PubMedCentralCrossRefPubMedGoogle Scholar
  15. 15.
    DeLisi C, Berzofski JA (1985) T-cell antigenic sites tend to be amphipathic structures. Proc Natl Acad Sci U S A 82:7048–7052PubMedCentralCrossRefPubMedGoogle Scholar
  16. 16.
    Margalit H, Spouge JL, Cornette JL et al (1987) Prediction of immunodominant helper T cell antigenic sites from the primary sequence. J Immunol 138:2213–2229PubMedGoogle Scholar
  17. 17.
    Geluk A, Van Meijgaarden KE, Janson AA et al (1992) Functional analysis of DR17(DR3)-restricted mycobacterial T cell epitopes reveals DR17-binding motif and enables the design of allele specific competitor peptides. J Immunol 149:2864–2871PubMedGoogle Scholar
  18. 18.
    Malcherek G, Falk K, Rötzschke O et al (1993) Natural peptide ligand motifs of two HLA molecules associated with myasthenia gravis. Int Immunol 5:1229–1237CrossRefPubMedGoogle Scholar
  19. 19.
    Geluk A, van Meijgaarden KE, Southwood S et al (1994) HLADR3 molecules can bind peptides carrying two alternative specific submotifs. J Immunol 152:5742–5748PubMedGoogle Scholar
  20. 20.
    Seeger FH, Schirle M, Keilholz W et al (1999) Peptide motif of HLA-B*1510. Immunogenetics 49:996–999CrossRefPubMedGoogle Scholar
  21. 21.
    Meister GE, Roberts CG, Berzofsky JA et al (1995) Two novel T cell epitope prediction algorithms based on MHC-binding motifs; comparison of predicted and published epitopes from Mycobacterium tuberculosis and HIV protein sequences. Vaccine 13:581–591CrossRefPubMedGoogle Scholar
  22. 22.
    Rammensee H, Bachmann J, Emmerich NP et al (1999) SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50:213–219CrossRefPubMedGoogle Scholar
  23. 23.
    Bian H, Hammer J (2004) Discovery of promiscuous HLA-II-restricted T cell epitopes with TEPITOPE. Methods 34:468–475CrossRefPubMedGoogle Scholar
  24. 24.
    Zhang L, Chen Y, Wong H-S et al (2012) TEPITOPEpan: Extending TEPITOPE for Peptide Binding Prediction Covering over 700 HLA-DR Molecules. PLoS One 7:e30483. doi: 10.1371/journal.pone.0030483 PubMedCentralCrossRefPubMedGoogle Scholar
  25. 25.
    Parker KC, Bednarek MA, Coligan JE (1994) Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. J Immunol 152: 163–175PubMedGoogle Scholar
  26. 26.
    Bhasin M, Raghava GPS (2004) Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine 22:3195–3201CrossRefPubMedGoogle Scholar
  27. 27.
    Bhasin M, Raghava GPS (2007) A hybrid approach for predicting promiscuous MHC class I restricted T cell epitopes. J Biosci 32: 31–42CrossRefPubMedGoogle Scholar
  28. 28.
    Rojas R (1996) Neural networks: a systematic introduction. Springer, BerlinCrossRefGoogle Scholar
  29. 29.
    Narayanan A, Keedwell EC, Olsson B (2002) Artificial intelligence techniques for bioinformatics. Appl Bioinformatics 1:191–222PubMedGoogle Scholar
  30. 30.
    Yang ZR (2010) Neural networks. Methods Mol Biol 609:197–222. doi: 10.1007/978-1-60327-241-4_12 CrossRefPubMedGoogle Scholar
  31. 31.
    Leman JK, Mueller R, Karakas M (2013) Simultaneous prediction of protein secondary structure and transmembrane spans. Proteins 81:1127–1140. doi: 10.1002/prot.24258 CrossRefPubMedGoogle Scholar
  32. 32.
    Yang ZR (2004) Biological applications of support vector machines. Brief Bioinform 5: 328–338CrossRefPubMedGoogle Scholar
  33. 33.
    Byvatov E, Schneider G (2003) Support vector machine applications in bioinformatics. Appl Bioinformatics 2:67–77PubMedGoogle Scholar
  34. 34.
    Kadam K, Sawant S, Kulkarni-Kale U et al. (2013) Prediction of protein function based on machine learning methods: an overview. In: Introduction to Sequence and Genome Analysis, iConcept Press Ltd., Hong Kong. (Accepted for publication)Google Scholar
  35. 35.
    Lata S, Bhasin M, Raghava GP (2009) MHCBN 4.0: A database of MHC/TAP binding peptides and T-cell epitopes. BMC Res Notes 2:61PubMedCentralCrossRefPubMedGoogle Scholar
  36. 36.
    Larsen MV, Lundegaard C, Lamberth K et al (2005) An integrative approach to CTL epitope prediction: A combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions. Eur J Immunol 35:2295–2303CrossRefPubMedGoogle Scholar
  37. 37.
    Kesmir C, Nussbaum AK, Schild H et al (2002) Prediction of proteasome cleavage motifs by neural networks. Protein Eng 15:287–296CrossRefPubMedGoogle Scholar
  38. 38.
    Nielsen M, Lundegaard C, Lund O et al (2005) The role of the proteasome in generating cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal cleavage. Immunogenetics 57:33–41CrossRefPubMedGoogle Scholar
  39. 39.
    Peters B, Bulik S, Tampe R et al (2003) Identifying MHC class I epitopes by predicting the TAP transport efficiency of epitope precursors. J Immunol 171:1741–1749CrossRefPubMedGoogle Scholar
  40. 40.
    Sturniolo T, Bono E, Ding J et al (1999) Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices. Nat Biotechnol 17:555–561CrossRefPubMedGoogle Scholar
  41. 41.
    Stranzl T, Larsen MV, Lundegaard C et al (2010) NetCTLpan: pan-specific MHC class I pathway epitope predictions. Immunogenetics 62:357–368PubMedCentralCrossRefPubMedGoogle Scholar
  42. 42.
    Dönnes P, Kohlbacher O (2005) Integrated modeling of the major events in the MHC class I antigen processing pathway. Protein Sci 14:2132–2140PubMedCentralCrossRefPubMedGoogle Scholar
  43. 43.
    Daniel S, Brusic V, Caillat-Zucman S et al (1998) Relationship between peptide selectivities of human transporters associated with antigen processing and HLA class I molecules. J Immunol 161:617–624PubMedGoogle Scholar
  44. 44.
    Donnes P, Elofsson A (2002) Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinformatics 3:25PubMedCentralCrossRefPubMedGoogle Scholar
  45. 45.
    Brusic V, Rudy G, Harrsison LC (1998) MHCPEP, a database of MHC-binding peptides: Update. Nucleic Acids Res 26: 368–371PubMedCentralCrossRefPubMedGoogle Scholar
  46. 46.
    Doytchinova IA, Guan P, Flower DR (2006) EpiJen: a server for multistep T-cell epitope prediction. BMC Bioinformatics 7:131. doi:  10.1186/1471-2105-7-131 PubMedCentralCrossRefPubMedGoogle Scholar
  47. 47.
    Toseland CP, Taylor DJ, McSparron H et al (2005) Anti-Jen: a quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data. Immunome Res 1:4. doi: 10.1186/1745-7580-1-4 PubMedCentralCrossRefPubMedGoogle Scholar
  48. 48.
    Dimitrov I, Garnev P, Flower DR et al (2010) EpiTOP—a proteochemometric tool for MHC class II binding prediction. Bioinformatics 26:2066–2068CrossRefPubMedGoogle Scholar
  49. 49.
    Vita R, Zarebski L, Greenbaum JA et al (2010) The immune epitope database 2.0. Nucleic Acids Res 38:D854–D862PubMedCentralCrossRefPubMedGoogle Scholar
  50. 50.
    Hellberg S, Sjöström M, Skagerberg B et al (1987) Peptide quantitative structure-activity relationships, a multivariate approach. J Med Chem 30:1126–1135CrossRefPubMedGoogle Scholar
  51. 51.
    Oyarzún P, Ellis PJ, Bodén M et al (2013) PREDIVAC: CD4+ T-cell epitope prediction for vaccine design that covers 95% of HLA class II DR protein diversity. BMC Bioinformatics 14:52. doi: 10.1186/1471-2105-14-52 PubMedCentralCrossRefPubMedGoogle Scholar
  52. 52.
    Reche PA, Zhang H, Glutting JP et al (2005) EPIMHC: a curated database of MHC-binding peptides for customized computational vaccinology. Bioinformatics 21:2140–2141CrossRefPubMedGoogle Scholar
  53. 53.
    Singh H, Raghava GP (2003) ProPred1: prediction of promiscuous MHC Class-I binding sites. Bioinformatics 19:1009–1014CrossRefPubMedGoogle Scholar
  54. 54.
    Singh H, Raghava GPS (2001) ProPred: Prediction of HLA-DR binding sites. Bioinformatics 17:1236–1237CrossRefPubMedGoogle Scholar
  55. 55.
    Zhang GL, Deluca DS, Keskin DB et al (2011) MULTIPRED2: A computational system for large-scale identification of peptides predicted to bind to HLA supertypes and alleles. J Immunol Methods 374:53–61. doi:10.1016/j.jim. 2010.11.009PubMedCentralCrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Bioinformatics CentreUniversity of PunePuneIndia

Personalised recommendations