Computational Methods for Identification of T Cell Neoepitopes in Tumors

  • Vanessa Isabell Jurtz
  • Lars Rønn OlsenEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1878)


Cancer immunotherapy has experienced several major breakthroughs in the past decade. Most recently, technical advances in next-generation sequencing methods have enabled discovery of tumor-specific mutations leading to protective T cell neoepitopes. Many of the successes are enabled by computational methods, which facilitate processing of raw data, mapping of mutations, and prediction of neoepitopes. In this book chapter, we provide an overview of the computational tasks related to the identification of neoepitopes, propose specific tools and best practices, and discuss strengths, weaknesses, and future challenges.

Key words

Cancer immunotherapy Bioinformatics Epitope prediction Next-generation sequencing Nonsynonymous mutations 


  1. 1.
    Lennerz V, Fatho M, Gentilini C et al (2005) The response of autologous T cells to a human melanoma is dominated by mutated neoantigens. Proc Natl Acad Sci U S A 102:16013–16018. Scholar
  2. 2.
    Zhou J, Dudley ME, Rosenberg SA, Robbins PF (2005) Persistence of multiple tumor-specific T-cell clones is associated with complete tumor regression in a melanoma patient receiving adoptive cell transfer therapy. J Immunother 28:53–62. Scholar
  3. 3.
    van Rooij N, van Buuren MM, Philips D et al (2013) Tumor exome analysis reveals neoantigen-specific T-cell reactivity in an ipilimumab-responsive melanoma. J Clin Oncol 31:e439–e442. Scholar
  4. 4.
    Rajasagi M, Shukla SA, Fritsch EF et al (2014) Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. Blood 124:453–462. Scholar
  5. 5.
    Olsen LR, Campos B, Winther O et al (2014) Tumor antigens as proteogenomic biomarkers in invasive ductal carcinomas. BMC Med Genet 15:1–10Google Scholar
  6. 6.
    Rooney MS, Shukla SA, Wu CJ et al (2015) Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell 160:48–61. Scholar
  7. 7.
    Olsen LR, Campos B, Barnkob MS et al (2014) Bioinformatics for cancer immunotherapy target discovery. Cancer Immunol Immunother. Scholar
  8. 8.
    Botstein D, Risch N (2003) Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet 33(Suppl):228–237. Scholar
  9. 9.
    Reuter JA, Spacek DV, Snyder MP (2015) High-throughput sequencing technologies. Mol Cell 58:586–597. Scholar
  10. 10.
    Guo J, Xu N, Li Z et al (2008) Four-color DNA sequencing with 3’-O-modified nucleotide reversible terminators and chemically cleavable fluorescent dideoxynucleotides. Proc Natl Acad Sci U S A 105:9145–9150. Scholar
  11. 11.
    Massingham T, Goldman N (2012) All your base: a fast and accurate probabilistic approach to base calling. Genome Biol 13:R13. Scholar
  12. 12.
    Kircher M, Stenzel U, Kelso J (2009) Improved base calling for the Illumina genome analyzer using machine learning strategies. Genome Biol 10:R83. Scholar
  13. 13.
    Renaud G, Kircher M, Stenzel U, Kelso J (2013) freeIbis: an efficient basecaller with calibrated quality scores for Illumina sequencers. Bioinformatics 29:1208–1209. Scholar
  14. 14.
    Cacho A, Smirnova E, Huzurbazar S, Cui X (2015) A comparison of base-calling algorithms for Illumina sequencing technology. Brief Bioinform 17(5):786–795. Scholar
  15. 15.
    Del Fabbro C, Scalabrin S, Morgante M, Giorgi FM (2013) An extensive evaluation of read trimming effects on illumina NGS data analysis. PLoS One 8:1–13. Scholar
  16. 16.
    Aronesty E (2013) Comparison of sequencing utility programs. Open Bioinform J 7:1–8. Scholar
  17. 17.
    Reinert K, Langmead B, Weese D, Evers DJ (2015) Alignment of next-generation sequencing reads. Annu Rev Genomics Hum Genet 16:133–151. Scholar
  18. 18.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. Scholar
  19. 19.
    Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. Scholar
  20. 20.
    Kim D, Pertea G, Trapnell C et al (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36. Scholar
  21. 21.
    Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525–527. Scholar
  22. 22.
    Wang Q, Jia P, Li F et al (2013) Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers. Genome Med 5:91. Scholar
  23. 23.
    Koboldt DC, Zhang Q, Larson DE et al (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22:568–576. Scholar
  24. 24.
    Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219. Scholar
  25. 25.
    Castle JC, Kreiter S, Diekmann J et al (2012) Exploiting the mutanome for tumor vaccination. Cancer Res 72:1081–1091. Scholar
  26. 26.
    Robinson J, Halliwell JA, Hayhurst JD et al (2015) The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res 43:D423–D431. Scholar
  27. 27.
    Middleton D, Gonzalez F, Fernandez-Vina M et al (2009) A bioinformatics approach to ascertaining the rarity of HLA alleles. Tissue Antigens 74:480–485. Scholar
  28. 28.
    Mack SJ, Cano P, Hollenbach JA et al (2013) Common and well-documented HLA alleles: 2012 update to the CWD catalogue. Tissue Antigens 81:194–203. Scholar
  29. 29.
    Szolek A, Schubert B, Mohr C et al (2014) OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30:3310–3316. Scholar
  30. 30.
    Harrow J, Frankish A, Gonzalez JM et al (2012) GENCODE: the reference human genome annotation for the ENCODE project. Genome Res 22:1760–1774. Scholar
  31. 31.
    Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164. Scholar
  32. 32.
    Forbes SA, Beare D, Gunasekaran P et al (2015) COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res 43:D805–D811. Scholar
  33. 33.
    Landrum MJ, Lee JM, Riley GR et al (2014) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 42:D980–D985. Scholar
  34. 34.
    Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293CrossRefGoogle Scholar
  35. 35.
    Nielsen M, Lundegaard C, Lund O, Keşmir C (2005) The role of the proteasome in generating cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal cleavage. Immunogenetics 57:33–41. Scholar
  36. 36.
    Saxová P, Buus S, Brunak S, Keşmir C (2003) Predicting proteasomal cleavage sites: a comparison of available methods. Int Immunol 15:781–787. Scholar
  37. 37.
    Zhang GL, Petrovsky N, Kwoh CK et al (2006) PRED(TAP): a system for prediction of peptide binding to the human transporter associated with antigen processing. Immunome Res 2:3. Scholar
  38. 38.
    Larsen MV, Lundegaard C, Lamberth K et al (2007) Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction. BMC Bioinformatics 8:424. Scholar
  39. 39.
    Andreatta M, Nielsen M (2015) Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics 32(4):511–517. Scholar
  40. 40.
    Nielsen M, Andreatta M (2016) NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Genome Med 8:33. Scholar
  41. 41.
    Vita R, Overton JA, Greenbaum JA et al (2015) The immune epitope database (IEDB) 3.0. Nucleic Acids Res 43:D405–D412. Scholar
  42. 42.
    Trolle T, Metushi IG, Greenbaum JA et al (2015) Automated benchmarking of peptide-MHC class I binding predictions. Bioinformatics 31(13):2174–2181. Scholar
  43. 43.
    Nielsen M, Lundegaard C, Blicher T et al (2008) Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan. PLoS Comput Biol 4:e1000107. Scholar
  44. 44.
    Andreatta M, Jurtz VI, Kaever T et al (2017) Machine learning reveals a non-canonical mode of peptide binding to MHC class II molecules. Immunology 152:255–264. Scholar
  45. 45.
    Paul S, Weiskopf D, Angelo MA, et al. (2013) HLA class I alleles are associated with peptide-binding repertoires of different size, affinity, and immunogenicity. J Immunol 191:5831–9. Scholar
  46. 46.
    van der Burg SH, Visseren MJ, Brandt RM et al (1996) Immunogenicity of peptides bound to MHC class I molecules depends on the MHC-peptide complex stability. J Immunol 156:3308–3314PubMedGoogle Scholar
  47. 47.
    Jørgensen KW, Rasmussen M, Buus S, Nielsen M (2014) NetMHCstab - predicting stability of peptide-MHC-I complexes; impacts for cytotoxic T lymphocyte epitope discovery. Immunology 141:18–26. Scholar
  48. 48.
    Lee JK, Stewart-Jones G, Dong T et al (2004) T cell cross-reactivity and conformational changes during TCR engagement. J Exp Med 200:1455–1466. Scholar
  49. 49.
    Frankild S, de Boer RJ, Lund O et al (2008) Amino acid similarity accounts for T cell cross-reactivity and for “holes” in the T cell repertoire. PLoS One 3:e1831. Scholar
  50. 50.
    Calis JJA, Maybeno M, Greenbaum JA et al (2013) Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput Biol. Scholar
  51. 51.
    Trolle T, Nielsen M (2014) NetTepi: an integrated method for the prediction of T cell epitopes. Immunogenetics 66:449–456. Scholar
  52. 52.
    Bjerregaard A-M, Nielsen M, Hadrup SR et al (2017) MuPeXI: prediction of neo-epitopes from tumor sequencing data. Cancer Immunol Immunother 66:1123–1130. Scholar
  53. 53.
    Hundal J, Carreno BM, Petti AA et al (2016) pVAC-Seq: a genome-guided in silico approach to identifying tumor neoantigens. Genome Med 8:11. Scholar
  54. 54.
    Sidney J, Peters B, Frahm N et al (2008) HLA class I supertypes: a revised and updated classification. BMC Immunol 9:1. Scholar
  55. 55.
    Engels B, Engelhard VH, Sidney J et al (2013) Relapse or eradication of cancer is predicted by peptide-major histocompatibility complex affinity. Cancer Cell 23:516–526. Scholar
  56. 56.
    Assarsson E, Sidney J, Oseroff C et al (2007) A quantitative analysis of the variables affecting the repertoire of T cell specificities recognized after vaccinia virus infection. J Immunol 178:7890–7901CrossRefGoogle Scholar
  57. 57.
    Fritsch EF, Rajasagi M, Ott PA et al (2014) HLA-binding properties of tumor neoepitopes in humans. Cancer Immunol Res 2:522–529. Scholar
  58. 58.
    Kim Y, Sidney J, Buus S et al (2014) Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions. BMC Bioinformatics 15:241. Scholar
  59. 59.
    Andersen RS, Kvistborg P, Frøsig TM et al (2012) Parallel detection of antigen-specific T cell responses by combinatorial encoding of MHC multimers. Nat Protoc 7:891–902. Scholar
  60. 60.
    Olsen LR, Johan Kudahl U, Winther O, Brusic V (2013) Literature classification for semi-automated updating of biological knowledgebases. BMC Genomics 14(Suppl 5):S14. Scholar
  61. 61.
    Olsen LR, Tongchusak S, Lin H et al (2017) TANTIGEN: a comprehensive database of tumor T cell antigens. Cancer Immunol Immunother 66:731–735. Scholar
  62. 62.
    Rammensee H, Bachmann J, Emmerich NP et al (1999) SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50:213–219CrossRefGoogle Scholar
  63. 63.
    van der Bruggen P, Stroobant V, Vigneron N, Van den Eynde B (2013) Peptide database: T cell-defined tumor antigens. Cancer ImmunGoogle Scholar
  64. 64.
    Juncker AS, Larsen MV, Weinhold N et al (2009) Systematic characterisation of cellular localisation and expression profiles of proteins containing MHC ligands. PLoS One 4:e7448. Scholar
  65. 65.
    Pearson H, Daouda T, Granados DP et al (2016) MHC class I–associated peptides derive from selective regions of the human genome. J Clin Invest 126:4690–4701. Scholar
  66. 66.
    Matsushita H, Vesely MD, Koboldt DC et al (2012) Cancer exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting. Nature 482:400–404. Scholar
  67. 67.
    Navin N, Kendall J, Troge J et al (2011) Tumour evolution inferred by single-cell sequencing. Nature 472:90–94. Scholar
  68. 68.
    Abelin JG, Keskin DB, Sarkizova S et al (2017) Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction. Immunity 46:315–326. Scholar
  69. 69.
    Bassani-Sternberg M, Chong C, Guillaume P et al (2017) Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity. PLoS Comput Biol 13:e1005725. Scholar
  70. 70.
    Jurtz VI, Paul S, Andreatta M et al (2017) NetMHCpan 4.0: improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. Scholar
  71. 71.
    Klinger M, Pepin F, Wilkins J et al (2015) Multiplex identification of antigen-specific T cell receptors using a combination of immune assays and immune receptor sequencing. PLoS One 10:e0141561. Scholar
  72. 72.
    Bentzen AK, Marquard AM, Lyngaa R et al (2016) Large-scale detection of antigen-specific T cells using peptide-MHC-I multimers labeled with DNA barcodes. Nat Biotechnol 34:1037–1045. Scholar
  73. 73.
    Dash P, Fiore-Gartland AJ, Hertz T et al (2017) Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547:89–93. Scholar
  74. 74.
    Glanville J, Huang H, Nau A et al (2017) Identifying specificity groups in the T cell receptor repertoire. Nature 547:94–98. Scholar
  75. 75.
    de Aquino MTP, Malhotra A, Mishra MK, Shanker A (2015) Challenges and future perspectives of T cell immunotherapy in cancer. Immunol Lett 166:117–133. Scholar
  76. 76.
    Fesnak AD, June CH, Levine BL (2016) Engineered T cells: the promise and challenges of cancer immunotherapy. Nat Rev Cancer 16:566–581. Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Bio and Health InformaticsTechnical University of DenmarkLyngbyDenmark

Personalised recommendations