MuPeXI: prediction of neo-epitopes from tumor sequencing data


Personalization of immunotherapies such as cancer vaccines and adoptive T cell therapy depends on identification of patient-specific neo-epitopes that can be specifically targeted. MuPeXI, the mutant peptide extractor and informer, is a program to identify tumor-specific peptides and assess their potential to be neo-epitopes. The program input is a file with somatic mutation calls, a list of HLA types, and optionally a gene expression profile. The output is a table with all tumor-specific peptides derived from nucleotide substitutions, insertions, and deletions, along with comprehensive annotation, including HLA binding and similarity to normal peptides. The peptides are sorted according to a priority score which is intended to roughly predict immunogenicity. We applied MuPeXI to three tumors for which predicted MHC-binding peptides had been screened for T cell reactivity, and found that MuPeXI was able to prioritize immunogenic peptides with an area under the curve of 0.63. Compared to other available tools, MuPeXI provides more information and is easier to use. MuPeXI is available as stand-alone software and as a web server at

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2



Area under the curve


Mutant peptide extractor and informer


Next generation sequencing


Non-small cell lung cancer


RNA sequencing


Receiver operator characteristic


Single nucleotide variant


Variant call format


Variant effect predictor


Whole exome sequencing


  1. 1.

    Vormehr M, Diken M, Boegel S et al (2015) Mutanome directed cancer immunotherapy. Curr Opin Immunol 39:14–22. doi:10.1016/j.coi.2015.12.001

    CAS  Article  PubMed  Google Scholar 

  2. 2.

    Schumacher TN, Schreiber RD (2015) Neoantigens in cancer immunotherapy. Science 348:69–74. doi:10.1126/science.aaa4971

    CAS  Article  PubMed  Google Scholar 

  3. 3.

    Rizvi NA, Hellmann MD, Snyder A et al (2015) Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348:124–128. doi:10.1126/science.aaa1348

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    McGranahan N, Furness AJS, Rosenthal R et al (2016) Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351:1463–1469. doi:10.1126/science.aaf1490

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Snyder A, Makarov V, Merghoub T et al (2014) Genetic basis for clinical response to CTLA-4 blockade in melanoma. N Engl J Med 371:2189–2199. doi:10.1056/NEJMoa1406498

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Hugo W, Zaretsky JM, Sun L et al (2016) Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell 165:35–44. doi:10.1016/j.cell.2016.02.065

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Olsen LR, Campos B, Barnkob MS et al (2014) Bioinformatics for cancer immunotherapy target discovery. Cancer Immunol Immunother 63:1235–1249. doi:10.1007/s00262-014-1627-7

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    Rajasagi M, Shukla S, Fritsch EF et al (2014) Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. Blood 124:453–462. doi:10.1182/blood-2014-04-567933

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Schubert B, Brachvogel H-P, Jurges C, Kohlbacher O (2015) EpiToolKit–a web-based workbench for vaccine design. Bioinformatics 31:2211–2213. doi:10.1093/bioinformatics/btv116

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Duan F, Duitama J, Al Seesi S et al (2014) Genomic and bioinformatic profiling of mutational neoepitopes reveals new rules to predict anticancer immunogenicity. J Exp Med 211:2231–2248. doi:10.1084/jem.20141308

    Article  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Hundal J, Carreno BM, Petti AA et al (2016) pVAC-Seq: a genome-guided in silico approach to identifying tumor neoantigens. Genome Med 8:11. doi:10.1186/s13073-016-0264-5

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Bentzen AK, Marquard AM, Lyngaa R et al (2016) Large-scale detection of antigen-specific T cells using peptide-MHC-I multimers labeled with DNA barcodes. Nat Biotechnol 34:1037–1045. doi:10.1038/nbt.3662

    CAS  Article  PubMed  Google Scholar 

  13. 13.

    Krueger F Trim Galore (2016) Accessed 19 Sep 2016

  14. 14.

    Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17:10–12. doi:10.14806/ej.17.1.200

    Article  Google Scholar 

  15. 15.

    Andrews S FastQC (2016) Accessed 19 Sep 2016

  16. 16.

    Van der Auwera GA, Carneiro MO, Hartl C et al (2013) From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinforma 43:11. doi:10.1002/0471250953.bi1110s43

    Article  Google Scholar 

  17. 17.

    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. doi:10.1093/bioinformatics/btp324

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219. doi:10.1038/nbt.2514

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Szolek A, Schubert B, Mohr C et al (2014) OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30:3310–3316. doi:10.1093/bioinformatics/btu548

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Weese D, Holtgrewe M, Reinert K (2012) RazerS 3: faster, fully sensitive read mapping. Bioinformatics 28:2592–2599. doi:10.1093/bioinformatics/bts505

    CAS  Article  PubMed  Google Scholar 

  21. 21.

    McLaren W, Gil L, Hunt SE et al (2016) The ensembl variant effect predictor. Genome Biol 17:122. doi:10.1186/s13059-016-0974-4

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Gubin MM, Artyomov MN, Mardis ER, Schreiber RD (2015) Tumor neoantigens: building a framework for personalized cancer immunotherapy. J Clin Invest 125(9):3413–3421

    Article  Google Scholar 

  23. 23.

    Shukla S, Rooney MS, Rajasagi M et al (2015) Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes. Nat Biotechnol 33:1152–1158. doi:10.1038/nbt.3344

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Nielsen M, Andreatta M (2016) NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Genome Med 8:33. doi:10.1186/s13073-016-0288-x

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Hoof I, Pérez CL, Buggert M et al (2010) Interdisciplinary analysis of HIV-specific CD8+ T cell responses against variant epitopes reveals restricted TCR promiscuity. J Immunol 184:5383–5391. doi:10.4049/jimmunol.0903516

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Schubert B, Walzer M, Brachvogel H-P et al (2016) FRED 2: an immunoinformatics framework for Python. Bioinformatics 32:2044–2046. doi:10.1093/bioinformatics/btw113

    CAS  Article  PubMed  PubMed Central  Google Scholar 

Download references


We thank Charles Swanton and Nicholas McGranahan for providing the raw data from the two NSCLC studies; Sofie Ramskov, Rikke Lyngaa and Sunil Kumar Saini for their experimental work in these studies; Amalie Kai Bentzen for her contribution to methods development; and Thomas Trolle, Andrea Marquard and Marcin Krzystanek for helpful discussions.


This work was supported by the Danish Cancer Society under grant R72-A4618 (Aron Charles Eklund); the Novo Nordisk Foundation under Grant 16,854 (Zoltan Szallasi); the Breast Cancer Research Foundation (Zoltan Szallasi); and the Danish Council for Independent Research under Grant 1331-00283 (Sine Reker Hadrup, Zoltan Szallasi).

Author information



Corresponding authors

Correspondence to Anne-Mette Bjerregaard or Aron Charles Eklund.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 205 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bjerregaard, A., Nielsen, M., Hadrup, S.R. et al. MuPeXI: prediction of neo-epitopes from tumor sequencing data. Cancer Immunol Immunother 66, 1123–1130 (2017).

Download citation


  • Neo-epitopes
  • Neo-antigens
  • Immunotherapy
  • Prediction
  • Mutation
  • Sequencing