OpenVax: An Open-Source Computational Pipeline for Cancer Neoantigen Prediction

  • Julia KodyshEmail author
  • Alex Rubinsteyn
Part of the Methods in Molecular Biology book series (MIMB, volume 2120)


OpenVax is a computational workflow for identifying somatic variants, predicting neoantigens, and selecting the contents of personalized cancer vaccines. It is a Dockerized end-to-end pipeline that takes as input raw tumor/normal sequencing data. It is currently used in three clinical trials (NCT02721043, NCT03223103, and NCT03359239). In this chapter, we describe how to install and use OpenVax, as well as how to interpret the generated results.

Key words

Neoantigen Cancer vaccine Bioinformatics pipeline NGS Docker Immunoinformatics 


  1. 1.
    González S, Volkova N, Beer P et al (2018) Immuno-oncology from the perspective of somatic evolution. Semin Cancer Biol 52:75–85CrossRefGoogle Scholar
  2. 2.
    Schumacher TN, Scheper W, Kvistborg P (2019) Cancer Neoantigens. Annu Rev Immunol 37:173–200CrossRefGoogle Scholar
  3. 3.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. Scholar
  4. 4.
    McKenna A, Hanna M, Banks E et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303CrossRefGoogle Scholar
  5. 5.
    Dobin A, Davis CA, Schlesinger F et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21CrossRefGoogle Scholar
  6. 6.
    Cibulskis K, Lawrence MS, Carter SL et al (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219CrossRefGoogle Scholar
  7. 7.
    Saunders CT, Wong WSW, Swamy S et al (2012) Strelka: accurate somatic small-variant calling from sequenced tumor–normal sample pairs. Bioinformatics 28:1811–1817CrossRefGoogle Scholar
  8. 8.
    Köster J and Rahmann S (2012) Building and documenting workflows with Python-based Snakemake, In: German Conference on Bioinformatics 2012, Schloss Dagstuhl-Leibniz-Zentrum fuer InformatikGoogle Scholar
  9. 9.
    Sherry ST, Ward MH, Kholodov M et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311CrossRefGoogle Scholar
  10. 10.
    Forbes SA, Bhamra G, Bamford S et al (2008) The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum Genet 10:11. Scholar
  11. 11.
    Kim S, Scheffler K, Halpern AL et al (2018) Strelka2: fast and accurate calling of germline and somatic variants. Nat Methods 15:591–594CrossRefGoogle Scholar
  12. 12.
    Hoof I, Peters B, Sidney J et al (2009) NetMHCpan, a method for MHC class I binding prediction beyond humans. Immunogenetics 61:1–13CrossRefGoogle Scholar
  13. 13.
    Karosiene E, Lundegaard C, Lund O et al (2012) NetMHCcons: a consensus method for the major histocompatibility complex class I predictions. Immunogenetics 64:177–186CrossRefGoogle Scholar
  14. 14.
    Peters B, Sette A (2005) Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. BMC Bioinformatics 6:132CrossRefGoogle Scholar
  15. 15.
    Kim Y, Sidney J, Pinilla C et al (2009) Derivation of an amino acid similarity matrix for peptide: MHC binding and its application as a Bayesian prior. BMC Bioinformatics 10:394CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkUSA

Personalised recommendations