The mapping of the cancer genome is revolutionizing our understanding of malignancy and its underlying complexity. The heterogeneity of cancer uncovered by comprehensive sequencing has highlighted the importance of personalized cancer therapy. At the same time, breakthroughs in tumor immunology are leading to the incorporation of cancer immunotherapy as a potent anti-cancer modality complementing traditional therapies. The results seen with anti-CTLA-4 and anti-PD-1/PD-L1 antibodies serve as a proof-of-concept that immunotherapy can achieve durable remissions with limited toxicity [15]. However, as with chemotherapy and targeted therapies, responses to immunotherapy are not universal. The deep and durable responses seen with immune checkpoint inhibitors are only elicited in a subset of patients and only in certain cancers [6]. In contrast to the advances in targeted therapy, we have as yet no effective a priori method of uniformly identifying potential immunotherapy responders. Furthermore, clinical and radiographic responses to immunotherapy can be delayed, making it difficult to identify responders even after treatment using standard criteria [7]. As comprehensive sequencing of individual cancers becomes more affordable and widely available, new methods are needed to translate genomic sequencing data into useful information to guide the burgeoning practice of immunotherapy. Coupled with an expanding armamentarium of immunomodulatory agents, further definition of the immunogenic landscape of cancer will be needed to unlock the full potential of cancer immunotherapy.

Native antigens versus neoantigens

Perhaps the greatest barrier to successful immunotherapy has been the lack of targetable tumor-associated antigens [8]. Because of their broad applicability, the most widely utilized strategies have been to target native tumor-associated antigens, such as cancer–testis antigens, which tend to be more commonly shared among different patients with the same malignancy [911]. While this allows therapy to be generalizable across patients, these native tumor-associated antigens are also found on germline/native tissues as well, resulting in immune tolerance that may be difficult to overcome and potentially off-target side effects. The response rates to native antigen-derived vaccines have been disappointing [10] with only a single vaccine receiving FDA approval [12].

The alternative approach is to instead target “neoantigens” formed by the somatic mutations unique to each patient’s tumor [1315]. The unique mutational profile of each tumor is predicted to result in a corresponding unique set of neoantigens specific to that tumor. As compared to native antigens, neoantigens are less likely to experience central immunological tolerance, theoretically making them superior antigens. Neoantigens suffer from one major drawback; they are unique to each patient. Because their identification requires costly and extensive effort, individualized neoantigens have largely been avoided as immunotherapy targets. Yet, there is indirect evidence suggesting a central role of neoantigens in the endogenous antitumor response. A tumor-associated antigen screen in a long-term melanoma survivor found five of eight targeted tumor antigens were in fact neoantigens, with detectable responses lasting several years [16]. Furthermore, when tumor-infiltrating lymphocytes (TILs) from melanoma patients were stained with multimers for 145 melanoma epitopes, representing all known HLA-A2-restricted melanoma-associated native epitopes, <1 % of TILs were reactive [17]. These findings suggest that neoantigens may constitute the bulk of melanoma-associated antigens targeted by TILs.

Identification of neoantigens

The role of mutated tumor antigens as a target for the immune system had been theorized since the first experiments showing that syngeneic mice could reject transplanted carcinogen-induced tumors [18, 19]. The first identified tumor-specific antigens were idiotypes derived from the CDR regions of immunoglobulin variable domains found on lymphoma and myeloma cells [20]. These antigens are formed via natural immunoglobulin rearrangement machinery, creating unique epitopes with expression restricted to malignant cells. Their characterization was the first proof that mutated or rearranged genes could elicit antitumor responses which could be clinically exploited. However, the ability of idiotype vaccination to produce antitumor immune responses [21] did not translate into clinical success [2224].

Whereas idiotype antigens created in the natural development of B cells were easily identified, neoantigens resulting from the underlying genetic instability of cancer proved more elusive. Decades after immune responses to idiotypes were first described, proof that mutated genes could produce targets for the antitumor immune response was demonstrated in murine models [25]. Later, autologous cytotoxic T lymphocytes isolated from human melanoma patients were found to target tumor-restricted mutated genes [2628]. These studies, and those to follow, established neoantigens as a veritable class of tumor-associated antigens. However, these methods of neoantigen identification were laborious, largely relying on screens of TILs against cDNA tumor antigen libraries. Advancements in immunologic methods did little to facilitate the identification of these neoantigens, thus limiting their use in clinical applications.

In silico methods of neoantigen discovery

Next generation sequencing techniques are opening new avenues into neoantigen discovery. It is becoming increasingly affordable to sequence entire cancer genomes, allowing a personalized and complete characterization of the mutanome for each patient. Each non-synonymous mutation forms dozens of putative peptide targets. The number of potential neoantigens can be narrowed based on our understanding of antigen processing and presentation. To elicit a cytotoxic T cell response, neoantigens must be processed into peptides and presented on MHC class I molecules. Each HLA allele has a unique peptide-binding profile, with only a small fraction of potential peptides presented on any given HLA class I allele. Extremely accurate methods are now available to predict the binding affinity between any given peptide and specific HLA alleles [29, 30]. The netMHC model is frequently used and performs well in benchmarking studies, though it is simple to combine multiple models to improve performance [31, 32].

The mutation rate varies considerably from cancer to cancer and even within different cancer types [3335]. The number of non-synonymous mutations range from about ten mutations per genome in acute myeloid leukemia to hundreds of mutations per genome in cancers such as melanoma and lung cancer [3638]. Filtering this limited number of coding mutations through peptide-MHC affinity prediction algorithms would theoretically yield a manageable final number of potential neoantigens per tumor (Fig. 1).

Fig. 1
figure 1

Schematic for identification of neoantigens through cancer genome sequencing

This type of neoantigen prediction has been applied to the emerging cancer genome sequencing datasets, confirming a relatively small set of putative high-probability neoantigens. In the case of colon and breast cancer, there are an average of seven and 10 mutations per genome, respectively, which are predicted to form epitopes with high affinity binding to HLA-A0201 allele, one of the most common class I alleles [39]. Another study probed only missense mutations that had been annotated as functionally relevant in the catalogue of somatic mutations in cancer (COSMIC) database [40, 41]. A total of 26,672,189 corresponding peptides were tested against 57 human HLA-A and HLA-B alleles. Only 0.4 % of the peptides were predicted to have a high binding affinity to any human allele. As each patient has a total of only four HLA-A and HLA-B alleles, the frequency of peptides specific for self-MHC would be expected to be much lower. These studies suggest that mutations producing neoantigens with high MHC affinity are uncommon. While cancers with a high rate of mutation, such as melanoma, appear to have a significant number of relevant neoantigens, cancers with a lower mutation rate likely have few, if any, neoantigens.

The number of potential neoantigens depends on the threshold of peptide-MHC affinity used to establish candidacy. Higher affinity peptides induce stronger immune responses [42, 43], but the optimal cutoff is unclear. A recent model of murine tumor rejection found that peptide-MHC (pMHC) affinities of 10 nM or less are required for tumor rejection, whereas lower affinity peptides resulted in disease relapse [44]. Others have found that peptides with an IC50 as high as 200 nM were able to stimulate cytotoxic T cells to recognize and kill tumor cells in vitro [43]. In the case of neoantigen recognition in a melanoma patient, a neoantigen-specific response was seen with pMHC affinities up to 100 nM [45].

Using predicted peptide-MHC affinity in neoantigen screening requires knowledge of the corresponding individual’s HLA alleles. While conventional HLA serotyping or genotyping methods are available, these carry additional cost and labor and cannot be applied retrospectively to the large cancer genome datasets. In theory, the HLA type of the patient should be extractable from the genomic sequence. However, the high degree of polymorphism in the HLA locus poses significant challenges in identifying HLA alleles from short read sequences. New methods are available that demonstrate high accuracy in identifying HLA genotype from next generation sequencing data [46, 47]. The first large-scale analysis to incorporate HLA type extracted in this manner provides the most complete survey of the neoantigen landscape to date [48]. Using a more lenient cutoff pMHC affinity of 500 nM, the average number of neoantigens per genome for melanoma, renal cell carcinoma and chronic lymphocytic leukemia was found to be 488, 80 and 24, respectively.

Detecting the immune response to neoantigens

Combining cancer genome sequencing with peptide-MHC binding prediction analysis is useful in identifying candidate neoantigens, but how many of these actually produce an immune response? Using filtering of whole exome sequencing data, Robbins et al. [45] identified candidate neoantigens from melanoma patients who demonstrated near complete regression of disease after treatment with autologous TILs. The top 55 peptides with the highest predicted affinity to HLA-0201 were assessed, and it was found that four of these had elicited a measurable T-cell response. For each of the three patients analyzed, T cell responses as measured by IFN-gamma release could be detected against at least two neoantigens. The interferon-gamma responses against each neoantigen were almost as robust as the responses against the autologous tumor. Additionally, the proportion of reactive T cells was similar (~50 %) when stimulated with either neoantigen peptide or autologous tumor [45, 49]. T cells reactive against one such neoantigen were found to be persistent in the peripheral blood of one patient 5 years after transfer [49]. These findings suggest that a small fraction of predicted neoantigens may serve as the primary targets for melanoma-derived TILs. Thus, despite limiting the number of candidate neoantigens by peptide-MHC affinity, high throughput confirmatory methods will be needed to verify the small number of true neoantigen antitumor targets.

An alternative method to functional antitumor assays is to determine antigen specificity via pMHC multimer staining [50]. High throughput peptide-MHC production is possible through the use of a UV-cleavable loading peptide in the production of class I MHC tetramers [51] and combinatorial pMHC labeling which allows up to 64 distinct pMHCs to be assayed in the same sample [52, 53]. Peptide-MHC multimer staining represents an appealing methodology for rapid screening T cell responses against large numbers of candidate neoantigens. Van Rooij et al. [54] employed this strategy in a neoantigen screen of an ipilimumab-responsive melanoma patient. The high number of non-synonymous somatic mutations associated with this specific tumor allowed for further filtering of the whole exome data. In addition to predicting peptide-MHC affinity, the authors also used RNAseq data to account for gene expression and also filtered by predicted proteasome processing [55]. This yielded 448 candidate neoepitopes which were synthesized into pMHC multimers with combinatorial labeling. T cell responses were detected against two of these neoepitopes. One peptide was specific for 0.003 % of CD8-positive TILs, while the other stained for 3.3 % of CD8-positive TILs. By using multimers, the neoantigen-specific response could be easily tracked in serial samples. This approach enables an efficient and comprehensive cataloguing of all the putative neoantigens for a particular cancer. Similar to the findings of Robbins et al., their findings suggest that only a small number of candidate neoantigen actually elicit an immune response.

These methods of neoantigen screening do have inherent limitations. They assume presentation of a mutated neoantigen peptide from genomic sequencing data, but this assumption may not hold true in all cases. Allelic expression may occur preferentially or exclusively from a preserved germline allele instead of the mutated allele, resulting in null expression of the putative neoantigen. This possibility can be partially addressed through verification of mutant allele expression, either by using RNAseq as the initially sequencing method or else through allele-specific expression measurement by a method such as quantitative allele-specific polymerase chain reaction. Additionally, cancer cells may evade detection via alteration of their antigen processing machinery or through downregulation of MHC class I expression, rendering the neoantigen undetectable by the immune system. Verifying the expression of MHC class I on the surface of the tumor cells can help exclude this possibility. However, confirming a neoantigen-specific tumor response with certainty would require directly testing reactivity of sorted neoantigen-specific cells against autologous tumor cells.

The use peptide-MHC multimers to detect the immune response also poses challenges. One limitation is that multimers must be synthesized for any pMHC to be tested, and uncommon or rare HLA alleles may be difficult to study. This is alleviated to some degree by the observation that groups of MHC class I alleles, called supertypes, display similar affinities for peptides and that algorithms predicting peptide-MHC affinity can be applied across members within the same supertype [56]. Even alleles in different supertypes sometimes demonstrate shared affinities [57]. Similarly, T cell receptors can demonstrate promiscuity in recognizing peptides in the context of multiple MHC alleles as well [58]. Another limitation to the use of pMHC multimers is that their application to MHC class II analysis is substantially more troublesome. This is in part due to the inferiority of the algorithms for predicting affinity for peptides binding MHC class II [59], though many models do exist. Another difficulty is that peptide-specific CD4 cells have lower affinity for their pMHC target and are present at lower frequencies [60, 61], thus complicating their detection. Thus far, the role of MHC class II-restricted neoantigens has not been investigated, despite mounting evidence of the importance of the CD4 T cell population in the antitumor immune response [62].

Immunoediting and implications for neoantigens

There are theoretical considerations that may explain the rarity of immunogenic neoantigens found upon screening. The immune system undergoes constant surveillance of developing tumors, and thus, it is thought that immunogenic mutations are deleted during the development of tumors. This process has been dubbed immunoediting [63, 64], and now has been confirmed experimentally in two murine models. In the first, an oncogene-driven, endogenous tumor engineered to express neoantigens was found to undergo deletion of the neoantigens only when passaged in an immunocompetent background [65]. The second study used an “unedited” carcinogen-induced immunogenic sarcoma cell line generated in an immunodeficient background [66]. The tumor cells were sequenced and mutations were subjected to MHC class I peptide prediction algorithms to identify potential neoantigens. When injected into immunocompetent mice, the tumors were rejected in about 80 % of recipients. An immunodominant neoantigen was discovered in the unedited, parental tumors which elicited a T cell response in mice challenged with the parental tumor. When tumors that escaped rejection were resequenced, this mutation was deleted. Thus, passage through an immunocompetent host was shown to “edit” the immunogenic neoantigen, resulting in outgrowth of a subclone of tumor lacking the neoantigen. It would be expected that human tumors may edit immunogenic neoantigens in a similar manner, potentially explaining the low numbers such neoantigens seen in previous studies.

The effect of tumor immunoediting on neoantigen expression takes on increased importance when combined with our evolving understanding of intratumoral heterogeneity. Tumors are composed of a number of subclones with distinct genetic alterations [6771]. Furthermore, comparison of primary tumors with sites of metastases show that each metastasis contains a mixture of mutations found in the primary lesion and a new set of mutations present only in the metastasis [69, 72], with the heterogeneity primarily due to the accumulation of passenger mutations [73]. This has important implications for the use of neoantigens as targets. Since neoantigen expression may be limited to a subclonal population, neoantigens associated with passenger mutations may be more easily edited, resulting in the outgrowth of a resistant clone lacking that mutation. Similarly, the allelic frequency of a given mutation may be an important consideration in the selection of target neoantigens, with high allelic frequency mutations representing superior targets.

An alternative strategy to unbiased neoantigen screening is to focus on neoantigens formed by founder or driver mutations. Natural immune responses are occasionally seen against these mutated proteins [26, 74, 75]. This strategy has been employed in vaccine development against frequently mutated genes such as ras, p53 and BCR-ABL with limited success [7679]. Castle et al. [80] applied peptide-MHC affinity predictions to the B16F10 melanoma cell line with a focus on only those neoantigens associated with potential driver genes. Sequencing of the murine B16F10 melanoma cell line found 563 mutations, with several involving potential driver genes. Fifty mutated peptides were chosen that were related to potential driver genes and displayed high affinity for class I MHC. One-third of these peptides generated an immune response, and immunization with those peptides conferred a protective effect against tumor inoculation. A similar analysis was performed using the COSMIC database for human mutations [81]. Mutations which were present in 5 % or greater of cancers were subjected to HLA-binding prediction algorithms to find neoantigens with affinity to class I HLA alleles. Thirty-six mutations were tested against all HLA class I alleles and candidates were weighted by frequency of the mutation in a given cancer, binding affinity of the peptide for a given HLA allele and the frequency of that allele in the general population. Interestingly, the top six candidates by this methodology were all KRAS mutations. Since the HLA type was factored only as an allelic frequency in the general population, it is possible that none of the mutations discovered occurred in a patient with a matching HLA allele.

Given the small number of driver mutations per tumor genome, restricting analysis to this subset of neoantigens may not be feasible nor may it be necessary. Those neoantigens discovered by sequencing and peptide-MHC affinity prediction have already survived the immunoediting process. Their persistence implies that either they contribute to tumor survival or their creation preceded a driver mutation that contributes to survival, allowing them to resist deletion. Tumors with these mutations may escape immune regulation via a number of alternatives to clonal deletion, including downregulation of antigen expression, downregulation of MHC molecules, dysfunction in antigen processing machinery or immune suppression through a number of mechanisms of immune dysregulation [82]. These cells may be the ones most amenable to cancer immunotherapy, such as immune checkpoint blockade.

Clinical implications

The identification of specific neoantigens has a number of direct clinical implications. One obvious benefit would be in the design of tumor-specific vaccines. For the reasons discussed above, neoantigen-derived vaccines would theoretically be less prone to immune tolerance than currently used tumor vaccines, and immune responses to vaccines could be readily measured. Knowledge of specific neoantigens could also address several challenges facing the next generation of immunomodulatory agents in development, such as antibodies targeting PD1/PD-L1 and CD137. While these agents have produced some dramatic results in clinical trials, responses have been limited to a subset of patients with no clear biomarkers identified to aid in predicting response. Information regarding the presence of tumor-specific T cells and, perhaps more importantly, the immunophenotype of these tumor-specific T cells may help identify those patients most likely to benefit from immunomodulatory therapy. For example, it is known that PD-L1 expression by tumor cells does not accurately predict response to immune checkpoint blockade [4], indicating that perhaps the phenotype of the tumor-specific lymphocytes may be a better predictor than the tumor itself. One possibility is that PD-1 expression (a marker of lymphocyte exhaustion) on neoantigen-specific T cells may better predict response to anti-PD-1/PD-L1 antibody therapy. Additionally, monitoring the neoantigen-specific tumor response after immunomodulatory therapy may provide a superior assessment of response than conventional imaging methods, which are unable to differentiate progressive disease from early immune responses, a phenomenon termed pseudoprogression. Lastly, identification of neoantigen-specific lymphocytes could improve the production protocols used for adoptive immunotherapy from TILs. As an alternative approach to reversing T cell exhaustion with immune checkpoint inhibitors, T cell receptors (TCRs) from neoantigen-specific cells could easily be cloned and transduced into lymphocytes. There is evidence that transduction of tumor-specific TCRs into naïve cells may produce a more effective population of T cells for adoptive immunotherapy [83].

Conclusion

Cancer genome sequencing promises to open the once-hidden field of neoantigens to investigation. The challenge remains, how to translate the genomic and epigenetic landscapes of cancer into an understanding of the immunogenicity of cancer. Identification of neoantigens through sequencing and computational methods has the potential to transform our approach to immunotherapy. This information could be used to predict those patients most likely to respond to immunotherapy or to detect early responses to therapy prior to traditional clinical indicators. It may also be the key to the development of successful cancer vaccines, with trials already being planned [84]. The next years will reveal whether neoantigens will finally provide cancer immunotherapy with the elusive tumor antigens needed to propel the field.