On the feasibility of mining CD8+ T cell receptor patterns underlying immunogenic peptide recognition
Current T cell epitope prediction tools are a valuable resource in designing targeted immunogenicity experiments. They typically focus on, and are able to, accurately predict peptide binding and presentation by major histocompatibility complex (MHC) molecules on the surface of antigen-presenting cells. However, recognition of the peptide-MHC complex by a T cell receptor (TCR) is often not included in these tools. We developed a classification approach based on random forest classifiers to predict recognition of a peptide by a T cell receptor and discover patterns that contribute to recognition. We considered two approaches to solve this problem: (1) distinguishing between two sets of TCRs that each bind to a known peptide and (2) retrieving TCRs that bind to a given peptide from a large pool of TCRs. Evaluation of the models on two HIV-1, B*08-restricted epitopes reveals good performance and hints towards structural CDR3 features that can determine peptide immunogenicity. These results are of particular importance as they show that prediction of T cell epitope and T cell epitope recognition based on sequence data is a feasible approach. In addition, the validity of our models not only serves as a proof of concept for the prediction of immunogenic T cell epitopes but also paves the way for more general and high-performing models.
KeywordsImmunoinformatics T cell receptor T cell epitope prediction Bioinformatics Random forest classifier
- Caruana R, Karampatziakis N, Yessenalina A (2008) An empirical evaluation of supervised learning in high dimensions. Proc 25th Int Conf Mach learn - ICML ‘08 96–103. doi:10.1145/1390156.1390169
- Cinelli M, Sun Y, Best K et al (2017) Feature selection using a one dimensional naïve Bayes’ classifier increases the accuracy of support vector machine classification of CDR3 repertoires. Bioinformatics. doi:10.1093/bioinformatics/btw771
- Frahm N, Linde C, Brander C (2006) Identification of HIV-derived, HLA class I restricted CTL epitopes: insights into TCR repertoire, CTL escape and viral fitnessGoogle Scholar
- Meysman P, Ogunjimi B, Naulaerts S et al (2015) Varicella-zoster virus-derived major histocompatibility complex class I-restricted peptide affinity is a determining factor in the HLA risk profile for the development of postherpetic neuralgia. J Virol 89:962–969. doi:10.1128/JVI.02500-14 CrossRefPubMedGoogle Scholar
- Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830Google Scholar