T-Cell Receptor Optimization with Reinforcement Learning and Mutation Polices for Precision Immunotherapy

Chen, Ziqi; Min, Martin Renqiang; Guo, Hongyu; Cheng, Chao; Clancy, Trevor; Ning, Xia

doi:10.1007/978-3-031-29119-7_11

Ziqi Chen⁸,
Martin Renqiang Min⁹,
Hongyu Guo¹⁰,
Chao Cheng¹¹,
Trevor Clancy¹² &
…
Xia Ning^8,13,14

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 13976))

Included in the following conference series:

International Conference on Research in Computational Molecular Biology

991 Accesses
1 Citations

Abstract

T cells monitor the health status of cells by identifying foreign peptides displayed on their surface. T-cell receptors (TCRs), which are protein complexes found on the surface of T cells, are able to bind to these peptides. This process is known as TCR recognition and constitutes a key step for immune response. Optimizing TCR sequences for TCR recognition represents a fundamental step towards the development of personalized treatments to trigger immune responses killing cancerous or virus-infected cells. In this paper, we formulated the search for these optimized TCRs as a reinforcement learning (\(\mathop {\texttt{RL}}\limits \)) problem, and presented a framework \(\mathop {\texttt{TCRPPO}}\limits \) with a mutation policy using proximal policy optimization. \(\mathop {\texttt{TCRPPO}}\limits \) mutates TCRs into effective ones that can recognize given peptides. \(\mathop {\texttt{TCRPPO}}\limits \) leverages a reward function that combines the likelihoods of mutated sequences being valid TCRs measured by a new scoring function based on deep autoencoders, with the probabilities of mutated sequences recognizing peptides from a peptide-TCR interaction predictor. We compared \(\mathop {\texttt{TCRPPO}}\limits \) with multiple baseline methods and demonstrated that \(\mathop {\texttt{TCRPPO}}\limits \) significantly outperforms all the baseline methods to generate positive binding and valid TCRs. These results demonstrate the potential of \(\mathop {\texttt{TCRPPO}}\limits \) for both precision immunotherapy and peptide-recognizing TCR motif discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The code is available at https://github.com/ninglab/TCRPPO.
2.
https://vdjdb.cdr3.net/motif.

References

Abati, D., Porrello, A., Calderara, S., Cucchiara, R.: Latent space autoregression for novelty detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
Google Scholar
Angermüller, C., Dohan, D., Belanger, D., Deshpande, R., Murphy, K., Colwell, L.: Model-based reinforcement learning for biological sequence design. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020 (2020)
Google Scholar
Arnold, F.H.: Design by directed evolution. Acc. Chem. Res. 31(3), 125–131 (1998)
Google Scholar
Cai, M., Bang, S., Zhang, P., Lee, H.: ATM-TCR: TCR-epitope binding affinity prediction using a multi-head self-attention model. Front. Immunol. 13 (2022)
Google Scholar
Chen, S.Y., Yue, T., Lei, Q., Guo, A.Y.: TCRdb: a comprehensive database for t-cell receptor sequences with powerful search function. Nucleic Acids Res. 49(D1), D468–D474 (2020)
Google Scholar
Chen, Z., Min, M.R., Ning, X.: Ranking-based convolutional neural network models for peptide-MHC class i binding prediction. Front. Mol. Biosci. 8 (2021)
Google Scholar
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75538-8_7
Craiu, A., Akopian, T., Goldberg, A., Rock, K.L.: Two distinct proteolytic processes in the generation of a major histocompatibility complex class i-presented peptide. Proc. Natl. Acad. Sci. 94(20), 10850–10855 (1997)
Google Scholar
Esfahani, K., Roudaia, L., Buhlaiga, N., Rincon, S.D., Papneja, N., Miller, W.: A review of cancer immunotherapy: From the past, to the present, to the future. Curr. Oncol. 27(12), 87–97 (2020)
Google Scholar
Glanville, J., et al.: Identifying specificity groups in the t cell receptor repertoire. Nature 547(7661), 94–98 (2017)
Article Google Scholar
Gómez-Bombarelli, R., et al.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Sci. 4(2), 268–276 (2018)
Google Scholar
González, J., Longworth, J., James, D.C., Lawrence, N.D.: Bayesian optimization for synthetic gene design (2015)
Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)
Google Scholar
Gupta, A., Zou, J.: Feedback GAN for DNA optimizes protein functions. Nat. Mach. Intell. 1(2), 105–111 (2019)
Google Scholar
Hou, X., et al.: Analysis of the repertoire features of TCR beta chain CDR3 in human by high-throughput sequencing. Cell. Physiol. Biochem. 39(2), 651–667 (2016)
Article Google Scholar
Killoran, N., Lee, L.J., Delong, A., Duvenaud, D., Frey, B.J.: Generating and designing DNA with deep generative models. CoRR abs/1712.06148 (2017)
Google Scholar
La Gruta, N.L., Gras, S., Daley, S.R., Thomas, P.G., Rossjohn, J.: Understanding the drivers of MHC restriction of t cell receptors. Nat. Rev. Immunol. 18(7), 467–478 (2018)
Article Google Scholar
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: Proceedings of The 33rd International Conference on Machine Learning, vol. 48, pp. 1928–1937. PMLR, New York, New York, USA (20–22 June 2016)
Google Scholar
Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2) (2021)
Google Scholar
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10(3), 61–74 (1999)
Google Scholar
Rossjohn, J., Gras, S., Miles, J.J., Turner, S.J., Godfrey, D.I., McCluskey, J.: T cell antigen receptor recognition of antigen-presenting molecules. Annu. Rev. Immunol. 33, 169–200 (2015)
Article Google Scholar
Sadelain, M., Rivière, I., Riddell, S.: Therapeutic t cell engineering. Nature 545(7655), 423–431 (2017)
Article Google Scholar
Schulman, J., Moritz, P., Levine, S., Jordan, M., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017)
Google Scholar
Shugay, M., Bagaev, D.V., Zvyagin, I.V., Vroomans, R.M., Crawford, J.C., Dolton, G., et al.: VDJdb: a curated database of t-cell receptor sequences with known antigen specificity. Nucleic Acids Res. 46(D1), D419–D427 (2017)
Google Scholar
Skwark, M.J., et al.: Designing a prospective COVID-19 therapeutic with reinforcement learning. CoRR abs/2012.01736 (2020)
Google Scholar
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pp. 223–231. Cambridge, Massachusetts, USA (8–12 August 2006)
Google Scholar
Springer, I., Besser, H., Tickotsky-Moskovitz, N., Dvorkin, S., Louzoun, Y.: Prediction of specific TCR-peptide binding from large dictionaries of TCR-peptide pairs. Front. Immunol. 11 (2020)
Google Scholar
Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E., Friedman, N.: McPAS-TCR: a manually curated catalogue of pathology-associated t cell receptor sequences. Bioinformatics 33(18), 2924–2929 (2017)
Google Scholar
Verdegaal, E.M.E., et al.: Neoantigen landscape dynamics during human melanoma–t cell interactions. Nature 536(7614), 91–95 (2016)
Google Scholar
Waldman, A.D., Fritz, J.M., Lenardo, M.J.: A guide to cancer immunotherapy: from t cell basic science to clinical practice. Nat. Rev. Immunol. 20(11), 651–668 (2020)
Google Scholar
Weber, A., Born, J., Martínez, M.R.: TITAN: T-cell receptor specificity prediction with bimodal attention networks. Bioinformatics 37(Supplement_1), i237–i244 (2021)
Google Scholar
Whitley, D.: A genetic algorithm tutorial. Stat. Comput. 4(2) (1994)
Google Scholar
Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)
Google Scholar
Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, The Ohio State University, Columbus, OH, 43210, USA
Ziqi Chen & Xia Ning
Machine Learning Department, NEC Labs, Princeton, NJ, 08540, USA
Martin Renqiang Min
Digital Technologies Research Centre, National Research Council Canada, Ontario, Canada
Hongyu Guo
Department of Medicine, Baylor College of Medicine, Houston, TX, 77030, USA
Chao Cheng
NEC Oncolmmunity AS, Oslo Cancer Cluster, Innovation Park, Ullernchausséen 64, 0379, Oslo, Norway
Trevor Clancy
Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
Xia Ning
Translational Data Analytics Institute, The Ohio State University, Columbus, OH, 43210, USA
Xia Ning

Authors

Ziqi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Martin Renqiang Min
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Guo
View author publications
You can also search for this author in PubMed Google Scholar
Chao Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Trevor Clancy
View author publications
You can also search for this author in PubMed Google Scholar
Xia Ning
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Martin Renqiang Min or Xia Ning .

Editor information

Editors and Affiliations

Indiana University Bloomington, Bloomington, IN, USA
Haixu Tang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z., Min, M.R., Guo, H., Cheng, C., Clancy, T., Ning, X. (2023). T-Cell Receptor Optimization with Reinforcement Learning and Mutation Polices for Precision Immunotherapy. In: Tang, H. (eds) Research in Computational Molecular Biology. RECOMB 2023. Lecture Notes in Computer Science(), vol 13976. Springer, Cham. https://doi.org/10.1007/978-3-031-29119-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-29119-7_11
Published: 03 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29118-0
Online ISBN: 978-3-031-29119-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

T-Cell Receptor Optimization with Reinforcement Learning and Mutation Polices for Precision Immunotherapy