Abstract
T cells play an essential role in defense of the organism against pathogens and cancer. Efficient protection requires a vast repertoire of immune receptors, which is created by the V(D)J recombination process. There are multiple algorithms designed for the annotation of recombined T cell receptor (TR) sequences from traditional (short-read) RNA-Seq, however, none is adjusted for the long-read data. Here we intend to examine whether existing methods for TR sequences annotation using traditional RNA-Seq can be utilized for long-read sequencing data. ImReP, TRUST4, CATT and MiXCR algorithms were applied to data obtained by nanopore technology (PromethION). Adjustment of parameters was performed. The biggest number of CDR3 sequences was detected by the TRUST4 algorithm (20,599 unique TR sequences out of 73,904,478 total reads), representing 25% of the expected number of sequences. The distribution of annotated V and J genes was the same for MiXCR and TRUST4 algorithms and may be used to analyze the repertoire of V/J gene used in rearranged TR genes. Due to the high sequencing error rate of the analyzed sample (median read quality Q = 6.9), TR clonotype analysis is not suggested, and additional error correction steps are recommended for such analyses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mose, L.E., Selitsky, S.R., Bixby, L.M., et al.: Assembly-based inference of B-cell receptor repertoires from short read RNA sequencing data with V’DJer. Bioinformatics 32(24), 3729–3734 (2016). https://doi.org/10.1093/bioinformatics/btw526
Canzar, S., Neu, K.E., Tang, Q., Wilson, P.C., Khan, A.A.: BASIC: BCR assembly from single cells. Bioinformatics 33(3), 425–427 (2017). https://doi.org/10.1093/bioinformatics/btw631
Upadhyay, A.A., Kauffman, R.C., Wolabaugh, A.N., et al.: BALDR: A computational pipeline for paired heavy and light chain immunoglobulin reconstruction in single-cell RNA-seq data. Genome Med. 10, 20 (2018). https://doi.org/10.1186/s13073-018-0528-3
Ye, J., Ma, N., Madden, T.L., Ostell, J.M.: IgBLAST: An immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res 41(Web Server issue), W34–W40 (2013)
Stubbington, M.J.T., Lönnberg, T., Proserpio, V., et al.: T cell fate and clonality inference from single-cell transcriptomes. Nat. Methods 13(4), 329–332 (2016). https://doi.org/10.1038/nmeth.3800
Gerritsen, B., Pandit, A., Andeweg, A.C., de Boer, R.J.: RTCR: A pipeline for complete and accurate recovery of T cell repertoires from high throughput sequencing data. Bioinformatics (Oxford, England) 32(20), 3098–3106 (2016). https://doi.org/10.1093/bioinformatics/btw339
Mandric, I., Rotman, J., Yang, H.T., et al.: Profiling immunoglobulin repertoires across multiple human tissues using RNA sequencing. Nat. Commun. 11, 3126 (2020). https://doi.org/10.1038/s41467-020-16857-7
Song, L., Cohen, D., Ouyang, Z., et al.: TRUST4: Immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat. Methods 18, 627–630 (2021). https://doi.org/10.1038/s41592-021-01142-2
Kuchenbecker, L., et al.: IMSEQ—a fast and error aware approach to immunogenetic sequence analysis. Bioinformatics 31(18), 2963–2971 (2015). https://doi.org/10.1093/bioinformatics/btv309
Bolotin, D., Poslavsky, S., Mitrophanov, I., et al.: MiXCR: Software for comprehensive adaptive immunity profiling. Nat. Methods 12, 380–381 (2015). https://doi.org/10.1038/nmeth.3364
Rizzetto, S., Koppstein, D.N.P., Samir, J., et al.: B-cell receptor reconstruction from single-cell RNA-seq with VDJPuzzle. Bioinformatics 34(16), 2846–2847 (2018). https://doi.org/10.1093/bioinformatics/bty203
Chen, S.-Y., Liu, C.-J., Zhang, Q., Guo, A.-Y.: An ultra-sensitive T-cell receptor detection method for TCR-Seq and RNA-Seq data. Bioinformatics 36(15), 4255–4262 (2020). https://doi.org/10.1093/bioinformatics/btaa432
Yu, Y., Ceredig, R., Seoighe, C.: LymAnalyzer: A tool for comprehensive analysis of next generation sequencing data of T cell receptors and immunoglobulins. Nucleic Acids Res. 44(4), e31 (2016). https://doi.org/10.1093/nar/gkv1016. Epub 2015 Oct 7. PMID: 26446988; PMCID: PMC4770197
Yang, X., et al.: TCRklass: A new K-string-based algorithm for human and mouse TCR repertoire characterization. J. Immunol. 194(1), 446–454 (2015). https://doi.org/10.4049/jimmunol.1400711
Wang, L., Qu, L., Yang, L., Wang, Y., Zhu, H.: NanoReviser: An error-correction tool for nanopore sequencing based on a deep learning algorithm. Front. Genet. 12(11), 900 (2020). https://doi.org/10.3389/fgene.2020.00900
Sahlin, K., Medvedev, P.: Error correction enables use of Oxford nanopore technology for reference-free transcriptome analysis. Nat. Commun. 12, 2 (2021). https://doi.org/10.1038/s41467-020-20340-8
Dohm, J.C., Peters, P., Stralis-Pavese, N., Himmelbauer, H.: Benchmarking of long-read correction methods. NAR Genomics Bioinformatics 2(2), lqaa037 (2020). https://doi.org/10.1093/nargab/lqaa037
Cruz-Garcia, L., et al.: Generation of a transcriptional radiation exposure signature in human blood using long-read nanopore sequencing. Radiat. Res. 193(2), 143–154 (2020). https://doi.org/10.1667/RR15476.1
de Coster, W., D’Hert, S., Schultz, D.T., Cruts, M., van Broeckhoven, C.: NanoPack: Visualizing and processing long-read sequencing data. Bioinformatics 34(15), 2666–2669 (2018). https://doi.org/10.1093/bioinformatics/bty149
Li, H.: Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34(18), 3094–3100 (2018). https://doi.org/10.1093/bioinformatics/bty191
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R.: The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25, 2078–2079 (2009). 1000 Genome Project Data Processing Subgroup (2009)
Morgan, M., Pagès, H., Obenchain, V., Hayden, N.: Rsamtools: Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import (2021). R package version 2.8.0, https://bioconductor.org/packages/Rsamtools
Lefranc, M.-P.: IMGT, the international ImMunoGeneTics information system. Cold Spring Harb. Protoc. 2011(6), pp. pdb-top115, 2011 Jun 1. DOI:https://doi.org/10.1101/pdb.top115. http://www.imgt.org/FAQ/#question15
Lin, J.: Divergence measures based on the shannon entropy. IEEE Trans. Inf. Theory 33(1), 145–151 (1991)
Larsson, J.: Eulerr: Area-Proportional Euler and Venn Diagrams with Ellipses (2020). R package version 6.1.0, https://cran.r-project.org/package=eulerr
Vaser, R., Sović, I., Nagarajan, N., Šikić, M.: Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27(5), 737–746 (2017). https://doi.org/10.1101/gr.214270.116. Epub 2017 Jan 18
Li, S., Wilkinson, M.F.: Nonsense surveillance in lymphocytes? Immun. 8(2), 135–141 (1998). https://doi.org/10.1016/s1074-7613(00)80466-5
Acknowledgment
This work was funded by the European Social Fund grant POWR.03.02.00–00-I029 and by the Silesian University of Technology grant for Support and Development of Research Potential.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Mika, J., Candéias, S.M., Badie, C., Polanska, J. (2022). Can We Detect T Cell Receptors from Long-Read RNA-Seq Data?. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2022. Lecture Notes in Computer Science(), vol 13347. Springer, Cham. https://doi.org/10.1007/978-3-031-07802-6_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-07802-6_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07801-9
Online ISBN: 978-3-031-07802-6
eBook Packages: Computer ScienceComputer Science (R0)