Skip to main content

Can We Detect T Cell Receptors from Long-Read RNA-Seq Data?

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2022)

Abstract

T cells play an essential role in defense of the organism against pathogens and cancer. Efficient protection requires a vast repertoire of immune receptors, which is created by the V(D)J recombination process. There are multiple algorithms designed for the annotation of recombined T cell receptor (TR) sequences from traditional (short-read) RNA-Seq, however, none is adjusted for the long-read data. Here we intend to examine whether existing methods for TR sequences annotation using traditional RNA-Seq can be utilized for long-read sequencing data. ImReP, TRUST4, CATT and MiXCR algorithms were applied to data obtained by nanopore technology (PromethION). Adjustment of parameters was performed. The biggest number of CDR3 sequences was detected by the TRUST4 algorithm (20,599 unique TR sequences out of 73,904,478 total reads), representing 25% of the expected number of sequences. The distribution of annotated V and J genes was the same for MiXCR and TRUST4 algorithms and may be used to analyze the repertoire of V/J gene used in rearranged TR genes. Due to the high sequencing error rate of the analyzed sample (median read quality Q = 6.9), TR clonotype analysis is not suggested, and additional error correction steps are recommended for such analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mose, L.E., Selitsky, S.R., Bixby, L.M., et al.: Assembly-based inference of B-cell receptor repertoires from short read RNA sequencing data with V’DJer. Bioinformatics 32(24), 3729–3734 (2016). https://doi.org/10.1093/bioinformatics/btw526

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Canzar, S., Neu, K.E., Tang, Q., Wilson, P.C., Khan, A.A.: BASIC: BCR assembly from single cells. Bioinformatics 33(3), 425–427 (2017). https://doi.org/10.1093/bioinformatics/btw631

    Article  CAS  PubMed  Google Scholar 

  3. Upadhyay, A.A., Kauffman, R.C., Wolabaugh, A.N., et al.: BALDR: A computational pipeline for paired heavy and light chain immunoglobulin reconstruction in single-cell RNA-seq data. Genome Med. 10, 20 (2018). https://doi.org/10.1186/s13073-018-0528-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Ye, J., Ma, N., Madden, T.L., Ostell, J.M.: IgBLAST: An immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res 41(Web Server issue), W34–W40 (2013)

    Article  PubMed  PubMed Central  Google Scholar 

  5. Stubbington, M.J.T., Lönnberg, T., Proserpio, V., et al.: T cell fate and clonality inference from single-cell transcriptomes. Nat. Methods 13(4), 329–332 (2016). https://doi.org/10.1038/nmeth.3800

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Gerritsen, B., Pandit, A., Andeweg, A.C., de Boer, R.J.: RTCR: A pipeline for complete and accurate recovery of T cell repertoires from high throughput sequencing data. Bioinformatics (Oxford, England) 32(20), 3098–3106 (2016). https://doi.org/10.1093/bioinformatics/btw339

    Article  CAS  Google Scholar 

  7. Mandric, I., Rotman, J., Yang, H.T., et al.: Profiling immunoglobulin repertoires across multiple human tissues using RNA sequencing. Nat. Commun. 11, 3126 (2020). https://doi.org/10.1038/s41467-020-16857-7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Song, L., Cohen, D., Ouyang, Z., et al.: TRUST4: Immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat. Methods 18, 627–630 (2021). https://doi.org/10.1038/s41592-021-01142-2

    Article  CAS  PubMed  Google Scholar 

  9. Kuchenbecker, L., et al.: IMSEQ—a fast and error aware approach to immunogenetic sequence analysis. Bioinformatics 31(18), 2963–2971 (2015). https://doi.org/10.1093/bioinformatics/btv309

    Article  CAS  PubMed  Google Scholar 

  10. Bolotin, D., Poslavsky, S., Mitrophanov, I., et al.: MiXCR: Software for comprehensive adaptive immunity profiling. Nat. Methods 12, 380–381 (2015). https://doi.org/10.1038/nmeth.3364

    Article  CAS  PubMed  Google Scholar 

  11. Rizzetto, S., Koppstein, D.N.P., Samir, J., et al.: B-cell receptor reconstruction from single-cell RNA-seq with VDJPuzzle. Bioinformatics 34(16), 2846–2847 (2018). https://doi.org/10.1093/bioinformatics/bty203

    Article  CAS  PubMed  Google Scholar 

  12. Chen, S.-Y., Liu, C.-J., Zhang, Q., Guo, A.-Y.: An ultra-sensitive T-cell receptor detection method for TCR-Seq and RNA-Seq data. Bioinformatics 36(15), 4255–4262 (2020). https://doi.org/10.1093/bioinformatics/btaa432

    Article  CAS  PubMed  Google Scholar 

  13. Yu, Y., Ceredig, R., Seoighe, C.: LymAnalyzer: A tool for comprehensive analysis of next generation sequencing data of T cell receptors and immunoglobulins. Nucleic Acids Res. 44(4), e31 (2016). https://doi.org/10.1093/nar/gkv1016. Epub 2015 Oct 7. PMID: 26446988; PMCID: PMC4770197

    Article  PubMed  Google Scholar 

  14. Yang, X., et al.: TCRklass: A new K-string-based algorithm for human and mouse TCR repertoire characterization. J. Immunol. 194(1), 446–454 (2015). https://doi.org/10.4049/jimmunol.1400711

    Article  CAS  PubMed  Google Scholar 

  15. Wang, L., Qu, L., Yang, L., Wang, Y., Zhu, H.: NanoReviser: An error-correction tool for nanopore sequencing based on a deep learning algorithm. Front. Genet. 12(11), 900 (2020). https://doi.org/10.3389/fgene.2020.00900

    Article  CAS  Google Scholar 

  16. Sahlin, K., Medvedev, P.: Error correction enables use of Oxford nanopore technology for reference-free transcriptome analysis. Nat. Commun. 12, 2 (2021). https://doi.org/10.1038/s41467-020-20340-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Dohm, J.C., Peters, P., Stralis-Pavese, N., Himmelbauer, H.: Benchmarking of long-read correction methods. NAR Genomics Bioinformatics 2(2), lqaa037 (2020). https://doi.org/10.1093/nargab/lqaa037

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Cruz-Garcia, L., et al.: Generation of a transcriptional radiation exposure signature in human blood using long-read nanopore sequencing. Radiat. Res. 193(2), 143–154 (2020). https://doi.org/10.1667/RR15476.1

    Article  CAS  PubMed  Google Scholar 

  19. de Coster, W., D’Hert, S., Schultz, D.T., Cruts, M., van Broeckhoven, C.: NanoPack: Visualizing and processing long-read sequencing data. Bioinformatics 34(15), 2666–2669 (2018). https://doi.org/10.1093/bioinformatics/bty149

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Li, H.: Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34(18), 3094–3100 (2018). https://doi.org/10.1093/bioinformatics/bty191

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R.: The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25, 2078–2079 (2009). 1000 Genome Project Data Processing Subgroup (2009)

    Article  PubMed  PubMed Central  Google Scholar 

  22. Morgan, M., Pagès, H., Obenchain, V., Hayden, N.: Rsamtools: Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import (2021). R package version 2.8.0, https://bioconductor.org/packages/Rsamtools

  23. Lefranc, M.-P.: IMGT, the international ImMunoGeneTics information system. Cold Spring Harb. Protoc. 2011(6), pp. pdb-top115, 2011 Jun 1. DOI:https://doi.org/10.1101/pdb.top115. http://www.imgt.org/FAQ/#question15

  24. Lin, J.: Divergence measures based on the shannon entropy. IEEE Trans. Inf. Theory 33(1), 145–151 (1991)

    Article  Google Scholar 

  25. Larsson, J.: Eulerr: Area-Proportional Euler and Venn Diagrams with Ellipses (2020). R package version 6.1.0, https://cran.r-project.org/package=eulerr

  26. Vaser, R., Sović, I., Nagarajan, N., Šikić, M.: Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27(5), 737–746 (2017). https://doi.org/10.1101/gr.214270.116. Epub 2017 Jan 18

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Li, S., Wilkinson, M.F.: Nonsense surveillance in lymphocytes? Immun. 8(2), 135–141 (1998). https://doi.org/10.1016/s1074-7613(00)80466-5

    Article  CAS  Google Scholar 

Download references

Acknowledgment

This work was funded by the European Social Fund grant POWR.03.02.00–00-I029 and by the Silesian University of Technology grant for Support and Development of Research Potential.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Justyna Mika or Joanna Polanska .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mika, J., Candéias, S.M., Badie, C., Polanska, J. (2022). Can We Detect T Cell Receptors from Long-Read RNA-Seq Data?. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2022. Lecture Notes in Computer Science(), vol 13347. Springer, Cham. https://doi.org/10.1007/978-3-031-07802-6_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07802-6_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07801-9

  • Online ISBN: 978-3-031-07802-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics