Skip to main content

Bioinformatic Pipelines to Analyze lncRNAs RNAseq Data

  • Protocol
  • First Online:
Long Non-Coding RNAs in Cancer

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2348))

Abstract

RNA-sequencing could be nowadays considered the gold standard to study the coding and noncoding transcriptome. The great advantage of high-throughput sequencing in the characterization and quantification of long noncoding RNA (lncRNA) resides in its capability to capture the complexity of lncRNA transcripts configuration patterns, even in the presence of several alternative isoforms, with superior accuracy and discovery power compared to other technologies such as microarrays or PCR-based methods. In this chapter, we provide a protocol for lncRNA analysis using through high-throughput sequencing, indicating the main difficulties in the annotation pipeline and showing how an accurate evaluation of the procedure can help to minimize biased observations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SP, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schonbach C, Sekiguchi K, Semple CA, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y, Consortium F, Group RGER, Genome Science G (2005) The transcriptional landscape of the mammalian genome. Science 309(5740):1559–1563. https://doi.org/10.1126/science.1112014

    Article  CAS  PubMed  Google Scholar 

  2. Van Roosbroeck K, Pollet J, Calin GA (2013) miRNAs and long noncoding RNAs as biomarkers in human diseases. Expert Rev Mol Diagn 13(2):183–204. https://doi.org/10.1586/erm.12.134

    Article  CAS  PubMed  Google Scholar 

  3. Freedman AH, Gaspar JM, Sackton TB (2020) Short paired-end reads trump long single-end reads for expression analysis. BMC Bioinformatics 21(1):149. https://doi.org/10.1186/s12859-020-3484-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Bokulich NA, Subramanian S, Faith JJ, Gevers D, Gordon JI, Knight R, Mills DA, Caporaso JG (2013) Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nat Methods 10(1):57–59. https://doi.org/10.1038/nmeth.2276

    Article  CAS  PubMed  Google Scholar 

  5. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. https://doi.org/10.1093/bioinformatics/btu170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Costa-Silva J, Domingues D, Lopes FM (2017) RNA-Seq differential expression analysis: an extended review and a software tool. PLoS One 12(12):e0190152. https://doi.org/10.1371/journal.pone.0190152

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14(4):R36. https://doi.org/10.1186/gb-2013-14-4-r36

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15–21. https://doi.org/10.1093/bioinformatics/bts635

    Article  CAS  PubMed  Google Scholar 

  9. Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34(5):525–527. https://doi.org/10.1038/nbt.3519

    Article  CAS  PubMed  Google Scholar 

  10. Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14(4):417–419. https://doi.org/10.1038/nmeth.4197

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30(7):923–930. https://doi.org/10.1093/bioinformatics/btt656

    Article  CAS  PubMed  Google Scholar 

  12. Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323. https://doi.org/10.1186/1471-2105-12-323

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Zheng H, Brennan K, Hernaez M, Gevaert O (2019) Benchmark of long non-coding RNA quantification for RNA sequencing of cancer samples. Gigascience 8(12):giz145. https://doi.org/10.1093/gigascience/giz145

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Wu DC, Yao J, Ho KS, Lambowitz AM, Wilke CO (2018) Limitations of alignment-free tools in total RNA-seq quantification. BMC Genomics 19(1):510. https://doi.org/10.1186/s12864-018-4869-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Dong P, Xiong Y, Yue J, Hanley SJB, Kobayashi N, Todo Y, Watari H (2018) Long non-coding RNA NEAT1: a novel target for diagnosis and therapy in human tumors. Front Genet 9:471. https://doi.org/10.3389/fgene.2018.00471

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Arun G, Aggarwal D, Spector DL (2020) MALAT1 long non-coding RNA: functional implications. Noncoding RNA 6(2):22. https://doi.org/10.3390/ncrna6020022

    Article  CAS  PubMed Central  Google Scholar 

  17. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Anders S, Pyl PT, Huber W (2015) HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31(2):166–169. https://doi.org/10.1093/bioinformatics/btu638

    Article  CAS  PubMed  Google Scholar 

  19. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511–515. https://doi.org/10.1038/nbt.1621

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

SB is supported by the Italian Ministry of Education, University and Research (PRIN 2017 #2017PPS2X4_003), and the Italian Association of Cancer Research, AIRC, Milan, Italy (Investigator Grant IG2017 #20052). LA is supported by the Accelerator Award #29374 through CRUK-AIRC partnership.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Agnelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Agnelli, L., Bortoluzzi, S., Pruneri, G. (2021). Bioinformatic Pipelines to Analyze lncRNAs RNAseq Data. In: Navarro, A. (eds) Long Non-Coding RNAs in Cancer. Methods in Molecular Biology, vol 2348. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1581-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1581-2_4

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1580-5

  • Online ISBN: 978-1-0716-1581-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics