Reference-Based Identification of Long Noncoding RNAs in Plants with Strand-Specific RNA-Sequencing Data

  • Xiao Lin
  • Meng Ni
  • Zhixia Xiao
  • Ting-Fung ChanEmail author
  • Hon-Ming LamEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1933)


Long noncoding RNAs (lncRNAs) have been shown to play important roles in various organisms, including plant species. Several tools and pipelines have emerged for lncRNA identification, including reference-based transcriptome assembly pipelines and various coding potential calculating tools. In this protocol, we have integrated some of the most updated computational tools and described the procedures step-by-step for identifying lncRNAs from plant strand-specific RNA-sequencing datasets. We will start from clean RNA-sequencing reads, followed by reference-based transcriptome assembly, filtering of known genes, and lncRNA prediction. At the end point, users will obtain a set of predicted lncRNAs for downstream use.

Key words

Plant long noncoding RNA Computational identification Software pipeline Strand-specific RNA-sequencing Reference-based transcriptome assembly 



This work was supported by grants from the Hong Kong Research Grants Council Area of Excellence Scheme (AoE/M-403/16); CUHK VC Discretionary Fund (VCF2014004); National Key Research and Development Program–Key Innovative and Collaborative Science and Technology Scheme for Hong Kong, Macau, and Taiwan (2017YFE0191100); CUHK Direct Grant (3132782); and the Lo Kwee-Seong Biomedical Research Fund to H.-M.L.


  1. 1.
    Marchese FP, Raimondi I, Huarte M (2017) The multidimensional mechanisms of long noncoding RNA function. Genome Biol 18(1):206CrossRefGoogle Scholar
  2. 2.
    Bazin J, Bailey-Serres J (2015) Emerging roles of long non-coding RNA in root developmental plasticity and regulation of phosphate homeostasis. Front Plant Sci 6:400CrossRefGoogle Scholar
  3. 3.
    Chekanova JA (2015) Long non-coding RNAs and their functions in plants. Curr Opin Plant Biol 27:207–216CrossRefGoogle Scholar
  4. 4.
    Zhao J, He Q, Chen G et al (2016) Regulation of non-coding RNAs in heat stress responses of plants. Front Plant Sci 7:1213PubMedPubMedCentralGoogle Scholar
  5. 5.
    Liu J, Jung C, Xu J et al (2012) Genome-wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell 24(11):4333–4345CrossRefGoogle Scholar
  6. 6.
    Wang H, Chung PJ, Liu J et al (2014) Genome-wide identification of long noncoding natural antisense transcripts and their responses to light in Arabidopsis. Genome Res 24(3):444–453CrossRefGoogle Scholar
  7. 7.
    Wang T-Z, Liu M, Zhao M-G et al (2015) Identification and characterization of long non-coding RNAs involved in osmotic and salt stress in Medicago truncatula using genome-wide high-throughput sequencing. BMC Plant Biol 15(1):131CrossRefGoogle Scholar
  8. 8.
    Zhang Y, Huang H, Zhang D et al (2017) A review on recent computational methods for predicting noncoding RNAs. Biomed Res Int 2017:1–14Google Scholar
  9. 9.
    Johnsson P, Lipovich L, Grandér D et al (2014) Evolutionary conservation of long non-coding RNAs; sequence, structure, function. Biochim Biophys Acta 1840(3):1063–1071CrossRefGoogle Scholar
  10. 10.
    Han S, Liang Y, Li Y et al (2016) Long noncoding RNA identification: comparing machine learning based tools for long noncoding transcripts discrimination. Biomed Res Int 2016:1–14Google Scholar
  11. 11.
    Pertea M, Kim D, Pertea GM et al (2016) Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11(9):1650–1667CrossRefGoogle Scholar
  12. 12.
    Wucher V, Legeai F, Hédan B et al (2017) FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Res 45(8):gkw1306CrossRefGoogle Scholar
  13. 13.
    Parkhomchuk D, Borodina T, Amstislavskiy V et al (2009) Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res 37(18):e123CrossRefGoogle Scholar
  14. 14.
    Kim D, Langmead B, Salzberg SL (2015) HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12(4):357–360CrossRefGoogle Scholar
  15. 15.
    Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079CrossRefGoogle Scholar
  16. 16.
    Pertea M, Pertea GM, Antonescu CM et al (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33(3):290–295CrossRefGoogle Scholar
  17. 17.
    Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120CrossRefGoogle Scholar
  18. 18.
    Bolser DM, Staines DM, Perry E et al (2017) Ensembl plants: integrating tools for visualizing, mining, and analyzing plant genomic data. Methods Mol Biol 1374:115–140CrossRefGoogle Scholar
  19. 19.
    Lagesen K, Hallin P, Rødland EA et al (2007) RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35(9):3100–3108CrossRefGoogle Scholar
  20. 20.
    Schattner P, Brooks AN, Lowe TM (2005) The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res 33(Web Server issue):W686–W689CrossRefGoogle Scholar
  21. 21.
    Nawrocki EP, Eddy SR (2013) Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29(22):2933–2935CrossRefGoogle Scholar
  22. 22.
    Kalvari I, Argasinska J, Quinones-Olvera N et al (2018) Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res 46(D1):D335–D342CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Life Sciences and Center for Soybean Research of the Partner State Key Laboratory of AgrobiotechnologyThe Chinese University of Hong KongShatinChina

Personalised recommendations