Skip to main content
Log in

Computational approaches and challenges for identification and annotation of non-coding RNAs using RNA-Seq

  • Review
  • Published:
Functional & Integrative Genomics Aims and scope Submit manuscript

Abstract

Significant innovations in next-generation sequencing techniques and bioinformatics tools have impacted our appreciation and understanding of RNA. Practical RNASeq applications have evolved in conjunction with sequence technology and bioinformatic tool advances. In this review, we explained various computational resources, tools, and bioinformatics analyses advancement for small and large non-coding RNAs. These include non-coding RNAs (ncRNAs) such as piwi, micro, circular, and long ncRNAs. In addition, this article discusses future challenges, single-cell level sequencing for non-coding RNAs, and advantages of using long-read sequencing to annotate lncRNAs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

All the relevant data discussed in the article is provided in the article.

References

  • Achawanantakun R et al (2015) LncRNA-ID: Long non-coding RNA identification using balanced random forests. Bioinform 31(24):3897–3905

    CAS  Google Scholar 

  • Agarwal V et al (2015) Predicting effective microRNA target sites in mammalian mRNAs. Elite 4:e05005

    Google Scholar 

  • Altesha MA et al (2019) Circular RNA in cardiovascular disease. J Cell Physiol 234(5):5588–5600

    Article  CAS  PubMed  Google Scholar 

  • Amaral PP, Mattick JS (2008) Noncoding RNA in development. Mamm Genome 19(7):454–492

    Article  CAS  PubMed  Google Scholar 

  • Amaral PP et al (2011) lncRNAdb: A reference database for long noncoding RNAs. Nucleic Acid Res 39(1):D146–D151

    Article  CAS  PubMed  Google Scholar 

  • Aparicio-Puerta E et al (2019) sRNAbench and sRNAtoolbox 2019: Intuitive fast small RNA profiling and differential expression. Nucleic Acids Res 47(1):W530–W535

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Backes C et al (2016) miEAA: microRNA enrichment analysis and annotation. Nucleic Acids Res 44(W1):W110–W116

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Baek J et al (2018) LncRNAnet: Long non-coding RNA identification using deep learning. Bioinform 34(22):3889–3897

    Article  CAS  Google Scholar 

  • Baek J et al (2018) LncRNAnet: Long non-coding RNA identification using deep learning. Bioinform 34(22):3889–3897

    Article  CAS  Google Scholar 

  • Beltran M et al (2008) A natural antisense transcript regulates Zeb2/Sip1 gene expression during Snail1-induced epithelial–mesenchymal transition. Genes Dev 22(6):756–769

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Betel D et al (2010) Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biol 11(8):1–14

    Article  Google Scholar 

  • Bortolomeazzi M, Gaffo E, Bortoluzzi S (2019) A survey of software tools for microRNA discovery and characterization using RNA-seq. Brief Bioinform. 20(3):918–930

    Article  CAS  PubMed  Google Scholar 

  • Boucheham A et al (2017) IpiRId: Integrative approach for piRNA prediction using genomic and epigenomic data. Plos One 12(6):e0179787

    Article  PubMed  PubMed Central  Google Scholar 

  • Castañeda J et al (2011) piRNAs, transposon silencing, and germline genome integrity. Mutat Res/Fundam Mol Mech Mutagen 714(1–2):95–104

    Article  Google Scholar 

  • Chen L et al (2019) Trends in the development of miRNA bioinformatics tools. Brief Bioinform 20(5):1836–1852. https://doi.org/10.1093/bib/bby054

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen G, Ning B, Shi T (2019b) Single-cell RNA-seq technologies and related computational data analysis. Front Genet317

  • Cheng W-C et al (2013) YM500: A small RNA sequencing (smRNA-seq) database for microRNA research. Nucleic Acids Res 41(D1):D285–D294

    Article  CAS  PubMed  Google Scholar 

  • Cheng J, Metge F, Dieterich CJB (2016) Specific Identification and Quantification of Circular RNAs from Sequencing Data. Bioinform 32(7):1094–1096

    Article  CAS  Google Scholar 

  • Chiquitto AG et al (2022) Impact of sequencing technologies on long non-coding RNA computational identification. BioRxiv. https://doi.org/10.1101/2022.04.15.488462

  • Cox DN et al (1998) A novel class of evolutionarily conserved genes defined by piwi are essential for stem cell self-renewal. Genes Dev 12(23):3715–3727

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cox DN, Chao A, Lin HJD (2000) Piwi encodes a nucleoplasmic factor whose activity modulates the number and division rate of germline stem cells. Development 127(3):503–514

    Article  CAS  PubMed  Google Scholar 

  • Dinger ME et al (2008) Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res 18(9):1433–1445

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ernst C, Odom DT, Kutter C (2017) The emergence of piRNAs against transposon invasion to preserve mammalian genome integrity. Nat Commun 8(1):1–10

    Article  CAS  Google Scholar 

  • Everaert C et al (2017) Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data. Sci Rep 7(1):1–11

    Article  CAS  Google Scholar 

  • Fan XN, Zhang SW (2015) lncRNA-MFDL: Identification of human long non-coding RNAs by fusing multiple features and using deep learning. Mol BioSyst 11(3):892–897

    Article  CAS  PubMed  Google Scholar 

  • Fang Y et al (2020) Recent advances on the roles of LncRNAs in cardiovascular disease. J Cell Mol Med 24(21):12246–12257

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Farrell D (2017) Smallrnaseq: short non coding RNA-seq analysis with Python. Biorxiv :110585. https://doi.org/10.1101/110585

  • Frith MC, Pheasant M, Mattick JS (2005) The amazing complexity of the human transcriptome. Eur J Hum Genetics 13(8):894–897

    Article  CAS  Google Scholar 

  • Fu Q et al (2018) Single-cell non-coding RNA in embryonic development. Single Cell Biomed :19–32. https://doi.org/10.1007/978-981-13-0502-3_3

  • Gao Y, Zhang J, Zhao F (2018) Circular RNA identification based on multiple seed matching. Brief Bioinform 19(5):803–810

    Article  CAS  PubMed  Google Scholar 

  • Gawronski KA, Kim J (2017) Single cell transcriptomics of noncoding RNAs and their cell-specificity. Wiley Interdiscip Rev RNA 8(6):e1433

    Article  Google Scholar 

  • Ge M et al (2016) A bipartite network-based method for prediction of long non-coding RNA–protein interactions. Genomics Proteomics Bioinformatics 14(1):62–71

  • Geisler S, Coller J (2013) RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol 14(11):699–712

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Geles K et al (2021) WIND (Workflow for pIRNAs aNd beyonD): a strategy for in-depth analysis of small RNA-seq data. F1000Res 10:1. https://doi.org/10.12688/f1000research.27868.3

  • Giroux P et al (2020) miRViz: A novel webserver application to visualize and interpret microRNA datasets. Nucleic Acids Res 48(W1):W252–W261

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gong Y et al (2021) Bioinformatics analysis of long non-coding RNA and related diseases: An overview. Front Genet 12:813873. https://doi.org/10.3389/fgene.2021.813873

  • Guttman M et al (2010) Ab initio reconstruction of cell type–specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 28(5):503–510

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hagemann-Jensen M et al (2018) Small-seq for single-cell small-RNA sequencing. Nat Protoc 13(10):2407–2424

    Article  CAS  PubMed  Google Scholar 

  • Han BW et al (2015) piPipes: A set of pipelines for piRNA and transposon analysis via small RNA-seq, RNA-Seq, Degradome-and CAGE-Seq, ChIP-Seq and genomic DNA sequencing. Bioinformatics 31(4):593–595

    Article  CAS  PubMed  Google Scholar 

  • Han S et al (2019) LncFinder: An integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property. Brief Bioinform 20(6):2009–2027

    Article  CAS  PubMed  Google Scholar 

  • Hauptman N, Glavač D (2013) Long non-coding RNA in cancer. Int J Mol Sci 14(3):4655–4669

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hinger SA et al (2018) Diverse long RNAs are differentially sorted into extracellular vesicles secreted by colorectal cancer cells. Cell Rep 25(3):715–725

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Holoch D, Moazed D (2015) RNA-mediated epigenetic regulation of gene expression. Nat Rev Genet 16(2):71–84

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hu X et al (2020) Integration of single-cell multi-omics for gene regulatory network inference. Comput Struct Biotechnol J 18:1925–1938

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Huarte M (2015) The emerging role of lncRNAs in cancer. Nat Med 21(11):1253–1261

    Article  CAS  PubMed  Google Scholar 

  • Hwang B, Lee JH, Bang D (2018) Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp Mol Med 50(8):1–14

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Iyer MK et al (2015) The landscape of long noncoding RNAs in the human transcriptome. Nat Genet 47(3):199–208

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jarroux J, Morillon A, Pinskaya M (2017) History, discovery, and classification of lncRNAs. Adv Exp Med Biol 1008:1-46

  • Jensen S et al (2020) Conserved small nucleotidic elements at the origin of concerted piRNA biogenesis from genes and lncRNAs. Cells 9(6):1491

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jurka J (2000) Repbase update: a database and an electronic journal of repetitive elements. Trend Genet 16(9):418–420

    Article  CAS  Google Scholar 

  • Karunanithi S, Simon M, Schulz MHJP (2019) Automated Analysis of Small RNA Datasets with RAPID. PeerJ 7:e6710

    Article  PubMed  PubMed Central  Google Scholar 

  • Kato M, Carninci P (2020) Genome-wide technologies to study RNA–chromatin interactions. Noncoding RNA 6(2):20

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kawai J et al (2001) Functional annotation of a full-length mouse cDNA collection. Nature 409(6821):685–689

    Article  PubMed  Google Scholar 

  • Kertesz M et al (2007) The role of site accessibility in microRNA target recognition. Nat Genet 39(10):1278–1284

    Article  CAS  PubMed  Google Scholar 

  • Li D et al (2016) A genetic algorithm-based weighted ensemble method for predicting transposon-derivedd piRNAs. BMC Bioinform 17(1):1–11

    Article  Google Scholar 

  • Li Z, Zhu X, Huang S (2020) Extracellular vesicle long non-coding RNAs and circular RNAs: Biology, functions and applications in cancer. Cancer Lett 489:111–120

    Article  CAS  PubMed  Google Scholar 

  • Liu X, Ding J, Gong J (2014) piRNA identification based on motif discovery. Mol BioSyst 10(12):3075–3080

    Article  CAS  PubMed  Google Scholar 

  • Liu Q et al (2021) Small Noncoding RNA Discovery and Profiling with sRNAtools Based on High-Throughput Sequencing. Brief Bioinform 22(1):463–473

    Article  CAS  PubMed  Google Scholar 

  • Liu Z et al (2021) DEBKS: A tool to detect differentially expressed circular RNA

  • Liu S et al (2019) PredLnc-GFStack: A global sequence feature based on a stacked ensemble learning method for predicting lncRNAs from transcriptsGenes (Basel) 10(9):672

  • Lorenzi L et al (2019) Long noncoding RNA expression profiling in cancer: Challenges and opportunities. Genes Chromosom Cancer 58(4):191–199

    Article  CAS  PubMed  Google Scholar 

  • Luginbühl J, Sivaraman DM, Shin JW (2017) The essentiality of non-coding RNAs in cell reprogramming. Noncoding RNA Res 2(1):74–82

    Article  PubMed  PubMed Central  Google Scholar 

  • Ma L, Bajic VB, Zhang Z (2013) On the classification of long non-coding RNAs. RNA Biol 10(6):924–933

    Article  CAS  PubMed Central  Google Scholar 

  • Matsumoto H et al (2017) SCODE: An efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 33(15):2314–2321

    Article  PubMed  PubMed Central  Google Scholar 

  • Mercer TR, Dinger ME, Mattick JS (2009) Long non-coding RNAs: Insights into functions. Nat Rev Genet 10(3):155–159

    Article  CAS  PubMed  Google Scholar 

  • Mohankumar S, Patel T (2016) Extracellular vesicle long noncoding RNA as potential biomarkers of liver cancer. Brief Funct Genomics 15(3):249–256

    Article  CAS  PubMed  Google Scholar 

  • Monga I, Banerjee I (2019) Computational identification of piRNAs using features based on rna sequence, structure, thermodynamic and physicochemical properties. Curr Genom 20(7):508–518

    Article  CAS  Google Scholar 

  • Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628

    Article  CAS  PubMed  Google Scholar 

  • Nielsen MM, Pedersen JS (2021) miRNA activity inferred from single cell mRNA expression. Sci Rep 11(1):1–8

    Article  Google Scholar 

  • Pan X, Xiong K (2015) PredcircRNA: Computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol Biosyst 11(8):2219–2226

    Article  CAS  PubMed  Google Scholar 

  • Pan Q et al (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40(12):1413–1415

    Article  CAS  PubMed  Google Scholar 

  • Pasmant E et al (2007) Characterization of a germ-line deletion, including the entire INK4/ARF locus, in a melanoma-neural system tumor family: identification of ANRIL, an antisense noncoding RNA whose expression coclusters with ARF. Can Res 67(8):3963–3969

    Article  CAS  Google Scholar 

  • Pogorelcnik R et al (2018) sRNAPipe: a Galaxy-based pipeline for bioinformatic in-depth exploration of small RNAseq data. Mobile DNA 9(1):1–6

    Article  Google Scholar 

  • Quillet A et al (2020) Improving Bioinformatics Prediction of microRNA Targets by Ranks Aggregation. Front Genet 10:1330

    Article  PubMed  PubMed Central  Google Scholar 

  • Ramos TA et al (2021) RNAmining: A machine learning stand-alone and web server tool for RNA coding potential prediction. F1000Res 10:323. https://doi.org/10.12688/f1000research.52350.2

  • Riffo-Campos ÁL, Riquelme I, Brebi-Mieville P (2016) Tools for sequence-based miRNA target prediction: What to choose? Int J Mol Sci 17(12):1987

    Article  PubMed  PubMed Central  Google Scholar 

  • Rinn JL, Chang HY (2012) Genome regulation by long noncoding RNAs. Annu Rev Biochem 81:145–166

    Article  CAS  PubMed  Google Scholar 

  • Rocchi A et al (2020) MicroRNAs: An update of applications in forensic science. Diagnostics 11(1):32

    Article  PubMed  PubMed Central  Google Scholar 

  • Ru Y et al (2014) The multiMiR R package and database: Integration of microRNA–target interactions along with their disease and drug associations. Nucleic Acids Res 42(17):e133–e133

    Article  PubMed  PubMed Central  Google Scholar 

  • Sablok G et al (2013) isomiRex: Web-based identification of microRNAs, isomiR variations and differential expression using next-generation sequencing datasets. FEBS Lett 587(16):2629–2634

    Article  CAS  PubMed  Google Scholar 

  • Shi J et al (2021) PANDORA-seq expands the repertoire of regulatory small RNAs by overcoming RNA modifications. Nat Cell Biol 23(4):424–436

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Song X et al (2016) Circular RNA profile in gliomas revealed by identification toolUROBORUS. Nucleic Acids Res 44(9):e87–e87

    Article  PubMed  PubMed Central  Google Scholar 

  • Sun L et al (2015) lncRScan-SVM: A tool for predicting long non-coding RNAs using support vector machine. Plos One 10(10):e0139654

    Article  PubMed  PubMed Central  Google Scholar 

  • Szabo L, Salzman J (2016) Detecting circular RNAs: Bioinformatic and experimental challenges. Nat Rev Genet 17(11):679–692

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Thind AS et al (2021) Demystifying emerging bulk RNA-Seq applications: The application and utility of bioinformatic methodology. Brief Bioinform 22(6):bbab259

    Article  PubMed  Google Scholar 

  • Thind AS, Kaur K, Monga I (2022) An overview of databases and tools for lncrna genomics advancing precision medicine. Mach Learn Syst Biol Genomics Health :49–67. https://doi.org/10.1007/978-981-16-5993-5_3

  • Turki T, Taguchi Y (2020) SCGRNs: Novel supervised inference of single-cell gene regulatory networks of complex diseases. Comput Biol Med 118:103656

    Article  CAS  PubMed  Google Scholar 

  • Uhrig S, Klein H (2019) PingPongPro: A tool for the detection of piRNA-mediated transposon-silencing in small RNA-Seq data. Bioinform 35(2):335–336

    Article  CAS  Google Scholar 

  • Ünsal K, Morgan GT (1995) A novel group of families of short interspersed repetitive elements (SINEs) inXenopus: Evidence of a specific target site for dna-mediated transposition of inverted-repeat SINEs. J Mol Biol 248(4):812–823

    Article  PubMed  Google Scholar 

  • Uszczynska-Ratajczak B et al (2018) Towards a complete map of the human long non-coding RNA transcriptome. Nat Rev Genet 19(9):535–548

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Volders PJ et al (2013) LNCipedia: A database for annotated human lncRNA transcript sequences and structures. Nucleic Acids Res 41(D1):D246–D251

    Article  CAS  PubMed  Google Scholar 

  • Wang J, Wang LJB (2019) Deep learning of the back-splicing code for circular RNA formation. Bioinform 35(24):5235–5242

    Article  CAS  Google Scholar 

  • Wang Y et al (2013a) The role of miRNA-29 family in cancer. Eur J Cell Biol 92(3):123–128

    Article  CAS  PubMed  Google Scholar 

  • Wang L et al (2013b) CPAT: Coding-potential assessment tool using an alignment-free logistic regression model. Nucleic Acids Res 41(6):e74–e74

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang J et al (2019) piRBase: A comprehensive database of piRNA sequences. Nucleic Acids Res 47(D1):D175–D180

    Article  CAS  PubMed  Google Scholar 

  • Wang J et al (2021) scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat Commun 12(1):1–11

    Google Scholar 

  • Wilson JE, Connell JE, Macdonald PM (1996) aubergine enhances oskar translation in the Drosophila ovary. Development 122(5):1631–1639

    Article  CAS  PubMed  Google Scholar 

  • Wucher V et al (2017) FEELnc: A tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Res 45(8):e57–e57

    CAS  PubMed  PubMed Central  Google Scholar 

  • Xu Y et al (2020) Predicting long non-coding RNAs through feature ensemble learning. BMC Genom 21(13):1–12

    Google Scholar 

  • Yang Q et al (2019) Single-cell CAS-seq reveals a class of short PIWI-interacting RNAs in human oocytes. Nat Commun 10(1):1–15

    Google Scholar 

  • Yang C et al (2021) LncADeep performance on full-length transcripts. Nat Mach Intell 3(3):197–198

    Article  Google Scholar 

  • Zeng Q et al (2021) PIWI-interacting RNAs and PIWI proteins in diabetes and cardiovascular disease: Molecular pathogenesis and role as biomarkers. Clin Chim Acta 518:33–37

    Article  CAS  PubMed  Google Scholar 

  • Zhang X-O et al (2016) Diverse alternative back-splicing and alternative splicing landscape of circular RNAs. Genome Res 26(9):1277–1287

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang J et al (2020) Accurate quantification of circular RNAs identifies extensive circular isoform switching events. Nat Commun 11(1):1–14

    Google Scholar 

  • Zhao Y, Yuan J, Chen R (2016) NONCODEv4: Annotation of noncoding RNAs with emphasis on long noncoding RNAs. Long Non-Coding RNAs. Springer, pp 243–254

    Chapter  Google Scholar 

  • Zhao X, Lan Y, Chen D (2022) Exploring long non-coding RNA networks from single cell omics data. Comput Struct Biotechnol J 20:4381–4389. https://doi.org/10.1016/j.csbj.2022.08.003

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ziemann M, Kaspi A, El-Osta AJR (2016) Evaluation of microRNA alignment techniques. RNA 22(8):1120–1138

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the authors of various tools discussed here in the article.

Author information

Authors and Affiliations

Authors

Contributions

K.D, I.M, and A.S.T wrote, edited, and reviewed the original review article; KD prepared the figure and table.

Corresponding author

Correspondence to Amarinder Singh Thind.

Ethics declarations

Consent for publication

Not applicable.

Human and animal ethics

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dindhoria, K., Monga, I. & Thind, A.S. Computational approaches and challenges for identification and annotation of non-coding RNAs using RNA-Seq. Funct Integr Genomics 22, 1105–1112 (2022). https://doi.org/10.1007/s10142-022-00915-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10142-022-00915-y

Keywords

Navigation