Abstract
Significant innovations in next-generation sequencing techniques and bioinformatics tools have impacted our appreciation and understanding of RNA. Practical RNASeq applications have evolved in conjunction with sequence technology and bioinformatic tool advances. In this review, we explained various computational resources, tools, and bioinformatics analyses advancement for small and large non-coding RNAs. These include non-coding RNAs (ncRNAs) such as piwi, micro, circular, and long ncRNAs. In addition, this article discusses future challenges, single-cell level sequencing for non-coding RNAs, and advantages of using long-read sequencing to annotate lncRNAs.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10142-022-00915-y/MediaObjects/10142_2022_915_Fig1_HTML.png)
Similar content being viewed by others
Data availability
All the relevant data discussed in the article is provided in the article.
References
Achawanantakun R et al (2015) LncRNA-ID: Long non-coding RNA identification using balanced random forests. Bioinform 31(24):3897–3905
Agarwal V et al (2015) Predicting effective microRNA target sites in mammalian mRNAs. Elite 4:e05005
Altesha MA et al (2019) Circular RNA in cardiovascular disease. J Cell Physiol 234(5):5588–5600
Amaral PP, Mattick JS (2008) Noncoding RNA in development. Mamm Genome 19(7):454–492
Amaral PP et al (2011) lncRNAdb: A reference database for long noncoding RNAs. Nucleic Acid Res 39(1):D146–D151
Aparicio-Puerta E et al (2019) sRNAbench and sRNAtoolbox 2019: Intuitive fast small RNA profiling and differential expression. Nucleic Acids Res 47(1):W530–W535
Backes C et al (2016) miEAA: microRNA enrichment analysis and annotation. Nucleic Acids Res 44(W1):W110–W116
Baek J et al (2018) LncRNAnet: Long non-coding RNA identification using deep learning. Bioinform 34(22):3889–3897
Baek J et al (2018) LncRNAnet: Long non-coding RNA identification using deep learning. Bioinform 34(22):3889–3897
Beltran M et al (2008) A natural antisense transcript regulates Zeb2/Sip1 gene expression during Snail1-induced epithelial–mesenchymal transition. Genes Dev 22(6):756–769
Betel D et al (2010) Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biol 11(8):1–14
Bortolomeazzi M, Gaffo E, Bortoluzzi S (2019) A survey of software tools for microRNA discovery and characterization using RNA-seq. Brief Bioinform. 20(3):918–930
Boucheham A et al (2017) IpiRId: Integrative approach for piRNA prediction using genomic and epigenomic data. Plos One 12(6):e0179787
Castañeda J et al (2011) piRNAs, transposon silencing, and germline genome integrity. Mutat Res/Fundam Mol Mech Mutagen 714(1–2):95–104
Chen L et al (2019) Trends in the development of miRNA bioinformatics tools. Brief Bioinform 20(5):1836–1852. https://doi.org/10.1093/bib/bby054
Chen G, Ning B, Shi T (2019b) Single-cell RNA-seq technologies and related computational data analysis. Front Genet317
Cheng W-C et al (2013) YM500: A small RNA sequencing (smRNA-seq) database for microRNA research. Nucleic Acids Res 41(D1):D285–D294
Cheng J, Metge F, Dieterich CJB (2016) Specific Identification and Quantification of Circular RNAs from Sequencing Data. Bioinform 32(7):1094–1096
Chiquitto AG et al (2022) Impact of sequencing technologies on long non-coding RNA computational identification. BioRxiv. https://doi.org/10.1101/2022.04.15.488462
Cox DN et al (1998) A novel class of evolutionarily conserved genes defined by piwi are essential for stem cell self-renewal. Genes Dev 12(23):3715–3727
Cox DN, Chao A, Lin HJD (2000) Piwi encodes a nucleoplasmic factor whose activity modulates the number and division rate of germline stem cells. Development 127(3):503–514
Dinger ME et al (2008) Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res 18(9):1433–1445
Ernst C, Odom DT, Kutter C (2017) The emergence of piRNAs against transposon invasion to preserve mammalian genome integrity. Nat Commun 8(1):1–10
Everaert C et al (2017) Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data. Sci Rep 7(1):1–11
Fan XN, Zhang SW (2015) lncRNA-MFDL: Identification of human long non-coding RNAs by fusing multiple features and using deep learning. Mol BioSyst 11(3):892–897
Fang Y et al (2020) Recent advances on the roles of LncRNAs in cardiovascular disease. J Cell Mol Med 24(21):12246–12257
Farrell D (2017) Smallrnaseq: short non coding RNA-seq analysis with Python. Biorxiv :110585. https://doi.org/10.1101/110585
Frith MC, Pheasant M, Mattick JS (2005) The amazing complexity of the human transcriptome. Eur J Hum Genetics 13(8):894–897
Fu Q et al (2018) Single-cell non-coding RNA in embryonic development. Single Cell Biomed :19–32. https://doi.org/10.1007/978-981-13-0502-3_3
Gao Y, Zhang J, Zhao F (2018) Circular RNA identification based on multiple seed matching. Brief Bioinform 19(5):803–810
Gawronski KA, Kim J (2017) Single cell transcriptomics of noncoding RNAs and their cell-specificity. Wiley Interdiscip Rev RNA 8(6):e1433
Ge M et al (2016) A bipartite network-based method for prediction of long non-coding RNA–protein interactions. Genomics Proteomics Bioinformatics 14(1):62–71
Geisler S, Coller J (2013) RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol 14(11):699–712
Geles K et al (2021) WIND (Workflow for pIRNAs aNd beyonD): a strategy for in-depth analysis of small RNA-seq data. F1000Res 10:1. https://doi.org/10.12688/f1000research.27868.3
Giroux P et al (2020) miRViz: A novel webserver application to visualize and interpret microRNA datasets. Nucleic Acids Res 48(W1):W252–W261
Gong Y et al (2021) Bioinformatics analysis of long non-coding RNA and related diseases: An overview. Front Genet 12:813873. https://doi.org/10.3389/fgene.2021.813873
Guttman M et al (2010) Ab initio reconstruction of cell type–specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 28(5):503–510
Hagemann-Jensen M et al (2018) Small-seq for single-cell small-RNA sequencing. Nat Protoc 13(10):2407–2424
Han BW et al (2015) piPipes: A set of pipelines for piRNA and transposon analysis via small RNA-seq, RNA-Seq, Degradome-and CAGE-Seq, ChIP-Seq and genomic DNA sequencing. Bioinformatics 31(4):593–595
Han S et al (2019) LncFinder: An integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property. Brief Bioinform 20(6):2009–2027
Hauptman N, Glavač D (2013) Long non-coding RNA in cancer. Int J Mol Sci 14(3):4655–4669
Hinger SA et al (2018) Diverse long RNAs are differentially sorted into extracellular vesicles secreted by colorectal cancer cells. Cell Rep 25(3):715–725
Holoch D, Moazed D (2015) RNA-mediated epigenetic regulation of gene expression. Nat Rev Genet 16(2):71–84
Hu X et al (2020) Integration of single-cell multi-omics for gene regulatory network inference. Comput Struct Biotechnol J 18:1925–1938
Huarte M (2015) The emerging role of lncRNAs in cancer. Nat Med 21(11):1253–1261
Hwang B, Lee JH, Bang D (2018) Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp Mol Med 50(8):1–14
Iyer MK et al (2015) The landscape of long noncoding RNAs in the human transcriptome. Nat Genet 47(3):199–208
Jarroux J, Morillon A, Pinskaya M (2017) History, discovery, and classification of lncRNAs. Adv Exp Med Biol 1008:1-46
Jensen S et al (2020) Conserved small nucleotidic elements at the origin of concerted piRNA biogenesis from genes and lncRNAs. Cells 9(6):1491
Jurka J (2000) Repbase update: a database and an electronic journal of repetitive elements. Trend Genet 16(9):418–420
Karunanithi S, Simon M, Schulz MHJP (2019) Automated Analysis of Small RNA Datasets with RAPID. PeerJ 7:e6710
Kato M, Carninci P (2020) Genome-wide technologies to study RNA–chromatin interactions. Noncoding RNA 6(2):20
Kawai J et al (2001) Functional annotation of a full-length mouse cDNA collection. Nature 409(6821):685–689
Kertesz M et al (2007) The role of site accessibility in microRNA target recognition. Nat Genet 39(10):1278–1284
Li D et al (2016) A genetic algorithm-based weighted ensemble method for predicting transposon-derivedd piRNAs. BMC Bioinform 17(1):1–11
Li Z, Zhu X, Huang S (2020) Extracellular vesicle long non-coding RNAs and circular RNAs: Biology, functions and applications in cancer. Cancer Lett 489:111–120
Liu X, Ding J, Gong J (2014) piRNA identification based on motif discovery. Mol BioSyst 10(12):3075–3080
Liu Q et al (2021) Small Noncoding RNA Discovery and Profiling with sRNAtools Based on High-Throughput Sequencing. Brief Bioinform 22(1):463–473
Liu Z et al (2021) DEBKS: A tool to detect differentially expressed circular RNA
Liu S et al (2019) PredLnc-GFStack: A global sequence feature based on a stacked ensemble learning method for predicting lncRNAs from transcripts. Genes (Basel) 10(9):672
Lorenzi L et al (2019) Long noncoding RNA expression profiling in cancer: Challenges and opportunities. Genes Chromosom Cancer 58(4):191–199
Luginbühl J, Sivaraman DM, Shin JW (2017) The essentiality of non-coding RNAs in cell reprogramming. Noncoding RNA Res 2(1):74–82
Ma L, Bajic VB, Zhang Z (2013) On the classification of long non-coding RNAs. RNA Biol 10(6):924–933
Matsumoto H et al (2017) SCODE: An efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 33(15):2314–2321
Mercer TR, Dinger ME, Mattick JS (2009) Long non-coding RNAs: Insights into functions. Nat Rev Genet 10(3):155–159
Mohankumar S, Patel T (2016) Extracellular vesicle long noncoding RNA as potential biomarkers of liver cancer. Brief Funct Genomics 15(3):249–256
Monga I, Banerjee I (2019) Computational identification of piRNAs using features based on rna sequence, structure, thermodynamic and physicochemical properties. Curr Genom 20(7):508–518
Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628
Nielsen MM, Pedersen JS (2021) miRNA activity inferred from single cell mRNA expression. Sci Rep 11(1):1–8
Pan X, Xiong K (2015) PredcircRNA: Computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol Biosyst 11(8):2219–2226
Pan Q et al (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40(12):1413–1415
Pasmant E et al (2007) Characterization of a germ-line deletion, including the entire INK4/ARF locus, in a melanoma-neural system tumor family: identification of ANRIL, an antisense noncoding RNA whose expression coclusters with ARF. Can Res 67(8):3963–3969
Pogorelcnik R et al (2018) sRNAPipe: a Galaxy-based pipeline for bioinformatic in-depth exploration of small RNAseq data. Mobile DNA 9(1):1–6
Quillet A et al (2020) Improving Bioinformatics Prediction of microRNA Targets by Ranks Aggregation. Front Genet 10:1330
Ramos TA et al (2021) RNAmining: A machine learning stand-alone and web server tool for RNA coding potential prediction. F1000Res 10:323. https://doi.org/10.12688/f1000research.52350.2
Riffo-Campos ÁL, Riquelme I, Brebi-Mieville P (2016) Tools for sequence-based miRNA target prediction: What to choose? Int J Mol Sci 17(12):1987
Rinn JL, Chang HY (2012) Genome regulation by long noncoding RNAs. Annu Rev Biochem 81:145–166
Rocchi A et al (2020) MicroRNAs: An update of applications in forensic science. Diagnostics 11(1):32
Ru Y et al (2014) The multiMiR R package and database: Integration of microRNA–target interactions along with their disease and drug associations. Nucleic Acids Res 42(17):e133–e133
Sablok G et al (2013) isomiRex: Web-based identification of microRNAs, isomiR variations and differential expression using next-generation sequencing datasets. FEBS Lett 587(16):2629–2634
Shi J et al (2021) PANDORA-seq expands the repertoire of regulatory small RNAs by overcoming RNA modifications. Nat Cell Biol 23(4):424–436
Song X et al (2016) Circular RNA profile in gliomas revealed by identification toolUROBORUS. Nucleic Acids Res 44(9):e87–e87
Sun L et al (2015) lncRScan-SVM: A tool for predicting long non-coding RNAs using support vector machine. Plos One 10(10):e0139654
Szabo L, Salzman J (2016) Detecting circular RNAs: Bioinformatic and experimental challenges. Nat Rev Genet 17(11):679–692
Thind AS et al (2021) Demystifying emerging bulk RNA-Seq applications: The application and utility of bioinformatic methodology. Brief Bioinform 22(6):bbab259
Thind AS, Kaur K, Monga I (2022) An overview of databases and tools for lncrna genomics advancing precision medicine. Mach Learn Syst Biol Genomics Health :49–67. https://doi.org/10.1007/978-981-16-5993-5_3
Turki T, Taguchi Y (2020) SCGRNs: Novel supervised inference of single-cell gene regulatory networks of complex diseases. Comput Biol Med 118:103656
Uhrig S, Klein H (2019) PingPongPro: A tool for the detection of piRNA-mediated transposon-silencing in small RNA-Seq data. Bioinform 35(2):335–336
Ünsal K, Morgan GT (1995) A novel group of families of short interspersed repetitive elements (SINEs) inXenopus: Evidence of a specific target site for dna-mediated transposition of inverted-repeat SINEs. J Mol Biol 248(4):812–823
Uszczynska-Ratajczak B et al (2018) Towards a complete map of the human long non-coding RNA transcriptome. Nat Rev Genet 19(9):535–548
Volders PJ et al (2013) LNCipedia: A database for annotated human lncRNA transcript sequences and structures. Nucleic Acids Res 41(D1):D246–D251
Wang J, Wang LJB (2019) Deep learning of the back-splicing code for circular RNA formation. Bioinform 35(24):5235–5242
Wang Y et al (2013a) The role of miRNA-29 family in cancer. Eur J Cell Biol 92(3):123–128
Wang L et al (2013b) CPAT: Coding-potential assessment tool using an alignment-free logistic regression model. Nucleic Acids Res 41(6):e74–e74
Wang J et al (2019) piRBase: A comprehensive database of piRNA sequences. Nucleic Acids Res 47(D1):D175–D180
Wang J et al (2021) scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat Commun 12(1):1–11
Wilson JE, Connell JE, Macdonald PM (1996) aubergine enhances oskar translation in the Drosophila ovary. Development 122(5):1631–1639
Wucher V et al (2017) FEELnc: A tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Res 45(8):e57–e57
Xu Y et al (2020) Predicting long non-coding RNAs through feature ensemble learning. BMC Genom 21(13):1–12
Yang Q et al (2019) Single-cell CAS-seq reveals a class of short PIWI-interacting RNAs in human oocytes. Nat Commun 10(1):1–15
Yang C et al (2021) LncADeep performance on full-length transcripts. Nat Mach Intell 3(3):197–198
Zeng Q et al (2021) PIWI-interacting RNAs and PIWI proteins in diabetes and cardiovascular disease: Molecular pathogenesis and role as biomarkers. Clin Chim Acta 518:33–37
Zhang X-O et al (2016) Diverse alternative back-splicing and alternative splicing landscape of circular RNAs. Genome Res 26(9):1277–1287
Zhang J et al (2020) Accurate quantification of circular RNAs identifies extensive circular isoform switching events. Nat Commun 11(1):1–14
Zhao Y, Yuan J, Chen R (2016) NONCODEv4: Annotation of noncoding RNAs with emphasis on long noncoding RNAs. Long Non-Coding RNAs. Springer, pp 243–254
Zhao X, Lan Y, Chen D (2022) Exploring long non-coding RNA networks from single cell omics data. Comput Struct Biotechnol J 20:4381–4389. https://doi.org/10.1016/j.csbj.2022.08.003
Ziemann M, Kaspi A, El-Osta AJR (2016) Evaluation of microRNA alignment techniques. RNA 22(8):1120–1138
Acknowledgements
The authors would like to acknowledge the authors of various tools discussed here in the article.
Author information
Authors and Affiliations
Contributions
K.D, I.M, and A.S.T wrote, edited, and reviewed the original review article; KD prepared the figure and table.
Corresponding author
Ethics declarations
Consent for publication
Not applicable.
Human and animal ethics
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dindhoria, K., Monga, I. & Thind, A.S. Computational approaches and challenges for identification and annotation of non-coding RNAs using RNA-Seq. Funct Integr Genomics 22, 1105–1112 (2022). https://doi.org/10.1007/s10142-022-00915-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10142-022-00915-y