Abstract
Long intergenic noncoding RNAs (lincRNAs) have emerged as critical participators in gene regulation in myriads of cell types. The development of the whole transcriptome sequencing technology, or RNA-seq , has enabled novel lincRNA detection, but the bioinformatics analysis toward distinguishing reliable ones remains a challenge. Here, we describe the bioinformatics workflow developed for identifying novel lincRNAs step by step, including read alignment, transcriptome assembly and transcript filtering.
References
Mercer TR, Dinger ME, Mattick JS (2009) Long non-coding RNAs: insights into functions. Nat Rev Genet 10(3):155–159
Sun K, Zhao Y, Wang H et al (2014) Sebnif: an integrated bioinformatics pipeline for the identification of novel large intergenic noncoding RNAs (lincRNAs)--application in human skeletal muscle cells. PLoS One 9(1):e84500.3
Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25(9):1105–1111
Wu TD, Nacu S (2010) Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26(7):873–881
Dobin A, Davis CA, Schlesinger F et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15–21
Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511–515
Guttman M, Garber M, Levin JZ et al (2010) Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 28(5):503–510
ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74
Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M et al (2012) GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res 22:1760–1774 10. Sun K, Chen X, Jiang P et al (2013) iSeeRNA: identification of long intergenic non-coding RNA transcripts from transcriptome sequencing data. BMC Genomics 14 Suppl 2:S7
Sun K, Chen X, Jiang P et al (2013) iSeeRNA: identification of long intergenic non-coding RNA transcripts from transcriptome sequencing data. BMC Genomics 14(Suppl 2):S7
The ENCODE Standards, guidelines and best practices for RNA-Seq. http://genome.ucsc.edu/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Standards_V1.0.pdf
Acknowledgment
The work is substantially supported by seven General Research Funds (GRF) to H.W. and H.S. from the Research Grants Council (RGC) of the Hong Kong Special Administrative Region, China (Project Code: 14133016, 14100415, 14102315, 14116014, 14113514, 476113 and 473713).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Peng, X., Sun, K., Zhou, J., Sun, H., Wang, H. (2017). Bioinformatics for Novel Long Intergenic Noncoding RNA (lincRNA) Identification in Skeletal Muscle Cells. In: Perdiguero, E., Cornelison, D. (eds) Muscle Stem Cells. Methods in Molecular Biology, vol 1556. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-6771-1_20
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6771-1_20
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-6769-8
Online ISBN: 978-1-4939-6771-1
eBook Packages: Springer Protocols