Transposable elements (TE) are mobile genetic elements that can readily change their genomic position. When not properly silenced, TEs can contribute a substantial portion to the cell’s transcriptome, but are typically ignored in most RNA-seq data analyses. One reason for leaving TE-derived reads out of RNA-seq analyses is the complexities involved in properly aligning short sequencing reads to these highly repetitive regions. Here we describe a method for including TE-derived reads in RNA-seq differential expression analysis using an open source software package called TEtranscripts. TEtranscripts is designed to assign both uniquely and ambiguously mapped reads to all possible gene and TE-derived transcripts in order to statistically infer the correct gene/TE abundances. Here, we provide a detailed tutorial of TEtranscripts using a published qPCR validated dataset.
Barbara McClintock laid the foundation for TE research with her discoveries in maize of mobile genetic elements capable of inserting into novel locations in the genome, altering the expression of nearby genes . Since then, our appreciation of the contribution of repetitive TE-derived sequences to eukaryotic genomes has vastly increased. With the publication of the first human genome draft by the Human Genome Project, it was determined that nearly half of the human genome is derived from TE sequences [2, 3], with varying levels of repetitive DNA present in most plant and animal species. More recent studies looking at distantly related TE-like sequences have estimated that up to two thirds of the human genome might be repeat-derived , with the vast majority of these sequences attributed to retrotransposons that require transcription as part of the mobilization process, as discussed below.
RNA-seq Transposable elements TEtranscripts Differential expression analysis STAR DESeq
This is a preview of subscription content, log in to check access
Springer Nature is developing a new tool to find and evaluate Protocols. Learn more
Sciamanna I et al (2013) A tumor-promoting mechanism mediated by retrotransposon-encoded reverse transcriptase is active in human transformed cell lines. Oncotarget 4:2271–2287CrossRefPubMedPubMedCentralGoogle Scholar
Coufal NG et al (2011) Ataxia telangiectasia mutated (ATM) modulates long interspersed element-1 (l1) retotransposition in human neural stem cells. Proc Natl Acad Sci U S A 108:20382–20387CrossRefPubMedPubMedCentralGoogle Scholar
Chung D et al (2011) Discovering transcription factor binding sites in highly repetitive regions of genomes with multiread analysis of ChIP-Seq data. PLoS Comput Biol 7:e1002111CrossRefPubMedPubMedCentralGoogle Scholar
Tucker BA et al (2011) Exome sequencing and analysis of induced pluripotent stem cells identify the cilia-related gene male germ cell-associated kinase (MAK) as a cause of retinitis pigmentosa. Proc Natl Acad Sci U S A 108:E569–E576CrossRefPubMedPubMedCentralGoogle Scholar
Jin Y, Tam OH, Paniagua E, Hammell M (2015) TEtranscripts: a package for including transposable elements in differential expression analysis of RNA-seq datasets. Bioinformatics 31:3593–3599CrossRefPubMedPubMedCentralGoogle Scholar
Lin Y, Golovnina K, Chen ZX, Lee HN et al (2016) Comparison of normalization and differential expression analysis using RNA-seq data from 726 individual Drosophila melanogaster. BMC Genomics 17:28CrossRefPubMedPubMedCentralGoogle Scholar