Comparing gene expression profiles measured in a wide range of different tissue types, at different developmental stages, or under different environmental conditions can yield valuable insights into the mechanisms of cell/tissue specification and differentiation, or identify cell/tissue-type specific responses to environmental stimuli. Critical for such comparisons is the identical processing of data from different sources. This may also include the integration of a novel data set into an existing collection of data sets (e.g., in-house and publicly available data). Here, I describe a complete workflow for RNA-Seq data, from data processing steps to the comparison of gene expression profiles measured with RNA-Seq. I use publicly available data for demonstration purposes, but I also describe how to integrate your own data sets. The workflow runs on all three major operating systems (Linux, MacOS, and Windows). The scripts and the tutorial can be accessed on github.com/MWSchmid/RNAseq_protocol.
RNA-Seq Public data Data integration Analysis Differential expression Multigroup comparisons Gene expression Transcriptome Workflow
This is a preview of subscription content, log in to check access.
Springer Nature is developing a new tool to find and evaluate Protocols. Learn more
Durinck S, Spellman P, Birney E et al (2009) Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc 4:1184–1191CrossRefPubMedPubMedCentralGoogle Scholar
Qi W, Schlapbach R, Rehrauer H (2017) RNA-seq data analysis: from raw data quality control to differential expression analysis. In: Schmidt A (ed) Plant germline development. Methods in molecular biology. Springer, DordrechtGoogle Scholar