Abstract
Changes in the surrounding environment are mirrored by changes in the transcript profile of an organism. In the case of a plant pathogen, host colonization would be a challenge that triggers changes in transcript expression patterns. Determining the transcriptional profile could provide valuable clues on how an organism responds to defined stimuli, in this case, how a pathogen colonizes its host. Several robust data analysis methods and pipelines are available that can identify these differentially expressed transcripts. In this chapter we outline the steps and other caveats that are needed to run one such pipeline.
Key words
- RNA-seq
- Transcriptome
- Transcript profile
- Data analysis
- Pipeline
- Differentially expressed genes
- Splice-aware
- HISAT2
- StringTie
- DESeq2
This is a preview of subscription content, access via your institution.
Buying options

References
Van den Berge K, Hembach KM, Soneson C, Tiberi S, Clement L, Love MI, Patro R, Robinson MD (2019) RNA sequencing data: Hitchhiker’s guide to expression analysis. Annu Rev Biomed Data Sci 2:139–173
Besemer J, Borodovsky M (1999) Heuristic approach to deriving models for gene finding. Nucl Acid Res 27:3911–3920
Allen JE, Salzberg SL (2005) JIGSAW: integration of multiple sources of evidence for gene prediction. Bioinformatics 21:3596–3603
Goldstein LD, Cao Y, Pau G, Lawrence M, Wu TD, Seshagiri S, Gentleman R (2016) Prediction and quantification of splice events from RNA-Seq data. PLoS One 11(5):e0156132
Elliott ML, Des Jardin EA, O’Donnell K, Geiser DM, Harrison NA, Broschat TK (2010) Fusarium oxysporum f. sp. palmarum, a novel forma specialis causing a lethal disease of Syagrus romanzoffiana and Washingtonia robusta in Florida. Plant Dis 94:31–38
Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628
Costa-Silva J, Domingues D, Lopes FM (2017) RNA-Seq differential expression analysis: an extended review and a software tool. PLoS One 12:e0190152
Kim D, Paggi JM, Park C, Bennett C, Salzberg SL (2019) Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37:907–915
Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33:290–295
Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup (2009) The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25:2078–2079
R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
Huber W et al (2015) Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods 12:115–121
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer, New York
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc
Hannon GJ (2010) FASTX-Toolkit FASTQ/A short-reads pre-processing tools. Available online at: http://hannonlab.cshl.edu/fastx_toolkit
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120
Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL (2016) Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature Protoc 11:1650–1667
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, Macmanes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, Leduc RD, Friedman N, Regev A (2013) De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512
Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829
Bushmanova E, Antipov D, Lapidus A, Prjibelski AD (2019) rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. GigaScience 8:giz100
Acknowledgments
This work is supported by the USDA National Institute of Food and Agriculture, Hatch project FLA-FTL-005926. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the National Institute of Food and Agriculture (NIFA) or the US Department of Agriculture (USDA).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Gill, N., Dhillon, B. (2022). RNA-seq Data Analysis for Differential Expression. In: Coleman, J. (eds) Fusarium wilt. Methods in Molecular Biology, vol 2391. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1795-3_4
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1795-3_4
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1794-6
Online ISBN: 978-1-0716-1795-3
eBook Packages: Springer Protocols