Skip to main content

RNA-seq Data Analysis for Differential Expression

Part of the Methods in Molecular Biology book series (MIMB,volume 2391)


Changes in the surrounding environment are mirrored by changes in the transcript profile of an organism. In the case of a plant pathogen, host colonization would be a challenge that triggers changes in transcript expression patterns. Determining the transcriptional profile could provide valuable clues on how an organism responds to defined stimuli, in this case, how a pathogen colonizes its host. Several robust data analysis methods and pipelines are available that can identify these differentially expressed transcripts. In this chapter we outline the steps and other caveats that are needed to run one such pipeline.

Key words

  • RNA-seq
  • Transcriptome
  • Transcript profile
  • Data analysis
  • Pipeline
  • Differentially expressed genes
  • Splice-aware
  • HISAT2
  • StringTie
  • DESeq2

This is a preview of subscription content, access via your institution.

Buying options

USD   49.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-0716-1795-3_4
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-1-0716-1795-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   159.99
Price excludes VAT (USA)
Hardcover Book
USD   219.99
Price excludes VAT (USA)
Fig. 1

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more


  1. Van den Berge K, Hembach KM, Soneson C, Tiberi S, Clement L, Love MI, Patro R, Robinson MD (2019) RNA sequencing data: Hitchhiker’s guide to expression analysis. Annu Rev Biomed Data Sci 2:139–173

    CrossRef  Google Scholar 

  2. Besemer J, Borodovsky M (1999) Heuristic approach to deriving models for gene finding. Nucl Acid Res 27:3911–3920

    CrossRef  CAS  Google Scholar 

  3. Allen JE, Salzberg SL (2005) JIGSAW: integration of multiple sources of evidence for gene prediction. Bioinformatics 21:3596–3603

    CrossRef  CAS  Google Scholar 

  4. Goldstein LD, Cao Y, Pau G, Lawrence M, Wu TD, Seshagiri S, Gentleman R (2016) Prediction and quantification of splice events from RNA-Seq data. PLoS One 11(5):e0156132

    CrossRef  Google Scholar 

  5. Elliott ML, Des Jardin EA, O’Donnell K, Geiser DM, Harrison NA, Broschat TK (2010) Fusarium oxysporum f. sp. palmarum, a novel forma specialis causing a lethal disease of Syagrus romanzoffiana and Washingtonia robusta in Florida. Plant Dis 94:31–38

    CrossRef  CAS  Google Scholar 

  6. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628

    CrossRef  CAS  Google Scholar 

  7. Costa-Silva J, Domingues D, Lopes FM (2017) RNA-Seq differential expression analysis: an extended review and a software tool. PLoS One 12:e0190152

    CrossRef  Google Scholar 

  8. Kim D, Paggi JM, Park C, Bennett C, Salzberg SL (2019) Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37:907–915

    CrossRef  CAS  Google Scholar 

  9. Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33:290–295

    CrossRef  CAS  Google Scholar 

  10. Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550

    CrossRef  Google Scholar 

  11. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup (2009) The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25:2078–2079

    CrossRef  Google Scholar 

  12. R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.

    Google Scholar 

  13. Huber W et al (2015) Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods 12:115–121

    CrossRef  CAS  Google Scholar 

  14. Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer, New York

    CrossRef  Google Scholar 

  15. Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Available online at:

  16. Hannon GJ (2010) FASTX-Toolkit FASTQ/A short-reads pre-processing tools. Available online at:

  17. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120

    CrossRef  CAS  Google Scholar 

  18. Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL (2016) Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature Protoc 11:1650–1667

    CrossRef  CAS  Google Scholar 

  19. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36

    CrossRef  Google Scholar 

  20. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21

    CrossRef  CAS  Google Scholar 

  21. Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, Macmanes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, Leduc RD, Friedman N, Regev A (2013) De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512

    CrossRef  CAS  Google Scholar 

  22. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829

    CrossRef  CAS  Google Scholar 

  23. Bushmanova E, Antipov D, Lapidus A, Prjibelski AD (2019) rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. GigaScience 8:giz100

    CrossRef  Google Scholar 

Download references


This work is supported by the USDA National Institute of Food and Agriculture, Hatch project FLA-FTL-005926. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the National Institute of Food and Agriculture (NIFA) or the US Department of Agriculture (USDA).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Braham Dhillon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Verify currency and authenticity via CrossMark

Cite this protocol

Gill, N., Dhillon, B. (2022). RNA-seq Data Analysis for Differential Expression. In: Coleman, J. (eds) Fusarium wilt. Methods in Molecular Biology, vol 2391. Humana, New York, NY.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1794-6

  • Online ISBN: 978-1-0716-1795-3

  • eBook Packages: Springer Protocols