Skip to main content

Trimming and Validation of Illumina Short Reads Using Trimmomatic, Trinity Assembly, and Assessment of RNA-Seq Data

  • Protocol
  • First Online:
Plant Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2443))

Abstract

Next-generation sequencing (NGS) technologies can generate billions of reads in a single sequencing run. However, with such high-throughput comes quality issues which have to be addressed before undertaking downstream analysis. Quality control on short reads is usually performed at default settings due to a lack of in-depth understanding of a particular software’s parameters and their effect if changed on the output. Here we demonstrate how to optimize read trimming using Trimmomatic. We highlight the benefits of trimming by comparing the quality of transcripts assembled using trimmed and untrimmed reads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17:333–351. https://doi.org/10.1038/nrg.2016.49

    Article  CAS  PubMed  Google Scholar 

  2. Payá-Milans M, Olmstead JW, Nunez G et al (2018) Comprehensive evaluation of RNA-seq analysis pipelines in diploid and polyploid species. Gigascience 7:giy132. https://doi.org/10.1093/gigascience/giy132

    Article  CAS  PubMed Central  Google Scholar 

  3. Turner FS (2014) Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries. Front Genet 5:5. https://doi.org/10.3389/fgene.2014.00005

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Yang S-F, Lu C-W, Yao C-T, Hung C-M (2019) To trim or not to trim: effects of read trimming on the de novo genome assembly of a widespread east Asian Passerine, the Rufous-Capped Babbler (Cyanoderma ruficeps Blyth). Genes 10:737. https://doi.org/10.3390/genes10100737

    Article  CAS  PubMed Central  Google Scholar 

  5. Pereira R, Oliveira J, Sousa M (2020) Bioinformatics and computational tools for next-generation sequencing analysis in clinical genetics. J Clin Med 9:132. https://doi.org/10.3390/jcm9010132

    Article  CAS  PubMed Central  Google Scholar 

  6. Pfeiffer F, Gröber C, Blank M et al (2018) Systematic evaluation of error rates and causes in short samples in next-generation sequencing. Sci Rep 8:10950. https://doi.org/10.1038/s41598-018-29325-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Abnizova I, te Boekhorst R, Orlov YL (2017) Computational errors and biases in short read next generation sequencing. J Proteomics Bioinform 10:1–17. https://doi.org/10.4172/jpb.1000420

    Article  Google Scholar 

  8. Fabbro CD, Scalabrin S, Morgante M, Giorgi FM (2013) An extensive evaluation of read trimming effects on Illumina NGS data analysis. PLoS One 8:e85024. https://doi.org/10.1371/journal.pone.0085024

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Bansal V (2017) A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments. BMC Bioinformatics 18:43. https://doi.org/10.1186/s12859-017-1471-9

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Visendi P, Berkman PJ, Hayashi S et al (2016) An efficient approach to BAC based assembly of complex genomes. Plant Methods 12:778. https://doi.org/10.1186/s13007-016-0107-9

    Article  CAS  Google Scholar 

  11. Heydari M, Miclotte G, Demeester P et al (2017) Evaluation of the impact of Illumina error correction tools on de novo genome assembly. BMC Bioinformatics 18:374. https://doi.org/10.1186/s12859-017-1784-8

    Article  PubMed  PubMed Central  Google Scholar 

  12. Tan G, Opitz L, Schlapbach R, Rehrauer H (2019) Long fragments achieve lower base quality in Illumina paired-end sequencing. Sci Rep 9:2856. https://doi.org/10.1038/s41598-019-39076-7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. van Gurp TP, McIntyre LM, Verhoeven KJF (2013) Consistent errors in first strand cDNA due to random hexamer mispriming. PLoS One 8:e85583. https://doi.org/10.1371/journal.pone.0085583

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Mbandi SK, Hesse U, Rees DJG, Christoffels A (2014) A glance at quality score: implication for de novo transcriptome reconstruction of Illumina reads. Front Genet 5:17. https://doi.org/10.3389/fgene.2014.00017

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Visendi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Sewe, S.O., Silva, G., Sicat, P., Seal, S.E., Visendi, P. (2022). Trimming and Validation of Illumina Short Reads Using Trimmomatic, Trinity Assembly, and Assessment of RNA-Seq Data. In: Edwards, D. (eds) Plant Bioinformatics. Methods in Molecular Biology, vol 2443. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2067-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2067-0_11

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2066-3

  • Online ISBN: 978-1-0716-2067-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics