Abstract
In this chapter, we describe a computational pipeline for the in silico detection of plant viruses by high-throughput sequencing (HTS) from total RNA samples. The pipeline is designed for the analysis of short reads generated using an Illumina platform and free-available software tools. First, we provide advice for high-quality total RNA purification, library preparation, and sequencing. The bioinformatics pipeline begins with the raw reads obtained from the sequencing machine and performs some curation steps to obtain long contigs. Contigs are blasted against a local database of reference nucleotide viral sequences to identify the viruses in the samples. Then, the search is refined by applying specific filters. We also provide the code to re-map the short reads against the viruses found to get information on sequencing depth and read coverage for each virus. No previous bioinformatics background is required, but basic knowledge of the Unix command line and R language is recommended.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Villamor DEV, Ho T, al Rwahnih M et al (2019) High throughput sequencing for plant virus detection and discovery. Phytopathology 109:716–725. https://doi.org/10.1094/PHYTO-07-18-0257-RVW
Kutnjak D, Tamisier L, Adams I et al (2021) A primer on the analysis of high-throughput sequencing data for detection of plant viruses. Microorganisms 9:841. https://doi.org/10.3390/microorganisms9040841
Andrews S (2010) FastQC: A quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 23 Feb 2023
Blawid R, Silva JMF, Nagata T (2017) Discovering and sequencing new plant viral genomes by next-generation sequencing: description of a practical pipeline. Ann Appl Biol 170:301–314. https://doi.org/10.1111/aab.12345
Altschup SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
Gutiérrez P, Rivillas A, Tejada D et al (2021) PVDP: a portable open source pipeline for detection of plant viruses in RNAseq data. A case study on potato viruses in Antioquia (Colombia). Physiol Mol Plant Pathol 113. https://doi.org/10.1016/j.pmpp.2021.101604
Sukhorukov G, Khalili M, Gascuel O et al (2022) VirHunter: a deep learning-based method for detection of novel RNA viruses in plant sequencing data. Front Bioinform 2. https://doi.org/10.3389/fbinf.2022.867111
Valenzuela SL, Norambuena T, Morgante V et al (2022) Viroscope: plant viral diagnosis from high-throughput sequencing data using biologically-informed genome assembly coverage. Front Microbiol 13. https://doi.org/10.3389/fmicb.2022.967021
Maachi A, Donaire L, Hernando Y, Aranda MA (2022) Genetic differentiation and migration fluxes of viruses from melon crops and crop edge weeds. J Virol 96. https://doi.org/10.1128/jvi.00421-22
Maachi A, Torre C, Sempere RN et al (2021) Use of high-throughput sequencing and two RNA input methods to identify viruses infecting tomato crops. Microorganisms 9. https://doi.org/10.3390/microorganisms9051043
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10. https://doi.org/10.1186/gb-2009-10-3-r25
Haas BJ, Papanicolaou A, Yassour M et al (2013) De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512. https://doi.org/10.1038/nprot.2013.084
Shen W, Le S, Li Y, Hu F (2016) SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS One 11. https://doi.org/10.1371/journal.pone.0163962
Li H, Durbin R (2009) Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25:1754–1760. https://doi.org/10.1093/bioinformatics/btp324
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
RStudio Team (2020) RStudio: integrated development for R. In: RStudio, PBC. http://www.rstudio.com/. Accessed 23 Feb 2023
Milne I, Stephen G, Bayer M et al (2013) Using tablet for visual exploration of second-generation sequencing data. Brief Bioinform 14:193–202. https://doi.org/10.1093/bib/bbs012
Acknowledgments
We thank Ayoub Maachi, a former PhD student of Abiopep S.L., for his help with testing this bioinformatics pipeline. LD is a recipient of a fellowship of the Torres Quevedo Program (Programa de Contratación de Doctores y Tecnólogos, Ref. PTQ2021-011629) at Abiopep S.L. This research received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement no. 813542T.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Donaire, L., Aranda, M.A. (2024). Computational Pipeline for the Detection of Plant RNA Viruses Using High-Throughput Sequencing. In: Fontes, E.P., Mäkinen, K. (eds) Plant-Virus Interactions. Methods in Molecular Biology, vol 2724. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3485-1_1
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3485-1_1
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3484-4
Online ISBN: 978-1-0716-3485-1
eBook Packages: Springer Protocols