Bioinformatic Analysis of Small RNA Sequencing Libraries

  • Ricardo A. Chávez MontesEmail author
  • Fabiola Jaimes-Miranda
  • Stefan de Folter
Part of the Methods in Molecular Biology book series (MIMB, volume 1932)


Bioinformatic analysis of small RNA sequencing libraries consists of transforming a series of small RNA sequencing experiment fastq files into a table containing small RNA sequences and their abundance. This is achieved by cleaning the reads, aligning the cleaned reads to a reference, and parsing the alignment results. In this protocol we present the most common option, and the rationale, for each of these steps.

Key words

Small RNA miRNA Adapter Bowtie ShortStack Bioinformatics Sequences 



Work in the SDF laboratory was financed by the Mexican National Council of Science and Technology (CONACyT) grants CB-2012-177739 and FC-2015-2/1061.


  1. 1.
    Didion JP, Martin M, Collins FS (2017) Atropos: specific, sensitive, and speedy trimming of sequencing reads. PeerJ 5:e3720CrossRefGoogle Scholar
  2. 2.
    Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal 17:10–12Google Scholar
  3. 3.
    Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120CrossRefGoogle Scholar
  4. 4.
    Chávez-Montes RA, Rosas-Cárdenas FF, De Paoli E, Accerbi M, Rymarquis LA, Mahalingam G, Marsch-Martínez N, Meyers BC, Green PJ, de Folter S (2014) Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs. Nat Commun 5:3722CrossRefGoogle Scholar
  5. 5.
    You C, Cui J, Wang H, Qi X, Kuo L-Y, Ma H, Gao L, Mo B, Chen X (2017) Conservation and divergence of small RNA pathways and microRNAs in land plants. Genome Biol 18:158CrossRefGoogle Scholar
  6. 6.
    Tsuji J, Weng Z (2016) DNApi: a De Novo Adapter prediction algorithm for small RNA sequencing data. PLoS One 11:e0164228CrossRefGoogle Scholar
  7. 7.
    Jiang H, Wong WH (2008) SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics 24:2395–2396CrossRefGoogle Scholar
  8. 8.
    Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD (2015) Rfam 12.0: updates to the RNA families database. Nucleic Acids Res 43:D130–D137CrossRefGoogle Scholar
  9. 9.
    Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki EP, Rivas E, Eddy SR, Bateman A, Finn RD, Petrov AI (2018) Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res 46:D335–D342CrossRefGoogle Scholar
  10. 10.
    Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25CrossRefGoogle Scholar
  11. 11.
    Johnson NR, Yeoh JM, Coruh C, Axtell MJ (2016) Improved placement of multi-mapping Small RNAs. G3 (Bethesda) 6:2103–2111CrossRefGoogle Scholar
  12. 12.
    Axtell MJ, Meyers BC (2018) Revisiting criteria for plant miRNA annotation in the era of big data. Plant Cell 30(2):272–284. Scholar
  13. 13.
    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079CrossRefGoogle Scholar
  14. 14.
    Lorenz R, Bernhart SH, Höner zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA Package 2.0. Alg Mol Biol 6:26CrossRefGoogle Scholar
  15. 15.
    Floyd SK, Bowman JL (2004) Ancient microRNA target sequences in plants. Nature 428:485–486CrossRefGoogle Scholar
  16. 16.
    Zhang B, Pan X, Cannon CH, Cobb GP, Anderson TA (2006) Conservation and divergence of plant microRNA genes. Plant J 46:243–259CrossRefGoogle Scholar
  17. 17.
    Jasinski S, Vialette-Guiraud ACM, Scutt CP (2010) The evolutionary-developmental analysis of plant microRNAs. Philos Trans R Soc Lond Ser B Biol Sci 365:469–476CrossRefGoogle Scholar
  18. 18.
    Kozomara A, Griffiths-Jones S (2014) miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42:D68–D73CrossRefGoogle Scholar
  19. 19.
    Lei J, Sun Y (2014) miR-PREFeR: an accurate, fast and easy-to-use plant miRNA prediction tool using small RNA-Seq data. Bioinformatics 30:2837–2839CrossRefGoogle Scholar
  20. 20.
    Taylor RS, Tarver JE, Hiscock SJ, Donoghue PCJ (2014) Evolutionary history of plant microRNAs. Trends Plant Sci 19:175–182CrossRefGoogle Scholar
  21. 21.
    Taylor RS, Tarver JE, Foroozani A, Donoghue PCJ (2017) MicroRNA annotation of plant genomes − do it right or not at all. BioEssays 39:1600113CrossRefGoogle Scholar
  22. 22.
    Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M, Gentleman R, Morgan MT, Carey VJ (2013) Software for Computing and Annotating Genomic Ranges. PLoS Comput Biol 9:e1003118CrossRefGoogle Scholar
  23. 23.
    Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140CrossRefGoogle Scholar
  24. 24.
    Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550CrossRefGoogle Scholar
  25. 25.
    Tarazona S, García-Alcalde F, Dopazo J, Ferrer A, Conesa A (2011) Differential expression in RNA-seq: a matter of depth. Genome Res 21:2213–2223CrossRefGoogle Scholar
  26. 26.
    Wang F, Johnson NR, Coruh C, Axtell MJ (2016) Genome-wide analysis of single non-templated nucleotides in plant endogenous siRNAs and miRNAs. Nucleic Acids Res 44:7395–7405CrossRefGoogle Scholar
  27. 27.
    Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26CrossRefGoogle Scholar
  28. 28.
    Thorvaldsdottir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192CrossRefGoogle Scholar
  29. 29.
    Cheng C-Y, Krishnakumar V, Chan AP, Thibaud-Nissen F, Schobel S, Town CD (2017) Araport11: a complete reannotation of the Arabidopsis thaliana reference genome. Plant J 89:789–804CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Ricardo A. Chávez Montes
    • 1
    Email author
  • Fabiola Jaimes-Miranda
    • 2
  • Stefan de Folter
    • 1
  1. 1.Unidad de Genómica Avanzada / Laboratorio Nacional de Genómica para la Biodiversidad (LANGEBIO)Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN)IrapuatoMexico
  2. 2.CONACyT-Instituto Potosino de Investigación Científica y Tecnológica AC / División de Biología MolecularSan Luis PotosíMexico

Personalised recommendations