Advertisement

The UEA Small RNA Workbench: A Suite of Computational Tools for Small RNA Analysis

  • Irina Mohorianu
  • Matthew Benedict Stocks
  • Christopher Steven Applegate
  • Leighton Folkes
  • Vincent Moulton
Part of the Methods in Molecular Biology book series (MIMB, volume 1580)

Abstract

RNA silencing (RNA interference, RNAi) is a complex, highly conserved mechanism mediated by short, typically 20–24 nt in length, noncoding RNAs known as small RNAs (sRNAs). They act as guides for the sequence-specific transcriptional and posttranscriptional regulation of target mRNAs and play a key role in the fine-tuning of biological processes such as growth, response to stresses, or defense mechanism.

High-throughput sequencing (HTS) technologies are employed to capture the expression levels of sRNA populations. The processing of the resulting big data sets facilitated the computational analysis of the sRNA patterns of variation within biological samples such as time point experiments, tissue series or various treatments. Rapid technological advances enable larger experiments, often with biological replicates leading to a vast amount of raw data. As a result, in this fast-evolving field, the existing methods for sequence characterization and prediction of interaction (regulatory) networks periodically require adapting or in extreme cases, a complete redesign to cope with the data deluge. In addition, the presence of numerous tools focused only on particular steps of HTS analysis hinders the systematic parsing of the results and their interpretation.

The UEA small RNA Workbench (v1-4), described in this chapter, provides a user-friendly, modular, interactive analysis in the form of a suite of computational tools designed to process and mine sRNA datasets for interesting characteristics that can be linked back to the observed phenotypes. First, we show how to preprocess the raw sequencing output and prepare it for downstream analysis. Then we review some quality checks that can be used as a first indication of sources of variability between samples. Next we show how the Workbench can provide a comparison of the effects of different normalization approaches on the distributions of expression, enhanced methods for the identification of differentially expressed transcripts and a summary of their corresponding patterns. Finally we describe individual analysis tools such as PAREsnip, for the analysis of PARE (degradome) data or CoLIde for the identification of sRNA loci based on their expression patterns and the visualization of the results using the software. We illustrate the features of the UEA sRNA Workbench on Arabidopsis thaliana and Homo sapiens datasets.

Key words

Small RNA (sRNA) microRNA (miRNA) High throughput sequencing UEA sRNA Workbench Normalization Differential expression sRNA loci Degradome analysis 

Notes

Acknowledgments

We thank Matthew Beckers, Tamas Dalmay, Frank Schwach, Simon Moxon, Hugh Woolfenden, Helio Pais, and Daniel Mapleson for their contributions to the UEA sRNA Workbench.

References

  1. 1.
    Fire A, Xu S, Montgomery MK, Kostas SA, Driver SE, Mello CC (1998) Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391:806–811CrossRefPubMedGoogle Scholar
  2. 2.
    Ruiz MT, Voinnet O, Baulcombe DC (1998) Initiation and maintenance of virus-induced gene silencing. Plant Cell 10:937–946CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Meister G, Tuschl T (2004) Mechanisms of gene silencing by double-stranded RNA. Nature 431:343–349CrossRefPubMedGoogle Scholar
  4. 4.
    Ha M, Kim VN (2014) Regulation of microRNA biogenesis. Nat Rev Mol Cell Biol 15:509–524CrossRefPubMedGoogle Scholar
  5. 5.
    Jones-Rhoades MW, Bartel DP, Bartel B (2006) MicroRNAS and their regulatory roles in plants. Annu Rev Plant Biol 57:19–53CrossRefPubMedGoogle Scholar
  6. 6.
    Allen E, Xie Z, Gustafson AM, Carrington JC (2005) microRNA-directed phasing during trans-acting siRNA biogenesis in plants. Cell 121:207–221CrossRefPubMedGoogle Scholar
  7. 7.
    Talbert PB, Henikoff S (2006) Spreading of silent chromatin: inaction at a distance. Nat Rev Genet 7:793–803CrossRefPubMedGoogle Scholar
  8. 8.
    Siomi MC, Sato K, Pezic D, Aravin AA (2011) PIWI-interacting small RNAs: the vanguard of genome defence. Nat Rev Mol Cell Biol 12:246–258CrossRefPubMedGoogle Scholar
  9. 9.
    Gunawardane LS, Saito K, Nishida KM, Miyoshi K, Kawamura Y, Nagami T, Siomi H, Siomi MC (2007) A slicer-mediated mechanism for repeat-associated siRNA 5′ end formation in Drosophila. Science 315:1587–1590CrossRefPubMedGoogle Scholar
  10. 10.
    Bartel DP (2009) MicroRNAs: target recognition and regulatory functions. Cell 136:215–233CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Voinnet O (2009) Origin, biogenesis, and activity of plant microRNAs. Cell 136:669–687CrossRefPubMedGoogle Scholar
  12. 12.
    Ozsolak F, Milos PM (2011) RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 12:87–98CrossRefPubMedGoogle Scholar
  13. 13.
    Morozova O, Marra MA (2008) Applications of next-generation sequencing technologies in functional genomics. Genomics 92:255–264CrossRefPubMedGoogle Scholar
  14. 14.
    Beckers M, Mohorianu I, Stocks MB, Applegate C, Dalmay T, Moulton V (2017) An interactive pipeline for quality checking, normalization and differential expression analysis of high throughput small RNA sequencing data. in preparation.Google Scholar
  15. 15.
    Stocks MB, Moxon S, Mapleson D, Woolfenden HC, Mohorianu I, Folkes L, Schwach F, Dalmay T, Moulton V (2012) The UEA sRNA workbench: a suite of tools for analysing and visualizing next generation sequencing microRNA and small RNA datasets. Bioinformatics 28:2059–2061CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Moxon S, Schwach F, Dalmay T, Maclean D, Studholme DJ, Moulton V (2008) A toolkit for analysing large-scale plant small RNA datasets. Bioinformatics 24:2252–2253CrossRefPubMedGoogle Scholar
  17. 17.
    Folkes L, Moxon S, Woolfenden HC, Stocks MB, Szittya G, Dalmay T, Moulton V (2012) PAREsnip: a tool for rapid genome-wide discovery of small RNA/target interactions evidenced through degradome sequencing. Nucleic Acids Res 40:e103CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Mohorianu I, Stocks MB, Wood J, Dalmay T, Moulton V (2013) CoLIde: a bioinformatics tool for CO-expression-based small RNA Loci Identification using high-throughput sequencing data. RNA Biol 10:1221–1230CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Zhang X, Zhu Y, Liu X, Hong X, Xu Y, Zhu P, Shen Y, Wu H, Ji Y, Wen X et al (2015) Plant biology. Suppression of endogenous gene silencing by bidirectional cytoplasmic RNA decay in Arabidopsis. Science 348:120–123CrossRefPubMedGoogle Scholar
  20. 20.
    Camps C, Saini HK, Mole DR, Choudhry H, Reczko M, Guerra-Assuncao JA, Tian YM, Buffa FM, Harris AL, Hatzigeorgiou AG et al (2014) Integrated analysis of microRNA and mRNA expression and association with HIF binding reveals the complexity of microRNA expression regulation under hypoxia. Mol Cancer 13:28CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Montgomery TA, Yoo SJ, Fahlgren N, Gilbert SD, Howell MD, Sullivan CM, Alexander A, Nguyen G, Allen E, Ahn JH, Carrington JC (2008) AGO1-miR173 complex initiates phased siRNA formation in plants. Proc Natl Acad Sci U S A 105:20055–20062CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    German MA, Pillay M, Jeong DH, Hetawal A, Luo S, Janardhanan P, Kannan V, Rymarquis LA, Nobuta K, German R et al (2008) Global identification of microRNA-target RNA pairs by parallel analysis of RNA ends. Nat Biotechnol 26:941–946CrossRefPubMedGoogle Scholar
  23. 23.
    Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E (2015) The Arabidopsis information resource: Making and mining the “gold standard” annotated reference plant genome. Genesis 53:474–485CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M et al (2012) The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res 40:D1202–D1210CrossRefPubMedGoogle Scholar
  25. 25.
    Prufer K, Stenzel U, Dannemann M, Green RE, Lachmann M, Kelso J (2008) PatMaN: rapid alignment of short sequences to large databases. Bioinformatics 24:1530–1531CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    McCormick KP, Willmann MR, Meyers BC (2011) Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments. Silence 2:2CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Sorefan K, Pais H, Hall AE, Kozomara A, Griffiths-Jones S, Moulton V, Dalmay T (2012) Reducing ligation bias of small RNAs in libraries for next generation sequencing. Silence 3:4CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Xu P, Billmeier M, Mohorianu I, Green D, Fraser WD, Dalmay T (2015) An improved protocol for small RNA library construction using High Definition adapters. Methods in Next Generation Sequencing 2:2084–7173CrossRefGoogle Scholar
  30. 30.
    Mohorianu I, Schwach F, Jing R, Lopez-Gomollon S, Moxon S, Szittya G, Sorefan K, Moulton V, Dalmay T (2011) Profiling of short RNAs during fleshy fruit development reveals stage-specific sRNAome expression patterns. Plant J 67:232–246CrossRefPubMedGoogle Scholar
  31. 31.
    Mantha S, Roizen MF, Fleisher LA, Thisted R, Foss J (2000) Comparing methods of clinical measurement: reporting standards for bland and altman analysis. Anesth Analg 90:593–602CrossRefPubMedGoogle Scholar
  32. 32.
    Dillies MA, Rau A, Aubert J, Hennequet-Antier C, Jeanmougin M, Servant N, Keime C, Marot G, Castel D, Estelle J et al (2013) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief Bioinform 14:671–683CrossRefPubMedGoogle Scholar
  33. 33.
    Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason CE, Socci ND, Betel D (2013) Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol 14:R95CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Soneson C, Delorenzi M (2013) A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics 14:91CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628CrossRefPubMedGoogle Scholar
  36. 36.
    Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11:R25CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Bullard JH, Purdom E, Hansen KD, Dudoit S (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11:94CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185–193CrossRefPubMedGoogle Scholar
  41. 41.
    Mohorianu I, Bretman A, Smith D, Fowler E, Dalmay T, Chapman T (2016) New approaches for analysing RNA-seq data: sampling-based normalization and hierarchical differential expression. in preparation.Google Scholar
  42. 42.
    Li J, Witten DM, Johnstone IM, Tibshirani R (2012) Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics 13:523–538CrossRefPubMedGoogle Scholar
  43. 43.
    Cleaveland W (1979) Robust locally weighted regression and smoothing scatterplot. J Am Stat Assoc 74:829–836CrossRefGoogle Scholar
  44. 44.
    Cleaveland W (1981) LOWESS: a program for smoothing scatterplots by robust locally weighted regression. The American Statistician 35:54–60CrossRefGoogle Scholar
  45. 45.
    Lopez-Gomollon S, Mohorianu I, Szittya G, Moulton V, Dalmay T (2012) Diverse correlation patterns between microRNAs and their targets during tomato fruit development indicates different modes of microRNA actions. Planta 236:1875–1887CrossRefPubMedGoogle Scholar
  46. 46.
    Kozomara A, Griffiths-Jones S (2014) miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42:D68–D73CrossRefPubMedGoogle Scholar
  47. 47.
    Hofacker IL, Lorenz R (2014) Predicting RNA structure: advances and limitations. Methods Mol Biol 1086:1–19CrossRefPubMedGoogle Scholar
  48. 48.
    Lorenz R, Hofacker IL, Stadler PF (2016) RNA folding with hard and soft constraints. Algorithms Mol Biol 11:8CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    Bonnet E, Wuyts J, Rouze P, Van de Peer Y (2004) Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics 20:2911–2917CrossRefPubMedGoogle Scholar
  50. 50.
    Peragine A, Yoshikawa M, Wu G, Albrecht HL, Poethig RS (2004) SGS3 and SGS2/SDE1/RDR6 are required for juvenile development and the production of trans-acting siRNAs in Arabidopsis. Genes Dev 18:2368–2379CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Vazquez F, Vaucheret H, Rajagopalan R, Lepers C, Gasciolli V, Mallory AC, Hilbert JL, Bartel DP, Crete P (2004) Endogenous trans-acting siRNAs regulate the accumulation of Arabidopsis mRNAs. Mol Cell 16:69–79CrossRefPubMedGoogle Scholar
  52. 52.
    Fei Q, Xia R, Meyers BC (2013) Phased, secondary, small interfering RNAs in posttranscriptional regulatory networks. Plant Cell 25:2400–2415CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Yifhar T, Pekker I, Peled D, Friedlander G, Pistunov A, Sabban M, Wachsman G, Alvarez JP, Amsellem Z, Eshed Y (2012) Failure of the tomato trans-acting short interfering RNA program to regulate AUXIN RESPONSE FACTOR3 and ARF4 underlies the wiry leaf syndrome. Plant Cell 24:3575–3589CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Chen HM, Li YH, Wu SH (2007) Bioinformatic prediction and experimental validation of a microRNA-directed tandem trans-acting siRNA cascade in Arabidopsis. Proc Natl Acad Sci U S A 104:3318–3323CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Friedlander MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, Rajewsky N (2008) Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol 26:407–415CrossRefPubMedGoogle Scholar
  56. 56.
    Guerra-Assuncao JA, Enright AJ (2010) MapMi: automated mapping of microRNA loci. BMC Bioinformatics 11:133CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Molnar A, Schwach F, Studholme DJ, Thuenemann EC, Baulcombe DC (2007) miRNAs control gene expression in the single-cell alga Chlamydomonas reinhardtii. Nature 447:1126–1129CrossRefPubMedGoogle Scholar
  58. 58.
    Tomato Genome Consortium (2012) The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485:635–641CrossRefGoogle Scholar
  59. 59.
    Helt GA, Nicol JW, Erwin E, Blossom E, Blanchard SG Jr, Chervitz SA, Harmon C, Loraine AE (2009) Genoviz Software Development Kit: Java tool kit for building genomics visualization applications. BMC Bioinformatics 10:266CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    German MA, Luo S, Schroth G, Meyers BC, Green PJ (2009) Construction of Parallel Analysis of RNA Ends (PARE) libraries for the study of cleaved miRNA targets and the RNA degradome. Nat Protoc 4:356–362CrossRefPubMedGoogle Scholar
  61. 61.
    Zhai J, Arikit S, Simon SA, Kingham BF, Meyers BC (2014) Rapid construction of parallel analysis of RNA end (PARE) libraries for Illumina sequencing. Methods 67:84–90CrossRefPubMedGoogle Scholar
  62. 62.
    Addo-Quaye C, Eshoo TW, Bartel DP, Axtell MJ (2008) Endogenous siRNA and miRNA targets identified by sequencing of the Arabidopsis degradome. Curr Biol 18:758–762CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Addo-Quaye C, Miller W, Axtell MJ (2009) CleaveLand: a pipeline for using degradome data to find cleaved small RNA targets. Bioinformatics 25:130–131CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  • Irina Mohorianu
    • 1
    • 2
  • Matthew Benedict Stocks
    • 2
  • Christopher Steven Applegate
    • 2
  • Leighton Folkes
    • 3
  • Vincent Moulton
    • 2
  1. 1.School of Biological SciencesUniversity of East AngliaNorwichUK
  2. 2.School of Computing SciencesUniversity of East AngliaNorwichUK
  3. 3.The Earlham InstituteNorwichUK

Personalised recommendations