RUbioSeq+: An Application that Executes Parallelized Pipelines to Analyse Next-Generation Sequencing Data

  • Miriam Rubio-Camarillo
  • Hugo López-Fernández
  • Gonzalo Gómez-López
  • Ángel Carro
  • José María Fernández
  • Florentino Fdez-Riverola
  • Daniel Glez-Peña
  • David G. Pisano
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 477)

Abstract

To facilitate routine analysis and to improve the reproducibility of the results, next-generation sequencing analysis requires intuitive, efficient and integrated data processing pipelines. Here, we present RUbioSeq+, a multi-platform application that incorporates a suite of automated and parallelized workflows to analyse NGS data. The software supports DNA-seq (single-nucleotide and copy number variation analyses) as well as for bisulfite-seq and ChIP-seq workflows. RUbioSeq+ supports parallelized and multithreaded execution, and its interactive graphical user interface facilitates its use by both biomedical researchers and bioinformaticians. Results generated by our software have been experimentally validated and accepted for publication. RUbioSeq+ is free and open to all users at http://rubioseq.bioinfo.cnio.es/.

Keywords

NGS analysis Parallelized workflows Whole-genome Variant calling ChIPSeq Bisulfite-Seq CNV HPC SGE 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Trapnell, C., Salzberg, S.L.: How to map billions of short reads onto genomes. Nat. Biotech. 27(5), 455–457 (2009)CrossRefGoogle Scholar
  2. 2.
    Ding, L., Wendl, M.C., McMichael, J.F., Raphael, B.J.: Expanding the computational toolbox for mining cancer genomes. Nat. Rev. Genet. 15(8), 556–570 (2014)CrossRefGoogle Scholar
  3. 3.
    Lam, H.Y., Pan, C., Clark, M.J., Lacroute, P., Chen, R., Haraksingh, R., O’Huallachain, M., Gerstein, M.B., Kidd, J.M., Bustamante, C.D., Snyder, M.: Detecting and annotating genetic variations using the HugeSeq pipeline. Nat. Biotechnol. 30(3), 226–229 (2012)CrossRefGoogle Scholar
  4. 4.
  5. 5.
  6. 6.
    Goecks, J., Nekrutenko, A., Taylor, J., The Galaxy Team: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), R86 (2010)CrossRefGoogle Scholar
  7. 7.
    Halbritter, F., Vaidya, H.J., Tomlinson, S.R.: GeneProf: analysis of high-throughput sequencing experiments. Nat. Methods 9(1), 7–8 (2011)CrossRefGoogle Scholar
  8. 8.
    Ji, H., Jiang, H., Ma, W., Johnson, D.S., Myers, R.M., Wong, W.H.: An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat. Biotechnol. 26(11), 1293–1300 (2008)CrossRefGoogle Scholar
  9. 9.
    Rubio-Camarillo, M., Gómez-López, G., Fernández, J.M., Valencia, A., Pisano, D.G.: RUbioSeq: a suite of parallelized pipelines to automate exome variation and bisulfite-seq analyses. Bioinformatics 29(13), 1687–1689 (2013)CrossRefGoogle Scholar
  10. 10.
    Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., Myers, R.M., Brown, M., Li, W., Liu, X.S.: Model-based Analysis of ChIP-Seq (MACS). Genome Biology 9, R137 (2008)CrossRefGoogle Scholar
  11. 11.
    Xu, H., Handoko, L., Wei, X., Ye, C., Sheng, J., Wei, C.L., Lin, F., Sung, W.K.: A signal-noise model for significance analysis of ChIP-seq with negative control. Bioinformatics 26(9), 1199–1204 (2010)CrossRefGoogle Scholar
  12. 12.
  13. 13.
    Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)CrossRefGoogle Scholar
  14. 14.
  15. 15.
    Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., 1000 Genome Project Data Processing Subgroup: The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25, 2078–2079 (2009)CrossRefGoogle Scholar
  16. 16.
    Li, Q., Brown, J., Huang, H., Bickel, P.: Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat. 5, 1752–1779 (2011)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Salmon-Divon, M., Dvinge, H., Tammoja, K., Bertone, P.: PeakAnalyzer: Genome-wide annotation of chromatin binding and modification loci. BMC Bioinformatics 11, 415 (2010)CrossRefGoogle Scholar
  18. 18.
    Homer, N., Merriman, B., Nelson, S.F.: BFAST: an alignment tool for large scale genome resequencing. PLoS One 4(11), e7767 (2009)CrossRefGoogle Scholar
  19. 19.
    DePristo, M., Banks, E., Poplin, R., Garimella, K., Maguire, J., Hartl, C., Philippakis, A., del Angel, G., Rivas, M.A., Hanna, M., McKenna, A., Fennell, T., Kernytsky, A., Sivachenko, A., Cibulskis, K., Gabriel, S., Altshuler, D., Daly, M.: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics 43, 491–498 (2011)CrossRefGoogle Scholar
  20. 20.
    Van der Auwera, G.A., Carneiro, M., Hartl, C., Poplin, R., del Angel, G., Levy-Moonshine, A., Jordan, T., Shakir, K., Roazen, D., Thibault, J., Banks, E., Garimella, K., Altshuler, D., Gabriel, S., DePristo, M.: From FastQ Data to High-Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline. Current Protocols in Bioinformatics 43, 11.10.1–11.10.33 (2013)Google Scholar
  21. 21.
    McLaren, W., Pritchard, P., Rios, D., Chen, Y., Flicek, P., Cunningham, F.: Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26(16), 2069–2070 (2010)CrossRefGoogle Scholar
  22. 22.
    Li, J., Lupat, R., Amarasinghe, K.C., Thompson, E.R., Doyle, M.A., Ryland, G.L., Tothill, R.W., Halgamuge, S.K., Campbell, I.G., Gorringe, K.L.: CONTRA: copy number analysis for targeted resequencing. Bioinformatics 28(10), 1307–1313 (2012)CrossRefGoogle Scholar
  23. 23.
    Krueger, F., Andrews, S.R.: Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27(11), 1571–1572 (2011)CrossRefGoogle Scholar
  24. 24.
    Vaqué, J.P., Gómez-López, G., Monsálvez, V., Varela, I., Martínez, N., Pérez, C., Domínguez, O., Graña, O., et al.: PLCG1 mutations in cutaneous T-cell lymphomas. Blood 123(13), 2034–2043 (2014)CrossRefGoogle Scholar
  25. 25.
    Cuadrado, A., Remeseiro, S., Graña, O., Pisano, D.G., Losada, A.: The contribution of cohesin-SA1 to gene expression and chromatin architecture in two murine tissues. Nucleic Acids Res., March 3, 2015. doi:10.1093/nar/gkv144

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Miriam Rubio-Camarillo
    • 1
  • Hugo López-Fernández
    • 2
    • 3
  • Gonzalo Gómez-López
    • 1
  • Ángel Carro
    • 1
  • José María Fernández
    • 4
  • Florentino Fdez-Riverola
    • 2
    • 3
  • Daniel Glez-Peña
    • 2
    • 3
  • David G. Pisano
    • 1
  1. 1.Bioinformatics Unit (UBio), Structural Biology and Biocomputing ProgrammeSpanish National Cancer Research Centre (CNIO)MadridSpain
  2. 2.ESEI - Escuela Superior de Ingeniería Informática, Edificio PolitécnicoOurenseSpain
  3. 3.Instituto de Investigación Biomédica de Vigo (IBIV)VigoSpain
  4. 4.Structural Computational Biology Group, Structural Biology and BioComputing ProgrammeSpanish National Cancer Research Centre (CNIO)MadridSpain

Personalised recommendations