Journal of Grid Computing

, Volume 14, Issue 4, pp 533–543

The Essential Components of a Successful Galaxy Service

  • Annette McGrath
  • Steve McMahon
  • Sean Li
  • Joel Ludbey
  • Tim Ho
Article

Abstract

Driven by advances in data generation technologies and fuelled by radical reduction in costs, genomics has become a data science. Nonetheless the field of genomics has been restrained by the ability to analyse data. Science gateways, such as Galaxy, have the potential to enable bench biologists to analyse their own data without needing be familiar with the command line. Implementing a production scale Galaxy service, sufficiently well-featured and resourced to meet the needs of the end-users, is a significant undertaking and requires the consideration and combination of a number of factors to be successfully adopted by the community. In this paper, we describe the process that we undertook to implement a Galaxy service and describe what we consider to be the essential components of such a service. Our experience and insights will be of interest to those who are planning on implementing a science gateway service in a research organisation.

Keywords

Galaxy HPC Bioinformatics Next generation sequencing Science gateway 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A Survey of Data-Intensive Scientific Workflow Management. J. Grid Comput. 13, 457 (2015). doi:10.1007/s10723-015-9329-8CrossRefGoogle Scholar
  2. 2.
    Emeakaroha, V. C., Maurer, M., Stern, P., ŁAbaj, P.P., Brandic, I., Kreil, D.P.: Managing and optimizing bioinformatics workflows for data analysis in clouds. J. Grid Comput. 11, 407 (2013). doi:10.1007/s10723-013-9260-9 CrossRefGoogle Scholar
  3. 3.
    Le Blanc, A., Brooke, J., Fellows, D., Soldati, M., Pérez-Suárez, D., Marassi, A., Santin, A.: Workflows for heliophysics. J. Grid Comput. 11, 481 (2013). doi:10.1007/s10723-013-9256-5 CrossRefGoogle Scholar
  4. 4.
    Gugnani, S., Blanco, C., Kiss, T., Terstyanszky, G.: Extending Science Gateway Frameworks to Support Big Data Applications in the Cloud J Grid Computing (2016). doi:10.1007/s10723-016-9369-8
  5. 5.
    Goecks, J., Nekrutenko, A., Taylor, J.: and The Galaxy Team. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), R86 (2010)CrossRefGoogle Scholar
  6. 6.
    Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., Taylor, J.: Galaxy: a web-based genome analysis tool for experimentalists, vol. 19. Current Protocols in Molecular Biology (2010)Google Scholar
  7. 7.
    Giardine, B., Riemer, C., Hardison, R. C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., Miller, W., Kent, W. J., Nekrutenko, A.: Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15(10), 1451–5 (2005)CrossRefGoogle Scholar
  8. 8.
    Oakley, T. H., Alexandrou, M. A., Ngo, R., Pankey, M. S., Churchill, C. K., Chen, W., Lopker, K. B.: Osiris: accessible and reproducible phylogenetic and phylogenomic analyses within the Galaxy workflow management system. BMC Bioinform. 15, 230 (2014). doi:10.1186/1471-2105-15-230 CrossRefGoogle Scholar
  9. 9.
    Bedoya-Reina, O. C., Ratan, A., Burhans, R., Kim, H. L., Giardine, B., Riemer, C., Li, Q., Olson, T. L., Loughran, T. P. Jr., Vonholdt, B. M., Perry, G. H., Schuster, S. C., Miller, W.: Galaxy tools to study genome diversity. Gigascience 2(1), 17 (2013). doi:10.1186/2047-217X-2-17 CrossRefGoogle Scholar
  10. 10.
    Blankenberg, D., Johnson, J.: The Galaxy Team, Taylor, J. Nekrutenko, A. Wrangling Galaxy’s reference data. Bioinformatics 30(13), 1917–1919 (2014)CrossRefGoogle Scholar
  11. 11.
    Blankenberg, D., Von Kuster, G., Bouvier, E., Baker, D., Afgan, E., Stoler, N.: the Galaxy Team, Taylor, J., Nekrutenko, A. Dissemination of scientific software with Galaxy ToolShed. Genome Biol. 15, 403 (2014). doi:10.1186/gb4161 CrossRefGoogle Scholar
  12. 12.
    Hook, S.E., Johnston, E.L., Nair, S., Roach, A.C., Moncuquet, P., Twine, N.A., Raftos, D.A.: Next generation sequence analysis of the transcriptome of Sydney rock oysters (Saccostrea glomerata) exposed to a range of environmental stressors. Mar. Genomics 18, B:109–11 (2014). doi:10.1016/j.margen.2014.08.003 CrossRefGoogle Scholar
  13. 13.
    Hook, S. E., Twine, N. A., Simpson, S. L., Spadaro, D. A., Moncuquet, P., Wilkins, M.R.: 454 pyrosequencing-based analysis of gene expression profiles in the amphipod Melita plumulosa: transcriptome assembly and toxicant induced changes. Aquat. Toxicol. 153, 73–88 (2014 Aug). doi:10.1016/j.aquatox.2013.11.022
  14. 14.
    Hook, S.E., Osborn, H.L., Gissi, F., Moncuquet, P., Twine, N.A., Wilkins, M.R., Adams, M.S.: RNA-Seq analysis of the toxicant-induced transcriptome of the marine diatom, Ceratoneis closterium. Mar. Genomics 16, 45–53 (2014). doi:10.1016/j.margen.2013.12.004 CrossRefGoogle Scholar
  15. 15.
    Bragg, L., Stone, G., Imelfort, M., Hugenholtz, P., Tyson, G. W.: Fast, accurate error-correction of amplicon pyrosequences using Acacia. Nat. Methods 9(5), 425–6 (2012). doi:10.1038/nmeth.1990 CrossRefGoogle Scholar
  16. 16.
    Greenfield, P., Duesing, K., Papanicolaou, A., Bauer, D. C.: Blue: correcting sequencing errors using consensus and context. Bioinformatics 30(19), 2723–32 (2014). doi:10.1093/bioinformatics/btu368 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Annette McGrath
    • 1
  • Steve McMahon
    • 2
  • Sean Li
    • 1
  • Joel Ludbey
    • 3
  • Tim Ho
    • 3
  1. 1.CSIRO Data61ActonAustralia
  2. 2.CSIRO Information Management & TechnologyYarralumlaAustralia
  3. 3.CSIRO Information Management & TechnologyClaytonAustralia

Personalised recommendations