Abstract
e-Science focuses on the use of computational tools and resources to analyze large scientific datasets. Performing these analyses often requires running a variety of computational tools specific to a given scientific domain. This places a significant burden on individual researchers for whom simply running these tools may be prohibitively difficult, let alone combining tools into a complete analysis, or acquiring data and appropriate computational resources. This limits the productivity of individual researchers and represents a significant barrier to potential scientific discovery. In order to alleviate researchers from such unnecessary complexities and promote more robust science, we have developed a tool integration framework called Galaxy; Galaxy abstracts individual tools behind a consistent and easy-to-use web interface to enable advanced data analysis that requires no informatics expertise. Furthermore, Galaxy facilitates easy addition of developed tools, thus supporting tool developers, as well as transparent and reproducible communication of computationally intensive analyses. Recently, we have enabled trivial deployment of complete a Galaxy solution on aggregated infrastructures, including cloud computing providers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
References
NCBI. (2009, February 3). GenBank Statistics. Available: http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
E. Huedo, R. S. Montero, and I. M. Llorente, “A Framework for Adaptive Execution on Grids,” Journal of Software - Practice and Experience, vol. 34, issue 7, pp. 631–651, June 2004.
E. Afgan and P. Bangalore, “Dynamic BLAST – a Grid Enabled BLAST,” International Journal of Computer Science and Network Security (IJCSNS), vol. 9, issue 4, pp. 149–157, April 2009.
D. Blankenberg, J. Taylor, I. Schenck, J. He, Y. Zhang, M. Ghent, N. Veeraraghavan, I. Albert, W. Miller, K. Makova, R. Hardison, and A. Nekrutenko, “A framework for collaborative analysis of ENCODE data: making large-scale analyses biologist-friendly,” Genome Research, vol. 17, issue 6, pp. 960–964, Jun 2007.
J. Taylor, I. Schenck, D. Blankenberg, and A. Nekrutenko, “Using Galaxy to perform large-scale interactive data analyses,” Current Protocols in Bioinformatics, vol. 19, pp. 10.5.1–10.5.25, Sep 2007.
M. Reich, T. Liefeld, J. Gould, J. Lerner, P. Tamayo, and J. Mesirov, “GenePattern 2.0,” Nature genetics, vol. 38, issue 5, pp. 500–501, 2006.
B. Langmead, C. Trapnell, M. Pop, and S. Salzberg, “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome biology, vol. 10, issue 3, p. 25, Mar 4 2009.
P. Kosakovsky, S. Wadhawan, F. Chiaromonte, G. Ananda, W. Chung, J. Taylor, and A. Nekrutenko, “Windshield splatter analysis with the Galaxy metagenomic pipeline,” Genome Research, vol. 19, issue 11, Oct 9 2009.
R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Future Generation Computer Systems, vol. 25, issue 6, pp. 599–616, June 2009.
M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the Clouds: A Berkeley View of Cloud Computing,” University of California at Berkeley UCB/EECS-2009-28, February 10 2009.
J. Nielsen, Designing web usability, 1st ed.: Peachpit Press, 1999.
S. Peleg, F. Sananbenesi, A. Zovoilis, S. Burkhardt, S. Bahari-Javan, R. Agis-Balboa, P. Cota, J. Wittnam, A. Gogol-Doering, and L. Opitz, “Altered Histone Acetylation Is Associated with Age-Dependent Memory Impairment in Mice,” Science, vol. 328, issue 5979, pp. 753–756, 2010.
S. Kosakovsky Pond, S. Wadhawan, F. Chiaromonte, G. Ananda, W. Chung, J. Taylor, and A. Nekrutenko, “Windshield splatter analysis with the Galaxy metagenomic pipeline,” Genome Research, vol. 19, issue 11, pp. 2144–2153, 2009.
K. Gaulton, T. Nammo, L. Pasquali, J. Simon, P. Giresi, M. Fogarty, T. Panhuis, P. Mieczkowski, A. Secchi, and D. Bosco, “A map of open chromatin in human pancreatic islets,” Nature genetics, vol. 42, issue 3, pp. 255–259, 2010.
R. Kikuchi, S. Yagi, H. Kusuhara, S. Imai, Y. Sugiyama, and K. Shiota, “Genome-wide analysis of epigenetic signatures for kidney-specific transporters,” Kidney International, 2010.
J. Parkhill, E. Birney, and P. Kersey, “Genomic information infrastructure after the deluge,” Genome biology, vol. 11, issue 7, p. 402, 2010.
The Grid: Blueprint for a New Computing Infrastructure, 1st ed.: Morgan Kaufmann Publishers, 1998.
K. Keahey and T. Freeman, “Contextualization: Providing one-click virtual clusters,” in IEEE International Conference on eScience, Indianapolis, IN, 2008, pp. 301–308.
D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, “The eucalyptus open-source cloud-computing system,” in Cloud Computing and Its Applications, Shanghai, China, 2008, pp. 1–5.
I. M. Llorente, R. Moreno-Vozmediano, and R. S. Montero, “Cloud Computing for On-Demand Grid Resource Provisioning,” Advances in Parallel Computing, vol. 18, pp. 177–191, 2009.
K. Keahey, I. Foster, T. Freeman, and X. Zhang, “Virtual Workspaces: Achieving Quality of Service and Quality of Life in the Grid,” Scientific Programming Journal, Special Issue: Dynamic Grids and Worldwide Computing, vol. 13, issue 4, pp. 265–276, 2005.
H. Nishimura, N. Maruyama, and S. Matsuoka, “Virtual clusters on the fly-fast, scalable, and flexible installation,” in CCGrid Rio de Janeiro, Brazil, 2007, pp. 549–556.
A. W. Group, “AMQP - A General-Purpose Middleware Standard,” ed, p. 291.
A. Siepel, A. Farmer, A. Tolopko, M. Zhuang, P. Mendes, W. Beavis, and B. Sobral, “ISYS: a decentralized, component-based approach to the integration of heterogeneous bioinformatics resources,” Bioinformatics, vol. 17, issue 1, pp. 83–94, Aug 14 2001.
S. Subramaniam, “The Biology Workbench--a seamless database and analysis environment for the biologist,” Proteins, vol. 32, issue 1, pp. 1–2, Jul 1 1998.
K. Choi, Y. Ma, J.-H. Choi, and S. Kim, “PLATCOM: a Platform for Computational Comparative Genomics,” Bioinformatics, vol. 21, issue 10, pp. 2514–2516, Feb 24 2005.
T. Etzold and P. Argos, “SRS--an indexing and retrieval tool for flat file data libraries,” Bioinformatics, vol. 9, issue 1, pp. 49–57, 1993.
E. Kawas, M. Senger, and M. D. Wilkinson, “BioMoby extensions to the Taverna workflow management and enactment software,” BMC Bioinformatics, vol. 7, p. 253, 2006.
D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn, “Taverna: a tool for building and running workflows of services,” Nucleic Acids Research, vol. 34, issue Web Server issue, pp. W729–32, 2006.
D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn, “Taverna: a tool for building and running workflows of services,” Nucleic Acids Research, vol. 34, issue Web Server issue, pp. W729–32, 2006.
S. Pepke, B. Wold, and A. Mortazavi, “Computation for ChIP-seq and RNA-seq studies,” Nature methods, vol. 6, pp. S22–S32, 2009.
B. Moore, “Taking the data center: Power and cooling challenge,” Energy User News, vol. 27, issue 9, p. 20, 2002.
Acknowledgments
Galaxy is developed by the Galaxy Team: Enis Afgan, Guruprasad Ananda, Dannon Baker, Dan Blankenberg, Ramkrishna Chakrabarty, Nate Coraor, Jeremy Goecks, Greg Von Kuster, Ross Lazarus, Kanwei Li, Anton Nekrutenko, James Taylor, and Kelly Vincent. We thank our many collaborators who support and maintain data warehouses and browsers accessible through Galaxy. Development of the Galaxy framework is supported by NIH grants HG004909 (A.N. and J.T), HG005133 (J.T. and A.N), and HG005542 (J.T. and A.N.), by NSF grant DBI-0850103 (A.N. and J.T) and by funds from the Huck Institutes for the Life Sciences and the Institute for CyberScience at Penn State. Additional funding is provided, in part, under a grant with the Pennsylvania Department of Health using Tobacco Settlement Funds. The Department specifically disclaims responsibility for any analyses, interpretations, or conclusions.
Author information
Authors and Affiliations
Consortia
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this chapter
Cite this chapter
Afgan, E. et al. (2011). Galaxy: A Gateway to Tools in e-Science. In: Yang, X., Wang, L., Jie, W. (eds) Guide to e-Science. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-0-85729-439-5_6
Download citation
DOI: https://doi.org/10.1007/978-0-85729-439-5_6
Published:
Publisher Name: Springer, London
Print ISBN: 978-0-85729-438-8
Online ISBN: 978-0-85729-439-5
eBook Packages: Computer ScienceComputer Science (R0)