Open Science in the Cloud: Towards a Universal Platform for Scientific and Statistical Computing
The UK, through the e-Science program, the US through the NSF-funded cyber infrastructure and the European Union through the ICT Calls aimed to provide “the technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge”.1 The Grid (Foster, 2002; Foster; Kesselman, Nick, & Tuecke, 2002), foreseen as a major accelerator of discovery, didn’t meet the expectations it had excited at its beginnings and was not adopted by the broad population of research professionals. The Grid is a good tool for particle physicists and it has allowed them to tackle the tremendous computational challenges inherent to their field. However, as a technology and paradigm for delivering computing on demand, it doesn’t work and it can’t be fixed. On one hand, “the abstractions that Grids expose – to the end-user, to the deployers and to application developers – are inappropriate and they need to be higher level” (Jha, Merzky, & Fox), and on the other hand, academic Grids are inherently economically unsustainable. They can’t compete with a service outsourced to the Industry whose quality and price would be driven by market forces. The virtualization technologies and their corollary, the Infrastructure-as-a-Service (IaaS) style cloud, hold the promise to enable what the Grid failed to deliver: a sustainable environment for computational sciences that would lower the barriers for accessing federated computational resources, software tools and data; enable collaboration and resources sharing and provide the building blocks of a ubiquitous platform for traceable and reproducible computational research.
KeywordsVirtual Machine Application Program Interface Cell Range Java Virtual Machine Computational Engine
- Amazon, Inc. (2006). Amazon elastic compute cloud [Online]. Available: http://aws.amazon.com/ec2.
- Amazon, Inc. (2006). Amazon simple storage service [Online]. Available: http://aws.amazon.com/s3.
- Foster, I., Kesselman, C., Nick, J., & Tuecke, S. (2002). The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum.Google Scholar
- Jha, S., Merzky, A., & Fox, G. (2009). Using clouds to provide grids with higher levels of abstraction and explicit support for usage modes. Concurrency and Computation: Practice and Experience, 21(8), 1087–1108.Google Scholar
- Keahey K., & Freeman, T. (October 2008). Science clouds: Early experiences in cloud computing for scientific applications. Chicago, IL: Cloud Computing and Its Applications 2008 (CCA-08).Google Scholar
- R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
- Theus, M., & Urbanek, S. (2008). Interactive graphics for data analysis: Principles and examples. CRC Press. ISBN 978-1-5848-8594-8, 2008.Google Scholar
- YarKhan, A., Dongarra, J., & Seymour, K. (July 2006). NetSolve to GridSolve: The evolution of a network enabled solver. IFIP WoCo9 Conference “Grid-Based Problem Solving Environments: Implications for Development and Deployment of Numerical Software,” Prescott, AZ.Google Scholar