Abstract
Scientists, educators, decision makers, students, and many others utilize scientific data produced by science instruments. They study our universe, make new discoveries in areas such as weather forecasting and cancer research, and shape policy decisions that impact nations fiscally, socially, economically, and in many other ways. Over the past 20 years or so, the data produced by these scientific instruments have increased in volume, complexity, and resolution, causing traditional computing infrastructures to have difficulties in scaling up to deal with them. This reality has led us, and others, to investigate the applicability of cloud computing to address the scalability challenges. NASA’s Jet Propulsion Laboratory (JPL) is at the forefront of transitioning its science applications to the cloud environment. Through the Apache Object Oriented Data Technology (OODT) framework, for NASA’s first software released at the open-source Apache Software Foundation (ASF), engineers at JPL have been able to scale the storage and computational aspects of their scientific data systems to the cloud – thus achieving reduced costs and improved performance. In this chapter, we report on the use of Apache OODT for cloud computing, citing several examples in a number of scientific domains. Experience, specific performance, and numbers are also reported. Directions for future work in the area are also suggested.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Crichton, D., Mattmann, C., Hughes, J.S., Kelly, S., Hart, A.: A multi-disciplinary, model-driven, distributed science data system architecture. In Yang, X., Wang, L.L., Jie, W. (eds.) Guide to e-Science: Next Generation Scientific Research and Discovery, 1st edn. XVIII, 558 pp., 184 illus. Springer, London, ISBN 978-0-85729-438-8 (2011)
Chang, G., Malhotra, S., Wolgast, P.: Leveraging the Cloud for robust and efficient lunar processing. In: Proceedings of the IEEE Aerospace Conference, Big Sky, March 2011
Khawaja, S., Powell, M.W., Crockett, T.M., Norris, J.S., Rossi, R., Soderstrom, T.: Polyphony: a workflow orchestration framework for Cloud Computing. In: Proceedings of CCGRID, pp. 606–611 (2011)
Crichton, D., Mattmann, C., Thornquist, M., Hughes, J.S., Anton, K.: Bioinformatics: biomarkers of early detection. In: Grizzle, W., Srivastava, S. (eds.) Translational Pathology of Early Cancer. IOS Press, Amsterdam (2012)
ICSE 2011 Software Engineering for Cloud Computing Workshop (SECLOUD): https://sites.google.com/site/icsecloud2011/ (2011)
Marru, S., Gunathilake, L., et al.: Apache Airavata: a framework for distributed applications and computational workflows. In: Proceedings of the SC 2011 Workshop on Gateway Computing Environments, Seattle, 18 November 2011
Garfinkel, S.: An evaluation of Amazon’s grid computing services: EC2, S3 and SQS. Tech. Rep. TR-08-07, Harvard University, August 2007
Acosta N.: Big Data on the Open Cloud. Rackspace US, Inc. (2012)
Ciurana, E.: Developing with Google App Engine. Firstpress, Berkeley (2009)
OpenStack: http://openstack.org/ (2012)
CloudStack – Open Source Cloud Computing: http://cloudstack.org/ (2012)
Amazon Elastic MapReduce (Amazon EMR): http://aws.amazon.com/elasticmapreduce/ (2012)
Nebula Cloud Computing Platform – NASA: http://nebula.nasa.gov (2012)
Chang, G., Law, E., Malhotra, S.: Demonstration of LMMP lunar image processing using Amazon E2C Cloud Computing facility. In: Proceedings of the ICSE 2011 Software Engineering for Cloud Computing (SECLOUD) Workshop, Honolulu, May 2011
Tran, J., Cinquini, L., Mattmann, C., Zimdars, P., Cuddy, D., Leung, K., Kwoun, O., Crichton, D., Freeborn, D.: Evaluating Cloud Computing in the NASA DESDynI ground data system. In: Proceedings of the ICSE 2011 Workshop on Software Engineering for Cloud Computing – SECLOUD, Honolulu, 22 May 2011
Mattmann, C., Crichton, D., Medvidovic, N., Hughes, S.: A software architecture-based framework for highly distributed and data intensive scientific applications. In: Proceedings of the 28th International Conference on Software Engineering (ICSE06), Software Engineering Achievements Track, Shanghai, 20–28 May 2006, pp. 721–730 (2006)
Mattmann, C., Freeborn, D., Crichton, D., Foster, B., Hart, A., Woollard, D., Hardman, S., Ramirez, P., Kelly, S., Chang, A.Y., Miller, C.E.: A reusable process control system framework for the orbiting carbon observatory and NPP Sounder PEATE missions. In: Proceedings of the 3rd IEEE International Conference on Space Mission Challenges for Information Technology (SMC-IT 2009), 19–23 July 2009, pp. 165–172 (2009)
Google Summer of Code – Google Code: http://code.google.com/soc/ (2012)
Acknowledgments
The authors would like to thank the many project sponsors and collaborators that have supported this effort from the National Aeronautics and Space Administration, National Cancer Institute, and the Jet Propulsion Laboratory. This includes Elizabeth Kay-Im, Gary Lau, Sudhir Srivastava, Christos Patriotis, Shan Malhotra, Dana Freeborn, Andrew Hart, Paul Ramirez, Brian Foster, and Brian Chafin.
This work was performed at the Jet Propulsion Laboratory, California Institute of Technology, under contract to the National Aeronautics and Space Administration.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Crichton, D. et al. (2013). Architecting Scientific Data Systems in the Cloud. In: Mahmood, Z. (eds) Cloud Computing. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-1-4471-5107-4_2
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5107-4_2
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5106-7
Online ISBN: 978-1-4471-5107-4
eBook Packages: Computer ScienceComputer Science (R0)