Skip to main content

I/O-Focused Cost Model for the Exploitation of Public Cloud Resources in Data-Intensive Workflows

  • Conference paper
  • First Online:
  • 854 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10049))

Abstract

Ultrascale computing systems will blur the line between HPC and cloud platforms, transparently offering to the end-user every possible available computing resource, independently of their characteristics, location, and philosophy. However, this horizon is still far from complete. In this work, we propose a model for calculating the costs related with the deployment of data-intensive applications in IaaS cloud platforms. The model will be especially focused on I/O-related costs in data-intensive applications and on the evaluation of alternative I/O solutions. This paper also evaluates the differences in costs of a typical cloud storage service in contrast with our proposed in-memory I/O accelerator, Hercules, showing great flexibility potential in the price/performance trade-off. In Hercules cases, the execution time reductions are up to 25% in the best case, while costs are similar to Amazon S3.

F. Rodrigo Duro—This work was supported by the project TIN2013-41350-P “Scalable Data Management Techniques for High-End Computing Systems” from the Ministerio de Economía y Competitividad, Spain.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://azure.microsoft.com/en-us/pricing/.

References

  1. Deelman, E.: Pegasus, a workflow management system for science automation. Future Gen. Comp. Syst. 46, 17–35 (2015)

    Article  Google Scholar 

  2. Carretero, J., et al.: Memorandum of understanding. In: Network for Sustainable Ultrascale Computing (NESUS), p. 30 (2014). http://www.nesus.eu

  3. Chiu, D., Agrawal, G.: Evaluating caching and storage options on the Amazon Web Services Cloud. In: 11th IEEE/ACM International Conference on Grid Computing, pp. 17–24 (2010)

    Google Scholar 

  4. Duran, A., Ayguade, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(02), 173–193 (2011)

    Article  MathSciNet  Google Scholar 

  5. Duro, F.R., Blas, J.G., Isaila, F., Wozniak, J.M., Carretero, J., Ross, R.: Flexible data-aware scheduling for workflows over an in-memory object store. In: CCGRID 2016, pp. 321–324, May 2016

    Google Scholar 

  6. Duro, F.R., Garcia-Blas, J., Isaila, F., Carretero, J.: Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in cloud environments. In: DISCS 2015, pp. 6:1–6:8 (2015)

    Google Scholar 

  7. Li, H., Ghodsi, A., Zaharia, M., Shenker, S., Stoica, I.: Tachyon: Reliable, memory speed storage for cluster computing frameworks. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 1–15. ACM (2014)

    Google Scholar 

  8. Marozzo, F., Talia, D., Trunfio, P.: JS4Cloud: script-based workflow programming for scalable data analysis on cloud platforms. Concurrency Comput. Pract. Experience 27(17), 5214–5237 (2015)

    Article  Google Scholar 

  9. Rodrigo Duro, F., Marozzo, F., Garcia Blas, J., Talia, D., Trunfio, P.: Exploiting in-memory storage for improving workflow executions in cloud platforms. J. Supercomputing 72(11), 4069–4088 (2016)

    Article  Google Scholar 

  10. Yuan, D., Yang, Y., Liu, X., Chen, J.: A cost-effective strategy for intermediate data storage in scientific cloud workflow systems. In: IPDPS 2010, pp. 1–12 (2010)

    Google Scholar 

  11. Yuan, D., Yang, Y., Liu, X., Chen, J.: On-demand minimum cost benchmarking for intermediate dataset storage in scientific cloud workflow systems. J. Parallel Distrib. Comput. 71(2), 316–332 (2011)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco Rodrigo Duro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Rodrigo Duro, F., Garcia Blas, J., Carretero, J. (2016). I/O-Focused Cost Model for the Exploitation of Public Cloud Resources in Data-Intensive Workflows. In: Carretero, J., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10049. Springer, Cham. https://doi.org/10.1007/978-3-319-49956-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49956-7_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49955-0

  • Online ISBN: 978-3-319-49956-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics