I/O-Focused Cost Model for the Exploitation of Public Cloud Resources in Data-Intensive Workflows

  • Francisco Rodrigo Duro
  • Javier Garcia Blas
  • Jesus Carretero
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10049)


Ultrascale computing systems will blur the line between HPC and cloud platforms, transparently offering to the end-user every possible available computing resource, independently of their characteristics, location, and philosophy. However, this horizon is still far from complete. In this work, we propose a model for calculating the costs related with the deployment of data-intensive applications in IaaS cloud platforms. The model will be especially focused on I/O-related costs in data-intensive applications and on the evaluation of alternative I/O solutions. This paper also evaluates the differences in costs of a typical cloud storage service in contrast with our proposed in-memory I/O accelerator, Hercules, showing great flexibility potential in the price/performance trade-off. In Hercules cases, the execution time reductions are up to 25% in the best case, while costs are similar to Amazon S3.


Cloud Amazon Data-intensive Cost model Workflows 


  1. 1.
    Deelman, E.: Pegasus, a workflow management system for science automation. Future Gen. Comp. Syst. 46, 17–35 (2015)CrossRefGoogle Scholar
  2. 2.
    Carretero, J., et al.: Memorandum of understanding. In: Network for Sustainable Ultrascale Computing (NESUS), p. 30 (2014).
  3. 3.
    Chiu, D., Agrawal, G.: Evaluating caching and storage options on the Amazon Web Services Cloud. In: 11th IEEE/ACM International Conference on Grid Computing, pp. 17–24 (2010)Google Scholar
  4. 4.
    Duran, A., Ayguade, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(02), 173–193 (2011)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Duro, F.R., Blas, J.G., Isaila, F., Wozniak, J.M., Carretero, J., Ross, R.: Flexible data-aware scheduling for workflows over an in-memory object store. In: CCGRID 2016, pp. 321–324, May 2016Google Scholar
  6. 6.
    Duro, F.R., Garcia-Blas, J., Isaila, F., Carretero, J.: Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in cloud environments. In: DISCS 2015, pp. 6:1–6:8 (2015)Google Scholar
  7. 7.
    Li, H., Ghodsi, A., Zaharia, M., Shenker, S., Stoica, I.: Tachyon: Reliable, memory speed storage for cluster computing frameworks. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 1–15. ACM (2014)Google Scholar
  8. 8.
    Marozzo, F., Talia, D., Trunfio, P.: JS4Cloud: script-based workflow programming for scalable data analysis on cloud platforms. Concurrency Comput. Pract. Experience 27(17), 5214–5237 (2015)CrossRefGoogle Scholar
  9. 9.
    Rodrigo Duro, F., Marozzo, F., Garcia Blas, J., Talia, D., Trunfio, P.: Exploiting in-memory storage for improving workflow executions in cloud platforms. J. Supercomputing 72(11), 4069–4088 (2016)CrossRefGoogle Scholar
  10. 10.
    Yuan, D., Yang, Y., Liu, X., Chen, J.: A cost-effective strategy for intermediate data storage in scientific cloud workflow systems. In: IPDPS 2010, pp. 1–12 (2010)Google Scholar
  11. 11.
    Yuan, D., Yang, Y., Liu, X., Chen, J.: On-demand minimum cost benchmarking for intermediate dataset storage in scientific cloud workflow systems. J. Parallel Distrib. Comput. 71(2), 316–332 (2011)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Francisco Rodrigo Duro
    • 1
  • Javier Garcia Blas
    • 1
  • Jesus Carretero
    • 1
  1. 1.Computer Science and Engineering DepartmentUniversity Carlos IIILeganesSpain

Personalised recommendations