Advertisement

Measuring the Effectiveness of Throttled Data Transfers on Data-Intensive Workflows

  • Ricardo J. Rodríguez
  • Rafael Tolosana-Calasanz
  • Omer F. Rana
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7327)

Abstract

In data intensive workflows, which often involve files, transfer between tasks is typically accomplished as fast as the network links allow, and once transferred, the files are buffered/stored at their destination. Where a task requires multiple files to execute (from different previous tasks), it must remain idle until all files are available. Hence, network bandwidth and buffer/storage within a workflow are often not used effectively. In this paper, we are quantitatively measuring the impact that applying an intelligent data movement policy can have on buffer/storage in comparison with existing approaches. Our main objective is to propose a metric that considers a workflow structure expressed as a Directed Acyclic Graph (DAG), and performance information collected from historical past executions of the considered workflow. This metric is intended for use at the design-stage, to compare various DAG structures and evaluate their potential for optimisation (of network bandwidth and buffer use).

Keywords

Directed Acyclic Graph Network Bandwidth Performance Information Input Place Synchronisation Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Park, S.M., Humphrey, M.: Data Throttling for Data-Intensive Workflows. In: IEEE International Symposium on Parallel and Distributed Processing, pp. 1–11 (April 2008)Google Scholar
  2. 2.
    van der Aalst, W., van Hee, K.: Workflow Management: Models, Methods, and Systems. MIT Press Books, vol. 1. The MIT Press (2004)Google Scholar
  3. 3.
    van der Aalst, W.M.P., Hirnschall, A., Verbeek, H.M.W.: An Alternative Way to Analyze Workflow Graphs. In: Pidduck, A.B., Mylopoulos, J., Woo, C.C., Ozsu, M.T. (eds.) CAiSE 2002. LNCS, vol. 2348, pp. 535–552. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Filgueira, R., Carretero, J., Singh, D.E., Calderón, A., Nuñez, A.: Dynamic-compi: dynamic optimization techniques for mpi parallel applications. The Journal of Supercomputing 59(1), 361–391 (2012)CrossRefGoogle Scholar
  5. 5.
    Yu, J., Buyya, R.: A Taxonomy of Workflow Management Systems for Grid Computing. CoRR 34(3), 44–49 (2005)Google Scholar
  6. 6.
    Oinn, T., Greenwood, M., Addis, M., Alpdemir, M.N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D., Li, P., Lord, P., Pocock, M.R., Senger, M., Stevens, R., Wipat, A., Wroe, C.: Taverna: lessons in creating a workflow environment for the life sciences: Research Articles. Concurr. Comput.: Pract. Exper. 18(10), 1067–1100 (2006)CrossRefGoogle Scholar
  7. 7.
    Deelman, E., Mehta, G., Singh, G., Su, M., Vahi, K.: Pegasus: Mapping Large-Scale Workflows to Distributed Resources. In: Workflows for eScience, pp. 376–394. Springer (2007)Google Scholar
  8. 8.
    Rodríguez, R.J., Tolosana-Calasanz, R., Rana, O.F.: Automating Data-Throttling Analysis for Data-Intensive Workflows. In: Proceedings of CCGrid (accepted for publication, 2012)Google Scholar
  9. 9.
    Murata, T.: Petri Nets: Properties, Analysis and Applications. Proceedings of the IEEE 77, 541–580 (1989)CrossRefGoogle Scholar
  10. 10.
    Molloy, M.: Performance Analysis Using Stochastic Petri Nets. IEEE Transactions on Computers C-31(9), 913–917 (1982)CrossRefGoogle Scholar
  11. 11.
    Rodríguez, R.J., Júlvez, J.: Accurate Performance Estimation for Stochastic Marked Graphs by Bottleneck Regrowing. In: Aldini, A., Bernardo, M., Bononi, L., Cortellessa, V. (eds.) EPEW 2010. LNCS, vol. 6342, pp. 175–190. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Campos, J., Silva, M.: Embedded Product-Form Queueing Networks and the Improvement of Performance Bounds for Petri Net Systems. Performance Evaluation 18(1), 3–19 (1993)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Berriman, G.B., Deelman, E., Good, J., Jacob, J.C., Katz, D.S., Laity, A.C., Prince, T.A., Singh, G., Su, M.H.: Generating Complex Astronomy Workflows. In: Taylor, I.J., Deelman, E., Gannon, D.B., Shields, M. (eds.) Workflows for e-Science, pp. 19–38. Springer, London (2007)CrossRefGoogle Scholar
  14. 14.
    Casanova, H., Legrand, A., Quinson, M.: SimGrid: a Generic Framework for Large-Scale Distributed Experiments. In: 10th IEEE International Conference on Computer Modeling and Simulation (March 2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ricardo J. Rodríguez
    • 1
  • Rafael Tolosana-Calasanz
    • 1
  • Omer F. Rana
    • 2
  1. 1.Dpto. de Informática e Ingeniería de SistemasUniversidad de ZaragozaZaragozaSpain
  2. 2.School of Computer Science & InformaticsCardiff UniversityCardiffUnited Kingdom

Personalised recommendations