Grid computing promises to enable a reliable and easy-to-use computational infrastructure for e-Science. To materialize this promise, grids need to provide full automation from the experiment design to the final result. Often, this automation relies on the execution of workflows, that is, of jobs comprising many inter-related computing and data transfer tasks. While several grid workflow execution tools already exist, not much is known about their workload. This lack of knowledge hampers the development of new workflow scheduling algorithms, and slows the tuning of existing ones. To address this situation, in this work we present an analysis of two workflow-based workload traces from the Austrian Grid. We introduce a method for analyzing such traces, focused on the intrinsic and on the environment-related characteristics of the workflows. Then, we analyze the workflows executed in the Austrian Grid over the last two years. Finally, we identify six categories of workflows based on their intrinsic workflow characteristics. We show that the six categories exhibit distinctive environmentrelated characteristics, and identify the categories that are difficult to execute for common workflow schedulers.
grid
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
P. Blaha, K. Schwarz, and J. Luitz. WIEN2k, a full potential linearized augmented plane wave package for calculating crystal properties. Austria 1999. ISBN 3-9501031-1-2.
T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. The MIT Press and McGraw-Hill Book Company, 1989.
R. Duan, R. Prodan, and T. Fahringer. Dee: A distributed fault tolerant workflow enactment engine for grid computing. In HPCC, volume 3726 of LNCS, pages 704-716. Springer-Verlag, 2005.
T. Fahringer, A. Jugravu, S. Pllana, R. Prodan, C. S. Jr., and H. L. Truong. ASKALON: a tool set for cluster and grid computing. CP&E, 17(2-4):143-169, 2005.
T. Fahringer, J. Qin, and S. Hainzer. Specification of grid workflow applications with agwl: an abstract grid workflow language. In CCGrid, pages 676-685. IEEE CS, 2005.
D. G. Feitelson and L. Rudolph. Metrics and benchmarking for parallel job scheduling. In JSSPP, volume 1459 of LNCS, pages 1-24. Springer, 1998.
J. L. Henning. Spec cpu2000: Measuring cpu performance in the new millennium. IEEE Computer, 33(7):28-35, 2000.
A. Iosup, C. Dumitrescu, D. Epema, H. Li, and L. Wolters. How are real grids used? the analysis of four grid traces and its implications. In GRID, pages 262-269. IEEE CS, 2006.
A. Iosup and D. H. J. Epema. Grenchmark: A framework for analyzing, testing, and comparing grids. In CCGrid, pages 313-320. IEEE CS, 2006.
A. Iosup, M. Jan, O. Sonmez, and D. Epema. The characteristics and performance of groups of jobs in grids. In Euro-Par, LNCS. Springer-Verlag, August 2007.
R. Jain. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling,. May 1991.
E. G. C. Jr. and R. L. Graham. Optimal scheduling for two-processor systems. Acta Inf., 1972.
Y.-K. Kwok and I. Ahmad. Benchmarking and comparison of the task graph scheduling algorithms. J. PDC, 59(3):381-422, 1999.
H. Li, D. L. Groep, and L. Wolters. Workload characteristics of a multi-cluster supercom- puter. In JSSPP, volume 3277 of LNCS, pages 176-193. Springer-Verlag, 2004.
U. Lublin and D. G. Feitelson. The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. PDC, 63(11):1105-1122, 2003.
K. Plankensteiner. EE2: A high performance execution engine for scientific workflows on Clusters and the Grid. U.Innsbruck, Master Thesis, 2008.
J. Yu and R. Buyya. A taxonomy of scientific workflow systems for grid computing. ACM SIGMOD Rec., 34(3):44-49, 2005.
F. Nadeem, R. Prodan, and T. Fahringer. Optimizing Performance of Automatic Training Phase for Application Performance Prediction in the Grid. In HPCC, pages 309-321, 2007.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Ostermann, S., Prodan, R., Fahringer, T., Iosup, A., Epema, D. (2008). A Trace-Based Investigation Of The Characteristics Of Grid Workflows. In: Priol, T., Vanneschi, M. (eds) From Grids to Service and Pervasive Computing. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09455-7_14
Download citation
DOI: https://doi.org/10.1007/978-0-387-09455-7_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-09454-0
Online ISBN: 978-0-387-09455-7
eBook Packages: Computer ScienceComputer Science (R0)