Abstract
Grids enable sharing, selection and aggregation of geographically distributed resources among various organizations. They are emerging as promising computing paradigms for resource and compute-intensive scientific workflow applications modeled as Directed Acyclic Graph (DAG) with intricate inter-task dependencies. With the growing popularity of real-time applications, streaming workflows continuously produce large quantity of experimental or simulation datasets, which need to be processed in a timely manner subject to certain performance and resource constraints. However, the heterogeneity and dynamics of Grid resources complicate the scheduling of streaming applications. In addition, the commercialization of Grids as a future trend is calling for policies to take resource cost into account while striving to satisfy the users’ Quality of Service (QoS) requirements. In this paper, streaming workflow applications are modeled as DAGs. We formulate scheduling problems with two different objectives in mind, namely either maximize the throughput under a budget/cost constraint or minimize the execution cost under a minimum throughput constraint. Two different algorithms named as Budget constrained RATE (\(B\)-RATE) and Budget constrained SWAP (\(B\)-SWAP) are developed and evaluated under the first objective; Another two algorithms named as Throughput constrained RATE (\(TP\)-RATE) and Throughput constrained SWAP (\(TP\)-SWAP) are evaluated under the second objective. Experimental results based on GridSim showed that our algorithms either achieved much lower cost with similar throughput, or higher throughput with similar cost compared with other comparable existing algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tannenbaum, T., Wright, D., Miller, K., Livny, M.: Condor - A Distributed Job. MIT Press, Cambridge (2002)
Blythe, J., Jain, S., Deelman, E., Gi, Y., Vahi, K., Mandal, A., Kennedy, K.: Task scheduling strategies for workflow-based applications in grids. In: IEEE International Symposium on Cluster Computing and the Grid (CCGrid), pp. 759–767 (2005)
Cao, J., Jarvis, S., Saini, S., Nudd, G.: Gridflow:workflow management for grid computing. In: 3rd International Symposium on Cluster Computing and the Grid (CCGrid), Tokyo, Japan (2003)
Abramson, R.B.D., Venugopal, S.: The grid economy. Proc. IEEE 93(3), 698–714 (2005)
Foster, I.: Globus toolkit version 4: software for service-oriented systems. J. Comput. Sci. Technol. 21, 513–520 (2006)
Gu, Y., Wu, Q.: Maximizing workflow throughput for streaming applications in distributed environments. In: 19th International Conference on Computer Communications and Networks (ICCCN) (2010)
Agarwalla, B., Ahmed, N., Hilley, D., Ramachandran, U.: Streamline: a scheduling heuristic for streaming application on the grid. In: The 13th Multimedia Computing and Networking Conference, pp. 69–85 (2007)
DAGMan. http://research.cs.wisc.edu/htcondor/dagman/dagman.html
Globus. http://www.globus.org
Deelman, E., Singh, G., Su, M.H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. 13, 219–237 (2005)
Yu, J., Buyya, R.: A taxonomy of scientific workflow systems for grid computing. SIGMOD Rec. 34(3), 44–49 (2005)
Topcuoglu, S., Wu, M.: Task scheduling algorithms for heterogeneous processors. In: 8th IEEE Heterogeneous Computing Workshop (HCW99), pp. 3–14 (1999)
Sonmez, O., Yigitbasi, N., Abrishami, S., Iosup, A., Epema, D.: Performance analysis of dynamic workflow scheduling in multicluster grids. In: The 19th ACM International Symposium on High Performance Distributed Computing (HPDC ’10) (2010)
Dongarra, J., Jeannot, E., Saule, E., Shi, Z.: Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems. In: The 19th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’07), pp. 280–288 (2007)
Wu, Q., Gu, Y.: Supporting distributed application workflows in heterogeneous computing environments. In: 14th International Conference on Parallel and Distributed Systems (ICPADS08), Vol. 47. pp. 8–22 (2008)
Wu, Q., Zhu, M., Lu, X., Brown, P., Lin, Y., Gu, Y., Cao, F., Reuter, M.: Automation and management of scientific workflows in distributed network environments. In: The 6th International Workshop of IPDPS on System Management Techniques, Processes, and Services, pp. 1–8 (2010)
Wu, Q., Zhu, M., Gu, Y., Brown, P., Lu, X., Lin, W., Liu, Y.: A distributed workflow management system with case study of real-life scientific applications on grids. J. Grid Comput. 10(3), 367–393 (2012)
Wu, Q., Gu, Y., Lin, Y., Rao, N.: Latency modeling and minimization for large-scale scientific workflows in distributed network environments. In: The 44th Annual Simulation Symposium (ANSS 2011), pp. 205–212 (2011)
Gu, Y., Wu, Q., Liu, X., Yu, D.: Improving throughput and reliability of distributed scientific workflows for streaming data processing. In: The 13th IEEE International Conference on High Performance and Communications (HPCC), pp. 347–354 (2011)
Yu, J., Buyya, R.: A budget constrained scheduling of workflow applications on utility grids using genetic algorithms. In: Workshop on Workflows in Support of Large-Scale Science (WORKS), pp. 1–10 (2006)
Yuan, Y., Wang, K., Sun, X., Guo, T.: An iterative heuristic for scheduling grid workflows with budget constraints. In: International Conference on Machine Learning and Cybernetics, pp. 1700–1705 (2009)
Abrishami, S., Naghibzadeh, M., Epema, D.: Cost-driven scheduling of grid workflows using partial critical paths. IEEE Trans. Parallel Distrib. Sys. 23(8), 1400–1414 (2012)
Yao, Y., Liu, J., Ma, L.: Efficient cost optimization for workflow scheduling on grids. In: International Conference on Management and Service Science (MASS), pp. 1–4 (2010)
Sakellariou, R., Zhao, H., Tsiakkouri, E., Dikaiakos, M.: Scheduling workflows with budget constraints. In: Gorlatch, S., Danelutto, M. (eds.) Integrated Research in Grid Computing, pp. 189–202. Springer, Heidelberg (2007)
Yu, J., Buyya, R.: Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms. Sci. Program. 14(3–4), 217–230 (2006)
Yu, J., Buyya, R., Tham, C.: Cost-based scheduling of scientific workflow applications on utility grids. In: First International Conference one-Science and Grid Computing, pp. 139–147 (2005)
Sakellariou, R., Zhao, H.: A hybrid heuristic for dag scheduling on heterogeneous systems. In: 13th IEEE Heterogeneous Computing Workshop (HCW’04), Santa Fe, New Mexico, USA (2004)
Buyya, R., Murshed, M.: Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurr. Comput. Pract. Exp. 14(13), 1175–1220 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cao, F., Zhu, M.M., Ding, D. (2014). Distributed Workflow Scheduling Under Throughput and Budget Constraints in Grid Environments. In: Desai, N., Cirne, W. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2013. Lecture Notes in Computer Science(), vol 8429. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43779-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-662-43779-7_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43778-0
Online ISBN: 978-3-662-43779-7
eBook Packages: Computer ScienceComputer Science (R0)