A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems

Chandio, Aftab Ahmed; Bilal, Kashif; Tziritas, Nikos; Yu, Zhibin; Jiang, Qingshan; Khan, Samee U.; Xu, Cheng-Zhong

doi:10.1007/s10586-014-0384-x

A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems

Published: 15 June 2014

Volume 17, pages 1349–1367, (2014)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Aftab Ahmed Chandio^1,2,3,
Kashif Bilal⁴,
Nikos Tziritas¹,
Zhibin Yu¹,
Qingshan Jiang¹,
Samee U. Khan^1,4 &
…
Cheng-Zhong Xu^1,5

924 Accesses
29 Citations
Explore all metrics

Abstract

In the large-scale parallel computing environment, resource allocation and energy efficient techniques are required to deliver the quality of services (QoS) and to reduce the operational cost of the system. Because the cost of the energy consumption in the environment is a dominant part of the owner’s and user’s budget. However, when considering energy efficiency, resource allocation strategies become more difficult, and QoS (i.e., queue time and response time) may violate. This paper therefore is a comparative study on job scheduling in large-scale parallel systems to: (a) minimize the queue time, response time, and energy consumption and (b) maximize the overall system utilization. We compare thirteen job scheduling policies to analyze their behavior. A set of job scheduling policies includes (a) priority-based, (b) first fit, (c) backfilling, and (d) window-based policies. All of the policies are extensively simulated and compared. For the simulation, a real data center workload comprised of 22385 jobs is used. Based on results of their performance, we incorporate energy efficiency in three policies i.e., (1) best result producer, (2) average result producer, and (3) worst result producer. We analyze the (a) queue time, (b) response time, (c) slowdown ratio, and (d) energy consumption to evaluate the policies. Moreover, we present a comprehensive workload characterization for optimizing system’s performance and for scheduler design. Major workload characteristics including (a) Narrow, (b) Wide, (c) Short, and (d) Long jobs are characterized for detailed analysis of the schedulers’ performance. This study highlights the strengths and weakness of various job scheduling polices and helps to choose an appropriate job scheduling policy in a given scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative Performance Analysis of Job Scheduling Algorithms in a Real-World Scientific Application

Research on Job Scheduling Algorithms Based on Cloud Computing

Scheduling in Parallel and Distributed Computing Systems

References

Iosup, A., Ostermann, S., Yigitbasi, M.N., Prodan, R., Fahringer, T., Epema, D.H.J.: Performance analysis of cloud computing services many-task s scientific computing. IEEE Trans. Parallel Distrib. Syst. 22(6), 931–945 (2011)
Xu, C.Z., Rao, J., Bu, X.: URL: a unified reinforcement learning approach for autonomic cloud management. J. Parallel Distrib. Comput. 72(2), 95–105 (2012)
Article Google Scholar
Wei, T., Dongxu, R., Zhiling, L., Narayan, D.: Adaptive metric-aware job scheduling for production supercomputers. In: 41st International Conference on Parallel Processing Workshops, (ICPPW’12), 2012, pp. 107–115
Khan, S.U.: A self-adaptive weighted sum technique for the joint optimization of performance and power consumption in data centers. In: 22nd International Conference on Parallel and Distributed Computing and Communication Systems (PDCCS), Louisville, KY, USA, 2009, pp. 13–18
Khan, S.U., Ardil, C.: A weighted sum technique for the joint optimization of performance and power consumption in data centers. Int. J. Electr. Comput. Syst. Eng. 3(1), 35–40 (2009)
MathSciNet Google Scholar
Khan, S.U., Ahmad, I.: Non-cooperative, semi-cooperative, and cooperative games-based grid resource allocation. In: 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, 25–29 April 2006, p. 10
Khan, S.U.: A goal programming approach for the joint optimization of energy consumption and response time in computational grids. In: 28th IEEE International Performance Computing and Communications Conference (IPCCC), Phoenix, AZ, USA, 2009, pp. 410–417
Braun, T.D., et al.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)
Article Google Scholar
Arndt, O., et al.: A comparative study of online scheduling algorithms for networks of workstations. Cluster Comput. 3(2), 95–112 (2000)
Article Google Scholar
Baraglia, R., et al.: A multi-criteria job scheduling framework for large computing farms. J. Comput. Syst. Sci. 79, 230–244 (2013)
Article MathSciNet Google Scholar
Feitelson, D.G., Tsafrir, D., Krakov, D.: Experience with the Parallel Workloads Archive. Technical Report 2012–6, School of Computer Science and Engineering, The Hebrew University of Jerusalem, 2012
Feitelson, D.G., Workload modeling for performance evaluation. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2459, pp. 114–141 (2002)
Steven, H., David, S., O’, Donnell, T. Analysis of the Early Workload on the Cornell Theory Center IBM Sp2. In: ACM SIGMETRICS Conference on Measurement and Modeling of Computer, System, 1996; Poster
Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Selective Reservation Strategies for Backfill Job Scheduling, Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2537, pp. 55–71 (2002)
Maheswarana, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput. 59(2), 107–131 (1999)
Article Google Scholar
Feitelson, D.G., et al.: Theory and Practice in Parallel Job Scheduling. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1291, pp. 1–34 (1997)
Litka, D.A.: The ANLIIBM SP Scheduling System. Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes in Computer Science, vol. 945, pp. 295–303 (1995)
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)
Article Google Scholar
Ababneh, I., Bani-Mohammad, S.: A new window-based job scheduling scheme for 2D mesh multicomputers. Simul. Model. Pract. Theory 19(1), 482–493 (2011)
Article Google Scholar
Chandio, A.A., Zhu, D., Sodhro, A.H.: Integration of Inter-Connectivity of Information System (i3) using Web Services. International MultiConference of Engineers and Computer Scientists (IMECS). Lecture Notes in Engineering and Computer Science, vol. 2195, pp. 651–655 (2012)
Chapin, S., Cirne, W., Feitelson, D.G., Jones, P., Leutenegger, S., Schwiegelshohn, U., Smith, W., Talby, D.: Benchmarks and Standards for the Evaluation of Parallel Job Schedulers. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1659, pp. 66–89 (1999)
Parallel Workload Archive. http://www.cs.huji.ac.il/labs/parallel/workload/
Lo, M., Mache, J., Windisch, K.J.: A Comparative Study of Real Workload Traces and Synthetic Workload Models for Parallel Job Scheduling. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1459, pp. 25–46 (1998)
Lublin, U., Feitelson, D.: The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. Parallel Distrib. Comput. 63(11), 1105–1122 (2003)
Article MATH Google Scholar
Chiang, S.-H., Vernon, M.K.: Characteristics of a large shared memory production workload. Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes in ComputerScience, vol. 2221, pp. 159–187 (2001)
Downey, A.B.: A parallel workload model and its implications for processor allocation. In: 6th IEEE International Symposium on High Performance Distributed Computing, pp. 112–123 (1997)
Chandio, A.A., Yu, Z., Syed, F.S., Korejo, I.A.: A Case Study on Job Scheduling Policy for Workload Characterization and Power Efficiency, Sindh University Research Journal (Science Series). 45(A-1), 23–28 (2013)
Wang, L., Khan, S.U., Dayal, J.: Thermal aware workload placement with task-temperature profiles in a data center. J. Supercomput. 61(3), 780–803 (2012)
Article Google Scholar
Chandio, A.A., et al.: A Comparative Study of Scheduling Strategies in Large-scale Parallel Computational Systems. In: 11th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) co-located with TrustCom and IUCC, Melbourne, Australia, July 2013, pp. 949–957 (2013). doi:10.1109/TrustCom.2013.116
Fan, X., Weber, W.-D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. SIGARCH Comput. Archit. News 35(2), 13–23 (2007)
Article Google Scholar
Brown, R., et al.: Report to Congress on Server and Data Center Energy Efficiency. Public Law 109–431, 2008
Koomey, J.G.: Worldwide electricity used in data centers. Environ. Res. Lett. 3(3), 034008 (2008)
Article Google Scholar
Koomey, J.: Growth in Data Center Electricity use 2005 to 2010. Analytics Press, Oakland (2011)
Google Scholar
Shuja, J., et al.: Energy-efficient data centers. Computing 94(12), 973–994 (2012)
Article MATH Google Scholar
Masanet, E.R., et al.: Estimating the energy use and efficiency potential of U.S. data centers. Proc. IEEE 99(8), 1440–1453 (2011)
Benini, L., Bogliolo, A., De Micheli, G.: A survey of design techniques for system-level dynamic power management. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(3), 299–316 (2000)
Article Google Scholar
Chen, G., et al.: Energy-aware server provisioning and load dispatching for connection-intensive internet services. In: 5th USENIX Symposium on Networked Systems Design and Implementation, USENIX Association: San Francisco, California, 2008, pp. 337–350
Liu, J., et al.: Challenges Towards Elastic Power Management in Internet Data Centers. In: 29th IEEE International Conference on Distributed Computing Systems Workshops, 2009, IEEE Computer Society, pp. 65–72
Kliazovich, D., Bouvry, P., Khan, S.U.: DENS: Data Center Energy-Efficient Network-Aware Scheduling. in Green Computing and Communications (GreenCom). In: IEEE/ACM International Conference on Green Computing and Communications & IEEE/ACM International Conference on Cyber, Physical and Social Computing, Hangzhou, China, pp. 69–75 (2010)
Meisner, D., Wenisch, T.F.: DreamWeaver: architectural support for deep sleep. SIGARCH Comput. Archit. News 40(1), 313–324 (2012)
Article Google Scholar
Isard, M., et al.: Dryad: distributed data-parallel programs from sequential building blocks. SIGOPS Oper. Syst. Rev. 41(3), 59–72 (2007)
Article Google Scholar
Nedevschi, S., et al.: Reducing network energy consumption via sleeping and rate-adaptation. In: 5th USENIX Symposium on Networked Systems Design and Implementation, USENIX Association: San Francisco, California, 2008, pp. 323–336
Bo, L., et al.: EnaCloud: An Energy-Saving Application Live Placement Approach for Cloud Computing Environments. In: IEEE International Conference on Cloud Computing (CLOUD ’09), Bangalore, India, pp. 17–24 (2009)
Weiser, M., et al.: Scheduling for reduced CPU energy. In: 1st USENIX conference on Operating Systems Design and Implementation. USENIX Association: Monterey, California, 1994, p. 2
Gruian, F., Kuchcinski, K.: LEneS: task scheduling for low-energy systems using variable supply voltage processors. In: Asia and South Pacific Design Automation Conference (ASP-DAC), Yokohama, Japan, pp. 449–455 (2001)
Horvath, T., et al.: Dynamic voltage scaling in multitier web servers with end-to-end delay control. IEEE Trans. Comput. 56(4), 444–458 (2007)
Article MathSciNet Google Scholar
Zhong, X., Xu, C.-Z.: System-wide energy minimization for real-time tasks: lower bound and approximation. ACM Trans. Embed. Comput. Syst. 7(3), 1–24 (2008)
Meisner, D., Gold, B.T., Wenisch, T.F.: PowerNap: Eliminating Server Idle Power. In: Architectural Support for Programming Languages and Operating Systems (ASPLOS ’09), Washington, DC, 2009, pp. 205–216
Kołodziej, J., et al.: Security, energy, and performance-aware resource allocation mechanisms for computational grids. Future Gener. Comp. Sy. 31, 77–92 (2014). doi:10.1016/j.future.2012.09.009
Bilal, K., et al.: A survey on green communications using adaptive link rate. Cluster Comput. 16(3), 575–589 (2013)
Article Google Scholar
Kliazovich, D., Bouvry, P., Khan, S.U.: DENS: data center energy-efficient network-aware scheduling. Cluster Comput. 16(1), 65–75 (2013)
Article Google Scholar
Kołodziej, J., et al.: Hierarchical genetic-based grid scheduling with energy optimization. Cluster Comput. 16(3), 591–609 (2013)
Article Google Scholar
Kołodziej, J., Khan, S.U., Wang, L., Zomaya, A.Y.: Energy efficient genetic-based schedulers in computational grids. Concurr. Comput. (2012). doi: 10.1002/cpe.2839

Download references

Acknowledgments

The shorter version [29] of the paper has been published in the proceedings of the 11th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) co-located with TrustCom and IUCC, Melbourne, Australia on July 2013. Aftab Ahmed Chandio’s work was partly supported for his PhD studies in Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China. Nikos Tziritas’s work is partly supported by Chinese Academy of Sciences. Samee U. Khan’s work was partly supported by the Young International Scientist Fellowship of the Chinese Academy of Sciences, (Grant No. 2011Y2GA01).

Author information

Authors and Affiliations

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People’s Republic of China
Aftab Ahmed Chandio, Nikos Tziritas, Zhibin Yu, Qingshan Jiang, Samee U. Khan & Cheng-Zhong Xu
Graduate University of Chinese Academy of Sciences, Beijing, People’s Republic of China
Aftab Ahmed Chandio
Institute of Mathematics and Computer Science, University of Sindh, Jamshoro, Pakistan
Aftab Ahmed Chandio
Department of Electrical and Computer Engineering, North Dakota State University, Fargo, ND, USA
Kashif Bilal & Samee U. Khan
Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI, USA
Cheng-Zhong Xu

Authors

Aftab Ahmed Chandio
View author publications
You can also search for this author in PubMed Google Scholar
Kashif Bilal
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Tziritas
View author publications
You can also search for this author in PubMed Google Scholar
Zhibin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Qingshan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Samee U. Khan
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Zhong Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samee U. Khan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chandio, A.A., Bilal, K., Tziritas, N. et al. A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems. Cluster Comput 17, 1349–1367 (2014). https://doi.org/10.1007/s10586-014-0384-x

Download citation

Received: 11 November 2013
Revised: 17 March 2014
Accepted: 18 May 2014
Published: 15 June 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10586-014-0384-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems

Abstract

Access this article

Similar content being viewed by others

Comparative Performance Analysis of Job Scheduling Algorithms in a Real-World Scientific Application

Research on Job Scheduling Algorithms Based on Cloud Computing

Scheduling in Parallel and Distributed Computing Systems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems

Abstract

Access this article

Similar content being viewed by others

Comparative Performance Analysis of Job Scheduling Algorithms in a Real-World Scientific Application

Research on Job Scheduling Algorithms Based on Cloud Computing

Scheduling in Parallel and Distributed Computing Systems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation