Abstract
In the large-scale parallel computing environment, resource allocation and energy efficient techniques are required to deliver the quality of services (QoS) and to reduce the operational cost of the system. Because the cost of the energy consumption in the environment is a dominant part of the owner’s and user’s budget. However, when considering energy efficiency, resource allocation strategies become more difficult, and QoS (i.e., queue time and response time) may violate. This paper therefore is a comparative study on job scheduling in large-scale parallel systems to: (a) minimize the queue time, response time, and energy consumption and (b) maximize the overall system utilization. We compare thirteen job scheduling policies to analyze their behavior. A set of job scheduling policies includes (a) priority-based, (b) first fit, (c) backfilling, and (d) window-based policies. All of the policies are extensively simulated and compared. For the simulation, a real data center workload comprised of 22385 jobs is used. Based on results of their performance, we incorporate energy efficiency in three policies i.e., (1) best result producer, (2) average result producer, and (3) worst result producer. We analyze the (a) queue time, (b) response time, (c) slowdown ratio, and (d) energy consumption to evaluate the policies. Moreover, we present a comprehensive workload characterization for optimizing system’s performance and for scheduler design. Major workload characteristics including (a) Narrow, (b) Wide, (c) Short, and (d) Long jobs are characterized for detailed analysis of the schedulers’ performance. This study highlights the strengths and weakness of various job scheduling polices and helps to choose an appropriate job scheduling policy in a given scenario.
Similar content being viewed by others
References
Iosup, A., Ostermann, S., Yigitbasi, M.N., Prodan, R., Fahringer, T., Epema, D.H.J.: Performance analysis of cloud computing services many-task s scientific computing. IEEE Trans. Parallel Distrib. Syst. 22(6), 931–945 (2011)
Xu, C.Z., Rao, J., Bu, X.: URL: a unified reinforcement learning approach for autonomic cloud management. J. Parallel Distrib. Comput. 72(2), 95–105 (2012)
Wei, T., Dongxu, R., Zhiling, L., Narayan, D.: Adaptive metric-aware job scheduling for production supercomputers. In: 41st International Conference on Parallel Processing Workshops, (ICPPW’12), 2012, pp. 107–115
Khan, S.U.: A self-adaptive weighted sum technique for the joint optimization of performance and power consumption in data centers. In: 22nd International Conference on Parallel and Distributed Computing and Communication Systems (PDCCS), Louisville, KY, USA, 2009, pp. 13–18
Khan, S.U., Ardil, C.: A weighted sum technique for the joint optimization of performance and power consumption in data centers. Int. J. Electr. Comput. Syst. Eng. 3(1), 35–40 (2009)
Khan, S.U., Ahmad, I.: Non-cooperative, semi-cooperative, and cooperative games-based grid resource allocation. In: 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, 25–29 April 2006, p. 10
Khan, S.U.: A goal programming approach for the joint optimization of energy consumption and response time in computational grids. In: 28th IEEE International Performance Computing and Communications Conference (IPCCC), Phoenix, AZ, USA, 2009, pp. 410–417
Braun, T.D., et al.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)
Arndt, O., et al.: A comparative study of online scheduling algorithms for networks of workstations. Cluster Comput. 3(2), 95–112 (2000)
Baraglia, R., et al.: A multi-criteria job scheduling framework for large computing farms. J. Comput. Syst. Sci. 79, 230–244 (2013)
Feitelson, D.G., Tsafrir, D., Krakov, D.: Experience with the Parallel Workloads Archive. Technical Report 2012–6, School of Computer Science and Engineering, The Hebrew University of Jerusalem, 2012
Feitelson, D.G., Workload modeling for performance evaluation. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2459, pp. 114–141 (2002)
Steven, H., David, S., O’, Donnell, T. Analysis of the Early Workload on the Cornell Theory Center IBM Sp2. In: ACM SIGMETRICS Conference on Measurement and Modeling of Computer, System, 1996; Poster
Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Selective Reservation Strategies for Backfill Job Scheduling, Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2537, pp. 55–71 (2002)
Maheswarana, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput. 59(2), 107–131 (1999)
Feitelson, D.G., et al.: Theory and Practice in Parallel Job Scheduling. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1291, pp. 1–34 (1997)
Litka, D.A.: The ANLIIBM SP Scheduling System. Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes in Computer Science, vol. 945, pp. 295–303 (1995)
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)
Ababneh, I., Bani-Mohammad, S.: A new window-based job scheduling scheme for 2D mesh multicomputers. Simul. Model. Pract. Theory 19(1), 482–493 (2011)
Chandio, A.A., Zhu, D., Sodhro, A.H.: Integration of Inter-Connectivity of Information System (i3) using Web Services. International MultiConference of Engineers and Computer Scientists (IMECS). Lecture Notes in Engineering and Computer Science, vol. 2195, pp. 651–655 (2012)
Chapin, S., Cirne, W., Feitelson, D.G., Jones, P., Leutenegger, S., Schwiegelshohn, U., Smith, W., Talby, D.: Benchmarks and Standards for the Evaluation of Parallel Job Schedulers. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1659, pp. 66–89 (1999)
Parallel Workload Archive. http://www.cs.huji.ac.il/labs/parallel/workload/
Lo, M., Mache, J., Windisch, K.J.: A Comparative Study of Real Workload Traces and Synthetic Workload Models for Parallel Job Scheduling. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1459, pp. 25–46 (1998)
Lublin, U., Feitelson, D.: The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. Parallel Distrib. Comput. 63(11), 1105–1122 (2003)
Chiang, S.-H., Vernon, M.K.: Characteristics of a large shared memory production workload. Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes in ComputerScience, vol. 2221, pp. 159–187 (2001)
Downey, A.B.: A parallel workload model and its implications for processor allocation. In: 6th IEEE International Symposium on High Performance Distributed Computing, pp. 112–123 (1997)
Chandio, A.A., Yu, Z., Syed, F.S., Korejo, I.A.: A Case Study on Job Scheduling Policy for Workload Characterization and Power Efficiency, Sindh University Research Journal (Science Series). 45(A-1), 23–28 (2013)
Wang, L., Khan, S.U., Dayal, J.: Thermal aware workload placement with task-temperature profiles in a data center. J. Supercomput. 61(3), 780–803 (2012)
Chandio, A.A., et al.: A Comparative Study of Scheduling Strategies in Large-scale Parallel Computational Systems. In: 11th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) co-located with TrustCom and IUCC, Melbourne, Australia, July 2013, pp. 949–957 (2013). doi:10.1109/TrustCom.2013.116
Fan, X., Weber, W.-D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. SIGARCH Comput. Archit. News 35(2), 13–23 (2007)
Brown, R., et al.: Report to Congress on Server and Data Center Energy Efficiency. Public Law 109–431, 2008
Koomey, J.G.: Worldwide electricity used in data centers. Environ. Res. Lett. 3(3), 034008 (2008)
Koomey, J.: Growth in Data Center Electricity use 2005 to 2010. Analytics Press, Oakland (2011)
Shuja, J., et al.: Energy-efficient data centers. Computing 94(12), 973–994 (2012)
Masanet, E.R., et al.: Estimating the energy use and efficiency potential of U.S. data centers. Proc. IEEE 99(8), 1440–1453 (2011)
Benini, L., Bogliolo, A., De Micheli, G.: A survey of design techniques for system-level dynamic power management. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(3), 299–316 (2000)
Chen, G., et al.: Energy-aware server provisioning and load dispatching for connection-intensive internet services. In: 5th USENIX Symposium on Networked Systems Design and Implementation, USENIX Association: San Francisco, California, 2008, pp. 337–350
Liu, J., et al.: Challenges Towards Elastic Power Management in Internet Data Centers. In: 29th IEEE International Conference on Distributed Computing Systems Workshops, 2009, IEEE Computer Society, pp. 65–72
Kliazovich, D., Bouvry, P., Khan, S.U.: DENS: Data Center Energy-Efficient Network-Aware Scheduling. in Green Computing and Communications (GreenCom). In: IEEE/ACM International Conference on Green Computing and Communications & IEEE/ACM International Conference on Cyber, Physical and Social Computing, Hangzhou, China, pp. 69–75 (2010)
Meisner, D., Wenisch, T.F.: DreamWeaver: architectural support for deep sleep. SIGARCH Comput. Archit. News 40(1), 313–324 (2012)
Isard, M., et al.: Dryad: distributed data-parallel programs from sequential building blocks. SIGOPS Oper. Syst. Rev. 41(3), 59–72 (2007)
Nedevschi, S., et al.: Reducing network energy consumption via sleeping and rate-adaptation. In: 5th USENIX Symposium on Networked Systems Design and Implementation, USENIX Association: San Francisco, California, 2008, pp. 323–336
Bo, L., et al.: EnaCloud: An Energy-Saving Application Live Placement Approach for Cloud Computing Environments. In: IEEE International Conference on Cloud Computing (CLOUD ’09), Bangalore, India, pp. 17–24 (2009)
Weiser, M., et al.: Scheduling for reduced CPU energy. In: 1st USENIX conference on Operating Systems Design and Implementation. USENIX Association: Monterey, California, 1994, p. 2
Gruian, F., Kuchcinski, K.: LEneS: task scheduling for low-energy systems using variable supply voltage processors. In: Asia and South Pacific Design Automation Conference (ASP-DAC), Yokohama, Japan, pp. 449–455 (2001)
Horvath, T., et al.: Dynamic voltage scaling in multitier web servers with end-to-end delay control. IEEE Trans. Comput. 56(4), 444–458 (2007)
Zhong, X., Xu, C.-Z.: System-wide energy minimization for real-time tasks: lower bound and approximation. ACM Trans. Embed. Comput. Syst. 7(3), 1–24 (2008)
Meisner, D., Gold, B.T., Wenisch, T.F.: PowerNap: Eliminating Server Idle Power. In: Architectural Support for Programming Languages and Operating Systems (ASPLOS ’09), Washington, DC, 2009, pp. 205–216
Kołodziej, J., et al.: Security, energy, and performance-aware resource allocation mechanisms for computational grids. Future Gener. Comp. Sy. 31, 77–92 (2014). doi:10.1016/j.future.2012.09.009
Bilal, K., et al.: A survey on green communications using adaptive link rate. Cluster Comput. 16(3), 575–589 (2013)
Kliazovich, D., Bouvry, P., Khan, S.U.: DENS: data center energy-efficient network-aware scheduling. Cluster Comput. 16(1), 65–75 (2013)
Kołodziej, J., et al.: Hierarchical genetic-based grid scheduling with energy optimization. Cluster Comput. 16(3), 591–609 (2013)
Kołodziej, J., Khan, S.U., Wang, L., Zomaya, A.Y.: Energy efficient genetic-based schedulers in computational grids. Concurr. Comput. (2012). doi: 10.1002/cpe.2839
Acknowledgments
The shorter version [29] of the paper has been published in the proceedings of the 11th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) co-located with TrustCom and IUCC, Melbourne, Australia on July 2013. Aftab Ahmed Chandio’s work was partly supported for his PhD studies in Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China. Nikos Tziritas’s work is partly supported by Chinese Academy of Sciences. Samee U. Khan’s work was partly supported by the Young International Scientist Fellowship of the Chinese Academy of Sciences, (Grant No. 2011Y2GA01).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chandio, A.A., Bilal, K., Tziritas, N. et al. A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems. Cluster Comput 17, 1349–1367 (2014). https://doi.org/10.1007/s10586-014-0384-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-014-0384-x