Skip to main content

Advertisement

Log in

A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In the large-scale parallel computing environment, resource allocation and energy efficient techniques are required to deliver the quality of services (QoS) and to reduce the operational cost of the system. Because the cost of the energy consumption in the environment is a dominant part of the owner’s and user’s budget. However, when considering energy efficiency, resource allocation strategies become more difficult, and QoS (i.e., queue time and response time) may violate. This paper therefore is a comparative study on job scheduling in large-scale parallel systems to: (a) minimize the queue time, response time, and energy consumption and (b) maximize the overall system utilization. We compare thirteen job scheduling policies to analyze their behavior. A set of job scheduling policies includes (a) priority-based, (b) first fit, (c) backfilling, and (d) window-based policies. All of the policies are extensively simulated and compared. For the simulation, a real data center workload comprised of 22385 jobs is used. Based on results of their performance, we incorporate energy efficiency in three policies i.e., (1) best result producer, (2) average result producer, and (3) worst result producer. We analyze the (a) queue time, (b) response time, (c) slowdown ratio, and (d) energy consumption to evaluate the policies. Moreover, we present a comprehensive workload characterization for optimizing system’s performance and for scheduler design. Major workload characteristics including (a) Narrow, (b) Wide, (c) Short, and (d) Long jobs are characterized for detailed analysis of the schedulers’ performance. This study highlights the strengths and weakness of various job scheduling polices and helps to choose an appropriate job scheduling policy in a given scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Iosup, A., Ostermann, S., Yigitbasi, M.N., Prodan, R., Fahringer, T., Epema, D.H.J.: Performance analysis of cloud computing services many-task s scientific computing. IEEE Trans. Parallel Distrib. Syst. 22(6), 931–945 (2011)

  2. Xu, C.Z., Rao, J., Bu, X.: URL: a unified reinforcement learning approach for autonomic cloud management. J. Parallel Distrib. Comput. 72(2), 95–105 (2012)

    Article  Google Scholar 

  3. Wei, T., Dongxu, R., Zhiling, L., Narayan, D.: Adaptive metric-aware job scheduling for production supercomputers. In: 41st International Conference on Parallel Processing Workshops, (ICPPW’12), 2012, pp. 107–115

  4. Khan, S.U.: A self-adaptive weighted sum technique for the joint optimization of performance and power consumption in data centers. In: 22nd International Conference on Parallel and Distributed Computing and Communication Systems (PDCCS), Louisville, KY, USA, 2009, pp. 13–18

  5. Khan, S.U., Ardil, C.: A weighted sum technique for the joint optimization of performance and power consumption in data centers. Int. J. Electr. Comput. Syst. Eng. 3(1), 35–40 (2009)

    MathSciNet  Google Scholar 

  6. Khan, S.U., Ahmad, I.: Non-cooperative, semi-cooperative, and cooperative games-based grid resource allocation. In: 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, 25–29 April 2006, p. 10

  7. Khan, S.U.: A goal programming approach for the joint optimization of energy consumption and response time in computational grids. In: 28th IEEE International Performance Computing and Communications Conference (IPCCC), Phoenix, AZ, USA, 2009, pp. 410–417

  8. Braun, T.D., et al.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)

    Article  Google Scholar 

  9. Arndt, O., et al.: A comparative study of online scheduling algorithms for networks of workstations. Cluster Comput. 3(2), 95–112 (2000)

    Article  Google Scholar 

  10. Baraglia, R., et al.: A multi-criteria job scheduling framework for large computing farms. J. Comput. Syst. Sci. 79, 230–244 (2013)

    Article  MathSciNet  Google Scholar 

  11. Feitelson, D.G., Tsafrir, D., Krakov, D.: Experience with the Parallel Workloads Archive. Technical Report 2012–6, School of Computer Science and Engineering, The Hebrew University of Jerusalem, 2012

  12. Feitelson, D.G., Workload modeling for performance evaluation. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2459, pp. 114–141 (2002)

  13. Steven, H., David, S., O’, Donnell, T. Analysis of the Early Workload on the Cornell Theory Center IBM Sp2. In: ACM SIGMETRICS Conference on Measurement and Modeling of Computer, System, 1996; Poster

  14. Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Selective Reservation Strategies for Backfill Job Scheduling, Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2537, pp. 55–71 (2002)

  15. Maheswarana, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput. 59(2), 107–131 (1999)

    Article  Google Scholar 

  16. Feitelson, D.G., et al.: Theory and Practice in Parallel Job Scheduling. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1291, pp. 1–34 (1997)

  17. Litka, D.A.: The ANLIIBM SP Scheduling System. Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes in Computer Science, vol. 945, pp. 295–303 (1995)

  18. Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)

    Article  Google Scholar 

  19. Ababneh, I., Bani-Mohammad, S.: A new window-based job scheduling scheme for 2D mesh multicomputers. Simul. Model. Pract. Theory 19(1), 482–493 (2011)

    Article  Google Scholar 

  20. Chandio, A.A., Zhu, D., Sodhro, A.H.: Integration of Inter-Connectivity of Information System (i3) using Web Services. International MultiConference of Engineers and Computer Scientists (IMECS). Lecture Notes in Engineering and Computer Science, vol. 2195, pp. 651–655 (2012)

  21. Chapin, S., Cirne, W., Feitelson, D.G., Jones, P., Leutenegger, S., Schwiegelshohn, U., Smith, W., Talby, D.: Benchmarks and Standards for the Evaluation of Parallel Job Schedulers. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1659, pp. 66–89 (1999)

  22. Parallel Workload Archive. http://www.cs.huji.ac.il/labs/parallel/workload/

  23. Lo, M., Mache, J., Windisch, K.J.: A Comparative Study of Real Workload Traces and Synthetic Workload Models for Parallel Job Scheduling. Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 1459, pp. 25–46 (1998)

  24. Lublin, U., Feitelson, D.: The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. Parallel Distrib. Comput. 63(11), 1105–1122 (2003)

    Article  MATH  Google Scholar 

  25. Chiang, S.-H., Vernon, M.K.: Characteristics of a large shared memory production workload. Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes in ComputerScience, vol. 2221, pp. 159–187 (2001)

  26. Downey, A.B.: A parallel workload model and its implications for processor allocation. In: 6th IEEE International Symposium on High Performance Distributed Computing, pp. 112–123 (1997)

  27. Chandio, A.A., Yu, Z., Syed, F.S., Korejo, I.A.: A Case Study on Job Scheduling Policy for Workload Characterization and Power Efficiency, Sindh University Research Journal (Science Series). 45(A-1), 23–28 (2013)

  28. Wang, L., Khan, S.U., Dayal, J.: Thermal aware workload placement with task-temperature profiles in a data center. J. Supercomput. 61(3), 780–803 (2012)

    Article  Google Scholar 

  29. Chandio, A.A., et al.: A Comparative Study of Scheduling Strategies in Large-scale Parallel Computational Systems. In: 11th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) co-located with TrustCom and IUCC, Melbourne, Australia, July 2013, pp. 949–957 (2013). doi:10.1109/TrustCom.2013.116

  30. Fan, X., Weber, W.-D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. SIGARCH Comput. Archit. News 35(2), 13–23 (2007)

    Article  Google Scholar 

  31. Brown, R., et al.: Report to Congress on Server and Data Center Energy Efficiency. Public Law 109–431, 2008

  32. Koomey, J.G.: Worldwide electricity used in data centers. Environ. Res. Lett. 3(3), 034008 (2008)

    Article  Google Scholar 

  33. Koomey, J.: Growth in Data Center Electricity use 2005 to 2010. Analytics Press, Oakland (2011)

    Google Scholar 

  34. Shuja, J., et al.: Energy-efficient data centers. Computing 94(12), 973–994 (2012)

    Article  MATH  Google Scholar 

  35. Masanet, E.R., et al.: Estimating the energy use and efficiency potential of U.S. data centers. Proc. IEEE 99(8), 1440–1453 (2011)

  36. Benini, L., Bogliolo, A., De Micheli, G.: A survey of design techniques for system-level dynamic power management. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(3), 299–316 (2000)

    Article  Google Scholar 

  37. Chen, G., et al.: Energy-aware server provisioning and load dispatching for connection-intensive internet services. In: 5th USENIX Symposium on Networked Systems Design and Implementation, USENIX Association: San Francisco, California, 2008, pp. 337–350

  38. Liu, J., et al.: Challenges Towards Elastic Power Management in Internet Data Centers. In: 29th IEEE International Conference on Distributed Computing Systems Workshops, 2009, IEEE Computer Society, pp. 65–72

  39. Kliazovich, D., Bouvry, P., Khan, S.U.: DENS: Data Center Energy-Efficient Network-Aware Scheduling. in Green Computing and Communications (GreenCom). In: IEEE/ACM International Conference on Green Computing and Communications & IEEE/ACM International Conference on Cyber, Physical and Social Computing, Hangzhou, China, pp. 69–75 (2010)

  40. Meisner, D., Wenisch, T.F.: DreamWeaver: architectural support for deep sleep. SIGARCH Comput. Archit. News 40(1), 313–324 (2012)

    Article  Google Scholar 

  41. Isard, M., et al.: Dryad: distributed data-parallel programs from sequential building blocks. SIGOPS Oper. Syst. Rev. 41(3), 59–72 (2007)

    Article  Google Scholar 

  42. Nedevschi, S., et al.: Reducing network energy consumption via sleeping and rate-adaptation. In: 5th USENIX Symposium on Networked Systems Design and Implementation, USENIX Association: San Francisco, California, 2008, pp. 323–336

  43. Bo, L., et al.: EnaCloud: An Energy-Saving Application Live Placement Approach for Cloud Computing Environments. In: IEEE International Conference on Cloud Computing (CLOUD ’09), Bangalore, India, pp. 17–24 (2009)

  44. Weiser, M., et al.: Scheduling for reduced CPU energy. In: 1st USENIX conference on Operating Systems Design and Implementation. USENIX Association: Monterey, California, 1994, p. 2

  45. Gruian, F., Kuchcinski, K.: LEneS: task scheduling for low-energy systems using variable supply voltage processors. In: Asia and South Pacific Design Automation Conference (ASP-DAC), Yokohama, Japan, pp. 449–455 (2001)

  46. Horvath, T., et al.: Dynamic voltage scaling in multitier web servers with end-to-end delay control. IEEE Trans. Comput. 56(4), 444–458 (2007)

    Article  MathSciNet  Google Scholar 

  47. Zhong, X., Xu, C.-Z.: System-wide energy minimization for real-time tasks: lower bound and approximation. ACM Trans. Embed. Comput. Syst. 7(3), 1–24 (2008)

  48. Meisner, D., Gold, B.T., Wenisch, T.F.: PowerNap: Eliminating Server Idle Power. In: Architectural Support for Programming Languages and Operating Systems (ASPLOS ’09), Washington, DC, 2009, pp. 205–216

  49. Kołodziej, J., et al.: Security, energy, and performance-aware resource allocation mechanisms for computational grids. Future Gener. Comp. Sy. 31, 77–92 (2014). doi:10.1016/j.future.2012.09.009

  50. Bilal, K., et al.: A survey on green communications using adaptive link rate. Cluster Comput. 16(3), 575–589 (2013)

    Article  Google Scholar 

  51. Kliazovich, D., Bouvry, P., Khan, S.U.: DENS: data center energy-efficient network-aware scheduling. Cluster Comput. 16(1), 65–75 (2013)

    Article  Google Scholar 

  52. Kołodziej, J., et al.: Hierarchical genetic-based grid scheduling with energy optimization. Cluster Comput. 16(3), 591–609 (2013)

    Article  Google Scholar 

  53. Kołodziej, J., Khan, S.U., Wang, L., Zomaya, A.Y.: Energy efficient genetic-based schedulers in computational grids. Concurr. Comput. (2012). doi: 10.1002/cpe.2839

Download references

Acknowledgments

The shorter version [29] of the paper has been published in the proceedings of the 11th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) co-located with TrustCom and IUCC, Melbourne, Australia on July 2013. Aftab Ahmed Chandio’s work was partly supported for his PhD studies in Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China. Nikos Tziritas’s work is partly supported by Chinese Academy of Sciences. Samee U. Khan’s work was partly supported by the Young International Scientist Fellowship of the Chinese Academy of Sciences, (Grant No. 2011Y2GA01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samee U. Khan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chandio, A.A., Bilal, K., Tziritas, N. et al. A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems. Cluster Comput 17, 1349–1367 (2014). https://doi.org/10.1007/s10586-014-0384-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-014-0384-x

Keywords

Navigation