Advertisement

Cluster Computing

, Volume 20, Issue 3, pp 2013–2030 | Cite as

Value of service based resource management for large-scale computing systems

  • Cihan TuncEmail author
  • Dylan Machovec
  • Nirmal Kumbhare
  • Ali Akoglu
  • Salim Hariri
  • Bhavesh Khemka
  • Howard Jay Siegel
Article

Abstract

Task scheduling for large-scale computing systems is a challenging problem. From the users perspective, the main concern is the performance of the submitted tasks, whereas, for the cloud service providers, reducing operation cost while providing the required service is critical. Therefore, it is important for task scheduling mechanisms to balance users’ performance requirements and energy efficiency because energy consumption is one of the major operational costs. We present a time dependent value of service (VoS) metric that will be maximized by the scheduling algorithm that take into consideration the arrival time of a task while evaluating the value functions for completing a task at a given time and the tasks energy consumption. We consider the variation in value for completing a task at different times such that the value of energy reduction can change significantly between peak and non-peak periods. To determine the value of a task completion, we use completion time and energy consumption with soft and hard thresholds. We define the VoS for a given workload to be the sum of the values for all tasks that are executed during a given period of time. Our system model is based on virtual machines, where each task will be assigned a resource configuration characterized by the number of the homogeneous cores and amount of memory. For the scheduling of each task submitted to our system, we use the estimated time to compute matrix and the estimated energy consumption matrix which are created using historical data. We design, evaluate, and compare our task scheduling methods to show that a significant improvement in energy consumption can be achieved when considering time-of-use dependent scheduling algorithms. The simulation results show that we improve the performance and the energy values up to 49% when compared to schedulers that do not consider the value functions. Similar to the simulation results, our experimental results from running our value based scheduling on an IBM blade server show up to 82% improvement in performance value, 110% improvement in energy value, and up to 77% improvement in VoS compared to schedulers that do not consider the value functions.

Keywords

Value of service Task scheduling Resource management Virtual machines Energy efficient resource allocation Performance metrics 

Notes

Acknowledgements

A preliminary version of portions of this work appeared in [55, 56]. This work is partly supported by National Science Foundation (NSF) research projects NSF CNS-1624668, SES-1314631, CCF-1302693, and DUE-1303362. Furthermore, this work utilized Colorado State Universitys ISTeC Cray system, which is supported by the NSF under grant number CNS-0923386.

References

  1. 1.
    Luo, J., Rao, L., Liu, X.: Eco-idc: trade delay for energy cost with service delay guarantee for internet data centers. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 4553 (2012)Google Scholar
  2. 2.
    Greenberg, A., Hamilton, J., Maltz, D.A., Patel, P.: The cost of a cloud: research problems in data center networks. ACM SIGCOMM Comput. Commun. Rev. 39(1), 68–73 (2008)CrossRefGoogle Scholar
  3. 3.
    Tang, Z., Qi, L., Cheng, Z., Li, K., Khan, S.U., Li, K.: An energy-efficient task scheduling algorithm in DVFS-enabled cloud environment. J. Grid Comput. 14(1), 55–74 (2016)CrossRefGoogle Scholar
  4. 4.
    Delforege, P.: Critical action needed to save money and cut pollution. (2015). https://www.nrdc.org/resources/americas-data-centers-consuming-and-wasting-growing-amounts-energy. Accessed May 2016
  5. 5.
    Delforage, P., Whitney, J.: Scaling up energy efficiency across the data center industry: evaluating key drivers and barriers. Issue Paper, National Research Defense Council (NRDC) (2014)Google Scholar
  6. 6.
    Koomey, J.: Growth in Data Center Electricity Use 2005 to 2010. Analytics Press. http://www.analyticspress.com/datacenters.html. Accessed May 2016
  7. 7.
    Top Ten Exascale Resource Challenges, Technical Report, DOE ASCAC (2014). https://science.energy.gov/~/media/ascr/ascac/pdf/meetings/20140210/Top10reportFEB14.pdf. Accessed Aug 2015
  8. 8.
    Rao, L., Liu, X., Xie, L., Liu, W.: Minimizing electricity cost: optimization of distributed internet data centers in a multi-electricity-market environment. In: IEEE INFOCOM (2010)Google Scholar
  9. 9.
    Amazon ECE2 Reserved Instances, AWS, Amazon. https://aws.amazon.com/ec2/purchasing-options/reserved-instances. Accessed May 2016
  10. 10.
    Khokhar, A., Prasanna, V.K., Shaaban, M.E., Wang, C.: Heterogeneous computing: challenges and opportunities. IEEE Comput. 26(6), 18–27 (1993)CrossRefGoogle Scholar
  11. 11.
    Xu, D., Nahrstedt, K., Wichadakul, D.: QoS and contention-aware multi-resource reservation. Clust. Comput. 4(2), 95–107 (2001)Google Scholar
  12. 12.
    Caron, E., Desprez, F., Muresan, A.: Forecasting for grid and cloud computing on-demand resources based on pattern matching. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom 2010), pp. 456–463 (2010)Google Scholar
  13. 13.
    Kosta, S., Aucinas, A., Hui, P., Mortier, R., Zhang, X.: Thinkair: dynamic resource allocation and parallel execution in the cloud for mobile code offloading. In: 2012 IEEE Infocom, pp. 945–953 (2012)Google Scholar
  14. 14.
    Khemka, B., Machovec, D., Blandin, C., Siegel, H.J., Hariri, S., Louri, A., Tunc, C., Fargo, F., Maciejewski, A.A.: Resource management in heterogeneous parallel computing environments with soft and hard deadlines. In: 11th Metaheuristics International Conference (MIC 2015) (2015)Google Scholar
  15. 15.
    Khemka, B., Friese, R., Briceo, L.D., Siegel, H.J., Maciejewski, A.A., Koenig, G.A., Groer, C., Okonski, G., Hilton, M.M., Rambharos, R., Poole, S.: Utility functions and resource management in an oversubscribed heterogeneous computing environment. IEEE Trans. Comput. 64(8), 2394–2407 (2015)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Khemka, B., Friese, R., Pasricha, S., Maciejewski, A.A., Siegel, H.J., Koenig, G.A., Powers, S., Hilton, M., Rambharos, R., Poole, S.: Utility maximizing dynamic resource management in an oversubscribed energy-constrained heterogeneous computing system. Sustain. Comput. Inform. Syst. 5, 14–30 (2015)Google Scholar
  17. 17.
    Wu, H., Ravindran, B., Jensen, E.D.: Energy-efficient, utility accrual real-time scheduling under the unimodal arbitrary arrival model. In: IEEE/ACM Design, Automation and Test in Europe (DATE 2005), pp. 474–479 (2005)Google Scholar
  18. 18.
    Liu, S., Quan, G., Ren, S.: On-line scheduling of real-time services for cloud computing. In: 6th World Congress Services (SERVICES 2010), pp. 459–464 (2010)Google Scholar
  19. 19.
    Snir, M., Bader, D.A.: A framework for measuring supercomputer productivity. Int. J. High Perform. Comput. Appl. 18(4), 417–432 (2004)CrossRefGoogle Scholar
  20. 20.
    Bohra, A.E., Chaudhary, V.: VMeter: power modelling for virtualized clouds. In: IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW’10) (2010)Google Scholar
  21. 21.
    Kansal, A., Zhao, F., Liu, J., Kothari, N., Bhattacharya, A.A.: Virtual machine power metering and provisioning. In: 1st ACM Symposium on Cloud Computing (SoCC 2010), pp. 39–50 (2010)Google Scholar
  22. 22.
    Perf: Linux profiling with performance counters. https://perf.wiki.kernel.org/index.php/Main_Page. Accessed Nov 2015
  23. 23.
    Qu, G., Hariri, S., Yousif, M.: A new dependency and correlation analysis for features. IEEE Trans. Knowl. Data Eng. 17(9), 1199–1207 (2005)CrossRefGoogle Scholar
  24. 24.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  25. 25.
    Fargo, F., Tunc, C., Nashif, Y.A., Hariri, S.: Autonomic performance-per-watt management (APM) of cloud resources and services. In: ACM Cloud and Autonomic Computing Conference (CAC 2013) (2013)Google Scholar
  26. 26.
    Tunc, C.: Autonomic cloud resource management. PhD Thesis, Electrical and Computer Engineering Deaprtment, University of Arizona, Tucson, Arizona, USA (2015)Google Scholar
  27. 27.
    Fargo, F., Tunc, C., Al-Nashif, Y., Akoglu, A., Hariri, S.: Autonomic workload and resources management of cloud computing services. In: International Conference on Cloud and Autonomic Computing (ICCAC 2014), pp. 101–110 (2014)Google Scholar
  28. 28.
    NAS Parallel Benchmarks (NAS-NPB): NASA Advanced Supercomputing. https://www.nas.nasa.gov/publications/npb.html. Accessed Aug 2015
  29. 29.
    Braun, T., Siegel, H.J., Beck, N., Blni, L.L., Maheswaran, M., Reuther, A.I., Robertson, J.P., Theys, M.D., Yao, B., Hensgen, D., Freund, R.F.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)CrossRefzbMATHGoogle Scholar
  30. 30.
    Ibarra, O.H., Kim, C.E.: Heuristic algorithms for scheduling independent tasks on non-identical processors. J. ACM (JACM) 24(2), 280–289 (1977)CrossRefzbMATHGoogle Scholar
  31. 31.
    Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput. 59(2), 107–131 (1999)CrossRefGoogle Scholar
  32. 32.
    Mohsenian-Rad, A.H., Wong, V.W.S., Jatskevich, J., Schober, R.: Optimal and autonomous incentive-based energy consumption scheduling algorithm for smart grid. In: Innovative Smart Grid Technologies (ISGT 2010) (2010)Google Scholar
  33. 33.
    Ali, S., Siegel, H.J., Maheswaran, M., Hensgen, D., Ali, S.: Representing task and machine heterogeneities for heterogeneous computing systems. Tamkang J. Sci. Eng. 3(3), 195–207 (2000)Google Scholar
  34. 34.
    Downey, A.B.: Model for speedup of parallel programs, Technical Report UCB/CSD-97-933, Berkeley (1997)Google Scholar
  35. 35.
    Kivity, A., Kamay, Y., Laor, D., Lublin, U., Liguori, A.: KVM: the Linux virtual machine monitor. Linux Symp. 1, 225–230 (2007)Google Scholar
  36. 36.
  37. 37.
    Plummer, D.: An Ethernet Address Resolution Protocol. RFC 826, MIT-LCS (1982)Google Scholar
  38. 38.
    Address Resolution Protocol (ARP). http://linux-ip.net/html/ether-arp.html. Accessed Mar 2017
  39. 39.
    Lifka, D.A.: The ANL/IBM SP scheduling systems. In: Workshop on Job Scheduling Strategies for Parallel Processing (IPPS 1995), pp. 295–303 (1995)Google Scholar
  40. 40.
    Mu’alem, A.W., Feitelson, D.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)CrossRefGoogle Scholar
  41. 41.
    Davis, R., Burns, A.: A survey of hard real-time scheduling for multiprocessor systems. ACM Comput. Surv. 43(4) (2011)Google Scholar
  42. 42.
    Feitelson, D., Rudolph, L.: Parallel job scheduling: issues and approaches. In: Workshop on Job Scheduling Strategies for Parallel Processing (IPPS 1995) (1995)Google Scholar
  43. 43.
    Feitelson, D., Rudolph, L., Schwiegelshohn, U., Sevcik, K., Wong, P.: Theory and practice in parallel job scheduling. In: Workshop on Job Scheduling Strategies for Parallel Processing (IPPS 1997) (1997)Google Scholar
  44. 44.
    Feitelson, D.G., Schwiegelshohn, U., Rudolph, L.: Parallel job scheduling: a status report. In: 10th International Workshop on Job Scheduling Strategies for Parallel Processing (JSPP 2004). Lecture Notes in Computer Science, Springer (2004)Google Scholar
  45. 45.
    Mishra, A., Mishra, S., Kushwaha, D.S.: An improved backfilling algorithm: SJF-B. Int. J. Recent Trends Eng. Technol. 5(1), 78–81 (2011)Google Scholar
  46. 46.
    Jensen, E., Locke, C., Tokuda, H.: A time-driven scheduling model for real-time systems. In: IEEE Real-Time Systems Symposium, pp. 112–122 (1985)Google Scholar
  47. 47.
    Li, P., Ravindran, B., Suhaib, S., Feizabadi, S.: A formally verified application-level framework for real-time scheduling on POSIX realtime operating systems. IEEE Trans. Softw. Eng. 30(9), 613–629 (2004)CrossRefGoogle Scholar
  48. 48.
    Vengerov, D., Mastroleon, L., Murphy, D., Bambos, N.: Adaptive data-aware utility-based scheduling in resource-constrained systems. J. Parallel Distrib. Comput. 70(9), 871–879 (2010)CrossRefzbMATHGoogle Scholar
  49. 49.
    Zhou, Z., Lan, Z., Tang, W., Desai, N.: Reducing Energy Costs for IBM Blue Gene/P via Power-Aware Job Scheduling. Job Scheduling Strategies for Parallel Processing (JSSPP 2013). Lecture Notes in Computer Science. Springer 8429, 96–115 (2013)Google Scholar
  50. 50.
    Kodama, Y., Itoh, S., Shimizu, T., Sekiguchi, S., Nakamura, H., Mori, N.: Imbalance of CPU temperatures in a blade system and its impact for power consumption of fans. Clust. Comput. 16(1), 27–37 (2013)CrossRefGoogle Scholar
  51. 51.
    Laszewski, G.V., Wang, L., Younge, A.J., He, X.: Power-aware scheduling of virtual machines in DVFS-enabled clusters. In: IEEE International Conference on Cluster Computing and Workshops (CLUSTER 2009) (2009)Google Scholar
  52. 52.
    Ding, Y., Qin, X., Liu, L., Wang, T.: Energy efficient scheduling of virtual machines in cloud with deadline constraint. Future Gener. Comput. Syst. 50, 62–74 (2015)CrossRefGoogle Scholar
  53. 53.
    Goiri, I., Beauchea, R., Le, K., Nguyen, T.D., Haque, M.E., Guitart, J., Bianchini, R.: GreenSlot: Scheduling energy consumption in green datacenters. In: ACM International Conference for High Performance Computing, Networking, Storage and Analysis (2011)Google Scholar
  54. 54.
    Machovec, D., Khemka, B., Pasricha, S., Maciejewski, A.A., Siegel, H.J., Koenig,G.A., Wright, M., Hilton, M., Rambharos, R., Imam, N.: Dynamic resource management for parallel tasks in an oversubscribed energy-constrained heterogeneous environment. In: 25th Heterogeneity in Computing Workshop (HCW 2016), in 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 67–78 (2016)Google Scholar
  55. 55.
    Machovec, D., Tunc, C., Kumbhare, N., Khemka, B., Akoglu, A., Hariri, S., Siegel, H.J.: Value-based resource management in high-performance computing systems. In: 7th Workshop on Scientific Cloud Computing (SCIENCECLOUD 2016), pp. 19–26 (2016)Google Scholar
  56. 56.
    Tunc, C., Kumbhare, N., Akoglu, A., Hariri, S., Machovec, D., Siegel, H.J.: Value of service based task scheduling for cloud computing systems. In: IEEE 2016 International Conference on Cloud and Autonomic Computing (ICCAC 2016) (2016)Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Cihan Tunc
    • 1
    Email author
  • Dylan Machovec
    • 2
  • Nirmal Kumbhare
    • 1
  • Ali Akoglu
    • 1
  • Salim Hariri
    • 1
  • Bhavesh Khemka
    • 2
  • Howard Jay Siegel
    • 2
    • 3
  1. 1.NSF Center for Cloud and Autonomic ComputingThe University of ArizonaTucsonUSA
  2. 2.Department of Electrical and Computer EngineeringColorado State UniversityFort CollinsUSA
  3. 3.Department of Computer ScienceColorado State UniversityFort CollinsUSA

Personalised recommendations