Advertisement

Heuristics for Resource Matching in Intel’s Compute Farm

  • Ohad Shai
  • Edi Shmueli
  • Dror G. Feitelson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8429)

Abstract

In this paper we investigate the issue of resource matching between jobs and machines in Intel’s compute farm. We show that common heuristics such as Best-Fit and Worse-Fit may fail to properly utilize the available resources when applied to either cores or memory in isolation. In an attempt to overcome the problem we propose Mix-Fit, a heuristic which attempts to balance usage between resources. While this indeed usually improves upon the single-resource heuristics, it too fails to be optimal in all cases. As a solution we default to Max-Jobs, a meta-heuristic that employs all the other heuristics as sub-routines, and selects the one which matches the highest number of jobs. Extensive simulations that are based on real workload traces from four different Intel sites demonstrate that Max-Jobs is indeed the most robust heuristic for diverse workloads and system configurations, and provides up to 22 % reduction in the average wait time of jobs.

Keywords

NetBatch Job scheduling Resource matching Simulation Best-Fit Worse-Fit First-Fit 

References

  1. 1.
    The parallel workloads archive (2013). http://www.cs.huji.ac.il/labs/parallel/workload
  2. 2.
    Amir, Y., Awerbuch, B., Barak, A., Borgstrom, R.S., Keren, A.: An opportunity cost approach for job assignment in a scalable computing cluster. IEEE Trans. Parallel Distrib. Syst. 11(7), 760–768 (2000)CrossRefGoogle Scholar
  3. 3.
    Bentley, B.: Validating the Intel® Pentium® 4 microprocessor. In: Proceedings of the 38th Design Automation Conference, pp. 244–248, June 2001Google Scholar
  4. 4.
    Deng, K., Verboon, R., Ren, K., Iosup, A.: A periodic portfolio scheduler for scientic computing in the data center. In: 17th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 2013), Boston, USA, May 2013Google Scholar
  5. 5.
    Evans, N.D.: Business Innovation and Disruptive Technology: Harnessing the Power of Breakthrough Technology for Competitive Advantage. Financial Times Prentice Hall, Upper Saddle River (2003)Google Scholar
  6. 6.
    Eyerman, S., Eeckhout, L.: Probabilistic job symbiosis modeling for SMT processor scheduling. In: 15th Intel Conference Architecture Support for Programming Language & Operating Systems, pp. 91–102, March 2010Google Scholar
  7. 7.
    Lee, S., Panigrahy, R., Prabhakaran, V., Ramasubramanian, V., Talwar, K., Uyeda, L., Wieder, U.: Validating heuristics for virtual machines consolidation. Technical report MSR-TR-2011-9, Microsoft Research, January 2011Google Scholar
  8. 8.
    Mishra, M., Sahoo, A.: On theory of VM placement: anomalies in existing methodologies and their mitigation using a novel vector based approach. In: IEEE Intel Conference Cloud, Computing, pp. 275–282 (2011)Google Scholar
  9. 9.
    Panigrahy, R., Talwar, K., Uyeda, L., Wieder, U.: Heuristics for vector bin packing. Technical report, Microsoft Research (2011)Google Scholar
  10. 10.
    Shai, O.: Batch simulator (simba). Open source project hosted (2012). http://code.google.com/p/batch-simulator
  11. 11.
    Shmueli, E., Feitelson, D.G.: Backfilling with lookahead to optimize the packing of parallel jobs. J. Parallel Distrib. Comput. 65, 1090–1107 (2005)CrossRefzbMATHGoogle Scholar
  12. 12.
    Singh, A., Korupolu, M., Mohapatra, D., Server-storage virtualization: integration and load balancing in data centers. In: SC 2008: High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2008)Google Scholar
  13. 13.
    Snavely, A., Tullsen, D.M.: Symbiotic jobscheduling for a simultaneous multithreading processor. In: 9th Intel Conference Architecture Support for Programming Language & Operating Systems, pp. 234–244, November 2000Google Scholar
  14. 14.
    Talby, D., Feitelson, D.G.: Improving and stabilizing parallel computer performance using adaptive backfilling. In: 19th Intel Parallel & Distributed Processing Symposium, April 2005Google Scholar
  15. 15.
    Weinberg, J., Snavely, A.: Symbiotic space-sharing on SDSC’s dataStar system. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2006. LNCS, vol. 4376, pp. 192–209. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  16. 16.
    Xiao, L., Chen, S., Zhang, X.: Dynamic cluster resource allocations for jobs with known and unknown memory demands. IEEE Trans. Parallel Distrib. Syst. 13(3), 223–240 (2002)CrossRefGoogle Scholar
  17. 17.
    Zhang, Z., Phan, L.T.X., Tan, G., Jain, S., Duong, H., Loo, B.T., Lee, I.: On the feasibility of dynamic rescheduling on the intel distributed computing platform. In: Proceedings 11th Intel Middleware Conference Industrial track, pp. 4–10. ACM, New York (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Intel CorporationHaifaIsrael
  2. 2.Blavatnik School of Computer ScienceTel-Aviv UniversityTel-AvivIsrael
  3. 3.School of Computer Science and EngineeringThe Hebrew UniversityJerusalemIsrael

Personalised recommendations