Advertisement

Applying Operations Management Principles on Optimisation of Scientific Computing Clusters

  • Ari-Pekka HameriEmail author
  • Tapio Niemi
Conference paper

Abstract

We apply operations management principles on production scheduling and allocation to computing clusters and their storage resources to increase throughput and reduce lead time of scientific computing jobs. In addition, we study how this approach affects the amount of energy consumed by a computing job comprised of hundreds of calculation tasks. Methodologically we use the design science approach by applying domain knowledge of operations management and efficient resource allocation on the efficient management of the computing resources. Using a test cluster we collected data on CPU and memory utilisation along with energy consumption on different ways of allocating the jobs. We challenge the traditional one job per one processor core method of scheduling scientific clusters with parallel processing and bottleneck management. We observed that by increasing the utilisation rate of the cluster memory increases throughput and decreases energy consumption. We studied also scheduling methods running multiple tasks per CPU core and scheduling based on the amount of free memory available. The test results showed that, at best these methods both decreased energy consumption down to 45% and increased throughput up to 100% compared to the standard practices used in scientific computing. The results are being further tested to eventually support LHC computing of CERN.

Keywords

Large Hadron Collider Computing Cluster Schedule Method Decrease Energy Consumption Memory Utilisation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brown, D., Reams, C.: Toward energy-efficient computing. Queue 8(2), 30–43 (2010)Google Scholar
  2. Bunde, D.: Power-aware scheduling for makespan and flow. In: SPAA 2006: Proceedings of the 18th annual ACM symposium on Parallelism in algorithms and architectures, pp. 190–196. ACM, New York, NY, USA (2006)CrossRefGoogle Scholar
  3. Ge, R., Feng, X., Cameron, K.: Performance-constrained distributed dvs scheduling for scientific applications on power-aware clusters. In: SC ’05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, p. 34. IEEE Computer Society, Washington, DC, USA (2005)Google Scholar
  4. Goes, L., Guerra, P., Coutinho, B., Rocha, L., Meira, W., Ferreira, R., Guedes, D., Cirne, W.: AnthillSched: A scheduling strategy for irregular and iterative I/O-intensive parallel jobs. In: Scheduling Strategies for Parallel Processing, pp. 108–122. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. Goldratt EM, Cox J (1984) The Goal. North River Press, Croton-on-HudsonGoogle Scholar
  6. Hanselman, S., Pegah, M.: The wild wild waste: e-waste. In: Proceedings of the 35th annual ACM SIGUCCS conference on user services, ACM, New York, NY, USA, pp. 157–162 (2007)Google Scholar
  7. Hevner, A., March, S., Park, J., Ram, S.: Design science in information systems research. MIS Quarterly 28(1), 75–105 (2004)Google Scholar
  8. Holmstrom, J., Ketokivi, M., Hameri, A.: Operations management as a problem-solving discipline: A design science approach. Decision Sciences 40(1), 65–87 (2009)CrossRefGoogle Scholar
  9. Hopp, W., Spearman, M.: Factory physics. Irwin, Chicago (1996)Google Scholar
  10. Koole, G., Righter, R.: Resource allocation in grid computing. Journal of Scheduling 11(3), 163–173 (2008)zbMATHCrossRefGoogle Scholar
  11. Lefurgy, C., Wang, X., Ware, M.: Server-level power control. In: ICAC’07: Proceedings of the Fourth International Conference on Autonomic Computing, pp. 4–4. IEEE Computer Society, Washington, DC, USA (2007)CrossRefGoogle Scholar
  12. Li, X., Li, Z., Zhou, Y., Adve, S.: Performance directed energy management for main memory and disks. Transactions on Storage 1(3), 346–380 (2005)CrossRefGoogle Scholar
  13. Little, J.: A proof for the queuing formula: L= λ w. Operations Research 9(3), 383–387 (1961)zbMATHCrossRefMathSciNetGoogle Scholar
  14. Marwah, M., Sharma, R., Shih, R., Patel, C., Bhatia, V., Mekanapurath, M., Velumani, R., Velayudhan, S.: Data analysis, visualization and knowledge discovery in sustainable data centers. In: Proceedings of the 2nd Bangalore Annual Compute Conference, pp. 1–8. ACM, New York, NY, USA (2009)CrossRefGoogle Scholar
  15. Mukherjee, T., Banerjee, A., Varsamopoulos, G., Gupta, S., Rungta, S.: Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers. Computer Networks 53(17), 2888–2904 (2009)zbMATHCrossRefGoogle Scholar
  16. Niemi T, Kommeri J, Ari-Pekka H (2009) Energy-efficient scheduling of grid computing clusters. In: Proceedings of the 17th Annual International Conference on Advanced Computing and Communications (ADCOM 2009), Bengaluru, IndiaGoogle Scholar
  17. Rajan, D., Yu, P.: Temperature-aware scheduling: When is system-throttling good enough? In: WAIM ’08: Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management, pp. 397–404. IEEE Computer Society, Washington, DC, USA (2008)CrossRefGoogle Scholar
  18. Santos-Neto, E., Cirne, W., Brasileiro, F., Lima, A.: Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids. In: The 10th Workshop on Job Scheduling Strategies for Parallel Processing, pp. 210–232. Springer, Heidelberg (2004)Google Scholar
  19. Schmenner, R.: Looking ahead by looking back: Swift, even flow in the history of manufacturing. Production and Operations Management 10(1), 87–96 (2010)CrossRefGoogle Scholar
  20. Simon, H.: Does scientific discovery have a logic? Philosophy of Science 40(4), 471–480 (1973)CrossRefGoogle Scholar
  21. Sun Microsystems (2008) Beginner’s Guide To Suntm Grid Engine 6.2 Installation And Configuration. Tech. rep., Sun MicrosystemsGoogle Scholar
  22. Venkatachalam, V., Franz, M.: Power reduction techniques for microprocessor systems. ACM Computing Surveys 37(3), 195–237 (2005)CrossRefGoogle Scholar
  23. Wang, C., Huang, X., Hsu, C.: Bi-objective optimization: An online algorithm for job assignment. In: GPC 2009, Geneva, Switzerland (2009)Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.HECUniversity of LausanneLausanneSwitzerland
  2. 2.Helsinki Institute of PhysicsCERNGenevaSwitzerland

Personalised recommendations