User-Aware Metrics for Measuring Quality of Parallel Job Schedules

  • Šimon Tóth
  • Dalibor KlusáčekEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8828)


The work presented in this paper is motivated by the challenges in the design of scheduling algorithms for the Czech National Grid MetaCentrum. One of the most notable problems is our inability to efficiently analyze the quality of schedules. While it is still possible to observe and measure certain aspects of generated schedules using various metrics, it is very challenging to choose a set of metrics that would be representative when measuring the schedule quality. Without quality quantification (either relative, or absolute), we have no way to determine the impact of new algorithms and configurations on the schedule quality, prior to their deployment in a production service. The only two options we are left with is to either use expert assessment or to simply deploy new solutions into production and observe their impact on user satisfaction. To approach this problem, we have designed a novel user-aware model and a metric that can overcome the presented issues by evaluating the quality on a user level. The model assigns an expected end time (EET) to each job based on a fair partitioning of the system resources, modeling users expectations. Using this calculated EET we can then compare generated schedules in detail, while also being able to adequately visualize schedule artifacts, allowing an expert to further analyze them. Moreover, we present how coupling this model with a job scheduling simulator gives us the ability to do an in-depth evaluation of scheduling algorithms before they are deployed into a production environment.


Grid Performance evaluation Metrics Queue-based scheduling Fairness User-aware scheduling 



We highly appreciate the support of the Grant Agency of the Czech Republic under the grant No. P202/12/0306. The access to the MetaCentrum workloads is kindly acknowledged.


  1. 1.
    Adaptive Computing Enterprises, Inc., Maui Scheduler Administrator’s Guide, version 3.2, January 2014.
  2. 2.
    Adaptive Computing Enterprises, Inc., TORQUE Admininstrator Guide, version 4.2.6, January 2014.
  3. 3. Hadoop Capacity Scheduler, January 2014.
  4. 4. Hadoop Fair Scheduler, January 2014.
  5. 5.
    Cirne, W., Berman, F.: A comprehensive model of the supercomputer workload. In 2001 IEEE International Workshop on Workload Characterization (WWC 2001), pp. 140–148. IEEE Computer Society (2001)Google Scholar
  6. 6.
    Cirne, W., Brasileiro, F., Sauvé, J., Andrade, N., Paranhos, D., Santos-neto, E., Medeiros, R., Gr, F.C.: Grid computing for bag of tasks applications. In: 3rd IFIP Conference on E-Commerce, E-Business and EGovernment (2003)Google Scholar
  7. 7.
    Ernemann, C., Hamscher, V., Yahyapour, R.: Benefits of global Grid computing for job scheduling. In: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, GRID 2004, pp. 374–379. IEEE (2004)Google Scholar
  8. 8.
    Feitelson, D.G., Rudolph, L., Schwiegelshohn, U., Sevcik, K.C., Wong, P.: Job scheduling strategies for parallel processing. In: Feitelson, D.G., Rudolph, L. (eds.) Theory and practice in parallel job scheduling. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  9. 9.
    Frachtenberg, E., Feitelson, D.G.: Pitfalls in parallel job scheduling evaluation. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) Job Scheduling Strategies for Parallel Processing. LNCS, vol. 3834, pp. 257–282. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., Stoica, I.: Dominant resource fairness: fair allocation of multiple resource types. In: 8th USENIX Symposium on Networked Systems Design and Implementation (2011)Google Scholar
  11. 11.
    Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., Goldberg, A.: Quincy: Fair scheduling for distributed computing clusters. In: SOSP 2009 (2009)Google Scholar
  12. 12.
    Jackson, D., Snell, Q., Clement, M.: Core algorithms of the Maui scheduler. In: Feitelson, D.G., Rudolph, L. (eds.) Job Scheduling Strategies for Parallel Processing. LNCS, vol. 2221, pp. 87–102. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  13. 13.
    Karatza, H.D.: Performance of gang scheduling strategies in a parallel system. Simul. Model. Pract. Theory 17(2), 430–441 (2009)CrossRefGoogle Scholar
  14. 14.
    Klusáček,D., Rudová, H.: Alea 2 - job scheduling simulator. In: Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques (SIMUTools 2010). ICST, 2010Google Scholar
  15. 15.
    Klusáček, D., Rudová, H., Jaroš, M.: Multi resource fairness: problems and challenges. In: Desai, N., Cirne, W. (eds.) Job Scheduling Strategies for Parallel Processing (JSSPP 2013). LNCS. Springer, Heidelberg (2013)Google Scholar
  16. 16.
    Klusáček, D., Tóth, Š.: On interactions among scheduling policies: finding efficient queue setup using high-resolution simulations. In: Silva, F., Dutra, I., Costa, V.S. (eds.) Euro-Par 2014. LNCS, vol. 8632. Springer, Heidelberg (2014)Google Scholar
  17. 17.
    Krakov, D., Feitelson, D.: High-resolution analysis of parallel job workloads. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds.) Job Scheduling Strategies for Parallel Processing. LNCS, vol. 7698, pp. 178–195. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  18. 18.
    Krakov, D., Feitelson, D.G.: Comparing Performance Heatmaps. In: Desai, N., Cirne, W. (eds.) Job Scheduling Strategies for Parallel Processing. LNCS. Springer, Heidelberg (2013)Google Scholar
  19. 19.
    Leung, V.J., Sabin, G., Sadayappan, P.: Parallel job scheduling policies to improve fairness: a case study. Technical Report SAND2008-1310, Sandia National Laboratories (2008)Google Scholar
  20. 20.
    Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)CrossRefGoogle Scholar
  21. 21.
    PBS Works. PBS Professional 12.1, Administrator’s Guide, January 2014.
  22. 22.
    Ruda, M., Šustr, Z., Sitera, J., Antoš, D., Hejtmánek, L., Holub, P., Mulač, M.: Virtual clusters as a new service of MetaCentrum, the Czech NGI. In: Cracow 2009 Grid Workshop (2010)Google Scholar
  23. 23.
    Sabin, G., Kochhar, G., Sadayappan, P.: Job fairness in non-preemptive job scheduling. In: International Conference on Parallel Processing (ICPP 2004), pp. 186–194. IEEE Computer Society (2004)Google Scholar
  24. 24.
    Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Selective reservation strategies for backfill job scheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) Job Scheduling Strategies for Parallel Processing. LNCS, vol. 2537, pp. 55–71. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  25. 25.
    Tóth, Š., Klusáček, D.: Tools and methods for detailed analysis of complex job schedules in the Czech National Grid. In: Bubak, M., Turała, M., Wiatr, K. (eds.) Cracow Grid Workshop, pp. 83–84. ACC CYFRONET AGH, Cracow (2013)Google Scholar
  26. 26.
    Tóth, Š., Ruda, M.: Practical experiences with torque meta-scheduling in the Czech National Grid. Comput. Sci. 13(2), 33–45 (2012)CrossRefGoogle Scholar
  27. 27.
    Vasupongayya, S., Chiang, S.-H.: On job fairness in non-preemptive parallel job scheduling. In: Zheng, S.Q. (ed.) International Conference on Parallel and Distributed Computing Systems (PDCS 2005), pp. 100–105. IASTED/ACTA Press, San Diego (2005)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Faculty of InformaticsMasaryk UniversityBrnoCzech Republic

Personalised recommendations