SCOUT: Scheduling Core Utilization to Optimize the Performance of Scientific Computing Applications on CPU/Coprocessor-Based Cluster

Chung, Minh Thanh; Pham, Kien Trung; Nguyen, Manh-Thin; Thoai, Nam

doi:10.1007/978-3-030-55240-4_6

Minh Thanh Chung⁵,
Kien Trung Pham⁵,
Manh-Thin Nguyen⁵ &
…
Nam Thoai⁵

365 Accesses

Abstract

Today’s scientific computing applications require many different kinds of task and computational resource. The success of scientific computing hinges on the development of High Performance Computing (HPC) system in the role of decreasing execution time. Remarkably, the support is more enhanced with the advent of accelerators like Graphics Processing Unit (GPU) or Intel Xeon Phi (MIC) coprocessor. However, problems related to coprocessor underutilization of MIC can lead to the thread and memory over-subscription. Based on logging the runtime behaviors of scientific applications, scheduling jobs usually has constraints on the completion time of jobs as deadline or due date assignment. These problems can be solved to improve the performance by a suitable method such as scheduling or assigning priorities to job submission. In this paper, we propose a scheduling module named SCOUT by exploiting factors from the view of the application’s performance to improve the scheduler on a CPU/Coprocessor-based cluster. SCOUT focuses on the performance of applications as well as reducing their execution time on Xeon Phi accelerator. Furthermore, our scheduling module decides the order of job execution to increase the throughput and minimize the delay time. Given a set of popular scientific applications, the experimental results show that the performance and throughput of SCOUT are better than others compared policies. Especially, we implement the entire module as a seamless plug-in to an HPC workload manager named PBS Professional and show the efficiency in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

L3C Model of High-Performance Computing Cluster for Scientific Applications

E-OSched: a load balancing scheduler for heterogeneous multicores

Article 23 May 2018

Benchmarking Performance: Influence of Task Location on Cluster Throughput

References

Assarzadegan, P., Rasti-Barzoki, M.: Minimizing sum of the due date assignment costs, maximum tardiness and distribution costs in a supply chain scheduling problem. Appl. Soft Comput. 47, 343–356 (2016)
Article Google Scholar
Awasthi, M., Nellans, D., Sudan, K., Balasubramonian, R., Davis, A.: Handling the problems and opportunities posed by multiple on-chip memory controllers. In: Parallel Architectures and Compilation Techniques (PACT), 2010 19th International Conference on, pp. 319–330. IEEE (2010)
Google Scholar
Benton, J., Do, M.B., Kambhampati, S.: Over-subscription planning with numeric goals. In: IJCAI, pp. 1207–1213. Citeseer (2005)
Google Scholar
Blair-Chappell, S., Stokes, A.: Parallel Programming With Intel Parallel Studio XE. Wiley, Hoboken (2012)
Google Scholar
Boillat, J.E., Kropf, P.G.: A fast distributed mapping algorithm. In: CONPAR 90VAPP IV, pp. 405–416. Springer, Berlin (1990)
Google Scholar
Broquedis, F., Aumage, O., Goglin, B., Thibault, S., Wacrenier, P.A., Namyst, R.: Structuring the execution of openmp applications for multicore architectures. In: Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pp. 1–10. IEEE (2010)
Google Scholar
Broquedis, F., Clet-Ortega, J., Moreaud, S., Furmento, N., Goglin, B., Mercier, G., Thibault, S., Namyst, R.: hwloc: A generic framework for managing hardware affinities in hpc applications. In: Parallel, Distributed and Network-Based Processing (PDP), 2010 18th Euromicro International Conference on, pp. 180–186. IEEE (2010)
Google Scholar
Chrysos, G.: Intel® xeon phi coprocessor-the architecture, vol. 176. Intel Whitepaper (2014)
Google Scholar
Cruz, E.H., Diener, M., Pilla, L.L., Navaux, P.O.: An efficient algorithm for communication-based task mapping. In: Parallel, Distributed and Network-Based Processing (PDP), 2015 23rd Euromicro International Conference on, pp. 207–214. IEEE (2015)
Google Scholar
Diener, M., Cruz, E.H., Alves, M.A., Navaux, P.O., Koren, I.: Affinity-based thread and data mapping in shared memory systems. ACM Comput. Surv. (CSUR) 49(4), 64 (2017)
Google Scholar
Goglin, B., Furmento, N.: Enabling high-performance memory migration for multithreaded applications on linux. In: Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pp. 1–9. IEEE (2009)
Google Scholar
Gordon, V., Proth, J.M., Chu, C.: A survey of the state-of-the-art of common due date assignment and scheduling research. Eur. J. Oper. Res. 139(1), 1–25 (2002)
Article MathSciNet Google Scholar
Iancu, C., Hofmeyr, S., Blagojević, F., Zheng, Y.: Oversubscription on multicore processors. In: Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pp. 1–11. IEEE (2010)
Google Scholar
Ito, S., Goto, K., Ono, K.: Automatically optimized core mapping to subdomains of domain decomposition method on multicore parallel environments. Comput. Fluids 80, 88–93 (2013)
Article Google Scholar
Jackson, J.R.: Scheduling a production line to minimize maximum tardiness. Technical Report, California Univ Los Angeles Numerical Analysis Research (1955)
Google Scholar
Kaur, K., Chhabra, A., Singh, G.: Heuristics based genetic algorithm for scheduling static tasks in homogeneous parallel system. Int. J. Comput. Sci. Secur. (IJCSS) 4(2), 183–198 (2010)
Google Scholar
Leung, J.Y.: Handbook of scheduling: algorithms, models, and performance analysis. CRC Press, Baco Raton (2004)
Google Scholar
Li, J., Sun, K., Xu, D., Li, H.: Single machine due date assignment scheduling problem with customer service level in fuzzy environment. Appl. Soft Comput. 10(3), 849–858 (2010)
Article Google Scholar
Lublin, U., Feitelson, D.G.: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs. J. Parallel Distrib. Comput. 63(11), 1105–1122 (2003)
Article Google Scholar
Power, J., Basu, A., Gu, J., Puthoor, S., Beckmann, B.M., Hill, M.D., Reinhardt, S.K., Wood, D.A.: Heterogeneous system coherence for integrated cpu-gpu systems. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 457–467. ACM, New York (2013)
Google Scholar
Raschka, S.: Python machine learning. Packt Publishing Ltd, Birmingham (2015)
Google Scholar
Smith, D.E.: Choosing objectives in over-subscription planning. In: ICAPS, vol. 4, p. 393 (2004)
Google Scholar
Tavakkoli-Moghaddam, R., Moslehi, G., Vasei, M., Azaron, A.: Optimal scheduling for a single machine to minimize the sum of maximum earliness and tardiness considering idle insert. Appl. Math. Comput. 167(2), 1430–1450 (2005)
MathSciNet MATH Google Scholar
Tesla, N.: A unified graphics and computing architecture. IEEE Computer Society pp. 0272–1732 (2008)
Google Scholar
Toosi, A.N., Sinnott, R.O., Buyya, R.: Resource provisioning for data-intensive applications with deadline constraints on hybrid clouds using aneka. Futur. Gener. Comput. Syst. 79, 765–775 (2018)
Article Google Scholar
Van Den Briel, M., Sanchez, R., Do, M.B., Kambhampati, S.: Effective approaches for partial satisfaction (over-subscription) planning. In: AAAI, pp. 562–569 (2004)
Google Scholar
Vasile, M.A., Pop, F., Tutueanu, R.I., Cristea, V., Kołodziej, J.: Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing. Futur. Gener. Comput. Syst. 51, 61–71 (2015)
Article Google Scholar

Download references

Acknowledgements

We thank the anonymous reviewers for their insightful comments. This research was conducted within the “Studying Tools to Support Applications Running on Powerful Clusters & Big Data Analytics (HPDA phase I 2018–2020)” funded by Ho Chi Minh City Department of Science and Technology (under grant number 46/2018/HD-QKHCN).

Author information

Authors and Affiliations

High Performance Computing Lab, Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, 268 Ly Thuong Kiet Street, District 10, Ho Chi Minh City, Vietnam
Minh Thanh Chung, Kien Trung Pham, Manh-Thin Nguyen & Nam Thoai

Authors

Minh Thanh Chung
View author publications
You can also search for this author in PubMed Google Scholar
Kien Trung Pham
View author publications
You can also search for this author in PubMed Google Scholar
Manh-Thin Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Nam Thoai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Kien Trung Pham or Nam Thoai .

Editor information

Editors and Affiliations

Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
Hans Georg Bock
Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
Willi Jäger
Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
Ekaterina Kostina
Institute of Mathematics, Vietnam Academy of Science and Technology, Hanoi, Vietnam
Hoang Xuan Phu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chung, M.T., Pham, K.T., Nguyen, MT., Thoai, N. (2021). SCOUT: Scheduling Core Utilization to Optimize the Performance of Scientific Computing Applications on CPU/Coprocessor-Based Cluster. In: Bock, H.G., Jäger, W., Kostina, E., Phu, H.X. (eds) Modeling, Simulation and Optimization of Complex Processes HPSC 2018. Springer, Cham. https://doi.org/10.1007/978-3-030-55240-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-55240-4_6
Published: 02 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55239-8
Online ISBN: 978-3-030-55240-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

SCOUT: Scheduling Core Utilization to Optimize the Performance of Scientific Computing Applications on CPU/Coprocessor-Based Cluster

Abstract

Access this chapter

Similar content being viewed by others

L3C Model of High-Performance Computing Cluster for Scientific Applications

E-OSched: a load balancing scheduler for heterogeneous multicores

Benchmarking Performance: Influence of Task Location on Cluster Throughput

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

SCOUT: Scheduling Core Utilization to Optimize the Performance of Scientific Computing Applications on CPU/Coprocessor-Based Cluster

Abstract

Access this chapter

Similar content being viewed by others

L3C Model of High-Performance Computing Cluster for Scientific Applications

E-OSched: a load balancing scheduler for heterogeneous multicores

Benchmarking Performance: Influence of Task Location on Cluster Throughput

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation