Abstract
Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are used for running various server applications. Depending on the application that is run on the system, remote memory accesses can impact overall performance. This paper presents a new operating system (OS) scheduling optimization to reduce the impact of such remote memory accesses. By observing the pattern of local and remote DRAM accesses for every thread in each scheduling quantum and applying different algorithms, we come up with a new schedule of threads for the next quantum. This new schedule potentially cuts down remote DRAM accesses for the next scheduling quantum and improves overall performance. We present three such new algorithms of varying complexity followed by an algorithm which is an adaptation of Hungarian algorithm. We used three different synthetic workloads to evaluate the algorithm. We also performed sensitivity analysis with respect to varying DRAM latency. We show that these algorithms can cut down DRAM access latency by up to 55% depending on the algorithm used. The benefit gained from the algorithms is dependent upon their complexity. In general higher the complexity higher is the benefit. Hungarian algorithm results in an optimal solution. We find that two out of four algorithms provide a good trade-off between performance and complexity for the workloads we studied.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chandra, R., Devine, S., Verghise, B., Gupta, A., Rosenblum, M.: Scheduling and page migration for multiprocessor compute servers. In: Proceedings of ASPLOS (1994)
Kaseridis, D., Stuecheli, J., Chen, J., John, L.K.: A bandwidth-aware memory-subsystem resource management using non-invasive resource profilers for large CMP systems. In: Proceedings of Sixteenth International Symposium on High Performance Computer Architecture (2010)
Ä°pek, E., Mutlu, O., MartÃnez, J.F., Caruana, R.: Self-optimizing memory controllers: a reinforcement learning approach. In: Proceedings of International Symposium on Computer Architecture, Beijing, China, June 2008
Ahn, J.H., Erez, M., Dally, W.J.: The design space of data - parallel memory systems. In: Proceedings of SC (2006)
Zhu, Z., Zhang, Z.: A performance comparison of DRAM memory system optimizations for SMT processors. In: Proceedings of HPCA-11 (2005)
Nauman, R., Lim, W.-T., Thottethodi, M.: Effective management of DRAM bandwidth in multicore processors. In: Proceedings of PACT-2007
Tang, L., Mars, J., Vachharajani, N., Hundt, R., Soffa, M.L.: The impact of memory subsystem resource sharing on datacenter applications. In: Proceedings of International Symposium on Computer Architecture (2011)
Hur, I., Lin, C.: Adaptive history-based memory schedulers. In: Proceedings of the International Symposium on Microarchitecture (2004)
Rixner, S., Dally, W.J., Kapasi, U., Mattson, P.R., Owens, J.D.: Memory access scheduling. In: Proceedings of International Symposium on Computer Architecture (2000)
Kuhn, H.W.: The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 2, 83–97 (1955)
Kim, C., Huh, J.: Fairness-oriented OS scheduling support for multicore system. In: Proceedings of 2016 International Conference on Supercomputing
Sahoo, P.K., Dehury, C.K.: Efficient data and CPU-intensive job scheduling algorithms for healthcare cloud. In: Elsevier Computers and Electrical Engineering, Vol. 68 (2018)
Lepers, B., Quema, V., Fedorova, A.: Thread and memory placement on NUMA systems: asymmetry matters. In: USENIX Annual Technical Conference (2015)
Harris, T., Maas, M., Marathe, V.J.: Callisto: co-scheduling parallel runtime systems. In: 9th EuroSys Conference (2014)
Acknowledgments
I would like to thank Prof. Alan Cox of Rice University for initially discussing with me the concept of optimizing OS scheduling algorithms for improving the performance of various workloads. I would also like to thank various reviewers of this work for their comments and feedback. Finally, I would like to thank my wife and kids for supporting me morally during the course of this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Durbhakula, M. (2019). OS Scheduling Algorithms for Memory Intensive Workloads in Multi-socket Multi-core Servers. In: Arai, K., Bhatia, R., Kapoor, S. (eds) Intelligent Computing. CompCom 2019. Advances in Intelligent Systems and Computing, vol 997. Springer, Cham. https://doi.org/10.1007/978-3-030-22871-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-22871-2_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22870-5
Online ISBN: 978-3-030-22871-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)