Abstract
SDN-enhanced MPI is a framework that integrates the network programmability of Software-Defined Networking (SDN) with Message Passing Interface (MPI). The aim of SDN-enhanced MPI is to improve MPI communication performance by dynamically steering the traffic within the interconnect based on the communication pattern of applications. A major limitation in the current implementation of SDN-enhanced MPI is that multiple jobs cannot be executed concurrently. This paper removes this limitation by integrating SDN-enhanced MPI with the job scheduler of the cluster. Specifically, we have developed a plugin for the job scheduler that collects and reports the job information to the interconnect controller. A preliminary evaluation demonstrated that applications could gain up to 2.56× speedup in communication.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Frederickson, P., Lasinski, T., Schreiber, R., Simon, H., Venkatakrishnan, V., Weeratunga, S.: The NAS parallel benchmarks. Inter. J. High Perform. Comput. Appl. 5(3), 63–73 (1991). https://doi.org/10.1177/109434209100500306
Dashdavaa, K., Date, S., Yamanaka, H., Kawai, E., Watashiba, Y., Ichikawa, K., Abe, H., Shimojo, S.: Architecture of a high-speed MPI-Bcast leveraging software-defined network. In: European Conference on Parallel Processing. Lecture Notes in Computer Science, vol. 8374, pp. 885–894. Springer, Berlin (2014). https://doi.org/10.1007/978-3-642-54420-0_86
Date, S., Abe, H., Khureltulga, D., Takahashi, K., Kido, Y., Watashiba, Y., U-chupala, P., Ichikawa, K., Yamanaka, H., Kawai, E., Shimojo, S.: SDN-accelerated HPC Infrastructure for scientific research. Inter. J. Inform. Tech. 22(1), pp. 1–30 (2016)
Jamalian, S., Rajaei, H.: Data-Intensive HPC tasks scheduling with SDN to enable HPC-as-a-service. In: 2015 IEEE 8th International Conference on Cloud Computing (CLOUD 2015), pp. 596–603 (2015). https://doi.org/10.1109/CLOUD.2015.85
McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., Turner, J.: OpenFlow: enabling innovation in campus networks. Comput. Commun. Rev. 38(2), 69–74 (2008). https://doi.org/10.1145/1355734.1355746
Michelogiannakis, G., Ibrahim, K.Z., Shalf, J., Wilke, J.J., Knight, S., Kenny, J.P.: APHiD: hierarchical task placement to enable a tapered fat tree topology for lower power and cost in HPC networks. In: 17th International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2017), pp. 228–237 (2017). https://doi.org/10.1109/CCGRID.2017.33
MPI Forum: MPI: A Message-Passing Interface Standard (2012). https://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf
Rodriguez, G., Minkenberg, C., Beivide, R., Luijten, R.P., Labarta, J., Valero, M.: Oblivious routing schemes in extended generalized fat tree networks. In: 2009 International Conference on Cluster Computing (CLUSTER 2009), pp. 1–8 (2009). https://doi.org/10.1109/CLUSTR.2009.5289145
Takahashi, K., Khureltulga, D., Watashiba, Y., Kido, Y., Date, S., Shimojo, S.: Performance evaluation of SDN-enhanced MPI allreduce on a cluster system with fat-tree interconnect. In: 2014 International Conference on High Performance Computing & Simulation (HPCS 2014), pp. 784–792 (2014). https://doi.org/10.1109/HPCSim.2014.6903768
Takahashi, K., Khureltulga, D., Munkhdorj, B., Kido, Y., Date, S., Yamanaka, H., Kawai, E., Shimojo, S.: Concept and design of SDN-enhanced MPI framework. In: Third European Workshop on Software Defined Networks (EWSDN 2015), pp. 109–110 (2015). https://doi.org/10.1109/EWSDN.2015.72
Takahashi, K., Date, S., Khureltulga, D., Kido, Y., Shimojo, S.: PFAnalyzer: a toolset for analyzing application-aware dynamic interconnects. In: 2017 International Conference on Cluster Computing (CLUSTER 2017), pp. 789–796 (2017). https://doi.org/10.1109/CLUSTER.2017.18
Takahashi, K., Date, S., Khureltulga, D., Kido, Y., Yamanaka, H., Kawai, E., Shimojo, S.: UnisonFlow: a software-defined coordination mechanism for message-passing communication and computation. IEEE Access 6, 23372–23382 (2018). https://doi.org/10.1109/ACCESS.2018.2829532
Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple Linux utility for resource management. In: Job Scheduling Strategies for Parallel Processing, pp. 44–60. Springer, Berlin (2003). https://doi.org/10.1007/10968987_3
Acknowledgement
This work was supported by JSPS KAKENHI Grant Number JP17K00168 and JP26330145.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Takahashi, K., Date, S., Watashiba, Y., Kido, Y., Shimojo, S. (2020). Integrating SDN-Enhanced MPI with Job Scheduler to Support Shared Clusters. In: Resch, M., Kovalenko, Y., Bez, W., Focht, E., Kobayashi, H. (eds) Sustained Simulation Performance 2018 and 2019. Springer, Cham. https://doi.org/10.1007/978-3-030-39181-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-39181-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39180-5
Online ISBN: 978-3-030-39181-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)