A Locality-Aware Communication Layer for Virtualized Clusters

  • Simon PickartzEmail author
  • Jonas Baude
  • Stefan Lankes
  • Antonello Monti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10524)


Locality-aware HPC communication stacks have been around with the emergence of SMP systems since the early 2000s. Common MPI implementations provide communication paths optimized for the underlying transport mechanism, i.e., two processes residing on the same SMP node should leverage local shared-memory communication while inter-node communication should be realized by means of HPC interconnects. As virtualization gains more and more importance in the area of HPC, locality-awareness becomes relevant again. Commonly, HPC systems lack support for efficient communication among co-located VMs, i.e., they harness the local InfiniBand adapter as opposed to the shared physical memory on the host system. This results in important performance penalties, especially for communication intensive applications. With IVShmem there exist means for the exploitation of the local memory as communication medium. In this paper we present a locality-aware MPI layer leveraging this technology for efficient intra-host inter-VM communication. We evaluate our implementation by drawing a comparison to a non-locality-aware communication layer in virtualized clusters.


Locality-awareness Virtualization IVShmem MPI 



This research and development was supported by the Federal Ministry of Education and Research (BMBF) under Grant 01IH16010C (Project ENVELOPE).


  1. 1.
    Intel MPI benchmarks. Technical report Intel Corporation (2014)Google Scholar
  2. 2.
    Intel virtualization technology for directed I/O. Technical report, Intel Corporation (2014)Google Scholar
  3. 3.
    Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Frederickson, P., Lasinski, T., Schreiber, R., Simon, H., Venkatakrishnan, V., Weeratunga, S.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)CrossRefGoogle Scholar
  4. 4.
    Bellard, F.: QEMU, a fast and portable dynamic translator. In: USENIX Annual Technical Conference, FREENIX Track, pp. 41–46 (2005)Google Scholar
  5. 5.
    Clauss, C., Moschny, T., Eicker, N.: Dynamic process management with allocation-internal co-scheduling towards interactive supercomputing. In: Trinitis, C., Weidendorfer, J. (eds.) Proceedings of the 1st COSH Workshop on Co-scheduling of HPC Applications, p. 13, January 2016Google Scholar
  6. 6.
    Intel LAN Access Division: PCI-SIG SR-IOV primer. Technical report 2.5, Intel Corporation, January 2011Google Scholar
  7. 7.
    Macdonell, A.C.: Shared-memory optimizations for virtual machines. Ph.D. thesis, University of Alberta (2011)Google Scholar
  8. 8.
    Mamidala, A.R., Chai, L., Jin, H.W., Panda, D.K.: Efficient SMP-aware MPI-level broadcast over InfiniBand’s hardware multicast. In: Proceedings of 20th IEEE International Parallel Distributed Processing Symposium, p. 8, April 2006Google Scholar
  9. 9.
    Nussbaum, L., Anhalt, F., Mornard, O., Gelas, J.P.: Linux-based virtualization for HPC clusters. In: Montreal Linux Symposium, Montreal, Canada, July 2009Google Scholar
  10. 10.
    Pickartz, S., Clauss, C., Lankes, S., Krempel, S., Moschny, T., Monti, A.: Non-intrusive migration of MPI processes in OS-bypass networks. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1728–1735, May 2016Google Scholar
  11. 11.
    Pickartz, S., Lankes, S., Monti, A., Clauss, C., Breitbart, J.: Application migration in HPC?–A driver of the exascale era? In: 2016 International Conference on High Performance Computing Simulation (HPCS), pp. 318–325, July 2016Google Scholar
  12. 12.
    Pickartz, S., Breitbart, J., Clauss, C., Lankes, S., Monti, A.: Co-scheduling of HPC applications. In: Virtualization in HPC - An Enabler for Adaptive Co-scheduling? IOS Press, January 2017Google Scholar
  13. 13.
    Pickartz, S., Breitbart, J., Lankes, S.: Implications of process-migration in virtualized environments. In: Proceedings of the 1st COSH Workshop on Co-Scheduling of HPC Applications, p. 6, January 2016Google Scholar
  14. 14.
    Pickartz, S., Gad, R., Lankes, S., Nagel, L., Süß, T., Brinkmann, A., Krempel, S.: Migration techniques in HPC environments. In: Lopes, L., Žilinskas, J., Costan, A., Cascella, R.G., Kecskemeti, G., Jeannot, E., Cannataro, M., Ricci, L., Benkner, S., Petit, S., Scarano, V., Gracia, J., Hunold, S., Scott, S.L., Lankes, S., Lengauer, C., Carretero, J., Breitbart, J., Alexander, M. (eds.) Euro-Par 2014. LNCS, vol. 8806, pp. 486–497. Springer, Cham (2014). doi: 10.1007/978-3-319-14313-2_41 Google Scholar
  15. 15.
    Träff, J.L.: SMP-aware message passing programming. In: Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments, Proceedings, pp. 56–65, April 2003Google Scholar
  16. 16.
    Träff, J.L.: Improved MPI all-to-all communication on a Giganet SMP cluster. In: Kranzlmüller, D., Volkert, J., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2002. LNCS, vol. 2474, pp. 392–400. Springer, Heidelberg (2002). doi: 10.1007/3-540-45825-5_57 CrossRefGoogle Scholar
  17. 17.
    Uhlig, R., Neiger, G., Rodgers, D., Santoni, A.L., Martins, F.C.M., Anderson, A.V., Bennett, S.M., Kagi, A., Leung, F.H., Smith, L.: Intel virtualization technology. Computer 38(5), 48–56 (2005)CrossRefGoogle Scholar
  18. 18.
    Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple Linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003). doi: 10.1007/10968987_3 CrossRefGoogle Scholar
  19. 19.
    Younge, A.J., Henschel, R., Brown, J.T., von Laszewski, G., Qiu, J., Fox, G.C.: Analysis of virtualization technologies for high performance computing environments. In: 2011 IEEE International Conference on Cloud Computing (CLOUD), pp. 9–16. IEEE (2011)Google Scholar
  20. 20.
    Zhang, J., Lu, X., Jose, J., Li, M., Shi, R., Panda, D.: High performance MPI library over SR-IOV enabled infiniband clusters. In: 2014 21st International Conference on High Performance Computing (HiPC), pp. 1–10, December 2014Google Scholar
  21. 21.
    Zhang, J., Lu, X., Chakraborty, S., Panda, D.K.: SLURM-V: extending SLURM for building efficient HPC cloud with SR-IOV and IVShmem. In: Dutot, P.-F., Trystram, D. (eds.) Euro-Par 2016. LNCS, vol. 9833, pp. 349–362. Springer, Cham (2016). doi: 10.1007/978-3-319-43659-3_26 Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Simon Pickartz
    • 1
    Email author
  • Jonas Baude
    • 1
  • Stefan Lankes
    • 1
  • Antonello Monti
    • 1
  1. 1.Institute for Automation of Complex Power SystemsE.ON Energy Research Center, RWTH Aachen UniversityAachenGermany

Personalised recommendations