Runtime Support for Distributed Dynamic Locality

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10659)


Single node hardware design is shifting to a heterogeneous nature and many of today’s largest HPC systems are clusters that combine heterogeneous compute device architectures. The need for new programming abstractions in the advancements to the Exascale era has been widely recognized and variants of the Partitioned Global Address Space (PGAS) programming model are discussed as a promising approach in this respect. In this work, we present a graph-based approach to provide runtime support for dynamic, distributed hardware locality, specifically considering heterogeneous systems and asymmetric, deep memory hierarchies. Our reference implementation dyloc leverages hwloc to provide high-level operations on logical hardware topology based on user-specified predicates such as filter- and group transformations and locality-aware partitioning. To facilitate integration in existing applications, we discuss adapters to maintain compatibility with the established hwloc API.


Partitioned Global Address Space (PGAS) Dy Locus Hardware Topology Local Hardware Deep Memory Hierarchies 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was partially supported by the German Research Foundation (DFG) by the German Priority Programme 1648 Software for Exascale Computing (SPPEXA) and by the German Federal Ministry of Education and Research (BMBF) through the MEPHISTO project, grant agreement 01IH16006B.


  1. 1.
    Bauer, M., Clark, J., Schkufza, E., Aiken, A.: Programming the memory hierarchy revisited: supporting irregular parallelism in Sequoia. ACM SIGPLAN Not. 46(8), 13–24 (2011)CrossRefGoogle Scholar
  2. 2.
    Chamberlain, B.L., Deitz, S.J., Iten, D., Choi, S.-E.: User-defined distributions and layouts in Chapel: philosophy and framework. In: Proceedings of 2nd USENIX Conference on Hot Topics in Parallelism, p. 12. USENIX Association (2010)Google Scholar
  3. 3.
    Da Costa, G., Fahringer, T., Gallego, J.A.R., Grasso, I., Hristov, A., Karatza, H., Lastovetsky, A., Marozzo, F., Petcu, D., Stavrinides, G., et al.: Exascale machines require new programming paradigms and runtimes. Supercomput. Front. Innov. 2(2), 6–27 (2015)Google Scholar
  4. 4.
    Fuchs, T., Fürlinger, K.: A multi-dimensional distributed array abstraction for PGAS. In: Proceedings of 18th IEEE International Conference on High Performance Computing and Communications (HPCC 2016), Sydney, Australia, pp. 1061–1068, December 2016Google Scholar
  5. 5.
    Fuchs, T., Fürlinger, K.: Expressing and exploiting multi-dimensional locality in DASH. In: Bungartz, H.-J., Neumann, P., Nagel, W.E. (eds.) Software for Exascale Computing - SPPEXA 2013-2015. LNCSE, vol. 113, pp. 341–359. Springer, Cham (2016). CrossRefGoogle Scholar
  6. 6.
    Fürlinger, K., Fuchs, T., Kowalewski, R.: DASH: a C++ PGAS library for distributed data structures and parallel algorithms. In: Proceedings of 18th IEEE International Conference on High Performance Computing and Communications (HPCC 2016), Sydney, Australia, pp. 983–990, December 2016Google Scholar
  7. 7.
    Goglin, B.: Managing the topology of heterogeneous cluster nodes with hardware locality (hwloc). In: International Conference on High Performance Computing & Simulation (HPCS 2014), July 2014, Bologna, Italy. IEEE (2014)Google Scholar
  8. 8.
    Goglin, B.: Exposing the locality of heterogeneous memory architectures to HPC applications. In: 1st ACM International Symposium on Memory Systems (MEMSYS 2016). ACM (2016)Google Scholar
  9. 9.
    Hajiaghayi, M., Johnson, T., Khani, M.R., Saha, B.: Hierarchical graph partitioning. In: Proceedings of 26th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 51–60. ACM (2014)Google Scholar
  10. 10.
    Kamil, A.A., Yelick, K.A.: Hierarchical additions to the SPMD programming model. Technical report UCB/EECS-2012-20, EECS Department, University of California, Berkeley, February 2012Google Scholar
  11. 11.
    Tate, A., Kamil, A., Dubey, A., Größlinger, A., Chamberlain, B., Goglin, B., Edwards, C., Newburn, C.J., Padua, D., Unat, D., et al.: Programming abstractions for data locality. Research report, PADAL Workshop 2014, 28–29 April, Swiss National Supercomputing Center (CSCS), Lugano, Switzerland, November 2014Google Scholar
  12. 12.
    Yan, Y., Zhao, J., Guo, Y., Sarkar, V.: Hierarchical place trees: a portable abstraction for task parallelism and data movement. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds.) LCPC 2009. LNCS, vol. 5898, pp. 172–187. Springer, Heidelberg (2010). CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.MNM-Team, Computer Science DepartmentLudwig-Maximilians-Universität (LMU) MünchenMunichGermany

Personalised recommendations