The Open Community Runtime on the Intel Knights Landing Architecture

  • Jiri Dokulil
  • Siegfried Benkner
  • Jakub Yaghob
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10393)


The Intel Xeon Phi Knights Landing manycore processor comes with new interesting features: on-chip high-bandwidth memory and several user-selectable NUMA configurations. In this paper, we look into how these affect applications that target the Open Community Runtime (OCR), an asynchronous tasked-based runtime system for future parallel architectures. We have extended our OCR runtime to make it NUMA aware and to allow it to use the high-bandwidth memory. We have conducted a range of experiments, comparing OpenMP, TBB, our OCR implementation, and the reference OCR implementation on different machine configurations using a memory intensive seismic simulation.


Open Community Runtime Knights Landing Intel Xeon Phi High-bandwidth memory Parallel runtime systems NUMA 



The work was supported in part by the Austrian Science Fund (FWF) project P 29783 Dynamic Runtime System for Future Parallel Architectures and by Charles University project PROGRES Q48.


  1. 1.
    Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency Comput. Pract. Exp. Euro-Par 2009(23), 187–198 (2011)CrossRefGoogle Scholar
  2. 2.
    Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 66:1–66:11. IEEE Computer Society Press, Los Alamitos (2012)Google Scholar
  3. 3.
    Bosilca, G., Bouteiller, A., Danalis, A., Faverge, M., Herault, T., Lemariner, P., Dongarra, J.: PaRSEC: exploiting heterogeneity to enhance scalability. IEEE Comput. Sci. Eng. 15(6), 36–45 (2013)CrossRefGoogle Scholar
  4. 4.
    Bueno, J., Planas, J., Duran, A., Badia, R., Martorell, X., Ayguade, E., Labarta, J.: Productive programming of GPU clusters with OmpSs. In: IPDPS 2012 Parallel Distributed Processing Symposium (2012)Google Scholar
  5. 5.
    Dokulil, J., Benkner, S.: Retargeting of the open community runtime to intel xeon phi. In: International Conference On Computational Science, ICCS 2015, pp. 1453–1462. Procedia Computer Science (2015)Google Scholar
  6. 6.
    Dokulil, J., Sandrieser, M., Benkner, S.: OCR-Vx - an alternative implementation of the open community runtime. In: International Workshop on Runtime Systems for Extreme Scale Programming Models and Architectures, in Conjunction with SC 2015, Austin, Texas (2015)Google Scholar
  7. 7.
    Falt, Z., Krulis, M., Bednarek, D., Yaghob, J., Zavoral, F.: Towards efficient locality aware parallel data stream processing. J. Univ. Comput. Sci. 21(6), 816–841 (2015)MathSciNetGoogle Scholar
  8. 8.
    Hartmut, K., Brodowicz, M., Sterling, T.: Parallex an advanced parallel execution model for scaling-impaired applications. In: Proceedings of the 2009 International Conference on Parallel Processing Workshops (ICPPW 2009), pp. 94–401 (2009)Google Scholar
  9. 9.
    Kaiser, H., Heller, T., Adelstein-Lelbach, B., Serio, A., Fey, D.: HPX - a task based programming model in a global address space. In: The 8th International Conference on Partitioned Global Address Space Programming Models (PGAS) (2014)Google Scholar
  10. 10.
    Mattson, T.G., et al.: The open community runtime: a runtime system for extreme scale computing. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7 (2016)Google Scholar
  11. 11.
    Mattson, T., Cledat, R. (eds.): The Open Community Runtime Interface, April 2016.;a=blob;f=ocr/spec/ocr-1.1.0.pdf

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Faculty of Computer ScienceUniversity of ViennaViennaAustria
  2. 2.Department of Software Engineering, Charles UniversityPragueCzech Republic

Personalised recommendations