Task-Based Programming with OmpSs and Its Application

  • Alejandro Fernández
  • Vicenç Beltran
  • Xavier Martorell
  • Rosa M. Badia
  • Eduard Ayguadé
  • Jesus Labarta
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8806)

Abstract

OmpSs is a task-based programming model that aims to provide portability and flexibility for sequential codes while the performance is achieved by the dynamic exploitation of the parallelism at task level. OmpSs targets the programming of heterogeneous and multi-core architectures and offers asynchronous parallelism in the execution of the tasks. The main extension of OmpSs, now incorporated in the recent OpenMP 4.0 standard, is the concept of data dependences between tasks.

Tasks in OmpSs are annotated with data directionality clauses that specify the data used by it, and how it will be used (read, write or read&write). This information is used during the execution by the underlying OmpSs runtime to control the synchronization of the different instances of tasks by creating a dependence graph that guarantees the proper order of execution. This mechanism provides a simple way to express the order in which tasks must be executed, without the need of adding explicit synchronization.

Additionally, OmpSs syntax offers the flexibility to express that given tasks can be executed on heterogeneous target architectures (i.e., regular processors, GPUs, or FPGAs). The runtime is able to schedule and run these tasks, taking care of the required data transfers and synchronizations. OmpSs is a promising programming model for future exascale systems, with the potential to exploit unprecedented amounts of parallelism while coping with memory latency, network latency and load imbalance.

The paper covers the basics of OmpSs and some recent new developments to support a family of embedded DSLs (eDSLs) on top of the compiler and runtime, including an prototype implementation of a Partial Differential Equations DSL.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    OpenMP architecture review board, OpenMP 4.0 specification, http://www.openmp.org
  2. 2.
    Ayguadé, E., Badia, R.M., Igual, F.D., Labarta, J., Mayo, R., Quintana-Ortí, E.S.: An extension of the starSs programming model for platforms with multiple gPUs. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 851–862. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  3. 3.
    Badia, R.M., Labarta, J., Sirvent, R., Pérez, J.M., Cela, J.M., Grima, R.: Programming grid applications with grid superscalar. Journal of Grid Computing 1(2), 151–170 (2003)CrossRefGoogle Scholar
  4. 4.
    Brinkmann, S., Niethammer, C., Gracia, J., Keller, R.: TEMANEJO - a debugger for task based parallel programming models. In: Proceedings of the ParCO2011 Conference, pp. 639–645 (2011)Google Scholar
  5. 5.
    Bueno, J., Martinell, L., Duran, A., Farreras, M., Martorell, X., Badia, R.M., Ayguade, E., Labarta, J.: Productive Cluster Programming with OmpSs. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part I. LNCS, vol. 6852, pp. 555–566. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  6. 6.
    Bueno, J., Martorell, X., Badia, R.M., Ayguadé, E., Labarta, J.: Implementing ompss support for regions of data in architectures with multiple address spaces. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS 2013, pp. 359–368. ACM, New York (2013)CrossRefGoogle Scholar
  7. 7.
    Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Processing Letters 21(02), 173–193 (2011)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Fernández, A., Beltran, V., Mateo, S., Patejko, T., Ayguadé, E.: A Data Flow Language to Develop High Performance Computing DSLs. In: Proceedings of the Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, SC 2014, IEEE Computer Society, New Orleans (2014)Google Scholar
  9. 9.
    Ferrer, R., Planas, J., Bellens, P., Duran, A., Gonzalez, M., Martorell, X., Badia, R., Ayguade, E., Labarta, J.: Optimizing the exploitation of multicore processors and gpus with openmp and opencl. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. LNCS, vol. 6548, pp. 215–229. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  10. 10.
    Garcia, M., Labarta, J., Corbalán, J.: Hints to improve automatic load balancing with lewi for hybrid applications. J. Parallel Distrib. Comput. 74(9), 2781–2794 (2014)CrossRefGoogle Scholar
  11. 11.
    Labarta, J., Girona, S., Pillet, V., Cortes, T., Gregoris, L.: DiP: A parallel program development environment. In: Fraigniaud, P., Mignotte, A., Robert, Y., Bougé, L. (eds.) Euro-Par 1996. LNCS, vol. 1124, pp. 665–674. Springer, Heidelberg (1996)Google Scholar
  12. 12.
    Moore, G.E.: Cramming more components onto integrated circuits. Electronics 38(8) (April 1965)Google Scholar
  13. 13.
    Perez, J.M., Badia, R.M., Labarta, J.: A dependency-aware task-based programming environment for multi-core architectures. IEEE Int. Conference on Cluster Computing, 142–151 (September 2008)Google Scholar
  14. 14.
    Perez, J.M., Bellens, P., Badia, R.M., Labarta, J.: CellSs: Making it easier to program the Cell Broadband Engine processor. IBM Journal of Research and Development 51(5), 593–604 (2007)CrossRefGoogle Scholar
  15. 15.
    Planas, J., Badia, R.M., Ayguadé, E., Labarta, J.: Self-adaptive ompss tasks in heterogeneous environments. In: 27th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2013, Cambridge, MA, USA, May 20-24, pp. 138–149 (2013)Google Scholar
  16. 16.
    Rico, A., Duran, A., Cabarcas, F., Etsion, Y., Ramírez, A., Valero, M.: Trace-driven simulation of multithreaded applications. In: IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS, Austin, TX, USA, April 10-12, pp. 87–96 (2011)Google Scholar
  17. 17.
    Rompf, T., Odersky, M.: Lightweight modular staging: A pragmatic approach to runtime code generation and compiled DSLs. In: Proceedings of the Ninth International Conference on Generative Programming and Component Engineering, GPCE 2010, pp. 127–136. ACM, New York (2010)CrossRefGoogle Scholar
  18. 18.
    Subotic, V., Brinkmann, S., Marjanovic, V., Badia, R.M., Gracia, J., Niethammer, C., Ayguadé, E., Labarta, J., Valero, M.: Programmability and portability for exascale: Top down programming methodology and tools with starss. J. Comput. Science 4(6), 450–456 (2013)CrossRefGoogle Scholar
  19. 19.
    Tejedor, E., Badia, R.M.: Comp superscalar: Bringing grid superscalar and gcm together. In: 8th IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2008, pp. 185–193. IEEE (2008)Google Scholar
  20. 20.
    Ayguadé, V.M.J.L.E., Valero, M.: Effective communication and computation overlap with hybrid mpi/smpss. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2010. ACM, New York (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Alejandro Fernández
    • 1
  • Vicenç Beltran
    • 1
  • Xavier Martorell
    • 1
    • 3
  • Rosa M. Badia
    • 1
    • 2
  • Eduard Ayguadé
    • 1
    • 3
  • Jesus Labarta
    • 1
    • 3
  1. 1.Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS)Spain
  2. 2.Artificial Intelligence Research Institute (IIIA), Spanish Council for Scientific Research (CSIC)Spain
  3. 3.Universitat Politècnica de CatalunyaSpain

Personalised recommendations