Compiler Automatic Discovery of OmpSs Task Dependencies

  • Sara Royuela
  • Alejandro Duran
  • Xavier Martorell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7760)


Dependence analysis is an essential step for many compiler optimizations, from simple loop transformations to automatic parallelization. Parallel programming models require specific dependence analyses that take into account multi-threaded execution. Furthermore, asynchronous parallelism introduced by OpenMP tasks has promoted the development of new dependency analysis techniques. In these terms, OmpSs parallel programming model extends OpenMP tasks with the definition of intertask dependencies. This extension allows run-time dependency detection, which potentially improves the performance when load balancing or locality rule the execution time. On the other side, the extension requires the user to figure out data-sharing attributes and the type of access to each data in all tasks in order to correctly specify the dependencies. We aim to enhance the programmability of OmpSs with a new methodology that enables the compiler to automatically determine the dependencies of OmpSs tasks, thus releasing users from the task of manually defining these dependencies. In this context, we have developed an algorithm based on the discovery of code concurrent to a task and liveness analysis. The algorithm first finds out all code concurrent with a given task. Then, it computes the data-sharing attributes of the variables appearing in the task. Finally, it analyzes the liveness properties of the task’s shared variables. With this information, the algorithm figures out the proper dependencies of the task. We have implemented this algorithm in the Mercurium source-to-source compiler. We have tested the results with several benchmarks proving that the algorithm is able to correctly find a large number of dependency expressions.


Block Size Dependence Graph Automatic Discovery Race Condition Liveness Property 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguadé, E.: Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP. In: 38th International Conference on Parallel Processing (ICPP 2009), Vienna, Austria, pp. 124–131. IEEE Computer Society (September 2009)Google Scholar
  2. 2.
    Altenfeld, R., Apel, M., an Mey, D., Böttger, B., Benke, S., Bischof, C.: Parallelising Computational Microstructure Simulations for Metallic Materials with OpenMP. In: Chapman, B.M., Gropp, W.D., Kumaran, K., Müller, M.S. (eds.) IWOMP 2011. LNCS, vol. 6665, pp. 1–11. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Andersch, M., Chi, C.C., Juurlink, B.H.H.: Programming parallel embedded and consumer applications in OpenMP superscalar. In: Ramanujam, J., Sadayappan, P. (eds.) PPoPP, pp. 281–282. ACM (2012)Google Scholar
  4. 4.
    Baah, G.K., Podgurski, A., Harrold, M.J.: The Probabilistic Program Dependence Graph and Its Application to Fault Diagnosis.. IEEE Transactions on Software Engineering 36(4), 528–545 (2010)CrossRefGoogle Scholar
  5. 5.
    Barcelona Supercomputing Center. The NANOS Group Site: The Mercurium Compiler,
  6. 6.
    Barcelona Supercomputing Center. Barcelona Supercomputing Center – Centro Nacional de Supercomputación (2011),
  7. 7.
    Baudisch, D., Brandt, J., Schneider, K.: Multithreaded code from synchronous programs: Extracting independent threads for OpenMP. In: DATE, pp. 949–952. IEEE (2010)Google Scholar
  8. 8.
    Baxter III, W., Bauer, H.R.: The Program Dependence Graph and Vectorization. In: PPL, pp. 1–11 (1989)Google Scholar
  9. 9.
    Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2008, pp. 101–113. ACM, New York (2008)CrossRefGoogle Scholar
  10. 10.
    Duran, A., Ayguadé, E., Badia, R.M.: OmpSs: a Proposal for Programming Heterogeneous Multi-Core Architectures. PPL 21(2), 173–193 (2011)zbMATHGoogle Scholar
  11. 11.
    Duran, A., Ferrer, R., Ayguadé, E., Badia, R.M., Labarta, J.: A Proposal to Extend the OpenMP Tasking Model with Dependent Tasks.. International Journal of Parallel Programming 37(3), 292–305 (2009)zbMATHCrossRefGoogle Scholar
  12. 12.
    Ferrante, J., Ottenstein, K.J., Warren, J.D.: The Program Dependence Graph and its Use in Optimization. In: Paul, M., Robinet, B. (eds.) Programming 1984. LNCS, vol. 167, pp. 125–132. Springer, Heidelberg (1984)Google Scholar
  13. 13.
    James: Intel ® Threading Building Blocks. O’Reilly Media, Inc. (July 2007)Google Scholar
  14. 14.
    Kegel, P., Schellmann, M., Gorlatch, S.: Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 654–665. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  15. 15.
    Larsen, P., Karlsson, S., Madsen, J.: Identifying Inter-task Communication in Shared Memory Programming Models. In: Müller, M.S., de Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 168–182. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Lin, Y., Terboven, C., an Mey, D., Copty, N.: Automatic Scoping of Variables in Parallel Regions of an OpenMP Program. In: Chapman, B.M. (ed.) WOMPAT 2004. LNCS, vol. 3349, pp. 83–97. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Norris, C., Pollock, L.L.: Register Allocation over the Program Dependence Graph.. In: PLDI, pp. 266–277 (1994)Google Scholar
  18. 18.
    OpenMP ARB. OpenMP Application Program Interface, v. 3.1 (September 2011),
  19. 19.
    Planas, J., Badia, R.M., Ayguadé, E., Labarta, J.: Hierarchical Task-Based Programming With StarSs. International Journal of High Performance Computing Applications 23(3), 284–299 (2009)CrossRefGoogle Scholar
  20. 20.
    Rinard, M.C., Scales, D.J., Lam, M.S.: A High-Level, Machine-Independent Language for Parallel Programming. IEEE Computer 26(6), 28–38 (1993)CrossRefGoogle Scholar
  21. 21.
    Royuela, S.: Compiler Analysis and its Application to OmpSs. Master’s thesis, Technical University of Catalonia, 1012Google Scholar
  22. 22.
    Royuela, S., Duran, A., Liao, C., Quinlan, D.J.: Auto-scoping for OpenMP Tasks. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 29–43. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  23. 23.
    Sarkar, V.: Automatic partitioning of a program dependence graph into parallel tasks. IBM Journal of Research and Development 35(5), 779–804 (1991)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sara Royuela
    • 1
  • Alejandro Duran
    • 1
    • 2
  • Xavier Martorell
    • 1
  1. 1.Barcelona Supercomputing CenterSpain
  2. 2.Intel CorporationUSA

Personalised recommendations