Compiler Generated Progress Estimation for OpenMP Programs

  • Peter ZangerlEmail author
  • Peter Thoman
  • Thomas Fahringer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11657)


Task-parallel runtime systems have to tune several parameters and take scheduling decisions during program execution to achieve the best performance. In order to decide whether a change was beneficial to the program performance, the runtime needs some kind of feedback mechanism on the progress of the program after such a parameter change was performed. Traditionally, this feedback is derived from metrics only indirectly related to the progress of the program.

To mitigate this drawback, we propose a fully automatic compiler analysis and transformation which generates progress estimates for sequential and OpenMP programs. Combined with a runtime system interface for progress reporting this enables the runtime system to get direct feedback on the progress of the executed program.

We based our implementation on the Insieme compiler and runtime system and evaluated it on a set of eight benchmarks representing a variety of different types of algorithms. Our evaluation results show a significant improvement in estimation accuracy over traditional estimation methods, with an increasing advantage for larger degrees of parallelism.



This work is supported by the D-A-CH project CELERITY, funded by DFG project CO1544/1-1 and FWF project 13388.


  1. 1.
    Anantpur, J., Govindarajan, R.: PRO: Progress Aware GPU Warp Scheduling Algorithm. In: 2015 IEEE International Parallel and Distributed Processing Symposium, pp. 979–988, May 2015Google Scholar
  2. 2.
    Bailey, D.H., Barszcz, E., Barton, J.T., et al.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)CrossRefGoogle Scholar
  3. 3.
    Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguade, E.: Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP. In: 2009 International Conference on Parallel Processing, pp. 124–131 (2009)Google Scholar
  4. 4.
    Feliu, J., Sahuquillo, J., Petit, S., Duato, J.: Addressing fairness in SMT multicores with a progress-aware scheduler. In: 2015 IEEE International on Parallel and Distributed Processing Symposium (IPDPS), pp. 187–196. IEEE (2015)Google Scholar
  5. 5.
    Georgakoudis, G., Vandierendonck, H., Thoman, P., Supinski, B.R.D., Fahringer, T., Nikolopoulos, D.S.: SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads. ACM Trans. Archit. Code Optim. 14(4), 54:1–54:25 (2017)CrossRefGoogle Scholar
  6. 6.
    Goel, A., Walpole, J., Shor, M.: Real-rate scheduling. In: 10th IEEE Real-Time and Embedded Technology and Applications Symposium, Proceedings, RTAS 2004, pp. 434–441, May 2004Google Scholar
  7. 7.
    Jordan, H., et al.: A Multi-Objective Auto-Tuning Framework for Parallel Codes. In: 2012 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–12, November 2012Google Scholar
  8. 8.
    Jordan, H., Pellegrini, S., Thoman, P., Kofler, K., Fahringer, T.: INSPIRE: The Insieme Parallel Intermediate Representation. In: Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, PACT 2013, pp. 7–18. IEEE Press, Piscataway (2013)Google Scholar
  9. 9.
    Lee, S.-J., Lee, H.-K., Yew, P.-C.: Runtime Performance Projection Model for Dynamic Power Management. In: Choi, L., Paek, Y., Cho, S. (eds.) ACSAC 2007. LNCS, vol. 4697, pp. 186–197. Springer, Heidelberg (2007). Scholar
  10. 10.
    Steere, D.C., Goel, A., Gruenberg, J., McNamee, D., Pu, C., Walpole, J.: A feedback-driven proportion allocator for real-rate scheduling. In: OSDI, vol. 99, pp. 145–158 (1999)Google Scholar
  11. 11.
    Thoman, P., Zangerl, P., Fahringer, T.: Task-parallel Runtime System Optimization Using Static Compiler Analysis. In: Proceedings of the Computing Frontiers Conference, pp. 201–210. ACM (2017)Google Scholar
  12. 12.
    Thoman, P., Zangerl, P., Fahringer, T.: Static Compiler Analyses for Application-specific Optimization of Task-Parallel Runtime Systems. J. Sig. Process. Syst., 1–18 (2018)Google Scholar
  13. 13.
    Wu, C., Li, J., Xu, D., Yew, P.C., Li, J., Wang, Z.: FPS: a fair-progress process scheduling policy on shared-memory multiprocessors. IEEE Trans. Parallel Distrib. Syst. 26(2), 444–454 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of InnsbruckInnsbruckAustria

Personalised recommendations