Toward Enhancing OpenMP’s Work-Sharing Directives

  • Barbara M. Chapman
  • Lei Huang
  • Haoqiang Jin
  • Gabriele Jost
  • Bronis R. de Supinski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4128)


OpenMP provides a portable programming interface for shared memory parallel computers (SMPs). Although this interface has proven successful for small SMPs, it requies greater flexibility in light of the steadily growing size of individual SMPs and the recent advent of multithreaded chips. In this paper, we describe two application development experiences that exposed these expressivity problems in the current OpenMP specification. We then propose mechanisms to overcome these limitations, including thread subteams and thread topologies. Thus, we identify language features that improve OpenMP application performance on emerging and large-scale platforms while preserving ease of programming.


Shared Memory Loop Nest Seismic Data Processing Library Routine Thread Number 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allen, E., Chase, D., Luchangco, V., Maessen, J.-W., Ryu, S., Steele Jr., G.L., Tobin-Hochstadt, S.: The Fortress Language Specification, Version 0.785,
  2. 2.
    Bailey, D., Harris, T., Saphir, W., Van der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0, RNR-95-020, NASA Ames Research Center (1995)Google Scholar
  3. 3.
    Cray Inc., Chapel Specification 0.4,
  4. 4.
    Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: The proceedings of OOPSLA 2005, pp. 519–538 (2005), Ebcioglu, K., Saraswat V., Sarkar, V.: X10: Programming for hierarchical parallelism and nonuniform data access (extended abstract), OOPSLA 2004 (October 2004)Google Scholar
  5. 5.
    Gonzalez, M., Ayguade, E., Martorell, X., Labarta, J.: Defining and Supporting Pipelined Executions in OpenMP. In: Eigenmann, R., Voss, M.J. (eds.) WOMPAT 2001. LNCS, vol. 2104, Springer, Heidelberg (2001)CrossRefGoogle Scholar
  6. 6.
    Gonzalez, M., Oliver, J., Martorell, X., Ayguade, E., Labarta, J., Navarro, N.: OpenMP Extensions for Thread Groups and Their Run-time Support. In: Midkiff, S.P., Moreira, J.E., Gupta, M., Chatterjee, S., Ferrante, J., Prins, J.F., Pugh, B., Tseng, C.-W. (eds.) LCPC 2000. LNCS, vol. 2017, pp. 317–331. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  7. 7.
    Jin, H., Jost, G., Yan, J., Ayguade, E., Gonzalez, M., Martorell, X.: Automatic Multilevel Parallelization Using OpenMP. Scientific Programming 11(2), 177–190 (2003)Google Scholar
  8. 8.
    Jin, H., Jost, G.: Support of Multidimensional Parallelism in the OpenMP Programming Model. In: WOMPEI 2003. The Proceedings of the International Symposium on High Performance Computing (ISHPC-V), Tokyo, Japan (October 2003)Google Scholar
  9. 9.
    Kalla, R., Sinharoy, B., Tendler, J.: IBM POWER5 chip: a dualcore multithreaded processor. IEEE Micro 24(2), 40–47 (2004)CrossRefGoogle Scholar
  10. 10.
    Kongetira, P.: A 32-way Multithreaded SPARC Processor. In: Hot Chips 16,
  11. 11.
    MIPSPro 7 Fortran 90 Commands and Directives Reference Manual 007-3696-03.
  12. 12.
    Liu, Z., Chapman, B., Wen, Y., Huang, L., Hernandez, O.: Analyses and Optimizations for the Translation of OpenMP Codes into SPMD Style. In: Voss, M.J. (ed.) WOMPAT 2003. LNCS, vol. 2716, pp. 26–41. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  13. 13.
    Olukotun, K., Nayfeh, B.A., Hammond, L., Wilson, K., Chang, K.: The Case for a Single-Chip Multiprocessor. In: Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 2–11 (1996)Google Scholar
  14. 14.
    OpenMP Application Program Interface, Version 2.5 (May 2005),
  15. 15.
    The OpenUH compiler project,
  16. 16.
    Sesimc Micro-Technology, Inc., TracePak Module,
  17. 17.
    Tullsen, D., Eggers, S., Levy, H.: Simultaneous Multithreading: Maximizing On-Chip Parallelism. In: Intl. Symp. on Computer Architecture, pp. 392–403 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Barbara M. Chapman
    • 1
  • Lei Huang
    • 1
  • Haoqiang Jin
    • 2
  • Gabriele Jost
    • 3
  • Bronis R. de Supinski
    • 4
  1. 1.University of HoustonHoustonUSA
  2. 2.NASA Ames Research CenterUSA
  3. 3.Sun Microsystems, Inc.USA
  4. 4.Lawrence Livermore National LaboratoryUSA

Personalised recommendations