Skip to main content

Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications

Part of the Lecture Notes in Computer Science book series (LNPSE,volume 9342)


Many different sources of overheads impact the efficiency of a scheduling strategy applied to a parallel loop within a scientific application. In prior work, we handled these overheads using multiple loop scheduling strategies, with each scheduling strategy focusing on mitigating a subset of the overheads. However, mitigating the impact of one source of overhead can lead to an increase in the impact of another source of overhead, and vice versa. In this work, we show that in order to improve efficiency of loop scheduling strategies, one must adapt the loop scheduling strategies so as to handle all overheads simultaneously. To show this, we describe a composition of our existing loop scheduling strategies, and experiment with the composed scheduling strategy on standard benchmarks and application codes. Applying the composed scheduling strategy to three MPI+OpenMP scientific codes run on a cluster of SMPs improves performance an average of 31 % over standard OpenMP static scheduling.


  • Schedule Strategy
  • Dynamic Fraction
  • Performance Loss
  • Static Fraction
  • Loop Iteration

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-24595-9_2
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   44.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-24595-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   59.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.


  1. Bull, J.M.: Feedback guided dynamic loop scheduling: algorithms and experiments. In: Pritchard, D., Reeve, J.S. (eds.) Euro-Par 1998. LNCS, vol. 1470, p. 377. Springer, Heidelberg (1998)

    CrossRef  Google Scholar 

  2. Bull, J.M.: Measuring synchronisation and scheduling overheads in OpenMP. In: Proceedings of First European Workshop on OpenMP, pp. 99–105, Lund, Sweden (1999)

    Google Scholar 

  3. Dinan, J., Larkins, D.B., Sadayappan, P., Krishnamoorthy, S., Nieplocha, J.: Scalable work stealing. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009, pp. 53:1–53:11, Portland, OR, USA. ACM (2009)

    Google Scholar 

  4. Donfack, S., Grigori, L., Gropp, W.D., Kale, V.: Hybrid static/dynamic scheduling for already optimized dense matrix factorizations. In: IEEE International Parallel and Distributed Processing Symposium, IPDPS 2012, Shanghai, China (2012)

    Google Scholar 

  5. Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the Cilk-5 multithreaded language. SIGPLAN Not. 33(5), 212–223 (1998)

    CrossRef  Google Scholar 

  6. Heroux, M.: MiniFE documentation.

  7. Kale, V., Gamblin, T., Hoefler, T., de Supinski, B.R., Gropp, W.D.: Abstract: Slack-Conscious Lightweight Loop Scheduling for Improving Scalability of Bulk-synchronous MPI Applications, November 2012

    Google Scholar 

  8. Kale, V., Randles, A.P., Kale, V., Gropp, W.D.: Locality-optimized scheduling for improved load balancing on SMPs. In: Proceedings of the 21st European MPI Users’ Group Meeting Conference on Recent Advances in the Message Passing Interface, vol. 0, pp. 1063–1074. Association for Computing Machinery (2014)

    Google Scholar 

  9. Markatos, E.P., LeBlanc, T.J.: Using processor affinity in loop scheduling on shared-memory multiprocessors. In: Proceedings of the 1992 ACM/IEEE Conference on Supercomputing, Supercomputing 1992, pp. 104–113, Los Alamitos, CA, USA. IEEE Computer Society Press (1992)

    Google Scholar 

  10. Olivier, S.L., de Supinski, B.R., Schulz, M., Prins, J.F.: Characterizing and mitigating work time inflation in task parallel programs. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 65:1–65:12, Salt Lake City, UT, USA. IEEE Computer Society Press (2012)

    Google Scholar 

  11. Rein, H., Liu, S.F.: REBOUND: an open-source multi-purpose N-body code for collisional dynamics. Astron. Astrophys. 537, A128 (2012)

    CrossRef  Google Scholar 

  12. Rountree, B., Lowenthal, D.K., de Supinski, B.R., Schulz, M., Freeh, V.W., Bletsch, T.: Adagio: making DVS practical for complex HPC applications. In: Proceedings of the 23rd International Conference on Supercomputing, ICS 2009, pp. 460–469, Yorktown Heights, NY, USA. ACM (2009)

    Google Scholar 

  13. Talamo, A.: Numerical solution of the time dependent neutron transport equation by the method of the characteristics. J. Comput. Phys. 240, 248–267 (2013)

    MathSciNet  CrossRef  Google Scholar 

  14. Zhang, Y., Voss, M.: Runtime empirical selection of loop schedulers on hyperthreaded SMPs. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2005), vol. 01, pp. 44.2, Washington, DC, USA. IEEE Computer Society (2005)

    Google Scholar 

Download references


This material is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under Award Number DE-NA0002374. This work was supported in part by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy award DE-FG02-13ER26138/DE-SC0010049.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Vivek Kale .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Kale, V., Gropp, W.D. (2015). Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications. In: Terboven, C., de Supinski, B., Reble, P., Chapman, B., Müller, M. (eds) OpenMP: Heterogenous Execution and Data Movements. IWOMP 2015. Lecture Notes in Computer Science(), vol 9342. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24594-2

  • Online ISBN: 978-3-319-24595-9

  • eBook Packages: Computer ScienceComputer Science (R0)