Abstract
To optimize programs for parallel computers with distributed shared memory two main problems need to be solved: load balance between the processors and minimization of interprocessor communication. This article describes a new technique called data-driven scheduling which can be used on sequentially iterated program regions on parallel computers with a distributed shared memory. During the first execution of the program region, statistical data on execution times of tasks and memory access behaviour are gathered. Based on this data, a special graph is generated to which graph partitioning techniques are applied. The resulting partitioning is stored in a template which is used in subsequent executions of the program region to efficiently schedule the parallel tasks of that region. Data-driven scheduling is integrated into the SVM-Fortran compiler. Performance results are shown for the Intel Paragon XP/S with the DSM-extension ASVM and for the SGI Origin2000.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Download to read the full chapter text
Chapter PDF
References
R. Berrendorf, M. Gerndt. Compiling SVM-Fortran for the Intel Paragon XP/S. Proc. Working Conference on Massively Parallel Programming Models (MPPM’95), pages 52–59, Berlin, October 1995. IEEE Society Press.
R. Berrendorf, M. Gerndt. SVM Fortran reference manual version 1.4. Technical Report KFA-ZAM-IB-9510, Research Centre Jülich, April 1995.
R. Berrendorf, M. Gerndt, M. Mairandres, S. Zeisset. A programming environment for shared virtual memory on the Intel Paragon supercomputer. In Proc. Intel User Group Meeting, Albuquerque, NM, June 1995. http://www.cs.sandia.gov/ISUG/ps/pesvm.ps.
C. Fu, T. Yang. Run-time compilation for parallel sparse matrix computations. In Proc. ACM Int’l Conf. Supercomputing, pages 237–244, 1996.
B. Hendrickson, R. Leland. The Chaco user’s guide, version 2.0. Technical Report SAND95-2344, Sandia National Lab., Albuquerque, NM, July 1995.
High Performance Fortran Forum. High Performance Fortran Language Specification, 2.0 edition, January 1997.
S. F. Hummel, E. Schonberg, L. E. Flynn. Factoring — a method for scheduling parallel loops. Comm. ACM, 35(8):90–101, August 1992.
G. Karypis, V. Kumar. Analysis of multilevel graph partitioning. Technical Report 95-037, Univ. Minnesota, Department of Computer Science, 1995.
R. Preis, R. Dieckmann. The PARTY Partitioning-Library, User Guide, Version 1.1. Univ. Paderborn, September 1996.
P. Tang, P.-C. Yew. Processor self-scheduling for multiple nested parallel loops. In Proc. IEEE Int’l Conf. Parallel Processing, pages 528–535, August 1986.
K. A. Tomko, S. G. Abraham. Data and program restructuring of irregular applications for cache-coherent multiprocessors. In Proc. ACM Int’l Conf. Supercomputing, pages 214–225, July 1994.
C. Walshaw, M. Cross, M.G. Everett, S. Johnson, K. McManus. Partitioning & mapping of unstructured meshed to parallel machine topologies. In Proc. Irregular 95: Parallel Algorithms for Irregularly Structured Problems, volume 980 of LNCS, pages 121–126. Springer, 1995.
J. Wu, R. Das, J. Saltz, H. Berryman, S. Hiranandani. Distributed memory compiler design for sparse problems. IEEE Trans. Computers, 44(6), 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Berrendorf, R. (1998). Optimizing load balance and communication on parallel computers with distributed shared memory. In: Pritchard, D., Reeve, J. (eds) Euro-Par’98 Parallel Processing. Euro-Par 1998. Lecture Notes in Computer Science, vol 1470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0057866
Download citation
DOI: https://doi.org/10.1007/BFb0057866
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64952-6
Online ISBN: 978-3-540-49920-6
eBook Packages: Springer Book Archive