An Efficient Run-Time Scheme for Exploiting Parallelism on Multiprocessor Systems

Huang1, Tsung-Chuan; Hsu, Po-Hsueh; Wu, Chi-Fan

doi:10.1007/3-540-44467-X_3

An Efficient Run-Time Scheme for Exploiting Parallelism on Multiprocessor Systems

Tsung-Chuan Huang1⁷,
Po-Hsueh Hsu⁸ &
Chi-Fan Wu⁷

Conference paper
First Online: 01 January 2001

413 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1970))

Abstract

High performance computing capability is crucial for the advanced calculations of scientific applications. A parallelizing compiler can take a sequential program as input and automatically translate it into a parallel form. But for loops with arrays of irregular (i.e., indirectly indexed), nonlinear or dynamic access patterns, no state-of-the-art compilers can determine their parallelism at compile-time. In this paper, we propose an efficient run-time scheme to compute a high parallelism execution schedule for those loops. This new scheme first constructs a predecessor iteration table in inspector phase, and then schedules the whole loop iterations into wavefronts for parallel execution. For non-uniform access patterns, the performance of the inspector/executor methods usually degrades dramatically, but it is not valid for our scheme. Furthermore, this scheme is especially suitable for multiprocessor systems because of the features of high scalability and low overhead.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chen, D. K., Yew, P. C., Torrellas, J.: An Efficient Algorithm for the Run-Time Parallelization of Doacross Loops. Proc. 1994 Supercomputing (1994) 518–527
Google Scholar
Huang, T. C., Hsu, P. H.: The SPNT Test: A New Technology for Run-Time Speculative Parallelization of Loops. Lecture Notes in Computer Science Vol. 1366. Springer-Verlag, Berlin Heidelberg New York (1998) 177–191
Google Scholar
Huang, T. C., Hsu, P. H., Sheng, T. N.: Efficient Run-Time Scheduling for Parallelizing Partially Parallel Loops,” J. Information Science and Engineering 14(1), (1998) 255–264
Google Scholar
Leung, S. T., Zahorjan, J.: Improving the Performance of Run-Time Parallelization. Proc. 4th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, (1993) 83–91
Google Scholar
Leung, S. T., Zahorjan, J.: Extending the Applicability and Improving the Performance of Run-Time Parallelization. Tech. Rep. 95-01-08, Dept. CSE, Univ. of Washington (1995)
Google Scholar
Midkiff, S., Padua, D.: Compiler Algorithms for Synchronization. IEEE Trans. Comput. C-36, 12, (1987) 1485–1495
Article Google Scholar
Polychronopoulos, C.: Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design. IEEE Trans. Comput. C-37, 8, (1988) 991–1004
Article Google Scholar
Rauchwerger, L., Amato, N., Padua, D.: A Scalable Method for Run-Time Loop Parallelization. Int. J. Parallel Processing, 26(6), (1995) 537–576
Article Google Scholar
Rauchwerger, L., Padua, D.: The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. IEEE Trans. Parallel and Distributed Systems, 10(2), (1999) 160–180
Article Google Scholar
Saltz, J., Mirchandaney, R., Crowley, K.: Run-time Parallelization and Scheduling of Loops. IEEE Trans. Comput. 40(5), (1991) 603–612
Article Google Scholar
Xu, C., Chaudhary, V.: Time-Stamping Algorithms for Parallelization of Loops at Run-Time. Proc. 11th Int. Parallel Processing Symp. (1997)
Google Scholar
Zhu, C. Q., Yew, P. C.: A Scheme to Enforce Data Dependence on Large Multiprocessor Systems. IEEE Trans. Software Eng. 13(6), (1987) 726–739
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, National Sun Yat-sen University, Taiwan
Tsung-Chuan Huang1 & Chi-Fan Wu
Department of Electronic Engineering, Cheng Shiu Institute of Technology, Taiwan
Po-Hsueh Hsu

Authors

Tsung-Chuan Huang1
View author publications
You can also search for this author in PubMed Google Scholar
Po-Hsueh Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Fan Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DAC-UPC,Computer Architecture Department, Technical University of Catalonia, Campus Nord,D6,Jordi Girona,1-3, Barcelona, Spain
Mateo Valero
Department of EE-Systems Computer Engineering Division, University of Southern California, 3740 McClintok Ave,EEB 200C, 90089-2562, Los Angeles, CA, USA
Viktor K. Prasanna
Indian Institute of Science, Sir C.V.Raman Avenue, 560012, Bangalore, India
Sriram Vajapeyam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang1, TC., Hsu, PH., Wu, CF. (2000). An Efficient Run-Time Scheme for Exploiting Parallelism on Multiprocessor Systems. In: Valero, M., Prasanna, V.K., Vajapeyam, S. (eds) High Performance Computing — HiPC 2000. HiPC 2000. Lecture Notes in Computer Science, vol 1970. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44467-X_3

Download citation

DOI: https://doi.org/10.1007/3-540-44467-X_3
Published: 08 June 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41429-2
Online ISBN: 978-3-540-44467-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics