Adaptively Increasing Performance and Scalability of Automatically Parallelized Programs

Lee, Jaejin; Moonesinghe, H. D. K.

doi:10.1007/11596110_14

Jaejin Lee⁶ &
H. D. K. Moonesinghe⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 2481))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

542 Accesses
1 Citations

Abstract

This paper presents adaptive execution techniques that determine whether automatically parallelized loops are executed parallelly or sequentially in order to maximize performance and scalability. The adaptation and performance estimation algorithms are implemented in a compiler preprocessor. The preprocessor inserts code that automatically determines at compile-time or at run-time the way the parallelized loops are executed. Using a set of standard numerical applications written in Fortran77 and running them with our techniques on a distributed shared memory multiprocessor machine (SGI Origin2000), we obtain the performance of our techniques, on average, 26%, 20%, 16%, and 10% faster than the original parallel program on 32, 16, 8, and 4 processors, respectively. One of the applications runs even more than twice faster than its original parallel version on 32 processors.

This work was supported in part by National Science Foundation under grant EIA-0130724 and by National Computational Science Alliance under grant ocn, and utilized the Silicon Graphics Origin2000. This work was also supported in part by the Korean Ministry of Education under the BK21 program and by the Korean Ministry of Science and Technology under the National Research Laboratory program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alpern, B., et al.: The Jalapeño Virtual Machine. IBM Systems Journal 39(1), 211–238 (2000)
Article Google Scholar
Blume, W., Doallo, R., Eigenmann, R., Grout, J., Hoeflinger, J., Lawrence, T., Lee, J., Padua, D., Paek, Y., Pottenger, B., Rauchwerger, L., Tu, P.: Parallel programming with Polaris. IEEE Computer 29(12), 78–82 (1996)
Google Scholar
Byler, M., Davies, J., Huson, C., Leasure, B., Wolfe, M.: Multiple Version Loops. In: Proceedings of the International Conference on Parallel Processing (ICPP), August 1987, pp. 312–318 (1987)
Google Scholar
Cascaval, C., DeRose, L., Padua, D.A., Reed, D.: Compile-Time Based Performance Prediction. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, pp. 365–379. Springer, Heidelberg (2000)
Chapter Google Scholar
Chandra, R., Dagum, L., Kohr, D., Maydan, D., McDonald, J., Manon, R.: Paralle Programming in OpenMP. Morgan Kaufmann Publisher, San Francisco (2001)
Google Scholar
Cox, A.L., Fowler, R.J.: Adaptive Cache Coherency for Detecting Migratory Shared Data. In: Proceedings of the 20th International Symposium on Computer Architectur, May 1993, pp. 98–108 (1993)
Google Scholar
Diniz, P., Rinard, M.: Dynamic Feedback: An Effective Technique for Adaptive Computing. In: Proceedings of the ACM SIGPLAN Conference on Program Language Design and Implementation, June 1997, pp. 71–84 (1997)
Google Scholar
Gupta, R., Bodik, R.: Adaptive Loop Transformations for Scientific Programs. In: Proceedings of the IEEE Symposium on Parallel and Distributed Processing, October 1995, pp. 368–375 (1995)
Google Scholar
Holzle, U., Ungar, D.: Optimizing Dynamically-Dispatched Calls with Run-Time Type Feedback. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 1994, pp. 326–336 (1994)
Google Scholar
Lee, J.: Compilation Techniques for Explicitly Parallel Programs. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, Department of Computer Science Technical Report UIUCDCS-R- 99-2112 (October 1999)
Google Scholar
Lee, J., Solihin, Y., Torrellas, J.: Automatically Mapping Code in an Intelligent Memory Architecture. In: Proceedings of the 7th International Symposium on High Performance Computer Architecture (HPCA), January 2001, pp. 121–132 (2001)
Google Scholar
Mattson, R.L., Gecsei, J., Slutz, D., Traiger, I.: Evaluation Techniques for Storage Hierarchies. IBM Systems Journal 9(2), 78–117 (1970)
Article Google Scholar
OpenMP Standard Board. OpenMP Fortran Interpretations, Version 1.0 (April 1999)
Google Scholar
Rinard, M., Diniz, P.: Eliminating Synchronization Bottlenecks in Object Based Programs Using Adaptive Replication. In: Proceedings of the ACM International Conference on Supercomputing (ICS), June 1999, pp. 83–92 (1999)
Google Scholar
Romer, T.H., Lee, D., Bershad, B.N., Chen, B.: Dynamic Page Mapping Policies for Cache Conflict Resolution on Standard Hardware. In: Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation, November 1994, pp. 255–266 (1994)
Google Scholar
Saavedra, R.H., Park, D.: Improving the Effectiveness of Software Prefetching with Adaptive Execution. In: Proceedings of the Conference on Parallel Algorithms and Compilation Techniques (October 1996)
Google Scholar
Silicon Graphics Inc. MIPSpro Auto-Parallelization Option Programmer’s Guide (1999)
Google Scholar
Silicon Graphics Inc. MIPSpro Fortran 77 programmer’s Guide (1999)
Google Scholar
Voss, M.J., Eigenmann, R.: Reducing Parallel Overheads through Dynamic Serialization. In: Proceedings of the International Parallel Processing Symposium, April 1999, pp. 88–92 (1999)
Google Scholar
Voss, M.J., Eigenmann, R.: ADAPT: Automated De-Coupled Adaptive Program Transformation. In: Proceedings of the International Conference on Parallel Processing (ICPP), August 2000, p. 163 (2000)
Google Scholar
Voss, M.J., Eigenmann, R.: High-level Adaptive Program Optimization with ADAPT. In: Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 2001, pp. 93–102 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Seoul National University, Seoul, 151-742, Korea
Jaejin Lee
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
H. D. K. Moonesinghe

Authors

Jaejin Lee
View author publications
You can also search for this author in PubMed Google Scholar
H. D. K. Moonesinghe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Deptartment of Computer Science, University of Maryland, 4135 A.V. Williams Bldg., College Park, 20742, MD, USA
Bill Pugh
Dept. of Computer Science, Univ. of Maryland at College Park,
Chau-Wen Tseng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, J., Moonesinghe, H.D.K. (2005). Adaptively Increasing Performance and Scalability of Automatically Parallelized Programs. In: Pugh, B., Tseng, CW. (eds) Languages and Compilers for Parallel Computing. LCPC 2002. Lecture Notes in Computer Science, vol 2481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596110_14

Download citation

DOI: https://doi.org/10.1007/11596110_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30781-5
Online ISBN: 978-3-540-31612-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics