Skip to main content

Adaptively Increasing Performance and Scalability of Automatically Parallelized Programs

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2002)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 2481))

Abstract

This paper presents adaptive execution techniques that determine whether automatically parallelized loops are executed parallelly or sequentially in order to maximize performance and scalability. The adaptation and performance estimation algorithms are implemented in a compiler preprocessor. The preprocessor inserts code that automatically determines at compile-time or at run-time the way the parallelized loops are executed. Using a set of standard numerical applications written in Fortran77 and running them with our techniques on a distributed shared memory multiprocessor machine (SGI Origin2000), we obtain the performance of our techniques, on average, 26%, 20%, 16%, and 10% faster than the original parallel program on 32, 16, 8, and 4 processors, respectively. One of the applications runs even more than twice faster than its original parallel version on 32 processors.

This work was supported in part by National Science Foundation under grant EIA-0130724 and by National Computational Science Alliance under grant ocn, and utilized the Silicon Graphics Origin2000. This work was also supported in part by the Korean Ministry of Education under the BK21 program and by the Korean Ministry of Science and Technology under the National Research Laboratory program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alpern, B., et al.: The Jalapeño Virtual Machine. IBM Systems Journal 39(1), 211–238 (2000)

    Article  Google Scholar 

  2. Blume, W., Doallo, R., Eigenmann, R., Grout, J., Hoeflinger, J., Lawrence, T., Lee, J., Padua, D., Paek, Y., Pottenger, B., Rauchwerger, L., Tu, P.: Parallel programming with Polaris. IEEE Computer 29(12), 78–82 (1996)

    Google Scholar 

  3. Byler, M., Davies, J., Huson, C., Leasure, B., Wolfe, M.: Multiple Version Loops. In: Proceedings of the International Conference on Parallel Processing (ICPP), August 1987, pp. 312–318 (1987)

    Google Scholar 

  4. Cascaval, C., DeRose, L., Padua, D.A., Reed, D.: Compile-Time Based Performance Prediction. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, pp. 365–379. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Chandra, R., Dagum, L., Kohr, D., Maydan, D., McDonald, J., Manon, R.: Paralle Programming in OpenMP. Morgan Kaufmann Publisher, San Francisco (2001)

    Google Scholar 

  6. Cox, A.L., Fowler, R.J.: Adaptive Cache Coherency for Detecting Migratory Shared Data. In: Proceedings of the 20th International Symposium on Computer Architectur, May 1993, pp. 98–108 (1993)

    Google Scholar 

  7. Diniz, P., Rinard, M.: Dynamic Feedback: An Effective Technique for Adaptive Computing. In: Proceedings of the ACM SIGPLAN Conference on Program Language Design and Implementation, June 1997, pp. 71–84 (1997)

    Google Scholar 

  8. Gupta, R., Bodik, R.: Adaptive Loop Transformations for Scientific Programs. In: Proceedings of the IEEE Symposium on Parallel and Distributed Processing, October 1995, pp. 368–375 (1995)

    Google Scholar 

  9. Holzle, U., Ungar, D.: Optimizing Dynamically-Dispatched Calls with Run-Time Type Feedback. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 1994, pp. 326–336 (1994)

    Google Scholar 

  10. Lee, J.: Compilation Techniques for Explicitly Parallel Programs. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, Department of Computer Science Technical Report UIUCDCS-R- 99-2112 (October 1999)

    Google Scholar 

  11. Lee, J., Solihin, Y., Torrellas, J.: Automatically Mapping Code in an Intelligent Memory Architecture. In: Proceedings of the 7th International Symposium on High Performance Computer Architecture (HPCA), January 2001, pp. 121–132 (2001)

    Google Scholar 

  12. Mattson, R.L., Gecsei, J., Slutz, D., Traiger, I.: Evaluation Techniques for Storage Hierarchies. IBM Systems Journal 9(2), 78–117 (1970)

    Article  Google Scholar 

  13. OpenMP Standard Board. OpenMP Fortran Interpretations, Version 1.0 (April 1999)

    Google Scholar 

  14. Rinard, M., Diniz, P.: Eliminating Synchronization Bottlenecks in Object Based Programs Using Adaptive Replication. In: Proceedings of the ACM International Conference on Supercomputing (ICS), June 1999, pp. 83–92 (1999)

    Google Scholar 

  15. Romer, T.H., Lee, D., Bershad, B.N., Chen, B.: Dynamic Page Mapping Policies for Cache Conflict Resolution on Standard Hardware. In: Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation, November 1994, pp. 255–266 (1994)

    Google Scholar 

  16. Saavedra, R.H., Park, D.: Improving the Effectiveness of Software Prefetching with Adaptive Execution. In: Proceedings of the Conference on Parallel Algorithms and Compilation Techniques (October 1996)

    Google Scholar 

  17. Silicon Graphics Inc. MIPSpro Auto-Parallelization Option Programmer’s Guide (1999)

    Google Scholar 

  18. Silicon Graphics Inc. MIPSpro Fortran 77 programmer’s Guide (1999)

    Google Scholar 

  19. Voss, M.J., Eigenmann, R.: Reducing Parallel Overheads through Dynamic Serialization. In: Proceedings of the International Parallel Processing Symposium, April 1999, pp. 88–92 (1999)

    Google Scholar 

  20. Voss, M.J., Eigenmann, R.: ADAPT: Automated De-Coupled Adaptive Program Transformation. In: Proceedings of the International Conference on Parallel Processing (ICPP), August 2000, p. 163 (2000)

    Google Scholar 

  21. Voss, M.J., Eigenmann, R.: High-level Adaptive Program Optimization with ADAPT. In: Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 2001, pp. 93–102 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, J., Moonesinghe, H.D.K. (2005). Adaptively Increasing Performance and Scalability of Automatically Parallelized Programs. In: Pugh, B., Tseng, CW. (eds) Languages and Compilers for Parallel Computing. LCPC 2002. Lecture Notes in Computer Science, vol 2481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596110_14

Download citation

  • DOI: https://doi.org/10.1007/11596110_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30781-5

  • Online ISBN: 978-3-540-31612-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics