Skip to main content

Speculative Parallelization of Partially Parallel Loops

  • Conference paper
  • First Online:

Part of the Lecture Notes in Computer Science book series (LNCS,volume 1915)


Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently de- fined access patterns. We have previously proposed a framework for their identifi- cation. We speculatively executed a loop as a doall, and applied a fully parallel data dependence test to determine if it had any cross-processor dependences; if the test failed, then the loop was re-executed serially. While this method ex- ploits doall parallelism well, it can cause slowdowns for loops with even one cross-processor flow dependence because we have to re-execute sequentially. Moreover, the existing, partial parallelism of loops is not exploited. In this paper we propose a generalization of our speculative doall parallelization technique, named Recursive LRPD test, that can extract and exploit the maximum available parallelism of any loop and that limits potential slowdowns to the overhead of the run-time dependence test itself, i.e., removes the time lost due to incorrect parallel execution. The asymptotic time-complexity is, for fully serial loops, equal to the sequential execution time. We present the base algorithm and an analysis of the different heuristics for its practical application. Some preliminary experimental results on loops from Track will show the performance of this new technique.


  • Execution Time
  • Load Balance
  • Access Pattern
  • Parallel Execution
  • Perfect Code

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Research supported in part byNSF CAREER Award CCR-9734471,NSF GrantACI-9872126, NSF Grant EIA-9975018, DOE ASCI ASAP Level 2 Grant B347886 and a Hewlett-Packard Equipment Grant

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/3-540-40889-4_22
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   74.99
Price excludes VAT (USA)
  • ISBN: 978-3-540-40889-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.00
Price excludes VAT (USA)


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. J. MarkBull. Feedback guided dynamic loop scheduling: Algorithms and experiments. In EUROPAR98, Sept., 1998.

    Google Scholar 

  2. Zhiyuan Li. Array privatization for parallel execution of loops. In Proc. of the 19th Int. Symp. on Computer Architecture, pp. 313–322, 1992.

    Google Scholar 

  3. M. J. Frisch et. al. Gaussian 94, Revision B.1. Gaussian, Inc., Pittsburgh PA, 1995.

    Google Scholar 

  4. D. Maydan, S. Amarasinghe, and M. Lam. Data dependenceand data-flowanalysis of arrays.In Proc. 5th Workshop on Languages and Compilers for Parallel Computing, Aug. 1992.

    Google Scholar 

  5. L. Nagel. SPICE2: A Computer Program to Simulate Semiconductor Circuits. PhD thesis, University of California, May 1975.

    Google Scholar 

  6. D. A. Padua and M. J. Wolfe. Advanced compiler optimizations for supercomputers. Com-munications of the ACM, 29:1184–1201, Dec. 1986.

    Google Scholar 

  7. L. Rauchwerger, N. Amato, and D. Padua. A scalable method for run-time loop parallelization. Int. J. Parallel Programming, 26(6):537–576, July 1995.

    CrossRef  Google Scholar 

  8. L. Rauchwerger and D. Padua. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. IEEE Trans. on Parallel and Distributed Systems, 10(2), 1999.

    Google Scholar 

  9. P. Tu and D. Padua. Automatic array privatization. In Proc. 6th Workshop on Languages and Compilers for Parallel Computing, Portland, OR, Aug. 1993.

    Google Scholar 

  10. R. Whirley and B. Engelmann. DYNA3D: A Nonlinear, Explicit, Three-Dimensional Finite Element Code For Solid and Structural Mechanics. L. Livermore National Lab., Nov., 1993.

    Google Scholar 

  11. M. Wolfe. Optimizing Compilers for Supercomputers. The MIT Press, Boston, MA, 1989.

    Google Scholar 

  12. Hao Yu and L. Rauchwerger. Run-time parallelization overhead reduction techniques. In Proc. of the 9th Int. Conf. on Compiler Construction (CC2000), Berlin, Germany. LectureNotes in Computer Science, Springer-Verlag, March 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dang, F.H., Rauchwerger, L. (2000). Speculative Parallelization of Partially Parallel Loops. In: Dwarkadas, S. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 2000. Lecture Notes in Computer Science, vol 1915. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41185-7

  • Online ISBN: 978-3-540-40889-5

  • eBook Packages: Springer Book Archive

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.