Skip to main content

A multi-grain parallelizing compilation scheme for OSCAR (optimally scheduled advanced multiprocessor)

  • VII. Compilers & Scheduling
  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 1991)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 589))

Abstract

This paper proposes a multi-grain parallelizing compilation scheme for Fortran programs. The scheme hierarchically exploits parallelism among coarse grain tasks, such as, loops, subroutines or basic blocks, among medium grain tasks like loop iterations and among near fine grain tasks like statements. Parallelism among the coarse grain tasks called the macrotasks is exploited by carefully analyzing control dependences and data dependences. The macrotasks are dynamically assigned to processor clusters to cope with run-time uncertainties, such as, conditional branches among the macrotasks and variation of execution time of each macrotask. The parallel processing of macrotasks is called the macro-dataflow computation. A macrotask composed of a Do-all loop, which is assigned onto a processor cluster, is processed in the medium grain in parallel by processors inside the processor cluster. A macrotask composed of a sequential loop or a basic block is processed on a processor cluster in the near fine grain by using static scheduling. A macrotask composed of subroutine or a large sequential loop is processed by hierarchically applying macro-dataflow computation inside a processor cluster. Performance of the proposed scheme is evaluated on a multiprocessor system named OSCAR. The evaluation shows that the multi-grain parallel processing effectively exploits parallelism from Fortran programs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V.Aho, R.Sethi and J.D.Ullman, Compilers: Principles, Techniques, and Tools, Addison Wesley, 1988.

    Google Scholar 

  2. U.Banerjee, Dependence Analysis for Supercomputing, Kluwer Pub., 1988

    Google Scholar 

  3. D.A.Padua, D.J.Kuck and D.H.Lawrie, “High-speed multiprocessor and compilation techniques,” IEEE Trans. Comput., Vol. C-29, No.9,pp.763–776, Sep. 1980.

    Google Scholar 

  4. D.A.Padua, and M.J.Wolfe,“Advanced Compiler Optimizations for Supercomputers,” C.ACM, Vol.29, No.12, pp.1184–1201,Dec.1986.

    Google Scholar 

  5. D.Gajski, D.Kuck, D.Lawrie and A.Sameh,“CEDAR,” Report UIUCDCS-R-83-1123, Dept. of Computer Sci., Univ. Illinois at Urbana-Champaign, Feb. 1983.

    Google Scholar 

  6. D.D.Gajski, D.J.Kuck, D.A.Padua, “Dependence Driven Computation,” Proc. of COMPCON 81 Spring Computer Conf., pp.168–172, Feb. 1981.

    Google Scholar 

  7. H.E.Husmann, D.J.Kuck and D.A.Padua,“Automatic Compound Function Definition for Multiprocessors,” Proc. 1988 Int"l. Conf. on Parallel Processing,Aug.1988.

    Google Scholar 

  8. M.Wolfe, “Multiprocessor synchronization for concurrent loops,” IEEE software, Vol. pp. 34–42, Jan. 1988.

    Google Scholar 

  9. M.Wolfe,Optimizing Supercompilers for Supercomputers,MIT Press, 1989.

    Google Scholar 

  10. C.D.Polychronopoulos and D.J.Kuck, “Guided self-scheduling: A practical scheduling scheme for parallel supercomputers,” IEEE Trans. Comput., Vol.c-36,12, pp.1425–1439,Dec. 1987.

    Google Scholar 

  11. D.J.Kuck, E.S.Davidson, D.H.Lawrie and A.H.Sameh, “Parallel Supercomputing Today and Cedar Approach,” Science, Vol.231, pp.967–974, Feb. 1986.

    Google Scholar 

  12. J.A.Fisher, “The VLIW Machine: A Multiprocessor for Compiling Scientific Code,” IEEE Computer, Vol. 17, No.7,pp.45–53, Jul.1984.

    Google Scholar 

  13. R.P.Colwell, et.al.,“A VLIW Architecture for a Trace Scheduling Compiler,” IEEE Trans. Comp., Vol.C-37, No.8, pp.967–979, Aug.1989.

    Google Scholar 

  14. J.R.Ellis, “Bulldog: A Compiler for VLIW Architectures,” MIT Press,1985.

    Google Scholar 

  15. A.Nicolau and J.A.Fisher, “Measuring the Parallelism Available for Very Long Instruction Word Architectures,” IEEE Trans. on Computers, Vol. C-33, No. 11, pp.968–976, Nov.1984.

    Google Scholar 

  16. A.Nicolau, “Uniform Parallelism Exploitation in Ordinary Programs,” Proc. 1985 Int. Conf. Parallel Processing, Aug. 1985.

    Google Scholar 

  17. N.P.Jouppi, “The Nonuniform Distribution of Instrction-Level and Machine Paralellism and Its Effect on Performance,” IEEE Trans. on Comput, vol. C-38, No.12, pp.1645–1657, Dec.1989.

    Google Scholar 

  18. E.G.Coffman Jr.(ed.), Computer and Job-shop Scheduling Theory. New York: Wiley, 1976.

    Google Scholar 

  19. M.R.Garey and D.S.Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: Freeman, 1979.

    Google Scholar 

  20. C.D.Polychronopoulos, Parallel Programming and Compilers, Kluwer Academic Pub., 1988.

    Google Scholar 

  21. V.Sarkar, “Determining Average Program Execution Times and Their Variance'”, Proc. Sigplan'89, June 1989.

    Google Scholar 

  22. V.Sarkar, Partitioning and Scheduling Parallel Programs for Multiprocessors, MIT Press,1989.

    Google Scholar 

  23. H.Kasahara and S.Narita, “Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing,” IEEE Trans. Comput, Vol.c-33, No.11.pp. 1023–1029,Nov.1984.

    Google Scholar 

  24. H.Kasahara and S.Narita, “An approach to supercomputing using multiprocessor scheduling algorithms, “ in Proc. IEEE 1st Int'l Conf. on Supercomputing, pp.139–148,Dec. 1985.

    Google Scholar 

  25. F.Allen, M.Burke,R.Cytron,J.Ferrante,W.Hsieh and V.Sarkar, “A Framework for Determining Useful Parallelism,” Proc. 2nd ACM Int'l. Conf. on Supercomputing, 1988.

    Google Scholar 

  26. J.Ferrante,K.J.Ottenstein,J.D.Warren,“The Program Dependence Graph and Its Use in Optimization,” ACM Trans. on Prog. Lang. and Syst, Vol.9,No.3.pp.319–349, July 1987.

    Google Scholar 

  27. B.S.Baker,“An Algorithm for Structuring Flowgraphs,” J. ACM, Vol.24, No.1, pp.98–120, Jan.1977.

    Google Scholar 

  28. M.Burke and R.Cytron, “Interprocedural Dependence Analysis and Parallelization,” Proc. ACM SIGPLAN'86 Symposium on Compiler Construction, 1986.

    Google Scholar 

  29. M.O'Keefe and H. Dietz, “Hardware Barrier Synchronization: Static Barrier MIMD,” Proc. 1990 Int'l Conf. on Parallel Processing, pp. 135–42, Aug. 1990.

    Google Scholar 

  30. H.Kasahara, H.Honda, S.Narita, “Parallel Processing of Near Fine Grain Tasks Using Static Scheduling on OSCAR,” in Proc. IEEE ACM Supercomputing'90, Nov. 1990.

    Google Scholar 

  31. H.Honda, M.Iwata, H.Kasahara, “Coarse Grain Parallelism Detection Scheme of Fortran programs,” Trans. IEICE, Vol.J73-D-I, No.12, Dec.1990 (in Japanese).

    Google Scholar 

  32. F.G.Gustavson, W.Liniger and R.A.Willoughby, “Symbolic Generation of an Optimal Crout Algorithm for Sparse Systems of Linear Equations,” J.ACM, vol.17, pp.87–109, Jan. 1970.

    Google Scholar 

  33. H.Kasahara, Parallel Processing Technology, Corona Publishing, Tokyo, (in Japanese), Jun. 1991.

    Google Scholar 

  34. S.S.Munshi and B.Simons, “Scheduling Sequential Loops on Parallel Processors,” SIAM J. Comput, Vol. 19, No.4, pp.728–741, Aug., 1990.

    Google Scholar 

  35. M.Girkar and C.D.Polychronopoulos, “Optimization of Data/Control Conditions in Task Graphs,” Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.

    Google Scholar 

  36. H.Kasahara, H.Honda, M.Iwata and M.Hirota, “A Macro-dataflow Compilation Scheme for Hierarchical Multiprocessor Systems,” Proc. Int'l. Conf. on Parallel Processing, Aug. 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kasahara, H., Honda, H., Mogi, A., Ogura, A., Fujiwara, K., Narita, S. (1992). A multi-grain parallelizing compilation scheme for OSCAR (optimally scheduled advanced multiprocessor). In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1991. Lecture Notes in Computer Science, vol 589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0038671

Download citation

  • DOI: https://doi.org/10.1007/BFb0038671

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-55422-6

  • Online ISBN: 978-3-540-47063-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics