A multi-grain parallelizing compilation scheme for OSCAR (optimally scheduled advanced multiprocessor)

Kasahara, H.; Honda, H.; Mogi, A.; Ogura, A.; Fujiwara, K.; Narita, S.

doi:10.1007/BFb0038671

H. Kasahara¹,
H. Honda¹,
A. Mogi¹,
A. Ogura¹,
K. Fujiwara¹ &
…
S. Narita¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 589))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

204 Accesses
11 Citations

Abstract

This paper proposes a multi-grain parallelizing compilation scheme for Fortran programs. The scheme hierarchically exploits parallelism among coarse grain tasks, such as, loops, subroutines or basic blocks, among medium grain tasks like loop iterations and among near fine grain tasks like statements. Parallelism among the coarse grain tasks called the macrotasks is exploited by carefully analyzing control dependences and data dependences. The macrotasks are dynamically assigned to processor clusters to cope with run-time uncertainties, such as, conditional branches among the macrotasks and variation of execution time of each macrotask. The parallel processing of macrotasks is called the macro-dataflow computation. A macrotask composed of a Do-all loop, which is assigned onto a processor cluster, is processed in the medium grain in parallel by processors inside the processor cluster. A macrotask composed of a sequential loop or a basic block is processed on a processor cluster in the near fine grain by using static scheduling. A macrotask composed of subroutine or a large sequential loop is processed by hierarchically applying macro-dataflow computation inside a processor cluster. Performance of the proposed scheme is evaluated on a multiprocessor system named OSCAR. The evaluation shows that the multi-grain parallel processing effectively exploits parallelism from Fortran programs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A.V.Aho, R.Sethi and J.D.Ullman, Compilers: Principles, Techniques, and Tools, Addison Wesley, 1988.
Google Scholar
U.Banerjee, Dependence Analysis for Supercomputing, Kluwer Pub., 1988
Google Scholar
D.A.Padua, D.J.Kuck and D.H.Lawrie, “High-speed multiprocessor and compilation techniques,” IEEE Trans. Comput., Vol. C-29, No.9,pp.763–776, Sep. 1980.
Google Scholar
D.A.Padua, and M.J.Wolfe,“Advanced Compiler Optimizations for Supercomputers,” C.ACM, Vol.29, No.12, pp.1184–1201,Dec.1986.
Google Scholar
D.Gajski, D.Kuck, D.Lawrie and A.Sameh,“CEDAR,” Report UIUCDCS-R-83-1123, Dept. of Computer Sci., Univ. Illinois at Urbana-Champaign, Feb. 1983.
Google Scholar
D.D.Gajski, D.J.Kuck, D.A.Padua, “Dependence Driven Computation,” Proc. of COMPCON 81 Spring Computer Conf., pp.168–172, Feb. 1981.
Google Scholar
H.E.Husmann, D.J.Kuck and D.A.Padua,“Automatic Compound Function Definition for Multiprocessors,” Proc. 1988 Int"l. Conf. on Parallel Processing,Aug.1988.
Google Scholar
M.Wolfe, “Multiprocessor synchronization for concurrent loops,” IEEE software, Vol. pp. 34–42, Jan. 1988.
Google Scholar
M.Wolfe,Optimizing Supercompilers for Supercomputers,MIT Press, 1989.
Google Scholar
C.D.Polychronopoulos and D.J.Kuck, “Guided self-scheduling: A practical scheduling scheme for parallel supercomputers,” IEEE Trans. Comput., Vol.c-36,12, pp.1425–1439,Dec. 1987.
Google Scholar
D.J.Kuck, E.S.Davidson, D.H.Lawrie and A.H.Sameh, “Parallel Supercomputing Today and Cedar Approach,” Science, Vol.231, pp.967–974, Feb. 1986.
Google Scholar
J.A.Fisher, “The VLIW Machine: A Multiprocessor for Compiling Scientific Code,” IEEE Computer, Vol. 17, No.7,pp.45–53, Jul.1984.
Google Scholar
R.P.Colwell, et.al.,“A VLIW Architecture for a Trace Scheduling Compiler,” IEEE Trans. Comp., Vol.C-37, No.8, pp.967–979, Aug.1989.
Google Scholar
J.R.Ellis, “Bulldog: A Compiler for VLIW Architectures,” MIT Press,1985.
Google Scholar
A.Nicolau and J.A.Fisher, “Measuring the Parallelism Available for Very Long Instruction Word Architectures,” IEEE Trans. on Computers, Vol. C-33, No. 11, pp.968–976, Nov.1984.
Google Scholar
A.Nicolau, “Uniform Parallelism Exploitation in Ordinary Programs,” Proc. 1985 Int. Conf. Parallel Processing, Aug. 1985.
Google Scholar
N.P.Jouppi, “The Nonuniform Distribution of Instrction-Level and Machine Paralellism and Its Effect on Performance,” IEEE Trans. on Comput, vol. C-38, No.12, pp.1645–1657, Dec.1989.
Google Scholar
E.G.Coffman Jr.(ed.), Computer and Job-shop Scheduling Theory. New York: Wiley, 1976.
Google Scholar
M.R.Garey and D.S.Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: Freeman, 1979.
Google Scholar
C.D.Polychronopoulos, Parallel Programming and Compilers, Kluwer Academic Pub., 1988.
Google Scholar
V.Sarkar, “Determining Average Program Execution Times and Their Variance'”, Proc. Sigplan'89, June 1989.
Google Scholar
V.Sarkar, Partitioning and Scheduling Parallel Programs for Multiprocessors, MIT Press,1989.
Google Scholar
H.Kasahara and S.Narita, “Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing,” IEEE Trans. Comput, Vol.c-33, No.11.pp. 1023–1029,Nov.1984.
Google Scholar
H.Kasahara and S.Narita, “An approach to supercomputing using multiprocessor scheduling algorithms, “ in Proc. IEEE 1st Int'l Conf. on Supercomputing, pp.139–148,Dec. 1985.
Google Scholar
F.Allen, M.Burke,R.Cytron,J.Ferrante,W.Hsieh and V.Sarkar, “A Framework for Determining Useful Parallelism,” Proc. 2nd ACM Int'l. Conf. on Supercomputing, 1988.
Google Scholar
J.Ferrante,K.J.Ottenstein,J.D.Warren,“The Program Dependence Graph and Its Use in Optimization,” ACM Trans. on Prog. Lang. and Syst, Vol.9,No.3.pp.319–349, July 1987.
Google Scholar
B.S.Baker,“An Algorithm for Structuring Flowgraphs,” J. ACM, Vol.24, No.1, pp.98–120, Jan.1977.
Google Scholar
M.Burke and R.Cytron, “Interprocedural Dependence Analysis and Parallelization,” Proc. ACM SIGPLAN'86 Symposium on Compiler Construction, 1986.
Google Scholar
M.O'Keefe and H. Dietz, “Hardware Barrier Synchronization: Static Barrier MIMD,” Proc. 1990 Int'l Conf. on Parallel Processing, pp. 135–42, Aug. 1990.
Google Scholar
H.Kasahara, H.Honda, S.Narita, “Parallel Processing of Near Fine Grain Tasks Using Static Scheduling on OSCAR,” in Proc. IEEE ACM Supercomputing'90, Nov. 1990.
Google Scholar
H.Honda, M.Iwata, H.Kasahara, “Coarse Grain Parallelism Detection Scheme of Fortran programs,” Trans. IEICE, Vol.J73-D-I, No.12, Dec.1990 (in Japanese).
Google Scholar
F.G.Gustavson, W.Liniger and R.A.Willoughby, “Symbolic Generation of an Optimal Crout Algorithm for Sparse Systems of Linear Equations,” J.ACM, vol.17, pp.87–109, Jan. 1970.
Google Scholar
H.Kasahara, Parallel Processing Technology, Corona Publishing, Tokyo, (in Japanese), Jun. 1991.
Google Scholar
S.S.Munshi and B.Simons, “Scheduling Sequential Loops on Parallel Processors,” SIAM J. Comput, Vol. 19, No.4, pp.728–741, Aug., 1990.
Google Scholar
M.Girkar and C.D.Polychronopoulos, “Optimization of Data/Control Conditions in Task Graphs,” Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.
Google Scholar
H.Kasahara, H.Honda, M.Iwata and M.Hirota, “A Macro-dataflow Compilation Scheme for Hierarchical Multiprocessor Systems,” Proc. Int'l. Conf. on Parallel Processing, Aug. 1990.
Google Scholar

Download references

Author information

Authors and Affiliations

Waseda University, USA
H. Kasahara, H. Honda, A. Mogi, A. Ogura, K. Fujiwara & S. Narita

Authors

H. Kasahara
View author publications
You can also search for this author in PubMed Google Scholar
H. Honda
View author publications
You can also search for this author in PubMed Google Scholar
A. Mogi
View author publications
You can also search for this author in PubMed Google Scholar
A. Ogura
View author publications
You can also search for this author in PubMed Google Scholar
K. Fujiwara
View author publications
You can also search for this author in PubMed Google Scholar
S. Narita
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kasahara, H., Honda, H., Mogi, A., Ogura, A., Fujiwara, K., Narita, S. (1992). A multi-grain parallelizing compilation scheme for OSCAR (optimally scheduled advanced multiprocessor). In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1991. Lecture Notes in Computer Science, vol 589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0038671

Download citation

DOI: https://doi.org/10.1007/BFb0038671
Published: 10 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55422-6
Online ISBN: 978-3-540-47063-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics