OSCAR Fortran Multigrain Compiler

Kasahara, H.; Honda, H.; Aida, K.; Okamoto, M.; Yoshida, A.; Ogata, W.

doi:10.1007/978-1-4615-2269-0_11

OSCAR Fortran Multigrain Compiler

H. Kasahara³,
H. Honda³,
K. Aida³,
M. Okamoto³,
A. Yoshida³ &
…
W. Ogata³

Chapter

80 Accesses
2 Citations

Abstract

OSCAR Fortran multigrain compiler [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] has been developed since 1986 for a multiprocessor system OSCAR (Optimally Scheduled Advanced Multiprocessor) [11] having centralized and distributed shared memories in addition to local memory on each processor. This multigrain compiler allows ordinary users to get much higher effective performance easily. It automatically parallelizes every block of a program, such as Do-all loops, Do-across loops, sequential loops, subroutines, and basic blocks outside loops, in inter- and intra-block level. More concretely, the compiler hierarchically exploits coarse-grain parallelism among loops, subroutines and basic blocks [2, 3, 4, 6], conventional medium-grain parallelism among loop-iterations in a Do-all loop, and near-fine-grain parallelism among statements inside a basic block [8, 9, 10]. The coarse-grain parallelism is automatically detected by the earliest executable condition analysis of macrotasks [3, 4], or coarse-grain tasks, considering control dependencies and data dependencies among macrotasks. Macrotasks are dynamically assigned to processor-clusters with low overhead by a scheduling routine generated by the compiler [1, 4]. At the macrodataflow processing, data localization techniques are applied to minimize data transfer overhead among macrotasks by using the local memory on each processor [12, 13]. A macrotask composed of a Do-all or Do-across loop, which is assigned onto a processor- cluster, is hierarchically processed in parallel in the medium-grain (namely, in loop-iteration level) by processors inside the processor-cluster.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Kasahara et al. “A Multi-grain Parallelizing Compilation Scheme on OSCAR,” Proc. 4th Workshop on Languages and Compilers for Parallel Computing, August 1991.
Google Scholar
H. Kasahara, H. Honda, M. Iwata, and M. Hirota, “A Macro-dataflow Compilation Scheme for Hierarchical Multiprocessor Systems,” Proc. International Conference on Parallel Processing, August 1990.
Google Scholar
H. Honda, M. Iwata, H. Kasahara, “Coarse Grain Parallelism Detection Scheme of Fortran programs,” Trans. IEICE, J73-D-I(12), December 1990 (in Japanese).
Google Scholar
H. Kasahara, Parallel Processing Technology, Corona Publishing, Tokyo, (in Japanese), June 1991.
Google Scholar
H. Kasahara, H. Honda, and S. Narita, “A Fortran Parallelizing Compilation Scheme for OSCAR Using Dependence Graph Analysis,” IEICE Trans., E74(10):3105–3114, 1991.
Google Scholar
H. Honda, K. Aida, M. Okamoto, A. Yoshida, W. Ogata, and H. Kasahara, “Fortran Macro-Dataflow Compiler,” Proc. of Fourth Workshop on Compilers for Parallel Computers, December 1993.
Google Scholar
M. Okamoto, K. Aida, M. Miyazawa, H. Honda, and H. Kasahara, “A Hierarchical Macro-Dataflow Computation Scheme for OSCAR Multi-grain Compiler,” Trans, of Information Processing Society of Japan, 35(4):513–521, (in Japanese), April 1994.
Google Scholar
H. Kasahara and S. Narita, “An Approach to Supercomputing Using Mul-tiprocessor Scheduling Algorithms,” Proc. IEEE 1st International Conference on Supercomputing, 139–148,December 1985.
Google Scholar
H. Kasahara, H. Honda, S. Narita, “Parallel Processing of near-fine Grain Tasks Using Static Scheduling on OSCAR,” Proc. IEEE ACM Supercom-puting’90, November 1990.
Google Scholar
W. Ogata, A. Yoshida, K. Aida, M. Okamoto, and H. Kasahara, “Near-fine Grain Parallel Processing without Synchronization Using Static Scheduling,” Trans. Information Processing Society of Japan, 35(4):522–531, (in Japanese), April 1994.
Google Scholar
H. Kasahara, S. Narita, and S. Hashimoto, “OSCAR’s Architecture,” Trans. IEICE, J71-D-I(8) (in Japanese), August 1988.
Google Scholar
A. Yoshida, S. Maeda, W. Ogata, and H. Kasahara, “A Data-Localization Scheme for Fortran Macro-Dataflow Computation,” Trans. Information Processing Society of Japan, 35(9):1848–1860, (in Japanese), September 1994.
Google Scholar
A. Yoshida, S. Maeda, W. Ogata, and H. Kasahara, “A Data-Localization Scheme among Doall/Sequential Loops for Fortran Coarse-Grain Parallel Processing,” Trans. IEICE, J78-D-I(2), (in Japanese), February 1995.
Google Scholar
B.S. Baker, “An Algorithm for Structuring Flowgraphs,” J. ACM,24(1):98–120, January 1977.
Article MATH Google Scholar
M. Burke and R. Cytron, “Interprocedural Dependence Analysis and Par-allelization,” Proc. ACMSIGPLAN’86 Symposium on Compiler Construction, 1986.
Google Scholar
F. Allen, M. Burke, R. Cytron, J. Ferrante, W. Hsieh and V. Sarkar, “A Framework for Determining Useful Parallelism,” Proc. 2nd ACM International Conference on Supercomputing, 1988.
Google Scholar
J. Ferrante, K.J. Ottenstein, J.D. Warren, “The Program Dependence Graph and Its Use in Optimization,” ACM Trans. Programing Languages and Systems,9(3):319–349, July 1987.
Article MATH Google Scholar
M. Girkar and CD. Polychronopoulos, “Optimization of Data/Control Conditions in Task Graphs,” Proc. 4th Workshop on Languages and Compilers for Parallel Computing, August 1991.
Google Scholar
H. Kasahara, T. Fujii, H. Nakayama, and S. Narita, “A parallel Processing Scheme for the Solution of Sparse Linear Equations Using Static Optimal Multiprocessor Scheduling Algorithms,” Proc. 2nd International Conference on Super computing, May 1987.
Google Scholar
H. Kasahara, W. Premchaiswadi, M. Tamura, Y. Maekawa, and S. Narita, “Parallel Processing of Sparse Matrix Solution Using Fine Grain Tasks on OSCAR,” Proc. International Conference on Parallel Processing, August 1991.
Google Scholar
F.G. Gustavson, W. Liniger, and R.A. Willoughby, “Symbolic Generation of an Optimal Crout Algorithm for Sparse Systems of Linear Equations,” J.ACM, 17:87–109, January 1970.
Article MATH Google Scholar
A.A. Berlin and R.J. Surati, “Exploiting the Parallelism Exposed by Partial Evaluation,” MIT AI Lab. A.I. Memo No. 1414, April 1993.
Google Scholar
H. Kasahara and S. Narita, “Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing,” IEEE Trans. Computers, c-33(11):1023–1029, 1984 November.
Article Google Scholar
E.G. Coffman Jr. (ed.), Computer and Job-shop Scheduling Theory, New York, Wiley, 1976.
MATH Google Scholar
M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, San Francisco, Freeman, 1979.
MATH Google Scholar
Y. Kodama, Y. Koumura, M.Sato, H. Sakane, S. Sakai, Y. Yamaguchi, “EMC-Y: Parallel Processing Element Optimizing Communication and Computation,” Proc. A CM International Conference on Supercomputing,July 1993.
Google Scholar
H.G. Dietz, T. Schewederski, M.T. O’Keefe, A. Zaafrani, “Extended Static Synchronization Beyond VLIW,” Proc. Supercomputing’89, 1989.
Google Scholar
M. O’Keefe and H. Dietz, “Hardware Barrier Synchronization: Static Barrier MIMD,” Proc. 1990 International Conference on Parallel Processing, 1:35–42, August 1990.
Google Scholar
D.A. Padua and M.J. Wolfe, “Advanced Compiler Optimizations for Supercomputers,” Communications of the ACM, 29(12): 1184–1201, December 1986.
Article Google Scholar
M. Wolfe, Optimizing Supercompilers for Supercomputers, Cambridge, MA, MIT Press, 1989.
MATH Google Scholar
U. Banerjee, Dependence Analysis for Supercomputing, Boston, MA, Kluwer Academic, 1988.
Google Scholar
W. Pugh, “The OMEGA Test: A Fast and Practical Integer Programming Algorithm for Dependence Analysis,” Proc. Supercomputing’91, 1991.
Google Scholar
P.M. Petersen and D.A. Padua, “Static and Dynamic Evaluation of Data Dependence Analysis,” Proc. International Conference on Supercommputing, June 1993.
Google Scholar
S.S. Munshi and B. Simons, “Scheduling Sequential Loops on Parallel Processors,” SIAM J. Comput., 19(4):728–741, August 1990.
Article MathSciNet MATH Google Scholar
D.D. Gajski, D.J. Kuck and D.A. Padua, “Dependence Driven Computation,” Proc. COMPCON 81 Spring Computer Conference,168–172, February 1981.
Google Scholar
D. Gajski, D. Kuck, D. Lawrie and A. Sameh, “CEDAR,” Report UIUCDCS-R-83-1123, Department of Computer Science University of Illinois at Urbana-Champaign, February 1983.
Google Scholar
D.J. Kuck, E.S. Davidson, D.H. Lawrie and A.H. Sameh, “Parallel Super-computing Today and Cedar Approach,” Science, 231:967–974, February 1986.
Google Scholar
H.E. Husmann, D.J. Kuck and D.A. Padua, “Automatic Compound Function Definition for Multiprocessors,” Proc. 1988 International Conference on Parallel Processing, August 1988.
Google Scholar
J.A. Fisher, “The VLIW Machine: A Multiprocessor for Compiling Scientific Code,” IEEE Computer, 17(7):45–53, July 1984.
Article Google Scholar
R.P. Colwell et.al., “A VLIW Architecture for a Trace Scheduling Compiler,” IEEE Trans. Compuers, C-37(8):967–979, August 1989.
MathSciNet Google Scholar
J.R. Ellis, Bulldog: A Compiler for VLIW Architectures, Cambridge, MA, MIT Press, 1985.
Google Scholar
J.A. Fisher, “Trace Scheduling: A Technique for Global Microcode Compaction,” IEEE Trans. Computers, C-30(7):478–490, July 1981.
Article Google Scholar
A. Nicolau, “Uniform Parallelism Exploitation in Ordinary Programs,” Proc. 1985 International Conference on Parallel Processing, August 1985.
Google Scholar
A. Aiken and A. Nicolau, “Perfect Pipelining: A New Loop Paralleliza-tion Technique,” Cornell University Computer Science Report, No.87-873, October 1987.
Google Scholar
N.P. Jouppi, “The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance,” IEEE Trans. Computers, C-38(12): 1645–1657, December 1989.
Article Google Scholar
CD. Polychronopoulos, Parallel Programming and Compilers, Boston Kluwer Academic, 1988.
Book MATH Google Scholar
V. Sarkar, “Determining Average Program Execution Times and Their Variance”, Proc. Sigplan’89, June 1989.
Google Scholar
V. Sarkar, Partitioning and Scheduling Parallel Programs for Multiprocessors, Cambridge, MA, MIT Press, 1989.
MATH Google Scholar
S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer and C. Tseng, “An Over-view of the Fortran D Programming System,” Proc, Workshop on Languages and Compilers for Parallel Computing, 18–34, August 1991.
Google Scholar
High Performance Fortran Forum, High Performance Fortran Language Specification, 1.0, May 1993.
Google Scholar
P. Tu and D. Padua, “Automatic Array Privatization,” 6th Annual Workshop on Languages and Compilers for Parallel Computing, 1993
Google Scholar
Zhiyuan Li, “Array Privatization for Parallel Execution of Loops,” Proc. the 1992 ACM International Conference on Supercomputing, 313-322, 1992.
Google Scholar
R. Eigenman, J. Hoeflinger, G. Jaxon, Z. Li and D. Padua, “Restructuring Fortran Programs for Cedar,” International Conference on Parallel Processing 1:57–66, 1991.
Google Scholar
J. Li and M. Chen, “Generating Explicit Communication from Shared-Memory Program References,” Proc. Supercomputing’90, 865–876, 1990.
Google Scholar
M. Gupta and P. Banerjee, “Demonstration of Automatic Data Partitioning Techiniques for Parallelizing Compilers on Multicomputers,” IEEE Trans. Parallel and Distributed System, 3(2):179–193, 1992.
Article Google Scholar
J.M. Anderson and M.S. Lam, “Global Optimizations for Parallelism and Locality on Scalable Parallel Machines,” Proc. the SIGPLAN’ 93 Con-ference on Programming Language Design and Implementation, 112–125, 1993.
Google Scholar
L. Kipp, “Perfect Benchmarks Documentation Suite 1,” CSRD University of Illinois at Urbana-Champaign, 1993.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Waseda University, 3-4-1 Ohkubo Shinjuku-ku, Tokyo, 169, Japan
H. Kasahara, H. Honda, K. Aida, M. Okamoto, A. Yoshida & W. Ogata

Authors

H. Kasahara
View author publications
You can also search for this author in PubMed Google Scholar
H. Honda
View author publications
You can also search for this author in PubMed Google Scholar
K. Aida
View author publications
You can also search for this author in PubMed Google Scholar
M. Okamoto
View author publications
You can also search for this author in PubMed Google Scholar
A. Yoshida
View author publications
You can also search for this author in PubMed Google Scholar
W. Ogata
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of California, Irvine, Irvine, CA, 92717
Lubomir F. Bic & Alexandru Nicolau &
Electrotechnical Laboratory, Tsukuba, Japan
Mitsuhisa Sato

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kasahara, H., Honda, H., Aida, K., Okamoto, M., Yoshida, A., Ogata, W. (1995). OSCAR Fortran Multigrain Compiler. In: Bic, L.F., Nicolau, A., Sato, M. (eds) Parallel Language and Compiler Research in Japan. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-2269-0_11

Download citation

DOI: https://doi.org/10.1007/978-1-4615-2269-0_11
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5957-9
Online ISBN: 978-1-4615-2269-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics