Compiling for the Cydra

Dehnert, James C.; Towle, Ross A.

doi:10.1007/BF01205184

Compiling for the Cydra

Published: May 1993

Volume 7, pages 181–227, (1993)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

James C. Dehnert¹ &
Ross A. Towle¹

65 Accesses
131 Citations
3 Altmetric
Explore all metrics

Abstract

The Cydra 5 is a VLIW minisupercomputer with hardware designed to accelerate a broad class of inner loops, presenting unique challenges to its compilers. We discuss the organization of its Fortran/77 compiler and several of the key approaches developed to fully exploit the hardware. These include the intermediate representation used; the preparation, overlapped scheduling, and register allocation of inner loops; the speculative execution model used to control global code motion; and the machine model and local instruction scheduling approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Allen, J.R., Kennedy, K., Porterfield, C., and Warren, J. 1983. Conversion of control dependence to data dependence. InConf. Proc. — 10th Annual Symp. on Principles of Programming Languages (Jan.), pp. 177–189.
Google Scholar
Banerjee, U. 1976. Data dependence in ordinary programs. Tech. Rept. no. UIUCDCS-R-76-837 (M.S. thesis), Dept. of Comp. Sci., Univ. of Ill., Urbana-Champaign, Ill.
Google Scholar
Beck, G.R., Yen, D.W.L., and Anderson, T.L. 1993. The Cydra-5 minisupercomputer: Architecture and implementation.The J. Supercomputing, 7, 1/2:143–180.
Google Scholar
Callahan, D., Carr, S., and Kennedy, K. 1990. Improving register allocation for subscripted variables. InProc., SIGPLAN '90 Conf. on Programming Language Design and Implementation (White Plains, N.Y., June), pp. 53–65.
Charlesworth, A.E. 1981. An approach to scientific array processing: The architectural design of the AP-120B/ FPS-164.IEEE Computer, 14, 9:18–27.
Google Scholar
Chow, F.C. 1983. A portable machine-independent global optimizer—Design and measurements. Tech. Rept. no. 83–254 (Ph.D. diss.), Stanford Univ., Stanford, Calif.
Google Scholar
Colwell, R.P., Nix, R.P., O'Donnell, J.J., Papworth, D.B., and Rodman, P.K. 1987. A VLIW architecture for a trace scheduling compiler. InProc., 2nd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Palo Alto, Calif., Oct. 5–8), pp. 180–192.
Dehnert, J.C., Hsu, P.Y-T., and Bratt, J.P. 1989. Overlapped loop support in the Cydra 5. InProc., 3rd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Boston, Apr. 3–6), pp. 26–38.
Ellis, J. 1986.Bulldog: A Compiler for VLIW Architectures. MIT Press, Cambridge, Mass.
Google Scholar
Fisher, J.A. 1981. Trace scheduling: A technique for global microcode compaction.IEEE Trans. Comps., C-30, 7:478–490.
Google Scholar
Fisher, J.A. 1983. Very long instruction word architecture and the ELI-512. InProc., 10th Annual Internat. Symp. on Computer Architecture (Stockholm, June 13–17), pp. 140–150.
Hennessy, J., Jouppi, N., Przybylski, S., Rowen, C., Gross, T., Baskett, F., and Gill, J. 1982. MIPS: A microprocessor architecture. InProc., 15th Annual Workshop on Microprogramming (Palo Alto, Calif., Oct. 5–7), pp. 17–22.
Hsu, P.Y.-T. 1986. Highly concurrent scalar processing. Ph.D. thesis, Univ. of Ill., Urbana-Champaign, Ill.
Google Scholar
Joshi, S.M., and Dhamdhere, D.M. 1982. A composite hoisting—strength reduction transformation for global program optimization, parts 1 and 2.Internat. J. Comput. Math., 11, 1:21–41, 111–126.
Google Scholar
Lam, M.S.-L. 1987. A systolic array optimizing compiler. Tech. Rept. no. CMU-CS-87-187 (Ph.D. thesis), Carnegie Mellon University, Pittsburgh.
Google Scholar
Lawler, E.L. 1976.Combinatorial Optimization: Networks and Matroids. Holt, Rinehart and Winston, Fort Worth, Tex.
Google Scholar
Lowney, P.G., Freudenberger, S.M., Karzes, T.J., Lichtenstein, W.D., Nix, R.P., O'Donnell, J.J., and Ruttenberg, J.C. 1993. The Multiflow Trace scheduling compiler.The J. Supercomputing, 7, 1/2: 51–142.
Google Scholar
Malek, G. 1988. Notes from a tutorial on global optimization, SIGPLAN '88 Conf. on Programming Language Design and Implementation (Atlanta, June 21).
Mateti, P., and Deo, N. 1976. On algorithms for enumerating all circuits of a graph.SIAM J. Computing, 5, 1:90–99.
Google Scholar
McKusick, M.K. 1984. Register allocation and data conversion in machine independent code generators. Tech. Rept. no. UCB/CSD 84/214 (Ph.D. thesis), Univ. of Calif., Berkeley.
Google Scholar
McMahon, F.H. 1986. The Livermore Fortran kernels: A computer test of the numerical performance range. Tech. Rept. UCRL-53745, Lawrence Livermore Nat. Lab., Livermore, Calif.
Google Scholar
Nicolau, A. 1986. A fine-grain parallelizing compiler. Tech. Rept. no. 86–792, Cornell Univ., Ithaca, N.Y.
Google Scholar
O'Brien, K., Hay, B., Minish, J., Schaffer, H., Schloss, B., Shepherd, A., and Zaleski, M. 1990. Advanced compiler technology for the RISC System/6000 architecture. InIBM RISC System/6000 Technology, IBM Corp., Austin, Tex. SA23-2619.
Google Scholar
Rau, B.R. 1992a. Data flow and dependence analysis for instruction-level parallelism. InProc., Fourth Internat. Workshop on Languages and Compilers for Parallel Computing, Lecture Notes in Comp. Sci., Vol. 589, Springer-Verlag, Berlin, pp. 236–250.
Google Scholar
Rau, B.R. 1992b. Private communication.
Rau, B.R., and Glaeser, C.D. 1981. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. InProc., 14th Annual Microprogramming Workshop (Chatham, Mass., Oct. 12–15), pp. 183–198.
Rau, B.R., Glaeser, C.D., and Picard, R.L. 1982. Efficient code generation for horizontal architectures: Compiler techniques and architectural support. InProc., 9th Annual Internat. Symp. on Computer Architecture (Austin, Tex., Apr. 26–29), pp. 131–139.
Rau, B.R., Schlansker, M.S., and Yen, D.W.L. 1989. The Cydra 5 stride-insensitive memory system. InProc., 1989 Internat. Conf. on Parallel Processing (Aug. 8–12), vol. 1, pp. 242–246.
Google Scholar
Rau, B.R., Lee, M., Tirumalai, P.P., and Schlansker, M.S. 1992. Register allocation for software pipelined loops. InProc., SIGPLAN '92 Symp. on Programming Language Design and Implementation (San Francisco, June 17–19), pp. 283–299.
Rau, B.R., Yen, D.W.L., Yen, W., and Towle, R.A. 1989. The Cydra 5 departmental supercomputer.IEEE Comp. 22, 1:12–34.
Google Scholar
Rosen, B.K., Wegman, M.N., and Zadeck, F.K. 1988. Global value numbers and redundant computations. InProc., 15th Annual Symp. on Principles of Programming Languages (San Diego, Jan. 13–15), pp. 12–27.
Tiernan, J.C. 1970. An efficient search algorithm to find the elementary circuits of a graph. CACM, 13:722–726.
Google Scholar
Tirumalai, P.P., Lee, M., and Schlansker, M.S. 1991. Parallelization of WHILE loops on pipelined Architectures.The J. Supercomputing, 5, 2/3:119–136.
Google Scholar
Touzeau, R.F. 1984. A Fortran compiler for the FPS-164 scientific computer. InProc., SIGPLAN '84 Symp. on Compiler Construction (Montreal, June 20), pp. 48–57.
Towle, R.A. 1976. Control and data dependence for program transformations. Tech. Rept. no. UIUCDCS-R-76-788 (Ph.D. thesis), Univ. of Ill., Urbana-Champaign, Ill.
Google Scholar

Download references

Author information

Authors and Affiliations

Silicon Graphics Computer Systems, P.O. Box 7311, 94039, Mountain View, CA
James C. Dehnert & Ross A. Towle

Authors

James C. Dehnert
View author publications
You can also search for this author in PubMed Google Scholar
Ross A. Towle
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dehnert, J.C., Towle, R.A. Compiling for the Cydra. J Supercomput 7, 181–227 (1993). https://doi.org/10.1007/BF01205184

Download citation

Received: 15 March 1992
Accepted: 15 October 1992
Issue Date: May 1993
DOI: https://doi.org/10.1007/BF01205184

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compiling for the Cydra

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Can GPU performance increase faster than the code error rate?

Shared Memory Parallelism in Modern C++ and HPX

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compiling for the Cydra

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Can GPU performance increase faster than the code error rate?

Shared Memory Parallelism in Modern C++ and HPX

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation