Skip to main content
Log in

Compiling for the Cydra

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The Cydra 5 is a VLIW minisupercomputer with hardware designed to accelerate a broad class of inner loops, presenting unique challenges to its compilers. We discuss the organization of its Fortran/77 compiler and several of the key approaches developed to fully exploit the hardware. These include the intermediate representation used; the preparation, overlapped scheduling, and register allocation of inner loops; the speculative execution model used to control global code motion; and the machine model and local instruction scheduling approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allen, J.R., Kennedy, K., Porterfield, C., and Warren, J. 1983. Conversion of control dependence to data dependence. InConf. Proc. — 10th Annual Symp. on Principles of Programming Languages (Jan.), pp. 177–189.

    Google Scholar 

  • Banerjee, U. 1976. Data dependence in ordinary programs. Tech. Rept. no. UIUCDCS-R-76-837 (M.S. thesis), Dept. of Comp. Sci., Univ. of Ill., Urbana-Champaign, Ill.

    Google Scholar 

  • Beck, G.R., Yen, D.W.L., and Anderson, T.L. 1993. The Cydra-5 minisupercomputer: Architecture and implementation.The J. Supercomputing, 7, 1/2:143–180.

    Google Scholar 

  • Callahan, D., Carr, S., and Kennedy, K. 1990. Improving register allocation for subscripted variables. InProc., SIGPLAN '90 Conf. on Programming Language Design and Implementation (White Plains, N.Y., June), pp. 53–65.

  • Charlesworth, A.E. 1981. An approach to scientific array processing: The architectural design of the AP-120B/ FPS-164.IEEE Computer, 14, 9:18–27.

    Google Scholar 

  • Chow, F.C. 1983. A portable machine-independent global optimizer—Design and measurements. Tech. Rept. no. 83–254 (Ph.D. diss.), Stanford Univ., Stanford, Calif.

    Google Scholar 

  • Colwell, R.P., Nix, R.P., O'Donnell, J.J., Papworth, D.B., and Rodman, P.K. 1987. A VLIW architecture for a trace scheduling compiler. InProc., 2nd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Palo Alto, Calif., Oct. 5–8), pp. 180–192.

  • Dehnert, J.C., Hsu, P.Y-T., and Bratt, J.P. 1989. Overlapped loop support in the Cydra 5. InProc., 3rd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Boston, Apr. 3–6), pp. 26–38.

  • Ellis, J. 1986.Bulldog: A Compiler for VLIW Architectures. MIT Press, Cambridge, Mass.

    Google Scholar 

  • Fisher, J.A. 1981. Trace scheduling: A technique for global microcode compaction.IEEE Trans. Comps., C-30, 7:478–490.

    Google Scholar 

  • Fisher, J.A. 1983. Very long instruction word architecture and the ELI-512. InProc., 10th Annual Internat. Symp. on Computer Architecture (Stockholm, June 13–17), pp. 140–150.

  • Hennessy, J., Jouppi, N., Przybylski, S., Rowen, C., Gross, T., Baskett, F., and Gill, J. 1982. MIPS: A microprocessor architecture. InProc., 15th Annual Workshop on Microprogramming (Palo Alto, Calif., Oct. 5–7), pp. 17–22.

  • Hsu, P.Y.-T. 1986. Highly concurrent scalar processing. Ph.D. thesis, Univ. of Ill., Urbana-Champaign, Ill.

    Google Scholar 

  • Joshi, S.M., and Dhamdhere, D.M. 1982. A composite hoisting—strength reduction transformation for global program optimization, parts 1 and 2.Internat. J. Comput. Math., 11, 1:21–41, 111–126.

    Google Scholar 

  • Lam, M.S.-L. 1987. A systolic array optimizing compiler. Tech. Rept. no. CMU-CS-87-187 (Ph.D. thesis), Carnegie Mellon University, Pittsburgh.

    Google Scholar 

  • Lawler, E.L. 1976.Combinatorial Optimization: Networks and Matroids. Holt, Rinehart and Winston, Fort Worth, Tex.

    Google Scholar 

  • Lowney, P.G., Freudenberger, S.M., Karzes, T.J., Lichtenstein, W.D., Nix, R.P., O'Donnell, J.J., and Ruttenberg, J.C. 1993. The Multiflow Trace scheduling compiler.The J. Supercomputing, 7, 1/2: 51–142.

    Google Scholar 

  • Malek, G. 1988. Notes from a tutorial on global optimization, SIGPLAN '88 Conf. on Programming Language Design and Implementation (Atlanta, June 21).

  • Mateti, P., and Deo, N. 1976. On algorithms for enumerating all circuits of a graph.SIAM J. Computing, 5, 1:90–99.

    Google Scholar 

  • McKusick, M.K. 1984. Register allocation and data conversion in machine independent code generators. Tech. Rept. no. UCB/CSD 84/214 (Ph.D. thesis), Univ. of Calif., Berkeley.

    Google Scholar 

  • McMahon, F.H. 1986. The Livermore Fortran kernels: A computer test of the numerical performance range. Tech. Rept. UCRL-53745, Lawrence Livermore Nat. Lab., Livermore, Calif.

    Google Scholar 

  • Nicolau, A. 1986. A fine-grain parallelizing compiler. Tech. Rept. no. 86–792, Cornell Univ., Ithaca, N.Y.

    Google Scholar 

  • O'Brien, K., Hay, B., Minish, J., Schaffer, H., Schloss, B., Shepherd, A., and Zaleski, M. 1990. Advanced compiler technology for the RISC System/6000 architecture. InIBM RISC System/6000 Technology, IBM Corp., Austin, Tex. SA23-2619.

    Google Scholar 

  • Rau, B.R. 1992a. Data flow and dependence analysis for instruction-level parallelism. InProc., Fourth Internat. Workshop on Languages and Compilers for Parallel Computing, Lecture Notes in Comp. Sci., Vol. 589, Springer-Verlag, Berlin, pp. 236–250.

    Google Scholar 

  • Rau, B.R. 1992b. Private communication.

  • Rau, B.R., and Glaeser, C.D. 1981. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. InProc., 14th Annual Microprogramming Workshop (Chatham, Mass., Oct. 12–15), pp. 183–198.

  • Rau, B.R., Glaeser, C.D., and Picard, R.L. 1982. Efficient code generation for horizontal architectures: Compiler techniques and architectural support. InProc., 9th Annual Internat. Symp. on Computer Architecture (Austin, Tex., Apr. 26–29), pp. 131–139.

  • Rau, B.R., Schlansker, M.S., and Yen, D.W.L. 1989. The Cydra 5 stride-insensitive memory system. InProc., 1989 Internat. Conf. on Parallel Processing (Aug. 8–12), vol. 1, pp. 242–246.

    Google Scholar 

  • Rau, B.R., Lee, M., Tirumalai, P.P., and Schlansker, M.S. 1992. Register allocation for software pipelined loops. InProc., SIGPLAN '92 Symp. on Programming Language Design and Implementation (San Francisco, June 17–19), pp. 283–299.

  • Rau, B.R., Yen, D.W.L., Yen, W., and Towle, R.A. 1989. The Cydra 5 departmental supercomputer.IEEE Comp. 22, 1:12–34.

    Google Scholar 

  • Rosen, B.K., Wegman, M.N., and Zadeck, F.K. 1988. Global value numbers and redundant computations. InProc., 15th Annual Symp. on Principles of Programming Languages (San Diego, Jan. 13–15), pp. 12–27.

  • Tiernan, J.C. 1970. An efficient search algorithm to find the elementary circuits of a graph. CACM, 13:722–726.

    Google Scholar 

  • Tirumalai, P.P., Lee, M., and Schlansker, M.S. 1991. Parallelization of WHILE loops on pipelined Architectures.The J. Supercomputing, 5, 2/3:119–136.

    Google Scholar 

  • Touzeau, R.F. 1984. A Fortran compiler for the FPS-164 scientific computer. InProc., SIGPLAN '84 Symp. on Compiler Construction (Montreal, June 20), pp. 48–57.

  • Towle, R.A. 1976. Control and data dependence for program transformations. Tech. Rept. no. UIUCDCS-R-76-788 (Ph.D. thesis), Univ. of Ill., Urbana-Champaign, Ill.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dehnert, J.C., Towle, R.A. Compiling for the Cydra. J Supercomput 7, 181–227 (1993). https://doi.org/10.1007/BF01205184

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01205184

Keywords

Navigation