Mapping a single-assignment language onto the warp systolic array

  • Thomas Gross
  • Alan Sussman
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 274)


Single-assignment languages offer the potential to efficiently program parallel processors. This paper discusses issues that arise in mapping SISAL programs onto the WarpSM array, a linear systolic array in use at Carnegie Mellon. A Warp machine with ten cells can deliver up to 100 million floating point operations per second.

The paper begins with a discussion of systolic arrays as targets for single-assignment languages and the suitability of the Warp machine for this purpose. Systolic arrays can take advantage of both large-grain parallelism and fine-grain parallelism. The communication bandwidth of the systolic array gives the translator great flexibility in mapping a SISAL program onto the linear array.

We present two principal methods to exploit parallelism on Warp, data partitioning and pipelining, Data partitioning is effective for local computations that depend on only a small neighborhood of values. Since SISAL allows the specification of array sizes at run-time, we have to provide static and dynamic methods for data partitioning. Many operations on the SISAL stream data type can be parallelized as a special case of dynamic data partitioning. Pipelining allows the overlapping of different stages of a computation or of function invocations. This method is well suited for Warp since the systolic array has high inter-cell communication bandwidth. This makes it possible to send large data sets to the next processor in a computation pipeline without performance degradation.

We use matrix multiplication and a relaxation algorithm, respectively, as examples to illustrate the data partitioning and pipeline models for mapping SISAL programs onto the Warp array.


Systolic Array Lawrence Livermore National Laboratory Parallel Loop Data Flow Graph Dataflow Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Annaratone, M., Arnould, E., Gross, T., Kung, H. T., Lam, M. S., Menzilcioglu, O., Sarocky, K., and Webb, J.A. Warp Architecture and Implementation. Conference Proceedings of the 13th Annual International Symposium on Computer Architecture, IEEE/ACM, June, 1986, pp. 346–356.Google Scholar
  2. 2.
    Annaratone, M., Bitz, F., Clune E., Kung H. T., Maulik, P., Ribas, H., Tseng, P., and Webb, J. Applications and Algorithm Partitioning on Warp. Proc. Compcon Spring 87, San Francisco, February, 1987, pp. 272–275.Google Scholar
  3. 3.
    Annaratone, M., Arnould, E., Hsiung, P.K. and Kung, H.T. Extending the CMU Warp Machine with a Boundary Processor. Proceedings of SPIE Symposium, Vol. 564, Real-Time Signal Processing VIII, Society of Photo-Optical Instrumentation Engineers, August, 1985.Google Scholar
  4. 4.
    Bruegge, B., Chang, C., Cohn, R., Gross, T., Lam, M., Lieu, P., Noaman, A. and Yam, D. Programming Warp. Proc. Compcon Spring 87, San Francisco, February, 1987.Google Scholar
  5. 5.
    Gross, T. and Lam, M. Compilation for a High-performance Systolic Array. Proceedings of the ACM SIGPLAN '86 Symposium on Compiler Construction, ACM SIGPLAN, June, 1986, pp. 27–38.Google Scholar
  6. 6.
    Gurd, J. R., Kirkham, C. C., and Watson, I. "The Manchester Prototype Dataflow Computer". CACM 28, 1 (Jan 1985), 34–52.Google Scholar
  7. 7.
    Kung, H. T. "Memory Requirements for Balanced Computer Architectures". Journal of Complexity 1, 1 (1985), 147–157.Google Scholar
  8. 8.
    McGraw, J. R. "The VAL Language: Description and Analysis". ACM Trans. on Programming Lang. and Systems 4, 1 (Jan. 1982), 44–82.Google Scholar
  9. 9.
    McGraw, James, Skedzielewski, Stephen, Allan, Stephen, Oldehoeft, Rod, Glauert, John, Kirkham, Chris, Noyce, Bill and Thomas, Robert. SISAL Streams and Iterations in a Single Assignement Language. Tech. Rept. M-146 (Rev.1), Lawrence Livermore National Laboratory, March, 1985.Google Scholar
  10. 10.
    Oldehoeft, R.R., Cann, D.C. and Allan, S.J. SISAL: Initial MIMD Performance Results. Proceedings of CONPAR 86 (Conference on Algorithms and Hardware for Parallel Processing), September, 1986, pp. 120–127.Google Scholar
  11. 11.
    Skedzielewski, S. K., and Welcome, M. L. Data Flow Graph Optimization in IF1. Proc. 1985 Conference on Functional Programming Languages and Computer Architecture, Nancy, Sept., 1985, pp. 17–34.Google Scholar
  12. 12.
    Skedzielewski, Stephen and Glauert, John. IF1: An Intermediate Form for Applicative Languages. Lawrence Livermore National Laboratory, 1985.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1987

Authors and Affiliations

  • Thomas Gross
    • 1
  • Alan Sussman
    • 1
  1. 1.Department of Computer ScienceCarnegie Mellon UniversityPittsburgh

Personalised recommendations