Multi-thread code generation for dataflow architectures from non-strict programs
This paper presents a new style of code generation for dataflow architectures, based on a view of such architectures as general multi-threaded von Neumann machines. Whereas the traditional picture of dataflow object code consists of tokens flowing along the arcs of a dataflow graph, the multi-threaded style treats a linear sequence of dataflow instructions as a sequential thread. Within a thread, data is passed by imperatively reading and writing an activation frame associated with a procedure invocation, just as in conventional architectures. Also, advanced dataflow architectures like Monsoon provide specific support for sequential threads, in the form of general registers for operand storage within a thread. Between threads, values are also communicated via the activation frame. The generation and synchronization of tokens, which in traditional dataflow is the primary means of communicating data, is relegated here to a purely control flow role, spawning new threads and gating their initiation, but themselves carrying no data.
Our results show that in many cases, the multi-thread style of code generation results in fewer total machine cycles to execute a given program as compared to the traditional dataflow style. Surprisingly, this remains true even if the general registers are not employed. The reason the multi-thread style is so successful is that by separating data flow from control flow, redundant control flow can be eliminated. In other words, fewer forking (token creation) and joining (token matching) operations are required. Central to eliminating redundant control flow is the compiler's ability to discover sequential threads that may be scheduled at compile time. While this is inherently difficult when starting from a non-strict language such as Id, even small threads have a beneficial effect. The most dramatic performance improvement comes about in compiling Id loops, where the language semantics specifies a greater degree of strictness, resulting in larger threads.
We demonstrate the comparative performance between the multi-thread and traditional dataflow styles of code generation through empirical observations on the Monsoon hardware, and also through a detailed analytic examination of the code generated under each paradigm.
Unable to display preview. Download preview PDF.
- [ACM88]Arvind, D. E. Culler, and G. K. Maa. Assessing the benefits of fine-grained parallelism in dataflow programs. International Journal of Supercomputer Applications, 2(3):10–36, 1988.Google Scholar
- [AN90]Arvind and R. S. Nikhil. Executing a program on the MIT tagged-token dataflow architecture. IEEE Transactions on Computers, 39(3):300–318, March 1990.Google Scholar
- [ANP86]Arvind, R. S. Nikhil, and K. K. Pingali. I-structures: Data structures for parallel computing. In Graph Reduction, volume 279 of Lecture Notes in Computer Science, pages 336–369. Springer-Verlag, October 1986.Google Scholar
- [ASU86]A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading MA, 1986.Google Scholar
- [BHY88]A. Bloss, P. Hudak, and J. Young. Code optimizations for lazy evaluation. Lisp and Symbolic Computation, 1(2):147–164, September 1988.Google Scholar
- [CA88]D. E. Culler and Arvind. Resource requirements of dataflow programs. In Proceedings of the 15th Annual International Symposium on Computer Architecture, pages 141–150. IEEE, June 1988.Google Scholar
- [CSS+91]D. E. Culler, A. Sah, K. E. Schauser, T. von Eicken, and J. Wawrzynek. Fine-grain parallellism with minimal hardware support: A compiler-controlled threaded abstract machine. In Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 164–175. Association for Computing Machinery, April 1991.Google Scholar
- [Eka86]K. Ekanadham. Multi-tasking on a dataflow-like architecture. Research Report RC 12307, IBM T. J. Watson Research Center, Hawthorne NY, November 1986.Google Scholar
- [Ian88a]R. A. Iannucci. A dataflow/von Neumann hybrid architecture. Technical Report TR-418, Massachusetts Institute of Technology Laboratory for Computer Science, Cambridge MA, May 1988.Google Scholar
- [Ian88b]R. A. Iannucci. Toward a dataflow/von Neumann hybrid architecture. In Proceedings of the 15th Annual International Symposium on Computer Architecture, pages 131–140. IEEE, June 1988.Google Scholar
- [Joh84]T. Johnsson. Efficient compilation of lazy evaluation. ACM SIGPLAN Notices, 19(6):58–69, June 1984. (Proceedings of the SIGPLAN 84 Symposium on Compiler Construction).Google Scholar
- [NA89]R. S. Nikhil and Arvind. Can dataflow subsume von Neumann computing? In Proceedings of the 16th Annual International Symposium on Computer Architecture, pages 262–272. IEEE, June 1989.Google Scholar
- [Nik90]R. S. Nikhil. Id version 90.0 reference manual. Computation Structures Group Memo 284-1, Massachusetts Institute of Technology Laboratory for Computer Science, Cambridge MA, September 1990.Google Scholar
- [PC90]G. M. Papadopoulos and D. E. Culler. Monsoon: an explicit token store architecture. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 82–91. IEEE, 1990.Google Scholar
- [PT91]G. M. Padadopoulos and K. R. Traub. Multithreading: A revisionist view of dataflow architectures. In Proceedings of the 18th Annual International Symposium on Computer Architecture, pages 342–351. IEEE, May 1991.Google Scholar
- [SCvE91]K. E. Schauser, D. E. Culler, and T. von Eicken. Compiler-controlled multithreading for lenient parallel languages. In Functional Programming Languages and Computer Architecture, 1991. (To Appear).Google Scholar
- [Ste89]K. Steele. An i-structure memory controller. Master's thesis, Massachusetts Institute of Technology, Cambridge MA, December 1989.Google Scholar
- [Tra86]K. R. Traub. A compiler for the MIT tagged-token dataflow architecture. Technical Report TR-370, Massachusetts Institute of Technology Laboratory for Computer Science, Cambridge MA, August 1986.Google Scholar
- [Tra89]K. R. Traub. Compilation as partitioning: A new approach to compiling non-strict functional languages. In Functional Programming Languages and Computer Architecture, pages 75–88. Association for Computing Machinery, September 1989.Google Scholar
- [Tra91]K. R. Traub. Implementation of Non-Strict Functional Programming Languages. Pitman Publishing, London, 1991. Also published by MIT Press, Cambridge MA.Google Scholar