Skip to main content

An efficient scheme for fine-grain software pipelining

  • Implementation Issues For Novel Architectures And Languages
  • Conference paper
  • First Online:
CONPAR 90 — VAPP IV (VAPP 1990, CONPAR 1990)

Abstract

Dataflow software pipelining was proposed as a means of structuring fine-grain parallelism and has been studied mostly under an idealized dataflow architecture model with infinite resources[7]. In this paper, we address some issues of software pipelining under a realistic architecture model with finite resources. A general framework for fine-grain code scheduling in pipelined machines is developed which simultaneously addresses both time and space efficiency issues for loops typically found in general-purpose scientific computations. This scheduling method exploits fine-grain parallelism through a loop optimization technique which limitedly balances the program graph at compile time, while the instruction-level scheduling is done dynamically at runtime in a data-driven manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Aiken and A. Nicolau. Optimal loop parallelization. In Proc. of the 1988 ACM SIGPLAN Conf. on Programming Languages Design and Implementation, June 1988.

    Google Scholar 

  2. Arvind and D.E. Culler. Managing resources in a parallel machine. In J.V. Woods, editor, Fifth Generation Computer Architecture, pages 103–121. Elsevier Science Publishers, 1986.

    Google Scholar 

  3. D. Bernstein and I. Gertner. Scheduling expressions on a pipelined processor with a maximal delay of one cycle. ACM Transactions on Programming Languages and Systems, 11(1):57–66, Jan. 1989.

    Article  Google Scholar 

  4. J.B. Dennis and G.R. Gao. An efficient pipelined dataflow processor architecture. In Proc. of the Supercomputing '88 Conf., pages 368–373, Florida, Nov. 1988. IEEE Computer Society and ACM SIGARCH.

    Google Scholar 

  5. J.B. Dennis, G.R. Gao, and K.W. Todd. Modeling the weather with a data flow supercomputer. IEEE Trans. on Computers, C-33(7):592–603, 1984.

    Google Scholar 

  6. G. R. Gao. A pipelined code mapping scheme for static dataflow computers. Technical Report TR-371, Laboratory for Computer Science, MIT, 1986.

    Google Scholar 

  7. G.R. Gao. Aspects of balancing techniques for pipelined data flow code generation. Journal of Parallel and Distributed Computing, 6:39–61, 1989.

    Article  Google Scholar 

  8. G.R. Gao, H.H.J. Hum, and Y.B. Wong. Towards efficient fine-grain software pipelining. In Proc. of the ACM Int'l. Conf. on Supercomputing, Amsterdam, Netherlands, June 1990.

    Google Scholar 

  9. G.R. Gao, R. Tio, and H.H.J. Hum. Design of an efficient dataflow architecture without dataflow. In Proc. of the International Conf. on Fifth-Generation Computers, pages 861–868, Tokyo, Japan, Dec. 1988.

    Google Scholar 

  10. M.R. Garey and D.S. Johnson. Computers and Intractability: A guide to the Theory of NP-Completeness. W.H. Freeman and Company, 1979.

    Google Scholar 

  11. P.B. Gibbons and S.S. Muchnik. Efficient instruction scheduling for a pipelined architecture. In Proc. of the ACM Symp. on Compiler Construction, pages 11–16, Palo Alto, Calif., June 1986.

    Google Scholar 

  12. J. Hennessy and T. Gross. Postpass code optimization of pipelined constraints. ACM Transactions on Programming Languages and Systems, 5(3):422–448, July 1983.

    Article  Google Scholar 

  13. P.M. Kogge. The Architecture of Pipelined Computers. McGraw-Hill Book Company, New York, 1981.

    Google Scholar 

  14. S.Y. Kung, S.C. Lo, and P.S. Lewis. Timing analysis and optimization of VLSI data flow arrays. In Proc. of the 1986 International Conf. on Parallel Processing, 1986.

    Google Scholar 

  15. M. Lam. Software pipelining: An effective scheduling technique for VLIW machines. In Proc. of the 1988 ACM SIGPLAN Conf. on Programming Languages Design and Implementation, pages 318–328, Atlanta, Georgia, June 1988.

    Google Scholar 

  16. J.R. Larus and P.N. Hilfinger. Register allocation in the SPUR Lisp compiler. In Proc. of the ACM Symp. on Compiler Construction, pages 255–263, Palo Alto, Calif., June 1986.

    Google Scholar 

  17. C. V. Ramamoorthy and G. S. Ho. Performance evaluation of asynchronous concurrent systems using petri nets. IEEE Trans. on Computers, pages 440–448, Sept. 1980.

    Google Scholar 

  18. C. Ramchandani. Analysis of asynchronous concurrent systems. Technical Report TR-120, Laboratory for Computer Science, MIT, 1974.

    Google Scholar 

  19. B.R. Rau and C.D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proc. of the 14th Annual Workshop on Microprogramming, pages 183–198, 1981.

    Google Scholar 

  20. R.F. Touzeau. A FORTRAN compiler for the FPS-164 scientific computer. In Proc. of the ACM SIGPLAN '84 Symp. on Compiler Construction, pages 48–57, June 1984.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Helmar Burkhart

Rights and permissions

Reprints and permissions

Copyright information

© 1990 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gao, G.R., Hum, H.H.J., Wong, YB. (1990). An efficient scheme for fine-grain software pipelining. In: Burkhart, H. (eds) CONPAR 90 — VAPP IV. VAPP CONPAR 1990 1990. Lecture Notes in Computer Science, vol 457. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-53065-7_147

Download citation

  • DOI: https://doi.org/10.1007/3-540-53065-7_147

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-53065-7

  • Online ISBN: 978-3-540-46597-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics