Skip to main content
Log in

Abstract

Various methods for mapping signal processing algorithms into systolic arrays have been developed in the past few years. In this paper, efficient scheduling techniques are developed for the partitioning problem, i.e. problems with size that do not match the array size. In particular, scheduling for the Locally Parallel-Globally Sequential (LPGS) technique and the Locally Sequential-Globally Parallel (LSGP) technique are developed. The scheduling procedure developed exploits the fact that after LPGS and LSGP partitioning, the locality constraints are less stringent allowing for more flexibility in the choice of algorithms and inter-processor communication. A flexible scheduling order is developed that is useful in evaluating the trade-off between execution time and size of storage buffers. The benefits of the scheduling techniques are illustrated with the help of matrix multiplication and least squares examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D.I. Moldovan and J.A.B. Fortes, “Partitioning and mapping algorithms into fixed size systolic arrays,”IEEE Trans. on Computers, Vol. C-35, pp. 1–12, 1986.

    Article  Google Scholar 

  2. S.Y. Kung,VLSI Array Processors, Englewood Cliffs, NJ, Prentice Hall, 1988.

    Google Scholar 

  3. S.Y. Kung, K.S. Arun, R.J. Gal-Ezer, and D.V. Bhaskar Rao, “Wavefront array processor: language, architecture, and applications,”IEEE Trans. on Computers, Special issue on Parallel and Distributed Processing, Vol. C-31, pp. 1054–1066, 1982.

    Google Scholar 

  4. S.K. Rao,Regular Iterative Algorithms and their Implementations on Processor Arrays, PhD thesis, Stanford University, Stanford, CA, 1985.

    Google Scholar 

  5. D.I. Moldovan, “ADVIS: A software package for the design of systolic arrays,”IEEE Trans. on Computer-Aided Design, pp. 33–40, 1987.

  6. E.T.L. Omtzigt, “SYSTARS: a CAD tool for the synthesis and analysis of VLSI systolic/wavefront arrays,” inProc. Inter. Conf. on Systolic Arrays, San Diego, CA, 1988, pp. 383–391.

  7. S.Y. Kung and S.N. Jean, “A VLSI Array Compiler System (VACS) for Array Design,” inProc. IEEE Workshop on VLSI Signal Processing, pp. 495–508, 1988.

  8. E.T.L. Omtzigt, “Domain flow and streaming architectures,” inProc. Inter. Conf. on Application Specific Array Processors, Princeton, NJ, 1990, pp. 438–447.

  9. J.J. Navarro, J.M. Llaberia, and M. Valero, “Partitioning: An essential step in mapping algorithms into systolic array processors,”IEEE Computer Magazine, pp. 77–89, July 1988.

  10. Don Heller, “Partitioning big matrices for small systolic arrays,” in H.J. Whitehouse, S.Y. Kung, and T. Kailath, editors,VLSI and Modern Signal Processing, Englewood Cliffs, NJ, Prentice-Hall, 1985.

    Google Scholar 

  11. Harry W. Nelis and Ed.F. Deprettere, “Automatic design and partitioning of systolic/wavefront arrays for VLSI,” inCircuits System and Signal Processing, Vol. 7(2), pp. 235–252, 1988.

    Article  MATH  Google Scholar 

  12. P. Kuchibhotla and B.D. Rao, “Partitioning Considerations in systolic array design,” inProc. 25th Asilomar Conference on Signals, Systems & Computers, Vol. 1, Monterey, California, Nov. 1991, pp. 530–534.

    Google Scholar 

  13. P. Kuchibhotla and B.D. Rao, “Efficient scheduling methods for partitioned systolic algorithms,” inProc. of International Conference on Application-Specific Array Processors, Berkeley, California, Aug. 1992, pp. 649–663.

  14. Jichun Bu, Ed.F. Deprettere, and P. Dewilde, “A Design Methodology for Fixed-Size Systolic Arrays,” inProc. Inter. Conf. on Application Specific Array Processors, Princeton, NJ, 1990, pp. 591–602.

  15. Jichun Bu, Ed.F. Deprettere, and Lothar Thiele, “Systolic Array Implementation of Nested Loop Programs,” inProc. Inter. Conf. on Application Specific Array Processors, Princeton, NJ, 1990, pp. 31–42.

  16. W.P. Burleson, “The partitioning problem on VLSI arrays: I/O and local memory complexity,” inProc. IEEE ICASSP, Toronto, Canada, May 1991, pp. 1217–1220.

  17. K.W. Przytula, “Medium Grain Parallel Architecture for Image and Signal Processing,” inParallel architectures and algorithms for image understanding, Boston, MA, Academic Press, 1991.

    Google Scholar 

  18. W.M. Gentleman and H.T. Kung, “Matrix Triangularization by Systolic arrays,” inProc. SPIE Real Time Signal Processing IV, pp. 19–26, 1981.

  19. J.H. Wilkinson,The Algebraic Eigenvalue Problem, Oxford, UK., Clarendon Press, 1965.

    MATH  Google Scholar 

  20. K.J.R. Liu, S.F. Hsieh, and K. Yao, “Recursive LS Filtering using Block HouseHolder Transformations,” inProc. ICASSP, Albuquerque, New Mexico, April 1990, pp. 1631–1634.

  21. P. Kuchibhotla and B.D. Rao, “Scheduling parallel implementations of partitioned orthogonal transformations,” inSPIE Conf. Algorithms and Architectures, San Diego, California, July 1992.

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supported in part by the UCSD/NSF Integrated Circuits And Systems Research Center and by the ARMY Research Office under Grant No. DAAL-03-90-G-0095.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuchibhotla, P., Rao, B.D. A methodology for fast scheduling of partitioned systolic algorithms. Journal of VLSI Signal Processing 10, 111–126 (1995). https://doi.org/10.1007/BF02407030

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02407030

Keywords

Navigation