Implementation on RISC Architectures

  • Richard Tolimieri
  • Myoung An
  • Chao Lu
Part of the Signal Processing and Digital Filtering book series (SIGNAL PROCESS)


A wide variety of DFT and convolution algorithms have been designed to optimize computations with respect to the number of arithmetic operations, especially multiplications. Blahut (1985) [1]offers an excellent survey of many algorithms designed using this methodology. Today, with the rapid advance in VLSI technology and the availability of high-speed and inexpensive floating-point processors, the time required to carry out a fixed-point addressing operation or a floating-point addition can effectively be the same as that for the floating-point multiplication. Some advanced architectures have these functional units working in parallel, with multiple operations realized in one or a few cycles at the same time. Traditional algorithm design of trading multiplications for additions, therefore, is not only ineffective but can result in a significant decrease in performance.


Discrete Fourier Transform Data Cache Assembly Code Intel I860 Machine Cycle 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Blahut, R.E. (1985), Fast Algorithms For Digital Signal Processing, Addison-Wesley, Reading, MA.MATHGoogle Scholar
  2. [2]
    Bogoch, S., Bason, I., Williams, J., and Russell, M. (1990), “Supercomputers Get Personal,” BYTE Magazine, 231–237.Google Scholar
  3. [3]
    Dewar, R.B. and Smosna, M. (1990), Microprocessors: A Programmer’s View, McGraw-Hill Publishing Co., New York.Google Scholar
  4. [4]
    Granata, J.A. (1990), The Design of Discrete Fourier Transform and Convolution Algorithms For RISC Architectures, Ph.D. dissertation, the City University of New York.Google Scholar
  5. [5]
    Hennessy, J.L. (1984), “VLSI Processor Architecture,” IEEE Computers C-33, 1221–1246.CrossRefGoogle Scholar
  6. [6]
    Linzer, E. and Feig, E. (1991), “Implementation of Efficient FFT Algorithms on Fused Multiply-Add Architectures,” to appear.Google Scholar
  7. [7]
    Lu, C. (1991), “Implementation of ‘Multiply-Add’ FFT Algorithms for Complex and Real Data Sequences,” Proceeding of IEEE International Conference on Circuits and Systems, Singapore.Google Scholar
  8. [8]
    Lu C, Cooley, J.W., and Tolimieri, R. (1993),“FFT Algorithms for Prime Transform Sizes and Their Implementations on VAX, IBM 3090VF and RS/6000,” IEEE Trans. Signal Processing 41(2), February.Google Scholar
  9. [9]
    Lu C, Cooley, J.W., and Tolimieri, R. (1991), “Variants of the Winograd Multiplicative FFT Algorithms and Their Implementation on RS/6000,” Proceedings ICASSP-91, Toronto.Google Scholar
  10. [10]
    Lu, C, An, M., Qian, Z., and Tolimieri, R. (1992), “Small FFT module Implementation on the Intel i860 Processor,” Proc. ICSPAT, November, 2–5, Cambridge, MA.Google Scholar
  11. [11]
    Margulis, N. (1990), i860 Microprocessor Architecture, McGraw-Hill Publishing Co., New York.Google Scholar
  12. [12]
    Patterson, D.A. (1985), “Reduced Instruction Set Computers,” Communications of the ACM 28(1), 8–21.CrossRefGoogle Scholar
  13. [13]
    Patterson, D.A. and Sequin, C.H. (1981), “RISC I: A Reduced Instruction Set VLSI Computer,” Proc. 8th Internat. Sympos. Computer Architectures ACM, 443–457.Google Scholar
  14. [14]
    Patterson, D.A. and Sequin, C.H. (1982), “A VLSI RISC,” IEEE Computer Mag., September, 8–22.Google Scholar
  15. [15]
    Radin, G. (1982), “The 801 Minicomputer,” Computer Architecture News 10, 39–47.CrossRefGoogle Scholar
  16. [16]
    Stallings, W. (1990), Reduced Instruction Set Computers (RISC), Second Edition. IEEE Computer Society Press.Google Scholar
  17. [17]
    Tolimieri, R., An, M., and Lu, C. (1989), Algorithms for Discrete Fourier Transform and Convolutions, Springer-Verlag, New York.Google Scholar
  18. [18]
    IBM Journal of Research and Development: Special Issue on IBM RISC System/6000 Processor, June, 1990.Google Scholar
  19. [19]
    Intel, iPSC/860 Supercomputer Advanced Information Fact Sheet. Intel 1990.Google Scholar
  20. [20]
    AT&T DSP Parallel Processor BT-100 User Manual, AT&T, 1988.Google Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • Richard Tolimieri
    • 1
  • Myoung An
    • 2
  • Chao Lu
    • 3
  1. 1.Department of Electrical EngineeringCity College of CUNYNew YorkUSA
  2. 2.A.J. Devaney AssociatesAllstonUSA
  3. 3.Department of Computer and Information SciencesTowson State UniversityTowsonUSA

Personalised recommendations