Skip to main content
Log in

Abstract

Real-time signal processing requires fast computation of inner products. Distributed arithmetic is a method of inner product computation that uses table-lookup and addition in place of multiplication. Distributed arithmetic has previously been shown to produce novel and seemingly efficient architectures for a variety of signal processing computations; however the methods of design, analysis and comparison have been ad hoc. We propose a systematic method for synthesizing optimal VLSI architectures using distributed arithmetic.

A partition of the inner product computation at the word and bit level produces a computation consisting of lookups and additions. We study two classes of algorithms to implement this computation, regular iterative algorithms and tree algorithms, each of which can be expressed in the form of a dependency graph. We use linear and nonlinear maps to assign computations to processors in space and time. Expressions are developed for the area, latency, period and arithmetic error for a particular partition and space/time map of the dependecy graph. We use these expressions to formulate a constrained optimization problem over a large class of architectures. We compare distributed arithmetic with more conventional methods for inner product computation and show how area, latency and period may be traded off while maintaining constant error.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Peled and B. Liu, “A New Hardware Realization of Digital Filters,”IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-22, No. 6, 1974.

  2. S. Zohar, “New Hardware Realization of Nonrecursive Digital Filters,”IEEE Transaction Computers, Vol. C-22, 1973, pp. 328–338.

    Article  Google Scholar 

  3. S.A. White, “On Mechanization of Vector Multiplication,”Proceedings of the IEEE, Vol. 63, 1975, pp. 633–648.

    Article  Google Scholar 

  4. S.A. White, “Applications of Distributed Arithmetic to Digital Signal Processing: A Tutorial Revies,”IEEE ASSP Magazine, Vol. 6, No. 3, 1989, pp. 4–19.

    Article  Google Scholar 

  5. W. Burleson, “Efficient Computation in VLSI with Distributed Arithmetic,” Ph.D. Dissertation, Univerdity of Colorado, 1989.

  6. W. Burleson and L.L. Scharf, “VLSI Design of Inner Product Computers Using Distributed Arithmetic,”Proceedings of the International Symposium on Circuits and Systems, Portland, OR, 1989, pp. 158–161.

  7. W. Burleson and L.L. Scharf, “Input/Output Complexity of Bit Level VLSI Array Architectures,”Proceedings of the Asilomar Conference on Computers, Signals and Systems, 1989.

  8. M. Arjmand and R.A. Roberts, “On Comparing Fixed Point Implementations of Fixed Point Digital Filters,”IEEE Circuits and Systems Magazine, Vol. 3, No. 2, 1981.

  9. M. Buttner and H.W. Scheussler, “On Structures for the Implementation of Distributed Arithmetic,”NTZ Communication Journal, Vol. 6, 1976.

  10. S. Rao and T. Kailath, “What is a Systolic Algorithm,”Highly Parallel Signal Processing Architectures, SPIE Vol. 614, 1986.

  11. K.K. Parhi, “Nibble-serial Arithmetic Processor Desings via Unfolding,”Proceedings of the International Symposium on Circuits and Systems, Portland, Oregon, 1989.

  12. A.V. Aho, J.E. Hopcroft and J.D. Ullman,The Design and Analysis of Computer Algorithms, Reading, MA: Addison-Welsey, 1974.

    MATH  Google Scholar 

  13. R.J. Lipton and J. Valdes, “Census Functions: an Approach to VLSI Upper Bounds,”Proceedings of the Twenty-First Annual IEEE Symposium on Foundations of Computer Science, pp. 13–22.

  14. C.S. Wallace, “A Suggestion for a Fast Multiplier,”IEEE Transactions on Computers, Vol. C-13, No. 2, February 1964, pp. 14–17.

    Article  MATH  Google Scholar 

  15. L. Dadda, “Some Schemes for Parallel Multipliers,”Alta Frequenza, 34:349–356, 1965.

    Google Scholar 

  16. J. Vuillemin, “A Very Fast Multiplication Algorithm for VLSI Implementation,”Integration, the VLSI Journal, Vol. 1, 1983, pp. 39–52.

    Article  Google Scholar 

  17. S.P. Smith and H.C. Torng, “A Fast Inner Product Processor Based on Equal Alignments,”Journal of Parallel and Distributed Computing, Vol. 2, 1985, pp. 376–390.

    Article  Google Scholar 

  18. M.R. Buric and C.A. Mead, “Bit-Serial Inner Product Processors in VLSI,”Proceedings Caltech Conference on VLSI, 1981, pp. 155–164.

  19. R.P. Brent and H.T. Kung, “A Regular Layout for Parallel Adders,”IEEE Transactions on Computers, Vol. C-31, No. 3, 1982.

  20. B. Chazelle and L. Monier, “Optimality in VLSI,” pp. 151–160 in J.P. Gray (ed.),VLSI 81, New York: Academic Press, 1981.

    Google Scholar 

  21. C. Mead and L. Conway,Introduction to VLSI Systems, Reading, MA: Addison-Wesley, 1980.

    Google Scholar 

  22. K.D. Kammeyer, “Quantization Error Analysis of the Distributed Arthmetic,”IEEE Transactions on Circuits and Systems, Vol. CAS-24, No. 12, 1977.

  23. F.J. Taylor, “An Analysis of the Distributed Arithmetic Digital Filter,”IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-34, No. 5, 1986.

  24. C.F. Chen, “Implementing FIR Filters with Distributed Arithmetic,”IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-33, No. 4, 1985.

  25. P.R. Cappello and K. Steiglitz, “Unifying VLSI Array Designs with Geometric Transformations,” inProceedings IEEE International Conference on Parallel Processing, 1983.

  26. M. Chen, “The Generation of a Class of Multipliers: Synthesizing Highly Parallel Algorithms in VLSI,”IEEE Transactions on Computers, Vol. 37, No. 3, 1988.

  27. S.Y. Kung,VLSI Array Processors, Englewood Cliffs, NJ: Prentice-Hall, 1988.

    Google Scholar 

  28. P.B. Denyer and D.J. Myers, “Carry-Save Arrays for VLSI Signal Processing,” pp. 151–160 in J.P. Gray (ed.),VLSI 81, New York: Academic Press, 1981.

    Google Scholar 

  29. P.R. Cappello and K. Steiglitz, “Completely-Pipelined Architectures for Digital Signal Processing,”IEEE Transaction on Acoustics, Speech, and Signal Processing, Vol. ASSP-vn31, No. 4, 1983.

  30. R.F. Lyon, “Two's Complement Pipeline Multipliers,”IEEE Transactions on Communications, COM-24, 1976, pp. 418–425.

    Article  Google Scholar 

  31. S.G. Smith and P.B. Denyer, “Effiient Bit-Serial Complex Multiplication and Sum-of-Products Computation Using Distributed Arithmetic,”Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Tokyo, 1986.

  32. J.D. Ullman,Computational Aspects of VLSI, Rockville, MD: Computer Science Press, 1984.

    MATH  Google Scholar 

  33. C.D. Thompson, “A Complexity Theory for VLSI,” Ph.D. Dissertation, Department of Computer Science, Carnegie-Mellon University, 1980.

  34. L. Kühnel and H. Schmeck, “A Closer Look at VLSI Multiplication,”Integration, the VLSI Journal, Vol. 6, 1988, pp. 345–359.

    Article  Google Scholar 

  35. W. Burleson, “Memory Design of Bit-level VLSI Architectures,”Proceedings of the International Symposium on Circuits and Systems, New Orleans, 1990.

  36. R. Jain, A. Ruetz and R.W. Brodersen, “Architectural Strategies for Digital Signal Processing Circuits,” in S.Y. Kung, R.E. Owen and J.G. Nash (eds.),VLSI Signal Processing II, New York: IEEE Press, 1986, pp. 361–372.

    Google Scholar 

  37. W. Burleson, L.L. Scharf, A.R. Gabriel and N.H. Endlsey, “A Systolic VLSI Chip for Implementing Orthogonal Transforms,”IEEE Journal of Solid-State Circuits, Vol. 24, No. 2, 1989, pp. 466–469.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supported by Ball Aerospace, Boulder, CO and by the Office of Naval Research, Electronics Branch, Arlington, VA under contract ONR 89-J-1070.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Burleson, W.P., Scharf, L.L. A VLSI design methodology for distributed arithmetic. J VLSI Sign Process Syst Sign Image Video Technol 2, 235–252 (1991). https://doi.org/10.1007/BF00925468

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00925468

Keywords

Navigation