Skip to main content
Log in

Abstract

Wave pipelining is a design methodology that can increase the clock frequency of digital systems. Also known asmaximum-rate pipelining, it has long been considered a technique for approaching the physical speed limit of a digital circuit. Unlike conventional pipelining, wave pipelining does not require internal clocked elements to increase throughput. The synchronization of internal computations is achieved by balancing inherent RC delays of combinational logic elements, thus allowing circuits to be pipelined at a very fine-grain level. In this article, we describe the design of a 16×16 wave-pipelined multiplier using a 1.0 μm CMOS process. The multiplier is designed using a conventional static CMOS technology. Simulation results show a speedup of about 7× over a nonpipeline implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Q. Lin and P. Xia, “The Design and Implementation of a Very Fast Experimental Pipelined Computer,”Journal of Computer Science and Technology, vol. 3, No. 1 1988, Beijing, pp. 1–6.

    Article  Google Scholar 

  2. D. Wong, G. De Micheli, M. Flynn, and R. Huston, “A Bipolar Population Counter Using Wave Pipelining to Achieve 2.5× Normal Clock Frequency,”IEEE J. Solid-State Circuits, vol. 27, 1992, pp. 745–753.

    Article  Google Scholar 

  3. D. Fan, C.T. Gray, W.J. Forlow, T.A. Hughes, W. Liu, and R.K. Cavin, “A CMOS Parallel Adder Using Wave Pipelining,”MIT Advanced Research in VLSI and Parallel Systems, Providence, RI, 1992, pp. 147–164.

  4. L. Cotten, “Maximum Rate Pipelined Systems,AFIPS Proceeding of Spring Joint Computer Conference, 1969, pp. 581–586.

  5. B. Ekroot, “Optimization of Pipelined Processors by Insertion of Combinational Logic Delay,” 1987, Ph.D. Dissertation, Electrical Engineering, Stanford University, Stanford, CA.

    Google Scholar 

  6. B. Fawcett, “Maximal Clocking Rates for Pipelined Digital Systems.” Report R-706 from Coordinated Science Laboratory, University of Illinois, Urbana, IL, 1975.

  7. S. Anderson, J. Earle, R. Goldschmidt, and D. Powers, “The IBM System/360 Model 91 Floating Point Execution Unit,”IBM Journal of Research and Development, 1967, pp. 34–53.

  8. D. Wong, G. De Micheli, and M. Flynn, “Designing High-Performance Digital Circuits Using Wave Pipelining: Algorithms and Practical Experiences,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 12, 1993, pp. 25–46.

    Article  Google Scholar 

  9. F. Klass, “Balancing Circuits for Wave Pipelining,” Technical Report, Stanford University, 1992, CSL-TR-92-549.

  10. F. Klass and J.M. Mulder, “Use of CMOS Technology in Wave Pipelining,”Proc. of Fifth International Conference on VLSI Design, Banglaore, India, 1992, pp. 303–308.

  11. C.S. Wallace, “A suggestion for a fast multiplier,”IEEE Trans. on Electron. Comput., vol. EC-13, 1964, pp. 14–17.

    Article  Google Scholar 

  12. S. Waser and M. Flynn,Introduction to Arithmetic for Digital Systems Designers, New York: Holt, Rinehart and Winston, 1982.

    Google Scholar 

  13. P.J. Song and G. De Micheli, “Circuits and Architecture Trade-offs for High-Speed Multiplication,”IEEE Journal of Solid-State Circuits, vol. 26, 1991, pp. 1184–1198.

    Article  Google Scholar 

  14. M. Santoro and M. Horowitz, “SPIM: A pipelined 64×64-bit iterative multiplier,”IEEE Journal of Solid-State Circuits, vol. 24, 1989, pp. 487–493.

    Article  Google Scholar 

  15. A. Weinberger, “A 4-2 carry-save adder module,”IBM Tech. Disc. Bulletin, vol. 23, 1981.

  16. M. Nagamatsu, S. Tanaka, J. Mori, K. Hirano, T. Noguchi, and K. Hatanaka, “A 15-ns 32×32-b CMOS Multiplier with an Improved Parallel Structure,”IEEE Journal of Solid-State Circuits, vol. 25, 1990, pp. 494–497.

    Article  Google Scholar 

  17. F. Klass, “Maximum and Minimum Delay Using Data-Dependent Delay Models,” Technical Report, Stanford University, in preparation.

  18. G. Goto, T. Sato, and T. Sukemura, “A 54×54 Regularly Structured Tree Multiplier,”IEEE Journal of Solid-State Circuits, vol. 27, 1992, pp. 1229–1236.

    Article  Google Scholar 

  19. R.P. Brent and H.T. Kung, “A regular layout for parallel adders,”IEEE Transactions on Computers, C-31, 1982, pp. 260–264.

    Article  MathSciNet  Google Scholar 

  20. D. Rose, D. Erdman, and G. Nifong, “CAzM: Circuit analyzer with macromodeling user's guide,” Technical Report, MCNC, June 1990.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Klass, F., Flynn, M.J. & Van De Goor, A.J. Fast multiplication in VLSI using wave pipelining techniques. Journal of VLSI Signal Processing 7, 233–248 (1994). https://doi.org/10.1007/BF02409400

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02409400

Keywords

Navigation