Skip to main content
Log in

Constructing H4, a Fast Depth-Size Optimal Parallel Prefix Circuit

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Given n values x 1, x 2,...,x n and an associative binary operation ⊗, the prefix problem is to compute x 1x 2⊗⋯⊗x i, 1≤in. Prefix circuits are combinational circuits for solving the prefix problem. For any n-input prefix circuit D with depth d and size s, if d+s=2n−2, then D is depth-size optimal. In general, a prefix circuit with a small depth is faster than one with a large depth. For prefix circuits with the same depth, a prefix circuit with a smaller fan-out occupies less area and is faster in VLSI implementation. This paper is on constructing parallel prefix circuits that are depth-size optimal with small depth and small fan-out. We construct a depth-size optimal prefix circuit H4 with fan-out 4. It has the smallest depth among all known depth-size optimal prefix circuits with a constant fan-out; furthermore, when n≥136, its depth is less than, or equal to, those of all known depth-size optimal prefix circuits with unlimited fan-out. A size lower bound of prefix circuits is also derived. Some properties related to depth-size optimality and size optimality are introduced; they are used to prove that H4 is depth-size optimal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. G. Akl. Parallel Computation: Models and Methods. Prentice-Hall, Upper Saddle River, NJ, 1997.

    Google Scholar 

  2. A. Bilgory and D. D. Gajski. A heuristic for suffix solutions. IEEE Transactions on Computers, C-35:34-42, 1986.

    Google Scholar 

  3. G. E. Blelloch. Scans as primitive operations. IEEE Transactions on Computers, 38:1526-1538, 1989.

    Google Scholar 

  4. R. P. Brent and H. T. Kung. A regular layout for parallel adders. IEEE Transactions on Computers, C-31:260-264, 1982.

    Google Scholar 

  5. D. A. Carlson and B. Sugla. Limited width parallel prefix circuits. The Journal of Supercomputing, 4:107-129, 1990.

    Google Scholar 

  6. R. Cole and U. Vishkin. Faster optimal parallel prefix sums and list ranking. Information and Control, 81:334-352, 1989.

    Google Scholar 

  7. A. Ferreira and S. Ubeda. Parallel complexity of the medial axis computation. In Proc. Int. Conf. on Image Processing, vol. 2, pp. 105-108, Washington, D.C., Oct. 23-26 1995.

    Google Scholar 

  8. F. E. Fich. New bounds for parallel prefix circuits. In Proc. 15th Symp. on the Theory of Computing, pp. 100-109, Apr. 1983.

  9. W. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge, MA, 1994.

    Google Scholar 

  10. C. P. Kruskal, T. Madej, and L. Rudolph. Parallel prefix on fully connected direct connection machines. In Proc. Int. Conf. on Parallel Processing, pp. 278-284, St. Charles, IL, Aug. 1986.

  11. R. E. Ladner and M. J. Fischer. Parallel prefix computation. Journal of the Association for Computing Machinery, 27:831-838, 1980.

    Google Scholar 

  12. S. Lakshmivarahan and S. K. Dhall. Parallel Computing Using the Prefix Problem. Oxford University Press, Oxford, UK, 1994.

    Google Scholar 

  13. S. Lakshmivarahan, C. M. Yang, and S. K. Dhall. On a new class of optimal parallel prefix circuits with (size + depth) = 2n-2 and ⌈log n⌉ ≤ depth ≤(2⌈log n⌉)-3). In Proc. Int. Conf. on Parallel Processing, pp. 58-65, St. Charles, IL, Aug. 17-21 1987.

  14. Y.-C. Lin. Optimal parallel prefix circuits with fan-out 2 and corresponding parallel algorithms. Neural, Parallel and Scientific Computations, 7:33-42, 1999.

    Google Scholar 

  15. Y.-C. Lin and C. M. Lin. Efficient parallel prefix algorithms on multicomputers. Journal of Information Science and Engineering, 16:41-64, 2000.

    Google Scholar 

  16. Y.-C. Lin and C.-K. Liu. Constructing optimal parallel prefix circuits. In Proc. National Computer Symp., pp. C-313-320, Taipei, Taiwan, Dec. 1999.

  17. Y.-C. Lin and C.-K. Liu. Finding optimal parallel prefix circuits with fan-out 2 in constant time. Information Processing Letters, 70:191-195, 1999.

    Google Scholar 

  18. Y.-C. Lin and C.-C. Shih. Optimal parallel prefix circuits with fan-out at most 4. In Proc. 2nd IASTED Int. Conf. on Parallel and Distributed Computing and Networks, pp. 312-317, Brisbane, Australia, Dec. 1998.

  19. Y.-C. Lin and C.-C. Shih. A new class of depth-size optimal parallel prefix circuits. The Journal of Supercomputing, 14:39-52, 1999.

    Google Scholar 

  20. Y.-C. Lin and C.-S. Yeh. Efficient parallel prefix algorithms on multiport message-passing systems. Information Processing Letters, 71:91-95, 1999.

    Google Scholar 

  21. Y.-C. Lin and C.-S. Yeh. Optimal parallel prefix using the multiport postal model. In Proc. 1st Int. Conf. on Parallel and Distributed Computing, Applications and Technologies, pp. 197-201, Hong Kong, May 2000.

  22. R. Manohar and J. A. Tierno. Asynchronous parallel prefix computation. IEEE Transactions on Computers, 47:1244-1252, 1998.

    Google Scholar 

  23. M. Snir. Depth-size trade-offs for parallel prefix computation. Journal of Algorithms, 7:185-201, 1986.

    Google Scholar 

  24. H. Wang, A. Nicolau, and K. S. Siu. The strict time lower bound and optimal schedules for parallel prefix with resource constraints. IEEE Transactions on Computers, 45:1257-1271, 1996.

    Google Scholar 

  25. N. H. E. Weste and K. Eshraghian. Principles of CMOS VLSI Design: A System Perspective, 2nd ed. Addison-Wesley, Reading, MA, 1993.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, YC., Hsu, YH. & Liu, CK. Constructing H4, a Fast Depth-Size Optimal Parallel Prefix Circuit. The Journal of Supercomputing 24, 279–304 (2003). https://doi.org/10.1023/A:1022084814175

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1022084814175

Navigation