Abstract
Given n values x 1, x 2,...,x n and an associative binary operation ⊗, the prefix problem is to compute x 1⊗x 2⊗⋯⊗x i, 1≤i≤n. Prefix circuits are combinational circuits for solving the prefix problem. For any n-input prefix circuit D with depth d and size s, if d+s=2n−2, then D is depth-size optimal. In general, a prefix circuit with a small depth is faster than one with a large depth. For prefix circuits with the same depth, a prefix circuit with a smaller fan-out occupies less area and is faster in VLSI implementation. This paper is on constructing parallel prefix circuits that are depth-size optimal with small depth and small fan-out. We construct a depth-size optimal prefix circuit H4 with fan-out 4. It has the smallest depth among all known depth-size optimal prefix circuits with a constant fan-out; furthermore, when n≥136, its depth is less than, or equal to, those of all known depth-size optimal prefix circuits with unlimited fan-out. A size lower bound of prefix circuits is also derived. Some properties related to depth-size optimality and size optimality are introduced; they are used to prove that H4 is depth-size optimal.
Similar content being viewed by others
References
S. G. Akl. Parallel Computation: Models and Methods. Prentice-Hall, Upper Saddle River, NJ, 1997.
A. Bilgory and D. D. Gajski. A heuristic for suffix solutions. IEEE Transactions on Computers, C-35:34-42, 1986.
G. E. Blelloch. Scans as primitive operations. IEEE Transactions on Computers, 38:1526-1538, 1989.
R. P. Brent and H. T. Kung. A regular layout for parallel adders. IEEE Transactions on Computers, C-31:260-264, 1982.
D. A. Carlson and B. Sugla. Limited width parallel prefix circuits. The Journal of Supercomputing, 4:107-129, 1990.
R. Cole and U. Vishkin. Faster optimal parallel prefix sums and list ranking. Information and Control, 81:334-352, 1989.
A. Ferreira and S. Ubeda. Parallel complexity of the medial axis computation. In Proc. Int. Conf. on Image Processing, vol. 2, pp. 105-108, Washington, D.C., Oct. 23-26 1995.
F. E. Fich. New bounds for parallel prefix circuits. In Proc. 15th Symp. on the Theory of Computing, pp. 100-109, Apr. 1983.
W. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge, MA, 1994.
C. P. Kruskal, T. Madej, and L. Rudolph. Parallel prefix on fully connected direct connection machines. In Proc. Int. Conf. on Parallel Processing, pp. 278-284, St. Charles, IL, Aug. 1986.
R. E. Ladner and M. J. Fischer. Parallel prefix computation. Journal of the Association for Computing Machinery, 27:831-838, 1980.
S. Lakshmivarahan and S. K. Dhall. Parallel Computing Using the Prefix Problem. Oxford University Press, Oxford, UK, 1994.
S. Lakshmivarahan, C. M. Yang, and S. K. Dhall. On a new class of optimal parallel prefix circuits with (size + depth) = 2n-2 and ⌈log n⌉ ≤ depth ≤(2⌈log n⌉)-3). In Proc. Int. Conf. on Parallel Processing, pp. 58-65, St. Charles, IL, Aug. 17-21 1987.
Y.-C. Lin. Optimal parallel prefix circuits with fan-out 2 and corresponding parallel algorithms. Neural, Parallel and Scientific Computations, 7:33-42, 1999.
Y.-C. Lin and C. M. Lin. Efficient parallel prefix algorithms on multicomputers. Journal of Information Science and Engineering, 16:41-64, 2000.
Y.-C. Lin and C.-K. Liu. Constructing optimal parallel prefix circuits. In Proc. National Computer Symp., pp. C-313-320, Taipei, Taiwan, Dec. 1999.
Y.-C. Lin and C.-K. Liu. Finding optimal parallel prefix circuits with fan-out 2 in constant time. Information Processing Letters, 70:191-195, 1999.
Y.-C. Lin and C.-C. Shih. Optimal parallel prefix circuits with fan-out at most 4. In Proc. 2nd IASTED Int. Conf. on Parallel and Distributed Computing and Networks, pp. 312-317, Brisbane, Australia, Dec. 1998.
Y.-C. Lin and C.-C. Shih. A new class of depth-size optimal parallel prefix circuits. The Journal of Supercomputing, 14:39-52, 1999.
Y.-C. Lin and C.-S. Yeh. Efficient parallel prefix algorithms on multiport message-passing systems. Information Processing Letters, 71:91-95, 1999.
Y.-C. Lin and C.-S. Yeh. Optimal parallel prefix using the multiport postal model. In Proc. 1st Int. Conf. on Parallel and Distributed Computing, Applications and Technologies, pp. 197-201, Hong Kong, May 2000.
R. Manohar and J. A. Tierno. Asynchronous parallel prefix computation. IEEE Transactions on Computers, 47:1244-1252, 1998.
M. Snir. Depth-size trade-offs for parallel prefix computation. Journal of Algorithms, 7:185-201, 1986.
H. Wang, A. Nicolau, and K. S. Siu. The strict time lower bound and optimal schedules for parallel prefix with resource constraints. IEEE Transactions on Computers, 45:1257-1271, 1996.
N. H. E. Weste and K. Eshraghian. Principles of CMOS VLSI Design: A System Perspective, 2nd ed. Addison-Wesley, Reading, MA, 1993.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lin, YC., Hsu, YH. & Liu, CK. Constructing H4, a Fast Depth-Size Optimal Parallel Prefix Circuit. The Journal of Supercomputing 24, 279–304 (2003). https://doi.org/10.1023/A:1022084814175
Issue Date:
DOI: https://doi.org/10.1023/A:1022084814175