## Abstract

We consider the problem of constructing fast and small parallel prefix adders for non-uniform input arrival times. In modern computer chips, adders with up to hundreds of inputs occur frequently, and they are often embedded into more complex circuits, e.g. multipliers, leading to instance-specific non-uniform input arrival times. Most previous results are based on representing binary carry-propagate adders as parallel prefix graphs, in which pairs of generate and propagate signals are combined using complex gates called prefix gates. Examples of commonly-used adders are constructed based on the Kogge–Stone or Ladner–Fischer prefix graphs. Adders constructed in this model usually minimize the delay in terms of these prefix gates. However, the delay in terms of logic gates can be worse by a factor of two. In contrast, we aim to minimize the delay of the underlying logic circuit directly. We prove a lower bound on the delay of a carry bit computation achievable by any prefix carry bit circuit and develop an algorithm that computes a prefix carry bit circuit with optimum delay up to a small additive constant. Our algorithm improves the running time of a previous dynamic program for constructing a prefix carry bit from \(\mathcal {O}(n^3)\) to \(\mathcal {O}(n \log ^2 n)\) while simultaneously improving the delay and size guarantee, where *n* is the number of bits in the summands. Furthermore, we use this algorithm as a subroutine to compute a full adder in near-linear time, reducing the delay approximation factor of 2 from previous approaches to 1.441 for our algorithm.

## Keywords

Circuit Delay Parallel prefix problem Addition Prefix adder Non-uniform input arrival times## Mathematics Subject Classification

68Q25 65Y04## Notes

### Compliance with Ethical Standards

### Conflict of interest

The authors declare that they have no conflict of interest.

## References

- 1.Choi, Y.: Parallel Prefix Adder Design. Dissertation, University of Texas at Austin (2004)Google Scholar
- 2.Cole, R., Vishkin, U.: Faster optimal parallel prefix sums and list ranking. Inf. Comput.
**81**(3), 334–352 (1989)MathSciNetCrossRefzbMATHGoogle Scholar - 3.Keeter, M., Harris, D.M., Macrae, A., Glick, R., Ong, M., Schauer, J.: Implementation of 32-bit Ling and Jackson adders. In: Proceedings of the Forty Fifth Asilomar Conference on Signals, Systems and Computers, pp. 170–175 (2011)Google Scholar
- 4.Knowles, S.: A family of adders. In: Proceedings of the 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pp. 277–281 (2001)Google Scholar
- 5.Kogge, P.M., Stone, H.S.: A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comput.
**100**(8), 786–793 (1973)MathSciNetCrossRefzbMATHGoogle Scholar - 6.Ladner, R.E., Fischer, M.J.: Parallel prefix computation. J. ACM
**27**(4), 831–838 (1980)MathSciNetCrossRefzbMATHGoogle Scholar - 7.Oklobdzija, V.G.: Design and analysis of fast carry-propagate adder under non-equal input signal arrival profile. In: Proceedings of the Twenty-Eighth Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1401 (1994)Google Scholar
- 8.Rautenbach, D., Szegedy, C., Werber, J.: Delay optimization of linear depth Boolean circuits with prescribed input arrival times. J. Discrete Algorithms
**4**(4), 526–537 (2006)MathSciNetCrossRefzbMATHGoogle Scholar - 9.Rautenbach, D., Szegedy, C., Werber, J.: The delay of circuits whose inputs have specified arrival times. Discrete Appl. Math.
**155**(10), 1233–1243 (2007)MathSciNetCrossRefzbMATHGoogle Scholar - 10.Rautenbach, D., Szegedy, C., Werber, J.: On the cost of optimal alphabetic code trees with unequal letter costs. Eur. J. Comb.
**29**(2), 386–394 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - 11.Roy, S., Choudhury, M., Puri, R., Pan, D.Z.: Towards optimal performance-area trade-off in adders by synthesis of parallel prefix structures. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst.
**33**(10), 1517–1530 (2014)CrossRefGoogle Scholar - 12.Roy, S., Choudhury, M., Puri, R., Pan, D.Z.: Polynomial time algorithm for area and power efficient adder synthesis in high-performance designs. In: Proceedings of the 20th Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 249–254 (2015)Google Scholar
- 13.Weinberger, A., Smith, J.L.: A logic for high-speed addition. Natl. Bur. Stand. Circul.
**591**, 3–12 (1958)Google Scholar - 14.Werber, J., Rautenbach, D., Szegedy, C.: Timing optimization by restructuring long combinatorial paths. In: Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 536–543 (2007)Google Scholar
- 15.Zimmermann, R.: Binary Adder Architectures for Cell-Based VLSI and Their Synthesis. Dissertation, Swiss Federal Institute of Technology (ETH) in Zurich (1998)Google Scholar