Parameterized High Throughput Function Evaluation for FPGAs

Mencer, Oskar; Luk, Wayne

doi:10.1023/B:VLSI.0000008067.31043.35

Oskar Mencer¹ &
Wayne Luk²

118 Accesses
18 Citations
Explore all metrics

Abstract

This paper presents parameterized module-generators for pipelined function evaluation using lookup tables, adders, shifters, multipliers, and dividers. We discuss trade-offs involved between (1) full-lookup tables, (2) bipartite (lookup-add) units, (3) lookup-multiply units, (4) shift-and-add based CORDIC units, and (5) rational approximation. Our treatment mainly focuses on explaining method (3), and briefly covers the background of the other methods. For lookup-multiply units, we provide equations for estimating approximation errors and rounding errors which are used to parameterize the hardware units. The resources and performance of the resulting design can be estimated given the input parameters. A selection of the compared methods are implemented as part of the current PAM-Blox module generation environment. An example shows that the lookup-multiply unit produces competitive designs with data widths up to 20 bits when compared with shift-and-add based CORDIC units. Additionally, the lookup-multiply method or rational approximation can produce efficient designs for larger data widths when evaluating functions not supported by CORDIC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Versatile and Flexible Multiplier Generator for Large Integer Polynomials

Article 26 June 2023

Approximate Multipliers and Dividers Using Dynamic Bit Selection

Efficient ASIC and FPGA Implementation of Binary-Coded Decimal Digit Multipliers

Article 17 June 2014

References

W.F. Wong and E. Goto, “Fast Hardware-Based Algorithms for Elementary Function Computations Using Rectangular Multipliers,” IEEE Trans. Comput., vol. 43, 1994, pp. 278-294.
Article Google Scholar
C. Ebeling, D.C. Cronquist, P. Franklin, J. Secosky, and S.G. Berg, “Mapping Applications to the RaPiD Configurable Architecture,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1997, pp. 106-115.
R. Laufer, R.R. Taylor, and H. Schmit, “PCI-PipeRench and SWORD API: A System for Stream-based Reconfigurable Computing,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1999, pp. 200-208.
O. Mencer, H. Huebert, M. Morf, and M.J. Flynn, “StReAm: Object-Oriented Programming of Stream Architectures Using PAM-Blox,” Field-Programmable Logic and Applications, LNCS 1896, Springer, 2000, pp. 595-604.
S. Rixner et al., “A Bandwidth-Efficient Architecture for Media Processing,” in Proc. ACM/IEEE Int'l Symposium on Microarchitecture, IEEE Computer Society Press, 1998, pp. 3-13.
M.J. Schulte and J.E. Stine, “Approximating Elementary Functions with Symmetric Bipartite Tables,” IEEE Trans. Comput., vol. 48, no. 8, 1999, pp. 842-847.
Article Google Scholar
N. Boullis, Designing Arithmetic Units for Adaptive Computing with PAM-Blox, MIM Internship Report, ENS-Lyon, France, Sept. 2000.
J.E. Volder, “The CORDIC Trigonometric Computing Technique,” IRE Trans. on Electronic Computers, vol. EC-8, no. 3, 1959.
O. Mencer, M. Morf, and M.J. Flynn, “PAM-Blox: High Performance FPGA Design for Adaptive Computing,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1998, pp. 167-174.
R. Andraka, “A Survey of CORDIC Algorithms for FPGAs,” in Proc. ACM/SIGDA Int. Symp. Field Programmable Gate Arrays, ACM Press, 1998, pp. 191-200.
O. Mencer, L. Séméria, M. Morf, and J.M. Delosme, “Application of Reconfigurable CORDIC Architectures,” Journal of VLSI Signal Processing, vol. 24, nos. 2/3, 2000, pp. 211-221.
Article Google Scholar
J.M. Muller, Elementary Functions, Algorithms and Implementation. Birkhaeuser, Boston, 1997.
F. de Dinechin and A. Tisserand, Some Improvements on Multipartite Table Methods, Tech. Rep. RR-4059, INRIA, Nov. 2000.
I. Koren and O. Zinaty, “Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations,” IEEE Trans. Comput., vol. 39, no. 8, 1990.
Google Scholar
P.T.P. Tang, “Table Lookup Algorithms for Elementary Functions and Their Error Analysis,” in Proc. 10th IEEE Symp. Computer Arithmetic, IEEE Press, 1991, pp. 232-236.
H.M. Ahmed, Signal Processing Algorithms and Architectures, PhD Thesis, E.E. Department, Stanford University, June 1982.
M. Weinhardt and W. Luk, “Pipeline Vectorization,” IEEE Trans. Comput. Aided Design, vol. 20, no. 2, 2001, pp. 234-248.
Article Google Scholar
O. Mencer, Rational Arithmetic Units in Computer Systems. PhD Thesis (with M.J. Flynn), E.E. Dept., Stanford University, Jan. 2000.
F. de Dinechin and J. Detrey, Multipartite Tables in JBits for the Evaluation of Functions on FPGAs, Tech. Rep. RR-4305, INRIA, Nov. 2001.

Download references

Author information

Authors and Affiliations

MAXELER Technologies, Florham Park, NJ, 07932, USA
Oskar Mencer
Department of Computing, Imperial College, 180 Queen's Gate, London, SW7 2BZ, UK
Wayne Luk

Authors

Oskar Mencer
View author publications
You can also search for this author in PubMed Google Scholar
Wayne Luk
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mencer, O., Luk, W. Parameterized High Throughput Function Evaluation for FPGAs. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 36, 17–25 (2004). https://doi.org/10.1023/B:VLSI.0000008067.31043.35

Download citation

Published: 01 February 2004
Issue Date: January 2004
DOI: https://doi.org/10.1023/B:VLSI.0000008067.31043.35

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameterized High Throughput Function Evaluation for FPGAs

Abstract

Access this article

Similar content being viewed by others

A Versatile and Flexible Multiplier Generator for Large Integer Polynomials

Approximate Multipliers and Dividers Using Dynamic Bit Selection

Efficient ASIC and FPGA Implementation of Binary-Coded Decimal Digit Multipliers

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Parameterized High Throughput Function Evaluation for FPGAs

Abstract

Access this article

Similar content being viewed by others

A Versatile and Flexible Multiplier Generator for Large Integer Polynomials

Approximate Multipliers and Dividers Using Dynamic Bit Selection

Efficient ASIC and FPGA Implementation of Binary-Coded Decimal Digit Multipliers

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation