Abstract
This paper presents parameterized module-generators for pipelined function evaluation using lookup tables, adders, shifters, multipliers, and dividers. We discuss trade-offs involved between (1) full-lookup tables, (2) bipartite (lookup-add) units, (3) lookup-multiply units, (4) shift-and-add based CORDIC units, and (5) rational approximation. Our treatment mainly focuses on explaining method (3), and briefly covers the background of the other methods. For lookup-multiply units, we provide equations for estimating approximation errors and rounding errors which are used to parameterize the hardware units. The resources and performance of the resulting design can be estimated given the input parameters. A selection of the compared methods are implemented as part of the current PAM-Blox module generation environment. An example shows that the lookup-multiply unit produces competitive designs with data widths up to 20 bits when compared with shift-and-add based CORDIC units. Additionally, the lookup-multiply method or rational approximation can produce efficient designs for larger data widths when evaluating functions not supported by CORDIC.
Similar content being viewed by others
References
W.F. Wong and E. Goto, “Fast Hardware-Based Algorithms for Elementary Function Computations Using Rectangular Multipliers,” IEEE Trans. Comput., vol. 43, 1994, pp. 278-294.
C. Ebeling, D.C. Cronquist, P. Franklin, J. Secosky, and S.G. Berg, “Mapping Applications to the RaPiD Configurable Architecture,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1997, pp. 106-115.
R. Laufer, R.R. Taylor, and H. Schmit, “PCI-PipeRench and SWORD API: A System for Stream-based Reconfigurable Computing,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1999, pp. 200-208.
O. Mencer, H. Huebert, M. Morf, and M.J. Flynn, “StReAm: Object-Oriented Programming of Stream Architectures Using PAM-Blox,” Field-Programmable Logic and Applications, LNCS 1896, Springer, 2000, pp. 595-604.
S. Rixner et al., “A Bandwidth-Efficient Architecture for Media Processing,” in Proc. ACM/IEEE Int'l Symposium on Microarchitecture, IEEE Computer Society Press, 1998, pp. 3-13.
M.J. Schulte and J.E. Stine, “Approximating Elementary Functions with Symmetric Bipartite Tables,” IEEE Trans. Comput., vol. 48, no. 8, 1999, pp. 842-847.
N. Boullis, Designing Arithmetic Units for Adaptive Computing with PAM-Blox, MIM Internship Report, ENS-Lyon, France, Sept. 2000.
J.E. Volder, “The CORDIC Trigonometric Computing Technique,” IRE Trans. on Electronic Computers, vol. EC-8, no. 3, 1959.
O. Mencer, M. Morf, and M.J. Flynn, “PAM-Blox: High Performance FPGA Design for Adaptive Computing,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1998, pp. 167-174.
R. Andraka, “A Survey of CORDIC Algorithms for FPGAs,” in Proc. ACM/SIGDA Int. Symp. Field Programmable Gate Arrays, ACM Press, 1998, pp. 191-200.
O. Mencer, L. Séméria, M. Morf, and J.M. Delosme, “Application of Reconfigurable CORDIC Architectures,” Journal of VLSI Signal Processing, vol. 24, nos. 2/3, 2000, pp. 211-221.
J.M. Muller, Elementary Functions, Algorithms and Implementation. Birkhaeuser, Boston, 1997.
F. de Dinechin and A. Tisserand, Some Improvements on Multipartite Table Methods, Tech. Rep. RR-4059, INRIA, Nov. 2000.
I. Koren and O. Zinaty, “Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations,” IEEE Trans. Comput., vol. 39, no. 8, 1990.
P.T.P. Tang, “Table Lookup Algorithms for Elementary Functions and Their Error Analysis,” in Proc. 10th IEEE Symp. Computer Arithmetic, IEEE Press, 1991, pp. 232-236.
H.M. Ahmed, Signal Processing Algorithms and Architectures, PhD Thesis, E.E. Department, Stanford University, June 1982.
M. Weinhardt and W. Luk, “Pipeline Vectorization,” IEEE Trans. Comput. Aided Design, vol. 20, no. 2, 2001, pp. 234-248.
O. Mencer, Rational Arithmetic Units in Computer Systems. PhD Thesis (with M.J. Flynn), E.E. Dept., Stanford University, Jan. 2000.
F. de Dinechin and J. Detrey, Multipartite Tables in JBits for the Evaluation of Functions on FPGAs, Tech. Rep. RR-4305, INRIA, Nov. 2001.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Mencer, O., Luk, W. Parameterized High Throughput Function Evaluation for FPGAs. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 36, 17–25 (2004). https://doi.org/10.1023/B:VLSI.0000008067.31043.35
Published:
Issue Date:
DOI: https://doi.org/10.1023/B:VLSI.0000008067.31043.35