Skip to main content
Log in

Abstract

This paper presents parameterized module-generators for pipelined function evaluation using lookup tables, adders, shifters, multipliers, and dividers. We discuss trade-offs involved between (1) full-lookup tables, (2) bipartite (lookup-add) units, (3) lookup-multiply units, (4) shift-and-add based CORDIC units, and (5) rational approximation. Our treatment mainly focuses on explaining method (3), and briefly covers the background of the other methods. For lookup-multiply units, we provide equations for estimating approximation errors and rounding errors which are used to parameterize the hardware units. The resources and performance of the resulting design can be estimated given the input parameters. A selection of the compared methods are implemented as part of the current PAM-Blox module generation environment. An example shows that the lookup-multiply unit produces competitive designs with data widths up to 20 bits when compared with shift-and-add based CORDIC units. Additionally, the lookup-multiply method or rational approximation can produce efficient designs for larger data widths when evaluating functions not supported by CORDIC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. W.F. Wong and E. Goto, “Fast Hardware-Based Algorithms for Elementary Function Computations Using Rectangular Multipliers,” IEEE Trans. Comput., vol. 43, 1994, pp. 278-294.

    Article  Google Scholar 

  2. C. Ebeling, D.C. Cronquist, P. Franklin, J. Secosky, and S.G. Berg, “Mapping Applications to the RaPiD Configurable Architecture,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1997, pp. 106-115.

  3. R. Laufer, R.R. Taylor, and H. Schmit, “PCI-PipeRench and SWORD API: A System for Stream-based Reconfigurable Computing,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1999, pp. 200-208.

  4. O. Mencer, H. Huebert, M. Morf, and M.J. Flynn, “StReAm: Object-Oriented Programming of Stream Architectures Using PAM-Blox,” Field-Programmable Logic and Applications, LNCS 1896, Springer, 2000, pp. 595-604.

  5. S. Rixner et al., “A Bandwidth-Efficient Architecture for Media Processing,” in Proc. ACM/IEEE Int'l Symposium on Microarchitecture, IEEE Computer Society Press, 1998, pp. 3-13.

  6. M.J. Schulte and J.E. Stine, “Approximating Elementary Functions with Symmetric Bipartite Tables,” IEEE Trans. Comput., vol. 48, no. 8, 1999, pp. 842-847.

    Article  Google Scholar 

  7. N. Boullis, Designing Arithmetic Units for Adaptive Computing with PAM-Blox, MIM Internship Report, ENS-Lyon, France, Sept. 2000.

  8. J.E. Volder, “The CORDIC Trigonometric Computing Technique,” IRE Trans. on Electronic Computers, vol. EC-8, no. 3, 1959.

  9. O. Mencer, M. Morf, and M.J. Flynn, “PAM-Blox: High Performance FPGA Design for Adaptive Computing,” in Proc. IEEE Symp. on FPGAs for Custom Computing Machines, IEEE Computer Society Press, 1998, pp. 167-174.

  10. R. Andraka, “A Survey of CORDIC Algorithms for FPGAs,” in Proc. ACM/SIGDA Int. Symp. Field Programmable Gate Arrays, ACM Press, 1998, pp. 191-200.

  11. O. Mencer, L. Séméria, M. Morf, and J.M. Delosme, “Application of Reconfigurable CORDIC Architectures,” Journal of VLSI Signal Processing, vol. 24, nos. 2/3, 2000, pp. 211-221.

    Article  Google Scholar 

  12. J.M. Muller, Elementary Functions, Algorithms and Implementation. Birkhaeuser, Boston, 1997.

  13. F. de Dinechin and A. Tisserand, Some Improvements on Multipartite Table Methods, Tech. Rep. RR-4059, INRIA, Nov. 2000.

  14. I. Koren and O. Zinaty, “Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations,” IEEE Trans. Comput., vol. 39, no. 8, 1990.

    Google Scholar 

  15. P.T.P. Tang, “Table Lookup Algorithms for Elementary Functions and Their Error Analysis,” in Proc. 10th IEEE Symp. Computer Arithmetic, IEEE Press, 1991, pp. 232-236.

  16. H.M. Ahmed, Signal Processing Algorithms and Architectures, PhD Thesis, E.E. Department, Stanford University, June 1982.

  17. M. Weinhardt and W. Luk, “Pipeline Vectorization,” IEEE Trans. Comput. Aided Design, vol. 20, no. 2, 2001, pp. 234-248.

    Article  Google Scholar 

  18. O. Mencer, Rational Arithmetic Units in Computer Systems. PhD Thesis (with M.J. Flynn), E.E. Dept., Stanford University, Jan. 2000.

  19. F. de Dinechin and J. Detrey, Multipartite Tables in JBits for the Evaluation of Functions on FPGAs, Tech. Rep. RR-4305, INRIA, Nov. 2001.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mencer, O., Luk, W. Parameterized High Throughput Function Evaluation for FPGAs. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 36, 17–25 (2004). https://doi.org/10.1023/B:VLSI.0000008067.31043.35

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VLSI.0000008067.31043.35

Navigation