Skip to main content
Log in

AxRSU-\(2^{m}\): Higher-Order m-Bit Approximate Encoders for Radix-\(2^{m}\) Squarer Units

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Approximate computing has emerged as a design alternative to enhance design efficiency by capitalizing on the inherent error resilience observed in numerous applications. Various error-resilient and compute-intensive applications, such as signal, image and video processing, computer vision, and supervised machine learning, necessitate dedicated hardware accelerators for mean squared error estimation during runtime. In these application domains, using efficient arithmetic operators, particularly a squarer unit, represents one of the most effective strategies for low-power design. This work introduces an approximate Radix-\(2^{m}\) squarer unit, denoted as AxRSU-\(2^{m}\). The proposed squarer unit employs m-bit approximate encoders to execute operations on m-bit data concurrently. The AxRSU-\(2^{m}\) under consideration explores encoders with m equal to 2 (AxRSU-4), 3 (AxRSU-8), and 4 (AxRSU-16). These approximate encoders exhibit low complexity and diminish the necessary partial products operating on m bits simultaneously, thereby substantially enhancing energy efficiency and reducing circuit area in the AxRSU-\(2^{m}\). To illustrate the trade-off between error and quality in the AxRSU-\(2^{m}\), we apply it to an SSD (sum squared difference) hardware accelerator designed for video processing, with a square-accumulate serving as a case study. Our findings reveal a novel Pareto front, presenting eight optimal AxRSU-\(2^{m}\) solutions that achieve accuracy levels ranging from 3.76 to 75.53%. These solutions yield energy savings ranging from 46.20 to 95.57% and circuit area reductions ranging from 37.68 to 66.73%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Algorithm 1
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. A. Banerjee, D. Kumar, A New Squarer design with reduced area and delay, in 19th International Symposium on VLSI Design and Test (VDAT), Avignon (2016), pp. 205–214

  2. S. Bui, J. Stine, Additional optimizations for parallel squarer units, in IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne VIC, Australia (2014)

  3. F. Bossen, Common test conditions and software configurations, in JCT-VC Document No. L1100 (2013)

  4. Y.-H. Chen, Area-efficient fixed-width squarer with dynamic error-compensation circuit. IEEE Trans. Circuits Syst. II Express Briefs 62(9), 851–855 (2015)

    Google Scholar 

  5. L. Dadda, Some schemes for parallel multipliers. Alta Frequenza (1965)

  6. G. Gillani, M. Hanif, M. Krone, S. Gerez, M. Shafique, A. Kokkeler, SquASH: approximate square-accumulate with self-healing. IEEE Access 6, 49112–49128 (2018)

    Article  Google Scholar 

  7. V. Guidotti, G. Paim, L.M. Rocha, E. Costa, S. Almeida, S. Bampi, Power-efficient approximate Newton–Raphson integer divider applied to NLMS adaptive filter for high-quality interference cancelling. Circuits Syst. Signal Process. 39, 5729–5757 (2020)

    Article  Google Scholar 

  8. M.B. Hisham, S.N. Yaakob, R.A.A. Raof, A.A. Nazren, N.M.W. Embedded, Template matching using sum of squared difference and normalized cross correlation, in IEEE Student Conference on Research and Development (SCOReD) (2015)

  9. HM. HEVC Test Model (HM) v. 16.7 (2023). http://hevc.hhi.fraunhofer.de

  10. R. Jaikumar, M. Karpagam, L. Raju, A novel approach to implement high speed squaring circuit using ancient Vedic mathematics techniques. Int. J. Appl. Eng. Res. 10, 1–6 (2015)

    Google Scholar 

  11. H. Jiang, C. Liu, F. Lombardi, J. Han, Low-power approximate unsigned multipliers with configurable error recovery. IEEE Trans. Circuits Syst. I Regul. Pap. 66(1), 189–202 (2019)

    Article  Google Scholar 

  12. C. Kommu, D. Rani, High performance 3-2 compressors architectures for high speed multipliers, in International Conference on Smart Systems and Inventive Technology (ICSSIT) (2018)

  13. A. Liddicoat, M. Flynn, Parallel square and cube computations. Computer Systems Laboratory, Department of Electrical Engineering Stanford University (2014)

  14. G. Paim, G. Zervakis, G. Pahwa, Y.S. Chauhan, E.A.C. da Costa, S. Bampi, J. Henkel, H. Amrouch, On the resiliency of NCFET circuits against voltage over-scaling. IEEE Trans. Circuits Syst. I Regul. Pap. 68(4), 1481–1492 (2021)

    Article  Google Scholar 

  15. G. Paim, L.M.G. Rocha, H. Amrouch, E.A.C. da Costa, S. Bampi, J. Henkel, A cross-layer gate-level-to-application co- simulation for design space exploration of approximate circuits in HEVC video encoders. IEEE Trans. Circuits Syst. Video Technol. 30(10), 3814–3828 (2020)

    Article  Google Scholar 

  16. K. Reddy, M. Vasantha, Y. Kumar, D. Dwivedi, Design of approximate booth squarer for error-tolerant computing. IEEE Trans. Very Large Scale Integr. VLSI Syst. 28(5), 1230–1241 (2020)

    Article  Google Scholar 

  17. L. Rocha, G. Paim, R. Ferreira, E. Costa, S. Bampi, Framework-based arithmetic core generation to explore ASIC-based parallel binary multipliers, in 2017 24th IEEE International Conference on Electronics, Circuits and Systems (ICECS) (2017), pp. 478–481

  18. L. Rocha, G. Paim, G.M. Santana, E.A.C. da Costa, S. Bampi, Framework-based arithmetic datapath generation to explore parallel binary multipliers. J. Integr. Circuits Syst. 15(3), 1–10 (2020)

    Article  Google Scholar 

  19. M. da Rosa, G. Paim, L.M.G. Rocha, E.A.C. da Costa, S. Bampi, The Radix-2m Squared Multiplier, in 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS) (2020), pp. 1–4

  20. M. da Rosa, E.A. da Costa, L.G. Rocha, G. Paim, S. Bampi, Energy-efficient VLSI squarer unit with optimized Radix-2m multiplication logic. Circuits Syst. Signal Process. 1–25 (2022)

  21. M. da Rosa, G. Paim, J. Castro-Godínez, E.A.C. Da Costa, R.I. Soares, S. Bampi, AxRSU: approximate Radix-4 squarer unit, in 2022 IEEE International Symposium on Circuits and Systems (ISCAS) (2022), pp. 1655–1659

  22. M. da Rosa, G. Paim, P.L.D. Costa, E.A.C.D. Costa, R.I. Soares, S. Bampi, AxPPA: approximate parallel prefix adders. IEEE Trans. Very Large Scale Integr. VLSI Syst. 31(1), 17–28 (2023)

    Article  Google Scholar 

  23. M. da Rosa, G. Paim, L.M. Rocha, E.A. da Costa, S. Bampi, Exploring efficient adder compressors for power-efficient sum of squared differences design, in 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS) (2020), pp. 1–4

  24. S. Salvini, S.J. Wijnholds, Fast gain calibration in radio astronomy using alternating direction implicit methods: analysis and applications. Astron. Astrophys. 571, A97 (2014)

    Article  ADS  Google Scholar 

  25. H. Seidel, M.M.A. da Rosa, G. Paim, E.A.C. da Costa, S.J.M. Almeida, S. Bampi, Approximate pruned and truncated Haar discrete wavelet transform VLSI hardware for energy-efficient ECG signal processing. IEEE Trans. Circuits Syst. I Regul. Pap. 68(5), 1814–1826 (2021)

    Article  MathSciNet  Google Scholar 

  26. B. Shao, P. Li, Array-based approximate arithmetic computing: a general model and applications to multiplier and squarer design. IEEE Trans. Circuits Syst. I Regul. Pap. 62(4), 1090 (2015)

    Article  MathSciNet  Google Scholar 

  27. D. Sharath, G. Devaraju, C. Kavitha, Optimization and implementation of parallel squarer. IJRET Int. J. Res. Eng. Technol.

  28. R. Sharma, M. Kaur, G. Singh, Design and FPGA implementation of optimized 32-bit Vedic multiplier and square architectures, in International Conference on Industrial Instrumentation and Control (ICIC) (2015)

  29. B. Silveira, G. Paim, B. Abreu, M. Grellert, C.M. Diniz, E.A.C. da Costa, S. Bampi, Power-efficient sum of absolute differences hardware architecture using adder compressors for integer motion estimation design. IEEE Trans. Circuits Syst. I Regul. Pap. 64(12), 3126–3137 (2017)

    Article  Google Scholar 

  30. P. Stanley-Marbell, A. Alaghi, M. Carbin, E. Darulova, L. Dolecek, A. Gerstlauer, G. Gillani, D. Jevdjic, T. Moreau, M. Cacciotti, A. Daglis, N.E. Jerger, B. Falsafi, S. Misailovic, A. Sampson, D. Zufferey, Exploiting errors for efficiency: a survey from circuits to applications. ACM Comput. Surv. 53(3), 1–39 (2020)

    Article  Google Scholar 

  31. A. Strollo, E. Napoli, D. De Caro, N. Petra, G.D. Meo, Comparison and extension of approximate 4–2 compressors for low-power approximate multipliers. IEEE Trans. Circuits Syst. I Regul. Pap. 67(9), 3021–3034 (2020)

    Article  MathSciNet  Google Scholar 

  32. Z. Tasoulas, G. Zervakis, I. Anagnostopoulos, H. Amrouch, J. Henkel, Weight-oriented approximation for energy-efficient neural network inference accelerators. IEEE Trans. Circuits Syst. I Regul. Pap. 67(12), 4670–4683 (2020)

    Article  Google Scholar 

  33. K. Tsai, Y.-J. Chang, C.-H. Wang, C.-T. Chiang, Accuracy-configurable radix-4 adder with a dynamic output modification scheme. IEEE Trans. Circuits Syst. I Regul. Pap. 68(8), 3328–3336 (2021)

    Article  Google Scholar 

  34. C. Wallace, A suggestion for fast multiplier. IEEE Trans. Electron. Comput. (1964)

  35. G. Zervakis, H. Saadat, H. Amrouch, A. Gerstlauer, S. Parameswaran, J. Henkel, Approximate computing for ML: state-of-the-art, challenges and visions, in 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) (2021), pp. 189–196

Download references

Acknowledgements

This work was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brazil (CAPES) - Finance Code 001. It was also supported by the Brazilian Microelectronics Society (SBMicro) through the Programa de Apoio a Projeto de Circuitos Integrados em Universidades (APCI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Morgana M. A. da Rosa.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

da Rosa, M.M.A., Paim, G., da Costa, E.A.C. et al. AxRSU-\(2^{m}\): Higher-Order m-Bit Approximate Encoders for Radix-\(2^{m}\) Squarer Units. Circuits Syst Signal Process (2024). https://doi.org/10.1007/s00034-024-02616-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00034-024-02616-2

Keywords

Navigation