Skip to main content

The 4-2 Fused Adder–Subtractor Compressor for Low-Power Butterfly-Based Hardware Architectures

Abstract

Over the years, the use of adder compressors has been a promising alternative to reduce power of dedicated hardware architectures. Adder compressors are able to perform several additions in parallel by reusing their internal structure. Many signals and visual processing applications are based on discrete transforms and require butterfly-based architectures (i.e., architectures that perform additions and subtractions sharing the same operands). The butterfly-based structures reduce the savings of the state-of-the-art adder compressors due to the duplication of the hardware. This work extends the results about a new fused adder–subtractor (FAS) 4-2 compressor arithmetic operator. The FAS 4-2 compressor performs the butterfly operation (i.e., add and subtractions) in a single optimized arithmetic operator reusing the internal structure and then increases the power-efficiency of butterfly-based architectures. We propose to fuse both adds and subtractions in a single compact structure showing how to share the internal logic to perform simultaneously both operations. As a case study for a butterfly-based architecture, we employed the sum of absolute transformed differences (SATD) design based on Hadamard transforms (HT) of multiple sizes, where the smallest HTs are reused for the implementation of the largest ones. Synthesis results for a 45 nm CMOS technology show that the FAS 4-2 architecture herein proposed presents power and circuit area savings of about 7.1% and 8.4% when compared to the state-of-the-art circuit which employs two 4-2 adder compressors. Moreover, the full SATD architecture with the proposed FAS 4-2 compressors improves the power-efficiency from 5.8% (i.e., against conventional adder compressors) to 10% on average, when compared to the adder operator automatically inferred by the synthesis tool.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. 1.

    L. Agostini, R. Porto, J.L. Güntzel, I.S. Silva, S. Bampi, High throughput multitransform and multiparallelism IP for H.264/AVC video compression standard, in Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 4–5422 (2006)

  2. 2.

    A. Bräscher, I. Seidel, J. Güntzel, Improving the energy efficiency of a low-area SATD hardware architecture using fine grain PDE, in 30th Symposium on Integrated Circuits and Systems Design (SBCCI), pp. 155–161 (2017)

  3. 3.

    Cadence EDA Tools. http://www.cadence.com

  4. 4.

    L.H. Cancellier, A.B. Bräscher, I. Seidel, J.L. Güntzel, Energy-efficient Hadamard-based SATD architectures, in Proceedings of the 27th Symposium on Integrated Circuits and Systems Design, pp. 36:1–36:6 (2014)

  5. 5.

    L. Cancellier, I. Seidel, A. Bräscher, J. Güntzel, Exploring optimized Hadamard methods to design energy-efficient SATD architectures. J. Integr. Circuits Syst. 10(2), 113–122 (2015)

    Article  Google Scholar 

  6. 6.

    C. Chang, J. Gu, M. Zhang, Ultra low-voltage low-power CMOS 4–2 and 5–2 compressors for fast arithmetic circuits. IEEE Trans. Circuits Syst. 51(10), 1985–1997 (2004)

    Article  Google Scholar 

  7. 7.

    L. Chen, K. Li, C. Huang, Y. Lai, Analysis and architecture design of multi-transform architecture for H.264/AVC intra frame coder, in Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 145–148 (2008)

  8. 8.

    V. Coutinho, R. Cintra, F. Bayer, S. Kulasekera, A. Madanayake, A multiplierless pruned DCT-like transformation for image and video compression that requires ten additions only. J. Real-Time Image Process. 12(1), 247–255 (2016)

    Article  Google Scholar 

  9. 9.

    C.M. Diniz, M.B. Fonseca, E.A.C. Costa, S. Bampi, Evaluating the use of adder compressors for power-efficient HEVC interpolation filter architecture. Analog Integr. Circuits Signal Process. 89(1), 111–120 (2016)

    Article  Google Scholar 

  10. 10.

    G. He, D. Zhou, Y. Li, Z. Chen, T. Zhang, S. Goto, High-throughput power-efficient VLSI architecture of fractional motion estimation for ultra-HD HEVC video encoding. IEEE Trans. Very Large Scale Integr. Syst. 23(12), 3138–3142 (2015)

    Article  Google Scholar 

  11. 11.

    W. Hwangbo, C. Kyung, A multitransform architecture for H.264/AVC high-profile coders. IEEE Trans. Multimed. 12(3), 157–167 (2010)

    Article  Google Scholar 

  12. 12.

    ITU-T and ISO/IEC. High Efficiency Video Coding. ITU-T Recommendation H.265 and ISO/IEC 23008-2 (2013)

  13. 13.

    JCT-VC. HEVC Test Model (HM) v. 16.7. https://hevc.hhi.fraunhofer.de/

  14. 14.

    C.Y. Kao, C.L. Wu, Y.L. Lin, A high-performance three-engine architecture for H.264/AVC fractional motion estimation. IEEE Trans. Very Large Scale Integr. Syst. 18(4), 662–666 (2010)

    Article  Google Scholar 

  15. 15.

    Y.K. Lin, C.W. Ku, D.W. Li, T.S. Chang, A 140-MHz 94 K gates HD1080p 30-Frames/s intra-only profile H.264 encoder. IEEE Trans. Circuits Syst. Video Technol. 19(3), 432–436 (2009)

    Article  Google Scholar 

  16. 16.

    J. Liu, X. Chen, Y. Fan, X. Zeng, A full-mode FME VLSI architecture based on 8\(\times \)8/4\(\times \)4 adaptive Hadamard Transform for QFHD H.264/AVC encoder, in Proceedings of the IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, pp. 434–439 (2011)

  17. 17.

    W. Lu, N. Yu, J. Nan, D. Wang, A hardware structure of HEVC intra prediction, in 2015 2nd International Conference on Information Science and Control Engineering, pp. 555–559 (2015)

  18. 18.

    M. Monteiro, I. Seidel, J. Güntzel, On the calculation reuse in Hadamard-based SATD, in 9th IEEE Latin American Symposium on Circuits and Systems (LASCAS), pp. 1–4 (2018)

  19. 19.

    MulticoreWare. HEVC x265 Encoder. https://bitbucket.org/multicoreware/x265/

  20. 20.

    Nangate. Nangate 45 nm Open Cell Library. http://www.nangate.com

  21. 21.

    V.G. Oklobdzija, D. Villeger, S.S. Liu, A method for speed optimized partial product reduction and generation of fast parallel multipliers using an algorithmic approach. IEEE Trans. Comput. 45(3), 294–306 (1996)

    Article  Google Scholar 

  22. 22.

    G. Paim, M. Fonseca, E. Costa, S. Almeida, Power efficient 2-d rounded cosine transform with adder compressors for image compression, in 2015 IEEE International Conference on Electronics, Circuits, and Systems (ICECS), pp. 348–351 (2015)

  23. 23.

    G. Paim, E. Costa, Using adder compressors for power-efficient 2-d approximate discrete Tchebichef transform, in 2016 14th IEEE International New Circuits and Systems Conference (NEWCAS), pp. 1–4 (2016)

  24. 24.

    G. Paim, G.M. Santana, B.A. Abreu, L.M.G. Rocha, M. Grellert, E.A.C. Costa, S. Bampi, Exploring high-order adder compressors for power reduction in sum of absolute differences architectures for real-time UHD video encoding. J. Real-Time Image Process. 1(1), 1–20 (2020)

    Google Scholar 

  25. 25.

    G. Pastuszak, M. Trochimiuk, Algorithm and architecture design of the motion estimation for the H.265/HEVC 4K-UHD encoder. J. Real-Time Image Process. 12(1), 517–529 (2015)

    Google Scholar 

  26. 26.

    W. Penny, G. Jones, G. Paim, M. Porto, L. Agostini, B. Zatt, High-throughput and power-efficient hardware design for a multiple video coding standard sample interpolator. J. Real-Time Image Process. 16(1), 175–192 (2019)

    Article  Google Scholar 

  27. 27.

    E. Quinnell, E. E. Swartzlander, C. Lemonds, Floating-point fused multiply-add architectures, in 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers, pp. 331–337 (2007)

  28. 28.

    G. Santana, G. Paim, L. Rocha, R. Neuenfeld, M. Fonseca, E. Costa, S. Bampi, Using efficient adder compressors with a split-radix butterfly hardware architecture for low-power IoT smart sensors. in 24th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pp. 486–489 (2016)

  29. 29.

    T. Schiavon, G. Paim, M. Fonseca, E. Costa, S. Almeida, Exploiting adder compressors for power-efficient 2-d approximate DCT realization, in 2016 IEEE 7th Latin American Symposium on Circuits Systems (LASCAS), pp. 383–386 (2016)

  30. 30.

    I. Seidel, A. Beims Bräscher, J.L. Güntzel, L. Agostini, Energy-efficient SATD for beyond HEVC, in 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 802–805 (2016)

  31. 31.

    I. Seidel, M. Monteiro, B. Bonotto, L. Agostini, J. Güntzel, Energy-efficient Hadamard-based SATD hardware architectures through calculation reuse. IEEE Trans. Circuits Syst. I: Regul Pap. 66, 2102–2115 (2019)

    MathSciNet  Article  Google Scholar 

  32. 32.

    L.F. Sequeira, G.M. Santana, G. Paim, L.M.G. Rocha, B. Abreu, E. Costa, S. Bampi, Low-power hevc 8-point 2-d discrete cosine transform hardware using adder compressors, in 2018 16th IEEE International New Circuits and Systems Conference (NEWCAS), pp. 309–312 (2018)

  33. 33.

    B. Silveira, C. Diniz, M. Beck Fonseca, E. Costa, SATD hardware architecture based on 8 \(\times \) 8 Hadamard transform for HEVC encoder, in Proceedings of the IEEE International Conference on Electronics, Circuits, and Systems, pp. 576–579 (2015)

  34. 34.

    B. Silveira, R. Ferreira, G. Paim, C. Diniz, E. Costa, Low power SATD architecture employing multiple sizes Hadamard Transforms and adder compressors, in 2017 15th IEEE International New Circuits and Systems Conference (NEWCAS), pp. 277–280 (2017)

  35. 35.

    B. Silveira, G. Paim, B. Abreu, M. Grellert, C.M. Diniz, E.A.C. da Costa, S. Bampi, Power-efficient sum of absolute differences hardware architecture using adder compressors for integer motion estimation design. IEEE Trans. Circuits Syst. I: Regul. Pap. 64(12), 3126–3137 (2017)

    Article  Google Scholar 

  36. 36.

    L.B. Soares, C.M. Diniz, E.A.C. da Costa, S. Bampi, A novel pruned-based algorithm for energy-efficient SATD operation in the HEVC coding, in 2016 29th Symposium on Integrated Circuits and Systems Design (SBCCI), pp. 1–6 (2016)

Download references

Acknowledgements

The authors would like to thank CNPq, Capes and Fapergs Brazilian agencies for financial support to our research.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Guilherme Paim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Silveira, B., Paim, G., Abreu, B.A. et al. The 4-2 Fused Adder–Subtractor Compressor for Low-Power Butterfly-Based Hardware Architectures. Circuits Syst Signal Process (2021). https://doi.org/10.1007/s00034-021-01839-x

Download citation

Keywords

  • Adder–Subtractor compressors
  • Low-power design
  • SATD