Skip to main content
Log in

Hybrid ADDer: A Viable Solution for Efficient Design of MAC in DNNs

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

This research article proposes a solution for efficient hardware implementation of deep neural networks (DNNs) in Edge-AI applications. An effective Hybrid ADDer (HADD) block for accumulation in fixed-point multiply-accumulate (MAC) operation is developed to overcome area and power limitations. The proposed HADD design offers a considerable reduction in area and power consumption, with a tolerable accuracy loss and reduced latency. The inference results show an accuracy of 96.97 and 96.64% for MNIST and A-Z Handwritten Alphabet datasets, respectively, using the LeNet-5 DNN model. Compared to the conventional adder implementation, the proposed HADD design reduces area utilization by 44% and power consumption by 51%, with a reduction in delay of 19% for 8-bit precision at 180 nm. For the same bit precision, the proposed design reduces area by 31%, power consumption by 34%, and delay by 8.1% at 45 nm. The proposed design further investigates edge detection applications, and the results for different standard images were promising. Overall, the proposed accumulator arithmetic block is a viable solution for error-tolerant AI applications, including DNN for image classification, object recognition, and other image-processing applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

Data sharing is not applicable to this article as no data sets were generated or analyzed during the current study, and detailed circuit simulation results are given in the manuscript.

References

  1. F. Albu, J. Kadlec, C. Softley, R. Matousek, A. Hermanek, N. Coleman, and A. Fagan, Implementation of (normalised) RLS lattice on virtex, in Field-programmable logic and applications: 11th international conference, FPL, Belfast, Northern Ireland, UK, August 27–29, 2001 Proceedings 11 (Springer, Berlin Heidelberg, 2001), pp. 91–100

  2. G. Armeniakos, G. Zervakis, D. Soudris, J. Henkel, Hardware approximate techniques for deep neural network accelerators: a survey. ACM Comput. Surv. 55(4), 1–36 (2022)

    Article  Google Scholar 

  3. H. Chhajed, G. Raut, N. Dhakad, S. Vishwakarma, S.K. Vishvakarma, Bitmac: bit-serial computation-based efficient multiply-accumulate unit for DNN accelerator. Circuits Syst. Signal Process. 41, 2045–2060 (2022). https://doi.org/10.1007/s00034-021-01873-9

  4. Y. Choi, D. Bae, J. Sim, S. Choi, M. Kim, L.S. Kim, Energy-efficient design of processing element for convolutional neural network. IEEE Trans. Circuits Syst. II Express Br. 64(11), 1332–1336 (2017)

    Google Scholar 

  5. J.N. Coleman, E.I. Chester, C.I. Softley, J. Kadlec, Arithmetic on the European logarithmic microprocessor. IEEE Trans. Comput. 49(7), 702–715 (2000)

    Article  Google Scholar 

  6. A. Dalloo, A. Najafi, A. Garcia-Ortiz, Systematic design of an approximate adder: the optimized lower part constant-OR adder. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 26(8), 1595–1599 (2018)

    Article  Google Scholar 

  7. P. Dhar, S. Guha, T. Biswas, M.Z. Abedin, A system design for license plate recognition by using edge detection and convolution neural network, in 2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2) (IEEE, 2018), pp. 1–4

  8. F.U.D. Farrukh, C. Zhang, Y. Jiang, Z. Zhang, Z. Wang, Z. Wang, H. Jiang, Power efficient tiny yolo CNN using reduced hardware resources based on booth multiplier and Wallace tree adders. IEEE Open J. Circuits Syst. 1, 76–87 (2020)

    Article  Google Scholar 

  9. S. Han, H. Mao, W.J. Dally, Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint arXiv:1510.00149 (2015)

  10. D. Harris, S. Harris, Digital Design and Computer Architecture (Morgan Kaufmann, 2010)

  11. A.B. Kahng, S. Kang. Accuracy-configurable adder for approximate arithmetic designs, in Proceedings of the 49th Annual Design Automation Conference (2012), pp. 820–825

  12. Y. Kim, Y. Zhang, P. Li, An energy-efficient approximate adder with carry skip for error-resilient neuromorphic VLSI systems, in 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (IEEE, 2013), pp. 130–137

  13. U. Lotrič, P. Bulić, Applicability of approximate multipliers in hardware neural networks. Neurocomputing 96, 57–65 (2012)

    Article  Google Scholar 

  14. Y. Ma, Y. Cao, S. Vrudhula, J.S. Seo, Optimizing the convolution operation to accelerate deep neural networks on FPGA. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 26(7), 1354–1367 (2018)

    Article  Google Scholar 

  15. T. Mendez, S.G. Nayak, Performance evaluation of fault-tolerant approximate adder, in 2022 6th International Conference on Devices, Circuits and Systems (ICDCS) (IEEE, 2022), pp. 1–5

  16. S. Mittal, A survey of techniques for approximate computing. ACM Comput. Surv. (CSUR) 48(4), 1–33 (2016)

    Google Scholar 

  17. V. Mrazek, S.S. Sarwar, L. Sekanina, Z. Vasicek, K. Roy, Design of power-efficient approximate multipliers for approximate artificial neural networks, in 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (IEEE, 2016), pp. 1–7

  18. S.M. Nabavinejad, M. Baharloo, K.C. Chen, M. Palesi, T. Kogel, M. Ebrahimi, An overview of efficient interconnection networks for deep neural network accelerators. IEEE J. Emerging Sel.Top. Circuits Syst. 10(3), 268–282 (2020)

    Article  Google Scholar 

  19. G. Park, J. Kung, Y. Lee, Design and analysis of approximate compressors for balanced error accumulation in MAC operator. IEEE Trans. Circuits Syst. I Regul. Pap. 68(7), 2950–2961 (2021)

    Article  Google Scholar 

  20. A. Parashar, M. Rhu, A. Mukkara, A. Puglielli, R. Venkatesan, B. Khailany, J. Emer, S.W. Keckler, W.J. Dally, SCNN: an accelerator for compressed-sparse convolutional neural networks. ACM SIGARCH Comput. Arch. News 45(2), 27–40 (2017)

    Article  Google Scholar 

  21. B.S. Prabakaran, S. Rehman, M.A. Hanif, S. Ullah, G. Mazaheri, A. Kumar, M. Shafique, DeMAS: an efficient design methodology for building approximate adders for FPGA-based systems, in 2018 Design, Automation and Test in Europe Conference and Exhibition (DATE) (IEEE, 2018), pp. 917–920

  22. B. Ramkumar, H.M. Kittur, Low-power and area-efficient carry select adder. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 20(2), 371–375 (2011)

    Article  Google Scholar 

  23. G. Raut, A. Biasizzo, N. Dhakad, N. Gupta, G. Papa, S.K. Vishvakarma, Data multiplexed and hardware reused architecture for deep neural network accelerator. Neurocomputing 486, 147–159 (2022)

    Article  Google Scholar 

  24. G. Raut, S. Rai, S.K. Vishvakarma, A. Kumar, RECON: resource-efficient CORDIC-based neuron architecture. IEEE Open J. Circuits Syst. 2, 170–181 (2021)

    Article  Google Scholar 

  25. L.B. Soares, M.M.A. da Rosa, C.M. Diniz, E.A.C. da Costa, S. Bampi, Design methodology to explore hybrid approximate adders for energy-efficient image and video processing accelerators. IEEE Trans. Circuits Syst. I Regul. Pap. 66(6), 2137–2150 (2019)

    Article  Google Scholar 

  26. V. Sze, Y.H. Chen, T.J. Yang, J.S. Emer, Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)

    Article  Google Scholar 

  27. S. Ullah, S. Rehman, M. Shafique, A. Kumar, High-performance accurate and approximate multipliers for FPGA-based hardware accelerators. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41(2), 211–224 (2021)

    Article  Google Scholar 

  28. S. Venkatachalam, S.B. Ko, Design of power and area-efficient approximate multipliers. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 25(5), 1782–1786 (2017)

    Article  Google Scholar 

  29. I.C. Wey, C.C. Ho, Y.S. Lin, C.C. Peng, An area-efficient carry select adder design by sharing the common Boolean logic term. Proc. IMECS 10, 1–4 (2012)

    Google Scholar 

  30. T. Yang, T. Ukezono, T. Sato, A low-power configurable adder for approximate applications, in 2018 19th International Symposium on Quality Electronic Design (ISQED) (IEEE, 2018), pp. 347–352

Download references

Acknowledgements

The authors would like to thank the Indo-South Korea Joint Network Center for Environmental Cyber-Physical Systems (INT/Korea/JNC/CPS), Department of Science and Technology, Government of India, for providing financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santosh Kumar Vishvakarma.

Ethics declarations

Conflict of interest

There is no conflict of interest from the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Trivedi, V., Lalwani, K., Raut, G. et al. Hybrid ADDer: A Viable Solution for Efficient Design of MAC in DNNs. Circuits Syst Signal Process 42, 7596–7614 (2023). https://doi.org/10.1007/s00034-023-02469-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-023-02469-1

Keywords

Navigation