Abstract
This research article proposes a solution for efficient hardware implementation of deep neural networks (DNNs) in Edge-AI applications. An effective Hybrid ADDer (HADD) block for accumulation in fixed-point multiply-accumulate (MAC) operation is developed to overcome area and power limitations. The proposed HADD design offers a considerable reduction in area and power consumption, with a tolerable accuracy loss and reduced latency. The inference results show an accuracy of 96.97 and 96.64% for MNIST and A-Z Handwritten Alphabet datasets, respectively, using the LeNet-5 DNN model. Compared to the conventional adder implementation, the proposed HADD design reduces area utilization by 44% and power consumption by 51%, with a reduction in delay of 19% for 8-bit precision at 180 nm. For the same bit precision, the proposed design reduces area by 31%, power consumption by 34%, and delay by 8.1% at 45 nm. The proposed design further investigates edge detection applications, and the results for different standard images were promising. Overall, the proposed accumulator arithmetic block is a viable solution for error-tolerant AI applications, including DNN for image classification, object recognition, and other image-processing applications.
Similar content being viewed by others
Data Availability
Data sharing is not applicable to this article as no data sets were generated or analyzed during the current study, and detailed circuit simulation results are given in the manuscript.
References
F. Albu, J. Kadlec, C. Softley, R. Matousek, A. Hermanek, N. Coleman, and A. Fagan, Implementation of (normalised) RLS lattice on virtex, in Field-programmable logic and applications: 11th international conference, FPL, Belfast, Northern Ireland, UK, August 27–29, 2001 Proceedings 11 (Springer, Berlin Heidelberg, 2001), pp. 91–100
G. Armeniakos, G. Zervakis, D. Soudris, J. Henkel, Hardware approximate techniques for deep neural network accelerators: a survey. ACM Comput. Surv. 55(4), 1–36 (2022)
H. Chhajed, G. Raut, N. Dhakad, S. Vishwakarma, S.K. Vishvakarma, Bitmac: bit-serial computation-based efficient multiply-accumulate unit for DNN accelerator. Circuits Syst. Signal Process. 41, 2045–2060 (2022). https://doi.org/10.1007/s00034-021-01873-9
Y. Choi, D. Bae, J. Sim, S. Choi, M. Kim, L.S. Kim, Energy-efficient design of processing element for convolutional neural network. IEEE Trans. Circuits Syst. II Express Br. 64(11), 1332–1336 (2017)
J.N. Coleman, E.I. Chester, C.I. Softley, J. Kadlec, Arithmetic on the European logarithmic microprocessor. IEEE Trans. Comput. 49(7), 702–715 (2000)
A. Dalloo, A. Najafi, A. Garcia-Ortiz, Systematic design of an approximate adder: the optimized lower part constant-OR adder. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 26(8), 1595–1599 (2018)
P. Dhar, S. Guha, T. Biswas, M.Z. Abedin, A system design for license plate recognition by using edge detection and convolution neural network, in 2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2) (IEEE, 2018), pp. 1–4
F.U.D. Farrukh, C. Zhang, Y. Jiang, Z. Zhang, Z. Wang, Z. Wang, H. Jiang, Power efficient tiny yolo CNN using reduced hardware resources based on booth multiplier and Wallace tree adders. IEEE Open J. Circuits Syst. 1, 76–87 (2020)
S. Han, H. Mao, W.J. Dally, Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint arXiv:1510.00149 (2015)
D. Harris, S. Harris, Digital Design and Computer Architecture (Morgan Kaufmann, 2010)
A.B. Kahng, S. Kang. Accuracy-configurable adder for approximate arithmetic designs, in Proceedings of the 49th Annual Design Automation Conference (2012), pp. 820–825
Y. Kim, Y. Zhang, P. Li, An energy-efficient approximate adder with carry skip for error-resilient neuromorphic VLSI systems, in 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (IEEE, 2013), pp. 130–137
U. Lotrič, P. Bulić, Applicability of approximate multipliers in hardware neural networks. Neurocomputing 96, 57–65 (2012)
Y. Ma, Y. Cao, S. Vrudhula, J.S. Seo, Optimizing the convolution operation to accelerate deep neural networks on FPGA. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 26(7), 1354–1367 (2018)
T. Mendez, S.G. Nayak, Performance evaluation of fault-tolerant approximate adder, in 2022 6th International Conference on Devices, Circuits and Systems (ICDCS) (IEEE, 2022), pp. 1–5
S. Mittal, A survey of techniques for approximate computing. ACM Comput. Surv. (CSUR) 48(4), 1–33 (2016)
V. Mrazek, S.S. Sarwar, L. Sekanina, Z. Vasicek, K. Roy, Design of power-efficient approximate multipliers for approximate artificial neural networks, in 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (IEEE, 2016), pp. 1–7
S.M. Nabavinejad, M. Baharloo, K.C. Chen, M. Palesi, T. Kogel, M. Ebrahimi, An overview of efficient interconnection networks for deep neural network accelerators. IEEE J. Emerging Sel.Top. Circuits Syst. 10(3), 268–282 (2020)
G. Park, J. Kung, Y. Lee, Design and analysis of approximate compressors for balanced error accumulation in MAC operator. IEEE Trans. Circuits Syst. I Regul. Pap. 68(7), 2950–2961 (2021)
A. Parashar, M. Rhu, A. Mukkara, A. Puglielli, R. Venkatesan, B. Khailany, J. Emer, S.W. Keckler, W.J. Dally, SCNN: an accelerator for compressed-sparse convolutional neural networks. ACM SIGARCH Comput. Arch. News 45(2), 27–40 (2017)
B.S. Prabakaran, S. Rehman, M.A. Hanif, S. Ullah, G. Mazaheri, A. Kumar, M. Shafique, DeMAS: an efficient design methodology for building approximate adders for FPGA-based systems, in 2018 Design, Automation and Test in Europe Conference and Exhibition (DATE) (IEEE, 2018), pp. 917–920
B. Ramkumar, H.M. Kittur, Low-power and area-efficient carry select adder. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 20(2), 371–375 (2011)
G. Raut, A. Biasizzo, N. Dhakad, N. Gupta, G. Papa, S.K. Vishvakarma, Data multiplexed and hardware reused architecture for deep neural network accelerator. Neurocomputing 486, 147–159 (2022)
G. Raut, S. Rai, S.K. Vishvakarma, A. Kumar, RECON: resource-efficient CORDIC-based neuron architecture. IEEE Open J. Circuits Syst. 2, 170–181 (2021)
L.B. Soares, M.M.A. da Rosa, C.M. Diniz, E.A.C. da Costa, S. Bampi, Design methodology to explore hybrid approximate adders for energy-efficient image and video processing accelerators. IEEE Trans. Circuits Syst. I Regul. Pap. 66(6), 2137–2150 (2019)
V. Sze, Y.H. Chen, T.J. Yang, J.S. Emer, Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)
S. Ullah, S. Rehman, M. Shafique, A. Kumar, High-performance accurate and approximate multipliers for FPGA-based hardware accelerators. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41(2), 211–224 (2021)
S. Venkatachalam, S.B. Ko, Design of power and area-efficient approximate multipliers. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 25(5), 1782–1786 (2017)
I.C. Wey, C.C. Ho, Y.S. Lin, C.C. Peng, An area-efficient carry select adder design by sharing the common Boolean logic term. Proc. IMECS 10, 1–4 (2012)
T. Yang, T. Ukezono, T. Sato, A low-power configurable adder for approximate applications, in 2018 19th International Symposium on Quality Electronic Design (ISQED) (IEEE, 2018), pp. 347–352
Acknowledgements
The authors would like to thank the Indo-South Korea Joint Network Center for Environmental Cyber-Physical Systems (INT/Korea/JNC/CPS), Department of Science and Technology, Government of India, for providing financial support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest from the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Trivedi, V., Lalwani, K., Raut, G. et al. Hybrid ADDer: A Viable Solution for Efficient Design of MAC in DNNs. Circuits Syst Signal Process 42, 7596–7614 (2023). https://doi.org/10.1007/s00034-023-02469-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-023-02469-1