SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators

Rajput, Gunjan; Biyani, Kunika Naresh; Logashree, V.; Vishvakarma, Santosh Kumar

doi:10.1007/s00034-021-01947-8

SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators

Published: 05 February 2022

Volume 41, pages 3465–3486, (2022)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Gunjan Rajput¹,
Kunika Naresh Biyani¹,
V. Logashree¹ &
…
Santosh Kumar Vishvakarma ORCID: orcid.org/0000-0003-4223-0077¹

323 Accesses
1 Citation
2 Altmetric
Explore all metrics

Abstract

Transcendental nonlinear function design in deep neural accelerators principally concerns performance parameters such as area, power, delay, and throughput. Neural hardware demands resource-intensive blocks such as adders, multipliers, and nonlinear activation functions. This work addresses the issues related to the implementation of the activation function for deep neural accelerators. The proposed design implements an activation function unit with the help of stochastic computing along with clock gating techniques to reduce the active power dissipation in the hardware. But a complete deep neural network uses various activation functions in the hidden layers. To overcome the problem of implementing individual hardware design corresponding to each activation function, we have designed the streamlined composite activation function unit for neural accelerators (SCAN), which implements hyperbolic tangent and ReLU activation functions. The proposed method using stochastic computing along with clock gating is compared with other states of the art. The area is reduced by approximately 74.14\(\%\) as compared to that of CORDIC-based design. While implementing a single neuron, both area and power are reduced by manifold, enhancing the performance of deep neural accelerators. Testing accuracy and inference time are calculated using the benchmark dataset (MNIST) on AlexNet architecture. Testing accuracy in the proposed implementation is increased by 1.08\(\%\), and loss is reduced by 40.66\(\%\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

A Configurable Activation Function for Variable Bit-Precision DNN Hardware Accelerators

An efficient stochastic computing based deep neural network accelerator with optimized activation functions

Article 03 May 2021

Improving Performance Estimation for FPGA-Based Accelerators for Convolutional Neural Networks

Data Availability

Data sharing was not applicable to this article as no datasets were generated or analyzed during the current study, and detailed circuit simulation results are given in the manuscript.

References

A. Alaghi, W. Qian, J.P. Hayes, The promise and challenge of stochastic computing. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 37(8), 1515–1531 (2017)
Article Google Scholar
L. Benini, B. Alessandro, M. De Giovanni, A survey of design techniques for system-level dynamic power management. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(3), 299–316 (2000)
Article Google Scholar
L. Benini, D.M. Giovanni, Dynamic Power Management: Design Techniques and CAD Tools (Springer, 2012)
M. Bhasin, G.P.S. Raghava, Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine 22(23–24), 3195–3204 (2004)
Article Google Scholar
B.D. Brown, H.C. Card, Stochastic neural computation. I. Computational elements. IEEE Trans. Comput. 50(9), 891–905 (2001)
Article MathSciNet Google Scholar
C.H. Chang, , H.Y. Kao, S.H. Huang, May. Hardware implementation for multiple activation functions, in 2019 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW) (IEEE, 2019), pp. 1–2
G. Dinelli et al. Advantages and limitations of fully on-chip CNN FPGA-based hardware accelerator, in IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, 2020)
M. Ercegovac, D. Kirovski, M. Potkonjak, Low-power behavioral synthesis optimization using multiple precision arithmetic, in Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361) (IEEE, 1999), pp. 568–573
B.R. Gaines, Stochastic computing systems, in Advances in Information Systems Science (Springer, Boston, 1969), pp. 37–172
K. Guo et al., Angel-eye: a complete design flow for mapping cnn onto embedded fpga. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 37(1), 35–47 (2017)
Article Google Scholar
J. Kathuria, M. Ayoubkhan, A. Noor, A review of clock gating techniques. MIT Int. J. Electron. Commun. Eng. 1(2), 106–114 (2011)
Google Scholar
K. Leboeuf, A.H. Namin, R. Muscedere, H. Wu, M. Ahmadi, High speed VLSI implementation of the hyperbolic tangent sigmoid function, in 2008 Third International Conference on Convergence and Hybrid Information Technology, vol. 1 (IEEE, 2008), pp. 1070–1073
B. Lee, N. Burgess, Some results on Taylor-series function approximation on FPGA. Thrity-Seventh Asilomar Conf. Signals Syst. Comput. 2, 2198–2202 (2003). (IEEE)
Google Scholar
J. Li, Z. Yuan, Z. Li, C. Ding, A. Ren, Q. Qiu, J. Draper, Y. Wang, Hardware-driven nonlinear activation for stochastic computing based deep convolutional neural networks, in 2017 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2017), pp. 1230–1236
C.W. Lin, J.-S. Wang, A digital circuit design of hyperbolic tangent sigmoid function for neural networks, in 2008 IEEE International Symposium on Circuits and Systems (IEEE, 2008), pp. 856–859
J.Y.L. Low, C.C. Jong, A memory-efficient tables-and-additions method for accurate computation of elementary functions. IEEE Trans. Comput. 62(5), 858–872 (2012)
Article MathSciNet Google Scholar
D.T. Nguyen, N.N. Tuan, H. Kim, H.-J. Lee, A high-throughput and power-efficient FPGA implementation of YOLO CNN for object detection. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 27(8), 1861–1873 (2019)
Article Google Scholar
E. Nurvitadhi, J. Sim, D. Sheffield, A. Mishra, S. Krishnan, D. Marr. Debbie, Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC, in 2016 26th International Conference on Field Programmable Logic and Applications (FPL) (IEEE, 2016), pp. 1–4
G. Rajput, G. Raut, M. Chandra, S.K. Vishvakarma, VLSI implementation of transcendental function hyperbolic tangent for deep neural network accelerators. Microprocess. Microsyst. 84, 104270 (2021)
Article Google Scholar
G. Rajput, S. Agrawal, G. Raut, S.K. Vishvakarma, An accurate and noninvasive skin cancer screening based on imaging technique. Int. J. Imaging Syst. Technol. (2021)
S.P.J.V. Rani, P. Kanagasabapathy, Multilayer perceptron neural network architecture using vhdl with combinational logic sigmoid function, in 2007 International Conference on Signal Processing, Communications and Networking (IEEE, 2007), pp. 404–409
F. Ratto, T. Fanni, L. Raffo, C. Sau, Mutual impact between clock gating and high level synthesis in reconfigurable hardware accelerators. Electronics 10(1), 73 (2021)
Article Google Scholar
G. Raut, S. Rai, S.K. Vishvakarma, A. Kumar, RECON: resource-efficient CORDIC-based neuron architecture. IEEE Open J. Circuits Syst. 2, 170–181 (2021)
Article Google Scholar
G. Raut, S. Rai, S.K. Vishvakarma, A. Kumar, A CORDIC based configurable activation function for ANN applications, in 2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (IEEE, 2020), pp. 78–83
B. Zamanlooy, M. Mirhassani, Efficient VLSI implementation of neural networks with hyperbolic tangent activation function. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 22(1), 39–48 (2013)
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the Council of Scientific and Industrial Research (CSIR) New Delhi, Government of India, under JRF scheme for providing financial support and Special Manpower Development Program Chip to System Design, Department of Electronics and Information Technology (DeitY) under the Ministry of Communication and Information Technology, Government of India, for providing necessary research facilities.

Author information

Authors and Affiliations

Indian Institute of Technology, Indore, Indore, India
Gunjan Rajput, Kunika Naresh Biyani, V. Logashree & Santosh Kumar Vishvakarma

Authors

Gunjan Rajput
View author publications
You can also search for this author in PubMed Google Scholar
Kunika Naresh Biyani
View author publications
You can also search for this author in PubMed Google Scholar
V. Logashree
View author publications
You can also search for this author in PubMed Google Scholar
Santosh Kumar Vishvakarma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Santosh Kumar Vishvakarma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

SCAN uses Stochastic Computing to implement the Power-Efficient Hardware of Composite Activation-function-unit for Deep Neural Accelerators.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajput, G., Biyani, K.N., Logashree, V. et al. SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators. Circuits Syst Signal Process 41, 3465–3486 (2022). https://doi.org/10.1007/s00034-021-01947-8

Download citation

Received: 09 May 2021
Revised: 11 November 2021
Accepted: 12 November 2021
Published: 05 February 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s00034-021-01947-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators

Abstract

Access this article

Similar content being viewed by others

A Configurable Activation Function for Variable Bit-Precision DNN Hardware Accelerators

An efficient stochastic computing based deep neural network accelerator with optimized activation functions

Improving Performance Estimation for FPGA-Based Accelerators for Convolutional Neural Networks

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators

Abstract

Access this article

Similar content being viewed by others

A Configurable Activation Function for Variable Bit-Precision DNN Hardware Accelerators

An efficient stochastic computing based deep neural network accelerator with optimized activation functions

Improving Performance Estimation for FPGA-Based Accelerators for Convolutional Neural Networks

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation