Skip to main content
Log in

SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Transcendental nonlinear function design in deep neural accelerators principally concerns performance parameters such as area, power, delay, and throughput. Neural hardware demands resource-intensive blocks such as adders, multipliers, and nonlinear activation functions. This work addresses the issues related to the implementation of the activation function for deep neural accelerators. The proposed design implements an activation function unit with the help of stochastic computing along with clock gating techniques to reduce the active power dissipation in the hardware. But a complete deep neural network uses various activation functions in the hidden layers. To overcome the problem of implementing individual hardware design corresponding to each activation function, we have designed the streamlined composite activation function unit for neural accelerators (SCAN), which implements hyperbolic tangent and ReLU activation functions. The proposed method using stochastic computing along with clock gating is compared with other states of the art. The area is reduced by approximately 74.14\(\%\) as compared to that of CORDIC-based design. While implementing a single neuron, both area and power are reduced by manifold, enhancing the performance of deep neural accelerators. Testing accuracy and inference time are calculated using the benchmark dataset (MNIST) on AlexNet architecture. Testing accuracy in the proposed implementation is increased by 1.08\(\%\), and loss is reduced by 40.66\(\%\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data Availability

Data sharing was not applicable to this article as no datasets were generated or analyzed during the current study, and detailed circuit simulation results are given in the manuscript.

References

  1. A. Alaghi, W. Qian, J.P. Hayes, The promise and challenge of stochastic computing. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 37(8), 1515–1531 (2017)

    Article  Google Scholar 

  2. L. Benini, B. Alessandro, M. De Giovanni, A survey of design techniques for system-level dynamic power management. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(3), 299–316 (2000)

    Article  Google Scholar 

  3. L. Benini, D.M. Giovanni, Dynamic Power Management: Design Techniques and CAD Tools (Springer, 2012)

  4. M. Bhasin, G.P.S. Raghava, Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine 22(23–24), 3195–3204 (2004)

    Article  Google Scholar 

  5. B.D. Brown, H.C. Card, Stochastic neural computation. I. Computational elements. IEEE Trans. Comput. 50(9), 891–905 (2001)

    Article  MathSciNet  Google Scholar 

  6. C.H. Chang, , H.Y. Kao, S.H. Huang, May. Hardware implementation for multiple activation functions, in 2019 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW) (IEEE, 2019), pp. 1–2

  7. G. Dinelli et al. Advantages and limitations of fully on-chip CNN FPGA-based hardware accelerator, in IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, 2020)

  8. M. Ercegovac, D. Kirovski, M. Potkonjak, Low-power behavioral synthesis optimization using multiple precision arithmetic, in Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361) (IEEE, 1999), pp. 568–573

  9. B.R. Gaines, Stochastic computing systems, in Advances in Information Systems Science (Springer, Boston, 1969), pp. 37–172

  10. K. Guo et al., Angel-eye: a complete design flow for mapping cnn onto embedded fpga. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 37(1), 35–47 (2017)

    Article  Google Scholar 

  11. J. Kathuria, M. Ayoubkhan, A. Noor, A review of clock gating techniques. MIT Int. J. Electron. Commun. Eng. 1(2), 106–114 (2011)

    Google Scholar 

  12. K. Leboeuf, A.H. Namin, R. Muscedere, H. Wu, M. Ahmadi, High speed VLSI implementation of the hyperbolic tangent sigmoid function, in 2008 Third International Conference on Convergence and Hybrid Information Technology, vol. 1 (IEEE, 2008), pp. 1070–1073

  13. B. Lee, N. Burgess, Some results on Taylor-series function approximation on FPGA. Thrity-Seventh Asilomar Conf. Signals Syst. Comput. 2, 2198–2202 (2003). (IEEE)

    Google Scholar 

  14. J. Li, Z. Yuan, Z. Li, C. Ding, A. Ren, Q. Qiu, J. Draper, Y. Wang, Hardware-driven nonlinear activation for stochastic computing based deep convolutional neural networks, in 2017 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2017), pp. 1230–1236

  15. C.W. Lin, J.-S. Wang, A digital circuit design of hyperbolic tangent sigmoid function for neural networks, in 2008 IEEE International Symposium on Circuits and Systems (IEEE, 2008), pp. 856–859

  16. J.Y.L. Low, C.C. Jong, A memory-efficient tables-and-additions method for accurate computation of elementary functions. IEEE Trans. Comput. 62(5), 858–872 (2012)

    Article  MathSciNet  Google Scholar 

  17. D.T. Nguyen, N.N. Tuan, H. Kim, H.-J. Lee, A high-throughput and power-efficient FPGA implementation of YOLO CNN for object detection. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 27(8), 1861–1873 (2019)

    Article  Google Scholar 

  18. E. Nurvitadhi, J. Sim, D. Sheffield, A. Mishra, S. Krishnan, D. Marr. Debbie, Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC, in 2016 26th International Conference on Field Programmable Logic and Applications (FPL) (IEEE, 2016), pp. 1–4

  19. G. Rajput, G. Raut, M. Chandra, S.K. Vishvakarma, VLSI implementation of transcendental function hyperbolic tangent for deep neural network accelerators. Microprocess. Microsyst. 84, 104270 (2021)

    Article  Google Scholar 

  20. G. Rajput, S. Agrawal, G. Raut, S.K. Vishvakarma, An accurate and noninvasive skin cancer screening based on imaging technique. Int. J. Imaging Syst. Technol. (2021)

  21. S.P.J.V. Rani, P. Kanagasabapathy, Multilayer perceptron neural network architecture using vhdl with combinational logic sigmoid function, in 2007 International Conference on Signal Processing, Communications and Networking (IEEE, 2007), pp. 404–409

  22. F. Ratto, T. Fanni, L. Raffo, C. Sau, Mutual impact between clock gating and high level synthesis in reconfigurable hardware accelerators. Electronics 10(1), 73 (2021)

    Article  Google Scholar 

  23. G. Raut, S. Rai, S.K. Vishvakarma, A. Kumar, RECON: resource-efficient CORDIC-based neuron architecture. IEEE Open J. Circuits Syst. 2, 170–181 (2021)

    Article  Google Scholar 

  24. G. Raut, S. Rai, S.K. Vishvakarma, A. Kumar, A CORDIC based configurable activation function for ANN applications, in 2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (IEEE, 2020), pp. 78–83

  25. B. Zamanlooy, M. Mirhassani, Efficient VLSI implementation of neural networks with hyperbolic tangent activation function. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 22(1), 39–48 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Council of Scientific and Industrial Research (CSIR) New Delhi, Government of India, under JRF scheme for providing financial support and Special Manpower Development Program Chip to System Design, Department of Electronics and Information Technology (DeitY) under the Ministry of Communication and Information Technology, Government of India, for providing necessary research facilities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santosh Kumar Vishvakarma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

SCAN uses Stochastic Computing to implement the Power-Efficient Hardware of Composite Activation-function-unit for Deep Neural Accelerators.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rajput, G., Biyani, K.N., Logashree, V. et al. SCAN: Streamlined Composite Activation Function Unit for Deep Neural Accelerators. Circuits Syst Signal Process 41, 3465–3486 (2022). https://doi.org/10.1007/s00034-021-01947-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-021-01947-8

Keywords

Navigation