Skip to main content
Log in

A survey of neural network accelerators

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Machine-learning techniques have recently been proved to be successful in various domains, especially in emerging commercial applications. As a set of machine-learning techniques, artificial neural networks (ANNs), requiring considerable amount of computation and memory, are one of the most popular algorithms and have been applied in a broad range of applications such as speech recognition, face identification, natural language processing, ect. Conventionally, as a straightforward way, conventional CPUs and GPUs are energy-inefficient due to their excessive effort for flexibility. According to the aforementioned situation, in recent years, many researchers have proposed a number of neural network accelerators to achieve high performance and low power consumption. Thus, the main purpose of this literature is to briefly review recent related works, as well as the DianNao-family accelerators. In summary, this review can serve as a reference for hardware researchers in the area of neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. McCulloch W S, Pitts W. A logical calculus of ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 1943, 5(4): 115–133

    Article  MathSciNet  MATH  Google Scholar 

  2. Hebb D O. The Organization of Behavior: A Neuropsychological Theory. London: Psychology Press, 2005

    Google Scholar 

  3. Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 1958, 65(6): 386

    Article  Google Scholar 

  4. Werbos P. Beyond regression: new tools for prediction and analysis in the behavioral sciences. Dissertation for the Doctoral Degree. Cambridge, MA: Harvard University, 1974.

    Google Scholar 

  5. Hinton G E, Osindero S, Teh Y. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7): 1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  6. Bengio Y. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009, 2(1): 1–127

    Article  MATH  Google Scholar 

  7. Williams R J, Zipser D. A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1989, 1(2): 270–280

    Article  Google Scholar 

  8. Back A, Tsoi A. FIR and IIR synapses, a new neural network architecture for time series modeling. Neural Computation, 1991, 3(3): 375–385

    Article  Google Scholar 

  9. Frasconi P, Gori M, Soda G. Local feedback multilayered networks. Neural Computation, 1992, 4(1): 120–130

    Article  Google Scholar 

  10. Ding S F, Li H, Su C Y, Yu J Z, Jin F X. Evolutionary artificial neural networks: a review. Artificial Intelligence Review, 2013, 39(3): 251–260

    Article  Google Scholar 

  11. Alneamy J S M, Alnaish R A H. Heart disease diagnosis utilizing hybrid fuzzy wavelet neural network and teaching learning based optimization algorithm. Advances in Artificial Neural Systems, 2014

    Google Scholar 

  12. Pereira L A M, Rodrigues D, Ribeiro P B, Papa J P, Weber S A T. Social-spider optimization-based artificial neural networks training and its applications for parkinson’s disease identification. In: Proceedings of the 27th IEEE International Symposium on Computer-Based Medical Systems. 2014, 14–17

    Google Scholar 

  13. Harmon F G, Frank A A, Joshi S S. The control of a parallel hybridelectric propulsion system for a small unmanned aerial vehicle using a cmac neural network. Neural Networks the Official Journal of the International Neural Network Society, 2005, 18(5-6): 772–780

    Article  Google Scholar 

  14. Zissis D, Xidias E K, Lekkas D. A cloud based architecture capable of perceiving and predicting multiple vessel behaviour. Applied Soft Computing, 2015, 35: 652–661

    Article  Google Scholar 

  15. Mishra A K, Desai V R. Drought forecasting using feed-forward recursive neural network. Ecological Modelling, 2006, 198(1-2): 127–138

    Article  Google Scholar 

  16. Azoff E M. Neural Network Time Series Forecasting of Financial Markets. New York: John Wiley & Sons, Inc., 1994

    Google Scholar 

  17. Kaastra I, Boyd M. Designing a neural network for forecasting financial and economic time series. Neurocomputing, 1996, 10(3): 215–236

    Article  Google Scholar 

  18. Tam K Y. Neural network models and the prediction of bank bankruptcy. Omega, 1991, 19(5): 429–445

    Article  Google Scholar 

  19. West D, Dellana S, Qian J X. Neural network ensemble strategies for financial decision applications. Computers & Operations Research, 2005, 32(10): 2543–2559

    Article  MATH  Google Scholar 

  20. Pokrajac D, Obradovic Z. A neural network-based method for sitespecific fertilization recommendation. In: Proceedings of ASAE Annual Meeting. 2001

    Google Scholar 

  21. Protzel P W, Palumbo D L, Arras M K. Performance and faulttolerance of neural networks for optimization. IEEE Transactions on Neural Networks, 1993, 4(4): 600–614

    Article  Google Scholar 

  22. Chandra P, Singh Y. Fault tolerance of feedforward artificial neural networks-a framework of study. In: Proceedings of the International Joint Conference on Neural Networks. 2003, 489–494

    Google Scholar 

  23. Dias F M, Antunes A. Fault tolerance of artificial neural networks: an open discussion for a global model. International Journal of Circuits, Systems and Signal Processing, 2008, 329–333

    Google Scholar 

  24. Siegelmann H T, Sontag E D. Analog computation via neural networks. Theoretical Computer Science, 1994, 131(2): 331–360

    Article  MathSciNet  MATH  Google Scholar 

  25. Siegelmann H. Neural Networks and Analog Computation: Beyond the Turing Limit. Springer Science & Business Media, 2012

    Google Scholar 

  26. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. 2013, arXiv preprint arXiv:1312.6199

    Google Scholar 

  27. Dennard R H, Rideout V L, Bassous E, Le Blanc A R. Design of ionimplanted mosfet’s with very small physical dimensions. IEEE Journal of Solid-State Circuits, 1974, 9(5): 256–268

    Article  Google Scholar 

  28. Esmaeilzadeh H, Blem E, Amant R S, Sankaralingam K, Burger D. Dark silicon and the end of multicore scaling. In: Proceedings of the 38th Annual International Symposium on Computer Architecture. 2011, 365–376

    Google Scholar 

  29. Jarrett K, Kavukcuoglu K, Ranzato M A, Le Cun Y. What is the best multi-stage architecture for object recognition? In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 2146–2153

    Google Scholar 

  30. Le Cun Y, Kavukcuoglu K, Farabet C. Convolutional networks and applications in vision. In: Proceedings of IEEE International Symposium on Circuits and Systems. 2010, 253–256

    Google Scholar 

  31. Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the Neural Information Processing Systems Conference. 2012, 1097–1105

    Google Scholar 

  32. Chakradhar S, Sankaradas M, Jakkula V, Cadambi S. A dynamically configurable coprocessor for convolutional neural networks. In: Proceedings of the 37th Annual International Symposium on Computer Architecture. 2010, 247–257

    Google Scholar 

  33. Vanhoucke V, Senior A, Mao M Z. Improving the speed of neural networks on cpus. In: Proceedings of Deep Learning and Unsupervised Feature Learning NIPS Workshop. 2011

    Google Scholar 

  34. Farabet C, Martini B, Akselrod P, Talay S. Hardware accelerated convolutional neural networks for synthetic vision systems. In: Proceedings of IEEE International Symposium on Circuits and Systems. 2010, 257–260

    Google Scholar 

  35. Scherer D, Schulz H, Behnke S. Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors. In: Proceedings of International Conference on Artificial Neural Networks. 2010, 82–91

    Google Scholar 

  36. Ciresan D C, Meier U, Masci J, Gambardella L M, Schmidhuber J. Flexible, high performance convolutional neural networks for image classification. In: Proceedings of International Joint Conference on Artificial Intelligence. 2011

    Google Scholar 

  37. Jia Y Q, Shelhamer E, Donahue J, Karayev S, Long J H, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia. 2014, 675–678

    Google Scholar 

  38. Krizhevsky A. One weird trick for parallelizing convolutional neural networks. 2014, arXiv preprint arXiv:1404.5997

    Google Scholar 

  39. Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Senior A, Tucker P, Yang K, Le Q V, Ng A Y. Large scale distributed deep networks. In: Proceedings of the Neural Information Processing Systems Conference. 2012, 1223–1231

    Google Scholar 

  40. Deng J, DongW, Socher R, Li L J, Li K, Li F F. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255

    Google Scholar 

  41. Ciresan D, Meier U, Schmidhuber J. Multi-column deep neural networks for image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 3642–3649

    Google Scholar 

  42. Oh K S, Jung K. GPU implementation of neural networks. Pattern Recognition, 2004, 37(6): 1311–1314

    Article  MATH  Google Scholar 

  43. Coates A, Baumstarck P, Le Q, Ng A Y. Scalable learning for object detection with gpu hardware. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. 2009, 4287–4293

    Google Scholar 

  44. Teodoro G, Sachetto R, Sertel O, Gurcan MN, Meira W, Catalyurek U, Ferreira R. Coordinating the use of GPU and CPU for improving performance of compute intensive applications. In: Proceedings of IEEE International Conference on Cluster Computing and Workshops. 2009, 1–10

    Google Scholar 

  45. Liu D F, Chen T S, Liu S L, Zhou J H, Zhou S Y, Teman O, Feng X B, Zhou X H, Chen Y J. Pudiannao: a polyvalent machine learning accelerator. In: Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems. 2015, 369–381

    Google Scholar 

  46. Chen Y J, Luo T, Liu S L, Zhang S J, He L Q, Wang J, Li L, Chen T S, Xu Z W, Sun N H, et al. Dadiannao: A machine-learning supercomputer. In: Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. 2014, 609–622

    Google Scholar 

  47. Le Q V, Ranzato MA, Monga R, Devin M, Chen K, Corrado G S, Dean J, Ng A Y. Building high-level features using large scale unsupervised learning. In: Proceedings of the International Conference on Machine Learning. 2011

    Google Scholar 

  48. Coates A, Huval B, Wang T, Wu D, Catanzaro B, Ng A Y. Deep learning with cots hpc systems. In: Proceedings of the 30th International Conference on Machine Learning. 2013, 1337–1345

    Google Scholar 

  49. Farabet C, Poulet C, Han J F, Le Cun Y. CNP: an FPGA-based processor for convolutional networks. In: Proceedings of IEEE International Conference on Field Programmable Logic and Applications. 2009, 32–37

    Google Scholar 

  50. Farabet C, Martini B, Corda B, Akselrod P, Culurciello E, Le Cun Y. Neuflow: A runtime reconfigurable dataflow processor for vision. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 2011, 109–116

    Google Scholar 

  51. Gokhale V, Jin J, Dundar A, Martini B, Culurciello E. A 240 G-ops/s mobile coprocessor for deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014, 682–687

    Google Scholar 

  52. Maashri A A, Debole M, Cotter M, Chandramoorthy N, Xiao Y, Narayanan V, Chakrabarti C. Accelerating neuromorphic vision algorithms for recognition. In: Proceedings of the 49th Annual Design Automation Conference. 2012, 579–584

    Chapter  Google Scholar 

  53. Kung H T. Why systolic architectures? IEEE Computer, 1982, 15(1): 37–46

    Article  Google Scholar 

  54. Du Z D, Fasthuber R, Chen T S, Ienne P, Li L, Luo T, Feng X B, Chen Y J, Temam O. Shidiannao: shifting vision processing closer to the sensor. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture. 2015, 92–104

    Chapter  Google Scholar 

  55. Dawwd S A. The multi 2D systolic design and implementation of convolutional neural networks. In: Proceedings of the 20th IEEE International Conference on Electronics, Circuits, and Systems. 2013, 221–224

    Google Scholar 

  56. Draper B A, Beveridge J R, Bohm A P W, Ross C, Chawathe M. Accelerated image processing on fpgas. IEEE Transactions on Image Processing, 2003, 12(12): 1543–1551

    Article  Google Scholar 

  57. Dawwd S A, Mahmood B S. A reconfigurable interconnected filter for face recognition based on convolution neural network. In: Proceedings of the 4th International Conference on Design and TestWorkshop. 2009, 1–6

    Google Scholar 

  58. Sankaradas M, Jakkula V, Cadambi S, Chakradhar S, Durdanovic I, Cosatto E, Graf H P. A massively parallel coprocessor for convolutional neural networks. In: Proceedings of the 20th IEEE International Conference on Application-specific Systems, Architectures and Processors. 2009, 53–60

    Google Scholar 

  59. Cardells-Tormo F, Molinet P L. Area-efficient 2-d shift-variant convolvers for fpga-based digital image processing. In: Proceedings of IEEEWorkshop on Signal Processing Systems Design and Implementation. 2005, 209–213

    Google Scholar 

  60. Ordonez-Cardenas E, Romero-Troncoso R D J. MLP neural network and on-line backpropagation learning implementation in a low-cost FPGA. In: Proceedings of the 18th ACM Great Lakes symposium on VLSI. 2008, 333–338

    Google Scholar 

  61. Peemen M, Setio A A, Mesman B, Corporaal H. Memory-centric accelerator design for convolutional neural networks. In: Proceedings of the 31st IEEE International Conference on Computer Design. 2013, 13–19

    Google Scholar 

  62. Zhang C, Li P, Sun G Y, Guan Y J, Xiao B J, Cong J S. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2015, 161–170

    Google Scholar 

  63. Suda N, Chandra V, Dasika G, Mohanty A, Ma Y F, Vrudhula S, Seo J, Cao Y. Throughput-optimized opencl-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2016, 16–25

    Chapter  Google Scholar 

  64. Qiu J T, Wang J, Yao S, Guo K Y, Li B X, Zhou E, Yu J C, Tang T Q, Xu N Y, Song S, Wang Y, Yang H Z. Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2016, 26–35

    Chapter  Google Scholar 

  65. Rice K L, Taha T M, Vutsinas C N. Scaling analysis of a neocortex inspired cognitive model on the Cray XD1. The Journal of Supercomputing, 2009, 47(1): 21–43

    Article  Google Scholar 

  66. George D, Hawkins J. A hierarchical bayesian model of invariant pattern recognition in the visual cortex. In: Proceedings of IEEE International Joint Conference on Neural Networks. 2005, 1812–1817

    Google Scholar 

  67. Kim S K, McAfee L C, McMahon P L, Olukotun K. A highly scalable restricted boltzmann machine FPGA implementation. In: Proceedings of IEEE International Conference on Field Programmable Logic and Applications. 2009, 367–372

    Google Scholar 

  68. Lee S Y, Aggarwal J K. Parallel 2-d convolution on a mesh connected array processor. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, (4): 590–594

    Article  MATH  Google Scholar 

  69. Stearns C C, Luthi D A, Ruetz P A, Ang P H. A reconfigurable 64-tap transversal filter. In: Proceedings of the IEEE Custom Integrated Circuits Conference. 1988

    Google Scholar 

  70. Kamp W, Künemund R, Söldner H, Hofer R. Programmable 2d linear filter for video applications. IEEE Journal of Solid-State Circuits, 1990, 25(3): 735–740

    Article  Google Scholar 

  71. Hecht V, Ronner K. An advanced programmable 2d-convolution chip for, real time image processing. In: Proceedings of IEEE International Sympoisum on Circuits and Systems. 1991, 1897–1900

    Google Scholar 

  72. Lee J J, Song G Y. Super-systolic array for 2D convolution. In: Proceedings of IEEE Region 10 Conference. 2006, 1–4

    Google Scholar 

  73. Merolla P, Arthur J, Akopyan F, Imam N, Manohar R, Modha D S. A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45nm. In: Proceedings of IEEE Custom Integrated Circuits Conference. 2011, 1–4

    Google Scholar 

  74. Kim J Y, Kim M, Lee S J, Oh J, Kim K, Yoo H J. A 201.4 GOPS 496 mWreal-time multi-object recognition processor with bio-inspired neural perception engine. IEEE Journal of Solid-State Circuits, 2010, 45(1): 32–45

    Article  Google Scholar 

  75. Pham P H, Jelaca D, Farabet C, Martini B, Le Cun Y, Culurciello E. Neuflow: dataflow vision processing system-on-a-chip. In: Proceedings of the 55th IEEE International Midwest Symposium on Circuits and Systems. 2012, 1044–1047

    Google Scholar 

  76. Esmaeilzadeh H, Sampson A, Ceze L, Burger D. Neural acceleration for general-purpose approximate programs. In: Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture. 2012, 449–460

    Google Scholar 

  77. Esmaeilzadeh H, Saeedi P, Araabi B N, Lucas C, Fakhraie S M. Neural network stream processing core (NnSP) for embedded systems. In: Proceedings of IEEE International Symposium on Circuits and Systems. 2006

    Google Scholar 

  78. Qadeer W, Hameed R, Shacham O, Venkatesan P, Kozyrakis C, Horowitz M A. Convolution engine: balancing efficiency & flexibility in specialized computing. In: Proceedings of the 40th Annual International Symposium on Computer Architecture. 2013, 24–35

    Google Scholar 

  79. Sim J, Park J S, Kim M, Bae D, Choi Y, Kim L S. 14.6 a 1.42 tops/w deep convolutional neural network recognition processor for intelligent ioe systems. In: Proceedings of IEEE International Solid-State Circuits Conference. 2016, 264–265

    Google Scholar 

  80. Chen Y H, Krishna T, Emer J, Sze V. 14.5 eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. In: Proceedings of IEEE International Solid-State Circuits Conference. 2016, 262–263

    Google Scholar 

  81. Park S, Bong K, Shin D, Lee J, Choi S, Yoo H J. 4.6 A1. 93TOPS/W scalable deep learning/inference processor with tetra-parallel mimd architecture for big-data applications. In: Proceedings of IEEE International Solid-State Circuits Conference. 2015, 1–3

    Google Scholar 

  82. Hashmi A, Berry H, Temam O, Lipasti M. Automatic abstraction and fault tolerance in cortical microachitectures. In: Proceedings of the 38th Annual International Symposium on Computer Architecture. 2011, 1–10

    Google Scholar 

  83. Temam O. A defect-tolerant accelerator for emerging highperformance applications. ACM SIGARCH Computer Architecture News, 2012, 40(3): 356–367

    Article  Google Scholar 

  84. Du Z D, Lingamneni A, Chen Y J, Palem K, Temam O, Wu C Y. Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. In: Proceedings of the 19th Asia and South Pacific Design Automation Conference. 2014, 201–206

    Google Scholar 

  85. Iwata A, Yoshida Y, Matsuda S, Sato Y, Suzumura N. An artificial neural network accelerator using general purpose 24 bit floating point digital signal processors. In: Proceedings of the International Joint Conference on Neural Networks. 1989, 171–175

    Chapter  Google Scholar 

  86. Khan M M, Lester D R, Plana L A, Rast A, Jin X, Painkras E, Furber S B. SpiNNaker: mapping neural networks onto a massively-parallel chip multiprocessor. In: Proceedings of IEEE International Joint Conference on Neural Networks. 2008, 2849–2856

    Google Scholar 

  87. Schemmel J, Fieres J, Meier K. Wafer-scale integration of analog neural networks. In: Proceedings of IEEE International Joint Conference on Neural Networks. 2008, 431–438

    Google Scholar 

  88. Chakradhar S, Sankaradas M, Jakkula V, Cadambi S. A dynamically configurable coprocessor for convolutional neural networks. In: Proceedings of the 37th Annual International Symposium on Computer Architecture. 2010, 247–257

    Google Scholar 

  89. Liu X X, Mao M J, Liu B Y, Li H, Chen Y R, Li B X, Wang Y, Jiang H, Barnell M, Wu Q, Yang J H. RENO: a high-efficient reconfigurable neuromorphic computing accelerator design. In: Proceedings of the 52nd ACM/EDAC/IEEE Design Automation Conference. 2015, 1–6

    Google Scholar 

  90. Hu M, Li H, Chen Y R, Wu Q, Rose G S. Bsb training scheme implementation on memristor-based circuit. In: Proceedings of IEEE Symposium on Computational Intelligence for Security and Defense Applications. 2013, 80–87

    Google Scholar 

  91. Hu M, Li H, Wu Q, Rose G. Hardware realization of neuromorphic bsb model with memristor crossbar network. In: Proceedings of IEEE Design Automation Conference. 2012, 554–559

    Google Scholar 

  92. Afifi A, Ayatollahi A, Raissi F. Implementation of biologically plausible spiking neural network models on the memristor crossbar-based CMOS/nano circuits. In: Proceedings of European Conference on Circuit Theory and Design. 2009, 563–566

    Google Scholar 

  93. Chen T S, Du Z D, Sun N H, Wang J, Wu C Y, Chen Y J, Temam O. Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGPLAN Notices, 2014, 49(4): 269–284

    Google Scholar 

  94. Muller M. Dark silicon and the Internet. In: Proceedings of EE Times “Designing with ARM” Virtual Conference. 2010, 285–288

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (Grants Nos. 61100163, 61133004, 61222204, 61221062, 61303158, 61432016, 61472396, and 61473275), the National High Technology Research and Development Program (863 Program) of China (2012AA012202), the Strategic Priority Research Program of the CAS (XDA06010403), the Internatioanal Collaboration Key Program of the CAS (171111KYSB20130002), and the 10,000 talent program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianshi Chen.

Additional information

Zhen Li received the BS degree in physics from Special Class for the Gifted Young, University of Science and Technology of China, China in 2014. He is currently a PhD candidate in Institute of Computing Technology, Chinese Academy of Sciences, China. His research interests include approximate computing, neuron networks accelerators and computer architecture.

Yuqing Wang received the BS degree in computer science from School of Computer Science and Technology, University of Science and Technology of China (USTC), China in 2015. He is currently a PhD candidate in USTC. His major research interests include neural network accelerators and optimization of corresponding compiler.

Tian Zhi received the BS degree of engineering from Zhejiang University, China, and the PhD degree in computer science from the Institute of Microelectronics of the Chinese Academy of Sciences, China in 2009 and 2014, respectively. Her current research interests include computer architecture, reconfigurable computing, and hardware neural networks.

Tianshi Chen received the BS degree in mathematics from Special Class for the Gifted Young and the PhD degree in computer science from School of Computer Science and Technology, University of Science and Technology of China, China in 2005 and 2010, respectively. He is currently a researcher in Institute of Computing Technology, Chinese Academy of Sciences. His research interests lie in computer architecture and computational intelligence. He is an awardee of the NSFC Excellent Young Scholars Program in 2015.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Wang, Y., Zhi, T. et al. A survey of neural network accelerators. Front. Comput. Sci. 11, 746–761 (2017). https://doi.org/10.1007/s11704-016-6159-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-016-6159-1

Keywords

Navigation