Advertisement

Implementation of DNNs on IoT devices

  • Zhichao Zhang
  • Abbas Z. KouzaniEmail author
Review Article

Abstract

Driven by the recent growth in the fields of internet of things (IoT) and deep neural networks (DNNs), DNN-powered IoT devices are expected to transform a variety of industrial applications. DNNs, however, involve many parameters and operations to process the data generated by IoT devices. This results in high data-processing latency and energy consumption. New approaches are thus being souhgt to tackle these issues and deploy real-time DNNs into resource-limited IoT devices. This paper presents a comprehensive review on hardware-and-software-co-design approaches developed to implement DNNs on low-resource hardware platforms. These approaches explore the trade-off between energy consumption, speed, classification accuracy, and model size. First, an overview of DNNs is given. Next, available tools for implementing DNNs on low-resource hardware platforms are described. Then, the memory hierarchy designs together with dataflow mapping strategies are presented. Furthermore, various model optimization approaches, including pruning and quantization, are discussed. In addition, case studies are given to demonstrate the feasibility of implementing DNNs for IoT applications. Finally, detailed discussions, research gaps, and future directions are provided. The presented review can guide the design and implementation of the next generation of hardware and software solutions for real-world IoT applications.

Keywords

Deep neural networks Real-time accelerators Low-resource IoT devices Hardware and software co-design Field-programmable gate arrays 

Notes

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Mohammadi M, Al-Fuqaha A, Sorour S, Guizani M (2018) Deep learning for IoT big data and streaming analytics: a survey. IEEE Commun Surv Tutor 20(4):2923–2960CrossRefGoogle Scholar
  2. 2.
    Sodhro AH, Luo Z, Sodhro GH, Muzamal M, Rodrigues JJ, de Albuquerque VHC (2019) Artificial Intelligence based QoS optimization for multimedia communication in IoV systems. Future Gener Comput Syst 95:667–680CrossRefGoogle Scholar
  3. 3.
    Evans D (2011) The internet of things: how the next evolution of the internet is changing everything. CISCO White Pap 1(2011):1–11Google Scholar
  4. 4.
    Sodhro AH, Shaikh FK, Pirbhulal S, Lodro MM, Shah MA (2017) Medical-QoS based telemedicine service selection using analytic hierarchy process. In: Handbook of large-scale distributed computing in smart healthcare. Springer, pp 589–609Google Scholar
  5. 5.
    Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1):1CrossRefGoogle Scholar
  6. 6.
    Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, CambridgezbMATHGoogle Scholar
  7. 7.
    Li J, Zhang Y, Chen X, Xiang Y (2018) Secure attribute-based data sharing for resource-limited users in cloud computing. Comput Secur 72:1–12CrossRefGoogle Scholar
  8. 8.
    Iandola F, Keutzer K (2017) Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures. In: Proceedings of the twelfth IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis companion. ACM, p 1Google Scholar
  9. 9.
    Sze V, Chen Y-H, Yang T-J, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329CrossRefGoogle Scholar
  10. 10.
    Ndikumana A, Tran NH, Hong CS (2018) Deep learning based caching for self-driving car in multi-access edge computing. arXiv preprint arXiv:181001548
  11. 11.
    Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788Google Scholar
  12. 12.
    Luo Z, Small A, Dugan L, Lane S (2018) Cloud chaser: real time deep learning computer vision on low computing power devices. arXiv preprint arXiv:181001069
  13. 13.
    Mozer TF (2017) Triggering video surveillance using embedded voice, speech, or sound recognition. Google PatentsGoogle Scholar
  14. 14.
    Stergiou C, Psannis KE, Kim B-G, Gupta B (2018) Secure integration of IoT and cloud computing. Future Gener Comput Syst 78:964–975CrossRefGoogle Scholar
  15. 15.
    Al-Garadi MA, Mohamed A, Al-Ali A, Du X, Guizani M (2018) A survey of machine and deep learning methods for Internet of Things (IoT) security. arXiv preprint arXiv:180711023
  16. 16.
    Wu B, Iandola F, Jin PH, Keutzer K (2017) Squeezedet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 129–137Google Scholar
  17. 17.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105Google Scholar
  18. 18.
    Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobiLeNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:170404861
  19. 19.
    Sodhro AH, Pirbhulal S, de Albuquerque VHC (2019) Artificial intelligence driven mechanism for edge computing based industrial applications. IEEE Trans Ind Inf 15(7):4235–4243CrossRefGoogle Scholar
  20. 20.
    Sodhro AH, Li Y, Shah MA (2016) Energy-efficient adaptive transmission power control for wireless body area networks. IET Commun 10(1):81–90CrossRefGoogle Scholar
  21. 21.
    Chen Y-H, Krishna T, Emer JS, Sze V (2017) Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circuits 52(1):127–138CrossRefGoogle Scholar
  22. 22.
    Horowitz M (2014) Energy table for 45 nm process. Stanford VLSI wikiGoogle Scholar
  23. 23.
    Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ ()2016 EIE: efficient inference engine on compressed deep neural network. In: ACM/IEEE 43rd annual international symposium on computer architecture (ISCA). IEEE, pp 243–254Google Scholar
  24. 24.
    Guan Y, Liang H, Xu N, Wang W, Shi S, Chen X, Sun G, Zhang W, Cong J (2017) FP-DNN: an automated framework for mapping deep neural networks onto FPGAs with RTL-HLS hybrid templates. In: IEEE 25th annual international symposium on field-programmable custom computing machines (FCCM). IEEE, pp 152–159Google Scholar
  25. 25.
    Guo K, Zeng S, Yu J, Wang Y, Yang H (2017) A survey of FPGA-based neural network accelerator. arXiv preprint arXiv:171208934
  26. 26.
    Abdelouahab K, Pelcat M, Serot J, Berry F (2018) Accelerating CNN inference on FPGAs: a survey. arXiv preprint arXiv:180601683
  27. 27.
    Wang E, Davis JJ, Zhao R, Ng H-C, Niu X, Luk W, Cheung PY, Constantinides GA (2019) Deep neural network approximation for custom hardware: where we’ve been, where we’re going. arXiv preprint arXiv:190106955
  28. 28.
    Cheng Y, Wang D, Zhou P, Zhang T (2018) Model compression and acceleration for deep neural networks: the principles, progress, and challenges. IEEE Signal Process Mag 35(1):126–136CrossRefGoogle Scholar
  29. 29.
    Shawahna A, Sait SM, El-Maleh A (2019) FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7:7823–7859CrossRefGoogle Scholar
  30. 30.
    Venieris SI, Kouris A, Bouganis C-S (2018) Toolflows for mapping convolutional neural networks on fpgas: a survey and future directions. ACM Computing Surveys (CSUR) 51(3):56CrossRefGoogle Scholar
  31. 31.
    McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133.  https://doi.org/10.1007/bf02478259 MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Guresen E, Kayakutlu G (2011) Definition of artificial neural networks with comparison to other networks. Procedia Comput Sci 3:426–433CrossRefGoogle Scholar
  33. 33.
    Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRefGoogle Scholar
  34. 34.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436.  https://doi.org/10.1038/nature14539 CrossRefGoogle Scholar
  35. 35.
    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828CrossRefGoogle Scholar
  36. 36.
    Nielsen MA (2015) Neural networks and deep learning, vol 25. Determination Press, San FranciscoGoogle Scholar
  37. 37.
    Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117.  https://doi.org/10.1016/j.neunet.2014.09.003 CrossRefGoogle Scholar
  38. 38.
    Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefGoogle Scholar
  39. 39.
    Ivakhnenko AG, Lapa VGE (1965) Cybernetic predicting devices. CCM Information Corporation, New YorkGoogle Scholar
  40. 40.
    Bertsekas DP, Tsitsiklis JN (1995) Neuro-dynamic programming: an overview. In: Proceedings of the 34th IEEE conference on decision and control. IEEE, Piscataway, pp 560–564Google Scholar
  41. 41.
    Hochreiter S, Bengio Y, Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A field guide to dynamical recurrent neural networks. IEEE PressGoogle Scholar
  42. 42.
    Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRefGoogle Scholar
  43. 43.
    Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 689–696Google Scholar
  44. 44.
    Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680Google Scholar
  45. 45.
    Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:13126114
  46. 46.
    Raina R, Madhavan A, Ng AY (2009) Large-scale deep unsupervised learning using graphics processors. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 873–880Google Scholar
  47. 47.
    Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484CrossRefGoogle Scholar
  48. 48.
    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  49. 49.
    Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken AP, Tejani A, Totz J, Wang Z (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, vol 3, p 4Google Scholar
  50. 50.
    Nguyen A, Yosinski J, Clune J (2015) Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 427–436Google Scholar
  51. 51.
    Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) XNOR-Net: imagenet classification using binary convolutional neural networks. In: European conference on computer vision. Springer, pp 525–542Google Scholar
  52. 52.
    Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112Google Scholar
  53. 53.
    Graves A, Mohamed A-R, Hinton G Speech recognition with deep recurrent neural networks. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6645–6649Google Scholar
  54. 54.
    Socher R, Lin CC, Manning C, Ng AY (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136Google Scholar
  55. 55.
    Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88CrossRefGoogle Scholar
  56. 56.
    Sünderhauf N, Brock O, Scheirer W, Hadsell R, Fox D, Leitner J, Upcroft B, Abbeel P, Burgard W, Milford M (2018) The limits and potentials of deep learning for robotics. Int J Robot Res 37(4–5):405–420CrossRefGoogle Scholar
  57. 57.
    Heaton J, Polson N, Witte JH (2017) Deep learning for finance: deep portfolios. Appl Stoch Models Bus Ind 33(1):3–12MathSciNetCrossRefGoogle Scholar
  58. 58.
    Liu Y, Racah E, Correa J, Khosrowshahi A, Lavers D, Kunkel K, Wehner M, Collins W (2016) Application of deep convolutional neural networks for detecting extreme weather in climate datasets. arXiv preprint arXiv:160501156
  59. 59.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  60. 60.
    Zhang Q, Nian Wu Y, Zhu S-C (2018) Interpretable convolutional neural networks. In: Proc IEEE conference on computer vision and pattern recognition, pp 8827–8836Google Scholar
  61. 61.
    van Gerven M, Bohte S (2018) Artificial neural networks as models of neural information processing. Frontiers Media, LausanneCrossRefGoogle Scholar
  62. 62.
    Bajaj R (2016) Exploiting DSP block capabilities in FPGA high level design flows. Nanyang Technological University, SingaporeGoogle Scholar
  63. 63.
    Vipin K, Fahmy SA (2018) FPGA dynamic and partial reconfiguration: a survey of architectures, methods, and applications. ACM Comput Surv (CSUR) 51(4):72CrossRefGoogle Scholar
  64. 64.
    Guo K, Sui L, Qiu J, Yu J, Wang J, Yao S, Han S, Wang Y, Yang H (2018) Angel-Eye: a complete design flow for mapping CNN onto embedded FPGA. IEEE Trans Comput Aided Des Integr Circuits Syst 37(1):35–47CrossRefGoogle Scholar
  65. 65.
    Di Febbo P, Dal Mutto C, Tieu K, Mattoccia S (2018) KCNN: extremely-efficient hardware keypoint detection with a compact convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 682–690Google Scholar
  66. 66.
    Xilinx Artix-7 FPGA AC701 Evaluation Kit (2019) Xilinx Inc. https://www.xilinx.com/products/boards-and-kits/ek-a7-ac701-g.html. Accessed 16 Oct 2019
  67. 67.
    Wei L, Luo B, Li Y, Liu Y, Xu Q (2018) I know what you see: power side-channel attack on convolutional neural network accelerators. In: Proceedings of the 34th annual computer security applications conference. ACM, pp 393-406Google Scholar
  68. 68.
    Spartan-6 FPGA SP605 Evaluation Kit (2019) Xilinx Inc. https://www.xilinx.com/products/boards-and-kits/ek-s6-sp605-g.html#documentation. Accessed 16 Oct 2019
  69. 69.
    Kästner F, Janßen B, Kautz F, Hübner M, Corradi G (2018) Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on PYNQ. In: IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, pp 154–161Google Scholar
  70. 70.
    PYNQ-Z1: Python Productivity for Zynq-7000 ARM/FPGA SoC (2019) Digilent Inc. https://store.digilentinc.com/pynq-z1-python-productivity-for-zynq-7000-arm-fpga-soc/. Accessed 16 Oct 2019
  71. 71.
    Morcel R, Hajj H, Saghir MA, Akkary H, Artail H, Khanna R, Keshavamurthy A (2019) FeatherNet: an accelerated convolutional neural network design for resource-constrained FPGAs. ACM Trans Reconfig Technol Syst (TRETS) 12(2):6Google Scholar
  72. 72.
    Cyclone V GT FPGA Development Board (2017) Altera. https://www.intel.com.au/content/dam/www/programmable/us/en/pdfs/literature/manual/rm_cvgt_fpga_dev_board.pdf. Accessed 16 Oct 2019
  73. 73.
    Venieris SI, Bouganis C-S (2017) fpgaConvNet: a toolflow for mapping diverse convolutional neural networks on embedded FPGAs. arXiv preprint arXiv:171108740
  74. 74.
    Xilinx Zynq-7000 SoC ZC706 Evaluation Kit (2018) Xilinx Inc. https://www.xilinx.com/products/boards-and-kits/ek-z7-zc706-g.html#hardware. Accessed 16 Oct 2019
  75. 75.
    Brilli G, Burgio P, Bertogna M (2018) Convolutional neural networks on embedded automotive platforms: a qualitative comparison. In: International conference on high performance computing and simulation (HPCS). IEEE, pp 496–499Google Scholar
  76. 76.
    ZCU102 evaluation board user guide (2019) Xilinx Inc. https://www.xilinx.com/support/documentation/boards_and_kits/zcu102/ug1182-zcu102-eval-bd.pdf. Accessed 16 Oct 2019
  77. 77.
    ZCU104 Evaluation Board (2018) Xilinx Inc. https://www.xilinx.com/support/documentation/boards_and_kits/zcu104/ug1267-zcu104-eval-bd.pdf. Accessed 16 Oct 2019
  78. 78.
    ZCU106 Evaluation Board (2018) Xilinx Inc. https://www.xilinx.com/support/documentation/boards_and_kits/zcu106/ug1244-zcu106-eval-bd.pdf. Accessed 16 Oct 2019
  79. 79.
    Nazemi M, Pasandi G, Pedram M (2019) Energy-efficient, low-latency realization of neural networks through boolean logic minimization. In: Proceedings of the 24th Asia and South Pacific design automation conference. ACM, pp 274–279Google Scholar
  80. 80.
    Intel® Arria® 10 SoC Development Kit (2019) Intel Corporation. https://www.intel.com/content/www/us/en/programmable/products/boards_and_kits/dev-kits/altera/arria-10-soc-development-kit.html. Accessed 16 Oct 2019
  81. 81.
    Colangelo P, Luebbers E, Huang R, Margala M, Nealis K (2017) Application of convolutional neural networks on Intel® Xeon® processor with integrated FPGA. In: IEEE high performance extreme computing conference (HPEC). IEEE, pp 1–7Google Scholar
  82. 82.
    Intel Arria 10 GX FPGA Development Kit (2019) Intel Corporation. https://www.intel.com/content/www/us/en/programmable/products/boards_and_kits/dev-kits/altera/kit-a10-gx-fpga.html. Accessed 16 Oct 2019
  83. 83.
    Omnitek (2019) Xilinx Inc. https://www.xilinx.com/products/acceleration-solutions/1-zz0jo0.html. Accessed 16 Oct 2019
  84. 84.
    Girija SS (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow orgGoogle Scholar
  85. 85.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678Google Scholar
  86. 86.
    Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z (2015) MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:151201274
  87. 87.
    Redmon J (2013) Darknet: open source neural networks in CGoogle Scholar
  88. 88.
    Samajdar A, Zhu Y, Whatmough P, Mattina M, Krishna T (2018) SCALE-Sim: systolic CNN accelerator. arXiv preprint arXiv:181102883
  89. 89.
    Chen C, Liu X, Peng H, Ding H, Shi C-JR (2018) iFPNA: a flexible and efficient deep neural network accelerator with a programmable data flow engine in 28 nm CMOS. In: IEEE 44th European solid state circuits conference (ESSCIRC). IEEE, pp 170–173Google Scholar
  90. 90.
  91. 91.
    Parashar A, Rhu M, Mukkara A, Puglielli A, Venkatesan R, Khailany B, Emer J, Keckler SW, Dally WJ (2017) SCNN: an accelerator for compressed-sparse convolutional neural networks. In: ACM SIGARCH computer architecture news, vol 2. ACM, pp 27–40Google Scholar
  92. 92.
    Zhang C, Sun G, Fang Z, Zhou P, Pan P, Cong J (2018) Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks. IEEE Tran Comput Aided Des Integr Circuits Syst.  https://doi.org/10.1109/TCAD.2017.2785257 CrossRefGoogle Scholar
  93. 93.
    Yang X, Gao M, Pu J, Nayak A, Liu Q, Bell SE, Setter JO, Cao K, Ha H, Kozyrakis C (2018) DNN dataflow choice is overrated. arXiv preprint arXiv:180904070
  94. 94.
    Li J, Yan G, Lu W, Jiang S, Gong S, Wu J, Li X (2018) SmartShuttle: optimizing off-chip memory accesses for deep learning accelerators. In: Design, automation and test in Europe conference and exhibition (DATE). IEEE, pp 343–348Google Scholar
  95. 95.
    Chen Y-H, Emer J, Sze V (2018) Eyeriss v2: a flexible and high-performance accelerator for emerging deep neural networks. arXiv preprint arXiv:180707928
  96. 96.
    Du Z, Fasthuber R, Chen T, Ienne P, Li L, Luo T, Feng X, Chen Y, Temam O (2015) ShiDianNao: shifting vision processing closer to the sensor. In: ACM SIGARCH computer architecture news, vol 3. ACM, pp 92–104Google Scholar
  97. 97.
    Kwon H, Samajdar A, Krishna T (2018) MAERI: enabling flexible dataflow mapping over DNN accelerators via reconfigurable interconnects. In: Proceedings of the twenty-third international conference on architectural support for programming languages and operating systems. ACM, pp 461–475Google Scholar
  98. 98.
    Ullrich K, Meeds E, Welling M (2017) Soft weight-sharing for neural network compression. arXiv preprint arXiv:170204008
  99. 99.
    He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: International conference on computer vision (ICCV), vol 6Google Scholar
  100. 100.
    Huang Z, Wang N (2018) Data-driven sparse structure selection for deep neural networks. In: Proceedings of the European conference on computer vision (ECCV), pp 304–320CrossRefGoogle Scholar
  101. 101.
    Yang T-J, Chen Y-H, Sze V (2017) Designing energy-efficient convolutional neural networks using energy-aware pruning. In: IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  102. 102.
    Hegde K, Yu J, Agrawal R, Yan M, Pellauer M, Fletcher CW (2018) UCNN: exploiting computational reuse in deep neural networks via weight repetition. arXiv preprint arXiv:180406508
  103. 103.
    Lane ND, Bhattacharya S, Georgiev P, Forlivesi C, Jiao L, Qendro L, Kawsar F (2016) Deepx: a software accelerator for low-power deep learning inference on mobile devices. In: Proceedings of the 15th International conference on information processing in sensor networks. IEEE Press, p 23Google Scholar
  104. 104.
    Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:150302531
  105. 105.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9Google Scholar
  106. 106.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556
  107. 107.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778Google Scholar
  108. 108.
    Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and < 0.5 mb model size. arXiv preprint arXiv:160207360
  109. 109.
    Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) MobiLeNetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520Google Scholar
  110. 110.
    Zhang X, Zhou X, Lin M, Sun J (2018) ShuffLeNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856Google Scholar
  111. 111.
    Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:151000149
  112. 112.
    Horowitz M (2014) 1.1 Computing’s energy problem (and what we can do about it). In: IEEE international conference on solid-state circuits conference digest of technical papers (ISSCC). IEEE, pp 10–14Google Scholar
  113. 113.
    Wen W, Wu C, Wang Y, Chen Y, Li H (2016) Learning structured sparsity in deep neural networks. In: Advances in neural information processing systems, pp 2074–2082Google Scholar
  114. 114.
    Wu J, Leng C, Wang Y, Hu Q, Cheng J (2016) Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4820–4828Google Scholar
  115. 115.
    Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2017) Quantized neural networks: training neural networks with low precision weights and activations. J Mach Learn Res 18(1):6869–6898MathSciNetzbMATHGoogle Scholar
  116. 116.
    Tang W, Hua G, Wang L (2017) How to train a compact binary neural network with high accuracy? In: AAAI, pp 2625–2631Google Scholar
  117. 117.
    Zhu C, Han S, Mao H, Dally WJ (2016) Trained ternary quantization. arXiv preprint arXiv:161201064
  118. 118.
    Manessi F, Rozza A, Bianco S, Napoletano P, Schettini R (2018) Automated pruning for deep neural network compression. In: 2018 24th International conference on pattern recognition (ICPR). IEEE, pp 657–664Google Scholar
  119. 119.
    Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst (JETC) 13(3):32Google Scholar
  120. 120.
    Li D, Wang X, Kong D (2018) Deeprebirth: accelerating deep neural network execution on mobile devices. In: Thirty-second AAAI conference on artificial intelligenceGoogle Scholar
  121. 121.
    Louizos C, Ullrich K, Welling M (2017) Bayesian compression for deep learning. In: Advances in neural information processing systems, pp 3288–3298Google Scholar
  122. 122.
    Kingma DP, Salimans T, Welling M (2015) Variational dropout and the local reparameterization trick. In: Advances in neural information processing systems, pp 2575–2583Google Scholar
  123. 123.
    Kharitonov V, Molchanov D, Vetrov D (2018) Variational dropout via empirical bayes. arXiv preprint arXiv:181100596
  124. 124.
    Jacob B, Kligys S, Chen B, Zhu M, Tang M, Howard A, Adam H, Kalenichenko D (2018) Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2704–2713Google Scholar
  125. 125.
    Banner R, Hubara I, Hoffer E, Soudry D (2018) Scalable methods for 8-bit training of neural networks. In: Advances in neural information processing systems, pp 5151–5159Google Scholar
  126. 126.
    Courbariaux M, Bengio Y, David J-P (2015) Binaryconnect: Training deep neural networks with binary weights during propagations. In: Advances in neural information processing systems, pp 3123–3131Google Scholar
  127. 127.
    Umuroglu Y, Fraser NJ, Gambardella G, Blott M, Leong P, Jahre M, Vissers K (2017) Finn: a framework for fast, scalable binarized neural network inference. In: Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. ACM, pp 65–74Google Scholar
  128. 128.
    Zhao R, Song W, Zhang W, Xing T, Lin J-H, Srivastava M, Gupta R, Zhang Z (2017) Accelerating binarized convolutional neural networks with software-programmable fpgas. In: Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. ACM, pp 15–24Google Scholar
  129. 129.
    Darabi S, Belbahri M, Courbariaux M, Nia VP (2018) BNN+: improved binary network training. arXiv preprint arXiv:181211800
  130. 130.
    Gartner (2018) Gartner identifies the top 10 strategic technology trends for 2019. https://www.gartner.com/en/newsroom/press-releases/2018-10-15-gartner-identifies-the-top-10-strategic-technology-trends-for-2019. Accessed 16 Oct 2019
  131. 131.
    Sodhro AH, Pirbhulal S, Luo Z, de Albuquerque VHC (2019) Towards an optimal resource management for IoT based green and sustainable smart cities. J Clean Prod 220:1167–1179CrossRefGoogle Scholar
  132. 132.
    Chandio AA, Zhu D, Sodhro AH (2014) Integration of inter-connectivity of information system (i3) using web services. arXiv preprint arXiv:14053724
  133. 133.
    Sodhro AH, Malokani AS, Sodhro GH, Muzammal M, Zongwei L (2019) An adaptive QoS computation for medical data processing in intelligent healthcare applications. Neural computing and applications, pp 1–12Google Scholar
  134. 134.
    Sodhro AH, Pirbhulal S, Sodhro GH, Gurtov A, Muzammal M, Luo Z (2018) A joint transmission power control and duty-cycle approach for smart healthcare system. IEEE Sens J 19(19):8479–8486CrossRefGoogle Scholar
  135. 135.
    Wei X, Liu W, Chen L, Ma L, Chen H, Zhuang Y (2019) FPGA-based hybrid-type implementation of quantized neural networks for remote sensing applications. Sensors 19(4):924CrossRefGoogle Scholar
  136. 136.
    Kang S, Lee J, Kim C, Yoo H-J (2018) B-Face: 0.2 mW CNN-based face recognition processor with face alignment for mobile user identification. In: IEEE symposium on VLSI circuits. IEEE, pp 137–138Google Scholar
  137. 137.
    Kueh SM, Kazmierski TJ (2018) Low-power and low-cost dedicated bit-serial hardware neural network for epileptic seizure prediction system. IEEE J Transl Eng Health Med 6:1–9CrossRefGoogle Scholar
  138. 138.
    Gao C, Braun S, Kiselev I, Anumula J, Delbruck T, Liu S-C (2019) Real-time speech recognition for IoT purpose using a delta recurrent neural network accelerator. In: IEEE international symposium on circuits and systems (ISCAS). IEEE, pp 1–5Google Scholar
  139. 139.
    Li C-L, Huang Y-J, Cai Y-J, Han J, Zeng X-Y (2018) FPGA implementation of LSTM based on automatic speech recognition. In: 14th IEEE international conference on solid-state and integrated circuit technology (ICSICT). IEEE, pp 1–3Google Scholar
  140. 140.
    You X, Zhang C, Tan X, Jin S, Wu H (2019) AI for 5G: research directions and paradigms. Sci China Inf Sci 62(2):21301CrossRefGoogle Scholar
  141. 141.
    Magsi H, Sodhro AH, Chachar FA, Abro SAK, Sodhro GH, Pirbhulal S (2018) Evolution of 5G in internet of medical things. In: International conference on computing, mathematics and engineering technologies (iCoMET). IEEE, pp 1–7Google Scholar
  142. 142.
    Lodro MM, Majeed N, Khuwaja AA, Sodhro AH, Greedy S (2018) Statistical channel modelling of 5G mmWave MIMO wireless communication. In: International conference on computing, mathematics and engineering technologies (iCoMET). IEEE, pp 1–5Google Scholar
  143. 143.
    Teerapittayanon S, McDanel B, Kung H (2017) Distributed deep neural networks over the cloud, the edge and end devices. In: IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, pp 328–339Google Scholar
  144. 144.
    Michael Chui JM, Miremadi M, Henke N, Chung R, Nel P, Malhotra S (2018) Notes from the AI frontier: applications and value of deep learning. McKinsey & Company. https://www.mckinsey.com/featured-insights/artificial-intelligence/notes-from-the-ai-frontier-applications-and-value-of-deep-learning. Accessed 16 Oct 2019
  145. 145.
    Press G (2019) Artificial intelligence (AI) stats news: AI augmentation to create $2.9 trillion of business value. Forbes. https://www.forbes.com/sites/gilpress/2019/08/12/artificial-intelligence-ai-stats-news-ai-augmentation-to-create-2-9-trillion-of-business-value/#21cb849b63c2. Accessed 16 Oct 2019
  146. 146.
    MSV J (2019) Microsoft and Intel collaborate to simplify AI deployments at the edge. Forbes. https://www.forbes.com/sites/janakirammsv/2019/08/23/microsoft-and-intel-collaborate-to-simplify-ai-deployments-at-the-edge/#60ebb26f2a4b. Accessed 16 Oct 2019

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of EngineeringDeakin UniversityGeelongAustralia

Personalised recommendations