Abstract
The rapid emergence of deep learning (DL) algorithms has paved the way for bringing artificial intelligence (AI) services to end users. The intersection between edge computing and AI has brought an exciting area of research called edge artificial intelligence (Edge AI). Edge AI has enabled a paradigm shift in many application areas such as precision medicine, wearable sensors, intelligent robotics, industry, and agriculture. The training and inference of DL algorithms are migrating from the cloud to the edge. Computationally expensive, memory and power-hungry DL algorithms are optimized to leverage the full potential of Edge AI. Embedding intelligence on the edge devices such as the internet of things (IoT), smartphones, and cyber-physical systems (CPS) can ensure user privacy and data security. Edge AI eliminates the need for cloud transmission through processing near the source of data and significantly reduces the latency; enabling real-time, learned, and automatic decision-making. However, the computing resources at the edge suffer from power and memory constraints. Various compression and optimization techniques have been developed in both the algorithm and the hardware to overcome the resource constraints of edge. In addition, algorithm-hardware codesign has emerged as a crucial element to realize the efficient Edge AI. This chapter focuses on each component of integrating DL into Edge AI such as model compression, algorithm hardware codesign, available edge hardware platforms, and challenges and future opportunities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
D. Xu et al., Edge intelligence: empowering intelligence to the edge of network. Proc. IEEE 109(11), 1778–1837 (2021). https://doi.org/10.1109/JPROC.2021.3119950
Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015). https://doi.org/10.1038/nature14539
V. Sze, Y.-H. Chen, T.-J. Yang, J.S. Emer, Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017). https://doi.org/10.1109/JPROC.2017.2761740
M.M. Hossain Shuvo, O. Hassan, D. Parvin, M. Chen, S.K. Islam, An optimized hardware implementation of deep learning inference for diabetes prediction, in 2021 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), May 2021, pp. 1–6. https://doi.org/10.1109/I2MTC50364.2021.9459794
M.M. Hossain Shuvo, N. Ahmed, K. Nouduri, K. Palaniappan, A hybrid approach for human activity recognition with support vector machine and 1D convolutional neural network, in 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), Oct 2020, pp. 1–5. https://doi.org/10.1109/AIPR50011.2020.9425332
Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, J. Zhang, Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107(8), 1738–1762 (2019). https://doi.org/10.1109/JPROC.2019.2918951
P. Guo, B. Hu, R. Li, W. Hu, FoggyCache, in Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, Oct 2018, pp. 19–34. https://doi.org/10.1145/3241539.3241557
H.-J. Jeong, H.-J. Lee, C.H. Shin, S.-M. Moon, IONN, in Proceedings of the ACM Symposium on Cloud Computing, Oct 2018, pp. 401–411. https://doi.org/10.1145/3267809.3267828
B.L. Deng, G. Li, S. Han, L. Shi, Y. Xie, Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc. IEEE 108(4), 485–532 (2020). https://doi.org/10.1109/JPROC.2020.2976475
X. Xu, S. Yin, P. Ouyang, Fast and low-power behavior analysis on vehicles using smartphones, in 2017 6th International Symposium on Next Generation Electronics (ISNE), May 2017, pp. 1–4. https://doi.org/10.1109/ISNE.2017.7968748
J. H. Al Shamsi, M. Al-Emran, K. Shaalan, Understanding key drivers affecting students’ use of artificial intelligence-based voice assistants. Educ. Inf. Technol. (2022). https://doi.org/10.1007/s10639-022-10947-3
F. Shang, J. Lai, J. Chen, W. Xia, H. Liu, A model compression based framework for electrical equipment intelligent inspection on edge computing environment, in 2021 IEEE 6th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Apr 2021, pp. 406–410. https://doi.org/10.1109/ICCCBDA51879.2021.9442600
Y.-L. Lee, P.-K. Tsung, M. Wu, Techology trend of Edge AI, in 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Apr 2018, pp. 1–2. https://doi.org/10.1109/VLSI-DAT.2018.8373244
Y. Wu, Cloud-edge orchestration for the internet of things: architecture and AI-powered data processing. IEEE Internet Things J. 8(16), 12792–12805 (2021). https://doi.org/10.1109/JIOT.2020.3014845
M. Al-Emran, J.M. Ehrenfeld, Breaking out of the box: wearable technology applications for detecting the spread of COVID-19. J. Med. Syst. 45(2), 20 (2021). https://doi.org/10.1007/s10916-020-01697-1
R. Sachdev, Towards security and privacy for Edge AI in IoT/IoE based digital marketing environments, in 2020 Fifth International Conference on Fog and Mobile Edge Computing (FMEC), Apr 2020, pp. 341–346. https://doi.org/10.1109/FMEC49853.2020.9144755
J.-W. Hong, I. Cruz, D. Williams, AI, you can drive my car: how we evaluate human drivers vs. self-driving cars. Comput. Hum. Behav. 125, 106944 (2021). https://doi.org/10.1016/j.chb.2021.106944
Q. Liang, P. Shenoy, D. Irwin, AI on the edge: characterizing AI-based IoT applications using specialized edge architectures, in 2020 IEEE International Symposium on Workload Characterization (IISWC), Oct 2020, pp. 145–156. https://doi.org/10.1109/IISWC50251.2020.00023
M.P. Véstias, R.P. Duarte, J.T. de Sousa, H.C. Neto, Moving deep learning to the edge. Algorithms 13(5), 125 (2020). https://doi.org/10.3390/a13050125
S. Ruder, An overview of gradient descent optimization algorithms (2016). arXiv preprint arXiv:1609.04747
M.Z. Alom et al., A state-of-the-art survey on deep learning theory and architectures. Electronics 8(3), 292 (2019). https://doi.org/10.3390/electronics8030292
M.M. Hossain Shuvo et al., Multi-focus image fusion for confocal microscopy using U-Net regression map, in 2020 25th International Conference on Pattern Recognition (ICPR), Jan 2021, pp. 4317–4323. https://doi.org/10.1109/ICPR48806.2021.9412122
S. Han, J. Pool, J. Tran, W.J. Dally, Learning both weights and connections for efficient neural networks (2015). arXiv preprint arXiv:1506.02626
A.G. Howard et al., Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, Mobilenetv2: inverted residuals and linear bottlenecks, in Conference on Computer Vision and Pattern Recognition (2018), pp. 4510–4520
F.N. Iandola, S. Han, M.W. Moskewicz, K. Ashraf, W.J. Dally, K. Keutzer, SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size (2016). arXiv preprint arXiv:1602.07360
X. Zhang, X. Zhou, M. Lin, J. Sun, Shufflenet: an extremely efficient convolutional neural network for mobile devices, in Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
A. Lavin, S. Gray, Fast algorithms for convolutional neural networks, in Conference on Computer Vision and Pattern Recognition (2016), pp. 4013–4021
R. Jozefowicz, W. Zaremba, I. Sutskever, An empirical exploration of recurrent network architectures, in International Conference on Machine Learning, pp. 2342–2350 (2015)
G.-B. Zhou, J. Wu, C.-L. Zhang, Z.-H. Zhou, Minimal gated unit for recurrent neural networks. Int. J. Autom. Comput. 13(3), 226–234 (2016). https://doi.org/10.1007/s11633-016-1006-2
D. Neil, M. Pfeiffer, S.-C. Liu, Phased LSTM: accelerating recurrent network training for long or event-based sequences (2016). arXiv preprint arXiv:1610.09513
H. Sak, A.W. Senior, F. Beaufays, Long short-term memory recurrent neural network architectures for large scale acoustic modeling (2014)
O. Kuchaiev, B. Ginsburg, Factorization tricks for LSTM networks (2017). arXiv preprint arXiv:1703.10722
S. Zhang et al., Architectural complexity measures of recurrent neural networks. Adv. Neural. Inf. Process. Syst. 29, 1822–1830 (2016)
Z. He, S. Gao, L. Xiao, D. Liu, H. He, D. Barber, Wider and deeper, cheaper and faster: tensorized LSTMs for sequence learning (2017). arXiv preprint arXiv:1711.01577
M. Zhu, T. Zhang, Z. Gu, Y. Xie, Sparse tensor core, in Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Oct 2019, pp. 359–371. https://doi.org/10.1145/3352460.3358269
X. Dai, H. Yin, N.K. Jha, NeST: a neural network synthesis tool based on a grow-and-prune paradigm. IEEE Trans. Comput. 68(10), 1487–1497 (2019). https://doi.org/10.1109/TC.2019.2914438
W. Wen, C. Wu, Y. Wang, Y. Chen, H. Li, Learning structured sparsity in deep neural networks. Adv. Neural. Inf. Process. Syst. 29, 2074–2082 (2016)
S. Cao et al., Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity, in Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb 2019, pp. 63–72. https://doi.org/10.1145/3289602.3293898
H. Wang, Q. Zhang, Y. Wang, L. Yu, H. Hu, Structured pruning for efficient ConvNets via incremental regularization, in 2019 International Joint Conference on Neural Networks (IJCNN), Jul 2019, pp. 1–8. https://doi.org/10.1109/IJCNN.2019.8852463
T.-J. Yang, Y.-H. Chen, V. Sze, Designing energy-efficient convolutional neural networks using energy-aware pruning. Proc. IEEE, 5687–5695 (2017)
M. Horowitz, 1.1 computing’s energy problem (and what we can do about it), in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), Feb 2014, pp. 10–14. https://doi.org/10.1109/ISSCC.2014.6757323
N. Wang, J. Choi, D. Brand, C.-Y. Chen, K. Gopalakrishnan, Training deep neural networks with 8-bit floating point numbers, in International Conference on Neural Information Processing Systems (2018), pp. 7686–7695
P. Gysel, M. Motamedi, S. Ghiasi, Hardware-oriented approximation of convolutional neural networks (2016). arXiv preprint arXiv:1604.03168
M.A. Nasution, D. Chahyati, M.I. Fanany, Faster R-CNN with structured sparsity learning and Ristretto for mobile environment, in 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Oct 2017, pp. 309–314. https://doi.org/10.1109/ICACSIS.2017.8355051
M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, Y. Bengio, Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or −1 (2016) arXiv preprint arXiv:1602.02830
M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi, XNOR-Net: ImageNet classification using binary convolutional neural networks (2016), pp. 525–542. https://doi.org/10.1007/978-3-319-46493-0_32
Z. Cai, X. He, J. Sun, N. Vasconcelos, Deep learning with low precision by half-wave Gaussian quantization, in Conference on Computer Vision and Pattern Recognition (2017), pp. 5918–5926
F. Li, B. Zhang, B. Liu, Ternary weight networks (2016). arXiv preprint arXiv:1605.04711
M. Covell, D. Marwood, S. Baluja, N. Johnston, Table-based neural units: fully quantizing networks for multiply-free inference (2019). arXiv preprint arXiv:1906.04798
W. Chen, J. Wilson, S. Tyree, K. Weinberger, Y. Chen, Compressing neural networks with the hashing trick, in International Conference on Machine Learning (2015), pp. 2285–2294
E.H. Lee, D. Miyashita, E. Chai, B. Murmann, S.S. Wong, LogNet: energy-efficient neural networks using logarithmic computation, in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar 2017, pp. 5900–5904. https://doi.org/10.1109/ICASSP.2017.7953288
S. Han, H. Mao, W.J. Dally, Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding (2015). arXiv preprint arXiv:1510.00149
L. Wang, K.-J. Yoon, Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. IEEE Trans. Pattern Anal. Mach. Intell., p. 1 (2021). https://doi.org/10.1109/TPAMI.2021.3055564
G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network (2015). arXiv preprint arXiv:1503.02531
Z. Huang, N. Wang, Like what you like: knowledge distill via neuron selectivity transfer (2017). arXiv preprint arXiv:1707.01219
A. Romero, N. Ballas, S. Ebrahimi Kahou, A. Chassang, C. Gatta, Y. Bengio, Fitnets: hints for thin deep nets (2014). arXiv preprint arXiv:1412.6550
B. Heo, M. Lee, S. Yun, J.Y. Choi, Knowledge transfer via distillation of activation boundaries formed by hidden neurons. Proc. AAAI Conf. Artifi. Intell. 33, 3779–3787 (2019). https://doi.org/10.1609/aaai.v33i01.33013779
D. Li, X. Wang, D. Kong, Deeprebirth: accelerating deep neural network execution on mobile devices. AAAI Conf. Artifi. Intell. 32(1) (2018)
L. Yuan, F. E.H. Tay, G. Li, T. Wang, J. Feng, Revisiting knowledge distillation via label smoothing regularization, in Conference on Computer Vision and Pattern Recognition (2020), pp. 3903–3911
S.I. Mirzadeh, M. Farajtabar, A. Li, N. Levine, A. Matsukawa, H. Ghasemzadeh, Improved knowledge distillation via teacher assistant. Proc. AAAI Conf. Artif. Intell. 34(04), 5191–5198 (2020). https://doi.org/10.1609/aaai.v34i04.5963
Y. Zhang, T. Xiang, T.M. Hospedales, H. Lu, Deep mutual learning, in Conference on Computer Vision and Pattern Recognition (2018), pp. 4320–4328
O. Bohdal, Y. Yang, T. Hospedales, Flexible dataset distillation: learn labels instead of images (2020). arXiv preprint arXiv:2006.08572
E. Park et al., Big/little deep neural network for ultra low power inference, in 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES + ISSS), Oct 2015, pp. 124–132. https://doi.org/10.1109/CODESISSS.2015.7331375
H. Tann, S. Hashemi, R.I. Bahar, S. Reda, Runtime configurable deep neural networks for energy-accuracy trade-off, in International Conference on Hardware/Software Codesign and System Synthesis (2016), pp. 1–10
S. Teerapittayanon, B. McDanel, H.T. Kung, BranchyNet: fast inference via early exiting from deep neural networks, in 2016 23rd International Conference on Pattern Recognition (ICPR), Dec 2016, pp. 2464–2469. https://doi.org/10.1109/ICPR.2016.7900006
J. Chen, X. Ran, Deep learning with edge computing: a review. Proc. IEEE 107(8), 1655–1674 (2019). https://doi.org/10.1109/JPROC.2019.2921977
Y. Jia et al., Caffe, in Proceedings of the 22nd ACM International Conference on Multimedia (2014), pp. 675–678. https://doi.org/10.1145/2647868.2654889
A. Paszke et al., Pytorch: an imperative style, high-performance deep learning library, in Advances in Neural Information Processing Systems (2019), pp. 8026–8037
M. Abadi et al., Tensorflow: a system for large-scale machine learning, in 12th USENIX symposium on operating systems design and implementation (OSDI) (2016), pp. 265–283
J. Lee, C. Kim, S. Kang, D. Shin, S. Kim, H.-J. Yoo, UNPU: a 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision, in 2018 IEEE International Solid—State Circuits Conference—(ISSCC), Feb 2018, pp. 218–220. https://doi.org/10.1109/ISSCC.2018.8310262
B. Moons, R. Uytterhoeven, W. Dehaene, M. Verhelst, 14.5 envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28 nm FDSOI, in 2017 IEEE International Solid-State Circuits Conference (ISSCC), Feb 2017, pp. 246–247. https://doi.org/10.1109/ISSCC.2017.7870353
D. Han, J. Lee, J. Lee, H.-J. Yoo, A 1.32 TOPS/W energy efficient deep neural network learning processor with direct feedback alignment based heterogeneous core architecture, in 2019 Symposium on VLSI Circuits, Jun 2019, pp. C304–C305. https://doi.org/10.23919/VLSIC.2019.8778006
AI for the Edge, https://www.gyrfalcontech.ai/solutions/
A. Shawahna, S.M. Sait, A. El-Maleh, FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7, 7823–7859 (2019). https://doi.org/10.1109/ACCESS.2018.2890150
K. Guo et al., Angel-Eye: a complete design flow for mapping CNN onto embedded FPGA. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 37(1), 35–47 (2018). https://doi.org/10.1109/TCAD.2017.2705069
S.I. Venieris, C.-S. Bouganis, fpgaConvNet: mapping regular and irregular convolutional neural networks on FPGAs. IEEE Trans. Neural Netw. Learn. Syst. 30(2), 326–342 (2019). https://doi.org/10.1109/TNNLS.2018.2844093
Y.-H. Chen, T. Krishna, J.S. Emer, V. Sze, Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid-State Circ. 52(1), 127–138 (2017). https://doi.org/10.1109/JSSC.2016.2616357
Y.-H. Chen, T.-J. Yang, J.S. Emer, V. Sze, Eyeriss v2: a flexible accelerator for emerging deep neural networks on mobile devices. IEEE J. Emerg. Sel. Top. Circ. Syst. 9(2), 292–308 (2019). https://doi.org/10.1109/JETCAS.2019.2910232
EyeQ The System-on-Chip for Automotive Applications. https://www.mobileye.com/eyeq-chip/
L. Lai, N. Suda, V. Chandra, CMSIS-NN: efficient neural network kernels for arm cortex-M CPUs (2018). arXiv preprint arXiv:1801.06601
B. Fan, X. Liu, X. Su, P. Hui, J. Niu, EmgAuth: an EMG-based smartphone unlocking system using siamese network, in 2020 IEEE International Conference on Pervasive Computing and Communications (PerCom), Mar 2020, pp. 1–10. https://doi.org/10.1109/PerCom45495.2020.9127387
D. Wen, H. Han, A.K. Jain, Face spoof detection with image distortion analysis. IEEE Trans. Inf. For. Secur. 10(4), 746–761 (2015). https://doi.org/10.1109/TIFS.2015.2400395
S. Bhattacharya, N.D. Lane, From smart to deep: robust activity recognition on smartwatches using deep learning, in 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops), Mar 2016, pp. 1–6. https://doi.org/10.1109/PERCOMW.2016.7457169
A. Mathur, N.D. Lane, S. Bhattacharya, A. Boran, C. Forlivesi, F. Kawsar, “DeepEye,” in Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services, Jun 2017, pp. 68–81. https://doi.org/10.1145/3081333.3081359
C. Streiffer, R. Raghavendra, T. Benson, M. Srivatsa, “Darnet,” in Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference on Industrial Track—Middleware’17, 2017, pp. 22–28. https://doi.org/10.1145/3154448.3154452
B. Wang, F. Ma, L. Ge, H. Ma, H. Wang, M.A. Mohamed, Icing-EdgeNet: a pruning lightweight edge intelligent method of discriminative driving channel for ice thickness of transmission lines. IEEE Trans. Instrum. Meas. 70, 1–12 (2021). https://doi.org/10.1109/TIM.2020.3018831
Y. Bengio, A. Courville, P. Vincent, Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
S. Dave, R. Baghdadi, T. Nowatzki, S. Avancha, A. Shrivastava, B. Li, Hardware acceleration of sparse and irregular tensor computations of ML models: a survey and insights. Proc. IEEE 109(10), 1706–1752 (2021). https://doi.org/10.1109/JPROC.2021.3098483
Z. Chen, B. Liu, in Lifelong Machine Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 12(3), 2nd edn. (2018), pp. 1–207. https://doi.org/10.2200/S00832ED1V01Y201802AIM037
Acknowledgements
The author would like to thank Dr. Syed Kamrul Islam, Professor and Chair, Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, USA, for his constructive feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Shuvo, M.M.H. (2022). Edge AI: Leveraging the Full Potential of Deep Learning. In: Al-Emran, M., Shaalan, K. (eds) Recent Innovations in Artificial Intelligence and Smart Applications. Studies in Computational Intelligence, vol 1061. Springer, Cham. https://doi.org/10.1007/978-3-031-14748-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-14748-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14747-0
Online ISBN: 978-3-031-14748-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)