Skip to main content
Log in

A Survey on Collaborative DNN Inference for Edge Intelligence

  • Review
  • Published:
Machine Intelligence Research Aims and scope Submit manuscript

Abstract

With the vigorous development of artificial intelligence (AI), intelligence applications based on deep neural networks (DNNs) have changed people’s lifestyles and production efficiency. However, the large amount of computation and data generated from the network edge becomes the major bottleneck, and the traditional cloud-based computing mode has been unable to meet the requirements of realtime processing tasks. To solve the above problems, by embedding AI model training and inference capabilities into the network edge, edge intelligence (EI) becomes a cutting-edge direction in the field of AI. Furthermore, collaborative DNN inference among the cloud, edge, and end devices provides a promising way to boost EI. Nevertheless, at present, EI oriented collaborative DNN inference is still in its early stage, lacking systematic classification and discussion of existing research efforts. Motivated by it, we have comprehensively investigated recent studies on EI-oriented collaborative DNN inference. In this paper, we first review the background and motivation of EI. Then, we classify four typical collaborative DNN inference paradigms for EI, and analyse their characteristics and key technologies. Finally, we summarize the current challenges of collaborative DNN inference, discuss future development trends and provide future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Y. LeCun, Y. Bengio, G. Hinton. Deep learning. Nature, vol.521, no. 7553, pp.436–444, 2015. DOI: https://doi.org/10.1038/nature14539.

    Google Scholar 

  2. F. Belkadi, M. A. Dhuieb, J. V. Aguado, F. Laroche, A. Bernard, F. Chinesta. Intelligent assistant system as a context-aware decision-making support for the workers of the future. Computers & Industrial Engineering, vol. 139, Article number 105732, 2020. DOI: https://doi.org/10.1016/j.cie.2019.02.046.

  3. S. Bhattacharya, S. R. K. Somayaji, T. R. Gadekallu, M. Alazab, P. K. R. Maddikunta. A review on deep learning for future smart cities. Internet Technology Letters, vol.5, no. 1, Article number e187, 2022. DOI: https://doi.org/10.1002/it12.187.

    Google Scholar 

  4. K. Rzadca, P. Findeisen, J. Swiderski, P. Zych, P. Broniek, J. Kusmierek, P. Nowak, B. Strack, P. Witusowski, S. Hand, J. Wilkes. Autopilot: Workload auto-scaling at Google. In Proceedings of the 15th European Conference on Computer Systems, Heraklion, Greece, Article number 16, 2020. DOI: https://doi.org/10.1145/3342195.3387524.

  5. M. AshifuddinMondal, Z. Rehena. IoT based intelligent agriculture field monitoring system. In Proceedings of the 8th International Conference on Cloud Computing, Data Science & Engineering, IEEE, Noida, India, pp. 625–629, 2018. DOI: https://doi.org/10.1109/CONFLUENCE.2018.8442535.

    Google Scholar 

  6. D. Pal, S. Funilkul, N. Charoenkitkarn, P. Kanthamanon. Internet-of-things and smart homes for elderly healthcare: An end user perspective. IEEE Access, vol.6, pp. 10483–10496, 2018. DOI: https://doi.org/10.1109/ACCESS.2018.2808472.

    Google Scholar 

  7. Y. Y. Mao, C. S. You, J. Zhang, K. B. Huang, K. B. Letaief. A survey on mobile edge computing: The communication perspective. IEEE Communications Surveys & Tutorials, vol.19, no. 4, pp. 2322–2358, 2017. DOI: https://doi.org/10.1109/COMST.2017.2745201.

    Google Scholar 

  8. Q. F. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, I. Stoica. Low latency geo-distributed data analytics. ACM SIGCOMM Computer Communication Review, vol.45, no.4, pp.421–434, 2015. DOI: https://doi.org/10.1145/2829988.2787505.

  9. Z. Zhou, X. Chen, E. Li, L. K. Zeng, K. Luo, J. S. Zhang. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, vol.107, no. 8, pp. 1738–1762, 2019. DOI: https://doi.org/10.1109/JPROC.2019.2918951.

    Google Scholar 

  10. W. S. Shi, J. Cao, Q. Zhang, Y. H. Z. Li, L. Y. Xu. Edge computing: Vision and challenges. IEEE Internet of Things Journal, vol.3, no.5, pp.637–646, 2016. DOI: https://doi.org/10.1109/JIOT.2016.2579198.

    Google Scholar 

  11. J. W. Kang, Z. H. Xiong, D. Niyato, Y. Z. Zou, Y. Zhang, M. Guizani. Reliable federated learning for mobile networks. IEEE Wireless Communications, vol.27, no. 2, pp. 72–80, 2020. DOI: https://doi.org/10.1109/MWC.001.1900119.

    Google Scholar 

  12. J. W. Kang, X. D. Li, J. T. Nie, Y. Liu, M. R. Xu, Z. H. Xiong, D. Niyato, Q. Yan. Communication-efficient and cross-chain empowered federated learning for artificial intelligence of things. IEEE Transactions on Network Science and Engineering, vol.9, no. 5, pp. 2966–2977, 2022. DOI: https://doi.org/10.1109/TNSE.2022.3178970.

    Google Scholar 

  13. Y. B. Qu, C. Dong, J. C. Zheng, H. P. Dai, F. Wu, S. Guo, A. Anpalagan. Empowering edge intelligence by air-ground integrated federated learning. IEEE Network, vol.35, no.5, pp.34–41, 2021. DOI: https://doi.org/10.1109/MNET.111.2100044.

    Google Scholar 

  14. X. W. Xu, Y. K. Ding, S. X. Hu, M. Niemier, J. Cong, Y. Hu, Y. Y. Shi. Scaling for edge inference of deep neural networks. Nature Electronics, vol.1, no. 4, pp. 216–222, 2018. DOI: https://doi.org/10.1038/s41928-018-0059-3.

    Google Scholar 

  15. K. B. Letaief, Y. M. Shi, J. M. Lu, J. H. Lu. Edge artificial intelligence for 6G: Vision, enabling technologies, and applications. IEEE Journal on Selected Areas in Communications, vol.40, no.1, pp.5–36, 2022. DOI: https://doi.org/10.1109/JSAC.2021.3126076.

    Google Scholar 

  16. J. Park, S. Samarakoon, M. Bennis, M. Debbah. Wireless network intelligence at the edge. Proceedings of the IEEE, vol.107, no. 11, pp. 2204–2239, 2019. DOI: https://doi.org/10.1109/JPROC.2019.2941458.

    Google Scholar 

  17. H. Jang, O. Simeone, B. Gardner, A. Gruning. An introduction to probabilistic spiking neural networks: Probabilistic models, learning rules, and applications. IEEE Signal Processing Magazine, vol.36, no.6, pp.64–77, 2019.DOI: https://doi.org/10.1109/MSP.2019.2935234.

    Google Scholar 

  18. F. Bonomi, R. Milito, J. Zhu, S. Addepalli. Fog computing and its role in the internet of things. In Proceedings of the 1st Edition of the MCC Workshop on Mobile Cloud Computing, Helsinki, Finland, pp. 13–16, 2012. DOI: https://doi.org/10.1145/2342509.2342513.

  19. S. G. Deng, H. L. Zhao, W. J. Fang, J. W. Yin, S. Dustdar, A. Y. Zomaya. Edge intelligence: The confluence of edge computing and artificial intelligence. IEEE Internet of Things Journal, vol.7, no. 8, pp. 7457–7469, 2020. DOI: https://doi.org/10.1109/JIOT.2020.2984887.

    Google Scholar 

  20. J. Zhang, K. B. Letaief. Mobile edge intelligence and computing for the internet of vehicles. Proceedings of the IEEE, vol.108, no. 2, pp. 246–261, 2020. DOI: https://doi.org/10.1109/JPROC.2019.2947490.

    Google Scholar 

  21. M. Jouhari, A. K. AI-Ali, E. Baccour, A. Mohamed, A. Erbad, M. Guizani, M. Hamdi. Distributed CNN inference on resource-constrained UAVs for surveillance systems: Design and optimization. IEEE Internet of Things Journal, vol.9, no. 2, pp. 1227–1242, 2022. DOI: https://doi.org/10.1109/JIOT.2021.3079164.

    Google Scholar 

  22. M. Subramanian, A. Wojtusciszyn, L. Favre, S. Boughorbel, J. X. Shan, K. B. Letaief, N. Pitteloud, L. Chouchane. Precision medicine in the era of artificial intelligence: Implications in chronic disease management. Journal of Translational Medicine, vol. 18, no. 1, Article number 472, 2020. DOI: https://doi.org/10.1186/s12967-020-02658-5.

    Google Scholar 

  23. C. Y. Chen, A. Seff, A. Kornhauser, J. X. Xiao. Deep-Driving: Learning affordance for direct perception in autonomous driving. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 2722–2730, 2015. DOI: https://doi.org/10.1109/ICCV.2015.312.

  24. N. Kalatzis, M. Avgeris, D. Dechouniotis, K. Papadakis-Vlachopapadopoulos, I. Roussaki, S. Papavassiliou. Edge computing in IoT ecosystems for UAV-enabled early fire detection. In Proceedings of IEEE International Conference on Smart Computing, Taormina, Italy, pp. 106–114, 2018. DOI: https://doi.org/10.1109/SMARTCOMP.2018.00080.

  25. S. Q. Ren, K. M. He, R. Girshick, J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 91–99, 2015.

  26. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg. SSD: Single shot MultiBox detector. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 21–37, 2016. DOI: https://doi.org/10.1007/978-3-319-46448-0_2.

  27. J. Redmon, A. Farhadi. YOLO9000: Better, faster, stronger. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 6517–6525, 2017. DOI: https://doi.org/10.1109/CVPR.2017.690.

  28. C. Szegedy, W. Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 1–9, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298594.

  29. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

  30. K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, [Online], Available: https://arxiv.org/abs/1409.1556, 2014.

  31. H. T. Dinh, C. Lee, D. Niyato, P. Wang. A survey of mobile cloud computing: Architecture, applications, and approaches. Wireless Communications and Mobile Computing, vol. 13, no. 18, pp. 1587–1611, 2013. DOI: https://doi.org/10.1002/wcm.1203.

    Google Scholar 

  32. G. Gobieski, B. Lucia, N. Beckmann. Intelligence beyond the edge: Inference on intermittent embedded systems. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Providence, USA, pp. 199–213, 2019. DOI: https://doi.org/10.1145/3297858.3304011.

  33. M. D. Ryan. Cloud computing privacy concerns on our doorstep. Communications of the ACM, vol.54, no.1, pp. 36–38, 2011. DOI: https://doi.org/10.1145/1866739.1866751.

    Google Scholar 

  34. K. Skala, D. Davidovic, E. Afgan, I. Sović, Z. Sojat. Scalable distributed computing hierarchy: Cloud, fog and dew computing. Open Journal of Cloud Computing, vol. 2, no. 1, pp. 16–24, 2015. DOI: https://doi.org/10.19210/1002.2.1.16.

    Google Scholar 

  35. Y. P. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, L. J. Tang. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Computer Architecture News, vol.45, no.1, pp. 615–629, 2017. DOI: https://doi.org/10.1145/3093337.3037698.

    Google Scholar 

  36. M. Krouka, A. Elgabli, C. B. Issaid, M. Bennis. Energy-efficient model compression and splitting for collaborative inference over time-varying channels. In Proceedings of the 32nd IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications, Helsinki, Finland, pp. 1173–1178, 2021. DOI: https://doi.org/10.1109/PIMRC50174.2021.9569707.

  37. K. K. Huang, Z. Tao, C. Wang, T. X. Guo, C. H. Yang, W. H. Gui. Cloud-edge collaborative method for industrial process monitoring based on error-triggered dictionary learning. IEEE Transactions on Industrial Informatics, vol. 18, no. 12, pp. 8957–8966, 2022.

    Google Scholar 

  38. L. Y. Liu, H. Y. Li, M. Gruteser. Edge assisted real-time object detection for mobile augmented reality. In Proceedings of the 25th Annual International Conference on Mobile Computing and Networking, Los Cabos, Mexico, Article number 25, 2019. DOI: https://doi.org/10.1145/3300061.3300116.

  39. H. B. Zhou, W. W. Zhang, C. W. Wang, X. Ma, H. R. Yu. BBNet: A novel convolutional neural network structure in edge-cloud collaborative inference. Sensors, vol.21, no. 13, Article number 4494, 2021. DOI: https://doi.org/10.3390/s21134494.

    Google Scholar 

  40. X. Dai, X. N. Kong, T. Guo, Y. X. Huang. CiNet: Redesigning deep neural networks for efficient mobile-cloud collaborative inference. In Proceedings of SIAM International Conference on Data Mining, pp. 459–467, 2021.

  41. J. Emmons, S. Fouladi, G. Ananthanarayanan, S. Venkataraman, S. Savarese, K. Winstein. Cracking open the DNN black-box: Video analytics with DNNS across the camera-cloud boundary. In Proceedings of Workshop on Hot Topics in Video Analytics and Intelligent Edges, Los Cabos, Mexico, pp. 27–32, 2019. DOI: https://doi.org/10.1145/3349614.3356023.

  42. M. C. Song, K. Zhong, J. Q. Zhang, Y. Hu, D. Liu, W. G. Zhang, J. Wang, T. Li. In-situ AI: Towards autonomous and incremental deep learning for IoT systems. In Proceedings of IEEE International Symposium on High Performance Computer Architecture, Vienna, Austria, pp. 92–103, 2018. DOI: https://doi.org/10.1109/HPCA.2018.00018.

  43. C. Hu, W. Bao, D. Wang, F. M. Liu. Dynamic adaptive DNN surgery for inference acceleration on the edge. In Proceedings of IEEE INFOCOM Conference on Computer Communications, Paris, France, pp. 1423–1431, 2019. DOI: https://doi.org/10.1109/INFOCOM.2019.8737614.

  44. N. Wang, Y. B. Duan, J. Wu. Accelerate cooperative deep inference via layer-wise processing schedule optimization. In Proceedings of International Conference on Computer Communications and Networks, IEEE, Athens, Greece, pp. 1–9, 2021. DOI: https://doi.org/10.1109/ICCCN52240.2021.9522274.

    Google Scholar 

  45. H. J. Jeong, H. J. Lee, C. H. Shin, S. M. Moon. IONN: Incremental offloading of neural network computations from mobile devices to edge servers. In Proceedings of the ACM Symposium on Cloud Computing, Carlsbad, USA, pp. 401–411, 2018. DOI: https://doi.org/10.1145/3267809.3267828.

  46. S. T. Nimi, A. Arefeen, Y. S. Uddin, Y. Lee. EARLIN: Early out-of-distribution detection for resource-efficient collaborative inference. In Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, Bilbao, Spain, pp. 635–651, 2021. DOI: https://doi.org/10.1007/978-3-030-86486-6_39.

    Google Scholar 

  47. J. Hauswald, T. Manville, Q. Zheng, R. Dreslinski, C. Chakrabarti, T. Mudge. A hybrid approach to offloading mobile image classification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, Italy, pp. 8375–8379, 2014. DOI: https://doi.org/10.1109/ICASSP.2014.6855235.

  48. S. Laskaridis, S. I. Venieris, M. Almeida, I. Leontiadis, N. D. Lane. SPINN: Synergistic progressive inference of neural networks over device and cloud. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, London, UK, Article number 37, 2020. DOI: https://doi.org/10.1145/3372224.3419194.

  49. A. E. Eshratifar, M. S. Abrishami, M. Pedram. JointDNN: An efficient training and inference engine for intelligent mobile cloud computing services. IEEE Transactions on Mobile Computing, vol.20, no. 2, pp. 565–576, 2021. DOI: https://doi.org/10.1109/TMC.2019.2947893.

    Google Scholar 

  50. M. F. Deng, H. Tian, B. Fan. Fine-granularity based application offloading policy in cloud-enhanced small cell networks. In Proceedings of IEEE International Conference on Communications Workshops, Kuala Lumpur, Malaysia, pp. 638–643, 2016. DOI: https://doi.org/10.1109/ICCW.2016.7503859.

  51. M. Gerla, E. K. Lee, G. Pau, U. Lee. Internet of vehicles: From intelligent grid to autonomous cars and vehicular clouds. In Proceedings of IEEE World Forum on Internet of Things, Seoul, Republic of Korea, pp. 241–246, 2014. DOI: https://doi.org/10.1109/WF-IoT.2014.6803166.

  52. B. Kizilkaya, E. Ever, H.Y. Yatbaz, A. Yazici. An effective forest fire detection framework using heterogeneous wireless multimedia sensor networks. ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 18, no. 2, pp. 1–21, 2022.

    Google Scholar 

  53. J. R. Jiang, H. J. Li, L. M. Wang. Joint model, task partitioning and privacy preserving adaptation for edge DNN inference. In Proceedings of IEEE Wireless Communications and Networking Conference, Austin, USA, pp. 1224–1229, 2022. DOI: https://doi.org/10.1109/WCNC51071.2022.9771620.

  54. T. Mohammed, C. Joe-Wong, R. Babbar, M. Di Francesco. Distributed inference acceleration with adaptive DNN partitioning and offloading. In Proceedings of IEEE INFOCOM Conference on Computer Communications, Toronto, Canada, pp. 854–863, 2020. DOI: https://doi.org/10.1109/INFOCOM41043.2020.9155237.

  55. N. L. Shan, Z. C. Ye, X. L. Cui. Collaborative intelligence: Accelerating deep neural network inference via device-edge synergy. Security and Communication Networks, vol.2020, Article number 8831341, 2020. DOI: https://doi.org/10.1155/2020/8831341.

  56. C. Y. Yang, J. J. Kuo, J. P. Sheu, K. J. Zheng. Cooperative distributed deep neural network deployment with edge computing. In Proceedings of ICC/IEEE International Conference on Communications, IEEE, Montreal, Canada, 2021. DOI: https://doi.org/10.1109/ICC42927.2021.9500668.

    Google Scholar 

  57. H. R. Liu, H. Y. Zheng, M. H. Jiao, G. X. Chi. SCADS: Simultaneous computing and distribution strategy for task offloading in mobile-edge computing system. In Proceedings of IEEE 18th International Conference on Communication Technology, Chongqing, China, pp. 1286–1290, 2018. DOI: https://doi.org/10.1109/ICCT.2018.8599958.

  58. M. Hanyao, Y. B. Jin, Z. Z. Qian, S. Zhang, S. L. Lu. Edge-assisted online on-device object detection for realtime video analytics. In Proceedings of IEEE INFOCOM Conference on Computer Communications, Vancouver, Canada, pp. 1–10, 2021. DOI: https://doi.org/10.1109/INFOCOM42981.2021.9488741.

  59. S. Yun, J. M. Kang, S. Choi, I. M. Kim. Cooperative inference of DNNs over noisy wireless channels. IEEE Transactions on Vehicular Technology, vol.70, no.8, pp. 8298–8303, 2021. DOI: https://doi.org/10.1109/TVT.2021.3092179.

    Google Scholar 

  60. E. Li, Z. Zhou, X. Chen. Edge intelligence: On-demand deep learning model co-inference with device-edge synergy. In Proceedings of Workshop on Mobile Edge Communications, Budapest, Hungary, pp. 31–36, 2018. DOI: https://doi.org/10.1145/3229556.3229562.

  61. E. Li, L. K. Zeng, Z. Zhou, X. Chen.z Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications, vol.19, no. 1, pp.447–457, 2020. DOI: https://doi.org/10.1109/TWC.2019.2946140.

    Google Scholar 

  62. J. D. Song, Z. C. Liu, X. F. Wang, C. Qiu, X. Chen. Adaptive and collaborative edge inference in task stream with latency constraint. In Proceedings of ICC/IEEE International Conference on Communications, Montreal, Canada, 2021. DOI: https://doi.org/10.1109/ICC42927.2021.9500892.

  63. L. K. Zeng, E. Li, Z. Zhou, X. Chen. Boomerang: On-demand cooperative deep neural network inference for edge intelligence on the industrial internet of things. IEEE Network, vol.33, no.5, pp.96–103, 2019. DOI: https://doi.org/10.1109/MNET.001.1800506.

    Google Scholar 

  64. S. Hu, C. W. Dong, W. S. Wen. Enable pipeline processing of DNN co-inference tasks in the mobile-edge cloud. In Proceedings of the 6th IEEE International Conference on Computer and Communication Systems, Chengdu, China, pp. 186–192, 2021. DOI: https://doi.org/10.1109/IC-CCS52626.2021.9449178.

  65. B. Y. Fang, X. Zeng, M. Zhang. NestDNN: Resource-aware multi-tenant on-device deep learning for continuous mobile vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, New Delhi, India, pp. 115–127, 2018. DOI: https://doi.org/10.1145/3241539.3241559.

  66. J. B. Du, L. Q. Zhao, J. Feng, X. L. Chu. Computation offloading and resource allocation in mixed fog/cloud computing systems with min-max fairness guarantee. IEEE Transactions on Communications, vol.66, no.4, pp. 1594–1608, 2018. DOI: https://doi.org/10.1109/TCOMM.2017.2787700.

    Google Scholar 

  67. X. Tang, X. Chen, L. K. Zeng, S. Yu, L. Chen. Joint multiuser DNN partitioning and computational resource allocation for collaborative edge intelligence. IEEE Internet of Things Journal, vol.8, no. 12, pp.9511–9522, 2021. DOI: https://doi.org/10.1109/JIOT.2020.3010258.

    Google Scholar 

  68. B. Yang, X. L. Cao, C. Yuen, L. J. Qian. Offloading optimization in edge computing for deep-learning-enabled target tracking by internet of UAVs. IEEE Internet of Things Journal, vol.8, no. 12, pp.9878–9893, 2021. DOI: https://doi.org/10.1109/JIOT.2020.3016694.

    Google Scholar 

  69. C. W. Dong, S. Hu, X. Chen, W. S. Wen. Joint optimization with DNN partitioning and resource allocation in mobile edge computing. IEEE Transactions on Network and Service Management, vol.18, no. 4, pp. 3973–3986, 2021. DOI: https://doi.org/10.1109/TNSM.2021.3116665.

    Google Scholar 

  70. A. E. Roth, M. Sotomayor. Two-sided matching. Handbook of Game Theory with Economic Applications, vol. 1, pp. 485–541, 1992. DOI: https://doi.org/10.1016/S1574-0005(05)80019-0.

    MathSciNet  MATH  Google Scholar 

  71. S. Teerapittayanon, B. McDanel, H. T. Kung.z Branchy-Net: Fast inference via early exiting from deep neural networks. In Proceedings of the 23rd International Conference on Pattern Recognition, IEEE, Cancun, Mexico, pp. 2464–2469, 2016. DOI: https://doi.org/10.1109/ICPR.2016.7900006.

    Google Scholar 

  72. M. Xue, H. M. Wu, R. D. Li, M. X. Xu, P. F. Jiao. Eos-DNN: An efficient offloading scheme for DNN inference acceleration in local-edge-cloud collaborative environments. IEEE Transactions on Green Communications and Networking, vol.6, no. 1, pp. 248–264, 2022. DOI: https://doi.org/10.1109/TGCN.2021.3111731.

    Google Scholar 

  73. X. J. Li, Y. J. Qin, H. C. Zhou, Z. W. Zhang. An intelligent collaborative inference approach of service partitioning and task offloading for deep learning based service in mobile edge computing networks. Transactions on Emerging Telecommunications Technologies, vol.32, no.9, Article number e4263, 2021. DOI: https://doi.org/10.1002/ett.4263.

    Google Scholar 

  74. P. Liu, B. Z. Qi, S. Banerjee. EdgeEye: An edge service framework for real-time intelligent video analytics. In Proceedings of the 1st International Workshop on Edge Systems, Analytics and Networking, Munich, Germany, pp. 1–6, 2018. DOI: https://doi.org/10.1145/3213344.3213345.

  75. A. Morshed, P. P. Jayaraman, T. Sellis, D. Georgakopoulos, M. Villari, R. Ranjan. Deep osmosis: Holistic distributed deep learning in osmotic computing. IEEE Cloud Computing, vol.4, no. 6, pp. 22–32, 2017. DOI: https://doi.org/10.1109/MCC.2018.1081070.

    Google Scholar 

  76. P. Ren, X. Q. Qiao, Y. K. Huang, L. Liu, C. Pu, S. Dustdar. Fine-grained elastic partitioning for distributed DNN towards mobile web AR services in the 5G era. IEEE Transactions on Services Computing, to be published. DOI: https://doi.org/10.1109/TSC.2021.3098816.

  77. C. Y. Lin, T. C. Wang, K. C. Chen, B. Y. Lee, J. J. Kuo. Distributed deep neural network deployment for smart devices from the edge to the cloud. In Proceedings of ACM MobiHoc Workshop on Pervasive Systems in the IoT Era, Catania, Italy, pp. 43–48, 2019. DOI: https://doi.org/10.1145/3331052.3332477.

  78. S. Dey, J. Mondal, A. Mukherjee. Offloaded execution of deep learning inference at edge: Challenges and insights. In Proceedings of IEEE International Conference on Pervasive Computing and Communications Workshops, Kyoto, Japan, pp. 855–861, 2019. DOI: https://doi.org/10.1109/PERCOMW.2019.8730817.

  79. B. Lin, Y. H. Huang, J. S. Zhang, J. Q. Hu, X. Chen, J. Li. Cost-driven off-loading for DNN-based applications over cloud, edge, and end devices. IEEE Transactions on Industrial Informatics, vol.16, no. 8, pp. 5456–5466, 2020. DOI: https://doi.org/10.1109/TII.2019.2961237.

    Google Scholar 

  80. S. Teerapittayanon, B. McDanel, H. T. Kung. Distributed deep neural networks over the cloud, the edge and end devices. In Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, Atlanta, USA, pp. 328–339, 2017. DOI: https://doi.org/10.1109/ICDCS.2017.226.

  81. Z. Y. Tao, Q. Li. eSGD: Communication efficient distributed deep learning on the edge. In Proceedings of the 1st USENIX Workshop on Hot Topics in Edge Computing, HotEdge, Boston, USA, 2018. Available: https://www.usenix.org/conference/hotedgel8/presentation/tao.

  82. A. Yousefpour, S. Devic, B. Q. Nguyen, A. Kreidieh, A. Liao, A. M. Bayen, J. P. Jue. Guardians of the deep fog: Failure-resilient DNN inference from edge to cloud. In Proceedings of the First International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things, New York, USA, pp. 25–31, 2019. DOI: https://doi.org/10.1145/3363347.3363366.

  83. A. Yousefpour, B. Q. Nguyen, S. Devic, G. H. Wang, A. Kreidieh, H. Lobel, A. M. Bayen, J. P. Jue. ResiliNet: Failure-resilient inference in distributed neural networks. [Online], Available: https://arxiv.org/abs/2002.07386, 2020.

  84. Y. Zhou, J. H. Xiao, Y. Zhou, G. Loianno. Multi-robot collaborative perception with graph neural networks. IEEE Robotics and Automation Letters, vol.7, no. 2, pp. 2289–2296, 2022. DOI: https://doi.org/10.1109/LRA.2022.3141661.

    Google Scholar 

  85. S. J. Wang, F. Jiang, B. Zhang, R. Ma, Q. Hao. Development of UAV-based target tracking and recognition systems. IEEE Transactions on Intelligent Transportation Systems, vol.21, no.8, pp.3409–3422, 2020. DOI: https://doi.org/10.1109/TITS.2019.2927838.

    Google Scholar 

  86. S. Bhagat, P. B. Sujit. UAV target tracking in urban environments using deep reinforcement learning. In Proceedings of International Conference on Unmanned Aircraft Systems, IEEE, Athens, Greece, pp. 694–701, 2020. DOI: https://doi.org/10.1109/ICUAS48674.2020.9213856.

    Google Scholar 

  87. M. Dhuheir, E. Baccour, A. Erbad, S. Sabeeh, M. Hamdi. Efficient real-time image recognition using collaborative swarm of UAVs and convolutional networks. In Proceedings of International Wireless Communications and Mobile Computing, IEEE, Harbin, China, pp. 1954–1959, 2021. DOI: https://doi.org/10.1109/IWCMC51323.2021.9498967.

    Google Scholar 

  88. Y. K. Huang, X. Q. Qiao, S. Dustdar, J. W. Zhang, J. L. Li. Toward decentralized and collaborative deep learning inference for intelligent IoT devices. IEEE Network, vol.36, no.1, pp.59–68, 2022. DOI: https://doi.org/10.1109/MNET.011.2000639.

    Google Scholar 

  89. N. Shlezinger, E. Farhan, H. Morgenstern, Y. C. Eldar. Collaborative inference via ensembles on the edge. In Proceedings of ICASSP/IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Toronto, Canada, pp. 8478–8482, 2021. DOI: https://doi.org/10.1109/ICASSP39728.2021.9414740.

    Google Scholar 

  90. S. Disabato, M. Roveri, C. Alippi. Distributed deep convolutional neural networks for the internet-of-things. IEEE Transactions on Computers, vol. 70, no. 8, pp. 1239–1252, 2021. DOI: https://doi.org/10.1109/TC.2021.3062227.

    MathSciNet  MATH  Google Scholar 

  91. S. Naveen, M. R. Kounte, M. R. Ahmed. Low latency deep learning inference model for distributed intelligent iot edge clusters. IEEE Access, vol.9, pp. 160607–160621, 2021. DOI: https://doi.org/10.1109/ACCESS.2021.3131396.

    Google Scholar 

  92. J. S. Du, M. H. Shen, Y. F. Du. A distributed in-situ CNN inference system for IoT applications. In Proceedings of the 38th IEEE International Conference on Computer Design, Hartford, USA, pp. 279–287, 2020. DOI: https://doi.org/10.1109/ICCD50377.2020.00055.

  93. E. Baccour, A. Erbad, A. Mohamed, M. Hamdi, M. Guizani. DistPrivacy: Privacy-aware distributed deep neural networks in IoT surveillance systems. In Proceedings of GLOBECOM/IEEE Global Communications Conference, IEEE, Taipei, China, 2020. DOI: https://doi.org/10.1109/GLOBE-COM42002.2020.9322470.

    Google Scholar 

  94. M. Hemmat, A. Davoodi, Y. H. Hu. Edgen AI: Distributed inference with local edge devices and minimal latency. In Proceedings of the 27th Asia and South Pacific Design Automation Conference, IEEE, Taipei, China, pp. 544–549, 2022. DOI: https://doi.org/10.1109/ASP-DAC52403.2022.9712496.

    Google Scholar 

  95. S. Zhang, S. Zhang, Z. Z. Qian, J. Wu, Y. B. Jin, S. L. Lu. DeepSlicing: Collaborative and adaptive CNN inference with low latency. IEEE Transactions on Parallel and Distributed Systems, vol.32, no.9, pp. 2175–2187, 2021. DOI: https://doi.org/10.1109/TPDS.2021.3058532.

    Google Scholar 

  96. J. C. Mao, X. Chen, K. W. Nixon, C. Krieger, Y. R. Chen. MoDNN: Local distributed mobile computing system for deep neural network. In Proceedings of Design, Automation & Test in Europe Conference & Exhibition, IEEE, Lausanne, Switzerland, pp. 1396–1401, 2017. DOI: https://doi.org/10.23919/DATE.2017.7927211.

    Google Scholar 

  97. Z. R. Zhao, K. M. Barijough, A. Gerstlauer.z DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, vol.37, no. 11, pp. 2348–2359, 2018. DOI: https://doi.org/10.1109/TCAD.2018.2858384.

    Google Scholar 

  98. L. K. Zeng, X. Chen, Z. Zhou, L. Yang, J. S. Zhang. CoEdge: Cooperative DNN inference with adaptive workload partitioning over heterogeneous edge devices. IEEE/ACM Transactions on Networking, vol. 29, no. 2, pp. 595–608, 2021. DOI: https://doi.org/10.1109/TNET.2020.3042320.

    Google Scholar 

  99. R. Hadidi, J. S. Cao, M. Woodward, M. S. Ryoo, H. Kim. Distributed perception by collaborative robots. IEEE Robotics and Automation Letters, vol.3, no.4, pp.3709–3716, 2018. DOI: https://doi.org/10.1109/LRA.2018.2856261.

    Google Scholar 

  100. A. Goel, C. Tung, X. Hu, G. K. Thiruvathukal, J. C. Davis, Y. H. Lu. Efficient computer vision on edge devices with pipeline-parallel hierarchical neural networks. In Proceedings of the 27th Asia and South Pacific Design Automation Conference, IEEE, Taipei, China, pp. 532–537, 2022. DOI: https://doi.org/10.1109/ASP-DAC52403.2022.9712574.

    Google Scholar 

  101. X. Liang, Z. Q. Li, D. D. Fan, B. Zhang, G. M. Lu, D. Zhang. Innovative contactless palmprint recognition system based on dual-camera alignment. IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 10, pp. 6464–6476, 2022.

    Google Scholar 

  102. J. Huang, V. Rathod, C. Sun, M. L. Zhu, A. Korattikara, A. Fathi, I. Fischer, Z. Wojna, Y. Song, S. Guadarrama, K. Murphy. Speed/Accuracy trade-offs for modern convolutional object detectors. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3296–3297, 2017. DOI: https://doi.org/10.1109/CV-PR.2017.351.

  103. C. Dong, Y. Shen, Y. B. Qu, K. Wang, J. C. Zheng, Q. H. Wu, F. Wu. UAVs as an intelligent service: Boosting edge intelligence for air-ground integrated networks. IEEE Network, vol.35, no.4, pp. 167–175, 2021. DOI: https://doi.org/10.1109/MNET.011.2000651.

  104. P. F. Wang, B. Y. Zhang, Y. G. Li, S. G. Zhang, Y. Zhang, B. Zhu. An adaptive task migration scheduling approach for edge-cloud collaborative inference. Wireless Communications & Mobile Computing, vol. 2022, 2022. DOI: https://doi.org/10.1155/2022/8804530.

  105. W. H. Liu, J. W. Geng, Z. W. Zhu, J. Cao, Z. R. Lian. Sniper: Cloud-edge collaborative inference scheduling with neural network similarity modeling. In Proceedings of the 59th ACM/IEEE Design Automation Conference, San Francisco, USA, pp. 505–510, 2022. DOI: https://doi.org/10.1145/3489517.3530474.

  106. M. Du, K. Wang, Y. F. Chen, X. Y. Wang, Y. F. Sun. Big data privacy preserving in multi-access edge computing for heterogeneous internet of things. IEEE Communications Magazine, vol.56, no.8, pp.62–67, 2018. DOI: https://doi.org/10.1109/MCOM.2018.1701148.

    Google Scholar 

  107. J. N. Li, J. Wu, J. H. Li, A. K. Bashir, M. J. Piran, A. Anjum. Blockchain-based trust edge knowledge inference of multi-robot systems for collaborative tasks. IEEE Communications Magazine, vol.59, no. 7, pp.94–100, 2021. DOI: https://doi.org/10.1109/MCOM.001.2000419.

    Google Scholar 

  108. D. Li, Z. N. Zhang, W. Y. Liao, Z. W. Xu. KLRA: A kernel level resource auditing tool for IoT operating system security. In Proceedings of IEEE/ACM Symposium on Edge Computing, IEEE, Seattle, USA, pp. 427–432, 2018. DOI: https://doi.org/10.1109/SEC.2018.00058.

    Google Scholar 

  109. Z. B. Wang, K. X. Liu, J. H. Hu, J. Ren, H. C. Guo, W. Yuan. Attrleaks on the edge: Exploiting information leakage from privacy-preserving co-inference. Chinese Journal of Electronics, vol. 32, no. 1, pp. 1–12, 2023.

    Google Scholar 

  110. I. Jarin, B. Eshete. PRICURE: Privacy-preserving collaborative inference in a multi-party setting. In Proceedings of ACM Workshop on Security and Privacy Analytics, pp. 25–35, 2021. DOI: https://doi.org/10.1145/3445970.3451156.

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Nos. 61931011, 62072303 and 61872310), the Key-area Research and Development Program of Guangdong Province, China (No. 2021B010 1400003), Hong Kong Research Grants Council (RGC) Research Impact Fund, China (No. R5060-19), General Research Fund (Nos. 152221/19E, 152203/20E and 152244/2IE), and Shenzhen Science and Technology Innovation Commission, China (No. JCYJ20200109142008673).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Dong.

Additional information

Wei-Qing Ren received the B. Sc. degree in electronic science and technology from Nanjing University of Aeronautics and Astronautics, China in 2021. He is currently a master student in electronic information at College of Electronic Information Engineering, Nanjing University of Aeronautics and Astronautics, China.

His research interests include deep learning, UAV based target detection and collaborative inference in UAV swarms.

Yu-Ben Qu received the B. Sc. degree in mathematics and applied mathematics from Nanjing University, China in 2009, the M. Sc. degree in communication and information systems, and the Ph. D. degree in computer science and technology from Nanjing Institute of Communications, China in 2012 and 2016, respectively. From June 2019 to June 2022, he was a postdoctoral fellow with Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. He is currently an associate research fellow in College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, and also with the Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space, Ministry of Industry and Information Technology, China. From October 2015 to January 2016, he was a visiting research associate in School of Computer Science and Engineering, University of Aizu, Japan. He was a recipient of the Best Paper Awards of GPC 2020 and IEEE SAGC 2021.

His research interests include mobile edge computing, edge intelligence and UAVs collaborative intelligence.

Chao Dong received the Ph. D. degree in communication engineering from PLA University of Science and Technology, China in 2007. From 2008 to 2011, he worked as a post doctor at Department of Computer Science and Technology, Nanjing University, China. From 2011 to 2017, he was an associate professor with Institute of Communications Engineering, PLA University of Science and Technology, China. He is now a full professor with College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, China. He is a member of IEEE, ACM and IEICE.

His research interests include D2D communications, UAV swarm networking and anti-jamming network protocols.

Yu-Qian Jing received B. Sc. degree in electronic science and technology from Nanjing University of Aeronautics and Astronautics, China in 2020. He is currently a master student in information and communication engineering at Nanjing University of Aeronautics and Astronautics, China.

His research interest is edge network intelligence.

Hao Sun received the B. Sc. degree in electronic science and technology from Nanjing University of Aeronautics and Astronautics, China in 2022. He is currently a master student in information and communication engineering at College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, China.

His research interests include deep learning, UAV cluster intelligence and UAV collaborative inference.

Qi-Hui Wu received the B. Sc. degree in communications engineering, and the M. Sc. and Ph. D. degrees in communications and information systems from Institute of Communications Engineering, China in 1994, 1997 and 2000, respectively. From 2003 to 2005, he was a postdoctoral research associate at Southeast University, China. From 2005 to 2007, he was an associate professor with Institute of Communications Engineering, PLA University of Science and Technology, China, where he is currently a full professor. From March 2011 to September 2011, he was an advanced visiting scholar in Stevens Institute of Technology, USA. Since 2016, he has been with Nanjing University of Aeronautics and Astronautics and appointed a distinguished professor.

His research interests include wireless communications and statistical signal processing, with emphasis on system design of software defined radio, cognitive radio, and smart radio.

Song Guo received the Ph. D. degree in computer Science from University of Ottawa, Canada in 2005. He is a full professor at Department of Computing, Hong Kong Polytechnic University, China. He also holds a Changjiang Chair Professorship awarded by the Ministry of Education of China. He is a Fellow of the Canadian Academy of Engineering and a Fellow of the IEEE (Computer Society). He published many papers in top venues with wide impact in these areas and was recognized as a Highly Cited Researcher (Clarivate Web of Science). He is the recipient of over a dozen Best Paper Awards from IEEE/ACM conferences, journals, and technical committees. He is the Editor-in-Chief of IEEE Open Journal of the Computer Society and the Chair of IEEE Communications Society (ComSoc) Space and Satellite Communications Technical Committee. He was an IEEE ComSoc Distinguished Lecturer and a Member of IEEE ComSoc Board of Governors. He has served for IEEE Computer Society on Fellow Evaluation Committee, and been named on editorial board of a number of prestigious international journals like IEEE TPDS, IEEE TCC, IEEE TETC, etc. He has also served as Chairs of organizing and technical committees of many international conferences.

His research interests include big data, edge AI, mobile computing and distributed systems.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, WQ., Qu, YB., Dong, C. et al. A Survey on Collaborative DNN Inference for Edge Intelligence. Mach. Intell. Res. 20, 370–395 (2023). https://doi.org/10.1007/s11633-022-1391-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-022-1391-7

Keywords

Navigation