Abstract
With the development of smart Internet of Things (IoT), it has seen a surge in wireless devices deploying Deep Neural Network (DNN) models for real-time computing tasks. However, the inherent resource and energy constraints of wireless devices make local completion of real-time inference tasks impractical. DNN model partitioning can partition the DNN model and use edge servers to assist in completing DNN model inference tasks, but offloading also requires a lot of transmission energy consumption. Additionally, the complex structure of DNN models means partitioning and offloading across different network layers impacts overall energy consumption significantly, complicating the development of an optimal partitioning strategy. Furthermore, in certain application contexts, regular battery charging or replacement for smart IoT devices is impractical and environmentally harmful. The development of wireless energy transfer technology enables devices to obtain RF energy through wireless transmission to achieve sustainable power supply. Motivated by this, We proposes a problem of joint DNN model partition and resource allocation in Wireless Powered Edge Computing (WPMEC). However, time-varying channel state in the WPMEC have a significant impact on resource allocation decisions. How to jointly optimize DNN model partition and resource allocation decisions is also a significant challenge. We proposes an online algorithm based on Deep Reinforcement Learning (DRL) to solve the time allocation decision, simplifying a Mixed Integer Nonlinear Problem (MINLP) into a convex optimization problem. Our approach seeks to maximize the completion rate of DNN inference tasks within the constraints of time-varying wireless channel states and delay constraints. Simulation results show the exceptional performance of this algorithm in enhancing task completion rates.
Similar content being viewed by others
Data availibility
Not applicable.
References
Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26
Cardoso VB, Oliveira AS, Forechi A, Azevedo P, Mutz FW, Oliveira-Santos T, Badue C, Souza AFD (2020) A large-scale mapping method based on deep neural networks applied to self-driving car localization, pp 1–8
Shah SH, Yaqoob I (2016) A survey: Internet of things (iot) technologies, applications and challenges. 2016 IEEE Smart Energy Grid Engineering (SEGE), pp 381–385
Sze V, Chen Y, Yang T, Emer JS (2017) Efficient processing of deep neural networks: A tutorial and survey. Proc IEEE 105(12):2295–2329
Roberts DA, Yaida S, Hanin B (2022) The principles of deep learning theory
Mao Y, You C, Zhang J, Huang K, Letaief KB (2017) A survey on mobile edge computing: The communication perspective. IEEE Commun Surv Tutor 19(4):2322–2358
Mach P, Becvar Z (2017) Mobile edge computing: A survey on architecture and computation offloading. IEEE Commun Surv Tutor 19(3):1628–1656
Hu C, Bao W, Wang D, Liu F (2019) Dynamic adaptive DNN surgery for inference acceleration on the edge, pp 1423–1431
Bi S, Ho CK, Zhang R (2015) Wireless powered communication: opportunities and challenges. IEEE Commun Mag 53(4):117–125
Guo Y, Yao A, Chen Y (2016) Dynamic network surgery for efficient dnns, pp 1379–1387
Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz M, Dally B (2016) Deep compression and EIE: efficient inference engine on compressed deep neural network, pp 1–6
Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149
Hoang DT, Lee C, Niyato D, Wang P (2013) A survey of mobile cloud computing: architecture, applications, and approaches. Wirel Commun Mob Comput 13(18):1587–1611
Rahimi MR, Ren J, Liu CH, Vasilakos AV, Venkatasubramanian N (2014) Mobile cloud computing: A survey, state of art and future directions. Mob Netw Appl 19(2):133–143
Shahzad H, Szymanski TH (2016) A dynamic programming offloading algorithm for mobile cloud computing, pp 1–5
Teerapittayanon S, McDanel B, Kung HT (2017) Distributed deep neural networks over the cloud, the edge and end devices, pp 328–339
Kang Y, Hauswald J, Gao C, Rovinski A, Mudge TN, Mars J, Tang L (2017) Neurosurgeon: Collaborative intelligence between the cloud and mobile edge, pp 615–629
Dong C, Hu S, Chen X, Wen W (2021) Joint optimization with DNN partitioning and resource allocation in mobile edge computing. IEEE Trans Netw Serv Manag 18(4):3973–3986
Xu Z, Zhao L, Liang W, Rana OF, Zhou P, Xia Q, Xu W, Wu G (2021) Energy-aware inference offloading for dnn-driven applications in mobile edge clouds. IEEE Trans Parallel Distrib Syst 32(4):799–814
Chen X, Li M, Zhong H, Ma Y, Hsu C (2022) Dnnoff: Offloading dnn-based intelligent iot applications in mobile edge computing. IEEE Trans Industr Inform 18(4):2820–2829
Mao S, Leng S, Yang K, Huang X, Zhao Q (2017) Fair energy-efficient scheduling in wireless powered full-duplex mobile-edge computing systems, pp 1–6
Bi S, Zhang YJ (2018) Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading. IEEE Trans Wirel Commun 17(6):4177–4190
Liu J, Xiong K, Ng DWK, Fan P, Zhong Z, Letaief KB (2020) Max-min energy balance in wireless-powered hierarchical fog-cloud computing networks. IEEE Trans Wirel Commun 19(11):7064–7080
Funding
This work was supported by the National Natural Science Foundation of China (Grant No.61672465) and Natural Science Foundation of Zhejiang Province (Grant No. LZ22F020004).
Author information
Authors and Affiliations
Contributions
Xianzhong Tian: Conceptualization, Methodology and Supervision. Pengcheng Xu: Methodology, Writing - Original Draft and Investigation. Yifan Shen: Writing - Review & Editing and Validation. Yuheng Shao: Validation and Review.
Corresponding author
Ethics declarations
Ethics approval
Not applicable.
Consent to publish
The authors consented for the publication of report to the journal with subscription.
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tian, X., Xu, P., Shen, Y. et al. Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing. Peer-to-Peer Netw. Appl. 16, 2865–2878 (2023). https://doi.org/10.1007/s12083-023-01564-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-023-01564-z