Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing

Tian, Xianzhong; Xu, Pengcheng; Shen, Yifan; Shao, Yuheng

doi:10.1007/s12083-023-01564-z

Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing

Published: 20 September 2023

Volume 16, pages 2865–2878, (2023)
Cite this article

Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Xianzhong Tian¹,
Pengcheng Xu¹,
Yifan Shen² &
…
Yuheng Shao¹

115 Accesses
Explore all metrics

Abstract

With the development of smart Internet of Things (IoT), it has seen a surge in wireless devices deploying Deep Neural Network (DNN) models for real-time computing tasks. However, the inherent resource and energy constraints of wireless devices make local completion of real-time inference tasks impractical. DNN model partitioning can partition the DNN model and use edge servers to assist in completing DNN model inference tasks, but offloading also requires a lot of transmission energy consumption. Additionally, the complex structure of DNN models means partitioning and offloading across different network layers impacts overall energy consumption significantly, complicating the development of an optimal partitioning strategy. Furthermore, in certain application contexts, regular battery charging or replacement for smart IoT devices is impractical and environmentally harmful. The development of wireless energy transfer technology enables devices to obtain RF energy through wireless transmission to achieve sustainable power supply. Motivated by this, We proposes a problem of joint DNN model partition and resource allocation in Wireless Powered Edge Computing (WPMEC). However, time-varying channel state in the WPMEC have a significant impact on resource allocation decisions. How to jointly optimize DNN model partition and resource allocation decisions is also a significant challenge. We proposes an online algorithm based on Deep Reinforcement Learning (DRL) to solve the time allocation decision, simplifying a Mixed Integer Nonlinear Problem (MINLP) into a convex optimization problem. Our approach seeks to maximize the completion rate of DNN inference tasks within the constraints of time-varying wireless channel states and delay constraints. Simulation results show the exceptional performance of this algorithm in enhancing task completion rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint DNN partitioning and task offloading in mobile edge computing via deep reinforcement learning

Article Open access 03 August 2023

Collaborative Deep Neural Network Inference via Mobile Edge Computing

Qoe-guaranteed distributed offloading decision via partially observable deep reinforcement learning for edge-enabled Internet of Things

Article 18 August 2023

Data availibility

Not applicable.

References

Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26
Article Google Scholar
Cardoso VB, Oliveira AS, Forechi A, Azevedo P, Mutz FW, Oliveira-Santos T, Badue C, Souza AFD (2020) A large-scale mapping method based on deep neural networks applied to self-driving car localization, pp 1–8
Shah SH, Yaqoob I (2016) A survey: Internet of things (iot) technologies, applications and challenges. 2016 IEEE Smart Energy Grid Engineering (SEGE), pp 381–385
Sze V, Chen Y, Yang T, Emer JS (2017) Efficient processing of deep neural networks: A tutorial and survey. Proc IEEE 105(12):2295–2329
Article Google Scholar
Roberts DA, Yaida S, Hanin B (2022) The principles of deep learning theory
Mao Y, You C, Zhang J, Huang K, Letaief KB (2017) A survey on mobile edge computing: The communication perspective. IEEE Commun Surv Tutor 19(4):2322–2358
Article Google Scholar
Mach P, Becvar Z (2017) Mobile edge computing: A survey on architecture and computation offloading. IEEE Commun Surv Tutor 19(3):1628–1656
Article Google Scholar
Hu C, Bao W, Wang D, Liu F (2019) Dynamic adaptive DNN surgery for inference acceleration on the edge, pp 1423–1431
Bi S, Ho CK, Zhang R (2015) Wireless powered communication: opportunities and challenges. IEEE Commun Mag 53(4):117–125
Article Google Scholar
Guo Y, Yao A, Chen Y (2016) Dynamic network surgery for efficient dnns, pp 1379–1387
Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz M, Dally B (2016) Deep compression and EIE: efficient inference engine on compressed deep neural network, pp 1–6
Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149
Hoang DT, Lee C, Niyato D, Wang P (2013) A survey of mobile cloud computing: architecture, applications, and approaches. Wirel Commun Mob Comput 13(18):1587–1611
Article Google Scholar
Rahimi MR, Ren J, Liu CH, Vasilakos AV, Venkatasubramanian N (2014) Mobile cloud computing: A survey, state of art and future directions. Mob Netw Appl 19(2):133–143
Article Google Scholar
Shahzad H, Szymanski TH (2016) A dynamic programming offloading algorithm for mobile cloud computing, pp 1–5
Teerapittayanon S, McDanel B, Kung HT (2017) Distributed deep neural networks over the cloud, the edge and end devices, pp 328–339
Kang Y, Hauswald J, Gao C, Rovinski A, Mudge TN, Mars J, Tang L (2017) Neurosurgeon: Collaborative intelligence between the cloud and mobile edge, pp 615–629
Dong C, Hu S, Chen X, Wen W (2021) Joint optimization with DNN partitioning and resource allocation in mobile edge computing. IEEE Trans Netw Serv Manag 18(4):3973–3986
Article Google Scholar
Xu Z, Zhao L, Liang W, Rana OF, Zhou P, Xia Q, Xu W, Wu G (2021) Energy-aware inference offloading for dnn-driven applications in mobile edge clouds. IEEE Trans Parallel Distrib Syst 32(4):799–814
Article Google Scholar
Chen X, Li M, Zhong H, Ma Y, Hsu C (2022) Dnnoff: Offloading dnn-based intelligent iot applications in mobile edge computing. IEEE Trans Industr Inform 18(4):2820–2829
Article Google Scholar
Mao S, Leng S, Yang K, Huang X, Zhao Q (2017) Fair energy-efficient scheduling in wireless powered full-duplex mobile-edge computing systems, pp 1–6
Bi S, Zhang YJ (2018) Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading. IEEE Trans Wirel Commun 17(6):4177–4190
Article Google Scholar
Liu J, Xiong K, Ng DWK, Fan P, Zhong Z, Letaief KB (2020) Max-min energy balance in wireless-powered hierarchical fog-cloud computing networks. IEEE Trans Wirel Commun 19(11):7064–7080
Article Google Scholar

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Grant No.61672465) and Natural Science Foundation of Zhejiang Province (Grant No. LZ22F020004).

Author information

Authors and Affiliations

Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China
Xianzhong Tian, Pengcheng Xu & Yuheng Shao
The Grainger College of Engineering, University of Illinois Urbana-Champaign, Champaign, 61820, Illinois, USA
Yifan Shen

Authors

Xianzhong Tian
View author publications
You can also search for this author in PubMed Google Scholar
Pengcheng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yifan Shen
View author publications
You can also search for this author in PubMed Google Scholar
Yuheng Shao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Xianzhong Tian: Conceptualization, Methodology and Supervision. Pengcheng Xu: Methodology, Writing - Original Draft and Investigation. Yifan Shen: Writing - Review & Editing and Validation. Yuheng Shao: Validation and Review.

Corresponding author

Correspondence to Xianzhong Tian.

Ethics declarations

Ethics approval

Not applicable.

Consent to publish

The authors consented for the publication of report to the journal with subscription.

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tian, X., Xu, P., Shen, Y. et al. Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing. Peer-to-Peer Netw. Appl. 16, 2865–2878 (2023). https://doi.org/10.1007/s12083-023-01564-z

Download citation

Received: 03 June 2023
Accepted: 09 September 2023
Published: 20 September 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s12083-023-01564-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing

Abstract

Access this article

Similar content being viewed by others

Joint DNN partitioning and task offloading in mobile edge computing via deep reinforcement learning

Collaborative Deep Neural Network Inference via Mobile Edge Computing

Qoe-guaranteed distributed offloading decision via partially observable deep reinforcement learning for edge-enabled Internet of Things

Data availibility

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Consent to publish

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing

Abstract

Access this article

Similar content being viewed by others

Joint DNN partitioning and task offloading in mobile edge computing via deep reinforcement learning

Collaborative Deep Neural Network Inference via Mobile Edge Computing

Qoe-guaranteed distributed offloading decision via partially observable deep reinforcement learning for edge-enabled Internet of Things

Data availibility

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Consent to publish

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation