Abstract
In mobile edge computing (MEC) systems, network entities and mobile devices need to make decisions to enable efficient use of network and computational resources. Such decision making can be challenging because the environment in MEC systems can be complex and involve time-varying system dynamics. To address such challenges, deep reinforcement learning (DRL) emerges as a promising method. It enables agents (e.g., network entities, mobile devices) to learn the optimal decision-making policy through interacting with the environment. In this chapter, we describe how DRL can be incorporated into MEC systems for improving the system performance. We first give an overview of DRL techniques. Then, we present a case study on the task offloading problem in MEC systems. In particular, we focus on the unknown and time-varying load level dynamics at the edge nodes and formulate a task offloading problem for minimizing the task delay and the ratio of dropped tasks. We propose a deep Q-learning-based algorithm that enables the mobile devices to make their task offloading decisions in a decentralized fashion with local information. This algorithm incorporates double deep Q-network (DQN) and dueling DQN techniques for enhancing the algorithm performance. Simulation results demonstrate that the proposed algorithm can reduce the task delay and ratio of dropped tasks significantly when compared with the existing methods. Finally, we outline several challenges and future research directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This setting is for the simplicity of mathematical presentation. For any task k m(t) that has been dropped, the value of Delaym(t) will not be taken into account in our proposed algorithm according to Sects. 9.3.2 and 9.3.3. Meanwhile, the delay of a dropped task is not accounted when we evaluate the average delay of the tasks with our proposed algorithm and benchmark methods in Sect. 9.3.4.
- 2.
The weights of the connections between the A&V layer and the output layer as well as the bias of the neurons in the output layer are given and fixed. Hence, we do not include them in the network parameter vector Īø m, as the vector Īø m includes the parameters that are adjustable through learning in the DQL-based algorithm.
References
Y. Mao, C. You, J. Zhang, K. Huang, K.B. Letaief, A survey on mobile edge computing: the communication perspective. IEEE Commun. Surv. Tutorials 19(4), 2322ā2358, Fourth quarter (2017)
P. Porambage, J. Okwuibe, M. Liyanage, M. Ylianttila, T. Taleb, Survey on multi-access edge computing for Internet of things realization. IEEE Commun. Surv. Tutorials 20(4), 2961ā2991 (2018)
F. Bonomi, R. Milito, J. Zhu, S. Addepalli, Fog computing and its role in the Internet of things, in Proc. ACM SIGCOMM Workshop on Mobile Cloud Computing (MCC), Helsinki, August 2012
Z. Sanaei, S. Abolfazli, A. Gani, R. Buyya, Heterogeneity in mobile cloud computing: taxonomy and open challenges. IEEE Commun. Surv. Tutorials 16(1), 369ā392, First quarter (2014)
T. Qiu, J. Chi, X. Zhou, Z. Ning, M. Atiquzzaman, D.O. Wu, Edge computing in industrial Internet of things: architecture, advances and challenges. IEEE Commun. Surv. Tutorials 22(4), 2462ā2488, Fourth quarter (2020)
P. Ranaweera, A.D. Jurcut, M. Liyanage, Survey on multi-access edge computing security and privacy. IEEE Commun. Surv. Tutorials 23(2), 1078ā1124, Second quarter (2021)
T. Chen, Q. Ling, G.B. Giannakis, An online convex optimization approach to proactive network resource allocation. IEEE Trans. Signal Process. 65(24), 6350ā6364 (2017)
H. Shah-Mansouri, V.W.S. Wong, Hierarchical fog-cloud computing for IoT systems: a computation offloading game. IEEE Internet Things J. 5(4), 3246ā3257 (2018)
V. FranƧois-Lavet, P. Henderson, R. Islam, M.G. Bellemare, J. Pineau, An introduction to deep reinforcement learning. Found. Trends Mach. Learn. 11(3ā4), 219ā354 (2018)
X. Wang, Y. Han, V.C.M. Leung, D. Niyato, X. Yan, X. Chen, Convergence of edge computing and deep learning: a comprehensive survey. IEEE Commun. Surv. Tutorials 22(2), 869ā904, Second quarter (2020)
R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, 2nd edn. (MIT Press, Cambridge, MA, 2018)
V. Mnih, K. Kavukcuoglu, D. Silver, A.A. Rusu, J. Veness, M.G. Bellemare, A. Graves, M. Riedmiller, A.K. Fidjeland, G. Ostrovski, Human-level control through deep reinforcement learning. Nature 518(7540), 529ā533 (2015)
H. van Hasselt, A. Guez, D. Silver, Deep reinforcement learning with double Q-learning, in Proc. AAAI Conf. on Artificial Intelligence, Phoenix, AZ, May 2016
Z. Wang, T. Schaul, M. Hessel, H. van Hasselt, M. Lanctot, N. de Freitas, Dueling network architectures for deep reinforcement learning, in Proc. Intāl Conf. on Machine Learning (ICML), New York City, NY, June 2016
M.G. Bellemare, W. Dabney, R. Munos, A distributional perspective on reinforcement learning, in Proc. Intāl Conf. on Machine Learning (ICML), Sydney, August 2017
T.P. Lillicrap, J.J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning. Preprint, arXiv:1509.02971, July 2019
G. Barth-Maron, M.W. Hoffman, D. Budden, W. Dabney, D. Horgan, T.B. Dhruva, A. Muldal, N. Heess, T. Lillicrap, Distributed distributional deterministic policy gradients, in Proc. Intāl Conf. on Learning Representations (ICLR), Vancouver, April 2018
V. Mnih, A.P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, K. Kavukcuoglu, Asynchronous methods for deep reinforcement learning, in Proc. Intāl Conf. on Machine Learning (ICML), New York City, NY, June 2016
Z. Wang, V. Bapst, N. Heess, V. Mnih, R. Munos, K. Kavukcuoglu, N. de Freitas, Sample efficient actor-critic with experience replay. Preprint, arXiv:1611.01224, July 2017
T. Haarnoja, A. Zhou, P. Abbeel, S. Levine, Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor, in Proc. Intāl Conf. on Machine Learning (ICML), Stockholm, June 2018
S. Fujimoto, H. Hoof, D. Meger, Addressing function approximation error in actor-critic methods, in Proc. Intāl Conf. on Machine Learning (ICML), Stockholm, June 2018
J. Schulman, S. Levine, P. Abbeel, M. Jordan, P. Moritz, Trust region policy optimization, in Proc. Intāl Conf. on Machine Learning (ICML), Lille, June 2015
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms. Preprint, arXiv:1707.06347, August 2017
C.B. Browne, E. Powley, D. Whitehouse, S.M. Lucas, P.I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, S. Colton, A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intel. AI 4(1), 1ā43 (2012)
D. Silver et al., Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484ā489 (2016)
A. Plaat, W. Kosters, M. Preuss, Model-based deep reinforcement learning for high-dimensional problems, a survey. Preprint, arXiv:2008.05598, December 2020
N.C. Luong, D.T. Hoang, S. Gong, D. Niyato, P. Wang, Y.-C. Liang, D.I. Kim, Applications of deep reinforcement learning in communications and networking: a survey. IEEE Commun. Surv. Tutorials 21(4), 3133ā3174, Fourth quarter (2019)
M. Tang, V.W.S. Wong, Deep reinforcement learning for task offloading in mobile edge computing systems. IEEE Trans. Mobile Comput. 21(6), 1985ā1997 (2020)
J. Liu, Y. Mao, J. Zhang, K.B. Letaief, Delay-optimal computation task scheduling for mobile-edge computing systems, in Proc. IEEE Intāl Symp. on Information Theory (ISIT), Barcelona, July 2016
X. Lyu, W. Ni, H. Tian, R.P. Liu, X. Wang, G.B. Giannakis, A. Paulraj, Distributed online optimization of fog computing for selfish devices with out-of-date information. IEEE Trans. Wirel. Commun. 17(11), 7704ā7717 (2018)
L. Yang, H. Zhang, X. Li, H. Ji, V. Leung, A distributed computation offloading strategy in small-cell networks integrated with mobile edge computing. IEEE/ACM Trans. Netw. 26(6), 2762ā2773 (2018)
A.K. Parekh, R.G. Gallager, A generalized processor sharing approach to flow control in integrated services networks: the single-node case. IEEE/ACM Trans. Netw. 1(3), 344ā357 (1993)
I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, Cambridge, 2016)
J.L.D. Neto, S.-Y. Yu, D.F. Macedo, M.S. Nogueira, R. Langar, S. Secci, ULOOF: a user level online offloading framework for mobile edge computing. IEEE Trans. Mobile Comput. 17(11), 2660ā2674 (2018)
C. Wang, C. Liang, F.R. Yu, Q. Chen, L. Tang, Computation offloading and resource allocation in wireless cellular networks with mobile edge computing. IEEE Trans. Wirel. Commun. 16(8), 4924ā4938 (2017)
Speedtest Intelligence, Speedtest global index: Canada average mobile upload speed based on March 2021 data, https://www.speedtest.net/reports/canada/. Accessed 25 June 2021
X. Lyu, H. Tian, W. Ni, Y. Zhang, P. Zhang, R.P. Liu, Energy-efficient admission of delay-sensitive tasks for mobile edge computing. IEEE Trans. Commun. 66(6), 2603ā2616 (2018)
J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, G. Zhang, Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346ā2363 (2019)
A. Xie, J. Harrison, C. Finn, Deep reinforcement learning amidst lifelong non-stationarity. Preprint, arXiv:2006.10701, June 2020
V. Lomonaco, K. Desai, E. Culurciello, D. Maltoni, Continual reinforcement learning in 3D non-stationary environments, in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, June 2020
F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, Q. He, A comprehensive survey on transfer learning. Proc. IEEE 109(1), 43ā76 (2021)
S. Han, H. Mao, W.J. Dally, Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding, in Proc. Intāl Conf. on Machine Learning (ICML), New York City, NY, June 2016
Z. Chen, B. Liu, Lifelong machine learning. Synth. Lect. Artif. Intell. Mach. Learn. 12(3), 1ā207 (2018)
W.Y.B. Lim, N.C. Luong, D.T. Hoang, Y. Jiao, Y.-C. Liang, Q. Yang, D. Niyato, C. Miao, Federated learning in mobile edge networks: a comprehensive survey. IEEE Commun. Surv. Tutorials 22(3), 2031ā2063, Third quarter (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Tang, M., Wong, V.S. (2022). Deep Reinforcement Learning for Mobile Edge Computing Systems. In: Cai, L., Mark, B.L., Pan, J. (eds) Broadband Communications, Computing, and Control for Ubiquitous Intelligence. Wireless Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-98064-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-98064-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98063-4
Online ISBN: 978-3-030-98064-1
eBook Packages: Computer ScienceComputer Science (R0)