Actor-critic deep learning for efficient user association and bandwidth allocation in dense mobile networks with green base stations
- 42 Downloads
In this paper, we introduce an efficient user-association and bandwidth-allocation scheme based on an actor-critic deep learning framework for downlink data transmission in dense mobile networks. In this kind of network, small cells are densely deployed in a single macrocell, and share the same spectrum band with the macrocell. The small-cell base stations are also called green base stations since they are powered solely by solar-energy harvesters. Therefore, we propose an actor-critic deep learning (ACDL) algorithm for the purpose of maximizing long-term network performance while adhering to constraints on harvested energy and spectrum sharing. For this purpose, the agent of the ACDL algorithm tries to obtain an optimal user-association and bandwidth-allocation policy by interacting with the network’s environment. We first formulate the optimization problem in this paper as a Markov decision process, during which the agent learns about the evolution of the environment through trial and error experience. Then, we use a deep neural network to model the policy function and the value function in the actor and in the critic of the agent, respectively. The actor selects an action based on the output of the policy network. Meanwhile, the critic uses the output of the value network to help the actor evaluate the taken action. Numerical results demonstrate that the proposed algorithm can enhance network performance in the long run.
KeywordsActor-critic Deep learning Energy harvesting Resource allocation
This work was supported by the National Research Foundation of Korea (NRF) grant through the Korean Government (MSIT) under Grant NRF-2018R1A2B6001714.
- 4.Su, J., & Liu, Y. (2015). Priority-based bandwidth allocation in heterogeneous wireless network. In 11th international conference on wireless communications, networking and mobile computing (WiCOM) (pp. 1–6).Google Scholar
- 13.Gabriel Gussen, C. M., Belmega, E. V., & Debbah, M. (2011). Pricing and bandwidth allocation problems in wireless multi-tier networks. In 2011 conference record of the forty fifth Asilomar conference on signals, systems and computers (ASILOMAR) (pp. 1633–1637).Google Scholar
- 18.Spielberg, S. P. K., Gopaluni, R. B., & Loewen, P. D. (2017). Deep reinforcement learning approaches for process control. In 2017 6th international symposium on advanced control of industrial processes (AdCONIP) (pp. 201–206).Google Scholar
- 19.Yu, Y., Wang, T., & Liew, S. C. (2018). Deep-reinforcement learning multiple access for heterogeneous wireless networks. In IEEE international conference on communications (ICC) (pp. 1–7).Google Scholar
- 22.Mnih V., et al. (2016). Asynchronous methods for deep reinforcement learning. In The 33rd international conference on international conference on machine learning (pp. 1928–1937).Google Scholar
- 23.Lillicrap T. P., et al. (2016). Continuous control with deep reinforcement learning. In International conference on learning representations (pp. 1–14).Google Scholar
- 24.van Hasselt, Hado, Guez, A., & Silver, D. (2016). Deep reinforcement learning with double Q-learning. In The thirtieth AAAI conference on artificial intelligence (pp. 2094–2100).Google Scholar
- 26.Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In The thirteenth international conference on artificial intelligence and statistics (pp. 249–256).Google Scholar
- 28.Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In The 3rd international conference for learning representations (pp. 1–15).Google Scholar