Multi-agent Deep Reinforcement Learning Based Adaptive User Association in Heterogeneous Networks

  • Weiwen YiEmail author
  • Xing Zhang
  • Wenbo Wang
  • Jing Li
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 262)


Nowadays, lots of technical challenges emerge focusing on user association in ever-increasingly complicated 5G heterogeneous networks. With distributed multiple attribute decision making (MADM) algorithm, users tend to maximize their utilities selfishly for lack of cooperation, leading to congestion. Therefore, it is efficient to apply artificial intelligence to deal with these emerging problems, which enables users to learn with incomplete environment information. In this paper, we propose an adaptive user association approach based on multi-agent deep reinforcement learning (RL), considering various user equipment types and femtocell access mechanisms. It aims to achieve a desirable trade-off between Quality of Experience (QoE) and load balancing. We formulate user association as a Markov Decision Process. And a deep RL approach, semi-distributed deep Q-network (DQN), is exploited to get the optimal strategy. Individual reward is defined as a function of transmission rate and base station load, which are adaptively balanced by a designed weight. Simulation results reveal that DQN with adaptive weight achieves the highest average reward compared with DQN with fixed weight and MADM, which indicates it obtains the best trade-off between QoE and load balancing. Compared with MADM, our approach improves by \({4\%\sim 11\%}\), \({32\%\sim 40\%}\), \({99\%}\) in terms of QoE, load balancing and blocking probability, respectively. Furthermore, semi-distributed framework reduces computational complexity.


Heterogeneous networks User association Multi-agent Deep Q-network 



This work is supported by the National Science Foundation of China (NSFC) under grant 61771065, 61571054 and 61631005.


  1. 1.
    Chandrasekhar, V., Andrews, J.G., Gatherer, A.: Femtocell networks: a survey. IEEE Commun. Mag. 46(9), 59–67 (2008)CrossRefGoogle Scholar
  2. 2.
    De La Roche, G., Valcarce, A., López-Pérez, D., Zhang, J.: Access control mechanisms for femtocells. IEEE Commun. Mag. 48(1), 33–39 (2010)CrossRefGoogle Scholar
  3. 3.
    Feng, Z., Song, L., Han, Z., Zhao, X., et al.: Cell selection in two-tier femtocell networks with open/closed access using evolutionary game. In: Wireless Communications and Networking Conference (WCNC), pp. 860–865. IEEE (2013)Google Scholar
  4. 4.
    El Helou, M., Ibrahim, M., Lahoud, S., Khawam, K., Mezher, D., Cousin, B.: A network-assisted approach for rat selection in heterogeneous cellular networks. IEEE J. Sel. Areas Commun. 33(6), 1055–1067 (2015)CrossRefGoogle Scholar
  5. 5.
    Li, J., Zhang, X., Wang, S., Wang, W.: Context-aware multi-rat connection with bi-level decision in 5g heterogeneous networks. In: 2017 IEEE/CIC International Conference on Communications in China (ICCC), pp. 1–6 (2017).
  6. 6.
    Wang, L., Kuo, G.S.G.: Mathematical modeling for network selection in heterogeneous wireless networks - a tutorial. IEEE Commun. Surv. Tutor. 15(1), 271–292 (2013)CrossRefGoogle Scholar
  7. 7.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRefGoogle Scholar
  8. 8.
    Yan, M., Feng, G., Qin, S.: Multi-rat access based on multi-agent reinforcement learning. In: GLOBECOM 2017–2017 IEEE Global Communications Conference, pp. 1–6. IEEE (2017)Google Scholar
  9. 9.
    Chae, S.H., Hong, J.P., Choi, W.: Optimal access in ofdma multi-rat cellular networks with stochastic geometry: can a single rat be better? IEEE Trans. Wirel. Commun. 15(7), 4778–4789 (2016)Google Scholar
  10. 10.
    Liu, Y.J., Cheng, S.M., Hsueh, Y.L.: enb selection for machine type communications using reinforcement learning based markov decision process. IEEE Trans. Veh. Technol. 66(12), 11330–11338 (2017)CrossRefGoogle Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2019

Authors and Affiliations

  1. 1.Wireless Signal Processing and Network LaboratoryBeijing University of Posts and TelecommunicationsBeijingPeople’s Republic of China

Personalised recommendations