Mobile Networks and Applications

, Volume 11, Issue 1, pp 101–110 | Cite as

Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning

  • Yu FeiEmail author
  • Vincent W. S. Wong
  • Victor C. M. Leung


The scarcity and large fluctuations of link bandwidth in wireless networks have motivated the development of adaptive multimedia services in mobile communication networks, where it is possible to increase or decrease the bandwidth of individual ongoing flows. This paper studies the issues of quality of service (QoS) provisioning in such systems. In particular, call admission control and bandwidth adaptation are formulated as a constrained Markov decision problem. The rapid growth in the number of states and the difficulty in estimating state transition probabilities in practical systems make it very difficult to employ classical methods to find the optimal policy. We present a novel approach that uses a form of discounted reward reinforcement learning known as Q-learning to solve QoS provisioning for wireless adaptive multimedia. Q-learning does not require the explicit state transition model to solve the Markov decision problem; therefore more general and realistic assumptions can be applied to the underlying system model for this approach than in previous schemes. Moreover, the proposed scheme can efficiently handle the large state space and action set of the wireless adaptive multimedia QoS provisioning problem. Handoff dropping probability and average allocated bandwidth are considered as QoS constraints in our model and can be guaranteed simultaneously. Simulation results demonstrate the effectiveness of the proposed scheme in adaptive multimedia mobile communication networks.


QoS adaptive multimedia mobile communication networks reinforcement learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    E. Altman, Constrained Markov Decision Process (Chapman and Hall, London, 1999).Google Scholar
  2. 2.
    N. Argiriou and L. Georgiadis, Channel sharing by rate-adaptive streaming applications, in: Proc. IEEE Infocom'02 (June 2002).Google Scholar
  3. 3.
    D.P. Bertsekas and J.N. Tsitsiklis, Neuro-Dynamic Programming (Athena Scientific, 1996).Google Scholar
  4. 4.
    F.J. Beutler and K.W. Ross, Optimal policies for controlled Markov chains with a constraint, J. Math. Anal. Appl. 112 (1985) 236–252.CrossRefMathSciNetGoogle Scholar
  5. 5.
    F.J. Beutler and K.W. Ross, Time-average optimal constrained semi-Markov decision processes, Adv. Appl. Prob. 18 (1986) 341–359.MathSciNetGoogle Scholar
  6. 6.
    A. Bhattacharya and S.K. Das, LeZi-update: An information-theoretic framework for personal mobility tracking in PCS networks, Wireless Networks 8(2/3) (2002) 121–135.Google Scholar
  7. 7.
    J.A. Boyan and M.L. Littman, Packet routing in dynamically changing networks: A reinforcement learning approach, in: Advances in NIPS 6, J.D. Cowan et al. (eds.) (1994) pp. 671–678.Google Scholar
  8. 8.
    C. Chou and K.G. Shin, Analysis of combined adaptive bandwidth allocation and admission control in wireless networks, in: Proc. IEEE Infocom'02 (June 2002).Google Scholar
  9. 9.
    3GPP, RRC protocol specification, 3G TS25.331 version 3.20.0 (Sept. 2004).Google Scholar
  10. 10.
    Z. Gabor, Z. Kalmar and C. Szepesvari, Multi-criteria reinforcement learning, in: Proc. Int'l Conf. Machine Learning, Madison, WI (July 1998).Google Scholar
  11. 11.
    D. Hong and S.S. Rappaport, Traffic model and performance analysis for cellular mobile radio telephone systems with prioritised and non-prioritised handoff procedures, IEEE Trans. Veh. Technol. VT-35 (1986) 77–92.Google Scholar
  12. 12.
    ISO/IEC 144962-2, Information Technology Coding of Audio-Visual Objects: Visual (Committee draft, Oct. 1997).Google Scholar
  13. 13.
    ITU-T H. 263, Video Coding for Low Bitrate Communication (Jan. 1998).Google Scholar
  14. 14.
    T. Kwon, J. Choi, Y. Choi and S.K. Das, Near optimal bandwidth adaptation algorithm for adaptive multimedia services in: Wireless/Mobile Networks, in: Proc. IEEE VTC'99-Fall, vol. 2, Amsterdam, The Netherland (Sept. 1999) pp. 874–878.Google Scholar
  15. 15.
    T. Kwon, Y. Choi, C. Bisdikian and M. Naghshineh, QoS provisioning in wireless/mobile multimedia networks using an adaptive framework, Wireless Networks 9 (2003) 51–59.CrossRefGoogle Scholar
  16. 16.
    P. Marbach, O. Mihatsch and J.N. Tsitsiklis, Call admission control and routing in integrated services networks using neuro-dynamic programming, IEEE J. Select. Areas Commun. 18(2) (2000) 197–208.Google Scholar
  17. 17.
    J. Nie and S. Haykin, A Q-learning based dynamic channel assignment technique for mobile communication systems, IEEE Trans. Veh. Technol. 48(5) (1999) 1676–1687.Google Scholar
  18. 18.
    M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming (Wiley, New York, 1994).Google Scholar
  19. 19.
    S.P. Singh and D.P. Bertsekas, Reinforcement learning for dynamic channel allocation in cellular telephone systems, in: Advances in NIPS Vol. 9, M. Mozer et al. (eds.) (1997) pp. 974–980.Google Scholar
  20. 20.
    A.K. Talukdar, B.R. Badrinath and A. Acharya, Rate adaptation schemes in networks with mobile hosts, in: Proc. ACM/IEEE MobiCom'98 (Oct. 1998).Google Scholar
  21. 21.
    H. Tong and T.X. Brown, Adaptive call admission control under quality of service constraints: a reinforcement learning solution, IEEE J. Select. Areas Commun. 18(2) (2000) 209–221.Google Scholar
  22. 22.
    D. Taubman and A. Zakhor, A common framework for rate and distortion based scaling of highly scalable compressed video, IEEE Trans. Circuits Syst. Video Technol. 6(4) (1996) 329–354.Google Scholar
  23. 23.
    C.J.C.H. Watkins and P. Dayan, Q-learning, Machine Learning 8 (1992) 279–292.Google Scholar
  24. 24.
    D. Wu, Y.T. Hou and Y.Q. Zhang, Scalable video coding and transport over broadband wireless networks, Proc. IEEE 89(1) (2001) 6–20.Google Scholar
  25. 25.
    S. Wu et al., A dynamic call admission policy with precision QoS guarantee using stochastic control for mobile wireless networks, IEEE/ACM Trans. Networking 10(2) (2002) 257–271.Google Scholar
  26. 26.
    F. Yu, V.W.S. Wong and V.C.M. Leung, Reinforcement learning for call admission control and bandwidth adaptation in mobile multimedia networks, in: Proc. of ICICS-PCM'3, Singapore (Dec. 2003).Google Scholar
  27. 27.
    F. Yu, V.W.S. Wong and V.C.M. Leung, A new QoS provisioning method for adaptive multimedia in cellular wireless networks, in: Proc. IEEE Infocom'04, HongKong, China, (Apr. 2004).Google Scholar
  28. 28.
    G.V. Zaruba, I. Chlamtac and S.K. Das, A prioritized real-time wireless call degradation framework for optimal call mix selection, Mobile Networks and Applications 7 (2002) 143–151.Google Scholar

Copyright information

© Springer Science + Business Media, Inc 2005

Authors and Affiliations

  • Yu Fei
    • 1
    Email author
  • Vincent W. S. Wong
    • 1
  • Victor C. M. Leung
    • 1
  1. 1.Department of Electrical and Computer EngineeringThe University of British ColumbiaVancouverCanada

Personalised recommendations