The Role of Information in System Stability with Partially Observable Servers

  • Azam AsanjaraniEmail author
  • Yoni Nazarathy


We present a methodology for analyzing the role of information on system stability. For this we consider a simple discrete-time controlled queueing system, where the controller has a choice of which server to use at each time slot and server performance varies according to a Markov modulated random environment. At the extreme cases of information availability, that is when there is either full information or no information, stability regions and maximally stabilizing policies are trivial. But in the more realistic cases where only the environment state of the selected server is observed, only the service successes are observed or only queue length is observed, finding throughput maximizing control laws is a challenge. To handle these situations, we devise a Partially Observable Markov Decision Process (POMDP) formulation of the problem and illustrate properties of its solution. We further model the system under given decision rules, using Quasi-Birth-and-Death (QBD) structure to find a matrix analytic expression for the stability bound. We use this formulation to illustrate how the stability region grows as the number of controller belief states increases. The example that we consider in this paper is a case of two servers where the environment of each is modulated like a Gilbert-Elliot channel. As simple as this case seems, there appear to be no closed form descriptions of the stability region under the various regimes considered. However, the numerical approximations to the POMDP Bellman equations together with the numerical solutions of the QBDs, both of which are in agreement, hint at a variety of structural results.


System stability POMDP Control Queueing systems Optimal Bellman equations QBD Information Markov models 

Mathematics Subject Classification 2010

60J28 49L20 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



Azam Asanjarani’s research is supported by the Australian Research Council Centre of Excellence for the Mathematical and Statistical Frontiers (ACEMS). Yoni Nazarathy’s research is supported by ARC grant DP180101602.


  1. Asanjarani A (2016) QBD Modelling Of a finite state controller for queueing systems with unobservable Markovian environments. In: Proceedings of the 11th International Conference on Queueing Theory and Network Applications, ACM, p 20Google Scholar
  2. Baccelli F, Makowski AM (1986) Stability and bounds for single server queues in random environment. Stoch Model 2(2):281–291MathSciNetCrossRefGoogle Scholar
  3. Bäuerle N., Rieder U (2011) Markov Decision Processes with Applications to Finance. Springer Science & Business MediaGoogle Scholar
  4. Bramson M (2008) Stability of Queueing Networks. SpringerGoogle Scholar
  5. Cecchi F, Jacko P (2016) Nearly-optimal scheduling of users with Markovian time-varying transmission rates. Performance Evaluation 99(c):16–36. Elsevier Science Publishers BVCrossRefGoogle Scholar
  6. Chadès I, Chapron G, Cros MJ, Garcia F, Sabbadin R (2014) MDP Toolbox: a multi-platform toolbox to solve stochastic dynamic programming problems. Ecography 37(9):916–920. Wiley Online LibraryCrossRefGoogle Scholar
  7. Halabian H, Lambadaris I, Lung CH (2014) Explicit characterization of stability region for stationary multi-queue multi-server systems. IEEE Trans Autom Control 59 (2):355–370MathSciNetCrossRefGoogle Scholar
  8. Hernández-Lerma O, Lasserre JB (2012) Discrete-time Markov Control Processes: Basic Optimality Criteria. SpringerGoogle Scholar
  9. Johnston LA, Vikram K (2006) Opportunistic file transfer over a fading channel: a POMDP search theory formulation with optimal threshold policies. IEEE Trans Wirel Commun 5(2):394–405CrossRefGoogle Scholar
  10. Kella O, Whitt W (1992) A storage model with a two-state random environment. Operations Research 40(3):supplement-2:257–262zbMATHGoogle Scholar
  11. Koole G, Liu Z, Righter R (2001) Optimal transmission policies for noisy channels. Oper Res 49(6):892–899MathSciNetCrossRefGoogle Scholar
  12. Kuhn J, Nazarathy Y (2015) Wireless Channel Selection with Reward-Observing Restless Multi-armed Bandits. draft book chapter submittedGoogle Scholar
  13. Larrañaga M, Ayesta U, Verloop IM (2014) Index policies for a multi-class queue with convex holding cost and abandonments. ACM SIGMETRICS Performance Evaluation Review, ACM 42(1):125–137CrossRefGoogle Scholar
  14. Larrañaga M., Assaad M, Destounis A, Paschos GS (2016) Asymptotically optimal pilot allocation over Markovian fading channels. arXiv:1608.08413
  15. Latouche G, Ramaswami V (1999) Introduction to matrix analytic methods in stochastic modeling. SIAMGoogle Scholar
  16. Leng B, Krishnamachari B, Guo X, Niu Z (2016) Optimal operation of a green server with bursty traffic, In: proceeding of Global Communications Conference (GLOBECOM), IEEE, pp 1–6Google Scholar
  17. Li CP, Neely MJ (2013) Network utility maximization over partially observable Markovian channels. Perform Eval 70(7):528–548CrossRefGoogle Scholar
  18. Liu K, Zhao Q (2010) Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access. IEEE Trans Inf Theory 56(11):5547–5567MathSciNetCrossRefGoogle Scholar
  19. MacPhee IM, Jordan BP (1995) Optimal search for a moving target. Probab Eng Inform Sc 9(2):159–182MathSciNetCrossRefGoogle Scholar
  20. Meyn S (2008) Control techniques for complex networks. Cambridge University PressGoogle Scholar
  21. Meshram R, Manjunath D, Gopalan A (2016) On the Whittle index for restless multiarmed hidden Markov bandits. arXiv:1603.04739
  22. Mészáros A, Telek M (2014) Markov decision process and linear programming based control of MAP/MAP/n queues. In: European Workshop on Performance Engineering, 179–193 SpringerGoogle Scholar
  23. Nazarathy Y, Taimre T, Asanjarani A, Kuhn J, Patch B, Vuorinen A (2015) The challenge of stabilizing control for queueing systems with unobservable server states. In: 5th Australian Control Conference (AUCC), IEEE, pp 342–347Google Scholar
  24. Puterman ML (2014) Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & SonsGoogle Scholar
  25. Sadeghi P, Kennedy RA, Rapajic BP, Shams R (2008) Finite-state Markov modelling of fading channels: A survey of principles and applications. IEEE Signal Process Mag 25(5):57–80CrossRefGoogle Scholar
  26. Smallwood RD, Sondik EJ (1973) The optimal control of partially observable Markov processes over a finite horizon. Oper Res 21(5):1071–1088CrossRefGoogle Scholar
  27. Tassiulas L, Ephremides A (1993) Dynamic server allocation to parallel queues with randomly varying connectivity. IEEE Trans Inf Theory 39(2):466–478MathSciNetCrossRefGoogle Scholar
  28. Verloop IM (2014) Asymptotically optimal priority policies for indexable and non-indexable restless bandits. Ann Appl Probab 26(4):1947–1995MathSciNetCrossRefGoogle Scholar
  29. Whittle P (1982) Optimization Over Time. John Wiley Sons Inc.Google Scholar
  30. Whittle P (1988) Restless bandits: Activity allocation in a changing world. J Appl Probab 25:287–298MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.The University of AucklandAucklandNew Zealand
  2. 2.The University of QueenslandBrisbaneAustralia

Personalised recommendations