Skip to main content

The Role of Information in System Stability with Partially Observable Servers


We present a methodology for analyzing the role of information on system stability. For this we consider a simple discrete-time controlled queueing system, where the controller has a choice of which server to use at each time slot and server performance varies according to a Markov modulated random environment. At the extreme cases of information availability, that is when there is either full information or no information, stability regions and maximally stabilizing policies are trivial. But in the more realistic cases where only the environment state of the selected server is observed, only the service successes are observed or only queue length is observed, finding throughput maximizing control laws is a challenge. To handle these situations, we devise a Partially Observable Markov Decision Process (POMDP) formulation of the problem and illustrate properties of its solution. We further model the system under given decision rules, using Quasi-Birth-and-Death (QBD) structure to find a matrix analytic expression for the stability bound. We use this formulation to illustrate how the stability region grows as the number of controller belief states increases. The example that we consider in this paper is a case of two servers where the environment of each is modulated like a Gilbert-Elliot channel. As simple as this case seems, there appear to be no closed form descriptions of the stability region under the various regimes considered. However, the numerical approximations to the POMDP Bellman equations together with the numerical solutions of the QBDs, both of which are in agreement, hint at a variety of structural results.

This is a preview of subscription content, access via your institution.


  1. Asanjarani A (2016) QBD Modelling Of a finite state controller for queueing systems with unobservable Markovian environments. In: Proceedings of the 11th International Conference on Queueing Theory and Network Applications, ACM, p 20

  2. Baccelli F, Makowski AM (1986) Stability and bounds for single server queues in random environment. Stoch Model 2(2):281–291

    MathSciNet  Article  Google Scholar 

  3. Bäuerle N., Rieder U (2011) Markov Decision Processes with Applications to Finance. Springer Science & Business Media

  4. Bramson M (2008) Stability of Queueing Networks. Springer

  5. Cecchi F, Jacko P (2016) Nearly-optimal scheduling of users with Markovian time-varying transmission rates. Performance Evaluation 99(c):16–36. Elsevier Science Publishers BV

    Article  Google Scholar 

  6. Chadès I, Chapron G, Cros MJ, Garcia F, Sabbadin R (2014) MDP Toolbox: a multi-platform toolbox to solve stochastic dynamic programming problems. Ecography 37(9):916–920. Wiley Online Library

    Article  Google Scholar 

  7. Halabian H, Lambadaris I, Lung CH (2014) Explicit characterization of stability region for stationary multi-queue multi-server systems. IEEE Trans Autom Control 59 (2):355–370

    MathSciNet  Article  Google Scholar 

  8. Hernández-Lerma O, Lasserre JB (2012) Discrete-time Markov Control Processes: Basic Optimality Criteria. Springer

  9. Johnston LA, Vikram K (2006) Opportunistic file transfer over a fading channel: a POMDP search theory formulation with optimal threshold policies. IEEE Trans Wirel Commun 5(2):394–405

    Article  Google Scholar 

  10. Kella O, Whitt W (1992) A storage model with a two-state random environment. Operations Research 40(3):supplement-2:257–262

    MATH  Google Scholar 

  11. Koole G, Liu Z, Righter R (2001) Optimal transmission policies for noisy channels. Oper Res 49(6):892–899

    MathSciNet  Article  Google Scholar 

  12. Kuhn J, Nazarathy Y (2015) Wireless Channel Selection with Reward-Observing Restless Multi-armed Bandits. draft book chapter submitted

  13. Larrañaga M, Ayesta U, Verloop IM (2014) Index policies for a multi-class queue with convex holding cost and abandonments. ACM SIGMETRICS Performance Evaluation Review, ACM 42(1):125–137

    Article  Google Scholar 

  14. Larrañaga M., Assaad M, Destounis A, Paschos GS (2016) Asymptotically optimal pilot allocation over Markovian fading channels. arXiv:1608.08413

  15. Latouche G, Ramaswami V (1999) Introduction to matrix analytic methods in stochastic modeling. SIAM

  16. Leng B, Krishnamachari B, Guo X, Niu Z (2016) Optimal operation of a green server with bursty traffic, In: proceeding of Global Communications Conference (GLOBECOM), IEEE, pp 1–6

  17. Li CP, Neely MJ (2013) Network utility maximization over partially observable Markovian channels. Perform Eval 70(7):528–548

    Article  Google Scholar 

  18. Liu K, Zhao Q (2010) Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access. IEEE Trans Inf Theory 56(11):5547–5567

    MathSciNet  Article  Google Scholar 

  19. MacPhee IM, Jordan BP (1995) Optimal search for a moving target. Probab Eng Inform Sc 9(2):159–182

    MathSciNet  Article  Google Scholar 

  20. Meyn S (2008) Control techniques for complex networks. Cambridge University Press

  21. Meshram R, Manjunath D, Gopalan A (2016) On the Whittle index for restless multiarmed hidden Markov bandits. arXiv:1603.04739

  22. Mészáros A, Telek M (2014) Markov decision process and linear programming based control of MAP/MAP/n queues. In: European Workshop on Performance Engineering, 179–193 Springer

  23. Nazarathy Y, Taimre T, Asanjarani A, Kuhn J, Patch B, Vuorinen A (2015) The challenge of stabilizing control for queueing systems with unobservable server states. In: 5th Australian Control Conference (AUCC), IEEE, pp 342–347

  24. Puterman ML (2014) Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons

  25. Sadeghi P, Kennedy RA, Rapajic BP, Shams R (2008) Finite-state Markov modelling of fading channels: A survey of principles and applications. IEEE Signal Process Mag 25(5):57–80

    Article  Google Scholar 

  26. Smallwood RD, Sondik EJ (1973) The optimal control of partially observable Markov processes over a finite horizon. Oper Res 21(5):1071–1088

    Article  Google Scholar 

  27. Tassiulas L, Ephremides A (1993) Dynamic server allocation to parallel queues with randomly varying connectivity. IEEE Trans Inf Theory 39(2):466–478

    MathSciNet  Article  Google Scholar 

  28. Verloop IM (2014) Asymptotically optimal priority policies for indexable and non-indexable restless bandits. Ann Appl Probab 26(4):1947–1995

    MathSciNet  Article  Google Scholar 

  29. Whittle P (1982) Optimization Over Time. John Wiley Sons Inc.

  30. Whittle P (1988) Restless bandits: Activity allocation in a changing world. J Appl Probab 25:287–298

    MathSciNet  Article  Google Scholar 

Download references


Azam Asanjarani’s research is supported by the Australian Research Council Centre of Excellence for the Mathematical and Statistical Frontiers (ACEMS). Yoni Nazarathy’s research is supported by ARC grant DP180101602.

Author information



Corresponding author

Correspondence to Azam Asanjarani.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Asanjarani, A., Nazarathy, Y. The Role of Information in System Stability with Partially Observable Servers. Methodol Comput Appl Probab 22, 949–968 (2020).

Download citation


  • System stability
  • Control
  • Queueing systems
  • Optimal
  • Bellman equations
  • QBD
  • Information
  • Markov models

Mathematics Subject Classification 2010

  • 60J28
  • 49L20