Skip to main content

One-Step Improvement Ideas and Computational Aspects

  • Chapter
  • First Online:
Markov Decision Processes in Practice

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 248))

Abstract

In this contribution we give a down-to-earth discussion on basic ideas for solving practical Markov decision problems. The emphasis is on the concept of the policy-improvement step for average cost optimization. This concept provides a flexible method of improving a given policy. By appropriately designing the policy-improvement step in specific applications, tailor-made algorithms may be developed to generate the best control rule within a class of control rules characterized by a few parameters. Also, in decision problems with an intractable multi-dimensional state space, decomposition and a once-only application of the policy-improvement step may lead to a good heuristic rule. These useful features of the policy-improvement concept will be illustrated with a queueing control problem with variable service rate and with the dynamic routing of arrivals to parallel queues. In the final section, we deal with the concept of the one-stage-look-ahead rule in optimal stopping and give several applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. S. Bhulai, G. Koole, On the structure of value functions for threshold policies in queueing models. J. Appl. Probab. 40, 613–622 (2003)

    Article  Google Scholar 

  2. W.M. Boyce, On a simple stopping problem. Discret. Math. 5, 297–312 (1973)

    Article  Google Scholar 

  3. E.V. Denardo, Dynamic Programming (Prentice-Hall, Englewood Cliffs, NJ, 1980)

    Google Scholar 

  4. C. Derman, Finite State Markovian Decision Processes (Academic, New York, 1970)

    Google Scholar 

  5. O. Hägström, J. Wästlund, Rigorous computer analysis of the Chow-Robbins game. Am. Math. Mon. 120, 893–900 (2013)

    Article  Google Scholar 

  6. R. Haijema, J. Van der Wal, An MDP decomposition approach for traffic control at isolated signalized intersections. Probab. Eng. Inf. Sci. 27, 587–602 (2008)

    Google Scholar 

  7. N.A.J. Hastings, Bounds on the gain of a Markov decision process. Oper. Res. 19, 240–244 (1971)

    Article  Google Scholar 

  8. T.P. Hill, Knowing when to stop. Am. Sci. 97, 126–133 (2007)

    Article  Google Scholar 

  9. R.A. Howard, Dynamic Programming and Markov Processes (Wiley, New York, 1960)

    Google Scholar 

  10. K.R. Krishnan, T.J. Ott, State-dependent routing for telephone traffic: theory and results, in Proceedings of 25th IEEE Conference on Decision and Control, Athens (IEEE, New York, 1986), pp. 2124–2128

    Google Scholar 

  11. K.R. Krishnan, T.J. Ott, Joining the right queue: a Markov decision rule, in Proceedings of 26th IEEE Conference on Decision and Control, Los Angeles, CA (IEEE, New York, 1987), pp. 1863–1868

    Google Scholar 

  12. J.M. Norman, Heuristic Procedures in Dynamic Programming (Manchester University Press, Manchester, 1972)

    Google Scholar 

  13. A. Odoni, On finding the maximal gain for Markov decision processes. Operat. Res. 17, 857–860 (1969)

    Article  Google Scholar 

  14. W.B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley, New York, 2007)

    Book  Google Scholar 

  15. M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming (Wiley, New York, 1994)

    Book  Google Scholar 

  16. S.M. Ross, Introduction to Stochastic Dynamic Programming, (Academic, New York, 1983)

    Google Scholar 

  17. S.A.E. Sassen, H.C. Tijms, R.D. Nobel, A heuristic rule for routing customers to parallel servers. Statistica Neerlandica 51, 107–121 (1997)

    Article  Google Scholar 

  18. P.J. Schweitzer, A. Federgruen, Geometric convergence of value iteration in multichain Markov decision problems. Adv. Appl. Probab. 11, 188–217 (1979)

    Article  Google Scholar 

  19. H.C. Tijms, A First Course in Stochastic Models (Wiley, New York, 2003)

    Book  Google Scholar 

  20. H.C. Tijms, Understanding Probability, 3rd edn. (Cambridge University Press, New York, 2012)

    Book  Google Scholar 

  21. J. Van der Wal, The method of value oriented successive approximations for the average reward Markov decision process. OR Spektrum 1, 233–242 (1980)

    Article  Google Scholar 

  22. R. Weber, Optimization and Control. Class Notes (University of Cambridge, Cambridge, 2014). http://www.statslab.cam.ac.uk/rrw1/oc/oc2014.pdf

  23. J. Wijngaard, Decomposition for dynamic programming in production and inventory control. Eng. Process Econ. 4, 385–388 (1979)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henk Tijms .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Tijms, H. (2017). One-Step Improvement Ideas and Computational Aspects. In: Boucherie, R., van Dijk, N. (eds) Markov Decision Processes in Practice. International Series in Operations Research & Management Science, vol 248. Springer, Cham. https://doi.org/10.1007/978-3-319-47766-4_1

Download citation

Publish with us

Policies and ethics