Skip to main content

Performance Potential Based Optimization and MDPs

  • Chapter
Book cover Stochastic Modeling and Optimization
  • 790 Accesses

Abstract

This chapter presents some recent results in the area of optimization and Markov decision processes (MDPs). The work starts with performance sensitivity analysis of Markov processes. It is a continuation of the research in the past two decades on the optimization of discrete event dynamic systems, especially the theory of perturbation analysis (PA). We approach the MDP problem from a sensitivity point of view. This new perspective leads to some new insights and offers a clear and concise explanation for the basic concepts and results in MDPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berman, A. and Plemmons, R. J., Nonnegative Matrices in the Mathematical Sciences, SIAM, Philadelphia, 1994.

    Google Scholar 

  2. Bertsekas, D. P., Dynamic Programming and Optimal Control, Vols. I, II, Athena Scientific, Belmont, Massachusetts, 1995.

    Google Scholar 

  3. Bertsekas, D. P., and Tsitsiklis, J. N., Neuro-Dynamic Programming, Athena Scientific, Belmont, Massachusetts, 1996.

    Google Scholar 

  4. Cao, X. R., Realization Probabilities: The Dynamics of Queueing Systems, Springer-Verlag, New York, 1994.

    Book  MATH  Google Scholar 

  5. Cao, X. R., The relation among potentials, perturbation analysis, Markov decision processes, and other topics, Journal of Discrete Event Dynamic Systems, 8 (1998), 71–87.

    Article  MATH  Google Scholar 

  6. Gao, X. R., The Maclaurin series for performance functions of Markov chains, Advances in Applied Probability, 30 (1998), 676–692.

    Article  MathSciNet  Google Scholar 

  7. Cao, X. R., Single sample path-based optimization of Markov chains, Journal of Optimization Theory and Application, 100 (1999), 527–548.

    Article  MATH  Google Scholar 

  8. Cao, X. R., A unified approach to Markov decision problems and performance sensitivity analysis, Automatica, 36 (2000), 771–774.

    Article  MATH  Google Scholar 

  9. Cao, X. R. and Chen, H. F., Potentials, perturbation realization, and sensitivity analysis of Markov processes, IEEE Transactions on Automatic Control, 42 (1997), 1382–1393.

    Article  MathSciNet  MATH  Google Scholar 

  10. Gao, X. R., Ren, Z. Y., Bhatnagar, S., Fu, M., and MARCUS, S., A time aggregation approach to Markov decision processes, Automatica, to appear, (2001).

    Google Scholar 

  11. Çinlar, E., Introduction to Stochastic Processes, Prentice Hall, Englewood cliffs, NJ, 1975.

    MATH  Google Scholar 

  12. Fang, H. T. and Cao, X. R., Potential-based on-line policy iteration algorithms for Markov decision processes, IEEE Trans. on Automatic Control, submitted.

    Google Scholar 

  13. Forestier, J. P. and Varaiya, P., Multilayer control of large Markov chains, IEEE Transactions on Automatic Control, 23 (1978), 298–305.

    Article  MathSciNet  MATH  Google Scholar 

  14. Glynn P. W. and Meyn, S. P., A Lyapunov bound for solutions of Poisson’s equation, Ann. Probab., 24 (1996), 916–931.

    Article  MathSciNet  MATH  Google Scholar 

  15. He, M., Performance Optimization in Wireless Communication Networks - Online Algorithms, MPhil. Thesis, Dept. of EEE, Hong Kong University of Science and Technology, 2001.

    Google Scholar 

  16. Ho, Y. C. and Cao, X. R., Perturbation Analysis of Discrete-Event Dynamic Systems, Kluwer Academic Publisher, Boston, 1991.

    Book  MATH  Google Scholar 

  17. Marbach, P. and Tsitsiklis, J. N., Simulation-based optimization of Markov reward processes, IEEE Transactions on Automatic Control, 46 (2001), 191–209.

    Article  MathSciNet  MATH  Google Scholar 

  18. Kemeny, J. G. and Snell, J. L., Finite Markov Chains, Van Nostrand, New York, 1960.

    MATH  Google Scholar 

  19. Meyn S. P. and Tweedie, R. L., Markov Chains and Stochastic Stability, Springer-Verlag, London, 1993.

    Google Scholar 

  20. Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, New York, 1994.

    MATH  Google Scholar 

  21. Revuz, D., Markov Chains, North-Holland, New York, 1984.

    MATH  Google Scholar 

  22. Zhang B. and Ho, Y. C., Performance gradient estimation for very large finite Markov chains, IEEE Transactions on Automatic Control, 36 (1991), 1218–1227.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag New York, Inc.

About this chapter

Cite this chapter

Cao, XR. (2003). Performance Potential Based Optimization and MDPs. In: Stochastic Modeling and Optimization. Springer, New York, NY. https://doi.org/10.1007/978-0-387-21757-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-21757-4_4

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4419-3065-1

  • Online ISBN: 978-0-387-21757-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics