Performance Potential Based Optimization and MDPs

Cao, Xi-Ren

doi:10.1007/978-0-387-21757-4_4

Xi-Ren Cao

790 Accesses

Abstract

This chapter presents some recent results in the area of optimization and Markov decision processes (MDPs). The work starts with performance sensitivity analysis of Markov processes. It is a continuation of the research in the past two decades on the optimization of discrete event dynamic systems, especially the theory of perturbation analysis (PA). We approach the MDP problem from a sensitivity point of view. This new perspective leads to some new insights and offers a clear and concise explanation for the basic concepts and results in MDPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berman, A. and Plemmons, R. J., Nonnegative Matrices in the Mathematical Sciences, SIAM, Philadelphia, 1994.
Google Scholar
Bertsekas, D. P., Dynamic Programming and Optimal Control, Vols. I, II, Athena Scientific, Belmont, Massachusetts, 1995.
Google Scholar
Bertsekas, D. P., and Tsitsiklis, J. N., Neuro-Dynamic Programming, Athena Scientific, Belmont, Massachusetts, 1996.
Google Scholar
Cao, X. R., Realization Probabilities: The Dynamics of Queueing Systems, Springer-Verlag, New York, 1994.
Book MATH Google Scholar
Cao, X. R., The relation among potentials, perturbation analysis, Markov decision processes, and other topics, Journal of Discrete Event Dynamic Systems, 8 (1998), 71–87.
Article MATH Google Scholar
Gao, X. R., The Maclaurin series for performance functions of Markov chains, Advances in Applied Probability, 30 (1998), 676–692.
Article MathSciNet Google Scholar
Cao, X. R., Single sample path-based optimization of Markov chains, Journal of Optimization Theory and Application, 100 (1999), 527–548.
Article MATH Google Scholar
Cao, X. R., A unified approach to Markov decision problems and performance sensitivity analysis, Automatica, 36 (2000), 771–774.
Article MATH Google Scholar
Cao, X. R. and Chen, H. F., Potentials, perturbation realization, and sensitivity analysis of Markov processes, IEEE Transactions on Automatic Control, 42 (1997), 1382–1393.
Article MathSciNet MATH Google Scholar
Gao, X. R., Ren, Z. Y., Bhatnagar, S., Fu, M., and MARCUS, S., A time aggregation approach to Markov decision processes, Automatica, to appear, (2001).
Google Scholar
Çinlar, E., Introduction to Stochastic Processes, Prentice Hall, Englewood cliffs, NJ, 1975.
MATH Google Scholar
Fang, H. T. and Cao, X. R., Potential-based on-line policy iteration algorithms for Markov decision processes, IEEE Trans. on Automatic Control, submitted.
Google Scholar
Forestier, J. P. and Varaiya, P., Multilayer control of large Markov chains, IEEE Transactions on Automatic Control, 23 (1978), 298–305.
Article MathSciNet MATH Google Scholar
Glynn P. W. and Meyn, S. P., A Lyapunov bound for solutions of Poisson’s equation, Ann. Probab., 24 (1996), 916–931.
Article MathSciNet MATH Google Scholar
He, M., Performance Optimization in Wireless Communication Networks - Online Algorithms, MPhil. Thesis, Dept. of EEE, Hong Kong University of Science and Technology, 2001.
Google Scholar
Ho, Y. C. and Cao, X. R., Perturbation Analysis of Discrete-Event Dynamic Systems, Kluwer Academic Publisher, Boston, 1991.
Book MATH Google Scholar
Marbach, P. and Tsitsiklis, J. N., Simulation-based optimization of Markov reward processes, IEEE Transactions on Automatic Control, 46 (2001), 191–209.
Article MathSciNet MATH Google Scholar
Kemeny, J. G. and Snell, J. L., Finite Markov Chains, Van Nostrand, New York, 1960.
MATH Google Scholar
Meyn S. P. and Tweedie, R. L., Markov Chains and Stochastic Stability, Springer-Verlag, London, 1993.
Google Scholar
Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, New York, 1994.
MATH Google Scholar
Revuz, D., Markov Chains, North-Holland, New York, 1984.
MATH Google Scholar
Zhang B. and Ho, Y. C., Performance gradient estimation for very large finite Markov chains, IEEE Transactions on Automatic Control, 36 (1991), 1218–1227.
Article MathSciNet MATH Google Scholar

Download references

Authors

Xi-Ren Cao
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cao, XR. (2003). Performance Potential Based Optimization and MDPs. In: Stochastic Modeling and Optimization. Springer, New York, NY. https://doi.org/10.1007/978-0-387-21757-4_4

Download citation

DOI: https://doi.org/10.1007/978-0-387-21757-4_4
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-3065-1
Online ISBN: 978-0-387-21757-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics