The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes

Cao, Xi-Ren

doi:10.1023/A:1008260528575

The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes

Published: March 1998

Volume 8, pages 71–87, (1998)
Cite this article

Discrete Event Dynamic Systems Aims and scope Submit manuscript

Xi-Ren Cao¹

216 Accesses
48 Citations
Explore all metrics

Abstract

This paper provides an introductory discussion for an important concept, the performance potentials of Markov processes, and its relations with perturbation analysis (PA), average-cost Markov decision processes (MDP), Poisson equations, α-potentials, the fundamental matrix, and the group inverse of the transition matrix (or the infinitesimal generators). Applications to single sample path-based performance sensitivity estimation and performance optimization are also discussed. On-line algorithms for performance sensitivity estimates and on-line schemes for policy iteration methods are presented. The approach is closely related to reinforcement learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization

Reinforcement Learning

Perturbation Analysis of Steady-State Performance and Relative Optimization

References

Berman, A., and Plemmons, R. J. 1994. Nonnegative Matrices in the Mathematical Sciences. Philadelphia: SIAM.
Google Scholar
Bertsekas, D. P. 1995. Dynamic Programming and Optimal Control, Vols. I, II. Belmont, Massachusetts: Athena Scientific.
Google Scholar
Bertsekas, D. P., and Tsitsiklis, J. N. 1996. Neuro–Dynamic Programming. Belmont, Massachusetts: Athena Scientific.
Google Scholar
Cao, X. R. 1994. Realization Probabilities: The Dynamics of Queueing Systems. New York: Springer–Verlag.
Google Scholar
Cao, X. R., and Chen, H. F. 1997. Potentials, perturbation realization, and sensitivity analysis of Markov processes. IEEE Trans. on Automatic Control 42: 1382–1393.
Google Scholar
Cao, X. R., and Wan, Y. W. To appear. Algorithms for sensitivity analysis of Markov systems through potentials and perturbation realization. IEEE Trans. on Control Systems Technology.
Çinlar, E. 1975. Introduction to Stochastic Processes. Prentice Hall, Inc.
Ho, Y. C., and Cao, X. R. 1991. Perturbation Analysis of Discrete–Event Dynamic Systems. Boston: Kluwer Academic Publisher.
Google Scholar
Dai, L. Y. 1994. A consistent algorithm for derivative estimation of Markov chains. Proceedings of the 33rd IEEE Conference on Decision and Control, 1990–1995.
Dai, L. Y., and Ho, Y. C. 1995. Structural infinitesimal perturbation analysis (SIPA) for derivative estimation of discrete event dynamic systems. IEEE Transactions on AC 40: 1154–1166.
Google Scholar
Fu, M., and Hu, J. Q. 1994. Smoothed perturbation analysis derivative estimation for Markov chains. Operations Research Letters 14: 241–251.
Google Scholar
Gallager, R. G. 1995. Discrete Stochastic Processes. Kluwer Academic Publishers.
Golub, G. H., and Meyer, C. D., Jr. 1986. Using the QR factorization and group inversion to compute, differentiate, and estimate the sensitivity of stationary probability for Markov chains. SIAM J. Alg. Disc. Meth. 7: 273–281.
Google Scholar
Jaakkola, T., Singh, S. P., and Jordan, M. J. 1995. Reinforcement learning algorithm for partially observable Markov decision problems. Neural Information Processing Systems7.
Kemeny, J. G., and Snell, J. L. 1960. Finite Markov Chains. New York: Van Nostrand.
Google Scholar
Meyer, Carl D., Jr. 1975. The role of the group generalized inverse in the theory of finite Markov chains. SIAM Review 17: 443–464.
Google Scholar
Puterman, M. L. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York: Wiley.
Google Scholar
Ross, S. M. 1983. Introduction to Stochastic Dynamic Programming. New York: Academic Press, Inc.
Google Scholar
Tsitsiklis, J. N., and Van Roy, B. 1996. Feature–based methods for large scale dynamic programming. Machine Learning 22: 59–94.
Google Scholar

Download references

Author information

Authors and Affiliations

The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Xi-Ren Cao

Authors

Xi-Ren Cao
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, XR. The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes. Discrete Event Dynamic Systems 8, 71–87 (1998). https://doi.org/10.1023/A:1008260528575

Download citation

Issue Date: March 1998
DOI: https://doi.org/10.1023/A:1008260528575

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes

Abstract

Access this article

Similar content being viewed by others

Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization

Reinforcement Learning

Perturbation Analysis of Steady-State Performance and Relative Optimization

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes

Abstract

Access this article

Similar content being viewed by others

Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization

Reinforcement Learning

Perturbation Analysis of Steady-State Performance and Relative Optimization

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation