Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

Hu, Qiying; Liu, Jianyong; Yue, Wuyi

doi:10.1007/3-540-44862-4_8

Qiying Hu⁶,
Jianyong Liu⁷ &
Wuyi Yue⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2658))

Included in the following conference series:

International Conference on Computational Science

612 Accesses

Abstract

This paper discusses continuous time Markov decision processes with criterion of expected discounted total rewards, where the state space is countable, the reward rate function is extended real-valued and the discount rate is a real number. Under necessary conditions that the model is well defined, the state space is partitioned into three subsets, on which the optimal value function is positive infinity, negative infinity, or finite, respectively. Correspondingly, the model is reduced into three submodels, by generalizing policies and eliminating some worst actions. Then for the submodel with finite optimal value, the validity of the optimality equation is shown and some its properties are obtained.

This research was supported by the National Natural Science Foundation of China, and by Institute of Applied Mathematics, Academia Sinica and by GRANT-IN-AID FOR SCIENTIFIC RESEARCH (No.13650440), Japan.

Download to read the full chapter text

Chapter PDF

The risk probability criterion for discounted continuous-time Markov decision processes

Article 10 August 2017

Haifeng Huo, Xiaolong Zou & Xianping Guo

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Article 15 April 2016

Xianping Guo, Yonghui Huang & Yi Zhang

Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces

Article 19 October 2019

Xianping Guo & Junyu Zhang

References

Kakumanu, P.V.: Continuous Time Markov Decision Models with Applications to Optimization Problems. Technical Report 63, Dept. of Oper. Res., Cornell Univ. (1969)
Google Scholar
Lewis, M.E. and Puterman, M.L.: A Probabilistic Analysis of Bias Optimality in Unichain Markov Decision Processes. IEEE Trans. on Autom. Contr. 46 (2001) 96–100
Article MATH MathSciNet Google Scholar
Lippman, S.A.: On Dynamic Programming with Unbounded Rewards, Mgt. Sci. 21 (1975) 1225–1233
Article MathSciNet MATH Google Scholar
Cassandras, C.G., Pepyne, D.L. and Wardi, Y.: Optimal control of A Class of Hybrid Systems. IEEE Trans. on AC 46 (2001) 398–415
MATH MathSciNet Google Scholar
Song, J.: Continuous Time Markov Decision Processes with Nonuniformly Bounded Transition Rate Family, Scientia Sinica Series A, 11 (1988) 1281–1290
Google Scholar
Hu, Q.: CTMDP and Its Relationship with DTMDP. Chinese Sci. Bull. 35 (1990) 710–714
Google Scholar
Serfozo, R.F.: An Equivalence Between Continuous and Discrete Time Markov Decision Processes, J. Oper. Res. 27 (1979) 60–70
MathSciNet Google Scholar
Hou, B.: Continuous-time Markov Decision Processes Programming with Polynomical Reward, Thesis, Institute of Appl. Math. Academic Sinica, Bejing (1986).
Google Scholar
Guo, X.P. and Zhu, W.P.: Denumerable-state Continuous-time Markov Decision Processes with Unbounded Transition and Reward Rates under the Discounted Criterion. J. Appl. Prob. 39 (2002) 233–250
Article MATH MathSciNet Google Scholar
Guo, X.P. and Zhu, W.P.: Denumerable-state Continuous-time Markov Decision Processes with Unbounded Cost and Transition Rates under Average Criterion. ANZIAM J. 43 (2002) 541–557
MATH MathSciNet Google Scholar
Hu, Q. and Xu, C.: The Finiteness of the Reward Function and the Optimal Value Function in Markov Decision Processes. J. Math. Methods in Ope. Res. 49 (1999) 255–266
MATH MathSciNet Google Scholar
Chung, K.L.: Markov Chains with StationaryTransition Probabilities. Springer-Verlag (1960)
Google Scholar
Kuczura, A.: Piecewise Markov Processes. SIAM J. Appl. Math. 24 (1973) 169–181
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

College of International Business & Management, Shanghai University, Shanghai, 201800, China
Qiying Hu
Institute of Applied Mathematics, Academia Sinica, Beijing, 100080, China
Jianyong Liu
Dept. of Information Science and Systems Engineering, Konan University, Kobe, 658-8501, JAPAN
Wuyi Yue

Authors

Qiying Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jianyong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wuyi Yue
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Informatics Institute, Section of Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands
Peter M. A. Sloot
School of Computer Science and Software Engineering, Monash University, Wellington Road, Clayton, VIC, 3800, Australia
David Abramson
Institute for High-Performance Computing and Information Systems, Fontanka emb. 6, St. Petersburg, 191187, Russia
Alexander V. Bogdanov & Yuriy E. Gorbachev &
Computer Science Dept., University of Tennessee and Oak Ridge National Laboratory, 1122 Volunteer Blvd., Knoxville, TN, 37996-3450, USA
Jack J. Dongarra
School of Information Technologies, CISCO Systems, The University of Sydney, Madsen Building F09, Sydney, NSW, 2006, Australia
Albert Y. Zomaya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Q., Liu, J., Yue, W. (2003). Continuous Time Markov Decision Processes with Expected Discounted Total Rewards. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J.J., Zomaya, A.Y. (eds) Computational Science — ICCS 2003. ICCS 2003. Lecture Notes in Computer Science, vol 2658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44862-4_8

Download citation

DOI: https://doi.org/10.1007/3-540-44862-4_8
Published: 18 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40195-7
Online ISBN: 978-3-540-44862-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

Abstract

Chapter PDF

Similar content being viewed by others

The risk probability criterion for discounted continuous-time Markov decision processes

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

Abstract

Chapter PDF

Similar content being viewed by others

The risk probability criterion for discounted continuous-time Markov decision processes

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation