Date: 18 Jun 2003

Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

* Final gross prices may vary according to local VAT.

Get Access

Abstract

This paper discusses continuous time Markov decision processes with criterion of expected discounted total rewards, where the state space is countable, the reward rate function is extended real-valued and the discount rate is a real number. Under necessary conditions that the model is well defined, the state space is partitioned into three subsets, on which the optimal value function is positive infinity, negative infinity, or finite, respectively. Correspondingly, the model is reduced into three submodels, by generalizing policies and eliminating some worst actions. Then for the submodel with finite optimal value, the validity of the optimality equation is shown and some its properties are obtained.

This research was supported by the National Natural Science Foundation of China, and by Institute of Applied Mathematics, Academia Sinica and by GRANT-IN-AID FOR SCIENTIFIC RESEARCH (No.13650440), Japan.