In this chapter, we study average optimality in the discrete time Markov decision processes with countable state space and measurable action sets. The average criterion differs from the discounted criterion. In the discounted criterion, the reward at period n should be discounted to period 0 by multiplying βn. Hence, the smaller the period n is, the more important the reward of period n in the criterion will be. The reverse is also true; that is, the larger the period n is, the less important the reward of period n in the criterion will be. Contrary to it, in the average criterion, the reward in any period accounts for nothing in the criterion. Here, only the future trend of the reward is considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
(2008). Discretetimemarkovdecisionprocesses: Average Criterion. In: Markov Decision Processes With Their Applications. Advances in Mechanics and Mathematics, vol 14. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-36951-8_3
Download citation
DOI: https://doi.org/10.1007/978-0-387-36951-8_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-36950-1
Online ISBN: 978-0-387-36951-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)