Abstract
In Chap. 7, we study the EAR criterion for the same MDP model as in Chap. 6. After briefly introducing some basic facts in Sect. 7.2, we establish the average reward optimality equation and the existence of EAR optimal policies in Sect. 7.3. In Sect. 7.4, we provide a policy iteration algorithm for computing or at least approximating an EAR optimal policy. Finally, we illustrate the results in this chapter with several examples in Sect. 7.5.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Guo, X., Hernández-Lerma, O. (2009). Average Optimality for Unbounded Rewards. In: Continuous-Time Markov Decision Processes. Stochastic Modelling and Applied Probability, vol 62. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02547-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-02547-1_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02546-4
Online ISBN: 978-3-642-02547-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)