Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Chichester (2005)
MATH
Google Scholar
Cassandra, A.: Exact and Approximate Algorithms for Partially Observable Markov Decision Processes. Ph.D. Thesis, Brown University (1998)
Google Scholar
Pyeatt, L.: Integration of Partially Observable Markov Decision Processes and Reinforcement Learning for Simulated Robot Navigation. Ph.D. Thesis, Colorado State University (1999)
Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific (1996)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Google Scholar
Lew, A., Mauch, H.: Dynamic Programming: A Computational Tool. Springer, Berlin (2007)
MATH
Google Scholar
Reynolds, S.I.: Reinforcement Learning with Exploration. Ph.D. Thesis, School of Computer Science, University of Birmingham, UK (2002)
Google Scholar
Van Roy, B.: Neuro-Dynamic Programming: Overview and Recent Trends. In: Feinberg, E.A., Schwartz, A. (eds.) Handbook of Markov Decision Processes: Methods and Applications. Kluwer Academic, Dordrecht (2002)
Google Scholar
Si, J., et al.: Handbook of Learning and Approximate Dynamic Programming. Wiley InterScience, Hoboken (2004)
CrossRef
Google Scholar
Soo Chang, H., et al.: A survey of some Simulation-Based Algorithms for Markov Decision Processes. Communications in Information and Systems 7(1), 59–92 (2007)
MATH
MathSciNet
Google Scholar
Smith, J.E., Mc Cardle, K.F.: Structural Properties of Stochastic Dynamic Programs. Operations Research 50, 796–809 (2002)
MATH
CrossRef
MathSciNet
Google Scholar
Fu, M.C., et al.: Monotone optimal policies for queuing staffing problem. Operations Research 46, 327–331 (2000)
CrossRef
Google Scholar
Givan, R., et al.: Bounded Markov Decision Processes. Artificial Intelligence 122, 71–109 (2000)
MATH
CrossRef
MathSciNet
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Google Scholar
Gordon, G.J.: Approximate Solution to Markov Decision Processes. Ph.D. Thesis, School of Computer Science, Carnegie Mellon University (1999)
Google Scholar
de Farias, D.P., Van Roy, B.: On the Existance of Fixed-points for Approximate Value Iteration and Temporal-Difference Learning. Journal of Optimization theory and Applications 105(3), 589–608 (2000)
MATH
CrossRef
MathSciNet
Google Scholar
Royden, H.: Real Analysis, 3rd edn. Prentice Hall, Englewood Cliffs (1988)
MATH
Google Scholar
Hu, Q., Yue, W.: Markov Decision Processes with Their Applications. Springer Science+Busines Media, LLC (2008)
Google Scholar
Soo, H., et al.: Simulation-based Algorithms for Markov Decision Processes. Springer, London (2007)
MATH
Google Scholar
Fernandez, F., Veloso, M.: Exploration and Policy Reuse. Technical Report, School of Computer Science, Carnegie Mellon University (2005)
Google Scholar
Fernandez, F., Veloso, M.: Probabilistic Reuse of Past policies. Technical Report, School of Computer Science, Carnegie Mellon University (2005)
Google Scholar
Fernandez, F., Veloso, M.: Building a Library of Policies through Policy Reuse. Technical Report, School of Computer Science, Carnegie Mellon University (2005)
Google Scholar
Bernstein, D.S.: Reusing Old Policies to Accelerate Learning on New Markov Decision Processes. Technical Report, University of Massachusetts (1999)
Google Scholar
Zhang, N.L., Zhang, W.: Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes. Journal of Artificial Intelligence Research 14, 29–51 (2001)
MathSciNet
Google Scholar
Hansen, E.A.: An Improved Policy Iteration for Partially Observable Markov Decision Processes. In: Proceedings of 10th Neural Information Processing Systems Conference (1997)
Google Scholar
Sallans, B.: Reinforcement Learning for Factored Markov Decision Processes. Ph.D. Thesis, Graduate Department of Computer Science, University of Toronto (2002)
Google Scholar
Ogata, K.: Discrete-Time Control Systems, 2nd edn. Prentice Hall, Englewood Cliffs (1994)
Google Scholar