Skip to main content

Average-Reward Reinforcement Learning

  • Living reference work entry
  • First Online:
Book cover Encyclopedia of Machine Learning and Data Mining
  • 778 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • Abounadi J, Bertsekas DP, Borkar V (2002) Stochastic approximation for non-expansive maps: application to Q-learning algorithms. SIAM J Control Optim 41(1):1–22

    Article  MATH  MathSciNet  Google Scholar 

  • Barto AG, Bradtke SJ, Singh SP (1995) Learning to act using real-time dynamic programming. Artif Intell 72(1):81–138

    Article  Google Scholar 

  • Bertsekas DP (1995) Dynamic programming and optimal control. Athena Scientific, Belmont

    MATH  Google Scholar 

  • Brafman RI, Tennenholtz M (2002) R-MAX – a general polynomial time algorithm for near-optimal reinforcement learning. J Mach Learn Res 2:213–231

    MathSciNet  Google Scholar 

  • Crites RH, Barto AG (1998) Elevator group control using multiple reinforcement agents. Mach Learn 33(2/3):235–262

    Article  MATH  Google Scholar 

  • Ghavamzadeh M, Mahadevan S (2006) Hierarchical average reward reinforcement learning. J Mach Learn Res 13(2):197–229

    Google Scholar 

  • Kearns M, Singh S (2002) Near-optimal reinforcement learning in polynomial time. Mach Learn 49(2/3):209–232

    Article  MATH  Google Scholar 

  • Mahadevan S (1996) Average reward reinforcement learning: foundations, algorithms, and empirical results. Mach Learn 22(1/2/3):159–195

    Google Scholar 

  • Marbach P, Mihatsch O, Tsitsiklis JN (2000) Call admission control and routing in integrated service networks using neuro-dynamic programming. IEEE J Sel Areas Commun 18(2): 197–208

    Article  Google Scholar 

  • Proper S, Tadepalli P (2006) Scaling model-based average-reward reinforcement learning for product delivery. In: European conference on machine learning, Berlin. Springer, pp 725–742

    Google Scholar 

  • Puterman ML (1994) Markov decision processes: discrete dynamic stochastic programming. Wiley, New York

    Book  MATH  Google Scholar 

  • Schwartz A (1993) A reinforcement learning method for maximizing undiscounted rewards. In: Proceedings of the tenth international conference on machine learning, Amherst. Morgan Kaufmann, San Mateo, pp 298–305

    Google Scholar 

  • Seri S, Tadepalli P (2002) Model-based hierarchical average-reward reinforcement learning. In: Proceedings of international machine learning conference, Sydney. Morgan Kaufmann, pp 562–569

    MATH  Google Scholar 

  • Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT, Cambridge

    Google Scholar 

  • Tadepalli P, Ok D (1998) Model-based average-reward reinforcement learning. Artif Intell 100:177–224

    Article  MATH  Google Scholar 

  • Tesauro G (1992) Practical issues in temporal difference learning. Mach Learn 8(3–4):257–277

    MATH  Google Scholar 

  • Tsitsiklis J, Van Roy B (1999) Average cost temporal-difference learning. Automatica 35(11):1799–1808

    Article  MATH  Google Scholar 

  • Van Roy B, Tsitsiklis J (2002) On average versus discounted temporal-difference learning. Mach Learn 49(2/3):179–191

    Article  MATH  Google Scholar 

  • Wang G, Mahadevan S (1999) Hierarchical optimization of policy-coupled semi-Markov decision processes. In: Proceedings of the 16th international conference on machine learning, Bled, pp 464–473

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prasad Tadepalli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Tadepalli, P. (2014). Average-Reward Reinforcement Learning. In: Sammut, C., Webb, G. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7502-7_17-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7502-7_17-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Online ISBN: 978-1-4899-7502-7

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics