In most Markov decision process applications, the decision-maker receives a reward each period. This reward can depend on the current state, the action taken, and the next state and is denoted by r t (s,a,s').
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this entry
Cite this entry
(2011). Reward. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_729
Download citation
DOI: https://doi.org/10.1007/978-0-387-30164-8_729
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30768-8
Online ISBN: 978-0-387-30164-8
eBook Packages: Computer ScienceReference Module Computer Science and Engineering