Skip to main content

Reward Shaping

  • Reference work entry
  • First Online:

Synonyms

Heuristic rewards; Reward selection

Definition

Reward shaping is a technique inspired by animal training where supplemental rewards are provided to make a problem easier to learn. There is usually an obvious natural reward for any problem. For games, this is usually a win or loss. For financial problems, the reward is usually profit. Reward shaping augments the natural reward signal by adding additional rewards for making progress toward a good solution.

Motivation and Background

Reward shaping is a method for engineering a reward function in order to provide more frequent feedback on appropriate behaviors. It is most often discussed in the reinforcement learning framework. Providing feedback is crucial during early learning so that promising behaviors are tried early. This is necessary in large domains, where reinforcement signals may be few and far between.

A good example of such a problem is chess. The objective of chess is to win a match, and an appropriate reinforcement...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   949.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  • Koenig S, Simmons RG (1996) The effect of representation and knowledge on goal directed exploration with reinforcement-learning algorithms. Mach Learn 22(1–3):227–250

    MATH  Google Scholar 

  • Mataric MJ (1994) Reward functions for accelerated learning. In: International conference on machine learning, New Brunswick. Morgan Kaufmann, San Francisco, pp 181–189

    Google Scholar 

  • Ng AY, Harada D, Russell S (1999) Policy invariance under reward transformations: theory and application to reward shaping. In: Machine learning, proceedings of the sixteenth international conference, Bled. Morgan Kaufmann, San Francisco, pp 278–287

    Google Scholar 

  • Randlov J, Alstrom P (1998) Learning to drive a bicycle using reinforcement learning and shaping. In: Proceedings of the fifteenth international conference on machine learning, Madison. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Wiewiora E (2003) Potential-based shaping and Q-value initialization are equivalent. J Artif Intell Res 19: 205–208

    MathSciNet  MATH  Google Scholar 

  • Wiewiora E, Cottrell G, Elkan C (2003) Principled methods for advising reinforcement learning agents. In: Machine learning, proceedings of the twentieth international conference, Washington, DC. AAAI Press, Menlo Park, pp 792–799

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry

Wiewiora, E. (2017). Reward Shaping. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_966

Download citation

Publish with us

Policies and ethics