Artificial General Intelligence
Volume 9782 of the series Lecture Notes in Computer Science pp 12-22
Avoiding Wireheading with Value Reinforcement Learning
- Tom EverittAffiliated withAustralian National University Email author
- , Marcus HutterAffiliated withAustralian National University
Abstract
How can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) may seem like a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward – the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to learn a utility function. The VRL setup allows us to remove the incentive to wirehead by placing a constraint on the agent’s actions. The constraint is defined in terms of the agent’s belief distributions, and does not require an explicit specification of which actions constitute wireheading.
- Title
- Avoiding Wireheading with Value Reinforcement Learning
- Book Title
- Artificial General Intelligence
- Book Subtitle
- 9th International Conference, AGI 2016, New York, NY, USA, July 16-19, 2016, Proceedings
- Pages
- pp 12-22
- Copyright
- 2016
- DOI
- 10.1007/978-3-319-41649-6_2
- Print ISBN
- 978-3-319-41648-9
- Online ISBN
- 978-3-319-41649-6
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- 9782
- Series ISSN
- 0302-9743
- Publisher
- Springer International Publishing
- Copyright Holder
- Springer International Publishing Switzerland
- Additional Links
- Topics
- Industry Sectors
- eBook Packages
- Editors
-
- Bas Steunebrink (13)
- Pei Wang (14)
- Ben Goertzel (15)
- Editor Affiliations
-
- 13. Galleria 1, IDSIA
- 14. Temple University
- 15. Hong Kong Polytechnic University
- Authors
-
-
Tom Everitt
(16)
- Marcus Hutter (16)
-
Tom Everitt
- Author Affiliations
-
- 16. Australian National University, Canberra, Australia
Continue reading...
To view the rest of this content please follow the download PDF link above.