Abstract
This chapter discusses lifelong reinforcement learning. Reinforcement learning (RL) is the problem where an agent learns actions through trial-and-error interactions with a dynamic environment [Kaelbling et al., 1996, Sutton and Barto, 1998]. In each interaction step, the agent receives input on the current state of the environment. It chooses an action from a set of possible actions. The action changes the state of the environment. Then, the agent gets the value of this state transition, which can be a reward or penalty. This process repeats as the agent learns a trajectory of actions to optimize its objective, e.g., to maximize the long-run sum of rewards. The goal of RL is to learn an optimal policy that maps states to actions (possibly stochastically). There is a recent surge in research in RL due to its successful use in the computer program called AlphaGo [Silver et al., 2016], which won 4–1 against one of the legendary professional Go players Lee Sedol in March 2016.1 More recently, AlphaGo Zero [Silver et al., 2017]2 was designed to learn to master the game of Go from scratch without human knowledge, and it has achieved superhuman performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chen, Z., Liu, B. (2018). Lifelong Reinforcement Learning. In: Lifelong Machine Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-031-01581-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-01581-6_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-00453-7
Online ISBN: 978-3-031-01581-6
eBook Packages: Synthesis Collection of Technology (R0)eBColl Synthesis Collection 7eBColl Synthesis Collection 8