Adaptive Reinforcement Learning for Dynamic Environment Based on Behavioral Habit
In our previous works, we proposed the adjustment method for learning rate of reinforcement learning: ALR-P. In this method, the learning rate can be adjusted adaptively considering learning progress using a simple and general value TD-error. And we confirmed that the adaptive learning can be realized with the proposed method through a maze problem as dynamic environment. In this paper, we propose the additional ability for this method to realize the learning agent taking behavioral habit into the consideration. The behavioral habit has been taken in human consideration for important decision making in real world. We believe that the learning agent also should have the behavioral habit and take action considering it. We applied ALR-P with the some behavioral habits (ALR-BH) to dynamic maze problem. The experimental results show that the adaptively adjustment of the learning rate is effective for dynamic environment and ALR-BH enabled the learning agent to behave appropriate actions based on the behavioral habit.
KeywordsLearning Rate Learning Agent Persistence Rate Total Reward Meta Parameter
Unable to display preview. Download preview PDF.
- 3.Mimura, A., Kato, S.: Adaptive reinforcement learning based on degree of learning progress. In: 17th International Symposium on Artificial Life and Robotics (2012)Google Scholar
- 4.Mimura, A., Nishibe, S., Kato, S.: Kinetic chained throwing humanoid robots by using reinforcement learning. In: 12th International Symposium on Advanced Intelligent Systems, pp. 188–191 (2011)Google Scholar
- 5.Noda, I.: Adaptation of stepsize parameter to minimize exponential moving average of square error by newton’s method. In: 9th International Conference on Autonomous Agents and Multiagent Systems, pp. M-2–1 (2010)Google Scholar
- 8.Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)Google Scholar