Adaption of Stepsize Parameter Using Newton’s Method
A method to optimize stepsize parameters in exponential moving average (EMA) based on Newton’s method to minimize square errors is proposed. The stepsize parameters used in reinforcement learning methods should be selected and adjusted carefully for dynamic and non-stationary environments. To find the suitable values for the stepsize parameters through learning, a framework to acquire higher-order derivatives of learning values by the stepsize parameters has been proposed. Based on this framework, the authors extend a method to determine the best stepsize using Newton’s method to minimize EMA of square error of learning. The method is confirmed by mathematical theories and by results of experiments.
KeywordsExpected Utility Learning Agent Approximate Dynamic Programming Reinforcement Learning Agent Multiagent Learning
Unable to display preview. Download preview PDF.
- 1.Bonarini, A., Lazaric, A., de Cote, E.M., Restelli, M.: Improving cooperation among self-interested reinforcement learning agents. In: Proc. of Workshop on Reinforcement Learning in Non-Stationary Environments. ECML-PKDD 2005 (October 2005)Google Scholar
- 3.Even-dar, E., Mansour, Y.: Learning rates for q-learning. Journal of Machine Learning Research 5, 2003 (December 2003)Google Scholar
- 5.Noda, I.: Adaptation of stepsize parameter for non-stationary environments by recursive exponential moving average. In: Prof. of ECML 2009 LNIID Workshop, ECML, pp. 24–31 (September 2009)Google Scholar
- 8.Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar