The kNN-TD Reinforcement Learning Algorithm
A reinforcement learning algorithm called kNN-TD is introduced. This algorithm has been developed using the classical formulation of temporal difference methods and a k-nearest neighbors scheme as its expectations memory. By means of this kind of memory the algorithm is able to generalize properly over continuous state spaces and also take benefits from collective action selection and learning processes. Furthermore, with the addition of probability traces, we obtain the kNN-TD(λ) algorithm which exhibits a state of the art performance. Finally the proposed algorithm has been tested on a series of well known reinforcement learning problems and also at the Second Annual RL Competition with excellent results.
Unable to display preview. Download preview PDF.
- 1.Sutton, R., Barto, A.: Reinforcement Learning, An Introduction. MIT Press, Cambridge (1998)Google Scholar
- 6.Gordon, G.J.: Stable function approximation in dynamic programming. In: ICML, pp. 261–268 (1995)Google Scholar
- 7.Atkeson, C., Moore, A., Schaal, S.: Locally weighted learning. AI Review 11, 11–73 (1997)Google Scholar
- 8.Bosman, S.: Locally weighted approximations: yet another type of neural network. Master’s thesis, Intelligent Autonomous Systems Group, Dep. of Computer Science, University of Amsterdam (July 1996)Google Scholar
- 11.Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: STOC, pp. 604–613 (1998)Google Scholar