Abstract
A method for function approximation in reinforcement learning settings is proposed. The action-value function of the Q-learning method is approximated by the radial basis function neural network and learned by the gradient descent. Those radial basis units that are unable to fit the local action-value function exactly enough are decomposed into new units with smaller widths. The local temporal-difference error is modelled by a two-class learning vector quantization algorithm, which approximates distributions of the positive and of the negative error and provides the centers of the new units. This method is especially convenient in cases of smooth value functions with large local variation in certain parts of the state space, such that non-uniform placement of basis functions is required. In comparison with four related methods, it has the smallest requirements of basis functions when achieving a comparable accuracy.
Similar content being viewed by others
References
Anderson, C.: Q-Learning with hidden-unit restarting, In: Hanson, S., Cowan, J. and Giles, C. (eds), Advances in Neural Information Processing Systems, Vol. 5, San Mateo, CA pp. 81–88, 1993.
Bellman, R.: Dynamic Programming, Princeton University Press. Princeton, NJ., 1957.
Bertsekas, D. and Tsitsiklis, J.: Neuro-Dynamic Programming, Athena Scientific. Belmont, Massachusetts, 1996.
Desieno, D.: Adding a conscience to competitive learning, Proc. Int. Conf. on Neural Networks, I. New York, pp. 117–124, 1988.
Hecht-Nielsen, R.: Neurocomputing, TR 646. Rochester, Addison-Wesley Publishing Company, New York, USA, 1990.
Kohonen, T.: Self-Organization and Associative Memory, Berlin: Springer-Verlag, 1st edition, 1984.
Kretchmar, R. M. and Anderson, C. W.: Comparison of CMACs and radial basis functions for local function approximators in reinforcement learning, Proceedings of the International Conference on Neural Networks, ICNN'97. Houston, TX, 1997.
Moody, J. and Darken, C.: Fast learning in networks of locally tuned processing units, Neural Computation, 1 (1989), 281–294.
Platt, J.: A resource-allocating network for function interpolation, Neural Computation, 3(2) (1992), 213–225.
Samejima, K. and Omori, T.: Adaptive internal state space construction method for reinforcement learning of a real-world agent. Neural Networks, 12(7-8) (1999), 1143–1155.
Sutton, R. and Barto, A.: Reinforcement Learning: An Introduction, MITPress, Cambridge, MA. A Bradford Book. 1998.
Watkins, J. and Dayan, P.: Technical Note: Q-learning, Machine Learning, 8 (1992), 279–292.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Šter, B., Dobnikar, A. Adaptive Radial Basis Decomposition by Learning Vector Quantization. Neural Processing Letters 18, 17–27 (2003). https://doi.org/10.1023/A:1026242620248
Issue Date:
DOI: https://doi.org/10.1023/A:1026242620248