Abstract
In the past decade, research in neurocomputing has been divided into two relatively well-defined tracks: one track dealing with cognition and the other with behavior. Cognition deals with organizing, classifying and recognizing sensory stimuli. Behavior is more dynamic, involving sequences of actions and changing interactions with an external environment. The mathematical techniques that apply to these areas, at least from the point of neurocomputing, appear to have been quite separate as well. The purpose of this paper is to give an overview of some recent powerful mathematical results in behavioral neurocomputing, specifically the concept of Q-learning due to C. Watkins, and some new extensions. Finally, we propose ways in which the mathematics of cognition and the mathematics of behavior can move closer to build more unified systems of information processing and action.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
D. Bertsekas, Dynamic Programming and Optimal Control. Athena Scientific, Belmont, MA (1995).
R. Howard, Dynamic Programming and Markov Processes. MIT Press, Cambridge, MA (1960).
W.T. Miller III, R.S. Sutton, and P.J. Werbos, Neural Networks for Control. MIT Press, Cambridge, MA (1990).
R.S. Sutton, A.G. Barto, and R.J. Williams, Reinforcement learning is direct adaptive control, IEEE Control Systems Magazine (April 1992), pp19–22.
J. Tsitsiklis, Asynchronous stochastic approximation and Q-Learning, Machine Learning, Vol. 16 (1994), pp185–202.
C.I.C.H. Watkins, Learning from delayed rewards Ph.D. Dissertation, University of Cambridge (1989).
C.I.C.H. Watkins and P. Dayan, Q-Learning, Machine Learning, Vol.8 (1989), pp279–292.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer Science+Business Media New York
About this chapter
Cite this chapter
Cybenko, G., Gray, R., Moizumi, K. (1997). Q-Learning: A Tutorial and Extensions. In: Ellacott, S.W., Mason, J.C., Anderson, I.J. (eds) Mathematics of Neural Networks. Operations Research/Computer Science Interfaces Series, vol 8. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-6099-9_3
Download citation
DOI: https://doi.org/10.1007/978-1-4615-6099-9_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7794-8
Online ISBN: 978-1-4615-6099-9
eBook Packages: Springer Book Archive