Advertisement

Autonomous Robots

, Volume 7, Issue 1, pp 41–56 | Cite as

Dynamics of a Classical Conditioning Model

  • Christian Balkenius
  • Jan Morén
Article

Abstract

Classical conditioning is a basic learning mechanism in animals and can be found in almost all organisms. If we want to construct robots with abilities matching those of their biological counterparts, this is one of the learning mechanisms that needs to be implemented first. This article describes a computational model of classical conditioning where the goal of learning is assumed to be the prediction of a temporally discounted reward or punishment based on the current stimulus situation.

The model is well suited for robotic implementation as it models a number of classical conditioning paradigms and learning in the model is guaranteed to converge with arbitrarily complex stimulus sequences. This is an essential feature once the step is taken beyond the simple laboratory experiment with two or three stimuli to the real world where no such limitations exist. It is also demonstrated how the model can be included in a more complex system that includes various forms of sensory pre-processing and how it can handle reinforcement learning, timing of responses and function as an adaptive world model.

classical conditioning reinforcement learning biological models 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balkenius, C. 1995. Natural intelligence in artificial creatures, Lund University Cognitive Studies 37.Google Scholar
  2. Balkenius, C. 1996. Generalization in instrumental learning. In From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, P. Maes, M. Mataric, J.-A. Meyer, J. Pollack, and S.W. Wilson (Eds.), MIT Press/Bradford Books: Cambridge, MA.Google Scholar
  3. Balkenius, C. 1998. A neural network model of classical conditioning I: The dynamics of learning, Lund University Cognitive Studies 68.Google Scholar
  4. Balkenius, C. and Morén, J. 1998. Computational models of classical conditioning: A comparative study. In From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, R. Pfeifer, B. Blumberg, J.-A. Meyer, and S.W. Wilson (Eds.), MIT Press/Bradford Books: Cambridge, MA.Google Scholar
  5. Balleine, B. 1992. Instrumental performance following a shift in primary motivation depends on incentive learning. Journal of Experimental Psychology: Animal Behavior Processes, 18:236-250.CrossRefGoogle Scholar
  6. Balleine, B., Garner, C., Gonzalez, F., and Dickinson, A. 1995. Motivational control of heterogeneous instrumental chains. Journal of Experimental Psychology: Animal Behavior Processes, 21:203-217.CrossRefGoogle Scholar
  7. Barto, A.G., Sutton, R.S., and Watkins, C.J.C.H. 1990. Learning and sequential decision making. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 539-602.Google Scholar
  8. Bouton, M.E. 1991. Context and retrieval in extinction and in other examples of interference in simple associative learning. In Current Topics in Animal Learning: Brain, Emotion and Cognition, L. Dachowski and C.F. Flaherty (Eds.), Erlbaum: Hillsdale, NJ.Google Scholar
  9. Desmond, J.E. 1990. Temporally adaptive responses in neural models: The stimulus trace. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 421-456.Google Scholar
  10. Donahoe, J.W. and Palmer, D.C. 1994. Learning and Complex Behavior, Allyn & Bacon: Boston.Google Scholar
  11. Gaffan, D. 1992. Amygdala and the memory of reward. In The Amygdala: Neurobiological Aspects of Emotion, Memory, and Mental Dysfunction, J.P. Aggleton (Ed.), Wiley-Liss: New York, pp. 471-484.Google Scholar
  12. Gallistel, C.R. 1990. The Organization of Learning, MIT Press: Cambridge, MA.Google Scholar
  13. Gray, J.A. 1975. Elements of a Two-Process Theory of Learning, Academic Press: London.Google Scholar
  14. Grossberg, S. 1974. Classical and Instrumental Learning by Neural Networks. In Progress in Theoretical Biology, Academic Press: New York, Vol. 3.Google Scholar
  15. Grossberg, S. 1987. The Adaptive Brain, North-Holland: Amsterdam.Google Scholar
  16. Hassoun, M.H. 1995. Fundamentals of Artificial Neural Networks, MIT Press: Cambridge, MA.zbMATHGoogle Scholar
  17. Holland, P.C. 1992. Occasion setting in Pavlovian conditioning. In The Psychology of Learning and Motivation, D. Medin (Ed.), Academic Press: San Diego, CA, Vol. 28, pp. 69-125.Google Scholar
  18. Hull, C.L. 1932. The goal-gradient hypothesis and maze learning. Psychological Review, 39(1):25-43.CrossRefGoogle Scholar
  19. Kaelbling, L.P., Littman, M.L., and Moore, A.W. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237-285.Google Scholar
  20. Kamin, L.J. 1968. Attention-like processes in classical conditioning. In Miami Symposium on the Prediction of Behavior: Aversive Stimulation, M.R. Jones (Ed.), University of Miami Press: Miami, pp. 9-31.Google Scholar
  21. Kehoe, E.J. 1982. Conditioning with serial compound stimuli: Theoretical and empirical issues. Experimental Animal Behavior, 1:30-65.Google Scholar
  22. Kehoe, E.J. 1990. Classical conditioning: Fundamental issues for adaptive network models. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 390-420.Google Scholar
  23. Klopf, A.H. 1988. A neuronal model of classical conditioning. Psychobiology, 16(2):85-125.Google Scholar
  24. Klopf, A.H., Morgan, J.S., and Weaver, S.E. 1993. A hierarchical network of control systems that learn: Modeling nervous system function during classical and instrumental conditioning. Adaptive Behavior, 1(3):263-319.CrossRefGoogle Scholar
  25. Machado, A. 1997. Learning the temporal dynamics of behavior. Psychological Review, 104:241-265.CrossRefGoogle Scholar
  26. Mackintosh, N.J. 1975. A theory of attention: Variations in the associability of stimuli with reinforcement. Psychological Review, 82:276-298.CrossRefGoogle Scholar
  27. Mackintosh, N.J. 1983. Conditioning and Associative Learning, Oxford University Press: Oxford.Google Scholar
  28. Minsky, M. and Papert, S. 1988. Perceptrons-Expanded Edition, MIT Press: Cambridge, MA.Google Scholar
  29. Moore, J.W. and Choi, J.-C. 1998. Conditioned stimuli are occasion setters. In Occasion Setting: Associative Learning and Cognition in Animals, N.A. Schmajuk and P.C. Holland (Eds.), American Psychological Association: Washington, D.C.Google Scholar
  30. Mowrer, O.H. 1973. Learning Theory and Behavior, Wiley: New York.Google Scholar
  31. Pavlov, I.P. 1927. Conditioned Reflexes, Oxford University Press: Oxford.Google Scholar
  32. Rescorla, R.A. and Wagner, A.R. 1972. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement. In Classical Conditioning II: Current Research and Theory, A.H. Black and W.F. Prokasy (Eds.), Appleton-Century-Crofts: New York, pp. 64-99.Google Scholar
  33. Rescorla, R.A. 1985. Inhibition and facilitation. In Information Processing in Animals: Conditioned Inhibition, R.R. Miller and N.E. Spear (Eds.), Erlbaum: Hillsdale, NJ, pp. 299-326.Google Scholar
  34. Rosenblatt, F. 1959. Principles of Neurodynamics, Spartan Books: New York.Google Scholar
  35. Schmajuk, N.A. 1990. Role of the hippocampus in temporal and spatial navigation: An adaptive neural network. Behavioral Brain Research, 39:205-229.CrossRefGoogle Scholar
  36. Schmajuk, N.A. and DiCarlo, J.J. 1992. Stimulus configuration, classical conditioning, and hippocampal function. Psychological Review, 99:268-305.CrossRefGoogle Scholar
  37. Schmajuk, N.A. and Thieme, A.D. 1992. Purposive behavior and cognitive mapping: A neural network model. Biological Cybernetics, 67:165-174.zbMATHCrossRefGoogle Scholar
  38. Schmajuk, N.A., 1997. Animal Learning and Cognition: A Neural Network Approach, Cambridge University Press.Google Scholar
  39. Schmajuk, N.A. and Holland, P.C. (Eds.). 1998. Occasion Setting: Associative Learning and Cognition in Animals, American Psychological Association: Washington, DC.Google Scholar
  40. Schneiderman, N. 1966. Interstimulus interval function of the nictitating membrane response of the rabbit under delay versus trace conditioning. Journal of Comparative and Physiological Psychology, 62:397-402.CrossRefGoogle Scholar
  41. Schneiderman, N., Fuentes, I., and Gormezano, I. 1962. Acquisition and extinction of the classically conditioned eyelid response in the albino rabbit. Science, 136:650-652.CrossRefGoogle Scholar
  42. Schneiderman, N. and Gormezano, I. 1964. Conditioning of the nictitating membrane of the rabbit as a function as the CS-US interval. Journal of Comparative and Physiological Psychology, 57:188-195.CrossRefGoogle Scholar
  43. Smith, M.C., Coleman, S.R., and Gormezano, I. 1969. Classical conditioning of the rabbits nictitating membrane response at backward, simultaneous, and forward CS-US interval. Journal of Comparative Physiological Psychology, 69:226-231.CrossRefGoogle Scholar
  44. Staddon, J.E.R and Higa, J.J. 1996. Multiple time scales in simple habituation. Psychological Review, 103:720-733.CrossRefGoogle Scholar
  45. Sutton, R.S. and Barto, A.G. 1990. Time-derivative models of Pavlovian reinforcement. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 497-538.Google Scholar
  46. Sutton, R.S. and Barto, A.G. 1998. Reinforcement Learning: An Introduction, MIT Press: Cambridge, MA.Google Scholar
  47. Watkins, C.J.C.H. 1992. Q-learning. Machine Learning, 8:279-292.zbMATHGoogle Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Christian Balkenius
    • 1
  • Jan Morén
    • 1
  1. 1.Lund University Cognitive ScienceLundSweden

Personalised recommendations