Internal Models and Anticipations in Adaptive Learning Systems

  • Martin V. Butz
  • Olivier Sigaud
  • Pierre Gérard
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2684)


The explicit investigation of anticipations in relation to adaptive behavior is a recent approach. This chapter first provides psychological background that motivates and inspires the study of anticipations in the adaptive behavior field. Next, a basic framework for the study of anticipations in adaptive behavior is suggested. Different anticipatory mechanisms are identified and characterized. First fundamental distinctions are drawn between implicit anticipatory behavior, payoff anticipatory behavior, sensory anticipatory behavior, and state anticipatory behavior. A case study allows further insights into the drawn distinctions. Many future research direction are suggested.


Internal Model Markov Decision Process Mirror Neuron Inattentional Blindness Anticipatory Behavior 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arbib, M.: The mirror system, imitation, and the evolution of language. In: Dautenhahn, K., Nehaniv, C.L. (eds.) Imitation in animals and artifacts. MIT Press, Cambridge (2002)Google Scholar
  2. 2.
    Baluja, S., Pomerleau, D.A.: Using the representation in a neural network’s hidden layer for task-specific focus on attention. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 133–141 (1995)Google Scholar
  3. 3.
    Baluja, S., Pomerleau, D.A.: Expectation-based selective attention for visual monitoring and control of a robot vehicle. Robotics and Autonomous Systems 22, 329–344 (1997)CrossRefGoogle Scholar
  4. 4.
    Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete event systems (2003) (to appear)Google Scholar
  5. 5.
    Bellman, R.E.: Dynamic programming. Princeton University Press, Princeton (1957)Google Scholar
  6. 6.
    Booker, L., Goldberg, D.E., Holland, J.H.: Classifier systems and genetic algorithms. Artificial Intelligence 40, 235–282 (1989)CrossRefGoogle Scholar
  7. 7.
    Brooks, R.A.: Intelligence without reason. In: Proceedings of the 12th International Joint Conference on Artificial Intelligence, pp. 569–595 (1991)Google Scholar
  8. 8.
    Butz, M.V.: Anticipatory learning classifier systems. Kluwer Academic Publishers, Boston (2002)Google Scholar
  9. 9.
    Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: Proceedings of the Twelfth National Conference on AI, pp. 1023–1028 (1994)Google Scholar
  10. 10.
    Davidsson, P.: Learning by linear anticipation in multi-agent systems. In: Weiss, G. (ed.) Distributed artificial intelligence meets machine learning, pp. 62–72. Springer, Heidelberg (1997)Google Scholar
  11. 11.
    Drescher, G.L.: Made-up minds, a constructivist approach to artificial intelligence. MIT Press, Cambridge (1991)zbMATHGoogle Scholar
  12. 12.
    Elman, J.L.: Finding structure in time. Cognitive Science 14, 179–211 (1990)CrossRefGoogle Scholar
  13. 13.
    Fikes, R.E., Nilsson, N.J.: STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence 2, 189–208 (1971)zbMATHCrossRefGoogle Scholar
  14. 14.
    Gallese, V.: The ’shared manifold’ hypothesis: From mirror neurons to empathy. Journal of Consciousness Studies: Between Ourselves - Second-Person Issues in the Study of Consciousness 8, 33–50 (2001)Google Scholar
  15. 15.
    Gallese, V., Goldman, A.: Mirror neurons and the simulation theory of mindreading. Trends in Cognitive Sciences 2, 493–501 (1998)CrossRefGoogle Scholar
  16. 16.
    Gérard, P., Meyer, J.A., Sigaud, O.: Combining latent learning and dynamic programming in MACS. European Journal of Operational Research (2003) (submitted)Google Scholar
  17. 17.
    Gérard, P., Stolzmann, W., Sigaud, O.: YACS: A new learning classifier system with anticipation. Soft Computing 6, 216–228 (2002)zbMATHGoogle Scholar
  18. 18.
    Gérard, P., Sigaud, O.: Adding a generalization mechanism to YACS. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 951–957 (2001)Google Scholar
  19. 19.
    Gérard, P., Sigaud, O.: YACS: Combining dynamic programming with generalization in classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 52–69. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  20. 20.
    Goldberg, D.E.: Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Reading (1989)zbMATHGoogle Scholar
  21. 21.
    Herbart, J.: Psychologie als Wissenschaft neu gegründet auf Erfahrung, Metaphysik und Mathematik. Zweiter, analytischer Teil. August Wilhem Unzer, Königsberg, Germany (1825)Google Scholar
  22. 22.
    Hoffmann, J., Sebald, A., Stöcker, C.: Irrelevant response effects improve serial learning in serial reaction time tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition 27, 470–482 (2001)CrossRefGoogle Scholar
  23. 23.
    Holland, J.H.: Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor (1975)Google Scholar
  24. 24.
    Holland, J.H., Holyoak, K.J., Nisbett, R.E., Thagard, P.R.: Induction. MIT Press, Cambridge (1986)Google Scholar
  25. 25.
    Holland, J.H., Reitman, J.S.: Cognitive systems based on adaptive algorithms. Pattern Directed Inference Systems 7, 125–149 (1978)Google Scholar
  26. 26.
    Holland, J.H.: Properties of the bucket brigade algorithm. In: Proceedings of an International Conference on Genetic Algorithms and their Applications, pp. 1–7 (1985)Google Scholar
  27. 27.
    Kaelbing, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  28. 28.
    Koch, C., Ullmann, S.: Shifts in selective attention: Towards the underlying neural circuitry. Human Neurobiology 4, 219–227 (1985)Google Scholar
  29. 29.
    Kunde, W.: Response-effect compatibility in manual choice reaction tasks. Journal of Experimental Psychology: Human Perception and Performance 27, 387– 394 (2001)Google Scholar
  30. 30.
    Kuvayev, L., Sutton, R.S.: Model-based reinforcement learning with an approximate, learned model. In: Proceedings of the ninth yale workshop on adaptive and learning systems, New Haven, CT, pp. 101–105 (1996)Google Scholar
  31. 31.
    LaBerge, D.: Attentional processing, the brain’s art of mindfulness. Harvard University Press, Cambridge (1995)Google Scholar
  32. 32.
    Lanzi, P.L.: An analysis of generalization in the XCS classifier system. Evolutionary Computation 7, 125–149 (1999)Google Scholar
  33. 33.
    Lanzi, P.L.: Learning classifier systems from a reinforcement learning perspective. Soft Computing 6, 162–170 (2002)Google Scholar
  34. 34.
    Mack, A., Rock, I.: Inattentinal blindness. MIT Press, CambridgeGoogle Scholar
  35. 35.
    Moore, A.W., Atkeson, C.: Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13, 103–130 (1993)Google Scholar
  36. 36.
    Newell, A., Simon, H.A., Shaw, J.C.: Elements of a theory of human problem solving. Psychological Review 65, 151–166 (1958)Google Scholar
  37. 37.
    Pashler, H., Johnston, J.C., Ruthruff, E.: Attention and performance. Annual Review of Psychology 52, 629–651 (2001)Google Scholar
  38. 38.
    Pashler, H.E.: The psychology of attention. MIT Press, Cambridge (1998)Google Scholar
  39. 39.
    Pavlov, I.P.: Conditioned reflexes. Oxford, London (1927)Google Scholar
  40. 40.
    Peng, J., Williams, R.J.: Efficient learning and planning within the dyna framework. Adaptive Behavior 1, 437–454 (1993)Google Scholar
  41. 41.
    Rizzolatti, G., Fadiga, L., Gallese, V., Fogassi, L.: Premotor cortex and the recognition of motor actions. Cognitive Brain Research 3, 131–141 (1996)Google Scholar
  42. 42.
    Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Engineering Department, Cambridge University (1994)Google Scholar
  43. 43.
    Schubotz, R.I., von Cramon, D.Y.: Functional organization of the lateral premotor cortex. fMRI reveals different regions activated by anticipation of object properties, location and speed. Cognitive Brain Research 11, 97–112 (2001)Google Scholar
  44. 44.
    Simons, D.J., Chabris, C.F.: Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception 28, 1059–1074 (1999)Google Scholar
  45. 45.
    Sjölander, S.: Some cognitive break-throughs in the evolution of cognition and consciousness, and their impact on the biology language. Evolution and Cognition 1, 3–11 (1995)Google Scholar
  46. 46.
    Skinner, B.F.: The behavior of organisms. Appleton-Century Crofts, Inc., New-York (1938)Google Scholar
  47. 47.
    Skinner, B.F.: Beyond freedom and dignity. Bantam/Vintage, New York (1971)Google Scholar
  48. 48.
    Stock, A., Hoffmann, J.: Intentional fixation of behavioral learning or how R-E learning blocks S-R learning. European Journal of Cognitive Psychology (2002) (in press)Google Scholar
  49. 49.
    Stolzmann, W.: Antizipative Classifier Systems [Anticipatory classifier systems]. Shaker Verlag, Aachen (1997)Google Scholar
  50. 50.
    Stolzmann, W.: Anticipatory classifier systems. Genetic Programming 1998. In: Proceedings of the Third Annual Conference, pp. 658–664 (1998)Google Scholar
  51. 51.
    Stolzmann, W., Butz, M.V., Hoffmann, J., Goldberg, D.E.: First cognitive capabilities in the anticipatory classifier system. In: From Animals to Animats 6: Proceedings of the Sixth International Conference on Simulation of Adaptive Behavior, pp. 287–296 (2000)Google Scholar
  52. 52.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)Google Scholar
  53. 53.
    Sutton, R.: Reinforcement learning architectures for animats. In: From animals to animats: Proceedings of the First International Conference on Simulation of Adaptative Behavior (1991)Google Scholar
  54. 54.
    Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181–211 (1999)Google Scholar
  55. 55.
    Tani, J.: Model-based learning for mobile robot navigation from the dynamical system perspective. IEEE Transactions on System, Man and Cybernetics 26, 421–436 (1996)Google Scholar
  56. 56.
    Tani, J.: An interpretation of the ”self” from the dynamical systems perspective: A constructivist approach. Journal of Consciousness Studies 5, 516–542 (1998)Google Scholar
  57. 57.
    Tani, J.: Learning to perceive the world as articulated: An approach for hierarchical learning in sensory-motor systems. Neural Networks 12, 1131–1141 (1999)Google Scholar
  58. 58.
    Thistlethwaite, D.: A critical review of latent learning and related experiments. Psychological Bulletin 48, 97–129 (1951)Google Scholar
  59. 59.
    Thompson, E.: Empathy and consciousness. Journal of Consciousness Studies: Between Ourselves - Second-Person Issues in the Study of Consciousness 8, 1–32 (2001)Google Scholar
  60. 60.
    Thorndike, E.L.: Animal intelligence: Experimental studies. Macmillan, New York (1911)Google Scholar
  61. 61.
    Tolman, E.C.: Purposive behavior in animals and men. Appletown, New York (1932)Google Scholar
  62. 62.
    Tolman, E.C.: The determiners of behavior at a choice point. Psychological Review 45, 1–41 (1938)Google Scholar
  63. 63.
    Tolman, E.C.: Cognitive maps in rats and men. Psychological Review 55, 189–208 (1948)Google Scholar
  64. 64.
    Tolman, E.C.: Principles of purposive behavior. In: Koch, S. (ed.) Psychology: A study of science, pp. 92–157. McGraw-Hill, New York (1959)Google Scholar
  65. 65.
    Watkins, C.J.: Learning with delayed rewards. PhD thesis, Psychology Department, University of Cambridge, England (1989)Google Scholar
  66. 66.
    Wilson, S.W.: ZCS, a zeroth level classifier system. Evolutionary Computation 2, 1–18 (1994)Google Scholar
  67. 67.
    Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3, 149–175 (1995)Google Scholar
  68. 68.
    Wilson, S.W.: Mining oblique data with XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, p. 158. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  69. 69.
    Wilson, S.W.: Knowledge growth in an artificial animal. In: Grefenstette, J.J. (ed.) Proceedings of an international conference on genetic algorithms and their applications, pp. 16–23. Carnegie-Mellon University, Pittsburgh (1985)Google Scholar
  70. 70.
    Witkowski, C.M.: Schemes for learning and behaviour: A new expectancy model. PhD thesis, Department of Computer Science, University of London, England (1997)Google Scholar
  71. 71.
    Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Computation 8, 1341–1390 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Martin V. Butz
    • 2
    • 3
  • Olivier Sigaud
    • 1
  • Pierre Gérard
    • 1
  1. 1.AnimatLab-LIP6ParisFrance
  2. 2.Department of Cognitive PsychologyUniversity of WürzburgGermany
  3. 3.Illinois Genetic Algorithms Laboratory (IlliGAL)University of Illinois at Urbana-ChampaignUSA

Personalised recommendations