Skip to main content

Recognizing Internal States of Other Agents to Anticipate and Coordinate Interactions

  • Conference paper
Book cover Multi-Agent Systems (EUMAS 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7541))

Included in the following conference series:

  • 897 Accesses

Abstract

In multi-agent systems, anticipating the behavior of other agents constitutes a difficult problem. In this paper we present the case where a cognitive agent is inserted into an unknown environment composed of different kinds of other objects and agents; our cognitive agent needs to incrementally learn a model of the environment dynamics, doing it only from its interaction experience; the learned model can then be used to define a policy of actions. It is relatively easy to do so when the agent interacts with static objects, with simple mobile objects, or with trivial reactive agents; however, when the agent deals with other complex agents that may change their behaviors according to some non-directly observable internal properties (like emotional or intentional states), the construction of a model becomes significantly harder. The complete system can be described as a Factored and Partially Observable Markov Decision Process (FPOMDP); our agent implements the Constructivist Anticipatory Learning Mechanism (CALM) algorithm, and the experiment (called mept) shows that the induction of non-observable variables enable the agent to learn a deterministic model of most of the system events (if it represents a well-structured universe), allowing it to anticipate other agents actions and to adapt to them, even if some interactions appear as non-deterministic in a first sight.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Astington, J.W., Harris, P.L., Olson, D.R. (eds.): Developing Theories of Mind. Cambridge University Press, Cambridge (1988)

    Google Scholar 

  2. Astrom, K.J.: Optimal Control of Markov Decision Processes with Incomplete State Estimation. Journal of Mathematical Analysis and Applications 10, 174–205 (1965)

    Article  MathSciNet  Google Scholar 

  3. Bateson, G.: Steps to an Ecology of Mind: Collected Essays in Anthropology, Psychiatry, Evolution, and Epistemology. University of Chicago Press (1972)

    Google Scholar 

  4. Beer, R.D.: A dynamical systems perspective on agent-environment interactions. Artificial Intelligence 72, 173–215 (1995)

    Article  Google Scholar 

  5. Bellman, R.: A Markovian Decision Process. Journal of Mathematics and Mechanics 6 (1957)

    Google Scholar 

  6. Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  7. Blythe, J.: Decision-Theoretic Planning. AI Magazine 20(2), 37–54 (1999)

    Google Scholar 

  8. Boutilier, C., Poole, D.: Computing optimal policies for partially observable decision processes using compact representations. In: Proceedings of the 13th National Conference on Artificial Intelligence, AAAI, OR, USA, vol. 2, pp. 1168–1175. AAAI Press, Portland (1996)

    Google Scholar 

  9. Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artificial Intelligence 121(1-2), 49–107 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chrisman, L.: Reinforcement Learning with Perceptual Aliasing: the perceptual distinctions approach. In: Proceedings of the 10th National Conference on Artificial Intelligence, AAAI, San Jose, CA, USA, pp. 183–188. AAAI Press (1992)

    Google Scholar 

  11. Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Comp. Intel. 5(3), 142–150 (1989)

    Article  Google Scholar 

  12. Degris, T., Sigaud, O.: Factored Markov Decision Processes. In: Buffet, O., Sigaud, O. (eds.) Markov Decision Processes in Artificial Intelligence. Loria, Vandoeuvre-lès-Nancy (2010)

    Google Scholar 

  13. Degris, T., Sigaud, O., Wuillemin, P.-H.: Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems. In: Proceedings of the 23rd International Conference on Machine Learning, ICML, pp. 257–264. ACM, Pittsburgh (2006)

    Chapter  Google Scholar 

  14. Dennett, D.: The Intentional Stance. MIT Press, Cambridge (1987)

    Google Scholar 

  15. Drescher, G.: Made-Up Minds: a constructivist approach to artificial intelligence. MIT Press, Cambridge (1991)

    MATH  Google Scholar 

  16. Feinberg, E.A., Shwartz, A.: Handbook of Markov Decision Processes: methods and applications. Kluwer, Norwell (2002)

    Book  MATH  Google Scholar 

  17. Friedman, N., Koller, D.: Being Bayesian about Network Structure: a bayesian approach to structure discovery in bayesian networks. Machine Learning 50(1-2), 95–125 (2003)

    Article  MATH  Google Scholar 

  18. Guestrin, C., Koller, D., Parr, R.: Solving Factored POMDPs with Linear Value Functions. In: Proceedings of the Workshop on Planning under Uncertainty and Incomplete Information, Seattle, WA, pp. 67–75 (2001)

    Google Scholar 

  19. Guestrin, C., Koller, D., Parr, R., Venkataraman, S.: Efficient Solution Algorithms for Factored MDPs. Journal of Artificial Intelligence Research 19, 399–468 (2003)

    MathSciNet  MATH  Google Scholar 

  20. Hansen, E.A., Feng, Z.: Dynamic programming for POMDPs using a factored state representation. In: Proceedings of the 5th International Conference on Artificial Intelligence, Planning and Scheduling, AIPS, Breckenridge, CO, USA, pp. 130–139. AAAI Press (2000)

    Google Scholar 

  21. Hauskrecht, M.: Value-function approximations for partially observable Markov decision processes. Journal of Artificial Intelligence Research 13, 33–94 (2000)

    MathSciNet  MATH  Google Scholar 

  22. Holmes, M., Isbell, C.: Looping Suffix Tree-Based Inference of Partially Observable Hidden State. In: Proceedings of the 23rd International Conference on Machine Learning, ICML, pp. 409–416. ACM, Pittsburgh (2006)

    Chapter  Google Scholar 

  23. Jensen, F.B., Graven-Nielsen, T.: Bayesian Networks and Decision Graphs, 2nd edn. Springer (2007)

    Google Scholar 

  24. Howard, R.A.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)

    MATH  Google Scholar 

  25. Howard, R.A., Matheson, J.E.: Influence Diagrams. In: The Principles and Applications of Decision Analysis, pp. 720–762 (1981)

    Google Scholar 

  26. Hoey, J., St-Aubin, R., Hu, A.J., Boutilier, C.: SPUDD: Stochastic Planning Using Decision Diagrams. In: Proceedings of the 15th International Conference on Uncertainty in Artificial Intelligence, UAI, Stockholm, Sweden. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  27. Jonsson, A., Barto, A.: A Causal Approach to Hierarchical Decomposition of Factored MDPs. In: Proceedings of the 22nd International Conference on Machine Learning, ICML, Bonn, Germany, pp. 401–408. ACM (2005)

    Google Scholar 

  28. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  29. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Acting optimally in partially observable stochastic domains. In: Proceedings of the 12th National Conference on Artificial Intelligence, AAAI, Seattle, WA, USA, pp. 1023–1028. AAAI Press (1994)

    Google Scholar 

  30. Meuleau, N., Kim, K.-E., Kaelbling, L.P., Cassandra, A.R.: Solving POMDPs by Searching the Space of Finite Policies. In: Proceedings of the 15th International Conference on Uncertainty in Artificial Intelligence, UAI, Stockholm, Sweden, pp. 427–443. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  31. Murphy, G.L.: The big book of concepts. MIT Press, Cambridge (2002)

    Google Scholar 

  32. Pearl, J.: Causality: models of reasoning and inference. Cambridge University Press (2000)

    Google Scholar 

  33. Perotto, F.S., Álvares, L.O.: Incremental Inductive Learning in a Constructivist Agent. In: Proceedings of the Research and Development in Intelligent Systems XXIII, SGAI 2006, pp. 129–144. Springer, London (2007)

    Chapter  Google Scholar 

  34. Perotto, F.S., Álvares, L.O., Buisson, J.-C.: Constructivist Anticipatory Learning Mechanism (CALM): Dealing with Partially Deterministic and Partially Observable Environments. In: Proceedings of the 7th International Conference on Epigenetic Robotics, Piscataway, NJ, USA, pp. 117–127. Lund University Cognitive Studies, New Jersey (2007)

    Google Scholar 

  35. Perotto, F.S.: Un Mécanisme Constructiviste d’Apprentissage Automatique d’Anticipations pour des Agents Artificiels Situés. PhD Thesis. INP, Toulouse, France (2010) (in French)

    Google Scholar 

  36. Perotto, F.S.: Anticipatory Learning Mechanisms. In: Seel, N.M. (ed.) Encyclopedia of the Sciences of Learning. Springer, Heidelberg (2012)

    Google Scholar 

  37. Piaget, J.: La Psychologie de l’Intelligence. Armand Colin, Paris (1947)

    Book  Google Scholar 

  38. Poupart, P., Boutilier, C.: VDCBPI: an approximate scalable algorithm for large scale POMDPs. In: Proceedings of the 17th Advances in Neural Information Processing Systems, NIPS, Vancouver, Canada, pp. 1081–1088. MIT Press, Cambridge (2004)

    Google Scholar 

  39. Puterman, M.L.: Markov Decision Processes: discrete stochastic dynamic programming. Wiley, New York (1994)

    MATH  Google Scholar 

  40. Shani, G., Brafman, R.I., Shimony, S.E.: Model-Based Online Learning of POMDPs. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 353–364. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  41. Shani, G., Poupart, P., Brafman, R.I., Shimony, S.E.: Efficient ADD Operations for Point-Based Algorithms. In: Proceedings of the 8th International Conference on Automated Planning and Scheduling, ICAPS, Sydney, Australia, pp. 330–337. AAAI Press (2008)

    Google Scholar 

  42. Sim, H.S., Kim, K.-E., Kim, J.H., Chang, D.-S., Koo, M.-W.: Symbolic Heuristic Search Value Iteration for Factored POMDPs. In: Proceedings of the 23rd National Conference on Artificial Intelligence, AAAI, Chicago, IL, USA, pp. 1088–1093. AAAI Press (2008)

    Google Scholar 

  43. Singh, S., Littman, M., Jong, N., Pardoe, D., Stone, P.: Learning Predictive State Representations. In: Proceedings of the 20th International Conference on Machine Learning, ICML, Washington, DC, USA, pp. 712–719. AAAI Press (2003)

    Google Scholar 

  44. Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov decision processes over a finite horizon. Operations Research, Informs 21, 1071–1088 (1973)

    Article  MATH  Google Scholar 

  45. St-Aubin, R., Hoey, J., Boutilier, C.: APRICODD: Approximate policy construction using decision diagrams. In: Proceedings of the 12th Advances in Neural Information Processing Systems, NIPS, Denver, CO, USA. MIT Press, Cambridge (2000)

    Google Scholar 

  46. Strehl, A.L., Diuk, C., Littman, M.L.: Efficient Structure Learning in Factored-State MDPs. In: Proceedings of the 22nd National Conference on Artificial Intelligence, AAAI, Vancouver, Canada, pp. 645–650. AAAI Press (2007)

    Google Scholar 

  47. Suchman, L.A.: Plans and Situated Actions. Cambridge University Press (1987)

    Google Scholar 

  48. Sutton, R.S., Barto, A.G.: Reinforcement Learning: an introduction. MIT Press (1998)

    Google Scholar 

  49. Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8(3), 279–292 (1992)

    MATH  Google Scholar 

  50. Wilson, R., Clark, A.: How to Situate Cognition: Letting Nature Take its Course. In: Aydede, M., Robbins, P. (eds.) Cambridge Handbook of Situated Cognition. Cambridge University Press, New York (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Perotto, F.S. (2012). Recognizing Internal States of Other Agents to Anticipate and Coordinate Interactions. In: Cossentino, M., Kaisers, M., Tuyls, K., Weiss, G. (eds) Multi-Agent Systems. EUMAS 2011. Lecture Notes in Computer Science(), vol 7541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34799-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34799-3_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34798-6

  • Online ISBN: 978-3-642-34799-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics