Skip to main content
Log in

Learning Intelligent Behavior in a Non-stationary and Partially Observable Environment

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Individual learning in an environment where more than one agent exist is a chal-lengingtask. In this paper, a single learning agent situated in an environment where multipleagents exist is modeled based on reinforcement learning. The environment is non-stationaryand partially accessible from an agents' point of view. Therefore, learning activities of anagent is influenced by actions of other cooperative or competitive agents in the environment.A prey-hunter capture game that has the above characteristics is defined and experimentedto simulate the learning process of individual agents. Experimental results show that thereare no strict rules for reinforcement learning. We suggest two new methods to improve theperformance of agents. These methods decrease the number of states while keeping as muchstate as necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abul, O., Polat, F. & Alhajj, R. (2000). Multi-Agent Reinforcement Learning Using Function Approximation. IEEE Transaction on Systems, Man and Cybernetics 30(4): 485–497.

    Google Scholar 

  • Bellmann, R. E. (1957). Dynamic Programming. Princeton, NJ: Princeton University Press.

    Google Scholar 

  • Ellis, H. C. (1972). Fundamentals of Human Learning and Cognition. Dubuque, Iowa: WM.C. Brown company Publishers.

    Google Scholar 

  • Estes, W. K. (1970). Learning Theory and Mental Development. New York, NY: Academic Press.

    Google Scholar 

  • Howard, R. A. (1960). Dynamic Programming and Markov Processes. Cambridge, MA: The MIT Press.

    Google Scholar 

  • Hu, J. & Wellman, M. P. (1998). Multi-Agent Reinforcement Learning: Theoretical Frame-work and an Algorithm. Proc.of Int.Conf.on Machine Learning, 242–250.

  • Hu, J. & Wellman, M. P. (1998). Multiagent Reinforcement Learning and Stochastic Games. Games and Economic Behavior.

  • Hulse, S. H., Egeth, H. & Deese, J. (1984). The Psychology of Learning. McGraw-Hill.

  • Kaelbling, L. P., Littman, M. L. & Moore, A. W. (1996). Reinfocement Learning: A Survey. Journal of Artificial Intelligence Research 4: 237–285.

    Google Scholar 

  • Kaelbling, L. P. et al. (1998). Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence 101.

  • Keller, F. S. (1969). Reinforcement Theory. New York, NY: Random House.

    Google Scholar 

  • Kodratoff, Y. (1998). Introduction to Machine Learning. Morgan Kaufmann.

  • Kuter, U. & Polat, F. (2000). Learning Better in Dynamic, Partially Observable Environ-ment. In Lindemann, G. (ed.) Proc.of European Conf.on Artificial Intelligence (ECAI) Workshop on Modeling Artificial Societies and Hybrid Organization, 50–68. Berlin, Aug. 20-25.

  • Langley, P. (1995). Elements of Machine Learning. Morgan Kaufman

  • Littman, M. L., Cassandra, A. R. & Kaelbling, L. P. (1995). Learning Policies for Partially Observable Environments: Scaling up. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 495–503. Morgan Kaufman.

  • Minsky, M. (1961). Steps towards Artificial Intelligence. Proceedings of IR E, 8–30. Reprinted in Feigenbaum, E. A. & Feldman, J. (eds.) Computers and Thought, 406-450. New York, NY: McGraw-Hill.

    Google Scholar 

  • Mitchell, T. M. (1997). Machine Learning. New York, NY: McGraw-Hill.

    Google Scholar 

  • Polat, F, Guvenir, S. & Shekhar, S. (1993). A Negotiation Platform for Cooperating Multi-Agent Systems. International Journal of Concurrent Engineering:Research & Applica-tions 3: 179–187.

    Google Scholar 

  • Polat, F. & Guvenir, A. (1994). A Conflict Resolution Based Decentralized Multi-Agent Problem Solving Model. Artificial Social Systems, LNAI 130, 279–294. Springer-Verlag.

    Google Scholar 

  • Russel, S. J. & Norvig, P. (1997). Artificial Intelligence: A Modern Approach. Englewood Cliffs, NJ: Prentice-Hall International, Inc.

    Google Scholar 

  • Sen, S., Sekaran, M. & Hale, J. (1994). Learning to Coordinate without Sharing Information. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 509–514. Morgan Kaufman.

  • Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.

  • Tan, M. (1993). Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 487–494. Morgan Kaufman.

  • Turing, A. M. (1950). Computing Machinery and Intelligence. Mind 95: 433–460. Reprinted in Mind design I I, 29-6. Cambridge, MA: MIT Press.

    Google Scholar 

  • Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD Thesis, University of Cambridge, England.

  • Watkins, C. J. C. H. & Dayan, P. (1992). Technical Note: Q-Learning. Machine Learning 8: 279–292.

    Google Scholar 

  • Wei, G. (1996). Adaptation and Learning in Multi-Agent Systems: Some Remarks and a Bibliography. In Weiss, G. and Sen, S. (eds.) Adaption and Learning in Multi-Agent Systems. Berlin: Springer.

    Google Scholar 

  • Weiss, G. (1999). Multi-Agent Systems: A Modern Approach to Distributed Artificial Intelli-gence, 28–77. Mit Press.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

şenkul, S., Polat, F. Learning Intelligent Behavior in a Non-stationary and Partially Observable Environment. Artificial Intelligence Review 18, 97–115 (2002). https://doi.org/10.1023/A:1019935502139

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1019935502139

Navigation