A framework for meta-level control in multi-agent systems

Raja, Anita; Lesser, Victor

doi:10.1007/s10458-006-9008-z

A framework for meta-level control in multi-agent systems

Published: 04 January 2007

Volume 15, pages 147–196, (2007)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Anita Raja¹ &
Victor Lesser²

224 Accesses
24 Citations
Explore all metrics

Abstract

Sophisticated agents operating in open environments must make decisions that efficiently trade off the use of their limited resources between dynamic deliberative actions and domain actions. This is the meta-level control problem for agents operating in resource-bounded multi-agent environments. Control activities involve decisions on when to invoke and the amount to effort to put into scheduling and coordination of domain activities. The focus of this paper is how to make effective meta-level control decisions. We show that meta-level control with bounded computational overhead allows complex agents to solve problems more efficiently than current approaches in dynamic open multi-agent environments. The meta-level control approach that we present is based on the decision-theoretic use of an abstract representation of the agent state. This abstraction concisely captures critical information necessary for decision making while bounding the cost of meta-level control and is appropriate for use in automatically learning the meta-level control policies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Barto, A., Sutton, R., & Anderson, C. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13, 834–846.
Google Scholar
Bertsekas D., Tsitsiklis J. (1996). Neuro-dynamic programming. Athena Scientific, Belmont, MA
MATH Google Scholar
Boddy M., Dean T. (1994). Decision-theoretic deliberation scheduling for problem solving in time-constrained environments. Artificial Intelligence 67(2): 245–286
Article MATH Google Scholar
Boutlier, C. (1999). Sequential optimality and coordination in multiagent systems. In Proceedings of the sixteenth international joint conference on artificial intelligence.
Crites R., Barto A. (1996). Improving elevator performance using reinforcement learning, Multi-ag In Advances in Neural Information Processing Systems, pages 8: 1017–1023
Dean, T., & Boddy, M. (1988). An analysis of time-dependent planning. In Proceedings of the seventh national conference on artificial intelligence (AAAI-88) (pp. 49–54). Saint Paul, Minnesota, USA: AAAI Press/MIT Press.
Decker, K. (1996). Taems: a framework for environment centered analysis and design of coordination mechanisms. In G. O’Hare & N. Jennings, (Eds.), Foundations of Distributed Artificial Intelligence, Chapter 16 (pp. 429–448). Wiley Inter-Science.
Dietterich T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13: 227–303
MATH Google Scholar
Doyle J. (1983). What is rational psychology? toward a modern mental philosophy. AI Magazine 4(3): 50–53
Google Scholar
Garvey, A. & Lesser, V. (1996). Issues in design-to-time real-time scheduling. In AAAI Fall 1996 symposium on flexible computation, November.
Georgeff, M. & Lansky, A. (1987). Reactive reasoning and planning. In Proceedings of the sixth national conference on artificial intelligence, pp. 677–682 Seattle, WA.
Goldman, R., Musliner, D. & Krebsbach, K. (2003). Managing online self-adaptation in real-time environments. In LNCS, vol. 2614, SV, pp. 6–23.
Good, I. J. (1971). Twenty-seven principles of rationality. In V. P. Godambe & D. A. Sprott, (Eds.), Foundations of statistical inference (pp. 108–141). Toronto: Holt Rinehart Wilson.
Hansen E., Zilberstein S. (1996). Monitoring anytime algorithms. SIGART Bulletin 7(2): 28–33
Article Google Scholar
Harada, D. & Russell, S. (1999). Extended abstract: Learning search strategies. In Proceedings AAAI spring symposium on search techniques for problem solving under uncertainty and incomplete information, Stanford, CA, 1999.
Hayes-Roth, B. (1993). Opportunistic control of action in intelligent agents. In Proceedings of IEEE transactions on systems, man and cybernetics, pp. SMC–23(6), 1575–1587.
Hayes-Roth, B., Uckun, S., Larsson, J. E., Gaba, D., Barr, J. & Chien, J. (1994). Guardian: A prototype intelligent agent for intensive-care monitoring. In Proceedings of the national conference on artificial intelligence, pp. 1503–1511.
Horling, B., Lesser, V. & Vincent, R. (2000). Multi-agent system simulation framework. In sixteenth IMACS World Congress 2000 on scientific computation, applied mathematics and simulation. Switzerland: EPFL, Lausanne.
Horling B., Lesser V., Vincent R., Wagner T. (2006). The soft real-time agent control architecture. Autonomous Agents and Multi-Agent Systems 12(1): 35–92
Article Google Scholar
Horvitz, E. (1988). Reasoning under varying and uncertain resource constraints. In National conference on artificial intelligence of the american association for AI (AAAI-88), pp. 111–116.
Kaelbling, L. (1990). Learning in embedded systems. PhD thesis, Stanford University.
Kinney, M. & Tsatsoulis, C. (1998). Learning communication strategies in multiagent systems. Applied intelligence, 9(1), 71–91.
Google Scholar
Kuwabara, K. (1996). Meta-level control of coordination protocols. In Proceedings of the third international conference on multi-agent systems (ICMAS96). pp. 104–111.
Lagoudakis, M. & Littman, M. (2000). Reinforcement learning for algorithm selection. In Proceedings of the seventeenth national conference on artificial intelligence (AAAI-2000), pp. 1081.
Littman, M. & Boyan, J. (1993). A distributed reinforcement learning scheme for network routing. Technical Report CS-93-165.
Littman, M. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the eleventh international conference on machine learning (ML-94) (pp. 157–163). Morgan Kaufmann: New Brunswick, NJ.
Mataric, M. (1997). Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1), 73–83.
Google Scholar
Musliner, D. J., Hendler, J. A., Agrawala, A. K., Durfee, E. H., Strosnider, J. K. & Paul, C. J. (1995). The Challenges of real-time AI. IEEE Computer, 28(1), 58–66.
Google Scholar
Musliner, D. (1996). Plan execution in mission-critical domains. In Working notes of the AAAI fall symposium on plan execution–problems and issues.
Nakakuki, Y. & Sadeh, N. (1994). Increasing the efficiency of simulated annealing search by learning to recognize (un)promising runs. In Proceedings of the twelfth national conference on artificial intelligence (AAAI-94), pp. 1316–1322.
Parr, R. & Russell, S. (1997). Reinforcement learning with hierarchies of machines. In M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Advances in neural information processing systems, vol. 10, The MIT Press.
Puterman, M. L. (1994). Markov decision processes – discrete stochastic dynamic programming. Games as a Framework for Multi-Agent Reinforcement Learning. New York: John Wiley and Sons, Inc.
Raja, A. (2003). Meta-level control in multi-agent systems. PhD thesis, University of Massachusetts at Amherst, Amherst, Massachusetts.
Raja, A., Alexander, G. & Mappillai, V. (2006). Leveraging problem classification in online meta-cognition. In Proceedings of AAAI 2006 spring symposium on distributed plan and schedule management (pp. 97–104) Stanford.
Raja, A., Lesser, V., & Wagner, T. (2000). Toward Robust Agent Control in Open Environments. In Proceedings of the fourth international conference on autonomous agents (pp. 84–91). Barcelona, Catalonia, Spain: ACM Press.
Russell, S. & Norvig, P. (1995). Artificial intelligence: A modern approach. Prentice Hall.
Russell, S. & Wefald, E. (1992). Do the right thing: studies in limited rationality. MIT press.
Russell, S. J., Subramanian, D. & Parr, R. (1993). Provably bounded optimal agents. In Proceedings of the thirteenth international joint conference on artificial intelligence (IJCAI-93), pp. 338–344.
Russell, S. & Wefald, E. (1989). Principles of metareasoning. In Proceedings of the first international conference on principles of knowledge representation and reasoning. pp. 400–411.
Sandholm T., Crites R. (1995). Multiagent reinforcement learning in the iterated prisoner’s dilemma. Biosystems Journal 37: 147–166
Article Google Scholar
Schut, M. & Wooldridge, M. (2001). The control of reasoning in resource-bounded agents. Knowledge Engineering Review, 16(3), 215–240.
Google Scholar
Sen, S., Sekaran, M. & Hale, J. (1994). Learning to coordinate without sharing information. In Proceedings of the twelfth national conference on artificial intelligence, (pp. 426–431), Seattle, WA.
Simon, H., Latsis, S. J. (Ed.) (1976). From substantive to procedural rationality. In Method and Appraisal in Economic. Cambridge University Press, pp. 129–148.
Simon H., Kadane J. (1974). Optimal problem solving search: All-or-none solutions. Artificial Intelligence 6: 235–247
Article Google Scholar
Simon, H. (1982). Models of bounded rationality. vol. 1. Cambridge, MA: The MIT Press.
Singh, S., Kearns, M., Litman, D. & Walker, M. (2000). Empirical evaluation of a reinforcement learning spoken dialogue system. In Proceedings of the seventeenth national conference on artificial intelligence, pp. 645–651.
Sugawara, T. & Lesser, V. (1993). On-line learning of coordination plans. In Proceedings of the twelth international workshop on distributed artificial intelligence, pp. 335–345,371–377.
Sutton, R. & Barto, A. (1998). Reinforcement learning. MIT Press.
Sutton, R. (1984). Temporal credit assignment in reinforcement learning. PhD thesis, University of Massachusetts Amherst.
Sutton R. (1988). Learning to predict by the method of temporal differences. Machine Learning 3(1): 9–44
Google Scholar
Sutton, R., Precup, D. & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1–2), 181–211.
Google Scholar
Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning, pp. 330–337.
Vincent, R., Horling, B. & Lesser, V. (2001). An agent infrastructure to build and evaluate multi-agent systems: The java agent framework and multi-agent system simulator. In Wagner and Rana (Eds.), Lecture notes in artificial intelligence: infrastructure for agents, multi-agent systems, and scalable multi-agent systems, vol. 1887. Springer.
Wagner, T., Garvey, A. & Lesser, V. (1998). Criteria-directed heuristic task scheduling. International Journal of Approximate Reasoning, Special Issue on Scheduling, 19(1–2), 91–118. A version also available as UMASS CS TR-97-59.
Google Scholar
Watkins, C. (1989). Learning from delayed rewards. PhD thesis, Cambridge, England.
Whitehead S.D., Ballard D.H. (1991). Learning to perceive and act by trial and error. Machine Learning 7(1): 45–83
Google Scholar
Zhang, X. & Lesser, V. (2002). Multi-linked negotiation in multi-agent system. In Proceedings of the first international joint conference on autonomous agents and multiagent systems (AAMAS 2002), pp. 1207–1214.
Zilberstein S., Mouaddib A. (1999). Reactive control of dynamic progressive processing. IJCAI, 1268–1273
Zilberstein, S. & Russell, S. J. (1992). Efficient resource-bounded reasoning in AT-RALPH. In James Hendler, (Edn.), Proceedings of the first international conference of artificial intelligence planning systems (AIPS 92) (pp. 260–268) Morgan Kaufmann: College Park, Maryland, USA.
Zilberstein S., Russell S.J. (1996). Optimal composition of real-time systems. Artificial Intelligence 82(1–2):181–213
Article Google Scholar
Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. International Conference in Machine Learning, pp. 929–936.

Download references

Author information

Authors and Affiliations

Department of Software and Information Systems, The University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
Anita Raja
Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, 01003, USA
Victor Lesser

Authors

Anita Raja
View author publications
You can also search for this author in PubMed Google Scholar
Victor Lesser
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anita Raja.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Raja, A., Lesser, V. A framework for meta-level control in multi-agent systems. Auton Agent Multi-Agent Syst 15, 147–196 (2007). https://doi.org/10.1007/s10458-006-9008-z

Download citation

Published: 04 January 2007
Issue Date: October 2007
DOI: https://doi.org/10.1007/s10458-006-9008-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A framework for meta-level control in multi-agent systems

Abstract

Access this article

Similar content being viewed by others

Detection and resolution of normative conflicts in multi-agent systems: a literature survey

The 15th Multi-Agent Programming Contest

Optimal Meta-control Theory

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A framework for meta-level control in multi-agent systems

Abstract

Access this article

Similar content being viewed by others

Detection and resolution of normative conflicts in multi-agent systems: a literature survey

The 15th Multi-Agent Programming Contest

Optimal Meta-control Theory

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation