Bounded Rationality in Multiagent Systems Using Decentralized Metareasoning

  • Alan Carlin
  • Shlomo Zilberstein
Part of the Intelligent Systems Reference Library book series (ISRL, volume 28)


Metareasoning has been used as a means for achieving bounded rationality by optimizing the tradeoff between the cost and value of the decision making process. Effective monitoring techniques have been developed to allow agents to stop their computation at the “right” time so as to optimize the overall time-dependent utility of the decision. However, these methods were designed for a single decision maker. In this chapter, we analyze the problems that arise when several agents solve components of a larger problem, each using an anytime algorithm. Metareasoning is more challenging in this case because each agent is uncertain about the progress made so far by the others. We develop a formal framework for decentralized monitoring of decision making, establish the complexity of several interesting variants of the problem, and propose solution techniques for each case.


Quality Level Multiagent System Markov Decision Process Bounded Rationality Local Monitoring 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anderson, M.: A review of recent research in metareasoning and metalearning. AI Magazine 28(1), 7–16 (2007)Google Scholar
  2. 2.
    Becker, R., Carlin, A., Lesser, V., Zilberstein, S.: Analyzing myopic approaches for multi-agent communication. Computational Intelligence 25(1), 31–50 (2009)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Becker, R., Zilberstein, S., Lesser, V., Goldman, C.: Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research 22, 423–455 (2004)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Bernstein, D., Givan, R., Immerman, N., Zilberstein, S.: The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research 27(4), 819–840 (2002)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Carlin, A., Zilberstein, S.: Myopic and non-myopic communication under partial observability. In: Proceedings of the 2009 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (2009)Google Scholar
  6. 6.
    Cheng, S., Raja, A., Lesser, V.: Multiagent Meta-level Control for a Network of Weather Radars. In: Proceedings of 2010 IEEE/WIC/ACM International Conference on Intelligent Agent Technology, pp. 157–164 (2010)Google Scholar
  7. 7.
    Cox, M., Raja, A.: Metareasoning: Thinking about thinking. MIT Press, Cambridge (2011)Google Scholar
  8. 8.
    Dean, T., Boddy, M.: An analysis of time-dependent planning. In: Proceedings of the Seventh National Conference on Artificial Intelligence, pp. 49–54 (1988)Google Scholar
  9. 9.
    Ford, L., Fulkerson, D.: Maximal flow through a network. Canadian Journal of Mathematics 8, 399–404 (1956)MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    Gigerenzer, G., Todd, P.: ABC Research Group: Simple Heuristics That Make Us Smart. Oxford University Press, Oxford (1999)Google Scholar
  11. 11.
    Goldman, C., Zilberstein, S.: Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research 22, 143–174 (2004)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Hansen, E., Zilberstein, S.: Monitoring and control of anytime algorithms: A dynamic programming approach. Artificial Intelligence 126(1-2), 139–157 (2001)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Horvitz, E.: Reasoning about beliefs and actions under computational resource constraints. In: Proceedings of Third Workshop on Uncertainty in Artificial Intelligence, pp. 429–444 (1987)Google Scholar
  14. 14.
    Laasri, B., Laasri, H., Lesser, V.: An analysis of negotiation and its role for coordinating cooperative distributed problem solvers. In: Proceedings of General Conference on Second Generation Expert Systems; Eleventh International Conference on Expert Systems and their Applications, vol. 2, pp. 81–94 (1991)Google Scholar
  15. 15.
    Petrik, M., Zilberstein, S.: A bilinear approach for multiagent planning. Journal of Artificial Intelligence Research 35, 235–274 (2009)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Puterman, M.: Markov decision processes, Discrete stochastic dynamic programming. John Wiley and Sons Inc., Chichester (2005)zbMATHGoogle Scholar
  17. 17.
    Raja, A., Lesser, V.: A framework for meta-level control in multi-agent systems. Autonomous Agents and Multi-Agent Systems 15, 147–196 (2007)CrossRefGoogle Scholar
  18. 18.
    Russell, S., Subramanian, D., Parr, R.: Provably bounded optimal agents. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 575–609 (1993)Google Scholar
  19. 19.
    Russell, S., Wefald, E.: Principles of metareasoning. In: Proceedings of the First International Conference on Principles of Knowledge Representation and Reasoning, pp. 400–411 (1989)Google Scholar
  20. 20.
    Smith, T., Simmons, R.: Heuristic search value iIteration for POMDPs. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence, pp. 520–527 (2004)Google Scholar
  21. 21.
    Sandholm, T.W.: Terminating decision algorithms optimally. In: Rossi, F. (ed.) CP 2003. LNCS, vol. 2833, pp. 950–955. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  22. 22.
    Schut, M., Wooldridge, M.: The control of reasoning in resource-bounded agents. Knowledge Engineering Review 16(3), 215–240 (2001)CrossRefGoogle Scholar
  23. 23.
    Simon, H.: A behavioral model of rational choice. Quaterly Journal of Economics 69, 99–118 (1955)CrossRefGoogle Scholar
  24. 24.
    Tsitsiklis, J., Athans, M.: On the complexity of decentralized decision making and detection problems. IEEE Transactions on Automatic Control 30(5), 440–446 (1985)MathSciNetzbMATHCrossRefGoogle Scholar
  25. 25.
    Wald, A.: Sequential tests of statistical hypotheses. The Annals of Mathematical Statistics 16, 117–186 (1945)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Wellman, M.: Formulation of Tradeoffs in Planning under Uncertainty. Pitman, London (1990)Google Scholar
  27. 27.
    Xuan, P., Lesser, V., Zilberstein, S.: Communication decisions in multi-agent cooperation: model and experiments. In: Proceedings of the Fifth International Conference on Autonomous Agents, pp. 616–623 (2001)Google Scholar
  28. 28.
    Zilberstein, S.: Operational rationality through compilation of anytime algorithms. Ph.D. Dissertation, Computer Science Division. University of California, Berkeley (1993)Google Scholar
  29. 29.
    Zilberstein, S., Russell, S.: Optimal composition of real-time systems. Artificial Intelligence 82(1-2), 181–213 (1996)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Zilberstein, S.: Metareasoning and bounded rationality. In: Cox, M., Raja, A. (eds.) Metareasoning: Thinking about Thinking. MIT Press, Cambridge (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Alan Carlin
    • 1
  • Shlomo Zilberstein
    • 1
  1. 1.Department of Computer ScienceUniversity of MassachusettsAmherst

Personalised recommendations