Formal models and algorithms for decentralized decision making under uncertainty

Seuken, Sven; Zilberstein, Shlomo

doi:10.1007/s10458-007-9026-5

Formal models and algorithms for decentralized decision making under uncertainty

Published: 14 February 2008

Volume 17, pages 190–250, (2008)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Sven Seuken¹ &
Shlomo Zilberstein²

929 Accesses
87 Citations
13 Altmetric
1 Mention
Explore all metrics

Abstract

Over the last 5 years, the AI community has shown considerable interest in decentralized control of multiple decision makers or “agents” under uncertainty. This problem arises in many application domains, such as multi-robot coordination, manufacturing, information gathering, and load balancing. Such problems must be treated as decentralized decision problems because each agent may have different partial information about the other agents and about the state of the world. It has been shown that these problems are significantly harder than their centralized counterparts, requiring new formal models and algorithms to be developed. Rapid progress in recent years has produced a number of different frameworks, complexity results, and planning algorithms. The objectives of this paper are to provide a comprehensive overview of these results, to compare and contrast the existing frameworks, and to provide a deeper understanding of their relationships with one another, their strengths, and their weaknesses. While we focus on cooperative systems, we do point out important connections with game-theoretic approaches. We analyze five different formal frameworks, three different optimal algorithms, as well as a series of approximation techniques. The paper provides interesting insights into the structure of decentralized problems, the expressiveness of the various models, and the relative advantages and limitations of the different solution techniques. A better understanding of these issues will facilitate further progress in the field and help resolve several open problems that we identify.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Top-Down vs. Bottom-Up Model-Based Methodologies for Distributed Control: A Comparative Experimental Study

Multi-objective Optimal Control for Proactive Decision Making with Temporal Logic Models

Automated Planning

References

Amato, C., Bernstein, D., & Zilberstein, S. (2007). Optimizing memory-bounded controllers for decentralized POMDP. In Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI). Vancouver, Canada, July 2007.
Amato, C., Bernstein, D. S., & Zilberstein, S. (2007). Solving POMDPs using quadratically constrained linear programs. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 2418–2424). Hyderabad, India, January 2007.
Aström K.J. (1965) Optimal control of Markov decision processes with incomplete state estimation. Journal of Mathematical Analysis and Applications 10: 174–205
Article MATH MathSciNet Google Scholar
Becker, R., Zilberstein, S., Lesser, V., & Goldman, C. V. (2004). Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research (JAIR), 22, 423–455.
Google Scholar
Bernstein, D. S. (2005). Complexity analysis and optimal algorithms for decentralized decision making. PhD thesis, Department of Computer Science, University of Massachusetts Amherst, Amherst, MA.
Bernstein D.S., Givan R., Immerman N., Zilberstein S. (2002) The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research 27(4): 819–840
Article MATH MathSciNet Google Scholar
Bernstein, D. S., Hansen, E. A., & Zilberstein, S. (2005). Bounded policy iteration for decentralized POMDPs. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 1287–1292). Edinburgh, Scotland, July 2005.
Bernstein, D. S., Zilberstein, S., & Immerman, N. (2000). The complexity of decentralized control of Markov decision processes. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 32–37). Stanford, California, June 2000.
Beynier, A., & Mouaddib, A. (2006). An iterative algorithm for solving constrained decentralized Markov decision processes. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI) (pp. 1089–1094). Boston, MA, July 2006.
Boutilier, C. (1996). Planning, learning and coordination in multiagent decision processes. In Proceedings of the Conference on Theoretical Aspects of Rationality and Knowledge (pp. 195–210). De Zeeuwse Stromen, The Netherlands.
Cogill, R., Rotkowitz, M., van Roy, B., & Lall, S. (2004). An approximate dynamic programming approach to decentralized control of stochastic systems. In Proceedings of the Allerton Conference on Communication, Control, and Computing (pp. 1040–1049). Urbana-Champaign, IL, 2004.
de Farias D.P., van Roy B. (2003) The linear programming approach to approximate dynamic programming. Operations Research 51(6): 850–865
Article MathSciNet MATH Google Scholar
Doshi, P., & Gmytrasiewicz, P. J. (2005). A particle filtering based approach to approximating interactive POMDPs. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI) (pp. 969–974). Pittsburg, Pennsylvania, July 2005.
Doshi, P., & Gmytrasiewicz, P. J. (2005). Approximating state estimation in multiagent settings using particle filters. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 320–327). Utrecht, Netherlands, July 2005.
Emery-Montemerlo, R., Gordon, G., Schneider, J., & Thrun, S. (2004). Approximate solutions for partially observable stochastic games with common payoffs. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 136–143). New York, NY, July 2004.
Gmytrasiewicz P.J., Doshi P. (2005) A framework for sequential planning in multiagent settings. Journal of Artificial Intelligence Research JAIR) 24: 49–79
MATH Google Scholar
Goldman C.V., Allen M., Zilberstein S. (2007) Learning to communicate in a decentralized environment. Autonomous Agents and Multi-Agent Systems 15(1): 47–90
Article Google Scholar
Goldman, C. V., & Zilberstein, S. (2003). Optimizing information exchange in cooperative multi agent systems. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 137–144). Melbourne, Australia, July 2003.
Goldman C.V., Zilberstein S. (2004) Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research (JAIR) 22: 143–174
MATH MathSciNet Google Scholar
Goldman, C. V., & Zilberstein, S. (2004). Goal-oriented DEC-MDPs with direct communication. Technical Report 04-44, Department of Computer Science, University of Massachusetts Amherst.
Hansen, E. A. (1998). Finite-memory control of partially observable systems. PhD thesis, Department of Computer Science, University of Massachuetts Amherst.
Hansen, E. A., Bernstein, D. S., & Zilberstein, S. (2004). Dynamic programming for partially observable stochastic games. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI) (pp. 709–715). San Jose, California, July 2004.
Kaelbling L.P., Littmann M.L., Cassandra A.R. (1998) Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(2): 99–134
Article MATH MathSciNet Google Scholar
Kalai E., Lehrer E. (1993) Rational learning leads to nash equilibrium. Econometrica 1: 1231–1240
Article MathSciNet Google Scholar
Madani O., Hanks S., Condon A. (2003) On the undecidability of probabilistic planning and related stochastic optimization problems. Artificial Intelligence 147(1–2): 5–34
MATH MathSciNet Google Scholar
Nair, R., Pynadath, D., Yokoo, M., Tambe, M., & Marsella, S. (2003). Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 705–711). Acapulco, Mexico, August 2003.
Nair, R., Varakantham, P., Tambe, M., & Yokoo, M. (2005). Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI) (pp. 133–139). Pittsburgh, Pennsylvania, July 2005.
Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds) (2007) Algorithmic Game Theory. Cambridge University Press., New York, NY
MATH Google Scholar
Ooi, J. M., & Wornell, G. W. (1996). Decentralized control of a multiple access broadcast channel: Performance bounds. In Proceedings of the Thirty-Fifth Conference on Decision and Control (pp. 293–298). Kobe, Japan, December 1996.
Osborne M.J., Rubinstein A. (1994) A course in game theory. The MIT Press, Cambridge, MA
Google Scholar
Papadimitriou C.H. (1994) Computational complexity. Addison Wesley, Reading, MA
MATH Google Scholar
Papadimitriou C.H., Tsitsiklis J.N. (1986) Intractable problems in control theory. SIAM Journal on Control and Optimization 24(4): 639–654
Article MATH MathSciNet Google Scholar
Papadimitriou C.H., Tsitsiklis J.N. (1987) The complexity of Markov decision processes. Mathematics of Operations Research 12(3): 441–450
Article MATH MathSciNet Google Scholar
Parkes, D. C. (2008). Computational mechanism design. In Lecture notes of Tutorials at 10th Conf. on Theoretical Aspectsof Rationality and Knowledge (TARK-05). Institute of Mathematical Sciences, University of Singapore (to appear).
Peshkin, L., Kim, K.-E., Meuleau, N., & Kaelbling, L. P. (2000). Learning to cooperate via policy search. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 489–496). Stanford, CA, July 2000.
Petrik, M., & Zilberstein, S. (2007). Anytime coordination using separable bilinear programs. In Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI) (pp. 750–755). Vancouver, Canada, July 2007.
Poupart, P., & Boutilier, C. (2003). Bounded finite state controllers. In Advances in Neural Information Processing Systems 16 (NIPS). Vancouver, Canada, December 2003.
Puterman M.L. (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York, NY
MATH Google Scholar
Pynadath D.V., Tambe M. (2002) The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research (JAIR) 16: 389–423
MATH MathSciNet Google Scholar
Rabinovich, Z., Goldman, C. V., & Rosenschein, J. S. (2003). The complexity of multiagent systems: The price of silence. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 1102–1103). Melbourne, Australia, 2003.
Russell S., Norvig P. (2003) Artificial intelligence: A modern approach. Prentice Hall, Upper Saddle River, NJ
Google Scholar
Seuken, S., & Zilberstein, S. (2007). Memory-bounded dynamic programming for DEC-POMDPs. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 2009–2015). Hyderabad, India, January 2007.
Seuken, S., & Zilberstein, S. (2007). Improved memory-bounded dynamic programming for decentralized POMDPs. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI). Vancouver, Canada, July 2007.
Shen, J., Becker, R., & Lesser, V. (2006). Agent interaction in distributed MDPs and its implications on complexity. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS). Hakodate, Japan, May 2006.
Suzuki I., Yamashita M. (1999) Distributed anonymous mobile robots: Formation of geometric patterns. SIAM Journal on Computing 28(4): 1347–1363
Article MATH MathSciNet Google Scholar
Szer, D., & Charpillet, F. (2005). An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs. In Proceedings of the Sixteenth European Conference on Machine Learning (ECML) (pp. 389–399). Porto, Portugal, October 2005.
Szer, D., & Charpillet, F. (2006). Point-based dynamic programming for DEC-POMDPs. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI) (pp. 1233–1238). Boston, MA, July 2006.
Szer, D., Charpillet, F., & Zilberstein, S. (2005). MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 576–583). Edinburgh, Scotland, July 2005.
Tambe M. (1997) Towards flexible teamwork. Journal of Artificial Intelligence Research (JAIR) 7: 83–124
Google Scholar
Washington, R., Golden, K., Bresina, J., Smith, D. E., Anderson, C., & Smith, T. (1999). Autonomous rovers for Mars exploration. In Proceedings of the IEEE Aerospace Conference (pp. 237–251). Snowmass, CO, March 1999.
Zilberstein, S., Washington, R.,Bernstein, D. S., & Mouaddib, A.-I. (2002). Decision-theoretic control of planetary rovers. In Beetz, M., Guibas, L., Hertzberg, J., Ghallab, M., & Pollack, M.E. (Eds.), Advances in Plan-Based Control of Robotic Agents, Lecture Notes in Computer Science (Vol. 2466, pp. 270–289). Berlin, Germany: Springer.

Download references

Author information

Authors and Affiliations

School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
Sven Seuken
Department of Computer Science, University of Massachusetts, Amherst, MA, 01003, USA
Shlomo Zilberstein

Authors

Sven Seuken
View author publications
You can also search for this author in PubMed Google Scholar
Shlomo Zilberstein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sven Seuken.

Additional information

This work was done while S. Seuken was a graduate student in the Computer Science Department of the University of Massachusetts, Amherst.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Seuken, S., Zilberstein, S. Formal models and algorithms for decentralized decision making under uncertainty. Auton Agent Multi-Agent Syst 17, 190–250 (2008). https://doi.org/10.1007/s10458-007-9026-5

Download citation

Published: 14 February 2008
Issue Date: October 2008
DOI: https://doi.org/10.1007/s10458-007-9026-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Formal models and algorithms for decentralized decision making under uncertainty

Abstract

Access this article

Similar content being viewed by others

Top-Down vs. Bottom-Up Model-Based Methodologies for Distributed Control: A Comparative Experimental Study

Multi-objective Optimal Control for Proactive Decision Making with Temporal Logic Models

Automated Planning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Formal models and algorithms for decentralized decision making under uncertainty

Abstract

Access this article

Similar content being viewed by others

Top-Down vs. Bottom-Up Model-Based Methodologies for Distributed Control: A Comparative Experimental Study

Multi-objective Optimal Control for Proactive Decision Making with Temporal Logic Models

Automated Planning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation