Advertisement

Influence Diagrams with Memory States: Representation and Algorithms

  • Xiaojian Wu
  • Akshat Kumar
  • Shlomo Zilberstein
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6992)

Abstract

Influence diagrams (IDs) offer a powerful framework for decision making under uncertainty, but their applicability has been hindered by the exponential growth of runtime and memory usage—largely due to the no-forgetting assumption. We present a novel way to maintain a limited amount of memory to inform each decision and still obtain near-optimal policies. The approach is based on augmenting the graphical model with memory states that represent key aspects of previous observations—a method that has proved useful in POMDP solvers. We also derive an efficient EM-based message-passing algorithm to compute the policy. Experimental results show that this approach produces high-quality approximate polices and offers better scalability than existing methods.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amato, C., Bernstein, D.S., Zilberstein, S.: Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs. Autonomous Agents and Multi-Agent Systems 21, 293–320 (2010)CrossRefGoogle Scholar
  2. 2.
    Bernstein, D.S., Amato, C., Hansen, E.A., Zilberstein, S.: Policy iteration for decentralized control of Markov decision processes. Journal of Artificial Intelligence Research 34, 89–132 (2009)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Cecil Huang, A.D.: Inference in belief networks: A procedural guide. International Journal of Approximate Reasoning 15, 225–263 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Cooper, G.: A method for using belief networks as influence diagrams. In: Proc. of the Conference on Uncertainty in Artificial Intelligence, pp. 55–63 (1988)Google Scholar
  5. 5.
    Dechter, R.: A new perspective on algorithims for optimizing policies under uncertainty. In: Proc. of the International Conference on Artificial Intelligence Planning Systems, pp. 72–81 (2000)Google Scholar
  6. 6.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical society, Series B 39(1), 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Hansen, E.A.: An improved policy iteration algorithm for partially observable MDPs. In: Proc. of Neural Information processing Systems, pp. 1015–1021 (1997)Google Scholar
  8. 8.
    Howard, R.A., Matheson, J.E.: Infuence diagrams. In: Readings on the Principles and Applications of Decision Analysis, vol. II, pp. 719–762. Strategic Decisions Group (1984)Google Scholar
  9. 9.
    Jensen, F., Jensen, F.V., Dittmer, S.L.: From influence diagrams to junction trees. In: Proc. of the Conference on Uncertainty in Artificial Intelligence, pp. 367–373 (1994)Google Scholar
  10. 10.
    Kumar, A., Zilberstein, S.: Anytime planning for decentralized POMDPs using expectation maximization. In: Proc. of the Conference on Uncertainty in Artificial Intelligence, pp. 294–301 (2010)Google Scholar
  11. 11.
    Nilsson, D., Lauritzen, S.: Representing and solving decision problems with limited information. Management Science 47(9), 1235–1251 (2001)CrossRefzbMATHGoogle Scholar
  12. 12.
    Marinescu, R.: A new approach to influence diagram evaluation. In: Proc. of the 29th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence (2009)Google Scholar
  13. 13.
    Poupart, P., Boutilier, C.: Bounded finite state controllers. In: Proc. of Neural Information processing Systems, pp. 823–830 (2003)Google Scholar
  14. 14.
    Qi, R., Poole, D.: A new method for influence diagram evaluation. Computational Intelligence 11, 498–528 (1995)CrossRefGoogle Scholar
  15. 15.
    Shachter, R.: Evaluating influence diagrams. Operations Research 34, 871–882 (1986)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Shachter, R.: Probabilistic inference and influence diagrams. Operations Research 36, 589–605 (1988)CrossRefzbMATHGoogle Scholar
  17. 17.
    Shachter, R.: An ordered examination of influence diagrams. Networks 20, 535–563 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Toussaint, M., Charlin, L., Poupart, P.: Hierarchical POMDP controller optimization by likelihood maximization. In: Proc. of the Conference on Uncertainty in Artificial Intelligence, pp. 562–570 (2008)Google Scholar
  19. 19.
    Toussaint, M., Harmeling, S., Storkey, A.: Probabilistic inference for solving (PO)MDPs. Technical Report EDI-INF-RR-0934, School of Informatics, University of Edinburgh (2006)Google Scholar
  20. 20.
    Toussaint, M., Storkey, A.J.: Probabilistic inference for solving discrete and continuous state Markov decision processes. In: Proc. of International Conference on Machine Learning, pp. 945–952 (2006)Google Scholar
  21. 21.
    Zhang, N.L., Qi, R., Poole, D.: A computational theory of decision networks. International Journal of Approximate Reasoning 11, 83–158 (1994)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Xiaojian Wu
    • 1
  • Akshat Kumar
    • 1
  • Shlomo Zilberstein
    • 1
  1. 1.Computer Science DepartmentUniversity of MassachusettsAmherstUSA

Personalised recommendations