Abstract
The problem of selecting action in environments that are dynamic and not completely predictable or observable is a central problem in intelligent behavior. From an AI point of view, the problem is to design a mechanism that can select the best actions given information provided by sensors and a suitable model of the actions and goals. We call this the problem of Planning as it is a direct generalization of the problem considered in Planning research where feedback is absent and the effect of actions is assumed to be predictable. In this paper we present an approach to Planning that combines ideas and methods from Operations Research and Artificial Intelligence. Basically Planning problems are described in high-level action languages that are compiled into general mathematical models of sequential decisions known as Markov Decision Processes or Partially Observable Markov Decision Processes, which are then solved by suitable Heuristic Search Algorithms. The result are controllers that map sequences of observations into actions, and which, under certain conditions can be shown to be optimal. We show how this approach applies to a number of concrete problems and discuss its relation to work in Reinforcement Learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Barto, S. Bradtke, and S. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 72:81–138, 1995.
D. Bertsekas and J. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996.
A. Blum and M. Furst. Fast planning through planning graph analysis. In Proceedings of IJCAI-95, Montreal, Canada, 1995.
B. Bonet and H. Geffner. Learning sorting and decision trees with POMDPs. To appear in Proceedings ICML-98, 1998.
B. Bonet and H. Geffner. Planning and control with incomplete information using POMDPs: Experimental results. Available at http://www.ldc.usb.ve/~hector, 1998.
B. Bonet, G. Loerincs, and H. Geffner. A robust and fast action selection mechanism for planning. In Proceedings of AAAI-97, pages 714–719. MIT Press, 1997.
C. Boutilier, T. Dean, and S. Hanks. Planning under uncertainty: structural assumptions and computational leverage. In Proceedings of EWSP-95, 1995.
A. Cassandra, L. Kaebling, and M. Littman. Acting optimally in partially observable stochastic domains. In Proceedings AAAI94, pages 1023–1028, 1994.
A. Cassandra, L. Kaebling, and M. Littman. Learning policies for partially observable environments: Scaling up. In Proc. of the 12th Int. Conf. on Machine Learning, 1995.
G. Collins and L. Pryor. Planning under uncertainty: Some key issues. In Proceedings IJCAI95, 1995.
T. Dean, L. Kaebling, J. Kirman, and A. Nicholson. Planning with deadlines in stochastic domains. In Proceedings AAAI93, pages 574–579. MIT Press, 1993.
T. Dean and K. Kanazawa. A model for reasoning about persistence and causation. Computational Intelligence, 5(3):142–150, 1989.
T. Dean and M. Wellman. Planning and Control. Morgan Kaufmann, 1991.
O. Etzioni, S. Hanks, D. Draper, N. Lesh, and M. Williamson. An approach to planning with incomplete information. In Proceedings of the Third Int. Conference on Principles of Knowledge Representation and Reasoning, pages 115–125. Morgan Kaufmann, 1992.
R. Fikes and N. Nilsson. STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 1:27–120, 1971.
H. Geffner and B. Bonet. High-level plannnig and control with incomplete information using POMDP’s. In Proceedings AIPS-98 Workshop on Integrating Planning, Scheduling and Execution in Dynamic and Uncertain Environments, 1998. Available at http://www.ldc.usb.ve/~hector.
H. Kautz and B. Selman. Pushing the en velope: Planning, propositional logic, and stochastic search. In Proceedings of AAAI-96, pages 1194–1201, Protland, Oregon, 1996. MIT Press.
R. Korf. Real-time heuristic search. Artificial Intelligence, 42:189–211, 1990.
N. Kushmerick, S. Hanks, and D. Weld. An algorithm for probabilistic planning. Artificial Intelligence, 76:239–286, 1995.
H. Levesque. What is planning in the presence of sensing. In Proceedings AAAI-96, pages 1139–1146, Portland, Oregon, 1996. MIT Press.
D. McDermott. A heuristic estimator for means-ends analysis in planning. In Proc. Third Int. Conf. on AI Planning Systems (AIPS-96), 1996.
A. Newell and H. Simon. Human Problem Solving. Prentice-Hall, Englewood Cliffs, NJ, 1972.
N. Nilsson. Principles of Artificial Intelligence. Tioga, 1980.
L. Padulo and M. Arbib. System Theory. Hemisphere Publishing Co., 1974.
J. Pearl. Heuristics. Morgan Kaufmann, 1983.
M. Puterman. Markov Decision Processes — Discrete Stochastic Dynamic Programming. John Wiley and Sons, Inc., 1994.
S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 1994.
E. Sondik. The Optimal Control of Partially Observable Markov Processes. PhD thesis, Stanford University, 1971.
R. Sutton. Learning to predict by the method of temporal differences. Machine Learning, 3:9–44, 1988.
R. Sutton. Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In Proceedings of ML-90, pages 216–224. Morgan Kaufmann, 1990.
R. Sutton and A. Barto. Introduction to Reinforcement Learning. MIT Press, 1998.
C. J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge University, 1989.
D. Weld. An introduction to least commitment planning. AI Magazine, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Geffner, H. (1998). Modelling Intelligent Behaviour: The Markov Decision Process Approach. In: Coelho, H. (eds) Progress in Artificial Intelligence — IBERAMIA 98. IBERAMIA 1998. Lecture Notes in Computer Science(), vol 1484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49795-1_1
Download citation
DOI: https://doi.org/10.1007/3-540-49795-1_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64992-2
Online ISBN: 978-3-540-49795-0
eBook Packages: Springer Book Archive