Instance-Based Action Models for Fast Action Planning

  • Mazda Ahmadi
  • Peter Stone
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5001)

Abstract

Two main challenges of robot action planning in real domains are uncertain action effects and dynamic environments. In this paper, an instance-based action model is learned empirically by robots trying actions in the environment. Modeling the action planning problem as a Markov decision process, the action model is used to build the transition function. In static environments, standard value iteration techniques are used for computing the optimal policy. In dynamic environments, an algorithm is proposed for fast replanning, which updates a subset of state-action values computed for the static environment. As a test-bed, the goal scoring task in the RoboCup 4-legged scenario is used. The algorithms are validated in the problem of planning kicks for scoring goals in the presence of opponent robots. The experimental results both in simulation and on real robots show that the instance-based action model boosts performance over using parametric models as done previously, and also incremental replanning significantly improves over original off-line planning.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yang, Q., Wu, K., Jiang, Y.: Learning action models from plan examples with incomplete knowledge. In: Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling (June 2005)Google Scholar
  2. 2.
    Wang, X.: Learning planning operators by observation and practice. In: Artificial Intelligence Planning Systems, pp. 335–340 (1994)Google Scholar
  3. 3.
    Moore, A.W., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning 13, 103–130 (1993)Google Scholar
  4. 4.
    Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)Google Scholar
  5. 5.
    Stentz, A.T.: The focussed d* algorithm for real-time replanning. In: Proceedings of the International Joint Conference on Artificial Intelligence (August 1995)Google Scholar
  6. 6.
    Koenig, S., Likhachev, M.: Improved fast replanning for robot navigation in unknown terrain. In: Proceedings of the 2002 IEEE International Conference on Robotics and Automation (May 2002)Google Scholar
  7. 7.
    Wilkins, D.E., Myers, K.L., Lowrance, J.D., Wesley, L.P.: Planning and reacting in uncertain and dynamic environments. Journal of Experimental and Theoretical AI 7(1), 197–227 (1995)Google Scholar
  8. 8.
    Chernova, S., Veloso, M.: Learning and using models of kicking motions for legged robots. In: Proceedings of International Conference on Robotics and Automation (ICRA 2004) (May 2004)Google Scholar
  9. 9.
    Burgard, W., Cremers, A., Fox, D., Hähnel, D., Lakemeyer, G., Schulz, D., Steiner, W., Thrun, S.: Experiences with an interactive museum tour-guide robot. Artificial Intelligence 114(1-2), 3–55 (1999)MATHCrossRefGoogle Scholar
  10. 10.
    Bevly, D., Farritor, S., Dubowsky, S.: Action module planning and its application to an experimental climbing robot. In: International Conference on Robotics and Automation (2000)Google Scholar
  11. 11.
    Frommherz, B., Werling, G.: Generating robot action plans by means of an heuristic search. In: International Conference on Robotics and Automation (1990)Google Scholar
  12. 12.
    Leake, D.B. (ed.): Case-Based Reasoning: Experiences, Lessons, and Future Directions. AAAI Press, Menlo Park (1996)Google Scholar
  13. 13.
    Atkeson, C.G., Moore, A.W., Schaal, S.: Locally weighted learning for control. Artificial Intelligence Review 11(1-5), 75–113 (1997)CrossRefGoogle Scholar
  14. 14.
    Gabel, T., Riedmiller, M.: Cbr for state value function approximation in reinforcement learning. In: 6th International Conference on Case-Based Reasoning (2005)Google Scholar
  15. 15.
    Atkeson, C., Santamaria, J.: A comparison of direct and model-based reinforcement learning. In: International Conference on Robotics and Automation (1997)Google Scholar
  16. 16.
    Stone, P., Dresner, K., Fidelman, P., Kohl, N., Kuhlmann, G., Sridharan, M., Stronger, D.: The UT Austin Villa 2005 RoboCup four-legged team. Technical Report UT-AI-TR-05-325, The University of Texas at Austin, Department of Computer Sciences, AI Laboratory (November 2005)Google Scholar
  17. 17.
    Puterman, M.L.: Markov Decision Processes. Wiley, NY (1994)MATHGoogle Scholar
  18. 18.
    Kwok, C., Fox, D.: Reinforcement learning for sensing strategies. In: The IEEE International Conference on Intelligent Robots and Systems (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mazda Ahmadi
    • 1
  • Peter Stone
    • 1
  1. 1.Department of Computer SciencesThe University of Texas at Austin 

Personalised recommendations