A Bayesian View on Motor Control and Planning

  • Marc Toussaint
  • Christian Goerick
Part of the Studies in Computational Intelligence book series (SCI, volume 264)

Abstract

The problem of motion control and planning can be formulated as an optimization problem. In this paper we discuss an alternative view that casts the problem as one of probabilistic inference. In simple cases where the optimization problem can be solved analytically the inference view leads to equivalent solutions. However, when approximate methods are necessary to tackle the problem, the tight relation between optimization and probabilistic inference has fruitfully lead to a transfer of methods between both fields. Here we show that such a transfer is also possible in the realm of robotics. The general idea is that motion can be generated by fusing motion objectives (task constraints, goals, motion priors) by using probabilistic inference techniques. In realistic scenarios exact inference is infeasible (as is the analytic solution of the corresponding optimization problem) and the use of efficient approximate inference methods is a promising alternative to classical motion optimization methods. In this paper we first derive Bayesian control methods that are directly analogous to classical redundant motion rate control and optimal dynamic control (including operational space control). Then, by extending the probabilistic models to be Markovian models of the whole trajectory, we show that approximate probabilistic inference methods (message passing) efficiently compute solutions to trajectory optimization problems. Using Gaussian belief approximations and local linearization the algorithm becomes related to Differential Dynamic Programming (DDP) (aka. iterative Linear Quadratic Gaussian (iLQG)).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baerlocher, P., Boulic, R.: An inverse kinematic architecture enforcing an arbitrary number of strict priority levels. In: The Visual Computer (2004)Google Scholar
  2. 2.
    Bui, H., Venkatesh, S., West, G.: Policy recognition in the abstract hidden markov models. Journal of Artificial Intelligence Research 17, 451–499 (2002)MATHMathSciNetGoogle Scholar
  3. 3.
    Culotta, A., McCallum, A., Selman, B., Sabharwal, A.: Sparse message passing algorithms for weighted maximum satisfiability. In: New England Student Colloquium on Artificial Intelligence, NESCAI (2007)Google Scholar
  4. 4.
    Howard, M., Klanke, S., Gienger, M., Goerick, C., Vijayakumar, S.: Methods for learning control policies from variable-constraint demonstrations. In: Sigaud, O., Peters, J. (eds.) From Motor Learning to Interaction Learning in Robots. SCI, vol. 264, pp. 253–291. Springer, Heidelberg (2010)Google Scholar
  5. 5.
    Kuffner, J., Nishiwaki, K., Kagami, S., Inaba, M., Inoue, H.: Motion planning for humanoid robots. In: Proc. 20th Int. Symp. Robotics Research, ISRR 2003 (2003)Google Scholar
  6. 6.
    Kuffner, J.J., LaValle, S.M.: RRT-connect: An efficient approach to single-query path planning. In: Proc. of IEEE Int’l Conf. on Robotics and Automation (2000)Google Scholar
  7. 7.
    Li, W., Todorov, E., Pan, X.: Hierarchical optimal control of redundant biomechanical systems. In: 26th Annual Int. Conf. of the IEEE Engineering in Medicine and Biology Society (2004)Google Scholar
  8. 8.
    Minka, T.: A family of algorithms for approximate bayesian inference. PhD thesis, MIT (2001)Google Scholar
  9. 9.
    Minka, T.P.: Expectation propagation for approximate Bayesian inference. In: Proc. of the 17th Annual Conf. on Uncertainty in AI (UAI 2001), pp. 362–369 (2001)Google Scholar
  10. 10.
    Murphy, K.: Dynamic bayesian networks: Representation, inference and learning. PhD Thesis, UC Berkeley, Computer Science Division (2002)Google Scholar
  11. 11.
    Nakamura, Y., Hanafusa, H.: Inverse kinematic solutions with singularity robustness for robot manipulator control. Journal of Dynamic Systems, Measurement and Control 108 (1986)Google Scholar
  12. 12.
    Nakamura, Y., Hanafusa, H., Yoshikawa, T.: Task-priority based redundancy control of robot manipulators. Int. Journal of Robotics Research 6 (1987)Google Scholar
  13. 13.
    Peters, J., Mistry, M., Udwadia, F.E., Cory, R., Nakanishi, J., Schaal, S.: A unifying framework for the control of robotics systems. In: IEEE Int. Conf. on Intelligent Robots and Systems (IROS 2005), pp. 1824–1831 (2005)Google Scholar
  14. 14.
    Salaun, C., Padois, V., Sigaud, O.: Learning forward models for the operational space control of redundant robots. In: Sigaud, O., Peters, J. (eds.) From Motor Learning to Interaction Learning in Robots. SCI, vol. 264, pp. 169–192. Springer, Heidelberg (2010)Google Scholar
  15. 15.
    Tappen, M.F., Freeman, W.T.: Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In: IEEE Intl. Conference on Computer Vision, ICCV (2003)Google Scholar
  16. 16.
    Todorov, E., Li, W.: Hierarchical optimal feedback control of redundant systems. In: Advances in Computational Motor Control IV, Extended Abstract (2004)Google Scholar
  17. 17.
    Toussaint, M.: Lecture notes: Factor graphs and belief propagation (2008), http://ml.cs.tu-berlin.de/~mtoussai/notes/
  18. 18.
    Toussaint, M.: Robot trajectory optimization using approximate inference. In: Proc. of the 26rd Int. Conf. on Machine Learning, ICML 2009 (2009)Google Scholar
  19. 19.
    Toussaint, M., Gienger, M., Goerick, C.: Optimization of sequential attractor-based movement for compact behaviour generation. In: 7th IEEE-RAS Int. Conf. on Humanoid Robots, Humanoids 2007 (2007)Google Scholar
  20. 20.
    Toussaint, M., Goerick, C.: Probabilistic inference for structured planning in robotics. In: Int. Conf. on Intelligent Robots and Systems (IROS 2007), pp. 3068–3073 (2007)Google Scholar
  21. 21.
    Toussaint, M., Harmeling, S., Storkey, A.: Probabilistic inference for solving (PO)MDPs. Tech. Rep. EDI-INF-RR-0934, University of Edinburgh, School of Informatics (2006)Google Scholar
  22. 22.
    Vlassis, N., Toussaint, M.: Model-free reinforcement learning as mixture learning. In: Proc. of the 26rd Int. Conf. on Machine Learning, ICML 2009 (2009)Google Scholar
  23. 23.
    Yedidia, J., Freeman, W., Weiss, Y.: Understanding belief propagation and its generalizations (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Marc Toussaint
    • 1
  • Christian Goerick
    • 2
  1. 1.Technical University BerlinBerlin
  2. 2.Honda Research Institute EuropeOffenbach/MainGermany

Personalised recommendations