Advertisement

A Tutorial on Newton Methods for Constrained Trajectory Optimization and Relations to SLAM, Gaussian Process Smoothing, Optimal Control, and Probabilistic Inference

  • Marc Toussaint
Chapter
Part of the Springer Tracts in Advanced Robotics book series (STAR, volume 117)

Abstract

Many state-of-the-art approaches to trajectory optimization and optimal control are intimately related to standard Newton methods. For researchers that work in the intersections of machine learning, robotics, control, and optimization, such relations are highly relevant but sometimes hard to see across disciplines, due also to the different notations and conventions used in the disciplines. The aim of this tutorial is to introduce to constrained trajectory optimization in a manner that allows us to establish these relations. We consider a basic but general formalization of the problem and discuss the structure of Newton steps in this setting. The computation of Newton steps can then be related to dynamic programming, establishing relations to DDP, iLQG, and AICO. We can also clarify how inverting a banded symmetric matrix is related to dynamic programming as well as message passing in Markov chains and factor graphs. Further, for a machine learner, path optimization and Gaussian Processes seem intuitively related problems. We establish such a relation and show how to solve a Gaussian Process-regularized path optimization problem efficiently. Further topics include how to derive an optimal controller around the path, model predictive control in constrained k-order control processes, and the pullback metric interpretation of the Gauss–Newton approximation.

Keywords

Newton Method Path Optimization Riccati Equation Trajectory Optimization Model Predictive Control 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was supported by the DFG under grants TO 409/9-1 and the 3rdHand EU-Project FP7-ICT-2013-10610878.

References

  1. 1.
    R. Bellman, Dynamic programming and lagrange multipliers. Proc. National Acad. Sci. 42(10), 767–769 (1956)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    A. Bemporad, M. Morari, V. Dua, E.N. Pistikopoulos, The explicit linear quadratic regulator for constrained systems. Automatica 38(1), 3–20 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    J.T. Betts, Survey of numerical methods for trajectory optimization. J. Guid Control Dyn. 21(2), 193–207 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    A.R. Conn, N.I. Gould, P. Toint, A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds. SIAM J. Numer. Anal. 28(2), 545–572 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    F. Dellaert, Factor graphs and GTSAM: A hands-on introduction. Technical Report Technical Report GT-RIM-CP&R-2012-002, Georgia Tech (2012)Google Scholar
  6. 6.
    M. Diehl, H.J. Ferreau, N. Haverbeke, Efficient numerical methods for nonlinear MPC and moving horizon estimation, in Nonlinear Model Predictive Control (Springer, 2009), pp. 391–417Google Scholar
  7. 7.
    C.R. Dohrmann, R.D. Robinett, Dynamic programming method for constrained discrete-time optimal control. J. Optim. Theory Appl. 101(2), 259–283 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    J. Dong, M. Mukadam, F. Dellaert, B. Boots, Motion planning as probabilistic inference using Gaussian processes and factor graphs, in Proceedings of Robotics: Science and Systems (RSS-2016) (2016)Google Scholar
  9. 9.
    P. Englert, M. Toussaint, Inverse KKT–learning cost functions of manipulation tasks from demonstrations, in Proceedings of the International Symposium of Robotics Research (2015)Google Scholar
  10. 10.
    J. Folkesson, H. Christensen, Graphical SLAM-a self-correcting map, in 2004 IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA’04, vol. 1 (IEEE, 2004), pp. 383–390Google Scholar
  11. 11.
    G.H. Golub, C.F. Van Loan, Matrix Computations, vol. 3 (JHU Press, Baltimore, 2012)zbMATHGoogle Scholar
  12. 12.
    S.J. Julier, J.K. Uhlmann, New extension of the Kalman filter to nonlinear systems, in AeroSense’97 (International Society for Optics and Photonics, 1997), pp. 182–193Google Scholar
  13. 13.
    M. Kalakrishnan, S. Chitta, E. Theodorou, P. Pastor, S. Schaal, STOMP: stochastic trajectory optimization for motion planning, in 2011 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2011), pp. 4569–4574Google Scholar
  14. 14.
    H.J. Kappen, V. Gómez, M. Opper, Optimal control as a graphical model inference problem. Mach. Learn. 87(2), 159–182 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    S. Kolev, E. Todorov, Physically consistent state estimation and system identification for contacts, in 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids) (IEEE, 2015), pp. 1036–1043Google Scholar
  16. 16.
    F.R. Kschischang, B.J. Frey, H.-A. Loeliger, Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47(2), 498–519 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    R. Kümmerle, G. Grisetti, H. Strasdat, K. Konolige, W. Burgard, g2o: a general framework for graph optimization, in 2011 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2011), pp. 3607–3613Google Scholar
  18. 18.
    J. Lafferty, A. McCallum, F.C. Pereira, Conditional random fields: probabilistic models for segmenting and labeling sequence data, in Proceedings of 18th International Conference on Machine Learning (ICML) (2001), pp. 282–289Google Scholar
  19. 19.
    L.-z. Liao, C. A. Shoemaker, Advantages of differential dynamic programming over Newton’s method for discrete-time optimal control problems. Technical report, Cornell University (1992)Google Scholar
  20. 20.
    D. Mayne, A second-order gradient method for determining optimal trajectories of non-linear discrete-time systems. Int. J. Control 3(1), 85–95 (1966)CrossRefGoogle Scholar
  21. 21.
    T.P. Minka, Expectation propagation for approximate Bayesian inference, in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (Morgan Kaufmann Publishers Inc., 2001), pp. 362–369Google Scholar
  22. 22.
    J. Nocedal, S. Wright, Numerical Optimization (Springer Science & Business Media, New York, 2006)zbMATHGoogle Scholar
  23. 23.
    J. Peters, S. Schaal, Natural actor-critic. Neurocomputing 71(7), 1180–1190 (2008)CrossRefGoogle Scholar
  24. 24.
    N. Ratliff, M. Zucker, J.A. Bagnell, S. Srinivasa, CHOMP: gradient optimization techniques for efficient motion planning, in IEEE International Conference on Robotics and Automation, 2009. ICRA’09 (IEEE, 2009), pp. 489–494Google Scholar
  25. 25.
    N. Ratliff, M. Toussaint, S. Schaal, Understanding the geometry of workspace obstacles in motion optimization, in 2015 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2015), pp. 4202–4209Google Scholar
  26. 26.
    K. Rawlik, M. Toussaint, S. Vijayakumar, On stochastic optimal control and reinforcement learning by approximate inference, in Proceedings of Robotics: Science and Systems (R:SS 2012) (2012). Runner Up Best Paper AwardGoogle Scholar
  27. 27.
    J. Schulman, J. Ho, A.X. Lee, I. Awwal, H. Bradlow, P. Abbeel, Finding locally optimal, collision-free trajectories with sequential convex optimization, in Robotics: Science and Systems, vol. 9 (2013), pp. 1–10. CiteseerGoogle Scholar
  28. 28.
    Y. Tassa, N. Mansard, E. Todorov, Control-limited differential dynamic programming, in 2014 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2014), pp. 1168–1175Google Scholar
  29. 29.
    S. Thrun, M. Montemerlo, The graph SLAM algorithm with applications to large-scale mapping of urban structures. Int. J. Robot. Res. 25(5–6), 403–429 (2006)CrossRefGoogle Scholar
  30. 30.
    E. Todorov, W. Li, A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems, in American Control Conference, 2005. Proceedings of the 2005 (IEEE, 2005), pp. 300–306Google Scholar
  31. 31.
    M. Toussaint, Robot trajectory optimization using approximate inference, in Proceedings of the International Conference on Machine Learning (ICML 2009) (ACM, 2009), pp. 1049–1056. ISBN 978-1-60558-516-1Google Scholar
  32. 32.
    M. Toussaint, Pros and cons of truncated Gaussian EP in the context of approximate inference control, in NIPS Workshop on Probabilistic Approaches for Robotics and Control (2009)Google Scholar
  33. 33.
    M. Toussaint, A novel augmented lagrangian approach for inequalities and convergent any-time non-central updates. e-Print arXiv:1412.4329 (2014)
  34. 34.
    M. Toussaint, KOMO: newton methods for k-order markov constrained motion problems. e-Print arXiv:1407.0414 (2014)
  35. 35.
    N. Vlassis, M. Toussaint, Model-free reinforcement learning as mixture learning, in Proceedings of the International Conference on Machine Learning (ICML 2009) (2009), pp. 1081–1088. ISBN 978-1-60558-516-1Google Scholar
  36. 36.
    O. Von Stryk, R. Bulirsch, Direct and indirect methods for trajectory optimization. Ann. Oper. Res. 37(1), 357–373 (1992)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Machine Learning and Robotics LabUniversity StuttgartStuttgartGermany

Personalised recommendations