Cognitive Processing

, Volume 12, Issue 4, pp 319–340 | Cite as

Model learning for robot control: a survey

  • Duy Nguyen-TuongEmail author
  • Jan Peters


Models are among the most essential tools in robotics, such as kinematics and dynamics models of the robot’s own body and controllable external objects. It is widely believed that intelligent mammals also rely on internal models in order to generate their actions. However, while classical robotics relies on manually generated models that are based on human insights into physics, future autonomous, cognitive robots need to be able to automatically generate models that are based on information which is extracted from the data streams accessible to the robot. In this paper, we survey the progress in model learning with a strong focus on robot control on a kinematic as well as dynamical level. Here, a model describes essential information about the behavior of the environment and the influence of an agent on this environment. In the context of model-based learning control, we view the model from three different perspectives. First, we need to study the different possible model learning architectures for robotics. Second, we discuss what kind of problems these architecture and the domain of robotics imply for the applicable learning methods. From this discussion, we deduce future directions of real-time learning algorithms. Third, we show where these scenarios have been used successfully in several case studies.


Model learning Robot control Machine learning Regression 


  1. Abbeel P, Coates A, Quigley M, Ng AY (2007) An application of reinforcement learning to aerobatic helicopter flight. Adv Neural Inf Process SystGoogle Scholar
  2. Akaike H (1970) Autoregressive model fitting for control. Ann Inst Stat Math 23:163–180CrossRefGoogle Scholar
  3. Akesson BM, Toivonen HT (2006) A neural network model predictive controller. J Process Control 16(9):937–946CrossRefGoogle Scholar
  4. Angelova A, Matthies L, Helmick D, Perona P (2006) Slip prediction using visual information. In: Proceedings of robotics: science and systems, Philadelphia, USA, August 2006Google Scholar
  5. Aström KJ, Wittenmark B (1995) Adaptive control. Addison Wesley, BostonGoogle Scholar
  6. Atkeson CG, An CH, Hollerbach JM (1986) Estimation of inertial parameters of manipulator loads and links. Int J Rob Res 5(3)Google Scholar
  7. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11(1–5):11–73CrossRefGoogle Scholar
  8. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning for control. Artif Intell Rev 11(1–5):75–113CrossRefGoogle Scholar
  9. Atkeson CG, Morimoto J (2002) Nonparametric representation of policies and value functions: a trajectory-based approach. Adv Neural Inf Process SystGoogle Scholar
  10. Atkeson CG, Schaal S (1997) Robot learning from demonstration. In: Proceedings of the 14th international conference on machine learningGoogle Scholar
  11. Ben-David S, Schuller R (2003) Exploiting task relatedness for multiple task learning. In: Proceedings of the conference on learning theoryGoogle Scholar
  12. Bhushan N, Shadmehr R (1999) Evidence for a forward dynamics model in human adaptive motor control. Adv Neural Inf Process SystGoogle Scholar
  13. Billings SS, Chen S, Korenberg G (1989) Identification of mimo nonlinear systems using a forward-regression orthogonal estimator. Int J Control 49:2157–2189Google Scholar
  14. Bongard J, Zykov V, Lipson H (2006) Resilient machines through continuous self-modeling. Science 314:1118–1121CrossRefGoogle Scholar
  15. Boots B, Siddiqi SM, Gordon GJ (2010) Closing the learning-planning loop with predictive state representations. Robot Sci SystGoogle Scholar
  16. Bottou L, Chapelle O, DeCoste D, Weston J (2007) Large-scale kernel machines. MIT Press, CambridgeGoogle Scholar
  17. Burdet E, Sprenger B, Codourey A (1997) Experiments in nonlinear adaptive control. Int Conf Robot Autom 1:537–542Google Scholar
  18. Butz M, Herbort M, Hoffmann J (2007) Exploiting redundancy for flexible behavior: unsupervised learning in a modular sensorimotor control architecture. Psychol Rev 114(3):1015–1046PubMedCrossRefGoogle Scholar
  19. Calinon S, D’halluin F, Sauser E, Caldwell D, Billard A (2010) A probabilistic approach based on dynamical systems to learn and reproduce gestures by imitation. IEEE Robot Autom Mag 17:44–54CrossRefGoogle Scholar
  20. Candela JQ, Rasmussen CE (2005) A unifying view of sparse approximate gaussian process regression. J Mach Learn ResGoogle Scholar
  21. Candela JQ, Rasmussen CE, Williams CK (2007) Large scale kernel machines. MIT Press, CambridgeGoogle Scholar
  22. Cao H, Yin Y, Du D, Lin L, Gu W, Yang Z (2006) Neural network inverse dynamic online learning control on physical exoskeleton. 13th international conference on neural information processingGoogle Scholar
  23. Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning. MIT Press, CambridgeGoogle Scholar
  24. Choi Y, Cheong SY, Schweighofer N (2007) Local online support vector regression for learning control. In: Proceedings of the IEEE international symposium on computational intelligence in robotics and automationGoogle Scholar
  25. Chow CM, Kuznetsov AG, Clarke DW (1998) Successive one-step-ahead predictions in multiple model predictive control. Int J Control 29:971–979Google Scholar
  26. Cleveland WS, Loader CL (1996) Smoothing by local regression: principles and methods. Stat Theory Comput Aspects SmoothGoogle Scholar
  27. Cohn DA, Ghahramani Z, Jordan MI (1996) Active learning with statistical models. J Artif Intell Res 4:129–145Google Scholar
  28. Coito FJ, Lemos JM (1991) A long-range adaptive controller for robot manipulators. Int J Robot Res 10:684–707CrossRefGoogle Scholar
  29. Craig JJ (2004) Introduction to robotics: mechanics and control. Prentice Hall, New JerseyGoogle Scholar
  30. Csato L, Opper M (2002) Sparse online gaussian processes. Neural ComputGoogle Scholar
  31. Dasgupta S (2004) Analysis of a greedy active learning strategy. Adv Neural Inf Process SystGoogle Scholar
  32. Demers D, Kreutz-Delgado K (1992) Learning global direct inverse kinematics. Adv Neural Inf Process Syst, strony 589–595Google Scholar
  33. D’Souza A, Vijayakumar S, Schaal S (2001) Learning inverse kinematics. IEEE Int Conf Intell Robots SystGoogle Scholar
  34. Edakunni NU, Schaal S, Vijayakumar S (2007) Kernel carpentry for online regression using randomly varying coefficient model. In: Proceedings of the 20th international joint conference on artificial intelligenceGoogle Scholar
  35. Engel Y, Mannor S, Meir R (2002) Sparse online greedy support vector regression. Eur Conf Mach LearnGoogle Scholar
  36. Fan J, Gijbels I (1995) Data driven bandwidth selection in local polynomial fitting. J R Stat Soc 57(2):371–394Google Scholar
  37. Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman and HallGoogle Scholar
  38. Farrell JA, Polycarpou MM (2006) Adaptive approximation based control. Wiley, New JerseyCrossRefGoogle Scholar
  39. Ferreira JP, Crisostomo M, Coimbra AP, Ribeiro B (2007) Simulation control of a biped robot with support vector regression. IEEE Int Symp Intell Signal ProcessGoogle Scholar
  40. Figueiredo MAF, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRefGoogle Scholar
  41. Gautier M, Khalil W (1992) Exciting trajectories for the identification of base inertial parameters of robots. Int J Robot Res 11(4):362–375CrossRefGoogle Scholar
  42. Ge SS, Lee TH, Tan EG (1998) Adaptive neural network control of flexible joint robots based on feedback linearization. Int J Syst Sci 29(6):623–635CrossRefGoogle Scholar
  43. Genov R, Chakrabartty S, Cauwenberghs G (2003) Silicon support vector machine with online learning. Int J Pattern Recognit Artif Intell 17:385–404CrossRefGoogle Scholar
  44. Girard A, Rasmussen CE, Candela JQ, Smith RM (2002) Gaussian process priors with uncertain inputs application to multiple-step ahead time series forecasting. Adv Neural Inf Process SystGoogle Scholar
  45. Glynn PW (1987) Likelihood ratio gradient estimation: an overview. In: Proceedings of the 1987 winter simulation conferenceGoogle Scholar
  46. Gomi H, Kawato M (1993) Recognition of manipulated objects by motor learning with modular architecture networks. Neural Netw 6(4):485–497CrossRefGoogle Scholar
  47. Grollman DH, Jenkins OC (2008) Sparse incremental learning for interactive robot control policy estimation. IEEE International Conference on Robotics and Automation, Pasadena, CA, USAGoogle Scholar
  48. Gu D, Hu H (2002) Predictive control for a car-like mobile robot. Robot Auton Syst 39:73–86CrossRefGoogle Scholar
  49. Haerdle WK, Mueller M, Sperlich S, Werwatz A (2004) Nonparametric and semiparametric models. Springer, New YorkCrossRefGoogle Scholar
  50. Haruno M, Wolpert DM, Kawato M (2001) Mosaic model for sensorimotor learning and control. Neural Comput 13(10):2201–2220PubMedCrossRefGoogle Scholar
  51. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New YorkGoogle Scholar
  52. Haykin S (1999) Neural networks: a comprehensive foundation. Prentice Hall, New JerseyGoogle Scholar
  53. Herbort O, Butz MV, Pedersen G (2010) The SURE_REACH model for motor learning and control of a redundant arm: from modeling human behavior to applications in robots. From motor to interaction learning in robots, strony 85–106Google Scholar
  54. Hoffman H, Schaal S, Vijayakumar S (2009) Local dimensionality reduction for non-parametric regression. Neural Process LettGoogle Scholar
  55. Hoffmann M, Marques HG, Arieta AH, Sumioka H, Lungarella M, Pfeifer R (2010) Body schema in robotics: a review. IEEE Trans Auton Ment Dev 2(4):304–324CrossRefGoogle Scholar
  56. Jacobs R, Jordan M, Nowlan S, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3:79–87CrossRefGoogle Scholar
  57. Jacobson DH, Mayne DQ (1973) Differential dynamic programming. American Elsevier, New YorkGoogle Scholar
  58. Jordan I, Rumelhart D (1992) Forward models: supervised learning with a distal teacher. Cogn Sci 16:307–354CrossRefGoogle Scholar
  59. Joshi P, Maass W (2005) Movement generation with circuits of spiking neurons. Neural Comput 17(8):1715–1738PubMedCrossRefGoogle Scholar
  60. Kalakrishnan M, Buchli J, Pastor P, Schaal S (2009) learning locomotion over rough terrain using terrain templates. IEEE Int Conf Intell Robots SystGoogle Scholar
  61. Kawato M (1990) Feedback error learning neural network for supervised motor learning. Adv Neural ComputGoogle Scholar
  62. Kawato M (1999) Internal models for motor control and trajectory planning. Curr Opin Neurobiol 9(6):718–727PubMedCrossRefGoogle Scholar
  63. Keyser RD, Cauwenberghe AV (1980) A self-tuning multistep predictor application. Automatica 17:167–174CrossRefGoogle Scholar
  64. Khalil W, Dombre E (2002) Modeling, identification and control of robots. Taylor & Francis Inc., BristolGoogle Scholar
  65. Khatib O (1987) A unified approach for motion and force control of robot manipulators: the operational space formulation. J Robot Autom 3(1):43–53CrossRefGoogle Scholar
  66. Klanke S, Lebedev D, Haschke R, Steil JJ, Ritter H (2006) Dynamic path planning for a 7-dof robot arm. In: Proceedings of the 2009 IEEE international conference on intelligent robots and systemsGoogle Scholar
  67. Ko J, Fox D (2009) GP-bayesfilters: Bayesian filtering using gaussian process prediction and observation models. Auton Robots 27(1):75–90CrossRefGoogle Scholar
  68. Kocijan J, Murray-Smith R, Rasmussen C, Girard A (2004) Gaussian process model based predictive control. In: Proceedings of the American control conferenceGoogle Scholar
  69. Kopicki M (2010) Prediction learning in robotic manipulation. Praca doktorska, University of BirminghamGoogle Scholar
  70. Kopicki M, Zurek S, Stolkin R, Morwald T, Wyatt J (2011) Learning to predict how rigid objects behave under simple manipulation. In: Proceedings of the 2010 IEEE international conference on robotics and automationGoogle Scholar
  71. Kroemer O, Detry R, Piater J, Peters J (2009) Active learning using mean shift optimization for robot grasping. International conference on intelligent robots and systems, St. Louis, MO, USAGoogle Scholar
  72. Kröse BJ, Vlassis N, Bunschoten R, Motomura Y (2001) A probabilistic model for appearance-based robot localization. Image Visi Comput 19:381–391CrossRefGoogle Scholar
  73. Krupka E, Tishby N (2007) Incorporating prior knowledge on features into learning. International conference on artificial intelligence and statistics, San Juan, Puerto RicoGoogle Scholar
  74. Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learningGoogle Scholar
  75. Layne JR, Passino KM (1996) Fuzzy model reference learning control. J Intell Fuzzy Syst 4:33–47Google Scholar
  76. Littman M, Sutton RS, Singh S (2001) Predictive representations of state. Adv Neural Inf Process SystGoogle Scholar
  77. Ljung L (2004) System identification—theory for the user. Prentice Hall, New JerseyGoogle Scholar
  78. Lopes M, Damas B (2007) A learning framework for generic sensory-motor maps. In: Proceedings of the international conference on intelligent Robots SystGoogle Scholar
  79. Luca AD, Lucibello P (1998) A general algorithm for dynamic feedback linearization of robots with elastic joints. In: Proceedings of the IEEE intemational conference on robotics and automationGoogle Scholar
  80. Lukocevicius M, Jaeger H (2009) Reservoir computing approaches to recurrent neural network training. Comput Sci Rev 3(3):127–149CrossRefGoogle Scholar
  81. Ma J, Theiler J, Perkins S (2005) Accurate on-line support vector regression. Neural Comput 15:2683–2703CrossRefGoogle Scholar
  82. Maciejowski JM (2002) Predictive control with constraints. Prentice Hall, New JerseyGoogle Scholar
  83. MacKay DJ (1992) A practical Bayesian framework for back-propagation networks. Neural Comput 4(3):448–472CrossRefGoogle Scholar
  84. Martinez-Cantin R, Freitas OD, Doucet A, Castellanos JA (2007) Active policy learning for robot planning and exploration under uncertainty. In: Proceedings of robotics: science and systemsGoogle Scholar
  85. Martinez-Cantin R, Lopes M, Montesano L (2010) Body schema acquisition through active learning. IEEE Int Conf Robot AutomGoogle Scholar
  86. Miller WT III (1989) Real-time application of neural networks for sensor-based control of robots with vision. IEEE Trans Syst Man Cybern 19(4):825–831CrossRefGoogle Scholar
  87. Miller WT III, Glanz FH, Kraft LG III (1987) Application of a general learning algorithm to the control of robotic manipulators. Int J Robot Res 6(2):84–98CrossRefGoogle Scholar
  88. Miyamoto H, Kawato M, Setoyama T, Suzuki R (1988) Feedback-error-learning neural network for trajectory control of a robotic manipulator. Neural Netw 1(3):251–265CrossRefGoogle Scholar
  89. Moore A (1992) Fast, robust adaptive control by learning only forward models. Adv Neural Inf Process SystGoogle Scholar
  90. Moore A, Lee MS (1994) Efficient algorithms for minimizing cross validation error. Proceedings of the 11th international conference on machine learningGoogle Scholar
  91. Morimoto J, Zeglin G, Atkeson CG (2003) Minimax differential dynamic programming: application to a biped walking robot. In: Proceedings of the 2009 IEEE international conference on intelligent robots and systemsGoogle Scholar
  92. Mosca E, Zappa G, Lemos JM (1989) Robustness of multipredictor adaptive regulators: MUSMAR. Automatica 25:521–529CrossRefGoogle Scholar
  93. Nakanishi J, Cory R, Mistry M, Peters J, Schaal S (2008) Operational space control: a theoretical and emprical comparison. Int J Robot Res 27(6):737–757CrossRefGoogle Scholar
  94. Nakanishi J, Farrell JA, Schaal S (2005) Composite adaptive control with locally weighted statistical learning. Neural Netw 18(1):71–90PubMedCrossRefGoogle Scholar
  95. Nakanishi J, Schaal S (2004) Feedback error learning and nonlinear adaptive control. Neural Netw 17(10)Google Scholar
  96. Nakayama H, Yun Y, Shirakawa M (2008) Multi-objective model predictive control. In: Proceedings of the 19th international conference on multiple criteria decision makingGoogle Scholar
  97. Narendra K, Balakrishnan J (1997) Adaptive control using multiple models. IEEE Trans Autom Control 42(2):171–187CrossRefGoogle Scholar
  98. Narendra K, Balakrishnan J, Ciliz M (1995) Adaptation and learning using multiple models, switching and tuning. IEEE Control Syst Mag 15(3):37–51CrossRefGoogle Scholar
  99. Narendra KS, Annaswamy AM (1987) Persistent excitation in adaptive systems. Int J Control 45:127–160CrossRefGoogle Scholar
  100. Narendra KS, Annaswamy AM (1989) Stable adaptive systems. Prentice Hall, New JerseyGoogle Scholar
  101. Neal RM (1996) Bayesian learning for networks. Lect Notes StatGoogle Scholar
  102. Negenborn R, Schutter BD, Wiering MA, Hellendoorn H (2005) Learning-based model predictive control for markov decision processes. In: Proceedings of the 16th IFAC world congressGoogle Scholar
  103. Ng AY, Coates A, Diel M, Ganapathi V, Schulte J, Tse B, Berger E, Liang E (2004) Autonomous inverted helicopter flight via reinforcement learning. In: Proceedings of the 11th international symposium on experimental roboticsGoogle Scholar
  104. Ng AY, Jordan M (2000) Pegasus: a policy search method for large mdps and pomdps. In: Proceedings of the 16th conference in uncertainty in artificial intelligenceGoogle Scholar
  105. Nguyen-Tuong D, Peters J (2009) Model learning with local gaussian process regression. Adv Robot 23(15):2015–2034CrossRefGoogle Scholar
  106. Nguyen-Tuong D, Peters J (2010) Incremental sparsification for real-time online model learning. Neurocomputing (in press)Google Scholar
  107. Nguyen-Tuong D, Peters J (2010) Using model knowledge for learning inverse dynamics. In: Proceedings of the 2010 IEEE international conference on robotics and automationGoogle Scholar
  108. Nicosia S, Tomei P (1984) Model reference adaptive control algorithms for industrial robots. Automatica 20:635–644CrossRefGoogle Scholar
  109. Nowlan S, Hinton GE (1991) Evaluation of adaptive mixtures of competing experts. Adv Neural Inf Process SystGoogle Scholar
  110. Otani K, Kakizaki T (1993) Motion planning and modeling for accurately identifying dynamic parameters of an industrial robotic manipulator. International Symposium on Industrial RobotsGoogle Scholar
  111. Patino HD, Carelli R, Kuchen BR (2002) Neural networks for advanced control of robot manipulators. IEEE Trans Neural Netw 13(2):343–354PubMedCrossRefGoogle Scholar
  112. Pelossof R, Miller A, Allen P, Jebara T (2004) An svm learning approach to robotic grasping. In: IEEE international conference on robotics and automationGoogle Scholar
  113. Peters J, Mistry M, Udwadia FE, Nakanishi J, Schaal S (2008) A unifying methodology for robot control with redundant DoFs. Auton Robots 24(1):1–12CrossRefGoogle Scholar
  114. Peters J, Schaal S (2008) Learning to control in operational space. Int J Robot Res 27(2):197–212CrossRefGoogle Scholar
  115. Petkos G, Toussaint M, Vijayakumar S (2006) Learning multiple models of non-linear dynamics for control under varying contexts. In: Proceedings of the international conference on artificial neural networksGoogle Scholar
  116. Plagemann C, Kersting K, Pfaff P, Burgard W (2007) Heteroscedastic gaussian process regression for modeling range sensors in mobile robotics. Snowbird learning workshopGoogle Scholar
  117. Plagemann C, Mischke S, Prentice S, Kersting K, Roy N, Burgard W (2008) Learning predictive terrain models for legged robot locomotion. In: Proceedings of the IEEE international conference on intelligent robots and systemsGoogle Scholar
  118. Porrill J, PDP, Stone JV (2004) Recurrent cerebellar architecture solves the motor-error problem. Proc R Soc BGoogle Scholar
  119. Rasmussen CE (1996) Evaluation of gaussian processes and other methods for non-linear regression. University of Toronto, TorontoGoogle Scholar
  120. Rasmussen CE, Ghahramani Z (2002) Infinite mixtures of gaussian process experts. Adv Neural Inf Process SystGoogle Scholar
  121. Rasmussen CE, Kuss M (2003) Gaussian processes in reinforcement learning. Adv Neural Inf Process SystGoogle Scholar
  122. Rasmussen CE, Williams CK (2006) Gaussian processes for Machine Learning. MIT Press, Massachusetts Institute of TechnologyGoogle Scholar
  123. Reinhart RF, Steil JJ (2008) Recurrent neural associative learning of forward and inverse kinematics for movement generation of the redundant pa-10 robot. Symposium on learning and adaptive behavior in robotic systemsGoogle Scholar
  124. Reinhart RF, Steil JJ (2009) Attractor-based computation with reservoirs for online learning of inverse kinematics. In: Proceedings of the European symposium on artificial neural networksGoogle Scholar
  125. Reinhart RF, Steil JJ (2009) Reaching movement generation with a recurrent neural network based on learning inverse kinematics. In: Proceedings of the conference on humanoid robotsGoogle Scholar
  126. Rolf M, Steil JJ, Gienger M (2010) Efficient exploration and learning of whole body kinematics. In: Proceedings of the international conference on development and learningGoogle Scholar
  127. Rolf M, Steil JJ, Gienger M (2010) Goal babbling permits direct learning of inverse kinematics. IEEE Trans Auton Ment Dev 2(3):216–229CrossRefGoogle Scholar
  128. Rottmann A, Burgard W (2009) Adaptive autonomous control using online value iteration with gaussian processes. In: Proceedings of the IEEE international conference on robotics and automationGoogle Scholar
  129. Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290Google Scholar
  130. Salaun C, Padois V, Sigaud O (2009) Control of redundant robots using learned models: an operational space control approach. In: Proceedings of the 2009 IEEE international conference on intelligent robots and systemsGoogle Scholar
  131. Sanger TD (1989) Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw 2(36):459–473CrossRefGoogle Scholar
  132. Schaal S (1999) Is imitation learning the route to humanoid robots? Trends in cognitive sciencesGoogle Scholar
  133. Schaal S, Atkeson CG (2010) Learning control in robotics: trajectory-based optimal control techniques. IEEE Robot Autom MagGoogle Scholar
  134. Schaal S, Atkeson CG, Vijayakumar S (2002) Scalable techniques from nonparametric statistics for real-time robot learning. Appl Intell 17(1):49–60CrossRefGoogle Scholar
  135. Schaal S, Sternad D (1998) Programmable pattern generators. Int Conf Comput Intell NeurosciGoogle Scholar
  136. Schölkopf B, Mika S, Burges CJC, Knirsch P, Müller K-R, Rätsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017PubMedCrossRefGoogle Scholar
  137. Schölkopf B, Simard P, Smola A, Vapnik V (1997) Prior knowledge in support vector kernel. Advances in Neural Information Processing Systems, Denver, CO, USAGoogle Scholar
  138. Schölkopf B, Smola A (2002) Learning with kernels: support vector machines, regularization, optimization and beyond. MIT Press, CambridgeGoogle Scholar
  139. Schölkopf B, Smola A, Williamson R, Bartlett P (2000) New support vector algorithms. Neural Comput 12(5)Google Scholar
  140. Schrauwen B, Verstraeten D, Campenhout JV (2007) An overview of reservoir computing: Theory, applications and implementations. In: Proceedings of the 15th European symposium on artificial neural networks, strony 471–482Google Scholar
  141. Sciavicco L, Siciliano B (1996) Modeling and control of robot manipulators. McGraw-Hill, New YorkGoogle Scholar
  142. Seeger M (2004) Gaussian processes for machine learning. Int J SystGoogle Scholar
  143. Sentis L, Khatib O (2005) Synthesis of whole-body behaviors through hierarchical control of behavioral primitives. Int J Hum Robot 2(4):505–518CrossRefGoogle Scholar
  144. Shibata T, Schaal C (2001) Biomimetic gaze stabilization based on feedback-error learning with nonparametric regression networks. Neural Netw 14(2):201–216PubMedCrossRefGoogle Scholar
  145. Skočaj D, Kristan M, Vrečko A, Leonardis A, Fritz M, Stark M, Schiele B, Hongeng S, Wyatt JL (2010) Multi-modal learning. Cogn Syst 8:265–309CrossRefGoogle Scholar
  146. Slotine J-JE, Li W (1991) Applied nonlinear control. Prentice Hall, New JerseyGoogle Scholar
  147. Smith OJ (1959) A controller to overcome dead-time. Instrum Soc Am J 6:28–33Google Scholar
  148. Smola A, Friess T, Schoelkopf B (1998) Semiparametric support vector and linear programming machines. Advances in neural information processing systems, Denver, CO, USAGoogle Scholar
  149. Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222CrossRefGoogle Scholar
  150. Spong MW, Hutchinson S, Vidyasagar M (2006) Robot dynamics and control. Wiley, New YorkGoogle Scholar
  151. Steffen J, Klanke S, Vijayakumar S, Ritter HJ (2009) Realising dextrous manipulation with structured manifolds using unsupervised kernel regression with structural hints. ICRA 2009 workshop: approaches to sensorimotor learning on humanoid robots, Kobe, Japan 2009.Google Scholar
  152. Steil JJ (2004) Backpropagation-decorrelation: online recurrent learning with O(n) complexity. In: Proceedings of the international joint conference on neural networks, July 2004Google Scholar
  153. Steil JJ (2007) Online reservoir adaptation by intrinsic plasticity for backpropagation-decorrelation and echo state learning. Neural Netw 20(3):353–364PubMedCrossRefGoogle Scholar
  154. Stilman M, Kuffner JJ (2008) Planning among movable obstacles with artificial constraints. Int J Robot Res 27(12):1295–1307CrossRefGoogle Scholar
  155. Sturm J, Plagemann C, Burgard W (2008) Unsupervised body scheme learning through self-perception. IEEE international conference on robotics and automation, Pasadena, CA, USAGoogle Scholar
  156. Sutton RS (1991) Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bull 2(4):160–163CrossRefGoogle Scholar
  157. Swevers J, Ganseman C, Tükel D, Schutter JD, Brussel HV (1997) Optimal robot excitation and identification. IEEE Trans Robot Autom 13:730–740CrossRefGoogle Scholar
  158. Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290Google Scholar
  159. Tevatia G, Schaal S (2008) Efficient inverse kinematics algorithms for high-dimensional movement systems. University of Southern CaliforniaGoogle Scholar
  160. Thrun S, Mitchell T (1995) Lifelong robot learning. Robot Auton SystGoogle Scholar
  161. Ting J, D’Souza A, Schaal S (2009) A bayesian approach to nonlinear parameter identification for rigid-body dynamics. Neural NetwGoogle Scholar
  162. Ting J, Kalakrishnan M, Vijayakumar S, Schaal S (2008) Bayesian kernel shaping for learning control. Adv Neural Inf Process SystGoogle Scholar
  163. Titsias MK, Lawrence ND (2010) Bayesian gaussian process latent variable model. In: Proceedings of the 13th international conference on articial intelligence and statisticsGoogle Scholar
  164. Toussaint M, Vijayakumar S (2005) Learning discontinuities with products-of-sigmoids for switching between local models. In: Proceedings of the 22nd international conference on machine learningGoogle Scholar
  165. Treps V (2000) A bayesian committee machine. Neural Comput 12(11):2719–2741CrossRefGoogle Scholar
  166. Treps V (2001) Mixtures of gaussian process. Adv Neural Inf Process SystGoogle Scholar
  167. Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484Google Scholar
  168. Ulbrich S, Angulo V, Asfour T, Torras C, Dillmann R (2009) Rapid learning of humanoid body schemas with kinematic bezier maps. International conference on humanoid robotsGoogle Scholar
  169. Urtasun R, Darrell T (2008) Sparse probabilistic regression for activity-independent human pose inference. International conference in computer vision and pattern recognition, Anchorage, AlaskaGoogle Scholar
  170. Vempaty P, Cheok K, Loh R (2009) Model reference adaptive control for actuators of a biped robot locomotion. In: Proceedings of the world congress on engineering and computer scienceGoogle Scholar
  171. Vijayakumar S, D’Souza A, Schaal S (2005) Incremental online learning in high dimensions. Neural Comput 12(11):2602–2634CrossRefGoogle Scholar
  172. Vijayakumar S, Schaal S (2000) Locally weighted projection regression: an O(n) algorithm for incremental real time learning in high dimensional space. International conference on machine learning, proceedings of the sixteenth conferenceGoogle Scholar
  173. Wan EA, Bogdanov AA (2001) Model predictive neural control with applications to a 6 dof helicopter model. In: Proceedings of the 2001 American control conferenceGoogle Scholar
  174. Weber M, Welling M, Perona P (2000) Unsupervised learning of models for recognition. In: Proceedings of the 6th European conference on computer vision, strony 18–32Google Scholar
  175. Wolpert DM, Kawato M (1998) Multiple paired forward and inverse models for motor control. Neural Netw 11:1317–1329PubMedCrossRefGoogle Scholar
  176. Wolpert DM, Miall RC, Kawato M (1998) Internal models in the cerebellum. Trends Cogn Sci 2Google Scholar

Copyright information

© Marta Olivetti Belardinelli and Springer-Verlag 2011

Authors and Affiliations

  1. 1.Max-Planck Institute for Biological CyberneticsTübingenGermany

Personalised recommendations