Abstract
A prerequisite for achieving brain-like intelligence is the ability to rapidly learn new behaviors and actions. A fundamental mechanism for rapid learning in humans is imitation: children routinely learn new skills (e.g., opening a door or tying a shoe lace) by imitating their parents; adults continue to learn by imitating skilled instructors (e.g., in tennis). In this chapter, we propose a probabilistic framework for imitation learning in robots that is inspired by how humans learn from imitation and exploration. Rather than relying on complex (and often brittle) physics-based models, the robot learns a dynamic Bayesian network that captures its dynamics directly in terms of sensor measurements and actions during an imitation-guided exploration phase. After learning, actions are selected based on probabilistic inference in the learned Bayesian network. We present results demonstrating that a 25-degree-of-freedom humanoid robot can learn dynamically stable, full-body imitative motions simply by observing a human demonstrator.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Turing, A.: Computing machinery and intelligence. Mind 59, 433–460 (1950)
McCarthy, J., Minsky, M., Rochester, N., Shannon, C.: A proposal for the dartmouth summer research project on artificial intelligence (1955)
Meltzoff, A.N.: Elements of a developmental theory of imitation. In: The imitative mind: Development, evolution, and brain bases, pp. 19–41. Cambridge University Press, Cambridge (2002)
Doya, K., Ishii, S., Pouget, A., Rao, R.P.N. (eds.): Bayesian Brain: Probabilistic Approaches to Neural Coding. MIT Press, Cambridge (2007)
Rao, R.P.N., Olshausen, B.A., Lewicki, M.S. (eds.): Probabilistic Models of the Brain: Perception and Neural Function, Perception and Neural Function. MIT Press, Cambridge (2002)
Rao, R.P.N., Shon, A.P., Meltzoff, A.N.: A Bayesian model of imitation in infants and robots. In: Imitation and Social Learning in Robots, Humans, and Animals. Cambridge University Press, Cambridge (2005)
Kuniyoshi, Y., Inaba, M., Inoue, H.: Learning by watching: Extracting reusable task knowledge from visual observation of human performance. Transaction on Robotics and Automation 10(6), 799–822 (1994)
Takahashi, Y., Hikita, K., Asada, M.: Incremental purposive behavior acquisition based on self-interpretation of instructions by coach. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), pp. 686–693. IEEE Computer Society Press, Los Alamitos (2003)
Schaal, S., Ijspeert, A., Billard, A.: Computational approaches to motor learning by imitation. The Neuroscience of Social Interaction 1(1431), 199–218 (2004)
Inamura, T., Toshima, I., Nakamura, Y.: Acquiring motion elements for bi-directional computation of motion recognition and generation. In: Experimental Robotics VIII, pp. 372–381. Springer, Heidelberg (2003)
Ijspeert, A.J., Nakanishi, J., Schaal, S.: Trajectory formation for imitation with nonlinear dynamical systems. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2001), pp. 752–757. IEEE Press, Los Alamitos (2001)
Billard, A., Mataric, M.: Learning human arm movements by imitation: Evaluation of a biologically-inspired connectionist architecture. Robotics and Autonomous Systems 37(941), 145–160 (2001)
Calinon, S., Guenter, F., Billard, A.: On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, Part B. Special issue on robot learning by observation, demonstration and imitation 37(2), 286–298 (2007)
Demiris, J., Hayes, G.: A robot controller using learning by imitation. In: Proceedings of the 2nd International Symposium on Intelligent Robotic Systems (IROS 1994). IEEE Press, Los Alamitos (1994)
Schaal, S.: Learning from demonstration. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems 9 (NIPS 1996), vol. 9, p. 1040. MIT Press, Cambridge (1997)
Atkeson, C.G., Schaal, S.: Robot learning from demonstration. In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), pp. 12–20 (1997)
Watkins, C.: Learning from Delayed Rewards. PhD thesis, Cambridge University (1989)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Price, B.: Accelerating Reinforcement Learning with Imitation. PhD thesis, University of British Columbia (2003)
Ng, A.Y., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), pp. 663–670 (2000)
Abbeel, P., Ng, A.Y.: Exploration and apprenticeship learning in reinforcement learning. In: Proceedings of the Twenty-first International Conference on Machine Learning (ICML 2005) (2005)
Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cognitive Science 3(6), 233–242 (1999)
Calinon, S., Guenter, F., Billard, A.: Goal-directed imitation in a humanoid robot. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2005). IEEE Press, Los Alamitos (2005)
Webots: Commercial Mobile Robot Simulation Software, http://www.cyberbotics.com
Featherstone, R.: Robot Dynamics Algorithms. Springer, Heidelberg (1987)
Luh, J.Y.S., Walker, M.W., Paul, R.P.C.: On-line computational scheme for mechanical manipulators. Dynamic Systems Measurement and Control 102 (1980)
Chang, K.S., Khatib, O.: Efficient algorithm for extended operational space inertia matrix. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1999). IEEE Press, Los Alamitos (1999)
Marhefka, D., Orin, D.: Simulation of contact using a nonlinear damping model. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 1996). IEEE Press, Los Alamitos (1996)
Lotstedt, P.: Numerical simulation of time-dependent contact friction problems in rigid body mechanics. SIAM Journal on Scientific Statistical Computing 5(2), 370–393 (1984)
Stewart, D., Trinkle, J.: An implicit time-stepping scheme for rigid body dynamics with coulomb friction. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2000). IEEE Press, Los Alamitos (2000)
Kuffner, J.J., Nishiwaki, K., Kagami, S., Inaba, M., Inoue, H.: Motion planning for humanoid robots under obstacle and dynamic balance constraints. In: Proceedings of the IEEE International Conf. Robotics and Automation (ICRA 2001), pp. 692–698. IEEE Press, Los Alamitos (2001)
Frank, A.A., McGhee, R.B.: Some considerations realation to the design of autopilots for legged vehicles. Terramechanics 6, 23–25 (1969)
Vukobratovic, M., Borovac, B.: Zero-moment point - thirty five years of its life. International Journal of Humanoid Robotics 1(1), 157–173 (2004)
Park, J., Rhee, Y.: ZMP trajectory generation for reduced trunk motions of biped robots. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1998). IEEE Press, Los Alamitos (1998)
Huang, Q., Kajita, S., Koyachi, N., Kaneko, K., Yokoi, K., Arai, H., Komoriya, K., Tanie, K.: A high stability, smooth walking pattern for a biped robot. In: Proceedings of the IEEE International Conf. Robotics and Automation (ICRA 1999). IEEE Press, Los Alamitos (1999)
Kagami, S., Kanehiro, F., Tamiya, Y., Inaba, M., Inoue, H.: Autobalancer: an online dynamic balance compensation scheme for humanoid robots. In: Proceedings of the International Workshop on Algorithmic Foundation of Robotics, pp. 329–340 (2000)
Park, J., Kim, K.: Biped robot walking using gravity-compensated inverted pendulum mode and computed torque control. In: Proceedings of the IEEE International Conf. Robotics and Automation (ICRA 1998). IEEE Press, Los Alamitos (1998)
Yamaguchi, Takanishi, A., Kato, I.: Development of a biped walking robot compensating for three-axis moment by trunk motion. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1993), pp. 561–566. IEEE Press, Los Alamitos (1993)
Yamane, K., Nakamura, Y.: Dynamics filter - concept and implementation of on-line motion generator for human figures. IEEE Transactions on Robotics and Automation 19(3), 421–432 (2003)
Ko, J., Klein, D., Fox, D., Hahnel, D.: GP-UKF: Unscented Kalman filters with gaussian process prediction and observation models. In: Proceedings of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2007). IEEE Press, Los Alamitos (2007)
Shon, A.P., Verma, D., Rao, R.P.N.: Active imitation learning. In: Proceedings of the American Association for Artificial Intelligence (AAAI 2007) (2007)
Barbic, J., Safonova, A., Pan, J.Y., Faloutsos, C., Hodgins, J.K., Pollard, N.S.: Segmenting motion capture data into distinct behaviors. In: Proceedings of Graphics Interface (GI 2004), University of Waterloo, Waterloo, Ontario, Canada, Canadian Human-Computer Communications Society, pp. 185–194 (2004)
Muller, M., Roder, T.: Motion templates for automatic classification and retrieval of motion capture data. In: Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation (SCA 2006), Aire-la-Ville, Switzerland, Eurographics Association, pp. 137–146 (2006)
Seth, A., Pandy, M.G.: A nonlinear tracking method of computing net joint torques for human movement. In: Proceedings of the 26th Annual International Conference of the Engineering in Medicine and Biology Society (2004)
Sung, H.G.: Gaussian Mixture Regression and Classification. PhD thesis, Rice University (2004)
Welling, M., Kurihara, K.: Bayesian K-means as a Maximization-Expectation algorithm. In: Proceedings of the SIAM conference on Data Mining (2005)
Scott, D., Szewczyk, W.: From kernels to mixtures. Technometrics 43(3), 323–335 (2001)
Kreutz, M., Reimetz, A.M., Sendhoff, B., Weihs, C., von Seelen, W.: Structure optimization of density estimation models applied to regression problems with dynamic noise. In: Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, pp. 237–242. Morgan Kaufmann, San Francisco (1999)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms. MIT Press, Cambridge (2001)
Park, J.D., Darwiche, A.: Complexity results and approximation strategies for map explanations. Journal of Artififical Intelligence Research (JAIR) 21, 101–133 (2004)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)
Weiss, Y.: Correctness of local probability propagation in graphical models with loops. Neural Computation 12(1), 1–41 (2000)
Sudderth, E.B., Ihler, A.T., Freeman, W.T., Willsky, A.S.: Nonparametric belief propagation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2003), pp. 605–612 (2003)
Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47(2), 498–519 (2001)
Carreira-Perpinan, M.A.: Mode-finding for mixtures of gaussian distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 22(11), 1318–1323 (2000)
Hwang, J., Lay, S., Lippman, A.: Nonparametric multivariate density estimation: a comparative study. IEEE Transactions on Signal Processing 42(10), 2795–2810 (1994)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, Boca Raton (1986)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)
Vicon: Vicon MX Motion Capture System, http://www.vicon.com
Lawrence, N.D.: Gaussian process latent variable models for visualization of high dimensional data. In: Advances in Neural Information Processing Systems 15 (NIPS 2002). MIT Press, Cambridge (2003)
Grochow, K., Martin, S.L., Hertzmann, A., Popovic, Z.: Style-based inverse kinematics. In: Proceedings of the ACM Transactions on Graphics, SIGGRAPH 2004 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Grimes, D.B., Rao, R.P.N. (2009). Learning Actions through Imitation and Exploration: Towards Humanoid Robots That Learn from Humans. In: Sendhoff, B., Körner, E., Sporns, O., Ritter, H., Doya, K. (eds) Creating Brain-Like Intelligence. Lecture Notes in Computer Science(), vol 5436. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00616-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-00616-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00615-9
Online ISBN: 978-3-642-00616-6
eBook Packages: Computer ScienceComputer Science (R0)