Autonomous Robots

, Volume 31, Issue 1, pp 21–53 | Cite as

Teaching a humanoid robot to draw ‘Shapes’

  • Vishwanathan MohanEmail author
  • Pietro Morasso
  • Jacopo Zenzeri
  • Giorgio Metta
  • V. Srinivasa Chakravarthy
  • Giulio Sandini


The core cognitive ability to perceive and synthesize ‘shapes’ underlies all our basic interactions with the world, be it shaping one’s fingers to grasp a ball or shaping one’s body while imitating a dance. In this article, we describe our attempts to understand this multifaceted problem by creating a primitive shape perception/synthesis system for the baby humanoid iCub. We specifically deal with the scenario of iCub gradually learning to draw or scribble shapes of gradually increasing complexity, after observing a demonstration by a teacher, by using a series of self evaluations of its performance. Learning to imitate a demonstrated human movement (specifically, visually observed end-effector trajectories of a teacher) can be considered as a special case of the proposed computational machinery. This architecture is based on a loop of transformations that express the embodiment of the mechanism but, at the same time, are characterized by scale invariance and motor equivalence. The following transformations are integrated in the loop: (a) Characterizing in a compact, abstract way the ‘shape’ of a demonstrated trajectory using a finite set of critical points, derived using catastrophe theory: Abstract Visual Program (AVP); (b) Transforming the AVP into a Concrete Motor Goal (CMG) in iCub’s egocentric space; (c) Learning to synthesize a continuous virtual trajectory similar to the demonstrated shape using the discrete set of critical points defined in CMG; (d) Using the virtual trajectory as an attractor for iCub’s internal body model, implemented by the Passive Motion Paradigm which includes a forward and an inverse motor model; (e) Forming an Abstract Motor Program (AMP) by deriving the ‘shape’ of the self generated movement (forward model output) using the same technique employed for creating the AVP; (f) Comparing the AVP and AMP in order to generate an internal performance score and hence closing the learning loop. The resulting computational framework further combines three crucial streams of learning: (1) motor babbling (self exploration), (2) imitative action learning (social interaction) and (3) mental simulation, to give rise to sensorimotor knowledge that is endowed with seamless compositionality, generalization capability and body-effectors/task independence. The robustness of the computational architecture is demonstrated by means of several experimental trials of gradually increasing complexity using a state of the art humanoid platform.


Shape Shaping Catastrophe theory Passive motion paradigm Terminal attractors iCub 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

Observing a tool being used.wmv (WMV 11.2 MB)

Auro2iCubArtShort.wmv (WMV 26.7 MB)


  1. Amedi, A., Stern, W., Camprodon, A. J., Bermpohl, F., Merabet, L., Rotman, S., Hemond, C., Meijer, P., & Pascual-Leone, A. (2007). Shape conveyed by visual-to-auditory sensory substitution activates the lateral occipital complex. Nature Neuroscience, 10(6), 687–689. CrossRefGoogle Scholar
  2. Anquetil, E., & Lorette, G. (1997). Perceptual model of handwriting drawing: application to the handwriting segmentation problem. In Proceedings of the fourth international conference on document analysis and recognition (pp. 112–117). CrossRefGoogle Scholar
  3. Aparna, K. H., Subramanian, V., Kasirajan, M., Prakash, G. V., Chakravarthy, V. S., & Madhvanath, S. (2004). Online handwriting recognition for tamil. In Proceedings of ninth international workshop on frontiers in handwriting recognition. Google Scholar
  4. Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483. CrossRefGoogle Scholar
  5. Atkeson, C. G., & Schaal, S. (1997a). Learning tasks from a single demonstration. Proceedings of the IEEE International Conference on Robotics and Automation, 2, 1706–1712. Google Scholar
  6. Atkeson, C. G., & Schaal, S. (1997b). Robot learning from demonstration. In Proceedings of the fourteenth international conference on machine learning (pp. 12–20). Google Scholar
  7. Basteris, A., Bracco, L., & Sanguineti, V. (2010). Intermanual transfer of handwriting skills: role of visual and haptic assistance. In Proceedings of the 4th IMEKO TC 18 symposium: measurement, analysis and modelling of human functions. Google Scholar
  8. Belkasim, S., Shridhar, M., & Ahmadi, M. (1991). Pattern recognition with moment invariants: a comparative study and new results. Pattern Recognition, 24, 1117–1138. CrossRefGoogle Scholar
  9. Bentivegna, D. C., Ude, A., Atkeson, C. G., & Cheng, G. (2002). Humanoid robot learning and game playing using PC-based vision. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems. Google Scholar
  10. Billard, A., & Mataric, M. (2001). Learning human arm movements by imitation: evaluation of a biologically- inspired architecture. Robotics and Autonomous Systems, 941, 1–16. Google Scholar
  11. Bizzi, E., Polit, A., & Morasso, P. (1976). Mechanisms underlying achievement of final position. Journal of Neurophysiology, 39, 435–444. Google Scholar
  12. Blum, H. (1967). A transformation for extracting new descriptors of shape. In A. Whaten-Dunn (Ed.), Models for the perception of speech and visual forms (pp. 362–380). Cambridge: MIT Press. Google Scholar
  13. Boronat, C., Buxbaum, L., Coslett, H., Tang, K., Saffran, E., Kimberg, D., & Detre, J. (2005). Distinction between manipulation and function knowledge of objects: evidence from functional magnetic resonance imaging. Cognitive Brain Research, 23, 361–373. CrossRefGoogle Scholar
  14. Braun, D. A., Mehring, C., & Wolpert, D. M. (2010). Structure learning in action. Behavioural Brain Research, 206, 157–165. CrossRefGoogle Scholar
  15. Brown, H. D. (1987). Principles of language learning and teaching. New York: Prentice-Hall. Google Scholar
  16. Bullock, D., & Grossberg, S. (1988). Neural dynamics of planned arm movements: emergent invariants and speed-accuracy properties. Psychological Reviews, 95, 49–90. CrossRefGoogle Scholar
  17. Casadio, M., Morasso, P., Sanguineti, V., & Arrichiello, V. (2006). Braccio di Ferro: a new haptic workstation for neuromotor rehabilitation. Technology and Health Care, 14, 123–142. Google Scholar
  18. Cattaneo, L., & Rizzolatti, G. (2009). The mirror neuron system. Archives of Neurology, 66(5), 557–560. CrossRefGoogle Scholar
  19. Chakravarthy, V. S., & Kompella, B. (2003). The shape of handwritten characters. Pattern Recognition Letters, 24, 1901–1913. CrossRefGoogle Scholar
  20. Chella, A., Dindo, H., & Infantino, I. (2006). A cognitive framework for imitation learning. Robotics and Autonomous Systems, 54(5), 403–408. Special issue: the social mechanisms of robot programming by demonstration. CrossRefGoogle Scholar
  21. Chen, S., Keller, J., & Crownover, R. (1990). Shape from fractal geometry. Artificial Intelligence, 43, 199–218. zbMATHCrossRefGoogle Scholar
  22. Clark, J. J. (1988). Singularity theory and phantom edges in scale-space. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(5), 720–727. zbMATHCrossRefGoogle Scholar
  23. Dautenhahn, K., & Nehaniv, C. L. (2002). Imitation in animals and artifacts. London: MIT Press. ISBN:0262042037. Google Scholar
  24. Demiris, Y., & Simmons, G. (2006a). Perceiving the unusual: temporal properties of hierarchical motor representations for action perception. Neural Networks, 19(3), 272–284. zbMATHCrossRefGoogle Scholar
  25. Demiris, Y., & Khadhouri, B. (2006b). Hierarchical Attentive Multiple Models for Execution and Recognition (HAMMER). Robotics and Autonomous Systems, 54, 361–369. CrossRefGoogle Scholar
  26. Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley. zbMATHGoogle Scholar
  27. Duncan, C. P. (1960). Description of learning to learn in human subjects. The American Journal of Psychology, 73(1), 108–114. CrossRefGoogle Scholar
  28. Ellis, R., & Tucker, M. (2000). Micro-affordance: the potentiation of components of action by seen objects. British Journal of Psychology, 91(4), 451–471. CrossRefGoogle Scholar
  29. Feldman, A. G. (1966). Functional tuning of the nervous system with control of movement or maintenance of a steady posture, II: controllable parameters of the muscles. Biophysics, 11, 565–578. Google Scholar
  30. Fischler, M. A., & Wolf, H. C. (1994). Locating perceptually salient points on planar curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2), 113–129. CrossRefGoogle Scholar
  31. Gaglio, S., Grattarola, A., Massone, L., & Morasso, P. (1987). Structure and texture in shape representation. Journal of Intelligent Systems, 1(1), 1–41. Google Scholar
  32. Gallese, V., & Lakoff, G. (2005). The Brain’s concepts: the role of the sensory-motor system in reason and language. Cognitive Neuropsychology, 22, 455–479. CrossRefGoogle Scholar
  33. Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin. Google Scholar
  34. Gilmore, R. (1981). Catastrophe theory for scientists and engineers. New York: Wiley-Interscience. zbMATHGoogle Scholar
  35. Grafton, S. T., Arbib, M. A., Fadiga, L., & Rizzolatti, G. (1996). Localization of grasp representation in humans by positron emission tomography: 2 observation compared with imagination. Experimental Brain Research, 112, 103–111. CrossRefGoogle Scholar
  36. Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21, 723–802. Google Scholar
  37. Harlow, H. F. (1949). The formation of learning sets. Psychological Review, 56, 51–65. CrossRefGoogle Scholar
  38. Hebb, D. O. (1949). The organization of behavior: a neuropsychological theory. New York: Wiley. Google Scholar
  39. Hersch, M., & Billard, A. G. (2008). Reaching with multi-referential dynamical systems. Autonomous Robots, 25, 71–83. CrossRefGoogle Scholar
  40. Hoff, W., & Ahuja, N. (1989). Surfaces from stereo: integrating feature matching, disparity estimation, and contour detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 121–136. CrossRefGoogle Scholar
  41. Hoffmann, H., Pastor, P., Asfour, T., & Schaal, S. (2009). Learning and generalization of motor skills by learning from demonstration. In Proceedings of the international conference on robotics and automation. Google Scholar
  42. Horn, B. K. P. (1990). Height and gradient from shading. International Journal of Computer Vision, 5, 37–75. CrossRefGoogle Scholar
  43. Iacoboni, M., Koski, L. M., Brass, M., Bekkering, H., Woods, R. P., Dubeau, M. C., Mazziotta, J. C., & Rizzolatti, G. (2001). Reafferent copies of imitated actions in the right superior temporal cortex. Proceedings of the National Academy of Sciences of the United States of America, 98, 13995–13999. CrossRefGoogle Scholar
  44. Iacoboni, M. (2009). Imitation, empathy, and mirror neurons. Annual Review of Psychology. Google Scholar
  45. Ijspeert, J. A., Nakanishi, J., & Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proceedings of the IEEE international conference on robotics and automation. Google Scholar
  46. Iyer, N., Jayanti, S., Lou, K., Kalyanaraman, Y., & Ramani, K. (2005). Three-dimensional shape searching: state-of-the-art review and future trends. Computer Aided Design, 37, 509–530. CrossRefGoogle Scholar
  47. Jagadish, H. V., & Bruckstein, A. M. (1992). On sequential shape descriptions. Pattern Recognition, 25, 165–172. CrossRefGoogle Scholar
  48. Koenderink, J. J., & van Doorn, A. J. (1986). Dynamic shape. Biological Cybernetics, 53, 383–396. zbMATHCrossRefMathSciNetGoogle Scholar
  49. Koski, L., Wohlschlager, A., Bekkering, H., Woods, R. P., Dubeau, M. C., Mazziotta, J. C., & Iacoboni, M. (2002). Modulation of motor and premotor activity during imitation of target-directed actions. Cerebral Cortex, 12, 847–855. CrossRefGoogle Scholar
  50. Li, X., & Yeung, D. Y. (1997). On-line alphanumeric character recognition using dominant points in strokes. Pattern Recognition, 30(1), 31–44. CrossRefGoogle Scholar
  51. Loncaric, S. (1998). A survey of shape analysis techniques. Pattern Recognition, 31(8), 983–1001. CrossRefGoogle Scholar
  52. Lopes, M., Melo, F., Montesano, L., & Santos-Victor, J. (2010). Abstraction levels for robotic imitation: overview and computational approaches. In O. Sigaud & J. Peters (Eds.), Series: studies in computational intelligence. From motor learning to interaction learning in robots. Berlin: Springer. Google Scholar
  53. Madduri, K., Aparna, H. K., & Chakravarthy, V. S. (2004). PATRAM—A handwritten word processor for Indian languages. In Proceedings of ninth international workshop on frontiers in handwriting recognition. Google Scholar
  54. Manikandan, B. J., Shankar, G., Anoop, V., Datta, A., & Chakravarthy, V. S. (2002). LEKHAK: a system for online recognition of handwritten tamil characters. In Proceedings of the international conference on natural language processing. Google Scholar
  55. Marr, D. (1982). Vision: a computational investigation into the human representation and processing of visual information. New York: Freeman. Google Scholar
  56. Mehrotra, R., Nichani, S., & Ranganathan, N. (1990). Corner detection. Pattern Recognition, 23(11), 1223–1233. CrossRefGoogle Scholar
  57. Metta, G., Fitzpatrick, P., & Natale, L. (2006). YARP: yet another robot platform. International Journal on Advanced Robotics Systems, 3(1), 43–48. Special issue on Software Development and Integration in Robotics. Google Scholar
  58. Mohan, V., & Morasso, P. (2007). Towards reasoning and coordinating action in the mental space. International Journal of Neural Systems, 17(4), 1–13. CrossRefGoogle Scholar
  59. Mohan, V., & Morasso, P. (2008). Reaching extended’: unified computational substrate for mental simulation and action execution in cognitive robots. In Proceedings of third international conference of cognitive science. Google Scholar
  60. Mohan, V., Morasso, P., Metta, G., & Sandini, G. (2009a). A biomimetic, force-field based computational model for motion planning and bimanual coordination in humanoid robots. Autonomous Robots, 27(3), 291–301. CrossRefGoogle Scholar
  61. Mohan, V., Zenzeri, J., Morasso, P., & Metta, G. (2009b). Composing and coordinating body models of arbitrary complexity and redundancy: a biomimetic field computing approach. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems. Google Scholar
  62. Morasso, P., Mussa Ivaldi, F. A., & Ruggiero, C. (1983). How a discontinuous mechanism can produce continuous patterns in trajectory formation and handwriting. Acta Psychologica, 54, 83–98. CrossRefGoogle Scholar
  63. Morasso, P., Casadio, M., Mohan, V., & Zenzeri, J. (2010). A neural mechanism of synergy formation for whole body reaching. Biological Cybernetics, 102(1), 45–55. CrossRefGoogle Scholar
  64. Mussa Ivaldi, F. A., Morasso, P., & Zaccaria, R. (1988). Kinematic networks. A distributed model for representing and regularizing motor redundancy. Biological Cybernetics, 60, 1–16. Google Scholar
  65. Perrett, D. I., & Emery, N. J. (1994). Understanding the intentions of others from visual signals: neurophysiological evidence. Current Psychology of Cognition, 13, 683–694. Google Scholar
  66. Poston, T., & Stewart, I. N. (1998). Catastrophe theory and its applications. London: Pitman. Google Scholar
  67. Ramachandran, V. S., & Hubbard, E. M. (2003). Hearing colors, tasting shapes. Scientific American, 288(5), 42–49. CrossRefGoogle Scholar
  68. Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp. Trends in Neurosciences, 21, 188–194. CrossRefGoogle Scholar
  69. Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying action understanding and imitation. Nature Reviews. Neuroscience, 2, 661–670. CrossRefGoogle Scholar
  70. Rizzolatti, G., Fadiga, L., Matelli, M., Bettinardi, V., Paulesu, E., Perani, D., & Fazio, F. (1996). Localization of grasp representations in humans by PET: 1. Observation versus execution. Experimental Brain Research, 111, 246–252. CrossRefGoogle Scholar
  71. Rocha, J., & Pavlidis, T. (1994). A shape analysis model with application to a character recognition system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(4), 393–404. CrossRefGoogle Scholar
  72. Sandini, G., Metta, G., & Vernon, D. (2004). RobotCub: an open framework for research in embodied cognition. In Proceedings of the 4th IEEE/RAS international conference on humanoid robots (pp. 13–32). CrossRefGoogle Scholar
  73. Sanfeliu, A., & Fu, K. (1983). A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics, 13(3), 353–362. zbMATHGoogle Scholar
  74. Schaal, S. (1999). Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences, 3, 233–242. CrossRefGoogle Scholar
  75. Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transaction of the Royal Society of London B, 358, 537–547. CrossRefGoogle Scholar
  76. Shankar, G., Anoop, V., & Chakravarthy, V. S. (2003). LEKHAK [MAL]: a system for online recognition of handwritten Malayalam characters. In Proceedings of the national conference on communications, IIT, Madras. Google Scholar
  77. Shapiro, R. (1978). Direct linear transformation method for three-dimensional cinematography. Restoration Quarterly, 49, 197–205. Google Scholar
  78. Smith, L. B., Yu, C., & Pereira, A. F. (2010). Not your mother’s view: the dynamics of toddler visual experience. Developmental Science. doi: 10.1111/j.1467-7687.2009.00947.x. Google Scholar
  79. Stiny, G., & Gips, J. (1978). Algorithmic aesthetics: computer models for criticism and design in the arts. California: University of California Press. Google Scholar
  80. Stiny, G. (2006). Shape: talking about seeing and doing. Cambridge: MIT Press. Google Scholar
  81. Symes, E., Ellis, R., & Tucker, M. (2007). Visual object affordances: object orientation. Acta Psychologica, 124, 238–255. CrossRefGoogle Scholar
  82. Teh, C. H., & Chin, R. T. (1989). On the detection of dominant points on digital curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(8), 859–872. CrossRefGoogle Scholar
  83. Ternovskiy, I., Jannson, T., & Caulfield, J. (2002). Is catastrophe theory the basis for visual perception? Three-dimensional holographic imaging. New York: Wiley. doi: 10.1002/0471224545.ch10. Google Scholar
  84. Thom, R. (1975). Structural stability and morphogenesis. Reading: Addison-Wesley. zbMATHGoogle Scholar
  85. Tikhanoff, V., Cangelosi, A., Fitzpatrick, P., Metta, G., Natale, L., & Nori, F. (2008). An open-source simulator for cognitive robotics research. Cogprints, article 6238. Google Scholar
  86. Tsuji, T., Morasso, P., Shigehashi, K., & Kaneko, M. (1995). Motion planning for manipulators using artificial potential field approach that can adjust convergence time of generated arm trajectory. Journal of the Robotics Society of Japan, 13(3), 285–290. Google Scholar
  87. Ulupinar, F., & Nevatia, R. (1990). Inferring shape from contour for curved surfaces. In Proceedings of the international conference on pattern recognition (pp. 147–154). CrossRefGoogle Scholar
  88. Visalberghi, E., & Tomasello, M. (1997). Primate causal understanding in the physical and in the social domains. Behavioral Processes, 42, 189–203. CrossRefGoogle Scholar
  89. Wallace, T., & Wintz, P. (1980). An efficient three-dimensional aircraft recognition algorithm using normalized Fourier descriptors. Computer Graphics and Image Processing, 13, 99–126. CrossRefGoogle Scholar
  90. Yu, C., Smith, L. B., Shen, H., Pereira, A. F., & Smith, T. G. (2009). Active information selection: visual attention through the hands. IEEE Transactions on Autonomous Mental Development, 1(2), 141–151. CrossRefGoogle Scholar
  91. Zak, M. (1988). Terminal attractors for addressable memory in neural networks. Physical Letters A, 133, 218–222. CrossRefGoogle Scholar
  92. Zeeman, E. C. (1977). Catastrophe theory-selected papers 1972–1977. Reading: Addison-Wesley. zbMATHGoogle Scholar
  93. Zöllner, R., Asfour, T., & Dillman, R. (2004). Programming by demonstration: dual-arm manipulation tasks for humanoid robots. In Proceedings of the IEEE/RSJ international conference on intelligent robots systems. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Vishwanathan Mohan
    • 1
    Email author
  • Pietro Morasso
    • 1
  • Jacopo Zenzeri
    • 1
  • Giorgio Metta
    • 1
  • V. Srinivasa Chakravarthy
    • 2
  • Giulio Sandini
    • 1
  1. 1.Robotics, Brain and Cognitive Sciences DepartmentItalian Institute of TechnologyGenovaItaly
  2. 2.Department of BiotechnologyIndian Institute of TechnologyChennaiIndia

Personalised recommendations