Abstract
Imitation and learning from humans require an adequate sensorimotor controller to learn and encode behaviors. We present the Dynamic Muscle Perception–Action(DM-PerAc) model to control a multiple degrees-of-freedom (DOF) robot arm. In the original PerAc model, path-following or place-reaching behaviors correspond to the sensorimotor attractors resulting from the dynamics of learned sensorimotor associations. The DM-PerAc model, inspired by human muscles, permits one to combine impedance-like control with the capability of learning sensorimotor attraction basins. We detail a solution to learn incrementally online the DM-PerAc visuomotor controller. Postural attractors are learned by adapting the muscle activations in the model depending on movement errors. Visuomotor categories merging visual and proprioceptive signals are associated with these muscle activations. Thus, the visual and proprioceptive signals activate the motor action generating an attractor which satisfies both visual and proprioceptive constraints. This visuomotor controller can serve as a basis for imitative behaviors. In addition, the muscle activation patterns can define directions of movement instead of postural attractors. Such patterns can be used in state-action couples to generate trajectories like in the PerAc model. We discuss a possible extension of the DM-PerAc controller by adapting the Fukuyori’s controller based on the Langevin’s equation. This controller can serve not only to reach attractors which were not explicitly learned, but also to learn the state/action couples to define trajectories.
Similar content being viewed by others
Notes
In the minimum-jerk approach, the movements maximize the smoothness of the motion.
The damping can be constant. However, controlled movements are improved if the damping varies with the stiffness. For instance, the damping can be defined as proportional to the square root of the stiffness like in Ganesh et al. (2010).
Bold letters indicate vectors, whereas plain letters are scalars.
In practice, the range of activities was \([0,1]\) and we used \(nc=0.1\).
With the software Webots (Cyberbotics).
The robot TINO was co-funded by the French projects INTERACT and SESAME TINO, the Robotex and the CNRS. The robot only recently arrived in the ETIS lab.
References
Albu-Schäffer A, Ott C, Hirzinger G (2007) A unified passivity-based control framework for position, torque and impedance control of flexible joint robots. Int J Robot Res 26(1):23–39
Amari SI (1977) Dynamics of pattern formation in lateral-inhibition type neural fields. Biol Cybern 27(2):77–87
Andry P, Gaussier P, Nadel J, Hirsbrunner B (2004) Learning invariant sensorimotor behaviors: a developmental approach to imitation mechanisms. Adapt Behav 12(2):117–140
Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483
Atkeson CG, Andrew W, Schaal S (1997) Locally weighted learning. In: Artificial intelligence review, pp 11–73
Bizzi E, Hogan N, Ivaldi FAM, Giszter S (1992) Does the nervous system use equilibrium-point control to guide single and multiple joint movements? Behav Brain Sci 15(Special Issue 04):603–613
Bullock D, Grossberg S (1989) VITE and FLETE: neural modules for trajectory formation and postural control. In: Hershberger W (ed) Volitional action, advances in psychology, vol 62. Elsevier, chap 11, pp 253–297
Burdet E, Tee KP, Mareels I, Milner TE, Chew CM, Franklin DW, Osu R, Kawato M (2006) Stability and motor adaptation in human arm movements. Biol Cybern 94(1):20–32
Butterworth G (1999) Neonatal imitation: existence, mechanisms and motives. In: Nadel J, Butterworth G (eds) Imitation in infancy. Cambridge University Press, Cambridge, pp 63–88
Calinon S, Guenter F, Billard A (2007) On learning, representing and generalizing a task in a humanoid robot. IEEE Trans Syst Man Cybern B Special Issue Robot Learn Obs Demonstr Imit 37(2):286–298
Calinon S, D’halluin F, Sauser E, Caldwell D, Billard A (2010a) Learning and reproduction of gestures by imitation: an approach based on hidden Markov model and Gaussian mixture regression. IEEE Robot Autom Mag 17(2):44–54
Calinon S, D’halluin F, Caldwell DG, Billard A (2009) Handling of multiple constraints and motion alternatives in a robot programming by demonstration framework. In: Proceedings of 2009 IEEE-RAS international conference on humanoid robots, pp 582–588
Calinon S, Sardellitti I, Caldwell DG (2010b) Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies. In: Proceedings of 2010 IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan, pp 249–254
Carpenter GA, Grossberg S (2002) Adaptive resonance theory (ART). In: The handbook of brain theory and neural networks. MIT Press, Cambridge, pp 79–82
Chiaverini S, Siciliano B, Villani L (1999) A survey of robot interaction control schemes with experimental comparison. IEEE/ASME Trans Mechatron 4(3):273–285
Cook G, Stark L (1968) The human eye-movement mechanism: experiments, modeling, and model testing. Arch Ophthalmol 79(4):428–436
de Rengervé A, Boucenna S, Andry P, Gaussier P (2010) Emergent imitative behavior on a robotic arm based on visuo-motor associative memories. In: Proceedings of 2010 IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan, pp 1754–1759
Droniou A, Ivaldi S, Padois V, Sigaud O (2012) Autonomous online learning of velocity kinematics on the iCub: a comparative study. In: Proceedings of 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS), Vilamoura, Portugal, pp 3577–3582
Feldman AG, Levin MF (2009) The equilibrium-point hypothesis past, present and future progress in motor control. In: Sternad D (ed) Progress in motor control, advances in experimental medicine and biology, vol 629, Springer, US, book part (with own title) 38, pp 699–726
Feldman AG (1966) Functional tuning of the nervous system with control of movement or maintenance of a steady posture. II. Controllable parameters of the muscle. Biophysics 11(3):565–578
Feldman AG (1986) Once more on the equilibrium-point hypothesis (lambda model) for motor control. J Motor Behav 18(1):17–54
Flash T, Hogan N (1985) The coordination of arm movements: an experimentally confirmed mathematical model. J Neurosci 5(7):1688–1703
Flash T (1987) The control of hand equilibrium trajectories in multi-joint arm movements. Biol Cybern 57(4):257–274
Franklin DW, Burdet E, Tee KP, Osu R, Chew CM, Milner TE, Kawato M (2008) CNS learns stable, accurate, and efficient movements using a simple algorithm. J Neurosci 28(44):11,165–11,173
Fukuyori I, Nakamura Y, Matsumoto Y, Ishiguro H (2008) Flexible control mechanism for multi-DOF robotic arm based on biological fluctuation. From Anim Animat 10:22–31
Ganesh G, Albu-Schaffer A, Haruno M, Kawato M, Burdet E (2010) Biomimetic motor behavior for simultaneous adaptation of force, impedance and trajectory in interaction tasks. In: Proceedings of 2010 IEEE international conference on robotics and automation (ICRA), pp 2705–2711
Gaussier P, Zrehen S (1995) PerAc: a neural architecture to control artificial animals. Robot Auton Syst 16(2–4):291–320
Gaussier P, Moga S, Banquet JP, Quoy M (1998) From perception-action loops to imitation processes: a bottom-up approach of learning by imitation. Appl Artif Intell 1(7):701–727
Georgopoulos A, Schwartz A, Kettner R (1986) Neuronal population coding of movement direction. Science 233(4771):1416–1419
Gergely G (2001) Is early differentiation of human action a precursor to the one-year-old’s understanding of intentionality? Dev Psychol 37:57982
Giovannangeli C, Gaussier P (2010) Interactive teaching for vision-based mobile robots: a sensory-motor approach. IEEE Trans Syst Man Cybern A 40(1):13–28
Giovannangeli C, Gaussier P, Désilles G (2006) Robust mapless outdoor vision-based navigation. In: Proceedings of 2006 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, Beijing, China
Hersch M, Billard A (2006) A biologically-inspired model of reaching movements. In: Proceedings of 2006 IEEE/RAS-EMBS international conference on biomedical robotics and biomechatronics
Hill AV (1938) The heat of shortening and the dynamic constants of muscle. Proc R Soc B Biol Sci 126(843):136–195
Hoffmann H, Pastor P, Park DH, Schaal S (2009) Biologically-inspired dynamical systems for movement generation: automatic real-time goal adaptation and obstacle avoidance. In: Proceedings of 2009 IEEE international conference on robotics and automation (ICRA)
Hogan N (1984) An organizing principle for a class of voluntary movements. J Neurosci 4(11):2745–2754
Huxley AF (1957) Muscle structure and theories of contraction. Prog Biophys Biophys Chem 7:255–318
Ijspeert AJ, Nakanishi J, Schaal S (2003) Learning attractor landscapes for learning motor primitives. In: Advances in neural information processing systems 15, Cambridge, MA: MIT Press, pp 1547–1554
Ijspeert AJ, Nakanishi J, Hoffmann H, Pastor P, Schaal S (2013) Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput 25(2):328–373
Iossifidis I, Schoner G (2004) Autonomous reaching and obstacle avoidance with the anthropomorphic arm of a robotic assistant using the attractor dynamics approach. In: Proceedings of 2004 IEEE international conference on robotics and automation (ICRA), Inst. fur Neuroinformatik, Ruhr-Univ., Bochum, Germany, IEEE, vol 5, pp 4295–4300
Iossifidis I, Schoner G (2006) Dynamical systems approach for the autonomous avoidance of obstacles and joint-limits for an redundant robot arm. In: Proceedings of 2006 IEEE/RSJ international conference on intelligent robots and systems (IROS), Institut fur Neuroinformatik, Ruhr-Universitat Bochum, pp 580–585
Jiménez-Fabián R, Verlinden O (2011) Review of control algorithms for robotic ankle systems in lower-limb orthoses, prostheses, and exoskeletons. Med Eng Phys 34(4):397–408
Klute GK, Czerniecki JM, Hannaford B (2002) Artificial muscles: actuators for biorobotic systems. Int J Robot Res 21(4):295–309
Kohonen T (1982) Analysis of a simple self-organizing process. Biol Cybern 44(2):135–140
Kronander K, Billard A (2012) Online learning of varying stiffness through physical human–robot interaction. In: Proceedings of 2012 IEEE international conference on robotics and automation (ICRA), pp 1842–1849
Lagarde M, Andry P, Gaussier P, Boucenna S, Hafemeister L (2010) Proprioception and imitation: on the road to agent individuation. In: Sigaud O, Peters J (eds) From motor learning to interaction learning in robots, vol 264, Springer, Berlin, book part 3, pp 43–63
Law J, Shaw P, Earland K, Sheldon M, Lee MH (2014) A psychology based approach for longitudinal development in cognitive robotics. Front Neurorobotics 8(1). doi:10.3389/fnbot.2014.00001
Lungarella M, Metta G, Pfeifer R, Sandini G (2003) Developmental robotics: a survey. Connect Sci 15(4):151–190
Maillard M, Gapenne O, Hafemeister L, Gaussier P (2005) Perception as a dynamical sensori-motor attraction basin. In: Capcarrre M, Freitas A, Bentley P, Johnson C, Timmis J (eds) Advanced in artificial life, lecture notes in computer science, vol 3630, Springer, Berlin, pp 37–46
Miyamoto H, Kawato M (1998) A tennis serve and upswing learning robot based on bi-directional theory. Neural Netw 11(7–8):1331–1344
Nehaniv CL, Dautenhahn K (2002) The correspondence problem. In: Dautenhahn K, Nehaniv CL (eds) Imitation in animals and artifacts. MIT Press, Cambridge, pp 41–61
Redgrave P, Gurney K (2006) The short-latency dopamine signal: a role in discovering novel actions? Nat Rev Neurosci 7(12):967–975
Rozo L, Calinon S, Caldwell D, Jimenez P, Torras C, Jiménez P (2013) Learning collaborative impedance-based robot behaviors. In: Proceedings of the 27th AAAI conference on artificial intelligence
Sanes JN, Jennings VA (1984) Centrally programmed patterns of muscle activity in voluntary motor behavior of humans. Exp Brain Res 54(1):23–32
Santrock JW (2005) A topical approach to life-span development, 2nd edn. McGraw-Hill, Boston
Schaal S (1997) Learning from demonstration. In: Advances in neural information processing systems , MIT Press, vol 9, pp 1040–1046
Schaal S, Atkeson CG (1998) Constructive incremental learning from only local information. Neural Comput 10(8):2047–2084
Schaal S (2006) Dynamic movement primitives—a framework for motor control in humans and humanoid robotics. In: Kimura H, Tsuchiya K, Ishiguro A, Witte H (eds) Adaptive motion of animals and machines. Springer, Tokyo, pp 261–280
Schöner G, Dose M, Engels C (1995) Dynamics of behavior: theory and applications for autonomous robot architectures. Robot Auton Syst 16(2–4):213–245
Slotine JJE (1988) Adaptive manipulator control: a case study. IEEE Trans Autom Control 33(11):995–1003
Todorov E (2007) Optimal control theory. In: Doya K (ed) Bayesian Brain: Probabilistic Approaches to Neural Coding, Applied Mathematical Sciences. MIT Press, pp 269–298 chap 12
Vijayakumar S, D’souza A, Schaal S (2005) Incremental online learning in high dimensions. Neural Comput 17(12):2602–2634
Winters JM, Stark L (1985) Analysis of fundamental human movement patterns through the use of in-depth antagonistic muscle models. IEEE Trans Bio-Med Eng 32(10):826–839
Winters JM, Stark L (1987) Muscle models: what is gained and what is lost by varying model complexity. Biol Cybern 55(6):403–420
Acknowledgments
This work was supported by the INTERACT French project reference number ANR-09-CORD-014.
Author information
Authors and Affiliations
Corresponding author
Appendix: summary of the parameters and equations used in the Dynamic Muscle PerAc model
Appendix: summary of the parameters and equations used in the Dynamic Muscle PerAc model
The different parameters and equations presented in this article are respectively summarized in Tables 1 and 2.
The proprioceptive (visual) categorization depends on the vigilance parameter \(\lambda ^P\) (\(\lambda ^V\)) and the parameter \(\beta ^P\) (\(\beta ^V\)) of the Gaussian similarity measure. High vigilance values induce that recruited categories overlap. On the contrary, we use \(\lambda ^P=\lambda ^V=0.005\) to avoid interferences between categories. The values of the Gaussian parameters are very low so the categories are selective enough. During the learning step, different values are used to increase progressively the number of learned categories (\(\beta ^P = 0.002\) then \(\beta ^P = 0.001\), and \(\beta ^V = 2\cdot 10^{-4}\) then \(\beta ^V = 5 \cdot 10^{-5}\)). During the tests, vision must drive the movements, thus the proprioceptive categories must be less selective than the visual categories (\(\beta ^P = 0.1\) and \(\beta ^V = 5\cdot 10^{-5}\)).
In the experiments, muscle activation learning depends on the learning factor \(\varepsilon ^A=10^{-3}\) and the decay factor \(\alpha ^A=10^{-4}\). As the learning factor is small, the stiffness \(K_j\) of each joint changes slowly. Still, the equilibrium position is rapidly adapted because it depends on the ratio of the muscle activations. Also, the decay must be slow enough to allow the learning. With an error threshold \(th_D=0.01\), the muscle activations around a joint are adapted if the position error is over a hundredth of the rotational range.
The parameters \(th_L\) and \(\gamma ^L\) define the dynamics of the “learning enable” signal \(L\), i.e. determine the amount of time to learn each postural attractor. The used values are \(th_L=10^{-5}\) and \(\gamma ^L=0.95\), thus the motor exploration resumes after a time period of about 10 s without correction of the movements.
Rights and permissions
About this article
Cite this article
de Rengervé, A., Andry, P. & Gaussier, P. Online learning and control of attraction basins for the development of sensorimotor control strategies. Biol Cybern 109, 255–274 (2015). https://doi.org/10.1007/s00422-014-0640-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-014-0640-4