Online learning and control of attraction basins for the development of sensorimotor control strategies

de Rengervé, Antoine; Andry, Pierre; Gaussier, Philippe

doi:10.1007/s00422-014-0640-4

Online learning and control of attraction basins for the development of sensorimotor control strategies

Original Paper
Published: 10 January 2015

Volume 109, pages 255–274, (2015)
Cite this article

Biological Cybernetics Aims and scope Submit manuscript

Antoine de Rengervé¹,
Pierre Andry¹ &
Philippe Gaussier¹

438 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Imitation and learning from humans require an adequate sensorimotor controller to learn and encode behaviors. We present the Dynamic Muscle Perception–Action(DM-PerAc) model to control a multiple degrees-of-freedom (DOF) robot arm. In the original PerAc model, path-following or place-reaching behaviors correspond to the sensorimotor attractors resulting from the dynamics of learned sensorimotor associations. The DM-PerAc model, inspired by human muscles, permits one to combine impedance-like control with the capability of learning sensorimotor attraction basins. We detail a solution to learn incrementally online the DM-PerAc visuomotor controller. Postural attractors are learned by adapting the muscle activations in the model depending on movement errors. Visuomotor categories merging visual and proprioceptive signals are associated with these muscle activations. Thus, the visual and proprioceptive signals activate the motor action generating an attractor which satisfies both visual and proprioceptive constraints. This visuomotor controller can serve as a basis for imitative behaviors. In addition, the muscle activation patterns can define directions of movement instead of postural attractors. Such patterns can be used in state-action couples to generate trajectories like in the PerAc model. We discuss a possible extension of the DM-PerAc controller by adapting the Fukuyori’s controller based on the Langevin’s equation. This controller can serve not only to reach attractors which were not explicitly learned, but also to learn the state/action couples to define trajectories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tacit Learning for Emergence of Task-Related Behaviour through Signal Accumulation

Autonomous Learning of Internal Dynamic Models for Reaching Tasks

The influence of proprioceptive state on learning control of reach dynamics

Article 14 July 2015

Notes

In the minimum-jerk approach, the movements maximize the smoothness of the motion.
The damping can be constant. However, controlled movements are improved if the damping varies with the stiffness. For instance, the damping can be defined as proportional to the square root of the stiffness like in Ganesh et al. (2010).
Bold letters indicate vectors, whereas plain letters are scalars.
In practice, the range of activities was \([0,1]\) and we used \(nc=0.1\).
With the software Webots (Cyberbotics).
The robot TINO was co-funded by the French projects INTERACT and SESAME TINO, the Robotex and the CNRS. The robot only recently arrived in the ETIS lab.

References

Albu-Schäffer A, Ott C, Hirzinger G (2007) A unified passivity-based control framework for position, torque and impedance control of flexible joint robots. Int J Robot Res 26(1):23–39
Article Google Scholar
Amari SI (1977) Dynamics of pattern formation in lateral-inhibition type neural fields. Biol Cybern 27(2):77–87
Article CAS PubMed Google Scholar
Andry P, Gaussier P, Nadel J, Hirsbrunner B (2004) Learning invariant sensorimotor behaviors: a developmental approach to imitation mechanisms. Adapt Behav 12(2):117–140
Article Google Scholar
Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483
Article Google Scholar
Atkeson CG, Andrew W, Schaal S (1997) Locally weighted learning. In: Artificial intelligence review, pp 11–73
Bizzi E, Hogan N, Ivaldi FAM, Giszter S (1992) Does the nervous system use equilibrium-point control to guide single and multiple joint movements? Behav Brain Sci 15(Special Issue 04):603–613
Article CAS PubMed Google Scholar
Bullock D, Grossberg S (1989) VITE and FLETE: neural modules for trajectory formation and postural control. In: Hershberger W (ed) Volitional action, advances in psychology, vol 62. Elsevier, chap 11, pp 253–297
Burdet E, Tee KP, Mareels I, Milner TE, Chew CM, Franklin DW, Osu R, Kawato M (2006) Stability and motor adaptation in human arm movements. Biol Cybern 94(1):20–32
Article CAS PubMed Google Scholar
Butterworth G (1999) Neonatal imitation: existence, mechanisms and motives. In: Nadel J, Butterworth G (eds) Imitation in infancy. Cambridge University Press, Cambridge, pp 63–88
Google Scholar
Calinon S, Guenter F, Billard A (2007) On learning, representing and generalizing a task in a humanoid robot. IEEE Trans Syst Man Cybern B Special Issue Robot Learn Obs Demonstr Imit 37(2):286–298
Article Google Scholar
Calinon S, D’halluin F, Sauser E, Caldwell D, Billard A (2010a) Learning and reproduction of gestures by imitation: an approach based on hidden Markov model and Gaussian mixture regression. IEEE Robot Autom Mag 17(2):44–54
Article Google Scholar
Calinon S, D’halluin F, Caldwell DG, Billard A (2009) Handling of multiple constraints and motion alternatives in a robot programming by demonstration framework. In: Proceedings of 2009 IEEE-RAS international conference on humanoid robots, pp 582–588
Calinon S, Sardellitti I, Caldwell DG (2010b) Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies. In: Proceedings of 2010 IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan, pp 249–254
Carpenter GA, Grossberg S (2002) Adaptive resonance theory (ART). In: The handbook of brain theory and neural networks. MIT Press, Cambridge, pp 79–82
Chiaverini S, Siciliano B, Villani L (1999) A survey of robot interaction control schemes with experimental comparison. IEEE/ASME Trans Mechatron 4(3):273–285
Article Google Scholar
Cook G, Stark L (1968) The human eye-movement mechanism: experiments, modeling, and model testing. Arch Ophthalmol 79(4):428–436
Article CAS PubMed Google Scholar
de Rengervé A, Boucenna S, Andry P, Gaussier P (2010) Emergent imitative behavior on a robotic arm based on visuo-motor associative memories. In: Proceedings of 2010 IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan, pp 1754–1759
Droniou A, Ivaldi S, Padois V, Sigaud O (2012) Autonomous online learning of velocity kinematics on the iCub: a comparative study. In: Proceedings of 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS), Vilamoura, Portugal, pp 3577–3582
Feldman AG, Levin MF (2009) The equilibrium-point hypothesis past, present and future progress in motor control. In: Sternad D (ed) Progress in motor control, advances in experimental medicine and biology, vol 629, Springer, US, book part (with own title) 38, pp 699–726
Feldman AG (1966) Functional tuning of the nervous system with control of movement or maintenance of a steady posture. II. Controllable parameters of the muscle. Biophysics 11(3):565–578
Google Scholar
Feldman AG (1986) Once more on the equilibrium-point hypothesis (lambda model) for motor control. J Motor Behav 18(1):17–54
Article CAS Google Scholar
Flash T, Hogan N (1985) The coordination of arm movements: an experimentally confirmed mathematical model. J Neurosci 5(7):1688–1703
CAS PubMed Google Scholar
Flash T (1987) The control of hand equilibrium trajectories in multi-joint arm movements. Biol Cybern 57(4):257–274
Article CAS PubMed Google Scholar
Franklin DW, Burdet E, Tee KP, Osu R, Chew CM, Milner TE, Kawato M (2008) CNS learns stable, accurate, and efficient movements using a simple algorithm. J Neurosci 28(44):11,165–11,173
Article CAS Google Scholar
Fukuyori I, Nakamura Y, Matsumoto Y, Ishiguro H (2008) Flexible control mechanism for multi-DOF robotic arm based on biological fluctuation. From Anim Animat 10:22–31
Article Google Scholar
Ganesh G, Albu-Schaffer A, Haruno M, Kawato M, Burdet E (2010) Biomimetic motor behavior for simultaneous adaptation of force, impedance and trajectory in interaction tasks. In: Proceedings of 2010 IEEE international conference on robotics and automation (ICRA), pp 2705–2711
Gaussier P, Zrehen S (1995) PerAc: a neural architecture to control artificial animals. Robot Auton Syst 16(2–4):291–320
Article Google Scholar
Gaussier P, Moga S, Banquet JP, Quoy M (1998) From perception-action loops to imitation processes: a bottom-up approach of learning by imitation. Appl Artif Intell 1(7):701–727
Article Google Scholar
Georgopoulos A, Schwartz A, Kettner R (1986) Neuronal population coding of movement direction. Science 233(4771):1416–1419
Article CAS PubMed Google Scholar
Gergely G (2001) Is early differentiation of human action a precursor to the one-year-old’s understanding of intentionality? Dev Psychol 37:57982
Article Google Scholar
Giovannangeli C, Gaussier P (2010) Interactive teaching for vision-based mobile robots: a sensory-motor approach. IEEE Trans Syst Man Cybern A 40(1):13–28
Article Google Scholar
Giovannangeli C, Gaussier P, Désilles G (2006) Robust mapless outdoor vision-based navigation. In: Proceedings of 2006 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, Beijing, China
Hersch M, Billard A (2006) A biologically-inspired model of reaching movements. In: Proceedings of 2006 IEEE/RAS-EMBS international conference on biomedical robotics and biomechatronics
Hill AV (1938) The heat of shortening and the dynamic constants of muscle. Proc R Soc B Biol Sci 126(843):136–195
Article Google Scholar
Hoffmann H, Pastor P, Park DH, Schaal S (2009) Biologically-inspired dynamical systems for movement generation: automatic real-time goal adaptation and obstacle avoidance. In: Proceedings of 2009 IEEE international conference on robotics and automation (ICRA)
Hogan N (1984) An organizing principle for a class of voluntary movements. J Neurosci 4(11):2745–2754
CAS PubMed Google Scholar
Huxley AF (1957) Muscle structure and theories of contraction. Prog Biophys Biophys Chem 7:255–318
CAS PubMed Google Scholar
Ijspeert AJ, Nakanishi J, Schaal S (2003) Learning attractor landscapes for learning motor primitives. In: Advances in neural information processing systems 15, Cambridge, MA: MIT Press, pp 1547–1554
Ijspeert AJ, Nakanishi J, Hoffmann H, Pastor P, Schaal S (2013) Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput 25(2):328–373
Article PubMed Google Scholar
Iossifidis I, Schoner G (2004) Autonomous reaching and obstacle avoidance with the anthropomorphic arm of a robotic assistant using the attractor dynamics approach. In: Proceedings of 2004 IEEE international conference on robotics and automation (ICRA), Inst. fur Neuroinformatik, Ruhr-Univ., Bochum, Germany, IEEE, vol 5, pp 4295–4300
Iossifidis I, Schoner G (2006) Dynamical systems approach for the autonomous avoidance of obstacles and joint-limits for an redundant robot arm. In: Proceedings of 2006 IEEE/RSJ international conference on intelligent robots and systems (IROS), Institut fur Neuroinformatik, Ruhr-Universitat Bochum, pp 580–585
Jiménez-Fabián R, Verlinden O (2011) Review of control algorithms for robotic ankle systems in lower-limb orthoses, prostheses, and exoskeletons. Med Eng Phys 34(4):397–408
Article PubMed Google Scholar
Klute GK, Czerniecki JM, Hannaford B (2002) Artificial muscles: actuators for biorobotic systems. Int J Robot Res 21(4):295–309
Article Google Scholar
Kohonen T (1982) Analysis of a simple self-organizing process. Biol Cybern 44(2):135–140
Article Google Scholar
Kronander K, Billard A (2012) Online learning of varying stiffness through physical human–robot interaction. In: Proceedings of 2012 IEEE international conference on robotics and automation (ICRA), pp 1842–1849
Lagarde M, Andry P, Gaussier P, Boucenna S, Hafemeister L (2010) Proprioception and imitation: on the road to agent individuation. In: Sigaud O, Peters J (eds) From motor learning to interaction learning in robots, vol 264, Springer, Berlin, book part 3, pp 43–63
Law J, Shaw P, Earland K, Sheldon M, Lee MH (2014) A psychology based approach for longitudinal development in cognitive robotics. Front Neurorobotics 8(1). doi:10.3389/fnbot.2014.00001
Lungarella M, Metta G, Pfeifer R, Sandini G (2003) Developmental robotics: a survey. Connect Sci 15(4):151–190
Article Google Scholar
Maillard M, Gapenne O, Hafemeister L, Gaussier P (2005) Perception as a dynamical sensori-motor attraction basin. In: Capcarrre M, Freitas A, Bentley P, Johnson C, Timmis J (eds) Advanced in artificial life, lecture notes in computer science, vol 3630, Springer, Berlin, pp 37–46
Miyamoto H, Kawato M (1998) A tennis serve and upswing learning robot based on bi-directional theory. Neural Netw 11(7–8):1331–1344
Article PubMed Google Scholar
Nehaniv CL, Dautenhahn K (2002) The correspondence problem. In: Dautenhahn K, Nehaniv CL (eds) Imitation in animals and artifacts. MIT Press, Cambridge, pp 41–61
Google Scholar
Redgrave P, Gurney K (2006) The short-latency dopamine signal: a role in discovering novel actions? Nat Rev Neurosci 7(12):967–975
Article CAS PubMed Google Scholar
Rozo L, Calinon S, Caldwell D, Jimenez P, Torras C, Jiménez P (2013) Learning collaborative impedance-based robot behaviors. In: Proceedings of the 27th AAAI conference on artificial intelligence
Sanes JN, Jennings VA (1984) Centrally programmed patterns of muscle activity in voluntary motor behavior of humans. Exp Brain Res 54(1):23–32
Article CAS PubMed Google Scholar
Santrock JW (2005) A topical approach to life-span development, 2nd edn. McGraw-Hill, Boston
Google Scholar
Schaal S (1997) Learning from demonstration. In: Advances in neural information processing systems , MIT Press, vol 9, pp 1040–1046
Schaal S, Atkeson CG (1998) Constructive incremental learning from only local information. Neural Comput 10(8):2047–2084
Article PubMed Google Scholar
Schaal S (2006) Dynamic movement primitives—a framework for motor control in humans and humanoid robotics. In: Kimura H, Tsuchiya K, Ishiguro A, Witte H (eds) Adaptive motion of animals and machines. Springer, Tokyo, pp 261–280
Chapter Google Scholar
Schöner G, Dose M, Engels C (1995) Dynamics of behavior: theory and applications for autonomous robot architectures. Robot Auton Syst 16(2–4):213–245
Article Google Scholar
Slotine JJE (1988) Adaptive manipulator control: a case study. IEEE Trans Autom Control 33(11):995–1003
Article Google Scholar
Todorov E (2007) Optimal control theory. In: Doya K (ed) Bayesian Brain: Probabilistic Approaches to Neural Coding, Applied Mathematical Sciences. MIT Press, pp 269–298 chap 12
Google Scholar
Vijayakumar S, D’souza A, Schaal S (2005) Incremental online learning in high dimensions. Neural Comput 17(12):2602–2634
Article PubMed Google Scholar
Winters JM, Stark L (1985) Analysis of fundamental human movement patterns through the use of in-depth antagonistic muscle models. IEEE Trans Bio-Med Eng 32(10):826–839
Article CAS Google Scholar
Winters JM, Stark L (1987) Muscle models: what is gained and what is lost by varying model complexity. Biol Cybern 55(6):403–420
Article CAS PubMed Google Scholar

Download references

Acknowledgments

This work was supported by the INTERACT French project reference number ANR-09-CORD-014.

Author information

Authors and Affiliations

ETIS Laboratory, ENSEA/Cergy-Pontoise University, CNRS, 95000, Cergy Pontoise, France
Antoine de Rengervé, Pierre Andry & Philippe Gaussier

Authors

Antoine de Rengervé
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Andry
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Gaussier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antoine de Rengervé.

Appendix: summary of the parameters and equations used in the Dynamic Muscle PerAc model

The different parameters and equations presented in this article are respectively summarized in Tables 1 and 2.

The proprioceptive (visual) categorization depends on the vigilance parameter \(\lambda ^P\) (\(\lambda ^V\)) and the parameter \(\beta ^P\) (\(\beta ^V\)) of the Gaussian similarity measure. High vigilance values induce that recruited categories overlap. On the contrary, we use \(\lambda ^P=\lambda ^V=0.005\) to avoid interferences between categories. The values of the Gaussian parameters are very low so the categories are selective enough. During the learning step, different values are used to increase progressively the number of learned categories (\(\beta ^P = 0.002\) then \(\beta ^P = 0.001\), and \(\beta ^V = 2\cdot 10^{-4}\) then \(\beta ^V = 5 \cdot 10^{-5}\)). During the tests, vision must drive the movements, thus the proprioceptive categories must be less selective than the visual categories (\(\beta ^P = 0.1\) and \(\beta ^V = 5\cdot 10^{-5}\)).

In the experiments, muscle activation learning depends on the learning factor \(\varepsilon ^A=10^{-3}\) and the decay factor \(\alpha ^A=10^{-4}\). As the learning factor is small, the stiffness \(K_j\) of each joint changes slowly. Still, the equilibrium position is rapidly adapted because it depends on the ratio of the muscle activations. Also, the decay must be slow enough to allow the learning. With an error threshold \(th_D=0.01\), the muscle activations around a joint are adapted if the position error is over a hundredth of the rotational range.

The parameters \(th_L\) and \(\gamma ^L\) define the dynamics of the “learning enable” signal \(L\), i.e. determine the amount of time to learn each postural attractor. The used values are \(th_L=10^{-5}\) and \(\gamma ^L=0.95\), thus the motor exploration resumes after a time period of about 10 s without correction of the movements.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Rengervé, A., Andry, P. & Gaussier, P. Online learning and control of attraction basins for the development of sensorimotor control strategies. Biol Cybern 109, 255–274 (2015). https://doi.org/10.1007/s00422-014-0640-4

Download citation

Received: 27 August 2013
Accepted: 27 November 2014
Published: 10 January 2015
Issue Date: April 2015
DOI: https://doi.org/10.1007/s00422-014-0640-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online learning and control of attraction basins for the development of sensorimotor control strategies

Abstract

Access this article

Similar content being viewed by others

Tacit Learning for Emergence of Task-Related Behaviour through Signal Accumulation

Autonomous Learning of Internal Dynamic Models for Reaching Tasks

The influence of proprioceptive state on learning control of reach dynamics

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: summary of the parameters and equations used in the Dynamic Muscle PerAc model

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Online learning and control of attraction basins for the development of sensorimotor control strategies

Abstract

Access this article

Similar content being viewed by others

Tacit Learning for Emergence of Task-Related Behaviour through Signal Accumulation

Autonomous Learning of Internal Dynamic Models for Reaching Tasks

The influence of proprioceptive state on learning control of reach dynamics

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: summary of the parameters and equations used in the Dynamic Muscle PerAc model

Appendix: summary of the parameters and equations used in the Dynamic Muscle PerAc model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation