Modulation of Robotic Motor Synergies Using Reinforcement Learning Optimization

  • Stephen H. Lane
  • David A. Handelman
  • Jack J. Gelfand
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 202)


It is thought that the brain produces coordinated action by recruiting suppressing and/or modulating appropriate sets of coordinative structures in the spinal cord. The work presented in this paper examines the ability of robotic systems to produce similar behavior through the modulation of motor synergy strengths using central pattern generator neural networks and reinforcement learning optimization. The motor synergies employed are forward kinematic approximations based on the Berkinblitt model of the spinal frog wiping reflex. The object of the reinforcement learning optimization is to modulate the synergy coefficient strengths in order to produce skilled motions that can be generalized across space and time. Simulation results demonstrate the acquisition of robotic skills associated with minimum energy cost functions.


Central Pattern Generator Coordinative Structure Joint Velocity Skilled Motion Neural Network Weight 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Handelman, D.A., Lane, S.H. and Gelfand, J.J., “Integrating Neural Networks and Knowledge-Based Systems for Intelligent Robotic Control,”IEEE Control Systems Magazinevol. 10, no.3, April, 1990 pp. 77–87.CrossRefGoogle Scholar
  2. [2]
    Lane, S.H., Handelman, D.A. and Gelfand, J.J., “Can Robots Learn Like People Do?”Proc. SPIE Conf. on Applications of Artificial Neural NetworksOrlando, FL, April, 1990.Google Scholar
  3. [3]
    Handelman, D.A., Lane, S.H., “Integration of Knowledge-Based Systems and Neural Networks for Intelligent Sensorimotor Control,” NSF SBIR Phase I Final ReportRSI TR90–1001October, 1990.Google Scholar
  4. [4]
    Lane, S.H., Handelman, D.A. and Gelfand, J.J., “An Architecture for Robotic Skill Acquisition Incorporating Structural and Functional Aspects of the Human Motor Control System,” submitted toIEEE Expert1991.Google Scholar
  5. [5]
    Handelman, D.A., Lane, S.H. and Gelfand, J.J., “Robotic Skill Acquisition Based on Biological Principles,” in Kandel, A. Langholz, G. Eds.Hybrid Architectures for Intelligent ControlCRC Press, Boca Raton, FL, 1992, pp. 301–328.Google Scholar
  6. [6]
    Lane, S.H., Flax, M.G., Handelman, D.A. and Gelfand, J.J., “Multi-Layer Perceptrons with B-Spline Receptive Field Functions,”Advances in Neural Information Processing Systems3, Morgan Kaufmann Publ., 1991Google Scholar
  7. [7]
    Lane, S.H., Handelman, D.A. and Gelfand, J.J., “Theory and Development of Higher-Order CMAC Neural Networks,”IEEE Control Systems MagazineApril, 1992.Google Scholar
  8. [8]
    Easton, T.A., “On the Normal Use of Reflexes,”American Scientistvol. 60, Sept.-Oct., 1972, pp. 591–599Google Scholar
  9. [9]
    Gallistel, C.R.The Organization of ActionLawrence Eribaum Assoc., Hillsdale, New Jersey, 1980.Google Scholar
  10. [10]
    Lee, W.A., “Neuromotor Synergies as a Basis for Coordinated Intentional Action,”J. Motor Behaviorvol. 16, no. 2, 1984, pp. 135–170.Google Scholar
  11. [11]
    Greene, P.H., “Problems of Organization of Motor Systems,”Progress in Theoretical BiologyR. Rosen and F.M. Snell, Eds. Academic Press, New York, 1972.Google Scholar
  12. [12]
    Flash, T., and Hogan, N., “The Coordination of Arm Movements: An Experimentally Confirmed Mathematical Model,”J. NeuroscienceVol. 5, no.7, July 1985, pp. 1688–1703.Google Scholar
  13. [13]
    Nelson, W.L., “Physical Principles for Economies of Skilled Movements,”Biological CyberneticsVol.46, 1983, pp.135–147.MATHCrossRefGoogle Scholar
  14. [14]
    Uno, Y., Kawato, M. and Suzuki, R., “Formation and Control of Optimal Trajectory in Human Multi-joint Arm Movements - Minimum Torque-Change Model,”Biological CyberneticsVol. 61, 1989, pp. 89–101.CrossRefGoogle Scholar
  15. [15]
    Barto, A.G., “Connectionist Learning for Control: An Overview,”COINS Technical Report 89–89University of Massachussetts, 1989.Google Scholar
  16. [16]
    Berkinblitt, M.V., Gel’fand, I.M., and Fel’dman, A.G., “Model of the Control of the Movements of a Multi-joint Limb,”Biophysicsvol. 31, no. 1, 1986, pp. 142–153.Google Scholar
  17. [17]
    Hinton, G., “Parallel Computations for Controlling an Arm,”J. Motor Behaviorvol. 16, no. 2, 1984, pp. 171–194.Google Scholar
  18. [18]
    Eckmiller, R., Beckmann, J., Werntges, H. and Lades, M., “Neural Kinematics Net for a Redundant Robot Arm,”Proc. Int’l Joint Conf. Neural NetworksWashington, D.C., June, 1989, pp. II:333–339.CrossRefGoogle Scholar
  19. [19]
    N. Hogan, “Adaptive Control of Mechanical Impedance by Coactivation of Anatagonist Muscles,”IEEE Trans. Auto. Controlvol. AC-29, no. 8, 1984, pp. 681–690.MATHCrossRefGoogle Scholar
  20. [20]
    Raibert, M.H., and Craig, J.J., “Hybrid Position/Force Control of Manipulators,”Trans ASME J. of Dynamic Systems Measurement and Control,vol. 102, pp. 126–133, 1981.CrossRefGoogle Scholar
  21. [21]
    Asada, H. and Slotine, J.J.E.Robot Analysis and ControlMIT Press, Cambridge, MA, 1986.Google Scholar
  22. [22]
    Miyamoto, H., Kawato, M., Setoyama, T. and Suzuki, R., “Feedback-ErrorLearning Neural Network for Trajectory Control of a Robotic Manipulator,”Neural NetworksVol. 1, 1988, pp. 251–265.CrossRefGoogle Scholar
  23. [23]
    Jordan, M.I., “Attractor Dynamics and Parallelism in a Connectionist Sequential Machine,”Proc. 8th Annual Conf. Cognitive Science Society1986, pp.531–546.Google Scholar
  24. [24]
    Lane, S.H. and Stengel, R.F., “Flight Control Design Using Nonlinear Inverse Dynamics,”IFACAutomaticaVol. 24, No. 3, July 1988.Google Scholar
  25. [25]
    Kawato, M., Maeda, Y., Uno, Y. and Suzuki, R., “Trajectory Formation of Arm Movement by Cascade Neural Network Model Based on Minimum Torque-Change Criterion,”Biological CyberneticsVol.62, 1990, pp. 275–288.CrossRefGoogle Scholar
  26. [26]
    Stengel, R.F.Stochastic Optimal Control: Theory and ApplicationsJohn Wiley & Sons, New York, 1986.MATHGoogle Scholar
  27. [27]
    A.P. Sage and C.C. WhiteOptimum Systems ControlPrentice Hall, Englewood Cliffs, NJ, 1977.Google Scholar
  28. [28]
    MacKay, W.A., and Murphy, J.T., “Cerebellar Modulation of Reflex Gain,”Progress in NeurobiologyVol. 13, 1979, pp. 361–417.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1993

Authors and Affiliations

  • Stephen H. Lane
    • 1
    • 2
  • David A. Handelman
    • 1
    • 2
  • Jack J. Gelfand
    • 1
  1. 1.Human Information Processing GroupDepartment of Psychology Princeton UniversityPrincetonUSA
  2. 2.Robicon Systems Inc.PrincetonUSA

Personalised recommendations