Decentralized robust zero-sum neuro-optimal control for modular robot manipulators in contact with uncertain environments: theory and experimental verification

  • Bo Dong
  • Tianjiao An
  • Fan Zhou
  • Keping Liu
  • Yuanchun LiEmail author
Original Paper


This paper presents a decentralized robust zero-sum optimal control approach for modular robot manipulators (MRMs) in contact with uncertain environments based on the adaptive dynamic programming (ADP) algorithm. The dynamic model of MRMs is formulated via joint torque feedback technique that is deployed for each joint module to design the model compensation controller. An uncertainty decomposition-based robust control is developed to compensate the model uncertainties, and then, the robust optimal control problem of the MRM system is transformed into a two-player zero-sum optimal control one. According to the ADP algorithm, the Hamilton–Jacobi–Isaacs equation can be solved by establishing action and critic neural networks, thus making the derivation of the approximate optimal control policy feasible. Based on the Lyapunov theory, the closed-loop robotic system is proved to be asymptotic stable under the developed decentralized control method. Finally, experiments are conducted to verify the effectiveness and advantages of the proposed method.


Modular robot manipulators Adaptive dynamic programming Decentralized control Optimal control Zero-sum game 



This work was supported by the National Natural Science Foundation of China (Grant Nos. 61374051, 61773075 and 61703055), the Scientific Technological Development Plan Project in Jilin Province of China (Grant Nos. 20170204067GX, 20160520013JH and 2016041403-3GH) and the Science and Technology Project of Jilin Provincial Education Department of China during the 13th Five-Year Plan Period (JJKH20170569KJ).

Compliance with ethical standards

Conflict of interest

The authors declared no potential conflicts of interest with respect to the research, authorship and publication of this article.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.


  1. 1.
    Ding, L., Gao, H., Deng, Z., Song, J., Liu, G., Iagnemma, K.: Foot-terrain interaction mechanics for legged robots: modeling and experimental validation. Int. J. Robot. Res. 32(13), 1585–1606 (2013)Google Scholar
  2. 2.
    Chen, L., Zhang, Y., Yi, J., Liu, T.: An integrated physical-learning model of physical human–robot interactions with application to pose estimation in bikebot riding. Int. J. Robot. Res. 35(12), 1459–1476 (2016)Google Scholar
  3. 3.
    Bajracharya, M., DiCicco, M., Backes, P., Nickels, K.: Visual end-effector position error compensation for planetary robotics. J. Field Robot. 24(5), 399–420 (2007)Google Scholar
  4. 4.
    Bhasin, S., Dupree, K., Patre, P.M., Dixon, W.E.: Neural network control of a robot interacting with an uncertain viscoelastic environment. IEEE Trans. Control Syst. Technol. 19(4), 947–955 (2011)Google Scholar
  5. 5.
    Latornell, D.J., Cherchas, D.B., Wong, R.: Dynamic characteristics of constrained manipulators for contact force control design. Int. J. Robot. Res. 17(3), 211–231 (1998)Google Scholar
  6. 6.
    Austin, D., McGarragher, B.: Force control command synthesis for constrained hybrid dynamic systems with friction. Int. J. Robot. Res. 20(9), 753–764 (2001)Google Scholar
  7. 7.
    Albu-Schaffer, A., Ott, C., Hirzinger, G.: A unified passivity-based control framework for position, torque, and impedance control of flexible joint robots. Int. J. Robot. Res. 26(1), 23–39 (2007)zbMATHGoogle Scholar
  8. 8.
    Liu, G., Abdul, S., Goldenberg, A.A.: Distributed control of modular and reconfigurable robot with torque sensing. Robotica 26(1), 75–84 (2008)Google Scholar
  9. 9.
    Liu, G., Liu, Y., Goldenberg, A.A.: Design, analysis, and control of a spring-assisted modular and reconfigurable robot. IEEE/ASME Trans. Mech. 16(4), 695–706 (2011)Google Scholar
  10. 10.
    Lee, H., Kim, S., Chang, H., Kim, J.: Development of a compact optical torque sensor with decoupling axial-interference effects for pHRI. Mechatronics 52, 90–101 (2018)Google Scholar
  11. 11.
    Li, W., Liao, B., Xiao, L., Lu, R.: A recurrent neural network with predefined-time convergence and improved noise tolerance for dynamic matrix square root finding. Neurocomputing 337, 262–273 (2019)Google Scholar
  12. 12.
    Wen, G., Wang, P., Huang, T., Yu, W., Sun, J.: Robust neuro-adaptive containment of multileader multiagent systems with uncertain dynamics. IEEE Trans. Syst. Man Cybern. Syst. 49(2), 406–417 (2019)Google Scholar
  13. 13.
    Vu, T., Wang, Y., Pham, V.: Robust adaptive sliding mode neural networks control for industrial robot manipulators. Int. J. Control Autom. 17(3), 783–792 (2019)Google Scholar
  14. 14.
    Huang, H., Zhou, J.Y., Di, Q., Zhou, J.W., Li, J.: Robust neural network-based tracking control and stabilization of a wheeled mobile robot with input saturation. Int. J. Robust Nonlinear Control 29, 375–392 (2019)MathSciNetzbMATHGoogle Scholar
  15. 15.
    He, W., Yan, Z., Sun, Y., Ou, Y., Sun, C.: Neural-learning-based control for a constrained robotic manipulator with flexible joints. IEEE Trans. Neural Netw. Learn. Syst. 29(12), 5993–6003 (2018)Google Scholar
  16. 16.
    Werbos, P.J.: Approximate dynamic programming for real time control and neural modeling. In: White, D.A., Sofge, D.A. (eds.) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold, New York (1992)Google Scholar
  17. 17.
    Al-Tamimi, A., Lewis, F.L., Abu-Khalaf, M.: Discrete time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. Syst. Man Cybern. B Cybern. 38(4), 943–949 (2008)Google Scholar
  18. 18.
    Wang, D., Mu, C., Liu, D.: Data-driven nonlinear near optimal regulation based on iterative neural dynamic programming. Acta Autom. Sin. 43(3), 366–375 (2017)zbMATHGoogle Scholar
  19. 19.
    Prokhorov, D.V., Wunsch, D.C.: Adaptive critic designs. IEEE Trans. Neural Netw. 8(5), 997–1007 (1997)Google Scholar
  20. 20.
    Kaelbling, L.P., Littman, M.L., Moore, A.M.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4(1), 237–285 (1996)Google Scholar
  21. 21.
    Zhang, C., Zou, W., Cheng, N., Gao, J.: Trajectory tracking control for rotary steerable systems using interval type-2 fuzzy logic and reinforcement learning. J. Frankl. I(355), 803–826 (2018)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Dong, N., Chen, Z.: A novel ADP based model-free predictive control. Nonlinear Dyn. 69, 89–97 (2012)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Luo, B., Liu, D., Wu, H.: Adaptive constrained optimal control design for data-based nonlinear discrete-time systems with critic-only structure. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2099–2111 (2018)MathSciNetGoogle Scholar
  24. 24.
    Wei, Q., Li, B., Song, R.: Discrete-time stable generalized self-learning optimal control with approximation errors. IEEE Trans. Neural Netw. Learn. Syst. 29(4), 1226–1238 (2018)Google Scholar
  25. 25.
    Zhao, B., Wang, D., Shi, G., Liu, D., Li, Y.: Decentralized control for large-scale nonlinear systems with unknown mismatched interconnections via policy iteration. IEEE Trans. Syst. Man Cybern. Syst. 48(10), 1725–1735 (2018)Google Scholar
  26. 26.
    Zhao, B., Jia, L., Xia, H., Li, Y.: Adaptive dynamic programming-based stabilization of nonlinear systems with unknown actuator saturation. Nonlinear Dyn. 93, 2089–2103 (2018)zbMATHGoogle Scholar
  27. 27.
    Zhang, Q., Zhao, D., Wang, D.: Event-based robust control for uncertain nonlinear systems using adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 29(1), 37–50 (2018)MathSciNetGoogle Scholar
  28. 28.
    Wei, Q., Liu, D.: Data-driven neuro-optimal temperature control of water–gas shift reaction using stable iterative adaptive dynamic programming. IEEE Trans. Ind. Electron. 61(11), 6399–6408 (2014)Google Scholar
  29. 29.
    Luo, B., Wu, H., Huang, T., Liu, D.: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12), 3281–3290 (2014)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Li, C., Liu, D., Wang, D.: Data-based optimal control for weakly coupled nonlinear systems using policy iteration. IEEE Trans. Syst. Man Cybern. Syst. 48(4), 511–521 (2018)Google Scholar
  31. 31.
    Wei, Q., Song, R., Yan, P.: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Trans. Neural Netw. Learn. Syst. 27(2), 444–458 (2016)MathSciNetGoogle Scholar
  32. 32.
    Wei, Q., Liu, D., Lin, Q., Song, R.: Adaptive dynamic programming for discrete-time zero-sum games. IEEE Trans. Neural Netw. Learn. Syst. 29(4), 957–969 (2018)Google Scholar
  33. 33.
    Sun, J., Liu, C., Zhao, X.: Backstepping-based zero-sum differential games for missile-target interception systems with input and output constraints. IET Control Theory Appl. 12(2), 243–253 (2018)MathSciNetGoogle Scholar
  34. 34.
    Zhang, H., Luo, Y., Liu, D.: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans. Neural. Netw. 20(9), 1490–1503 (2009)Google Scholar
  35. 35.
    Yang, X., Liu, D., Wang, D.: Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. Int. J. Control 87(3), 553–566 (2014)MathSciNetzbMATHGoogle Scholar
  36. 36.
    Jiang, Y., Jiang, Z.: Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems. IEEE Trans. Circuits Syst. II Exp. Briefs 59(10), 693–697 (2012)Google Scholar
  37. 37.
    Wang, D., Liu, D., Li, H.: Policy iteration algorithm for online design of robust control for a class of continuous time nonlinear systems. IEEE Trans. Autom. Sci. Eng. 11(2), 627–632 (2014)Google Scholar
  38. 38.
    Wang, D., Liu, D., Li, H., Luo, B., Ma, H.: An approximate optimal control approach for robust stabilization of a class of discrete-time nonlinear systems with uncertainties. IEEE Trans. Syst. Man Cybern. Syst. 46(5), 713–717 (2016)Google Scholar
  39. 39.
    Zhao, B., Liu, D., Li, Y.: Online fault compensation control based on policy iteration algorithm for a class of affine non-linear systems with actuator failures. IET Control Theory Appl. 10(15), 1816–1823 (2016)MathSciNetGoogle Scholar
  40. 40.
    He, W., Dong, Y.: Adaptive fuzzy neural network control for a constrained robot using impedance learning. IEEE Trans. Neural Netw. Learn. Syst. 29(4), 1174–1186 (2018)Google Scholar
  41. 41.
    Zhao, B., Li, Y.: Model-free adaptive dynamic programming based near-optimal decentralized tracking control of reconfigurable manipulators. Int. J. Control Autom. 16(2), 478–490 (2018)Google Scholar
  42. 42.
    Roveda, L., Pallucca, G., Pedrocchi, N., Braghin, F., Tosatti, L.M.: Iterative learning procedure with reinforcement for high-accuracy force tracking in robotized tasks. IEEE Trans. Ind. Electron. 14(4), 1753–1763 (2018)Google Scholar
  43. 43.
    Leottau, D., Ruiz-del-Solar, J., Babuska, R.: Decentralized reinforcement learning of robot behaviors. Artif. Intell. 256, 130–159 (2018)MathSciNetzbMATHGoogle Scholar
  44. 44.
    Qureshi, A.H., Nakamura, Y., Yoshikawa, Y., Ishiguro, H.: Intrinsically motivated reinforcement learning for human–robot interaction in the real-world. Neural Netw. 107, 23–33 (2018)Google Scholar
  45. 45.
    Patchaikani, P.K., Behera, L., Prasad, G.: A single network adaptive critic-based redundancy resolution scheme for robot manipulators. IEEE Trans. Ind. Electron. 55(10), 3731–3831 (2012)Google Scholar
  46. 46.
    Lian, C., Xu, X., Chen, H., He, H.: Near-optimal tracking control of mobile robots via receding-horizon dual heuristic programming. IEEE Trans. Cybern. 46(11), 2484–2496 (2016)Google Scholar
  47. 47.
    Jiang, C., Ni, Z., Guo, Y., He, H.: Learning human–robot Interaction for robot-assisted pedestrian flow optimization. IEEE Trans. Syst. Man Cybern. Syst. 49(4), 797–813 (2019)Google Scholar
  48. 48.
    Li, S., Ding, L., Gao, H., Liu, Y., Huang, L., Deng, Z.: ADP-based online tracking control of partially uncertain time-delayed nonlinear system and application to wheeled mobile robots. IEEE Trans. Cybern. (2019). Google Scholar
  49. 49.
    Dong, B., Zhou, F., Liu, K., Li, Y.: Decentralized robust optimal control for modular robot manipulators via critic-identifier structure-based adaptive dynamic programming. Neural Comput. Appl. (2018). Google Scholar
  50. 50.
    Dong, B., Zhou, F., Liu, K., Li, Y.: Torque sensorless decentralized neuro-optimal control for modular and reconfigurable robots with uncertain environments. Neurocomputing 282, 60–73 (2018)Google Scholar
  51. 51.
    Imura, J., Yokokohji, Y., Yoshikawa, T., Sugie, T.: Robust control of robot manipulators based on joint torque sensor information. Int. J. Robot. Res. 13(5), 434–442 (1994)Google Scholar
  52. 52.
    Dong, B., Liu, K., Li, Y.: Decentralized control of harmonic drive based modular robot manipulator using only position measurements: theory and experimental verification. J. Intell. Robot. Syst. 88, 3–18 (2017)Google Scholar
  53. 53.
    Dong, B., Li, Y., Liu, K.: Decentralized control for harmonic drive-based modular and reconfigurable robots with uncertain environment contact. Adv. Mech. Eng. 9(4), 1–15 (2017)Google Scholar
  54. 54.
    Armstrong-Hlouvry, B., Dupont, P., De Wit, C.C.: A survey of models, analysis tools and compensation methods for the control of machines with friction. Automatica 30(7), 1083–1138 (1994)zbMATHGoogle Scholar
  55. 55.
    Liu, G., Goldenberg, A.A., Zhang, Y.: Precise slow motion control of a direct-drive robot arm with velocity estimation and friction compensation. Mechatronics 14(7), 821–834 (2004)Google Scholar
  56. 56.
    Liu, G.: Decomposition-based friction compensation of mechanical systems. Mechatronics 12(4), 755–769 (2002)Google Scholar
  57. 57.
    Basar, T., Bernhard, P.: H\(\infty \)-Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach. Birkhauser, Boston (1995)zbMATHGoogle Scholar
  58. 58.
    Basar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory, 2nd edn. SIAM, Philadelphia (1999)zbMATHGoogle Scholar
  59. 59.
    Liu, G., Goldenberg, A.A.: Uncertainty decomposition-based robust control of robot manipulators. IEEE Trans. Control Syst. Technol. 4(4), 384–393 (1996)Google Scholar
  60. 60.
    Chao, F., Wang, Z., Shang, C., Meng, Q., Jiang, M., Zhou, C., Shen, Q.: A developmental approach to robotic pointing via human–robot interaction. Inf. Sci. 283, 288–303 (2014)Google Scholar
  61. 61.
    Cherubini, A., Passama, R., Crosnier, A., Lasnier, A., Fraisse, P.: Collaborative manufacturing with physical human–robot interaction. Robot. CIM Int. Manuf. 40, 1–13 (2016)Google Scholar
  62. 62.
    Liu, D., Wang, D., Li, H.: Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach. IEEE Trans. Neural Netw. Learn. Syst. 25(2), 418–428 (2014)Google Scholar
  63. 63.
    Tong, S., Sun, K., Sui, S.: Observer-based adaptive fuzzy decentralized optimal control design for strict-feedback nonlinear large-scale systems. IEEE Trans. Fuzzy Syst. 26(2), 569–584 (2018)Google Scholar
  64. 64.
    Sun, K., Sui, S., Tong, S.: Fuzzy adaptive decentralized optimal control for strict feedback nonlinear large-scale systems. IEEE Trans. Cybern. 48(4), 1326–1339 (2018)Google Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Department of Control Science and EngineeringChangchun University of TechnologyChangchunChina
  2. 2.The State Key Laboratory of Management and Control for Complex Systems, Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations