Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances

Vu, Van Tu; Tran, Quang Huy; Pham, Thanh Loc; Dao, Phuong Nam

doi:10.1007/s12555-020-0809-7

Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances

Regular Papers
Intelligent Control and Applications
Published: 11 March 2022

Volume 20, pages 1029–1040, (2022)
Cite this article

International Journal of Control, Automation and Systems Aims and scope Submit manuscript

Van Tu Vu¹,
Quang Huy Tran²,
Thanh Loc Pham³ &
…
Phuong Nam Dao ORCID: orcid.org/0000-0002-8333-5572³

384 Accesses
17 Citations
Explore all metrics

Abstract

This article addresses a trajectory tracking control approach for an uncertain surface vessel using the new cascade structure of adaptive reinforcement learning (ARL) algorithm and kinematic controller, feed-forward term. Since a surface vessel is decoupled by kinematic sub-system and dynamic sub-system, the cascade control system is an ideal method for obtaining the tracking problem. In the proposed control structure, the dynamic control loop is designed to be the optimized method of the corresponding dynamic sub-system and the kinematic control loop is implemented by a nonlinear controller combining with feed-forward term. The online actor-critic architecture is considered in ARL algorithm to overcome the challenge of solving the Hamilton-Jacobi-Bellman (HJB) equation. Additionally, the proposed controller is able to handle the difficulty of the non-autonomous optimal control problem by designing the ARL technique for the corresponding system with a small number of state variables. Based on theoretical analysis, the ARL based control design has been made to guarantee the uniformly ultimately bounded (UUB) stability of the closed system. Finally, the simulation results are illustrated to verify the effectiveness of the proposed control scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Optimal Control Based on the Off-Policy Integral Reinforcement Learning Algorithm for Surface Vessel Systems with Unknown Dynamics

Robust adaptive asymptotic trajectory tracking control for underactuated surface vessels subject to unknown dynamics and input saturation

Article 26 July 2021

Optimal trajectory-tracking control for underactuated AUV with unknown disturbances via single critic network based adaptive dynamic programming

Article 06 October 2022

References

B. S. Park, J.-W. Kwon, and H. Kim, “Neural network-based output feedback control for reference tracking of underactuated surface vessels,” Automatica, vol. 77, pp. 353–359, 2017.
Article MathSciNet Google Scholar
N. Wang, S.-F. Su, X. Pan, X. Yu, and G. Xie, “Yaw-guided trajectory tracking control of an asymmetric underactuated surface vehicle,” IEEE Transactions on Industrial Informatics, vol. 16, no. 6, pp. 3502–3513, 2018.
Article Google Scholar
N. Wang, G. Xie, X. Pan, and S. F. Su, “Full-state regulation control of asymmetric underactuated surface vehicles,” IEEE Transactions on Industrial Electronics, vol. 66, no. 11, pp. 8741–8750, 2019.
Article Google Scholar
L. J. Wang, “Robust adaptive control of underactuated ships with input saturation,” International Journal of Control, vol. 94, no. 7, pp. 1784–1793, 2021.
Article MathSciNet Google Scholar
H. Qin, C. Li, Y. Sun, X. Li, Y. Du, and Z. Deng, “Finite-time trajectory tracking control of unmanned surface vessel with error constraints and input saturations,” Journal of the Franklin Institute, vol. 357, no. 16, pp. 11472–11495, 2020.
Article MathSciNet Google Scholar
J. Zhang, S. Yu, and Y. Yan, “Fixed-time output feedback trajectory tracking control of marine surface vessels subject to unknown external disturbances and uncertainties,” ISA Transactions, vol. 93, pp. 145–155, 2019.
Article Google Scholar
J. Zhang, S. Yu, and Y. Yan, “Fixed-time velocity-free sliding mode tracking control for marine surface vessels with uncertainties and unknown actuator faults,” Ocean Engineering, vol. 201, in Press.
M. Van, “An enhanced tracking control of marine surface vessels based on adaptive integral sliding mode control and disturbance observer,” ISA Transactions, vol. 90, pp. 30–40, 2019.
Article Google Scholar
M. Van, “Adaptive neural integral sliding-mode control for tracking control of fully actuated uncertain surface vessels,” International Journal of Robust and Nonlinear Control, vol. 29, no. 5, pp. 1537–1557, 2019.
Article MathSciNet Google Scholar
N. Wang, H. R. Karimi, H. Li, and S.-F. Su, “Accurate trajectory tracking of disturbed surface vehicles: A finite-time control approach,” IEEE/ASME Transactions on Mechatronics, vol. 24, no. 3, pp. 1064–1074, 2019.
Article Google Scholar
W. Xie, B. Ma, W. Huang, and Y. Zhao, “Global trajectory tracking control of underactuated surface vessels with nondiagonal inertial and damping matrices,” Nonlinear Dynamics, vol. 92, no. 4, pp. 1481–1492, 2018.
Article Google Scholar
J. Huang, C. Wen, W. Wang, and Z.-P. Jiang, “Adaptive output feedback tracking control of a nonholonomic mobile robot,” Automatica, vol. 50, no. 3, pp. 821–831, 2014.
Article MathSciNet Google Scholar
Z. Gao and G. Guo, “Command-filtered fixed-time trajectory tracking control of surface vehicles based on a disturbance observer,” International Journal of Robust and Nonlinear Control, vol. 29, no. 13, pp. 4348–4365, 2019.
Article MathSciNet Google Scholar
Y. Tuo, Y. Wang, S. X. Yang, M. Biglarbegian, and M. Fu, “Robust adaptive dynamic surface control based on structural reliability for a turret-moored floating production storage and offloading vessel,” International Journal of Control, Automation and Systems, vol. 16, no. 4, pp. 1648–1659, 2018.
Article Google Scholar
R. Wu and J. Du, “Adaptive robust course-tracking control of time-varying uncertain ships with disturbances,” International Journal of Control, Automation and Systems, vol. 17, no. 7, pp.1847–1855, 2019.
Article Google Scholar
G. Xia, C. Sun, B. Zhao, and J. Xue, “Cooperative control of multiple dynamic positioning vessels with input saturation based on finite-time disturbance observer,” International Journal of Control, Automation and Systems, vol. 17, no. 2, pp. 370–379, 2019.
Article Google Scholar
Z. Zheng, Y. Huang, L. Xie, and B. Zhu, “Adaptive trajectory tracking control of a fully actuated surface vessel with asymmetrically constrained input and output,” IEEE Transactions on Control Systems Technology, vol. 26, no. 5, pp.1851–1859, 2017.
Article Google Scholar
Y. Yang, J. Du, H. Liu, C. Guo, and A. Abraham, “A trajectory tracking robust controller of surface vessels with disturbance uncertainties,” IEEE Transactions on Control Systems Technology, vol. 22, no. 4, pp. 1511–1518, 2013.
Article Google Scholar
Y. Qu, B. Xiao, Z. Fu, and D. Yuan, “Trajectory exponential tracking control of unmanned surface ships with external disturbance and system uncertainties,” ISA Transactions, vol. 78, pp. 47–55, 2018.
Article Google Scholar
G. Wen, S. S. Ge, C. L. P. Chen, F. Tu, and S. Wang, “Adaptive tracking control of surface vessel using optimized backstepping technique,” IEEE Transactions on Cybernetics, vol. 49, no. 9, pp. 3420–3431, 2018.
Article Google Scholar
Y. Huang, D. Wang, and D. Liu, “Bounded robust control design for uncertain nonlinear systems using singlenetwork adaptive dynamic programming,” Neurocomputing, vol. 266, pp. 128–140, 2017.
Article Google Scholar
S. Bhasin, R. Kamalapurkar, M. Johnson, K. G. Vamvoudakis, F. L. Lewis, and W. E. Dixon, “A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems,” Automatica, vol. 49, no. 1, pp. 82–92, 2013.
Article MathSciNet Google Scholar
Y. Zhu, D. Zhao, and X. Liu, “Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics,” IET Control Theory & Applications, vol. 10, no. 12, pp. 1339–1347, 2016.
Article MathSciNet Google Scholar
X. Guo, W. Yan, and R. Cui, “Integral reinforcement learning-based adaptive NN control for continuous-time nonlinear MIMO systems with unknown control directions,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 11, pp. 4068–4077, 2020.
Article Google Scholar
X. Yang, H. He, D. Liu, and Y. Zhu, “Adaptive dynamic programming for robust neural control of unknown continuous-time non-linear systems,” IET Control Theory & Applications, vol. 11, no. 14, pp. 2307–2316, 2017.
Article MathSciNet Google Scholar
J. Dornheim, N. Link, and P. Gumbsch, “Model-free adaptive optimal control of episodic fixed-horizon manufacturing processes using reinforcement learning,” International Journal of Control, Automation and Systems, vol. 18, no. 6, pp. 1593–1604, 2020.
Article Google Scholar
L. Guo, S. A. A. Rizvi, and Z. Lin, “Optimal control of a two-wheeled self-balancing robot by reinforcement learning,” International Journal of Robust and Nonlinear Control, vol. 31, no. 6, pp. 1885–1904, 2021.
Article MathSciNet Google Scholar
Y. Lv, X. Ren, S. Hu, and H. Xu, “Approximate optimal stabilization control of servo mechanisms based on reinforcement learning scheme,” International Journal of Control, Automation and Systems, vol. 17, no. 10, pp. 2655–2665, 2019.
Article Google Scholar
J. Na, Y. Lv, K. Zhang, and J. Zhao, “Adaptive identifier-critic-based optimal tracking control for nonlinear systems with experimental validation,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 1, pp. 459–472, 2022.
Article Google Scholar
Q. Zhao, H. Xu, and S. Jagannathan, “Neural network-based finite-horizon optimal control of uncertain affine nonlinear discrete-time systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 3, pp. 486–499, 2015.
Article MathSciNet Google Scholar
X. Yang, D. Liu, and D. Wang, “Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints,” International Journal of Control, vol. 87, no. 3, pp. 553–566, 2014.
Article MathSciNet Google Scholar
H. Zhang, Q. Qu, G. Xiao, and Y. Cui, “Optimal guaranteed cost sliding mode control for constrained-input nonlinear systems with matched and unmatched disturbances,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2112–2126, 2018.
Article MathSciNet Google Scholar
Y. Yang, C. Xu, D. Yue, and X. Xie, “Output feedback tracking control of a class of continuous-time nonlinear systems via adaptive dynamic programming approach,” Information Sciences, vol. 469, pp. 1–13, 2018.
Article MathSciNet Google Scholar
G. Wen, C. L. P. Chen, and S. S. Ge, “Simplified optimized backstepping control for a class of nonlinear strict-feedback systems with unknown dynamic functions,” IEEE Transactions on Cybernetics, vol. 51, no. 0, pp. 4567–4580, 2021.
Article Google Scholar
T. Sun and X.-M. Sun, “An adaptive dynamic programming scheme for nonlinear optimal control with unknown dynamics and its application to turbofan engines,” IEEE Transactions on Industrial Informatics, vol. 17, no. 1, pp. 367–376, 2021.
Article Google Scholar
Z. Yin, W. He, C. Yang, and C. Sun, “Control design of a marine vessel system using reinforcement learning,” Neurocomputing, vol. 311, pp. 353–362, 2018.
Article Google Scholar
H. K. Khalil, Nonlinear Systems, vol. 3, Prentice Hall, Upper Saddle River, NJ, 2002.
MATH Google Scholar

Download references

Funding

This work was supported in part by the Ministry of Education and Training, Vietnam, under grant B2020-BKA-05.

Author information

Authors and Affiliations

Hai Phong University, Hai Phong, Vietnam
Van Tu Vu
Department of Mechanical Engineering, National Cheng-Kung Uninversity (NCKU), Tainan, Taiwan
Quang Huy Tran
School of Electrical Engineering, Hanoi University of Science and University, 01 Dai Co Viet, Hai Ba Trung District, Hanoi, Vietnam
Thanh Loc Pham & Phuong Nam Dao

Authors

Van Tu Vu
View author publications
You can also search for this author in PubMed Google Scholar
Quang Huy Tran
View author publications
You can also search for this author in PubMed Google Scholar
Thanh Loc Pham
View author publications
You can also search for this author in PubMed Google Scholar
Phuong Nam Dao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Phuong Nam Dao.

Additional information

Van Tu Vu received his M.E. degree in electrical engineering from Vietnam Maritime University, Hai Phong, Viet Nam in 2015. Currently, he holds the position of lecturer at Hai Phong University, Viet Nam. Currently, he holds the position of lecturer at Hai Phong University, Viet Nam. He is currently working toward a Ph.D. degree at Hanoi University of Science and Technology, Vietnam. His current research interests include optimal control and robust/adaptive control.

Quang Huy Tran received his engineering degree of Engineer in control engineering and automation from Hanoi University of Science and Technology. Currently he is a master student in Mechanical Engineering, National Cheng Kung University. His research interests include robotics, automatic control, networked robot systems, and autonomous systems.

Thanh Loc Pham received his B.S. degree in electronic engineering in 2020 from the Hanoi University of Science and Technology, Vietnam. He is currently an automation engineer in Viettel High Technology Industries Corporation. His research interests include control and navigation system of unmanned aerial vehicle, and surface vehicle and manipulators.

Phuong Nam Dao received his Ph.D. degree in industrial automation from Hanoi University of Science and Technology, Hanoi, Vietnam in 2013. Currently, he holds the position as lecturer at Hanoi University of Science and Technology, Vietnam. His research interests include control of robotic systems and robust/adaptive, optimal control.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vu, V.T., Tran, Q.H., Pham, T.L. et al. Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances. Int. J. Control Autom. Syst. 20, 1029–1040 (2022). https://doi.org/10.1007/s12555-020-0809-7

Download citation

Received: 01 November 2020
Revised: 05 March 2021
Accepted: 22 May 2021
Published: 11 March 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s12555-020-0809-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances

Abstract

Access this article

Similar content being viewed by others

Robust Optimal Control Based on the Off-Policy Integral Reinforcement Learning Algorithm for Surface Vessel Systems with Unknown Dynamics

Robust adaptive asymptotic trajectory tracking control for underactuated surface vessels subject to unknown dynamics and input saturation

Optimal trajectory-tracking control for underactuated AUV with unknown disturbances via single critic network based adaptive dynamic programming

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances

Abstract

Access this article

Similar content being viewed by others

Robust Optimal Control Based on the Off-Policy Integral Reinforcement Learning Algorithm for Surface Vessel Systems with Unknown Dynamics

Robust adaptive asymptotic trajectory tracking control for underactuated surface vessels subject to unknown dynamics and input saturation

Optimal trajectory-tracking control for underactuated AUV with unknown disturbances via single critic network based adaptive dynamic programming

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation