A Modified ALOS Method of Path Tracking for AUVs with Reinforcement Learning Accelerated by Dynamic Data-Driven AUV Model

Wang, Dianrui; He, Bo; Shen, Yue; Li, Guangliang; Chen, Guanzhong

doi:10.1007/s10846-021-01504-0

A Modified ALOS Method of Path Tracking for AUVs with Reinforcement Learning Accelerated by Dynamic Data-Driven AUV Model

Short paper
Published: 04 March 2022

Volume 104, article number 49, (2022)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Dianrui Wang¹,
Bo He¹,
Yue Shen¹,
Guangliang Li¹ &
…
Guanzhong Chen¹

362 Accesses
5 Citations
Explore all metrics

Abstract

Path tracking has a significant impact on the success of long-term autonomous underwater vehicle (AUV) missions in terms of safety, energy-saving, and efficiency. However, it is a challenging problem due to the model uncertainty, and ocean current disturbance. Moreover, the widely used line of sight (LOS) algorithm with fixed lookahead distance does not perform well because it requires an urgent need for the automatic adjustment of the parameter. Considering the above, this study proposes an adaptive line-of-sight (ALOS) guidance method with reinforcement learning (RL) based on the dynamic data-driven AUV model (DDDAM). Firstly, we introduced a detailed AUV dynamic model mainly including the models with and without current influence. Next, we conducted a detailed analysis of the path tracking error dynamics and the factors influencing the tracking performance based on the model proposed above. We then used the DDDAM (using long short-term memory (LSTM) neural network) to pre-train the RL framework to generate more samples for online learning in order to speed up the learning process. Finally, the deterministic policy gradient (DPG) based RL was designed to optimize the continuously varying lookahead distance considering the previously analyzed factors. Collectively, this paper presents simulation cases and an evaluation of the algorithm. Our results indicate that the proposed method significantly improves the performance of path tracking with effectiveness and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Adaptive LOS Guidance Law Based Path Following of an AUV Using Model Predictive Control

Path optimization for marine vehicles in ocean currents using reinforcement learning

Article 11 December 2015

AUV Path Planning Based on Differential Evolution with Environment Prediction

Article 22 January 2022

Data Availability

The data sets supporting the results of this article are included in the article.

References

A, B.D., B, M.L., A, G.P., B, I.G., B, R.B.: Comparison of model-free and model-based methods for time optimal hit control of a badminton robot. Mechatronics 24(8), 1021–1030 (2014)
Article Google Scholar
Abdurahman, B., Savvaris, A., Tsourdos, A.: A switching los guidance with relative kinematics for path-following of underactuated underwater vehicles. Ifac Papersonline 50(1), 2290–2295 (2017)
Article Google Scholar
Abdurahman, B., Savvaris, A., Tsourdos, A.: Switching los guidance with speed allocation and vertical course control for path-following of unmanned underwater vehicles under ocean current disturbances. Ocean Eng. 182(JUN.15), 412–426 (2019)
Article Google Scholar
Anderson, B.D.O., Moore, J.B., Molinari, B.P.: Linear optimal control. IEEE Trans. Syst. Man Cybern. 93(4), 559–559 (1972)
Article Google Scholar
Benjamin, M.R., Schmidt, H., Newman, P., Leonard, J.: Nested autonomy for unmanned marine vehicles with moos-ivp. J. Field Robot. 27, 834–875 (2010)
Article Google Scholar
Carreras, M., Hernndez, J.D., Vidal, E., Palomeras, N., Ribas, D., Ridao, P.: Sparus ii auv-a hovering vehicle for seabed inspection. IEEE J. Ocean. Eng. 43(2), 344–355 (2018)
Article Google Scholar
Filonov, P., Lavrentyev, A., Vorontsov, A.: Multivariate industrial time series with cyber-attack simulation: Fault detection using an lstm-based predictive data model. arXiv:1612.06676 (2016)
Fossen, T.I., Lekkas, A.M.: Direct and indirect adaptive integral line-of-sight path-following controllers for marine craft exposed to ocean currents. Int. J. Adapt. Control Signal Proc. 31(4), 445–463 (2017)
Article MathSciNet Google Scholar
Fossen, T.I., Pettersen, K.Y.: On uniform semiglobal exponential stability (usges) of proportional line-of-sight guidance laws. Automatica 50(11), 2912–2917 (2014)
Article MathSciNet Google Scholar
Gers, F.A., Schraudolph, N.N.: Learning precise timing with lstm recurrent networks. J. Mach. Learn. Res. 3, 115–143 (2003)
MathSciNet MATH Google Scholar
Khaled, N., Chalhoub, N.G.: A self-tuning guidance and control system for marine surface vessels. Nonlinear Dyn. 73(1-2), 897–906 (2013)
Article Google Scholar
Kupcsik, A., Deisenroth, M.P., Peters, J., Loh, A.P., Vadakkepat, P., Neumann, G.: Model-based contextual policy search for data-efficient generalization of robot skills. Artif. Intell. 247, 415–439 (2017)
Article MathSciNet Google Scholar
Laghrouche, S., Plestan, F., Glumineau, A.: Higher order sliding mode control based on integral sliding mode. Automatica 43(3), 531–537 (2007)
Article MathSciNet Google Scholar
Lekkas, A., Fossen, T.: Integral los path following for curved paths based on a monotone cubic hermite spline parametrization. IEEE Trans. Control Syst. Technol. 22(6), 2287–2301 (2014)
Article Google Scholar
Lekkas, A.M., Fossen, T.I.: A time-varying lookahead distance guidance law for path following. Ifac Proc. 45(27), 398–403 (2012)
Google Scholar
Lekkas, A.M., Fossen, T.I.: Uav path following in windy urban environments. J. Intell. Robot. Syst. 74, 1013–1028 (2014)
Article Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. Comput. Sci. 8(6), 1–14 (2015)
Google Scholar
Lin, C., Chiang, H., Lee, T.: A practical fuzzy controller with q-learning approach for the path tracking of a walking-aid robot. In: The SICE Annual Conference 2013, pp. 888–893 (2013)
Liu, P., Huda, M.N., Sun, L., Yu, H.: A survey on underactuated robotic systems: Bio-inspiration, trajectory planning and control. Mechatronics 72, 102443 (2020)
Article Google Scholar
Liu, P., Yu, H., Cang, S.: Adaptive neural network tracking control for underactuated systems with matched and mismatched disturbances. Nonlinear Dyn. 98, 1447–1464 (2019)
Article Google Scholar
Liu, P., Yu, H., Shuang, C.: Optimized adaptive tracking control for an underactuated vibro-driven capsule system. Nonlinear Dyn. 94, 1803–1817 (2018)
Article Google Scholar
Lu, L., Dan, W., Peng, Z.: Eso-based line-of-sight guidance law for path following of underactuated marine surface vehicles with exact sideslip compensation. IEEE J. Ocean. Eng. 42(2), 477–487 (2017)
Article Google Scholar
Malus, A., Kozjek, D., Vrabic, R.: Real-time order dispatching for a fleet of autonomous mobile robots using multi-agent reinforcement learning. CIRP Ann. 69, 397–400 (2020)
Article Google Scholar
Mandel, J., Beezley, J.D., Bennethum, L.S., Chakraborty, S., Vodacek, A.: A dynamic data driven wildland fire model. In: Computational Science - ICCS 2007, 7th International Conference Beijing, China, May 27-30 2007 Proceedings, Part I (2007)
Mu, D., Wang, G., Fan, Y., Bai, Y., Zhao, Y.: Fuzzy-based optimal adaptive line-of-sight path following for underactuated unmanned surface vehicle with uncertainties and time-varying disturbances. Math. Probl. Eng. 2018, 1–12 (2018)
MathSciNet MATH Google Scholar
Nouri, N.M., Valadi, M., Asgharian, J.: Optimal input design for hydrodynamic derivatives estimation of nonlinear dynamic model of auv. Nonlinear Dyn. 92(2), 139–151 (2018)
Article Google Scholar
Polydoros, A.S., Nalpantidis, L.: Survey of model-based reinforcement learning: Applications on robotics. J. Intell. Robot. Syst. 86(2), 153–173 (2017)
Article Google Scholar
Pong, V., Gu, S., Dalal, M., Levine, S.: Temporal difference models: Model-free deep RL for model-based control. 1–14 arXiv:1802.09081 (2018)
Praczyk, T.: Using neurocevolutionary techniques to tune odometric navigational system of small biomimetic autonomous underwater vehicle c preliminary report. J. Intell. Robot. Syst. 100, 363–376 (2020)
Article Google Scholar
Qiao, L., Zhang, W.: Adaptive second-order fast nonsingular terminal sliding mode tracking control for fully actuated autonomous underwater vehicles. IEEE J. Ocean. Eng. 44(2), 363–385 (2019)
Article Google Scholar
Sadeghzadeh, M., Calvert, D., Abdullah, H.A.: Self-learning visual servoing of robot manipulator using explanation-based fuzzy neural networks and q-learning. J. Intell. Robot. Syst. 78(1), 83–104 (2015)
Article Google Scholar
Shi, H., Lin, Z., Zhang, S., Li, X., Hwang, K.S.: An adaptive decision-making method with fuzzy bayesian reinforcement learning for robot soccer. Inform. Sci. 436-437, 268–281 (2018)
Article MathSciNet Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: 31st International Conference on Machine Learning, ICML, 2014, pp 387–395 (2014)
Sun, Y., Cheng, J., Zhang, G., Xu, H.: Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning. J. Intell. Robot. Syst. 96, 591–601 (2019)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction (1998)
Wang, A., Jia, X., Dong, S.: A new exponential reaching law of sliding mode control to improve performance of permanent magnet synchronous motor. IEEE Trans. Magnet. 49(5), 2409–2412 (2013)
Article Google Scholar
Wang, X., Yao, X., Zhang, L.: Path planning under constraints and path following control of autonomous underwater vehicle with dynamical uncertainties and wave disturbances. J. Intell. Robot. Syst. 99(3), 891–908 (2020)
Article Google Scholar
Woo, J.: Y.C..K.N.: Deep reinforcement learning-based controller for path following of an unmanned surface vehicle. Ocean Eng. 183(1), 155–166 (2019)
Article Google Scholar
Yuan, C., Licht, S., He, H.: Formation learning control of multiple autonomous underwater vehicles with heterogeneous nonlinear uncertain dynamics. IEEE Trans. Cybern. 48(10), 2920–2934 (2018)
Article Google Scholar
Yue, Z., Zhu, D.: A bio-inspired neurodynamics based back stepping path-following control of an auv with ocean current. Int. J. Robot. Autom 27(3), 298–307 (2012)
Google Scholar

Download references

Acknowledgements

The work is partially supported by the National Key Research and Development Program of China (Project No.2016YFC0301400), the National Natural Science Foundation of China (Project No.51379198), the National Natural Science Foundation of China (under grant No.51809246), the National Natural Science Foundation of Shandong Province (under grant No.ZR2018QF003) and the Fundamental Research Funds for the Central Universities (Project No.201961005).

Funding

The work is partially supported by the National Key Research and Development Program of China (Project No.2016YFC0301400), the National Natural Science Foundation of China (Project No.51379198), the National Natural Science Foundation of China (under grant No.51809246), the National Natural Science Foundation of Shandong Province (under grant No.ZR2018QF003) and the Fundamental Research Funds for the Central Universities (Project No.201961005).

Author information

Authors and Affiliations

School of Information Science and Engineering, Ocean University of China, Qingdao, Shandong, 266000, China
Dianrui Wang, Bo He, Yue Shen, Guangliang Li & Guanzhong Chen

Authors

Dianrui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bo He
View author publications
You can also search for this author in PubMed Google Scholar
Yue Shen
View author publications
You can also search for this author in PubMed Google Scholar
Guangliang Li
View author publications
You can also search for this author in PubMed Google Scholar
Guanzhong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Dianrui Wang: Methodology, Resources, Software, Writing Original Draft, Writing-Review & Editing; Bo He: Conceptualization, Investigation, Supervision, Project administration, Funding acquisition; Yue Shen: Project administration; Guangliang Li: Writing-Review & Editing; Guanzhong Chen: Data Curation.

Corresponding author

Correspondence to Bo He.

Ethics declarations

Consent for Publication

Consent for publication was obtained from all participants.

Competing interests

The authors declare that they have no competing financial interests

Additional information

Consent to Participate

We confirm that the manuscript has been read and approved by all named authors. We further confirm that the order of authors listed in the manuscript has been approved by all of us.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, D., He, B., Shen, Y. et al. A Modified ALOS Method of Path Tracking for AUVs with Reinforcement Learning Accelerated by Dynamic Data-Driven AUV Model. J Intell Robot Syst 104, 49 (2022). https://doi.org/10.1007/s10846-021-01504-0

Download citation

Received: 19 November 2020
Accepted: 09 September 2021
Published: 04 March 2022
DOI: https://doi.org/10.1007/s10846-021-01504-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Modified ALOS Method of Path Tracking for AUVs with Reinforcement Learning Accelerated by Dynamic Data-Driven AUV Model

Abstract

Access this article

Similar content being viewed by others

On Adaptive LOS Guidance Law Based Path Following of an AUV Using Model Predictive Control

Path optimization for marine vehicles in ocean currents using reinforcement learning

AUV Path Planning Based on Differential Evolution with Environment Prediction

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Consent for Publication

Competing interests

Additional information

Consent to Participate

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Modified ALOS Method of Path Tracking for AUVs with Reinforcement Learning Accelerated by Dynamic Data-Driven AUV Model

Abstract

Access this article

Similar content being viewed by others

On Adaptive LOS Guidance Law Based Path Following of an AUV Using Model Predictive Control

Path optimization for marine vehicles in ocean currents using reinforcement learning

AUV Path Planning Based on Differential Evolution with Environment Prediction

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Consent for Publication

Competing interests

Additional information

Consent to Participate

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation