Towards autonomous and optimal excavation of shield machine: a deep reinforcement learning-based approach

Zhang, Ya-kun; Gong, Guo-fang; Yang, Hua-yong; Chen, Yu-xi; Chen, Geng-lin

doi:10.1631/jzus.A2100325

Towards autonomous and optimal excavation of shield machine: a deep reinforcement learning-based approach

题目:迈向盾构机自主最优掘进:一种基于深度强化学习的方法

Research Article
Published: 07 July 2022

Volume 23, pages 458–478, (2022)
Cite this article

Journal of Zhejiang University-SCIENCE A Aims and scope Submit manuscript

Ya-kun Zhang (张亚坤)¹,
Guo-fang Gong (龚国芳) ORCID: orcid.org/0000-0001-9553-8783¹,
Hua-yong Yang (杨华勇)¹,
Yu-xi Chen (陈玉羲)¹ &
…
Geng-lin Chen (陈更林)²

8 Citations
Explore all metrics

Abstract

Autonomous excavation operation is a major trend in the development of a new generation of intelligent tunnel boring machines (TBMs). However, existing technologies are limited to supervised machine learning and static optimization, which cannot outperform human operation and deal with ever changing geological conditions and the long-term performance measure. The aim of this study is to resolve the problem of dynamic optimization of the shield excavation performance, as well as to achieve autonomous optimal excavation. In this study, a novel autonomous optimal excavation approach that integrates deep reinforcement learning and optimal control is proposed for shield machines. Based on a first-principles analysis of the machine-ground interaction dynamics of the excavation process, a deep neural network model is developed using construction field data consisting of 1.1 million samples. The multi-system coupling mechanism is revealed by establishing an overall system model. Based on the overall system analysis, the autonomous optimal excavation problem is decomposed into a multi-objective dynamic optimization problem and an optimal control problem. Subsequently, a dimensionless multi-objective comprehensive excavation performance measure is proposed. A deep reinforcement learning method is used to solve for the optimal action sequence trajectory, and optimal closed-loop feedback controllers are designed to achieve accurate execution. The performance of the proposed approach is compared to that of human operation by using the construction field data. The simulation results show that the proposed approach not only has the potential to replace human operation but also can significantly improve the comprehensive excavation performance.

概要

目的

自主掘进作业是新一代智能隧道掘进机(TBM)发展的趋势. 然而, 现有技术局限于有监督机器学习和静态优化, 其性能无法超越人工操作, 也难以处理不断变化的地质条件和长期掘进性能指标. 本文旨在解决盾构机掘进性能的动态优化问题, 实现自主最优掘进.

创新点

1. 针对掘进过程的盾构机-环境交互作用动力学, 提出了一种基于第一性原理分析和深度神经网络相结合的高精度混合建模方法, 改善模型的可解释性并简化了特征选择过程;2. 提出了一种适用于盾构机智能操作系统的无量纲多目标综合掘进性能指标;3. 提出了一种深度学习与最优控制结合的盾构自主最优掘进方法, 实现盾构掘进参数的智能决策与长期综合掘进性能的多目标动态优化.

方法

1. 通过理论推导, 揭示掘进过程的多系统耦合作用关系, 得到自主最优掘进系统设计的两个自由度(图8);2. 通过机理与数据联合驱动的混合建模, 构建深度强化学习智能体的高精度训练环境;3. 通过仿真模拟, 利用施工现场数据, 对自主最优掘进系统与人工操作的性能进行比较, 验证所提方法的可行性和有效性(图11~13).

结论

1. 人类司机在进行掘进参数决策时, 掘进比速度和掘进比能耗的相对权重比接近6׃4. 2. 不同的地质条件应采用不同的掘进参数决策策略:常规地质应采用k1值较高的自主最优掘进系统, 而在掘进比速度明显降低的困难地质则应采用k2值较高的自主最优掘进系统. 3. 尽管训练深度强化学习智能体非常耗时, 但与培训熟练的盾构司机相比仍具有巨大的优势.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Control of Operation Parameters During EPB Shield Tunnelling Based on Artificial Neural Network Model

Using CNNs to Optimize Numerical Simulations in Geotechnical Engineering

Agent-Based Simulation Model for the Real-Time Evaluation of Tunnel Boring Machines Using Deep Learning

References

Antsaklis PJ, Rahnama A, 2018. Control and machine intelligence for system autonomy. Journal of Intelligent & Robotic Systems, 91(1):23–34. https://doi.org/10.1007/s10846-018-0832-6
Article Google Scholar
Antsaklis PJ, Passino KM, Wang SJ, 1991. An introduction to autonomous control systems. IEEE Control Systems Magazine, 11(4):5–13. https://doi.org/10.1109/37.88585
Article Google Scholar
Ates U, Bilgin N, Copur H, 2014. Estimating torque, thrust and other design parameters of different type TBMs with some criticism to TBMs used in Turkish tunneling projects. Tunnelling and Underground Space Technology, 40:46–63. https://doi.org/10.1016/j.tust.2013.09.004
Article Google Scholar
Busoniu L, Babuska R, de Schutter B, et al., 2017. Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton, USA, p.1–13. https://doi.org/10.1201/9781439821091
Book Google Scholar
Carreras M, Yuh J, Batlle J, et al., 2005. A behavior-based scheme using reinforcement learning for autonomous underwater vehicles. IEEE Journal of Oceanic Engineering, 30(2):416–427. https://doi.org/10.1109/JOE.2004.835805
Article Google Scholar
Chen RP, Zhang P, Kang X, et al., 2019. Prediction of maximum surface settlement caused by earth pressure balance (EPB) shield tunneling with ANN methods. Soils and Foundations, 59(2):284–295. https://doi.org/10.1016/j.sandf.2018.11.005
Article Google Scholar
Cobbe K, Klimov O, Hesse C, et al., 2019. Quantifying generalization in reinforcement learning. Proceedings of the 36th International Conference on Machine Learning, p.1282–1289.
Dietterich TG, 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227–303. https://doi.org/10.1613/jair.639
Article MathSciNet MATH Google Scholar
El Sallab A, Abdou M, Perot E, et al., 2017. Deep reinforcement learning framework for autonomous driving. Electronic Imaging, 2017(19):70–76. https://doi.org/10.2352/ISSN.2470-1173.2017.19.AVM-023
Article Google Scholar
Geng Q, Wei ZY, He F, et al., 2015. Comparison of the mechanical performance between two-stage and flat-face cutter head for the rock tunnel boring machine (TBM). Journal of Mechanical Science and Technology, 29(5):2047–2058. https://doi.org/10.1007/s12206-015-0425-2
Article Google Scholar
Han MD, Cai ZX, Qu CY, et al., 2017. Dynamic numerical simulation of cutterhead loads in TBM tunnelling. Tunnelling and Underground Space Technology, 70:286–298. https://doi.org/10.1016/j.tust.2017.08.028
Article Google Scholar
He KM, Zhang XY, Ren SQ, et al., 2015. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. 2015 IEEE International Conference on Computer Vision (ICCV), p.1026–1034. https://doi.org/10.1109/ICCV.2015.123
Huo JZ, Sun W, Chen J, et al., 2010. Optimal disc cutters plane layout design of the full-face rock tunnel boring machine (TBM) based on a multi-objective genetic algorithm. Journal of Mechanical Science and Technology, 24(2):521–528. https://doi.org/10.1007/s12206-009-1220-8
Article Google Scholar
Kingma DP, Ba J, 2015. Adam: a method for stochastic optimization. The 3rd International Conference on Learning Representations.
Koopialipoor M, Nikouei SS, Marto A, et al., 2019. Predicting tunnel boring machine performance through a new model based on the group method of data handling. Bulletin of Engineering Geology and the Environment, 78(5):3799–3813. https://doi.org/10.1007/s10064-018-1349-8
Article Google Scholar
Kuwahara H, Harada M, 1988. Application of fuzzy reasoning to the control of shield tunnelling. Journal of the Society of Instrument and Control Engineers, 27(11):1030–1037. https://doi.org/10.11499/sicejl1962.27.1030
Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, et al., 2016. Continuous control with deep reinforcement learning. The 4th International Conference on Learning Representations.
Liu XY, Shao C, Ma HF, et al., 2011. Optimal earth pressure balance control for shield tunneling based on LS-SVM and PSO. Automation in Construction, 20(4):321–327. https://doi.org/10.1016/j.autcon.2010.11.002
Article Google Scholar
Mahdevari S, Shahriar K, Yagiz S, et al., 2014. A support vector regression model for predicting tunnel boring machine penetration rates. International Journal of Rock Mechanics and Mining Sciences, 72:214–229. https://doi.org/10.1016/j.ijrmms.2014.09.012
Article Google Scholar
Namli M, Bilgin N, 2017. A model to predict daily advance rates of EPB-TBMs in a complex geology in Istanbul. Tunnelling and Underground Space Technology, 62:43–52. https://doi.org/10.1016/j.tust.2016.11.008
Article Google Scholar
Ng AY, Coates A, Diel M, et al., 2006. Autonomous inverted helicopter flight via reinforcement learning. In: Ang MH, Khatib O (Eds.), Experimental Robotics IX. Springer, Berlin, Heidelberg, Germany, p.363–372. https://doi.org/10.1007/11552246_35
Chapter Google Scholar
Ninić J, Meschke G, 2015. Model update and real-time steering of tunnel boring machines using simulation-based meta models. Tunnelling and Underground Space Technology, 45:138–152. https://doi.org/10.1016/j.tust.2014.09.013
Article Google Scholar
Pan XL, You YR, Wang ZY, et al., 2017. Virtual to real reinforcement learning for autonomous driving. British Machine Vision Conference.
Qin CJ, Shi G, Tao JF, et al., 2021. Precise cutterhead torque prediction for shield tunneling machines using a novel hybrid deep neural network. Mechanical Systems and Signal Processing, 151:107386. https://doi.org/10.1016/j.ymssp.2020.107386
Article Google Scholar
Salimi A, Faradonbeh RS, Monjezi M, et al., 2018. TBM performance estimation using a classification and regression tree (CART) technique. Bulletin of Engineering Geology and the Environment, 77(1):429–440. https://doi.org/10.1007/s10064-016-0969-0
Article Google Scholar
Saridis GN, 2001. Hierarchically Intelligent Machines. World Scientific, Hong Kong, China, p.25–32. https://doi.org/10.1142/4846
Book Google Scholar
Shalev-Shwartz S, Shammah S, Shashua A, 2016. Safe, multiagent, reinforcement learning for autonomous driving. https://arxiv.org/abs/1610.03295v1
Shao C, Lan DS, 2014. Optimal control of an earth pressure balance shield with tunnel face stability. Automation in Construction, 46:22–29. https://doi.org/10.1016/j.autcon.2014.07.005
Article Google Scholar
Shi H, Yang HY, Gong GF, et al., 2011. Determination of the cutterhead torque for EPB shield tunneling machine. Automation in Construction, 20(8):1087–1095. https://doi.org/10.1016/j.autcon.2011.04.010
Article Google Scholar
Song X, Liu JQ, Guo W, 2010. A cutter head torque forecast model based on multivariate nonlinear regression for EPB shield tunneling. International Conference on Artificial Intelligence and Computational Intelligence, p.104–108. https://doi.org/10.1109/AICI.2010.261
Sun W, Huo JZ, Chen J, et al., 2011. Disc cutters’ layout design of the full-face rock tunnel boring machine (TBM) using a cooperative coevolutionary algorithm. Journal of Mechanical Science and Technology, 25(2):415. https://doi.org/10.1007/s12206-010-1225-3
Article Google Scholar
Sun W, Shi ML, Zhang C, et al., 2018a. Dynamic load prediction of tunnel boring machine (TBM) based on heterogeneous in-situ data. Automation in Construction, 92:23–34. https://doi.org/10.1016/j.autcon.2018.03.030
Article Google Scholar
Sun W, Wang XB, Shi ML, et al., 2018b. Multidisciplinary design optimization of hard rock tunnel boring machine using collaborative optimization. Advances in Mechanical Engineering, 10(1):1–12. https://doi.org/10.1177/1687814018754726
Google Scholar
Wang LT, Gong GF, Shi H, et al., 2012. A new calculation model of cutterhead torque and investigation of its influencing factors. Science China Technological Sciences, 55(6):1581–1588. https://doi.org/10.1007/s11431-012-4749-1
Article Google Scholar
Wang LT, Sun W, Long YY, et al., 2018a. Reliability-based performance optimization of tunnel boring machine considering geological uncertainties. IEEE Access, 6:19086–19098. https://doi.org/10.1109/ACCESS.2018.2821190
Article Google Scholar
Wang LT, Yang X, Gong GF, et al., 2018b. Pose and trajectory control of shield tunneling machine in complicated stratum. Automation in Construction, 93:192–199. https://doi.org/10.1016/j.autcon.2018.05.020
Article Google Scholar
Xie HB, Duan XM, Yang HY, et al., 2012. Automatic trajectory tracking control of shield tunneling machine under complex stratum working condition. Tunnelling and Underground Space Technology, 32:87–97. https://doi.org/10.1016/j.tust.2012.06.002
Article Google Scholar
Yeh IC, 1997. Application of neural networks to automatic soil pressure balance control for shield tunneling. Automation in Construction, 5(5):421–426. https://doi.org/10.1016/S0926-5805(96)00165-3
Article Google Scholar
Yu A, Palefsky-Smith R, Bedi R, 2016. Deep Reinforcement Learning for Simulated Autonomous Vehicle Control. Technical Report, Stanford University, California, USA.
Google Scholar
Zhang P, Chen RP, Wu HN, 2019. Real-time analysis and regulation of EPB shield steering using Random Forest. Automation in Construction, 106:102860. https://doi.org/10.1016/j.autcon.2019.102860
Article Google Scholar
Zhang P, Wu HN, Chen RP, et al., 2020a. A critical evaluation of machine learning and deep learning in shield-ground interaction prediction. Tunnelling and Underground Space Technology, 106:103593. https://doi.org/10.1016/j.tust.2020.103593
Article Google Scholar
Zhang P, Li H, Ha QP, et al., 2020b. Reinforcement learning based optimizer for improvement of predicting tunneling-induced ground responses. Advanced Engineering Informatics, 45:101097. https://doi.org/10.1016/j.aei.2020.101097
Article Google Scholar
Zhang Q, Kang YL, Qu CY, et al., 2010. Mechanical model for operational loads prediction on shield cutter head during excavation. IEEE/ASME International Conference on Advanced Intelligent Mechatronics, p.1252–1256. https://doi.org/10.1109/AIM.2010.5695778
Zhang Q, Huang T, Huang GY, et al., 2013. Theoretical model for loads prediction on shield tunneling machine with consideration of soil-rock interbedded ground. Science China Technological Sciences, 56(9):2259–2267. https://doi.org/10.1007/s11431-013-5302-6
Article Google Scholar
Zhang Q, Qu CY, Cai ZX, et al., 2014. Modeling of the thrust and torque acting on shield machines during tunneling. Automation in Construction, 40:60–67. https://doi.org/10.1016/j.autcon.2013.12.008
Article Google Scholar
Zhang Q, Hou ZD, Huang GY, et al., 2015. Mechanical characterization of the load distribution on the cutterhead-ground interface of shield tunneling machines. Tunnelling and Underground Space Technology, 47:106–113. https://doi.org/10.1016/j.tust.2014.12.009
Article Google Scholar
Zhang WJ, Yang GS, Lin YZ, et al., 2018. On definition of deep learning. World Automation Congress (WAC), p. 1–5. https://doi.org/10.23919/WAC.2018.8430387
Zhang YK, Gong GF, Yang HY, et al., 2019. Data-driven direct automatic tuning scheme for fixed-structure digital controllers of hybrid systems. IET Control Theory & Applications, 13(2):248–257. https://doi.org/10.1049/iet-cta.2018.5165
Article MathSciNet MATH Google Scholar
Zhang YK, Gong GF, Yang HY, et al., 2020. Precision versus intelligence: autonomous supporting pressure balance control for slurry shield tunnel boring machines. Automation in Construction, 114:103173. https://doi.org/10.1016/j.autcon.2020.103173
Article Google Scholar
Zhou C, Ding LY, He R, 2013. PSO-based Elman neural network model for predictive control of air chamber pressure in slurry shield tunneling under Yangtze River. Automation in Construction, 36:208–217. https://doi.org/10.1016/j.autcon.2013.03.001
Article Google Scholar
Zhou C, Ding LY, Skibniewski MJ, et al., 2018. Data based complex network modeling and analysis of shield tunneling performance in metro construction. Advanced Engineering Informatics, 38:168–186. https://doi.org/10.1016/j.aei.2018.06.011
Article Google Scholar
Zhou C, Xu HC, Ding LY, et al., 2019a. Dynamic prediction for attitude and position in shield tunneling: a deep learning method. Automation in Construction, 105:102840. https://doi.org/10.1016/j.autcon.2019.102840
Article Google Scholar
Zhou C, Ding LY, Zhou Y, et al., 2019b. Hybrid support vector machine optimization model for prediction of energy consumption of cutter head drives in shield tunneling. Journal of Computing in Civil Engineering, 33(3):04019019. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000833
Article Google Scholar
Zhou J, Zhou YH, Wang BC, et al., 2019. Human-cyber-physical systems (HCPSs) in the context of new-generation intelligent manufacturing. Engineering, 5(4):624–636. https://doi.org/10.1016/j.eng.2019.07.015
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work is supported by the National Key Research and Development Program of China (Nos. 2020YFF0218004 and 2020YFF0218003) and the National Natural Science Foundation of China (No. 52105074). The authors give special thanks to the China Railway Engineering Equipment Group Co., Ltd. for providing construction field data.

Author information

Authors and Affiliations

State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou, 310027, China
Ya-kun Zhang (张亚坤), Guo-fang Gong (龚国芳), Hua-yong Yang (杨华勇) & Yu-xi Chen (陈玉羲)
School of Electrical and Power Engineering, China University of Mining and Technology, Xuzhou, 221116, China
Geng-lin Chen (陈更林)

Authors

Ya-kun Zhang (张亚坤)
View author publications
You can also search for this author in PubMed Google Scholar
Guo-fang Gong (龚国芳)
View author publications
You can also search for this author in PubMed Google Scholar
Hua-yong Yang (杨华勇)
View author publications
You can also search for this author in PubMed Google Scholar
Yu-xi Chen (陈玉羲)
View author publications
You can also search for this author in PubMed Google Scholar
Geng-lin Chen (陈更林)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guo-fang Gong (龚国芳).

Additional information

Electronic Supplementary Materials

The pseudo-code for the implementation of the training environment.

Author contributions

Ya-kun ZHANG and Guo-fang GONG designed the research and wrote the first draft of the manuscript. Yu-xi CHEN conducted the literature review. Geng-lin CHEN helped to organize the manuscript. Hua-yong YANG revised the final version and provided the funding support.

Conflict of interest

Ya-kun ZHANG, Guo-fang GONG, Hua-yong YANG, Yu-xi CHEN, and Geng-lin CHEN declare that they have no conflict of interest.

Electronic supplementary material