Skip to main content
Log in

Towards autonomous and optimal excavation of shield machine: a deep reinforcement learning-based approach

题目:迈向盾构机自主最优掘进:一种基于深度强化学习的方法

  • Research Article
  • Published:
Journal of Zhejiang University-SCIENCE A Aims and scope Submit manuscript

Abstract

Autonomous excavation operation is a major trend in the development of a new generation of intelligent tunnel boring machines (TBMs). However, existing technologies are limited to supervised machine learning and static optimization, which cannot outperform human operation and deal with ever changing geological conditions and the long-term performance measure. The aim of this study is to resolve the problem of dynamic optimization of the shield excavation performance, as well as to achieve autonomous optimal excavation. In this study, a novel autonomous optimal excavation approach that integrates deep reinforcement learning and optimal control is proposed for shield machines. Based on a first-principles analysis of the machine-ground interaction dynamics of the excavation process, a deep neural network model is developed using construction field data consisting of 1.1 million samples. The multi-system coupling mechanism is revealed by establishing an overall system model. Based on the overall system analysis, the autonomous optimal excavation problem is decomposed into a multi-objective dynamic optimization problem and an optimal control problem. Subsequently, a dimensionless multi-objective comprehensive excavation performance measure is proposed. A deep reinforcement learning method is used to solve for the optimal action sequence trajectory, and optimal closed-loop feedback controllers are designed to achieve accurate execution. The performance of the proposed approach is compared to that of human operation by using the construction field data. The simulation results show that the proposed approach not only has the potential to replace human operation but also can significantly improve the comprehensive excavation performance.

概要

目的

自主掘进作业是新一代智能隧道掘进机(TBM)发展的趋势. 然而, 现有技术局限于有监督机器学习和静态优化, 其性能无法超越人工操作, 也难以处理不断变化的地质条件和长期掘进性能指标. 本文旨在解决盾构机掘进性能的动态优化问题, 实现自主最优掘进.

创新点

1. 针对掘进过程的盾构机-环境交互作用动力学, 提出了一种基于第一性原理分析和深度神经网络相结合的高精度混合建模方法, 改善模型的可解释性并简化了特征选择过程;2. 提出了一种适用于盾构机智能操作系统的无量纲多目标综合掘进性能指标;3. 提出了一种深度学习与最优控制结合的盾构自主最优掘进方法, 实现盾构掘进参数的智能决策与长期综合掘进性能的多目标动态优化.

方法

1. 通过理论推导, 揭示掘进过程的多系统耦合作用关系, 得到自主最优掘进系统设计的两个自由度(图8);2. 通过机理与数据联合驱动的混合建模, 构建深度强化学习智能体的高精度训练环境;3. 通过仿真模拟, 利用施工现场数据, 对自主最优掘进系统与人工操作的性能进行比较, 验证所提方法的可行性和有效性(图11~13).

结论

1. 人类司机在进行掘进参数决策时, 掘进比速度和掘进比能耗的相对权重比接近6׃4. 2. 不同的地质条件应采用不同的掘进参数决策策略:常规地质应采用k1值较高的自主最优掘进系统, 而在掘进比速度明显降低的困难地质则应采用k2值较高的自主最优掘进系统. 3. 尽管训练深度强化学习智能体非常耗时, 但与培训熟练的盾构司机相比仍具有巨大的优势.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Acknowledgments

This work is supported by the National Key Research and Development Program of China (Nos. 2020YFF0218004 and 2020YFF0218003) and the National Natural Science Foundation of China (No. 52105074). The authors give special thanks to the China Railway Engineering Equipment Group Co., Ltd. for providing construction field data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guo-fang Gong  (龚国芳).

Additional information

Electronic Supplementary Materials

The pseudo-code for the implementation of the training environment.

Author contributions

Ya-kun ZHANG and Guo-fang GONG designed the research and wrote the first draft of the manuscript. Yu-xi CHEN conducted the literature review. Geng-lin CHEN helped to organize the manuscript. Hua-yong YANG revised the final version and provided the funding support.

Conflict of interest

Ya-kun ZHANG, Guo-fang GONG, Hua-yong YANG, Yu-xi CHEN, and Geng-lin CHEN declare that they have no conflict of interest.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Yk., Gong, Gf., Yang, Hy. et al. Towards autonomous and optimal excavation of shield machine: a deep reinforcement learning-based approach. J. Zhejiang Univ. Sci. A 23, 458–478 (2022). https://doi.org/10.1631/jzus.A2100325

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.A2100325

Key words

关键词

Navigation