Skip to main content

Advertisement

Log in

Deep Reinforcement Learning for autonomous pre-failure tool life improvement

  • ORIGINAL ARTICLE
  • Published:
The International Journal of Advanced Manufacturing Technology Aims and scope Submit manuscript

Abstract

This paper develops an approach to improve a CNC machine’s tool performance and slow down its degradation rate automatically in the Pre-Failure stage. A Deep Reinforcement Learning (DRL) agent is developed to optimize the machining process performance online during the Pre-Failure interval of the tool’s life. The Pre-Failure agent that is presented in the proposed approach tunes the feed rate according to the optimal policy that is learned in order to slow down the tool’s degradation rate, while maintaining an acceptable Material Removal Rate (MRR) level. The machine learning techniques and pattern recognitions are implemented to monitor and detect the tool’s potential failure level. The proposed mechanism is applied to a CNC machine when turning Titanium Metal Matrix Composites (TiMMC). A CNC machine Digital Twin (DT) is developed to emulate the physical machine in the digital environment. It is validated with the physical machine’s measurements. The proposed pre-failure mechanism is a model-free approach, which can be implemented in any machining process with fewer online computational efforts. It also validated on a wide range of cutting speeds, up to 15,000 RPM. Deployment of the proposed machine learning approach for the particular case study improves the tool’s Time to Failure (T2F) by 40% and the MRR by 6%, on average, compared to the classical approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Availability of data and materials

Data are available.

Code availability

Code is available.

Abbreviations

\({\alpha }\) :

Binary index

\({\beta }\) :

Scaling factor

\({\Delta (O)}\) :

Multi-class’s discriminator function

\({\gamma }\) :

Discount factor

\({\nabla _{Q^\mu }J}\) :

Training losses of policy network

\({\pi ^*}\) :

RL optimal policy

\({\tau }\) :

Learning rate

\({\theta ^\mu }\) :

Policy Network parameters

\({\theta ^Q}\) :

Q-Network parameters

\({a_t}\) :

Action at time t

A :

Action space

\({F_x}\) :

Radial force N

\({F_y}\) :

Feed force N

\({F_z}\) :

Cutting force N

f :

Feed rate rev/min

\({L(\theta ^Q)}\) :

Training losses of Q-network

\({N_h}\) :

Hidden neurons

\({N_s}\) :

Number of data observations

\({N_t}\) :

Random noise

NTPI :

Normalized Tool performance degradation Index

\({Q^*(s_t,a_t)}\) :

Optimal Q-value

\({r_t}\) :

Reward at time t

\({s_t}\) :

State at time t

S :

State space

\({VB_p}\) :

Potential failure tool wear

v :

Cutting speed m/min

\({W_j}\) :

Pattern coverage wight

References

  1. Lee J, Ardakani HD, Yang S, Bagheri B (2015) Industrial big data analytics and cyber-physical systems for future maintenance & service innovation. Procedia CIRP 38:3–7. https://doi.org/10.1016/j.procir.2015.08.026https://www.sciencedirect.com/science/article/pii/S2212827115008744. Proceedings of the 4th International Conference on Through-life Engineering Services

  2. Spielberg S, Tulsyan A, Lawrence NP, Loewen PD, Bhushan Gopaluni R (2019) Toward self-driving processes: a deep reinforcement learning approach to control. AIChE Journal 65(10):e16689. https://doi.org/10.1002/aic.16689https://aiche.onlinelibrary.wiley.com/doi/abs/10.1002/aic.16689

  3. Yacout S (2019) Industrial value chain research and applications for industry 4.0. In: In 4th North America Conference on Industrial Engineering and Operations Management, Toronto, Canada

  4. Elsheikh A, Yacout S, Ouali MS, Shaban Y (2020) Failure time prediction using adaptive logical analysis of survival curves and multiple machining signals. J Intell Manuf 31(2):403–415. https://doi.org/10.1007/s10845-018-1453-4

    Article  Google Scholar 

  5. Xiong G, Li ZL, Ding Y, Zhu L (2020) Integration of optimized feedrate into an online adaptive force controller for robot milling. Int J Adv Manuf Technol 106(3):1533–1542. https://doi.org/10.1007/s00170-019-04691-1

    Article  Google Scholar 

  6. Abbas AT, Abubakr M, Elkaseer A, Rayes MME, Mohammed ML, Hegab H (2020) Towards an adaptive design of quality, productivity and economic aspects when machining aisi 4340 steel with wiper inserts. IEEE Access 8:159206–159219. https://doi.org/10.1109/ACCESS.2020.3020623

    Article  Google Scholar 

  7. Abbas AT, Sharma N, Anwar S, Hashmi FH, Jamil M, Hegab H (2019) Towards optimization of surface roughness and productivity aspects during high-speed machining of Ti-6Al-4V. Materials 12(22):3749. https://doi.org/10.3390/ma12223749

    Article  Google Scholar 

  8. Park HS, Tran NH (2014) Development of a smart machining system using self-optimizing control. Int J Adv Manuf Technol 74(9–12):1365–1380. https://doi.org/10.1007/s00170-014-6076-0

    Article  Google Scholar 

  9. Ridwan F, Xu X, Liu G (2012) A framework for machining optimisation based on STEP-NC. J Intell Manuf 23(3):423–441. https://doi.org/10.1007/s00170-014-6076-0

    Article  Google Scholar 

  10. Stemmler S, Abel D, Adams O, Klocke F (2016) Model predictive feed rate control for a milling machine. IFAC-PapersOnLine 49(12):11–16. https://doi.org/10.1016/j.ifacol.2016.07.542

    Article  Google Scholar 

  11. Shaban Y, Aramesh M, Yacout S, Balazinski M, Attia H, Kishawy H (2014) Optimal replacement of tool during turning titanium metal matrix composites. In: Proceedings of the 2014 Industrial and Systems Engineering Research Conference

  12. Shaban Y, Meshreki M, Yacout S, Balazinski M, Attia H (2017) Process control based on pattern recognition for routing carbon fiber reinforced polymer. J Intell Manuf 28(1):165–179. https://doi.org/10.1007/s10845-014-0968-6

  13. Shaban Y, Yacout S, Balazinski M (2015) Tool wear monitoring and alarm system based on pattern recognition with logical analysis of data. J Manuf Sci Eng 137(4). https://doi.org/10.1115/1.4029955

  14. Sadek A, Hassan M, Attia M (2020) A new cyber-physical adaptive control system for drilling of hybrid stacks. CIRP Ann 69(1):105–108. https://doi.org/10.1016/j.cirp.2020.04.039

    Article  Google Scholar 

  15. Shaban Y, Aramesh M, Yacout S, Balazinski M, Attia H, Kishawy H (2017) Optimal replacement times for machining tool during turning titanium metal matrix composites under variable machining conditions. Proc Inst Mech Eng B J Eng Manuf 231(6):924–932. https://doi.org/10.1177/0954405415577591

  16. Taha HA, Yacout S, Shaban Y (2022) Autonomous self-healing mechanism for a CNC milling machine based on pattern recognition. J Intell Manuf 1–21. https://doi.org/10.1007/s10845-022-01913-4

  17. Ma Y, Zhu W, Benton MG, Romagnoli J (2019) Continuous control of a polymerization system with deep reinforcement learning. J Process Control 75:40–47. https://doi.org/10.1016/j.jprocont.2018.11.004

    Article  Google Scholar 

  18. Zhang Y, Li Y, Xu K (2022) Reinforcement learning-based tool orientation optimization for five-axis machining. Int J Adv Manuf Technol 119(11):7311–7326. https://doi.org/10.1007/s00170-022-08668-5

    Article  Google Scholar 

  19. Xiao Q, Li C, Tang Y, Li L (2021) Meta-reinforcement learning of machining parameters for energy-efficient process control of flexible turning operations. IEEE Trans Autom Sci Eng 18(1):5–18. https://doi.org/10.1109/TASE.2019.2924444

    Article  Google Scholar 

  20. Ochella S, Shafiee M, Sansom C (2021) Adopting machine learning and condition monitoring pf curves in determining and prioritizing high-value assets for life extension. Expert Syst Appl 176. https://doi.org/10.1016/j.eswa.2021.114897

    Article  Google Scholar 

  21. Bennane A, Yacout S (2012) LAD-CBM; new data processing tool for diagnosis and prognosis in condition-based maintenance. J Intell Manuf 23(2):265–275. https://doi.org/10.1007/s10845-009-0349-8

    Article  Google Scholar 

  22. Singh G, Gupta MK, Mia M, Sharma VS (2018) Modeling and optimization of tool wear in MQL-assisted milling of Inconel 718 superalloy using evolutionary techniques. Int J Adv Manuf Technol 97(1):481–494. https://doi.org/10.1007/s00170-018-1911-3

    Article  Google Scholar 

  23. M’Saoubi R, Axinte D, Soo SL, Nobel C, Attia H, Kappmeyer G, Engin S, Sim WM (2015) High performance cutting of advanced aerospace alloys and composite materials. CIRP Annals 64(2):557–580. https://doi.org/10.1016/j.cirp.2015.05.002https://www.sciencedirect.com/science/article/pii/S0007850615001419

  24. Aramesh M, Shaban Y, Yacout S, Attia M, Kishawy H, Balazinski M (2016) Survival life analysis applied to tool life estimation with variable cutting conditions when machining titanium metal matrix composites (TI-MMCS). Mach Sci Technol 20(1):132–147. https://doi.org/10.1080/10910344.2015.1133916

    Article  Google Scholar 

  25. Montgomery DC (2007) Introduction to statistical quality control. John Wiley & Sons

  26. Satopaa V, Albrecht J, Irwin D, Raghavan B (2011) Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: 2011 31st International Conference on Distributed Computing Systems Workshops. pp 166–171. https://doi.org/10.1109/ICDCSW.2011.20

  27. Lejeune M, Lozin V, Lozina I, Ragab A, Yacout S (2019) Recent advances in the theory and practice of logical analysis of data. Eur J Oper Res 275(1):1–15. https://doi.org/10.1016/j.ejor.2018.06.011

    Article  MathSciNet  MATH  Google Scholar 

  28. Shaban Y, Yacout S, Balazinski M, Jemielniak K (2017) Cutting tool wear detection using multiclass logical analysis of data. Mach Sci Technol 21(4):526–541. https://doi.org/10.1080/10910344.2017.1336177

  29. Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:150902971

  30. Barde SR, Yacout S, Shin H (2019) Optimal preventive maintenance policy based on reinforcement learning of a fleet of military trucks. J Intell Manuf 30(1):147–161. https://doi.org/10.1007/s10845-016-1237-7

    Article  Google Scholar 

  31. Shafto M, Conroy M, Doyle R, Glaessgen E, Kemp C, LeMoigne J, Wang L (2012) Modeling, simulation, information technology & processing roadmap. National Aeronautics and Space Administration 32(2012):1–38

    Google Scholar 

  32. Taha HA, Sakr AH, Yacout S (n.d.) Aircraft engine remaining useful life prediction framework for industry 4.0

  33. Yao J, Lu B, Zhang J (2022) Tool remaining useful life prediction using deep transfer reinforcement learning based on long short-term memory networks. Int J Adv Manuf Technol 118(3):1077–1086. https://doi.org/10.1007/s00170-021-07950-2

    Article  Google Scholar 

  34. Glaessgen E, Stargel D (2012) The digital twin paradigm for future NASA and US Air Force vehicles. In: 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference 20th AIAA/ASME/AHS Adaptive Structures Conference 14th AIAA. p 1818. https://doi.org/10.2514/6.2012-1818

  35. AboElHassan A, Sakr A, Yacout S (2021) A framework for digital twin deployment in production systems. In: Weißgraeber P, Heieck F, Ackermann C (eds) Advances in Automotive Production Technology - Theory and Application. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 145–152

    Chapter  Google Scholar 

  36. Qi Q, Tao F, Hu T, Anwer N, Liu A, Wei Y, Wang L, Nee A (2021) Enabling technologies and tools for digital twin. J Manuf Syst 58:3–21. https://doi.org/10.1016/j.jmsy.2019.10.001

    Article  Google Scholar 

  37. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press

  38. Hagan Martin T, Demuth Howard B, Beale Mark H et al (2002) Neural network design. University of Colorado at Boulder

  39. Brownlee J (2016) Deep learning with Python: develop deep learning models on Theano and TensorFlow using Keras. Machine Learning Mastery

  40. Traue A, Book G, Kirchgässner W, Wallscheid O (2022) Toward a reinforcement learning environment toolbox for intelligent electric motor control. IEEE Transactions on Neural Networks and Learning Systems 33(3):919–928. https://doi.org/10.1109/TNNLS.2020.3029573

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC). The authors would like to thank Dr. Hussien Hegab for his valuable discussions and his support for this research perspective.

Funding

This research was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC), Prof. Soumaya Yacout, grant reference RGPIN-05785-2017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hussein A. Taha.

Ethics declarations

Ethics approval

The authors confirm that this work does not contain any studies with human participants performed by any of the authors.

Consent to participate

Not applicable.

Consent for publication

The author grants the Publisher the sole and exclusive license of the full copyright in the Contribution, which license the Publisher hereby accepts.

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

1.1 Section title of first appendix

The proposed pre-failure DDPG architecture consists of two deep actor-critic networks with two hidden layers and the hyperparameters in Table A. The DRL performance depends on its hyperparameters, and it is related to the environment/application dimension space of actions and state [2, 17, 29, 40]. These hyperparameters are adopted from literature on computer game applications that have the same data dimensions of actions and sensors as the CNC turning machine [29]. Figure 16 shows the developed Pre-Failure agent architecture.

Hidden neurons

Discount factor

Batch size

Learning rate critic

Learning rate actor

Target Update rate

Memory size

256/128

0.995

128

1e-4

1e-4

1e-3

1e6

Fig. 16
figure 16

Pre-failure DDPG agent architecture for CNC tool performance

Appendix 2

1.1 Other runs figures

In this section, the detailed results of runs II, III, IV, and V are stated.

1.1.1 Run II, the spindle speed is 7500 RPM

The Pre-Failure agent interaction in this run is demonstrated by Fig. 17.

Fig. 17
figure 17

Pre-Failure agent interaction in Run II, the spindle speed is 7500 RPM

1.1.2 In Run III, the spindle speed is 10000 RPM

The Pre-Failure agent interaction in this run is given by Fig. 18.

Fig. 18
figure 18

Pre-Failure Agent Interaction in Run III, the spindle speed is 10000 RPM

1.1.3 In Run IV, spindle speed is 12,500 RPM

The Pre-Failure agent interaction in this run is depicted by Fig. 19.

Fig. 19
figure 19

Pre-Failure Agent Interaction in Run IV, spindle speed is 12,500 RPM

1.1.4 In Run V, the spindle speed is 15,000 RPM

The Pre-Failure agent interaction in this run is shown by Fig. 20.

Fig. 20
figure 20

Pre-Failure Agent Interaction in Run V, the spindle speed is 15,000 RPM

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Taha, H.A., Yacout, S. & Shaban, Y. Deep Reinforcement Learning for autonomous pre-failure tool life improvement. Int J Adv Manuf Technol 121, 6169–6192 (2022). https://doi.org/10.1007/s00170-022-09700-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00170-022-09700-4

Keywords

Navigation