Journal of Mechanical Science and Technology

, Volume 33, Issue 11, pp 5415–5423 | Cite as

Automatic control of cardiac ablation catheter with deep reinforcement learning method

  • Hyeonseok You
  • EunKyung Bae
  • Youngjin Moon
  • Jihoon Kweon
  • Jaesoon ChoiEmail author


To reduce the radiation exposure of personnel during an interventional procedure for arrhythmia, a robot has been developed and implemented herein for use in interventional procedures. Studies on the control of an electrophysiology catheter by robots are being conducted. However, controlling a catheter using a robot has limited precision owing to external forces subjected on the catheter due to blood flow and pulse inside a heart. This study implements a reinforcement learning method for automated control of a catheter by a robot. Using the reinforcement learning method, this study aims to show that such a robot can learn to manipulate a catheter to reach a target in a simulated environment and subsequently control a catheter in an actual environment. Randomization noise is used during the simulation to reduce the differences between the simulation and actual learning environments. Each environment is implemented with different movement values depending on insertion angles and steps of the catheter model. When the results from the simulated learning model are implemented in the actual environment, the success rate of catheter reaching the designated target is 73 %. In addition, the noise-implemented model shows that the success rate can be increased up to 87 %. Through these experiments, the study verifies that a simulated learning model can be implemented in a robot system to control an actual catheter as well as that the success rate of the model can be increased using randomization noise.

Key words

Catheter control Deep learning Reinforcement learning Catheter simulation 


P r[X]

Expectation of a random variable

Equality relationship that is true by definition


A value of x at which x takes its maximal value

\(E\left[ {\mathop X\limits^x } \right]\)

Expectation of a random variable


Set of all states


Set of all actions


Set of all possible rewards


Discount-rate parameter






An action


A reward


Discrete time step


States at the time step t


An action at the time step t


A reward at the time step t


Expected value from state and action


Expected advantage value from state and action under policy π

V (s,a)

Expected state value from state and action under policy π


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (HI17C2410) and the Ministry of Trade, Industry and Energy, Republic of Korea (10077502).


  1. [1]
    K. H. Kuck, M. Schluter, M. Geige, J. Siebels and W. Duckeck, Radiofrequency current catheter ablation of accessory atrio-ventricular pathways, The Lancet, 337 (8757) (1991) 1557–1561.CrossRefGoogle Scholar
  2. [2]
    W. M. Jackman, X. Wang, K. J. Friday, C. A. Roman, K. P. Moulton, K. J. Beckman and P. D. Margolis, Catheter ablation of accessory atrioventricular pathways (Wolff–Parkinson–White syndrome) by radiofrequency current, New England Journal of Medicine, 324 (23) (1991) 1605–1611.CrossRefGoogle Scholar
  3. [3]
    R. J. Kanter, J. Papagiannis, M. P. Carboni, R. M. Ungerleider, W. E. Sanders and J. M. Wharton, Radiofrequency catheter ablation of supraventricular tachycardia substrates after Mustard and Senning operations for d-transposition of the great arteries, Journal of the American College of Cardiology, 35 (2) (2000) 428–441.CrossRefGoogle Scholar
  4. [4]
    E. Vano, L. Gonzalez, J. M. Fernandez, F. Alfonso and C. Macaya, Occupational radiation doses in interventional cardiology: A 15-year follow-up, The British Journal of Radiology, 79 (941) (2006) 383–388.CrossRefGoogle Scholar
  5. [5]
    Catheter Robotics, Inc., Amigo Remote Catheter System,
  6. [6]
    Hansen Medical, Inc., Sensei X Robotic Catheter System, (2016).Google Scholar
  7. [7]
    P. M. Loschak, A. Degirmenci and R. D. Howe, Predictive filtering in motion compensation with steerable cardiac catheters, IEEE International Conference on Robotics and Automation, Singapore (2017) 4830–4836.Google Scholar
  8. [8]
    M. Khoshnam and R. V. Patel, Robotics-assisted control of steerable ablation catheters based on the analysis of tendonsheath transmission mechanisms, IEEE/ASME Transactions on Mechatronics, 22 (3) (2017) 1473–1484.CrossRefGoogle Scholar
  9. [9]
    Y. Thakur, J. S. Bax, D. W. Holdsworth and M. Drangova, Design and performance evaluation of a remote catheter navigation system, IEEE Transactions on Biomedical Engineering, 56 (7) (2009) 1901–1908.CrossRefGoogle Scholar
  10. [10]
    R. Ginhoux, J. Gangloff, M. de Mathelin, L. Soler, M. M. A. Sanchez and J. Marescaux, Active filtering of physiological motion in robotized surgery using predictive control, IEEE Transactions on Robotics, 21 (1) (2005) 67–79.CrossRefGoogle Scholar
  11. [11]
    T. N. Do, T. Tjahjowidodo, M. W. S. Lau and S. J. Phee, Nonlinear friction modelling and compensation control of hysteresis phenomena for a pair of tendon-sheath actuated surgical robots, Mechanical Systems and Signal Processing, 60 (2015) 770–784.CrossRefGoogle Scholar
  12. [12]
    S. B. Kesner and R. D. Howe, Robotic catheter cardiac ablation combining ultrasound guidance and force control, The International Journal of Robotics Research, 33 (4) (2014) 631–644.CrossRefGoogle Scholar
  13. [13]
    Y. Ganji and F. Janabi-Sharifi, Catheter kinematics for intracardiac navigation, IEEE Transactions on Biomedical Engineering, 56 (3) (2009) 621–632.CrossRefGoogle Scholar
  14. [14]
    A. Degirmenci, P. M. Loschak, C. M. Tschabrunn, E. Anter and R. D. Howe, Compensation for unconstrained catheter shaft motion in cardiac catheters, IEEE International Conference on Robotics and Automation Stockholm, Sweden (2016) 4436–4442.Google Scholar
  15. [15]
    S. Levine, C. Finn, T. Darrell and P. Abbeel, End-to-end training of deep visuomotor policies, The Journal of Machine Learning Research, 17 (1) (2016) 1334–1373.MathSciNetzbMATHGoogle Scholar
  16. [16]
    J. Matas, S. James and A. J. Davison, Sim-to-real reinforcement learning for deformable object manipulation, CoRL 2018 (2018).Google Scholar
  17. [17]
    J. Tan, T. Zhang, E. Coumans, A. Iscen, Y. Bai, D. Hafner and V. Vanhoucke, Sim-to-real: Learning agile locomotion for quadruped robots, arXiv preprint arXiv:1804.10332 (2018).Google Scholar
  18. [18]
    R. Antonova, S. Cruciani, C. Smith and D. Kragic, Reinforcement learning for pivoting task, arXiv preprint arXiv:1703.00472 (2017).Google Scholar
  19. [19]
    A. A. Rusu, M. Vecerik, T. Rothörl, N. Heess, R. Pascanu and R. Hadsell, Sim-to-real robot learning from pixels with progressive nets, arXiv preprint arXiv:1610.04286 (2016).Google Scholar
  20. [20]
    Y. Tsurumine, Y. Cui, E. Uchibe and T. Matsubara, Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation, Robotics and Autonomous Systems, 112 (2019) 72–83.CrossRefGoogle Scholar
  21. [21]
    Y. Chebotar, A. Handa, V. Makoviychuk, M. Macklin, J. Issac, N. Ratliff and D. Fox, Closing the sim-to-real loop: Adapting simulation randomization with real world experience, arXiv preprint arXiv:1810.05687 (2018).Google Scholar
  22. [22]
    M. Hessel, J. Modayil, H. Van Hasselt, T. Schaul, G. Ostrovski, W. Dabney and D. Silver, Rainbow: Combining improvements in deep reinforcement learning, Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA (2018).Google Scholar
  23. [23]
    C. J. Watkins and P. Dayan, Q-learning, Machine Learning, 8 (3-4) (1992) 279–292.CrossRefGoogle Scholar
  24. [24]
    L. J. Lin, Reinforcement Learning for Robots Using Neural Networks, School of Computer Science, Carnegie-Mellon Univ., Pittsburgh, PA (1993) (No. CMU-CS-93-103).Google Scholar
  25. [25]
    P. Kormushev, S. Calinon and D. Caldwell, Reinforcement learning in robotics: Applications and real-world challenges, Robotics, 2 (3) (2013) 122–148.CrossRefGoogle Scholar
  26. [26]
    R. S. Sutton and A. G. Barto, Introduction to Reinforcement Learning, Cambridge: MIT Press, 4 (2) (1998).Google Scholar
  27. [27]
    T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot and G. Dulac-Arnold, Deep q-learning from demonstrations, Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA (2018).Google Scholar
  28. [28]
    V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare and S. Petersen, Human-level control through deep reinforcement learning, Nature, 518 (7540) (2015) 529.CrossRefGoogle Scholar
  29. [29]
    S. Ruder, An overview of gradient descent optimization algorithms, arXiv preprint arXiv: 1609.04747 (2016).Google Scholar
  30. [30]
    Z. Wang, T. Schaul, M. Hessel, H. Hasselt, M. Lanctot and N. Freitas, Dueling network architectures for deep reinforcement learning, International Conference on Machine Learning New York City, New York, USA (2016) 1995–2003.Google Scholar
  31. [31]
    L. Wang, H. R. Weerasooriya and M. J. E. Davis, Radiofrequency catheter ablation of atrial tachycardia, Australian and New Zealand Journal of Medicine, 25 (2) (1995) 127–132.CrossRefGoogle Scholar
  32. [32]
    W. E. Sanders, R. A. Sorrentino, R. A. Greenfield, H. Shenasa, M. E. Hamer and J. M. Wharton, Catheter ablation of sinoatrial node reentrant tachycardia, Journal of the American College of Cardiology, 23 (4) (1994) 926–934.CrossRefGoogle Scholar
  33. [33]
    Z. Hu, J. Won, Y. Moon, S. Park and J. Choi, Design of a robotic catheterization platform with use of commercial ablation catheter, 2017 Design of Medical Devices Conference, American Society of Mechanical Engineers Minneapolis, Minnesota, USA (2017) V001T08A005–V001T08A005.CrossRefGoogle Scholar
  34. [34]
    B. Yu, J. D. G. Fernández and T. Tan, Probabilistic kinematic model of a robotic catheter for 3D position control, Soft Robotics, 6 (2) (2019) 184–194.CrossRefGoogle Scholar
  35. [35]
  36. [36]
    Y. Ganji and F. Janabi-Sharifi, Catheter kinematics and control to enhance cardiac ablation, Optomechatronic Actuators, Manipulation, and Systems Control, International Society for Optics and Photonics, 6374 (2006) 63740.CrossRefGoogle Scholar
  37. [37]
    F. Zhang, J. Leitner, M. Milford, B. Upcroft and P. Corke, Towards vision-based deep reinforcement learning for robotic motion control, arXiv preprint arXiv:1511.03791 (2015).Google Scholar
  38. [38]
    E. Rodrigues Gomes and R. Kowalczyk, Dynamic analysis of multiagent Q-learning with ε-greedy exploration, Proceedings of the 26th Annual International Conference on Machine Learning (2009) 369–376.Google Scholar
  39. [39]
    V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra and M. Riedmiller, Playing atari with deep reinforcement learning, arXiv preprint arXiv:1312.5602 (2013).Google Scholar
  40. [40]
    S. B. Kesner and R. D. Howe, Position control of motion compensation cardiac catheters, IEEE Transactions on Robotics, 27 (6) (2011) 1045–1055.CrossRefGoogle Scholar
  41. [41]
    J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba and P. Abbeel, Domain randomization for transferring deep neural networks from simulation to the real world, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2017) 23–30.CrossRefGoogle Scholar
  42. [42]
    X. B. Peng, M. Andrychowicz, W. Zaremba and P. Abbeel, Sim-to-real transfer of robotic control with dynamics randomization, IEEE International Conference on Robotics and Automation, Brisbane, Australia (2018) 1–8.Google Scholar

Copyright information

© KSME & Springer 2019

Authors and Affiliations

  • Hyeonseok You
    • 1
    • 2
  • EunKyung Bae
    • 1
    • 2
  • Youngjin Moon
    • 2
    • 3
  • Jihoon Kweon
    • 2
    • 3
  • Jaesoon Choi
    • 1
    • 2
    Email author
  1. 1.Department of Biomedical Engineering, College of MedicineUniversity of UlsanUlsanKorea
  2. 2.Biomedical Engineering Research Center, Asan Institute for Life SciencesAsan Medical CenterSeoulKorea
  3. 3.Department of Convergence Medicine, College of MedicineUniversity of UlsanUlsanKorea

Personalised recommendations