Skip to main content

ROS Rescue: Fault Tolerance System for Robot Operating System

  • Chapter
  • First Online:
Robot Operating System (ROS)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 895))

Abstract

In this chapter we discuss the problem of master failure in ROS 1.0 and its impact on robotic deployments in the real world. We address this issue in this tutorial chapter where we outline, design and demonstrate a fault tolerant mechanism associated with a ROS master failure. Unlike previous solutions which use primary backup replication and external checkpointing libraries which are resource demanding, our mechanism adds a lightweight functionality to the ROS master to enable it to recover from failure. We present a modified version of the ROS master which is equipped with a logging mechanism to record the meta information and network state of ROS nodes as well as a recovery mechanism to go back to the previous state without having to abort or restart all the nodes. We also implement an additional master monitor node responsible for failure detection on the master by polling it for its availability. Our code is implemented in Python and preliminary tests were conducted successfully on a variety of land, aerial and underwater robots and a teleoperated computer running ROS Kinetic on Ubuntu 16.04. The code is publicly available under a Creative Commons license on Github at https://github.com/PushyamiKaveti/fault-tolerant-ros-master.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dataspeed inc drive-by-wire adas kit, https://bitbucket.org/DataspeedInc/dbw_mkz_simulation/src/default

  2. Dji matrice m100 quadcopter for developers, https://www.dji.com/matrice100

  3. The rise of the robot operating system, https://roboticsandautomationnews.com/2019/05/16/the-rise-of-the-robot-operating-system/22485/

  4. J. Ansel, K. Arya, G. Cooperman, DMTCP: transparent checkpointing for cluster computations and the desktop, in IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium (2009). https://doi.org/10.1109/IPDPS.2009.5161063

  5. S. Ghemawat, H. Gobioff, S.T. Leung, The google file system. Oper. Syst. Rev. (ACM) (2003). https://doi.org/10.1145/1165389.945450

  6. T. Jain, G. Cooperman, Dmtcp: fixing the single point of failure of the ros master (2017)

    Google Scholar 

  7. M. Lauer, M. Amy, J.C. Fabre, M. Roy, W. Excoffon, M. Stoicescu, Resilient computing on ros using adaptive fault tolerance. J. Softw.: Evol. Process. 30(3), e1917 (2018)

    Google Scholar 

  8. M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, A.Y. Ng, ROS: an open-source Robot Operating System, in ICRA Workshop on Open Source Software (2009)

    Google Scholar 

  9. P. Yoonseok, J. Leon, Turtlebot3 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pushyami Kaveti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kaveti, P., Singh, H. (2021). ROS Rescue: Fault Tolerance System for Robot Operating System. In: Koubaa, A. (eds) Robot Operating System (ROS). Studies in Computational Intelligence, vol 895. Springer, Cham. https://doi.org/10.1007/978-3-030-45956-7_12

Download citation

Publish with us

Policies and ethics