Skip to main content

Staggered Checkpointing and Recovery in Cluster Based Mobile Ad Hoc Networks

  • Conference paper
Advances in Parallel Distributed Computing (PDCTA 2011)

Abstract

Checkpointing uses stable storage available in the distributed system for saving the consistent states of processes to which they can rollback at the time of recovery. But the checkpointing techniques for wired and cellular mobile systems are not trivially applicable to ad hoc networks as these networks have limited stable storage and wireless links are of low bandwidth. Moreover if synchronous checkpointing is employed, the processes contend for these limited resources at the time of checkpointing. This paper addresses the application of checkpointing to ad hoc networks and proposes a staggered approach to avoid simultaneous contention for resources. The staggering causes events, which would normally happen at the same time, to start or happen at different times. The proposed protocol does not need FIFO channels and logs minimum number of messages. It supports concurrent checkpoint initiation and successfully handles the overlapping failures in ad hoc networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Elnozahi, E.N., Alvisi, L., Wang, Y.M., Johnson, D.B.: A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys 34(3), 375–408 (2002)

    Article  Google Scholar 

  2. Norman, A.N., Choi, S.E., Lin,C.: Compiler-generated staggered checkpointing. In: Proc. 7th ACM Workshop on Languages, Compilers, and Run-time Support for Scalable Systems LCR 2004, pp. 1–8 (2004)

    Google Scholar 

  3. Chandy, K.M., Lamport, L.: Distributed snapshots: determining global states of distributed systems. ACM Transactions on Computer Systems 3(1), 63–75 (1985)

    Article  Google Scholar 

  4. Plank, J.S.: Efficient checkpointing on MIMD architectures, Ph.D. dissertation, Dept. of Computer Science, Princeton Univ. (1993)

    Google Scholar 

  5. Vaidya, N.H.: Staggered consistent checkpointing. IEEE Transactions on Parallel and distributed Systems 10(7), 694–702 (1999)

    Article  MathSciNet  Google Scholar 

  6. Jin, H., Hwang, K.: Distributed checkpointing on clusters with dynamic striping and staggering. In: Jean-Marie, A. (ed.) ASIAN 2002. LNCS, vol. 2550, pp. 19–33. Springer, Heidelberg (2002)

    Google Scholar 

  7. Hwang, K., Jin, H., Ho, R., Ro, W.: Reliable cluster computing with a new checkpointing RAID-x architecture. In: Proc. 9th Workshop on Heterogeneous Computing HCW 2000, Cancun, Mexico, pp. 171–184 (2000)

    Google Scholar 

  8. Ahn, J.: An efficient algorithm for removing useless logged messages in SBML protocols. In: Chakraborty, G. (ed.) ICDCIT 2005. LNCS, vol. 3816, pp. 166–171. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Koo, R., Toueg, S.: Checkpointing and rollback-recovery for distributed systems. IEEE Transactions on Software Engineering SE-13(1), 23–31 (1987)

    Article  MATH  Google Scholar 

  10. Spezialetti, M., Kearns, P.: Efficient distributed snapshots. In: Proc. 6th IEEE International Conference on Distributed Computing Systems, pp. 382–388 (1986)

    Google Scholar 

  11. Prakash, R., Singhal, M.: Maximal global snapshot with concurrent initiators. In: Proc. 6th IEEE Symposium on Parallel and Distributed Processing, pp. 344–351 (1994)

    Google Scholar 

  12. Mandal, P.S., Mukhopadhyay, K.: Concurrent checkpoint initiation and recovery algorithms on asynchronous ring networks. Journal of Parallel and Distributed Computing 64(5), 649–661 (2004)

    Article  MATH  Google Scholar 

  13. Manivannan, D., Jiang, Q., Yang, J., Persson, K.E., Singhal, M.: An asynchronous recovery algorithm based on a staggered quasi-synchronous checkpointing algorithm. In: Pal, A., Kshemkalyani, A.D., Kumar, R., Gupta, A. (eds.) IWDC 2005. LNCS, vol. 3741, pp. 117–128. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Jiang, Q., Manivannan, D.: An optimistic checkpointing and selective message logging approach for consistent global checkpoint collection in distributed systems. In: Proc. IEEE International Parallel and Distributed Processing Symposium, pp. 1–10 (2007)

    Google Scholar 

  15. Men, C., Xu, Z., Li, X.: An Efficient Checkpointing and Rollback Recovery Scheme for Cluster-Based Multi-channel Ad Hoc Wireless Networks. In: Proc. of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2008), pp. 371–378. IEEE Computer Society, Washington, DC, USA (2008)

    Chapter  Google Scholar 

  16. Riva, O., Nzouonta, J., Borcea, C.: Context-aware fault tolerance in migratory services. In: Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services (Mobiquitous 2008). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), ICST, Brussels, article 22 (2008)

    Google Scholar 

  17. Ono, M., Higaki, H.: Consistent Checkpoint Protocol for Wireless Ad-hoc Networks. In: The 2007 International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, Nevada, USA, pp. 1041–1046 (2007)

    Google Scholar 

  18. Juang, T.T., Liu, M.C.: An Efficient Asynchronous Recovery Algorithm In Wireless Mobile Ad Hoc Networks. J. of Internet Technology 4, 143–152 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jaggi, P.K., Singh, A.K. (2011). Staggered Checkpointing and Recovery in Cluster Based Mobile Ad Hoc Networks. In: Nagamalai, D., Renault, E., Dhanuskodi, M. (eds) Advances in Parallel Distributed Computing. PDCTA 2011. Communications in Computer and Information Science, vol 203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24037-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24037-9_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24036-2

  • Online ISBN: 978-3-642-24037-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics