Skip to main content
Log in

High-throughput state-machine replication using software transactional memory

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

State-machine replication is a common way of constructing general purpose fault tolerance systems. To ensure replica consistency, requests must be executed sequentially according to some total order at all non-faulty replicas. Unfortunately, this could severely limit the system throughput. This issue has been partially addressed by identifying non-conflicting requests based on application semantics and executing these requests concurrently. However, identifying and tracking non-conflicting requests require intimate knowledge of application design and implementation, and a custom fault tolerance solution developed for one application cannot be easily adopted by other applications. Software transactional memory offers a new way of constructing concurrent programs. In this article, we present the mechanisms needed to retrofit existing concurrency control algorithms designed for software transactional memory for state-machine replication. The main benefit for using software transactional memory in state-machine replication is that general purpose concurrency control mechanisms can be designed without deep knowledge of application semantics. As such, new fault tolerance systems based on state-machine replications with excellent throughput can be easily designed and maintained. In this article, we introduce three different concurrency control mechanisms for state-machine replication using software transactional memory, namely, ordered strong strict two-phase locking, conventional timestamp-based multiversion concurrency control, and speculative timestamp-based multiversion concurrency control. Our experiments show that speculative timestamp-based multiversion concurrency control mechanism has the best performance in all types of workload, the conventional timestamp-based multiversion concurrency control offers the worst performance due to high abort rate in the presence of even moderate contention between transactions. The ordered strong strict two-phase locking mechanism offers the simplest solution with excellent performance in low contention workload, and fairly good performance in high contention workload.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bernstein PA, Goodman N (1981) Concurrency control in distributed database systems. ACM Comput Surv 13(2):185–221

    Article  MathSciNet  Google Scholar 

  2. Brito A, Fetzer C, Felber P (2009) Multithreading-enabled active replication for event stream processing operators. In: Proceedings of the 28th IEEE international symposium on reliable distributed systems. IEEE, New York, pp 22–31

  3. Castro M, Liskov B (2002) Practical byzantine fault tolerance and proactive recovery. ACM Trans Comput Syst 20(4):398–461

    Article  Google Scholar 

  4. Chai H, Zhang H, Zhao W, Melliar-Smith PM, Moser LE (2013) Toward trustworthy coordination for web service business activities. IEEE Trans Serv Comput 6(2):276–288

    Article  Google Scholar 

  5. Chai H, Zhao W (2012) Byzantine fault tolerance as a service. In: Kim, T.h., Mohammed S, Ramos C, Abawajy J, Kang BH, Slezak D (eds) Computer applications for web, human computer interaction, signal and image processing, and pattern recognition, communications in computer and information science, vol 342. Springer, Berlin, pp 173–179

  6. Chai H, Zhao W (2013) Byzantine fault tolerance for session-oriented multi-tiered applications. Int J Web Sci 2(1/2):113–125

    Article  Google Scholar 

  7. Chai H, Zhao W (2014) Byzantine fault tolerance for services with commutative operations. In: Proceedings of the IEEE international conference on services computing. IEEE, Anchorage, pp 219–226

  8. Chai H, Zhao W (2014) Byzantine fault tolerant event stream processing for autonomic computing. In: Proceedings of the 12th IEEE international conference on dependable, autonomic and secure computing. IEEE, New York, pp 109–114

  9. Chai H, Zhao W (2014) Towards trustworthy complex event processing. In: Proceedings of the 5th IEEE international conference on software engineering and service science. IEEE, New York, pp 758–761

  10. Dice D, Shalev O, Shavit N (2006) Transactional locking. II. In: Distributed computing. Springer, Berlin, pp 194–208

  11. Ennals R (2006) Software transactional memory should not be obstruction-free. Tech. rep., technical report IRC-TR-06-052, Intel research cambridge tech report

  12. Gray J, Reuter A (1992) Trans Process Conc Techniq, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  13. Harris T, Fraser K (2003) Language support for lightweight transactions. In: ACM SIGPLAN notices, vol 38, pp 388–402. ACM, New York

  14. Lamport L (2001) Paxos made simple. ACM SIGACT News (Distrib Comput Column) 32(4):18–25

    Google Scholar 

  15. Ramadan HE, Roy I, Herlihy M, Witchel E (2009) Committing conflicting transactions in an stm. In: ACM sigplan notices, vol 44. ACM, New York, pp 163–172

  16. Ras Y (1992) The principle of commitment ordering. In: Proceedings of the 18th international conference on very large data bases, pp 292–312

  17. Shavit N, Touitou D (1995) Software transactional memory. In: Proceedings of the 14th ACM symposium on principles of distributed computing, pp 204–213

  18. Yin J, Martin JP, Venkataramani A, Alvisi L, Dahlin M (2003) Separating agreement from execution for byzantine fault tolerant services. In: Proceedings of the ACM symposium on operating systems principles. Bolton Landing, NY, pp 253–267

  19. Zhang H, Chai H, Zhao W, Melliar-Smith PM, Moser LE (2012) Trustworthy coordination for web service atomic transactions. IEEE Trans Parall Distrib Syst 23(8):1551–1565

    Article  Google Scholar 

  20. Zhang H, Zhao W (2012) Concurrent byzantine fault tolerance for software-transactional-memory based applications. Int J Future Comput Commun 1(1):47–50

    Article  Google Scholar 

  21. Zhang H, Zhao W, Melliar-Smith PM, Moser LE (2011) Design and implementation of a byzantine fault tolerance framework for non-deterministic applications. IET Softw 5:342–356

    Article  Google Scholar 

  22. Zhao W (2009) Design and implementation of a Byzantine fault tolerance framework for web services. J Syst Softw 82(6):1004–1015

    Article  Google Scholar 

  23. Zhao W (2014) Application-aware byzantine fault tolerance. In: Proceedings of the 12th IEEE international conference on dependable, autonomic and secure computing. IEEE, New York, pp 45–50

  24. Zhao W (2014) Building dependable distributed systems. Wiley-Scrivener, New York (2014)

  25. Zhao W (2015) Optimistic byzantine fault tolerance. Int J Parall Emerg Distrib Syst, pp 1–14 (2015). (preprint)

  26. Zhao W (2016) Performance optimization for state machine replication based on application semantics: a review. J Syst Softw 112:96–109

    Article  Google Scholar 

  27. Zhao W, Babi M (2013) Byzantine fault tolerant collaborative editing. In: Proceedings of the IET international conference on information and communications technologies. IET, UK, pp 233–240 (2013)

  28. Zhao W, Melliar-Smith PM, Moser LE (2013) Low latency fault tolerance system. Comput J 56(6):716–740

    Article  Google Scholar 

  29. Zhao W, Moser LE, Melliar-Smith PM (2005) Unification of transactions and replication in three-tier architectures based on CORBA. IEEE Trans Depend Secure Comput 2(1):20–33

    Article  Google Scholar 

  30. Zhao W, Zhang H, Chai H (2009) A lightweight fault tolerance framework for web services. Web Intell Agent Syst Int J 7(3):255–268 (2009)

  31. Zhao W, Zhang H, Luo X, Zhu Y (2015) Enable concurrent Byzantine fault tolerance computing with software transactional memory. Proceedings of the 8th international conference on advanced software engineering & its applications. IEEE, New York, pp 67–72

    Google Scholar 

Download references

Acknowledgments

We sincerely thank the anonymous reviewers for their invaluable comments and suggestions . An earlier version of this paper was presented at the 8th International Conference on Advanced Software Engineering & Its Applications [31]. This work was supported in part by a Graduate Faculty Travel Award from Cleveland State University, by the National Key Technologies R&D Program of China under Grant 2015BAK38B01, and by a grant from the special funds project for scientific research of public welfare industry from the Ministry of Land and Resources of the Peoples Republic of China No. 201511079.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenbing Zhao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, W., Yang, W., Zhang, H. et al. High-throughput state-machine replication using software transactional memory. J Supercomput 72, 4379–4398 (2016). https://doi.org/10.1007/s11227-016-1747-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1747-2

Keywords

Navigation