Skip to main content
Log in

Quantifying eventual consistency with PBS

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

All good ideas arrive by chance.—Max Ernst.

Abstract

Data store replication results in a fundamental trade-off between operation latency and data consistency. At the weak end of the consistency spectrum is eventual consistency providing no limit to the staleness of data returned. However, anecdotally, eventual consistency is often “good enough” for practitioners given its latency and availability benefits. In this work, we explain why eventually consistent systems are regularly acceptable in practice, analyzing both the staleness of data they return and the latency benefits they offer. We introduce Probabilistically Bounded Staleness (PBS), a consistency model which provides expected bounds on data staleness with respect to both versions and wall clock time. We derive a closed-form solution for versioned staleness as well as model real-time staleness under Internet-scale production workloads for a large class of quorum-replicated, Dynamo-style stores. Using PBS, we measure the latency–consistency trade-off for partial, non-overlapping quorum systems, including limited multi-object operations. We quantitatively demonstrate how and why eventually consistent systems frequently return consistent data within tens of milliseconds while offering significant latency benefits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Many systems providing consistency semantics such as serializability, linearizability, and convergent causal consistency provide eventual consistency (often along with additional safety and liveness properties which result in stronger models than “basic” eventual consistency).

  2. In prior versions of this work [15], we used different terminology; here, we leverage definitions from the existing literature. We believe these metrics will both provide greater clarity and allow for cleaner integration with other metrics. For readers familiar with our prior terminology, \(k\)-staleness has become \((K, p)\)-regular semantics, \(t\)-visibility has become \((\varDelta , p)\)-regular semantics, and \(\langle k, t \rangle \)-staleness has become \((\varDelta , K, p)\)-regular semantics.

  3. LinkedIn. www.linkedin.com.

  4. Yammer. www.yammer.com.

References

  1. Apache Cassandra 1.0 documentation: About Data Consistency in Cassandra. http://datastax.com/docs/1.0/dml/data_consistency

  2. Apache Cassandra Jira: cassandra-876: Support Session (Read-After-Write) Consistency. https://issues.apache.org/jira/browse/CASSANDRA-876. October 2010. Accessed 13 Dec 2011

  3. Abadi, D.J.: Consistency tradeoffs in modern distributed database system design: CAP is only part of the story. IEEE Comput. 45(2), 37–42 (2012)

    Article  MathSciNet  Google Scholar 

  4. Abraham, I., Malkhi, D.: Probabilistic quorums for dynamic systems (extended abstract). In: DISC, pp. 60–74 (2003)

  5. Agrawal, D., Abbadi, A.E.: The tree quorum protocol: An efficient approach for managing replicated data. In: VLDB, pp. 243–254 (1990)

  6. Agrawal, D., Abbadi, A.E.: Resilient logical structures for efficient management of replicated data. In: VLDB, pp. 151–162 (1992)

  7. Ahamad, M., Neiger, G., Burns, J.E., Kohli, P., Hutto, P.: Causal memory: definitions, implementation and programming. Distrib. Comput. 9(1), 37–49 (1995)

    Google Scholar 

  8. Aiyer, A., Alvisi, L., Bazzi, R.A.: On the availability of non-strict quorum systems. In: DISC, pp. 48–62 (2005)

  9. Aiyer, A.S., Alvisi, L., Bazzi, R.A.: Byzantine and multi-writer k-quorums. In: DISC, pp. 443–458 (2006)

  10. Alpern, B., Schneider, F.B.: Defining liveness. Inf. Process. Lett. 21(4), 181–185 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  11. Alvaro, P., Conway, N., Hellerstein, J.M., Marczak, W.R.: Consistency analysis in Bloom: a CALM and collected approach. In: CIDR, pp. 249–260 (2011)

  12. Armbrust, M., Curtis, K., Kraska, T., Fox, A., Franklin, M.J., Patterson, D.A.: PIQL: Success-tolerant query processing in the cloud. In: VLDB, pp. 181–192 (2012)

  13. Basho Riak: http://basho.com/products/riak-overview/ (2012)

  14. Bailis, P., Fekete, A., Ghodsi, A., Hellerstein, J.M., Stoica, I.: The potential dangers of causal consistency and an explicit solution. In: SOCC (2012)

  15. Bailis, P., Venkataraman, S., Franklin, M.J., Hellerstein, J.M., Stoica, I.: Probabilistically bounded staleness for practical partial quorums. PVLDB 5(8), 776–787 (2012)

    Google Scholar 

  16. Bailis, P., Venkataraman, S., Franklin, M.J., Hellerstein, J.M., Stoica, I.: Pbs at work: advancing data management with consistency metrics. In: SIGMOD (2013). Demo

  17. Basho Technologies, Inc.: Riak wiki: Riak concepts replication. http://wiki.basho.com/Replication.html. Accessed Jan 2013

  18. Basho Technologies, Inc.: \(\text{ riak }\_\text{ kv }\) 1.0 application. https://github.com/basho/riak_kv/blob/1.0/src/riak_kv_app.erl

  19. Bermbach, D., Tai, S.: Eventual consistency: how soon is eventual? An evaluation of Amazon S3’s consistency behavior. In: MW4SOC, pp. 1:1–1:6 (2011)

  20. Birman, K., Chockler, G., van Renesse, R.: Toward a cloud computing research agenda. SIGACT News 40(2), 68–80 (2009)

    Article  Google Scholar 

  21. Blomstedt, J.: Bringing consistency to Riak. Talk at RICON 2012 (http://vimeo.com/51973001)

  22. Cassandra 1.0 Thrift Configuration. https://github.com/apache/cassandra/blob/cassandra-1.0/interface/cassandra.thrift

  23. Cassandra wiki: Operations. http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data. Accessed Jan 2013

  24. Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008). http://dl.acm.org/citation.cfm?id=1454159.1454167

  25. Davidson, S., Garcia-Moina, H., Skeen, D.: Consistency in partitioned networks. ACM Comput. Surv. 17(3), 314–370 (1985)

    Google Scholar 

  26. Dean, J.: Designs, lessons, and advice from building large distributed systems. In: Keynote from LADIS (2009)

  27. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: SOSP, pp. 205–220 (2007)

  28. Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for replicated database maintenance. In: PODC, pp. 1–12 (1987)

  29. Ellis, J.B.: Revision 986783: revert ’per-connection read-your-writes “session” consistency’. http://svn.apache.org/viewvc?view=revision&revision=986783. 18 August 2010, one week after the original patch was accepted

  30. Feinberg, A.: Personal, communication. 23, 24 October, 14, 19, 21, 30 November, 1 December 2011

  31. Feinberg, A.: Project Voldemort: Reliable distributed storage. In: ICDE (2011). Project site: http://www.project-voldemort.com (2012)

  32. Fu, A.W.: Delay-optimal quorum consensus for distributed systems. IEEE Trans. Parallel Distrib. Syst. 8(1), 59–69 (1997)

    Article  Google Scholar 

  33. Gibbons, P.B., Korach, E.: Testing shared memories. SIAM J. Comput. 26(4), 1208–1244 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  34. Gifford, D.K.: Weighted voting for replicated data. In: SOSP, pp. 150–162 (1979)

  35. Golab, W., Li, X., Shah, M.A.: Analyzing consistency properties for fun and profit. In: PODC, pp. 197–206 (2011)

  36. Gupta, A., Maggs, B.M., Oprea, F., Reiter, M.K.: Quorum placement in networks to minimize access delays. In: PODC, pp. 87–96 (2005)

  37. Hale, C.: Personal Communication. 16 November 2011

  38. Hale, C., Kennedy, R.: Using Riak at Yammer. http://dl.dropbox.com/u/2744222/2011-03-22_Riak-At-Yammer.pdf. 23 March 2011

  39. Hamilton, J.: Perspectives: I love eventual consistency but...http://perspectives.mvdirona.com/2010/02/24/ILoveEventualConsistencyBut.aspx. 24 February 2010

  40. Helland, P., Campbell, D.: Building on quicksand. In: CIDR (2009)

  41. Herlihy, M.: Dynamic quorum adjustment for partitioned data. ACM Trans. Database Syst. 12(2), 170–194 (1987)

    Article  Google Scholar 

  42. Herlihy, M., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)

    Article  Google Scholar 

  43. Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: ZooKeeper: wait-free coordination for internet-scale systems. In: USENIX ATC, pp. 145–158 (2010)

  44. Jiménez-Peris, R., Patiño Martínez, M.: Are quorums an alternative for data replication? ACM Trans. Database Syst. 28(3), 257–294 (2003)

    Article  Google Scholar 

  45. King, D.: keltranis comment on “reddit’s now running on Cassandra”. http://www.reddit.com/r/programming/comments/bcqhi/reddits_now_running_on_cassandra/c0m3wh6. March 2010

  46. Kirkell, J.: Consistency or bust: breaking a Riak cluster. http://www.oscon.com/oscon2011/public/schedule/detail/19762. Talk at O’Reilly OSCON 2011, 27 July 2011

  47. Kraska, T., Hentschel, M., Alonso, G., Kossmann, D.: Consistency rationing in the cloud: pay only when it matters. In: Proceedings of the VLDB Endowment, vol. 2, issue 1, pp. 253–264 (2009)

  48. Krishnamurthy, S., Sanders, W.H., Cukier, M.: An adaptive quality of service aware middleware for replicated services. IEEE Trans.Parallel Distrib. Syst. 14(11), 1112–1125 (2003)

    Article  Google Scholar 

  49. Lakshman, A., Malik, P.: Cassandra—a decentralized structured storage system. In: LADIS, pp. 35–40 (2008). Project site: http://cassandra.apache.org (2012)

  50. Lamport, L.: On interprocess communication. Distrib. Comput. 1(2), 86–101 (1986)

    Article  MATH  Google Scholar 

  51. Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)

    Article  Google Scholar 

  52. Linden, G.: Make Data Useful. https://sites.google.com/site/glinden/Home/StanfordDataMining.2006-11-29.ppt. 29 November 2006

  53. Linden, G.: Marissa Mayer at Web 2.0. http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html. 9 November 2006

  54. Lloyd, W., Freedmand, M.J., Kaminsky, M., Andersen, D.G.: Don’t settle for eventual: scalable causal consistency for wide-area storage with COPS. In: SOSP, pp. 401–416 (2011)

  55. Lynch, J.: Rolling with Riak. http://sdruby.org/podcast/81. Talk presented at SD Ruby meeting (Podcast 81), 2010

  56. Metafilter Infodump: http://stuff.metafilter.com/infodump/. Combination of all available comment datasets: mefi, askme, meta, music. User count from usernames

  57. Mahajan, P., Alvisi, L., Dahlin, M.: Consistency, availability, convergence. Tech. Rep. TR-11-22, Computer Science Department, University of Texas at Austin (2011)

  58. Malkhi, D., Reiter, M., Wool, A., Wright, R.: Probabilistic quorum systems. Inf. Comput. 170(2), 184–206 (2001)

    Google Scholar 

  59. Marcus, A.: The NoSQL ecosystem. In: The Architecture of Open Source Applications, pp. 185–205 (2011)

  60. Merideth, M., Reiter, M.: Selected results from the latest decade of quorum systems research. In: Replication, LNCS, vol. 5959, pp. 185–206. Springer, Berlin (2010)

  61. Naor, M., Wool, A.: The load, capacity, and availability of quorum systems. SIAM J. Comput. 27(2), 214–225 (1998)

    Article  MathSciNet  Google Scholar 

  62. Olston, C., Widom, J.: Offering a precision-performance tradeoff for aggregation queries over replicated data. In: VLDB, pp. 144–155 (2000)

  63. Outbrain Inc.: Introduction to no:sql [sic] and Cassandra (and Outbrain). https://docs.google.com/present/view?id=ahbp3bktzpkc_220f7v26vg7. January 2010

  64. Papadimitriou, C.: The serializability of concurrent database updates. J. ACM (JACM) 26(4), 631–653 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  65. Rahman, M., Golab, W., AuYoung, A., Keeton, K., Wylie, J.: Toward a principled framework for benchmarking consistency. In: Proceedings of the 8th Workshop on Hot Topics in System Dependability (2012)

  66. Ritter, A., Cherry, C., Dolan, B.: Unsupervised modeling of Twitter conversations. In: HLT (2010)

  67. Schurman, E., Brutlag., J.: Performance related changes and their user impact. Presented at Velocity Web Performance and Operations Conference (June 2009)

  68. Sovran, Y., Power, R., Aguilera, M.K., Li., J.: Transactional storage for geo-replicated systems. In: SOSP, pp. 385–400 (2011)

  69. Stonebraker, M.: Urban Myths About SQL. http://voltdb.com/_pdf/VoltDB-MikeStonebraker-SQLMythsWebinar-060310.pdf. VoltDB Webinar (June 2010)

  70. Sumbaly, R.: Writing Own Client for Voldemort. https://github.com/voldemort/voldemort/wiki/Writing-own-client-for-Voldemort. 16 June 2011. Accessed 21 Dec 2011

  71. Taylor, R.N.: Complexity of analyzing the synchronization structure of concurrent programs. Acta Informatica 19, 57–84 (1983)

    Google Scholar 

  72. Terry, D.B., Demers, A.J., Petersen, K., Spreitzer, M.J., Theimer, M.M., Welch, B.B.: Session guarantees for weakly consistent replicated data. In: PDIS, pp. 140–149 (1994)

  73. Torres-Rojas, F.J., Ahamad, M., Raynal, M.: Timed consistency for shared distributed objects. In: PODC 1999, pp. 163–172

  74. Vogels, W.: Eventually consistent. CACM 52, 40–44 (2009)

    Article  Google Scholar 

  75. Wada, H., Fekete, A., Zhao, L., Lee, K., Liu, A.: Data consistency properties and the trade-offs in commercial cloud storage: the consumers perspective. In: CIDR, pp. 134–143 (2011)

  76. Wester, B., Cowling, J., Nightingale, E.B., Chen, P.M., Flinn, J., Liskov, B.: Tolerating latency in replicated state machines through client speculation. In: NSDI, pp. 245–260 (2009)

  77. Williams, D.: HBase vs Cassandra: Why We Moved. http://ria101.wordpress.com/2010/02/24/hbase-vs-cassandra-why-we-moved. 24 February 2010

  78. Yu, H., Vahdat, A.: Design and evaluation of a conit-based continuous consistency model for replicated services. ACM Trans. Comput. Syst. 20(3), 239–282 (2002)

    Google Scholar 

  79. Yu, H., Vahdat, A.: The costs and limits of availability for replicated services. ACM Trans. Comput. Syst. 24(1), 70–113 (2006)

    Google Scholar 

  80. Zellag, K., Kemme, B.: How consistent is your cloud application? In: SOCC (2012)

  81. Zhang, C., Zhang, Z.: Trading replication consistency for performance and availability: an adaptive approach. In: ICDCS, pp. 687–695 (2003)

Download references

Acknowledgments

The authors would like to thank Alex Feinberg and Coda Hale for their cooperation in providing real-world distributions for experiments and for exemplifying positive industrial–academic relations through their conduct and feedback. The authors would also like to thank the following individuals whose discussions and feedback improved this work: Marcos Aguilera, Peter Alvaro, Eric Brewer, Neil Conway, Aaron Davidson, Greg Durrett, Jonathan Ellis, Andy Gross, Hariyadi Gunawi, Sam Madden, Bill Marczak, Kay Ousterhout, Vern Paxson, Mark Phillips, Christopher Ré, Aviad Rubenstein, Justin Sheehy, Scott Shenker, Sriram Srinivasan, Doug Terry, Anirudh Todi, Greg Valiant, and Patrick Wendell. We would especially like to thank Bryan Kate for his extensive comments and Ali Ghodsi, who, in addition to providing feedback, originally piqued our interest in theoretical quorum systems. This work was supported by gifts from Google, SAP, Amazon Web Services, Blue Goji, Cloudera, Ericsson, General Electric, Hewlett Packard, Huawei, IBM, Intel, MarkLogic, Microsoft, NEC Labs, NetApp, NTT Multimedia Communications Laboratories, Oracle, Quanta, Splunk, and VMware. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant DGE 1106400, National Science Foundation Grants IIS-0713661, CNS-0722077 and IIS-0803690, NSF CISE Expeditions award CCF-1139158, the Air Force Office of Scientific Research Grant FA95500810352, and by DARPA contract FA865011C7136.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Bailis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bailis, P., Venkataraman, S., Franklin, M.J. et al. Quantifying eventual consistency with PBS. The VLDB Journal 23, 279–302 (2014). https://doi.org/10.1007/s00778-013-0330-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-013-0330-1

Keywords

Navigation