All good ideas arrive by chance.—Max Ernst.
Abstract
Data store replication results in a fundamental trade-off between operation latency and data consistency. At the weak end of the consistency spectrum is eventual consistency providing no limit to the staleness of data returned. However, anecdotally, eventual consistency is often “good enough” for practitioners given its latency and availability benefits. In this work, we explain why eventually consistent systems are regularly acceptable in practice, analyzing both the staleness of data they return and the latency benefits they offer. We introduce Probabilistically Bounded Staleness (PBS), a consistency model which provides expected bounds on data staleness with respect to both versions and wall clock time. We derive a closed-form solution for versioned staleness as well as model real-time staleness under Internet-scale production workloads for a large class of quorum-replicated, Dynamo-style stores. Using PBS, we measure the latency–consistency trade-off for partial, non-overlapping quorum systems, including limited multi-object operations. We quantitatively demonstrate how and why eventually consistent systems frequently return consistent data within tens of milliseconds while offering significant latency benefits.
Similar content being viewed by others
Notes
Many systems providing consistency semantics such as serializability, linearizability, and convergent causal consistency provide eventual consistency (often along with additional safety and liveness properties which result in stronger models than “basic” eventual consistency).
In prior versions of this work [15], we used different terminology; here, we leverage definitions from the existing literature. We believe these metrics will both provide greater clarity and allow for cleaner integration with other metrics. For readers familiar with our prior terminology, \(k\)-staleness has become \((K, p)\)-regular semantics, \(t\)-visibility has become \((\varDelta , p)\)-regular semantics, and \(\langle k, t \rangle \)-staleness has become \((\varDelta , K, p)\)-regular semantics.
LinkedIn. www.linkedin.com.
Yammer. www.yammer.com.
References
Apache Cassandra 1.0 documentation: About Data Consistency in Cassandra. http://datastax.com/docs/1.0/dml/data_consistency
Apache Cassandra Jira: cassandra-876: Support Session (Read-After-Write) Consistency. https://issues.apache.org/jira/browse/CASSANDRA-876. October 2010. Accessed 13 Dec 2011
Abadi, D.J.: Consistency tradeoffs in modern distributed database system design: CAP is only part of the story. IEEE Comput. 45(2), 37–42 (2012)
Abraham, I., Malkhi, D.: Probabilistic quorums for dynamic systems (extended abstract). In: DISC, pp. 60–74 (2003)
Agrawal, D., Abbadi, A.E.: The tree quorum protocol: An efficient approach for managing replicated data. In: VLDB, pp. 243–254 (1990)
Agrawal, D., Abbadi, A.E.: Resilient logical structures for efficient management of replicated data. In: VLDB, pp. 151–162 (1992)
Ahamad, M., Neiger, G., Burns, J.E., Kohli, P., Hutto, P.: Causal memory: definitions, implementation and programming. Distrib. Comput. 9(1), 37–49 (1995)
Aiyer, A., Alvisi, L., Bazzi, R.A.: On the availability of non-strict quorum systems. In: DISC, pp. 48–62 (2005)
Aiyer, A.S., Alvisi, L., Bazzi, R.A.: Byzantine and multi-writer k-quorums. In: DISC, pp. 443–458 (2006)
Alpern, B., Schneider, F.B.: Defining liveness. Inf. Process. Lett. 21(4), 181–185 (1985)
Alvaro, P., Conway, N., Hellerstein, J.M., Marczak, W.R.: Consistency analysis in Bloom: a CALM and collected approach. In: CIDR, pp. 249–260 (2011)
Armbrust, M., Curtis, K., Kraska, T., Fox, A., Franklin, M.J., Patterson, D.A.: PIQL: Success-tolerant query processing in the cloud. In: VLDB, pp. 181–192 (2012)
Basho Riak: http://basho.com/products/riak-overview/ (2012)
Bailis, P., Fekete, A., Ghodsi, A., Hellerstein, J.M., Stoica, I.: The potential dangers of causal consistency and an explicit solution. In: SOCC (2012)
Bailis, P., Venkataraman, S., Franklin, M.J., Hellerstein, J.M., Stoica, I.: Probabilistically bounded staleness for practical partial quorums. PVLDB 5(8), 776–787 (2012)
Bailis, P., Venkataraman, S., Franklin, M.J., Hellerstein, J.M., Stoica, I.: Pbs at work: advancing data management with consistency metrics. In: SIGMOD (2013). Demo
Basho Technologies, Inc.: Riak wiki: Riak concepts replication. http://wiki.basho.com/Replication.html. Accessed Jan 2013
Basho Technologies, Inc.: \(\text{ riak }\_\text{ kv }\) 1.0 application. https://github.com/basho/riak_kv/blob/1.0/src/riak_kv_app.erl
Bermbach, D., Tai, S.: Eventual consistency: how soon is eventual? An evaluation of Amazon S3’s consistency behavior. In: MW4SOC, pp. 1:1–1:6 (2011)
Birman, K., Chockler, G., van Renesse, R.: Toward a cloud computing research agenda. SIGACT News 40(2), 68–80 (2009)
Blomstedt, J.: Bringing consistency to Riak. Talk at RICON 2012 (http://vimeo.com/51973001)
Cassandra 1.0 Thrift Configuration. https://github.com/apache/cassandra/blob/cassandra-1.0/interface/cassandra.thrift
Cassandra wiki: Operations. http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data. Accessed Jan 2013
Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008). http://dl.acm.org/citation.cfm?id=1454159.1454167
Davidson, S., Garcia-Moina, H., Skeen, D.: Consistency in partitioned networks. ACM Comput. Surv. 17(3), 314–370 (1985)
Dean, J.: Designs, lessons, and advice from building large distributed systems. In: Keynote from LADIS (2009)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: SOSP, pp. 205–220 (2007)
Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for replicated database maintenance. In: PODC, pp. 1–12 (1987)
Ellis, J.B.: Revision 986783: revert ’per-connection read-your-writes “session” consistency’. http://svn.apache.org/viewvc?view=revision&revision=986783. 18 August 2010, one week after the original patch was accepted
Feinberg, A.: Personal, communication. 23, 24 October, 14, 19, 21, 30 November, 1 December 2011
Feinberg, A.: Project Voldemort: Reliable distributed storage. In: ICDE (2011). Project site: http://www.project-voldemort.com (2012)
Fu, A.W.: Delay-optimal quorum consensus for distributed systems. IEEE Trans. Parallel Distrib. Syst. 8(1), 59–69 (1997)
Gibbons, P.B., Korach, E.: Testing shared memories. SIAM J. Comput. 26(4), 1208–1244 (1997)
Gifford, D.K.: Weighted voting for replicated data. In: SOSP, pp. 150–162 (1979)
Golab, W., Li, X., Shah, M.A.: Analyzing consistency properties for fun and profit. In: PODC, pp. 197–206 (2011)
Gupta, A., Maggs, B.M., Oprea, F., Reiter, M.K.: Quorum placement in networks to minimize access delays. In: PODC, pp. 87–96 (2005)
Hale, C.: Personal Communication. 16 November 2011
Hale, C., Kennedy, R.: Using Riak at Yammer. http://dl.dropbox.com/u/2744222/2011-03-22_Riak-At-Yammer.pdf. 23 March 2011
Hamilton, J.: Perspectives: I love eventual consistency but...http://perspectives.mvdirona.com/2010/02/24/ILoveEventualConsistencyBut.aspx. 24 February 2010
Helland, P., Campbell, D.: Building on quicksand. In: CIDR (2009)
Herlihy, M.: Dynamic quorum adjustment for partitioned data. ACM Trans. Database Syst. 12(2), 170–194 (1987)
Herlihy, M., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)
Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: ZooKeeper: wait-free coordination for internet-scale systems. In: USENIX ATC, pp. 145–158 (2010)
Jiménez-Peris, R., Patiño Martínez, M.: Are quorums an alternative for data replication? ACM Trans. Database Syst. 28(3), 257–294 (2003)
King, D.: keltranis comment on “reddit’s now running on Cassandra”. http://www.reddit.com/r/programming/comments/bcqhi/reddits_now_running_on_cassandra/c0m3wh6. March 2010
Kirkell, J.: Consistency or bust: breaking a Riak cluster. http://www.oscon.com/oscon2011/public/schedule/detail/19762. Talk at O’Reilly OSCON 2011, 27 July 2011
Kraska, T., Hentschel, M., Alonso, G., Kossmann, D.: Consistency rationing in the cloud: pay only when it matters. In: Proceedings of the VLDB Endowment, vol. 2, issue 1, pp. 253–264 (2009)
Krishnamurthy, S., Sanders, W.H., Cukier, M.: An adaptive quality of service aware middleware for replicated services. IEEE Trans.Parallel Distrib. Syst. 14(11), 1112–1125 (2003)
Lakshman, A., Malik, P.: Cassandra—a decentralized structured storage system. In: LADIS, pp. 35–40 (2008). Project site: http://cassandra.apache.org (2012)
Lamport, L.: On interprocess communication. Distrib. Comput. 1(2), 86–101 (1986)
Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)
Linden, G.: Make Data Useful. https://sites.google.com/site/glinden/Home/StanfordDataMining.2006-11-29.ppt. 29 November 2006
Linden, G.: Marissa Mayer at Web 2.0. http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html. 9 November 2006
Lloyd, W., Freedmand, M.J., Kaminsky, M., Andersen, D.G.: Don’t settle for eventual: scalable causal consistency for wide-area storage with COPS. In: SOSP, pp. 401–416 (2011)
Lynch, J.: Rolling with Riak. http://sdruby.org/podcast/81. Talk presented at SD Ruby meeting (Podcast 81), 2010
Metafilter Infodump: http://stuff.metafilter.com/infodump/. Combination of all available comment datasets: mefi, askme, meta, music. User count from usernames
Mahajan, P., Alvisi, L., Dahlin, M.: Consistency, availability, convergence. Tech. Rep. TR-11-22, Computer Science Department, University of Texas at Austin (2011)
Malkhi, D., Reiter, M., Wool, A., Wright, R.: Probabilistic quorum systems. Inf. Comput. 170(2), 184–206 (2001)
Marcus, A.: The NoSQL ecosystem. In: The Architecture of Open Source Applications, pp. 185–205 (2011)
Merideth, M., Reiter, M.: Selected results from the latest decade of quorum systems research. In: Replication, LNCS, vol. 5959, pp. 185–206. Springer, Berlin (2010)
Naor, M., Wool, A.: The load, capacity, and availability of quorum systems. SIAM J. Comput. 27(2), 214–225 (1998)
Olston, C., Widom, J.: Offering a precision-performance tradeoff for aggregation queries over replicated data. In: VLDB, pp. 144–155 (2000)
Outbrain Inc.: Introduction to no:sql [sic] and Cassandra (and Outbrain). https://docs.google.com/present/view?id=ahbp3bktzpkc_220f7v26vg7. January 2010
Papadimitriou, C.: The serializability of concurrent database updates. J. ACM (JACM) 26(4), 631–653 (1979)
Rahman, M., Golab, W., AuYoung, A., Keeton, K., Wylie, J.: Toward a principled framework for benchmarking consistency. In: Proceedings of the 8th Workshop on Hot Topics in System Dependability (2012)
Ritter, A., Cherry, C., Dolan, B.: Unsupervised modeling of Twitter conversations. In: HLT (2010)
Schurman, E., Brutlag., J.: Performance related changes and their user impact. Presented at Velocity Web Performance and Operations Conference (June 2009)
Sovran, Y., Power, R., Aguilera, M.K., Li., J.: Transactional storage for geo-replicated systems. In: SOSP, pp. 385–400 (2011)
Stonebraker, M.: Urban Myths About SQL. http://voltdb.com/_pdf/VoltDB-MikeStonebraker-SQLMythsWebinar-060310.pdf. VoltDB Webinar (June 2010)
Sumbaly, R.: Writing Own Client for Voldemort. https://github.com/voldemort/voldemort/wiki/Writing-own-client-for-Voldemort. 16 June 2011. Accessed 21 Dec 2011
Taylor, R.N.: Complexity of analyzing the synchronization structure of concurrent programs. Acta Informatica 19, 57–84 (1983)
Terry, D.B., Demers, A.J., Petersen, K., Spreitzer, M.J., Theimer, M.M., Welch, B.B.: Session guarantees for weakly consistent replicated data. In: PDIS, pp. 140–149 (1994)
Torres-Rojas, F.J., Ahamad, M., Raynal, M.: Timed consistency for shared distributed objects. In: PODC 1999, pp. 163–172
Vogels, W.: Eventually consistent. CACM 52, 40–44 (2009)
Wada, H., Fekete, A., Zhao, L., Lee, K., Liu, A.: Data consistency properties and the trade-offs in commercial cloud storage: the consumers perspective. In: CIDR, pp. 134–143 (2011)
Wester, B., Cowling, J., Nightingale, E.B., Chen, P.M., Flinn, J., Liskov, B.: Tolerating latency in replicated state machines through client speculation. In: NSDI, pp. 245–260 (2009)
Williams, D.: HBase vs Cassandra: Why We Moved. http://ria101.wordpress.com/2010/02/24/hbase-vs-cassandra-why-we-moved. 24 February 2010
Yu, H., Vahdat, A.: Design and evaluation of a conit-based continuous consistency model for replicated services. ACM Trans. Comput. Syst. 20(3), 239–282 (2002)
Yu, H., Vahdat, A.: The costs and limits of availability for replicated services. ACM Trans. Comput. Syst. 24(1), 70–113 (2006)
Zellag, K., Kemme, B.: How consistent is your cloud application? In: SOCC (2012)
Zhang, C., Zhang, Z.: Trading replication consistency for performance and availability: an adaptive approach. In: ICDCS, pp. 687–695 (2003)
Acknowledgments
The authors would like to thank Alex Feinberg and Coda Hale for their cooperation in providing real-world distributions for experiments and for exemplifying positive industrial–academic relations through their conduct and feedback. The authors would also like to thank the following individuals whose discussions and feedback improved this work: Marcos Aguilera, Peter Alvaro, Eric Brewer, Neil Conway, Aaron Davidson, Greg Durrett, Jonathan Ellis, Andy Gross, Hariyadi Gunawi, Sam Madden, Bill Marczak, Kay Ousterhout, Vern Paxson, Mark Phillips, Christopher Ré, Aviad Rubenstein, Justin Sheehy, Scott Shenker, Sriram Srinivasan, Doug Terry, Anirudh Todi, Greg Valiant, and Patrick Wendell. We would especially like to thank Bryan Kate for his extensive comments and Ali Ghodsi, who, in addition to providing feedback, originally piqued our interest in theoretical quorum systems. This work was supported by gifts from Google, SAP, Amazon Web Services, Blue Goji, Cloudera, Ericsson, General Electric, Hewlett Packard, Huawei, IBM, Intel, MarkLogic, Microsoft, NEC Labs, NetApp, NTT Multimedia Communications Laboratories, Oracle, Quanta, Splunk, and VMware. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant DGE 1106400, National Science Foundation Grants IIS-0713661, CNS-0722077 and IIS-0803690, NSF CISE Expeditions award CCF-1139158, the Air Force Office of Scientific Research Grant FA95500810352, and by DARPA contract FA865011C7136.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bailis, P., Venkataraman, S., Franklin, M.J. et al. Quantifying eventual consistency with PBS. The VLDB Journal 23, 279–302 (2014). https://doi.org/10.1007/s00778-013-0330-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-013-0330-1