Concise Server-Wide Causality Management for Eventually Consistent Data Stores

  • Ricardo GonçalvesEmail author
  • Paulo Sérgio Almeida
  • Carlos Baquero
  • Victor Fonte
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9038)


Large scale distributed data stores rely on optimistic replication to scale and remain highly available in the face of network partitions. Managing data without coordination results in eventually consistent data stores that allow for concurrent data updates. These systems often use anti-entropy mechanisms (like Merkle Trees) to detect and repair divergent data versions across nodes. However, in practice hash-based data structures are too expensive for large amounts of data and create too many false conflicts.

Another aspect of eventual consistency is detecting write conflicts. Logical clocks are often used to track data causality, necessary to detect causally concurrent writes on the same key. However, there is a non-negligible metadata overhead per key, which also keeps growing with time, proportional with the node churn rate. Another challenge is deleting keys while respecting causality: while the values can be deleted, per-key metadata cannot be permanently removed without coordination.

We introduce a new causality management framework for eventually consistent data stores, that leverages node logical clocks (Bitmapped Version Vectors) and a new key logical clock (Dotted Causal Container) to provides advantages on multiple fronts: 1) a new efficient and lightweight anti-entropy mechanism; 2) greatly reduced per-key causality metadata size; 3) accurate key deletes without permanent metadata.


Distributed Systems Key-Value Stores Eventual Consistency Causality Logical Clocks Anti-Entropy 


  1. 1.
    Almeida, P.S., Baquero, C., Gonçalves, R., Preguiça, N., Fonte, V.: Scalable and accurate causality tracking for eventually consistent stores. In: Magoutis, K., Pietzuch, P. (eds.) DAIS 2014. LNCS, vol. 8460, pp. 67–81. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  2. 2.
    DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: ACM SIGOPS Operating Systems Review, vol. 41, pp. 205–220 (2007)Google Scholar
  3. 3.
    Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News 33(2), 51–59 (2002)CrossRefGoogle Scholar
  4. 4.
    Golding, R.A.: Weak-consistency group communication and membership. Ph.D. thesis, University of California Santa Cruz (1992)Google Scholar
  5. 5.
    Johnson, P.R., Thomas, R.H.: The maintenance of duplicate databases. Internet Request for Comments RFC 677, Information Sciences Institute (1976)Google Scholar
  6. 6.
    Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, pp. 654–663. ACM (1997)Google Scholar
  7. 7.
    Klophaus, R.: Riak core: building distributed applications without shared state. In: ACM SIGPLAN Commercial Users of Functional Programming. ACM (2010)Google Scholar
  8. 8.
    Ladin, R., Liskov, B., Shrira, L., Ghemawat, S.: Providing high availability using lazy replication. ACM Trans. Comput. Syst. 10(4), 360–391 (1992)CrossRefGoogle Scholar
  9. 9.
    Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review 44(2), 35–40 (2010)CrossRefGoogle Scholar
  10. 10.
    Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21(7), 558–565 (1978)CrossRefzbMATHGoogle Scholar
  11. 11.
    Malkhi, D., Terry, D.: Concise version vectors in winFS. In: Fraigniaud, P. (ed.) DISC 2005. LNCS, vol. 3724, pp. 339–353. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Merkle, R.C.: A certified digital signature. In: Brassard, G. (ed.) Advances in Cryptology - CRYPTO 1989. LNCS, vol. 435, pp. 218–238. Springer, Heidelberg (1990),
  13. 13.
    Parker Jr., D.S., Popek, G., Rudisin, G., Stoughton, A., Walker, B., Walton, E., Chow, J., Edwards, D.: Detection of mutual inconsistency in distributed systems. IEEE Transactions on Software Engineering, 240–247 (1983)Google Scholar
  14. 14.
    Schwarz, R., Mattern, F.: Detecting causal relationships in distributed computations: In search of the holy grail. Distributed Computing 7(3), 149–174 (1994)CrossRefzbMATHGoogle Scholar
  15. 15.
    Wuu, G., Bernstein, A.: Efficient solutions to the replicated log and dictionary problems. In: Symp. on Principles of Dist. Comp. (PODC), pp. 233–242 (1984)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  • Ricardo Gonçalves
    • 1
    Email author
  • Paulo Sérgio Almeida
    • 1
  • Carlos Baquero
    • 1
  • Victor Fonte
    • 1
  1. 1.HASLab, INESC Tec and Universidade do MinhoBragaPortugal

Personalised recommendations