Skip to main content
Log in

Confluently Persistent Tries for Efficient Version Control

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We consider a data-structural problem motivated by version control of a hierarchical directory structure in a system like Subversion. The model is that directories and files can be moved and copied between two arbitrary versions in addition to being added or removed in an arbitrary version. Equivalently, we wish to maintain a confluently persistent trie (where internal nodes represent directories, leaves represent files, and edge labels represent path names), subject to copying a subtree between two arbitrary versions, adding a new child to an existing node, and deleting an existing subtree in an arbitrary version.

Our first data structure represents an n-node degree-Δ trie with O(1) “fingers” in each version while supporting finger movement (navigation) and modifications near the fingers (including subtree copy) in O(lg Δ) time and space per operation. This data structure is essentially a locality-sensitive version of the standard practice—path copying—costing O(dlg Δ) time and space for modification of a node at depth d, which is expensive when performing many deep but nearby updates. Our second data structure supporting finger movement in O(lg Δ) time and no space, while modifications take O(lg n) time and space. This data structure is substantially faster for deep updates, i.e., unbalanced tries. Both of these data structures are functional, which is a stronger property than confluent persistence. Without this stronger property, we show how both data structures can be sped up to support movement in O(lg lg Δ), which is essentially optimal. Along the way, we present a general technique for global rebuilding of fully persistent data structures, which is nontrivial because amortization and persistence do not usually mix. In particular, this technique improves the best previous result for fully persistent arrays and obtains the first efficient fully persistent hash table.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bent, S.W., Sleator, D.D., Tarjan, R.E.: Biased search trees. SIAM J. Comput. 14(3), 545–568 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  2. Blelloch, G.E., Maggs, B.M., Woo, S.L.M.: Space-efficient finger search on degree-balanced search trees. In: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 373–383, Baltimore, Maryland, January 2003

  3. Carter, J.L., Wegman, M.N.: Universal classes of hash functions (extended abstract). In: Proc. 9th Annual ACM Symposium on Theory of Computing, pp. 106–112 (1977)

  4. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)

    MATH  Google Scholar 

  5. Dietz, P.F.: Fully persistent arrays (extended abstract). In: Proc. Workshop on Algorithms and Data Structures. Lecture Notes in Computer Science, vol. 382, pp. 67–74. Springer, Berlin (1989)

    Google Scholar 

  6. Douceur, J.R., Bolosky, W.J.: A large-scale study of file-system contents. SIGMETRICS Perform. Eval. Rev. 27(1), 59–70 (1999)

    Article  Google Scholar 

  7. Driscoll, J.R., Sarnak, N., Sleator, D.D., Tarjan, R.E.: Making data structures persistent. J. Comput. Syst. Sci. 38(1), 86–124 (1989). Originally appeared in STOC’86

    Article  MATH  MathSciNet  Google Scholar 

  8. Driscoll, J.R., Sleator, D.D.K., Tarjan, R.E.: Fully persistent lists with catenation. J. ACM 41(5), 943–959 (1994)

    Google Scholar 

  9. Fiat, A., Kaplan, H.: Making data structures confluently persistent. J. Algorithms 48(1), 16–58 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  10. HaskellWiki: Zipper. http://www.haskell.org/haskellwiki/Zipper (2008)

  11. Iacono, J.: Alternatives to splay trees with o(log n) worst-case access times. In: Proc. 12th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 516–522 (2001)

  12. Kaplan, H., Tarjan, R.E.: Persistent lists with catenation via recursive slow-down. In: Proc. 27th Annual ACM Symposium on Theory of Computing, pp. 93–102 (1995)

  13. Krijnen, T.J.G., Meertens, L.: Making B-trees work for B. Mathematical Centre Report IW 219, Mathematisch Centrum, Amsterdam, The Netherlands (1983)

  14. Mitzenmacher, M.: Dynamic models for file sizes and double Pareto distributions. Internet Math. 1(3), 305–333 (2003)

    MathSciNet  Google Scholar 

  15. Myers, E.W.: AVL dags. Technical Report 82-9, Department of Computer Science, University of Arizona, Tucson. Arizona (1982)

  16. Myers, E.W.: Efficient applicative data types. In: Proceedings of the 11th Annual ACM Symposium on Principles of Programming Languages, pp. 66–75, Salt Lake City, Utah, January 1984

  17. Okasaki, C.: Purely Functional Data Structures. Cambridge University Press, Cambridge (1998)

    Book  Google Scholar 

  18. Pǎtraşcu, M., Thorup, M.: Randomization does not help searching predecessors. In: Proc. 18th ACM-SIAM Symposium on Discrete Algorithms, pp. 555–564 (2007)

  19. Pippenger, N.: Pure versus impure lisp. ACM Trans. Program. Lang. Syst. 19(2), 223–238 (1997)

    Article  MathSciNet  Google Scholar 

  20. Ports, D.R.K., Clements, A.T., Demaine, E.D.: PersiFS: A versioned file system with an efficient representation. In: Proc. 20th ACM Symposium on Operating Systems Principles, October 2005

  21. Reps, T., Teitelbaum, T., Demers, A.: Incremental context-dependent analysis for language-based editors. ACM Trans. Program. Lang. Syst. 5(3), 449–477 (1983)

    Article  Google Scholar 

  22. Sarnak, N., Tarjan, R.E.: Planar point location using persistent search trees. Commun. ACM 29(7), 669–677 (1986)

    Article  MathSciNet  Google Scholar 

  23. Sleator, D.D., Tarjan, R.E.: A data structure for dynamic trees. J. Comput. Syst. Sci. 26(3), 362–391 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  24. Swart, G.: Efficient algorithms for computing geometric intersections. Technical Report 85-01-02, Department of Computer Science, University of Washington, Seattle, Washington (1985)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erik D. Demaine.

Additional information

A preliminary version of this paper appears in Proceedings of the 11th Scandinavian Workshop on Algorithm Theory, LNCS 5124, July 2008, pp. 160–172.

E.D. Demaine and E. Price were supported in part by MADALGO—Center for Massive Data Algorithmics—a Center of the Danish National Research Foundation.

S. Langerman is Maître de recherches du F.R.S.-FNRS.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Demaine, E.D., Langerman, S. & Price, E. Confluently Persistent Tries for Efficient Version Control. Algorithmica 57, 462–483 (2010). https://doi.org/10.1007/s00453-008-9274-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-008-9274-z

Keywords

Navigation