Versioning Version Trees: The Provenance of Actions that Affect Multiple Versions

  • David KoopEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9672)


Change-based provenance captures how an entity is constructed; it can be used not only as a record of the steps taken but also as a guide during the development of derivative or new analyses. This provenance is captured as a version tree which stores a set of related entities and the exact changes made in deriving one from another. Version trees are generally viewed as monotonic–new nodes may be added but none are modified or deleted. However, there are a number of operations (e.g., upgrades) where this constraint leads to inefficient and unintuitive new versions. To address this, we propose a version tree without monotonicity where nodes may be modified and new actions inserted. We also propose to track the provenance of these tree changes to ensure that past version trees are not lost. This provenance is change-based; it links versions of version trees by the actions which transform the trees. Thus, we continue to track every change that impacts the evolution of an entity, but the actions are split between direct edits and changes to the version tree that affect multiple entity definitions. We show how this provenance leads to more intuitive and efficient operations on workflows and how this hybrid provenance may be understood.


Provenance Version tree Workflows 



The author thanks Juliana Freire for her suggestions and the anonymous reviewers for their helpful comments. This work was supported in part by NSF CNS-1405927.


  1. 1.
    Ba, M.L., Abdessalem, T., Senellart, P.: Uncertain version control in open collaborative editing of tree-structured documents. In: Proceedings 2013 ACM Symposium on Document Engineering, pp. 27–36. ACM (2013)Google Scholar
  2. 2.
    Bhattacherjee, S., Chavan, A., Huang, S., Deshpande, A., Parameswaran, A.: Principles of dataset versioning: exploring the recreation/storage tradeoff. Proc. VLDB Endow. 8(12), 1346–1357 (2015)CrossRefGoogle Scholar
  3. 3.
    Bose, R., Frew, J.: Lineage retrieval for scientific data processing: a survey. ACM Comput. Surv. 37(1), 1–28 (2005)CrossRefGoogle Scholar
  4. 4.
    Buneman, P., Khanna, S., Tan, W.C.: Why and where: a characterization of data provenance. In: Proceedings 8th International Conference on Database Theory, pp. 316–330. Springer-Verlag (2001)Google Scholar
  5. 5.
    Conradi, R., Westfechtel, B.: Version models for software configuration management. ACM Comput. Surv. 30(2), 232–282 (1998)CrossRefGoogle Scholar
  6. 6.
  7. 7.
    De Nies, T., Magliacane, S., Verborgh, R., Coppens, S., Groth, P., Mannens, E., Van de Walle, R.: Git2PROV: exposing version control system content as W3C PROV. In: Poster and Demo Proceedings of 12th International Semantic Web Conference (2013)Google Scholar
  8. 8.
    Freire, J.-L., Silva, C.T., Callahan, S.P., Santos, E., Scheidegger, C.E., Vo, H.T.: Managing rapidly-evolving scientific workflows. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 10–18. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
  10. 10.
    Koop, D., Freire, J.: Reorganizing workflow evolution provenance. In: 6th USENIX Workshop on the Theory and Practice of Provenance (Tapp. 2014) (2014)Google Scholar
  11. 11.
    Koop, D., Scheidegger, C.E., Freire, J., Silva, C.T.: The provenance of workflow upgrades. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds.) IPAW 2010. LNCS, vol. 6378, pp. 2–16. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Leitner, P., Michlmayr, A., Rosenberg, F., Dustdar, S.: End-to-end versioning support for web services. In: IEEE International Conference on Services Computing, pp. 59–66 (2008)Google Scholar
  13. 13.
    Lins, L.D., Ferreira, N., Freire, J., Silva, C.T.: Maximum common subelement metrics and its applications to graphs. CoRR abs/1501.06774 (2015)Google Scholar
  14. 14.
    Özsoyoğlu, G., Snodgrass, R.T.: Temporal and real-time databases: a survey. IEEE Trans. Knowl. Data Eng. 7(4), 513–532 (1995)CrossRefGoogle Scholar
  15. 15.
    Sabel, M.: Structuring wiki revision history. In: Proceedings 2007 International Symposium on Wikis, NY, USA, pp. 125–130. ACM, New York (2007)Google Scholar
  16. 16.
    Scheidegger, C.E., Vo, H.T., Koop, D., Freire, J., Silva, C.T.: Querying and creating visualizations by analogy. IEEE Trans. Vis. Comp. Graph. 13(6), 1560–1567 (2007)CrossRefGoogle Scholar
  17. 17.
  18. 18.
    Taentzer, G., Ermel, C., Langer, P., Wimmer, M.: A fundamental approach to model versioning based on graph modifications: from theory to implementation. Softw. Syst. Model. 13(1), 239–272 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.University of Massachusetts DartmouthDartmouthUSA

Personalised recommendations