Multimedia Tools and Applications

, Volume 43, Issue 3, pp 253–274 | Cite as

Versioning XML-based office documents

An efficient, format-independent, merge-capable approach
  • Sebastian Rönnau
  • Uwe M. Borghoff


The ability to reliably merge independent updates of a document is a crucial prerequisite to efficient collaboration in office work. However, merge support for common office document standards like OpenDocument or OfficeOpenXML is still in its infancy. In this paper, we present a consistent versioning model for XML documents in general including merge support. This is achieved by using context-aware fingerprints that identify edit operations and allow for a conflict detection. We show how to extract tracked changes from office documents and map them on our delta model. Experimental results indicate that our fingerprinting technique is efficient and reliable.


XML diff XML patch Fingerprint Office applications Version control Document merging 



The authors would like to thank their students Geraint Philipp and Maik Teupel, who showed exceptional enthusiasm when implementing parts of the tool-set presented in this paper.


  1. 1.
    Balasubramaniam S, Pierce BC (1998) What is a file synchronizer? In: 4th annual ACM/IEEE int. conference on mobile computing and networking (MobiCom ’98), Dallas, 25–30 October 1998Google Scholar
  2. 2.
    Boyer J (2001) Canonical XML version 1.0Google Scholar
  3. 3.
    Boyer JM (2008) Interactive office documents: a new face for web 2.0 applications. In: DocEng ’08: proceedings of the 8th ACM symposium on document engineering. ACM, New York, pp 8–17. doi: CrossRefGoogle Scholar
  4. 4.
    Brauer M, Weir R, McRae M (2007) OpenDocument v1.1 specification.
  5. 5.
    Chamberlin D, Florescu D, Melton J, Robie J, Siméon J (2008) XQuery update facility 1.0.
  6. 6.
    Chawathe SS, Garcia-Molina H (1997) Meaningful change detection in structured data. SIGMOD Rec 26(2):26–37. doi: CrossRefGoogle Scholar
  7. 7.
    Clark J, deRose S (1999) XML path language (XPath). Tech. rep., World Wide Web Consortium,
  8. 8.
    Cobéna G, Abiteboul S, Marian A (2002) Detecting changes in XML documents. In: Proceedings of the 18th international conference on data engineering. 26 February–1 March 2002, San Jose, CA. IEEE Computer Society, Los Alamitos, pp 41–52Google Scholar
  9. 9.
    Fayzullin M, Subrahmanian VS (2004) An algebra for powerpoint sources. Multimedia Tools Appl 24(3):273–301. doi: CrossRefGoogle Scholar
  10. 10.
    Fontaine RL (2002) Merging xml files: a new approach providing intelligent merge of xml data sets. In: Proceedings of XML Europe 2002. Barcelona, 20–23 May 2002Google Scholar
  11. 11.
    FSF (2002) Comparing and merging files. Free Software Foundation, BostonGoogle Scholar
  12. 12.
    Ignat CL, Norrie MC (2006) Flexible collaboration over xml documents. In: CDVE, pp 267–274Google Scholar
  13. 13.
    Khanna S, Kunal K, Pierce BC (2007) A formal investigation of diff3. In: Arvind V, Prasad S (eds) Foundations of software technology and theoretical computer science. Springer, New YorkGoogle Scholar
  14. 14.
    Lam F, Lam N, Wong R (2002) Efficient synchronization for mobile xml data. In: CIKM ’02: proceedings of the eleventh international conference on information and knowledge management. ACM, New York, pp 153–160. doi: CrossRefGoogle Scholar
  15. 15.
    Lindholm T (2004) A three-way merge for xml documents. In: DocEng ’04: proceedings of the 2004 ACM symposium on document engineering. ACM, New York, pp 1–10. doi: CrossRefGoogle Scholar
  16. 16.
    Lindholm T, Kangasharju J, Tarkoma S (2005) A hybrid approach to optimistic file system directory tree synchronization. In: Kumar V, Zaslavsky AB, Cetintemel U, Labrinidis A (eds) MobiDE. ACM, New York, pp 49–56CrossRefGoogle Scholar
  17. 17.
    Lindholm T, Kangasharju J, Tarkoma S (2006) Fast and simple xml tree differencing by sequence alignment. In: DocEng ’06: proceedings of the 2006 ACM symposium on document engineering. ACM, New York, pp 75–84. doi: CrossRefGoogle Scholar
  18. 18.
    Marian A, Abiteboul S, Cobéna G, Mignet L (2001) Change-centric management of versions in an XML warehouse. VLDB J 581–590Google Scholar
  19. 19.
    Maruyama H, Tamura K, Uramoto N (2000) Digest values for dom (domhash)Google Scholar
  20. 20.
    Mens T (2002) A state-of-the-art survey on software merging. IEEE Trans Softw Eng 28(5):449–462CrossRefGoogle Scholar
  21. 21.
    Neuwirth CM, Chandhok R, Kaufer DS, Erion P, Morris J, Miller D (1992) Flexible diff-ing in a collaborative writing system. In: CSCW ’92: proceedings of the 1992 ACM conference on computer-supported cooperative work. ACM, New York, pp 147–154. doi:10.1145/143457.143473 CrossRefGoogle Scholar
  22. 22.
    Paoli J, Valet-Harper I, Farquhar A, Sebestyen I (2006) ECMA-376 office open XML file formats.
  23. 23.
    Rönnau S, Scheffczyk J, Borghoff UM (2005) Towards xml version control of office documents. In: DocEng ’05: proceedings of the 2005 ACM symposium on document engineering. ACM, New York, pp 10–19. doi:10.1145/1096601.1096606 CrossRefGoogle Scholar
  24. 24.
    Rönnau S, Pauli C, Borghoff UM (2008) Merging changes in xml documents using reliable context fingerprints. In: DocEng ’08: proceedings of the 8th ACM symposium on document engineering. ACM, New York, pp 52–61. doi:10.1145/1410140.1410151 CrossRefGoogle Scholar
  25. 25.
    Rosado LA, Márquez AP, Gil JM (2007) Managing branch versioning in versioned/temporal xml documents. In: Barbosa D, Bonifati A, Bellahsene Z, Hunt E, Unland R (eds) XSym, Lecture notes in computer science, vol 4704. Springer, New York, pp 107–121Google Scholar
  26. 26.
    Tatarinov I, Ives ZG, Halevy AY, Weld DS (2001) Updating xml. In: SIGMOD ’01: proceedings of the 2001 ACM SIGMOD international conference on management of data. ACM, New York, pp 413–424. doi:10.1145/375663.375720 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Universität der Bundeswehr MünchenNeubibergGermany

Personalised recommendations