DTD-Diff: A Change Detection Algorithm for DTDs

  • Erwin Leonardi
  • Tran T. Hoai
  • Sourav S. Bhowmick
  • Sanjay Madria
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3882)


The DTD of a set of XML documents may change due to many reasons such as changes to the real world events, changes to the user’s requirements, and mistakes in the initial design. In this paper, we present a novel algorithm called DTD-Diff to detect the changes to DTDs that defines the structure of a set of XML documents. Such change detection tool can be useful in several ways such as maintenance of XML documents, incremental maintenance of relational schema for storing XML data, and XML schema integration. We compare DTD-Diff with existing XML change detection approaches and show that converting DTD to XML Schema (XSD) (which is in XML document format) and detecting the changes using existing XML change detection algorithms is not a feasible option. Our experimental results show that DTD-Diff is 5–325 times faster than X-Diff when it detects the changes to the XSD files. We also study the result quality of detected deltas.


Change Detection Element Type Content Tree Bipartite Match Move Operation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rivest, R.L.: The MD5 Message Digest Algorithm. Internet RFC 1321 (April 1992), http://www.faqs.org/rfcs/rfc1321.html
  2. 2.
    UW XML Repository. Database Research Group, University of Washington, http://www.cs.washington.edu/research/xmldatasets/
  3. 3.
    XML Schema. World Wide Web Consortium. http://www.w3.org/XML/Schema
  4. 4.
    XML. ORG Registry and Repository for XML Schemas. http://www.xml.org/xml/registry.jsp
  5. 5.
    Choi, B.: What are real DTDs like? In: WebDB (2002)Google Scholar
  6. 6.
    Cobena, G., Abiteboul, S., Marian, A.: Detecting Changes in XML Documents. In: ICDE (2002)Google Scholar
  7. 7.
    Leonardi, E., Bhowmick, S.S.: Detecting Changes on XML Documents Using Relational Databases: A Schema-Conscious Approach. In: ACM CIKM (2005)Google Scholar
  8. 8.
    Leonardi, E., Hoai, T.T., Bhowmick, S.S., Madria, S.: DTD-Diff: A Change Detection Algorithm for DTDs. Technical Report, Center for Advanced Information System, Nanyang Technological University, Singapore (2005)Google Scholar
  9. 9.
    Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational Databases for Querying XML Documents: Limitations and Opportunities. In: VLDB (1999)Google Scholar
  10. 10.
    Wang, Y., DeWitt, D.J., Cai, J.: X-Diff: An Effective Change Detection Algorithm for XML Documents. In: ICDE (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Erwin Leonardi
    • 1
  • Tran T. Hoai
    • 1
  • Sourav S. Bhowmick
    • 1
  • Sanjay Madria
    • 2
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingapore
  2. 2.Department of Computer ScienceUniversity of Missouri-RollaUSA

Personalised recommendations