Searching Multi-hierarchical XML Documents: The Case of Fragmentation

* Final gross prices may vary according to local VAT.

Get Access

Abstract

To properly encode properties of textual documents using XML, multiple markup hierarchies must be used, often leading to conflicting markup in encodings. Text Encoding Initiative (TEI) Guidelines [1] recognize this problem and suggest a number of ways to incorporate multiple hierarchies in a single well-formed XML document. In this paper, we present a framework for processing XPath queries over multi-hierarchical XML documents represented using fragmentation, one of the TEI-suggested techniques. We define the semantics of XPath over DOM trees of fragmented XML, extend the path expression language to cover overlap in markup, and describe FragXPath, our implementation of the proposed XPath semantics over fragmented markup.

The work of the first author has been supported in part by NSF grant ITR-0219924 and the NSF grant ITR-0325063. Second author’s work has been supported in part by NEH grant RZ-20887-02.