Skip to main content
Log in

An XML query engine for network-bound data

  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract.

XML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all types of data sharing. The ubiquity of XML has removed one of the major obstacles to integrating data from widely disparate sources - namely, the heterogeneity of data formats. However, general-purpose integration of data across the wide are a also requires a query processor that can query data sources on demand, receive streamed XML data from them, and combine and restructure the data into new XML output - while providing good performance for both batch-oriented and ad hoc, interactive queries. This is the goal of the Tukwila data integration system, the first system that focuses on network-bound, dynamic XML data sources. In contrast to previous approaches, which must read, parse, and often store entire XML objects before querying them, Tukwila can return query results even as the data is streaming into the system. Tukwila is built with a new system architecture that extends adaptive query processing and relational-engine techniques into the XML realm, as facilitated by a pair of operators that incrementally evaluate a query's input path expressions as data is read. In this paper, we describe the Tukwila architecture and its novel aspects, and we experimentally demonstrate that Tukwila provides better overall query performance and faster initial answers than existing systems, and has excellent scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Received: December 15, 2001 / Accepted: July 1, 2002 Published online: December 13, 2002

RID="*"

ID="*" Supported in part by an IBM Research Fellowship.

RID="**"

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ives, Z., Halevy, A. & Weld, D. An XML query engine for network-bound data. VLDB 11, 380–402 (2002). https://doi.org/10.1007/s00778-002-0078-5

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-002-0078-5

Navigation