Skip to main content

Evaluating XPath Queries on XML Data Streams

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4587))

Abstract

Whenever queries have to be evaluated on XML data streams - or when the memory that is available to evaluate the XML data is relatively small com pared to the document - DOM based approaches that have to load and store large parts of the document in main memory will fail. In comparison, we pre sent an approach to evaluate XPath queries on SAX streams that supports all axes of core XPath, including the sibling axes. Starting from the XPath query, our approach generates a stack of automata that uses the SAX stream as input and generates the result of the query as an output SAX stream. An evaluation of our implementation shows that in gen eral our approach needs less main memory, but at the same time is faster than both, Saxon and YFilter.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Avila-Campillo, I., Green, T.J., Gupta, A., Onizuka, M., Raven, D., Suciu, D.: XMLTK: An XML Toolkit for Scalable XML Stream Processin. In: Proceedings of PLANX (October 2002)

    Google Scholar 

  2. Bar-Yossef, Z., Fontoura, M., Josifovski, V.: Buffering in query evaluation over XML streams. PODS 2005, pp. 216–227 (2005)

    Google Scholar 

  3. Bar-Yossef, Z., Fontoura, M., Josifovski, V.: On the Memory Requirements of XPath Evaluation over XML Streams. PODS 2004, pp. 177–188 (2004)

    Google Scholar 

  4. Barton, C., Charles, P., Goyal, D., Raghavachari, M., Fontoura, M., Josifovski, V.: Streaming XPath Processing with Forward and Backward Axes. In: ICDE 2003, pp. 455–466 (2003)

    Google Scholar 

  5. Bry, F., Coskun, F., Durmaz, S., Furche, T., Olteanu, D., Spannagel, M.: The XML Stream Query Processor SPEX. In: ICDE 2005, pp. 1120–1121 (2005)

    Google Scholar 

  6. Candan, K.S., Hsiung, W.-P., Chen, S., Tatemura, J., Agrawal, D.: AFilter: Adaptable XML Filtering with Prefix-Caching and Suffix-Clustering. In: VLDB 2006, pp. 559–570 (2006)

    Google Scholar 

  7. Chan, C.Y., Felber, P., Garofalakis, M.N., Rastogi, R.: Efficient Filtering of XML Documents with XPath Expressions. In: ICDE 2002, pp. 235–244 (2002)

    Google Scholar 

  8. Chen, Y., Davidson, S.B., Zheng, Y.: An Efficient XPath Query Processor for XML Streams. In: Proceedings of 22nd International Conference on Data Engineering (ICDE) (2006) (to appear)

    Google Scholar 

  9. Diao, Y., Rizvi, S., Franklin, M.J.: Towards an Internet-Scale XML Dissemination Service. In: Proceedings of VLDB 2004 (August 2004)

    Google Scholar 

  10. Franceschet, M.: XPathMark: An XPath Benchmark for the XMark Generated Data. In: Bressan, S., Ceri, S., Hunt, E., Ives, Z.G., Bellahsène, Z., Rys, M., Unland, R. (eds.) XSym 2005. LNCS, vol. 3671, pp. 129–143. Springer, Heidelberg (2005)

    Google Scholar 

  11. Gottlob, G., Koch, C., Pichler, R.: Efficient Algorithms for Processing XPath Queries. In: VLDB 2002 (2002)

    Google Scholar 

  12. Green, T.J., Gupta, A., Miklau, G., Onizuka, M., Suciu, D.: Processing XML Streams with Deterministic Automata and Stream Indexes Published in ACM TODS, vol. 29(4), pp. 752–788 (December 2004)

    Google Scholar 

  13. Gupta, A., Suciu, D.: Stream Processing of XPath Queries with Predicate. In: Proceeding of ACM SIGMOD Conference on Management of Data (2003)

    Google Scholar 

  14. Ives, Z.G., Halevy, A.Y., Weld, D.S.: An XML query engine for network-bound data. VLDB J. 11(4), 380–402 (2002)

    Article  MATH  Google Scholar 

  15. Josifovski, V., Fontoura, M., Barta, A.: Querying XML streams. VLDB J. 14(2), 197–210 (2005)

    Article  Google Scholar 

  16. NewsML 1.2: News Markup Language (October 2003), http://www.newsml.org/

  17. NITF 3.3: News Industry Text Format, http://www.nitf.org/

  18. Olteanu, D., Kiesling, T., Bry, F.: An Evaluation of Regular Path Expressions with Qualifiers against XML Streams. In: ICDE 2003, pp. 702–704 (2003)

    Google Scholar 

  19. Olteanu, D., Meuss, H., Furche, T., Bry, F.: XPath: Looking Forward. In: EDBT Workshops 2002, pp. 109–127 (2002)

    Google Scholar 

  20. Peng, F., Chawathe, S.S.: XPath Queries on Streaming Data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. June 9-12 2003, San Diego, California (2003)

    Google Scholar 

  21. SAXON - XSLT and XQUERY Prozessor Version 8.8.0.4. (2006), http://saxon.sourceforge.net/

  22. Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A Benchmark for XML Data Management. In: VLDB 2002, pp. 974–985 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Richard Cooper Jessie Kennedy

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Böttcher, S., Steinmetz, R. (2007). Evaluating XPath Queries on XML Data Streams. In: Cooper, R., Kennedy, J. (eds) Data Management. Data, Data Everywhere. BNCOD 2007. Lecture Notes in Computer Science, vol 4587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73390-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73390-4_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73389-8

  • Online ISBN: 978-3-540-73390-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics