Skip to main content

An Analysis of Real-World XML Queries

  • Conference paper
  • First Online:
On the Move to Meaningful Internet Systems: OTM 2016 Conferences (OTM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10033))

  • 1422 Accesses

Abstract

The aim of our research was to gather a representative set of real-world XQuery queries and perform its analysis in order to confirm or refute distinct hypotheses about query complexity. The data were gathered using a modified crawler, then cleaned, corrected, and validated. The main subject of the analysis was usage of XQuery grammar symbols. We also analyzed the XML documents referenced from the XQuery queries as well as their outputs. To the best of our knowledge this is the first analysis of this kind and extent.

Supported by the grant SVV-2016-260331.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Due to space limitations we refer the interested reader to [15].

  2. 2.

    Note that there are limitations of the Google search engine. There can be only 100 results per web page and maximally 10 web pages of results with 100 results. That means maximally 1,000 URLs per a search query.

  3. 3.

    https://metavo.metacentrum.cz/ – MetaCentrum Virtual Organization.

  4. 4.

    Due to space limitations, the list of all XQuery grammar symbols and their description is provided in [15]. However, their names are mostly self-explanatory.

References

  1. Common Crawl. https://commoncrawl.org/

  2. XMark - An XML Benchmark Project. http://www.xml-benchmark.org/

  3. XML Query Test Suite 1.0. https://dev.w3.org/2011/QT3-test-suite/

  4. Afanasiev, L., Marx, M.: An analysis of XQuery benchmarks. Inf. Syst. 33(2), 155–181 (2008)

    Article  Google Scholar 

  5. Arias, M., Fernández, J.D., Martínez-Prieto, M.A., de la Fuente, P.: An Empirical Study of Real-World SPARQL Queries. CoRR, abs/1103.5043 (2011)

    Google Scholar 

  6. Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: Continuous queries and real-time analysis of social semantic data with C-SPARQL. In: SDoW2009, vol. 520 of CEUR Workshop Proceedings. CEUR-WS.org (2009)

    Google Scholar 

  7. Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query Language (2010). http://www.w3.org/TR/xquery/

  8. Bonifati, A., Ceri, S.: Comparative analysis of five XML query languages. SIGMOD Rec. 29, 2000 (2000)

    Article  Google Scholar 

  9. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible Markup Language (XML) 1.0 (2008). http://www.w3.org/TR/xml/

  10. Chamberlin, D., Fankhauser, P., Florescu, D., Marchiori, M., Robie, J.: XML Query Use Cases (2007). http://www.w3.org/TR/xquery-use-cases/

  11. Clark, J., DeRose, S.: XML Path Language (XPath) Version 1.0, November 1999. http://www.w3.org/TR/xpath/

  12. Ganjisaffar, Y.: Crawler4j v. 3.5. http://code.google.com/p/crawler4j/

  13. Groppe, S., Groppe, J., Klein, N., Bettentrupp, R., Bttcher, S., Gruenwald, L.: Transforming XSLT stylesheets into XQuery expressions and vice versa. Comput. Lang. Syst. Struct. 37(2), 76–111 (2011)

    Google Scholar 

  14. Halfond, W.G.J., Orso, A.: Combining static analysis and runtime monitoring to counter SQL-injection attacks. In: WODA 2005, pp. 1–7. ACM, New York, NY, USA (2005)

    Google Scholar 

  15. Hlista, P.: Analysis of Real-World XML Queries. Master thesis, Charles University in Prague, Czech Republic (2016). https://is.cuni.cz/webapps/zzp/detail/173957/?lang=en

  16. Kratky, M., Kosek, J., Snasel, V.: Struktura realnych XML dokumentu a metody indexovani. In: ITAT: Workshop on Information Technologies Applications and Theory. High Tatras, Slovakia (2003)

    Google Scholar 

  17. Manegold, S.: An empirical evaluation of XQuery processors. Inf. Syst. 33(2), 203–220 (2008)

    Article  Google Scholar 

  18. Masicek, V.: XSLT Benchmarking. Master thesis, Charles University in Prague, Czech Republic (2012)

    Google Scholar 

  19. Mlynkova, I., Toman, K., Pokorny, J.: Statistical Analysis of Real XML Data Collections. Technical report, Charles University, Prague, Czech Republic (2006)

    Google Scholar 

  20. Robie, J., Chamberlin, D., Dyck, M., Florescu, D., Melton, J., Simeon, J.: XQuery Update Facility 1.0 (2011). http://www.w3.org/TR/xquery-update-10/

  21. Schejbal, J.: A System for Analysis of Collections of XML Queries. Master thesis, Charles University in Prague, Czech Republic (2010)

    Google Scholar 

  22. Schejbal, J., Sochna, J., Starka, J., Svoboda, M., Mlynkova, I.: Analyzer - A Tool for Batch File Analysis 1.0. http://analyzer.kenai.com/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irena Holubová .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Hlísta, P., Holubová, I. (2016). An Analysis of Real-World XML Queries. In: Debruyne, C., et al. On the Move to Meaningful Internet Systems: OTM 2016 Conferences. OTM 2016. Lecture Notes in Computer Science(), vol 10033. Springer, Cham. https://doi.org/10.1007/978-3-319-48472-3_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48472-3_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48471-6

  • Online ISBN: 978-3-319-48472-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics