Skip to main content

Schema-Driven Evaluation of Approximate Tree-Pattern Queries

  • Conference paper
  • First Online:
Advances in Database Technology — EDBT 2002 (EDBT 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2287))

Included in the following conference series:

Abstract

We present a simple query language for XML, which supports hierarchical, Boolean-connected query patterns. The interpretation of a query is founded on cost-based query transformations: The total cost of a sequence of transformations measures the similarity between the query and the data and is used to rank the results. We introduce two polynomial-time algorithms that efficiently find the best n answers to the query: The first algorithm finds all approximate results, sorts them by increasing cost, and prunes the result list after the n then try. The second algorithm uses a structural summary -the schema- of the database to estimate the best k transformed queries, which in turn are executed against the database. We compare both approaches and show that the schema-based evaluation outperforms the pruning approach for small values of n. The pruning strategy is the better choice if n is close to the total number of approximate results for the query.

This research was supported by the German Research Society, Berlin-Brandenburg Graduate School in Distributed Information Systems (DFG grant no. GRK 316).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Aboulnaga, J.F. Naughton, and C. Zhang. Generating synthetic complexstructured XML data. In Proceedings of WebDB’01, 2001.

    Google Scholar 

  2. A. Apostolico and Z. Galil, editors. Pattern Matching Algorithms, Chapter 14: Approximate Tree Pattern Matching. Oxford University Press, 1997.

    Google Scholar 

  3. R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley Longman, 1999.

    Google Scholar 

  4. The Berkeley DB. Sleepycat Software Inc., 2000. http://www.sleepycat.com.

  5. A. Bonifati and S. Ceri. Comparative analysis of five XML query languages. SIGMOD Record, 29(1), 2000.

    Google Scholar 

  6. T.T. Chinenyanga and N. Kushmerick. Expressive retrieval from XML documents. In Proceedings of SIGIR, 2001.

    Google Scholar 

  7. N. Fuhr and K. Groβjohann. XIRQL: A query language for information retrieval in XML documents. In Proceedings of SIGIR, 2001.

    Google Scholar 

  8. R. Goldman and J. Widom. DataGuides: Enabling query formulation and optimization in semistructured data. In Proceedings of VLDB, 1997.

    Google Scholar 

  9. T. Jiang, L. Wang, and K. Zhang. Alignment of trees-an alternative to tree edit. In Proceedings of Combinatorial Pattern Matching, 1994.

    Google Scholar 

  10. P. Kilpeläinen. Tree Matching Problems with Applications to Structured Text Databases. PhD thesis, University of Helsinki, Finland, 1992.

    Google Scholar 

  11. J. Robie, J. Lapp, and D. Schach. XML query language (XQL), 1998. http://www.w3.org/TandS/QL/QL98/pp/xql.html.

  12. T. Schlieder. ApproXQL: Design and implementation of an approximate pattern matching language for XML. Report B 01-02, Freie Universität Berlin, 2001.

    Google Scholar 

  13. T. Schlieder. Schema-driven evaluation of ApproXQL queries. Report B 02-01, Freie Universität Berlin, 2002.

    Google Scholar 

  14. K.-C. Tai. The tree-to-tree correction problem. Journal of the ACM, 26(3):422–433, 1979.

    Article  MATH  MathSciNet  Google Scholar 

  15. A. Theobald and G. Weikum. Adding relevance to XML. In Proceedings of WebDB’00, 2000.

    Google Scholar 

  16. K. Zhang. A new editing based distance between unordered labeled trees. In Proceedings of Combinatorial Pattern Matching, 1993.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schlieder, T. (2002). Schema-Driven Evaluation of Approximate Tree-Pattern Queries. In: Jensen, C.S., et al. Advances in Database Technology — EDBT 2002. EDBT 2002. Lecture Notes in Computer Science, vol 2287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45876-X_33

Download citation

  • DOI: https://doi.org/10.1007/3-540-45876-X_33

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43324-8

  • Online ISBN: 978-3-540-45876-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics