Tree Pattern Relaxation

  • Sihem Amer-Yahia
  • SungRan Cho
  • Divesh Srivastava
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2287)

Abstract

Tree patterns are fundamental to querying tree-structured data like XML. Because of the heterogeneity of XML data, it is often more appropriate to permit approximate query matching and return ranked answers, in the spirit of Information Retrieval, than to return only exact answers. In this paper, we study the problem of approximate XML query matching, based on tree pattern relaxations, and devise efficient algorithms to evaluate relaxed tree patterns. We consider weighted tree patterns, where exact and relaxed weights, associated with nodes and edges of the tree pattern, are used to compute the scores of query answers. We are interested in the problem of finding answers whose scores are at least as large as a given threshold. We design data pruning algorithms where intermediate query results are filtered dynamically during the evaluation process. We develop an optimization that exploits scores of intermediate results to improve query evaluation efficiency. Finally, we show experimentally that our techniques outperform rewriting-based and post-pruning strategies.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel, D. Srivastava, and Y. Wu. Structural joins: Efficient matching of XML query patterns. In Proceedings of ICDE, 2002.Google Scholar
  2. 2.
    S. Boag, D. Chamberlin, M. Fernandez, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery 1.0: An XML query language. http://www.w3.org/TR/xquery.
  3. 3.
    Y. Chiaramella, P. Mulhem, and F. Fourel. A model for multimedia information retrieval. Technical report, FERMI ESPRIT BRA 8134, University of Glasgow. www.dcs.gla.ac.uk/fermi/tech_reports/reports/fermi96-4.ps.gz.
  4. 5.
    C. Delobel and M. C. Rousset. A uniform approach for querying large treestructured data through a mediated schema. In Proceedings of International Workshop on Foundations of Models for Information Integration (FMII), 2001.Google Scholar
  5. 6.
    A. Deutch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for XML. In Proceedings of WWW, 1999.Google Scholar
  6. 7.
    C. Faloutsos. Access methods for text. ACM Computing Surveys, 17(1), 1985.Google Scholar
  7. 8.
    N. Fuhr and K. Grossjohann. XIRQL: An extension of XQL for information retrieval. In Proceedings of SIGIR, 2001.Google Scholar
  8. 9.
    Y. Hayashi, J. Tomita, and G. Kikui. Searching text-rich XML documents with relevance ranking. In Proceedings of SIGIR Workshop on XML and Information Retrieval, 2000.Google Scholar
  9. 10.
    Y. Kanza and Y. Sagiv. Flexible queries over semistructured data. In Proceedings of PODS, 2001.Google Scholar
  10. 11.
    J. McHugh, S Abiteboul, R. Goldman, D. Quass, and J. Widom. Lore: A database management system for semistructured data. SIGMOD Record, 26(3):45–66, 1997.CrossRefGoogle Scholar
  11. 12.
    S. Myaeng, D.-H. Jang, M.-S. Kim, and Z.-C. Zhoo. A flexible Model for retrieval of SGML documents. In Proceedings of SIGIR, 1998.Google Scholar
  12. 13.
    M. Persin. Document filtering for fast ranking. In Proceedings of SIGIR, 1994.Google Scholar
  13. 14.
    J. Robie, J. Lapp, and D. Schach. XML query language (XQL). Available from http://www.w3.org/TandS/QL/QL98/pp/xql.html.
  14. 15.
    G. Salton and M. J. McGill. Introduction to modern information retrieval. McGraw-Hill, New York, 1983.MATHGoogle Scholar
  15. 16.
    T. Schlieder. Similarity search in XML data using cost-based query Transformations. In Proceedings of SIGMOD WebDB Workshop, 2001.Google Scholar
  16. 17.
    A. Theobald and G. Weikum. Adding relevance to XML. In Proceedings of SIGMOD WebDB Workshop, 2000.Google Scholar
  17. 18.
    H. Turtle and J. Flood. Query evaluation: Strategies and optimization. Information Processing & Management, Nov. 1995.Google Scholar
  18. 19.
    C. Zhang, J. Naughton, D. DeWitt, Q. Luo, and G. Lohman. On supporting containment queries in relational database management systems. In Proceedings of SIGMOD, 2001.Google Scholar
  19. 20.
    K. Zhang and D. Shasha. Tree pattern matching. Pattern Matching Algorithms, Apostolico and Galil (Eds.), Oxford University Press, 1997.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Sihem Amer-Yahia
    • 1
  • SungRan Cho
    • 2
  • Divesh Srivastava
    • 1
  1. 1.AT&T Labs-ResearchFlorham ParkUSA
  2. 2.Stevens Institute of TechnologyHobokenUSA

Personalised recommendations