Journal of Intelligent Information Systems

, Volume 32, Issue 2, pp 139–162 | Cite as

A coherent query language for XML

Article

Abstract

Text search engines are inadequate for indexing and searching XML documents because they ignore metadata and aggregation structure implicit in the XML documents. On the other hand, the query languages supported by specialized XML search engines are very complex. In this paper, we present a simple yet flexible query language, and develop its semantics to enable intuitively appealing extraction of relevant fragments of information while simultaneously falling back on retrieval through plain text search if necessary. Our approach combines and generalizes several available techniques to obtain precise and coherent results.

Keywords

Query languages XML/RDF Information search and retrieval Indexing and search 

References

  1. Antoniou, G., & van Harmelen, F. (2004). A Semantic web primer. The MIT PressGoogle Scholar
  2. Bailey, J., Bry, F., Furche, T., & Schaffert, S. (2005). Web and semantic web query languages: A survey. In N. Eisinger & J. Maluszynski, (Eds.), Reasoning Web, First International Summer School 2005, LNCS 3564 (pp. 35–133).Google Scholar
  3. Berger, S., Bry, F., Schaffert, S., & Wieser, C. (2003). Xcerpt and visXcerpt: from pattern-based to visual querying of XML and semistructured data. In Proceedings of 29th International Conference on Very Large Data Bases (pp. 1053–1056).Google Scholar
  4. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In Proceedings of the seventh international conference on world wide web (pp. 107–117).Google Scholar
  5. Carmel, D., Maarek, Y. S., Mass, Y., Efraty, N., & Landau, G. M. (2002). An extension of the vector space model for querying XML documents via XML fragment. The ACM SIGIR Second Workshop on XML and IR.Google Scholar
  6. Catania, B., Maddalena, A., & Vakali, A. (2005). XML document indexes: A classification. In IEEE internet computing (pp. 64–70).Google Scholar
  7. Chamberlin, D., Robie, J., & Florescu, D. (2000). Quilt: An XML query language for heterogeneous data sources. In Proceedings of WebDB 2000 Conference, Lecture Notes in Computer Science (Vol. 1997, 2000, pp. 1–25).Google Scholar
  8. Cohen, S., Kanza, Y., Kimelfeld B., & Sagiv, Y. (2005). Interconnection semantics for keyword search in XML. In The 2005 ACM international conference on information and knowledge management (CIKM) (pp. 389–396). Bermen (Germany).Google Scholar
  9. Cohen, S., Mamou, J., Kanza, Y., & Sagiv, Y. (2003). XSEarch: A semantic search engine for XML. In The 29th international conference on very large databases (VLDB) (pp. 45–56).Google Scholar
  10. Deutsch, A., Fernandez, M., Florescu, D., Levy, A., & Suciu, D. (1998). XML-QL: A query language for XML. In Proceedings of the 8th WWW conference, http://www.w3.org/TR/1998/NOTE-xml-ql-19980819/, Retrieved 10/2007.
  11. Fensel, D., Hendler, J., Lieberman, H., & Wahlster, W. (Eds.) (2003). Spinning the semantic web: Bringing the world wide web to its full potential. The MIT Press.Google Scholar
  12. Florescu, D., Kossmann D., & Manolescu, I. (2000). Integrating keyword search into XML query processing. Computer Networks: The International Journal of Computer and Telecommunications Networking, 33(1–6), 119–135, June.Google Scholar
  13. Fuhr, N., & Grojohann, K. (2001). XIRQL: A query language for information retrieval in XML documents. In Proceedings of the 24th ACM SIGIR Conference (pp. 172–180).Google Scholar
  14. Grabs, T., & Schek, H. (2002). Generating vector spaces on-the-fly for flexible XML retrieval. In The ACM SIGIR Second Workshop on XML and IR.Google Scholar
  15. Guo, L., Shao, F., Botev C., & Shanmugasundaram, J. (2003). XRANK: Ranked keyword search over XML documents. In Proceedings of ACM SIGMOD (pp. 16–27).Google Scholar
  16. Lalmas, M., & Tombros, A. (2007). Evaluating XML retrieval effectiveness at INEX. ACM SIGIR Forum, 41(1), 40–57, June.CrossRefGoogle Scholar
  17. Li, Y., Yu, C., & Jagadish, H. V. (2004). Schema-Free XQuery. In Proceedings of the VLDB conference (pp. 72–83).Google Scholar
  18. Manning, C. D., Raghavan, P., & Schtze, H. (2007). Introduction to information retrieval, http://informationretrieval.org/, Retrieved 10/2007.
  19. Meyer, H., Bruder, I., Weber, G., & Heuer, A. (2003). The Xircus Search Engine. Google Scholar
  20. Schlieder, T., & Meuss, H. (2002). Querying and ranking XML documents. Journal of the American Society for Information Science and Technology, 53(6), 489–503.CrossRefGoogle Scholar
  21. Scott, M. L. (2006). Programming language pragmatics. Morgan Kaufmann Publishers, 2nd Edn.Google Scholar
  22. Theobald, A., & Weikum, G. (2002). The index-based XXL search engine for querying XML data with relevance ranking. In 8th International Conference on Extending Database Technology (EDBT), LNCS 2287 (pp. 477–495).Google Scholar
  23. Thirunarayan, K., & Immaneni, T. (2006). Flexible querying of XML documents. In F. Esposito & Z. Ras (Eds.), Proceedings of the 16th international symposium on methodologies for intelligent systems, (LNAI/LNCS) (Vol. 4203, pp. 198–207), September.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Krishnaprasad Thirunarayan
    • 1
  • Trivikram Immaneni
    • 1
  1. 1.Metadata and Languages Laboratory Department of Computer Science and EngineeringWright State UniversityDaytonUSA

Personalised recommendations