Feedback-Driven Structural Query Expansion for Ranked Retrieval of XML Data

  • Ralf Schenkel
  • Martin Theobald
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3896)

Abstract

Relevance Feedback is an important way to enhance retrieval quality by integrating relevance information provided by a user. In XML retrieval, feedback engines usually generate an expanded query from the content of elements marked as relevant or nonrelevant. This approach that is inspired by text-based IR completely ignores the semistructured nature of XML. This paper makes the important step from content-based to structural feedback. It presents an integrated solution for expanding keyword queries with new content, path, and document constraints. An extensible framework evaluates such query conditions with existing keyword-based XML search engines while allowing to easily integrate new dimensions of feedback. Extensive experiments with the established INEX benchmark show the feasibility of our approach.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amati, G., Carpineto, C., Romano, G.: Merging XML indices. In: INEX Workshop 2004, pp. 77–81 (2004), available from http://inex.is.informatik.uni-duisburg.de:2004/
  2. 2.
    Baeza-Yates, R.A., Riberto-Neto, B.: Modern Information Retrieval. Addison Wesley, Reading (1999)Google Scholar
  3. 3.
    Balmin, A., et al.: A system for keyword proximity search on XML databases. In: VLDB 2003, pp. 1069–1072 (2003)Google Scholar
  4. 4.
    Blanken, H., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.): Intelligent Search on XML Data. LNCS, vol. 2818. Springer, Heidelberg (2003)MATHGoogle Scholar
  5. 5.
    Carmel, D., et al.: Searching XML documents via XML fragments. In: SIGIR 2003, pp. 151–158 (2003)Google Scholar
  6. 6.
    Crouch, C.: Relevance feedback at the INEX 2004 workshop. SIGIR Forum 39(1), 41–42 (2005)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Crouch, C.J., Mahajan, A., Bellamkonda, A.: Flexible XML retrieval based on the extended vector model. In: INEX 2004 Workshop, pp. 149–153 (2004)Google Scholar
  8. 8.
    Gonçalves, M.A., Fox, E.A., Krowne, A., Calado, P., Laender, A.H.F., da Silva, A.S., Ribeiro-Neto, B.: The effectiveness of automatically structured queries in digital libraries. In: 4th ACM/IEEE-CS joint conference on Digital libraries (JCDL 2004), pp. 98–107 (2004)Google Scholar
  9. 9.
    Guo, L., et al.: XRANK: ranked keyword search over XML documents. In: SIGMOD 2003, pp. 16–27 (2003)Google Scholar
  10. 10.
    Hlaoua, L., Boughanem, M.: Towards context and structural relevance feedback in XML retrieval. In: Workshop on Open Source Web Information Retrieval, OSWIR (2005), http://www.emse.fr/OSWIR05/
  11. 11.
    Hsu, W., Lee, M.L., Wu, X.: Path-augmented keyword search for XML documents. In: ICTAI 2004, pp. 526–530 (2004)Google Scholar
  12. 12.
  13. 13.
    Kazai, G., et al.: The INEX evaluation initiative. In: Blanken, et al. (eds.) [4], pp. 279–293.Google Scholar
  14. 14.
    Liu, S., Zou, Q., Chu, W.: Configurable indexing and ranking for XML information retrieval. In: SIGIR 2004, pp. 88–95 (2004)Google Scholar
  15. 15.
    Mass, Y., Mandelbrod, M.: Relevance feedback for XML retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 303–310. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Mihajlović, V., Ramírez, G., de Vries, A.P., Hiemstra, D., Blok, H.E.: TIJAH at INEX 2004 modeling phrases and relevance feedback. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 276–291. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Pan, H.: Relevance feedback in XML retrieval. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 187–196. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Pan, H., Theobald, A., Schenkel, R.: Query refinement by relevance feedback in an XML retrieval system. In: Atzeni, P., Chu, W., Lu, H., Zhou, S., Ling, T.-W. (eds.) ER 2004. LNCS, vol. 3288, pp. 854–855. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  19. 19.
    Ramirez, G., Westerveld, T., de Vries, A.: Structural features in content oriented xml retrieval. Technical Report INS-E0508, CWI,Centre for Mathematics and Computer Science (2005)Google Scholar
  20. 20.
    Ramírez, G., Westerveld, T., de Vries, A.P.: Structural features in content oriented XML retrieval. In: CIKM 2005 (2005)Google Scholar
  21. 21.
    Rocchio Jr., J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, ch. 14, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)Google Scholar
  22. 22.
    Ruthven, I., Lalmas, M.: A survey on the use of relevance feedback for information access systems. Knowledge Engineering Review 18(1) (2003)Google Scholar
  23. 23.
    Sigurbjörnsson, B., Kamps, J., de Rijke, M.: The University of Amsterdam at INEX 2004. In: INEX 2004 Workshop, pp. 104–109 (2004)Google Scholar
  24. 24.
    Theobald, M., Schenkel, R., Weikum, G.: An efficient and versatile query engine for TopX search. In: VLDB 2005, pp. 625–636 (2005)Google Scholar
  25. 25.
    Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I, NEXI (2004), available at http://www.cs.otago.ac.nz/postgrads/andrew/2004-4.pdf
  26. 26.
    Vittaut, J.-N., Piwowarski, B., Gallinari, P.: An algebra for structured queries in bayesian networks. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 58–64. Springer, Heidelberg (2004)Google Scholar
  27. 27.
    Weber, R.: Using relevance feedback in XML retrieval. In: Blanken et al. [4], pp. 133–143Google Scholar
  28. 28.
    Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest LCAs in XML databases. In: SIGMOD 2005, pp. 537–538 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ralf Schenkel
    • 1
  • Martin Theobald
    • 1
  1. 1.Max-Planck-Institut für InformatikSaarbrückenGermany

Personalised recommendations