Advertisement

Relevance feedback revisited: dealing with content and structure in XML documents

  • Lobna Hlaoua
  • Karen Pinel-Sauvagnat
  • Mohand Boughanem
Article

Abstract

Relevance feedback (RF) is a technique that allows to enrich an initial query according to the user feedback. The goal is to express more precisely the user’s needs. Some open issues arise when considering semi-structured documents like XML documents. They are mainly related to the form of XML documents which mix content and structure information and to the new granularity of information. Indeed, the main objective of XML retrieval is to select relevant elements in XML documents instead of whole documents. Most of the RF approaches proposed in XML retrieval are simple adaptation of traditional RF to the new granularity of information. They usually enrich queries by adding terms extracted from relevant elements instead of terms extracted from whole documents. In this article, we describe a new approach of RF that takes advantage of two sources of evidence: the content and the structure. We propose to use the query term proximity to select terms to be added to the initial query and to use generic structures to express structural constraints. Both sources of evidence are used in different combined forms. Experiments were carried out within the INEX evaluation campaign and results show the effectiveness of our approaches.

Keywords

XML retrieval Relevance feedback Structure 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Belkin N.J.: Anomalous states of knowledge as a basis for information retrieval. Can. J. Inform. Sci. 5, 133–143 (1980)Google Scholar
  2. 2.
    Campbell, I.: Supporting information needs by ostensive definition in an adaptive information space. In: MIRO’95. Electronic Workshops in Computing. Springer, Berlin (1995)Google Scholar
  3. 3.
    Campbell I.: Interactive evaluation of the ostensive model, using a new test-collection of images with multiple relevance assessments. J. Inf. Retriev. 2(1), 89–114 (1999)CrossRefGoogle Scholar
  4. 4.
    Campbell, I., Rijsbergen, C.J.V.: Ostensive model of information needs. In: Proceedings of the Second International Conference on Conceptions of Library and Information Science: Integration in Perspective (CoLIS 2), pp. 251–268 (1996)Google Scholar
  5. 5.
    Crouch, C.J., Apte, S., Bapat, H.: An approach to structured retrieval based on the extended vector model. In: Proceedings of INEX 2003 Workshop, pp. 89–93. Germany, December (2003)Google Scholar
  6. 6.
    Crouch, C., Mahajan, A., Bellamkonda, A.: Flexible XML retrieval based on the vector space model. In: INEX 2004 Workshop Proceedings, pp. 292–302. Germany, December (2004)Google Scholar
  7. 7.
    Denoyer, L., Gallinari, P.: A belief networks-based generative model for structured documents. An application to the xml categorization. In: MLDM, pp. 328–342 (2003)Google Scholar
  8. 8.
    Denoyer L., Gallinari P.: The wikipedia xml corpus. SIGIR Forum 40(1), 64–69 (2006)CrossRefGoogle Scholar
  9. 9.
    Efthimiadis E.: Interactive query expansion: a user based evaluation in relevance feedback environment. J. Am. Soc. Inform. Sci. 51(11), 989–1003 (2000)CrossRefGoogle Scholar
  10. 10.
    Ellis D.: A behavioural approach to information system design. J. Doc. 45(3), 171–212 (1989)CrossRefGoogle Scholar
  11. 11.
    Fuhr, N., Govert, N., Kazai, G., Lalmas, M.: In: Proceedings of the First Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2002) (2002)Google Scholar
  12. 12.
    Fuhr, N., Lalmas, M., Malik, S.: In: INEX 2003 Workshop Proceedings (2003)Google Scholar
  13. 13.
    Fuhr, N., Lalmas, M., Malik, S., Szlavik, Z.: In: INEX 2004 Workshop Proceedings. Springer, Berlin (2004)Google Scholar
  14. 14.
    Fuhr, N., Lalmas, M., Malik, S., Kazai, G.: In: INEX 2005 Workshop Proceedings (2005)Google Scholar
  15. 15.
    Fuhr, N., Lalmas, M., Trotman, A.: In: INEX 2006 Workshop Proceedings (2006)Google Scholar
  16. 16.
    Geva, S.: Gpx-gardens point xml information retrieval at inex 2004. In: INEX 2004 Workshop Proceedings, pp. 211–223. Dagsthul, Germany, December (2004)Google Scholar
  17. 17.
    Geva, S.: Gpx-gardens point xml ir at inex 2006. In: Comparative Evaluation of XML Information Retrieval Systems, pp. 137–150. Dagstuhl, Germany, December (2006)Google Scholar
  18. 18.
    Grabs, T., Schek, H.: Eth zurich at inex, flexible information retrieval from xml with powerdb-xml. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML retrieval (INEX), pp. 141–148. Dagsthul, Germany, December (2002)Google Scholar
  19. 19.
    Harman, D.: Towards interactive query expansion. In: 11th Annual International ACM SIGIR Conference on Research and Developement in Information Retrieval, pp. 321–331 (1988)Google Scholar
  20. 20.
    Hatano, K., Kinutani, H., Watanabe, M.: Determining the unit of retrieval results for xml document retrieval. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML Retrieval (INEX), pp. 57–64. Dagsthul, Germany, December (2002)Google Scholar
  21. 21.
    Hatano, K., Kinutani, H., Watanabe, M., Mori, Y., Yoshikawa, M., Uemura, S.: Keyword-based xml fragment retrieval: experimental evaluation based on inex 2003 relevance assessments. In: Proceedings of INEX 2003 Workshop, pp. 81–88. Dagsthul, Germany, December (2003)Google Scholar
  22. 22.
    Hlaoua, L., Sauvagnat, K.: Structure-oriented relevance feedback in xml retrieval. In: InSciT2006. Merida, Espagna, October (2006)Google Scholar
  23. 23.
    Hlaoua, L., Sauvagnat, K., Boughanem, M.: A structure-oriented relevance feedback method for xml retrieval. In: Proceedings of the 15th ACM Annual Conference on Information and Knowlege Management CIKM’06, Arlington, November (2006)Google Scholar
  24. 24.
    Hlaoua, L., Torjmen, M., Pinel-Sauvagnat, K., Boughanem, M.: XFIRM at INEX 2006. Ad-hoc, relevance feedback and multimedia tracks. In: International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX), Dagstuhl, Allemagne, 18/12/2006–20/12/2006 (2006)Google Scholar
  25. 25.
    Hlaoua, L., Boughanem, M., Pinel-Sauvagnat, K.: Combination of evidences in relevance feedback for XML retrieval. In Conference on Information and Knowledge Management (CIKM), Lisbonne, Portugal, November (2007)Google Scholar
  26. 26.
    Hlaoua, L., Boughanem, M., Pinel-Sauvagnat, K.: Using a content-and-structure oriented method for relevance feedback in XML retrieval. In: Large-Scale Semantic Access to Content (Text, Image, Video and Sound) (RIAO), Pittsburgh (PA) États-Unis, 30/05/2007–01/06/2007, June (2007)Google Scholar
  27. 27.
    Hlaoua, L., Pinel-Sauvagnat, K., Boughanem, M.: Relevance feedback for XML retrieval: using structure and content to expand queries. In: Rolland, C., Pastor, O., Cavarero, J.-L. (eds.) International Conference on Research Challenges in Information Science (RCIS), Ouarzazate-Maroc, 23/04/2007–26/04/2007, pp. 195–202, April (2007)Google Scholar
  28. 28.
    Hubert, G.: A voting method for XML retrieval. In: Fuhr, N., Lalmas M., Malik, S. (eds.) Advances in XML Information Retrieval: Third International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2004, LNCS 3493/2005, Dagstuhl, Germany, pp. 183–196. Springer, Heidelberg, May (2005)Google Scholar
  29. 29.
    Kazai, G., Lalmas, M.: Inex 2005 evaluation metrics. In: INEX 2005 Workshop Proceedings, pp. 401–406. Germany, November (2005)Google Scholar
  30. 30.
    Kuhlthau C.: Principle for uncertainty for information seeking. J. Doc. 49(4), 339–355 (1993)CrossRefGoogle Scholar
  31. 31.
    Larson, R.: Cheshire ii at inex: using a hybrid logistic regression and boolean model for xml retrieval. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML Retrieval (INEX), pp. 18–25. Dagsthul, Germany, December (2002)Google Scholar
  32. 32.
    List, J.A., Mihajlovic, V., de Vries, A.P., Ramirez, G.: The TIJAH XML-IR system at INEX 2003. In: Proceedings of INEX 2003 Workshop, pp. 102–109. Dagsthul, Germany, December (2003)Google Scholar
  33. 33.
    Mass, Y., Mandelbrod, M.: Retrieving the most relevant XML components. In: Proceedings of INEX 2003 Workshop, pp. 53–58. Dagsthul, Germany, December (2003)Google Scholar
  34. 34.
    Mass, Y., Mandelbrod, M.: Relevance feedback for XML retrieval. In: INEX 2004 Workshop Proceedings, pp. 303–310. Germany, December (2004)Google Scholar
  35. 35.
    Mass, Y., Mandelbrod, M., Amitay, E., Maarek, Y., Soffer, A.: JuruXML-an XML retrieval system at INEX’02. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML Retrieval(INEX), pp. 73–80. Dagsthul, Germany, December (2002)Google Scholar
  36. 36.
    Mihajlovic, V., Ramirez, G., de Vries, A., Hiemstra, D., Blok, H.: TIJAH at INEX 2004 modeling phrases and relevance feedback. In: INEX 2004 Workshop Proceedings, pp. 276–291. Germany, December (2004)Google Scholar
  37. 37.
    Mihajlovic, V., Ramirez, G., Westerveld, T., Block, H., de Vries, A., Hiemstra, D.: TIJAH scratches INEX 2005 vague element selection, overlap, image search, relevance feedback, and users. In: INEX 2005 Workshop Proceedings, pp. 54–71. Dagsthul, Germany, November (2005)Google Scholar
  38. 38.
    Olgilvie, P., Callan, J.: Using language models for flat text queries in xml retrieval. In: Proceedings of INEX 2003 Workshops, pp. 12–18. Dagsthul, Germany, December (2002)Google Scholar
  39. 39.
    Pinel-Sauvagnat K., Boughanem M., Chrisment C.: Answering content-and-structure-based queries on XML documents using relevance propagation. Inf. Syst. (Special Issue SPIRE 2004) 31, 621–635 (2006)Google Scholar
  40. 40.
    Robertson S., Sparck-Jones J.K.: Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27(3), 129–146 (1976)CrossRefGoogle Scholar
  41. 41.
    Rocchio, J.: Relevance feedback in information retrieval. In: The SMART Retrieval System-Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall Inc., Englewood Cliffs (1971)Google Scholar
  42. 42.
    Ruthven I., Lalmas M.: A survey on the use of relevance feedback for information access systems. Knowl Eng Rev 18(2), 95–145 (2003)CrossRefGoogle Scholar
  43. 43.
    Ruthven I., Lalmas M., Rijsbergen K.: Combining and selecting characteristics of information use. JASIST 53(5), 378–396 (2002)CrossRefGoogle Scholar
  44. 44.
    Sauvagnat, K., Boughanem, M.: The impact of leaf nodes relevance values evaluation in a propagation method for xml retrieval. In: Third XML and Information Retrieval Workshop, SIGIR 2004. Sheffield, UK, July (2004)Google Scholar
  45. 45.
    Sauvagnat, K., Boughanem, M., Chrisment, C.: Searching XML documents using relevance propagation. In: SPIRE 04. Padoue, Italie, October (2004)Google Scholar
  46. 46.
    Sauvagnat, K., Hlaoua, L., Boughanem, M.: Xfirm at inex 2005: ad-hoc and relevance feedback track. In: INEX 2005 Workshop Proceedings, pp. 72–83. Germany, November (2005)Google Scholar
  47. 47.
    Schenkel, R., Theobald, M.: Relevance feedback for structural query expansion. In: INEX 2005 Workshop Proceedings, pp. 260–272. Germany, November (2005)Google Scholar
  48. 48.
    Schenkel, R., Theobald, M.: Feedback-driven structural query expansion for ranked retrieval of xml data. In: EDBT, pp. 331–348 (2006)Google Scholar
  49. 49.
    Schenkel, R., Theobald, M.: Structural feedback for keyword-based retrieval. In: Advances in Information Retrieval, 28th European Conference on IR Research, ECIR 2006, pp. 326–337. London, UK, April (2006)Google Scholar
  50. 50.
    Sigurbjörnsson, B., Kamps, J., de Rijke, M.: An element-based approach to XML retrieval. In: Proceedings of INEX 2003 Workshop. Dagstuhl, Germany, December (2003)Google Scholar
  51. 51.
    Spink, A., Wilson, T.D.: Toward a theoretical framework for information retrieval (ir) evaluation in an information seeking context. In: Mira’99: Evaluating Information Retrieval (1999)Google Scholar
  52. 52.
    Trotman, A., Lalmas, M.: Why structural hints in queries do not help xml-retrieval. In: SIGIR, pp. 711–712 (2006)Google Scholar
  53. 53.
    Trotman, A., Sigurbjornsson, B.: Narrowed extended xpath i(nexi). In: INEX 2004 Workshop Proceedings, pp. 16–40. Germany, December (2004)Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Lobna Hlaoua
    • 1
  • Karen Pinel-Sauvagnat
    • 2
  • Mohand Boughanem
    • 2
  1. 1.École Supérieure des Sciences et de Technologie de H. SousseH. SousseTunisia
  2. 2.IRIT-SIGToulouse Cedex 4France

Personalised recommendations