Abstract
Our XML retrieval system EXTIRP was slightly modified from the 2004 version for the INEX 2005 project. For the first time, the system is now completely independent of the document type of the XML documents in the collection, which justifies the use of the term “heterogeneous” when describing our methodology. Nevertheless, the 2005 version of EXTIRP is still an incomplete system that does not include query expansion or dynamic determination of the answer size. The latter is seen as a serious limitation because of the XCG-based metrics which favour systems that can adjust the size of the answer according to its relevance to the query. We put our main focus on the CO.Focussed task of the adhoc track although runs were submitted for other tasks, as well. Perhaps because of the incompleteness of our system, the initial results bring out the characteristics of our system better than in earlier years. Even when partially stripped, EXTIRP is capable of ranking the most obvious highly relevant answers at the top ranks better than many other systems. The relatively high precision at the top ranks is achieved at the cost of losing the sight of the marginally relevant content, which shows in some exceptionally steep curves, and the rankings among other systems that sink from the top ranks at low recall levels towards the bottom ranks at higher levels of recall. Another fact supporting our observation is that regardless of the metric, our runs are ranked higher with the strict quantisation than with any other quantisation function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Doucet, A., Aunimo, L., Lehtonen, M., Petit, R.: Accurate Retrieval of XML Document Fragments using EXTIRP. In: INEX 2003 Workshop Proceedings, Schloss Dagstuhl, Germany, pp. 73–80 (2003)
Lehtonen, M.: EXTIRP 2004: Towards heterogeneity. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 372–381. Springer, Heidelberg (2005)
Lehtonen, M.: Indexing Heterogeneous XML for Full-Text Search. PhD thesis, University of Helsinki (2006)
Fuhr, N., Großjohann, K.: XIRQL: a query language for information retrieval in XML documents. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 172–180. ACM Press, New York (2001)
Liu, S., Zou, Q., Chu, W.W.: Configurable indexing and ranking for XML information retrieval. In: SIGIR 2004: Proceedings of the 27th annual international conference on Research and development in information retrieval, pp. 88–95. ACM Press, New York (2004)
Vyas, A., Fernàndez, M., Siméon, J.: The simplest XML storage manager ever. In: Proceedings of the First International Workshop on XQuery Implementation, Experience and Perspectives <XIME-P/>, pp. 37–42 (2004)
Kamps, J., de Rijke, M., Sigurbjörnsson, B.: Length normalization in XML retrieval. In: SIGIR 2004: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 80–87. ACM Press, New York (2004)
Cai, D., Yu, S., Wen, J.R., Ma, W.Y.: Block-based web search. In: SIGIR 2004: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 456–463. ACM Press, New York (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lehtonen, M. (2006). When a Few Highly Relevant Answers Are Enough. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds) Advances in XML Information Retrieval and Evaluation. INEX 2005. Lecture Notes in Computer Science, vol 3977. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-34963-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-34963-1_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34962-4
Online ISBN: 978-3-540-34963-1
eBook Packages: Computer ScienceComputer Science (R0)