Abstract
In this paper, we examine the methodological issues involved in constructing test collections of structured documents and obtaining best entry points for the evaluation of the focussed retrieval of document components. We describe a pilot test of the proposed test collection construction methodology performed on a document collection of Shakespeare plays. In our analysis, we examine the effect of query complexity and type on overall query difficulty, the use of multiple relevance judges for each query, the problem of obtaining exhaustive relevance assessments from participants, and the method of eliciting relevance assessments and best entry points. Our findings indicate that the methodology is indeed feasible in this small-scale context, and merits further investigation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brin, S., Page, L.: The Anatomy of a Large-scale Hypertextual Web Search Engine. In: 7th WWW Conference, Brisbane, Australia (1998)
Silva, I., Ribeiro-Neto, B., Calado, P., Moura, E., Ziviani, N.: Link-Based and Content-Based Evidential Information in a Belief Network Model. In: 23rd ACM-SIGIR, Athens (2000)
Géry, M., Chevallet, J-P.:Toward a Structured Information Retrieval System on the Web: Automatic Structure Extraction of Web Pages. In: Pre-Proceedings of the International Workshop on Web Dynamics, London (2001)
Wilkinson, R.: Effective Retrieval of Structured Documents. In: 17th ACM-SIGIR, Dublin (1994) 311–317
Kotsakis, E.: Structured Information Retrieval in XML documents. In: Proceedings of the 17th ACM Symposium on Applied Computing (SAC’02), Madrid, Spain (2002)
Myaeng, S., Jang, D.H., Kim, M.S., Zhoo, Z.C.: A Flexible Model for Retrieval of SGML Documents. In: 21st ACM-SIGIR, Melbourne, Australia (1998) 138–145
Roelleke, T.: POOL: Probabilistic Object-Oriented Logical Representation and Retrieval of Complex Objects — A Model for Hypermedia Retrieval, Ph.D. Thesis, University of Dortmund, Verlag-Shaker (1999)
Fuhr, N., Großjohann K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: 24th ACM-SIGIR, New Orleans (2001) 172–180
Chiaramella, Y., Mulhem, P., Fourel, F.: A Model for Multimedia Information Retrieval, Technical Report Fermi ESPRIT BRA 8134, University of Glasgow (1996)
Callan, J.: Passage-Level Evidence in Document Retrieval. In: 17th ACM SIGIR, Dublin (1994) 302–310
Salton, G., Allan, J., Buckley, C.: Approaches to Passage Retrieval in Full Text Information Systems. In: 16th ACM SGIR, Pittsburgh (1993) 49–58
Burkowski, F.J.: Retrieval Activities in a Database Consisting of Heterogeneous Collections of Structured Texts. In: 15th ACM SIGIR, Copenhagen (1992) 112–125
Navarro, G., Baeza-Yates, R.: A Language for Queries on Structure and Content of Textual Databases. In: 18th ACM-SIGIR, Seattle (1995) 93–101
Frisse, M.: Searching for Information in a Hypertext Medical Handbook. Communications of the ACM 31 (1988) 880–886
Lalmas, M., Moutogianni, E.: A Dempster-Shafer Indexing for the Focussed Retrieval of a Hierarchically Structured Document Space: Implementation and Experiments on a Web Museum Collection. In: 6th RIAO Conference on Content-Based Multimedia Information Access, Paris (2000)
Roelleke, T., Lalmas, M., Kazai, G., Ruthven, I., Quicker, S.: The Accessibility Dimension for Structured Document Retrieval. In: 24th European Conference on Information Retrieval Research (ECIR’02), Glasgow (2002)
Kazai, G., Lalmas, M., Roelleke, T.: A Model for the Representation and Focussed Retrieval of Structured Documents based on Fuzzy Aggregation. In: String Processing and Information Retrieval (SPIRE 2001), Laguna De San Rafael, Chile (2001)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley (1999)
http://www.trec.nist.gov. TREC web site
Chinenyanga, T.P., Kushmerick, N.: Expressive Retrieval from XML Documents. In: 24th ACM-SIGIR, New Orleans (2001) 163–171
Harman, D.K.: The TREC Conferences. In: Kuhlen, R., Rittberger, M. (eds.): Hypertext-Information Retrieval-Multimedia: Proceedings of HIM 95, Konstanz, Germany (1995) 9–28
Janes, J.W.: Other People’s Judgments: A Comparison of Users’ and Others’ Judgments of Document Relevance, Topicality and Utility. Journal of the American Society of Information Science 45 (1994) 160–171
Shaw, W.M., Wood, J.B., Wood, R.E., Tibbo, H.R.: The Cystic Fibrosis Database: Content and Research Opportunities. Library and Information Science Research 13 (1991) 347–366
Vorhees, E.M.: Variations in Relevance Judgments and the Measurement of Retrieval Effectiveness. In: Croft, W.B., Moffat, A., van Rijsbergen, C.J., Wilkinson, R., Zobel, J. (eds.): 21st ACM-SIGIR, Melbourne (1998) 315–323
Lalmas, M., Reid, J., Hertzum, M.: Information Seeking Behaviour in the Context of Structured Documents. In preparation
Finesilver, K., Reid J. User behaviour in the Context of Structured Documents. To appear in: 25th European Conference on Information Retrieval Research (ECIR’03), Pisa (2003)
Fuhr, N., Goevert, N., Kazai, G., Lalmas, M. (eds.): INEX Proceedings, Schloss Dagstuhl (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kazai, G., Lalmas, M., Reid, J. (2003). Construction of a Test Collection for the Focussed Retrieval of Structured Documents. In: Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2003. Lecture Notes in Computer Science, vol 2633. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36618-0_7
Download citation
DOI: https://doi.org/10.1007/3-540-36618-0_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-01274-0
Online ISBN: 978-3-540-36618-8
eBook Packages: Springer Book Archive