Abstract
Hummingbird participated in the Bulgarian and Hungarian monolingual information retrieval tasks of the Ad-Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2005. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant documents (with high precision) in a particular document set. We conducted diagnostic experiments with different techniques for matching word variations and handling stopwords. We found that the experimental stemmers significantly increased mean average precision for both languages. Analysis of individual topics found that the algorithmic Bulgarian and Hungarian stemmers encountered some unanticipated stopword collisions. A comparison to an experimental 4-gram technique suggested that Hungarian stemming would further benefit from decompounding.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
AltaVista’s Babel Fish Translation Service, http://babelfish.altavista.com/tr
Cross-Language Evaluation Forum web site, http://www.clef-campaign.org/
Hodgson, A.: Converting the Fulcrum Search Engine to Unicode. In: Sixteenth International Unicode Conference (2000)
McNamee, P., Mayfield, J.: JHU/APL Experiments in Tokenization and Non-word Translation. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 85–97. Springer, Heidelberg (2004)
MTA SZTAKI: English-Hungarian, Hungarian-English Online Dictionary, http://dict.sztaki.hu/english-hungarian
NTCIR (NII-NACSIS Test Collection for IR Systems) Home Page, http://research.nii.ac.jp/~ntcadm/index-en.html
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at TREC-3. In: Proceedings of TREC-3 (1995)
Savoy, J.: CLEF and Multilingual information retrieval resource page (visited May 2005), http://www.unine.ch/info/clef/
Text REtrieval Conference (TREC) Home Page, http://trec.nist.gov/
Tomlinson, S.: Finnish, Portuguese and Russian Retrieval with Hummingbird SearchServerTM at CLEF 2004. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 221–232. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tomlinson, S. (2006). Bulgarian and Hungarian Experiments with Hummingbird SearchServerTM at CLEF 2005. In: Peters, C., et al. Accessing Multilingual Information Repositories. CLEF 2005. Lecture Notes in Computer Science, vol 4022. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11878773_22
Download citation
DOI: https://doi.org/10.1007/11878773_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45697-1
Online ISBN: 978-3-540-45700-8
eBook Packages: Computer ScienceComputer Science (R0)