Advertisement

Monolingual Retrieval Experiments with a Domain-Specific Document Corpus at the Chemnitz University of Technology

  • Jens Kürsten
  • Maximilian Eibl
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4730)

Abstract

This article describes the first participation of the Chair Media Informatics of the Chemnitz University of Technology in the Cross Language Evaluation Forum. An experimental prototype is introduced which implements several methods of optimizing search results. The configuration of the prototype is tested with the CLEF training data. The results of the Domain-Specific Monolingual German task suggest that combining the suffix stripping stemming and the decompounding approach is very useful. Also, a local document clustering (LDC) approach used to improve the query expansion (QE) based on pseudo-relevance feedback (PRF) seems to be quite beneficial. Nevertheless, the evaluation of the English task using the same configuration suggests that the qualities of the results are highly speech dependent.

Keywords

Evaluation Pseudo-Relevance Feedback Local Clustering Data Fusion 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    The Apache Software Foundation: Lucene. Retrieved August 10, 2006 from the World Wide Web (1998-2006), http://lucene.apache.org
  2. 2.
    CLEF: Guidelines for Participation in CLEF 2006 Ad-Hoc and Domain-Specific Tracks. Retrieved August 10, 2006 from the World Wide Web (restricted access, 2006), http://www.clef-campaign.org/delos/clef/protect/guidelines06.htm
  3. 3.
    Porter, M.: The Snowball Project. Retrieved August 10, 2006 from the World Wide Web (2001), www.snowball.tartarus.org
  4. 4.
    Wagner, S.: A German Decompounder Retrieved August 10, 2006 from the World Wide Web (2005), http://www-user.tu-chemnitz.de/~wags/cv/clr.pdf
  5. 5.
    Steinbach, M., Karypis, G., Kumar, V.: A Comparison of Document Clustering Techniques, University of Minnesota, Technical Report # 00034 (2000)Google Scholar
  6. 6.
    Rasmussen, E.: Clustering Algorithms. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval -Data Structures and Algorithms, Prentice Hall, Englewood Cliffs New Jersey (1992)Google Scholar
  7. 7.
    Willett, P.: Recent Trends in Hierarchic Document Clustering. Information Processing & Management 24(5), 577–597 (1988)CrossRefGoogle Scholar
  8. 8.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Pearson Addison-Wesley, Harlow Munich (2005)Google Scholar
  9. 9.
    Fox, E.A., Shaw, J.A.: Combination of Multiple Searches. In: Proceedings of the 2nd Text Retrieval Conference (TREC2), NIST Special Publication, pp. 215–500 (1994)Google Scholar
  10. 10.
    Savoy, J.: Data Fusion for Effective European Monolingual Information Retrieval. In: Working Notes for the CLEF 2004 Workshop (2004)Google Scholar
  11. 11.
    Lin, W.-C., Chen, H.-H.: Merging Mechanisms in Multilingual Information Retrieval. In: Working Notes for the CLEF 2002 Workshop (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jens Kürsten
    • 1
  • Maximilian Eibl
    • 1
  1. 1.Chemnitz University of Technology, Faculty of Computer Science, Chair Media Informatics, Straße der Nationen 62, 09107 ChemnitzGermany

Personalised recommendations