Skip to main content

Data Fusion for Effective European Monolingual Information Retrieval

  • Conference paper
Multilingual Information Access for Text, Speech and Images (CLEF 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3491))

Included in the following conference series:

Abstract

For our fourth participation in the CLEF evaluation campaigns, our first objective was to propose an effective and general stopword list and a light stemming procedure for the Portuguese language. Our second objective was to obtain a better picture of the relative merit of various search engines when processing documents in the Finnish and Russian languages. Finally, based on the Z-score method we suggested a data fusion strategy intended to improve monolingual searches in various European languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Savoy, J.: Combining Multiple Strategies for Effective Monolingual and Cross-Lingual Retrieval. IR Journal 7, 121–148 (2004)

    Google Scholar 

  2. Savoy, J.: Report on CLEF-2003 Monolingual Tracks: Fusion of Probabilistic Models for Effective Monolingual Retrieval. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 322–336. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Sproat, R.: Morphology and Computation. The MIT Press, Cambridge (1992)

    Google Scholar 

  4. Hedlund, T., Airio, E., Keskustalo, H., Lehtokangas, R., Pirkola, A., Järvelin, K.: Dictionary-Based Cross-Language Information Retrieval: Learning Experiences from CLEF 2000-2002. IR Journal 7, 99–119 (2004)

    Google Scholar 

  5. Lovins, J.B.: Development of a Stemming Algorithm. Mechanical Translation and Computational Linguistics 11, 22–31 (1968)

    Google Scholar 

  6. Porter, M.F.: An Algorithm for Suffix Stripping. Program 14, 130–137 (1980)

    Google Scholar 

  7. Braschler, M., Ripplinger, B.: How Effective is Stemming and Decompounding for German Text Retrieval? IR Journal 7, 291–316 (2004)

    Google Scholar 

  8. Chen, A.: Cross-Language Retrieval Experiments at CLEF 2002. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 28–48. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Buckley, C., Singhal, A., Mitra, M., Salton, G.: New Retrieval Approaches Using SMART. In: Proceedings TREC-4, pp. 25–48. NIST Publication #500-236, Gaithersburg (1996)

    Google Scholar 

  10. Singhal, A., Choi, J., Hindle, D., Lewis, D.D., Pereira, F.: AT&T at TREC-7. In: Proceedings TREC-7, pp. 239–251. NIST, Publication #500-242, Gaithersburg (1999)

    Google Scholar 

  11. Robertson, S.E., Walker, S., Beaulieu, M.: Experimentation as a Way of Life: Okapi at TREC. Information Processing & Management 36, 95–108 (2000)

    Article  Google Scholar 

  12. Amati, G., Carpineto, C., Romano, G.: Italian Monolingual Information Retrieval with PROSIT. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 257–264. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Amati, G., van Rijsbergen, C.J.: Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Transactions on Information Systems 20, 357–389 (2002)

    Article  Google Scholar 

  14. Hull, D.: Using Statistical Testing in the Evaluation of Retrieval Experiments. In: Proceedings of the ACM-SIGIR 1993, pp. 329–338. The ACM Press, New York (1993)

    Google Scholar 

  15. Savoy, J.: Statistical Inference in Retrieval Effectiveness Evaluation. Information Processing & Management 33, 495–512 (1997)

    Article  Google Scholar 

  16. Vogt, C.C., Cottrell, G.W.: Fusion via a Linear Combination of Scores. IR Journal 1, 151–173 (1999)

    Google Scholar 

  17. Fox, E.A., Shaw, J.A.: Combination of Multiple Searches. In: Proceedings TREC-2, pp. 243–249. NIST Publication #500-215, Gaithersburg (1994)

    Google Scholar 

  18. Tomlinson, S.: Finnish, Portuguese and Russian Retrieval with Hummingbird SearchServerTMat CLEF 2004. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 221–232. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  19. Moulinier, I., Williams, K.: Report on Thomson Legal and Regulatory Experiments at CLEF 2004. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 110–122. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Savoy, J. (2005). Data Fusion for Effective European Monolingual Information Retrieval. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds) Multilingual Information Access for Text, Speech and Images. CLEF 2004. Lecture Notes in Computer Science, vol 3491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11519645_24

Download citation

  • DOI: https://doi.org/10.1007/11519645_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27420-9

  • Online ISBN: 978-3-540-32051-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics