Accessing Multilingual Information Repositories

Volume 4022 of the series Lecture Notes in Computer Science pp 783-791

UNED@CL-SR CLEF 2005: Mixing Different Strategies to Retrieve Automatic Speech Transcriptions

  • Fernando López-OsteneroAffiliated withNLP Group, ETSI Informática, UNED
  • , Víctor PeinadoAffiliated withNLP Group, ETSI Informática, UNED
  • , Valentín SamaAffiliated withNLP Group, ETSI Informática, UNED
  • , Felisa VerdejoAffiliated withNLP Group, ETSI Informática, UNED

* Final gross prices may vary according to local VAT.

Get Access


In this paper we describe UNED’s participation in the CLEF CL-SR 2005 track. First, we explain how we tried several strategies to clean up the automatic transcriptions. Then, we describe how we performed 84 different runs mixing these strategies with named entity recognition and different pseudo-relevance feedback approaches, in order to study the influence of each method in the retrieval process, both in monolingual and cross-lingual environments. We noticed that the influence of named entity recognition was higher in the cross-lingual environment, where MAP scores double when we take advantage of an entity recognizer. The best pseudo-relevance feedback approach was the one using manual keywords. The effects of the different cleaning strategies were very similar, except for character 3-grams, which obtained poor scores compared with other approaches.