Evaluation of Multilingual and Multi-modal Information Retrieval

Volume 4730 of the series Lecture Notes in Computer Science pp 744-758

Overview of the CLEF-2006 Cross-Language Speech Retrieval Track

  • Douglas W. OardAffiliated withCollege of Information Studies and, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742
  • , Jianqiang WangAffiliated withDepartment of Library and Information Studies, State University of New York at Buffalo, Buffalo, NY 14260
  • , Gareth J. F. JonesAffiliated withSchool of Computing, Dublin City University, Dublin 9
  • , Ryen W. WhiteAffiliated withMicrosoft Research, One Microsoft Way, Redmond, WA 98052
  • , Pavel PecinaAffiliated withMFF UK, Malostranske namesti 25, Room 422, Charles University, 118 00 Praha 1
  • , Dagobert SoergelAffiliated withCollege of Information Studies, University of Maryland, College Park, MD 20742
  • , Xiaoli HuangAffiliated withCollege of Information Studies, University of Maryland, College Park, MD 20742
  • , Izhak ShafranAffiliated withOGI School of Science & Engineering, Oregon Health and Sciences University, 20000 NW Walker Rd, Portland, OR 97006

* Final gross prices may vary according to local VAT.

Get Access


The CLEF-2006 Cross-Language Speech Retrieval (CL-SR) track included two tasks: to identify topically coherent segments of English interviews in a known-boundary condition, and to identify time stamps marking the beginning of topically relevant passages in Czech interviews in an unknown-boundary condition. Five teams participated in the English evaluation, performing both monolingual and cross-language searches of speech recognition transcripts, automatically generated metadata, and manually generated metadata. Results indicate that the 2006 English evaluation topics are more challenging than those used in 2005, but that cross-language searching continued to pose no unusual challenges when compared with monolingual searches of the same collection. Three teams participated in the monolingual Czech evaluation using a new evaluation measure based on differences between system-suggested and ground truth replay start times, with results that were broadly comparable to those observed for English.