Advertisement

About Sound and Vision: CLEF Beyond Text Retrieval Tasks

  • Gareth J. F. JonesEmail author
Chapter
Part of the The Information Retrieval Series book series (INRE, volume 41)

Abstract

CLEF was initiated with intention of providing a catalyst to research in Cross-Language Information Retrieval (CLIR) and Multilingual Information Retrieval (MIR). Focusing principally on European languages, it initially provided CLIR benchmark tasks to the research community within an annual cycle of task design, conduct and reporting. While the early focus was on textual data, the emergence of technologies to enable collection, archiving and content processing of multimedia content led to several initiatives which sought to address search for spoken and visual content. Similar to the interest in multilingual search for text, interest arose in working multilingually with multimedia content. To support research in these areas CLEF introduced a number of tasks in multilingual search for multimedia content. While investigation of image retrieval has formed the focus of the ImageCLEF task over many years, this chapter reviews tasks examining speech and video retrieval carried out within CLEF during its first 10 years, and overviews related work reported at other information retrieval benchmarks.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

The success of the CLEF and MediaEval tasks described in this chapter would not have been possible without the work of the task co-chairs Marcello Federico, Douglas W. Oard, Martha Larson, Maria Eskevich, Robin Aly and Roeland Ordelman.

References

  1. Akiba T, Nishizaki H, Aikawa K, Kawahara T, Matsui T (2011) Overview of the IR for spoken documents task in NTCIR-9 workshop. In: Kando N, Ishikawa D, Sugimoto M (eds) Proceedings of the 9th NTCIR workshop meeting on evaluation of information access technologies: information retrieval, question answering and cross-lingual information access. National Institute of Informatics, TokyoGoogle Scholar
  2. Akiba T, Nishizaki H, Aikawa K, Hu X, Itoh Y, Kawahara T, Nakagawa S, Nanjo H, Yamashita Y (2013) Overview of the NTCIR-10 spokendoc-2 task. In: Kando N, Kishida K (eds) Proceedings of the 10th NTCIR conference on evaluation of information access technologies. National Institute of Informatics, TokyoGoogle Scholar
  3. Akiba T, Nishizaki H, Nanjo H, Jones GJF (2014) Overview of the NTCIR-11 spokenquery&doc task. In: Kando N, Joho H, Kishida K (eds) Proceedings of the 11th NTCIR conference on evaluation of information access technologies. National Institute of Informatics, TokyoGoogle Scholar
  4. Akiba T, Nishizaki H, Nanjo H, Jones GJF (2016) Overview of the ntcir-12 spokenquery&doc-2 task. In: Kando N, Sakai T, Sanderson M (eds) Proceedings of the 12th NTCIR conference on evaluation of information access technologies. National Institute of Informatics, TokyoGoogle Scholar
  5. Aly R, Ordelman R, Eskevich M, Jones GJF, Chen S (2013) Linking inside a video collection - what and how to measure? In: Proceedings of the first worldwide web workshop on linked media (LiME-2013), International World Wide Web Conference Committee (IW3C2), GenevaGoogle Scholar
  6. Awad G, Fiscus J, Joy D, Michel M, Smeaton AF, Kraaij W, Eskevich M, Aly R, Ordelman R, Jones GJF, Huet B, Larson M (2016) TRECVID 2016: evaluating video search, video event detection, localization, and hyperlinking. In: The sixteenth international workshop on video retrieval evaluation (TRECVID 2016). National Institute of Standards and Technology (NIST), Special Publication 500–321, WashingtonGoogle Scholar
  7. Awad G, Butt A, Fiscus J, Joy D, Delgado A, Mcclinton W, Michel M, Smeaton A, Graham Y, Kraaij W, Quénot G, Eskevich M, Roeland Ordelman GJFJ, Huet B (2017) Trecvid 2017: evaluating ad-hoc and instance video search, events detection, video captioning, and hyperlinking. In: The seventeenth international workshop on video retrieval evaluation (TRECVID 2017). National Institute of Standards and Technology (NIST), Special Publication 500–321, WashingtonGoogle Scholar
  8. Byrne W, Doermann D, Franz M, Member S, Gustman S, Soergel D, Ward T, jing Zhu W (2004) Automatic recognition of spontaneous speech for access to multilingual oral history archives. IEEE Trans Speech Audio Process 12(4):420–435CrossRefGoogle Scholar
  9. Clough P, Sanderson M, Reid N (2006) The Eurovision St Andrews collection of photographs. SIGIR Forum 40(1):21–30CrossRefGoogle Scholar
  10. Eskevich M, Jones GJF (2014) Exploring speech retrieval from meetings using the AMI corpus. Comput Speech Lang (Special Issue on Information Extraction and Retrieval) 28(5):1021–1044Google Scholar
  11. Eskevich M, Jones GJF, Chen S, Aly R, Ordelman R, Larson M (2012a) Search and hyperlinking task at mediaeval 2012. In: Larson MA, Schmiedeke S, Kelm P, Rae A, Mezaris V, Piatrik T, Soleymani M, Metze F, Jones GJF (eds) Working Notes Proceedings of the MediaEval 2012 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-927/
  12. Eskevich M, Jones GJF, Larson M, Wartena C, Aly R, Verschoor T, Ordelman R (2012b) Comparing retrieval effectiveness for alternative content segmentation methods for internet video. In: Proceedings of the 10th workshop on content-based multimedia indexing. IEEE, New Jersey, CBMI 2012Google Scholar
  13. Eskevich M, Jones GJF, Chen S, Aly R, Ordelman R (2013) The search and hyperlinking task at mediaeval 2013. In: Larson M, Anguera X, Reuter T, Jones GJF, Ionescu B, Schedl M, Piatrik T, Hauff C, Soleymani M (eds) Working notes proceedings of the MediaEval 2013 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-1043/
  14. Eskevich M, Aly R, Racca DN, Ordelman R, Chen S, Jones GJF (2014) The search and hyperlinking task at mediaeval 2014. In: Larson M, Ionescu B, Anguera X, Eskevich M, Korshunov P, Schedl M, Soleymani M, Petkos P, Sutcliffe R, Choi J, Jones GJF (eds) Working notes proceedings of the MediaEval 2014 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-1263/
  15. Eskevich M, Aly R, Ordelman R, Racca DN, Chen S, Jones GJF (2015) SAVA at Mediaeval 2015: Search and anchoring in video archives. In: Larson M, Ionescu B, Sjöberg M, Anguera X, Poignant J, Riegler M, Eskevich M, Hauff C, Sutcliffe R, Jones GJF, Yang YH, Soleymani M, Papadopoulos S (eds) Working notes proceedings of the MediaEval 2015 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-1436/
  16. Federico M, Jones GJF (2004) The CLEF 2003 cross-language spoken document retrieval track. In: Peters C, Braschler M, Gonzalo J, Kluck M (eds) Comparative evaluation of multilingual information access systems: fourth workshop of the cross–language evaluation forum (CLEF 2003) revised selected papers. Lecture notes in computer science (LNCS), vol 3237. Springer, Heidelberg, p 646CrossRefGoogle Scholar
  17. Federico M, Bertoldi N, Levow GA, Jones GJF (2005) CLEF 2004 cross-language spoken document retrieval track. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: fifth workshop of the cross–language evaluation forum (CLEF 2004) revised selected papers. Lecture notes in computer science (LNCS), vol 3491. Springer, Heidelberg, pp 816–820CrossRefGoogle Scholar
  18. Garofolo JS, Auzanne CGP, Voorhees EM (2000) The trec spoken document retrieval track: a success story. In: Content-Based Multimedia Information Access - vol 1, LE CENTRE DE HAUTES ETUDES INTERNATIONALES D’INFORMATIQUE DOCUMENTAIRE, Paris, France, France, RIAO ‘00, pp 1–20Google Scholar
  19. Glavitsch U, Schäuble P (1992) A system for retrieving speech documents. In: Belkin NJ, Ingwersen P, Mark Pejtersen A, Fox EA (eds) Proceedings of the 15th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 1992). ACM Press, New York, pp 168–176CrossRefGoogle Scholar
  20. Hauptmann AG, Witbrock MJ (1997) Informedia: news-on-demand multimedia information acquisition and retrieval. In: Maybury MT (ed) Intelligent multimedia information retrieval. MIT Press, Cambridge, pp 215–239Google Scholar
  21. James DA (1995) The application of classical information retrieval techniques to spoken documents. PhD thesis, Cambridge UniversityGoogle Scholar
  22. Jones GJF (2000) Applying machine translation resources for cross-language information access from spoken documents. In: Proceedings of MT 2000: machine translation and multilingual applications in the new millennium. British Computer Society, pp 4-1–4-9Google Scholar
  23. Jones GJF (2001) New challenges for cross-language information retrieval: multimedia data and the user experience. In: Peters C (ed) Cross-language information retrieval and evaluation: workshop of cross-language evaluation forum (CLEF 2000). Lecture notes in computer science (LNCS), vol 2069. Springer, Heidelberg, pp 71–81CrossRefGoogle Scholar
  24. Jones GJF (2013) An introduction to crowdsourcing for language and multimedia technology research. In: Agosti M, Ferro N, Forner P, Müller H, Santucci G (eds) Information retrieval meets information visualization – PROMISE Winter School 2012, Revised Tutorial Lectures. Lecture notes in computer science (LNCS), vol 7757. Springer, Heidelberg, pp 132–154Google Scholar
  25. Jones GJF, Federico M (2003) CLEF 2002 cross-language spoken document retrieval pilot track report. In: Peters C, Braschler M, Gonzalo J, Kluck M (eds) Advances in cross-language information retrieval: third workshop of the cross–language evaluation forum (CLEF 2002) Revised Papers. Lecture notes in computer science (LNCS), vol 2785. Springer, Heidelberg, pp 446–457CrossRefGoogle Scholar
  26. Jones GJF, Foote JT, Spärck Jones K, Young SJ (1996) Retrieving spoken documents by combining multiple index sources. In: Frei HP, Harman D, Schaübie P, Wilkinson R (eds) Proceedings of the 19th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 1996). ACM Press, New York, pp 30–38Google Scholar
  27. Kekäläinen J, Järvelin K (2002) Using graded relevance assessments in IR evaluation. J Am Soc Inf Sci Technol 53(13):1120–1129CrossRefGoogle Scholar
  28. Khwileh A, Jones GJF (2016) Investigating segment-based query expansion for user-generated spoken content retrieval. In: 14th international workshop on content-based multimedia indexing, IEEE, CBMI 2016, pp 1–6Google Scholar
  29. Khwileh A, Afli H, Jones GJF, Way A (2017) Identifying effective translations for cross-lingual arabic-to-english user-generated speech search. In: Proceedings of the third arabic natural language processing workshop. Association for Computational Linguistics, pp 100–109Google Scholar
  30. Larson M, Jones GJF (2011) Spoken content retrieval: a survey of techniques and technologies. Found Trends Inf Retr 5(4–5):235—422Google Scholar
  31. Larson M, Newman E, Jones GJF (2009) Overview of VideoCLEF 2008: automatic generation of topic-based feeds for dual language audio-visual content. In: Peters C, Deselaers T, Ferro N, Gonzalo J, Jones GJF, Kurimo M, Mandl T, Peñas A (eds) Evaluating systems for multilingual and multimodal information access: ninth workshop of the cross-language evaluation forum (CLEF 2008). Revised selected papers. Lecture notes in computer science (LNCS), vol 5706. Springer, Heidelberg, pp 906–917CrossRefGoogle Scholar
  32. Larson M, Newman E, Jones GJF (2010) Overview of VideoCLEF 2009: new perspectives on speech-based multimedia content enrichment. In: Peters C, Tsikrika T, Müller H, Kalpathy-Cramer J, Jones GJF, Gonzalo J, Caputo B (eds) Multilingual information access evaluation Vol. II multimedia experiments – tenth workshop of the cross–language evaluation forum (CLEF 2009). Revised selected papers. Lecture notes in computer science (LNCS). Springer, Heidelberg, pp 354–368Google Scholar
  33. Larson M, Eskevich M, Ordelman R, Kofler C, Schmiedeke S, Jones GJF (2011) Overview of mediaeval 2011 rich speech retrieval task and genre tagging task. In: Larson M, Rae A, Demarty CH, Kofler C, Metze F, Troncy R, Mezaris V, Jones GJF (eds) Working notes proceedings of the MediaEval 2011 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-807/
  34. Marchand-Maillet S (2000) Content-based video retrieval: an overview. Technical report, Computer Vision Group, Computing Science Center, University of GenevaGoogle Scholar
  35. Marge M, Banerjee S, Rudnicky AI (2010) Using the Amazon Mechanical Turk for transcription of spoken language. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP 2010). IEEE, Piscataway, pp 5270–5273CrossRefGoogle Scholar
  36. Oard DW, Wang J, Jones GJF, White RW, Pecina P, Soergel D, Huang X, Shafran I (2007) Overview of the CLEF-2006 cross-language speech retrieval track. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval: seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS), vol 4730. Springer, Heidelberg, pp 744–758CrossRefGoogle Scholar
  37. Ordelman RJF, Eskevich M, Aly R, Huet B, Jones GJF (2015) Defining and evaluating video hyperlinking for navigating multimedia archives. In: Proceedings of the 24th international conference on world wide web. ACM, New York, WWW ‘15 Companion, pp 727–732Google Scholar
  38. Over P, Fiscus J, Joy D, Michel M, Awad G, Smeaton A, Kraaij W, Quénot G, Ordelman R, Aly R (2015) Trecvid 2015 – an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: The fifteenth international workshop on video retrieval evaluation (TRECVID 2015). National Institute of Standards and Technology (NIST), Special Publication 500-321, WashingtonGoogle Scholar
  39. Pecina P, Hoffmannová P, Jones GJF, Zhang Y, Oard DW (2008) Overview of the CLEF-2007 cross-language speech retrieval track. In: Peters C, Jijkoun V, Mandl T, Müller H, Oard DW, Peñas A, Petras V, Santos D (eds) Advances in multilingual and multimodal information retrieval: eighth workshop of the cross–language evaluation forum (CLEF 2007). Revised selected papers. Lecture notes in computer science (LNCS), vol 5152. Springer, Heidelberg, pp 674–686CrossRefGoogle Scholar
  40. Racca DN, Jones GJ (2016) On the effectiveness of contextualisation techniques in spoken query spoken content retrieval. In: Perego R, Sebastiani F, Aslam J, Ruthven I, Zobel J (eds) Proceedings of the 39th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2016). ACM Press, New York, pp 933–936Google Scholar
  41. Rashtchian C, Young P, Hodosh M, Hockenmaier J (2010) Collecting image annotations using Amazon’s Mechanical Turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, Association for Computational Linguistics, pp 139–147Google Scholar
  42. Sanderson M, Shou XM (2007) Search of spoken documents retrieves well recognized transcripts. In: Amati G, Carpineto C, Romano G (eds) Advances in information retrieval. Proceedings of the 29th European conference on IR research (ECIR 2007). Lecture notes in computer science (LNCS), vol 4425. Springer, Heidelberg, pp 505–516CrossRefGoogle Scholar
  43. Schmiedeke S, Xu P, Ferrané I, Eskevich M, Kofler C, Larson M, Estève Y, Lamel L, Jones GJF, Sikora T (2013) Blip10000: a social video dataset containing SPUG content for tagging and retrieval. In: Proceedings of ACM multimedia systems. ACM, New York, MMSys’13Google Scholar
  44. Schoeffmann K, Hopfgartner F, Marques O, Böszörmenyi L, Jose JM (2010) Video browsing interfaces and applications: a review. SPIE Rev 1(1):1–35Google Scholar
  45. Sheridan P, Wechsler M, Schäuble P (1997) Cross-language speech retrieval: establishing a baseline performance. In: Belkin NJ, Narasimhalu AD, Willett P, Hersh W, Can F, Voorhees EM (eds) Proceedings of the 20th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 1997). ACM Press, New York, pp 99–108CrossRefGoogle Scholar
  46. Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and trecvid. In: Proceedings of the 8th ACM international workshop on multimedia information retrieval. ACM, New York, MIR ‘06, pp 321–330Google Scholar
  47. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380CrossRefGoogle Scholar
  48. White RW, Oard DW, Jones GJF, Soergel D, Huang X (2006) Overview of the CLEF-2005 cross-language speech retrieval track. In: Peters C, Gey FC, Gonzalo J, Jones GJF, Kluck M, Magnini B, Müller H, de Rijke M (eds) Accessing multilingual information repositories: sixth workshop of the cross–language evaluation forum (CLEF 2005). Revised selected papers. Lecture notes in computer science (LNCS), vol 4022. Springer, Heidelberg, pp 744–759CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.ADAPT CentreSchool of Computing, Dublin City UniversityDublinIreland

Personalised recommendations