Advertisement

Experiences from the ImageCLEF Medical Retrieval and Annotation Tasks

  • Henning MüllerEmail author
  • Jayashree Kalpathy-Cramer
  • Alba García Seco de Herrera
Chapter
Part of the The Information Retrieval Series book series (INRE, volume 41)

Abstract

The medical tasks in ImageCLEF have been run every year from 2004–2018 and many different tasks and data sets have been used over these years. The created resources are being used by many researchers well beyond the actual evaluation campaigns and are allowing to compare the performance of many techniques on the same grounds and in a reproducible way. Many of the larger data sets are from the medical literature, as such images are easier to obtain and to share than clinical data, which was used in a few smaller ImageCLEF challenges that are specifically marked with the disease type and anatomic region. This chapter describes the main results of the various tasks over the years, including data, participants, types of tasks evaluated and also the lessons learned in organizing such tasks for the scientific community.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

We would like to thank the various funding organizations that have helped make ImageCLEF a reality (EU FP6 & FP7, SNF, RCSO, Google and others) and also all the volunteer researchers who helped organize the tasks. Another big thank you goes to the data providers that assured that medical data could be shared with the participants. A final thanks to all participants who work on the tasks and provide us with techniques to compare and with lively discussions at the ImageCLEF workshops.

References

  1. Angelini M, Ferro N, Larsen B, Müller H, Santucci G, Silvello G, Tsikrika T (2014) Measuring and analyzing the scholarly impact of experimental evaluation initiatives. In: Italian research conference on digital librariesGoogle Scholar
  2. Buyya R, Venugopal S (2005) A gentle introduction to grid computing and technologies. CSI Commun 29(1):9–19Google Scholar
  3. Cleverdon C, Mills J, Keen M (1966) Factors determining the performance of indexing systems. Tech. Rep., ASLIB Cranfield Research Project, CranfieldGoogle Scholar
  4. Clinchant S, Csurka G, Ah-Pine J, Jacquet G, Perronnin F, Sánchez J, Minoukadeh K (2010) Xrce’s participation in wikipedia retrieval, medical image modality classification and ad–hoc retrieval tasks of imageclef 2010. In: Working notes of the 2010 CLEF workshopGoogle Scholar
  5. Clough P, Sanderson M (2004) The CLEF 2003 cross language image retrieval task. In: Proceedings of the cross language evaluation forum (CLEF 2003)CrossRefGoogle Scholar
  6. Clough P, Sanderson M (2013) Evaluating the performance of information retrieval systems using test collections. Information Research 18(2). http://informationr.net/ir/18-2/paper582.html#.XSXc5S2B06g
  7. Clough P, Müller H, Sanderson M (2005) The CLEF 2004 cross–language image retrieval track. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: result of the fifth CLEF evaluation campaign. Lecture notes in computer science (LNCS), vol 3491, Springer, Bath, pp 597–613CrossRefGoogle Scholar
  8. Clough P, Müller H, Deselaers T, Grubinger M, Lehmann TM, Jensen J, Hersh W (2006) The CLEF 2005 cross–language image retrieval track. In: Cross language evaluation forum (CLEF 2005). Lecture notes in computer science (LNCS). Springer, Berlin, pp 535–557CrossRefGoogle Scholar
  9. Clough P, Müller H, Sanderson M (2010) Seven years of image retrieval evaluation. Springer, Berlin, pp 3–18Google Scholar
  10. Depeursinge A, Müller H (2010) Fusion techniques for combining textual and visual information retrieval. In: Müller H, Clough P, Deselaers T, Caputo B (eds) ImageCLEF, The Springer international series on information retrieval, vol 32. Springer, Berlin, pp 95–114Google Scholar
  11. Deselaers T, Weyand T, Keysers D, Macherey W, Ney H (2005) FIRE in ImageCLEF 2005: combining content–based image retrieval with textual information retrieval. In: Working notes of the CLEF workshop, ViennaGoogle Scholar
  12. Dicente Cid Y, Batmanghelich K, Müller H (2017a) Textured graph-model of the lungs for tuberculosis type classification and drug resistance prediction: participation in ImageCLEF 2017. In: CLEF2017 working notes. CEUR workshop proceedings, Dublin, CEUR-WS.org. http://ceur-ws.org
  13. Dicente Cid Y, Kalinovsky A, Liauchuk V, Kovalev V, Müller H (2017b) Overview of ImageCLEFtuberculosis 2017 - predicting tuberculosis type and drug resistances. In: CLEF 2017 labs working notes. CEUR Workshop Proceedings. Dublin, CEUR-WS.org. http://ceur-ws.org
  14. Eickhoff C, Schwall I, García Seco de Herrera A, Müller H (2017) Overview of ImageCLEFcaption 2017 - the image caption prediction and concept extraction tasks to understand biomedical images. In: CLEF2017 working notes. CEUR workshop proceedings. Dublin, CEUR-WS.org. http://ceur-ws.org
  15. Foncubierta-Rodríguez A, Müller H (2012) Ground truth generation in medical imaging: a crowdsourcing based iterative approach. In: Workshop on crowdsourcing for multimediaGoogle Scholar
  16. García Seco de Herrera A, Kalpathy-Cramer J, Demner Fushman D, Antani S, Müller H (2013) Overview of the ImageCLEF 2013 medical tasks. In: Working notes of CLEF 2013 (cross language evaluation forum)Google Scholar
  17. García Seco de Herrera A, Foncubierta-Rodríguez A, Markonis D, Schaer R, Müller H (2014) Crowdsourcing for medical image classification. In: Annual congress SGMI 2014Google Scholar
  18. García Seco de Herrera A, Müller H, Bromuri S (2015) Overview of the ImageCLEF 2015 medical classification task. In: Working notes of CLEF 2015 (cross language evaluation forum)Google Scholar
  19. García Seco de Herrera A, Schaer R, Bromuri S, Müller H (2016a) Overview of the ImageCLEF 2016 medical task. In: Working notes of CLEF 2016 (cross language evaluation forum)Google Scholar
  20. García Seco de Herrera A, Schaer R, Antani S, Müller H (2016b) Using crowdsourcing for multi-label biomedical compound figure annotation. In: MICCAI workshop Labels. Lecture notes in computer science. Springer, BerlinGoogle Scholar
  21. Gollub T, Stein B, Burrows S, Hoppe D (2012) Tira: configuring, executing, and disseminating information retrieval experiments. In: 2012 23rd international workshop on database and expert systems applications (DEXA). IEEE, Piscataway, pp 151–155CrossRefGoogle Scholar
  22. Hanbury A, Müller H, Langs G, Weber MA, Menze BH, Fernandez TS (2012) Bringing the algorithms to the data: cloud–based benchmarking for medical image analysis. In: CLEF conference. Lecture notes in computer science. Springer, BerlinCrossRefGoogle Scholar
  23. Hanbury A, Müller H, Balog K, Brodt T, Cormack GV, Eggel I, Gollub T, Hopfgartner F, Kalpathy-Cramer J, Kando N, Krithara A, Lin J, Mercer S, Potthast M (2015) Evaluation–as–a–service: overview and outlook. ArXiv 1512.07454Google Scholar
  24. Heimann T, Van Ginneken B, Styner M, Arzhaeva Y, Aurich V, Bauer C, Beck A, Becker C, Beichel R, Bekes G, et al (2009) Comparison and evaluation of methods for liver segmentation from CT datasets. IEEE Trans Med Imag 28(8):1251–1265CrossRefGoogle Scholar
  25. Jimenez-del-Toro O, Hanbury A, Langs G, Foncubierta-Rodríguez A, Müller H (2015) Overview of the VISCERAL retrieval benchmark 2015. In: Multimodal retrieval in the medical domain: first international workshop, MRMD 2015, Vienna, Austria, March 29, 2015, Revised selected papers. Lecture notes in computer science, vol 9059. Springer, Berlin, pp 115–123Google Scholar
  26. Jimenez-del-Toro O, Müller H, Krenn M, Gruenberg K, Taha AA, Winterstein M, Eggel I, Foncubierta-Rodríguez A, Goksel O, Jakab A, Kontokotsios G, Langs G, Menze B, Salas Fernandez T, Schaer R, Walleyo A, Weber MA, Dicente Cid Y, Gass T, Heinrich M, Jia F, Kahl F, Kechichian R, Mai D, Spanier AB, Vincent G, Wang C, Wyeth D, Hanbury A (2016) Cloud–based evaluation of anatomical structure segmentation and landmark detection algorithms: VISCERAL anatomy benchmarks. IEEE Trans Med Imag 35(11):2459–2475CrossRefGoogle Scholar
  27. Jones KS, van Rijsbergen C (1975) Report on the need for and provision of an ideal information retrieval test collection. British Library Research and Development Report 5266, Computer Laboratory, University of CambridgeGoogle Scholar
  28. Kalpathy-Cramer J, Müller H, Bedrick S, Eggel I, García Seco de Herrera A, Tsikrika T (2011) The CLEF 2011 medical image retrieval and classification tasks. In: Working notes of CLEF 2011 (cross language evaluation forum)Google Scholar
  29. Kalpathy-Cramer J, García Seco de Herrera A, Demner-Fushman D, Antani S, Bedrick S, Müller H (2015) Evaluating performance of biomedical image retrieval systems: overview of the medical image retrieval task at ImageCLEF 2004–2014. Comput Med Imag Graph 39:55–61CrossRefGoogle Scholar
  30. Koitka S, Friedrich CM (2016) Traditional feature engineering and deep learning approaches at medical classification task of ImageCLEF 2016. In: CLEF2016 working notes. CEUR workshop proceedings. CEUR-WS.org, ÉvoraGoogle Scholar
  31. Krenn M, Dorfer M, Jimenez-del-Toro O, Müller H, Menze B, Weber MA, Hanbury A, Langs G (2016) Creating a large–scale silver corpus from multiple algorithmic segmentations. Springer, Berlin, pp 103–115Google Scholar
  32. Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, Wiest R, Lanczi L, Gerstner E, Weber MA, Arbel T, Avants BB, Ayache N, Buendia P, Collins DL, Cordier N, Corso JJ, Criminisi A, Das T, Delingette H, Demiralp C, Durst CR, Dojat M, Doyle S, Festa J, Forbes F, Geremia E, Glocker B, Golland P, Guo X, Hamamci A, Iftekharuddin KM, Jena R, John NM, Konukoglu E, Lashkari D, Mariz JA, Meier R, Pereira S, Precup D, Price SJ, Raviv TR, Reza SMS, Ryan M, Sarikaya D, Schwartz L, Shin HC, Shotton J, Silva CA, Sousa N, Subbanna NK, Szekely G, Taylor TJ, Thomas OM, Tustison NJ, Unal G, Vasseur F, Wintermark M, Ye DH, Zhao L, Zhao B, Zikic D, Prastawa M, Reyes M, Van Leemput K (2015) The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imag 34(10):1993–2024CrossRefGoogle Scholar
  33. Müller H, Geissbuhler A, Ruch P (2005) ImageCLEF 2004: combining image and multi–lingual search for medical image retrieval. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: result of the fifth CLEF evaluation campaign. Lecture notes in computer science (LNCS), vol 3491. Springer, Bath, pp 718–727CrossRefGoogle Scholar
  34. Müller H, Deselaers T, Lehmann T, Clough P, Kim E, Hersh W (2006) Overview of the ImageCLEFmed 2006 medical retrieval and annotation tasks. In: Nardi A, Peters C, Vicedo JL, Ferro N (eds) CLEF 2006 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1172/
  35. Müller H, Boyer C, Gaudinat A, Hersh W, Geissbuhler A (2007a) Analyzing web log files of the health on the net HONmedia search engine to define typical image search tasks for image retrieval evaluation. In: MedInfo 2007. Studies in health technology and informatics, Brisbane, vol 12. IOS Press, Amsterdam, pp 1319–1323Google Scholar
  36. Müller H, Deselaers T, Grubinger M, Clough P, Hanbury A, Hersh W (2007b) Problems with running a successful multimedia retrieval benchmark. In: MUSCLE/ImageCLEF workshop 2007, Budapest, pp 9–18Google Scholar
  37. Müller H, Deselaers T, Kim E, Kalpathy-Cramer J, Deserno TM, Clough P, Hersh W (2008a) Overview of the ImageCLEFmed 2007 medical retrieval and annotation tasks. In: CLEF 2007 proceedings. Lecture notes in computer science (LNCS), Springer, Budapest, vol 5152, pp 473–491Google Scholar
  38. Müller H, Kalpathy-Cramer J, Hersh W, Geissbuhler A (2008b) Using medline queries to generate image retrieval tasks for benchmarking. In: Medical informatics Europe (MIE2008). IOS Press, Gothenburg, pp 523–528Google Scholar
  39. Müller H, Kalpathy-Cramer J, Kahn CE, Jr, Hatt W, Bedrick S, Hersh W (2009) Overview of the ImageCLEFmed 2008 medical image retrieval task. In: Peters C, Giampiccolo D, Ferro N, Petras V, Gonzalo J, Peñas A, Deselaers T, Mandl T, Jones G, Kurimo M (eds) Evaluating systems for multilingual and multimodal information access – 9th workshop of the cross-language evaluation forum, Aarhus, Denmark. Lecture Notes in Computer Science (LNCS), vol 5706, pp 500–510Google Scholar
  40. Müller H, Clough P, Deselaers T, Caputo B (eds) (2010a) ImageCLEF – experimental evaluation in visual information retrieval. The Springer international series on information retrieval, vol 32. Springer, BerlinzbMATHGoogle Scholar
  41. Müller H, Kalpathy-Cramer J, Eggel I, Bedrick S, Reisetter J, Kahn CE Jr, Hersh W (2010b) Overview of the CLEF 2010 medical image retrieval track. In: Working notes of CLEF 2010 (Cross language evaluation forum)Google Scholar
  42. Müller H, García Seco de Herrera A, Kalpathy-Cramer J, Demner Fushman D, Antani S, Eggel I (2012) Overview of the ImageCLEF 2012 medical image retrieval and classification tasks. In: Working notes of CLEF 2012 (Cross language evaluation forum)Google Scholar
  43. Radhouani S, Kalpathy-Cramer J, Bedrick S, Bakke B, Hersh W (2009) Multimodal medical image retrieval improving precision at ImageCLEF 2009. In: Working notes of the 2009 CLEF workshop, CorfuGoogle Scholar
  44. Rowe BR, Wood DW, Link AN, Simoni DA (2010) Economic impact assessment of NIST text retrieval conference (TREC) program. Technical report project number 0211875, National Institute of Standards and TechnologyGoogle Scholar
  45. Stefan LD, Ionescu B, Müller H (2017) Generating captions for medical images with a deep learning multi-hypothesis approach: ImageCLEF 2017 caption task. In: CLEF2017 working notes, CEUR Workshop Proceedings. Dublin, CEUR-WS.org. http://ceur-ws.org
  46. Thornley CV, Johnson AC, Smeaton AF, Lee H (2011) The scholarly impact of TRECVid (2003–2009). J Am Soc Inf Sci Technol 62(4):613–627CrossRefGoogle Scholar
  47. Tommasi T, Caputo B, Welter P, Güld M, Deserno TM (2010) Overview of the CLEF 2009 medical image annotation track. In: Peters C, Caputo B, Gonzalo J, Jones G, Kalpathy-Cramer J, Müller H, Tsikrika T (eds) Multilingual information access evaluation II. Multimedia experiments. Lecture notes in computer science, vol 6242. Springer, Berlin, pp 85–93CrossRefGoogle Scholar
  48. Tsikrika T, García Seco de Herrera A, Müller H (2011) Assessing the scholarly impact of ImageCLEF. In: CLEF 2011. Springer lecture notes in computer science (LNCS), pp 95–106CrossRefGoogle Scholar
  49. Tsikrika T, Larsen B, Müller H, Endrullis S, Rahm E (2013) The scholarly impact of CLEF (2000–2009). In: Information access evaluation. Multilinguality, multimodality, and visualization. Springer, Berlin, pp 1–12Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Henning Müller
    • 1
    Email author
  • Jayashree Kalpathy-Cramer
    • 2
  • Alba García Seco de Herrera
    • 3
  1. 1.HES–SO ValaisSierreSwitzerland
  2. 2.MGH Martinos Center for Biomedical ImagingCharlestownUSA
  3. 3.University of EssexColchesterUK

Personalised recommendations