Crowdtruth validation: a new paradigm for validating algorithms that rely on image correspondences
- 429 Downloads
Feature tracking and 3D surface reconstruction are key enabling techniques to computer-assisted minimally invasive surgery. One of the major bottlenecks related to training and validation of new algorithms is the lack of large amounts of annotated images that fully capture the wide range of anatomical/scene variance in clinical practice. To address this issue, we propose a novel approach to obtaining large numbers of high-quality reference image annotations at low cost in an extremely short period of time.
The concept is based on outsourcing the correspondence search to a crowd of anonymous users from an online community (crowdsourcing) and comprises four stages: (1) feature detection, (2) correspondence search via crowdsourcing, (3) merging multiple annotations per feature by fitting Gaussian finite mixture models, (4) outlier removal using the result of the clustering as input for a second annotation task.
On average, 10,000 annotations were obtained within 24 h at a cost of $100. The annotation of the crowd after clustering and before outlier removal was of expert quality with a median distance of about 1 pixel to a publically available reference annotation. The threshold for the outlier removal task directly determines the maximum annotation error, but also the number of points removed.
Our concept is a novel and effective method for fast, low-cost and highly accurate correspondence generation that could be adapted to various other applications related to large-scale data annotation in medical image computing and computer-assisted interventions.
KeywordsCrowdsourcing Validation Benchmarking Endoscopy Image correspondences Feature tracking
This work was conducted within the setting of SFB TRR 125: Cognition-guided surgery funded by the German Research Foundation (DFG) (Projects A02 and A01). It was further sponsored by the European Social Fund of the State of Baden-Württemberg and the Klaus Tschira Foundation.
- 1.Maier-Hein L, Groch A, Bartoli A, Bodenstedt S, Boissonnat G, Chang PL, Clancy NT, Elson DS, Haase S, Heim E, Hornegger J, Jannin P, Kenngott H, Kilgus T, Muller-Stich B, Oladokun D, Rohl S, Dos Santos TR, Schlemmer HP, Seitel A, Speidel S, Wagner M, Stoyanov D (2014) Comparative validation of single-shot optical techniques for laparoscopic 3-D surface reconstruction. IEEE Trans Med Imaging 33:1913–1930PubMedCrossRefGoogle Scholar
- 2.Von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 319–326Google Scholar
- 4.Chen JJ, Menezes NJ, Bradley AD, North T (2011) Opportunities for crowdsourcing research on amazon mechanical turk. Interfaces 5. Jg., No. 3Google Scholar
- 6.Khatib F, DiMaio F, Cooper S, Kazmierczyk M, Gilski M, Krzywda S, Zabranska H, Pichova I, Thompson J, Popovic Z, Jaskolski M, Baker D (2011) Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nat Struct Mol Biol 18:1175–1177Google Scholar
- 10.Foncubierta Rodríguez A, Müller H (2012) Ground truth generation in medical imaging: a crowdsourcing-based iterative approach. In: Proceedings of the ACM multimedia 2012 workshop on crowdsourcing for multimedia. CrowdMM’12. ACM, New York, NY, USA, pp 9–14Google Scholar
- 12.Maier-Hein L, Mersmann S, Kondermann D, Bodenstedt S, Sanchez A, Stock C, Kenngott HG, Eisenmann M, Speidel S (2014) Can masses of non-experts train highly accurate image classifiers? In: Golland P, Hata N, Barillot C, Hornegger J, Howe R (eds) Medical image computing and computer-assisted intervention—MICCAI 2014, vol 8674. Lecture notes in computer science. Springer, pp 438–445Google Scholar
- 13.Maier-Hein L, Mersmann S, Kondermann D, Stock C, Kenngott HG, Sanchez A, Wagner M, Preukschas A, Wekerle AL, Helfert S, Bodenstedt S, Speidel S (2014) Crowdsourcing for reference correspondence generation in endoscopic images. In: Golland P, Hata N, Barillot C, Hornegger J, Howe R (eds) Medical image computing and computer-assisted intervention—MICCAI 2014, vol 8674. Lecture notes in computer science. Springer, pp 349–356Google Scholar
- 14.Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: European conference on computer vision (ECCV), vol 3951. Lecture notes in computer science, pp 404–417Google Scholar
- 17.Fraley C, Raftery AE, Murphy TB, Scrucca L (2012) MCLUST version 4 for R: normal mixture modeling for model-based clustering, classification, and density estimation. Technical report, Technical report no. 597, Department of Statistics, University of WashingtonGoogle Scholar
- 18.Puerto G, Mariottini GL (2012) A comparative study of correspondence-search algorithms in MIS images. In: International conference on medical image computing and computer-assisted intervention (MICCAI), pp 625–633Google Scholar
- 19.R Core Team (2014) R: A language and environment for statistical computing. R Core Team, ViennaGoogle Scholar
- 20.Pinheiro J, Bates D, DebRoy S, Sarkar D (2014) R Core Team: NLME: linear and nonlinear mixed effects models. R package version 3.1-118Google Scholar