Abstract
There has been a growing interest in applying human computation – particularly crowdsourcing techniques – to assist in the solution of multimedia, image processing, and computer vision problems which are still too difficult to solve using fully automatic algorithms, and yet relatively easy for humans. In this paper we focus on a specific problem – object segmentation within color images – and compare different solutions which combine color image segmentation algorithms with human efforts, either in the form of an explicit interactive segmentation task or through an implicit collection of valuable human traces with a game. We use Click’n’Cut, a friendly, web-based, interactive segmentation tool that allows segmentation tasks to be assigned to many users, and Ask’nSeek, a game with a purpose designed for object detection and segmentation. The two main contributions of this paper are: (i) We use the results of Click’n’Cut campaigns with different groups of users to examine and quantify the crowdsourcing loss incurred when an interactive segmentation task is assigned to paid crowd-workers, comparing their results to the ones obtained when computer vision experts are asked to perform the same tasks. (ii) Since interactive segmentation tasks are inherently tedious and prone to fatigue, we compare the quality of the results obtained with Click’n’Cut with the ones obtained using a (fun, interactive, and potentially less tedious) game designed for the same purpose. We call this contribution the assessment of the gamification loss, since it refers to how much quality of segmentation results may be lost when we switch to a game-based approach to the same task. We demonstrate that the crowdsourcing loss is significant when using all the data points from workers, but decreases substantially (and becomes comparable to the quality of expert users performing similar tasks) after performing a modest amount of data analysis and filtering out of users whose data are clearly not useful. We also show that – on the other hand – the gamification loss is significantly more severe: the quality of the results drops roughly by half when switching from a focused (yet tedious) task to a more fun and relaxed game environment.
Similar content being viewed by others
References
Adamek T (2006) Using contour information and segmentation for object registration, modeling and retrieval. Ph.D. dissertation, Dublin City University
Arbelaez P, Cohen L (2008) Constrained image segmentation from hierarchical boundaries. In: CVPR
Arbeláez P, Pont-Tuset J, Barron JT, Marques F, Malik J (2014) Multiscale combinatorial grouping. In: CVPR
Batra D, Kowdle A, Parikh D, Luo J, Chen T (2010) icoseg: Interactive co-segmentation with intelligent scribble guidance. In: Proceedings of CVPR’10, pp 3169–3176
Bell S, Upchurch P, Snavely N, Bala K (2013) Opensurfaces: A richly annotated catalog of surface appearance. ACM TOG 32(4)
Boykov Y, Jolly M-P (2001) Interactive graph cuts for optimal boundary map; region segmentation of objects in n-d images. In: ICCV
Cabezas F, Carlier A, Salvador A, Giró-i Nieto X, Charvillat V (2015) Quality control in crowdsourced object segmentation. arXiv:1505.00145
Carlier A, Marques O, Charvillat V (2012) Ask’nseek: A new game for object detection and labeling. In: Computer Vision–ECCV 2012. Workshops and Demonstrations. Springer, pp 249–258
Carlier A, Charvillat V, Salvador A, Giro-i Nieto X, Marques O (2014) Click’n’cut: Crowdsourced interactive segmentation with object candidates. In: Proceedings of the 2014 International ACM Workshop on Crowdsourcing for Multimedia, ser. CrowdMM ’14. New York, NY, USA: ACM, pp 53–56. [Online]. Available: doi:10.1145/2660114.2660125
Carreira J, Sminchisescu C (2010) Constrained parametric min-cuts for automatic object segmentation. In: CVPR
Chen L-C, Fidler S, Yuille AL, Urtasun R (2014) Beat the mturkers: Automatic image labeling from weak 3d supervision. In: CVPR
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The Pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Fathi A, Balcan MF, Ren X, Rehg JM (2011) Combining self training and active learning for video segmentation. In: Hoey J, McKenna S, Trucco E (eds) Proceedings of the British Machine Vision Conference (BMVC 2011), vol 29, pp 78–1
Giró-i Nieto X, Martos M, Mohedano E, Pont-Tuset J (2014) From global image annotation to interactive object segmentation. MTAP 70(1)
Jain SD, Grauman K (2013) Predicting sufficient annotation strength for interactive foreground segmentation. In: Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, pp 1313–1320
Lee HS, Kim J, Park SJ, Kim J (2014) Interactive segmentation as supervised classification with superpixels. In: WCVPR 2014-W. on Computer Vision and Human Computation
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. CoRR
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H-Y (2011) Learning to detect a salient object. PAMI 33(2)
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3431–3440
Lux M, Müller A, Guggenberger M (2012) Finding image regions with human computation and games with a purpose. In: AIIDE
Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: ICCV
McGuinness K, O’Connor N (2013) Improved graph cut segmentation by learning a contrast model on the fly. In: ICIP
McGuinness K, O’connor NE (2010) A comparative evaluation of interactive segmentation algorithms. Pattern Recogn 43(2):434–444
Noma A, Graciano ABV, Cesar Jr RM, Consularo LA, Bloch I (2012) Interactive image segmentation by matching attributed relational graphs. Pattern Recogn 45(3)
Oleson D, Sorokin A, Laughlin GP, Hester V, Le J, Biewald L (2011) Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. Human Computation 11:11
Pinheiro PO, Collobert R (2015) From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1713–1721
Rother C, Kolmogorov V, Blake A (2004) “grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3)
Rupprecht C, Peter L, Navab N (2015) Image segmentation in twenty questions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3314–3322
Russakovsky O, Bearman AL, Ferrari V, Li F-F (2015) What’s the point: Semantic segmentation with point supervision. arXiv:1506.02106
Russell BC, Torralba A, Murphy KP, Freeman TW (2008) Labelme: A database and web-based tool for image annotation. IJCV 77(1-3)
Salembier P, Garrido L (2000) Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval. IEEE Trans Image Process 9(4)
Salvador A, Carlier A, Giro-i Nieto X, Marques O, Charvillat V (2013) Crowdsourced object segmentation with a game. In: ACM CrowdMM
Steggink J, Snoek C (2011) Adding semantics to image-region annotations with the name-it-game. Multimedia Systems:17
Sun Y, CHen Y, Wang W, Tang X (2014) Deep learning face representation by joint identification-verification. In: Proceedings of Neural Information Processing Systems Conference (NIPS)
von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: ACM CHI
von Ahn L, Liu R, Blum M (2006) Peekaboom: a game for locating objects in images. In: ACM CHI
Wang J, Bhat P, Colburn RA, Agrawala M, Cohen MF (2005) Interactive video cutout. ACM Trans. Graph. 24(3)
Acknowledgments
This work has been developed in the framework of the project TEC2013-43935-R, financed by the Spanish Ministerio de Economa y Competitividad and the European Regional Development Fund (ERDF).
Author information
Authors and Affiliations
Corresponding author
Appendix: A: Images used in our experiments
Appendix: A: Images used in our experiments
We used the five images taken from the PASCAL VOC dataset (see Fig. 15) as Gold Standard in our experiments. The textual descriptions that were provided to the users during our experiments were :
-
Extract the woman from the image. Include her hair, her clothes, and the part of her arm that holds the bottle.
-
Extract the cat from the image. Try to discard the dog’s paw laying on the cat.
-
Extract the woman from the image. Include her hair.
-
Extract the bus from the image. Do not include the mirrors on the front of the bus.
-
Extract the cat from the image.
Rights and permissions
About this article
Cite this article
Carlier, A., Salvador, A., Cabezas, F. et al. Assessment of crowdsourcing and gamification loss in user-assisted object segmentation. Multimed Tools Appl 75, 15901–15928 (2016). https://doi.org/10.1007/s11042-015-2897-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2897-6