Abstract
Image annotation is the process of assigning metadata to images, allowing effective retrieval by text-based search techniques. Despite the lots of efforts in automatic multimedia analysis, automatic semantic annotation of multimedia is still inefficient due to the problems in modeling high-level semantic terms. In this paper, we examine the factors affecting the quality of annotations collected through crowdsourcing platforms. An image dataset was manually annotated utilizing: (1) a vocabulary consists of preselected set of keywords, (2) an hierarchical vocabulary and (3) free keywords. The results show that the annotation quality is affected by the image content itself and the used lexicon. As we expected while annotation using the hierarchical vocabulary is more representative, the use of free keywords leads to increased invalid annotation. Finally, it is shown that images requiring annotations that are not directly related to their content (i.e., annotation using abstract concepts) lead to accrue annotator inconsistency revealing in that way the difficulty in annotating such kind of images is not limited to automatic annotation, but it is a generic problem of annotation.
Similar content being viewed by others
Notes
“The History of Commandaria: Digital Journeys Back to Time”, Project funded by the Cyprus Research Promotion Foundation (CRPF) under the Contract ANTHRO/0308(BE)/04.
References
Tyagi V (2017) Content-based image retrieval techniques: a review. Springer, Singapore, pp 29–48
Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486
Nazir A, Ashraf R, Hamdani T, Ali N (2018) Content based image retrieval system by using HSV color histogram, discrete wavelet transform and edge histogram descriptor. In: 2018 international conference on computing, mathematics and engineering technologies (iCoMET), pp 1–6
Li A, Sun J, Ng JY, Yu R, Morariu VI, Davis LS (2017) Generating holistic 3D scene abstractions for text-based image retrieval. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1942–1950
Dutta A, Verma Y, Jawahar CV (2018) Automatic image annotation: the quirks and what works. Multimed Tools Appl 77(24):31991–32011
Nguyen DT, Hua B, Yu L, Yeung S (2018) A robust 3D–2D interactive tool for scene segmentation and annotation. IEEE Trans Vis Comput Graph 24(12):3005–3018
Yang CM, Choo Y, Park S (2018) Semi-automatic image and video annotation system for generating ground truth information. In: 2018 International conference on information networking (ICOIN), pp 821–824
Dutta A, Zisserman A (2019) The VIA annotation software for images, audio and video. In: Proceedings of the 27th ACM international conference on multimedia, MM ’19, Nice, France. ACM, New York, NY. https://doi.org/10.1145/3343031.3350535
Cheng Q, Zhang Q, Fu P, Tu C, Li S (2018) A survey and analysis on automatic image annotation. Pattern Recognit 79:242–259
Ma Y, Liu Y, Xie Q (2019) CNN-feature based automatic image annotation method. Multimed Tools Appl 78(3):3767–3780
Jin C, Sun QM, Jin SW (2019) A hybrid automatic image annotation approach. Multimed Tools Appl 78(9):11815–11834
Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques. Pattern Recognit 45:346–362
Zhang R, Zhang Z, Li M, Zhang HJ (2006) A probabilistic semantic model for image annotation and multi-modal image retrieval. Multimed Syst 12:27–33
Kwasnicka H, Paradowski M (2010) Machine learning methods in automatic image annotation. In: Advances in machine learning II. Studies in computational intelligence, vol 263, pp 387–411
Wigness M, Draper BA, Beveridge JR (2018) Efficient label collection for image datasets via hierarchical clustering. Int J Comput Vis 126(1):59–85
Hong S, Choi J, Feyereisl J, Han B, Davis LS (2016) Joint image clustering and labeling by matrix factorization. IEEE Trans Pattern Anal Mach Intell 38(7):1411–1424
Glowacz A (2018) Acoustic-based fault diagnosis of commutator motor. Electronics 7(11):299
Glowacz A (2019) Fault diagnosis of single-phase induction motor based on acoustic signals. Mech Syst Signal Process 117:65–80
Huang Y, Yang H, Qi X, Malekian R, Pfeiffer O, Li Z (2018) A novel selection method of seismic attributes based on gray relational degree and support vector machine. PLoS ONE 13(2):1–16
dit Leksir YL, Mansour M, Moussaoui A (2018) Localization of thermal anomalies in electrical equipment using infrared thermography and support vector machine. Infrared Phys Technol 89:120–128
Ristin M, Guillaumin M, Gall J, Gool LV (2016) Incremental learning of random forests for large-scale image classification. IEEE Trans Pattern Anal Mach Intell 38(3):490–503
Piramanayagam S, Schwartzkopf W, Koehler FW, Saber E (2016) Classification of remote sensed images using random forests and deep learning framework. In: Bruzzone L, Bovolo F (eds) Image and signal processing for remote sensing XXII, vol 10004. SPIE, pp 205–212. https://doi.org/10.1117/12.2243169
Quintero R, Parra I, Lorenzo J, Fernández-Llorca D, Sotelo MA (2017) Pedestrian intention recognition by means of a hidden Markov model and body language. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), pp 1–7
Xie F, Fan H, Li Y, Jiang Z, Meng R, Bovik A (2017) Melanoma classification on dermoscopy images using a neural network ensemble model. IEEE Trans Med Imaging 36(3):849–858
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Perina A, Mohammadi S, Jojic N, Murino V (2017) Summarization and classification of wearable camera streams by learning the distributions over deep features of out-of-sample image sequences. In: The IEEE international conference on computer vision (ICCV)
Maggiori E, Tarabalka Y, Charpiat G, Alliez P (2017) Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans Geosci Remote Sens 55(2):645–657
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40:834–848
Wang J, Yang Y, Mao J, Huang Z, Huang C, Xu W (2016) CNN-RNN: a unified framework for multi-label image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Jing X, Wu F, Li Z, Hu R, Zhang D (2016) Multi-label dictionary learning for image annotation. IEEE Trans Image Process 25(6):2712–2725
Penna A, Mohammadi S, Jojic N, Murino V (2017) Summarization and classification of wearable camera streams by learning the distributions over deep features of out-of-sample image sequences. In: 2017 IEEE international conference on computer vision (ICCV), pp 4336–4344
Heidorn PB (1999) Image retrieval as linguistic and nonlinguistic visual model matching. Libr Trends 48(2):303–325
Hare JS, Lewis PH, Esner PGB, Sandom CJ (2006) Mind the gap: another look at the problem of the semantic gap in image retrieval. In: Proceedings of multimedia content analysis, management and retrieval 2006 SPIE, San Jose, California, USA
Theodosiou Z, Kasapi C, Tsapatsoulis N (2012) Semantic gap between people: an experimental investigation based on image annotation. In: Seventh international workshop on semantic and social media adaptation and personalization (SMAP), Luxembourg, pp 73–77
Kovashka A, Russakovsky O, Fei-Fei L, Grauman K (2016) Crowdsourcing in computer vision. Found Trends Comput Graph Vis 10(3):177–243
Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. In: Proceedings of European conference on computer vision, Marseille, France, pp 316–329
Hanbury A (2008) A survey of methods for image annotation. J Vis Lang Comput 19(5):617–627
Gulati P, Yadav M (2019) A novel approach for extracting pertinent keywords for web image annotation using semantic distance and euclidean distance. In: Hoda MN, Chauhan N, Quadri SMK, Srivastava PR (eds) Software engineering. Springer, Singapore, pp 173–183
Matusiak KK (2006) Towards user-centered indexing in digital image collections. OCLC Syst Serv 22(4):283–298
Joachims T, Granka L, Pang B, Hembrooke H, Gay G (2005) Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th annual international ACM SIGIR conference, Salvador, Brazil, pp 154–161
Macdonald C, Ounis I (2009) Usefulness of quality clickthrough data for training. In: Proceedings of the 2009 workshop on web search click data, Barcelona, Spain, pp 75–79
Tsikrika T, Diou C, De Vries AP, Delopoulos A (2009) Image annotation using clickthrough data. In: Proceedings of the 8th international conference on image and video retrieval, Santorini, Greece, pp 1–8
Kittur A, Kraut RE (2008) Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In: Proceedings of the 2008 ACM conference on computer supported cooperative work, San Diego, CA, USA, pp 37–46
Theodosiou Z, Tsapatsoulis N (2011) Crowdsourcing annotation: modelling keywords using low level features. In: Proceedings of the 5th international conference on internet multimedia systems architecture and application, Bangalore, India
Chen KT, Wu CC, Chang YC, Lei CL (2009) A crowdsourceable QoE evaluation framework for multimedia content. In: Proceedings of the 17th ACM international conference on multimedia, Beijing, China, pp 491–500
Brants T (2000) Inter-annotator agreement for a German newspaper corpus. In: Proceedings of the 2nd international conference on language resources and evaluation, Athens, Greece, pp 1–5
Kilgarriff A (1998) Gold standard datasets for evaluating word sense disambiguation programs. Comput Speech Lang 12(3):453–472
Howe J (2008) Crowdsourcing: why the power of the crowd is driving the future of business. Crown Business, New York
Ghezzi A, Gabelloni D, Martini A, Natalicchio A (2018) Crowdsourcing: a review and suggestions for future research. Int J Manag Rev 20(2):343–363
Welinder P, Perona P (2010) Online crowdsourcing: rating annotators and obtaining cost effective labels. In: Proceedings of IEEE conference on computer vision and pattern recognition, San Francisco, CA, USA, pp 25–32
Howe J (2006) The rise of crowdsourcing. Wired Mag 14(6):176–183
Brabham D (2008) Crowdsourcing as a model for problem solving: an introduction and cases. Convergence 14(1):75–90
Brawley AM, Pury CLS (2016) Work experiences on mturk: job satisfaction, turnover, and information sharing. Comput Hum Behav 54:531–546
Fowler F Jr (2014) Survey research methods, 5th edn. SAGE Publications Inc, Thousand Oaks
Allahbakhsh M, Benatallah B, Ignjatovic A, Motahari-Nezhad HR, Bertino E, Dustdar S (2013) Quality control in crowdsourcing systems: issues and directions. IEEE Internet Comput 17(2):76–81
McCredie MN, Morey LC (2018) Who are the turkers? A characterization of mturk workers using the personality assessment inventory. Assessment 26:759–766
Lovett M, Bajaba S, Lovett M, Simmering MJ (2017) Data quality from crowdsourced surveys: a mixed method inquiry into perceptions of Amazon’s mechanical turk masters. Appl Psychol 67(2):339–366
Snow R, O’Connor B, Jurafsky D, Ng A (2008) Cheap and fast but is it good evaluating nonexpert annotations for natural language tasks. In: Proceedings of the conference on empirical methods in natural language processing, Honolulu, HI, USA, pp 254–263
Raykar V, Zhao S, Yu L, Jerebko A, Florin C, Valadez G, Bogoni L, Moy L (2009) Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th annual international conference on machine learning, Montreal, Canada, pp 889–896
Smyth P, Fayyad UM, Burl M, Perona P, Baldi P (1995) Inferring ground truth from subjective labeling of venus images. Adv Neural Inf Process Syst 7:1085–1092
Sheng VS, Provost F, Ipeirotis PG (2008) Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, Las Vegas, NV, USA, pp 614–622
Ahn LV, Maurer B, McMillen C, Abraham D, Blum M (2008) Recaptcha: human-based character recognition via web security measures. Science 321(5895):1465–1468
Whitehill J, Ruvolo P, Bergsma T Wu J, Movellan J (2009) Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 23rd annual conference on neural information processing systems, Vancouver, Canada, pp 2035–2043
Vijayanarasimhan S, Grauman K (2009) What’s it going to cost you? Predicting effort vs. informativeness for multi-label image annotations. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, Miami, FL, USA, pp 2262–2269
Aroyo L, Welty C (2015) Truth is a lie: crowd truth and the seven myths of human annotation. AI Mag 36(1):15–24
Artstein R (2017) Inter-annotator agreement. In: Ide N, Pustejovsky J (eds) Handbook of linguistic annotation. Springer, Dordrecht
Callison-Burch C (2009) Fast, cheap, and creative: evaluating translation quality using Amazon’s mechanical turk. In: Proceedings of conference on empirical methods in natural language processing, Singapore, pp 286–295
Nowak S, Ruger S (2010) How reliable are annotations via crowdsourcing a study about inter-annotator agreement for multi-label image annotation. In: Proceedings of the international conference on multimedia information retrieval, Philadelphia, PA, USA, pp 557–566
Yadav P, Jezek E, Bouillon P, Callahan T, Bada M, Hunter L, Cohen KB (2017) Semantic relations in compound nouns: perspectives from inter-annotator agreement. Stud Health Technol Inform 245:644–648
Papadopoulos K, Tsapatsoulis N, Lanitis A, Kounoudes A (2008) The history of commandaria: digital journeys back to time. In: Proceedings of the 14th international conference on virtual systems and multimedia, Limassol, Cyprus
Cohen J (1960) A coefficient of agreement for nomimal scales. Educ Phsychol Meas 20(1):37–46
Landis JR, Koch GK (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Randolph JJ (2005) Free-marginal multirater kappa: an alternative to Fleiss’ fixed-marginal multirater kappa. In: Joensuu University learning and instruction symposium, Joensuu, Finland
Cowles M, Davis C (1982) On the origins of the.05 level of statistical significance. Am Psychol 37(5):553–558
Fujisawa S (2007) Automatic creation and enhancement of metadata for cultural heritage. In: Bulletin of IEEE technical committee on digital libraries (TCDL)
Randolph JJ (2008) Online kappa calculator. http://justusrandolph.net/kappa/. Retrieved 5 Apr 2019
Acknowledgements
This work has been partly supported by the project that has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 739578 (RISE-Call: H2020-WIDE-SPREAD-01-2016-2017-TeamingPhase2) and the Government of the Republic of Cyprus through the Directorate General for European Programmes, Coordination and Development.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Theodosiou, Z., Tsapatsoulis, N. Image annotation: the effects of content, lexicon and annotation method. Int J Multimed Info Retr 9, 191–203 (2020). https://doi.org/10.1007/s13735-020-00193-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-020-00193-z