Image annotation: the effects of content, lexicon and annotation method

Theodosiou, Zenonas; Tsapatsoulis, Nicolas

doi:10.1007/s13735-020-00193-z

Image annotation: the effects of content, lexicon and annotation method

Regular Paper
Published: 02 March 2020

Volume 9, pages 191–203, (2020)
Cite this article

International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

557 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Image annotation is the process of assigning metadata to images, allowing effective retrieval by text-based search techniques. Despite the lots of efforts in automatic multimedia analysis, automatic semantic annotation of multimedia is still inefficient due to the problems in modeling high-level semantic terms. In this paper, we examine the factors affecting the quality of annotations collected through crowdsourcing platforms. An image dataset was manually annotated utilizing: (1) a vocabulary consists of preselected set of keywords, (2) an hierarchical vocabulary and (3) free keywords. The results show that the annotation quality is affected by the image content itself and the used lexicon. As we expected while annotation using the hierarchical vocabulary is more representative, the use of free keywords leads to increased invalid annotation. Finally, it is shown that images requiring annotations that are not directly related to their content (i.e., annotation using abstract concepts) lead to accrue annotator inconsistency revealing in that way the difficulty in annotating such kind of images is not limited to automatic annotation, but it is a generic problem of annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Annotation Using a Semantic Hierarchy

A Novel Approach for Extracting Pertinent Keywords for Web Image Annotation Using Semantic Distance and Euclidean Distance

A Flexible Framework for the Evaluation of Unsupervised Image Annotation

Notes

“The History of Commandaria: Digital Journeys Back to Time”, Project funded by the Cyprus Research Promotion Foundation (CRPF) under the Contract ANTHRO/0308(BE)/04.

References

Tyagi V (2017) Content-based image retrieval techniques: a review. Springer, Singapore, pp 29–48
Google Scholar
Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486
Google Scholar
Nazir A, Ashraf R, Hamdani T, Ali N (2018) Content based image retrieval system by using HSV color histogram, discrete wavelet transform and edge histogram descriptor. In: 2018 international conference on computing, mathematics and engineering technologies (iCoMET), pp 1–6
Li A, Sun J, Ng JY, Yu R, Morariu VI, Davis LS (2017) Generating holistic 3D scene abstractions for text-based image retrieval. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1942–1950
Dutta A, Verma Y, Jawahar CV (2018) Automatic image annotation: the quirks and what works. Multimed Tools Appl 77(24):31991–32011
Google Scholar
Nguyen DT, Hua B, Yu L, Yeung S (2018) A robust 3D–2D interactive tool for scene segmentation and annotation. IEEE Trans Vis Comput Graph 24(12):3005–3018
Google Scholar
Yang CM, Choo Y, Park S (2018) Semi-automatic image and video annotation system for generating ground truth information. In: 2018 International conference on information networking (ICOIN), pp 821–824
Dutta A, Zisserman A (2019) The VIA annotation software for images, audio and video. In: Proceedings of the 27th ACM international conference on multimedia, MM ’19, Nice, France. ACM, New York, NY. https://doi.org/10.1145/3343031.3350535
Cheng Q, Zhang Q, Fu P, Tu C, Li S (2018) A survey and analysis on automatic image annotation. Pattern Recognit 79:242–259
Google Scholar
Ma Y, Liu Y, Xie Q (2019) CNN-feature based automatic image annotation method. Multimed Tools Appl 78(3):3767–3780
Google Scholar
Jin C, Sun QM, Jin SW (2019) A hybrid automatic image annotation approach. Multimed Tools Appl 78(9):11815–11834
Google Scholar
Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques. Pattern Recognit 45:346–362
Google Scholar
Zhang R, Zhang Z, Li M, Zhang HJ (2006) A probabilistic semantic model for image annotation and multi-modal image retrieval. Multimed Syst 12:27–33
Google Scholar
Kwasnicka H, Paradowski M (2010) Machine learning methods in automatic image annotation. In: Advances in machine learning II. Studies in computational intelligence, vol 263, pp 387–411
Wigness M, Draper BA, Beveridge JR (2018) Efficient label collection for image datasets via hierarchical clustering. Int J Comput Vis 126(1):59–85
MathSciNet Google Scholar
Hong S, Choi J, Feyereisl J, Han B, Davis LS (2016) Joint image clustering and labeling by matrix factorization. IEEE Trans Pattern Anal Mach Intell 38(7):1411–1424
Google Scholar
Glowacz A (2018) Acoustic-based fault diagnosis of commutator motor. Electronics 7(11):299
Google Scholar
Glowacz A (2019) Fault diagnosis of single-phase induction motor based on acoustic signals. Mech Syst Signal Process 117:65–80
Google Scholar
Huang Y, Yang H, Qi X, Malekian R, Pfeiffer O, Li Z (2018) A novel selection method of seismic attributes based on gray relational degree and support vector machine. PLoS ONE 13(2):1–16
Google Scholar
dit Leksir YL, Mansour M, Moussaoui A (2018) Localization of thermal anomalies in electrical equipment using infrared thermography and support vector machine. Infrared Phys Technol 89:120–128
Google Scholar
Ristin M, Guillaumin M, Gall J, Gool LV (2016) Incremental learning of random forests for large-scale image classification. IEEE Trans Pattern Anal Mach Intell 38(3):490–503
Google Scholar
Piramanayagam S, Schwartzkopf W, Koehler FW, Saber E (2016) Classification of remote sensed images using random forests and deep learning framework. In: Bruzzone L, Bovolo F (eds) Image and signal processing for remote sensing XXII, vol 10004. SPIE, pp 205–212. https://doi.org/10.1117/12.2243169
Quintero R, Parra I, Lorenzo J, Fernández-Llorca D, Sotelo MA (2017) Pedestrian intention recognition by means of a hidden Markov model and body language. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), pp 1–7
Xie F, Fan H, Li Y, Jiang Z, Meng R, Bovik A (2017) Melanoma classification on dermoscopy images using a neural network ensemble model. IEEE Trans Med Imaging 36(3):849–858
Google Scholar
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Google Scholar
Perina A, Mohammadi S, Jojic N, Murino V (2017) Summarization and classification of wearable camera streams by learning the distributions over deep features of out-of-sample image sequences. In: The IEEE international conference on computer vision (ICCV)
Maggiori E, Tarabalka Y, Charpiat G, Alliez P (2017) Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans Geosci Remote Sens 55(2):645–657
Google Scholar
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40:834–848
Google Scholar
Wang J, Yang Y, Mao J, Huang Z, Huang C, Xu W (2016) CNN-RNN: a unified framework for multi-label image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Jing X, Wu F, Li Z, Hu R, Zhang D (2016) Multi-label dictionary learning for image annotation. IEEE Trans Image Process 25(6):2712–2725
MathSciNet MATH Google Scholar
Penna A, Mohammadi S, Jojic N, Murino V (2017) Summarization and classification of wearable camera streams by learning the distributions over deep features of out-of-sample image sequences. In: 2017 IEEE international conference on computer vision (ICCV), pp 4336–4344
Heidorn PB (1999) Image retrieval as linguistic and nonlinguistic visual model matching. Libr Trends 48(2):303–325
Google Scholar
Hare JS, Lewis PH, Esner PGB, Sandom CJ (2006) Mind the gap: another look at the problem of the semantic gap in image retrieval. In: Proceedings of multimedia content analysis, management and retrieval 2006 SPIE, San Jose, California, USA
Theodosiou Z, Kasapi C, Tsapatsoulis N (2012) Semantic gap between people: an experimental investigation based on image annotation. In: Seventh international workshop on semantic and social media adaptation and personalization (SMAP), Luxembourg, pp 73–77
Kovashka A, Russakovsky O, Fei-Fei L, Grauman K (2016) Crowdsourcing in computer vision. Found Trends Comput Graph Vis 10(3):177–243
Google Scholar
Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. In: Proceedings of European conference on computer vision, Marseille, France, pp 316–329
Hanbury A (2008) A survey of methods for image annotation. J Vis Lang Comput 19(5):617–627
Google Scholar
Gulati P, Yadav M (2019) A novel approach for extracting pertinent keywords for web image annotation using semantic distance and euclidean distance. In: Hoda MN, Chauhan N, Quadri SMK, Srivastava PR (eds) Software engineering. Springer, Singapore, pp 173–183
Google Scholar
Matusiak KK (2006) Towards user-centered indexing in digital image collections. OCLC Syst Serv 22(4):283–298
Google Scholar
Joachims T, Granka L, Pang B, Hembrooke H, Gay G (2005) Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th annual international ACM SIGIR conference, Salvador, Brazil, pp 154–161
Macdonald C, Ounis I (2009) Usefulness of quality clickthrough data for training. In: Proceedings of the 2009 workshop on web search click data, Barcelona, Spain, pp 75–79
Tsikrika T, Diou C, De Vries AP, Delopoulos A (2009) Image annotation using clickthrough data. In: Proceedings of the 8th international conference on image and video retrieval, Santorini, Greece, pp 1–8
Kittur A, Kraut RE (2008) Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In: Proceedings of the 2008 ACM conference on computer supported cooperative work, San Diego, CA, USA, pp 37–46
Theodosiou Z, Tsapatsoulis N (2011) Crowdsourcing annotation: modelling keywords using low level features. In: Proceedings of the 5th international conference on internet multimedia systems architecture and application, Bangalore, India
Chen KT, Wu CC, Chang YC, Lei CL (2009) A crowdsourceable QoE evaluation framework for multimedia content. In: Proceedings of the 17th ACM international conference on multimedia, Beijing, China, pp 491–500
Brants T (2000) Inter-annotator agreement for a German newspaper corpus. In: Proceedings of the 2nd international conference on language resources and evaluation, Athens, Greece, pp 1–5
Kilgarriff A (1998) Gold standard datasets for evaluating word sense disambiguation programs. Comput Speech Lang 12(3):453–472
Google Scholar
Howe J (2008) Crowdsourcing: why the power of the crowd is driving the future of business. Crown Business, New York
Google Scholar
Ghezzi A, Gabelloni D, Martini A, Natalicchio A (2018) Crowdsourcing: a review and suggestions for future research. Int J Manag Rev 20(2):343–363
Google Scholar
Welinder P, Perona P (2010) Online crowdsourcing: rating annotators and obtaining cost effective labels. In: Proceedings of IEEE conference on computer vision and pattern recognition, San Francisco, CA, USA, pp 25–32
Howe J (2006) The rise of crowdsourcing. Wired Mag 14(6):176–183
Google Scholar
Brabham D (2008) Crowdsourcing as a model for problem solving: an introduction and cases. Convergence 14(1):75–90
Google Scholar
Brawley AM, Pury CLS (2016) Work experiences on mturk: job satisfaction, turnover, and information sharing. Comput Hum Behav 54:531–546
Google Scholar
Fowler F Jr (2014) Survey research methods, 5th edn. SAGE Publications Inc, Thousand Oaks
Google Scholar
Allahbakhsh M, Benatallah B, Ignjatovic A, Motahari-Nezhad HR, Bertino E, Dustdar S (2013) Quality control in crowdsourcing systems: issues and directions. IEEE Internet Comput 17(2):76–81
Google Scholar
McCredie MN, Morey LC (2018) Who are the turkers? A characterization of mturk workers using the personality assessment inventory. Assessment 26:759–766
Google Scholar
Lovett M, Bajaba S, Lovett M, Simmering MJ (2017) Data quality from crowdsourced surveys: a mixed method inquiry into perceptions of Amazon’s mechanical turk masters. Appl Psychol 67(2):339–366
Google Scholar
Snow R, O’Connor B, Jurafsky D, Ng A (2008) Cheap and fast but is it good evaluating nonexpert annotations for natural language tasks. In: Proceedings of the conference on empirical methods in natural language processing, Honolulu, HI, USA, pp 254–263
Raykar V, Zhao S, Yu L, Jerebko A, Florin C, Valadez G, Bogoni L, Moy L (2009) Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th annual international conference on machine learning, Montreal, Canada, pp 889–896
Smyth P, Fayyad UM, Burl M, Perona P, Baldi P (1995) Inferring ground truth from subjective labeling of venus images. Adv Neural Inf Process Syst 7:1085–1092
Google Scholar
Sheng VS, Provost F, Ipeirotis PG (2008) Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, Las Vegas, NV, USA, pp 614–622
Ahn LV, Maurer B, McMillen C, Abraham D, Blum M (2008) Recaptcha: human-based character recognition via web security measures. Science 321(5895):1465–1468
MathSciNet MATH Google Scholar
Whitehill J, Ruvolo P, Bergsma T Wu J, Movellan J (2009) Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 23rd annual conference on neural information processing systems, Vancouver, Canada, pp 2035–2043
Vijayanarasimhan S, Grauman K (2009) What’s it going to cost you? Predicting effort vs. informativeness for multi-label image annotations. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, Miami, FL, USA, pp 2262–2269
Aroyo L, Welty C (2015) Truth is a lie: crowd truth and the seven myths of human annotation. AI Mag 36(1):15–24
Google Scholar
Artstein R (2017) Inter-annotator agreement. In: Ide N, Pustejovsky J (eds) Handbook of linguistic annotation. Springer, Dordrecht
Google Scholar
Callison-Burch C (2009) Fast, cheap, and creative: evaluating translation quality using Amazon’s mechanical turk. In: Proceedings of conference on empirical methods in natural language processing, Singapore, pp 286–295
Nowak S, Ruger S (2010) How reliable are annotations via crowdsourcing a study about inter-annotator agreement for multi-label image annotation. In: Proceedings of the international conference on multimedia information retrieval, Philadelphia, PA, USA, pp 557–566
Yadav P, Jezek E, Bouillon P, Callahan T, Bada M, Hunter L, Cohen KB (2017) Semantic relations in compound nouns: perspectives from inter-annotator agreement. Stud Health Technol Inform 245:644–648
Google Scholar
https://commandaria.cut.ac.cy//
Papadopoulos K, Tsapatsoulis N, Lanitis A, Kounoudes A (2008) The history of commandaria: digital journeys back to time. In: Proceedings of the 14th international conference on virtual systems and multimedia, Limassol, Cyprus
Cohen J (1960) A coefficient of agreement for nomimal scales. Educ Phsychol Meas 20(1):37–46
Google Scholar
Landis JR, Koch GK (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
MATH Google Scholar
Randolph JJ (2005) Free-marginal multirater kappa: an alternative to Fleiss’ fixed-marginal multirater kappa. In: Joensuu University learning and instruction symposium, Joensuu, Finland
Cowles M, Davis C (1982) On the origins of the.05 level of statistical significance. Am Psychol 37(5):553–558
Google Scholar
Fujisawa S (2007) Automatic creation and enhancement of metadata for cultural heritage. In: Bulletin of IEEE technical committee on digital libraries (TCDL)
Randolph JJ (2008) Online kappa calculator. http://justusrandolph.net/kappa/. Retrieved 5 Apr 2019

Download references

Acknowledgements

This work has been partly supported by the project that has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 739578 (RISE-Call: H2020-WIDE-SPREAD-01-2016-2017-TeamingPhase2) and the Government of the Republic of Cyprus through the Directorate General for European Programmes, Coordination and Development.

Author information

Authors and Affiliations

Research Centre on Interactive Media, Smart Systems and Emerging Technologies (RISE), Nicosia, Cyprus
Zenonas Theodosiou
Department of Communication and Internet Studies, Cyprus University of Technology, Limassol, Cyprus
Nicolas Tsapatsoulis

Authors

Zenonas Theodosiou
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Tsapatsoulis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zenonas Theodosiou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Theodosiou, Z., Tsapatsoulis, N. Image annotation: the effects of content, lexicon and annotation method. Int J Multimed Info Retr 9, 191–203 (2020). https://doi.org/10.1007/s13735-020-00193-z

Download citation

Received: 23 July 2019
Revised: 05 December 2019
Accepted: 16 February 2020
Published: 02 March 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s13735-020-00193-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image annotation: the effects of content, lexicon and annotation method

Abstract

Access this article

Similar content being viewed by others

Image Annotation Using a Semantic Hierarchy

A Novel Approach for Extracting Pertinent Keywords for Web Image Annotation Using Semantic Distance and Euclidean Distance

A Flexible Framework for the Evaluation of Unsupervised Image Annotation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image annotation: the effects of content, lexicon and annotation method

Abstract

Access this article

Similar content being viewed by others

Image Annotation Using a Semantic Hierarchy

A Novel Approach for Extracting Pertinent Keywords for Web Image Annotation Using Semantic Distance and Euclidean Distance

A Flexible Framework for the Evaluation of Unsupervised Image Annotation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation