Suggesting an Integration System for Image Annotation

Ghostan Khatchatoorian, Artin; Jamzad, Mansour

doi:10.1007/s11042-021-11571-y

Suggesting an Integration System for Image Annotation

1207: Innovations in Multimedia Information Processing & Retrieval
Published: 15 July 2022

Volume 82, pages 8323–8343, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

209 Accesses
1 Altmetric
Explore all metrics

Abstract

The number of digital images uploaded in the virtual world is rapidly growing every day. Therefore, an automatic image annotation system that can retrieve information from these images seems to be in high demand. One of the challenges in this field is the imbalanced data sets and the difficulty of successfully learning tags from them. Even if a nearly balanced data set exists for image annotation, it is unlikely to find a single learner, which could learn all tags with the same accuracy. In this paper, we suggest a novel integration system that selects an elite group of models from all existing annotation models and then combines them to take the best advantage of each model’s learning technique. The proposed system studies the data sets of selected models without the need for direct access to those data sets. As this algorithm is independent of the annotation models or data sets, it could be used to combine the currently available annotation models and those developed in future, along with their data sets and learning models. We believe the proposed approach has the potential of becoming an integrated ground for automatic image annotation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Article Open access 06 February 2017

References

Ballan L, Uricchio T, Seidenari L, Del Bimbo A (2014) A cross-media model for automatic image annotation. In: Proceedings of International Conference on Multimedia Retrieval, pp 73–80. https://doi.org/10.1145/2578726.2578728
Bradshaw B (2000) Semantic based image retrieval. In: Proceedings of the eighth ACM international conference on Multimedia—MULTIMEDIA ’00, pp 167–176. https://doi.org/10.1145/354384.354456.
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655
Article MATH Google Scholar
Bugnon LA, Yones C, Milone DH, Stegmayer G (2020) Deep neural architectures for highly imbalanced data in bioinformatics. IEEE Trans Neural Netw Learn Syst 31(8):2857–2867. https://doi.org/10.1109/TNNLS.2019.2914471
Article Google Scholar
Cao X, Zhang H, Guo X, Liu Si, Meng D (2015) SLED: semantic label embedding dictionary representation for multilabel image annotation. IEEE Trans Image Process 24(9):2746–2759. https://doi.org/10.1109/TIP.2015.2428055
Article MathSciNet MATH Google Scholar
Chen M, Zheng A, Weinberger K (2013) Fast Image Tagging. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol. 28, pp 1274–1282. https://doi.org/10.5555/3042817.3043079
Cui C, Ma J, Lian T, Wang X, Ren Z (2013) Ranking-oriented nearest-neighbor based method for automatic image annotation. In: SIGIR 2013—Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 957–960. https://doi.org/10.1145/2484028.2484113
Dai H-J, Wang C-K (2019) Classifying adverse drug reactions from imbalanced twitter data. Int J Med Inform 129:122–132. https://doi.org/10.1016/j.ijmedinf.2019.05.017
Article Google Scholar
Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Proc Eur Conf Comput Vis (ECCV) 2353:97–112. https://doi.org/10.1007/3-540-47979-1_7
Article MATH Google Scholar
Džeroski S, Ženko B (2004) Is combining classifiers with stacking better than selecting the best one? Mach Learn 54(3):255–273. https://doi.org/10.1023/B:MACH.0000015881.36452.6e. Accessed March 2004
Fakeri-Tabrizi A, Tollari S, Usunier N, Gallinari P (2010) Improving image annotation in imbalanced classification problems with ranking SVM. Lect Notes Comput Sci 6242 LNCS:291–294. https://doi.org/10.1007/978-3-642-15751-6_37
Article Google Scholar
Feng SL, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp 1002–1009. 29 October 2012 https://doi.org/10.1109/CVPR.2004.1315274
Freund Y, Schapire RE (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. Lect Notes Comput Sci 904(1):23–37. https://doi.org/10.1007/3-540-59119-2_166
Article Google Scholar
Ghostan Khatchatoorian A, Jamzad M (2017) Post rectifying methods to improve the accuracy of image annotation. In 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), vol. 2017-Decem, pp 1–7. https://doi.org/10.1109/DICTA.2017.8227478
Ghostan Khatchatoorian A, Jamzad M (2018) An image annotation rectifying method based on deep features. In: Proceedings of the 2nd International Conference on Digital Signal Processing—ICDSP 2018, pp 88–92. https://doi.org/10.1145/3193025.3193035
Ghostan Khatchatoorian A, Jamzad M (2020) Architecture to improve the accuracy of automatic image annotation systems. IET Comput Vis 14(5):214–223. https://doi.org/10.1049/iet-cvi.2019.0500
Article Google Scholar
Grubinger M (2007) Analysis and evaluation of visual information systems performance. Eng Sci. Victoria University, pp 1–499. http://eprints.vu.edu.au/1435
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th International Conference on Computer Vision, pp 309–316. https://doi.org/10.1109/ICCV.2009.5459266.
Hamid Amiri S, Jamzad M (2015) Efficient multi-modal fusion on supergraph for scalable image annotation. Pattern Recognit 48(7):2241–2253. https://doi.org/10.1016/j.patcog.2015.01.015
Article MATH Google Scholar
Hariharan B, Zelnik-Manor L, Vishwanathan SVN, Varma M (2010) Large scale max-margin multi-label classification with priors. In: ICML 2010—Proceedings, 27th International Conference on Machine Learning, pp 423–430. https://doi.org/10.5555/3104322.3104377
Ivasic-Kos M, Ipsic I, Ribaric S (2015) A knowledge-based multi-layered image annotation system. Expert Syst Appl 42(24):9539–9553. https://doi.org/10.1016/j.eswa.2015.07.068
Article Google Scholar
Jin C, Jin S-W (2016) Image distance metric learning based on neighborhood sets for automatic image annotation. J Vis Commun Image Represent 34:167–175. https://doi.org/10.1016/j.jvcir.2015.10.017
Article Google Scholar
Jing X-Y, Wu F, Li Z, Hu R, Zhang D (2016) Multi-label dictionary learning for image annotation. IEEE Trans Image Process 25(6):2712–2725. https://doi.org/10.1109/TIP.2016.2549459
Article MathSciNet MATH Google Scholar
Kalayeh MM, Idrees H, Shah M (2014) NMF-KNN: image annotation using weighted multi-view non-negative matrix factorization. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp 184–191. https://doi.org/10.1109/CVPR.2014.31
Ke X, Zhou M, Niu Y, Guo W (2017) Data equilibrium based automatic image annotation by fusing deep model and semantic propagation. Pattern Recognit 71:60–77. https://doi.org/10.1016/j.patcog.2017.05.020
Article Google Scholar
Kuric E, Bielikova M (2015) ANNOR: efficient image annotation based on combining local and global features. Comput Gr 47:1–15. https://doi.org/10.1016/j.cag.2014.09.035
Article Google Scholar
Le HM, Nguyen T-O, Ngo-Tien D (2016) Fully automated multi-label image annotation by convolutional neural network and adaptive thresholding. In: Proceedings of the Seventh Symposium on Information and Communication Technology, vol. 08–09-Dec, pp 323–330. https://doi.org/10.1145/3011077.3011118.
Li J, Yuan C (2016) Automatic image annotation using adaptive weighted distance in improved K nearest neighbors framework. Lect Notes Comput Sci 9916 LNCS:345–354. https://doi.org/10.1007/978-3-319-48890-5_34
Article Google Scholar
Li Z, Liu J, Xu C, Lu H (2013) MLRank: multi-correlation Learning to Rank for image annotation. Pattern Recognit 46(10):2700–2710. https://doi.org/10.1016/j.patcog.2013.03.016
Article MATH Google Scholar
Liu Y, Wen K, Gao Q, Gao X, Nie F (2018) SVM based multi-label learning with missing labels for image annotation. Pattern Recognit 78:307–317. https://doi.org/10.1016/j.patcog.2018.01.022
Article Google Scholar
Lu Z, Peng Y (2012) Image annotation by semantic sparse recoding of visual content. In: Proceedings of the 20th ACM international conference on Multimedia—MM ’12, p 499. https://doi.org/10.1145/2393347.2393418
Ma Y, Xie Q, Liu Y, Xiong S (2020) A weighted KNN-based automatic image annotation method. Neural Comput Appl 32(11):6559–6570. https://doi.org/10.1007/s00521-019-04114-y
Article Google Scholar
Makadia A, Pavlovic V, Kumar S (2010) Baselines for image annotation. Int J Comput Vis 90(1):88–105. https://doi.org/10.1007/s11263-010-0338-6
Article Google Scholar
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval, First. Cambridge University Press, Cambridge
Book MATH Google Scholar
Moran S, Lavrenko V (2014) A sparse kernel relevance model for automatic image annotation. Int J Multimed Inf Retr 3(4):209–229. https://doi.org/10.1007/s13735-014-0063-y
Article Google Scholar
Murthy VN, Can EF, Manmatha R (2014) A hybrid model for automatic image annotation. In: Proceedings of International Conference on Multimedia Retrieval, pp 369–376. https://doi.org/10.1145/2578726.2578774
Murthy VN, Maji S, Manmatha R (2015) Automatic image annotation using deep learning representations. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 603–606. https://doi.org/10.1145/2671188.2749391
Rad R, Jamzad M (2015) Automatic image annotation by a loosely joint non-negative matrix factorisation. IET Comput Vis 9(6):806–813. https://doi.org/10.1049/iet-cvi.2014.0413
Article Google Scholar
Rad R, Jamzad M (2017) Image annotation using multi-view non-negative matrix factorization with different number of basis vectors. J Vis Commun Image Represent 46:1–12. https://doi.org/10.1016/j.jvcir.2017.03.005
Article Google Scholar
Rui Y, Huang TS, Chang S-F (1999) Image retrieval: current techniques, promising directions, and open issues. J Vis Commun Image Represent 10(1):39–62. https://doi.org/10.1006/jvci.1999.0413
Article Google Scholar
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227. https://doi.org/10.1023/A:1022648800760
Article Google Scholar
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380. https://doi.org/10.1109/34.895972
Article Google Scholar
Su F, Xue L (2015) Graph learning on K nearest neighbours for automatic image annotation. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 403–410. https://doi.org/10.1145/2671188.2749383
Tang C, Liu X, Wang P, Zhang C, Li M, Wang L (2019) Adaptive hypergraph embedded semi-supervised multi-label image annotation. IEEE Trans Multimed 21(11):2837–2849. https://doi.org/10.1109/TMM.2019.2909860
Article Google Scholar
Tariq A, Foroosh H (2015) Feature-independent context estimation for automatic image annotation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 07–12-June, pp. 1958–1965. https://doi.org/10.1109/CVPR.2015.7298806
Verma Y, Jawahar CV (2012) Image annotation using metric learning in semantic neighbourhoods. Lect Notes Comput Sci 7574 LNCS(PART 3):836–849. https://doi.org/10.1007/978-3-642-33712-3_60
Article Google Scholar
Verma Y, Jawahar C (2013) Exploring SVM for image annotation in presence of confusing labels. In: Proceedings of the British Machine Vision Conference 2013, no. c, pp 25.1–25.11. https://doi.org/10.5244/C.27.25.
von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the 2004 conference on Human factors in computing systems—CHI ’04, pp 319–326. https://doi.org/10.1145/985692.985733
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259. https://doi.org/10.1016/S0893-6080(05)80023-1
Article Google Scholar
Xiang Y, Zhou X, Chua T-S, Ngo C.-W (2009) A revisit of generative model for automatic image annotation using Markov Random fields. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 1153–1160. https://doi.org/10.1109/CVPR.2009.5206518
Yang Y, Zhang W, Xie Y (2015) Image automatic annotation via multi-view deep representation. J Vis Commun Image Represent 33:368–377. https://doi.org/10.1016/j.jvcir.2015.10.006
Article Google Scholar
Zhang X, Liu C (2015) Image annotation based on feature fusion and semantic similarity. Neurocomputing 149(PC):1658–1671. https://doi.org/10.1016/j.neucom.2014.08.027
Article Google Scholar
Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2010) Automatic image annotation using group sparsity. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3312–3319. https://doi.org/10.1109/CVPR.2010.5540036
Zhang W, Hu H, Hu H (2018) Training visual-semantic embedding network for boosting automatic image annotation. Neural Process Lett 48(3):1503–1519. https://doi.org/10.1007/s11063-017-9753-9
Article Google Scholar
Zhou Z-H (2009) Ensemble learning. Encycl Biometrics. https://doi.org/10.1007/978-0-387-73003-5_293
Article Google Scholar
Zhuang J, Cai J, Wang R, Zhang J, Zheng W (2019) CARE: class attention to regions of lesion for classification on imbalanced data. In: Proceedings of the 2nd International Conference on Medical Imaging with Deep Learning, pp 588–597. http://proceedings.mlr.press/v102/zhuang19a/zhuang19a.pdf. Accessed March 2019

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Sharif University of Technology, 11365-11155, Tehran, Iran
Artin Ghostan Khatchatoorian & Mansour Jamzad

Authors

Artin Ghostan Khatchatoorian
View author publications
You can also search for this author in PubMed Google Scholar
Mansour Jamzad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Artin Ghostan Khatchatoorian.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghostan Khatchatoorian, A., Jamzad, M. Suggesting an Integration System for Image Annotation. Multimed Tools Appl 82, 8323–8343 (2023). https://doi.org/10.1007/s11042-021-11571-y

Download citation

Received: 27 November 2020
Revised: 29 July 2021
Accepted: 20 September 2021
Published: 15 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s11042-021-11571-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Suggesting an Integration System for Image Annotation

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Suggesting an Integration System for Image Annotation

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation