Skip to main content

Advertisement

Log in

Suggesting an Integration System for Image Annotation

  • 1207: Innovations in Multimedia Information Processing & Retrieval​
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The number of digital images uploaded in the virtual world is rapidly growing every day. Therefore, an automatic image annotation system that can retrieve information from these images seems to be in high demand. One of the challenges in this field is the imbalanced data sets and the difficulty of successfully learning tags from them. Even if a nearly balanced data set exists for image annotation, it is unlikely to find a single learner, which could learn all tags with the same accuracy. In this paper, we suggest a novel integration system that selects an elite group of models from all existing annotation models and then combines them to take the best advantage of each model’s learning technique. The proposed system studies the data sets of selected models without the need for direct access to those data sets. As this algorithm is independent of the annotation models or data sets, it could be used to combine the currently available annotation models and those developed in future, along with their data sets and learning models. We believe the proposed approach has the potential of becoming an integrated ground for automatic image annotation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Ballan L, Uricchio T, Seidenari L, Del Bimbo A (2014) A cross-media model for automatic image annotation. In: Proceedings of International Conference on Multimedia Retrieval, pp 73–80. https://doi.org/10.1145/2578726.2578728

  2. Bradshaw B (2000) Semantic based image retrieval. In: Proceedings of the eighth ACM international conference on Multimedia—MULTIMEDIA ’00, pp 167–176. https://doi.org/10.1145/354384.354456.

  3. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655

    Article  MATH  Google Scholar 

  4. Bugnon LA, Yones C, Milone DH, Stegmayer G (2020) Deep neural architectures for highly imbalanced data in bioinformatics. IEEE Trans Neural Netw Learn Syst 31(8):2857–2867. https://doi.org/10.1109/TNNLS.2019.2914471

    Article  Google Scholar 

  5. Cao X, Zhang H, Guo X, Liu Si, Meng D (2015) SLED: semantic label embedding dictionary representation for multilabel image annotation. IEEE Trans Image Process 24(9):2746–2759. https://doi.org/10.1109/TIP.2015.2428055

    Article  MathSciNet  MATH  Google Scholar 

  6. Chen M, Zheng A, Weinberger K (2013) Fast Image Tagging. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol. 28, pp 1274–1282. https://doi.org/10.5555/3042817.3043079

  7. Cui C, Ma J, Lian T, Wang X, Ren Z (2013) Ranking-oriented nearest-neighbor based method for automatic image annotation. In: SIGIR 2013—Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 957–960. https://doi.org/10.1145/2484028.2484113

  8. Dai H-J, Wang C-K (2019) Classifying adverse drug reactions from imbalanced twitter data. Int J Med Inform 129:122–132. https://doi.org/10.1016/j.ijmedinf.2019.05.017

    Article  Google Scholar 

  9. Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Proc Eur Conf Comput Vis (ECCV) 2353:97–112. https://doi.org/10.1007/3-540-47979-1_7

    Article  MATH  Google Scholar 

  10. Džeroski S, Ženko B (2004) Is combining classifiers with stacking better than selecting the best one? Mach Learn 54(3):255–273. https://doi.org/10.1023/B:MACH.0000015881.36452.6e. Accessed March 2004

  11. Fakeri-Tabrizi A, Tollari S, Usunier N, Gallinari P (2010) Improving image annotation in imbalanced classification problems with ranking SVM. Lect Notes Comput Sci 6242 LNCS:291–294. https://doi.org/10.1007/978-3-642-15751-6_37

    Article  Google Scholar 

  12. Feng SL, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp 1002–1009. 29 October 2012 https://doi.org/10.1109/CVPR.2004.1315274

  13. Freund Y, Schapire RE (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. Lect Notes Comput Sci 904(1):23–37. https://doi.org/10.1007/3-540-59119-2_166

    Article  Google Scholar 

  14. Ghostan Khatchatoorian A, Jamzad M (2017) Post rectifying methods to improve the accuracy of image annotation. In 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), vol. 2017-Decem, pp 1–7. https://doi.org/10.1109/DICTA.2017.8227478

  15. Ghostan Khatchatoorian A, Jamzad M (2018) An image annotation rectifying method based on deep features. In: Proceedings of the 2nd International Conference on Digital Signal Processing—ICDSP 2018, pp 88–92. https://doi.org/10.1145/3193025.3193035

  16. Ghostan Khatchatoorian A, Jamzad M (2020) Architecture to improve the accuracy of automatic image annotation systems. IET Comput Vis 14(5):214–223. https://doi.org/10.1049/iet-cvi.2019.0500

    Article  Google Scholar 

  17. Grubinger M (2007) Analysis and evaluation of visual information systems performance. Eng Sci. Victoria University, pp 1–499. http://eprints.vu.edu.au/1435

  18. Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th International Conference on Computer Vision, pp 309–316. https://doi.org/10.1109/ICCV.2009.5459266.

  19. Hamid Amiri S, Jamzad M (2015) Efficient multi-modal fusion on supergraph for scalable image annotation. Pattern Recognit 48(7):2241–2253. https://doi.org/10.1016/j.patcog.2015.01.015

    Article  MATH  Google Scholar 

  20. Hariharan B, Zelnik-Manor L, Vishwanathan SVN, Varma M (2010) Large scale max-margin multi-label classification with priors. In: ICML 2010—Proceedings, 27th International Conference on Machine Learning, pp 423–430. https://doi.org/10.5555/3104322.3104377

  21. Ivasic-Kos M, Ipsic I, Ribaric S (2015) A knowledge-based multi-layered image annotation system. Expert Syst Appl 42(24):9539–9553. https://doi.org/10.1016/j.eswa.2015.07.068

    Article  Google Scholar 

  22. Jin C, Jin S-W (2016) Image distance metric learning based on neighborhood sets for automatic image annotation. J Vis Commun Image Represent 34:167–175. https://doi.org/10.1016/j.jvcir.2015.10.017

    Article  Google Scholar 

  23. Jing X-Y, Wu F, Li Z, Hu R, Zhang D (2016) Multi-label dictionary learning for image annotation. IEEE Trans Image Process 25(6):2712–2725. https://doi.org/10.1109/TIP.2016.2549459

    Article  MathSciNet  MATH  Google Scholar 

  24. Kalayeh MM, Idrees H, Shah M (2014) NMF-KNN: image annotation using weighted multi-view non-negative matrix factorization. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp 184–191. https://doi.org/10.1109/CVPR.2014.31

  25. Ke X, Zhou M, Niu Y, Guo W (2017) Data equilibrium based automatic image annotation by fusing deep model and semantic propagation. Pattern Recognit 71:60–77. https://doi.org/10.1016/j.patcog.2017.05.020

    Article  Google Scholar 

  26. Kuric E, Bielikova M (2015) ANNOR: efficient image annotation based on combining local and global features. Comput Gr 47:1–15. https://doi.org/10.1016/j.cag.2014.09.035

    Article  Google Scholar 

  27. Le HM, Nguyen T-O, Ngo-Tien D (2016) Fully automated multi-label image annotation by convolutional neural network and adaptive thresholding. In: Proceedings of the Seventh Symposium on Information and Communication Technology, vol. 08–09-Dec, pp 323–330. https://doi.org/10.1145/3011077.3011118.

  28. Li J, Yuan C (2016) Automatic image annotation using adaptive weighted distance in improved K nearest neighbors framework. Lect Notes Comput Sci 9916 LNCS:345–354. https://doi.org/10.1007/978-3-319-48890-5_34

    Article  Google Scholar 

  29. Li Z, Liu J, Xu C, Lu H (2013) MLRank: multi-correlation Learning to Rank for image annotation. Pattern Recognit 46(10):2700–2710. https://doi.org/10.1016/j.patcog.2013.03.016

    Article  MATH  Google Scholar 

  30. Liu Y, Wen K, Gao Q, Gao X, Nie F (2018) SVM based multi-label learning with missing labels for image annotation. Pattern Recognit 78:307–317. https://doi.org/10.1016/j.patcog.2018.01.022

    Article  Google Scholar 

  31. Lu Z, Peng Y (2012) Image annotation by semantic sparse recoding of visual content. In: Proceedings of the 20th ACM international conference on Multimedia—MM ’12, p 499. https://doi.org/10.1145/2393347.2393418

  32. Ma Y, Xie Q, Liu Y, Xiong S (2020) A weighted KNN-based automatic image annotation method. Neural Comput Appl 32(11):6559–6570. https://doi.org/10.1007/s00521-019-04114-y

    Article  Google Scholar 

  33. Makadia A, Pavlovic V, Kumar S (2010) Baselines for image annotation. Int J Comput Vis 90(1):88–105. https://doi.org/10.1007/s11263-010-0338-6

    Article  Google Scholar 

  34. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval, First. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  35. Moran S, Lavrenko V (2014) A sparse kernel relevance model for automatic image annotation. Int J Multimed Inf Retr 3(4):209–229. https://doi.org/10.1007/s13735-014-0063-y

    Article  Google Scholar 

  36. Murthy VN, Can EF, Manmatha R (2014) A hybrid model for automatic image annotation. In: Proceedings of International Conference on Multimedia Retrieval, pp 369–376. https://doi.org/10.1145/2578726.2578774

  37. Murthy VN, Maji S, Manmatha R (2015) Automatic image annotation using deep learning representations. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 603–606. https://doi.org/10.1145/2671188.2749391

  38. Rad R, Jamzad M (2015) Automatic image annotation by a loosely joint non-negative matrix factorisation. IET Comput Vis 9(6):806–813. https://doi.org/10.1049/iet-cvi.2014.0413

    Article  Google Scholar 

  39. Rad R, Jamzad M (2017) Image annotation using multi-view non-negative matrix factorization with different number of basis vectors. J Vis Commun Image Represent 46:1–12. https://doi.org/10.1016/j.jvcir.2017.03.005

    Article  Google Scholar 

  40. Rui Y, Huang TS, Chang S-F (1999) Image retrieval: current techniques, promising directions, and open issues. J Vis Commun Image Represent 10(1):39–62. https://doi.org/10.1006/jvci.1999.0413

    Article  Google Scholar 

  41. Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227. https://doi.org/10.1023/A:1022648800760

    Article  Google Scholar 

  42. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380. https://doi.org/10.1109/34.895972

    Article  Google Scholar 

  43. Su F, Xue L (2015) Graph learning on K nearest neighbours for automatic image annotation. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 403–410. https://doi.org/10.1145/2671188.2749383

  44. Tang C, Liu X, Wang P, Zhang C, Li M, Wang L (2019) Adaptive hypergraph embedded semi-supervised multi-label image annotation. IEEE Trans Multimed 21(11):2837–2849. https://doi.org/10.1109/TMM.2019.2909860

    Article  Google Scholar 

  45. Tariq A, Foroosh H (2015) Feature-independent context estimation for automatic image annotation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 07–12-June, pp. 1958–1965. https://doi.org/10.1109/CVPR.2015.7298806

  46. Verma Y, Jawahar CV (2012) Image annotation using metric learning in semantic neighbourhoods. Lect Notes Comput Sci 7574 LNCS(PART 3):836–849. https://doi.org/10.1007/978-3-642-33712-3_60

    Article  Google Scholar 

  47. Verma Y, Jawahar C (2013) Exploring SVM for image annotation in presence of confusing labels. In: Proceedings of the British Machine Vision Conference 2013, no. c, pp 25.1–25.11. https://doi.org/10.5244/C.27.25.

  48. von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the 2004 conference on Human factors in computing systems—CHI ’04, pp 319–326. https://doi.org/10.1145/985692.985733

  49. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259. https://doi.org/10.1016/S0893-6080(05)80023-1

    Article  Google Scholar 

  50. Xiang Y, Zhou X, Chua T-S, Ngo C.-W (2009) A revisit of generative model for automatic image annotation using Markov Random fields. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 1153–1160. https://doi.org/10.1109/CVPR.2009.5206518

  51. Yang Y, Zhang W, Xie Y (2015) Image automatic annotation via multi-view deep representation. J Vis Commun Image Represent 33:368–377. https://doi.org/10.1016/j.jvcir.2015.10.006

    Article  Google Scholar 

  52. Zhang X, Liu C (2015) Image annotation based on feature fusion and semantic similarity. Neurocomputing 149(PC):1658–1671. https://doi.org/10.1016/j.neucom.2014.08.027

    Article  Google Scholar 

  53. Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2010) Automatic image annotation using group sparsity. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3312–3319. https://doi.org/10.1109/CVPR.2010.5540036

  54. Zhang W, Hu H, Hu H (2018) Training visual-semantic embedding network for boosting automatic image annotation. Neural Process Lett 48(3):1503–1519. https://doi.org/10.1007/s11063-017-9753-9

    Article  Google Scholar 

  55. Zhou Z-H (2009) Ensemble learning. Encycl Biometrics. https://doi.org/10.1007/978-0-387-73003-5_293

    Article  Google Scholar 

  56. Zhuang J, Cai J, Wang R, Zhang J, Zheng W (2019) CARE: class attention to regions of lesion for classification on imbalanced data. In: Proceedings of the 2nd International Conference on Medical Imaging with Deep Learning, pp 588–597. http://proceedings.mlr.press/v102/zhuang19a/zhuang19a.pdf. Accessed March 2019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Artin Ghostan Khatchatoorian.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghostan Khatchatoorian, A., Jamzad, M. Suggesting an Integration System for Image Annotation. Multimed Tools Appl 82, 8323–8343 (2023). https://doi.org/10.1007/s11042-021-11571-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11571-y

Keywords

Navigation