CNN-feature based automatic image annotation method

Ma, Yanchun; Liu, Yongjian; Xie, Qing; Li, Lin

doi:10.1007/s11042-018-6038-x

CNN-feature based automatic image annotation method

Published: 28 April 2018

Volume 78, pages 3767–3780, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yanchun Ma¹,
Yongjian Liu¹,
Qing Xie¹ &
…
Lin Li¹

1305 Accesses
48 Citations
3 Altmetric
Explore all metrics

Abstract

Automatic image annotation(AIA) methods are considered as a kind of efficient schemes to solve the problem of semantic-gap between the original images and their semantic information. However, traditional annotation models work well only with finely crafted manual features. To address this problem, we combined the CNN feature of an image into our proposed model which we referred as SEM by using a famous CNN model-AlexNet. We extracted a CNN feature by removing its final layer and it is proved to be useful in our SEM model. Additionally, based on the experience of the traditional KNN models, we propose a model to address the problem of simultaneously addressing the image tag refinement and assignment while maintaining the simplicity of the KNN model. The proposed model divides the images which have similar features into a semantic neighbor group. Moreover, utilizing a self-defined Bayesian-based model, we distribute the tags which belong to the neighbor group to the test images according to the distance between the test image and the neighbors. At last, the experiments are performed on three typical image datasets corel5k, espGame and laprtc12, which verify the effectiveness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Annotation with Nearest Neighbor Based on Semantic Information

A weighted KNN-based automatic image annotation method

Article 07 March 2019

A Hybrid Architecture Based on CNN for Image Semantic Annotation

References

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems
Cusano C, Bicocca M, Bicocca V (2003) Image annotation using SVM. Proc SPIE 1:330–338
Article Google Scholar
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60
Article Google Scholar
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2013) DeCAF: a deep convolutional activation feature for generic visual recognition, 32
Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation learning a lexicon for a fixed image vocabulary, pp 97–112
Feng SL, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. Proc 2004 IEEE Comput Soc Confon Comput Vis Pattern Recogn 2004 CVPR 2004 2:1002–1009
Article Google Scholar
Gao Y, Fan J, Xue X, Jain R (2006) Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers. In: Proceedings of the 14th annual ACM international conference on multimedia - MULTIMEDIA ’06, (January), pp 901
Gru̇binger M, Clough P, Mu̇ller H, Deselaers T (2006) The IAPR TC-12 benchmark a new evaluation resource for visual information systems. LREC Workshop OntoImage language resources for content-based image retrieval, pp 13–23
Guillaumin M, Mensink T, Verbeek J, Schmid C, Guillaumin M, Mensink T, Verbeek J, Discrim CST, Guillaumin M, Mensink T, Verbeek J, Schmid C, Kuntzmann JL (2010) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation to cite this version: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 770–778
Jeon J, Lavrenko VP, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval - SIGIR ’03, p 119
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition
Li Z, Jinhui T (2015) Deep matrix factorization for social image tag refinement and assignment. In: IEEE 17th International workshop on multimedia signal processing, MMSP 2015 (200)
Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimed 17(11):1989–1999
Article Google Scholar
Li Z, Liu J, Tang J, Hanqing L (2015) Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Intell 37(10):2085–2098
Article Google Scholar
Li Z, Jinhui T (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288
Article MathSciNet Google Scholar
Luo Y, Yang Y, Shen F, Huang Z, Zhou P, Shen HT (2018) Robust discrete code modeling for supervised hashing. Pattern Recogn 75:128–135
Article Google Scholar
Makadia A, Pavlovic V, Kumar S (2010) A new baselines for image annotation. Int J Comput Vis 90:88–105
Article Google Scholar
Mori Y, Takahashi H, Oka R (1999) Image-to-word transformation based on dividing and vector quantizing images with words. In: MISRM
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1717–1724
Razavian AS, Azizpour H, Sullivan J, Carlsson S, Sharif A, Hossein R, Josephine A, Stefan S, Royal KTH (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Cvprw, pp 512–519
Rongyao H, Zhu X, Cheng D, He W, Yan Y, Song J, Shichao Z (2017) Graph self-representation method for unsupervised feature selection. Neurocomputing 220:130–137
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, pp 1–14
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Article Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A, Hill C, Arbor A (2014) Going deeper with convolutions, 1–9
von Ahn L, Dabbish L (2004) Proceedings of the 2004 conference on human factors in computing systems - CHI ’04 pp 319–326
Wang C, Blei D, Li F-F (2009) Simultaneous image classification and annotation. In: 2009 IEEE Computer society conference on computer vision and pattern recognition workshops. CVPR Workshops 2009, pp 1903–1910
Wang S, Chang XJ, Li X, Long G, Yao L, Sheng QZ (2016) Diagnosis code assignment using sparsity-based disease correlation embedding. IEEE Trans Knowl Data Eng 28(12):3191–3202
Article Google Scholar
Wang S, Li X, Chang X, Yao L, Sheng . ZQ, Long G (2017) Learning multiple diagnosis codes for ICU patients with local disease correlation mining. ACM Trans Knowl Discov Data 11(3):1–21
Google Scholar
Yang Y, Ma Z, Yang Y, Nie F, Shen HT (2015) Multitask spectral clustering by exploring intertask correlation. IEEE Trans Cybern 45(5):1069–1080
Article Google Scholar
Yang Y, Shen F, Shen HT, Li H, Li X (2015) Robust discrete spectral hashing for large-scale image semantic indexing. IEEE Trans Big Data 1(4):162–171
Article Google Scholar
Yang Y, Shen F, Huang Z, Shen HT, Li X (2017) Discrete nonnegative spectral clustering. IEEE Trans Knowl Data Eng 29(9):1834–1845
Article Google Scholar
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8689 LNCS(PART 1):818–833
Google Scholar
Zhu X, Li X, Zhang S, Ju C, Wu X (2016) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 1:1–13
Google Scholar
Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph PCA hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044
Article Google Scholar
Zhu X, Suk H-I, Huang H, Dinggang S (2017) Low-rank graph-regularized structured sparse regression for identifying genetic biomarkers. IEEE Trans Big Data 3(4):1–1
Article Google Scholar

Download references

Acknowledgments

This research is partially supported by Natural Science Foundation of China (Grant No.61602353) and the Fundamental Research Funds for the Central Universities (WUT:2017IVA053, WUT:2017IVB028 and WUT:2017YB028).

Author information

Authors and Affiliations

School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
Yanchun Ma, Yongjian Liu, Qing Xie & Lin Li

Authors

Yanchun Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yongjian Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qing Xie
View author publications
You can also search for this author in PubMed Google Scholar
Lin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qing Xie.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, Y., Liu, Y., Xie, Q. et al. CNN-feature based automatic image annotation method. Multimed Tools Appl 78, 3767–3780 (2019). https://doi.org/10.1007/s11042-018-6038-x

Download citation

Received: 31 August 2017
Revised: 21 March 2018
Accepted: 20 April 2018
Published: 28 April 2018
Issue Date: February 2019
DOI: https://doi.org/10.1007/s11042-018-6038-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CNN-feature based automatic image annotation method

Abstract

Access this article

Similar content being viewed by others

Image Annotation with Nearest Neighbor Based on Semantic Information

A weighted KNN-based automatic image annotation method

A Hybrid Architecture Based on CNN for Image Semantic Annotation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CNN-feature based automatic image annotation method

Abstract

Access this article

Similar content being viewed by others

Image Annotation with Nearest Neighbor Based on Semantic Information

A weighted KNN-based automatic image annotation method

A Hybrid Architecture Based on CNN for Image Semantic Annotation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation