Multi-scale salient region and relevant visual keywords based model for automatic image annotation

Ke, Xiao; Guo, Wenzhong

doi:10.1007/s11042-014-2318-2

Multi-scale salient region and relevant visual keywords based model for automatic image annotation

Published: 21 October 2014

Volume 75, pages 12477–12498, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiao Ke^1,2 &
Wenzhong Guo^1,2

295 Accesses
3 Citations
Explore all metrics

Abstract

Automatic image annotation is a vital and challenging problem in pattern recognition and image understanding areas. The existing models directly extract visual features from segmented image regions. Since segmented image regions may still have multi-objects, the extractive visual features may not effectively describe corresponding regions. In addition, existing models did not consider the visual representations of corresponding keywords, which would lead to appearing plenty of irrelevant annotations in final annotation results, and these annotations did not relate to any part of images considering visual contents. In order to overcome the above problems, an image annotation model based on multi-scale salient region and relevant visual keywords is proposed. In this model, each image is segmented by using multi-scale grid segmentation method and the global contrast based method is used to extract the saliency maps from each image region. Visual features are extracted from each salient region. In addition, each keyword is divided into two categories: abstract words or non-abstract words. Visual seeds of each non-abstract word are established, and then a new method is proposed to extract visual keyword collections by using corresponding seeds. According to the traits of abstract words, an algorithm based on subtraction regions is proposed to extract visual seeds and corresponding visual keyword collections of each abstract word. Adaptive parameter method and a fast solution algorithm are proposed to determine the similarity thresholds of each keyword. Finally, multi-scale visual features and the combinations of the above methods are used to improve the annotation performance. Our model can improve the object descriptions of images and image regions. Experimental results verify the effectiveness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Automatic Image Annotation Based on Multi-scale Salient Region

Salient Object Detection via Google Image Retrieval

Extraction of salient objects based on image clustering and saliency

Article 20 February 2015

In Seop Na, Ha Le, … Hyung Jeong Yang

Notes

http://www.flickr.com

References

Borji A, Sihite DN, Itti L (2012) Salient object detection: A benchmark. Proceedings of 12th European Conference on Computer Vision, pp 414–429
Carneiro G, Chan AB, Moreno PJ et al (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410
Article Google Scholar
Cheng MM, Mistra NJ, Huang X et al (2011) Global contrast based salient region detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 409–416
Duygulu P, Barnard K, Freitas J, Forsyth D (2002) Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. Proceedings of the 7th European Conference on Computer Vision, pp 97–112
Feng S L, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, pp 1002–1009
Feng S, Xu D (2010) Transductive multi-instance multi-label learning algorithm with application to automatic image annotation. Expert Syst Appl 37:661–670
Article MathSciNet Google Scholar
Grubinger M (2007) Analysis and Evaluation of Visual Information Systems Performance. PhD thesis, Victoria University, Melbourne
Guan T, Fan Y, Duan L et al (2014) On-device mobile visual location recognition by using panoramic images and compressed sensing based visual descriptors. PLoS One 9(6):e98806
Article Google Scholar
Guan T, He Y, Duan L et al (2014) Efficient BOF generation and compression for on-device mobile visual location recognition. IEEE Multimed 21(2):32–41
Article Google Scholar
Guan T, He Y, Gao J, Yang J, Yu J (2013) On-device mobile visual location recognition by integrating vision and inertial sensors. IEEE Trans Multimed 15(7):1688–1699
Article Google Scholar
Han Y, Wu F, Tian Q et al (2012) Image annotation by input–output structural grouping sparsity. IEEE Trans Image Process 21(6):3066–3079
Article MathSciNet Google Scholar
Hu J, Lam KM (2013) An efficient two-stage framework for image annotation. Pattern Recogn 46(3):936–947
Article Google Scholar
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. Proceedings of the 26th Annual International ACM SIGIR, Toronto, pp 119–126
Ji R, Duan LY, Chen J et al (2012) Location discriminative vocabulary coding for mobile landmark search. Int J Comput Vis 96(3):290–314
Article MATH Google Scholar
Ji R, Duan LY, Chen J et al (2013) Learning to distribute vocabulary indexing for scalable visual search. IEEE Trans Multimed 15(1):153–166
Article Google Scholar
Ji R, Gao Y, Hong R et al (2014) Spectral-spatial constraint hyperspectral image classification. IEEE Trans Geosci Remote Sens 52(3):1811–1824
Article Google Scholar
Ji R, Yao H, Liu W et al (2012) Task dependent visual codebook compression. IEEE Trans Image Process 21(4):2282–2293
Article MathSciNet Google Scholar
Kang F, Jin R, Sukthankar R (2006) Correlated Label Propagation with application to Multi-label Learning. Proc of the 2006 I.E. Computer Society Conf on Computer Vision and Pattern Recognition. IEEE, Piscataway, pp 1719–1726
Google Scholar
Lavrenko V, Manmatha R, Jeon J (2003) A model for learning the semantics of pictures. Proceedings of Advance in Neutral Information Processing, Vancouver
Li Z, Liu J, Xu C et al (2013) MLRank Multi-correlation Learning to Rank for image annotation. Pattern Recogn 46(10):2700–2710
Article MathSciNet MATH Google Scholar
Lindstaedt S, Morzinger R, Sorschag R et al (2009) Automatic image annotation using visual content and folksonomies. Multimed Tools Appl 42:97–113
Article Google Scholar
Liu J, Li MJ, Ma WY et al (2006) An adaptive graph model for automatic image annotation. Proceedings of the ACM SIGMM Workshop on Multimedia Information Retrieval. ACM, New York, pp 61–69
Google Scholar
Luis VA, Laura D (2004) Labeling images with a computer game. Proceedings of the SIGCHI conference on Human factors in computing system, pp 319–326
Makadia A, Pavlovic V, Kumar S (2008) A New Baseline for Image Annotation. 10th European Conference on Computer Vision, Marseille, pp 316–329
Matthieu G, Thomas M, Jakob V et al (2009). TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation. Proceedings of 12th International Conference on Computer Vision, pp 309–316
Pedro F, Daniel P (2004) Efficient graph-based image segmentation [J]. Int J Comput Vis 59(2):167–181
Article Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22:888–905
Article Google Scholar
Si Z, Zhu SC (2013) Learning AND-OR templates for object recognition and detection. IEEE Trans Pattern Anal Mach Intell 35(9):2189–2205
Article Google Scholar
Wang Y, Mei T, Gong S, Hua XS (2009) Combining global, regional and contextual features for automatic image annotation. Pattern Recogn 42:259–266
Article MATH Google Scholar
Yang Y, Wu F, Nie F et al (2012) Web and personal image annotation by mining label correlation with relaxed visual graph embedding. IEEE Trans Image Process 21(3):1339–1351
Article MathSciNet Google Scholar
Zhang L, Gao Y, Hong C et al (2014) Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Trans Cybern 44(8):1408–1419
Article Google Scholar
Zhang L, Gao Y, Xia Y et al (2014) A Fine-grained image categorization system by cellet-encoded spatial pyramid modeling. IEEE Trans Ind Electron
Zhang L, Han Y, Yang Y et al (2013) Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans Image Process 22(12):5071–5084
Article MathSciNet Google Scholar
Zhang S, Huang J, Li H et al (2012) Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern 42(3):838–849
Article MathSciNet Google Scholar
Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques [J]. Pattern Recogn 45(1):346–262
Article Google Scholar
Zhang L, Song M, Liu X et al (2013) Fast multi-view segment graph kernel for object classification. Sig Process 93(6):1597–1607
Article Google Scholar
Zhang L, Song M, Liu X et al (2014) Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf Sci 254(1):141–154
Article MathSciNet Google Scholar
Zhu JY, Wu J, Wei Y et al (2012) Unsupervised Object Discovery via Saliency-Guided Multiple Class Learning. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3218–3225

Download references

Acknowledgments

This work is partially supported by the National Natural Science Foundation of China under Grants No. 61103175, the Natural Science Foundation of Fujian Province under Grant No. 2013 J05088, the Key Project of Chinese Ministry of Education under Grant No.212086, the Fujian Province High School Science Fund for Distinguished Young Scholars under Grand No.JA12016, the Program for New Century Excellent Talents in Fujian Province University under Grant No. JA13021, and the Fujian Natural Science Funds for Distinguished Young Scholar under Grant No. 2014 J06017.

Author information

Authors and Affiliations

College of Mathematics and Computer Science, Fuzhou University, Fuzhou, 350116, China
Xiao Ke & Wenzhong Guo
Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing (Fuzhou University), Fuzhou, 350116, China
Xiao Ke & Wenzhong Guo

Authors

Xiao Ke
View author publications
You can also search for this author in PubMed Google Scholar
Wenzhong Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenzhong Guo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ke, X., Guo, W. Multi-scale salient region and relevant visual keywords based model for automatic image annotation. Multimed Tools Appl 75, 12477–12498 (2016). https://doi.org/10.1007/s11042-014-2318-2

Download citation

Received: 17 July 2014
Revised: 08 October 2014
Accepted: 10 October 2014
Published: 21 October 2014
Issue Date: October 2016
DOI: https://doi.org/10.1007/s11042-014-2318-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Multi-scale salient region and relevant visual keywords based model for automatic image annotation

Abstract

Access this article

Similar content being viewed by others

Automatic Image Annotation Based on Multi-scale Salient Region

Salient Object Detection via Google Image Retrieval

Extraction of salient objects based on image clustering and saliency

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-scale salient region and relevant visual keywords based model for automatic image annotation

Abstract

Access this article

Similar content being viewed by others

Automatic Image Annotation Based on Multi-scale Salient Region

Salient Object Detection via Google Image Retrieval

Extraction of salient objects based on image clustering and saliency

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation