Image Annotation Algorithm Based on Semantic Similarity and Multi-features

  • Jingxiu NiEmail author
  • Dongxing Wang
  • Guoying Zhang
  • Yanchao Sun
  • Xinkai Xu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11344)


The paper proposed an image annotation algorithm based on semantic similarity and multi-feature fusion. The annotation algorithm draws lessons from the method of semantic extraction in natural language processing, and establishes the corresponding semantic trees for some common scenes. The scene semantic tree is constructed based on the visual features of the specific scene in the image set. Firstly, the visual features of scene images are extracted, and then the visual features are clustered by fuzzy clustering. According to the clustering results, the images are grouped, clustered at different nodes according to visual features, and the images are further grouped. After the scene semantic tree is constructed, the algorithm will extract the visual features of the image to be annotated. Furthermore, the image moves from the item node to a leaf node in the scene semantic tree according to its visual features, and the semantic keywords which appear in the route constitute the tags of the image.


Semantic tree Image annotation Multi-feature fusion Semantic similarity Fuzzy clustering 


  1. 1.
    Wu, B., Fan, J., Liu, W., Ghanem, B.: Diverse image annotation. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 6194–6202 (2017)Google Scholar
  2. 2.
    LeCun, Y., Boser, B., Denker, J.S., Henderson, D., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRefGoogle Scholar
  3. 3.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS 2015, pp. 91–99 (2015)Google Scholar
  4. 4.
    Johnson, J., Ballan, L., Li, F.: Love thy neighbors: image annotation by exploiting image metadata. In: ICCV 2015, pp. 4624–4632 (2015)Google Scholar
  5. 5.
    Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep convolutional ranking for multilabel image annotation. arXiv preprint arXiv:1312.4894 (2013)
  6. 6.
    Wu, B., Lyu, S., Ghanem, B.: ML-MG: multi-label learning with missing labels using a mixed graph. In: ICCV 2015, pp. 4157–4165 (2015)Google Scholar
  7. 7.
    Wu, B., Lyu, S., Hu, B.G., Ji, Q.: Multi-label learning with missing labels for image annotation and facial action unit recognition. Pattern Recogn. 48(7), 2279–2289 (2015)CrossRefGoogle Scholar
  8. 8.
    Cabral, R.S., Torre, F.D., Costeira, J.P., Bernardino, A.: Matrix completion for multi-label image classification. In: NIPS 2011, pp. 190–198 (2011)Google Scholar
  9. 9.
    Cao, X., Zhang, H., Guo, X., Liu, S., Meng, D.: SLED: semantic label embedding dictionary representation for multi-label image annotation. IEEE Trans. Image Process. 24(9), 2746–2759 (2015)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Wu, B., Lyu, S., Ghanem, B.: Constrained submodular minimization for missing labels and class imbalance in multi-label learning. In: AAAI 2016, pp. 2229–2236 (2016)Google Scholar
  11. 11.
    Li, Y., Wu, B., Ghanem, B., Zhao, Y., Yao, H., Ji, Q.: Facial action unit recognition under incomplete data based on multi-label learning with missing labels. Pattern Recogn. 60, 890–900 (2016)CrossRefGoogle Scholar
  12. 12.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRefGoogle Scholar
  13. 13.
    Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)CrossRefGoogle Scholar
  14. 14.
    Salami, S., Shamsfard, M.: Integrating shallow syntactic labels in the phrase-boundary translation model. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 17(3) (2018)Google Scholar
  15. 15.
    Moran, S., Lavrenko, V.: Sparse kernel learning for image annotation. In: Proceedings of the ACM International Conference on Multimedia Retrieval, pp. 113–120 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Jingxiu Ni
    • 1
    • 2
    Email author
  • Dongxing Wang
    • 2
  • Guoying Zhang
    • 2
  • Yanchao Sun
    • 2
  • Xinkai Xu
    • 1
    • 2
  1. 1.Beijing Union UniversityBeijingChina
  2. 2.School of Mechanical, Electronic and Information EngineeringChina University of Mining and TechnologyBeijingChina

Personalised recommendations