Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

  • Longzheng CaiEmail author
  • Shuyun Lim
  • Xuan Wang
  • Longmei Tang
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1107)


A lot of researches use DNNs to learn image high-level semantic concepts, like categories, from low-level visual properties. Images have more semantic concepts than categories, like whether two images are complement with each other, serve the same purpose, or occur in the same place or situation, etc. In this work, we do an experimental research to evaluate whether DNNs can learn these broad semantic concepts of images. We perform experiments with POPORO image dataset. Our results show that in overall, DNNs have limited capability in learning above-mentioned broad semantic concepts from image visual features. Within DNN models we tested, Inception models and its variants can learn broad semantic concepts of images better than VGG, ResNet, and DenseNet models. We think one of the main reasons for the pale performance in our experiments is the POPORO dataset used in this work is too small for DNN models. Big image datasets with rich and broad semantic labels and measures is the key for successful research in this area.


DNN Image matching Image semantic property Image similarity Image visual property 


Compliance with Ethical Standards

This research does not involve human participants and/or animals.


  1. 1.
    Kovalenko, L.Y., Chaumon, M., Busch, N.A.: A pool of pairs of related objects (POPORO) for investigating visual semantic integration: behavioral and electrophysiological validation. Brain Topogr. 25(3), 272–284 (2012)CrossRefGoogle Scholar
  2. 2.
    Wang, Z., Alan, C., Sheikh, H.R., Simoncelli, E.P.: The SSIM index for image quality assessment (2019). Accessed 25 July 2019
  3. 3.
    Peak signal-to-noise ratio as an image quality metric. Accessed 25 July 2019
  4. 4.
    Lee, H.S., Jung, H., Agarwal, A.A., Kim, J.: Can peep neural networks match the related objects?: a survey on ImageNet-trained classification models (2017). Accessed 25 July 2019
  5. 5.
    Gupta, V.: Keras tutorial: using pre-trained Imagenet models (2019). Accessed 25 July 2019
  6. 6.
    Deng, J., Dong, W., Socher, R., Li, K.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009, pp. 248–255 (2009)Google Scholar
  7. 7.
    Brownlee, J.: How to grid search hyperparameters for deep learning models in Python with Keras (2016). Accessed 25 July 2019
  8. 8.
    Deselaers, T., Ferrari, V.: Visual and semantic similarity in ImageNet. In: CVPR 2011, pp. 1777–1784 (2011)Google Scholar
  9. 9.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)CrossRefGoogle Scholar
  10. 10.
    Deng, J., Berg, A.C., Li, F.F.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011 (2011)Google Scholar
  11. 11.
    McAuley, J., Leskovec, J.: Image labeling on a network: using social-network metadata for image classification. In: European Conference on Computer Vision, pp. 828–841 (2012)Google Scholar
  12. 12.
    Wang, Q., Zhou, X.W., Daniilidis, K.: Multi-image semantic matching by mining consistent features. In: CVPR 2017, pp. 685–694 (2017)Google Scholar
  13. 13.
    Huang, Y., Wu, Q., Song, C.F., Wang, L.: Learning semantic concepts and order for image and sentence matching. In: CVPR 2018, pp. 6163–6171 (2018)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Longzheng Cai
    • 1
    Email author
  • Shuyun Lim
    • 2
  • Xuan Wang
    • 1
  • Longmei Tang
    • 1
  1. 1.College of Information Science and EngineeringFujian University of TechnologyFuzhouChina
  2. 2.Faculty of Business and TechnologyUnitar International UniversityPetaling JayaMalaysia

Personalised recommendations