Can Deep Neural Networks Learn Broad Semantic Concepts of Images?
- 23 Downloads
A lot of researches use DNNs to learn image high-level semantic concepts, like categories, from low-level visual properties. Images have more semantic concepts than categories, like whether two images are complement with each other, serve the same purpose, or occur in the same place or situation, etc. In this work, we do an experimental research to evaluate whether DNNs can learn these broad semantic concepts of images. We perform experiments with POPORO image dataset. Our results show that in overall, DNNs have limited capability in learning above-mentioned broad semantic concepts from image visual features. Within DNN models we tested, Inception models and its variants can learn broad semantic concepts of images better than VGG, ResNet, and DenseNet models. We think one of the main reasons for the pale performance in our experiments is the POPORO dataset used in this work is too small for DNN models. Big image datasets with rich and broad semantic labels and measures is the key for successful research in this area.
KeywordsDNN Image matching Image semantic property Image similarity Image visual property
Compliance with Ethical Standards
This research does not involve human participants and/or animals.
- 2.Wang, Z., Alan, C., Sheikh, H.R., Simoncelli, E.P.: The SSIM index for image quality assessment (2019). https://ece.uwaterloo.ca/~z70wang/research/ssim/. Accessed 25 July 2019
- 3.Peak signal-to-noise ratio as an image quality metric. http://www.ni.com/zh-cn/innovations/white-papers/11/peak-signal-to-noise-ratio-as-an-image-quality-metric.html. Accessed 25 July 2019
- 4.Lee, H.S., Jung, H., Agarwal, A.A., Kim, J.: Can peep neural networks match the related objects?: a survey on ImageNet-trained classification models (2017). https://arxiv.org/abs/1709.03806v1. Accessed 25 July 2019
- 5.Gupta, V.: Keras tutorial: using pre-trained Imagenet models (2019). https://www.learnopencv.com/keras-tutorial-using-pre-trained-imagenet-models/. Accessed 25 July 2019
- 6.Deng, J., Dong, W., Socher, R., Li, K.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009, pp. 248–255 (2009)Google Scholar
- 7.Brownlee, J.: How to grid search hyperparameters for deep learning models in Python with Keras (2016). https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/. Accessed 25 July 2019
- 8.Deselaers, T., Ferrari, V.: Visual and semantic similarity in ImageNet. In: CVPR 2011, pp. 1777–1784 (2011)Google Scholar
- 10.Deng, J., Berg, A.C., Li, F.F.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011 (2011)Google Scholar
- 11.McAuley, J., Leskovec, J.: Image labeling on a network: using social-network metadata for image classification. In: European Conference on Computer Vision, pp. 828–841 (2012)Google Scholar
- 12.Wang, Q., Zhou, X.W., Daniilidis, K.: Multi-image semantic matching by mining consistent features. In: CVPR 2017, pp. 685–694 (2017)Google Scholar
- 13.Huang, Y., Wu, Q., Song, C.F., Wang, L.: Learning semantic concepts and order for image and sentence matching. In: CVPR 2018, pp. 6163–6171 (2018)Google Scholar