Skip to main content
Log in

Learning Semantic Features for Classifying Very Large Image Datasets Using Convolution Neural Network

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Advancements in sensors and image acquisition devices lead to tremendous increase in creation of unlabeled database of images, and traditional image retrieval approaches are inefficient in retrieving semantic images. Neural network is emerging as popular method to solve most of the state-of-the-art problems in filling the gap between low level features and high level semantics. Convolution neural network (CNN) is a category of neural network which automatically extracts the important features without any human intervention, with considerably reduced set of parameters. In this paper, a model is generated using CNN (VGG-16) architecture which combines convolution and max pooling layers at different levels using effective regularization and transfer learning with data augmentation. The effectiveness of proposed model is demonstrated on benchmarking Caltech 101 and 256 datasets. The results obtained have proven the capacity of model in understanding high level semantics and outperformed several contemporary techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Du X, Cai Y, Wang S, Zhang L. Overview of deep learning. In: 2016 31st Youth academic annual conference of Chinese association of automation (YAC), Wuhan; 2016. pp. 159–164.

  2. Shrestha A, Mahmood A. Review of deep learning algorithms and architectures. IEEE Access. 2019;7:53040–65.

    Article  Google Scholar 

  3. Jia X. Image recognition method based on deep learning. In: 2017 29th Chinese control and decision conference (CCDC), Chongqing; 2017. pp. 4730–4735.

  4. Kim K, Park J, Kim JY, Kim S. Color decision system based on deep learning and fuzzy inference system. In: Joint 10th international conference on soft computing and intelligent systems (SCIS) and 19th international symposium on advanced intelligent systems (ISIS), Toyama, Japan; 2018. pp. 236–239. https://doi.org/10.1109/SCIS-ISIS.2018.00049.

  5. Yeh C, Huang C, Lin C. Deep learning underwater image color correction and contrast enhancement based on hue preservation. In: IEEE underwater technology (UT), Taiwan; 2019. pp. 1–6. https://doi.org/10.1109/UT.2019.8734469.

  6. Zhu G, Li B, Hong S, Mao B. Texture recognition and classification based on deep learning. In: Sixth international conference on advanced cloud and bigdata (CBD), Lanzhou; 2018. pp. 344–348. https://doi.org/10.1109/CBD.2018.00068.

  7. Bansal R, Pundir AS, Raman B. Dynamic texture using deep learning. In: TENCON IEEE region 10 conference, Penang; 2017. pp. 2609–2614. https://doi.org/10.1109/TENCON.2017.8228302.

  8. Koller O, Ney H, Bowden R. Deep learning of mouth shapes for sign language. In: IEEE international conference on computer vision workshop (ICCVW), Santiago; 2015. pp. 477–483. https://doi.org/10.1109/ICCVW.2015.69.

  9. Abhishek K, et al. A binary classification framework for two-stage multiple kernel learning. In: 29th International conference on machine learning, Edinburgh, Scotland, UK; 2012.

  10. Sohn K, Jung DY, Lee H, Hero AO. Efficient learning of sparse, distributed, convolutional feature representations for object recognition. In: International conference on computer vision, Barcelona; 2011. pp. 2643–2650. https://doi.org/10.1109/ICCV.2011.6126554.

  11. Boiman O, Shechtman E, Irani M. In defense of nearest-neighbor based image classification. In: IEEE conference on computer vision and pattern recognition, Anchorage, AK; 2008. pp. 1–8. https://doi.org/10.1109/CVPR.2008.4587598.

  12. Yang J, Yu K, Gong Y, Huang T. Linear spatial pyramid matching using sparse coding for image classification. In: IEEE computer society conference on computer vision and pattern recognition workshops; 2009. pp. 1794–1801. https://doi.org/10.1109/CVPR.2009.5206757.

  13. Timofte R, Tuytelaars T, Van Gool L. Naive Bayes image classification: beyond nearest neighbors. In: ACCV lecture notes in computer science, vol 7724. Springer, Berlin; 2012. https://doi.org/10.1007/978-3-642-37331-2_52

  14. Wang J, et al. Locality-constrained linear coding for image classification. In: IEEE computer society conference on computer vision and pattern recognition; 2010. pp. 3360–3367.

  15. Gehler P, Nowozin S. On feature combination for multiclass object classification. In: IEEE 12th international conference on computer vision, Kyoto; 2009. pp. 221–228. https://doi.org/10.1109/ICCV.2009.5459169.

  16. McCann S, Lowe DG. Spatially local coding for object recognition. In: Lee KM, Matsushita Y, Rehg JM, Hu Z (eds) Computer vision - ACCV. Lecture notes in computer science, vol 7724. Springer, Berlin; 2012. pp. 204–217. https://doi.org/10.1007/978-3-642-37331-2_16.

  17. Mahantesh K, Aradhya VNM, Niranjan SK. An impact of complex hybrid color space in image segmentation. In: Recent advances in intelligent informatics. Advances in intelligent systems and computing, Springer; 2014. Vol. 235, pp. 73–83. https://doi.org/10.1007/978-3-319-01778-5_8.

  18. Panigrahi S, Nanda A, Swarnkar T. Deep learning approach for image classification. In: 2nd International conference on data science and business analytics (ICDSBA), Changsha; 2018. pp. 511–516. https://doi.org/10.1109/ICDSBA.2018.00101.

  19. Bandhu A, Roy SS. Classifying multi-category images using deep learning: a convolutional neural network model. In: 2nd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT), Bangalore; 2017. pp. 915–919. https://doi.org/10.1109/RTEICT.2017.8256731.

  20. Klsch A, Afzal MZ, Ebbecke M, Liwicki M. Real-time document image classification using deep CNN and extreme learning machines. In: 14th IAPR international conference on document analysis and recognition (ICDAR), Kyoto; 2017. pp. 1318–1323. https://doi.org/10.1109/ICDAR.2017.217.

  21. Bhunia AK, Bhunia AK, Ghose S, Das A, Roy PP, Pal U. A deep one-shot network for query-based logo retrieval. Pattern Recognit. 2019;96:1–10.

    Article  Google Scholar 

  22. Lei Y, Zhou Z, Zhang P, Guo Y, Ma Z, Liu L. Deep point-to-subspace metric learning for sketch-based 3D shape retrieval. Pattern Recognit. 2019;96:1–13.

    Article  Google Scholar 

  23. Mahantesh K, Rao S: Content based image retrieval - inspired by computer vision & deep learning techniques. In: 4th International conference on electrical, electronics, communication, computer technologies and optimization techniques (ICEECCOT), Mysuru, India; 2019. pp. 371–377. https://doi.org/10.1109/ICEECCOT46775.2019.9114610.

  24. Song J, Xie X, Shi G, Dong W. Multi-layer discriminative dictionary learning with locality constraint for image classification. Pattern Recognit. 2019;91:135–46.

    Article  Google Scholar 

  25. Zhu R, Dornaika F, Ruichek Y. Joint graph based embedding and feature weighting for image classification. Pattern Recognit. 2019;93:458–69.

    Article  Google Scholar 

  26. Ding B, Qian H, Zhou J. Activation functions and their characteristics in deep neural networks. In: Chinese control and decision conference (CCDC), IEEE; 2018. pp. 1836–1841.

  27. Sharma O. A new activation function for deep neural network. In: International conference on machine learning, big data, cloud and parallel computing (COMITCon), IEEE; 2019. pp. 84–86.

  28. Dogo EM, Afolabi OJ, Nwulu NI, Twala B, Aigbavboa CO. A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks. In: International conference on computational techniques, electronics and mechanical systems (CTEMS), Belgaum, India; 2018. pp. 92–99.

  29. You Z, Xu B. Improving training time of deep neural network with asynchronous averaged stochastic gradient descent. In: The 9th international symposium on Chinese spoken language processing, Singapore; 2014. pp. 446–449.

  30. Jiang W, et al. A novel stochastic gradient descent algorithm based on grouping over heterogeneous cluster systems for distributed deep learning. In: 2019 19th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGRID), Larnaca, Cyprus; 2019. pp. 391–398.

  31. Khellal A, Ma H, Fei Q. Convolutional neural network features comparison between back-propagation and extreme learning machine. In: 2018 37th Chinese control conference (CCC), Wuhan; 2018. pp. 9629–9634.

  32. Arif RB, Siddique MAB, Khan MMR, Oishe MR. Study and observation of the variations of accuracies for handwritten digits recognition with various hidden layers and epochs using convolutional neural network. In: 2018 4th International conference on electrical engineering and information & communication technology (iCEEiCT), Dhaka, Bangladesh; 2018. pp. 112–117.

  33. Peng X, et al. A convolutional neural network-based deep learning methodology for recognition of partial discharge patterns from high-voltage cables. IEEE Trans Power Deliv. 2019;34(4):1460–9.

    Article  Google Scholar 

  34. Yaseen AF. A survey on the layers of convolutional neural networks. Int J Comput Sci Mob Comput. 2018;7(12):191–6.

    Google Scholar 

  35. Tripathi N, et al. A survey of regularization methods for deep neural network. Int J Comput Sci Mob Comput. 2014;3(11):429–36.

    Google Scholar 

  36. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58.

    MathSciNet  MATH  Google Scholar 

  37. Mikołajczyk A, Grochowski M. Data augmentation for improving deep learning in image classification problem. In: International interdisciplinary PhD workshop (IIPhDW), Swinoujscie; 2018. pp. 117–122.

  38. Wong SC, Gatt A, Stamatescu V, McDonnell MD. Understanding data augmentation for classification: when to warp?. In: 2016 International conference on digital image computing: techniques and applications (DICTA), Gold Coast, QLD; 2016. pp. 1–6.

  39. Shijie J, Ping W, Peiyi J, Siping H. Research on data augmentation for image classification based on convolution neural networks. In: 2017 Chinese automation congress (CAC), Jinan; 2017. pp. 4165–4170.

  40. Gurkaynak CD, Arica N. A case study on transfer learning in convolutional neural networks. In: 2018 26th Signal processing and communications applications conference (SIU), Izmir; 2018. pp. 1–4.

  41. Shaha M, Pawar M. Transfer learning for image classification. In: 2018 Second international conference on electronics, communication and aerospace technology (ICECA), Coimbatore; 2018. pp. 656–660.

  42. Lee SJ, Koo G, Choi H, Kim SW. Transfer learning of a deep convolutional neural network for localizing handwritten slab identification numbers. In: 2017 Fifteenth IAPR international conference on machine vision applications (MVA), Nagoya; 2017. pp. 330–333.

  43. Jo SY, Ahn N, Lee Y, Kang S. Transfer learning-based vehicle classification. In: 2018 International SoC design conference (ISOCC), Daegu, Korea (South); 2018. pp. 127–128.

  44. Haque MF, Lim H, Kang D. Object detection based on VGG with ResNet network. In: 2019 International conference on electronics, information, and communication (ICEIC), Auckland, New Zealand; 2019. pp. 1–3.

  45. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556; 2014. pp. 1–6.

  46. Liu S, Deng W. Very deep convolutional neural network based image classification using small training sample size. In: 2015 3rd IAPR Asian conference on pattern recognition (ACPR), Kuala Lumpur; 2015. pp. 730–734.

  47. Brzezinski JR, Knafl GJ. Logistic regression modeling for context-based classification. In: Proceedings. Tenth international workshop on database and expert systems applications. DEXA 99, IEEE; 1999. pp. 755–759.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Mahantesh.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Data Science and Communication” guest edited by Kamesh Namudri, Naveen Chilamkurti, Sushma S J and S. Padmashree.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rao, A.S., Mahantesh, K. Learning Semantic Features for Classifying Very Large Image Datasets Using Convolution Neural Network. SN COMPUT. SCI. 2, 187 (2021). https://doi.org/10.1007/s42979-021-00589-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00589-6

Keywords

Navigation