Skip to main content
Log in

Content-based image retrieval using Group Normalized-Inception-Darknet-53

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

In recent days' research, deep learning methods have shown promising performance in various fields of computer vision, including content-based image retrieval (CBIR). In this paper, an improved version of Darknet-53, called GroupNormalized-Inception-Darknet-53 (GN-Inception-Darknet-53), is proposed to extract features for the CBIR model. To extract the more detailed features of an image, we augmented one inception layer which includes 1 × 1, 3 × 3, and 5 × 5 kernels in place of an existing 3 × 3 kernel. The output of this newly added inception layer is the concatenated results of these three kernels. To make the normalization process of the proposed model less dependent on batch size, group normalization (GN) layer is used instead of batch normalization. A total of five such inception layers are used in the proposed GN-Inception-Darknet-53, and the output of all these inception layers is depth concatenated to extract more detailed features of the image. To train the proposed model transfer learning mechanism is used. Five standard performance measures: average precision rate, average recall rate, F-measure, average normalized modified retrieval rank, and total minimum retrieval epoch, are calculated to evaluate the efficiency of our proposed method. To assess the performance of the proposed method, seven challenging image datasets: three natural datasets (Corel-1K, Corel-5K & Corel-10K), three subsets of ImageNet dataset, and UKbench dataset are used. For all these datasets, the proposed method shows better results than the nineteen methods used to compare that contain traditional and CNN methods for CBIR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Singha M, Hemachandran K (2012) Content based image retrieval using color and texture. Signal Image Process 3(1):39–57. https://doi.org/10.5121/sipij.2012.3104

    Article  Google Scholar 

  2. Huang J, Kumar SR, Mitra M (1997) Combining supervised learning with color correlograms for content-based image retrieval. In: 5th ACM international conference on Multimedia. pp 325–334. https://doi.org/10.1145/266180.266383

  3. Huang J, Kumar SR, Mitra M, Zhu WJ, Zabih R (1997) Image indexing using color correlograms. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition. pp 762–768. https://doi.org/10.1109/CVPR.1997.609412

  4. Chun YD, Kim NC, Jang IH (2008) Content-based image retrieval using multiresolution color and texture features. IEEE Trans Multimedia 10(6):1073–1084. https://doi.org/10.1109/TMM.2008.2001357

    Article  Google Scholar 

  5. Bhunia AK, Bhattacharyya A, Banerjee P, Roy PP, Murala S (2019) A novel feature descriptor for image retrieval by combining modified color histogram and diagonally symmetric co-occurrence texture pattern. Pattern Anal Appl. https://doi.org/10.1007/s10044-019-00827-x

    Article  Google Scholar 

  6. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987. https://doi.org/10.1109/TPAMI.2002.1017623

    Article  MATH  Google Scholar 

  7. Heikkilä M, Pietikäinen M, Schmid C (2006) Description of interest regions with center-symmetric local binary patterns. In: Computer vision, graphics and image processing. pp 58–69. https://doi.org/10.1007/11949619_6

  8. Verma M, Raman B, Murala S (2015) Local extrema co-occurrence pattern for color and texture image retrieval. Neurocomputing 165:255–269. https://doi.org/10.1016/j.neucom.2015.03.015

    Article  Google Scholar 

  9. Zhang B, Gao Y, Zhao S, Liu J (2009) Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE Trans Image Process 19(2):533–544. https://doi.org/10.1109/TIP.2009.2035882

    Article  MathSciNet  MATH  Google Scholar 

  10. Murala S, Maheshwari RP, Balasubramanian R (2012) Local tetra patterns: a new feature descriptor for content-based image retrieval. IEEE Trans Image Process 21(5):2874–2886. https://doi.org/10.1109/TIP.2012.2188809

    Article  MathSciNet  MATH  Google Scholar 

  11. Haralick RM, Shanmugam K, Dinstein IH (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621. https://doi.org/10.1109/TSMC.1973.4309314

    Article  Google Scholar 

  12. Clausi DA (2002) An analysis of co-occurrence texture statistics as a function of grey level quantization. Cana J Remote Sens 28(1):45–62. https://doi.org/10.5589/m02-004

    Article  Google Scholar 

  13. Rui Y, Huang TS, Chang SF (1999) Image retrieval: Current techniques, promising directions, and open issues. J Vis Commun Image Represent 10(1):39–62. https://doi.org/10.1006/jvci.1999.0413

    Article  Google Scholar 

  14. Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380. https://doi.org/10.1109/34.895972

    Article  Google Scholar 

  15. Kokare M, Chatterji BN, Biswas PK (2002) A survey on current content based image retrieval methods. IETE J Res 48(3–4):261–271. https://doi.org/10.1080/03772063.2002.11416285

    Article  Google Scholar 

  16. Kanaparthi SK, Raju USN, Shanmukhi P, Aneesha GK, Rahman MEU (2019) Image Retrieval by Integrating Global Correlation of Color and Intensity Histograms with Local Texture Features. Multimedia Tools Appl. https://doi.org/10.1007/s11042-019-08029-7

    Article  Google Scholar 

  17. Sivic J, Zisserman A (2003) Video Google: A text retrieval approach to object matching in videos. In: Proceedings of the ninth ieee international conference on computer vision. pp 1470–1477. https://doi.org/10.1109/ICCV.2003.1238663

  18. Elsayad I, Martinet J, Urruty T, Djeraba C (2010) A new spatial weighting scheme for bag-of-visual-words. In: Proceedings of the international conference content-based multimedia indexing pp 1–6. https://doi.org/10.1109/ICCV.2003.1238663.

  19. Chen X, Hu X, Shen X (2009) Spatial weighting for bag-of-visualwords and its application in content-based image retrieval. In: Proceedings of the international conference advance knowledge discovery data mining. pp 867–874. https://doi.org/10.1007/978-3-642-01307-2_90

  20. Bouachir W, Kardouchi M, Belacel N (2009) Improving bag of visual words image retrieval: A fuzzy weighting scheme for efficient indexation. In: Proceedings of the international conference on signal-image technology internet-based system. pp 215–220. https://doi.org/10.1109/SITIS.2009.43

  21. Zhu L, Jin H, Zheng R, Feng X (2013) Weighting scheme for image retrieval based on bag-of-visual-words. IET Image Process 8(9):509–518. https://doi.org/10.1049/iet-ipr.2013.0375

    Article  Google Scholar 

  22. Guo JM, Prasetyo H, Wang NJ (2015) Effective image retrieval system using dot-diffused block truncation coding features. IEEE Trans Multimedia 17(9):1576–1590. https://doi.org/10.1109/TMM.2015.2449234

    Article  Google Scholar 

  23. Guo JM, Liu YF (2014) Improved block truncation coding using optimized dot diffusion. IEEE Trans Image Process 23(3):1269–1275. https://doi.org/10.1109/TIP.2013.2257812

    Article  MathSciNet  MATH  Google Scholar 

  24. Hu R, Barnard M, Collomosse J (2010) Gradient field descriptor for sketch based retrieval and localization. In: 2010 IEEE international conference on image processing. IEEE, pp 1025–1028. https://doi.org/10.1109/ICIP.2010.5649331

  25. Hu RX, Jia W, Ling H, Zhao Y, Gui J (2013) Angular pattern and binary angular pattern for shape retrieval. IEEE Trans Image Process 23(3):1118–1127. https://doi.org/10.1109/TIP.2013.2286330

    Article  MathSciNet  MATH  Google Scholar 

  26. Osowski S (2002) Fourier and wavelet descriptors for shape recognition using neural networks—a comparative study. Pattern Recognit 35(9):1949–1957. https://doi.org/10.1016/S0031-3203(01)00153-4

    Article  MATH  Google Scholar 

  27. Mathew SP, Balas VE, Zachariah KP (2015) A content-based image retrieval system based on convex hull geometry. Acta Polytech Hung 12(1):103–116

    Google Scholar 

  28. Wan J, Wang D, Hoi SC, Wu P, Zhu J, Zhang Y, Li J (2014) Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the 22nd ACM international conference on multimedia. pp. 157–166. https://doi.org/10.1145/2647868.2654948.

  29. Zeiler MD , Fergus R (2014) Visualizing and understanding convolutional networks in Computer Vision. In: European conference on computer vision. pp 818–833. https://doi.org/10.1007/978-3-319-10590-1_53.

  30. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp 1097–1105. https://doi.org/10.1145/3065386

  31. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint https://arxiv.org/abs/1409.1556

  32. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition. pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594

  33. Maji S, Bose S (2020) CBIR using features derived by deep learning. arXiv preprint https://arxiv.org/abs/2002.07877

  34. Alluri L, Dendukuri H (2020) An efficient system for cbir using deep learning convolutional neural networks. Int J Recent Dev Sci Technol 4(1):160–167

    Google Scholar 

  35. Tarawneh AS, Celik C, Hassanat AB, Chetverikov D (2020) Detailed investigation of deep features with sparse representation and dimensionality reduction in cbir: a comparative study. Intell Data Anal 24(1):47–68. https://doi.org/10.3233/ida-184411

    Article  Google Scholar 

  36. Sezavar A, Farsi H, Mohamadzadeh S (2019) Content-based image retrieval by combining convolutional neural networks and sparse representation. Multimedia Tools Appl 78(15):20895–20912. https://doi.org/10.1007/s11042-019-7321-1

    Article  Google Scholar 

  37. Saritha RR, Paul V, Kumar PG (2019) Content based image retrieval using deep learning process. Cluster Comput 22(2):4187–4200. https://doi.org/10.1007/s10586-018-1731-0

    Article  Google Scholar 

  38. Mustafic F, Prazina I, Ljubovic V (2019) A new method for improving content-based image retrieval using deep learning. In: XXVII international conference on information, communication and automation technologies. pp 1–4. https://doi.org/10.1109/icat47117.2019.8939009.

  39. Ramanjaneyulu K, Swamy KV, Rao CS (2018) Novel CBIR system using CNN architecture. In: 2018 3rd international conference on inventive computation technologies. pp 379–383. https://doi.org/10.1109/icict43934.2018.9034389.

  40. Liu P, Guo JM, Wu CY, Cai D (2017) Fusion of deep learning and compressed domain features for content-based image retrieval. IEEE Trans Image Process 26(12):5706–5717. https://doi.org/10.1109/tip.2017.2736343

    Article  MathSciNet  MATH  Google Scholar 

  41. Messina N, Amato G, Carrara F, Falchi F, Gennaro C (2019) Learning visual features for relational CBIR. Int J Multimedia Inf 14:1–2. https://doi.org/10.1007/s13735-019-00178-7

    Article  Google Scholar 

  42. Song K, Li F, Long F, Wang J, Ling Q (2018) Discriminative deep feature learning for semantic-based image retrieval. IEEE Access 6:44268–44280. https://doi.org/10.1109/access.2018.2862464

    Article  Google Scholar 

  43. Zheng L, Yang Y, Tian Q (2017) SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell 40(5):1224–1244. https://doi.org/10.1109/tpami.2017.2709749

    Article  Google Scholar 

  44. Swati ZN, Zhao Q, Kabir M, Ali F, Ali Z, Ahmed S, Lu J (2019) Content-based brain tumor retrieval for MR images using transfer learning. IEEE Access 7:17809–17822. https://doi.org/10.1109/access.2019.2892455

    Article  Google Scholar 

  45. Cai Y, Li Y, Qiu C, Ma J, Gao X (2019) Medical image retrieval based on convolutional neural network and supervised hashing. IEEE Access 7:51877–51885. https://doi.org/10.1109/access.2019.2911630

    Article  Google Scholar 

  46. Wei S, Liao L, Li J, Zheng Q, Yang F, Zhao Y (2019) Saliency inside: learning attentive CNNs for content-based image retrieval. IEEE Trans Image Process 28(9):4580–4593. https://doi.org/10.1109/tip.2019.2913513

    Article  MathSciNet  MATH  Google Scholar 

  47. Bhandi V, Devi KS (2019) Image retrieval by fusion of features from pre-trained deep convolution neural networks. In: 1st International conference on advanced technologies in intelligent control, environment, computing and communication engineering. pp 35–40. https://doi.org/10.1109/icatiece45860.2019.9063814

  48. Özaydın U, Georgiou T, Lew M (2019) A comparison of cnn and classic features for image retrieval. In: 2019 International conference on content-based multimedia indexing. pp 1–4. https://doi.org/10.1109/cbmi.2019.8877470

  49. Tzelepi M, Tefas A (2018) Deep convolutional learning for content based image retrieval. Neurocomputing 275:2467–2478. https://doi.org/10.1016/j.neucom.2017.11.022

    Article  Google Scholar 

  50. Rao Y, Liu W, Fan B, Song J, Yang Y (2018) A novel relevance feedback method for CBIR. World Wide Web 21(6):1505–1522. https://doi.org/10.1007/s11280-017-0523-4

    Article  Google Scholar 

  51. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. In: Computer vision and patern recognition (CVPR). https://arxiv.org/abs/1804.02767v1.

  52. Liu B, Wang S, Zhao JS, Li MF (2019) Ship tracking and recognition based on Darknet network and YOLOv3 algorithm. J Comput Appl. https://doi.org/10.11772/j.issn.1001-9081.2018102190

    Article  Google Scholar 

  53. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition. pp 7263–7271. https://doi.org/10.1109/CVPR.2017.690

  54. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint https://arxiv.org/abs/1502.03167

  55. Wu Y, He K (2019) Group normalization. Int J Comput Vis 128:742–755

    Article  Google Scholar 

  56. Shakarami A, Tarrah H (2020) An efficient image descriptor for image classification and CBIR. Optik 214:164833–164843. https://doi.org/10.1016/j.ijleo.2020.164833

    Article  Google Scholar 

  57. Pradhan J, Pal AK, Banka H, Dansena P (2021) Fusion of region based extracted features for instance-and class-based CBIR applications. Appl Soft Comput 102:107063–107086. https://doi.org/10.1016/j.asoc.2020.107063

    Article  Google Scholar 

  58. Wang JZ (2020), Modelingobjects, concepts, aesthetics and emotionsin big visual data. http://wang.ist.psu.edu/docs/home.shtml. Accessed 10 Mar 2021

  59. Liu GH et al. Corel-10k dataset. http://www.ci.gxnu.edu.cn/cbir/Dataset.aspx. Accessed 15 Mar 2021

  60. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR'09). pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848

  61. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252. https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  62. Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: 2006 IEEE computer society conference on computervision and pattern recognition (CVPR'06). pp 2161–2168. https://doi.org/10.1109/CVPR.2006.264

  63. Bhowmick A, Saharia S, Hazarika SM (2021) FhVLAD: Fine-grained quantization and encoding high-order descriptor statistics for scalable image retrieval. Multimedia Tools Appl. https://doi.org/10.1007/s11042-020-10491-7

    Article  Google Scholar 

  64. Li J, Yang B, Yang W, Sun C, Xu J (2021) Subspace-based multi-view fusion for instance-level image retrieval. Vis Comput 37(3):619–633

    Article  Google Scholar 

  65. Zheng L, Wang S, Liu Z, Tian Q (2014) Packing and padding: Coupled multi-index for accurate image retrieval. In: IEEE conference on computer vision and pattern recognition 2014. pp 1939–1946

  66. Yan K, Wang Y, Liang D, Huang T, Tian Y (2016) Cnn vs. sift for image retrieval: Alternative or complementary? In: 24th ACM international conference on Multimedia. pp 407–411. https://doi.org/10.1145/2964284.2967252

  67. Liu S, Sun M, Feng L, Qiao H, Chen S, Liu Y (2020) Social neighborhood graph and multigraph fusion ranking for multifeature image retrieval. IEEE Trans Neural Netw Learn Syst 32(3):1389–1399. https://doi.org/10.1109/tnnls.2020.2984676

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to U. S. N. Raju.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pathak, D., Raju, U.S.N. Content-based image retrieval using Group Normalized-Inception-Darknet-53. Int J Multimed Info Retr 10, 155–170 (2021). https://doi.org/10.1007/s13735-021-00215-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-021-00215-4

Keywords

Navigation