Logo detection using weakly supervised saliency map

Kumar, Gautam; Keserwani, Prateek; Roy, Partha Pratim; Dogra, Debi Prosad

doi:10.1007/s11042-020-09813-6

Logo detection using weakly supervised saliency map

Published: 29 September 2020

Volume 80, pages 4341–4365, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Gautam Kumar¹,
Prateek Keserwani¹,
Partha Pratim Roy¹ &
…
Debi Prosad Dogra²

555 Accesses
5 Citations
Explore all metrics

Abstract

Box level annotation of a large number of logo images for training purpose of typical deep learning architecture is highly challenging. Thus, a method that can detect the logo with the help of training to remove box-level annotations can be helpful. In this paper, we present a method of logo detection that utilizes weakly supervised learning of Convolutional Neural Network (CNN) to generate a deep saliency map. The saliency map is generated from the back-propagated response of the CNN trained with the classification task. The saliency map produces responses for the regions of logos. GrabCut segmentation method has been applied then to obtain the bounding box corresponding to the logo class predicted by the CNN for a given image. AlexNet, CaffeNet, and VGGNet deep architectures has been fine-tuned for the classification purpose. The framework is further utilized for detection through a back-propagated saliency map. The performance of the proposed methodology has been validated on the FlickrLogos-32 logo benchmark dataset. The proposed method outperforms the state-of-the-art baseline fully supervised methods with mean average precision (mAP) of 75.83%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Logo-SSL: Self-supervised Learning with Self-attention for Efficient Logo Detection

Detection and Classification of Logos and Trademarks from Images

Multi-label Logo Classification Using Convolutional Neural Networks

References

Alaei A, Roy PP, Pal U (2016) Logo and seal based administrative document image retrieval: a survey. Comput Sci Rev 22:47
Article MathSciNet Google Scholar
Bhunia AK, Bhunia AK, Ghose S, Das A, Roy PP, Pal U (2019) A deep one-shot network for query-based logo retrieval. Pattern Recogn 96:106965
Article Google Scholar
Bilen H, Vedaldi A (2016) Weakly supervised deep detection networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2846–2854
Biswas C (2014) Logo recognition technique using sift descriptor, surf descriptor and hog descriptor. Ph.D. thesis
Borji A, Cheng MM, Jiang H, Li J (2015) Salient object detection: A benchmark. IEEE Trans Image Process 24(12):5706
Article MathSciNet Google Scholar
Boykov Y, Funka-Lea G (2006) Graph cuts and efficient ND image segmentation. Int J Comput Vis 70(2):109
Article Google Scholar
Candemir S, Palaniappan K, Akgul YS (2013) Multi-class regularization parameter learning for graph cut image segmentation. In: 10th international symposium on biomedical imaging, pp 1473–1476
Chen X, Kundu K, Zhang K, Ma H, Fidler S, Urtasun R (2016) Monocular 3d object detection for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2147–2156
Cheng MM, Mitra NJ, Huang X, Torr PH, Hu SM (2011) Salient object detection and segmentation. Image 2(3):9
Google Scholar
Cheng MM, Mitra NJ, Huang X, Torr PH, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37 (3):569
Article Google Scholar
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88 (2):303
Article Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627
Article Google Scholar
Gao K, Lin S, Zhang Y, Tang S, Zhang D (2009) Logo detection based on spatial-spectral saliency and partial spatial context. In: International conference on multimedia and expo, pp 322–329
Gao R, Uchida S, Shahab A, Shafait F, Frinken V (2014) Visual saliency models for text detection in real world. Plos One 9(12):e114539
Article Google Scholar
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
MATH Google Scholar
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoi SC, Wu X, Liu H, Wu Y, Wang H, Xue H, Wu Q (2015) Logo-net: Large-scale deep logo detection and brand recognition with deep region-based convolutional networks. arXiv:1511.02462
Iandola FN, Shen A, Gao P, Keutzer K (2015) Deeplogo: Hitting logo recognition with the deep neural network hammer. arXiv:1510.02131
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia, pp 675–678
Joly A, Buisson O (2009) Logo retrieval with a contrario visual query expansion. In: Proceedings of the 17th ACM international conference on multimedia, pp 581–584
Kalantidis Y, Pueyo LG, Trevisiol M, van Zwol R, Avrithis Y (2011) Scalable triangulation based logo recognition. In: Proceedings of the 1st ACM international conference on multimedia retrieval, p 20
Keserwani P, De P, Roy PP, Pal U (2019) Zero shot learning based script identification in the wild. In: 2019 international conference on document analysis and recognition. IEEE, pp 987–992
Kleban J, Xie X, Ma WY (2008) Spatial pyramid mining for logo detection in natural scenes. In: International conference on multimedia and expo, pp 1077–1080
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
LeCun YA, Bottou L, Orr GB, Müller KR (2012) Efficient BackProp. In: Neural networks: tricks of the trade. Springer, pp 9–48
Li Z, Schulte-Austum M, Neschen M (2010) Fast logo detection and recognition in document images. In: 20th International conference on pattern recognition, pp 2716–2719
Lin Y, Kong S, Wang D, Zhuang Y (2014) Saliency detection within a deep convolutional architecture
Malmer T (2010) Image segmentation using grabcut. IEEE Transactions on Signal Processing 5(1):1
Google Scholar
Na IS, Oh KH, Kim SH (2013) Unconstrained object segmentation using grabcut based on automatic generation of initial boundary. International Journal of Contents 9(1):6
Article Google Scholar
Pham TD (2003) Unconstrained logo detection in document images. Pattern Recogn 36(12):3023
Article Google Scholar
Pigou L, Van Den Oord A, Dieleman S, Van Herreweghe M, Dambre J (2018) Beyond temporal pooling: Recurrence and temporal convolutions for gesture recognition in video. Int J Comput Vision 126(2-4):430
Article MathSciNet Google Scholar
Plath N, Toussaint M, Nakajima S (2009) Multi-class image segmentation using conditional random fields and global classification. In: Proceedings of the 26th annual international conference on machine learning, pp 817–824
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Romberg S, Pueyo LG, Lienhart R, Van Zwol R (2011) Scalable logo recognition in real-world images. In: Proceedings of the 1st ACM international conference on multimedia retrieval, p 25
Rother C, Kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics 23:309–314
Article Google Scholar
Rusinol M, Llados J (2009) Logo spotting by a bag-of-words approach for document categorization. In: 10th international conference on document analysis and recognition, pp 111–115
Sanyal S, Sengamedu SH (2007) Logoseeker: a system for detecting and matching logos in natural images. In: Proceedings of the 15th ACM international conference on multimedia, pp 166–167
Scharfenberger C, Wong A, Fergani K, Zelek JS, Clausi DA (2013) Statistical textural distinctiveness for salient region detection in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 979–986
Sharma S, Kiros R, Salakhutdinov R (2015) Action recognition using visual attention. arXiv:1511.04119
Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv:1312.6034
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Su H, Gong S, Zhu X (2017) Weblogo-2m: Scalable logo detection by deep learning from the web. In: Proceedings of the IEEE international conference on computer vision workshops, pp 270–279
Su H, Gong S, Zhu X (2020) Scalable logo detection by self co-learning. Pattern Recogn 97:107003
Article Google Scholar
Su H, Zhu X, Gong S (2017) Deep learning logo detection with data expansion by synthesising context. In: 2017 IEEE winter conference on applications of computer vision. IEEE, pp 530–539
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708
Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, Liang J (2016) Convolutional neural networks for medical image analysis: Full training or fine tuning?. IEEE Trans Med Imaging 35(5):1299
Article Google Scholar
Tang P, Peng Y (2017) Exploiting distinctive topological constraint of local feature matching for logo image recognition. Neurocomputing 236:113
Article Google Scholar
Tang P, Wang X, Bai X, Liu W (2017) Multiple instance detection network with online instance classifier refinement. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2843–2851
Tang P, Wang X, Wang A, Yan Y, Liu W, Huang J, Yuille A (2018) Weakly supervised region proposal network and object detection. In: Proceedings of the European conference on computer vision, pp 352–368
Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 511–518
Wang WRKH, Chong TT (2014) Gradient-based learning applied to document recognition. In: European Conference on Computer Vision, 86 pp 431–445
Xie L, Shen J, Zhu L (2016) Online cross-modal hashing for web image retrieval. In: Thirtieth AAAI conference on artificial intelligence
Xing L, Tian Z, Huang W, Scott MR (2019) Convolutional character networks. In: Proceedings of the IEEE international conference on computer vision, pp 9126–9136
Yang K, Li D, Dou Y (2019) Towards precise end-to-end weakly supervised object detection network. In: Proceedings of the IEEE international conference on computer vision, pp 8372–8381
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833
Zhang Y, Bai Y, Ding M, Li Y, Ghanem B (2018) W2F: A weakly-supervised to fully supervised framework for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 928–936
Zhang S, Benenson R, Omran M, Hosang J, Schiele B (2018) Towards reaching human performance in pedestrian detection. IEEE Trans Pattern Anal Mach Intell 40(4):973
Article Google Scholar
Zhang Y, Zhu M, Wang D, Feng S (2014) Logo detection and recognition based on classification. In: International conference on web-age information management. Springer, pp 805–816
Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495
Zhu G, Doermann D (2007) Automatic document logo detection. In: Ninth international conference on document analysis and recognition, vol 2, pp 864–868
Zhu L, Shen J, Xie L, Cheng Z (2016) Unsupervised topic hypergraph hashing for efficient mobile image retrieval. IEEE Trans Cybern 47(11):3941
Article Google Scholar

Download references

Acknowledgments

The authors would like to acknowledge the support of DST-SERB. The Project ID is SB/S3/EECE/099/2016.

Author information

Authors and Affiliations

Department of CSE, Indian Institute of Technology Roorkee, Roorkee, India
Gautam Kumar, Prateek Keserwani & Partha Pratim Roy
School of Electrical Sciences, IIT Bhubaneswar, Bhubaneswar, India
Debi Prosad Dogra

Authors

Gautam Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Prateek Keserwani
View author publications
You can also search for this author in PubMed Google Scholar
Partha Pratim Roy
View author publications
You can also search for this author in PubMed Google Scholar
Debi Prosad Dogra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gautam Kumar.

Ethics declarations

Conflict of interests

The authors declared that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, G., Keserwani, P., Roy, P.P. et al. Logo detection using weakly supervised saliency map. Multimed Tools Appl 80, 4341–4365 (2021). https://doi.org/10.1007/s11042-020-09813-6

Download citation

Received: 17 July 2018
Revised: 11 August 2020
Accepted: 02 September 2020
Published: 29 September 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s11042-020-09813-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Logo detection using weakly supervised saliency map

Abstract

Access this article

Similar content being viewed by others

Logo-SSL: Self-supervised Learning with Self-attention for Efficient Logo Detection

Detection and Classification of Logos and Trademarks from Images

Multi-label Logo Classification Using Convolutional Neural Networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Logo detection using weakly supervised saliency map

Abstract

Access this article

Similar content being viewed by others

Logo-SSL: Self-supervised Learning with Self-attention for Efficient Logo Detection

Detection and Classification of Logos and Trademarks from Images

Multi-label Logo Classification Using Convolutional Neural Networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation