Abstract
Identifying various products on the racks of supermarkets is a very easy task for human beings. But, when the same identification task is given to a computer vision based system, it poses a huge challenge for it. This article proposes a method to identify various products on the racks of supermarkets by detecting the text blocks in product labels using Faster R-CNN with more than one region proposal networks (RPNs) and then recognizing the text using Recurrent Neural Network (RNN) classifier. To detect the varying sized text blocks in product labels, several diverse sized RPNs have been proposed in this investigation. The traditional Faster R-CNN creates regions-of-interest (ROIs) using a sole RPN and so remains unable to detect the labels with diverse sized text blocks accurately. The novelty of this work lies in proposing more than one diverse sized RPNs in the traditional Faster R-CNN to detect the text blocks in the product labels and recognizing the text using RNN classifier. Three different public datasets, namely GroZi-120, Grocery Products, and Grocery Dataset have been used to assess the performance of this work and it outperforms state-of-the-art results on text block detection. The proposed system has provided the text recognition accuracies of 99.18%, 99.21%, and 99.12% for GroZi-120, Grocery Products, and Grocery Dataset respectively.
Similar content being viewed by others
Data Availability
Data sharing not applicable to this article as no datasets were generated during the current study.
References
George M, Floerkemeier C (2014) “Recognizing products: a per-exemplar multi-label image classification approach”, Proceedings of the European Conference on Computer Vision, pp. 440–455
Varol G, Kuzu RS, Akgiil YS (2014) “Product placement detection based on image processing”, Proceedings of the Signal Processing and Communications Applications Conference, pp. 1031–1034
George M, Mircic D, Soros G, Floerkemeier C, Mattern F (2015) “Fine-grained product class recognition for assisted shopping”, Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 154–162
Cleveland J, Thakur D, Dames P, Phillips C, Kientz T, Daniilidis K, Bergstrom J, Kumar V (2017) Automated system for semantic object labeling with soft-object recognition and dynamic programming segmentation. IEEE Trans Autom Sci Eng 14(2):820–833
Karlinsky L, Shtok J, Tzur Y, Tzadok A (2017) “Fine-grained recognition of thousands of object categories with single-example training”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4113–4122
Zientara P, Advani S, Shukla N, Okafor I, Irick K, Sampson J, Datta S (2017) “A multitask grocery assistance system for the visually impaired smart glasses, gloves, and shopping carts provide auditory and tactile feedback. IEEE Consum Electron Mag 6(1):73–81
Franco A, Maltoni D, Papi S (2017) Grocery product detection and recognition. Expert Syst Appl 81:163–176
Ren S, He K, Girshick R, Sun J (2015) “Faster R-CNN: Towards real-time object detection with region proposal networks”, Proceedings of the NIPS
Ghosh R, Vamshi C, Kumar P (2019) RNN based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning. Pattern Recog 92:203–218
Merler M, Galleguillos C, Belongie S (2007) “Recognizing groceries in situ using in vitro training data”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
Marder M, Harary S, Ribak A, Tzur Y, Alpert S, Tzadok A (2015) Using image analytics to monitor retail store shelves. IBM J Res Dev 59(2/3):1–3
Saran A, Hassan E, Maurya AK (2015) “Robust visual analysis for planogram compliance problem”, Proceedings of the 14th IAPR International Conference on Machine Vision Applications, pp. 576–579
Liu S, Tian H (2015) “Planogram compliance checking using recurring patterns”, Proceedings of the 2015 IEEE International Symposium on Multimedia, pp. 27–32
Winlock T, Christiansen E, Belongie S (2010) “Toward real-time grocery detection for the visually impaired”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 49–56
Yörük E, Öner KT, Akgül CB (2016) “An efficient Hough transform for multi-instance object recognition and pose estimation”, Proceedings of the 23rd International Conference on Pattern Recognition, pp. 1352–1357
Zhang Q, Qu D, Xu F, Jia K, Sun X (2016) “Dual-layer density estimation for multiple object instance detection”, J Sensors, pp. 1–13
Tonioni A, Stefano LD (2017) “Product recognition in store shelves as a subgraph isomorphism problem”, Proceedings of the International Conference on Image Analysis and Processing, pp. 682–693
Hu B, Zhou N, Zhou Q, Wang X, Liu W (202) “DiffNet: A Learning to Compare Deep Network for Product Recognition”, IEEE Access, Volume 8, pp. 19336–19344
Umer S, Mohanta PP, Rout RK, Pande HM (2020) Machine learning method for cosmetic product recognition: a visual searching approach. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-09079-y
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Shi Tomasi J (1994) “Good features to track”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 593–600
Harris C, Stephens M (1988) “A combined corner and edge detector”, Proceedings of the Alvey Vision Conference, pp. 10–5244
Fritz M, Leibe B, Caputo B, Schiele B (2005) “Integrating representative and discriminant models for object category detection”, Proceedings of the Tenth IEEE International Conference on Computer Vision, pp. 1363–1370
Dalal N, Triggs B (2005) “Histograms of oriented gradients for human detection”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893
Lazebnik S, Schmid C, Ponce J (2006) “Beyond bags of features: spatial pyramid matching for recognizing natural scene categories”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169– 2178
Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (SURF). Comput Vision Image underst 110(3):346–359
Varol G, Kuzu RS (2014) “Toward retail product recognition on grocery shelves”, Proceedings of the Sixth International Conference on Graphic and Image Processing, pp. 944309–944309
Ghosh R (2021) On-road Vehicle Detection in Varying Weather Conditions using Faster R-CNN with Several Region Proposal Networks. Multimed Tools Appl 80:25985–25999
Ghosh R (2021) A Recurrent Neural Network based Deep Learning Model for Offline Signature Verification and Recognition System. Expert Systems With Applications 168. https://doi.org/10.1016/j.eswa.2020.114249
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681
Ghosh R (2022) A Faster R-CNN and recurrent neural network based approach of gait recognition with and without carried objects. Expert Systems With Applications 205. https://doi.org/10.1016/j.eswa.2022.117730
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding and/or Conflicts of interests/Competing interests
The author has no conflict of interest/competing interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ghosh, R. Product identification in retail stores by combining faster r-cnn and recurrent neural network. Multimed Tools Appl 83, 7135–7158 (2024). https://doi.org/10.1007/s11042-023-15633-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15633-1