Skip to main content
Log in

Product identification in retail stores by combining faster r-cnn and recurrent neural network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Identifying various products on the racks of supermarkets is a very easy task for human beings. But, when the same identification task is given to a computer vision based system, it poses a huge challenge for it. This article proposes a method to identify various products on the racks of supermarkets by detecting the text blocks in product labels using Faster R-CNN with more than one region proposal networks (RPNs) and then recognizing the text using Recurrent Neural Network (RNN) classifier. To detect the varying sized text blocks in product labels, several diverse sized RPNs have been proposed in this investigation. The traditional Faster R-CNN creates regions-of-interest (ROIs) using a sole RPN and so remains unable to detect the labels with diverse sized text blocks accurately. The novelty of this work lies in proposing more than one diverse sized RPNs in the traditional Faster R-CNN to detect the text blocks in the product labels and recognizing the text using RNN classifier. Three different public datasets, namely GroZi-120, Grocery Products, and Grocery Dataset have been used to assess the performance of this work and it outperforms state-of-the-art results on text block detection. The proposed system has provided the text recognition accuracies of 99.18%, 99.21%, and 99.12% for GroZi-120, Grocery Products, and Grocery Dataset respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated during the current study.

Notes

  1. http://deeplearning.net/software/theano/

References

  1. George M, Floerkemeier C (2014) “Recognizing products: a per-exemplar multi-label image classification approach”, Proceedings of the European Conference on Computer Vision, pp. 440–455

  2. Varol G, Kuzu RS, Akgiil YS (2014) “Product placement detection based on image processing”, Proceedings of the Signal Processing and Communications Applications Conference, pp. 1031–1034

  3. George M, Mircic D, Soros G, Floerkemeier C, Mattern F (2015) “Fine-grained product class recognition for assisted shopping”, Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 154–162

  4. Cleveland J, Thakur D, Dames P, Phillips C, Kientz T, Daniilidis K, Bergstrom J, Kumar V (2017) Automated system for semantic object labeling with soft-object recognition and dynamic programming segmentation. IEEE Trans Autom Sci Eng 14(2):820–833

    Article  Google Scholar 

  5. Karlinsky L, Shtok J, Tzur Y, Tzadok A (2017) “Fine-grained recognition of thousands of object categories with single-example training”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4113–4122

  6. Zientara P, Advani S, Shukla N, Okafor I, Irick K, Sampson J, Datta S (2017) “A multitask grocery assistance system for the visually impaired smart glasses, gloves, and shopping carts provide auditory and tactile feedback. IEEE Consum Electron Mag 6(1):73–81

    Article  Google Scholar 

  7. Franco A, Maltoni D, Papi S (2017) Grocery product detection and recognition. Expert Syst Appl 81:163–176

    Article  Google Scholar 

  8. Ren S, He K, Girshick R, Sun J (2015) “Faster R-CNN: Towards real-time object detection with region proposal networks”, Proceedings of the NIPS

  9. Ghosh R, Vamshi C, Kumar P (2019) RNN based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning. Pattern Recog 92:203–218

    Article  Google Scholar 

  10. Merler M, Galleguillos C, Belongie S (2007) “Recognizing groceries in situ using in vitro training data”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8

  11. Marder M, Harary S, Ribak A, Tzur Y, Alpert S, Tzadok A (2015) Using image analytics to monitor retail store shelves. IBM J Res Dev 59(2/3):1–3

    Article  Google Scholar 

  12. Saran A, Hassan E, Maurya AK (2015) “Robust visual analysis for planogram compliance problem”, Proceedings of the 14th IAPR International Conference on Machine Vision Applications, pp. 576–579

  13. Liu S, Tian H (2015) “Planogram compliance checking using recurring patterns”, Proceedings of the 2015 IEEE International Symposium on Multimedia, pp. 27–32

  14. Winlock T, Christiansen E, Belongie S (2010) “Toward real-time grocery detection for the visually impaired”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 49–56

  15. Yörük E, Öner KT, Akgül CB (2016) “An efficient Hough transform for multi-instance object recognition and pose estimation”, Proceedings of the 23rd International Conference on Pattern Recognition, pp. 1352–1357

  16. Zhang Q, Qu D, Xu F, Jia K, Sun X (2016) “Dual-layer density estimation for multiple object instance detection”, J Sensors, pp. 1–13

  17. Tonioni A, Stefano LD (2017) “Product recognition in store shelves as a subgraph isomorphism problem”, Proceedings of the International Conference on Image Analysis and Processing, pp. 682–693

  18. Hu B, Zhou N, Zhou Q, Wang X, Liu W (202) “DiffNet: A Learning to Compare Deep Network for Product Recognition”, IEEE Access, Volume 8, pp. 19336–19344

  19. Umer S, Mohanta PP, Rout RK, Pande HM (2020) Machine learning method for cosmetic product recognition: a visual searching approach. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-09079-y

    Article  Google Scholar 

  20. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110

    Article  Google Scholar 

  21. Shi Tomasi J (1994) “Good features to track”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 593–600

  22. Harris C, Stephens M (1988) “A combined corner and edge detector”, Proceedings of the Alvey Vision Conference, pp. 10–5244

  23. Fritz M, Leibe B, Caputo B, Schiele B (2005) “Integrating representative and discriminant models for object category detection”, Proceedings of the Tenth IEEE International Conference on Computer Vision, pp. 1363–1370

  24. Dalal N, Triggs B (2005) “Histograms of oriented gradients for human detection”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893

  25. Lazebnik S, Schmid C, Ponce J (2006) “Beyond bags of features: spatial pyramid matching for recognizing natural scene categories”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169– 2178

  26. Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (SURF). Comput Vision Image underst 110(3):346–359

    Article  Google Scholar 

  27. Varol G, Kuzu RS (2014) “Toward retail product recognition on grocery shelves”, Proceedings of the Sixth International Conference on Graphic and Image Processing, pp. 944309–944309

  28. Ghosh R (2021) On-road Vehicle Detection in Varying Weather Conditions using Faster R-CNN with Several Region Proposal Networks. Multimed Tools Appl 80:25985–25999

    Article  Google Scholar 

  29. Ghosh R (2021) A Recurrent Neural Network based Deep Learning Model for Offline Signature Verification and Recognition System. Expert Systems With Applications 168. https://doi.org/10.1016/j.eswa.2020.114249

  30. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681

    Article  Google Scholar 

  31. Ghosh R (2022) A Faster R-CNN and recurrent neural network based approach of gait recognition with and without carried objects. Expert Systems With Applications 205. https://doi.org/10.1016/j.eswa.2022.117730

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajib Ghosh.

Ethics declarations

Funding and/or Conflicts of interests/Competing interests

The author has no conflict of interest/competing interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghosh, R. Product identification in retail stores by combining faster r-cnn and recurrent neural network. Multimed Tools Appl 83, 7135–7158 (2024). https://doi.org/10.1007/s11042-023-15633-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15633-1

Keywords

Navigation