Skip to main content
Log in

Retail Product Classification on Distinct Distribution of Training and Evaluation Data

  • APPLICATION PROBLEMS
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

Retail product classification can be beneficial in the world of commerce, take for example helping vision-disabled parties in their shopping or evaluating product placement strategy. However, the available datasets for retail product classification are few and some have very distinct distribution of training and evaluation data, thus providing a huge challenge on its own. In addition, there are only few researches on this subject which can still be improved on. This paper attempts to improve previous approaches for retail product classification on very distinct training and evaluation data distribution by utilizing convolutional neural network (CNN) models inspired by well-performing CNN models in general image classification task, which later can be fine-tuned for other computer vision tasks, namely, object detection. The results show that VGG-16 performs at 66.9167% accuracy and a new modified VGG-16 model named VGG-16-D attains 66.83% accuracy with 85% fewer parameters than VGG-16, outperforming most existing approaches considering several comparison baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

Similar content being viewed by others

REFERENCES

  1. J. Deng, W. Dong, R. Socher, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami, 2009 (IEEE, 2019), pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

  2. M. George and C. Floerkemeier, “Recognizing products: A per-exemplar multi-label image classification approach,” in Computer Vision–ECCV 2014, Ed. by D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Lecture Notes in Computer Science, vol. 8690 (Springer, Cham, 2014), pp. 440–455. https://doi.org/10.1007/978-3-319-10605-2_29

    Book  Google Scholar 

  3. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016 (IEEE, 2016), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  4. G. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature.” arXiv:1207.0580 [cs.NE]

  5. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS’12: Proc. 25th Int. Conf. on Neural Information Processing Systems, Lake Tahoe, Nev., 2012, Ed. by F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Curran Associates, Red Hook, N.Y., 2012), Vol. 1, pp. 1097–1105.

  6. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE 86, 2278–2234 (1998). https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  7. Y. Lecun, C. Cortes, and C. Burges, “MNIST handwritten digit database,” ATT Labs. http://yann.lecun.com/exdb/mnist

  8. Q. Li, X. Peng, L. Cao, W. Du, H. Xing, and Y. Qiao, “Product image recognition with guidance learning and noisy supervision,” Comput. Vision Image Understanding 196, 102963 (2019). https://doi.org/10.1016/j.cviu.2020.102963

    Article  Google Scholar 

  9. M. Lin, Q. Chen, and S. Yan, “Network in network.” arXiv:1312.4400v3 [cs.NE]

  10. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single shot multiBox detector,” in Computer Vision–ECCV 2016, Ed. by B. Leibe, J. Matas, N. Sebe, and M. Welling, Lecture Notes in Computer Science, vol. 9905 (Springer, Cham, 2016), pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

    Book  Google Scholar 

  11. D. G. Lowe, “Object recognition from local scale-invariant features,” in Proc. 7th Int. Conf. on Computer Vision, Corfu, 1999 (IEEE, 1999), Vol. 2, pp. 1150–1157.  https://doi.org/10.1109/ICCV.1999.790410

  12. D. G. Lowe, “Distinctive image features from scale-invariant key points,” Int. J. Comput. Vision 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  13. A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier Nonlinearities Improve Neural Network Acoustic Models,” in Proceedings of the 30th Int. Conf. on Machine Learning, Atlanta, 2013.

  14. M. Merler, C. Galleguillos, and S. Belongie, “Recognizing groceries in situ using in vitro training data,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Minneapolis, 2007 (IEEE, 2007), pp. 1–8. https://doi.org/10.1109/CVPR.2007.383486

  15. R. Rajagopal, “Comparative analysis of COVID-19 X‑ray images classification using convolutional neural network, transfer learning, and machine learning classifiers using deep features,” Pattern Recognit. Image Anal. 31, 313–322 (2021). https://doi.org/10.1134/S1054661821020140

    Article  Google Scholar 

  16. J. Redmon, Darknet: Open source neural networks in C (2013). https://pjreddie.com/darknet/

  17. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2015 (IEEE, 2015), pp. 779–788. https://doi.org/10.1109/CVPR.2016.91

  18. J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, Hawaii, 2017 (IEEE, 2017), vol. 1, pp. 6517–6525. https://doi.org/10.1109/CVPR.2017.690

  19. J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement.” arXiv:1804.02767 [cs.CV]

  20. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vision 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  21. B. Santra, A. Paul, and D. P. Mukherjee, “Deterministic Dropout for Deep Neural Networks Using Composite Random Forest,” Pattern Recognit. Lett. 131, 205–212 (2020). https://doi.org/10.1016/j.patrec.2019.12.023

    Article  Google Scholar 

  22. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Int. Conf. on Learning Representations, 2015. arXiv:1409.1556 [cs.CV]

  23. M. M. Srivastava, “Bag of tricks for retail product image classification,” Image Analysis and Recognition. ICIAR 2020, Ed. by A. Campilho, F. Karray, and Z. Wang, Lecture Notes in Computer Science, vol. 12131 (Springer, Cham, 2020), pp. 71–82. https://doi.org/10.1007/978-3-030-50347-5_8

    Book  Google Scholar 

  24. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learning Res. 15, 1929–1958 (2014).

    MathSciNet  MATH  Google Scholar 

  25. A. C. Wilson, R. Roelofs, M. Stern, N. Srebro, and B. Recht, “The Marginal Value of Adaptive Gradient Methods in Machine Learning,” in 31st Conf. on Neural Information Processing Systems (NIPS 2017), Long Beach, Calif., 2017, Ed. by U. von Luxburg, I. Guyon, S. Bengio, H. Wallach, and R. Fergus (Curran Associates, Red Hook, N. Y., 2017), pp. 4151–4161.

  26. S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,” in IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, Hawaii, 2017 (IEEE, 2017), pp. 5987–5995. https://doi.org/10.1109/CVPR.2017.634

  27. F. Yu, V. Koltun, and T. Funkhouser, “Dilated residual networks,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, 2017 (IEEE, 2017), pp. 636–644. https://doi.org/10.1109/CVPR.2017.75

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jonathan or Gede Putra Kusuma.

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

This article is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA Editorial Board decides not to accept it for publication.

Conflict of Interest

The process of writing and the content of the article do not give grounds for raising the issue of a conflict of interest.

Additional information

Jonathan is a candidate of Master of Computer Science at Bina Nusantara University’s BINUS Graduate Program. He has published several researches on computer vision, pattern recognition, and software engineering on various international journals. His research interests include computer vision, pattern recognition, deep learning, and software engineering.

Gede Putra Kusuma received PhD degree in Electrical and Electronic Engineering from Nanyang Technological University (NTU), Singapore, in 2013. He is currently working as a Lecturer and Research Coordinator in Computer Science Department, Bina Nusantara University, Indonesia. Before joining Bina Nusantara University, he was working as a Research Scientist in I2R–A*STAR, Singapore. His research interests include pattern recognition, machine learning, face recognition, appearance-based object recognition, mobile learning, and gamification of learning.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jonathan, Kusuma, G.P. Retail Product Classification on Distinct Distribution of Training and Evaluation Data. Pattern Recognit. Image Anal. 32, 142–152 (2022). https://doi.org/10.1134/S105466182104012X

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S105466182104012X

Keywords:

Navigation