Recognizing Products: A Per-exemplar Multi-label Image Classification Approach

  • Marian George
  • Christian Floerkemeier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8690)

Abstract

Large-scale instance-level image retrieval aims at retrieving specific instances of objects or scenes. Simultaneously retrieving multiple objects in a test image adds to the difficulty of the problem, especially if the objects are visually similar. This paper presents an efficient approach for per-exemplar multi-label image classification, which targets the recognition and localization of products in retail store images. We achieve runtime efficiency through the use of discriminative random forests, deformable dense pixel matching and genetic algorithm optimization. Cross-dataset recognition is performed, where our training images are taken in ideal conditions with only one single training image per product label, while the evaluation set is taken using a mobile phone in real-life scenarios in completely different conditions. In addition, we provide a large novel dataset and labeling tools for products image search, to motivate further research efforts on multi-label retail products image classification. The proposed approach achieves promising results in terms of both accuracy and runtime efficiency on 680 annotated images of our dataset, and 885 test images of GroZi-120 dataset. We make our dataset of 8350 different product images and the 680 test images from retail stores with complete annotations available to the wider community.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bangpeng, Y., Aditya, K., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR (2011)Google Scholar
  2. 2.
    Boutella, M.R., Luob, J., Shena, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognition 37(9), 1755–1771 (2004)Google Scholar
  3. 3.
    Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Csurka, G., Dance, C., Bray, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision (2004)Google Scholar
  5. 5.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  6. 6.
    Deep, K., Singh, K.P., Kansal, M.L., Mohan, C.: A real coded genetic algorithm for solving integer and mixed integer optimization problems. Applied Mathematics and Computation 212(2), 505–518 (2009)CrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Duan, G., Huang, C., Ai, H., Lao, S.: Boosting associated pairing comparison features for pedestrian detection. In: ICCV Workshop on Visual Surveillance (2009)Google Scholar
  8. 8.
    Everingham, M., Gool, L.V., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. IJCV 88(2) (2010)Google Scholar
  9. 9.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: Workshop on Generative-Model Based Vision, CVPR (2004)Google Scholar
  10. 10.
    Fei-Fei, L., Fergus, R., Torralba, A.: Recognizing and learning object categories. In: ICCV Tutorial (2005)Google Scholar
  11. 11.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE T. Pattern Anal. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  12. 12.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley (1989)Google Scholar
  13. 13.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical report, Caltech (2007)Google Scholar
  14. 14.
    Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Jing, Y., Baluja, S.: Pagerank for product image search. In: WWW (2008)Google Scholar
  16. 16.
    Jurie, Y.S.F.: Improving image classifcation using semantic attributes. IJCV 100(1), 59–77 (2012)CrossRefGoogle Scholar
  17. 17.
    Kang, F., Jin, R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: CVPR (2006)Google Scholar
  18. 18.
    Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: First Workshop on Fine-Grained Visual Categorization, CVPR (2011)Google Scholar
  19. 19.
    Kim, J., Liu, C., Sha, F., Grauman, K.: Deformable spatial pyramid matching for fast dense correspondences. In: CVPR (2013)Google Scholar
  20. 20.
    Lin, X., Gokturk, B., Sumengen, B., Vu, D.: Visual search engine for product images. In: Multimedia Content Access: Algorithms and Systems II (2008)Google Scholar
  21. 21.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)CrossRefGoogle Scholar
  22. 22.
    Merler, M., Galleguillos, C., Belongie, S.: Recognizing groceries in situ using in vitro training data. In: CVPR (2007)Google Scholar
  23. 23.
    Nilsback, M.E., Zisserman, A.: A visual vocabulary for ower classification. In: CVPR (2006)Google Scholar
  24. 24.
    Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: ICCV (2011)Google Scholar
  25. 25.
    Perronnin, F., Liu, Y., Sanchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: CVPR (2010)Google Scholar
  26. 26.
    Sharma, G., Jurie, F., Schmid, C.: Discriminative spatial saliency for image classification. In: CVPR (2012)Google Scholar
  27. 27.
    Shen, X., Lin, Z., Brandt, J., Wu, Y.: Mobile product image search by automatic query object extraction. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 114–127. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  28. 28.
    Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I, LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)Google Scholar
  29. 29.
    Torralba, A., Efros, A.: Unbiased look at dataset bias. In: CVPR (2011)Google Scholar
  30. 30.
    Tsai, S.S., Chen, D.M., Chandrasekhar, V., Takacs, G., Cheung, N.M., Vedantham, R., Grzeszczuk, R., Girod, B.: Mobile product recognition. In: ACM Multimedia (ACM MM) (2010)Google Scholar
  31. 31.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)Google Scholar
  32. 32.
    Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-ucsd birds 200. Technical report cns-tr-201, Caltech (2010)Google Scholar
  33. 33.
    Winlock, T., Christiansen, E., Belongie, S.: Toward real-time grocery detection for the visually impaired. In: CVAVI (2010)Google Scholar
  34. 34.
    Zha, Z., Hua, X., Mei, T., Wang, J., Qi, G., Wang, Z.: Joint multi-label multi-instance learning for image classification. In: CVPR (2008)Google Scholar
  35. 35.
    Zhang, M., Pena, J., Robles, V.: Feature selection for multi-label naive bayes classification. Information Sciences 179(19), 3218–3229 (2009)CrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Marian George
    • 1
  • Christian Floerkemeier
    • 1
  1. 1.Department of Computer ScienceETH ZurichSwitzerland

Personalised recommendations