Advertisement

Egocentric Vision for Visual Market Basket Analysis

  • Vito Santarcangelo
  • Giovanni Maria Farinella
  • Sebastiano Battiato
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9913)

Abstract

This paper introduces a new application scenario for egocentric vision: Visual Market Basket Analysis (VMBA). The main goal in the proposed application domain is the understanding of customers behaviours in retails from videos acquired with cameras mounted on shopping carts (which we call narrative carts). To properly study the problem and to set the first VMBA challenge, we introduce the VMBA15 dataset. The dataset is composed by 15 different egocentric videos acquired with narrative carts during users shopping in a retail. The frames of each video have been labelled by considering 8 possible behaviours of the carts. The considered cart’s behaviours reflect the behaviour of the customers from the beginning (cart picking) to the end (cart releasing) of their shopping in a retail. The inferred information related to the time of stops of the carts within the retail, or to the shops at cash desks could be coupled with classic Market Basket Analysis information (i.e., receipts) to help retailers in a better management of spaces and marketing strategies. To benchmark the proposed problem on the introduced dataset we have considered classic visual and audio descriptors in order to represent video frames at each instant. Classification has been performed exploiting the Directed Acyclic Graph SVM learning architecture. Experiments pointed out that an accuracy of more than 93 % can be obtained on the 8 considered classes.

Keywords

Visual Feature Classification Modality Audio Feature Indoor Location Shopping Cart 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgment

The authors would like to thank Antonino Furnari, for his support in the development of this work.

References

  1. 1.
    Betancourt, A., Morerio, P., Regazzoni, C.S., Rauterberg, M.: The evolution of first person vision methods: a survey. IEEE Trans. Circuits Syst. Video Technol. 25(5), 744–760 (2015)CrossRefGoogle Scholar
  2. 2.
    Mann, S., Kitani, K.M., Lee, Y.J., Ryoo, M.S., Fathi, A.: An introduction to the 3rd workshop on egocentric (first-person) vision. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 827–832 (2014)Google Scholar
  3. 3.
    Furnari, A., Farinella, G.M., Battiato, S.: Recognizing personal contexts from egocentric images. In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), pp. 393–401 (2015)Google Scholar
  4. 4.
    Poleg, Y., Arora, C., Peleg, S.: Temporal segmentation of egocentric videos. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2537–2544 (2014)Google Scholar
  5. 5.
    Agrawal, P., Carreira, J., Malik, J.: Learning to see by moving. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 37–45 (2015)Google Scholar
  6. 6.
    Damen, D., Leelasawassuk, T., Haines, O., Calway, A., Mayol-Cuevas, W.W.: You-do, i-learn: Discovering task relevant objects and their modes of interaction from multi-user egocentric video. In: British Machine Vision Conference (2014)Google Scholar
  7. 7.
    Fathi, A., Rehg, J.M.: Modeling actions through state changes. In: Computer Vision and Pattern Recognition, pp. 2579–2586 (2013)Google Scholar
  8. 8.
    Fathi, A., Ren, X., Rehg, J.M.: Learning to recognize objects in egocentric activities. In: Computer Vision and Pattern Recognition, pp. 3281–3288 (2011)Google Scholar
  9. 9.
    Poleg, Y., Halperin, T., Arora, C., Peleg, S.: Egosampling: Fast-forward and stereo for egocentric videos. In: Computer Vision and Pattern Recognition, pp. 4768–4776 (2015)Google Scholar
  10. 10.
    Lee, Y.J., Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: Computer Vision and Pattern Recognition, pp. 1346–1353 (2012)Google Scholar
  11. 11.
    Xiong, B., Kim, G., Sigal, L.: Storyline representation of egocentric videos with an applications to story-based search. In: International Conference on Computer Vision, pp. 4525–4533 (2015)Google Scholar
  12. 12.
    Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: Computer Vision and Pattern Recognition, pp. 2714–2721 (2013)Google Scholar
  13. 13.
    Xu, Q., Li, L., Lim, J.H., Tan, C.Y.C., Mukawa, M., Wang, G.: A wearable virtual guide for context-aware cognitive indoor navigation. In: International Conference on Human-computer Interaction with Mobile Devices and Services, pp. 111–120 (2014)Google Scholar
  14. 14.
    Starner, T., Schiele, B., Pentland, A.: Visual contextual awareness in wearable computing. In: Second International Symposium on Wearable Computers, Digest of Papers, pp. 50–57, October 1998Google Scholar
  15. 15.
    Ortis, A., Farinella, G.M., D’Amico, V., Addesso, L., Torrisi, G., Battiato, S.: Organizing egocentric videos for daily living monitoring. Submitted to the ACM MM International Workshop on Lifelogging Tools and Applications (2016)Google Scholar
  16. 16.
    Wang, S., Fidler, S., Urtasun, R.: Lost shopping! monocular localization in large indoor spaces. In: International Conference on Computer Vision, pp. 2695–2703 (2015)Google Scholar
  17. 17.
    Ali, Z., Sonkusare, R.: Rfid based smart shopping: an overview. In: International Conference on Advances in Communication and Computing Technologies, pp. 1–3 (2014)Google Scholar
  18. 18.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2005)Google Scholar
  19. 19.
    Kendall, A., Grimes, M., Cipolla, R.: Posenet: a convolutional network for real-time 6-dof camera relocalization. In: International Conference on Computer Vision, pp. 2938–2946 (2015)Google Scholar
  20. 20.
    Platt, J.C., Cristianini, N., Shawe-taylor, J.: Large margin dags for multiclass classification. In: Advances in Neural Information Processing Systems, vol. 12, pp. 547–553 (2000)Google Scholar
  21. 21.
    Farinella, G., Raví, D., Tomaselli, V., Guarnera, M., Battiato, S.: Representing scenes for real-time context classification on mobile devices. Pattern Recogn. 48(4), 1086–1100 (2015)CrossRefGoogle Scholar
  22. 22.
    Dosovitskiy, A., Fischery, P., Ilg, E., Häusser, P., Hazirbas, C., Golkov, V., Smagt, P.V.D., Cremers, D., Brox, T.: Flownet: learning optical flow with convolutional networks. In: International Conference on Computer Vision, pp. 2758–2766 (2015)Google Scholar
  23. 23.
    Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. CoRR abs/1003.4083 (2010)Google Scholar
  24. 24.
    Sahidullah, M., Saha, G.: Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition. Speech Commun. 54(4), 543–565 (2012)CrossRefGoogle Scholar
  25. 25.
    Barron, J.L., Fleet, D.J., Beauchemin, S.S., Burkitt, T.A.: Performance of optical flow techniques. In: Computer Vision and Pattern Recognition, pp. 236–242 (1992)Google Scholar
  26. 26.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)CrossRefzbMATHGoogle Scholar
  27. 27.
    Veho Muvi Cam: Narrative cam. www.vehomuvi.com. Accessed April 2016
  28. 28.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRefGoogle Scholar
  29. 29.
    Wu, C.: Towards linear-time incremental structure from motion. In: International Conference on 3D Vision, pp. 127–134 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Vito Santarcangelo
    • 1
    • 2
  • Giovanni Maria Farinella
    • 1
  • Sebastiano Battiato
    • 1
  1. 1.Department of Mathematics and Computer ScienceUniversity of CataniaCataniaItaly
  2. 2.Centro Studi S.r.l.BuccinoItaly

Personalised recommendations