Kitchen Scene Context Based Gesture Recognition: A Contest in ICPR2012

  • Atsushi Shimada
  • Kazuaki Kondo
  • Daisuke Deguchi
  • Géraldine Morin
  • Helman Stern
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7854)


This paper introduces a new open dataset “Actions for Cooking Eggs (ACE) Dataset” and summarizes results of the contest on “Kitchen Scene Context based Gesture Recognition”, in conjunction with ICPR2012. The dataset consists of naturally performed actions in a kitchen environment. Five kinds of cooking menus were actually performed by five different actors, and the cooking actions were recorded by a Kinect Sensor. Color image sequences and depth image sequences are both available. Besides, action label was given to each frame. To estimate the action label, action recognition method has to analyze not only actor’s action, but also scene contexts such as ingredients and cooking utensils. We compare the submitted algorithms and the results in this paper.


Action Recognition Depth Image Kinect Sensor Action Label Motion History Image 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mitra, S., Acharya, T.: Gesture Recognition: A Survey. IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews 37(3), 311–324 (2007)CrossRefGoogle Scholar
  2. 2.
    Poppe, R.: A survey on vision-based human action recognition. International Journal of Image and Vision Computing 28(6), 976–990 (2010)CrossRefGoogle Scholar
  3. 3.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: Proceedings of the 17th International Conference on Pattern Recognition, vol. 3, pp. 32–36 (2004)Google Scholar
  4. 4.
    Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Transactions on Pattern Analysis and Machine Intelligence 29(12), 2247–2253 (2007)CrossRefGoogle Scholar
  5. 5.
    Ryoo, M.S., Chen, C.-C., Aggarwal, J.K., Roy-Chowdhury, A.: An overview of contest on semantic description of human activities (SDHA) 2010. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 270–285. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  6. 6.
    Tenorth, M., Bandouch, J., Beetz, M.: The TUM Kitchen Data Set of Everyday Manipulation Activities for Motion Tracking and Action Recognition. In: IEEE International Workshop on Tracking Humans for the Evaluation of their Motion in Image Sequences (THEMIS), in conjunction with ICCV 2009 (2009)Google Scholar
  7. 7.
  8. 8.
    Davis, J.W., Richards, W., Bobick, A.F.: Categorical representation and recognition of oscillatory motion patterns. In: CVPR, pp. 1628–1635 (2000)Google Scholar
  9. 9.
    Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)CrossRefGoogle Scholar
  10. 10.
    Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.-S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: CVPR, pp. 2004–2011 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Atsushi Shimada
    • 1
  • Kazuaki Kondo
    • 2
  • Daisuke Deguchi
    • 3
  • Géraldine Morin
    • 4
  • Helman Stern
    • 5
  1. 1.Department of Advanced Information TechnologyKyushu UniversityJapan
  2. 2.Academic Center for Computing and Media studiesKyoto UniversityJapan
  3. 3.Strategy Office, Information and Communications HeadquartersNagoya UniversityJapan
  4. 4.IRITUniversity of ToulouseFrance
  5. 5.Department of Industrial Engineering and ManagementBen-Gurion University of the NegevIsrael

Personalised recommendations