Workshop at the European Conference on Computer Vision

ECCV 2014: Computer Vision - ECCV 2014 Workshops pp 459-473 | Cite as

ChaLearn Looking at People Challenge 2014: Dataset and Results

  • Sergio Escalera
  • Xavier Baró
  • Jordi Gonzàlez
  • Miguel A. Bautista
  • Meysam Madadi
  • Miguel Reyes
  • Víctor Ponce-López
  • Hugo J. Escalante
  • Jamie Shotton
  • Isabelle Guyon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8925)

Abstract

This paper summarizes the ChaLearn Looking at People 2014 challenge data and the results obtained by the participants. The competition was split into three independent tracks: human pose recovery from RGB data, action and interaction recognition from RGB data sequences, and multi-modal gesture recognition from RGB-Depth sequences. For all the tracks, the goal was to perform user-independent recognition in sequences of continuous images using the overlapping Jaccard index as the evaluation measure. In this edition of the ChaLearn challenge, two large novel data sets were made publicly available and the Microsoft Codalab platform were used to manage the competition. Outstanding results were achieved in the three challenge tracks, with accuracy results of 0.20, 0.50, and 0.85 for pose recovery, action/interaction recognition, and multi-modal gesture recognition, respectively.

Keywords

Human pose recovery Behavior analysis Action and interactions Multi-modal gestures Recognition 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: Human pose estimation: new benchmark and state of the art analysis. In: CCVPR, IEEE (2014)Google Scholar
  2. 2.
    Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3d human pose annotations. In: ICCV, pp. 1365–1372. IEEE (2009)Google Scholar
  3. 3.
    De la Torre, F., Hodgins, J.K., Montano, J., Valcarcel, S.: Detailed human data acquisition of kitchen activities: the CMU-multimodal activity database (CMU-MMAC). Tech. rep., RI-TR-08-22h, CMU (2008)Google Scholar
  4. 4.
    Escalera, S., Gonzàlez, J., Baró, X., Reyes, M., Guyon, I., Athitsos, V., Escalante, H.J., Sigal, L., Argyros, A., Sminchisescu, C., Bowden, R., Sclaroff, S.: Chalearn multi-modal gesture recognition 2013: grand challenge and workshop summary. In: 15th ACM International Conference on Multimodal Interaction, pp. 365–368 (2013)Google Scholar
  5. 5.
    Escalera, S., Gonzàlez, J., Baró, X., Reyes, M., Lopes, O., Guyon, I., Athitsos, V., Escalante, H.J.: Multi-modal gesture recognition challenge 2013: Dataset and results. In: ChaLearn Multi-modal Gesture Recognition Grand Challenge and Workshop (ICMI), pp. 445–452 (2013)Google Scholar
  6. 6.
    Everingham, M., Gool, L.V., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  7. 7.
    Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR (2008)Google Scholar
  8. 8.
    Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE TPAMI 34(9), 1704–1716 (2012)CrossRefGoogle Scholar
  9. 9.
    Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: BMVC (2010). doi:10.5244/C.24.12
  10. 10.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR, pp. 1–8 (2008)Google Scholar
  11. 11.
    Moeslund, T., Hilton, A., Krueger, V., Sigal, L. (eds.): Visual Analysis of Humans: Looking at People. Springer, The Netherlands (2011)Google Scholar
  12. 12.
    Ramanan, D.: Learning to parse images of articulated bodies. In: NIPS, pp. 1129–1136 (2006)Google Scholar
  13. 13.
    Sánchez, D., Bautista, M.A., Escalera, S.: HuPBA 8k+: Dataset and ECOC-graphcut based segmentation of human limbs. Neurocomputing (2014)Google Scholar
  14. 14.
    Sapp, B., Taskar, B.: Modec: Multimodal decomposable models for human pose estimation. In: CVPR, IEEE (2013)Google Scholar
  15. 15.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. ICPR 3, 32–36 (2004)Google Scholar
  16. 16.
    Tran, D., Forsyth, D.: Improved human parsing with a full relational model. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 227–240. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  17. 17.
    Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV (2013)Google Scholar
  18. 18.
    Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE TPAMI (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Sergio Escalera
    • 1
    • 2
    • 3
  • Xavier Baró
    • 1
    • 4
  • Jordi Gonzàlez
    • 1
    • 5
  • Miguel A. Bautista
    • 1
    • 2
  • Meysam Madadi
    • 1
    • 2
  • Miguel Reyes
    • 1
    • 2
  • Víctor Ponce-López
    • 1
    • 2
    • 4
  • Hugo J. Escalante
    • 3
    • 6
  • Jamie Shotton
    • 7
  • Isabelle Guyon
    • 3
  1. 1.Computer Vision CenterCampus UABBarcelonaSpain
  2. 2.Department of MathematicsUniversity of BarcelonaBarcelonaSpain
  3. 3.ChaLearnBerkeleyCalifornia
  4. 4.EIMT/IN3 at the Open University of CataloniaBarcelonaSpain
  5. 5.Department of Computer ScienceUniv. Autònoma de BarcelonaBarcelonaSpain
  6. 6.INAOEPueblaMexico
  7. 7.Microsoft ResearchCambridgeUK

Personalised recommendations