ChaLearn Looking at People Challenge 2014: Dataset and Results
Abstract
This paper summarizes the ChaLearn Looking at People 2014 challenge data and the results obtained by the participants. The competition was split into three independent tracks: human pose recovery from RGB data, action and interaction recognition from RGB data sequences, and multi-modal gesture recognition from RGB-Depth sequences. For all the tracks, the goal was to perform user-independent recognition in sequences of continuous images using the overlapping Jaccard index as the evaluation measure. In this edition of the ChaLearn challenge, two large novel data sets were made publicly available and the Microsoft Codalab platform were used to manage the competition. Outstanding results were achieved in the three challenge tracks, with accuracy results of 0.20, 0.50, and 0.85 for pose recovery, action/interaction recognition, and multi-modal gesture recognition, respectively.
Keywords
Human pose recovery Behavior analysis Action and interactions Multi-modal gestures RecognitionReferences
- 1.Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: Human pose estimation: new benchmark and state of the art analysis. In: CCVPR, IEEE (2014)Google Scholar
- 2.Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3d human pose annotations. In: ICCV, pp. 1365–1372. IEEE (2009)Google Scholar
- 3.De la Torre, F., Hodgins, J.K., Montano, J., Valcarcel, S.: Detailed human data acquisition of kitchen activities: the CMU-multimodal activity database (CMU-MMAC). Tech. rep., RI-TR-08-22h, CMU (2008)Google Scholar
- 4.Escalera, S., Gonzàlez, J., Baró, X., Reyes, M., Guyon, I., Athitsos, V., Escalante, H.J., Sigal, L., Argyros, A., Sminchisescu, C., Bowden, R., Sclaroff, S.: Chalearn multi-modal gesture recognition 2013: grand challenge and workshop summary. In: 15th ACM International Conference on Multimodal Interaction, pp. 365–368 (2013)Google Scholar
- 5.Escalera, S., Gonzàlez, J., Baró, X., Reyes, M., Lopes, O., Guyon, I., Athitsos, V., Escalante, H.J.: Multi-modal gesture recognition challenge 2013: Dataset and results. In: ChaLearn Multi-modal Gesture Recognition Grand Challenge and Workshop (ICMI), pp. 445–452 (2013)Google Scholar
- 6.Everingham, M., Gool, L.V., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
- 7.Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR (2008)Google Scholar
- 8.Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE TPAMI 34(9), 1704–1716 (2012)CrossRefGoogle Scholar
- 9.Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: BMVC (2010). doi: 10.5244/C.24.12
- 10.Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR, pp. 1–8 (2008)Google Scholar
- 11.Moeslund, T., Hilton, A., Krueger, V., Sigal, L. (eds.): Visual Analysis of Humans: Looking at People. Springer, The Netherlands (2011)Google Scholar
- 12.Ramanan, D.: Learning to parse images of articulated bodies. In: NIPS, pp. 1129–1136 (2006)Google Scholar
- 13.Sánchez, D., Bautista, M.A., Escalera, S.: HuPBA 8k+: Dataset and ECOC-graphcut based segmentation of human limbs. Neurocomputing (2014)Google Scholar
- 14.Sapp, B., Taskar, B.: Modec: Multimodal decomposable models for human pose estimation. In: CVPR, IEEE (2013)Google Scholar
- 15.Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. ICPR 3, 32–36 (2004)Google Scholar
- 16.Tran, D., Forsyth, D.: Improved human parsing with a full relational model. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 227–240. Springer, Heidelberg (2010) CrossRefGoogle Scholar
- 17.Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV (2013)Google Scholar
- 18.Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE TPAMI (2013)Google Scholar