Kinect and Episodic Reasoning for Human Action Recognition

  • Ruben Cantarero
  • Maria J. Santofimia
  • David Villa
  • Roberto Requena
  • Maria Campos
  • Francisco Florez-Revuelta
  • Jean-Christophe Nebel
  • Jesus Martinez-del-Rincon
  • Juan C. Lopez
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 474)


This paper presents a method for rational behaviour recognition that combines vision-based pose estimation with knowledge modeling and reasoning. The proposed method consists of two stages. First, RGB-D images are used in the estimation of the body postures. Then, estimated actions are evaluated to verify that they make sense. This method requires rational behaviour to be exhibited. To comply with this requirement, this work proposes a rational RGB-D dataset with two types of sequences, some for training and some for testing. Preliminary results show the addition of knowledge modeling and reasoning leads to a significant increase of recognition accuracy when compared to a system based only on computer vision.


Action recognition Common sense Episodic reasoning Artificial intelligence Kinect Computer vision 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chaaraoui, A.A., Padilla-Lopez, J.R., Florez-Revuelta, F.: Fusion of skeletal and silhouette-based features for human action recognition with rgb-d devices. In: The IEEE International Conference on Computer Vision (ICCV) Workshops, December 2013Google Scholar
  2. 2.
    Cantarero, R., Santofimia, M.J., Nebel, J.-C., Revuelta, F.F., del Rincon, J.M., Lopez, J.C.: KinBehR: Kinect for Behaviour Recognition.
  3. 3.
    Chaaraoui, A.A., Climent-Pérez, P., Flórez-Revuelta, F.: An efficient approach for multi-view human action recognition based on bag-of-key-poses. In: Proceedings of the Third International Conference on Human Behavior Understanding, HBU 2012, pp. 29–40. Springer, Heidelberg (2012)Google Scholar
  4. 4.
    Chaaraoui, A.A., Climent-Pérez, P., Flórez-Revuelta, F.: Silhouette-based human action recognition using sequences of key poses. Pattern Recognition Letters 34(15), 1799–1807 (2013)Google Scholar
  5. 5.
    Chaaraoui, A.A., Florez-Revuelta, F.: Adaptive human action recognition with an evolving bag of key poses. IEEE Trans. on Auton. Ment. Dev. 6(2), 139–152 (2014)CrossRefGoogle Scholar
  6. 6.
    Chaaraoui, A.A., Flórez-Revuelta, F., Climent-Pérez, P.: Bag of key poses (2015).
  7. 7.
    Climent-Pérez, P., Chaaraoui, A.A., Padilla-López, J.R., Flórez-Revuelta, F.: Optimal joint selection for skeletal data from RGB-D devices using a genetic algorithm. In: Advances in Computational Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers, Part II, San Luis Potosí, Mexico, October 27 - November 4, pp. 163–174 (2012)Google Scholar
  8. 8.
    Davidson, D.: Actions, reasons, and causes. The Journal of Philosophy 60(23), 685–700 (1963)CrossRefGoogle Scholar
  9. 9.
    del Rincón, J.M., Santofimia, M.J., Nebel, J.-C.: Common-sense reasoning for human action recognition. Pattern Recogn. Lett. 34(15), 1849–1860 (2013)CrossRefGoogle Scholar
  10. 10.
    Edwards, C.: Decoding the language of human movement. Commun. ACM 57(12), 12–14 (2014)CrossRefGoogle Scholar
  11. 11.
    Fahlman, S.E. : The Scone knowledge-base project (2016). (retrieved on January 20, 2016)
  12. 12.
    Farooq, A., Won, C.S.: A survey of human action recognition approaches that use an rgb-d sensor. IEIE Transactions on Smart Processing and Computing 4(4), 281–290 (2015)CrossRefGoogle Scholar
  13. 13.
    Gonen, M., Alpaydin, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2268 (2011)MathSciNetMATHGoogle Scholar
  14. 14.
    Lara, O.D., Labrador, M.A.: A survey on human activity recognition using wearable sensors. IEEE Communications Surveys Tutorials 15(3), 1192–1209 (2013)CrossRefGoogle Scholar
  15. 15.
    Lei, J., Ren, X., Fox, D.: Fine-grained kitchen activity recognition using rgb-d. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing, UbiComp 2012, pp. 208–211. ACM New York (2012)Google Scholar
  16. 16.
    Mojidra, H.S., Borisagar, V.H.: A literature survey on human activity recognition via hidden markov model. In: IJCA Proceedings on International Conference on Recent Trends in Information Technology and Computer Science 2012, ICRTITCS vol. (6), pp. 1–5, February 2013Google Scholar
  17. 17.
    Santofimia, M.J., Martinez-del Rincon, J., Nebel, J.-C.: Episodic reasoning for vision-based human action recognition. The Scientific World Journal (2014)Google Scholar
  18. 18.
    Shimada, A., Kondo, K., Deguchi, D., Morin, G., Stern, H.: Kitchen scene context based gesture recognition: A contest in ICPR2012. In: Advances in Depth Image Analysis and Applications - International Workshop, WDIA 2012, Revised Selected and Invited Papers, Tsukuba, Japan, November 11, pp. 168–185 (2012)Google Scholar
  19. 19.
    Vieira, A., Nascimento, E., Oliveira, G., Liu, Z., Campos, M.: Stop: Space-time occupancy patterns for 3d action recognition from depth. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  20. 20.
    Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3d action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012)Google Scholar
  21. 21.
    Weinland, D., Özuysal, M., Fua, P.: Making action recognition robust to occlusions and viewpoint changes. In: European Conference on Computer Vision (2010)Google Scholar
  22. 22.
    Wu, Y.: Mining actionlet ensemble for action recognition with depth cameras. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012, pp. 1290–1297. IEEE Computer Society, Washington, DC (2012)Google Scholar
  23. 23.
    Xia, L., Chen, C.-C., Aggarwal, J.K.: View invariant human action recognition using histograms of 3d joints. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA, June 16-21, pp. 20–27. IEEE (2012)Google Scholar
  24. 24.
    Yang, X., Tian, Y.: Effective 3d action recognition using eigenjoints. J. Vis. Comun. Image Represent. 25(1), 2–11 (2014)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia, MM 2012, pp. 1057–1060. ACM, New York (2012)Google Scholar
  26. 26.
    Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimedia 19(2), 4–10 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ruben Cantarero
    • 1
  • Maria J. Santofimia
    • 1
  • David Villa
    • 1
  • Roberto Requena
    • 1
  • Maria Campos
    • 1
  • Francisco Florez-Revuelta
    • 2
  • Jean-Christophe Nebel
    • 2
  • Jesus Martinez-del-Rincon
    • 3
  • Juan C. Lopez
    • 1
  1. 1.School of Computing ScienceUniversity of Castilla-La ManchaCiudad RealSpain
  2. 2.Digital Imaging Research CentreKingston UniversityLondonUK
  3. 3.The Institute of Electronics, Communications and Information Technology (ECIT)Queens University of BelfastBelfastUK

Personalised recommendations