Human Activity Recognition by Fusion of RGB, Depth, and Skeletal Data

  • Pushpajit Khaire
  • Javed Imran
  • Praveen Kumar
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 703)


A significant increase in research of human activity recognition can be seen in recent years due to availability of low-cost RGB-D sensors and advancement of deep learning algorithms. In this paper, we augmented our previous work on human activity recognition (Imran et al., IEEE international conference on advances in computing, communications, and informatics (ICACCI), 2016) [1] by incorporating skeletal data for fusion. Three main approaches are used to fuse skeletal data with RGB, depth data, and the results are compared with each other. A challenging UTD-MHAD activity recognition dataset with intraclass variations, comprising of twenty-seven activities, is used for testing and experimentation. Proposed fusion results in accuracy of 95.38% (nearly 4% improvement over previous work), and it also justifies the fact that recognition improves with an increase in number of evidences in support.


Convolutional neural networks Deep learning Depth motion map RGB-D sensors Skeleton UTD-MHAD Motion history image and fusion 



This research was supported by Science and Engineering Research Board (SERB) under project no. ECR/2016/000387, in cooperation with the Department of Science & Technology (DST), Government of India. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of DST-SERB or the Government of India. The DST-SERB or Government of India is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.


  1. 1.
    Imran, J., Kumar, P.: Human Action Recognition using RGB-D Sensor and Deep Convolutional Neural Networks. In: IEEE International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 144–148. Jaipur, India (2016)Google Scholar
  2. 2.
    Chen, C., Jafari, R., Kehtarnavaz, N.: UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor. In: IEEE International Conference on Image Processing (ICIP), pp. 168–172 (2015)Google Scholar
  3. 3.
    Li, W., Zhang, Z., and Liu, Z.: Action recognition based on a bag of 3D points. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog. Workshops, San Francisco, CA, USA, pp. 9–14. Jun. (2010)Google Scholar
  4. 4.
    Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., and Bajcsy R.: Berkeley MHAD: A Comprehensive Multimodal Human Action Database. In: Proc. IEEE Workshop Appl. Comput. Vision, pp. 53–60. Jan. (2013)Google Scholar
  5. 5.
    Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured Human Activity Detection from RGBD Images. In: IEEE International Conference on Robotics and Automation RiverCentre, Saint Paul, Minnesota, USA, pp. 842–849, (2012)Google Scholar
  6. 6.
    Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., and Ogunbona, P. O.: Action Recognition From Depth Maps Using Deep Convolutional Neural Networks, IEEE Transactions on Human-Machine Systems, Vol. 46, No. 4, pp. 498–509 August (2016)Google Scholar
  7. 7.
    Chen, C., Liu, M., Zhang, B., Han, J., Jiang, J., Liu, H.: 3D Action Recognition Using Multi-temporal Depth Motion Maps and Fisher Vector, In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, pp. 3331–3337 (2016)Google Scholar
  8. 8.
    Xia, L., Chen, C. C., and Aggarwal, J. K.: View Invariant Human Action Recognition Using Histograms of 3D Joints. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 20–27, Providence, RI, (2012)Google Scholar
  9. 9.
    Gaglio, S., Lo Re, G., and Morana, M.: Human Activity Recognition Process Using 3-D Posture Data. IEEE Transactions on Human-Machine Systems, Vol. 45, No. 5, pp. 586–597 (2015)Google Scholar
  10. 10.
    Cippitelli, E., Gasparrini, S., Gambi, E., and Spinsante, S.: A Human Activity Recognition System Using Skeleton Data from RGBD Sensors, Computational Intelligence and Neuroscience, Article ID 4351435, pp. 1–14, Volume 2016 (2016)Google Scholar
  11. 11.
    Farhad, M. B., Jiang, Y., and Ma, J.: DMMs- Based Multiple Features Fusion for Human Action Recognition. International Journal of Multimedia Data Engineering and Management (IJMDEM) Volume 6, Issue 4, pp. 23–39 (2015)Google Scholar
  12. 12.
    Aggarwal, J.K., Xia, L.: Human Activity Recognition from 3D Data: A Review. Pattern Recognition Letters, 48, pp. 70–80 (2014)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringVisvesvaraya National Institute of TechnologyNagpurIndia
  2. 2.Department of Computer Science and EngineeringIndian Institute of TechnologyRoorkeeIndia

Personalised recommendations