Comprehensive Representation and Efficient Extraction of Spatial Information for Human Activity Recognition from Video Data

  • Shobhanjana KalitaEmail author
  • Arindam Karmakar
  • Shyamanta M. Hazarika
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 460)


Of late, human activity recognition (HAR) in video has generated much interest. A fundamental step is to develop a computational representation of interactions. Human body is often abstracted using minimum bounding rectangles (MBRs) and approximated as a set of MBRs corresponding to different body parts. Such approximations assume each MBR as an independent entity. This defeats the idea that these are parts of the whole body. A representation schema for interaction between entities, each of which is considered as set of related rectangles or what is referred to as extended objects holds promise. We propose an efficient representation schema for extended objects together with a simple recursive algorithm to extract spatial information. We evaluate our approach and demonstrate that, for HAR, the spatial information thus extracted leads to better models compared to CORE9 [1] a compact and comprehensive representation schema for video understanding.


Latent Dirichlet Allocation Component Relation Human Activity Recognition Extended Object Partially Overlap 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Cohn, A.G., Renz, J., Sridhar, M.: Thinking inside the box: A comprehensive spatial representation for video analysis. In: Proc. 13th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR2012). pp. 588–592. AAAI Press (2012)Google Scholar
  2. 2.
    Aggarwal, J., Ryoo, M.: Human activity analysis: A review. ACM Computing Surveys 43(3), 16:1–16:43 (Apr 2011)Google Scholar
  3. 3.
    Dubba, K.S.R., Bhatt, M., Dylla, F., Hogg, D.C., Cohn, A.G.: Interleaved inductive-abductive reasoning for learning complex event models. In: ILP. Lecture Notes in Computer Science, vol. 7207, pp. 113–129. Springer (2012)Google Scholar
  4. 4.
    Kusumam, K.: Relational Learning using body parts for Human Activity Recognition in Videos. Master’s thesis, University of Leeds (2012)Google Scholar
  5. 5.
    Schneider, M., Behr, T.: Topological relationships between complex spatial objects. ACM Trans. Database Syst. 31(1), 39–81 (2006)CrossRefGoogle Scholar
  6. 6.
    Skiadopoulos, S., Koubarakis, M.: On the consistency of cardinal directions constraints. Artificial Intelligence 163, 91 – 135 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Chen, L., Nugent, C., Mulvenna, M., Finlay, D., Hong, X.: Semantic smart homes: Towards knowledge rich assisted living environments. In: Intelligent Patient Management, vol. 189, pp. 279–296. Springer Berlin Heidelberg (2009)Google Scholar
  8. 8.
    Cohn, A.G., Hazarika, S.M.: Qualitative spatial representation and reasoning: An overview. Fundam. Inform. 46(1-2), 1–29 (2001)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Randell, D.A., Cui, Z., Cohn, A.G.: A spatial logic based on regions and connection. In: Proc. of 3rd Int. Conf. on Principles of Knowledge Representation and Reasoning (KR’92). pp. 165–176. Morgan Kauffman (1992)Google Scholar
  10. 10.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)Google Scholar
  11. 11.
    al Harbi, N., Gotoh, Y.: Describing spatio-temporal relations between object volumes in video streams. In: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)Google Scholar
  12. 12.
    Sokeh, H.S., Gould, S., J, J.: Efficient extraction and representation of spatial information from video data. In: Proc. of the 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI’13). pp. 1076–1082. AAAI Press/IJCAI (2013)Google Scholar
  13. 13.
    Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Comp. Soc. Conf. on Computer Vision and Pattern Recognition (CVPR). vol. 2, pp. 524–531 (2005)Google Scholar
  14. 14.
    Phan, X.H., Nguyen, C.T.: GibbsLDA++: A C/C++ implementation of latent Dirichlet allocation (LDA) (2007)Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2017

Authors and Affiliations

  • Shobhanjana Kalita
    • 1
    Email author
  • Arindam Karmakar
    • 1
  • Shyamanta M. Hazarika
    • 1
  1. 1.Biomimetic and Cognitive Robotics LabComputer Science and Engineering, Tezpur UniversityTezpurIndia

Personalised recommendations