Skip to main content

Feature-level combination of skeleton joints and body parts for accurate aggressive and agitated behavior recognition


This paper presents a novel and practical approach for aggressive and agitated behavior recognition using skeleton data. Our approach is based on feature-level combination of joint-based features and body part-based features. To characterize spatiotemporal information, our approach extracts first meaningful joint-based features by computing pairwise distances of skeleton 3D joint positions at each time frame. Then, distances between body parts as well as joint angles are computed to incorporate body part features. These features are then effectively combined using an ensemble learning method based on rotation forests. A singular value decomposition method is used for feature selection and dimensionality reduction. The proposed approach is validated using extensive experiments on variety of challenging 3D action datasets for human behavior recognition. We empirically demonstrate that our proposed approach accurately discriminates between behaviors and performs better than several state of the art algorithms.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10


  1. Here we use the terms Behavior and Action interchangeably.

  2. Confidence factor C = 0.25.

  3. SVM with radial basis function kernel.

  4. BN with K2 search algorithm.

  5. Number of trees n = 10.

  6. Decision tree as base classifier.

  7. ZeroR as base classifier.

  8. Decision stump tree as base classifier.

  9. RepTree as base classifier.



  • Aggarwal J, Cai Q (1999) Human motion analysis: a review. Comput Vis Image Underst 73(3):428–440

    Article  Google Scholar 

  • Aggarwal J, Ryoo M (2011) Human activity analysis: a review. ACM Comput Surv 43(3):1–16

    Article  Google Scholar 

  • Andreu J, Angelov P (2013) An evolving machine learning method for human activity recognition systems. J Ambient Intell Humaniz Comput 4(2):195–206

    Article  Google Scholar 

  • Ashok Krishnamoorthy DA (2011) Managing challenging behaviour in older adults with dementia. Prog Neurol Psychiatry 15(3):20–26

    Article  Google Scholar 

  • Bankole A, Anderson M, Smith-Jackson T, Knight A, Oh K, Brantley J, Barth A, Lach J (2012) Validation of noninvasive body sensor network technology in the detection of agitation in dementia. Am J Alzheimer’s Disease Other Dement 27(5):346–354

    Article  Google Scholar 

  • Beeri MS, Werner P, Davidson M, Noy S (2002) The cost of behavioral and psychological symptoms of dementia (bpsd) in community dwelling alzheimer’s disease patients. Int J Geriatr Psychiatry 17(5):403–408

    Article  Google Scholar 

  • Benayed S, Eltaher M, Lee J (2014) Developing kinect-like motion detection system using canny edge detector. Am J Comput Res Repos 2(2):28–32

    Google Scholar 

  • Biswas J, Jayachandran M, Thang PV, Fook V FS, Choo TS, Qiang Q, Takahashi S, Jianzhong EH, Feng CJ, Kiat P YL (2006) Agitation monitoring of persons with dementia based on acoustic sensors, pressure sensors and ultrasound sensors: a feasibility study. In: International conference on aging, disability and independence, pp 3–15

  • Bouchard K, Bouchard B, Bouzouane A (2014) Spatial recognition of activities for cognitive assistance: realistic scenarios using clinical data from Alzheimer’s patients. J Ambient Intell Humaniz Comput 5(5):759–774

    Article  Google Scholar 

  • Bouziane A, Chahir Y, Molina M, Jouen F (2013) Unified framework for human behaviour recognition: an approach using 3d zernike moments. Neurocomputing 100:107–116

    Article  Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167

    Article  Google Scholar 

  • Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recogn Lett 34(15):1995–2006

    Article  Google Scholar 

  • Chikhaoui B, Wang S, Pigot H (2012) Adr-splda: activity discovery and recognition by combining sequential patterns and latent dirichlet allocation. Pervasive Mobile Comput 8(6):845–862

    Article  Google Scholar 

  • Chikhaoui B, Wang S, Xiong T, Pigot H (2014) Pattern-based causal relationships discovery from event sequences for modeling behavioral user profile in ubiquitous environments. Inf Sci 285:204–222

    Article  Google Scholar 

  • Cohen-Mansfield J (1991) Instruction manual for the cohen-mansfield agitation inventory (cmai). Research Institute of the Hebrew Home of Greater Washington

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  • Desai AK, Grossberg GT (2001) Recognition and management of behavioral disturbances in dementia. Primary Care Companion J Clin Psychiatry 3(3):93

    Article  Google Scholar 

  • Dolatabadi E, Taati B, Parra-Dominguez GS, Mihailidis A (2013) A markerless motion tracking approach to understand changes in gait and balance: a case study. In: Proceedings of the RESNA annual conference, pp 391–400

  • Domingos P (1999) Metacost: a general method for making classifiers cost-sensitive. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 155–164

  • Duong TV, Bui HH, Phung DQ, Venkatesh S (2005) Activity recognition and abnormality detection with the switching hidden semi-markov model. In: Computer vision and pattern recognition, 2005. CVPR 2005. IEEE computer society conference on, vol 1, pp 838–845 (IEEE)

  • Fallucchi F, Massimo ZF (2009) Svd feature selection for probabilistic taxonomy learning. In: Proceedings of the workshop on geometrical models of natural language semantics, pp 66–73

  • Fook VFS, Thang PV, Mon T, Htwe QQ, Phyo A AP, Jayachandran BJ, Yap P (2007) Automated recognition of complex agitation behavior of demented patient using video camera. In: 9th international conference one-health networking, application and services, pp 68–73

  • Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    MathSciNet  Article  MATH  Google Scholar 

  • Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163

    Article  MATH  Google Scholar 

  • Gantenbein D (2012). Kinect launches a surgical revolution.

  • Gray KF (2004) Managing agitation and difficult behavior in dementia. Clin Geriatr Med 20(1):69–82

    Article  Google Scholar 

  • Guo K (2011) Action recognition using log-covariance matrices of silhouette and optical-flow features. PhD thesis, Boston University

  • Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall PTR, Upper Saddle River

    MATH  Google Scholar 

  • Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: Proceedings of the twenty-third international joint conference on artificial intelligence, IJCAI ’13, AAAI Press, pp 2466–2472

  • Kläser A, Marszalek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: Proceedings of the British machine vision conference 2008, Leeds, September 2008, pp 1–10

  • Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3d points. In: Computer vision and pattern recognition workshops (CVPRW), 2010 IEEE computer society conference on, pp 9–14

  • Liu B (2006) Web data mining: exploring hyperlinks, contents, and usage data (data-centric systems and applications). Springer, New York

    Google Scholar 

  • Lu C, Jia J and Tang CK (2014) Range-sample depth feature for action recognition. In: Computer vision and pattern recognition (CVPR), 2014 IEEE conference on, pp 772–779

  • Ludmila K, Juan R (2007) An experimental study on rotation forest ensembles. In: Proceedings of the 7th international conference on multiple classifier systems, pp 459–468

  • Luo J, Wang W and Qi H (2013) Group sparsity and geometry constrained dictionary learning for action recognition from depth maps. In: Computer vision (ICCV), 2013 IEEE international conference on, pp 1809–1816

  • Maleki-Dizaji S, Siddiqi J, Soltan-Zadeh Y, Rahman F (2014) Adaptive information retrieval system via modelling user behaviour. J Ambient Intell Humaniz Comput 5(1):105–110

    Article  Google Scholar 

  • Mallidou A, Oliveira N, Borycki E (2013) Behavioural and psychological symptoms of dementia: are there any effective alternative-to-antipsychotics strategies? OA Fam Med 1(1):1–6

    Google Scholar 

  • Manoochehri M, Huey ED (2012) Diagnosis and management of behavioral issues in frontotemporal dementia. Curr Neurol Neurosci Rep 12(5):528–536

    Article  Google Scholar 

  • Melville P, Mooney RJ (2004) Creating diversity in ensembles using artificial data. Inf Fusion 6:99–111

    Article  Google Scholar 

  • Mihailidis A, Boger JN, Craig T, Hoey J (2008) The coach prompting system to assist older adults with dementia through handwashing: an efficacy study. BMC Geriatr 8(1):28

    Article  Google Scholar 

  • Moore P, Xhafa F, Barolli L, Thomas A (2013) Monitoring and detection of agitation in dementia: towards real-time and big-data solutions. In: P2P, parallel, grid, cloud and internet computing (3PGCIC), eighth international conference on, pp 128–135

  • Mori T, Fujii A, Shimosaka M, Noguchi H, Sato T (2007) Typical behavior patterns extraction and anomaly detection algorithm based on accumulated home sensor data. In: Future generation communication and networking (FGCN 2007), vol 2, pp 12–18 (IEEE)

  • Nazerfard E, Cook DJ (2015) Crafft: an activity prediction model based on Bayesian networks. J Ambient Intell Humaniz Comput 6(2):193–205

    Article  Google Scholar 

  • Nirjon S, Greenwood C, Torres C, Zhou S, Stankovic JA, Yoon HJ, Ra HK, Basaran C, Park T, Son SH (2013) Kintense: a robust, accurate, real-time and evolving system for detecting aggressive actions from streaming 3d skeleton data. In: Proceedings of the 11th ACM conference on embedded networked sensor systems, pp 1–9

  • Ohn-Bar E, Trivedi M (2013) Joint angles similarities and hog2 for action recognition. In: Computer vision and pattern recognition workshops (CVPRW), 2013 IEEE conference on, pp 465–470

  • Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198

    MATH  Google Scholar 

  • Oreifej O, Liu Z (2013) Hon4d: histogram of oriented 4d normals for activity recognition from depth sequences. In: Computer vision and pattern recognition (CVPR), 2013 IEEE conference on, pp 716–723

  • Osunkoya T, Chern J-C (2013) Gesture-based human-computer-interaction using kinect for windows mouse control and power point presentation. Chicago State University, Chicago (Department of Mathematics and Computer Science 60628)

    Google Scholar 

  • Plötz T, Hammerla NY, Rozga A, Reavis A, Call N, Abowd GD (2012) Automatic assessment of problem behavior in individuals with developmental disabilities. In: Proceedings of the 2012 ACM conference on ubiquitous computing, pp 391–400

  • Qiang Q, Fook FS, Phyo WAA, Thang PV, Jayachandran M, Jit B, Philip Y (2007) Multimodal information fusion for automated recognition of complex agitation behaviors of dementia patients. In: Information fusion, 2007 10th international conference on, pp 1–8 (IEEE)

  • Quinlan J (1999) Simplifying decision trees. Int J Hum Comput Stud 51(2):497–510

    Article  Google Scholar 

  • Rajasekaran S, Luteran C, Qu H and Riley-Doucet C (2011) A portable autonomous multisensory intervention device (pamid) for early detection of anxiety and agitation in patients with cognitive impairments. In: Engineering in medicine and biology society, EMBC, 2011 annual international conference of the IEEE, pp 4733–4736

  • Rodriguez J, Kuncheva L, Alonso C (2006) Rotation forest: a new classifier ensemble method. Pattern Anal Mach Intell IEEE Trans 28(10):1619–1630

    Article  Google Scholar 

  • Roy N, Misra A, Cook D (2016) Ambient and smartphone sensor assisted adl recognition in multi-inhabitant smart environments. J Ambient Intell Humaniz Comput 7(1):1–19

    Article  Google Scholar 

  • Sakr G, Elhajj I, Huijer H-S (2010) Support vector machines to define and detect agitation transition. Affect Comput IEEE Trans 1(2):98–108

    Article  Google Scholar 

  • Seidenari L, Varano V, Berretti S, Del Bimbo A and Pala P (2013) Recognizing actions from depth cameras as weakly aligned multi-part bag-of-poses. In: Computer vision and pattern recognition workshops (CVPRW), 2013 IEEE conference on, pp 479–485

  • Sheng B, Yang W, Sun C (2015) Action recognition using direction-dependent feature pairs and non-negative low rank sparse model. Neurocomputing 158:73–80

    Article  Google Scholar 

  • Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: Proceedings of the 2011 IEEE conference on computer vision and pattern recognition, pp 1297–1304

  • Tampi RR, Williamson D, Muralee S, Mittal V, McEnerney N, Thomas J, Cash M (2011) Behavioral and psychological symptoms of dementia: parti epidemiology, neurobiology, heritability, and evaluation. Clin Geriatr 1–6

  • van Teijlingen W, van den Broek EL, Könemann R, Schavemaker JG (2012) Towards sensing behavior using the kinect. In: 8th international conference on methods and techniques in behavioural research, pp 372–375 (Noldus Information Technology)

  • Wang J, Liu Z, Chorowski J, Chen Z, Wu Y (2012) Robust 3d action recognition with random occupancy patterns. In: Proceedings of the 12th European conference on computer vision—volume part II, pp 872–885

  • Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: Computer vision and pattern recognition (CVPR), 2012 IEEE conference on, pp 1290–1297

  • Wang Y, Tran D, Liao Z, Forsyth D (2012) Discriminative hierarchical part-based models for human parsing and action recognition. J Mach Learn Res 13(1):3075–3102

    MathSciNet  MATH  Google Scholar 

  • Xia L, Chen CC, Aggarwal JK (2012) View invariant human action recognition using histograms of 3d joints. In: CVPR workshops, pp 20–27 (IEEE)

  • Yang X, Zhang C and Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM international conference on multimedia, pp 1057–1060

  • Ya-Xuan H, Chih-Yen C, Hsu SJ, Chia-Tai C (2010) Abnormality detection for improving elder’s daily life independent. In: Aging friendly technology for health and independence. Springer pp 186–194

  • Ye M, Zhang Q, Wang L, Zhu J, Yang R, Gall J (2013) A survey on human motion analysis from depth data. In: Time-of-flight and depth imaging. Sensors, algorithms, and applications: Dagstuhl 2012 seminar on time-of-flight imaging and GCPR 2013 workshop on imaging new modalities, pp 149–187

  • Zhan Y, Kuroda T (2014) Wearable sensor-based human activity recognition from environmental background sounds. J Ambient Intell Humaniz Comput 5(1):77–89

    Article  Google Scholar 

  • Zhu Y, Chen W, Guo G (2013) Fusing spatiotemporal features and joints for 3d action recognition. In: Computer vision and pattern recognition workshops (CVPRW), 2013 IEEE conference on, pp 486–491

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Belkacem Chikhaoui.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chikhaoui, B., Ye, B. & Mihailidis, A. Feature-level combination of skeleton joints and body parts for accurate aggressive and agitated behavior recognition. J Ambient Intell Human Comput 8, 957–976 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Singular Value Decomposition
  • Recognition Accuracy
  • Challenging Behavior
  • Ensemble Method
  • Kinect Sensor