Mossar: motion segmentation by using splitting and remerging strategies


This paper presents a novel approach for motion segmentation by using strategies of splitting and remerging. The presented approach, Mossar, hybridizes two existing ones to obtain their potential advantages while covering weaknesses: (1) velocity-based, one of the most widely used approaches that has fairly low accuracy but provides computational simplicity and (2) graph-based, a state-of-the-art approach that provides outstanding accuracy, yet bears high computational complexity and a burden in setting of thresholds. An initial set of key frames is generated by a velocity-based splitting process and then fed into a graph-based remerging process for refinement. We present mechanisms that improve key-frames capturing in the velocity-based approach as well as details on how the graph-based approach is modified and later applied to remerging. The proposed approach also allows users to interactively add or reduce the number of key frames to control segmentation hierarchy without the need to change threshold values and re-run segmentation, as usually done in existing approaches. Our experimental results show that the presented hybrid approach, compared to both velocity-based and graph-based, demonstrates superior performance in terms of accuracy and in comparison to graph-based, our approach has not only less complexity but also a lesser number of thresholds, the values of which can be much more simply determined.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14


  1. 1.

    ground truth: a set of key frames located at where the data should be partitioned to create segments


  1. 1.

    Bulut E, Capin T (2007) Key-frame extraction from motion capture data by curve saliency. In: Computer animation and social agents 2007 Jun (pp 119–123)

  2. 2.

    Carnegie Mellon University. CMU graphics lab motion capture database. Accessed 1 Jan 2018

  3. 3.

    Choensawat W, Nakamura M, Hachimura K (2015) GenLaban: a tool for generating Labanotation from motion capture data. Multimed Tools Appl 74(23):10823–10846

    Article  Google Scholar 

  4. 4.

    Devanne M, Wannous H, Pala P, Berretti S, Daoudi M, Del Bimbo A (2015) Combined shape analysis of human poses and motion units for action segmentation and recognition. In: Automatic face and gesture recognition (FG), 2015 11th IEEE international conference and workshops on 2015 May 4 (Vol. 7, pp 1–6). IEEE

  5. 5.

    Escalera S, Athitsos V, Guyon I (2016) Challenges in multimodal gesture recognition. J Mach Learn Res 17(72):1–54

    MathSciNet  Google Scholar 

  6. 6.

    Gong D, Medioni G, Zhu S, Zhao X (2012) Kernelized temporal cut for online temporal segmentation and recognition. Comput Vis–ECCV 2012:229–243

    Google Scholar 

  7. 7.

    Hachimura K, Nakamura M (2001) Method of generating coded description of human body motion from motion-captured data. In: Robot and human interactive communication. Proceedings. 10th IEEE International Workshop on 2001 (pp 122–127). IEEE

  8. 8.

    ICE Lab Mossar. Accessed 1 Jan 2018

  9. 9.

    Keogh E (2002) Exact indexing of dynamic time warping. In: Proceedings of the 28th international conference on very large data bases 2002 Aug 20 (pp 406–417). VLDB Endowment

  10. 10.

    Khan S, Bailey DG, Gupta GS (2014) Pause detection in continuous sign language. Int J Comput Appl Technol 50(1–2):75–83

    Article  Google Scholar 

  11. 11.

    Krüger B, Vögele A, Willig T, Yao A, Klein R, Weber A (2017 Apr) Efficient unsupervised temporal segmentation of motion data. IEEE Trans Multimed 19(4):797–812

    Article  Google Scholar 

  12. 12.

    Li M, Leung H (2016) Graph-based representation learning for automatic human motion segmentation. Multimed Tools Appl 1;75(15):9205–9224

    Article  Google Scholar 

  13. 13.

    Li M, Leung H, Liu Z, Zhou L (2016) 3D human motion retrieval using graph kernels based on adaptive graph construction. Comput Graph 54:104–112

    Article  Google Scholar 

  14. 14.

    Lim IS, Thalmann D (2001) Key-posture extraction out of human motion data. In: Engineering in medicine and biology society. Proceedings of the 23rd annual international conference of the IEEE 2001 (Vol. 2, pp 1167–1169). IEEE

  15. 15.

    Liu F, Zhuang Y, Wu F, Pan Y (2003) 3D motion retrieval with motion index tree. Comput Vis Image Underst 92(2):265–284

    Article  Google Scholar 

  16. 16.

    Miura T, Mitobe K, Yukawa T, Kaiga T, Taniguchi T, Tamamoto H (2011) Adaptation of grouping structure analysis in GTTM to hierarchical segmentation of dance motion. Inf Media Technol 6(1):172–192

    Google Scholar 

  17. 17.

    Northwestern University Process Optimization Open Textbook Spatial branch-and-bound. Accessed 1 Jan 2018

  18. 18.

    Paliyawan P, Sookhanaphibarn K, Choensawat W, Thawonmas R (2015) Towards universal kinect interface for fighting games. In Consumer Electronics (GCCE), 2015 I.E. 4th Global Conference on 2015 Oct 27 (pp 332–333). IEEE

  19. 19.

    Panagiotakis C, Holzapfel A, Michel D, Argyros AA (2013) Beat synchronous dance animation based on visual analysis of human motion and audio analysis of music tempo. In: International symposium on visual computing 2013 Jul 29 (pp 118–127). Springer Berlin Heidelberg

  20. 20.

    Ruffieux S, Lalanne D, Mugellini E (2013) ChAirGest: a challenge for multimodal mid-air gesture recognition for close HCI. In: Proceedings of the 15th ACM on International conference on multimodal interaction 2013 Dec 9 (pp 483–488). ACM

  21. 21.

    Shiratori T, Nakazawa A, Ikeuchi K 2018 Detecting dance motion structure using motion capture and musical information. In: Proc. international conference on virtual systems and multimedia (VSMM) 2004 Nov (Vol. 3)

  22. 22.

    Tunca C, Pehlivan N, Ak N, Arnrich B, Salur G, Ersoy C (2017) Inertial sensor-based robust gait analysis in non-hospital settings for neurological disorders. Sensors 17(4):825

    Article  Google Scholar 

  23. 23.

    Van Mieghem P (2010) Graph spectra for complex networks. Cambridge University Press, Cambridge ISBN-13: 9781107411470

    Google Scholar 

  24. 24.

    Vlachos M, Hadjieleftheriou M, Gunopulos D, Keogh E (2003) Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining 2003 Aug 24 (pp 216–225). ACM

  25. 25.

    Vögele A, Krüger B, Klein R (2014) Efficient unsupervised temporal segmentation of human motion. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2014 Jul 21 (pp 167–176). Eurographics Association

  26. 26.

    Wang Q, Kurillo G, Ofli F, Bajcsy R (2015) Unsupervised temporal segmentation of repetitive human actions based on kinematic modeling and frequency analysis. In: 3D Vision (3DV), 2015 International Conference on 2015 Oct 19 (pp 562–570). IEEE

  27. 27.

    Wessa P. Kernel density estimation (v1.0.12) in free statistics software (v1.1.23-r7). Accessed 1 Jan 2018

  28. 28.

    Yin Y, Davis R (2013) Gesture spotting and recognition using salience detection and concatenated hidden markov models. In: Proceedings of the 15th ACM on international conference on multimodal interaction 2013 Dec 9 (pp. 489–494). ACM

  29. 29.

    Zhou F, De la Torre F, Hodgins JK (2013) Hierarchical aligned cluster analysis for temporal clustering of human motion. IEEE Trans Pattern Anal Mach Intell 35(3):582–596

    Article  Google Scholar 

Download references


The authors wish to thank Professor Kingkarn Sookhanaphibarn, Bangkok University, for her assistance in the technical editing of the manuscript. We would also like to express special thanks to the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper.

Author information



Corresponding author

Correspondence to Pujana Paliyawan.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Paliyawan, P., Choensawat, W. & Thawonmas, R. Mossar: motion segmentation by using splitting and remerging strategies. Multimed Tools Appl 77, 27761–27788 (2018).

Download citation


  • Motion segmentation
  • Motion representation
  • Graph Kernel matching