Advertisement

Multimedia Tools and Applications

, Volume 77, Issue 21, pp 27761–27788 | Cite as

Mossar: motion segmentation by using splitting and remerging strategies

  • Pujana Paliyawan
  • Worawat Choensawat
  • Ruck Thawonmas
Article
  • 83 Downloads

Abstract

This paper presents a novel approach for motion segmentation by using strategies of splitting and remerging. The presented approach, Mossar, hybridizes two existing ones to obtain their potential advantages while covering weaknesses: (1) velocity-based, one of the most widely used approaches that has fairly low accuracy but provides computational simplicity and (2) graph-based, a state-of-the-art approach that provides outstanding accuracy, yet bears high computational complexity and a burden in setting of thresholds. An initial set of key frames is generated by a velocity-based splitting process and then fed into a graph-based remerging process for refinement. We present mechanisms that improve key-frames capturing in the velocity-based approach as well as details on how the graph-based approach is modified and later applied to remerging. The proposed approach also allows users to interactively add or reduce the number of key frames to control segmentation hierarchy without the need to change threshold values and re-run segmentation, as usually done in existing approaches. Our experimental results show that the presented hybrid approach, compared to both velocity-based and graph-based, demonstrates superior performance in terms of accuracy and in comparison to graph-based, our approach has not only less complexity but also a lesser number of thresholds, the values of which can be much more simply determined.

Keywords

Motion segmentation Motion representation Graph Kernel matching 

Notes

Acknowledgements

The authors wish to thank Professor Kingkarn Sookhanaphibarn, Bangkok University, for her assistance in the technical editing of the manuscript. We would also like to express special thanks to the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper.

References

  1. 1.
    Bulut E, Capin T (2007) Key-frame extraction from motion capture data by curve saliency. In: Computer animation and social agents 2007 Jun (pp 119–123)Google Scholar
  2. 2.
    Carnegie Mellon University. CMU graphics lab motion capture database. http://mocap.cs.cmu.edu. Accessed 1 Jan 2018
  3. 3.
    Choensawat W, Nakamura M, Hachimura K (2015) GenLaban: a tool for generating Labanotation from motion capture data. Multimed Tools Appl 74(23):10823–10846CrossRefGoogle Scholar
  4. 4.
    Devanne M, Wannous H, Pala P, Berretti S, Daoudi M, Del Bimbo A (2015) Combined shape analysis of human poses and motion units for action segmentation and recognition. In: Automatic face and gesture recognition (FG), 2015 11th IEEE international conference and workshops on 2015 May 4 (Vol. 7, pp 1–6). IEEEGoogle Scholar
  5. 5.
    Escalera S, Athitsos V, Guyon I (2016) Challenges in multimodal gesture recognition. J Mach Learn Res 17(72):1–54MathSciNetGoogle Scholar
  6. 6.
    Gong D, Medioni G, Zhu S, Zhao X (2012) Kernelized temporal cut for online temporal segmentation and recognition. Comput Vis–ECCV 2012:229–243Google Scholar
  7. 7.
    Hachimura K, Nakamura M (2001) Method of generating coded description of human body motion from motion-captured data. In: Robot and human interactive communication. Proceedings. 10th IEEE International Workshop on 2001 (pp 122–127). IEEEGoogle Scholar
  8. 8.
    ICE Lab Mossar. https://sites.google.com/site/icelabuki/mossar. Accessed 1 Jan 2018
  9. 9.
    Keogh E (2002) Exact indexing of dynamic time warping. In: Proceedings of the 28th international conference on very large data bases 2002 Aug 20 (pp 406–417). VLDB EndowmentGoogle Scholar
  10. 10.
    Khan S, Bailey DG, Gupta GS (2014) Pause detection in continuous sign language. Int J Comput Appl Technol 50(1–2):75–83CrossRefGoogle Scholar
  11. 11.
    Krüger B, Vögele A, Willig T, Yao A, Klein R, Weber A (2017 Apr) Efficient unsupervised temporal segmentation of motion data. IEEE Trans Multimed 19(4):797–812CrossRefGoogle Scholar
  12. 12.
    Li M, Leung H (2016) Graph-based representation learning for automatic human motion segmentation. Multimed Tools Appl 1;75(15):9205–9224CrossRefGoogle Scholar
  13. 13.
    Li M, Leung H, Liu Z, Zhou L (2016) 3D human motion retrieval using graph kernels based on adaptive graph construction. Comput Graph 54:104–112CrossRefGoogle Scholar
  14. 14.
    Lim IS, Thalmann D (2001) Key-posture extraction out of human motion data. In: Engineering in medicine and biology society. Proceedings of the 23rd annual international conference of the IEEE 2001 (Vol. 2, pp 1167–1169). IEEEGoogle Scholar
  15. 15.
    Liu F, Zhuang Y, Wu F, Pan Y (2003) 3D motion retrieval with motion index tree. Comput Vis Image Underst 92(2):265–284CrossRefGoogle Scholar
  16. 16.
    Miura T, Mitobe K, Yukawa T, Kaiga T, Taniguchi T, Tamamoto H (2011) Adaptation of grouping structure analysis in GTTM to hierarchical segmentation of dance motion. Inf Media Technol 6(1):172–192Google Scholar
  17. 17.
    Northwestern University Process Optimization Open Textbook Spatial branch-and-bound. https://optimization.mccormick.northwestern.edu/index.php/Spatial_branch_and_bound_method. Accessed 1 Jan 2018
  18. 18.
    Paliyawan P, Sookhanaphibarn K, Choensawat W, Thawonmas R (2015) Towards universal kinect interface for fighting games. In Consumer Electronics (GCCE), 2015 I.E. 4th Global Conference on 2015 Oct 27 (pp 332–333). IEEEGoogle Scholar
  19. 19.
    Panagiotakis C, Holzapfel A, Michel D, Argyros AA (2013) Beat synchronous dance animation based on visual analysis of human motion and audio analysis of music tempo. In: International symposium on visual computing 2013 Jul 29 (pp 118–127). Springer Berlin HeidelbergGoogle Scholar
  20. 20.
    Ruffieux S, Lalanne D, Mugellini E (2013) ChAirGest: a challenge for multimodal mid-air gesture recognition for close HCI. In: Proceedings of the 15th ACM on International conference on multimodal interaction 2013 Dec 9 (pp 483–488). ACMGoogle Scholar
  21. 21.
    Shiratori T, Nakazawa A, Ikeuchi K 2018 Detecting dance motion structure using motion capture and musical information. In: Proc. international conference on virtual systems and multimedia (VSMM) 2004 Nov (Vol. 3)Google Scholar
  22. 22.
    Tunca C, Pehlivan N, Ak N, Arnrich B, Salur G, Ersoy C (2017) Inertial sensor-based robust gait analysis in non-hospital settings for neurological disorders. Sensors 17(4):825CrossRefGoogle Scholar
  23. 23.
    Van Mieghem P (2010) Graph spectra for complex networks. Cambridge University Press, Cambridge ISBN-13: 9781107411470CrossRefGoogle Scholar
  24. 24.
    Vlachos M, Hadjieleftheriou M, Gunopulos D, Keogh E (2003) Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining 2003 Aug 24 (pp 216–225). ACMGoogle Scholar
  25. 25.
    Vögele A, Krüger B, Klein R (2014) Efficient unsupervised temporal segmentation of human motion. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2014 Jul 21 (pp 167–176). Eurographics AssociationGoogle Scholar
  26. 26.
    Wang Q, Kurillo G, Ofli F, Bajcsy R (2015) Unsupervised temporal segmentation of repetitive human actions based on kinematic modeling and frequency analysis. In: 3D Vision (3DV), 2015 International Conference on 2015 Oct 19 (pp 562–570). IEEEGoogle Scholar
  27. 27.
    Wessa P. Kernel density estimation (v1.0.12) in free statistics software (v1.1.23-r7). http://www.wessa.net/rwasp_density.wasp. Accessed 1 Jan 2018
  28. 28.
    Yin Y, Davis R (2013) Gesture spotting and recognition using salience detection and concatenated hidden markov models. In: Proceedings of the 15th ACM on international conference on multimodal interaction 2013 Dec 9 (pp. 489–494). ACMGoogle Scholar
  29. 29.
    Zhou F, De la Torre F, Hodgins JK (2013) Hierarchical aligned cluster analysis for temporal clustering of human motion. IEEE Trans Pattern Anal Mach Intell 35(3):582–596CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Intelligent Computer Entertainment Lab, Graduate School of Information Science and EngineeringRitsumeikan UniversityKusatsuJapan
  2. 2.Multimedia Intelligent Technology Lab, School of Information Technology and InnovationBangkok UniversityBangkokThailand

Personalised recommendations