Averaging of motion capture recordings for movements’ templates generation
 542 Downloads
Abstract
In this paper we propose, describe and evaluate the novel motion capture (MoCap) data averaging framework. It incorporates hierarchical kinematic model, angle coordinates’ preprocessing methods, that recalculate the original MoCap recording making it applicable for further averaging algorithms, and finally signals averaging processing. We have tested two signal averaging methods namely Kalman Filter (KF) and Dynamic Time Warping barycenter averaging (DBA). The propose methods have been tested on MoCap recordings of elite Karate athlete, multiple champion of Oyama karate knockdown kumite who performed 28 different karate techniques repeated 10 times each. The proposed methods proved to have not only high effectiveness measured with rootmeansquare deviation (4.04 ± 5.03 degrees for KF and 5.57 ± 6.27 for DBA) and normalized Dynamic Time Warping distance (0.90 ± 1.58 degrees for KF and 0.93 ± 1.23 for DBA), but also the reconstruction and visualization of those recordings persists all crucial aspects of those complicated actions. The proposed methodology has many important applications in classification, clustering, kinematic analysis and coaching. Our approach generates an averaged full body motion template that can be practically used for example for human actions recognition. In order to prove it we have evaluated templates generated by our method in human action classification tasks using DTW classifier. We have made two experiments. In first leave  one  out cross  validation we have obtained 100% correct recognitions. In second experiment when we classified recordings of one person using templates of another recognition rate 94.2% was obtained.
Keywords
Signal averaging Movements’ templates Motion capture Kalman filter Dynamic time warping Barycenter averaging Karate1 Introduction
Many researches on physical activities are based on evaluation of a single or several kinetic or kinematic parameters like maximal force, velocity or acceleration for which mean value and standard deviation among participants is calculated [16, 64, 66]. Even when an empirical model is presented, very often it does not apply full body [27, 39, 49, 57, 58, 65]. This is a big simplification of those models because for example in martial arts the elite fighters use whole body to get the high speed and impact of techniques. The aspects like, for instance, movement’s trajectory that is a crucial part of accurate technique performance is often not taken into account. There are of course some exceptions from that trend [47], however very often experiments are performed on low precision hardware and it is impossible to get accurate results. What is more, often the evaluation of kinematic models is done on group of sportsmen that have long experience and it is assumed that they perform some action ”correctly” and ”optimally” [50]. But while analyzing the results, we see shallowness of evaluation: we know the average results on some activities, however we know nothing about the averaged body motion trajectories with which that results were obtained. In this paper we will present the motion capture signal averaging techniques which allow to generate actions’ templates from a group motion capture (MoCap) recordings. Those templates have many potential applications, for example kinematic analysis, coaching, actions classifications, clustering etc.
The human motion’s models based on training on MoCap data typically use Markov Models [42], graph representation [13] or Dynamic Time Warping (DTW) [1, 51, 61].Very often those methods do not employ full body evaluation or operate in reduced PCA space [14] in which not all features are taken into account during calculation.
Authors usually use forward kinematic model to determine the position and orientation of the body parts with given joints angles. On the other hand, we can estimate joint angles with desired or known position of body parts when the inverse kinematic model is used [56]. In our paper we decided to use local coordinate systems (hierarchical model) for angles calculation instead of projection of angles on sagittal, frontal and transversal planes [66]. Thanks to this we simplify the movements features calculation, because in hierarchical model data (beside to root joint) is invariant to any outer coordinate system. Of course hierarchical model can be recalculated to kinematic model that uses angles description relative to common fixed axis and vice versa. However if we would like to compare two MoCap recordings of the same action gathered from two persons that perform movements facing different direction (for example when they vector that links shoulders of one person is perpendicular to the same vector designed by shoulders of the other person) we will obtain different angles values. In case of hierarchical kinematic model all angles besides Hips can be directly compared without additional calculation. That is because of the fact that Hips is a root of the hierarchical model and only this joint is responsible for the global rotation of the whole body. Also the local coordinate system is more intuitive for coaches and athletes. It is easier to explain the movement (especially threedimensional actions!) relatively to other parts of the body than to virtual planes which are more usable when we describe one or maximally twodimensional actions.
We have tested two signal averaging approaches. The first is Kalman Filter (KF) [34], which is very popular signal processing algorithm with many applications, for example in Earth science [43], biomedical engineering [31], robotics [40], kinematic model synthesis [8, 11, 33, 37] or object tracking [60]. The second averaging approach is based on DTW barycenter averaging (DBA) [53], which was already initially applied to movements’ analysis [41, 59]. The evaluation of our new approach was done on karate techniques MoCap dataset. There were two main reasons why we have chosen this particular type of physical activities. At first it can be observed that even a few months of sport training can visibly increase the improvement of gait and posture of people that do various activities, like fitness, Tai Chi or Karate [55]. Due to this fact martial arts training becomes more and more popular in all group ages from preschool children to retiree. Because of the large popularity of that sport there is also a need of computer aided methods and applications that support training [67, 69]. The second reason is that karate has a large group of well defined ”standardized” attacking and defense techniques that are practiced by athletes. The model technique performance constitutes the natural template of that action. By averaging a group of MoCap recordings of the same actions we can not only compare the averaged performance of an athlete to the model performance, but also numerically generate that template.
The main novelty of this paper is a proposition and evaluation of motion capture data averaging framework. It incorporates hierarchical kinematic model, angle coordinates’ preprocessing methods, that recalculate the original MoCap recording making it applicable for further averaging algorithms, and finally signals averaging processing. We have tested two signal averaging methods namely Kalman Filter (KF) and Dynamic Time Warping barycenter averaging (DBA). We have tested our method on MoCap recordings of elite Karate athlete, multiple champion of Oyama karate knockdown kumite who performed 28 different karate techniques repeated 10 times each. The proposed methods proved to have not only high effectiveness measured with rootmeansquare deviation and normalized Dynamic Time Warping distance, but also the reconstruction and visualization of those recordings persists all crucial aspects of those complicated actions. The method we introduce in this paper incorporates reliable and already known approaches (KF and DTW); however, if they are applied in accordance with our research idea it results with valuable practical output. Among them are for example classification, clustering, kinematic analysis and coaching. In order to prove this we successfully use templates generated by proposed framework for human action recognition task. In this paper we also introduce the ”Lastchance” nonlinear averaging algorithm which is also our contribution. For our best knowledge there were no other published papers that subject were averaging of whole body motion capture recordings of the same activity for movements’ templates generation. In order to generate motion capture templates our algorithm requires several highquality MoCap recordings of one person that performs an actions to be averaged several times. However we did not find available dataset that satisfy those needs to serve as general benchmark for our research. Because of that we decided to make our own dataset and make it available for download [68]. It can be used as the reference dataset for future research.
In the next section we will define the problem we want to solve, dataset we used for methodology validation, MoCap hardware and kinematic model that was used to acquire this dataset. We also introduce data preprocessing methods and template generation algorithms we have designed. The third section presents validation results of our approaches. In fourth section we discuss the results by analyzing obtained numerical and visual data. The last section summarizes the paper, shows potential applications and directions of further researches.
2 Material and methods
 1.
Boundary condition: p_{1} = (1, 1) and p_{L} = (N, M).
 2.
Monotonicity condition: n_{1} ≤ n_{2} ≤ ... ≤ n_{L} and m_{1} ≤ m_{2} ≤ ... ≤ m_{L}.
 3.Step size condition:$$p_{l + 1}  p_{l} \in {(1, 0), (0, 1), (1, 1)}\quad \text{ for }\quad l \in [1 : L  1]. $$
Intuitively, the sequences are warped in a nonlinear fashion to match each other [48]. However, if N = M and p_{l+ 1} − p_{l} = (1, 1), the warping is linear. Our goal is to generate a new time series’ set that is averaged from many time series representing the same human activity. We will take into account two cases: first, in which we consider that signals can be wrapped linearly, and second, in which signals will be wrapped nonlinearly. The averaging algorithms will be named linear averaging algorithm and nonlinear averaging algorithm appropriately.
2.1 MoCap hardware and kinematic model
In hierarchical model we used rotations of body joints are described relatively to their parent joints in treelike fashion. The in our case (see Fig. 1) the root joint is a hips joint. The lower body hierarchy goes as follows:
Hips → Thigh (left or right) → Leg (left or right) → Foot (left or right). The upper part of the body is: Hips (left or right) → SpineLow (left or right) → SpineMid (left or right) → Chest (left or right) → Neck (left or right) → Head; and: Chest (left or right) → Shoulder (left or right) → Arm (left or right) → Forearm (left or right) → Hand (left or right). The hierarchy description is symmetrical for left and right leg and hand. Finally we did not use data from hands and feet sensors. These signals were omitted because our athlete performed actions barehanded and barefoot and we had no possibility to attach sensors to those body parts and to keep them in place during kicks and punches done in full speed. The MoCap system we used for data acquisition is based on internal sensors. In this technology it is crucial for tracking sensors to remains in fixed position on human body. It has appeared that our MoCap costume construction does not prevent hands and foot sensors falling off from bare limbs. After falling off, hands and foots sensors were freely dangling, making acquired signals useless. This is of course limitation of our MoCap acquisition hardware. The valid hands and feet tracking data could be processed with our method. The measurements from points named SpineLow, SpineMid and Neck are interpolated from neighbors’ sensors by Shadow software. In Fig. 1 we have presented the local coordinates systems of each sensor (all of them are righthanded). In the rightbottom part of the figure we have also presented orientations of rotations in righthanded coordinate system. The tracking frequency was 100 Hz with 0.5 degrees static accuracy and 2 degrees dynamic accuracy. All further angles measurements in this paper will be presented in degrees. The signals counter domain is [− 180, 180).
2.2 Dataset

Karate stands: kibadachi, zenkutsudachi, kokutsudachi. The stands were preceded by fudodachi. The stands were done in left and right side, so there were 6 types of recordings.

Defense techniques: gedanbarai, jodanuke, sotouke and uchiuke. Those techniques were done with left and right hand, so there were 8 types of recordings.

Kicks: hizageri, maegeri, mawashigeri and yokogeri. Those techniques were done with left and right leg, so there were 8 types of recordings.

Punches: furiuchi, shitauchi and tsuki. Those techniques were done with left and right hand, so there were 6 types of recordings. In all punches the rear hand was used.
Athlete performed ten repetition of each action. Between each recording the MoCap system was calibrated to maintain the adequate motion tracking. Then the acquired data was segmented into separate recordings in which each sample contains only single repetition of an action.
2.3 Data preprocessing
An algorithm that we will propose in this paper allows to perform Euler angles averaging. Euler angles are commonly used in biomechanics and kinematic analysis [27, 42, 58, 65, 66], because they are very intuitive and easy to interpret rotation description method. In case we are working on time varying signals averaging we have to deal with two factors: signals might vary in length and periodicity. The MoCap signal is commonly saved with counter domain limited to 360 degrees. It is enough for rotation description; however it might be insufficient for instant distance – based comparison of Euler angles. Without this comparison it is impossible to perform signal averaging. We will explain this with the help of images from our dataset.
 1.
signal resampling,
 2.
angle correction,
 3.
angle rescaling.
2.4 Linear averaging algorithm
2.5 Nonlinear averaging algorithm
 1.
Computes DTW between each individual sequence and the temporary average sequence to be refined in order to find associations between coordinates of the average sequence and coordinates of the set of sequences.
 2.
Updates each coordinate of the average sequence as the barycenter of coordinates associated to it during the first step.
2.6 ”Lastchance” nonlinear averaging algorithm
3 Results
The proposed approach has been implemented in R language. We have used ’dtw’ package [17] that implements DTW and ’dtwclust’ for DBA implementation. We have utilized Kalman filter implementation from package ’KFAS’ [29]. In DTW we have used Euclidean distance and wellknown symmetric2 step pattern. Those are most typical parameters that also in our experiment returned very good results.
3.1 Expert evaluation

Furiuchi with the right hand – right hand is positioned over left hand in LA and NA. It has appeared that it is caused by MoCap error so we do not consider it as an error.

Gedanbarai with the right hand – the right arm movement is jerky in LA. This is caused by similar situation as described in Fig. 5. The LCAC has corrected this error leaving only small angle’s value jumps at the end.

Hizageri with the left leg – the left arm movement is jerky in LA. This is caused by similar situation as indicated in Fig. 5. The LCAC has corrected this.

Hizageri with the right leg – the left arm movement is jerky in LA and there is a minimal value jump in NA. The whole body posture is shifted to one side however it is caused by MoCap error so we do not consider it as an error.

Jodanuke with the left hand – the left arm movement is jerky in LA. This is caused by similar situation as indicated in Fig. 5. The LCAC has corrected this error leaving only small angle’s value jump at the end.

Tsuki with the right hand – the left arm movement is jerky in LA. There is also error in hips rotation that is depicted in Fig. 5 in LA and NA. The LCAC has corrected this error leaving only small angle’s value jump in left hand LA at the end.

Yokogeri with left and right hands – the left and right arms movement is jerky in LA. The LCAC has corrected those errors.
3.2 Data preprocessing evaluation
Each of 28 karate actions was recorder 10 times and we measured 16 threedimensional signals (see Section 2.1). Totally we had 4480 signals grouped by 10. We have applied signal preprocessing algorithms from Section 2.3. As we mention in Section 2.6 seven signals among 4480 has failed the angle rescaling step, that means they had different period than others from the same group. The LCAC has corrected those errors. Wrongly calculated signals were the same signals that have been initially spotted by an expert.
3.3 Linear evaluation
The diff formula is the result of the fact, that distance between two angles in degrees is not greater than 180.
3.4 Nonlinear evaluation
3.5 Application for human actions recognition
This table summarizes results from proposed algorithms evaluation
jnn  RMSD linear  RMSD nonlinear  DTW linear  DTW nonlinear 

MAX  61.76  65.89  26.49  23.40 
MIN  0.23  0.26  0.05  0.04 
Mean±SD  4.04 ± 5.03  5.57 ± 6.27  0.90 ± 1.58  0.93 ± 1.23 
Median  2.60  3.70  0.59  0.66 
12 angles we have chosen to classify karate kicks from our dataset nicely covers major possible leg movements of thigh and shin (in our MoCap hardware setup we cannot track the position of feet). As has been shown later in this section, those description are sufficient to make successful classification of lower body motions.
This table presents the classification results of karate kicks set with DTW classifier. Rows stand for true conditions while columns for actual class assignment. All together there are 20 exemplars of each class (10 for Oyama and 10 for Shorin – Ryu master)
Mae geri right  Mae geri left  Mawashi geri right  Mawashi geri left  Hiza geri right  Hiza geri left  

Mae geri right  20  
Mae geri left  19  1  
Mawashi geri right  20  
Mawashi geri left  20  
Hiza geri right  3  17  
Hiza geri right  3  17 
4 Discussion
We have to remember that, although recordings that were averaged represented same body actions, it is virtually impossible that they were aligned ideally (we have discussed it in Section 2.4). That means we cannot expect that (5) and (9) will be zero. As can be seen in Table 1 mean value is comparable to standard deviation. This relatively large variance among the kinematic parameter among recordings in our dataset is not a surprise. It is of course few times smaller than deviation of kinematic variables in karate group that performs the same activity [65] but it is noticeable even when data come from single, experienced athlete. This is however inevitable attribute of precise MoCap measurement that makes signal processing of full body motion capture data a challenging task. Due to it is hardly possible to mutually align source recordings in averaging process with (5) and (9) below certain thresholds.
As can be seen in Table 1 and Figs. 11–14 both averaging methods resulted in very satisfying numerical results. The mean value of RMSD between averaging results and original MoCap data was 4.04 ± 5.03 degrees for LA and 5.57 ± 6.27 degrees for NA. The median values of those parameters were lower: 2.60 and 3.70 degrees appropriately. It can be explained by the fact that larger mean value together with high standard deviation (larger than mean value) is caused by limited group of techniques that had high movements dynamic in arms (rotation about X and Z axis) and legs (rotation about X and Z axis). Those were gedanbarai, jodanuke, hizageri and yokogeri. The highest values of deviations in those features can be easily explained. The shoulder and hip joints have anatomical ball and socket construction which allows them to have wide range of rotation about all three axis. In punches and blocks arm position changes rapidly, which in combination with angles’ periodicity makes that joints averaging most difficult.
The similar conclusions can be drawn from analysis of DTW normalized distance. The mean value was 0.90 ± 1.58 degrees for LA and for 0.93 ± 1.23 degrees for NA. This time also LA had smaller mean value, however higher standard deviation than NA. It seems that in our case, when the elite karate athlete is analyzed, there are not many nonlinear translocations of actions components in analyzed techniques. Summing up as there are not much differences in numerical evaluation between LA and NA. Finally we can recommend using the second one, which has better results in expert evaluation.
5 Conclusion

It can be used for template creation for pattern recognition purposes, especially for distancebased methods and clustering [18, 19, 22, 24, 35, 52]. Classification can be done for example with DTW with features set designed in similar way as we presented in Section 3.5 of this paper.

The averaging ability confirmed by numerical evaluation together with possibility of creating correct visual output make our methodologies usable for coaching purposes. With them a trainer can visualize averaged performance of an athlete and evaluate his or her kinematic parameters [9, 38].

Many uptodate researches evaluate only statistical kinematic parameters of actions like mean displacement, velocity or acceleration and their angular analogs. With our approach it is possible to compare and display kinematic differences between various participants of experiments using well established DTW framework (see Figs. 8 and 9).

With the proposed methodology it is possible to compare the kinematic parameters of actions that were generated in some period of time for example to examine the growth of flexibility during performing dynamic actions. The proposed averaging approach might be a component of evaluation procedure of athlete progress during training.

The method we proposed is a universal approach which can be applied directly to any MoCap dataset (not only karate) that can be represented with hierarchical model. Methods from Sections 2.3–2.6 can be used without any further generalization to any hierarchical kinematic.
Notes
Acknowledgment
This work has been supported by the National Science Center, Poland, under project number 2015/17/D/ST6/04051.
Supplementary material
(AVI 2.14 MB)
(AVI 660 KB)
References
 1.Adistambha K, Ritz C, Burnett IS (2008) Motion classification using dynamic time warping, international workshop on multimedia signal processing, MMSP 2008, October 810, 2008, Shangrila Hotel, Cairns, Queensland Australia, pp 622–627. https://doi.org/10.1109/MMSP.2008.4665151
 2.Arici T, Celebi S, Aydin A S, Temiz TT (2014) Robust gesture recognition using feature preprocessing and weighted dynamic time warping. Multimedia Tools Appl 72(3):3045–3062CrossRefGoogle Scholar
 3.Arici T, Celebi S, Aydin AS, Temiz TT (2014) Robust gesture recognition using feature preprocessing and weighted dynamic time warping, vol 72Google Scholar
 4.Arulkarthick V J, Sangeetha D (2012) Sign language recognition using kmeans clustered haarlike features and a stochastic context free grammar. Eur J Sci 78:74–84Google Scholar
 5.Berretti S, Daoudi M, Turaga P, Basu A (2018) Representation, Analysis, and Recognition of 3D Humans: A Survey. ACM Trans. Multimedia Comput. Commun. Appl. 14, 1s, Article 16 (March 2018), 36 pages. https://doi.org/10.1145/3182179 Google Scholar
 6.Bianco S, Tisato F (2013) Karate moves recognition from skeletal motion. Inproceedings of the SPIE 8650, ThreeDimensional Image Processing (3DIP) and Applications Burlingame, CA, USAGoogle Scholar
 7.Bielecka M, Piórkowski A (2015) Automatized fuzzy evaluation of CT scan heart slices for creating 3D/4D heart model. Appl Soft Comput 30:179–189. https://doi.org/10.1016/j.asoc.2015.01.023 CrossRefGoogle Scholar
 8.Burke M, Lasenby J (2016) Estimating missing marker positions using low dimensional Kalman smoothing. J Biomech 49:1854–1858CrossRefGoogle Scholar
 9.Burns AM, Kulpa R, Durny A, Spanlang B, Slater M, Multon F (2011) Using virtual humans and computer animations to learn complex motor skills: a case study in karate SKILLSGoogle Scholar
 10.Celiktutan O, Akgul CB, Wolf C, Sankur B (2013) Graphbased analysis of physical exercise actions. In: Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare (MIIRH ’13), Barcelona, Catalunya, Spain, 21–25, pp. 23–32 ppGoogle Scholar
 11.ChangWhan S, SoonKi J, Kwangyun W Synthesis of human motion using kalman filter, modelling and motion capture techniques for virtual environments: International Workshop, CAPTECH’98 Geneva, Switzerland, November 2627, 1998 Proceedings, Springer Berlin Heidelberg, pp 100112, 1998. https://doi.org/10.1007/3540493840_8 CrossRefGoogle Scholar
 12.Chen X, Koskela M (2015) Skeletonbased action recognition with extreme learning machines. Neurocomputing 149(Part A):387–396CrossRefGoogle Scholar
 13.Endres F, Hess J, Burgard W (2012) Graphbased action models for human motion classification. 7th German Conference on Robotics; Proceedings of ROBOTIK 2012, pp 1–6Google Scholar
 14.Firouzmanesh A, Cheng I, Basu A (2011) Perceptually Guided Fast Compression of 3D Motion Capture Data. IEEE Trans Multimedia 13(4):829–834. https://doi.org/10.1109/TMM.2011.2129497 CrossRefGoogle Scholar
 15.Furlanello C, Merler S, Jurman G (2006) Combining feature selection and DTW for timevarying functional genomics. IEEE Trans Signal Process 54(6):2436–2443. https://doi.org/10.1109/TSP.2006.873715 CrossRefGoogle Scholar
 16.Gheller RG, Dal Pupo J, AcheDias J, Detanico D, Padulo J, dos Santos SG (2015) Effect of different knee starting angles on intersegmental coordination and performance in vertical jumps. Hum Mov Sci 42:71–80CrossRefGoogle Scholar
 17.Giorgino T (2009) Computing and visualizing dynamic time warping alignments in R: The dtw package. J Stat Softw 31(7):1–24. https://doi.org/10.18637/jss.v031.i07 CrossRefGoogle Scholar
 18.Głowacz A (2015) Recognition of acoustic signals of synchronous motors with the use of MoFS and selected classifiers. Measurement Sci Rev 15 (4):167–175. https://doi.org/10.1515/msr20150024 CrossRefGoogle Scholar
 19.Głowacz A, Głowacz Z (2016) Recognition of images of finger skin with application of histogram, image filtration and KNN classifier. Biocybernetics Biomed Eng 36(1):95–101. https://doi.org/10.1016/j.bbe.2015.12.005 CrossRefGoogle Scholar
 20.Guodong L, Leonard M (2006) Estimation of missing markers in human motion capture. Vis Comput 22(9):721–728. https://doi.org/10.1007/s0037100600809 Google Scholar
 21.Gupta S, Jaafar J, Fatimah W, Ahmad W (2012) Static hand gesture recognition using local Gabor filter. Proc Eng 41:827–832CrossRefGoogle Scholar
 22.Hachaj T, Ogiela MR (2012) Semantic Description and Recognition of Human Body Poses and Movement Sequences with Gesture Description Language, Computer Applications for Biotechnology, Multimedia, and Ubiquitous City: International Conferences MulGraB, BSBT and IUrC 2012 Held as Part of the Future Generation Information Technology Conference, FGIT 2012, Gangneug, Korea, December 1619, 2012. Proceedings, Springer Berlin Heidelberg, pp 18. https://doi.org/10.1007/9783642355219_1 Google Scholar
 23.Hachaj T, Ogiela MR, Koptyra K Human actions modeling and recognition in lowdimensional feature space. In: Proceedings of the BWCCA 2015, 10th International Conference on Broadband and Wireless Computing, Communication and Applications, Krakow, Poland, 4–6, November 2015, pp 247–254Google Scholar
 24.Hachaj T, Ogiela MR, Koptyra K (2015) Application of assistive computer vision methods to oyama karate techniques recognition. Symmetry 7(4):1670–1698. https://doi.org/10.3390/sym7041670 MathSciNetCrossRefGoogle Scholar
 25.Hachaja T, Ogiela MR (2015) Full body movements recognition – unsupervised learning approach with heuristic RGDL method, Digital Signal Processing, Volume 46, pp. 239252. https://doi.org/10.1016/j.dsp.2015.07.004 MathSciNetCrossRefGoogle Scholar
 26.Hachaja T, Ogiela MR (2016) Human actions recognition on multimedia hardware using anglebased and coordinatebased features and multivariate continuous hidden Markov model classifier. Multimedia Tools Appl 75(23):16265–16285. https://doi.org/10.1007/s1104201529283 CrossRefGoogle Scholar
 27.Hadizadeh M, Amri S, Mohafez H, Roohi SA, Mokhtar AH (2016) Gait analysis of national athletes after anterior cruciate ligament reconstruction following three stages of rehabilitation program: Symmetrical perspective. Gait Posture 48:152–158CrossRefGoogle Scholar
 28.Helske J KFAS: Kalman Filter and Smoother for Exponential Family State Space Models, 2016, http://cran.rproject.org/package=KFAS (access date 16 October 2016)
 29.Helske J (2016) KFAS: Exponential family state space models in R, Accepted to journal of statistical softwareGoogle Scholar
 30.Huu PC, Le QK, Le TH (2014) Human action recognition using dynamic time warping and voting algorithm. VNU J Sci Comp Sci Com Eng 30(3):22–30MathSciNetGoogle Scholar
 31.Izzetoglu M, Chitrapu P, Bunce S, Onaral B (2010) Motion artifact cancellation in NIR spectroscopy using discrete Kalman filtering. BioMedical Engineering OnLine 9(1):934–938. https://doi.org/10.1186/1475925X916 CrossRefGoogle Scholar
 32.Ji S, Xu W, Yang M, Yu K (2013) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35:221–231CrossRefGoogle Scholar
 33.Jin M, Zhao J, Jin J, Yu G, Li W (2014) The adaptive Kalman filter based on fuzzy logic for inertial motion capture system. Measurement 49:196–204CrossRefGoogle Scholar
 34.Kalman RE (1960) A New Approach to Linear Filtering and Prediction Problems. Trans ASME J Basic Eng 82:35–45CrossRefGoogle Scholar
 35.Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient kmeans clustering algorithm: analysis and implementation. IEEE Trans Pattern Analysis Mach Intell Archive 24(7):881–892. https://doi.org/10.1109/TPAMI.2002.1017616 CrossRefGoogle Scholar
 36.Ke SR, Le H, Thuc U, Lee YJ, Hwang JN, Yoo JH, Choi KH (2013) A Review on VideoBased Human Activity Recognition. Computers 2 (2):88–131. https://doi.org/10.3390/computers2020088 CrossRefGoogle Scholar
 37.Kok M, Hol JD, Schön TB (2014) An optimizationbased approach to human body motion capture using inertial sensors, Proceedings of the 19th World Congress, The International Federation of Automatic Control Cape Town, South Africa, August, vol 2429, 2014, pp 79–85CrossRefGoogle Scholar
 38.Kwon DY, Gross M (2005) Combining body sensors and visual sensors for motion training. Proceedings of the 2005 ACM SIGCHI international conference on advances in computer entertainment technology, pp 94–101. https://doi.org/10.1145/1178477.1178490
 39.Lachlan PJ, Haff GG, Kelly VG, Beckman EM (2016) Towards a determination of the physiological characteristics distinguishing successful mixed martial arts athletes: A systematic review of combat sport literature. Sports Medicine 46(10):1525–1551. https://doi.org/10.1007/s4027901604931 CrossRefGoogle Scholar
 40.Larouche BP, Zhu ZH (2014) Autonomous robotic capture of noncooperative target using visual servoing and motion predictive control. Auton Robot 37(2):157–167. https://doi.org/10.1007/s1051401493832 CrossRefGoogle Scholar
 41.Laurent E, Thomas D, Maike B, Gavin M (2016) https://doi.org/10.1080/13658816.2015.1081205. Int J Geogr Inf Sci 30(5):835–853CrossRefGoogle Scholar
 42.Lehrmann AM, Gehler PV, Nowozin S (2014) Efficient nonlinear markov models for human motion, IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
 43.Li YM, Dong YF, Lai M (2007) Instantaneous spectrum estimation of earthquake ground motions based on unscented Kalman filter method. Appl Math Mech 28(11):1535–1543. https://doi.org/10.1007/s1048300711135 MathSciNetCrossRefGoogle Scholar
 44.LópezMéndez A, Casas JR (2012) Modelbased recognition of human actions by trajectory matching in phase spaces. Image Vis Comput 30:808–816CrossRefGoogle Scholar
 45.Mead R., Atrash A., Matarić MJ (2013) Automated proxemic feature extraction and behavior recognition: Applications in humanrobot interaction. Int J Soc Robot 5:367–378CrossRefGoogle Scholar
 46.Miranda L, Vieira T, Martinez D, Lewiner T, Vieira AW, Campos MFM (2014) Online gesture recognition from pose kernel learning and decision forests. Pattern Recognit Lett 39:65–73CrossRefGoogle Scholar
 47.Mitsuhashi K, Hashimoto H, Ohyama Y (2014) The curved surface visualization of the expert behavior for skill transfer using microsoft kinect. ICINCO 2:550–555. https://doi.org/10.5220/0005101305500555 Google Scholar
 48.Müller M (2007) Information retrieval for music and motion, SpringerVerlag New York, Inc. ISBN 3540740473CrossRefGoogle Scholar
 49.Neto OP, Magini M, Pacheco MTT (2008) Electromiographic and kinematic characteristics of Kung Fu YauMan palm strike. J Electromyogr Kinesiol 18:1047–1052CrossRefGoogle Scholar
 50.Neto OP, Silva JH, Marzullo ANAC, Bolander RP, Bir CA (2012) The effect of hand dominance on martial arts strikes. Hum Mov Sci 31:824–833CrossRefGoogle Scholar
 51.Palma C, Salazar A, Vargas F (2016) HMM and DTW for evaluation of therapeutical gestures using kinect. arXiv:1602.03742
 52.Peng L, Chen L, Wu X, Guo H, Chen G (2016) Hierarchical complex activity representation and recognition using topic model and classifier level fusion IEEE transactions on biomedical engineering PF (99). https://doi.org/10.1109/TBME.2016.2604856 CrossRefGoogle Scholar
 53.Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recogn 44(3):678–693. https://doi.org/10.1016/j.patcog.2010.09.013 CrossRefGoogle Scholar
 54.Piórkowski A, Jajesnica L, Szostek K (2009) Creating 3D webbased viewing services for DICOM images, computer networks, volume 39 of the series communications in computer and information science, pp 218224. https://doi.org/10.1007/9783642026713_26 CrossRefGoogle Scholar
 55.Pliske G, Emmermacher P, Weinbeer V, Witte K (2015) Changes in dualtask performance after 5 months of karate and fitness training for older adults to enhance fall prevention. Aging Clin Exp Res 28:1–8. 10.1007/s405200150508zCrossRefGoogle Scholar
 56.Qi Y, Soh CB, Gunawan E, Low KS (2014) A Wearable Wireless Ultrasonic Sensor Network for Human Arm Motion Tracking Engineering in Medicine and Biology Society (EMBC) 2014 36th Annual International Conference of the IEEEGoogle Scholar
 57.Quinzi F, Camomilla V, Felici F, Di Mario A, Sbriccoli P (2013) Differences in neuromuscular control between impact and no impact roundhouse kick in athletes of different skill levels. J Electromyogr Kinesiol 23:140–150CrossRefGoogle Scholar
 58.Sbriccoli P, Camomilla V, Di Mario A, Quinzi F, Figura F, Felici F (2009) Neuromuscular control adaptations in elite athletes: the case of top level karateka. Eur J Appl Physiol 108(6):1269–1280. https://doi.org/10.1007/s0042100913385 CrossRefGoogle Scholar
 59.Seto S, Zhang W, Zhou Y, Müller M (2015) Multivariate time series classification using dynamic time warping template selection for human activity recognition, IEEE Symposium Series on Computational Intelligence, SSCI 2015, Cape Town, South Africa, December 710, 2015, pp 1399–1406. https://doi.org/10.1109/SSCI.2015.199
 60.Shimin Y, Hee NJ, Young CJ, Songhwai O (2011) Hierarchical Kalmanparticle filter with adaptation to motion changes for object tracking. Comput Vis Image Underst 115:885–900CrossRefGoogle Scholar
 61.Slama R, Wannous H, Daoudi M (2013) 3D Human Video Retrieval: from pose to motion matching, eurographics workshop on 3D object retrieval. https://doi.org/10.2312/3DOR/3DOR13/033040
 62.Stasinopoulos S, Maragos P (2012) Human action recognition using Histographic methods and hidden Markov models for visual martial arts applications. Inproceedings of the 2012 19th IEEE International Conference on Image Processing (ICIP), Orlando, FL, USA, 30 September–3Google Scholar
 63.Su CJ, Chiang CY, Huang JY (2014) Kinectenabled homebased rehabilitation system using dynamic time warping and fuzzy logic. Appl Soft Comput 22:652–666CrossRefGoogle Scholar
 64.Timmi A, Pennestrí E, Valentini PP, Aschieri P (2011) Biomechanical analysis of two variants of the karate reverse punch (gyaku tsuki) based on the evaluation of the body kinetic energy from 3D mocap data Multibody Dynamics ECCOMASGoogle Scholar
 65.VencesBrito AM, Rodrigues Ferreira MA, Cortes N, Fernandes O, PezaratCorreia P (2011) Kinematic and electromyographic analyses of a karate punch. J Electromyogr Kinesiol 21:1023–1029CrossRefGoogle Scholar
 66.Vieira P, Moreira S, Goethel FM, Gonçalves M (2016) Neuromuscular performance of Bandal Chagui: Comparison of subelite and elite taekwondo athletes. J Electromyogr Kinesiol 30:55–65CrossRefGoogle Scholar
 67.Vignais N, Kulpa R, Brault S, Presse D, Bideau B (2015) Which technology to investigate visual perception in sport Video vs. virtual reality. Hum Mov Sci 39:12–26CrossRefGoogle Scholar
 68.Website of the GDL project that hosts MoCap dataset we used to validate our method http://gdl.org.pl/ (Access date: 11.11.2017)
 69.Witte K, Emmermacher P, Bandow N, Masik S (2012) Usage of virtual reality technology to study reactions in KarateKumite. Int J Sports Sci Eng 6(1):17–24Google Scholar
 70.Yang X, Tian Y (2014) Effective 3D action recognition using EigenJoints. J Visual Commun Image Represent 25:2–11CrossRefGoogle Scholar
 71.Zhang S, Wei Z, Nie J, Huang L, Wang S, Li Z (2017) A review on human activity recognition using visionbased method. J. Healthcare Eng 2017 (Article ID):3090343. https://doi.org/10.1155/2017/3090343 Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.