Human motion recognition based on SVM in VR art media interaction environment
- 432 Downloads
In order to solve the problem of human motion recognition in multimedia interaction scenarios in virtual reality environment, a motion classification and recognition algorithm based on linear decision and support vector machine (SVM) is proposed. Firstly, the kernel function is introduced into the linear discriminant analysis for nonlinear projection to map the training samples into a high-dimensional subspace to obtain the best classification feature vector, which effectively solves the nonlinear problem and expands the sample difference. The genetic algorithm is used to realize the parameter search optimization of SVM, which makes full use of the advantages of genetic algorithm in multi-dimensional space optimization. The test results show that compared with other classification recognition algorithms, the proposed method has a good classification effect on multiple performance indicators of human motion recognition and has higher recognition accuracy and better robustness.
KeywordsHuman motion recognition Virtual reality Interactive technology Support vector machine Linear decision
linear discriminant analysis
support vector machine
linear discriminant analysis-genetic algorithm-support vector machine algorithm
K-means Clustering-Support Vector Machine Algorithm
In the process of digital performance, body language can often express the true feelings of actors compared with natural language. Therefore, in the virtual environment, the accurate recognition of human–computer interaction is especially important. At this stage, mainstream human motion recognition methods mainly use machine vision technology, involving knowledge of advanced computer disciplines such as image processing, pattern recognition, and machine learning. Among them, the image processing method based on spatiotemporal features and the machine learning method based on representation features have higher robustness, which has become the mainstream of current research [25, 26, 27, 28, 29]. Although the computational complexity is high, the two motion recognition methods can recognize continuous motion and interaction. The research direction chosen in this paper is a machine learning based approach. For example, using the Kinect sensor, Shi et al.  proposed a human motion recognition method based on the skeleton characteristics of key frames. The method uses K-means clustering algorithm to extract key frames and two features in human motion video sequences and uses SVM classifier to classify action sequences. Qin and Li  proposed a real-time recognition system for portable human gestures based on DSP. It uses a combination of wavelet packet principal component analysis and Linear Discriminant Analysis (LDA). All the above methods achieve a certain degree of precision and efficiency in human motion recognition. However, the human body movements in the VR multimedia art scene are more complicated and the changes are more irregular, resulting in the motion data being massive and high-dimensional (non-linear feature information), so the spatial feature extraction needs to reduce the dimension as much as possible. Reflect various types of actions. In addition, SVM classifier parameter optimization has a space for improvement.
In view of the spatio-temporal continuity of human motion data, two newest CNN based approaches [30, 31] are proposed. They used convolutional neural networks (CNN) to solve the problem of coherent motion recognition and used convolutional neuron spatiotemporal sequences to capture the dependence between input data. However, the size of the convolution kernel limits the range of dependency captures between data samples. Therefore, typical CNN models are not suitable for multiple complex motion recognition. Murad and Pyun  based on Deep Recurrent Neural Networks (DRNN) to propose an algorithm for human motion classification and recognition. Although the recognition rate is high, in the training and recognition process many GPU parallel operations are mainly used. It will lead the operations have a certain delay and real-time performance is affected, especially in large digital performances. Thus, their algorithm is not suitable for used in real-time evaluation systems.
In this paper, we proposes a human motion recognition method based on LDA and SVM (named LDA-GA-SVM), in order to improve the efficiency and accuracy of human motion recognition in VR human–computer interaction applications. This method mainly studies from two aspects: (1) Improve the recognition rate of motion features. (2) Improve the accuracy of motion classification. First, introducing a kernel function in LDA for nonlinear projection to map training samples into a high-dimensional subspace, and obtaining the best classification feature vector, effectively solving the nonlinear problem and expanding the sample difference, and reducing the dimensionality of the vector space operating efficiency. Secondly, the genetic algorithm is used to realize the parameter search optimization of SVM, which makes full use of the advantages of genetic algorithm in multi-dimensional space optimization and improves the recognition rate. The experimental results verify the validity and accuracy of the proposed method.
In addition, during the experiment, in the VR environment, the motion data acquisition of the virtual character in human–computer interaction is mainly acquired by the inertia capture device. The process mainly uses the wearable inertial sensor to capture the main bone joint posture data of the human body, and after obtaining the motion capture data, the data file can be imported into the skeleton virtual human model to drive the virtual human model bone movement.
The rest of this paper is organized as follows. The second session introduces the use of the nuclear decision LDA algorithm to extract the effective human motion features; the third session introduces the use of genetic optimization SVM algorithm for accurate motion classification; the fourth session introduces the experimental analysis in the VR environment, for the traditional K-means-SVM algorithm and the LDA-GA-SVM algorithm proposed in this paper are compared and analyzed in terms of accuracy, accuracy, specificity and sensitivity, and the advantages of the proposed method are obtained.
Feature extraction based on nuclear decision LDA
Linear discriminant analysis is a linear method commonly used for feature extraction. The LDA algorithm is insensitive to changes in illumination and attitude and is therefore widely used in image recognition tasks. However, algorithms such as traditional LDA  are basically linear.
Due to the complexity and diversity of human motion in VR scenes, some important high-dimensional nonlinear feature information hidden in motion data cannot be extracted. Therefore, this paper introduces a kernel function in the LDA algorithm for nonlinear projection to extract expression features. Combined with the genetically optimized SVM classifier, the complex action classification and recognition is finally realized.
It can be seen that the LDA algorithm is essentially a linear method, so the effect is not very good when dealing with nonlinear problems, and there are singularities. In order to efficiently extract the nonlinear characteristics of the data, we use the kernel decision LDA to extract features.
Proposed human motion recognition method
Motion data collection
Number of sensors
Maximum angular frequency
Motion data classification based on genetic optimization SVM
Human motion recognition realization
Step 1. Collect human motion data.
Step 2. Perform kernel matrix feature extraction based on LDA algorithm.
Step 3. Search for SVM parameters according to the genetic algorithm and determine whether it is optimal.
Step 4. If the parameter is the optimal parameter, the search is completed and recorded. If the non-optimal parameters continue to search.
Step 5. Classify based on the optimized SVM classifier and output the classification result.
Experimental analysis and comparison in VR environment
Software and hardware parameters of the experimental environment
AMD FX-8350 CPU
Window10 64 bit
8 GB RAM
Visual C ++ 6.0
Hard disk 300G
DirectX 3D processing software
NVIDIA GeForce GTX1060, 6 GB DRAM
Rond De Jambe A Terre
Rond De Jambe En Lair
Battement Releve Lent
Port De Bras
Comparison of experimental results of motion recognition (%)
Motion type number
Motion type number
In this paper, we combine the kernel decision LDA algorithm with the genetic optimization-based SVM algorithm to achieve human motion classification and recognition. In order to improve the accuracy of human motion recognition in VR human–computer interaction applications. Introducing a kernel function in LDA for nonlinear projection to map training samples into a high-dimensional subspace, and obtaining the best classification feature vector, effectively solving the nonlinear problem and expanding the sample difference and reducing the dimensionality of the vector space operating efficiency. In addition, the genetic algorithm is used to optimize the parameter search of SVM. The experimental results verify the effectiveness and advancement of the proposed method. However, the real-time performance of the algorithm in sample training and testing remains to be studied, and the complexity and scalability of the proposed algorithm will be further studied.
The authors thank the handled editor for a great support and all reviewers’ careful reviewing and constructive suggestions.
Fuquan Zhang received the Ph.D. degree in School of Computer Science & Technology, Beijing Institute of Technology, China in 2019. Currently, he is a professor of Minjiang University, China. He has received silver medal of the 6.18 cross strait staff innovation exhibition, gold medal of nineteenth National Invention Exhibition in 2010. In 2012, his proposed project has won the gold award of the seventh international invention exhibition. He was awarded the “top ten inventor of Fuzhou” honorary title by Fuzhou, China. He is now a director of Fujian Artificial Intelligence Society. His research interests include artificial intelligence and computer vision.
Tsu-Yang Wu received the Ph.D. degree in Department of Mathematics, National Changhua University of Education, Taiwan in 2010. Currently, he is an associate professor in College of Computer Science and Engineering, Shandong University of Science and Technology, China. In the past, he is an assistant professor in Innovative Information Industry Research Center at Shenzhen Graduate School, Harbin Institute of Technology. He serves as executive editor in Journal of Network Intelligence and as associate editor in Data Science and Pattern Recognition. His research interests include artificial intelligence and information security.
Jeng-Shyang Pan received the Ph.D. degree in Electrical Engineering from the University of Edinburgh, U.K. in 1996. Currently, he is the Director of the Fujian Provincial Key Lab of Big Data Mining and Applications, the Dean in College of Information Science and Engineering, and an Assistant President at Fujian University of Technology, China. He is the IET Fellow, UK and was offered Thousand Talent Program in China in 2010. His research interests include artificial intelligence, pattern recognition, and computer vision.
Gangyi Ding is professor, doctoral tutor. He received the Ph.D. degree from Beijing Institute of Technology, China in 1993. In December 2008, he served as Dean of the School of Software, Beijing Institute of Technology. He was hired as a member of the General Technology Department’s Simulation Technology Expert Group, Vice Chairman of the China Computer Simulation Association, Editor of the Computer Simulation Magazine, Member of the Quality and Reliability Expert Group of the National Defense Science and Technology Commission, National 863 Information Technology Specialist, Beijing Multimedia Public Service platform experts, etc. In 2011, as the leader, the Ministry of Education approved the “Digital Performance” of the Ministry of Education to set up an interdisciplinary discipline. In 2008, he was awarded the title of Olympic Liberation Model, Beijing Mass Economic and Technological Innovation Model, and Beijing Education Innovation Model Award by the Beijing Federation of Trade Unions. In 2009, he was awarded the “Top Ten Capital Education News Figures”. In 2010, he was awarded the title of Beijing Advanced Worker. He won the “Support for Contribution Unit Award” and “Innovation Achievement Award” for the National Day of the Capital.
Zuoyong Li Ph.D., Professor, Executive Deputy Director of Information Processing and Intelligent Control Key Laboratory of Fujian Province, Director of E-health Research Center of Internet Innovation Institute of Minjiang College, and Executive Director of Fujian Artificial Intelligence Society. In July 2010, he received a Ph.D. degree in computer application from Nanjing University of Science and Technology. He is mainly engaged in image processing, pattern recognition, and machine learning. Selected as the 2013 Outstanding Youth Research Talents Cultivation Program of Fujian Province and the 2015 New Century Excellent Talents Supporting Program of Fujian Province University. In 2015, he was selected as the Young Scholar Program of Minjiang College, and won the 2013 Fuzhou Education System Advanced Worker and Fuzhou City in 2014. The title of advanced educator.
FZ and TYW design the flowchart and main algorithms. Meanwhile, they finish the revise works. JSP designs the experimental environment. GD analyzes the previous related works. ZL analyzes the experimental results. All authors read and approved the final manuscript.
This work was supported by the Research Program Foundation of Minjiang University under Grants No. MYK17021, MYK18033, MJW201831408, and No. MJW201833313 and supported by the Major Project of Sichuan Province Key Laboratory of Digital Media Art under Grants No. 17DMAKL01 and supported by Fujian Province Guiding Project under Grants No. 2018H0028. We also acknowledge the solution from National Natural Science Foundation of China (61772254 and 61871204), Key Project of College Youth Natural Science Foundation of Fujian Province (JZ160467), Fujian Provincial Leading Project (2017H0030), Fuzhou Science and Technology Planning Project (2016-S-116), Program for New Century Excellent Talents in Fujian Province University (NCETFJ) and Program for Young Scholars in Minjiang University (Mjqn201601).
The authors declare that they have no competing interests.
- 3.Pan JS, Kong L, Sung TW, Tsai PW, Snasel V (2018) alpha-Fraction first strategy for hierarchical wireless sensor networks. J Internet Technol 19(6):1717–1726Google Scholar
- 12.Lin JC, Fournier-Viger P, Wu L, Gan W, Djenouri Y, Zhang J (2018 ) PPSF: an open-source privacy-preserving and security mining framework. In: IEEE international conference on data mining workshops (ICDMW), pp. 1459–1463, 17–20 Nov. 2018, SingaporeGoogle Scholar
- 22.Zhang Fuquan, Ding Gangyi, Lin Qing, Lin Xu, Li Zuoyong, Li Lijie (2018) Research of Simulation of Creative Stage Scene Based on the 3DGans Technology. J Inf Hiding Multimed Signal Process 9(6):1430–1443Google Scholar
- 24.Zhang F, Ding G, Ma L, Zhu Y, Li Z, Xu L (2018) Research on stage creative scene model generation based on series key algorithms. In: Zhao Y, Wu TY, Chang TH, Pan JS, Jain L (eds) Advances in smart vehicular technology, transportation, communication and applications, vol 128. VTCA. Smart Innovation, Systems and Technologies, Springer, pp 170–177CrossRefGoogle Scholar
- 26.Zhang F, Ding G, Lin X, Chen B, Li Z (2018) An effective method for the abnormal monitoring of stage performance based on visual sensor network. Int J Distrib Sens Netw 14(4):1–11Google Scholar
- 27.Shi X, Liu Y, Zhang D (2015) Human body motion recognition method based on key frames. J Syst Simul 27(10):2401–2408Google Scholar
- 28.Qin QIN, Yanwei LI (2014) Real-time recognition system of human gestures based on DSP[J]. Electron Technol Appl 40(7):75–78Google Scholar
- 30.Zhang R, Cao S (2019) Real-time human motion behavior detection via CNN using mmWave radar. IEEE Sensors Lett 3(2):3500104Google Scholar
- 33.Li C, Lu Y, Wu J, Zhang Y, Xia Z, Wang T, Yu D, Chen X, Liu P, Guo J. LDA meets Word2Vec: a novel model for academic abstract clustering. In: International World Wide Web Conferences, in the 2018 web conference companion (WWW 2018). April 23–27, 2018, Lyon, France, ACM, New York, pp 1699–1706Google Scholar
- 34.Yu Y, Pan Z, Hu G, Mo X, Xue J (2016) Kernel dimensionality reduction method based on KLDA. J Univ Sci Technol China 9:749–756Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.