Research on the Positioning Technology of Sports 3D Teaching Action Based on Machine Vision

This paper presents a method of action location in three-dimensional motion teaching. The machine vision technology is used to solve the problems of low positioning accuracy and long positioning time in the traditional motion three-dimensional teaching method. The work of this method is as follows: (a) using machine vision method to determine the world coordinate system of the image; (b) using MRF algorithm to extract the features of 3D teaching action image; (c) determining the spatial correlation of 3D teaching action data. In the three-dimensional teaching action image, the smooth filtering technology is used to suppress and eliminate the noise. Then the convolution neural network (CNN) is used to reconstruct the three-dimensional teaching action image. The entropy of three-dimensional teaching behavior of physical education is determined by CNN. Through a large number of computer simulations, the effectiveness of the proposed system is confirmed. The results show that the system achieves 95% accuracy when the positioning time is 1.9 s.


Introduction
With the development of video technology and biometric recognition technology, human posture recognition technology has been widely used in traffic, sports, medical and other fields. Among them, the application of human motion recognition is more extensive. In recent years, with the rapid development of biometric recognition technology, the recognition and location of human motion has become a hot research field, and it is a technology with potential development space. In the field of sports, the orientation of three-dimensional teaching movement can help athletes and coaches improve their training level. Coaches give effective guidance according to the performance of athletes in sports. Since the end of the twentieth century, the distance knowledge and skills teaching based on the network support environment has become a reality, and various network teaching courses and related research have also arisen. The network can break through the limitation of time and space, provide personalized teaching and diversified interactive mode, and make up for the defects of traditional teaching activities. This paper analyzes and summarizes the sports skills from the theory of sports skill teaching. Therefore, people have carried on a lot of research on the orientation of the three-dimensional teaching behavior of physical education and achieved some achievements [1][2][3].
In reference [4], a method of action location for CNNs based on global spatiotemporal features is proposed. This method is designed to solve the problem of poor local feature extraction of human motion. This method first constructs a dual channel recognition model of global time domain and spatial features. Through the design of spatial feature channel, multi frame fusion method is introduced to fuse the motion action. Then the global time domain channel is used to analyze the motion action, and the fused data is effectively trained to realize the deficiency of motion action. This method can quickly complete the prepositioning of action recognition through the set recognition model, but the scope of this method is narrow, and the effect is good only in a certain range, which has some limitations. In reference [5], a human motion localization method based on multi-scale trajectory of optical flow key points is proposed. Aiming at the problem of complex feature extraction of smooth light in traditional recognition methods, this method is designed. In this method, the scale trajectory of key points in optical flow region is combined with human motion trajectory, and the constraint conditions of human motion recognition are designed to determine the local characteristics of the action, and then the obtained action related data is input into the designed classifier to complete the motion recognition. This method can effectively improve the positioning accuracy of the action, but there are many interference factors in the operation process of this method, which takes a long time to de-noising, and the positioning time is long, so it still needs further improvement. In reference [6], an action feature tracking algorithm based on Markov random field is proposed. The algorithm extracts the feature sub blocks of various parts of the human body in motion, selects the tracking sub block with higher significance as the basis of action location, obtains the spatial constraint relationship between different sub blocks through Markov random field, and realizes the action feature location based on the constraint relationship. The algorithm can accurately divide different feature sub blocks of action, but the resolution of the action image obtained by the algorithm is low, which leads to low positioning accuracy and has some defects.
Based on the aforementioned problems, this paper proposes a novel motion location technology by using machine vision to determine the world coordinate system of the image, using MRF algorithm to both extract the characteristics of three-dimensional teaching action image and determine the spatial correlation of three-dimensional teaching action data. With the help of smooth filtering technology, the noise in the three-dimensional teaching action image is effectively suppressed and eliminated. Therefore, the threedimensional teaching action image is reconstructed. The entropy value of the three-dimensional image is determined by using convolution neural network. The visualization of teaching behavior is realized, and the three-dimensional positioning of physical education teaching behavior is realized.
The key contributions of this paper are presented as follows: • Firstly, the world coordinate system of the three-dimensional teaching action image is determined by machine vision, which is used to locate the motion. In this paper, the three-dimensional sports teaching action image is first obtained through the camera, and then located the action by using the artificial intelligence algorithm. • Secondly, MRF algorithm is used to extract the features of 3D teaching action image of physical education, and to determine the spatial correlation of 3D teaching action data of physical education. With the help of smooth fil-tering technology, the noise in 3D teaching action image of physical education is effectively suppressed and eliminated, and the preprocessing of 3D teaching action image of physical education is completed. • Thirdly, a convolution neural network is used to determine the entropy value of sports three-dimensional teaching action image, and to realize the positioning of sports three-dimensional teaching action. • Finally, computer simulations have been performed to evaluate the performance of the proposed system.
The rest of the paper is organized as follows: Section 2 presented the acquisition and preprocessing of 3-D teaching action image of the physical education; Section 3 discusses the physical education using 3-D teaching action positioning; Experimental analysis is described in Section 4; and finally, conclusion has been drawn in Section 5.

Acquisition method
In order to realize the three-dimensional teaching action positioning of physical education, this paper uses machine vision to locate it. Machine vision is a key technology in industrial detection, target recognition, positioning, tracking and other fields through robots [7]. This method has the advantages of non-contact, high precision, flexibility and reliability, and good environmental adaptability. Machine vision obtains the scene image through the camera, and then preprocesses the image to determine the location information of the research object, which can effectively solve the problem of human detection and positioning. In this paper, we first obtain the three-dimensional sports teaching action image through the camera, and then use the artificial intelligence algorithm to locate its action [8,9]. In this paper, firstly, through the camera calibration, determine its coordinate system, and then obtain the sports three-dimensional teaching action image. It mainly includes three-dimensional teaching action image coordinate system, camera coordinate system and world coordinate system. The coordinate system of sports 3D teaching action image is represented by pixel unit, and the coordinate origin is located in the lower left corner of the whole action image.
Set (u, v) to represent the coordinates of a certain point of the action image, and both represent the unit pixel of the action image. Set the origin of the coordinate system as (u 0 , v 0 ) pixel and set the coordinates of a point representing the action image using Eq. (1).
Among them, the pixel size of the action image is k × l. Through the transformation of its coordinates by Eq. (1) and it is obtained by Eq. (2).
In the Eq. (2), f represents the focal length of the camera, θ represents the skewness of the camera coordinate system. On this basis, the relationship between the action image and the camera coordinate system is determined using Eq. (3).
Where, M represents the intra-camera parameter matrix. (k, l, u 0 v 0 , f, θ) represents the camera internal parameters.
Through the world coordinate system, the acquisition of three-dimensional teaching action image of physical education is completed. The world coordinate system is a three-dimensional coordinate system with any point in space as the origin. The relationship between the camera coordinate system and the world coordinate system is represented through Eq. (4).
Where, R 3 × 3 represents the rotation matrix, t 3 × 1 represents a translation vector, (X C , Y C , Z C ) represents the camera coordinates, (X W , Y W , Z W ) represents the world coordinate system.
During the three-dimensional space, the two-dimensional rotation of the x, y, z coordinate axis obtained above is rotated, the coordinate angle of the three-dimensional teaching action image of physical education is determined, and the image acquisition is completed using Eq. (5).

Image feature extraction
In order to realize the 3D teaching action positioning of physical education, it is necessary to obtain the 3D teaching action data [10] of physical education in the coordinate system of (1) x y determining the action image above. According to the threedimensional teaching action image of physical education, the characteristic data of three-dimensional teaching action image of physical education are obtained.
Physical education 3D teaching action image imaging is more complex because physical education 3D teaching action characteristics are not typical features of target images. Therefore, it is necessary to extract the features of the threedimensional teaching action image of physical education. This paper uses the MRF algorithm [11] extract the features of the three-dimensional teaching action image of physical education and determine the spatial correlation of the three-dimensional teaching action data of physical education.
Suppose A = {a 1 , a 2 , a 3 …a i } represents the set of threedimensional physical education action data, the threshold of any value is Δ(1, 2, ∞). The selected value is the random value in the three-dimensional teaching action data of physical education, which is regarded as the joint value of the threedimensional teaching action data of physical education. At this time, the probability of the joint value in the MRF algorithm is G a as represented by Eq. (6).
In the formula, a i a n represents the conditional probability of random values in the three-dimensional physical education action data.
According to the joint value of physical education 3D teaching action random field obtained above, the image threshold of physical education 3D teaching action is determined. Suppose the three-dimensional teaching action image of physical education contains three parts: target, shadow and background [12], set the threshold of these three Z, divide them into two pixels p(e, r) as shown by Eq. (7).
Where,1 represents the characteristic target value of 3D teaching action image of physical education and 0 represents the background value of 3D teaching action of physical education.
After obtaining the threshold of 3D teaching action image of physical education, the feature extraction of 3D teaching action image of physical education is completed according to the characteristic target value and the background value as shown in Eq. (8).
a i a n (7) p(e, r) = 1, g(e, r) 0, g(e, r) Where, M k represents the characteristic target value and background value of 3D teaching action image of physical education.
After the above analysis, we get the three-dimensional teaching action image of physical education. This paper simulates it to better show the target segmentation of threedimensional action in physical education teaching, as shown in Fig. 1, Based on the course goal and the depth of the sample video image preprocessing content, a general teaching requirement is put forward by using the directional gradient histogram feature to reflect the concrete behavior. Teaching objectives also refer to the direction of histogram features and expected results. From the point of view of the success of image feature extraction, the teaching methods and strategies of output feature selection and the arrangement of teaching process play an important guiding role. On this basis, the extraction process of sports 3D teaching action image is set, as shown in Fig. 2.

Pre-processing of image feature data
After obtaining the above physical education 3D teaching action image data because the original image obtained does not meet the standard stipulated in the image processing technology in terms of format, pixel, brightness, clarity, etc., it will reduce the line shape feature [13] of the sports 3D teaching action image without preprocessing. Therefore, it is very important to preprocess the feature of 3D teaching action image. In the 3D teaching action image of physical education, the pre-processing of 3D teaching action image of physical education is completed by gray scaling the image lines and de-noising the image lines.
In the preprocessing of physical education 3D teaching action image, although the obtained 3D teaching action image will contain more abundant edge information than the gray image, there is still no mature algorithm so far. It can be directly based on color image for image edge line feature preprocessing, and the processing is difficult [14]. Therefore, it is necessary to convert the color image lines into gray image lines in the 3D teaching action images of physical education. In the gray processing of image lines, the three-component pixel value in the image lines is extracted as the gray value of three gray images by component method as presented by Eq. On the basis of this, the maximum value of-component pixel value is regarded as the gray value as shown in Eq. (10).
On this basis, the three-dimensional physical education teaching action image features are preprocessed. In this paper, the smoothing filtering technique is used to effectively suppress and eliminate the noise in the 3D teaching action image of physical education. A pixel point is The image feature acquisition of 3D teaching action in physical education randomly selected from the 3D teaching action image of physical education as the center point, and then a square field is determined around this center point, all the pixel values in the square field are counted and sorted. Select intermediate value instead of randomly selected center point pixel value. Square domain is called filter window [15], there are 3 × 3 filter window and 5 filter windows. Using the filter window to move up and down on the 3D teaching action image of physical education, the intermediate value obtained is replaced by the gray value of the original image, and the 3D teaching action image of physical education is filtered and de-noised as shown in Eq. (11).
Where, G(i, j) is the gray value of any pixel in the 3D teaching action image of physical education after median filtering; h(k, l) is the gray value of any pixel in the initial (11) G(i, j) = median{h(k, l)|f (i, j) ∈ Nf (i, j) } 3D teaching action image; Nf(i, j) represents the first order solid of the f(i, j).
On this basis, it completes the preprocessing of sports 3D teaching action image data. This process is very important. It is a key step in the follow-up research and design. In this extraction, it also needs multi feature fusion of sports 3D teaching action image data. Therefore, this paper makes an in-depth study on it.
After extracting the color features, geometric features, and corner features from the 3D teaching action image of physical education, we need to integrate the features before tracking them, and then track the features using Eq. (12).
The process of feature fusion and matching of sports 3D action is shown in Fig. 3. Whether the image feature is extracted successfully

Physical education using 3-D teaching action positioning
On the basis of the three-dimensional teaching action of physical education, the three-dimensional teaching action of physical education is positioned. This paper uses the convolution neural network in the intelligent neural network to locate the three-dimensional teaching action of physical education. Deep convolution neural network is an artificial neural network connected by a fixed neuron, which has a certain hierarchical [16]. The structure of deep convolution neural network is receptive field, channel, filter bank, feature map, activation function, convolution layer and pool layer. The convolution layer of deep convolution neural network is the range of local connection, that is, the receptive field of convolution layer. The principle is shown in Fig. 4: In Fig. 4, the receptive field size of the convolution layer of the deep convolution neural network is ×l, the corresponding shared weights are convolution kernels, also known as filters. The movement of different convolutional kernels on spatial dimensions will generate multiple feature maps. Figure 4 generates feature maps of different sizes after convolution operation, the number of which is the number of channels, which is determined by the number of convolution kernels in the convolution layer. Thus, after passing through a convolution layer, k feature map of the T layer has the following elements at the location (i, j) of the space as presented by Eq. (13).  13), ϕ(•) represents the activation function of deep convolutional neural networks, ξ, l represent the size of the receptive field space of the deep convolution neural network, b represents the number of input layer channels, y is the bias term of the deep convolution neural network, ϖ represents the size of the weights. For deep convolution neural network, there are no learning parameters in the pool layer. In essence, the convolution layer and the pool layer need linear filtering and multi-feature transformation [17]. Assuming that the ν (1) = a is the input layer of the deep convolution neural network, then the v n represents the response of layer N as presented in Eq. (14).
Where, e, v and τ vectors representing response and bias, Q matrices representing weights.
Firstly, every candidate frame of 3D teaching action image of physical education is divided into two parts in the 3D teaching action orientation of physical education M × N, then pool each region is pooled and the candidate box of different size in the 3D teaching action image of physical education is transformed into the same size data input to the next layer of convolution neural network [18][19][20], Suppose the nodes of the input layer are x i , node of the output layer i is y i to get the cost function as shown in Eq. (15).
Where, δ(i, j) indicates whether the decision function is the maximum output value of the candidate box at the i node. Cost for the x i gradient and the sum of the correlation gradient of the latter layer. It is assumed that the original gray scale range of the three-dimensional teaching action image of physical education is [0, L-1]. The maximum value obtained is the threshold of the 3D teaching action image of physical education, and the location of the action point is completed as given in Eqs. (16) to (18).
Where, σ 0 shows the proportion of the number of foreground pixels to the whole picture, σ 1 indicates the proportion of the number of background pixels to the whole picture, ϑ[i] represents the probability of i gray values, P 1 (x) representative foreground entropy, P 2 (x) represents the background entropy value.

Experimental analysis
To verify the effectiveness of this method, the experiment uses the method of comparing this paper, the method of literature [5] and the method of literature [6], set up a certain experimental environment, and the experimental index to verify the effectiveness of this method.

Experimental environment
The experiment is carried out on the MATLAB 7.2 platform. The experimental operating system is a Windows XP system with a running memory of 8 GB, CPU and 36. GHz, in order to ensure the accuracy of experimental data analysis, the SPSS 13.0 software is used to analyze the experimental data. Through the image acquisition of basketball players in a sports school, three methods are used to locate them.

Experimental parameters
In order to ensure the reliability of the experiment, the relevant parameters used in the experiment are designed, as shown in Table 1.

Result analysis
In order to verify the advantages of this method, the experiment compares the method of this paper, the method of literature [5] and the method of literature [6] and takes the accuracy and time-consuming of the sample image action positioning as the experimental indicators to complete the experimental analysis.

Accuracy analysis
To verify the effectiveness of this method, the experimental analysis of this method, literature [5] method and literature [6] method in the sample image location accuracy, the experiment carried out a total of 100 iterations, the value is more objective. The results are shown in Fig. 5. From the test results, with the change of test times, there is a certain gap in the positioning accuracy of the sample action image between the method in this paper, the method in literature [5] and the method in literature [6]. When the number of tests is 50, the location accuracy of reference [5] method and reference [6] method is about 55% and 52% respectively, and the location accuracy of this method is about 95%; when the number of tests is 100, the extraction accuracy of reference [5] method and reference [6] method is about 71% and 83% respectively, and the location accuracy of this method is about 97% %Through the comparison of the test results, the positioning accuracy of this method is higher. This is because this method obtains the 3D teaching action image of physical education through machine vision, determines the world coordinate system of the image; uses MRF algorithm to extract the features of the 3D teaching action image of physical education, determines the spatial correlation of the 3D teaching action data of physical education, and effectively suppresses and eliminates the noise in the 3D teaching action image of physical education with the help of smooth filtering technology, so as to complete the 3D teaching of physical education The preprocessing of 3D teaching action image improves the positioning accuracy of this method.

Time-consuming analysis
In order to further verify the scientific effectiveness of the proposed method, the experiment analyzes the time-consuming of the three methods to locate the sample teaching actions.
By analyzing the data in Fig. 6, we can see that in the same experimental environment, the positioning time of the three methods is different. Among them, the time-consuming  The method of this paper Method of literature [5] Method of literature [6] Accuracy /% of this method is always lower than that of the other two traditional methods, and the minimum is about 1.9 s, while the time-consuming of the other two methods is higher than that of this method. This is because this method uses convolution neural network to input and output the processed threedimensional motion image and achieves fast positioning.

Accuracy analysis
After the above two indexes are determined, the accuracy of the method in literature [5] and literature [6] in the fusion of three-dimensional sports action features is verified in the experiment. The resolution of the image is 720 * 400 and 384 * 288 respectively, and the experimental results are obtained by using three methods for 100 iterations as shown in Fig. 7: By analyzing Fig. 7a, it can be seen that when the resolution of 3D sports action image is 720 * 400, the fusion accuracy of the method in this paper, the method in reference [5] and the method in reference [6] changes with the number of iterations. When the number of iterations is 20, the fusion accuracy of our method is 95%, that of reference [5] is 62%, and that of reference [6] is 64%; when the number of iterations is 40, the fusion accuracy of our method is 94%, that of reference [5] is 66%, and that of reference [6] is 64%; when the number of iterations is 80, the fusion accuracy of our method is 94% The fusion accuracy of this method is 97%, that of reference [5] is about 70%, and that of reference [6] is about 75%, When the resolution of three-dimensional sports action image is 384 * 288, the fusion accuracy of this method, literature [5] method and literature [6] method changes with the number of iterations. When the number of iterations is 20, the fusion accuracy of our method is 96%, that of reference [5] is 47%, and that of reference [6] is 59%; when the number of iterations is 40, the fusion accuracy of our method is 93%, that of reference [5] is 52%, and that of reference [6] is 45%; when the number of iterations is 80, the fusion accuracy of our method is 93% The fusion accuracy of this method is 95%, that of reference [5] is about 62%, and that of reference [6] is about 54%. Compared with the comparison results of two images, it is not difficult to find that the fusion accuracy of this method is always higher than that of reference [5] and reference [6], which verifies the effectiveness of this method.

Conclusion
Aiming at the problems of low positioning accuracy and long positioning time in traditional 3D teaching of sports, a 3D teaching motion positioning method based on machine vision was proposed. The world coordinate system of the image is determined by machine vision. MRF algorithm is used to extract the characteristics of 3D teaching action images and determine the spatial correlation of 3D teaching The proposed method Literature [5] Literature [6] (a) Fusion accuracy of different methods at 720 * 400 resolution. The proposed method Literature [5] Literature [6] (b) Fusion accuracy of different methods at 384 * 288 resolution Fig. 7 Comparison of feature fusion accuracy of 3D sports action image action data. By means of smoothing filter technology, the noise in 3D teaching action image is effectively suppressed and eliminated, and on this basis, the reconstruction of 3D teaching action image is completed. The convolutional neural network is used to determine the entropy value of 3D motion teaching action image and realize the positioning of 3D motion teaching action. Compared with traditional methods, the proposed method has the following advantages: (a) the highest accuracy of the proposed method is about 95%, and the positioning accuracy is high; (b) The proposed method takes about 1.9 s, and the time is relatively short. In the future, further research on PE 3D teaching is needed so as to improve it further.

Conflict of interest Authors declare no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.