Research on the Positioning Technology of Sports 3D Teaching Action Based on Machine Vision

Hao, Liu; Pandey, Hari Mohan

doi:10.1007/s11036-021-01885-4

Research on the Positioning Technology of Sports 3D Teaching Action Based on Machine Vision

Open access
Published: 05 February 2022

Volume 27, pages 2419–2428, (2022)
Cite this article

Download PDF

You have full access to this open access article

Mobile Networks and Applications Aims and scope Submit manuscript

Research on the Positioning Technology of Sports 3D Teaching Action Based on Machine Vision

Download PDF

1520 Accesses
Explore all metrics

Abstract

This paper presents a method of action location in three-dimensional motion teaching. The machine vision technology is used to solve the problems of low positioning accuracy and long positioning time in the traditional motion three-dimensional teaching method. The work of this method is as follows: (a) using machine vision method to determine the world coordinate system of the image; (b) using MRF algorithm to extract the features of 3D teaching action image; (c) determining the spatial correlation of 3D teaching action data. In the three-dimensional teaching action image, the smooth filtering technology is used to suppress and eliminate the noise. Then the convolution neural network (CNN) is used to reconstruct the three-dimensional teaching action image. The entropy of three-dimensional teaching behavior of physical education is determined by CNN. Through a large number of computer simulations, the effectiveness of the proposed system is confirmed. The results show that the system achieves 95% accuracy when the positioning time is 1.9 s.

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Human action recognition using fusion of multiview and deep features: an application to video surveillance

Article 14 March 2020

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the development of video technology and biometric recognition technology, human posture recognition technology has been widely used in traffic, sports, medical and other fields. Among them, the application of human motion recognition is more extensive. In recent years, with the rapid development of biometric recognition technology, the recognition and location of human motion has become a hot research field, and it is a technology with potential development space. In the field of sports, the orientation of three-dimensional teaching movement can help athletes and coaches improve their training level. Coaches give effective guidance according to the performance of athletes in sports. Since the end of the twentieth century, the distance knowledge and skills teaching based on the network support environment has become a reality, and various network teaching courses and related research have also arisen. The network can break through the limitation of time and space, provide personalized teaching and diversified interactive mode, and make up for the defects of traditional teaching activities. This paper analyzes and summarizes the sports skills from the theory of sports skill teaching. Therefore, people have carried on a lot of research on the orientation of the three-dimensional teaching behavior of physical education and achieved some achievements [1,2,3].

In reference [4], a method of action location for CNNs based on global spatiotemporal features is proposed. This method is designed to solve the problem of poor local feature extraction of human motion. This method first constructs a dual channel recognition model of global time domain and spatial features. Through the design of spatial feature channel, multi frame fusion method is introduced to fuse the motion action. Then the global time domain channel is used to analyze the motion action, and the fused data is effectively trained to realize the deficiency of motion action. This method can quickly complete the prepositioning of action recognition through the set recognition model, but the scope of this method is narrow, and the effect is good only in a certain range, which has some limitations. In reference [5], a human motion localization method based on multi-scale trajectory of optical flow key points is proposed. Aiming at the problem of complex feature extraction of smooth light in traditional recognition methods, this method is designed. In this method, the scale trajectory of key points in optical flow region is combined with human motion trajectory, and the constraint conditions of human motion recognition are designed to determine the local characteristics of the action, and then the obtained action related data is input into the designed classifier to complete the motion recognition. This method can effectively improve the positioning accuracy of the action, but there are many interference factors in the operation process of this method, which takes a long time to de-noising, and the positioning time is long, so it still needs further improvement. In reference [6], an action feature tracking algorithm based on Markov random field is proposed. The algorithm extracts the feature sub blocks of various parts of the human body in motion, selects the tracking sub block with higher significance as the basis of action location, obtains the spatial constraint relationship between different sub blocks through Markov random field, and realizes the action feature location based on the constraint relationship. The algorithm can accurately divide different feature sub blocks of action, but the resolution of the action image obtained by the algorithm is low, which leads to low positioning accuracy and has some defects.

Based on the aforementioned problems, this paper proposes a novel motion location technology by using machine vision to determine the world coordinate system of the image, using MRF algorithm to both extract the characteristics of three-dimensional teaching action image and determine the spatial correlation of three-dimensional teaching action data. With the help of smooth filtering technology, the noise in the three-dimensional teaching action image is effectively suppressed and eliminated. Therefore, the three-dimensional teaching action image is reconstructed. The entropy value of the three-dimensional image is determined by using convolution neural network. The visualization of teaching behavior is realized, and the three-dimensional positioning of physical education teaching behavior is realized.

The key contributions of this paper are presented as follows:

Firstly, the world coordinate system of the three-dimensional teaching action image is determined by machine vision, which is used to locate the motion. In this paper, the three-dimensional sports teaching action image is first obtained through the camera, and then located the action by using the artificial intelligence algorithm.
Secondly, MRF algorithm is used to extract the features of 3D teaching action image of physical education, and to determine the spatial correlation of 3D teaching action data of physical education. With the help of smooth filtering technology, the noise in 3D teaching action image of physical education is effectively suppressed and eliminated, and the preprocessing of 3D teaching action image of physical education is completed.
Thirdly, a convolution neural network is used to determine the entropy value of sports three-dimensional teaching action image, and to realize the positioning of sports three-dimensional teaching action.
Finally, computer simulations have been performed to evaluate the performance of the proposed system.

The rest of the paper is organized as follows: Section 2 presented the acquisition and preprocessing of 3-D teaching action image of the physical education; Section 3 discusses the physical education using 3-D teaching action positioning; Experimental analysis is described in Section 4; and finally, conclusion has been drawn in Section 5.

2 Acquisition and preprocessing for 3-D teaching action image of physical education

2.1 Acquisition method

In order to realize the three-dimensional teaching action positioning of physical education, this paper uses machine vision to locate it. Machine vision is a key technology in industrial detection, target recognition, positioning, tracking and other fields through robots [7]. This method has the advantages of non-contact, high precision, flexibility and reliability, and good environmental adaptability. Machine vision obtains the scene image through the camera, and then preprocesses the image to determine the location information of the research object, which can effectively solve the problem of human detection and positioning. In this paper, we first obtain the three-dimensional sports teaching action image through the camera, and then use the artificial intelligence algorithm to locate its action [8, 9].

In this paper, firstly, through the camera calibration, determine its coordinate system, and then obtain the sports three-dimensional teaching action image. It mainly includes three-dimensional teaching action image coordinate system, camera coordinate system and world coordinate system. The coordinate system of sports 3D teaching action image is represented by pixel unit, and the coordinate origin is located in the lower left corner of the whole action image.

Set (u, v) to represent the coordinates of a certain point of the action image, and both represent the unit pixel of the action image. Set the origin of the coordinate system as (u₀, v₀) pixel and set the coordinates of a point representing the action image using Eq. (1).

$$\left[\begin{array}{c}u\\ {}v\\ {}1\end{array}\right]=\left[\begin{array}{ccc}\frac{1}{k}& 0& {u}_0\\ {}0& \frac{1}{l}& {v}_0\\ {}0& 0& 1\end{array}\right]\left[\begin{array}{c}x\\ {}y\\ {}1\end{array}\right]$$

(1)

Among them, the pixel size of the action image is k × l.

Through the transformation of its coordinates by Eq. (1) and it is obtained by Eq. (2).

$${\displaystyle \begin{array}{c}\left[\begin{array}{c}x\\ {}y\\ {}1\end{array}\right]=\frac{1}{Z_C}\left[\begin{array}{ccc}f& - fcot\theta & 00\\ {}0& \frac{f}{\mathit{\sin}}& 0\kern0.5em 0\\ {}0& 0& \begin{array}{cc}1& 0\end{array}\end{array}\right]\cdot \left[\begin{array}{c}{X}_C\\ {}{Y}_C\\ {}{Z}_C\\ {}1\end{array}\right]\\ {}\end{array}}$$

(2)

In the Eq. (2), f represents the focal length of the camera, θ represents the skewness of the camera coordinate system. On this basis, the relationship between the action image and the camera coordinate system is determined using Eq. (3).

$$\left[\begin{array}{c}u\\ {}v\\ {}1\end{array}\right]=\frac{1}{Z_C}\left[0\begin{array}{ccc}\frac{f}{k}& \frac{- fcot\theta}{k}& {u}_00\\ {}0& \frac{f}{lsin\theta}& \begin{array}{cc}{v}_0& 0\end{array}\\ {}0& 0& \begin{array}{cc}1& 0\end{array}\end{array}\right]\left[\begin{array}{c}{X}_C\\ {}{Y}_C\\ {}{Z}_C\end{array}\right]=\frac{1}{Z_C}M\left[\begin{array}{c}{X}_C\\ {}{Y}_C\\ {}{Z}_C\\ {}1\end{array}\right]$$

(3)

Where, M represents the intra-camera parameter matrix. (k, l, u₀v₀, f, θ) represents the camera internal parameters.

Through the world coordinate system, the acquisition of three-dimensional teaching action image of physical education is completed. The world coordinate system is a three-dimensional coordinate system with any point in space as the origin. The relationship between the camera coordinate system and the world coordinate system is represented through Eq. (4).

$$\left[\begin{array}{c}{X}_C\\ {}{Y}_C\\ {}{Z}_C\\ {}1\end{array}\right]=\left[\begin{array}{c}{R}_{3\times 3}{t}_{3\times 1}\\ {}01\end{array}\right]\left[\begin{array}{c}{X}_W\\ {}{Y}_W\\ {}{Z}_W\\ {}1\end{array}\right]$$

(4)

Where, R_3 × 3 represents the rotation matrix, t_3 × 1 represents a translation vector, (X_C, Y_C, Z_C) represents the camera coordinates, (X_W, Y_W, Z_W) represents the world coordinate system.

During the three-dimensional space, the two-dimensional rotation of the x, y, z coordinate axis obtained above is rotated, the coordinate angle of the three-dimensional teaching action image of physical education is determined, and the image acquisition is completed using Eq. (5).

$$\left[\begin{array}{c}{x}^1\\ {}{y}^1\end{array}\right]=\left[\begin{array}{c} cos\theta sin\theta \\ {}- sin\theta cos\theta \end{array}\right]\left[\begin{array}{c}x\\ {}y\end{array}\right]$$

(5)

2.2 Image feature extraction

In order to realize the 3D teaching action positioning of physical education, it is necessary to obtain the 3D teaching action data [10] of physical education in the coordinate system of determining the action image above. According to the three-dimensional teaching action image of physical education, the characteristic data of three-dimensional teaching action image of physical education are obtained.

Physical education 3D teaching action image imaging is more complex because physical education 3D teaching action characteristics are not typical features of target images. Therefore, it is necessary to extract the features of the three-dimensional teaching action image of physical education. This paper uses the MRF algorithm [11] extract the features of the three-dimensional teaching action image of physical education and determine the spatial correlation of the three-dimensional teaching action data of physical education.

Suppose A = {a₁, a₂, a₃…a_i} represents the set of three-dimensional physical education action data, the threshold of any value is Δ(1, 2, ∞). The selected value is the random value in the three-dimensional teaching action data of physical education, which is regarded as the joint value of the three-dimensional teaching action data of physical education. At this time, the probability of the joint value in the MRF algorithm is G_a as represented by Eq. (6).

$${G}_a=G\left(\frac{a_i}{a_n}\right)/\sum_{a=1}^i\left(\frac{a_i}{a_n}\right)$$

(6)

In the formula, $\frac{a_i}{a_n}$ represents the conditional probability of random values in the three-dimensional physical education action data.

According to the joint value of physical education 3D teaching action random field obtained above, the image threshold of physical education 3D teaching action is determined. Suppose the three-dimensional teaching action image of physical education contains three parts: target, shadow and background [12], set the threshold of these three Z, divide them into two pixels p(e, r) as shown by Eq. (7).

$$p\left(e,r\right)=\left\{\begin{array}{c}1,g\left(e,r\right)\\ {}0,g\left(e,r\right)\end{array}\right.$$

(7)

Where,1 represents the characteristic target value of 3D teaching action image of physical education and 0 represents the background value of 3D teaching action of physical education.

After obtaining the threshold of 3D teaching action image of physical education, the feature extraction of 3D teaching action image of physical education is completed according to the characteristic target value and the background value as shown in Eq. (8).

$${p}^k=\frac{\sum_{p\left(e,r\right)\le {t}_k}p\left(e,r\right)}{M_k}$$

(8)

Where, M_k represents the characteristic target value and background value of 3D teaching action image of physical education.

After the above analysis, we get the three-dimensional teaching action image of physical education. This paper simulates it to better show the target segmentation of three-dimensional action in physical education teaching, as shown in Fig. 1,

Based on the course goal and the depth of the sample video image preprocessing content, a general teaching requirement is put forward by using the directional gradient histogram feature to reflect the concrete behavior. Teaching objectives also refer to the direction of histogram features and expected results. From the point of view of the success of image feature extraction, the teaching methods and strategies of output feature selection and the arrangement of teaching process play an important guiding role. On this basis, the extraction process of sports 3D teaching action image is set, as shown in Fig. 2.

2.3 Pre-processing of image feature data

After obtaining the above physical education 3D teaching action image data because the original image obtained does not meet the standard stipulated in the image processing technology in terms of format, pixel, brightness, clarity, etc., it will reduce the line shape feature [13] of the sports 3D teaching action image without preprocessing. Therefore, it is very important to preprocess the feature of 3D teaching action image. In the 3D teaching action image of physical education, the pre-processing of 3D teaching action image of physical education is completed by gray scaling the image lines and de-noising the image lines.

In the preprocessing of physical education 3D teaching action image, although the obtained 3D teaching action image will contain more abundant edge information than the gray image, there is still no mature algorithm so far. It can be directly based on color image for image edge line feature preprocessing, and the processing is difficult [14]. Therefore, it is necessary to convert the color image lines into gray image lines in the 3D teaching action images of physical education. In the gray processing of image lines, the three-component pixel value in the image lines is extracted as the gray value of three gray images by component method as presented by Eq. (9).

$$\left\{\begin{array}{c}\mathrm{Gray}1\left(i,j\right)=R\left(i,j\right)\\ {}\mathrm{Gray}2\left(i,j\right)=G\left(i,j\right)\\ {}\mathrm{Gray}3\left(i,j\right)=B\left(i,j\right)\end{array}\right.$$

(9)

Where, Gray1, Gray2 and Gray3 represent the different gray values of the three-point line of the three-dimensional teaching action image of physical education, respectively. R, G, B corresponding to the three primary colors of the image lines, R(i, j), G(i, j), B(i, j) is the pixel value of the image line at the point (i, j) .

On the basis of this, the maximum value of-component pixel value is regarded as the gray value as shown in Eq. (10).

$$\boldsymbol{Gray}\left(i,j\right)=\mathit{\max}\left[R\left(i,j\right),G\left(i,j\right),B\left(i,j\right)\right]$$

(10)

On this basis, the three-dimensional physical education teaching action image features are preprocessed. In this paper, the smoothing filtering technique is used to effectively suppress and eliminate the noise in the 3D teaching action image of physical education. A pixel point is randomly selected from the 3D teaching action image of physical education as the center point, and then a square field is determined around this center point, all the pixel values in the square field are counted and sorted. Select intermediate value instead of randomly selected center point pixel value. Square domain is called filter window [15], there are 3 × 3 filter window and 5 filter windows. Using the filter window to move up and down on the 3D teaching action image of physical education, the intermediate value obtained is replaced by the gray value of the original image, and the 3D teaching action image of physical education is filtered and de-noised as shown in Eq. (11).

$$G\left(i,j\right)=\boldsymbol{median}\left\{h\left(k,l\right)\left|f\left(i,j\right)\in Nf\left(i,j\right)\right.\right\}$$

(11)

Where, G(i, j) is the gray value of any pixel in the 3D teaching action image of physical education after median filtering; h(k, l) is the gray value of any pixel in the initial 3D teaching action image; Nf(i, j) represents the first order solid of the f(i, j).

On this basis, it completes the preprocessing of sports 3D teaching action image data. This process is very important. It is a key step in the follow-up research and design. In this extraction, it also needs multi feature fusion of sports 3D teaching action image data. Therefore, this paper makes an in-depth study on it.

After extracting the color features, geometric features, and corner features from the 3D teaching action image of physical education, we need to integrate the features before tracking them, and then track the features using Eq. (12).

$$U\left(i,j\right)=\alpha p(x)\sum_{i=1}^n(pv)$$

(12)

The process of feature fusion and matching of sports 3D action is shown in Fig. 3.

3 Physical education using 3-D teaching action positioning

On the basis of the three-dimensional teaching action of physical education, the three-dimensional teaching action of physical education is positioned. This paper uses the convolution neural network in the intelligent neural network to locate the three-dimensional teaching action of physical education.

Deep convolution neural network is an artificial neural network connected by a fixed neuron, which has a certain hierarchical [16]. The structure of deep convolution neural network is receptive field, channel, filter bank, feature map, activation function, convolution layer and pool layer. The convolution layer of deep convolution neural network is the range of local connection, that is, the receptive field of convolution layer. The principle is shown in Fig. 4:

In Fig. 4, the receptive field size of the convolution layer of the deep convolution neural network is ×l, the corresponding shared weights are convolution kernels, also known as filters. The movement of different convolutional kernels on spatial dimensions will generate multiple feature maps. Figure 4 generates feature maps of different sizes after convolution operation, the number of which is the number of channels, which is determined by the number of convolution kernels in the convolution layer. Thus, after passing through a convolution layer, k feature map of the T layer has the following elements at the location (i, j) of the space as presented by Eq. (13).

$${x}_{i,j,k}^t=\phi \left(\sum_{c=1}^b\sum_{m=1}^{\xi}\sum_{n=1}^l{x}_{i+m,j+n,c}^{\left(t-1\right)}\times {\varpi}_{m,n,c.k}^{(t)}+{y}_k^{(t)}\right)$$

(13)

In Eq. (13), ϕ(•) represents the activation function of deep convolutional neural networks, ξ, l represent the size of the receptive field space of the deep convolution neural network, b represents the number of input layer channels, y is the bias term of the deep convolution neural network, ϖ represents the size of the weights. For deep convolution neural network, there are no learning parameters in the pool layer. In essence, the convolution layer and the pool layer need linear filtering and multi-feature transformation [17]. Assuming that the ν⁽¹⁾ = a is the input layer of the deep convolution neural network, then the vⁿ represents the response of layer N as presented in Eq. (14).

$${\displaystyle \begin{array}{c}{e}^{\left(n+1\right)}={Q}^{(n)}{v}^{(n)}+{\tau}^{(n)}\\ {}{v}^{\left(n+1\right)}={\phi}^{\left(t+1\right)}{e}^{\left(n+1\right)}\end{array}}$$

(14)

Where, e, v and τ vectors representing response and bias, Q matrices representing weights.

Firstly, every candidate frame of 3D teaching action image of physical education is divided into two parts in the 3D teaching action orientation of physical education M × N, then pool each region is pooled and the candidate box of different size in the 3D teaching action image of physical education is transformed into the same size data input to the next layer of convolution neural network [18,19,20], Suppose the nodes of the input layer are x_i, node of the output layer i is y_i to get the cost function as shown in Eq. (15).

$$\frac{\partial L}{\partial {x}_i}=\left\{\begin{array}{c}0\delta \left(i,j\right)=F\\ {}\frac{\partial L}{\partial {x}_i}\delta \left(i,j\right)=T\end{array}\right.$$

(15)

Where, δ(i, j) indicates whether the decision function is the maximum output value of the candidate box at the i node. Cost for the x_i gradient and the sum of the correlation gradient of the latter layer.

It is assumed that the original gray scale range of the three-dimensional teaching action image of physical education is [0, L-1]. The maximum value obtained is the threshold of the 3D teaching action image of physical education, and the location of the action point is completed as given in Eqs. (16) to (18).

$${P}_1(x)=\sum_{i=0}^{n-1}-\Big(\vartheta \left[i\right]/{\sigma}_0(x)\times \mathit{\log}\left(\vartheta \left[i\right]/{\sigma}_0(x)\right)$$

(16)

$${P}_2(x)=\sum_{i=1}^{L-1}-\Big(\vartheta \left[i\right]/{\sigma}_1(x)\times \mathit{\log}\left(\vartheta \left[i\right]/{\sigma}_1(x)\right)$$

(17)

$$P=\mathit{\operatorname{Max}}\left({P}_1(x)+{P}_2(x)\right)$$

(18)

Where, σ₀ shows the proportion of the number of foreground pixels to the whole picture, σ₁ indicates the proportion of the number of background pixels to the whole picture, ϑ[i] represents the probability of i gray values, P₁(x) representative foreground entropy, P₂(x) represents the background entropy value.

4 Experimental analysis

To verify the effectiveness of this method, the experiment uses the method of comparing this paper, the method of literature [5] and the method of literature [6], set up a certain experimental environment, and the experimental index to verify the effectiveness of this method.

4.1 Experimental environment

The experiment is carried out on the MATLAB 7.2 platform. The experimental operating system is a Windows XP system with a running memory of 8 GB, CPU and 36. GHz, in order to ensure the accuracy of experimental data analysis, the SPSS 13.0 software is used to analyze the experimental data. Through the image acquisition of basketball players in a sports school, three methods are used to locate them.

4.2 Experimental parameters

In order to ensure the reliability of the experiment, the relevant parameters used in the experiment are designed, as shown in Table 1.

Table 1 Data sheets for experiments

Full size table

4.3 Result analysis

In order to verify the advantages of this method, the experiment compares the method of this paper, the method of literature [5] and the method of literature [6] and takes the accuracy and time-consuming of the sample image action positioning as the experimental indicators to complete the experimental analysis.

4.3.1 Accuracy analysis

To verify the effectiveness of this method, the experimental analysis of this method, literature [5] method and literature [6] method in the sample image location accuracy, the experiment carried out a total of 100 iterations, the value is more objective. The results are shown in Fig. 5.

From the test results, with the change of test times, there is a certain gap in the positioning accuracy of the sample action image between the method in this paper, the method in literature [5] and the method in literature [6]. When the number of tests is 50, the location accuracy of reference [5] method and reference [6] method is about 55% and 52% respectively, and the location accuracy of this method is about 95%; when the number of tests is 100, the extraction accuracy of reference [5] method and reference [6] method is about 71% and 83% respectively, and the location accuracy of this method is about 97% %Through the comparison of the test results, the positioning accuracy of this method is higher. This is because this method obtains the 3D teaching action image of physical education through machine vision, determines the world coordinate system of the image; uses MRF algorithm to extract the features of the 3D teaching action image of physical education, determines the spatial correlation of the 3D teaching action data of physical education, and effectively suppresses and eliminates the noise in the 3D teaching action image of physical education with the help of smooth filtering technology, so as to complete the 3D teaching of physical education The preprocessing of 3D teaching action image improves the positioning accuracy of this method.

4.3.2 Time-consuming analysis

In order to further verify the scientific effectiveness of the proposed method, the experiment analyzes the time-consuming of the three methods to locate the sample teaching actions.

By analyzing the data in Fig. 6, we can see that in the same experimental environment, the positioning time of the three methods is different. Among them, the time-consuming of this method is always lower than that of the other two traditional methods, and the minimum is about 1.9 s, while the time-consuming of the other two methods is higher than that of this method. This is because this method uses convolution neural network to input and output the processed three-dimensional motion image and achieves fast positioning.

4.3.3 Accuracy analysis

After the above two indexes are determined, the accuracy of the method in literature [5] and literature [6] in the fusion of three-dimensional sports action features is verified in the experiment. The resolution of the image is 720 * 400 and 384 * 288 respectively, and the experimental results are obtained by using three methods for 100 iterations as shown in Fig. 7:

By analyzing Fig. 7a, it can be seen that when the resolution of 3D sports action image is 720 * 400, the fusion accuracy of the method in this paper, the method in reference [5] and the method in reference [6] changes with the number of iterations. When the number of iterations is 20, the fusion accuracy of our method is 95%, that of reference [5] is 62%, and that of reference [6] is 64%; when the number of iterations is 40, the fusion accuracy of our method is 94%, that of reference [5] is 66%, and that of reference [6] is 64%; when the number of iterations is 80, the fusion accuracy of our method is 94% The fusion accuracy of this method is 97%, that of reference [5] is about 70%, and that of reference [6] is about 75%, When the resolution of three-dimensional sports action image is 384 * 288, the fusion accuracy of this method, literature [5] method and literature [6] method changes with the number of iterations. When the number of iterations is 20, the fusion accuracy of our method is 96%, that of reference [5] is 47%, and that of reference [6] is 59%; when the number of iterations is 40, the fusion accuracy of our method is 93%, that of reference [5] is 52%, and that of reference [6] is 45%; when the number of iterations is 80, the fusion accuracy of our method is 93% The fusion accuracy of this method is 95%, that of reference [5] is about 62%, and that of reference [6] is about 54%. Compared with the comparison results of two images, it is not difficult to find that the fusion accuracy of this method is always higher than that of reference [5] and reference [6], which verifies the effectiveness of this method.

5 Conclusion

Aiming at the problems of low positioning accuracy and long positioning time in traditional 3D teaching of sports, a 3D teaching motion positioning method based on machine vision was proposed. The world coordinate system of the image is determined by machine vision. MRF algorithm is used to extract the characteristics of 3D teaching action images and determine the spatial correlation of 3D teaching action data. By means of smoothing filter technology, the noise in 3D teaching action image is effectively suppressed and eliminated, and on this basis, the reconstruction of 3D teaching action image is completed. The convolutional neural network is used to determine the entropy value of 3D motion teaching action image and realize the positioning of 3D motion teaching action. Compared with traditional methods, the proposed method has the following advantages: (a) the highest accuracy of the proposed method is about 95%, and the positioning accuracy is high; (b) The proposed method takes about 1.9 s, and the time is relatively short. In the future, further research on PE 3D teaching is needed so as to improve it further.

References

Lin Y, Zhao H, Ma X, Tu Y, Wang M Adversarial attacks in modulation recognition with convolutional neural networks. IEEE Trans Reliab. https://doi.org/10.1109/TR.2020.3032744
Qin Y, Na Q, Liu F (2019) Strain gauges position based on machine vision positioning[J]. Integr Ferroelectr 20(1):191–198
Article Google Scholar
Wang Q, Cheng M, Noureldin A (2019) Research on the improved method for dual foot-mounted inertial/magnetometer pedestrian positioning based on adaptive inequality constraints Kalman filter algorithm[J]. Measurement 15(12):189–198
Article Google Scholar
Wang Ke W, JunZhou T, Ruifeng L (2018) An action recognition method based on global spatial-temporal feature convolutional neural networks[J]. J Huazhong Univ Sci Technol (Nat Sci Edn) 46(12):36–41
Google Scholar
Gadekallu TR et al (2020) Deep neural networks to predict diabetic retinopathy. J Ambient Intell Humaniz Comput 24:1–4
Google Scholar
Belhadi A et al (2021) Deep learning for pedestrian collective behavior analysis in smart cities: a model of group trajectory outlier detection. Inf Fusion 65:13–20
Article Google Scholar
Bai L, Yang X, Gao H (2018) Corner point-based coarse–fine method for surface-mount component positioning[J]. IEEE Trans Ind Inform 14(3):877–886
Article Google Scholar
Tu Y, Lin Y, Wang J et al (2018) Semi-supervised learning with generative adversarial networks on digital signal modulation classification[J]. CMC-Comput Mater Continua 55(2):243–254
Google Scholar
Wang Q, Cheng M, Noureldin A et al (2019) Research on the improved method for dual foot-mounted inertial/magnetometer pedestrian positioning based on adaptive inequality constraints Kalman filter algorithm[J]. Measurement 135(20):189–198
Article Google Scholar
Lin Y, Ya T, Zheng D et al Contour Stella image and deep learning for signal recognition in the physical layer. IEEE Trans Cogn Commun Netw. https://doi.org/10.1109/TCCN.2020.3024610
Liu S, Wang S, Liu X et al (2021, online first) Human memory update strategy: a multi-layer template update mechanism for remote visual monitoring. IEEE Trans Multimedia. https://doi.org/10.1109/TMM.2021.3065580
Jh C, Jj Z, Rj G (2020) Research on modified algorithms of cylindrical external thread profile based on machine vision[J]. Meas Sci Rev 19(15):11–21
Google Scholar
Huang X, Wang F, Zhang J et al (2019) A posture recognition method based on indoor positioning technology[J]. Sensors 19(6):259–264
Article Google Scholar
Shuai L, Shuai W, Xinyu L et al (2021) Fuzzy detection aided real-time and robust visual tracking under complex environments. IEEE Trans Fuzzy Syst 29(1):90–102
Article Google Scholar
Wu X, Cao Q, Li Y (2018) A research on wireless sensor networks' node positioning mechanism based on narrowband internet of things data linking[J]. Int J Distrib Sensor Netw 14(12):36–42
Article Google Scholar
Ortiz SDF, Amestoy ME, Carrero-Blanco J (2020) Positioning measurement using a new artificial vision algorithm in LabVIEW based on the analysis of images on an LCD screen[J]. Int J Adv Manuf Technol 10(5):1–16
Google Scholar
Liu S, Guo C, Al-Turjman F et al (2020) Reliability of response region: a novel mechanism in visual tracking by edge computing for IIoT environments. Mech Syst Signal Process 138:106537
Article Google Scholar
Ge L, Dan D, Li H (2020) An accurate and robust monitoring method of full-bridge traffic load distribution based on YOLO-v3 machine vision[J]. Struct Control Health Monit 27(01):17–25
Google Scholar
Shi C, Qian J, Han S (2018) Developing a machine vision system for simultaneous prediction of freshness indicators based on tilapia (Oreochromis niloticus) pupil and gill color during storage at 4°C.[J]. Food Chem 243(15):134–140
Article Google Scholar
Liu S, Liu D, Srivastava G et al (2020, online first) Overview and methods of correlation filter algorithms in object tracking. Complex Intell Syst. https://doi.org/10.1007/s40747-020-00161-4

Download references

Author information

Authors and Affiliations

Zhongnan University of Economics and Law, Wuhan, 430064, China
Liu Hao
Department of Computer Science, Edge Hill University, Ormskirk, UK
Hari Mohan Pandey

Authors

Liu Hao
View author publications
You can also search for this author in PubMed Google Scholar
Hari Mohan Pandey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hari Mohan Pandey.

Ethics declarations

Conflict of interest

Authors declare no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hao, L., Pandey, H.M. Research on the Positioning Technology of Sports 3D Teaching Action Based on Machine Vision. Mobile Netw Appl 27, 2419–2428 (2022). https://doi.org/10.1007/s11036-021-01885-4

Download citation

Accepted: 23 May 2021
Published: 05 February 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11036-021-01885-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Research on the Positioning Technology of Sports 3D Teaching Action Based on Machine Vision

Abstract

Similar content being viewed by others

A review of convolutional neural networks in computer vision

Human action recognition using fusion of multiview and deep features: an application to video surveillance

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

1 Introduction