Elite male table tennis matches diagnosis using SHAP and a hybrid LSTM–BPNN algorithm

Song, Honglin; Li, Yutao; Zou, Xiaofeng; Hu, Ping; Liu, Tianbiao

doi:10.1038/s41598-023-37746-1

Elite male table tennis matches diagnosis using SHAP and a hybrid LSTM–BPNN algorithm

Article
Open access
Published: 17 July 2023

Volume 13, article number 11533, (2023)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Elite male table tennis matches diagnosis using SHAP and a hybrid LSTM–BPNN algorithm

Download PDF

Honglin Song¹,
Yutao Li¹,
Xiaofeng Zou²,
Ping Hu³ &
…
Tianbiao Liu¹

2592 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

This study adopts a new approach, SHapley Additive exPlanation (SHAP), to diagnose the table tennis matches based on a hybrid algorithm, namely Long Short-Term Memory–Back Propagation Neural Network (LSTM–BPNN). 100 male singles competitions (8535 rallies) from 2019 to 2022 are analyzed by a hybrid technical–tactical analysis theory, which hybridizes the double three-phase and four-phase evaluation theories. A k-means cluster analysis is conducted to classify 59 players’ winning rates into three levels (high, medium, and low). The results show that LSTM–BPNN has excellent performance (MSE = 0.000355, MAE = 0.014237, RMSE = 0.018853, and ${\mathrm{R}}^{2}$ = 0.988311) compared with six typical artificial intelligence algorithms. Using LSTM–BPNN to calculate the SHAP value of each feature, the global results find that the receive-attack and serve-attack phases of the ending match have essential impacts on the mutual winning probabilities. Finally, case applications show that the SHAP can directly obtain each feature importance on one or more matches, which is more objective and reliable than the traditional simulation method. This research explores an innovative way to understand and analyze matches, and these results have implications for the performance analysis of table tennis and related racket sports.

Technical and tactical diagnosis model of table tennis matches based on BP neural network

Article Open access 20 May 2021

NBA Game Result Prediction Using Feature Analysis and Machine Learning

Article 03 January 2019

Analysis on the construction of sports match prediction model using neural network

Article 05 March 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Introduction

In racket sports, players’ technical–tactical performance is a crucial factor in winning the game, and much research on singles matches of table tennis has focused on it for the past few decades^{1,2,3,4,5,6,7,8,9,10}. Techniques in table tennis include the non-constraining techniques of serving and the constraining techniques of attacking, controlling and defending, the most representative of which are fast attacking, pushing and chopping¹¹. Tactics combine different sports techniques which players use to achieve superiority in the matches. The technical–tactical performance in table tennis is complicated in matches. Researchers have categorized different strokes in each rally according to respective phases and performed notational analysis on each phase^3,4,5,12. As a strong table tennis country, China has many excellent players, which mainly benefits from continuous analysis of relevant technical–tactical performances¹³. The most usual method or theory is the traditional three-phase evaluation theory¹. Based on that, phase-based theories have been proposed to improve it, such as the four-phase evaluation theory³ and the double three-phase evaluation theory⁷ in the descriptive analysis range¹⁴.

With the rapid development of deep learning and machine learning, many models and algorithms have been applied to study table tennis matches^{15,16,17,18,19,20,21,22,23,24,25}. Researchers have built models and adjusted input values through simulations to achieve technical–tactical diagnosis^{15,16,18,19,21} based on sports training and competition methods or theories. Data mining helps coaches organize more purposeful training programs for athletes during preparation and provides targeted coaching during competitions. It also uncovers critical aspects that are easily overlooked in the competition.

However, three problems are overlooked in previous studies about evaluating or diagnosing players’ technical–tactical performance. One of the drawbacks is that previous studies fail to explain how the interaction between two players impacts their performance in the match^10,26,27,28. Current diagnosis models only account for one side’s performance in table tennis matches, which does not consider the mutual influence and interaction between the two sides. Meanwhile, there is a correlation between players’ technical–tactical performance and their ability levels and technical styles. Therefore, it may yield biased results by only diagnosing one side’s technical–tactical performance or ignoring both sides’ styles and abilities.

Another problem in the previous studies is that they have not considered the serial characteristics of the data in the selection of algorithms. Two improved theories are based on the traditional three-phase evaluation theory: the four-phase evaluation theory³ and the double three-phase evaluation theory⁷. The former reclassifies the match into four phases (i.e. serve-attack phase, receive-attack phase, stalemate I phase, and stalemate II phase). The latter adds the concept of match progression to the evaluation (i.e. the beginning, middle, and end phases), giving the serial character to the data collected and each phase. Meanwhile, the recurrent neural network (RNN) is proposed to solve contextual relations in serial data²⁹, but it suffers from gradient explosion and lacks information preservation over long periods. In order to solve the above problem, a particular type of RNN, the Long Short-Term Memory (LSTM) neural network³⁰ is proposed. Previous studies have adopted LSTM to deal with series data with excellent performance in finance, motion recognition, and wearables^31,32,33. Therefore, it may be more appropriate to use LSTM to construct a diagnosis model for technical–tactical analysis data with serial characteristics, leading to higher accuracy for the model.

In addition, previous studies have focused too much on the performance of the models, but the importance of interpreting models has been neglected³⁴. They have used simulation analysis to find the key influencing factors in matches based on artificial intelligence algorithms^17,21, which significantly reduced the reliability of the diagnostic results. Fortunately, there have been many different approaches to increasing the interpretability of models. Among them, the SHapley Additive exPlanation (SHAP) is widely recognized, which uses local approximation and cooperative game theory to explore the interpretability of complex models³⁵. Explaining the model through SHAP has been used in many industries with excellent results^36,37,38,39. However, few studies have used the SHAP in sports performance analysis. SHAP can dramatically improve technical–tactical diagnosis applicability in table tennis.

Inspired by previous studies, this study adopts a hybrid algorithm, namely LSTM–BPNN to diagnose table tennis matches using SHAP method. In order to further embody the interaction between two players, this study hybridize the four-phase evaluation theory and the double three-phase evaluation theory to build an interactive technical–tactical model including two side players. Meanwhile, LSTM–BPNN is the most appropriate here because it has excellent performance and robustness in prediction compared with six typical artificial intelligence algorithms. In terms of SHAP, this study visualizes all features to understand the technical–tactical model and uses local interpretation to diagnose and analyze table tennis matches. Meanwhile, this study conducts the case application to compare with the simulation and SHAP analyses.

Method

The study design is shown in Fig. 1.

Data resources

All videos are taken from publicly available platforms on the Internet. 100 international men’s singles table tennis competitions from 2019 to 2022, with a total of 8535 rallies are analyzed, including 59 elite male players. There are 200 data in the dataset (training data = 160, testing data = 40), each with 28 input values and 1 output value. Meanwhile, the technical styles (TS) are obtained from the International Table Tennis Federation website (https://www.ittf.com/), namely left-handed-penhold attackers (RPA), right-handed-penhold attacker (LPA), left-handed-shakehand attacker (LSA), right-handed-shakehand attacker (RSA), and right-handed-shakehand defender (RSD). The players’ singles winning rates (WR) from 2019 to 2022 are obtained from the International Table Tennis Federation website. They are calculated by filtering all matches from the players’ careers, excluding doubles and team matches.

Data reliability

The intraclass correlation coefficient (ICC) is a statistical estimate that measures the extent of agreement between at least two quantitative measurements⁴⁰. In order to test the validity of data collection and match analysis, 10 matches are randomly selected for observation and reanalysis, and the ICC test is conducted using IBM SPSS Statistics 27. The model and type of the ICC test are a 2-way mixed-effects model and absolute agreement, where the results of the single rater are 0.944 and 0.995, indicating excellent agreement⁴¹.

Data clustering

K-means clustering is performed to classify the different levels of players’ singles WRs via Python 3.8 and the scikit-learn library. Before K-means clustering, this study adopted the Min–Max scaling for the players’ singles WRs. Comparing the performance of the binary, triple and four classifications, the best is the triple classification (high, medium, and low), and its silhouette coefficient is 0.604, as shown in Table 1.

Table 1 The results of the K-means clustering.

Full size table

Data formatting and normalization

In feature coding, label encoding is used to encode the three levels of players’ singles WRs (high: 3; medium: 2; low: 1), and the frequency encoding is used to encode the five technical styles (RPA: 11; RSA: 128; RSD: 10, LPA: 18, LSA: 33), which is consistent with previous studies in the data encoding^42,43,44. In the process of data normalization, the study uses the Min–Max scaling.

Traditional three-phase evaluation theory

The traditional three-phase evaluation theory is one of the most representative theoretical approaches in table tennis tactical diagnosis¹. It is based on the serial characteristics of the last stroke scored or lost in each rally and divides the match into three phases: the serve-attack phase, the receive-attack phase, and the stalemate phase. The criteria for each phase are as follows: (1) The serve-attack phase: the points won and lost by the player on the serve (first stroke) and the third stroke, and the points lost on the fifth stroke; (2) The receive-attack phase: the points scored and lost by the player on the receive-attack phase (second stroke) and the fourth stroke. (3) The stalemate phase: the serving player scores on the fifth stroke and the scores lost points after the fifth stroke.

Match progress phase in the double three-phase evaluation theory

As the rules of table tennis continue to evolve and the equipment on the table tennis court continues to change, the original theory of technical–tactical analysis may need to develop accordingly. Dr Dandan Xiao proposed the double three-phase evaluation theory⁷, adding the concept of match progression based on the traditional three-phase evaluation theory of the serve-attack phase, the receive-attack phase, and the stalemate phase, and classifies the whole match into the beginning phase, the middle phase, and the end phase. It is a more comprehensive and concise statistical method to analyze matches. The criteria for each phase are as follows: (1) the beginning phase: from the start of a game until one player scores four points; (2) the middle phase: from the time one player scores four points until eight points; (3) the end phase: from the time one player scores eight points until the end of the match.

Stroke classification in the four-phase evaluation theory

Many studies use the traditional three-phase evaluation theory with great success. However, one problem has arisen in the long-term application the technical–tactical data of players on both sides of the match are unequal. In order to solve this problem and further optimize it, the four-phase evaluation theory is proposed to provide a complementary and symmetrical relationship between the two sides in each technical–tactical phase³. It can produce the matching data between two players and more detailed information on the stalemate phase. The four-phase evaluation theory classifies the whole match into four phases. These phases follow the criteria: (1) the serve-attack phase: the points won and lost at the first stroke and third strokes, as well as the points lost at the fifth strokes; (2) the receive-attack phase: the points won and lost at the second and fourth strokes; (3) the stalemate I phase: the points won at the fifth stroke, as well as the points, scored and lost after the fifth stroke; (4) the stalemate II phase: the points won and lost on the sixth stroke and beyond.

Observation points

The observation points are based on the number of strokes and the last stroke scored to describe the profile of the table tennis match. It allows data to be collected quickly and more easily understood. The traditional three-phase evaluation theory and its various derivatives (the four-phase evaluation theory and the double three-phase evaluation theory) are proposed based on such an observation criterion for data collection. The observation points are as follows: (1) the odd number of strokes for the serving player (first, third, fifth, and beyond); (2) the even number of strokes for the receiving player (second, fourth, sixth, and beyond); (3) the last stroke won or lost by both players in each rally.

Hybrid technical–tactical analysis theory

The double three-phase evaluation theory adds the concept of match progression to the traditional three-phase evaluation theory to analyze the match of table tennis in a multi-dimensional way⁷. However, it still follows the principles of the traditional three-phase evaluation theory for stroke classification in each phase, which indirectly retains the disadvantage that causes the mismatch of data between two players. Thus, it reflects that the usage rates (UR) in the corresponding phase of both players are not equal, and the sum of two scoring rates (SR) values is not 1, which is disadvantageous to the interactive analysis between both sides in the match. Nevertheless, the four-phase evaluation theory can better address the above issue³. However, the four-phase evaluation theory does not have the advantage that reflects the match progression proposed in the double three-phase evaluation theory. In order to provide an interactive and multi-perspective analysis of the match between both sides, this study hybridizes the advantages of two analysis theories into one hybrid technical–tactical analysis theory.

This study proposes a hybrid analysis theory by combining the match progression phase in the double three-phase evaluation theory and the four-phase evaluation theory for stroke classification for a total of twelve phases: the beginning and serve-attack phase (BSAP), the middle and serve-attack phase (MSAP), the end and serve-attack phase (ESAP), the beginning and receive-attack phase (BRAP), the middle and receive-attack phase (MRAP), the end and receive-attack phase (ERAP), the beginning and stalemate I phase (BSI), the middle and stalemate I phase (MSI), the end and stalemate I phase (ESI), the beginning and stalemate II phase (BSII), the middle and stalemate II phase (MSII), and the end and stalemate II phase (ESII). The framework of the hybrid technical–tactical analysis theory is shown in Table 2 and Fig. 2.

Table 2 The framework of the hybrid technical–tactical analysis theory.

Full size table

Evaluation indicators

UR and SR are important indicators for evaluating a table tennis player’s performance. Whether it is the traditional three-phase evaluation theory, the double three-phase evaluation theory, or the four-phase evaluation theory, each of these phases’ indicators can be calculated by SR and UR, which are calculated as follows:

$${\text{SR}} = {\frac{{\text{phase }}\;{\text{score}}}{{\text{phase }}\;{\text{score }} + {\text{ phase}}\;{\text{points}}\;{\text{lost}}}\times 100\%}$$

(1)

$$\text{UR} = {\frac{\text{phase } \; \text{score } + \text{ phase } \; \text{points } \; \text{lost}} {\text{points} \; \text{scored} \; \text{and} \; \text{lost} \; \text{in} \; \text{the} \; \text{match} \; \text{progression}} }\times 100\%$$

(2)

The SR is expressed as phase score/(phase score + phase points lost) * 100%, and the UR is calculated as (phase score + phase points lost)/points scored and lost in the match progression (i.e., beginning, middle, or end).

As shown in Eq. (3), to further construct the link between SR and UR, the technique effectiveness (TE) is proposed to evaluate their contribution to the match results, which can help us better understand the effectiveness of players’ performances in each match phase⁴⁵. Therefore, this study constructs a linkage between SR and UR through TE. Subsequently, the TEs of each phase are used as feature input to the model. TE ensures the interrelationship between SR and UR within each phase and considers the non-linear relationship between each phase.

$${\text{TE}} = - \left( {1 + \frac{{\sqrt 2 }}{2}} \right) + \left( {1.5 + \sqrt 2 } \right)\left[ {\left( {1 + {\text{UR}}} \right)^{{{\text{SR}} - 0.5}} } \right] - \frac{{\sqrt 2 }}{2}\left[ {\left( {1 + {\text{UR}}} \right)^{{2\left( {{\text{SR}} - 0.5} \right)}} } \right]$$

(3)

The overall winning probability (WP) indicates one player’s scoring ability. The mutual winning probability indicates the athlete’s scoring ability in a confrontational situation. One player’s overall winning probability (WP) is the total points scored/(total points scored + total points lost), as shown in Eq. (4). Moreover, the mutual winning probability (MWP) is ${\mathrm{WP}}_{\mathrm{player A}}$ − ${\mathrm{WP}}_{\mathrm{player B}}$, referring to the winning probability of player A over that of player B, as shown in Eq. (5).

$${\text{WP}} = \frac{{{\text{player}}\;{\text{A}}\;{\text{total}}\;{\text{points}}\;{\text{scored }}}}{{{\text{total}}\;{\text{points}}\;{\text{scored }} + {\text{ total}}\;{\text{points}}\;{\text{lost}}}}$$

(4)

$${\text{MWP}} = {\text{WP}}_{{{\text{player}}\;{\text{A}}}} - {\text{WP}}_{{{\text{player}}\;{\text{B}}}}$$

(5)

KNN

The k-nearest neighbour (KNN) algorithm is a supervised learning algorithm for classification and regression problems. It is easy to implement and does not rely on any specific model, making it simple and popular. The algorithm examines the k nearest data to the training set to classify new samples. These k examples belong to a large class to which this new sample belongs⁴⁶. KNN is computed by the preferred method of Euclid’s formula, where the mathematical expression can be seenin Eq. (6).

$${\text{d}} = \sqrt {\mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{n}}} \left( {{\text{x}}_{{\text{i}}} - {\text{y}}_{{\text{i}}} } \right)^{2} }$$

(6)

DT

The Decision Tree (DT) is a non-parametric supervised learning algorithm. Classification and Regression Trees (CART) type of DT is used in this study to go for the regression task. It is a hierarchical tree structure consisting of a root node, branches, internal nodes and leaf nodes.

RF

The Random Forest (RF) is a fast and robust integrated learning method first described by Breiman⁴⁷ that can be used for classification and regression. RF regression creates multiple decision trees from bootstrap samples of the training dataset, and the average of the N outputs of the N decision trees is calculated by voting to create the final Projection⁴⁸, where the final mathematical expression is shown in Eq. (7).

$$\hat{f}_{RF}^{C} \left( x \right) = \frac{1}{C}\mathop \sum \limits_{i = 1}^{C} T_{i} \left( x \right)$$

(7)

where x is the vector input variable, C is the number of trees, and ${T}_{i}(x)$ is a single regression tree constructed based on a subset of the input variables and the bootstrapped samples.

ET

The extra trees algorithm (ET) is a relatively new machine learning technique developed as an extension of the Random Forest algorithm, using the same principles as Random Forest and using a random subset of features to train each base estimator and the ability to randomly select the best features and corresponding values to segment the nodes.

XGB

The eXtreme Gradient Boosting (XGB) algorithm, a scalable tree augmentation system proposed by Dr Chen, is now widely used by data scientists and provides state-of-the-art results on many problems⁴⁹. Given the training data ${\left\{\left({x}_{i},{y}_{i}\right)\right\}}_{i=1}^{N}$, the objective function of XGBoost consisting of two parts reads:

$${\text{Obj}} = \mathop \sum \limits_{i = 1}^{n} {\text{l}}\left( {{\text{y}}_{{\text{i}}} ,{\hat{\text{y}}}_{{\text{i}}} } \right) + \mathop \sum \limits_{k = 1}^{n} {\Omega }\left( {{\text{f}}_{{\text{k}}} } \right)$$

(8)

where Obj is the minimization objective, $\mathrm{l}\left({\mathrm{y}}_{\mathrm{i}},{\widehat{\mathrm{y}}}_{\mathrm{i}}\right)$ is the loss function for each accident sample, and it measures the error between the actual value and the predicted value. $\Omega \left({\mathrm{f}}_{\mathrm{k}}\right)$ is the conventional term used to prevent overfitting.

Another significant enhancement of the XGB algorithm is the ability to optimize the objective function using both the first-order derivative and the second-order derivative of a function of sorts:

$${\text{Gain}} = \frac{1}{2}\left[ {\frac{{\left( {\mathop \sum \nolimits_{{{\text{i}} \in {\text{I}}_{{\text{L}}} }} {\text{g}}_{{\text{i}}} } \right)^{2} }}{{\mathop \sum \nolimits_{{{\text{i}} \in {\text{I}}_{{\text{L}}} }} {\text{h}}_{{\text{i}}} + {\uplambda }}} + \frac{{\left( {\mathop \sum \nolimits_{{{\text{i}} \in {\text{I}}_{{\text{R}}} }} {\text{g}}_{{\text{i}}} } \right)^{2} }}{{\mathop \sum \nolimits_{{{\text{i}} \in {\text{I}}_{{\text{R}}} }} {\text{h}}_{{\text{i}}} + {\uplambda }}} - \frac{{\left( {\mathop \sum \nolimits_{{{\text{i}} \in {\text{I}}}} {\text{g}}_{{\text{i}}} } \right)^{2} }}{{\mathop \sum \nolimits_{{{\text{i}} \in {\text{I}}}} {\text{h}}_{{\text{i}}} + {\uplambda }}}} \right]$$

(9)

where ${g}_{i}$ denotes the first and second-order partial derivatives of the loss function of the previous tree model, ${\mathrm{I}}_{\mathrm{L}}$ and ${\mathrm{I}}_{\mathrm{R}}$ respectively denote the set of samples in the left and right nodes; in addition, $\mathrm{I}={\mathrm{I}}_{\mathrm{L}}{\mathrm{UI}}_{\mathrm{R}}$

BPNN

Back Propagation Neural Network (BPNN) is also a prevalent neural network technique for engineering applications, which is a multi-layer artificial neural network including an input layer, hidden layer and output layer⁵⁰. The BPNN algorithm minimizes the mean square error between the predicted and desired outputs. For a set of input data x(t) and output data y(t), the mapping relationship between them can be obtained by using BPNN models as follows:

$$y_{k}^{t + T} = \mathop \sum \limits_{j = 1}^{M} f_{2} \left[ {w_{jk} f_{1} \left( {\mathop \sum \limits_{i = 1}^{N} w_{ij} x_{i}^{t} - \theta_{j} } \right) - \theta_{k} } \right]$$

(10)

where ${x}_{i}^{t}$ is the input value of ith node at the time of t, ${w}_{ij}$ is the weight value between the ith node of input layer and the jth node of hidden layer, ${\theta }_{j}$ is threshold value at the jth node of hidden layer, ${w}_{jk}$ is the weight value between the jth node of hidden layer and the kth node of output layer, ${\theta }_{k}$ is threshold value at the kth node of output layer, ${y}_{k}^{t+T}$ is the output value at the kth node at the time of t + T, and T is the forecast period. ${f}_{1}$(·) and ${f}_{2}$(·) are activation functions.

LSTM and LSTM–BPNN

The long short-term memory neural network (LSTM) was proposed by Hochreiter et al. In order to eliminate the gradient disappearance problem³⁰, unlike the general RNN, which consists of three types of layers: output, hidden and input layers, the LSTM consists of three gates (forgetting gate, input gate, output gate) and additional cell memory, as shown in Eqs. (11–18) and Fig. 3. The forgetting gate determines what information is forgotten from the previous state, the inputs determine what information is updated and stored in the previous state, and the output gate determines the final output of the information. The equations in the LSTM at the t-th layer are summarized as follows, t = 0, …, τ.

$${\text{Forget gate}} {:}\;{\text{f}}_{{\text{t}}} =\upsigma \left( {{\text{x}}_{{\text{t}}} {\text{W}}_{{\text{f}}} + {\text{h}}_{{{\text{t}} - 1}} {\text{W}}_{{\text{f}}} + {\text{b}}_{{\text{f}}} } \right)$$

(11)

$${\text{Input gate }}{:}\;{\text{ i}}_{{\text{t}}} = {\upsigma }\left( {{\text{x}}_{{\text{t}}} {\text{W}}_{{\text{i}}} + {\text{h}}_{{{\text{t}} - 1}} {\text{W}}_{{\text{i}}} + {\text{b}}_{{\text{i}}} } \right)$$

(12)

$$\mathrm{Old \;value}{:}\; {\widehat{\mathrm{C}}}_{\mathrm{t}}=\mathrm{tan}\;\mathrm{h}\left({\mathrm{x}}_{\mathrm{t}}{\mathrm{W}}_{\mathrm{c}}+{\mathrm{h}}_{\mathrm{t}-1}{\mathrm{W}}_{\mathrm{c}}+{\mathrm{b}}_{\mathrm{c}}\right)$$

(13)

$${\text{New cell state }}{:}\;{\text{ C}}_{{\text{t}}} = {\text{f}}_{{\text{t}}} \odot {\text{C}}_{{{\text{t}} - 1}} + {\text{i}}_{{\text{t}}} \odot {\tilde{\text{C}}}$$

(14)

$${\text{Output gate }}{:}\;{\text{ o}}_{{\text{t}}} = {\upsigma }\left( {{\text{x}}_{{\text{t}}} {\text{W}}_{{\text{o}}} + {\text{h}}_{{{\text{t}} - 1}} {\text{W}}_{{\text{o}}} + {\text{b}}_{{\text{o}}} } \right)$$

(15)

$${\text{Hidden state }}{:}\;{\text{ h}}_{{\text{t}}} = {\text{o}}_{{\text{t}}} \odot \tan \left( {{\text{C}}_{{\text{t}}} } \right)$$

(16)

$${\text{S}} {-} {\text{type function}}{:}\;\upsigma \left( {\text{z}} \right) = \frac{1}{{1 + {\text{e}}^{{ - {\text{z}}}} }}$$

(17)

$${\text{Tanh}} {-} {\text{type function}}{:}\; {\text{tanh}}\left( z \right) = \frac{{e^{z} - e^{ - z} }}{{e^{z} + e^{ - z} }}$$

(18)

where W is the weight matrices, b is the bias vectors, step length or size t > 0 and $\odot$ denotes the Hadamard product.

This study proposes a hybrid algorithm, namely LSTM–BPNN, and utilizes LSTM to extract features of the serial relation of each technical–tactical phase in the match. After feature extraction of both sides, the BPNN is used for non-linear regression. The specific framework of the model is shown in Supplementary file and Fig. 4. KNN, DT, RF, ET, and XGB modelling are done via the sci-kit-learn library in Python 3.8. BPNN and LSTM–BPNN modelling is done via the Pytorch library in Python 3.8.

SHAP

SHAP is an additive explanatory model based on cooperative game theory⁵², which defines the output model as a linear sum of the input models. In SHAP, all feature values in the sample are considered as “contributors” to the prediction machine learning model, and for each prediction sample, SHAP can output a unique solution as the evaluation value of the sample features, i.e., the SHAP value³⁵. Based on the additive feature attribution methods, SHAP constructs a linear function g with binary variables to estimate the target function f. The explanation model is as follows:

$${\text{F}}\left( {\text{x}} \right) \approx {\text{g}}\left( {{\text{z}}^{\prime } } \right) = \phi_{0} + \mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{M}}} \phi_{{\text{i}}} {\text{z}}_{{\text{i}}}^{\prime }$$

(19)

where ${z}^{^{\prime}}\approx {x}^{^{\prime}}$, where ${x}^{^{\prime}}$ is the simplified input that mapped from the original inputs $x={h}_{x}({x}^{^{\prime}})$ and ensure that $g({\mathrm{z}}^{\mathrm{^{\prime}}})\approx f({h}_{x}({x}^{^{\prime}}))$; ${\phi }_{0}$ is the contribution with zero inputs, and M is the number of simplified input features. Equation (19) is illustrated in Fig. 5.

As noted by Lundberg and Lee³⁵, a single solution exists for Eq. (19), which has three desirable properties: local accuracy, missingness, and consistency. The local accuracy indicates that for any specific data x, the output of the explainable model g shall match the output of the original model f when x = ${h}_{x}$(${x}^{i}$), i.e. $f(x) = g({x}^{i})$. Missingness indicates that there shall be no impact on contribution for the features that are missing in the simplified inputs, i.e. ${x}_{i}^{^{\prime}}=0$, implying ${\phi }_{i} = 0$. Consistency implies that the inputs attribution should not be changed even though the simplified inputs’ contribution varies due to the change in the model⁵³. Specifically, for any two models f and ${f}^{i}$, if ${f}_{x}^{^{\prime}}({\mathrm{z}}^{\mathrm{^{\prime}}})-{f}_{x}^{^{\prime}}\left({z}^{^{\prime}}\backslash i\right)\ge {f}_{\mathrm{x}}\left({z}^{^{\prime}}\right)-{f}_{\mathrm{x}}\left({z}^{^{\prime}}\backslash i\right), then \; {\phi }_{i}({f}^{^{\prime}}, x)\ge {\phi }_{i}(f, x)$, where ${z}^{^{\prime}}\backslash i$ indicates that input without ${z}_{i}^{^{\prime}}$. Consequently, the only possible model that satisfies these properties is Eq. (20).

$$\phi_{i} \left( {{\text{f}},{\text{x}}} \right) = \mathop \sum \limits_{{{\text{z}}^{\prime } \subseteq x^{\prime}}} \frac{{\left| {z^{\prime}} \right|!\left( {{\text{M}} - \left| {z^{\prime}} \right| - 1} \right)!}}{{{\text{M}}!}}\left[ {{\text{f}}_{{\text{x}}} \left( {z^{\prime}} \right) - {\text{f}}_{{\text{x}}} \left( {z^{\prime}\backslash i} \right)} \right]$$

(20)

where $\left|{z}^{^{\prime}}\right|$ is the number of non-zero entries in ${z}^{^{\prime}}$, and ${\phi }_{i}$ is the Shapley value for feature ${x}_{i}$.

Traditional technical–tactical diagnosis based on simulation method

Two players’ TEs in each phase and their mutual correlations are shown in Table 3. Technical–tactical Eq. (21) is proposed to calculate one player’s magnitude of increasing or decreasing the SR at one phase, with other indicator values unchanged^17,21 for both players’ technical–tactical diagnosis in matches. If the SR is less than or equal to 50%, the incremental value Z = SR + Y; otherwise, the incremental value Z = SR − Y¹⁷. Subsequently, this player’s recalculated SR at one phase is used to calculate the new SR value of the other-side player at the corresponding phase by calculating the 1-SR formula and keeping each indicator value of other phases unchanged. Moreover, the recalculated SRs with the original URs of both sides are then applied through Eq. (3) to calculate the adjusted TEs. These two adjusted TEs and the other unchanged TEs are combined and re-input into the model to output the adjusted mutual winning probability (AMWP). The weight value is calculated via Eq. (22). The greater the weight’s absolute value is, the greater the influence of the adjusted phase on the MWP of both sides is and the more significant the impact on whether the player can win the match.

$${{\text{Y}} = 0.238 }{ \times } \cos \left( {{ - 1.32}{ \times }{\text{SR}} + 0.66} \right) - 0.178$$

(21)

$${\text{Weight}} = \frac{{\left| {{\text{MWP}} - {\text{AMWP}}} \right|}}{{{\text{MWP}}}}{ \times }100\%$$

(22)

Table 3 Two players’ technique effectiveness indicators in each phase and their mutual correlations.

Full size table

Technical–tactical diagnosis based on SHAP method

Unlike traditional table tennis tactical diagnostic methods that indirectly analyze the players’ technical–tactical performance by simulation, current study interprets the model by constructing an artificial intelligence model and using the state-of-the-art SHAP method, which assigns Shapley values to all table tennis technical–tactical indicators and can make a global analysis of the interaction of each indicator for the game. Unlike previous tree-based machine learning models that only present the feature importance, it is noteworthy that the SHAP value also reflects the positive and negative trends of feature importance on the model. In addition, the SHAP model calculates feature importance based on only one or local observations⁵³, allowing us to focus our analysis on a specific player and a specific game, bringing table tennis technical–tactical diagnosis to a more precise level. Finally, because of the non-linear characteristics of the various influencing factors and indicators in table tennis competitions, traditional research methods only provide a rough interpretation of the game profile through feature weight values, while based on the SHAP method, we visualize the relationship between the independent and dependent variables and discuss the non-linear relationships between the influencing factors and indicators. Thus, the SHAP method used for table tennis technical–tactical diagnostic analysis can address the shortcomings of traditional methods and enhance the benefits of technical–tactical diagnosis.

Result and discussion

Performance comparison

In order to find the best algorithm for prediction accuracy, this study compares the performance of LSTM–BPNN with six classical artificial intelligence algorithms, which are BPNN, XGB, RF, ET, DT and KNN, in terms of the mean squared error (MSE), mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (${\mathrm{R}}^{2}$), as shown in Eqs. (23–26).

$${\text{R}}^{2} = 1 - { }\frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{{\text{n}}} \left( {{\text{y}}_{{\text{i}}} { } - {\hat{\text{y}}}_{{\text{i}}} } \right)^{2} }}{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{{\text{n}}} \left( {{\text{y}}_{{\text{i}}} { } - {\overline{\text{y}}}_{{}} } \right)^{2} }}$$

(23)

$${\text{MSE}} = \frac{1}{{\text{n}}}{ }\mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{n}}} ({\hat{\text{y}}}_{{\text{i}}} - {\text{y}}_{{\text{i}}} )^{2}$$

(24)

$${\text{RMSE}} = \sqrt {\frac{1}{{\text{n}}}\mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{n}}} \left( {{\text{y}}_{{\text{i}}} - {\hat{\text{y}}}_{{\text{i}}} } \right)^{2} }$$

(25)

$$\text{MAE} = {\frac{1}{\text{n}}}\mathop\sum\limits_{\text{i} = 1}^{\text{n}} | {\widehat{\text{y}}}_{\text{i}} - {\text{y}}_{\text{i}} |$$

(26)

${\mathrm{y}}_{\mathrm{i}}$: the original value; ${\widehat{\mathrm{y}}}_{\mathrm{i}}$: the simulated value; ${\overline{\mathrm{y}} }$: the sample mean; n: the number of samples.

This study uses the tenfold cross-validation to determine the hyper-parameters of XGB, RF, ET, DT, KNN, LSTM–BPNN, and BPNN for improving the model performance. The dataset is split and shuffled into the training set and testing set (training:testing, 8:2). The training set is split and shuffled as 10 consecutive folds. Each fold is treated as a validation set, and the remaining 9 folds as the training sets. The best hyper-parameters of XGB, RF, ET, DT, KNN, LSTM–BPNN, and BPNN depend on their performance by calculating the mean value of these ten validations.

The final hyper-parameters of LSTM–BPNN and BPNN are as follows: (1) LSTM–BPNN (1 LSTM layer with 50 hidden layer neurons, 40 BPNN hidden layer neurons, 0.01 learning rate and 200 epochs) applied the Adam as the optimizer and the MSELoss as the criterion; (2) BPNN (31 hidden neurons, 0.01 learning rate and 200 epochs) applied the Adam as the optimizer and the MSELoss as the criterion. The hyper-parameters of DT, ET, RF and KNN are as follows: (1) DT (criterion = ‘squared error’, adept = 2, min_samples_split = 2); (2) ET (n_estimators = 2500, max_depth = 15, min_samples_split = 2, min_samples_leaf = 1); (3) XGB (learning_rate = 1, max_depth = 1, min_child_weight = 4, n_estimators = 2500); (4) RF (n_estimators = 1000, min_samples_split = 2, max_depth = 15); (5) KNN (n_neighbors = 9).

Table 4 shows the testing results of LSTM–BPNN, BPNN, DTR, XGBR, RFR, KNN and ETR. The results show that LSTM–BPNN outperforms the others in testing data, with MSE = 0.000355, MAE = 0.014237, RMSE = 0.018853, and ${\mathrm{R}}^{2}$ = 0.988311. Therefore, LSTM–BPNN can provide more authentic feedback. Using LSTM–BPNN as the predicting model to calculate SHAP values for further interpretation and analysis is applicable and consistent with comparative studies⁵⁴.

Table 4 Comparison of 7 artificial intelligence models.

Full size table

Global model interpretation with SHAP

Previous studies have shown that multiple factors influence the results of ball games, and the influencing factors show a non-linear relationship with each other^{26,55,56,57,58,59}. This view is verified by related studies^17,21, which all used artificial intelligence algorithms with the ability to handle nonlinearities. The result of this study is consistent with the design and viewpoints of the above studies. Although artificial intelligence algorithms have considerable accuracy, the lack of interpretation makes it difficult to understand the internal mechanism of the model and the relationship between individual features⁶⁰. Therefore, the SHAP is adopted in this study to explain the technical–tactical diagnosis model based on LSTM–BPNN, which allows us to understand which features or phases are essential in the current table tennis matches.

Figure 6 indicates the training dataset’s relative importance and shows the SHAP value’s mean magnitude applied to characterize the feature importance of the LSTM–BPNN model. The higher the feature’s importance is, the more contribution the feature provides in predicting the model. 28 features are sorted according to their importance, and TE21 ranks top, followed by TE9, TE22, TE6 and TE10. The results show that player B’s ESAP, player A’s ESAP, player B’s ERAP, player A’s MRAP and player A’s ERAP greatly impact current table tennis matches. It implies that the receive-attack and serve-attack phases of the ending match have an essential impact on MWP. At the same time, in terms of the mutual correlations between the phase of player A and player B (Table 3), the ESAP-ERAP phase of them has a high impact on MWP.

As shown in Fig. 7, the SHAP summary plots are provided to concisely describe the distribution of the feature’s impact on the MWP and the relationship between the features’ SHAP values and their impact. The dots with bluer colour refer to lower feature values. Figure 7 implies that the higher player A’s TEs are, the higher their SHAP values are, and the bigger MWPs are; the higher player B’s TEs are, the lower their SHAP values and the smaller MWPs are. Two players’ TEs exhibit mutually constraining impacts on the MWP in their corresponding phase (Table 3). Furthermore, high TEs can lead to low SHAP values for player B rather than for player A because of Eq. (5).

As shown in Fig. 8, the SHAP decision plot provides more detail on how each feature impacts the MWP in each data sample. To some extent, Fig. 8 implies that each feature contributes differently to the MWP in each data sample. It is consistent with the comparative studies that the quality of their opponents influences players’ performance, results, and rankings^{61,62,63,64,65}.

In order to further investigate whether these features’ SHAP values are linear or non-linear in the relationship, Fig. 9 is provided to describe this aspect concisely, and each TE’s SHAP value symbolizes the degree of impact on the final MWP. As shown in Fig. 9, the result finds that both players show an approximately linear relationship in each corresponding phase of the TEs’ SHAP values. The comparative studies showed that the phase-based analysis theory used two indicators, UR and SR, to better and more directly compare the technical–tactical performance of both players. Meanwhile, it reveals a near-complementary relationship between the SRs of both players at each corresponding phase, with URs presenting a balanced and equal relationship. TEs construct a non-linear relationship between URs and SRs. However, the TEs of both players also correspond to each corresponding phase, which results in the higher the TEs of one side, the lower the TEs of his opponent. The above result shows an approximately linear relationship. Furthermore, the result implies that the SHAP values of both players’ TEs, WRs and TSs show non-linear relationships, consistent with relative studies^66,67. Meanwhile, the SHAP values of one side’s WR and TS have presented non-linear relationships with another side, consistent with relative studies^57,59.

Case diagnosis and local analysis

Regarding the technical–tactical diagnosis and analysis, SHAP provides the local interpretation to explain outputs, which is a different way to analyze and diagnose table tennis matches. Using SHAP reveals the essential feature in one match or the most crucial feature of one player in many matches. In the case application, as shown in Table 5, this study uses the traditional method and the SHAP method to analyze and diagnose the German elite male player Ovtcharov’s eight matches from 2019 to 2022 against Harimoto, Lin Yun-Ju, Morizono, Achanta, Alamian, Lin Gaoyuan, Lebesson, and Hirano.

Table 5 The specific information about Ovtcharov against eight players.

Full size table

As shown in Fig. 10, the SHAP method shows that TE9 ranks top, followed by TE18, TE22, TE11 and TE5, which illustrates the ESAP, ESI, and MSAP of Ovtcharov’s performance and the MRAP, ERAP, and ESAP of his opponents’ performance have a high impact on the MWP. According to the two players’ mutual correlations in Table 3, Fig. 10 implies that Ovtcharov’s performance in the ESAP-ERAP and MSAP-MRAP phases against his opponent greatly impacts the MWPs. Meanwhile, Fig. 10 shows that each feature’s contribution varies among different samples on the MWPs, which verifies that one player’s performance in matches is closely related to their opponents¹⁰ and that both sides in the match have interactive and logical relationships^10,28.

As shown in Fig. 11, the traditional diagnosis method also shows that each weight differs among players on the MWPs. However, in Fig. 11, it is hard to know which phase is the most important among the eight matches, which can only be known as the most crucial phase in a single match rather than the whole match. Meanwhile, weight is calculated by the MWP’s value based on Eq. (5), and the weight’s size depends on the MWP’s size. Figure 11 shows that the results of the weight between Ovtcharov and Lin Gaoyuan are almost all higher than other diagnostic results, which illustrates that the importance of phases or features in different matches cannot be compared. Nevertheless, the SHAP method can fill this gap through local interpretation and analysis. As shown in Fig. 11, SHAP decision plots show how LSTM–BPNN arrives at each output. Therefore, the trend and the degree of impact on each feature’s respective output MWP in different matches can be identified based on the direction and slope of each curve.

To further investigate the difference between the traditional and the SHAP methods, this study conducts one match between Ovtcharov and Harimoto for analysis and diagnosis. Regarding the traditional diagnosis method, as shown in Table 6, the results show TE5&TE18 and TE6&TE17 have a high impact on the MWP and TE12&TE23, TE11&TE24 have a low impact on the MWP. As shown in Fig. 12, the results show TE11, TE6, TE21, TE17, TE10 and TE24 have a high and positive impact on the MWP, and TE23 has a high and negative impact on the MWP. In the diagnostic results of both methods, this study found that the traditional method cannot analyze the positive or negative impacts. Meanwhile, the two diagnosis methods have different analysis results, and the traditional one shows TE12&TE23 and TE11&TE24 have low impacts, but the SHAP method shows they all have high impacts on the MWP. To further analyze this phenomenon, the study finds that the traditional method presents this result because of Eq. (21). The weight values used in the traditional method to measure the importance of different features are obtained by simulating the input SRs by Eq. (21) and comparing the change in winning probabilities before and after the adjustment. According to Eq. (21), the value of the resulting increment y is lower when the input value SR converges to 1 or 0 compared to 0.5. Therefore, the simulated SR is also lower. Meanwhile, the simulated TE is also calculated from the simulated SR and the original UR. This indirect adjustment and simulation likely lead to the above results for adopting the traditional method. The black-box character of the models built by deep learning algorithms⁶⁰ leads to traditional diagnosis methods relying heavily on such indirect simulation. However, giving direct feedback on the match conditions requires much work. The SHAP method used in this study is based on local interpretation, which enables the direct and objective reflection of the condition on the match without simulation.

Table 6 Using traditional simulation method to diagnose Ovtcharov’s technical–tactical performance against Harimoto.

Full size table

Above all, the shortcoming of the traditional table tennis diagnostic method is that it cannot directly determine which feature has the most critical impact but only indirectly determines the feature importance on one match through the simulation analysis. Each feature’s importance on one or more matches, which can be obtained directly by the SHAP method, is more objective and reliable than the one revealed by the traditional simulation method.

Limitation and Contribution

This study explores an innovative way to understand and analyze matches, and these results have implications for the performance analysis of table tennis and related racket sports. In detail, the special contributions of this work are: (1) This study proposed a hybrid technical–tactical analysis through the four-phase evaluation theory and the double three-phase evaluation theory and built a diagnosis model based on LSTM–BPNN; (2) This study used the SHAP to interpret the model’s output and diagnose the table tennis match; (3) The SHAP’s global interpretation of the model shows that the receive-attack and serve-attack phases of the ending match have an essential impact on the current tennis matches; (4) The local interpretation and case application found that the SHAP method and traditional simulation method came up with different diagnostic results. SHAP could directly obtain each feature’s importance on one or more matches, which is more objective and reliable than the traditional simulation method.

Although this study has made some contributions, there are some limitations. The data obtained from the hybrid analysis theory proposed in this study and previous analysis theories like the four-phase and double three-phase evaluation theory are all outcome-focus data. Although the technical–tactical diagnosis of this type of data can find the key phases that impacted the match based on artificial intelligence algorithms, it can’t reflect the progress and the antecedents and consequences of the critical phases. Therefore, further study is suggested to focus more on the progress analysis in matches and construct theoretical models for progress analysis. Moreover, many factors can affect the players’ technical and tactical performance in matches, including the players’ psychological and physiological factors, venue factors, equipment factors, etc. Further studies, which consider these factors, will need to be undertaken.

Conclusion

The interpretation of the technical–tactical diagnosis model is essential for players and coaches to trust analysis results. This study uses the SHAP to interpret the model’s output and diagnose the table tennis match. Meanwhile, this study proposes a hybrid technical–tactical analysis theory through the four-phase and double three-phase evaluation theories and built an interactive diagnosis model based on a hybrid deep learning model, namely LSTM–BPNN. Our main findings demonstrates that the LSTM–BPNN achieves the best performance among the seven algorithms. Moreover, the global interpretation of the model shows that the receive-attack and serve-attack phases of the ending match have an essential impact on the current male tennis matches. The local interpretation and case application find that the SHAP and traditional simulation methods produce different diagnostic results. SHAP could directly obtain each feature’s importance on one or more matches, which is more objective and reliable than the traditional simulation method. These results have implications for the technical–tactical analysis theories, the diagnostic method, and algorithm selection for table tennis and related racket sports performance analysis.

Data availability

The datasets generated during the current study are not publicly available but are available from the corresponding author upon reasonable request.

References

Wu, H. & Li, Z. Research on technical diagnosis method for table tennis players. Int. J. Table Tennis Sci. 1, 99–103 (1992).
Google Scholar
Hsu, M. H., Chen, Y. F. & Wang, S. C. Offense-defense mode analysis of the world top male table tennis player—A case study by Chuang Chih-Yuan who participated in 2012 London Olympic Male Single Games. J. Sci. Innov. 4, 41–49 (2014).
Google Scholar
Yang, Q. & Zhang, H. Construction and application of “Four Phase Evaluation Theory” technique and tactics for table tennis. J. Tianjin Univ. Sport 29, 439–442. https://doi.org/10.13297/j.cnki.issn1005-0000.2014.05.013 (2014).
Article MathSciNet Google Scholar
Huang, W. & Shi, Z. Three-stage index evaluation about Ding Ning’s table tennis playing. China Sport Sci. Technol. 52, 126–130. https://doi.org/10.16470/j.csst.201605017 (2016).
Article Google Scholar
Tamaki, S., Yoshida, K. & Yamada, K. A shot number based approach to performance analysis in table tennis. J. Hum. Kinet. 55, 7–18 (2017).
Article PubMed PubMed Central Google Scholar
Ley, C., Dominicy, Y. & Bruneel, W. Mutual point-winning probabilities (MPW): A new performance measure for table tennis. J. Sports Sci. 36, 2684–2690 (2018).
Article PubMed Google Scholar
Xiao, D., Zhou, X., Liu, H., Qin, Z. & Yu, Y. The construction and application of double three-phase method on table tennis technique and tactics. China Sport Sci. Technol. 54, 112–116. https://doi.org/10.16470/j.csst.201805017 (2018).
Article Google Scholar
Zhang, X., Xiao, D., Zhou, X. & Fang, W. The construction and application of dynamic three-phase method on table tennis technique and tactics. China Sport Sci. Technol. 54, 80–83. https://doi.org/10.16470/j.csst.201801011 (2018).
Article Google Scholar
Yang, Q. & Lü, Y. Construction of the subsection theory for table tennis chop stroke. Sports. Sci. Res. 24, 44–52. https://doi.org/10.19715/j.tiyukexueyanjiu.2020.06.006 (2020).
Article Google Scholar
Yu, J. & Gao, P. Interactive three-phase structure for table tennis performance analysis: Application to elite men’s singles matches. J. Hum. Kinet. 81, 177–188 (2022).
Article PubMed PubMed Central Google Scholar
Zhang, H., Dai, J., Shi, F., Liu, Y. & Wang, J. Research on technical & tactical characteristics of racket games. J. Shanghai Univ. Sport https://doi.org/10.16099/j.cnki.jsus.2007.04.010 (2007).
Article Google Scholar
Hughes, M. D. & Bartlett, R. M. The use of performance indicators in performance analysis. J. Sports Sci. 20, 739–754. https://doi.org/10.1080/026404102320675602 (2002).
Article PubMed Google Scholar
Zhang, H. & Zhou, Z. How is table tennis in China successful?. Ger. J. Exerc. Sport Res. 49, 244–250 (2019).
Article Google Scholar
Zhang, H., Zhou, Z. & Yang, Q. Match analyses of table tennis in China: A systematic review. J. Sports Sci. 36, 2663–2674 (2018).
Article PubMed Google Scholar
Zhang, H. & Hohmam, A. Theory and practice of performance diagnosis through mathematical simulation in ball game. China Sport Sci. https://doi.org/10.16469/j.css.2005.08.009 (2005).
Article Google Scholar
Zhang, H. & Hohmam, A. Athletic diagnosis of table tennis matches through mathematic simulation. J. Shanghai Univ. Sport https://doi.org/10.16099/j.cnki.jsus.2004.02.016 (2004).
Article Google Scholar
Xiao, Y. & Zhang, H. Research report on the preparations of chinese table tennis team for the olympics—On the diagnostic model of table tennis competition based on artificial neural network. Sport Sci. Res. 29, 19–22 (2008).
Google Scholar
Pfeiffer, M., Zhang, H. & Hohmann, A. A Markov chain model of elite table tennis competition. Int. J. Sports Sci. Coa. 5, 205–222 (2010).
Article Google Scholar
Wenninger, S. & Lames, M. Performance analysis in table tennis-stochastic simulation by numerical derivation. Int. J. Comput. Sci. Sport 15, 22–36 (2016).
Article Google Scholar
Yang, Q. & Zhang, H. Application of BP neural network and multiple regression in table tennis technical and tactical ability analysis. J. Chengdu Sport Univ. 42, 78–82. https://doi.org/10.15942/j.jcsu.2016.01.015 (2016).
Article Google Scholar
Huang, W., Lu, M., Zeng, Y., Hu, M. & Xiao, Y. Technical and tactical diagnosis model of table tennis matches based on BP neural network. BMC Sports Sci. Med. Rehabil. 13, 1–11. https://doi.org/10.1186/s13102-021-00283-3 (2021).
Article Google Scholar
Qiao, F. Application of deep learning in automatic detection of technical and tactical indicators of table tennis. PLoS ONE 16, 1–16 (2021).
Article Google Scholar
Zhang, J. Automatic detection method of technical and tactical indicators for table tennis based on trajectory prediction using compensation fuzzy neural network. Comput. Intell. Neurosci. 2021, 3155357 (2021).
PubMed PubMed Central Google Scholar
Zhao, H. & Liu, S. Tracing mechanism of sports competition pressure based on backpropagation neural network. Complexity 2021, 1–12. https://doi.org/10.1155/2021/6652896 (2021).
Article ADS Google Scholar
Cao, Y. et al. Application of tactics in technical and tactical analysis of table tennis mixed doubles based on artificial intelligence graph theory model. J. Environ. Public Health 2022, 1–9 (2022).
Google Scholar
Glazier, P. S. Game, set and match? Substantive issues and future directions in performance analysis. Sports Med. 40, 625–634 (2010).
Article PubMed Google Scholar
Gómez, M. A., García-de-Alcaráz, A. & Furley, P. Analysis of contextual-related variables on serve and receiving performances in elite men’s and women’s table tennis players. Int. J. Perform. Anal. Sport 17, 919–933 (2017).
Article Google Scholar
Lvanek, V., Đukić, B., Mikić, B., Smajic, M. & Doder, D. Effects of technical and tactical characteristics on the performance of the table tennis players. Facta Univ. Ser. Phys. Ed. Sport 16, 157–166 (2018).
Article Google Scholar
Elman, J. L. Finding structure in time. Cogn. Sci. 14, 179–211 (1990).
Article Google Scholar
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Article CAS PubMed Google Scholar
Moghar, A. & Hamiche, M. Stock market prediction using LSTM recurrent neural network. Procedia Comput. Sci. 170, 1168–1173 (2020).
Article Google Scholar
Sezer, O. B., Gudelek, M. U. & Ozbayoglu, A. M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Appl. Soft Comput. 90, 106181 (2020).
Article Google Scholar
Lim, S. M., Oh, H. C., Kim, J., Lee, J. & Park, J. LSTM-guided coaching assistant for table tennis practice. IEEE Sens. J. 18, 4112 (2018).
Article Google Scholar
Li, K., Xu, H. & Liu, X. Analysis and visualization of accidents severity based on LightGBM-TPE. Chaos Solitons Fractals https://doi.org/10.1016/j.chaos.2022.111987 (2022).
Article PubMed PubMed Central Google Scholar
Lundberg, S. M. & Lee, S.-I. in 31st Annual Conference on Neural Information Processing Systems (NIPS). (2017).
Tseng, P.-Y. et al. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit. Care https://doi.org/10.1186/s13054-020-03179-9 (2020).
Article PubMed PubMed Central Google Scholar
Sun, T. & Wu, H. Reconciling the actual and nominal exposure concentrations of microplastics in aqueous phase: Implications for risk assessment and deviation control. J. Hazard. Mater. https://doi.org/10.1016/j.jhazmat.2022.130246 (2023).
Article PubMed Google Scholar
Yao, Y., Qiu, Y., Cui, Y., Wei, M. & Bai, B. Insights to surfactant huff-puff design in carbonate reservoirs based on machine learning modeling. Chem. Eng. J. https://doi.org/10.1016/j.cej.2022.138022 (2023).
Article Google Scholar
Guo, Y. et al. Effects of microplastics on growth, phenanthrene stress, and lipid accumulation in a diatom, Phaeodactylum tricornutum. Environ. Pollut. https://doi.org/10.1016/j.envpol.2019.113628 (2020).
Article PubMed PubMed Central Google Scholar
Bujang, M. A. & Baharum, N. A simplified guide to determination of sample size requirements for estimating the value of intraclass correlation coefficient: A review. Arch. Orofac. Sci. 12, 1–11 (2017).
ADS Google Scholar
Koo, T. K. & Li, M. Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15, 155–163 (2016).
Article PubMed PubMed Central Google Scholar
Rawat, S., Rawat, A., Kumar, D. & Sabitha, A. S. Application of machine learning and data visualization techniques for decision support in the insurance sector. Int. J. Inf. Manag. Data Insights 1, 100012 (2021).
Google Scholar
Hancock, J. T. & Khoshgoftaar, T. M. Survey on categorical data for neural networks. J. Big Data 7, 1–41 (2020).
Article Google Scholar
Pargent, F., Pfisterer, F., Thomas, J. & Bischl, B. Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features. Comput. Stat. 37, 2671–2692 (2022).
Article MathSciNet MATH Google Scholar
Huang, W., Zhang, H. & Liu, W. Evaluation of Table Tennis Olympic Winner ZHANG Ji-ke’s technique effectiveness. China Sport Sci. Technol. 50, 31–34+39. https://doi.org/10.16470/j.csst.2014.03.006 (2014).
Article Google Scholar
Balli, S. & Ozdemir, E. A novel method for prediction of EuroLeague game results using hybrid feature extraction and machine learning techniques. Chaos Solitons Fractals https://doi.org/10.1016/j.chaos.2021.111119 (2021).
Article MathSciNet PubMed Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/a:1010933404324 (2001).
Article MATH Google Scholar
Ahmad, M. W., Reynolds, J. & Rezgui, Y. Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. J. Clean. Prod. 203, 810–821. https://doi.org/10.1016/j.jclepro.2018.08.207 (2018).
Article Google Scholar
Chen, T., Guestrin, C. & Assoc Comp, M. in 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 785–794 (2016).
Wang, L., Zeng, Y. & Chen, T. Back propagation neural network with adaptive differential evolution algorithm for time series forecasting. Expert Syst. Appl. 42, 855–863. https://doi.org/10.1016/j.eswa.2014.08.018 (2015).
Article Google Scholar
Song, X. et al. Spatial-temporal behavior of precipitation driven karst spring discharge in a mountain terrain. J. Hydrol. https://doi.org/10.1016/j.jhydrol.2022.128116 (2022).
Article Google Scholar
Lipovetsky, S. & Conklin, M. Analysis of regression in game theory approach. Appl. Stoch. Model. Bus. Ind. 17, 319–330 (2001).
Article MathSciNet MATH Google Scholar
Antwarg, L., Miller, R. M., Shapira, B. & Rokach, L. Explaining anomalies detected by autoencoders using Shapley Additive Explanations. Expert Syst. Appl. https://doi.org/10.1016/j.eswa.2021.115736 (2021).
Article Google Scholar
Fu, X., Wu, M., Ponnarasu, S. & Zhang, L. A hybrid deep learning approach for dynamic attitude and position prediction in tunnel construction considering spatio-temporal patterns. Expert Syst. Appl. https://doi.org/10.1016/j.eswa.2022.118721 (2023).
Article Google Scholar
Winter, C., Rasche, C. & Pfeiffer, M. Linear vs. non-linear classification of winners, drawers and losers at FIFA World Cup 2014. Sci. Med. Footb. 1, 164–170. https://doi.org/10.1080/24733938.2017.1283435 (2017).
Article Google Scholar
Soo, J., Woods, C. T., Arjunan, S. P., Aziz, A. R. & Ihsan, M. Identifying the performance characteristics explanatory of fight outcome in elite Pencak Silat matches. Int. J. Perform. Anal. Sport 18, 973–985. https://doi.org/10.1080/24748668.2018.1539381 (2018).
Article Google Scholar
Wallace, J. L. & Norton, K. I. Evolution of World Cup soccer final games 1966–2010: Game structure, speed and play patterns. J. Sci. Med. Sport 17, 223–228. https://doi.org/10.1016/j.jsams.2013.03.016 (2014).
Article PubMed Google Scholar
Cui, Y., Gomez, M.-A., Goncalves, B. & Sampaio, J. Performance profiles of professional female tennis players in grand slams. PLoS ONE https://doi.org/10.1371/journal.pone.0200591 (2018).
Article PubMed PubMed Central Google Scholar
Palut, Y. & Zanone, P. G. A dynamical analysis of tennis: Concepts and data. J. Sports Sci. 23, 1021–1032. https://doi.org/10.1080/02640410400021682 (2005).
Article PubMed Google Scholar
Castelvecchi, D. Can we open the black box of AI?. Nature News 538, 20 (2016).
Article ADS CAS Google Scholar
Lai, M., Meo, R., Schifanella, R. & Sulis, E. The role of the network of matches on predicting success in table tennis. J. Sports Sci. 36, 2691–2698. https://doi.org/10.1080/02640414.2018.1482813 (2018).
Article PubMed Google Scholar
Hewitt, A., Norton, K. & Lyons, K. Movement profiles of elite women soccer players during international matches and the effect of opposition’s team ranking. J. Sports Sci. 32, 1874–1880. https://doi.org/10.1080/02640414.2014.898854 (2014).
Article PubMed Google Scholar
Yi, Q. et al. Situational and positional effects on the technical variation of players in the UEFA Champions League. Front. Psychol. https://doi.org/10.3389/fpsyg.2020.01201 (2020).
Article PubMed PubMed Central Google Scholar
Lago-Ballesteros, J., Lago-Penas, C. & Rey, E. The effect of playing tactics and situational variables on achieving score-box possessions in a professional soccer team. J. Sports Sci. 30, 1455–1461. https://doi.org/10.1080/02640414.2012.712715 (2012).
Article PubMed Google Scholar
Aquino, R., Munhoz Martins, G. H., Palucci Vieira, L. H. & Menezes, R. P. Influence of match location, quality of opponents, and match status on movement patterns in Brazilian professional football players. J. Strength Cond. Res. 31, 2155–2161. https://doi.org/10.1519/jsc.0000000000001674 (2017).
Article PubMed Google Scholar
Kidokoro, S., Inaba, Y., Yoshida, K., Yamada, K. & Ozaki, H. A topspin rate exceeding 110 rps reduces the ball time of arrival to the opponent: A table tennis rally study. Sports Biomech. https://doi.org/10.1080/14763141.2022.2156916 (2022).
Article PubMed Google Scholar
Menayo Antunez, R., Moreno Hernandez, F. J., Fuentes Garcia, J. P., Reina Vaillo, R. & Damas Arroyo, J. S. Relationship between motor variability, accuracy, and ball speed in the tennis serve. J. Hum. Kinet. 33, 45–53. https://doi.org/10.2478/v10078-012-0043-3 (2012).
Article Google Scholar

Download references

Funding

This research was funded by the National Social Science Fund of China (NSSFC) under Grant Number 18CTY011.

Author information

Authors and Affiliations

College of Physical Education and Sports, Beijing Normal University, Beijing, 100084, China
Honglin Song, Yutao Li & Tianbiao Liu
School of Physical Education, Jilin University, Jilin, 130015, China
Xiaofeng Zou
Microsoft, Beijing, 100080, China
Ping Hu

Authors

Honglin Song
View author publications
You can also search for this author in PubMed Google Scholar
Yutao Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Zou
View author publications
You can also search for this author in PubMed Google Scholar
Ping Hu
View author publications
You can also search for this author in PubMed Google Scholar
Tianbiao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.S.: Conceptualization, Methodology, Formal analysis, Visualization, AI modelling, Writing—original draft, Writing—review & editing. Y.L.: Conceptualization, Methodology, Formal analysis, Visualization, Writing—original draft, Writing—review & editing. X.Z.: Formal analysis, Writing—original draft. P.H.: Methodology, Supervision. T.L.: Methodology, Supervision, Funding acquisition, Writing—review & editing.

Corresponding author

Correspondence to Tianbiao Liu.

Ethics declarations

Competing interests

Ping Hu was employed by the company Microsoft, Beijing. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Song, H., Li, Y., Zou, X. et al. Elite male table tennis matches diagnosis using SHAP and a hybrid LSTM–BPNN algorithm. Sci Rep 13, 11533 (2023). https://doi.org/10.1038/s41598-023-37746-1

Download citation

Received: 24 February 2023
Accepted: 27 June 2023
Published: 17 July 2023
DOI: https://doi.org/10.1038/s41598-023-37746-1
Springer Nature Limited

This article is cited by

Real-Time Prediction of Elbow Motion Through sEMG-Based Hybrid BP-LSTM Network
- Yiyuan Ma
- Huaiyuan Chen
- Weidong Chen
Journal of Shanghai Jiaotong University (Science) (2024)
Deep learning-based public transit passenger flow prediction model: integration of weather and temporal attributes
- Nithin K. Shanthappa
- Raviraj H. Mulangi
- Harsha M. Manjunath
Public Transport (2024)
Long-term gridded land evapotranspiration reconstruction using Deep Forest with high generalizability
- Qiaomei Feng
- Junyong Shen
- Zhenzhong Zeng
Scientific Data (2023)

Elite male table tennis matches diagnosis using SHAP and a hybrid LSTM–BPNN algorithm

Abstract

Similar content being viewed by others

Technical and tactical diagnosis model of table tennis matches based on BP neural network

NBA Game Result Prediction Using Feature Analysis and Machine Learning

Analysis on the construction of sports match prediction model using neural network

Explore related subjects

Introduction

Method

Data resources

Data reliability

Data clustering

Data formatting and normalization

Traditional three-phase evaluation theory

Match progress phase in the double three-phase evaluation theory

Stroke classification in the four-phase evaluation theory

Observation points

Hybrid technical–tactical analysis theory

Evaluation indicators

KNN

DT

RF

ET

XGB

BPNN

LSTM and LSTM–BPNN

SHAP

Traditional technical–tactical diagnosis based on simulation method

Technical–tactical diagnosis based on SHAP method

Result and discussion

Performance comparison

Global model interpretation with SHAP

Case diagnosis and local analysis

Limitation and Contribution

Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Real-Time Prediction of Elbow Motion Through sEMG-Based Hybrid BP-LSTM Network

Deep learning-based public transit passenger flow prediction model: integration of weather and temporal attributes

Long-term gridded land evapotranspiration reconstruction using Deep Forest with high generalizability

Search

Navigation