Introduction

Sporting events are a popular source of entertainment, with immense interest from the general public. Sports analysts, coaching staff, franchises, and fans alike all seek to forecast winners and losers in upcoming sports match-ups based on previous records. The interest in predicting sporting outcomes is particularly pronounced for professional team sport leagues including the Major League Baseball (MLB), the National Football League (NFL), the National Hockey League (NHL), and the National Basketball Association (NBA); postseason plays in these leagues, namely the playoffs, are of greater interest than games in the regular season because teams compete directly for prestigious championships titles.

The development of statistical models to robustly predict the outcome of playoff games from year-to-year is a challenging machine learning task because of the plethora of individual, team and extenral factors that all-together confound the propensity of a given team to win a given game in a give year. In this work, we develop MambaNet: a large hybrid neural network for predicting the outcome of a basketball match during the playoffs. MambaNet leverages a sophisticated combination of deep learning techniques, including convolutional neural networks, recurrent neural networks, and dense layers. Crucially, it incorporates Feature Imitating Networks (FINs), a novel approach that initializes network weights to simulate specific statistical features, effectively bridging advanced statistical methodologies with machine learning capabilities. This integration is pivotal in enhancing MambaNet’s ability to process complex patterns in sports data, leading to more accurate and robust predictions.

FINs have emerged as powerful tools in specialized tasks, especially in overcoming challenges in transfer learning. Their effectiveness spans a broad spectrum of applications, from EEG data analysis [12] to biomedical image processing [11], Bitcoin price prediction, and speech emotion recognition [9]. In the context of MambaNet, the integration of FINs enables more effective capturing and utilization of complex statistical relationships in basketball game data, crucial for accurate playoff outcome predictions. This innovative approach distinguishes MambaNet from conventional predictive models, highlighting its transformative potential in sports analytics and beyond.

To assess the value of our proposed approach, we performed three experiments using NBA and Iranian Super League data. The first experiment demonstrated that Feature Imitating Networks (FINs) significantly enhanced predictive accuracy, achieving an AUC improvement of up to 0.15 across various NBA seasons. In our second experiment, MambaNet’s generalizability was confirmed with an AUC improvement of up to 0.12 in the Iranian Basketball Super League playoffs. Lastly, integrating various deep learning techniques into MambaNet led to an increase in predictive accuracy, with AUC scores reaching up to 0.82 in different NBA seasons. These results highlight MambaNet’s exceptional performance and transformative potential in the realm of sports analytics.

Contributions

There are five main differences between our work and previous studies [3, 4, 6, 7, 14]: (1) we use a combination of both player and team statistics; (2) we account for the evolution in player and team statistics over time using a signal processing approach; (3) we utilize Feature Imitating Networks (FINs) [12] to embed feature representations into the network; (4) we predict the outcome of playoff results, as opposed to season games; and (5) we test the generalizability of our model across two distinct national basketball leagues (NBA and Iranian Basketball Super League).

The structure of this paper is organized in the following manner: Section two provides a thorough review of existing literature, concentrating on game outcome prediction and recent advancements in FINs. Section three introduces our innovative methodology. This is followed by section four, where we rigorously evaluate our approach. The paper then progresses to section five, where we engage in a discussion of our findings. Finally, the paper concludes in section six, summarizing our key insights and implications.

Related Work

Advancements in Feature Imitation for Diverse Neural Network Applications

FINs have revolutionized transfer learning in various specialized tasks, demonstrating their adaptability across multiple domains [9, 12]. These networks, which initialize weights to mimic specific statistical features, have shown significant efficacy in both biomedical and broader applications. Pioneering work in this field [12] revealed FINs’ superiority in ECG and EEG artifact detection, outperforming traditional models using features like Kurtosis and Shannon’s Entropy. Subsequent studies, such as those by [11], extended their application to biomedical image processing, achieving state-of-the-art results in CT and MRI scan analysis for COVID-19 and brain tumor detection. Recent research [9] has further broadened the scope of FINs, successfully applying them to financial time series, speech emotion recognition, and chronic neck pain detection. This expansion highlights the versatility of FINs, demonstrating substantial improvements in areas like Bitcoin price prediction and speech classification accuracy, thereby confirming their utility in a wide range of time series datasets.

Predictive modeling of NBA Game Outcome Prediction

The NBA is the most popular contemporary basketball league [8, 15]. Several previous studies have examined the impact of different game statistics on a team’s propensity to win or lose a game [13, 16]. More specifically, previous studies have identified teams’ defensive rebounds, field goal percentage, and assists as crucial contributing factors to succeeding in a basketball game [10]; for machine learning workflows, these game attributes may be used as valuable input features to predict the outcome of a given basketball game [2, 5].

Probabilistic models to predict the outcome of basketball games have been proposed by several previous studies. Jain and Kaur [7] developed a Support Vector Machine (SVM) and a Hybrid Fuzzy-SVM model (HFSVM) and reported 86.21% and 88.26% accuracy in predicting the outcome of basketball games. More recently, Houde [6] experimented with SVM, Gaussian Naive Bayes (GNB), Random Forest (RF) Classifier, K Neighbors Classifier (KNN), Logistic Regression (LR), and XGBoost Classifier (XGB) over fifteen game statistics across the last ten games of both home and away teams. They also experimented over a more extended period of NBA season data, starting from 2018 to 2021, and reported 65.1% accuracy in winners/losers classifications. In contrast to Kaur and Houde, that addressed game outcome prediction as a binary classification task, Chen et al [3] identified the winner/loser by predicting their exact final game scores. They used a data mining approach, experimenting with 13 NBA game statistics from the 2018 to 2019 season. After feature selection, this number shrank to 6 critical basketball statistics for predicting the outcome. In terms of classifiers, the authors experimented with KNN, XGB, Stochastic Gradient Boosting (SGB), Multivariate Adaptive Regression Splines (MARS), and Extreme Learning Machine (ELM) to train and classify the winner of NBA matchups. The authors also studied the effect of different game-lag values (from 1 to 6) on the success of their utilized classifiers and indicated that 4 was found to perform best on their feature set.

Fewer studies have used Neural Networks to predict the outcome of basketball games; this is mostly due to challenges of over-fitting in the presence of (relatively) small basketball training datasets. Thabtah et al. [14] trained Artificial Neural Networks (ANN) on a wide span of data where they extracted 20 team stats per NBA matchup played from 1980 to 2017. Their model obtained 83% accuracy in predicting NBA game outcomes; they also demonstrated the significance of three-point percentage, free throws made, and total rebounds as features that enhanced their model’s accuracy rate.

Adding a new subsection at the end of the related work to discuss the limitations of existing methods would enrich the context for your study. Here’s a proposed paragraph for this addition:

Limitations of Existing Methods

While previous studies have laid a significant groundwork in the field of sports outcome prediction, especially within the NBA, they predominantly rely on classical machine learning techniques. These approaches, although valuable, often do not leverage the full potential of advanced neural network architectures, such as Recurrent Neural Networks (RNNs), which are particularly adept at handling the sequential nature of data. The capacity of RNNs to model the temporal dynamics of player and team performance across a season could potentially offer substantial improvements in predictive accuracy, yet this avenue remains underexplored.

Furthermore, existing methodologies have largely overlooked the rich dataset available in player statistics. By focusing primarily on aggregated team statistics, these studies miss out on the nuanced insights individual player performance data can provide, which could significantly inform game outcome predictions. This oversight limits the depth of analysis possible, particularly in understanding the contributions of individual players to the game’s outcome.

Another critical aspect often neglected is the specific focus on playoff games. Playoffs represent a unique subset of matches characterized by higher stakes, increased intensity, and often, different dynamics compared to regular-season games. The failure to distinguish between these contexts can lead to models that, while effective for regular-season predictions, struggle when applied to the more unpredictable nature of playoff matchups. By not addressing these limitations, existing predictive models miss the opportunity to fully capture and analyze the complexities inherent in sports competitions, underscoring the need for more sophisticated, nuanced approaches.

Methods

Baseline approach: A majority of the existing studies use a similar methodological approach: For each team (home and away), a set of s game statistics (the features) are extracted over n previous games (the game-lag value [3]) forming an \(n \times s\) matrix. Then, the mean of each stat is calculated across the n games, resulting in a \(1 \times s\) feature vector for each team. The two feature vectors and concatenated yielding a \(1 \times 2s\) vector for each unique matchup between a given pair of teams. Finally, this results in a \(trainSize \times 2s\) matrix, which is used to train a classification model (each experiment will report the train/test set size in more detail). Alternatively, the label of each sample indicates whether the home team won (\(y=1\)) or lost the game (\(y=0\)).

Fig. 1
figure 1

An overview of MambaNet’s architecture. First, the home (column 1, yellow boxes) and away (column 1, purple boxes) teams’ stats and the two teams’ players’ stats are fed to the network. Next, four FINs are utilized to represent the input stats’ signal features, which contain trainable (column 1, dark circles) and non-trainable (column 2, light circles) layers. These representations are further processed with convolutional and Dense layers. Raw time-domain signal features are also extracted from input stats using LSTM networks. Finally, the aforementioned features are incorporated to make the final prediction

Table 1 A description of game statistics used in this work

FIN Training: Our method follows the same steps as the baseline approaches, but with one critical difference: instead of calculating the mean of features across the n last games using the mean equation, we feed the entire \(n \times s\) matrix to a pretrained mean FIN and stack hidden layers on top of it (hereafter, this FIN-based deep feedforward architecture is referred to as FINDFF) to perform binary classification; In addition to the mean feature, we also imitate standard deviation, variance, and skewness.

All FINs are trained using the same neural architecture: A sequence of dense layers with 64, 32, 16, 8, and 4 units are stacked, respectively, before connecting to a single-unit sigmoid layer. The activation function of is ReLU, for the first two hidden layers and the rest are Linear. Each model undergoes training in a regression setting, employing a dataset comprising 100,000 randomly generated signals. This dataset is partitioned into an 80-10-10 split for training, validation, and testing, respectively. The training labels are derived from handcrafted feature values corresponding to each signal. This number was determined through experimentation to ensure an optimal balance for training, where 100,000 signals were found to offer the most effective diversity and volume for the FINs to generalize well across our analyses. Preliminary tests with various quantities of data highlighted that this number provided a robust training set without overwhelming computational resources, thereby ensuring efficient and effective learning. Then, we freeze the first three layers, finetune the fourth layer, and remove the remaining two layers before integrating them within the larger network structure of the MambaNet network.

MambaNet: In Fig. 1, we provide an illustration of MambaNet—our proposed approach. The complete set of player and team statistics used in this study can also be found in Table 1. The input to the network is an \(10 \times 35\) stats matrix which are passed to both the pretrained FINs as well as LSTM layers with a hidden dimension size of 8 to extract a team’s statistics’ sequential features. For each team, we also extract the a stats matrix (\(n=10\), \(s=34\)) for each of its roster’s top ten players and pass them to the same FINs and LSTM layers. Next, we flatten teams’ signal feature representations and feed them to dense layers, whereas for players, we stack them and feed them to 1D convolutional layers, utilizing a kernel size of 3 and a filter count of 5. Finally, all latent representations of a team and its ten players are concatenated in the network before connecting them to the last sigmoid layer. The training of MambaNet is optimized with a batch size of 32 and a learning rate of 0.001, employing the Adam optimizer for efficient convergence. These hyperparameters were selected based on extensive experimentation to ensure the model’s robust performance across different datasets.

Experiments and Results

We performed three experiments to assess the performance of our proposed method. To demonstrate the advantage of leveraging FINs in deep neural networks, we first compare the performance of FINDFF against a diverse set of other basketball game outcome prediction models trained using NBA data. For the second experiment, these models are tested for generalization across unseen basketball playoff games from the Iranian Super League data. Finally, we assess the performance of Mamabanet for accurate playoff outcome prediction. In all three experiments, the Area Under the ROC Curve (AUC) was used as our primary evaluation metric.

Table 2 Comparison of training and testing dataset sizes across 5 years of NBA data
Table 3 A performance comparison between FINDFF and other previously-developed machine learning models on 5 years of NBA Playoffs, from the 2017 to 2018 season (17–18) to the 21–22 season

Experiment I

This experiment aims to determine whether using FINs in conjunction with deep neural networks can enhance playoff outcome prediction. We followed the same machine learning pipeline as previous studies to compare FINDFF. However, we applied a pretrained mean FIN to the \(n \times s\) matrix instead of taking the mean directly, providing an almost identical setting when comparing FINDFF with other classic machine learning algorithms. Since the FIN is the only differing component in this setting, its effects can be easily studied.

Dataset: All data were gathered from NBA games played from 2017–2018 to 2021–2022 over five seasons. We used each year’s season games as training data and playoff games as testing data (statistics of can be found in Table 2), leaving us with five different NBA datasets.

Results: In Table 3, we compare FINDFF models with five other methods from the literature using different features (game statistics), game-lag values, and classification algorithms. The FINDFF network successfully outperformed all other methods with a 0.05 to 0.15 AUC margin in every year of NBA data, demonstrating the advantage of the feature imitation technique in game outcome prediction.

Table 4 A performance comparison between FINDFF and other previously-developed machine learning models trained on 5 years of NBA (from 17–18 to 21–22) on the 2020–2021 Iranian Super League Playoffs
Table 5 Comparing the performance of different MambaNet versions in 5 years of NBA Playoffs from the 2017 to 2018 season (17-18) to the 21–22 season

Experiment II

The purpose of this experiment is to examine the generalizability of methodologies from the first experiment. As we mentioned in Experiment 1, each model is trained and tested on five different years of NBA data. In this experiment, we still train these models on the five NBA datasets but test them on Iranian Super League playoffs. This allows us to compare how generalized each method is when predicting test cases from a significantly different data source.

Dataset: For training purposes, we used the same NBA datasets discussed in Experiment 1. But, to test them, we used the 2020-2021 Iranian Basketball Super League playoffs.

Results: As shown in Table 4, FINDFF models outperformed almost all other methodologies in predicting the outcome of the Iranian Basketball Super League playoffs by a range of 0.02 to 0.12 AUC.

Experiment III

The initial two experiments illustrated the superior and more generalizable performance of FINs in predicting playoff outcomes compared to baseline methods. Building on the FINDFF architecture, MambaNet was developed to explore the impact of integrating additional components into our hybrid model’s effectiveness.

Dataset: For this experiment, we employed the same NBA datasets as introduced in Experiment 1, ensuring consistency in our data source.

Results: The results of our comprehensive experiment are summarized in Table 5. Initially, MambaNet employed 35 team statistics, passed through a FINDFF network that imitates the mean (m), as outlined in the first row of Table 5. This approach utilized a broad spectrum of basketball game statistics to form the team’s feature vector, meeting the data-intensive needs of neural networks. The subsequent enhancement involved training additional FINs to imitate Standard Deviation (std), Variance (v), and Skewness (s), which resulted in significant improvements in predictive accuracy, as detailed in the second row of Table 5.

Moreover, in the third stage of our experiment, we extended our feature set to include individual player statistics in addition to team stats, as comprehensively listed in Table 1. This integration of player features alongside team statistics marked a significant expansion of our model’s capacity, leading to an average increase of 0.02 in AUC across four NBA datasets. The detailed incorporation of both team and player statistics allowed for a more nuanced analysis, leveraging the complete set of features available.

Finally, the integration of RNN layers enabled the creation of a dynamic time-series representation of both game and individual player statistics. This methodological enhancement, depicted in the fourth row of Table 5, offered further improvements in the model’s predictive performance, with AUC increases of 0.03 and 0.02 for the 2019–2020 and 2021–2022 NBA seasons, respectively. This layered approach underscored the effectiveness of employing a comprehensive array of features in enhancing the predictive accuracy of MambaNet.

Discussion

In the first experiment, our focus was on comparing the FINDFF model, which utilizes FINs, against traditional basketball game outcome prediction models. Crucially, to ensure a fair comparison with prior work, where all models calculated the mean of game stats across the last games, we employed feature imitation solely for the mean function in FINDFF. The results were significant, with FINDFF consistently outperforming conventional methods. Notably, the model demonstrated an increase of up to 0.15 in AUC across various NBA seasons. This substantial improvement underscores the advantage of feature imitation, even when applied to basic statistical functions like the mean. It suggests that feature imitation is more effective than models built solely on handcrafted features, offering a more nuanced approach to capturing and representing the complex statistical patterns inherent in sports data, thereby enhancing the predictive modeling in sports analytics.

The second experiment examined the generalizability of the FINDFF model by applying it to the Iranian Basketball Super League playoffs. In this test, FINDFF continued to demonstrate its efficacy, showing an increase of up to 0.12 in AUC compared to other methodologies. This significant improvement highlights the robustness of the FINDFF model and its applicability beyond a single league.

Our third experiment was pivotal in showcasing the incremental benefits of integrating various deep learning techniques into MambaNet. By progressively adding new features and layers, particularly RNNs for time-series data processing, we observed a substantial improvement in the model’s predictive accuracy. RNNs are particularly beneficial for their ability to capture the temporal dynamics in data, an essential factor in sports analytics. For instance, if a player has recently improved their shooting rate, RNNs can effectively capture this temporal shift, providing a more accurate and current representation of the player’s performance. This nuanced understanding of time-dependent changes in game statistics is crucial for accurate predictions. The culmination of these enhancements in MambaNet resulted in AUC scores reaching up to 0.82, marking a significant advancement in the accuracy of sports outcome predictions. This highlights not only the efficacy of combining different machine learning techniques but also underscores the potential of deep learning in extracting deeper insights from complex, time-sensitive datasets.

Conclusion

In this work, we tackled playoff basketball game outcome prediction from a signal processing standpoint. We introduced MambaNet, which incorporated historical player and team statistics and represented them through signal feature imitation using FINs. To compare our method with the baseline, we used NBA and Iranian Super League data which enabled us to demonstrate the performance and generalizability of our method. Future studies will potentially use fusion techniques or other suitable data modeling techniques, such as graphs, to develop more advanced neural networks that integrate team and player representations more efficiently to predict playoff outcomes more accurately.