Skip to main content
Log in

NBA Game Result Prediction Using Feature Analysis and Machine Learning

  • Published:
Annals of Data Science Aims and scope Submit manuscript

Abstract

In the recent years, sports outcome prediction has gained popularity, as demonstrated by massive financial transactions in sports betting. One of the world’s popular sports that lures betting and attracts millions of fans worldwide is basketball, particularly the National Basketball Association (NBA) of the United States. This paper proposes a new intelligent machine learning framework for predicting the results of games played at the NBA by aiming to discover the influential features set that affects the outcomes of NBA games. We would like to identify whether machine learning methods are applicable to forecasting the outcome of an NBA game using historical data (previous games played), and what are the significant factors that affect the outcome of games. To achieve the objectives, several machine learning methods that utilise different learning schemes to derive the models, including Naïve Bayes, artificial neural network, and Decision Tree, are selected. By comparing the performance and the models derived against different features sets related to basketball games, we can discover the key features that contribute to better performance such as accuracy and efficiency of the prediction model. Based on the results analysis, the DRB (defensive rebounds) feature was chosen and was deemed as the most significant factor influencing the results of an NBA game. Furthermore, others crucial factors such as TPP (three-point percentage), FT (free throws made), and TRB (total rebounds) were also selected, which subsequently increased the model’s prediction accuracy rate by 2–4%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abdelhamid N, Thabtah F, Abdel-jaber H (2017) Phishing detection: a recent intelligent machine learning comparison based on models content and features. In: Proceedings of the 2017 IEEE international conference on intelligence and security informatics (ISI). Beijing

  2. AlShboul R, Thabtah F, Abdelhamid N, Al-diabat M (2018) A visualization cybersecurity method based on features’ dissimilarity. Comput Secur 77:289–303

    Article  Google Scholar 

  3. Bradly M (2016) ABC News. https://www.abc.net.au/news/2016-01-21/bradley-corruption-inprofessional-sport-should-be-no-surprise/7101508. Accessed 18 Jan 2018

  4. Bunker RP, Thabtah F (2017) A machine learning framework for sport result prediction. Appl Comput Inform. https://doi.org/10.1016/j.aci.2017.09.005

    Google Scholar 

  5. Burges C (1998) Tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167

    Article  Google Scholar 

  6. Cao C (2012) Sports data mining technology used in basketball outcome prediction. Dublin Institute of Technology. Retrieved from https://arrow.dit.ie/cgi/viewcontent.cgi?article=1040&context=scschcomdis. Accessed 17 Jan 2018

  7. Cheng G, Zhang Z, Kyebambe MN, Kimbugwe N (2016) Predicting the outcome of NBA playoffs based on the maximum entropy principle. Entropy 18:450. https://doi.org/10.3390/e18120450

    Article  Google Scholar 

  8. Cohen W (1995) Fast effective rule induction. Proceedings of the 12th International Conference on Machine Learning 115–123

  9. Haghighat M, Rastegari H, Nourafza N (2013) A review of data mining techniques for result prediction in sports. In: Advances in computer science, pp 2322–5157

  10. Hall M (1999) Correlation-based feature selection for machine learning. Doctoral dissertation, University of Waikato, Dept. of Computer Science

  11. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA Data Mining Software: An Update. SIGKDD Explor 11(1)

  12. Higgins J (2005) Introduction to multiple regression, Chapt 4, pp 111–115. Accessed 9 Feb 2018

  13. Hosmer D, Lemeshow S (2000) Applied logistic regression. Wiley, New York, pp 236–269

    Book  Google Scholar 

  14. Kaggle Inc (2018) Kaggle: your home for data science. Retrieved 24 July 2018, from https://www.kaggle.com/slonsky/boxing-bouts

  15. Keller JM, Gray MR, Givens JA (1985) A fuzzy K-nearest neighbour algorithm. IEEE Trans Syst Man Cyberne 580(4):580–585

    Article  Google Scholar 

  16. Kopf D (2017) Data analytics have made the NBA unrecognizable. Retrieved from: https://qz.com/1104922/data-analytics-have-revolutionized-the-nba/. Accessed 25 Feb 2018

  17. Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 95(1–2):161–205

    Article  Google Scholar 

  18. Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In: The tenth national conference on artificial intelligence, vol. 24. AAAI Press, San Jose, pp 399–406

  19. Latheef NA (2017) The number games—how machine learning is changing sports. Retrieved from https://medium.com/@nabil_lathif/the-number-games-how-machine-learning-is-changing-sports-4f4673792c8e

  20. Lewis D (1998) Naive (Bayes) at forty: the independence assumption in information retrieval. In: European conference on machine learning, pp 4–15

  21. Lieder NM (2018) Can machine-learning methods predict the outcome of an NBA game? 1, Mar 2018. https://ssrn.com/abstract=3208101 or http://dx.doi.org/10.2139/ssrn.3208101

  22. Loeffelholz B, Bednar E, Bauer KW (2009) Predicting NBA games using neural networks. J Quant Anal Sports 5(1):1156

    Google Scholar 

  23. Mccabe A, Trevathan J (2008) Artificial intelligence in sports prediction. In: Fifth international conference on information technology: new generations (itng 2008). https://doi.org/10.1109/itng.2008.203

  24. Meyera D, Leischa F, Hornik K (2003) The support vector machine under test. Neurocomputing 55:169–186

    Article  Google Scholar 

  25. Miljkovic D, Gajic L, Kovacevic A, Konjovic Z (2010) The use of data mining for basketball matches outcomes prediction. In: IEEE 8th international symposium on intelligent systems and informatics. SISY, Subotica, pp 10–11

  26. Purucker M (1996) Neural network quarterbacking. IEEE Potentials 15(3):9–15. https://doi.org/10.1109/45.535226

    Article  Google Scholar 

  27. Quinlan JR (1986) Induction of decision trees. Mach Learn. https://doi.org/10.1007/bf00116251

    Google Scholar 

  28. Schalkoff RJ (1997) Artificial neural networks. International ed. McGraw-Hill, New York

    Google Scholar 

  29. Steinberg L (2015) Changing the game: the rise of sports analytics. Retrieved from https://www.forbes.com/sites/leighsteinberg/2015/08/18/changing-the-game-the-rise-of-sports-analytics/. Accessed 15 Feb 2018

  30. Thabtah F (2017) Autism spectrum disorder screening: machine learning adaptation and DSM-5 fulfillment. In: Proceedings of the 1st international conference on medical and health informatics. ACM, Taichung City, pp 1–6

  31. Thabtah F, Abdelhamid N (2016) Deriving correlated sets of website features for phishing detection: a computational intelligence approach. J Inform Knowl Manag 15(04):1650042

    Article  Google Scholar 

  32. Thabtah F, Kamalov F, Rajab K (2018) A new computational intelligence approach to detect autistic features for autism screening. Int J Med Inform 117:112–124

    Article  Google Scholar 

  33. Trawinski K (2010) A fuzzy classification system for prediction of the results of the basketball games. In: IEEE international conference on fuzzy systems. Barcelona, pp 1–7. https://doi.org/10.1109/fuzzy.2010.5584399

  34. Zdravevski E, Kulakov A (2009) System for prediction of the winner in a sports game. ICT Innov. https://doi.org/10.1007/978-3-642-10781-8_7

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fadi Thabtah.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thabtah, F., Zhang, L. & Abdelhamid, N. NBA Game Result Prediction Using Feature Analysis and Machine Learning. Ann. Data. Sci. 6, 103–116 (2019). https://doi.org/10.1007/s40745-018-00189-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40745-018-00189-x

Keywords

Navigation