Forecasting Player Behavioral Data and Simulating In-Game Events

  • Anna Guitart
  • Pei Pei Chen
  • Paul Bertens
  • África Periáñez
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 886)


Understanding player behavior is fundamental in game data science. Video games evolve as players interact with the game, so being able to foresee player experience would help to ensure a successful game development. In particular, game developers need to evaluate beforehand the impact of in-game events. Simulation optimization of these events is crucial to increase player engagement and maximize monetization. We present an experimental analysis of several methods to forecast game-related variables, with two main aims: to obtain accurate predictions of in-app purchases and playtime in an operational production environment, and to perform simulations of in-game events in order to maximize sales and playtime. Our ultimate purpose is to take a step towards the data-driven development of games. The results suggest that even though the performance of traditional approaches, such as ARIMA is still better, the outcomes of state-of-the-art techniques like deep learning are promising. Deep learning comes up as a well-suited general model that could be used to forecast a variety of time series with different dynamic behaviors.


Social games Time series Forecasting Sequential analysis Deep learning ARIMA models Gradient boosting 



We thank Sovannrith Lay for helping to gather the data and Javier Grande for his careful review of the manuscript.


  1. 1.
    El-Nasr, M.S., Drachen, A., Canossa, A.: Game Analytics. Sprint, New York (2013)CrossRefGoogle Scholar
  2. 2.
    Yannakakis, G.N., Togelius, J.: Artificial Intelligence and Games. Springer (2017).
  3. 3.
    De Gooijer, J.G., Hyndman, R.J.: 25 years of time series forecasting. Int. J. Forecast. 22(3), 443–473 (2006)CrossRefGoogle Scholar
  4. 4.
    Brockwell, P.J., Davis, R.A.: Introduction to Time Series and Forecasting. Springer, Heidelberg (2016)CrossRefGoogle Scholar
  5. 5.
    Adhikari, R., Agrawal, R.: An introductory study on time series modeling and forecasting. arXiv preprint arXiv:1302.6613 (2013)
  6. 6.
    Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithms and Analysis, vol. 57. Springer Science and Business Media, Heidelberg (2007)zbMATHGoogle Scholar
  7. 7.
    Carson, Y., Maria, A.: Simulation optimization: methods and applications. In: Proceedings of the 29th Conference on Winter Simulation, pp. 118–126. IEEE Computer Society (1997)Google Scholar
  8. 8.
    Box, G.E., Jenkins, G.M.: Time series analysis: forecasting and control, revised ed. Holden-Day (1976)Google Scholar
  9. 9.
    Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models, vol. 43. CRC Press, Boca Raton (1990)zbMATHGoogle Scholar
  11. 11.
    Busseti, E., Osband, I., Wong, S.: Deep learning for time series modeling. Technical report, Stanford University (2012)Google Scholar
  12. 12.
    Bauckhage, C., Kersting, K., Sifa, R., Thurau, C., Drachen, A., Canossa, A.: How players lose interest in playing a game: an empirical study based on distributions of total playing times. In: 2012 IEEE Conference on Computational Intelligence and Games (CIG), pp. 139–146. IEEE (2012)Google Scholar
  13. 13.
    Hadiji, F., Sifa, R., Drachen, A., Thurau, C., Kersting, K., Bauckhage, C.: Predicting player churn in the wild. In: 2014 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2014)Google Scholar
  14. 14.
    Periáñez, Á., Saas, A., Guitart, A., Magne, C.: Churn prediction in mobile social games: towards a complete assessment using survival ensembles. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 564–573. IEEE (2016)Google Scholar
  15. 15.
    Bertens, P., Guitart, A., Periáñez, Á.: Games and big data: a scalable multi-dimensional churn prediction model. In: Accepted in IEEE CIG (2017)Google Scholar
  16. 16.
    Bauckhage, C., Drachen, A., Sifa, R.: Clustering game behavior data. IEEE Trans. Comput. Intell. AI Games 7(3), 266–278 (2015)CrossRefGoogle Scholar
  17. 17.
    Drachen, A., Sifa, R., Bauckhage, C., Thurau, C.: Guns, swords and data: clustering of player behavior in computer games in the wild. In: 2012 IEEE Conference on Computational Intelligence and Games (CIG), pp. 163–170. IEEE (2012)Google Scholar
  18. 18.
    Drachen, A., Thurau, C., Sifa, R., Bauckhage, C.: A comparison of methods for player clustering via behavioral telemetry. arXiv preprint arXiv:1407.3950 (2014)
  19. 19.
    Sifa, R., Bauckhage, C., Drachen, A.: The playtime principle: large-scale cross-games interest modeling. In: 2014 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2014)Google Scholar
  20. 20.
    Saas, A., Guitart, A., Periáñez, Á.: Discovering playing patterns: time series clustering of free-to-play game data. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2016)Google Scholar
  21. 21.
    Lawrence, K.D., Geurts, M.D.: Advances in Business and Management Forecasting, vol. 4. Emerald Group Publishing, Bingley (2006)Google Scholar
  22. 22.
    Box, G.E., Cox, D.R.: An analysis of transformations. J. Roy. Statist. Soc. Ser. B (Methodol.) 26, 211–252 (1964)zbMATHGoogle Scholar
  23. 23.
    Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Cragg, J.G.: Estimation and testing in time-series regression models with heteroscedastic disturbances. J. Econom. 20(1), 135–157 (1982)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Dietterich, T.G.: Ensemble methods in machine learning. In: International Workshop on Multiple Classifier Systems, pp. 1–15. Springer (2000)Google Scholar
  27. 27.
    Mason, L., Baxter, J., Bartlett, P.L., Frean, M.R.: Boosting algorithms as gradient descent. In: NIPS, pp. 512–518 (1999)Google Scholar
  28. 28.
    Breiman, L.: “Arcing the edge,” Technical Report 486, Statistics Department. University of California at Berkeley, Technical report (1997)Google Scholar
  29. 29.
    Ridgeway, G.: Generalized boosted models: a guide to the gbm package. Update 1(1), 2007 (2007)Google Scholar
  30. 30.
    Natekin, A., Knoll, A.: Gradient boosting machines, a tutorial. Front. Neurorobot. 7, 21 (2013)CrossRefGoogle Scholar
  31. 31.
    Zhang, T., Yu, B.: Boosting with early stopping: convergence and consistency. Ann. Stat. 33(4), 1538–1579 (2005)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Hastie, T., Tibshirani, R.: Generalized additive models: some applications. J. Am. Stat. Assoc. 82(398), 371–386 (1987)CrossRefGoogle Scholar
  33. 33.
    Maindonald, J.: Smoothing terms in GAM models (2010)Google Scholar
  34. 34.
    Larsen, K.: GAM: the predictive modeling silver bullet. Multithreaded. Stitch Fix, vol. 30 (2015)Google Scholar
  35. 35.
    Chen, C.: Generalized additive mixed models. In: Communications in Statistics-Theory and Methods, vol. 29, no. 5–6, pp. 1257–1271 (2000)CrossRefGoogle Scholar
  36. 36.
    Breslow, N.E., Clayton, D.G.: Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc. 88(421), 9–25 (1993)zbMATHGoogle Scholar
  37. 37.
    Wood, S.N.: Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 73(1), 3–36 (2011)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Bengio, Y.: Learning deep architectures for AI. Found. Trends® Mach. Learn. 2(1), 1–127 (2009)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends® Sig. Process. 7(3–4), 197–387 (2014)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  41. 41.
    Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)Google Scholar
  42. 42.
    Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefGoogle Scholar
  43. 43.
    Ackley, D.H., Hinton, G.E., Sejnowski, T.J.: A learning algorithm for Boltzmann machines. Cognit. Sci. 9(1), 147–169 (1985)CrossRefGoogle Scholar
  44. 44.
    Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: Proceedings of the 25th International Conference on Machine Learning, pp. 536–543. ACM (2008)Google Scholar
  45. 45.
    Längkvist, M., Karlsson, L., Loutfi, A.: A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognit. Lett. 42, 11–24 (2014)CrossRefGoogle Scholar
  46. 46.
    Zhang, G., Patuwo, B.E., Hu, M.Y.: Forecasting with artificial neural networks: the state of the art. Int. J. Forecast. 14(1), 35–62 (1998)CrossRefGoogle Scholar
  47. 47.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  48. 48.
    Ng, A.Y.: Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In: Proceedings of the Twenty-first International Conference on Machine Learning, p. 78. ACM (2004)Google Scholar
  49. 49.
    Hyndman, R.J., Koehler, A.B.: Another look at measures of forecast accuracy. Int. J. Forecast. 22(4), 679–688 (2005)CrossRefGoogle Scholar
  50. 50.
    Fox, A.J.: Outliers in time series. J. Roy. Stat. Soc. Ser. B (Methodol.) 11, 350–363 (1972)MathSciNetzbMATHGoogle Scholar
  51. 51.
    Sakurada, M., Yairi, T.: Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, p. 4. ACM (2014)Google Scholar
  52. 52.
  53. 53.
    Tokyo daily temperature 2014 to 2017.
  54. 54.
    Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)Google Scholar
  55. 55.
    Eilers, P.H., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat. Sci. 11, 89–102 (1996)MathSciNetCrossRefGoogle Scholar
  56. 56.
    Wood, S.N.: Generalized Additive Models: An Introduction with R. CRC Press, Boca Raton (2017)CrossRefGoogle Scholar
  57. 57.
    Wood, S.N.: Thin plate regression splines. J. R. Stat. Soc. Ser. B Stat. Methodol. 65(1), 95–114 (2003)MathSciNetCrossRefGoogle Scholar
  58. 58.
    Prechelt, L.: Early stopping-but when? In: Neural Networks: Tricks of the trade, pp. 55–69. Springer (1998)Google Scholar
  59. 59.
    Gilliland, M., Sglavo, U., Tashman, L.: Business Forecasting: Practical Problems and Solutions. Wiley, Hoboken (2016)Google Scholar
  60. 60.
    Makridakis, S., Hibon, M.: The M3-Competition: results, conclusions and implications. Int. J. Forecast. 16(4), 451–476 (2000)CrossRefGoogle Scholar
  61. 61.
    Julkunen, J.: Feature Spotlight: In-Game Events and Market Trends (2016).
  62. 62.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  63. 63.
    Dwyer, L., Gill, A., Seetaram, N.: Handbook of Research Methods in Tourism: Quantitative and Qualitative Approaches. Edward Elgar Publishing, Cheltenham (2012)CrossRefGoogle Scholar
  64. 64.
    Khandakar, Y., Hyndman, R.J.: Automatic time series forecasting: the forecast Package for R (2008)Google Scholar
  65. 65.
    Wood, S.N.: MGCV: Mixed GAM computation vehicle with GCV/AIC/REML smoothness estimation (2012)Google Scholar
  66. 66.
    Theano Development Team: Theano: A Python framework for fast computation of mathematical expressions, arXiv e-prints, abs/1605.02688,, May 2016

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Anna Guitart
    • 1
  • Pei Pei Chen
    • 1
  • Paul Bertens
    • 1
  • África Periáñez
    • 1
  1. 1.Yokozuna Data unitSilicon StudioShibuya-kuJapan

Personalised recommendations