Cluster Evaluation, Description, and Interpretation for Serious Games

Player Profiling in Minecraft
  • David J. Cornforth
  • Marc T. P. Adam
Part of the Advances in Game-Based Learning book series (AGBL)


This chapter describes cluster evaluation, description, and interpretation for evaluating player profiles based on log files available from a game server. Calculated variables were extracted from these logs in order to characterize players. Using circular statistics, we show how measures can be extracted that enable players to be characterized by the mean and standard deviation of the time that they interacted with the server. Feature selection was accomplished using a correlation study of variables extracted from the log data. This process favored a small number of the features, as judged by the results of clustering. The techniques are demonstrated based on a log file data set of the popular online game Minecraft. Automated clustering was able to suggest groups that Minecraft players fall into. Cluster evaluation, description, and interpretation techniques were applied to provide further insight into distinct behavioral characteristics, leading to a determination of the quality of clusters, using the Silhouette Width measure. We conclude by discussing how the techniques presented in this chapter can be applied in different areas of serious games analytics.


Cluster evaluation Cluster description Cluster interpretation Player profiles Cognitive performance 



We would like to thank Masahiro Takatsuka who supplied the data used in this study, and Samuel Cornforth who assisted with decoding server logs, and provided details of gameplay in Minecraft.


  1. Ackerman, M., & Ben-David, S. (2008). Measures of clustering quality: A working set of axioms for clustering. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in Neural Information Processing Systems 21 (NIPS).Google Scholar
  2. ARFF (2014). Retrieved July, 2014, from
  3. Asteriadis, S., Karpouzis, K., Shaker, N., & Yannakakis, G. N. (2012). Towards detecting clusters of players using visual and gameplay behavioral cues. Procedia Computer Science, 15, 140–147.CrossRefGoogle Scholar
  4. Astor, P. J., Adam, M. T. P., Jerčić, P., Schaaff, K., & Weinhardt, C. (2014). Integrating biosignals into information systems: A NeuroIS tool for improving emotion regulation. Journal of Management Information Systems, 30(3), 247–278.CrossRefGoogle Scholar
  5. Bolshakova, N., & Azuaje, N. (2003). Cluster validation techniques for genome expression data. Signal Processing, 83, 825–833.CrossRefGoogle Scholar
  6. Breaban, M., & Luchian, H. (2011). A unifying criterion for unsupervised clustering and feature selection. Pattern Recognition, 44, 854–865.CrossRefGoogle Scholar
  7. Byun, J., & Loh, C. S. (2015). Audial engagement: Effects of game sound on learner engagement in digital game-based learning environments. Computers in Human Behavior, 46, 129–138.CrossRefGoogle Scholar
  8. Danish Geodata Agency (2014). Denmark in Minecraft. Retrieved July, 2014, from
  9. Duncan, S. C. (2011). Minecraft, beyond construction and survival. Well Played: A Journal on Video Games, Value and Meaning, 1(1), 1–22.Google Scholar
  10. Ekaputra, G., Lim, C., & Eng, K. I. (2013). Minecraft: A game as an education and scientific learning tool. In Information Systems International Conference (ISICO) (pp. 237–242).Google Scholar
  11. Feldmann, N., Adam, M. T. P., & Bauer, M. (2014). Using serious games for idea assessment in service innovation. In ECIS 2014 Proceedings, Tel Aviv, Israel (pp. 1–17).Google Scholar
  12. Halkidi, M., Batistakis, Y., & Vazirgiannis, M. (2001). On clustering validation techniques. Journal of Intelligent Information Systems, 17, 107–145.CrossRefGoogle Scholar
  13. Han, J., Kamber, M., & Pei, J. (2011). Data mining: Concepts and techniques. Morgan Kaufmann series in data management systems. Burlington, MA: Morgan Kaufmann. ISBN: 0123814790, 9780123814791.Google Scholar
  14. Handl, J., Knowles, J., & Kell, D. B. (2005). Computational cluster validation in post-genomic data analysis. Bioinformatics, 21(15), 3201–3312.CrossRefGoogle Scholar
  15. Honarkhah, M., & Caers, J. (2010). Stochastic simulation of patterns using distance-based pattern modeling. Mathematical Geosciences, 42, 487–517.CrossRefGoogle Scholar
  16. Huang, C. L., & Wang, C. J. (2006). A GA-based feature selection and parameters optimization for support vector machines. Expert Systems with Applications, 31, 231–240.CrossRefGoogle Scholar
  17. Hubert, A. (1985). Comparing partitions. Journal of Classification, 2, 193–198.CrossRefGoogle Scholar
  18. Inza, I., Larranaga, P., Etxeberria, R., & Sierra, B. (2000). Feature subset selection by Bayesian networks based optimization. Artificial Intelligence, 123(1–2), 157–184.CrossRefGoogle Scholar
  19. Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31, 651–666.CrossRefGoogle Scholar
  20. Jang, W., & Hendry, M. (2007). Cluster analysis of massive datasets in astronomy. Statistics and Computing, 17(3), 253–262.CrossRefGoogle Scholar
  21. Jerčić, P., Astor, P. J., Adam, M. T. P., Hilborn, O., Schaaff, K., Lindley, C. A., Sennersten, C., & Eriksson, J. (2012). A serious game using physiological interfaces for emotion regulation training in the context of financial decision-making. In ECIS 2012 Proceedings, Barcelona, Spain (pp. 1–13).Google Scholar
  22. Lee, M.-Y., Kim, Y.-K., & Kim, H.-Y. (2008). Segmenting online auction consumers. Journal of Customer Behaviour, 7(2), 135–148.CrossRefGoogle Scholar
  23. Lehmann, T., Hähnlein, I., & Ifenthaler, D. (2014). Cognitive, metacognitive and motivational perspectives on preflection in self-regulated online learning. Computers in Human Behavior, 32, 313–323.CrossRefGoogle Scholar
  24. Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2), 129–137.CrossRefGoogle Scholar
  25. Loh, C. S., & Sheng, Y. (2013). Measuring the (dis-)similarity between expert and novice behaviors as serious games analytics. Education and Information Technologies, 20, 5–19.CrossRefGoogle Scholar
  26. Loh, C. S., & Sheng, Y. (2014). Maximum Similarity Index (MSI): A metric to differentiate the performance of novices vs. multiple-experts in serious games. Computers in Human Behavior, 39, 322–330.CrossRefGoogle Scholar
  27. Mardia, K. V. (1975). Statistics of directional data. Journal of the Royal Statistical Society, Series B, 37(3), 349–393.Google Scholar
  28. Mitra, P., Murthy, C. A., & Pal, S. K. (2002). Unsupervised feature selection using feature similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 301–312. doi: 10.1109/34.990133.CrossRefGoogle Scholar
  29. Nesbitt, K., & Cornforth, D. (2013). Quality assessment of clusters of electrical disturbances: A case study. In Proceedings of the 8th IEEE Conference on Industrial Electronics and Applications (ICIEA 2013) (pp. 247–254).Google Scholar
  30. Perkins, S., Lacker, K., & Theiler, J. (2003). Grafting: Fast, incremental feature selection by gradient descent in function space. Journal of Machine Learning Research, 3, 1333–1356.Google Scholar
  31. Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(1), 53–65.CrossRefGoogle Scholar
  32. Ruß, G., & Kruse, R. (2011). Exploratory hierarchical clustering for management zone delineation in precision agriculture. In P. Perner (Ed.), Proceedings of the 11th International Conference on Advances in Data Mining: Applications and Theoretical Aspects (ICDM’11) (pp. 161–173). Berlin: Springer.CrossRefGoogle Scholar
  33. Short, D. (2012). Teaching scientific concepts using a virtual world: Minecraft. Teaching Science, 58(3), 55–58.Google Scholar
  34. Strehl, A., Ghosh, J., & Mooney, R. (2000). Impact of similarity measures on web-page clustering. In Proceedings of the Workshop of Artificial Intelligence for Web Search, AAAI 2000 (pp. 58–64).Google Scholar
  35. Sun, Y., Todorovic, S., & Goodison, S. (2010). Local learning based feature selection for high dimensional data analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1610–1626.CrossRefGoogle Scholar
  36. Wang, X., Yang, J., Teng, X., Xia, W., & Jensen, R. (2007). Feature selection based on rough sets and particle swarm optimization. Pattern Recognition Letters, 28(4), 459–471.CrossRefGoogle Scholar
  37. Waxman, O. (2012, September 21). MinecraftEdu teaches students through virtual world-building. Time. Google Scholar
  38. Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques with Java implementations. San Francisco, CA: Morgan Kaufmann.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.University of Newcastle, AustraliaCallaghanAustralia
  2. 2.University of Newcastle, AustraliaCallaghanAustralia

Personalised recommendations