Abstract
The puzzle game 2048, a single-player stochastic game played on a \(4\,\times \,4\) grid, is the most popular among similar slide-and-merge games. One of the strongest computer players for 2048 uses temporal difference learning (TD learning) with N-tuple networks, and it matters a great deal how to design N-tuple networks. In this paper, we study the N-tuple networks for the game 2048. In the first set of experiments, we conduct TD learning by selecting 6- and 7-tuples exhaustively, and evaluate the usefulness of those tuples. In the second set of experiments, we conduct TD learning with high-utility tuples, varying the number of tuples. The best player with ten 7-tuples achieves an average score 234,136 and the maximum score 504,660. It is worth noting that this player utilize no game-tree search and plays a move in about 12 \(\upmu \)s.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Since it requires 30 GB of memory to conduct the experiment, we used a PC with 32 GB memory for this additional experiment.
References
GPCC (games and puzzles competitions on computers) problems for 2015 (2015, in Japanese). http://hp.vector.co.jp/authors/VA003988/gpcc/gpcc15.htm
Abdelkader, A., Acharya, A., Dasler, P.: On the complexity of slide-and-merge games, [cs.CC] (2015). arXiv:1501.03837
Chabin, T., Elouafi, M., Carvalho, P., Tonda, A.: Using linear genetic programming to evolve a controller for the game 2048 (2015). http://www.cs.put.poznan.pl/wjaskowski/pub/2015-GECCO-2048-Competition/Treecko.pdf
Cirulli, G.: 2048 (2014). http://gabrielecirulli.github.io/2048/
Jaśkowski, W., Szubert, M.: Game 2048 AI controller competition @ GECCO 2015 (2015). http://www.cs.put.poznan.pl/wjaskowski/pub/2015-GECCO-2048-Competition/GECCO-2015-2048-Competition-Results.pdf
Langerman, S., Uno, Y.: Threes!, fives, 1024!, and 2048 are hard. CoRR abs/1505.04274 (2015)
Oka, K., Matsuzaki, K., Haraguchi, K.: Exhaustive analysis and Monte-Carlo tree search player for two-player 2048. Kochi Univ. Technol. Res. Bull. 12(1), 123–130 (2015, in Japanese)
Oka, K., Matsuzaki, K.: An evaluation function for 2048 players: evaluation for the original game and for the two-player variant. In: Proceedings of the 57th Programming Symposium, pp. 9–18 (2016, in Japanese)
van der Ree, M., Wiering, M.: Reinforcement learning in the game of Othello: learning against a fixed opponent and learning from self-play. In: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 108–115 (2013)
Rodgers, P., Levine, J.: An investigation into 2048 AI strategies. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–2 (2014)
Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 44(1), 206–227 (1959)
Schraudolph, N.N., Dayan, P., Sejnowski, T.J.: Learning to evaluate go positions via temporal difference methods. In: Computational Intelligence in Games, pp. 77–98 (2001)
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
Szubert, M., Jaśkowski, W.: Temporal difference learning of N-tuple networks for the game 2048. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2014)
Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6(2), 215–219 (1994)
Wu, I.C., Yeh, K.H., Liang, C.C., Chang, C.C., Chiang, H.: Multi-stage temporal difference learning for 2048. In: Cheng, S.-M., Day, M.-Y. (eds.) Technologies and Applications of Artificial Intelligence. LNCS, vol. 8916, pp. 366–378. Springer, Cham (2014). doi:10.1007/978-3-319-13987-6_34
Xiao, R.: nneonneo/2048-ai (2015). https://github.com/nneonneo/2048-ai
Yeh, K.H., Liang, C.C., Wu, K.C., Wu, I.C.: 2048-bot tournament in Taiwan (2014). https://icga.leidenuniv.nl/wp-content/uploads/2015/04/2048-bot-tournament-report-1104.pdf
Yeh, K.H., Wu, I.C., Hsueh, C.H., Chang, C.C., Liang, C.C., Chiang, H.: Multi-stage temporal difference learning for 2048-like games, [cs.LG] (2016). arXiv:1606.07374
Zaky, A.: Minimax and expectimax algorithm to solve 2048 (2014). http://informatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2013-2014-genap/Makalah2014/MakalahIF2211-2014-037.pdf
Acknowledgment
Most of the experiments in this paper were conducted on the IACP cluster of the Kochi University of Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Oka, K., Matsuzaki, K. (2016). Systematic Selection of N-Tuple Networks for 2048. In: Plaat, A., Kosters, W., van den Herik, J. (eds) Computers and Games. CG 2016. Lecture Notes in Computer Science(), vol 10068. Springer, Cham. https://doi.org/10.1007/978-3-319-50935-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-50935-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50934-1
Online ISBN: 978-3-319-50935-8
eBook Packages: Computer ScienceComputer Science (R0)