Advertisement

Developing a 2048 Player with Backward Temporal Coherence Learning and Restart

  • Kiminori MatsuzakiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10664)

Abstract

The puzzle game 2048 is a single-player stochastic game played on a \(4\times 4\) grid. It is very popular among similar slide-and-merge games. After the appearance of the game, several researchers developed computer players for 2048 based on reinforcement learning methods with N-tuple networks. The state-of-the-art player developed by Jaśkowski is based on several techniques as the title of the paper implies. In this paper, we show that backward learning is very useful for 2048, since the game has quite a long sequence of moves in a single play. We also show a restart strategy to improve the learning by focusing on the later stage of the game. The learned player achieved better average scores than the existing players with the same set of N-tuple networks.

Notes

Acknowledgments

Most of the experiments in this paper were conducted on the IACP cluster of the Kochi University of Technology.

References

  1. 1.
    Beal, D.F., Smith, M.C.: Temporal coherence and prediction decay in TD learning. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, vol. 1, pp. 564–569 (1999)Google Scholar
  2. 2.
    Cirulli, G.: 2048 (2014). http://gabrielecirulli.github.io/2048/
  3. 3.
    Jaśkowski, W.: Mastering 2048 with delayed temporal coherence learning, multi-stage weight promotion, redundant encoding and carousel shaping. In: IEEE Transactions on Computational Intelligence and AI in Games (2017, accepted for publication)Google Scholar
  4. 4.
    Matsuzaki, K.: Systematic selection of N-tuple networks with consideration of interinfluence for game 2048. In: Technologies and Applications of Artificial Intelligence (TAAI 2016), pp. 186–193 (2016)Google Scholar
  5. 5.
    Oka, K., Matsuzaki, K.: Systematic selection of N-tuple networks for 2048. In: Plaat, A., Kosters, W., van den Herik, J. (eds.) CG 2016. LNCS, vol. 10068, pp. 81–92. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-50935-8_8 CrossRefGoogle Scholar
  6. 6.
    van der Ree, M., Wiering, M.: Reinforcement learning in the game of Othello: learning against a fixed opponent and learning from self-play. In: IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), pp. 108–115 (2013)Google Scholar
  7. 7.
    Rodgers, P., Levine, J.: An investigation into 2048 AI strategies. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–2 (2014)Google Scholar
  8. 8.
    Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 44, 206–227 (1959)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Schraudolph, N.N., Dayan, P., Sejnowski, T.J.: Learning to evalutate Go positions via temporal difference methods. In: Baba, N., Jain, L.C. (eds.) Computational Intelligence in Games. Studies in Fuzziness and Soft Computing, pp. 77–98. Springer, Heidelberg (2001).  https://doi.org/10.1007/978-3-7908-1833-8_4 CrossRefGoogle Scholar
  10. 10.
    Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44 (1988)Google Scholar
  11. 11.
    Szubert, M., Jaśkowski, W.: Temporal difference learning of N-tuple networks for the game 2048. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–8. IEEE (2014)Google Scholar
  12. 12.
    Tesauro, G.: TD-gammon, a self-teaching Backgammon program, achieves master-level play. Neural Comput. 6, 215–219 (1994)CrossRefGoogle Scholar
  13. 13.
    Wu, I.C., Yeh, K.H., Liang, C.C., Chiang, H.: Multi-stage temporal difference learning for 2048. In: Cheng, S.M., Day, M.Y. (eds.) Technologies and Applications of Artificial Intelligence. LNCS, pp. 366–378. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-13987-6_34 CrossRefGoogle Scholar
  14. 14.
    Xiao, R., Vermaelen, W., Mora\(\acute{\rm v}\)ek, P.: AI for the 2048 game (2015). https://github.com/nneonneo/2048-ai
  15. 15.
    Yeh, K., Wu, I., Hsueh, C., Chang, C., Liang, C., Chiang, H.: Multi-stage temporal difference learning for 2048-like games. In: IEEE Transactions on Computational Intelligence and AI in Games (2016, accepted for publication)Google Scholar
  16. 16.

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Kochi University of TechnologyKamiJapan

Personalised recommendations