Skip to main content

Improved Diversity in Nested Rollout Policy Adaptation

  • Conference paper
  • First Online:
KI 2016: Advances in Artificial Intelligence (KI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9904))

Abstract

For combinatorial search in single-player games nested Monte-Carlo search is an apparent alternative to algorithms like UCT that are applied in two-player and general games. To trade exploration with exploitation the randomized search procedure intensifies the search with increasing recursion depth. If a concise mapping from states to actions is available, the integration of policy learning yields nested rollout with policy adaptation (NRPA), while Beam-NRPA keeps a bounded number of solutions in each recursion level. In this paper we propose refinements for Beam-NRPA that improve the runtime and the solution diversity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We used one core of an Intel\(^{\textregistered {}}\) Core™ i5-2520M CPU @ 2.50 GHz \(\times \) 4. The computer has 8 GB of RAM but all invocations of the algorithm to any problem instance used less than 10 MB of main memory. Moreover, we had the following software infrastructure. Operating system: Ubuntu 14.04 LTS, Linux kernel: 3.13.0-74-generic, the compiler: g++ version 4.8.4, and the compiler options: -O3 -march=native -funroll-loops -std=c++11 -Wall.

  2. 2.

    http://www.js-games.de/eng/games/samegame.

  3. 3.

    https://www.sintef.no/projectweb/top/vrptw/solomon-benchmark

    http://web.cba.neu.edu/~msolomon/problems.htm.

  4. 4.

    The sequence of cities we found was 73, 22, 72, 54, 24, 80, 12, 0, 65, 71, 71, 20, 32, 70, 0, 92, 37, 98, 91, 16, 86, 85, 97, 13, 0, 83, 45, 61, 84, 5, 60, 89, 0, 94, 96, 99, 6, 0, 50, 33, 30, 51, 9, 67, 1, 0, 14, 44, 38, 43, 100, 95, 0, 27, 69, 76, 79, 68, 0, 52, 7, 11, 19, 49, 48, 82, 0, 28, 29, 78, 34, 35, 3, 77, 0, 62, 88, 8, 46, 17, 93, 59, 0, 36, 47, 18, 0, 39, 23, 67, 55, 4, 25, 26, 0, 63, 64, 90, 10, 31, 0, 87, 57, 2, 58, 0, 40, 53, 0, 42, 15, 41, 75, 56, 74, 21, 0.

References

  1. Biedl, T.C., Demaine, E.D., Demaine, M.L., Fleischer, R., Jacobsen, L., Munro, J.I.: The complexity of clickomania. CoRR, cs.CC/0107031 (2001)

    Google Scholar 

  2. Bouzy, B.: An experimental investigation on the pancake problem. In: Cazenave, T., Winands, M.H.M., Edelkamp, S., Schiffel, S., Thielscher, M., Togelius, J. (eds.) CGW 2015/GIGA 2015. CCIS, vol. 614, pp. 30–43. Springer, Heidelberg (2016). doi:10.1007/978-3-319-39402-2_3

    Chapter  Google Scholar 

  3. Browne, C.B., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4, 1–43 (2004)

    Article  Google Scholar 

  4. Cazenave, T.: Nested Monte-Carlo search. In: IJCAI, pp. 456–461 (2009)

    Google Scholar 

  5. Cazenave, T.: Monte-Carlo beam search. IEEE Trans. Comput. Intell. AI Games 4(1), 68–72 (2012)

    Article  Google Scholar 

  6. Cazenave, T., Teytaud, F.: Beam nested rollout policy adaptation. In: ECAI-Workshop on Computer Games, pp. 1–12 (2012)

    Google Scholar 

  7. Edelkamp, S., Gath, M., Rohde, M.: Monte-Carlo tree search for 3D packing with object orientation. In: Lutz, C., Thielscher, M. (eds.) KI 2014. LNCS, vol. 8736, pp. 285–296. Springer, Heidelberg (2014)

    Google Scholar 

  8. Edelkamp, S., Gath, M.: Pickup-and-delivery problems with time windows and capacity constraints using nested Monte-Carlo search. In: ICAART (2014)

    Google Scholar 

  9. Edelkamp, S., Gath, M., Cazenave, T., Teytaud, F.: Algorithm and knowledge engineering for the TSPTW problem. In: IEEE SSCI (2013)

    Google Scholar 

  10. Gath, M., Herzog, O., Edelkamp, S.: Agent-based planning and control for groupage traffic. In: IEEE-CEWIT (2013)

    Google Scholar 

  11. Huang, S.-C., Arneson, B., Hayward, R.B., Müller, M., Pawlewicz, J.: MoHex 2.0: a pattern-based MCTS hex player. In: Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 60–71. Springer, Heidelberg (2014)

    Google Scholar 

  12. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  13. Palombo, A., Stern, R., Puzis, R., Felner, A., Kiesel, S., Ruml, W.: Solving the snake in the box problem with heuristic search: first results. In: Proceedings of the Eighth Annual Symposium on Combinatorial Search, SOCS 2015, 11–13 June 2015, Ein Gedi, The Dead Sea, Israel, pp. 96–104 (2015)

    Google Scholar 

  14. Rosin, C.D.: Nested rollout policy adaptation for Monte-Carlo tree search. In: IJCAI, pp. 649–654 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Edelkamp .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Edelkamp, S., Cazenave, T. (2016). Improved Diversity in Nested Rollout Policy Adaptation. In: Friedrich, G., Helmert, M., Wotawa, F. (eds) KI 2016: Advances in Artificial Intelligence. KI 2016. Lecture Notes in Computer Science(), vol 9904. Springer, Cham. https://doi.org/10.1007/978-3-319-46073-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46073-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46072-7

  • Online ISBN: 978-3-319-46073-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics