We propose a new approach for search tree exploration in the context of combinatorial optimization, specifically Mixed Integer Programming (MIP), that is based on UCT, an algorithm for the multi-armed bandit problem designed for balancing exploration and exploitation in an online fashion. UCT has recently been highly successful in game tree search. We discuss the differences that arise when UCT is applied to search trees as opposed to bandits or game trees, and provide initial results demonstrating that the performance of even a highly optimized state-of-the-art MIP solver such as CPLEX can be boosted using UCT’s guidance on a range of problem instances.


Mixed Integer Programming Linear Programming Relaxation Node Selection Game Tree Open Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)zbMATHCrossRefGoogle Scholar
  2. 2.
    Ciancarini, P., Favini, G.P.: Monte Carlo tree search techniques in the game of Kriegspiel. In: 21st IJCAI, Pasadena, CA, pp. 474–479 (July 2009)Google Scholar
  3. 3.
    Finnsson, H., Björnsson, Y.: Simulation-based approach to general game playing. In: 23rd AAAI, Chicago, IL, pp. 259–264 (July 2008)Google Scholar
  4. 4.
    Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: 24th ICML, Corvallis, OR, pp. 273–280 (June 2007)Google Scholar
  5. 5.
    Gelly, S., Silver, D.: Achieving master level play in 9 ×9 computer Go. In: 23rd AAAI, Chicago, IL, pp. 1537–1540 (July 2008)Google Scholar
  6. 6.
    IBM ILOG. IBM CPLEX Optimization Studio 12.3 (2011)Google Scholar
  7. 7.
    Kocsis, L., Szepesvári, C.: Bandit Based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Nemhauser, G.L., Wolsey, L.A.: Integer and Combinatorial Optimization. Wiley-Interscience (1999)Google Scholar
  9. 9.
    Previti, A., Ramanujan, R., Schaerf, M., Selman, B.: Applying UCT to Boolean Satisfiability. In: Sakallah, K.A., Simon, L. (eds.) SAT 2011. LNCS, vol. 6695, pp. 373–374. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  10. 10.
    Ramanujan, R., Sabharwal, A., Selman, B.: Understanding sampling style adversarial search methods. In: 26th UAI, Catalina Island, CA (July 2010)Google Scholar
  11. 11.
    Ramanujan, R., Selman, B.: Trade-offs in sampling-based adversarial planning. In: 21st ICAPS, Freiburg, Germany (June 2011)Google Scholar
  12. 12.
    Wolsey, L.A.: Integer Programming. Wiley-Interscience (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ashish Sabharwal
    • 1
  • Horst Samulowitz
    • 1
  • Chandra Reddy
    • 1
  1. 1.IBM Watson Research CenterYorktown HeightsUSA

Personalised recommendations